Emotions in Humans and Artifacts Robert Trappl Paolo Petta Sabine Payr
Apr 2003 ISBN 0262201429 400 pp. 35 illus. $52.00 ( (hardback) $41.60 ( (hardback)
Emotions have been much studied and discussed in recent years. Most books, however, treat only one aspect of emotions, such as emotions and the brain, emotions and well-being, or emotions and computer agents. This interdisciplinary book presents recent work on emotions in neuroscience, cognitive science, philosophy, computer science, artificial intelligence, and software and game development. The book discusses the components of human emotion and how they might be incorporated into machines, whether artificial agents should convey emotional responses to human users and how such responses could be made believable, and whether agents should accept and interpret the emotions of users without displaying emotions of their o w n . It also covers the evolution and brain architecture of emotions, offers vocabularies and classifications for defining emotions, and examines emotions in relation to machines, games, virtual worlds, and music. TABLE OF CONTENTS PREFACE 1EMOTIONS: FROM BRAIN RESEARCH TO COMPUTER G A M E DEVELOPMENT 2A THEORY OF EMOTION, ITS FUNCTIONS, AND ITS ADAPTIVE VALUE BY EDMUND T. ROLLS 3 H O W MANY SEPARATELY EVOLVED EMOTIONAL BEASTIES LIVE WITHIN US? BY AARON SLOMAN 4DESIGNING EMOTIONS FOR ACTIVITY SELECTION IN AUTONOMOUS AGENTS BY LOLA D. CAÑAMERO 5EMOTIONS: MEANINGFUL MAPPINGS BETWEEN THE INDIVIDUAL AND ITS WORLD BY KIRSTIE L. BELTMAN 6 O N MAKING BELIEVABLE EMOTIONAL AGENTS BELIEVABLE BY ANDRES ORTONY 7 WHAT DOES IT MEAN FOR A COMPUTER TO " H A V E " EMOTIONS? BY ROSALIND W. PICARD 8 T H E ROLE OF ELEGANCE IN EMOTION AND PERSONALITY: REASONING FOR BELIEVABLE AGENTS BY CLARK BLIOTT 9 T H E ROLE OF EMOTIONS IN A TRACTABLE ARCHITECTURE FOR SITUATED COGNIZERS BY PAOLO PETTA 10 T H E WOLFGANG SYSTEM: A ROLE OF "EMOTIONS" TO BIAS LEARNING AND PROBLEM SOLVING WHEN LEARNING TO COMPOSE MUSIC BY DOUGLAS RIECKEN 11 A BAYESIAN HEART: COMPUTER RECOGNITION AND SIMULATION OF EMOTION BY EUGENE BALL 12 CREATING EMOTIONAL RELATIONSHIPS WITH VIRTUAL CHARACTERS BY ANDREW STERN CONCLUDING REMARKS BY ROBERT TRAPPI
Preface
In Steven Spielberg’s movie Artificial Intelligence the question is raised, Do robots have emotions, especially love? Can they? Should they? The study of emotions has become a hot research area during recent years. However, there is no book that covers this topic from different perspectives—from the brain researcher, the cognitive scientist, the philosopher, the AI researcher, the software developer, the interface designer, the computer game developer—thus enabling an overview of the current approaches and models. We therefore invited leading scientists and practitioners from these research and development areas to present their positions and discuss them in a two-day workshop held at the Austrian Research Institute for Artificial Intelligence in Vienna. We had all discussions both video- and audiotaped and the transcripts sent to the participants. This book opens with an overview chapter, which also presents the motivation—both for the book and the research covered—in more precision. It is followed by the chapters prepared by the participants especially for the book. Most of the chapters are followed by a condensation of the often vivid and intense discussions. However, on several occasions, questions were raised or comments were expressed at specific moments during the presentations— which would hardly be understandable at the end of the chapter. We decided to place them at the right spot, but clearly separating them from the body of the chapter as dialogs, thus enabling the reader to skip them if desired. The book ends with a short chapter ‘‘Concluding Remarks,’’ followed by ‘‘Recommended Readings,’’ short biographies, and the index. First, we want to thank the contributors who took great pains to enhance their original position papers to book chapters by including new material and by considering the comments in and outside the discussions. Furthermore, we want to thank our colleagues at the Austrian Research Institute for Artificial Intelligence for their support—
Preface
especially Isabella Ghobrial-Willmann for efficiently organizing the travel and the hotels for the participants; Gerda Helscher, for her hard work preparing transcripts from the tape recordings; Christian Holzbaur for his technical expertise; and Ulrike Schulz, for carefully helping us in the final preparation of the book manuscript. Bob Prior of MIT Press was an ideal partner in our endeavor. Finally, we want to thank the Austrian taxpayers whose money enabled us to pay for the travel and hotels of the participants at the workshop. We are grateful to Dr. Norbert Rozsenich, Sektionschef, and Dr. Rene´ Fries, Ministerialrat, both currently at the Federal Ministry of Transport, Innovation, and Technology, who channeled this money to a project of which this workshop formed an integral part. It is our hope that this book will serve as a useful guide to the different approaches for understanding the purpose and function of emotions, both in humans and in artifacts, and assist in developing the corresponding computer models.
1 Emotions: From Brain Research to Computer Game Development Robert Trappl and Sabine Payr
1.1
Motivation
There are at least three reasons why the topic of emotions has become such a prominent one during the last several years. In the early nineties, a strange fact surfaced when people with specific brain lesions (which left their intellectual capacities intact but drastically reduced their capacity to experience emotions) were asked to make rational decisions: This capacity was severely diminished. These psychological experiments were repeated several times with similar results (Dama´sio 1994). As a result, the strange conclusion had to be drawn that rationality and emotionality are not opposed or even contradictory but that the reverse is true: Emotionality—evidently not including tantrums—is a prerequisite of rational behavior. In contrast, in interviews with twenty leading cognitive scientists in the United States (Baumgartner and Payr 1995), none of them had mentioned emotions! Also, AI research that had been done for decades without considering emotions had to incorporate this new aspect of research (figure 1.1). A second reason: Humans—not only naive ones—treat computers like persons. This can be shown in several cleverly designed experiments (e.g., Reeves and Nass 1996). In one of these experiments, the test persons had to work with a dialog-oriented training program on a computer. Afterward, they were divided into two groups to evaluate the computer’s performance. One group had to answer the evaluation questions on the same computer that was used for the training task, the other group used a different computer. The persons who had to answer the questions on the same computer gave significantly more positive responses than the others—that is, they were less honest and more cautious—which means they automatically applied rules of politeness just as they would have done in interactions with persons. This happened even when the persons asked were students of computer science! Therefore, if people have emotional relations to computers, why
Robert Trappl and Sabine Payr
Emotions Intelligent Software Agents Embodied Artificial Intelligence Neural Networks Physical Symbol System Figure 1.1 The development of AI research paradigms.
not make the computer recognize these emotions or make them express emotions (e.g., Picard 1997)? From yet another area: Computer animation has made such rapid progress that it is now possible to generate faces and bodies that are hardly distinguishable from human ones, such as the main character in the movie Final Fantasy by Hironobu Sakaguchi, released in 2001. At present, mimic, gesture, and general motion of these synthetic actors have to be either programmed or ‘‘motion captured’’ by cameras from human actors. Given the increasing number of such synthetic actors, this will become more and more difficult and, in the case of real-time interaction in, for example, computer games or educational programs, simply impossible. In order to give these synthetic actors some autonomy, personality models, at least simple ones initially, have to be developed in which emotions play a major role (Trappl and Petta 1997). As a result, the number of books and papers published on this subject during recent years has increased considerably. However, they usually treat emotions only in one respect—for example: emotions and the brain, emotions and agents, emotions and computer users.
1.2
Aims of the Book
But what are the different aspects of emotions—emotions in humans and in artifacts? How are emotions viewed, analyzed, and, especially, modeled in different disciplines, for different purposes? What would happen if scientists and developers from different disciplines (from brain research, cognitive science, philosophy, computer science, and artificial intelligence, to developers of soft-
Emotions
ware and computer games) came together to present and discuss their views? The result of this interdisciplinary endeavor is this book, with the original position papers enhanced by incorporating additional, new material and by presenting condensations of the often intense and vivid discussions of the differing positions.
1.3
Contents
Chapter 2 is the contribution by Edmund Rolls, ‘‘A Theory of Emotion, Its Functions, and Its Adaptive Value.’’ The logical—but not the easiest—way into the broad issue of emotions in humans and artifacts is to ask what emotions are, or better: to try to cut out the class of those states in animals (including humans) that should be called emotional. Edmund Rolls offers a definition wherein emotional states are necessarily tied to reward and punishment: Emotions are states produced by instrumental reinforcers. Whether the reinforcers are primary (unlearned) or secondary (learned by association with primary reinforcers) is not a criterion, whereas he draws the dividing line between those stimuli that are (positively or negatively) reinforcing and those that are not—in which case they can evoke sensations, but not emotions. Rolls’s evolutionary approach to emotions leads him to search for the function of the emotional system in increasing fitness (i.e., the chance to pass on one’s genes). He finds this potential in the reaction-independent nature of emotions, or, in his words: ‘‘rewards and punishments provide a common currency for inputs to response selection mechanisms.’’ The evolutionary advantage, then, is to allow the animal to select arbitrarily among several possible (re)actions, thus increasing its adaptability to complex and new environments. As to the underlying brain architecture, Rolls distinguishes two different paths to action: On the one hand, there is the ‘‘implicit’’ system involving the orbitofrontal cortex and the amygdala. This system is geared toward rapid, immediate reactions and tends to accumulate greatest possible reward in the present. On the other hand, there is the ‘‘explicit’’ system, in which the symbolprocessing capabilities and short-term memory play an important part. This system allows some mammals, including humans, to make multistep plans that defer expected rewards to some point in the future, overriding the implicit system’s promises for immediate rewards. This second system would be, for Rolls, related to
Robert Trappl and Sabine Payr
consciousness as the ability to reflect one’s own thoughts (and some emotional states, as we might add), an ability that is necessary in order to evaluate and correct the execution of complex plans. Aaron Sloman’s chapter ‘‘How Many Separately Evolved Emotional Beasties Live within Us?’’ is concerned with the confusion surrounding the use of emotion and related concepts in AI that are not well understood yet. Before discussing, for example, the necessity of emotions for artifacts or the functions of emotions in animals, an effort has to be made to understand what is meant and involved in each case. Sloman proposes a complex architecture that could allow an approach to these widely varying and ill-defined phenomena from an information-processing perspective. The architecture is based on an overlay of the input-output information flow model (perception-processing-action) and a model of three processing levels (reactive-deliberative-reflective)—reminiscent of the brain evolution model (‘‘old’’ reptilian and ‘‘newer’’ mammalian brains). This architecture results in what Sloman terms an ‘‘ecology of mind,’’ in that it accommodates coevolved suborganisms that acquire and use different kinds of information and process it in different ways, sometimes cooperating and sometimes competing with each other. This architecture—contrary to many proposals currently made by AI research—does not contain a specific ‘‘emotional system’’ whose function supposedly is to produce emotions. Instead, emotions are emergent properties of interactions between components that are introduced for other reasons and whose functioning may or may not generate emotion. Sloman’s view is that with the progress of research in this field, emotion, as well as other categories originating from naive psychology, may either become useless or may change their meaning radically—in analogy to some categories used in the early days of physics. With chapter 4, ‘‘Designing Emotions for Activity Selection in Autonomous Agents,’’ by Lola Can˜amero, we fully enter the domain of emotions in artifacts. Can˜amero first presents an overview and categorization of different approaches to emotional modeling and evaluates them with regard to their adequateness for the design of mechanisms where emotions guide the selection of actions. From the point of view of the modeling goal, she considers phenomenon-based (or black box) models as arising from an engi-
Emotions
neering motivation, whereas design-based (or process) models are motivated by more scientific concerns. From the perspective of their view on emotions, she distinguishes component-based and functional models. Component-based models (e.g., Picard’s) postulate an artificial agent as having emotions when it has certain components characterizing human (animal) emotional systems. Functional models (e.g., Frijda’s), on the contrary, focus on how to transpose the properties of humans and their environment to a structurally different system so that the same functions and roles arise. Not surprisingly, functional models turn out to be more useful for action selection. Can˜amero explores the relationship between emotion, motivation, and behavior in the experimental system ‘‘Gridland,’’ a dynamic environment with simple virtual beings. The connection between emotions, motivations, and behavior is achieved through the physiological metaphor of homeostatically controlled variables (drives) and ‘‘hormones.’’ Her motivation is clearly that of the engineer when she seeks to make virtual beings benefit from the better survival chances of emotional animals (including humans): rapid reactions, resolution of goal conflicts, and the social function of signaling relevant events to others. Kirstie Bellman’s ‘‘Emotions: Meaningful Mappings Between the Individual and Its World’’ (chapter 5) first describes a specific function of emotions and its modeling (as presented in chapter 4). Then it presents a broader perspective to the part played by emotions in cognition, social interaction, and emergence of the self. Based on that, virtual worlds as open test beds with a hybrid (agent and human) population are presented. From an information processing perspective—which has been, historically, dominant in cognitive science and Artificial Intelligence—the functions of emotions are primarily seen in their implications for decision making, arousal, and motivation. Bellman, however, goes on to explore the role of emotions in establishing (a sense of) ‘‘self’’: emotions are always personal and experienced from a first-person point of view. The ‘‘self’’ is always a perceived and felt self. Emotions, then, and self-perception of them, play a crucial part in integrating cognitive and vital processes into a continuous and global construction. Without this integration, Bellman argues, it is not possible for a being to ‘‘make sense of the world.’’ She then advocates virtual worlds as test beds for collecting a new level of observations on the interactions among humans
Robert Trappl and Sabine Payr
and agents. Virtual worlds have the advantage of offering wellspecified environments where human and virtual actors can meet on common ground, and, at the same time, of allowing the capture and analysis of interactions among humans, among agents, and between humans and agents. In chapter 6, ‘‘On Making Believable Emotional Agents Believable,’’ Andrew Ortony starts out from the requirement that for the behavior of an emotional agent to be believable, it has to be consistent across similar situations, but also coherent across different kinds of situations and over longer periods of time. Emotions, motivations, and behaviors are not randomly related to the situations that give rise to them. While the mapping from types of emotions to classes of response tendencies is flexible, it is not totally accidental, neither in form nor in intensity. Ortony distinguishes three major types of emotion response tendencies—expressive, information processing, and coping. Which tendencies will prevail in an individual (or in an artifact) is to some degree a question of the individual’s personality. Personality, in this view, is not a mere description of observed regularities, but a generative mechanism that drives behavior. Only personality models that reduce the vast number of observed personality traits to a small number of dimensions are capable of providing such a ‘‘behavior engine.’’ Factor structure is one such model, offering between two and five basic factors. Regulatory focus (‘‘promotion’’ vs. ‘‘prevention’’ focus), characterized as a preference for either gain/no-gain or nonloss/loss situations, is another option. For building a believable agent, it is therefore necessary to ensure appropriate internal responses (emotions), appropriate external responses (behavior and behavioral inclinations), to arrange for coordination between internal and external responses, and for individual appropriateness through a personality model. In chapter 7, ‘‘What Does It Mean for a Computer to ‘Have’ Emotions?’’ Rosalind Picard approaches this question from two angles. First she lists possible motivations why computers should have certain emotional abilities. Her concern has been with ways in which computers could be more adaptive to users’ feelings— especially those of frustration. It may not be necessary for such an adaptive system to ‘‘have’’ emotions as long as it recognizes and deals with frustration in a useful way, but the nature of this task,
Emotions
which involves real-time decision making with complex unpredictable inputs and limited resources, might eventually require mechanisms analogous to emotion. In the rest of the chapter, Picard differentiates four components of emotion: emotional appearance (comprising any kind of expression, behavior, or signal that makes emotion visible to others), multiple levels of emotion generation (meaning at least two levels: subconsciously generated emotions versus more reason-generated emotions), emotional experience (again, the subjective quality of emotions, needed for self-perception and selfmonitoring), and mind-body interactions (including hidden regulatory and biasing functions of emotion, typically occurring without one’s awareness). Her term sentic modulation suggests that emotions not only give rise to specific ‘‘emotional behavior,’’ but modify behavior in general. Picard addresses the way computers with emotional abilities are presented to the general public. She does not see emotions as a phenomenon that will separate man and machine in the future— rather, she describes that all these known components of emotion should be implementable in machines, with the possible exception of fully implementing the third component, which seems to require conscious experience. The focus of Clark Elliott in chapter 8, ‘‘The Role of Elegance in Emotion and Personality: Reasoning for Believable Agents,’’ is clearly on ‘‘alien AI’’ in contrast to ‘‘human AI’’—that is, the creation of anthropomorphic artifacts. What he terms ‘‘elegance’’ in the title of his chapter is the requirement that the agent should, first of all, support the user’s projection of emotions onto it, rather than (necessarily) exhibit a complex emotional model built strictly along the lines of what is known about emotions in humans. The effectiveness of the underlying emotion and personality theory in allowing fluid, intuitive interactions with the user, in his view, takes precedence over its psychological correctness. He illustrates his point with an experiment where listeners had to recall different ‘‘emotional’’ stories generated around the same simple plot. In his view, it is the user who endows the agent with personality and emotions without taking the artifact for a ‘‘real’’ personality. In a Gedankenexperiment, where the user is told to kill or torture the artifact in order to save her cat from pain or discomfort, she would hardly hesitate. As a consequence, agents
Robert Trappl and Sabine Payr
should support the user in building up the illusion of social interaction, which is quite a different engineering goal from attempting to force the user into it. In ‘‘The Role of Emotions in a Tractable Architecture for Situated Cognizers’’ (chapter 9), Paolo Petta begins with an investigation into the relevance of the emotional for situated agent engineering, which he demonstrates for reactivity, social abilities, autonomy, and proactiveness. Following Nico Frijda, he views the emotional as a flexible adaptation mechanism: the appraisal of an event with respect to its adaptational significance for the individual, followed by the generation of an action tendency aimed at changing the relationship between the individual and the environment. Petta then discusses appraisal theories of emotion, models of the appraisal process, and different agent architectures for emotion synthesis. As an example of an early appraisal-based architecture, he presents in more detail the Affective Reasoner, developed by Clark Elliott. These considerations led Petta to develop his own architecture framework for software agents in virtual environments—the TABASCO (tractable appraisal-based architecture framework for situated cognizers). He gives a detailed description of the structure of TABASCO, of the role and functions of its components, and shows its links to psychological emotion research. TABASCO is not only a theoretical concept but has already been implemented in an exhibit in the Vienna Museum of Technology. Thus conclusions from the experiences with this implementation can be drawn and will help schedule further directions of research. In chapter 10, ‘‘The Wolfgang System: A Role of ‘Emotions’ to Bias Learning and Problem Solving when Learning to Compose Music,’’ Douglas Riecken describes a working model of music composition in which dispositions are instrumental in deciding about steps in the elaboration of tonal monodies during composing. Composing is viewed as a process that creates an artifact to communicate some cognitive emotional effect. Wolfgang is a learning system based on a dynamic K-line memory. Its basic elements are E-nodes (emotion nodes), which are collections of information defining the emoting potential of a given primitive musical artifact. Gene Ball’s ‘‘A Bayesian Heart: Computer Recognition and Simulation of Emotion’’ (chapter 11) discusses a system that uses a Bayesian network to analyze and generate emotional responses. The emotionally sensitive conversational interface he has in mind
Emotions
should be able to detect the user’s emotional state in an unobtrusive way. He points out that any need for add-on devices like glasses, gloves, and so on might be a serious obstacle for widespread use of emotional artifacts. The model of emotions he uses is deliberately simple, representing both short-term emotional states and longer-term personality traits on two orthogonal axes. The dimensions of valence and arousal are used to categorize emotions, while personality is characterized by friendliness and dominance as dimensions. The agent maintains two copies of the emotion model, one for assessing the user’s state, the other for generating the agent’s behavior. In the Bayesian network, internal states are linked to nodes representing the aspects of behavior that are assumed to be influenced by that state, such as choice of linguistic expression, vocalization, gesture, and posture. Concluding the contributed chapters, ‘‘Creating Emotional Relationships with Virtual Characters’’ (chapter 12), by Andrew Stern, presents the computer toys ‘‘Petz’’ and ‘‘Babyz.’’ They are among the first commercial software products implementing virtual characters with emotions. The goal of these software toys is to enable the user to build an emotional relationship with them. Starting out from this design objective and these market requirements, the question of modeling emotions becomes a practical engineering and staging problem: which emotions should such characters have and how should they display them? What effects emotional states and how do they influence behavior? An application that wants to present a believable, coherent illusion of life in an entertaining way puts several constraints on the architecture of emotions and on user interface design, in which Andrew Stern sees the current bottleneck for any ‘‘naturalistic’’ emotional relationship between humans and virtual characters. However, beyond issues of engineering, this chapter raises the question of what it means for humans to have an emotional relationship with a virtual being. Stern questions the usefulness of emotional relationships with what he calls ‘‘functional’’ agents while underlining the long tradition of emotional attachment to characters in entertainment—be they characters in novels, movies, cartoons, or TV series. Computer and robot toys with emotions add a new dimension to this tradition by introducing interactivity. Forming an emotional relationship with the artificial being becomes the goal of the game and, at the same time, its ‘‘plot.’’ The creation of the properties and
Robert Trappl and Sabine Payr
behaviors of such virtual beings is, in his view, an artistic work just as was the creation of famous fiction or cartoon characters.
References Baumgartner, P., and Payr, S., eds. (1995): Speaking Minds. Interviews withTwenty Eminent Cognitive Scientists. Princeton University Press, Princeton. Dama´sio, A. (1994): Descartes’ Error: Emotion, Reason, and the Human Brain. Putnam, New York. Picard, R. (1997): Affective Computing. MIT Press, Cambridge. Reeves, B., and Nass, C. (1996): The Media Equation. How People Treat Computers, Television, and New Media Like Real People and Places. CSLI Publications, Stanford, Calif. and Cambridge University Press, Cambridge. Trappl, R., and Petta, P., eds. (1997): Creating Personalities for Synthetic Actors. Towards Autonomous Personality Agents. Springer, Berlin, Heidelberg, New York.
2 A Theory of Emotion, Its Functions, and Its Adaptive Value Edmund T. Rolls
Summary
Emotions may be defined as states elicited by reinforcers (rewards a n d punishments). This approach helps with understanding the functions of emotion a n d with classifying different emotions; it helps also in understanding what information processing systems in the brain are involved in emotion a n d how they are involved. The hypothesis is developed that brains are designed around reward a n d punishment evaluation systems, because this is the way that genes can build a complex system that will produce appropriate but flexible behavior to increase their fitness. By specifying goals rather than particular behavioral patterns of responses, genes leave much more open the possible behavioral strategies that might be required to increase their fitness.
2.1
Introduction
What are emotions? Why do we have emotions? What are the rules by which emotion operates? What are the brain mechanisms of emotion, and how can disorders of emotion be understood? Why does it feel ‘‘like something’’ to have an emotion? What motivates us to work for particular rewards such as food when we are hungry, or water when we are thirsty? How do these motivational control systems operate to ensure that we eat approximately the correct amount of food to maintain our body weight or to replenish our thirst? What factors account for the overeating and obesity that some humans show? Why is the brain built to have reward and punishment systems, rather than in some other way? Raising this issue of brain design and why we have reward and punishment systems (and emotion and motivation) produces a fascinating answer based on how genes can direct our behavior to increase their fitness. How does the brain produce behavior by using reward and punishment mechanisms? These are some of the questions considered in The Brain a n d Emotion (Rolls 1999), and introduced here.
Edmund Rolls
2.2 A Theory of Emotion, and Some Definitions
Emotions can usefully be defined as states elicited by rewards and punishments, including changes in rewards and punishments (see also Rolls 1986a,b, 1990, 2000). A reward is anything for which an animal will work. A punishment is anything that an animal will work to escape or avoid. An example of an emotion might thus be happiness produced by being given a reward, such as a pleasant touch, praise, or winning a large sum of money. Another example of an emotion might be fear produced by the sound of a rapidly approaching bus, or the sight of an angry expression on someone’s face. We will work to avoid such stimuli, which are punishing. Another example would be frustration, anger, or sadness produced by the omission of an expected reward such as a prize, or the termination of a reward such as the death of a loved one. Another example would be relief, produced by the omission or termination of a punishing stimulus—such as the removal of a painful stimulus, or sailing out of danger. These examples indicate how emotions can be produced by the delivery, omission, or termination of rewarding or punishing stimuli, and go some way to indicate how different emotions could be produced and classified in terms of the rewards and punishments received, omitted, or terminated. A diagram summarizing some of the emotions associated with the delivery of reward or punishment, a stimulus associated with them, or with the omission of a reward or punishment, is shown in figure 2.1. Before accepting this approach, we should consider whether there are any exceptions to the proposed rule. Are there any emotions caused by stimuli, events, or remembered events that are not rewarding or punishing? Do any rewarding or punishing stimuli not cause emotions? We will consider these questions in more detail below. The point is that if there are no major exceptions, or if any exceptions can be clearly encapsulated, then we may have a good working definition at least of what causes emotions. Moreover, it is worth pointing out that many approaches to or theories of emotion (see Strongman 1996) have in common the fact that part of the process involves ‘‘appraisal’’ (e.g., Frijda 1986; Lazarus 1991; Oatley and Jenkins 1996). In all these theories, the concept of appraisal presumably involves assessing whether something is rewarding or punishing. The description in terms of reward or punishment adopted here seems more tightly and operationally specified. I next consider a slightly more formal definition than
A Theory of Emotion, Its Functions, and Its Adaptive Value
Figure 2.1
Some of the emotions associated with different reinforcement contingencies are indicated. Intensity increases away from the center of the diagram, on a continuous scale. The classification scheme created by the different reinforcement contingencies consists of (1) the presentation of a positive reinforcer (S+), (2) the presentation of a negative reinforcer ( S - ) , (3) the omission of a positive reinforcer (S+) or the termination of a positive reinforcer (S+!), and (4) the omission of a negative reinforcer ( S - ) or the termination of a negative reinforcer ( S - ! ) . (After The Brain and Emotion, figure 3.1.)
rewards or punishments, in which the concept of reinforcers is introduced, and show how there has been a considerable history in the development of ideas along this line. The proposal that emotions can be usefully seen as states produced by instrumental reinforcing stimuli follows earlier work by Millenson (1967), Weiskrantz (1968), Gray (1975, 1987) and Rolls (1986a,b, 1990). (Instrumental reinforcers are stimuli that, if their occurrence, termination, or omission is made contingent upon the making of a response, alter the probability of the future emission of that response.) Some stimuli are unlearned reinforcers (e.g., the taste of food if the animal is hungry, or pain), while others may become reinforcing by learning, because of their association
Edmund Rolls
with such primary reinforcers, thereby becoming ‘‘secondary reinforcers.’’ This type of learning may thus be called ‘‘stimulusreinforcement association,’’ and occurs via a process like classical conditioning. If a reinforcer increases the probability of emission of a response on which it is contingent, it is said to be a ‘‘positive reinforcer’’ or ‘‘reward’’; if it decreases the probability of such a response it is a ‘‘negative reinforcer’’ or ‘‘punishment.’’ For example, fear is an emotional state that might be produced by a sound (the conditioned stimulus) that has previously been associated with an electrical shock (the primary reinforcer). The converse reinforcement contingencies produce the opposite effects on behavior. The omission or termination of a positive reinforcer (‘‘extinction’’ and ‘‘time out,’’ respectively—sometimes described as ‘‘punishing’’) decreases the probability of responses. Responses followed by the omission or termination of a negative reinforcer increase in probability—this pair of negative reinforcement operations being termed ‘‘active avoidance’’ and ‘‘escape,’’ respectively (see, further, Gray 1975; Mackintosh 1983). This foundation has been developed (see Rolls 1999, 2000, 1986a,b, 1990) to show how a very wide range of emotions can be accounted for as a result of the operation of a number of factors, including the following: 1. 2. 3.
4. 5. 6.
The reinforcement contingency (e.g., whether reward or punishment is given or withheld) (see figure 2.1). The intensity of the reinforcer (see figure 2.1). Any environmental stimulus might have a number of different reinforcement associations. (For example, a stimulus might be associated both with the presentation of a reward and of a punishment, allowing states such as conflict and guilt to arise.) Emotions elicited by stimuli associated with different primary reinforcers will be different. Emotions elicited by different secondary reinforcing stimuli will be different from each other (even if the primary reinforcer is similar). The emotion elicited can depend on whether an active or passive behavioral response is possible. (For example, if an active behavioral response can occur to the omission of a positive reinforcer, then anger might be produced, but if only passive behavior is possible, then sadness, depression, or grief might occur.) By combining these six factors, it is possible to account for a very wide range of emotions (for elaboration, see Rolls 1990, 1999).
A Theory of Emotion, Its Functions, and Its Adaptive Value
It is also worth noting that emotions can be produced just as much by the recall of reinforcing events as by external reinforcing stimuli; that cognitive processing (whether conscious or not) is important in many emotions—for very complex cognitive processing may be required to determine whether or not environmental events are reinforcing. Indeed, emotions normally consist of cognitive processing that analyses the stimulus, determines its reinforcing valence—and then an elicited mood change if the valence is positive or negative. In that an emotion is produced by a stimulus, philosophers say that emotions have an object in the world, and that emotional states are intentional, in that they are about something. We note that a mood or affective state may occur in the absence of an external stimulus, as in some types of depression, but that normally the mood or affective state is produced by an external stimulus, with the whole process of stimulus representation, evaluation in terms of reward or punishment, and the resulting mood or affect being referred to as emotion. Three issues receive discussion here (see further Rolls 1999, 2000). One is that rewarding stimuli such as the taste of food are not usually described as producing emotional states (though there are cultural differences here!). It is useful here to separate rewards related to internal homeostatic need states associated with, say, hunger and thirst, and to note that these rewards are not normally described as producing emotional states. In contrast, the great majority of rewards and punishments are external stimuli not related to internal need states such as hunger and thirst, and these stimuli do produce emotional responses. An example is fear produced by the sight of a stimulus that is about to produce pain. A second issue is that philosophers usually categorize fear in the previous example as emotion, but not pain. The distinction they make may be that primary (unlearned) reinforcers do not produce emotions, whereas secondary reinforcers (stimuli associated by stimulus-reinforcement learning with primary reinforcers) d o . They describe the pain as a sensation. But neutral stimuli (such as a table) can produce sensations when touched. It accordingly seems to be much more useful to categorize stimuli according to whether they are reinforcing (in which case they produce emotions), or are not reinforcing (in which case they do not produce emotions). Clearly, there is a difference between primary reinforcers and learned reinforcers; but this is most precisely caught by
Edmund Rolls
noting that this is the difference, and that it is whether a stimulus is reinforcing that determines if it is related to emotion. A third issue is that, as we are about to see, emotional states (i.e., those elicited by reinforcers) have many functions, and the implementations of only some of these functions by the brain are associated with emotional feelings (Rolls 1999), including evidence for interesting dissociations in some patients with brain damage between actions performed to reinforcing stimuli and what is subjectively reported. In this sense, it is biologically and psychologically useful to consider emotional states as including more than just those states associated with feelings of emotion.
2.3
The Functions of Emotion
The functions of emotion also provide insight into the nature of emotion. These functions, described more fully elsewhere (Rolls 1990, 1999), can be summarized as follows: 1.
2.
The elicitation of autonomic responses (e.g., a change in heart rate) a n d endocrine responses (e.g., the release of adrenaline). These prepare the body for action. Flexibility of behavioral responses to reinforcing stimuli. Emotional (and motivational) states allow a simple interface between sensory inputs and action systems. The essence of this idea is that goals for behavior are specified by reward and punishment evaluation. When an environmental stimulus has been decoded as a primary reward or punishment, or (after previous stimulus-reinforcer association learning) a secondary rewarding or punishing stimulus, then it becomes a goal for action. The animal can then perform any action (instrumental response) to obtain the reward, or to avoid the punishment. Thus there is flexibility of action, and this is in contrast to stimulus-response, or habit, learning in which a particular response to a particular stimulus is learned. It also contrasts with the elicitation of species-typical behavioral responses by signreleasing stimuli (such as pecking at a spot on the beak of the parent herring gull in order to be fed—Tinbergen 1951—where there is inflexibility of the stimulus and the response, and which can be seen as a very limited type of brain solution to the elicitation of behavior). The emotional route to action is flexible not only because any action can be performed to obtain the reward or avoid the punishment, but also because the animal can learn in as little
A Theory of Emotion, Its Functions, and Its Adaptive Value
3.
4.
as one trial that a reward or punishment is associated with a particular stimulus, in what is termed ‘‘stimulus-reinforcer association learning.’’ To summarize and formalize, two processes are involved in the actions being described. The first is stimulus-reinforcer association learning, and the second is instrumental learning of an operant response made to approach and obtain the reward or to avoid or escape from the punishment. Emotion is an integral part of this, for it is the state elicited in the first stage, by stimuli that are decoded as rewards or punishments, and this state has the property that it is motivating. The motivation is to obtain the reward or avoid the punishment, and animals must be built to obtain certain rewards and avoid certain punishments. Indeed, primary or unlearned rewards and punishments are specified by genes that effectively specify the goals for action. This is the solution that natural selection has found for how genes can influence behavior to promote their fitness (as measured by reproductive success), and for how the brain could interface sensory systems to action systems. Selecting between available rewards with their associated costs, and avoiding punishments with their associated costs, is a process that can take place both implicitly (unconsciously), and explicitly, using a language system to enable long-term plans to be made (Rolls 1999). These many different brain systems, some involving implicit evaluation of rewards, and others explicit, verbal, conscious evaluation of rewards and planned long-term goals, must all enter into the selector of behavior (figure 2.2). This selector is poorly understood, but it might include a process of competition between all the competing calls on output, and might involve the basal ganglia in the brain (see figure 2.2 and Rolls 1999). Emotion is motivating, as just described. For example, fear learned by stimulus-reinforcement association provides the motivation for actions performed to avoid noxious stimuli. Communication. Monkeys, for example, may communicate their emotional state to others by making an open-mouth threat to indicate the extent to which they are willing to compete for resources, and this may influence the behavior of other animals. This aspect of emotion was emphasized by Darwin (1872) and has been studied more recently by Ekman (1982, 1993). Ekman reviews evidence that humans can categorize facial expressions into the categories happy, sad, fearful, angry, surprised, and disgusted, and that
Edmund Rolls
Figure 2.2 Dual routes to the initiation of action in response to rewarding and punishing stimuli. The inputs from different sensory systems to brain structures such as the orbitofrontal cortex and amygdala allow these brain structures to evaluate the reward- or punishment-related value of incoming stimuli, or of remembered stimuli. The different sensory inputs enable evaluations within the orbitofrontal cortex and amygdala based mainly on the primary (unlearned) reinforcement value for taste, touch, and olfactory stimuli, and on the secondary (learned) reinforcement value for visual and auditory stimuli. In the case of vision, the ‘‘association cortex’’ which outputs representations of objects to the amygdala and orbitofrontal cortex is the inferior temporal visual cortex. One route for the outputs from these evaluative brain structures is via projections directly to structures such as the basal ganglia (including the striatum and ventral striatum) to enable implicit, direct behavioural responses based on the reward or punishment-related evaluation of the stimuli to be made. The second route is via the language systems of the brain, which allow explicit (verbalizable) decisions involving multistep syntactic planning to be implemented. (After The Brain and Emotion, figure 9.4.)
this categorization may operate similarly in different cultures. He also describes h o w the facial muscles produce different expressions. Further investigations of the degree of cross-cultural universality of facial expression, its development in infancy, a n d its role in social behavior are described by Izard (1991) and Fridlund (1994). As shown below, there are neural systems in the amygdala and overlying temporal cortical visual areas that are specialized for the face-related aspects of this processing. 5.
Social bonding. Examples of this are the emotions associated with the attachment of the parents to their young and the attachment of the young to their parents.
A Theory of Emotion, Its Functions, and Its Adaptive Value
6.
7.
8.
9.
The current mood state can affect the cognitive evaluation of events or memories (see Oatley and Jenkins 1996). This may facilitate continuity in the interpretation of the reinforcing value of events in the environment. A hypothesis that back projections— from parts of the brain involved in emotion such as the orbitofrontal cortex and amygdala—implement this is described in The Brain a n d Emotion (Rolls 1999). Emotion may facilitate the storage of memories. One way this occurs is that episodic memory (i.e., one’s memory of particular episodes) is facilitated by emotional states. This may be advantageous in that storing many details of the prevailing situation when a strong reinforcer is delivered may be useful in generating appropriate behavior in situations with some similarities in the future. This function may be implemented by the relatively nonspecific projecting systems to the cerebral cortex and hippocampus, including the cholinergic pathways in the basal forebrain and medial septum, and the ascending noradrenergic pathways (see chapter 4 in Rolls 1999 and Rolls and Treves 1998). A second way in which emotion may affect the storage of memories is that the current emotional state may be stored with episodic memories, providing a mechanism for the current emotional state to affect which memories are recalled. A third way that emotion may affect the storage of memories is by guiding the cerebral cortex in the representations of the world that are set u p . For example, in the visual system it may be useful for perceptual representations or analyzers to be built that are different from each other if they are associated with different reinforcers, and for these to be less likely to be built if they have no association with reinforcement. Ways in which back projections—from parts of the brain important in emotion (such as the amygdala) to parts of the cerebral cortex— could perform this function are discussed by Rolls and Treves (1998). Another function of emotion is that by enduring for minutes or longer after a reinforcing stimulus has occurred, it may help to produce persistent a n d continuing motivation a n d direction of behavior, to help achieve a goal or goals. Emotion may trigger the recall of memories stored in neocortical representations. Amygdala back projections to the cortex could perform this for emotion in a way analogous to that in which the hippocampus could implement the retrieval in the neocortex of recent (episodic) memories (Rolls and Treves 1998).
Edmund Rolls
2.4 Reward, Punishment, and Emotion in Brain Design: An Evolutionary Approach
The theory of the functions of emotion is further developed in chapter 10 of The Brain a n d Emotion (Rolls 1999). Some of the points made help to elaborate greatly on item no. 2 from section 2.3 above. In chapter 10 of The Brain a n d Emotion, the fundamental question of why we and other animals are built to use rewards and punishments to guide or determine our behavior is considered. Why are we built to have emotions, as well as motivational states? Is there any reasonable alternative around which evolution could have built complex animals? In this section, I outline several types of brain design, with differing degrees of complexity, and suggest that evolution can operate to influence action with only some of these types of design.
Taxes
A simple design principle is to incorporate mechanisms for taxes into the design of organisms. Taxes consist at their simplest of orientation toward stimuli in the environment—for example, the bending of a plant toward light, which results in maximum light collection by its photosynthetic surfaces. (When just turning rather than locomotion is possible, such responses are called tropisms.) With locomotion possible, as in animals, taxes include movements toward sources of nutrient and movements away from hazards such as very high temperatures. The design principle here is that animals have, through a process of natural selection, built receptors for certain dimensions of the wide range of stimuli in the environment, and have linked these receptors to mechanisms for particular responses in such a way that the stimuli are approached or avoided.
Reward and Punishment
As soon as we have approach toward stimuli at one end of a dimension (e.g., a source of nutrient) and away from stimuli at the other end of the dimension (in this case lack of nutrient), we can start to wonder when it is appropriate to introduce the terms rewards and punishments for the stimuli at the different ends of the dimension. By convention, if the response consists of a fixed reaction to obtain the stimulus (e.g., locomotion up a chemical gradient), we shall call this a taxis, not a reward. On the other
A Theory of Emotion, Its Functions, and Its Adaptive Value
hand, if an arbitrary operant response can be performed by the animal in order to approach the stimulus, then we will call this rewarded behavior, and the stimulus the animal works to obtain is a reward. (The operant response can be thought of as any arbitrary action the animal will perform to obtain the stimulus.) This criterion, of an arbitrary operant response, is often tested by bidirectionality. For example, if a rat can be trained to either raise or lower its tail in order to obtain a piece of food, then we can be sure that there is no fixed relationship between the stimulus (e.g., the sight of food) and the response, as there is in a taxis. The role of natural selection in this process is to guide animals to build sensory systems that will respond to dimensions of stimuli in the natural environment along which actions can lead to better ability to pass genes on to the next generation—that is, to increased fitness. The animals must be built by such natural selection to make responses that will enable them to obtain more rewards—that is, to work to obtain stimuli that will increase their fitness. Correspondingly, animals must be built to make responses that will enable them to escape from, or learn to avoid, stimuli that will reduce their fitness. There are likely to be many dimensions of environmental stimuli along which responses can alter fitness. Each of these dimensions may be a separate reward-punishment dimension. An example of one of these dimensions might be food reward. It increases fitness to be able to sense nutrient need, to have sensors that respond to the taste of food, and to perform behavioral responses to obtain such reward stimuli when in that need or motivational state. Similarly, another dimension is water reward, in which the taste of water becomes rewarding when there is body fluid depletion (see chapter 7 of Rolls 1999). With many reward/punishment dimensions for which actions may be performed (see table 10.1 of Rolls 1999 for a nonexhaustive list), a selection mechanism for actions performed is needed. In this sense, rewards and punishments provide a common currency for inputs to response-selection mechanisms. Evolution must set the magnitudes of each of the different reward systems so that each will be chosen for action in such a way as to maximize overall fitness. Food reward must be chosen as the aim for action if a nutrient is depleted, but water reward as a target for action must be selected if current water depletion poses a greater threat to fitness than the current food depletion. This indicates that each reward must be carefully calibrated by evolution to have the right value in
Edmund Rolls
the common currency for the competitive selection process. Other types of behavior, such as sexual behavior, must be selected sometimes, but probably less frequently, in order to maximize fitness (as measured by gene transmission to the next generation). Many processes contribute to increasing the chances that a wide set of different environmental rewards will be chosen over a period of time, including not only need-related satiety mechanisms, which decrease the rewards within a dimension, but also sensory-specific satiety mechanisms, which facilitate switching to another reward stimulus (sometimes within and sometimes outside the same main dimension), and attraction to novel stimuli. Finding novel stimuli rewarding is one way that organisms are encouraged to explore the multidimensional space in which their genes are operating. The above mechanisms can be contrasted with typical engineering design. In the latter, the engineer defines the requisite function and then produces special-purpose design features that enable the task to be performed. In the case of the animal, there is a multidimensional space within which many optimizations to increase fitness must be performed. The solution is to evolve reward/ punishment systems tuned to each dimension in the environment that can increase fitness if the animal performs the appropriate actions. Natural selection guides evolution to find these dimensions. In contrast, in the engineering design of a robot arm, the robot does not need to tune itself to find the goal to be performed. The contrast is between design by evolution, which is ‘‘blind’’ to the purpose of the animal, and design by a designer who specifies the job to be performed (cf. Dawkins 1986). Another contrast is that for the animal, the space will be high dimensional, so that the most appropriate reward for current behavior (taking into account the costs of obtaining each reward) needs to be selected, whereas for the robot arm, the function to perform at any one time is specified by the designer. Another contrast is that the behavior (the operant response) most appropriate to obtain the reward must be selected by the animal, whereas the movement to be made by the robot arm is specified by the design engineer. The implication of this comparison is that operation by animals using reward and punishment systems tuned to dimensions of the environment that increase fitness provides a mode of operation that can work in organisms that evolve by natural selection. It is clearly a natural outcome of Darwinian evolution to operate using reward and punishment systems tuned to fitness-related dimen-
A Theory of Emotion, Its Functions, and Its Adaptive Value
sions of the environment—if arbitrary responses are to be made by the animals, rather than just preprogrammed movements, such as tropisms and taxes. Is there any alternative to such a reward/ punishment-based system in this evolution-by-natural-selection situation? It is not clear that there is, if the genes are efficiently to control behavior. The argument is that genes can specify actions that will increase their fitness if they specify the goals for action. It would be very difficult for them in general to specify in advance the particular responses to be made to each of a myriad of different stimuli. This may be why we are built to work for rewards, avoid punishment, and to have emotions and needs (motivational states). This view of brain design in terms of reward and punishment systems built by genes that gain their adaptive value by being tuned to a goal for action offers a deep insight into how natural selection has shaped many brain systems—and is a fascinating outcome of Darwinian thought.
2.5
Dual Routes to Action
It is suggested (Rolls 1999) that there are two routes to action performed in relation to reward or punishment in h u m a n s . Examples of such actions include emotional and motivational behavior. The first route is via the brain systems that have been present in nonhuman primates such as monkeys, and to some extent in other mammals, for millions of years. These systems include the amygdala and, particularly well-developed in primates, the orbitofrontal cortex. These systems control behavior in relation to previous associations of stimuli with reinforcement. The computation that controls the action thus involves assessment of the reinforcementrelated value of a stimulus. This assessment may be based on a number of different factors. One is the previous reinforcement history, which involves stimulus-reinforcement association learning using the amygdala, and its rapid updating—especially in primates using the orbitofrontal cortex. This stimulus-reinforcement association learning may involve quite specific information about a stimulus—for example, of the energy associated with each type of food by the process of conditioned appetite and satiety (Booth 1985). A second is the current motivational state—for example, whether hunger is present, whether other needs are satisfied, and so on. A third factor that affects the computed reward value of the stimulus is whether that reward has been received recently. If it
Edmund Rolls
has been received recently but in small quantity, this may increase the reward value of the stimulus. This is known as incentive motivation or the ‘‘salted peanut’’ phenomenon. The adaptive value of such a process is that this positive feedback of reward value in the early stages of working for a particular reward tends to lock the organism onto behavior being performed for that reward. This means that animals that are, for example, almost equally hungry and thirsty will show hysteresis in their choice of action, rather than continually switching from eating to drinking and back with each mouthful of water or food. This introduction of hysteresis into the reward evaluation system makes action selection a much more efficient process in a natural environment, for constantly switching between different types of behavior would be very costly if all the different rewards were not available in the same place at the same time. (For example, walking half a mile between a site where water was available and a site where food was available after every mouthful would be very inefficient.) The amygdala is one structure that may be involved in this increase in the reward value of stimuli early on in a series of presentations, in that lesions of the amygdala (in rats) abolish the expression of this reward incrementing process, which is normally evident in the increasing rate of working for a food reward early on in a meal (Rolls and Rolls 1982). A fourth factor is the computed absolute value of the reward or punishment expected or being obtained from a stimulus—for example, the sweetness of the stimulus (set by evolution so that sweet stimuli will tend to be rewarding, because they are generally associated with energy sources), or the pleasantness of touch (set by evolution to be pleasant according to the extent to which it brings animals of the opposite sex together, and depending on the investment in time that the partner is willing to put into making the touch pleasurable, a sign that indicates the commitment and value for the partner of the relationship). After the reward value of the stimulus has been assessed in these ways, behavior is then initiated based on approach toward or withdrawal from the stimulus. A critical aspect of the behavior produced by this type of system is that it is aimed directly toward obtaining a sensed or expected reward, by virtue of connections to brain systems such as the basal ganglia, which are concerned with the initiation of actions (see figure 2.2). The expectation may of course involve behavior to obtain stimuli associated with reward, which might even be present in a chain.
A Theory of Emotion, Its Functions, and Its Adaptive Value
Part of the way in which the behavior is controlled with this first route to action is according to the reward value of the outcome. At the same time, the animal may only work for the reward if the cost is not too high. Indeed, in the field of behavioral ecology, animals are often thought of as performing optimally on some cost-benefit curve (see, e.g., Krebs and Kacelnik 1991). This does not at all mean that the animal thinks about the rewards and performs a cost-benefit analysis using a lot of thoughts about the costs, other rewards available and their costs, and so on. Instead, it should be taken to mean that in evolution, the system has evolved in such a manner that the way in which the reward varies with the different energy densities or amounts of food and the delay before it is received can be used as part of the input to a mechanism that has also been built to track the costs of obtaining the food (e.g., energy loss in obtaining it, risk of predation, etc.), and to then select, given many such types of reward and the associated cost, the current behavior that provides the most ‘‘net reward.’’ Part of the value of having the computation expressed in this reward-minus-cost form is that there is then a suitable ‘‘currency,’’ or net reward value, to enable the animal to select the behavior with currently the most net reward gain (or minimal aversive outcome). The second route to action in humans involves a computation with many ‘‘if . . . then’’ statements, to implement a plan to obtain a reward. In this case, the reward may actually be deferred as part of the plan, which might involve working first to obtain one reward, and only then to work for a second more highly valued reward, if this was thought to be overall an optimal strategy in terms of resource usage (e.g., time). In this case, syntax is required, because the many symbols (e.g., names of people) that are part of the plan must be correctly linked or bound. Such linking might be of the form: ‘‘If A does this, then B is likely to do this, and this will cause C to do this. . . .’’ The requirement of syntax for this type of planning implies that an output to language systems in the brain is required for this type of planning (see figure 2.2). This, the explicit language system in humans, may allow working for deferred rewards by enabling use of a one-off, individual plan appropriate for each situation. Another building block for such planning operations in the brain may be the type of short-term memory in which the prefrontal cortex is involved. This short-term memory may be, for example in nonhuman primates, of where in space a response has just been made. A development of this type of short-term
Edmund Rolls
response memory system in humans to enable multiple short-term memories to be held in place correctly (preferably with the temporal order of the different items in the short-term memory coded correctly) may be another building block for the multiple-step ‘‘if . . . then’’ type of computation in order to form a multiple step plan. Such short-term memories are implemented in the (dorsolateral and inferior convexity) prefrontal cortex of nonhuman primates and humans (see Goldman-Rakic 1996; Petrides 1996) and may be part of the reason why prefrontal cortex damage impairs planning (see Shallice 1996). Of these two routes (see figure 2.2), it is the second that I have suggested above is related to consciousness. The hypothesis is that consciousness is the state that arises by virtue of having the ability to think about one’s own thoughts, which has the adaptive value of enabling one to correct long multistep syntactic plans. This latter system is thus the one in which explicit, declarative processing occurs. Processing in this system is frequently associated with reason and rationality, in that many of the consequences of possible actions can be taken into account. The actual computation of how rewarding a particular stimulus or situation is or will be probably still depends on activity in the orbitofrontal and amygdala, as the reward value of stimuli is computed and represented in these regions, and in that it is found that verbalized expressions of the reward (or punishment) value of stimuli are dampened by damage to these systems. (For example, damage to the orbitofrontal cortex renders painful input still identifiable as pain, but without the strong affective, ‘‘unpleasant,’’ reaction to it.) This language system, which enables long-term planning, may be contrasted with the first system, in which behavior is directed at obtaining the stimulus (including the remembered stimulus), which is currently most rewarding, as computed by brain structures that include the orbitofrontal cortex and amygdala. There are outputs from this system, perhaps those directed at the basal ganglia, which do not pass through the language system, and behavior produced in this way is described as implicit—and verbal declarations cannot be made directly about the reasons for the choice made. When verbal declarations are made about decisions made in this first system, those verbal declarations may be confabulations (reasonable explanations or fabrications) of reasons why the choice was made. These reasonable explanations would be generated to be consistent with the sense of continuity and self that is a characteristic of reasoning in the language system.
A Theory of Emotion, Its Functions, and Its Adaptive Value
The question then arises of how decisions are made in animals such as humans that have both the implicit, direct-rewardbased, and the explicit, rational, planning systems (see figure 2.2). One particular situation in which the first, implicit system may be especially important is when rapid reactions to stimuli with reward or punishment value must be made, for then the direct connections from structures such as the orbitofrontal cortex to the basal ganglia may allow rapid actions. Another is when there may be too many factors to be taken into account easily by the explicit, rational, planning system, and the implicit system may be used to guide action. In contrast, when the implicit system continually makes errors, it would then be beneficial for the organism to switch from automatic, direct action—based on obtaining what the orbitofrontal cortex system decodes as being the most positively reinforcing choice currently available—to the explicit conscious control system—which can evaluate with its long-term planning algorithms what action should be performed next. Indeed, it would be adaptive for the explicit system to be regularly assessing performance by the more automatic system, and to switch itself into control behavior quite frequently, as otherwise the adaptive value of having the explicit system would be less than optimal. Another factor that may influence the balance between control by the implicit and explicit systems is the presence of pharmacological agents such as alcohol, which may alter the balance toward control by the implicit system, may allow the implicit system to influence more the explanations made by the explicit system, and may within the explicit system alter the relative value it places on caution and restraint versus commitment to a risky action or plan. There may also be a flow of influence from the explicit, verbal system to the implicit system, in that the explicit system may decide on a plan of action or strategy, and exert an influence on the implicit system, which will alter the reinforcement evaluations made by and the signals produced by the implicit system. An example of this might be that if a pregnant woman feels that she would like to escape a cruel mate, but is aware that she may not survive in the jungle, then it would be adaptive if the explicit system could suppress some aspects of her implicit behavior toward her mate, so that she does not give signals that she is displeased with her situation. In the literature on self-deception, it has been suggested that unconscious desires may not be made explicit in consciousness (or may actually be repressed), so as not to compromise the explicit system in what it produces (see, e.g., Alexander
Edmund Rolls
1975, 1979; Trivers 1976, 1985; and the review by Nesse and Lloyd 1992). Another example might be that the explicit system could (because of its long-term plans) influence the implicit system to increase its response to a positive reinforcer. One way in which the explicit system might influence the implicit system is by setting up the conditions in which, when a given stimulus (e.g., person) is present, positive reinforcers are given to facilitate stimulusreinforcement association learning by the implicit system of the person receiving the positive reinforcers. Conversely, the implicit system may influence the explicit system—for example, by highlighting certain stimuli in the environment that are currently associated with reward to guide the attention of the explicit system to such stimuli. However, it may be expected that there is often a conflict between these systems, in that the first, implicit system is able to guide behavior particularly to obtain the greatest immediate reinforcement, whereas the explicit system can potentially enable immediate rewards to be deferred, and longer-term, multistep plans to be formed. This type of conflict will occur in animals with a syntactic planning ability—that is, in humans and any other animals that have the ability to process a series of ‘‘if . . . then’’ stages of planning. This is a property of the human language system, and the extent to which it is a property of nonhuman primates is not yet fully clear. In any case, such conflict may be an important aspect of the operation of at least the human mind, because it is so essential for humans to correctly decide, at every moment, whether to invest in a relationship or a group that may offer long-term benefits, or whether to directly pursue immediate benefits (Nesse and Lloyd 1992). As Nesse and Lloyd (1992) describe, analysts have come to a somewhat similar position, for they hold that intrapsychic conflicts usually seem to have two sides, with impulses on one side and inhibitions on the other. Analysts describe the source of the impulses as the id, and the modules that inhibit the expression of impulses, because of external and internal constraints, the ego and superego, respectively (Leak and Christopher 1982; Trivers 1985; see Nesse and Lloyd 1992, p. 613). The superego can be thought of as the conscience, while the ego is the locus of executive functions that balance satisfaction of impulses with anticipated internal and external costs. A difference of the present position is that it is based on identification of dual routes to action implemented by different systems in the brain, each with its own selective advantage.
A Theory of Emotion, Its Functions, and Its Adaptive Value
2.6
Conclusion
This approach leads to an appreciation that in order to understand brain mechanisms of emotion and motivation, it is necessary to understand how the brain decodes the reinforcement value of primary reinforcers, how it performs stimulus-reinforcement association learning to evaluate whether a previously neutral stimulus is associated with reward or punishment and is therefore a goal for action, and how the representations of these neutral sensory stimuli are appropriate as an input to such stimulus-reinforcement learning mechanisms. Some of these issues are considered in The Brain a n d Emotion (Rolls 1999): for emotion in chapter 4, for feeding in chapter 2, for drinking in chapter 7, and for sexual behavior in chapter 8. The way in which the brain mechanisms and processes that underlie emotion can be understood using the approach described here is also provided in The Brain a n d Emotion (Rolls 1999).
Acknowledgments
The author has worked on some of the experiments described here with G. C. Baylis, L. L. Baylis, M. J. Burton, H. C. Critchley, M. E. Hasselmo, C. M. Leonard, F. Mora, D. I. Perrett, M. K. Sanghera, T. R. Scott, S. J. Thorpe, and F. A. W. Wilson, and their collaboration—and helpful discussions with or communications from M. Davies and C. C. W. Taylor (Corpus Christi College, Oxford), and M. S. Dawkins—are sincerely acknowledged. Some of the research described was supported by the Medical Research Council.
Discussion
Association, Representation, and Cognition
Ortony: Could you tell us what you mean by ‘‘cognition?’’ Rolls: When we have a primary reinforcer that can influence behavior, such as a taste, then cognitive processing might not be needed to produce a behavioral response to it. If I then have in my association cortex a representation of an object that can be used for many different functions—for example, for emotion, or for some automatic response—that’s a simple step at which to introduce
Edmund Rolls
cognition: as soon as I have a representation that can be used for many things and is not tied to one. Sloman: Application-neutral representation, that’s the important thing! Rolls: But then there are obviously much higher levels of cognitive processing, including, as I have shown in my talk, language processing. If the output of all that cognitive processing is rewarding or punishing, then I say we have an emotional state. But there can be a tremendous amount of cognition involved in calculating whether something is rewarding or punishing. Bellman: You would not consider cognitive the actual processing or establishment of associations? Rolls: Inside our association cortex, we have got representations that can be used for many functions. The lowest level at which I would want to introduce the word ‘‘cognition’’ is for such a representation. So, when you are associating those representations with primary reinforcers, you are doing an association between something that you can say is cognitive and a primary reinforcer. So, cognition does come into emotional learning. Ortony: But we also, of course, have representations of sweetness and all other things we view as primary reinforcers. How are these representations created in your model? And if you do have a representation of sweetness, how come this representation is not involved when you actually taste something that is sweet? Why does it not activate an abstract representation of sweetness—not that I believe this, by the way, I am only pushing you a little—so that a minimum of cognition is involved even there? Rolls: I don’t want to take a strong position on whether we want to call that cognitive or not. Basically, what happens in the taste system is that you go through to the primary cortex, and you represent by populations of neurons what the taste is—sweet, salty, bitter, and sour—in a distributed way, independently of its reward value. If you go one stage on in the taste system towards secondary taste cortex, one synapse away you find something that is completely different: the sweet neurons only respond to the sweet taste if you are hungry, so that they get associated. . . . So we can see where the pleasantness of taste, the reward value, is decoded. I think it’s then a semantic issue whether you want to call that a representation or not. Obviously, it is a representation in the sense of information theory.
A Theory of Emotion, Its Functions, and Its Adaptive Value
Sloman: This functional neutrality is important. For instance, take a trainable neural net which partitions the space of possible input, and when the input is in subset A, it produces response A, and when it is in subset B, it produces response B. Then you can say that the internal state produced by a particular set of inputs is a sort of representation. But this representation has just one effect, namely, to produce that particular output. And that is not a piece of cognition in your sense, although from my point of view, it might be a particular kind of representation.
Consciousness
Ortony: What’s the role of consciousness in all of this? That is, most of us think—at a sort of lay-level, in fact—that a quintessential ingredient of emotion is conscious awareness of being in an affected or disturbed state of some kind. I don’t know what the answer is, but I always think one could do all this without having consciousness and just go directly from the perception of the environment to the emotional response without having that quintessential ingredient in there. Rolls: So, deliberately, I said nothing about that, because that’s obviously a major problem. However, note that, as I said, a lot of our behavior could be reduced to an implicit system, by which I infer that it’s an unconscious system. But then, there is also the second group, the language group, that produces what I might be calling explicit understanding or conscious actions. The language system enables you to work for deferred rewards. Now, for me, consciousness might creep into the machine in the following way: I could imagine the first-order language processor which does ‘‘if-then’’ statements and calculates what counts as a reward. Computationally, there is a credit assignment problem in such a machine: What happens if the fifth step is the one that is responsible for not getting the correct answer? That’s like a credit assignment problem in a multilayer net. So the answer is to play back your reasoning, think about this step, think about the premises for each step, try and work out alternatives. Ortony: I actually did not mean to allude to the consciousness as the conscious selection of a response, but to the phenomenological quality of the emotion per se: the cowering that you get when some large object is approaching you rapidly. I am not concerned about
Edmund Rolls
whether or not the cowering is a conscious response, whether it is consciously determined or something, I assume it isn’t. But feeling fear—it’s the quality of the feeling, if you like, the awareness of that, that I don’t understand. Sloman: But it’s well known that there are kinds of important human emotions of which the person who has the emotion can be unaware. Somebody can be angry, and if you ask him if he is aware of the fact he is angry, he would say that he is not. I believe this is linked to what Edmund was saying about that extra thing which I will, in my talk, call metamanagement. You may have the ability to inspect and evaluate and feel your state, but it does not follow that you always use that ability in every state in which it might be useful to use it. And there might be some case in which your attention is so firmly focused on what’s up there that you do not discover that you are angry or infatuated or even humiliated. Rolls: Right. So, what I am trying to do is to suggest an answer to the credit assignment problem which involves thinking about your own thoughts, and then to say that that state, for me at least, will have to feel like something. Now here we have the question of simple qualia—how does it feel to see something red or to be touched, and so on. I can’t see a function for those independently of anything else that I talked about so far. So, what I then suggest is that, when the machine is developed to the stage where it does feel like something, associated to its thinking about its own thoughts, then occasionally it has to think about these sensory events in the world. I actually hold the view then that these qualia are secondary to the development of such a system that can think about its own thoughts. They arise because they have to get into the computations, and it would not be parsimonious to say that the raw sensory inputs that you are processing about feel like nothing. I understand as parsimonious to say that they feel like something. Now, that argument is original, and I think most people would dismiss it by saying that there’s got to be something that is much more primary, that simple qualia came out much earlier in evolution. Ortony: The third question is: What are we going to do about the
¨ data of Arne Ohman who shows that, when you give phobics a phobic-relevant stimulus, you get an increase in GSR (galvanic skin response) with absolutely no awareness of the identity of the object. O¨ hman showed spider-phobics pictures of spiders, subliminally exposed. They had no idea that there were any pictures of
A Theory of Emotion, Its Functions, and Its Adaptive Value
spiders, but their GSR showed the rudiments of a fear response to spiders which they did not show to, say, snakes, and vice versa for snake phobics. And then, of course, you would look at the role of primary reinforcers in your model, and you would say: I wonder where that’s coming from! Is that coming from the input? Joe LeDoux 1 claims it needn’t actually go through the association cortex—that there’s a direct route from the thalamus to the amygdala —it’s a question of time. Sloman: That does not prove that the association cortex is not involved. Ortony: No. The question is: How do you accommodate those data in this model? Rolls: Ok. Exactly like this: In order to see spiders, you have to go to the association cortex. Right? Basically that’s where you have representations of objects, including spiders. From there, you can go through the amygdala, the orbitofrontal cortex, and produce your implicit output. Why is it unconscious for you? Well, in backward masking experiments we have shown that when any cortex process lasts for only about 20 milliseconds, you don’t have any conscious aspect of it. And the reason for that is that it does not get from the association cortex out into the language system, about which you can make conscious thoughts. The language system has a higher threshold, as it were, than these other systems. Why that is so might be interesting, but that’s a side issue. But maybe you want to be real sure before you commit your symbolprocessing system to something, that there is something out there. So, that’s the answer. Ortony: Thank you. Yes, that’s fine. I like that.
Note 1. J. E. LeDoux, The Emotional Brain (New York: Simon and Schuster, 1996).
References Alexander, R. D. (1975): The Search for a General Theory of Behaviour. Behav. Sci. 20: 77–100. Alexander, R. D. (1979): Darwinism and Human Affairs. University of Washington Press, Seattle. Booth, D. A. (1985): Food-Conditioned Eating Preferences and Aversions with Interoceptive Elements: Learned Appetites and Satiates. Ann. N.Y. Acad. Sci. 443: 22–37. Darwin, C. (1872): The expression of the emotions in man and animals. University of Chicago Press.
Edmund Rolls
Dawkins, R. (1986): The Blind Watchmaker. Longman, Harlow. Ekman, P. (1982): Emotion in the Human Face. 2nd ed. Cambridge University Press, Cambridge. Ekman, P. (1993): Facial Expression and Emotion. Am. Psychol. 48: 384–392. Fridlund, A. J. (1994): Human Facial Expression: An Evolutionary View. Academic Press, New York. Frijda, N. H. (1986): The Emotions. Cambridge University Press, Cambridge. Goldman-Rakic, P. S. (1996): The Prefrontal Landscape: Implications of Functional Architecture for Understanding Human Mentation and the Central Executive. Philos. Trans. R. Soc. Lond. BBiol. Sci. 351: 1445–1453. Gray, J. A. (1975): Elements of a Two-Process Theory of Learning. Academic Press, New York. Gray, J. A. (1987): The Psychology of Fear and Stress. 2nd ed. Cambridge University Press, Cambridge. Izard, C. E. (1991): The Psychology of Emotions. Plenum, New York. James, W. (1884): What Is an Emotion? Mind 9: 188–205. Krebs, J. R., and Kacelnik, A. (1991): Decision Making. Chapter 4 in J. R. Krebs and N. B. Davies, eds., Behavioural Ecology, 105–136. Blackwell, Oxford. Lazarus, R. S. (1991): Emotion and Adaptation. Oxford University Press, New York. Leak, G. K., and Christopher, S. B. (1982): Freudian Psychoanalysis and Sociobiology: A Synthesis. Am. Psychol. 37: 313–322. Mackintosh, N. J. (1983): Conditioning and Associative Learning. Oxford University Press, Oxford, London, New York. Millenson, J. R. (1967): Principles of Behavioral Analysis. MacMillan, London. Nesse, R. M., and Lloyd, A. T. (1992): The Evolution of Psychodynamic Mechanisms. Chap. 17 in J. H. Barkow, L. Cosmides, and J. Tooby, eds., The Adapted Mind, 601–624. Oxford University Press, Oxford, London, New York. Oatley, K., and Jenkins, J. M. (1996): Understanding Emotions. Blackwell, Oxford. Petrides, M. (1996): Specialized Systems for the Processing of Mnemonic Information within the Primate Frontal Cortex. Philos. Trans. R. Soc. Lond. BBiol. Sci. 351: 1455–1462. Rolls, B. J., and Rolls, E. T. (1982): Thirst. Cambridge University Press, Cambridge. Rolls, E. T. (1975): The Brain and Reward. Pergamon, Oxford, New York. Rolls, E. T. (1986a): Neural Systems Involved in Emotion in Primates. In R. Plutchik and H. Kellerman, eds., Emotion: Theory, Research, and Experience. Vol. 3, 125–143. Academic Press, New York. Rolls, E. T. (1986b): A Theory of Emotion, and Its Application to Understanding the Neural Basis of Emotion. In Y. Oomura, ed., Emotions. Neural and Chemical Control, 325–344. Karger, Basel. Rolls, E. T. (1990): A Theory of Emotion, and its Application to Understanding the Neural Basis of Emotion. Cogn. Emotion 4: 161–190. Rolls, E. T. (1999): The Brain and Emotion. Oxford University Press, Oxford, London, New York. Rolls, E. T. (2000): Pre´cis of The Brain and Emotion. B ehav. B rain Sci. 23: 177–233. Rolls, E. T., and Treves, A. (1998): Neural Networks and Brain Function. Oxford University Press, Oxford, London, New York. Shallice, T., and Burgess, P. (1996): The Domain of Supervisory Processes and Temporal Organization of Behaviour. Philos. Trans. R. Soc. Lond. BBiol. Sci. 351: 1405–1411. Strongman, K. T. (1996): The Psychology of Emotion. 4th ed. Wiley, New York. Tinbergen, N. (1951): The Study of Instinct. Oxford University Press, Oxford, London, New York. Trivers, R. L. (1976): Foreword. In R. Dawkins, The Selfish Gene. Oxford University Press, Oxford, London, New York. Trivers, R. L. (1985): Social Evolution. Benjamin Cummings, Menlo Park, Calif. Weiskrantz, L. (1968): Emotion. In L. Weiskrantz, ed., Analysis of Behavioural Change, 50–90. Harper and Row, New York.
3 How Many Separately Evolved Emotional Beasties Live within Us? Aaron Sloman
Abstract
A problem that bedevils the study of emotions, a n d the study of consciousness, is that we assume a shared understanding of many everyday concepts, such as ‘‘emotion,’’ ‘‘feeling,’’ ‘‘pleasure,’’ ‘‘pain,’’ ‘‘desire,’’ ‘‘awareness,’’ a n d so forth. Unfortunately, these concepts are inherently very complex, ill-defined, a n d used with different meanings by different people. Moreover, this goes unnoticed, so that people t h i n k t h e y understand what they are referring to even when their understanding is very unclear. Consequently, there is much discussion that is inherently vague, often at cross-purposes, a n d with apparent disagreements that arise out of people unwittingly talking about different things. We need a frameworkthat explains how there can be all the diverse phenomena that different people refer to when they t a l k a b o u t emotions a n d other affective states a n d processes. The conjecture on which this chapter is based is that adult h u m a n s have a type of information-processing architecture, with components that evolved at different times, including a rich a n d varied collection of components whose interactions can generate all the sorts of phenomena that different researchers have labeled ‘‘emotions.’’ Within this frameworkwe can provide rational reconstructions of many everyday concepts of mind. We can also allow a variety of different architectures—found in children, brain damaged adults, other animals, robots, software agents, a n d so on—where different architectures support different classes of states a n d processes, a n d therefore different mental ontologies. Thus concepts like ‘‘emotion,’’ ‘‘awareness,’’ a n d so on will need to be interpreted differently when referring to different architectures. We need to limit the class of architectures under consideration, because for any class of behaviors there are indefinitely many architectures that can produce those behaviors. One important constraint is to consider architectures that might have been produced by biological evolution. This leads to the notion of a h u m a n architecture composed of many components that evolved under the influence of the other
Aaron Sloman
components as well as environmental needs a n d pressures. From this viewpoint, a mind is a kind of ecology of coevolved suborganisms acquiring a n d using different kinds of information a n d processing it in different ways, sometimes cooperating with one another a n d sometimes competing. Within this framework, we can hope to study not only mechanisms underlying affective states a n d processes, but also other mechanisms that are often studied in isolation—for example, vision, action mechanisms, learning mechanisms, ‘‘alarm’’ mechanisms, a n d so forth. We can also explain why some models, a n d corresponding conceptions of emotion, are shallow, whereas others are deeper. Shallow models may be of practical use—for example, in entertainment a n d interface design. Deeper models are required if we are to understand what we are, how we can go wrong, a n d so on. This chapter is a snapshot of a long-term project addressing all these issues.
3.1
What Kinds of Emotions?
The study of emotions has recently become fashionable within AI and cognitive science. Unfortunately, all sorts of different things are labeled as ‘‘emotions.’’ This is perhaps understandable among young engineers who have not been trained in philosophy or psychology. However, even among specialists there are many different definitions of ‘‘emotion’’ and related concepts, such as ‘‘feeling,’’ ‘‘affect,’’ ‘‘motivation,’’ ‘‘mood,’’ and so on. For instance, some define emotions in terms of observable physical behaviors (such as weeping, grimacing, smiling, jumping for joy, etc.). Some define them in terms of measurable physiological changes—which need not be easily discernible externally, though they may be sensed internally (referred to by Picard 1997 as sentic modulation). Some define them in terms of the kinds of conscious experiences involved in having them—their phenomenology. Some define them in terms of the brain mechanisms that may be activated. Even when behavioral manifestations do occur, they may be to some extent culturally determined, casting doubt on behavioral criteria for emotions. For instance, the sounds people make when exhibiting pain can vary according to culture: ‘‘ouch’’ in English is replaced by ‘‘eina’’ in Afrikaans! Some researchers regard emotions as inherently social or cultural in nature, though this may be more true of having a guilty conscience than being terrified during an earthquake.
How Many Separately Evolved Emotional Beasties Live within Us?
There is also disagreement over what sorts of evidence can be taken as relevant to the study of emotions. For instance, some will regard the behavior of skilled actors when asked to show certain emotions as demonstrating connections between emotions and externally observable behavior. Others will object that that merely reveals what happens when people are asked to act as if they had certain emotions, whereas naturally occurring emotions may be quite different. In some cases they may have no external manifestations, because people can often conceal their emotions. For some researchers, emotions, by definition, are linked to and differentiable in observable behavior, like weeping, grimacing, jumping for joy, growing tense, and so on, whereas other researchers are more interested in semantically rich emotions for which there are no characteristic, nonverbal, behavioral expressions (e.g., being worried that your work is not appreciated by your colleagues versus being worried that your political party is going to lose the next election, or being delighted that the there is a sunny weather forecast for the day you have planned a picnic versus being delighted that someone you admire very much is impressed by your research, etc.). Most of the empirical, laboratory research on emotions has studied only simple, shallow emotions—largely ignoring semantic content—whereas most of the important h u m a n emotions (the ones that are important in our social lives, and which are the subject matter of gossip, poems, stories, plays, etc.) are deep and semantically rich. Another common difficulty is that some people use the word emotion so loosely that it covers almost any affective state, including having a desire or motive, whereas in ordinary parlance we do not normally describe someone as being emotional just because they have goals, purposes, or preferences, or because they are enjoying a meal or finding their chair uncomfortable to sit in. If all such affective states were included as emotions, it would follow that people constantly have a large number of different emotions, because we all have multiple enduring goals, ambitions, tastes, preferences, ideals, and so on. Another source of confusion concerns whether having an emotion necessarily involves being conscious of the emotion. According to some, this is a defining criterion, yet that does not square with the common observation that people can sometimes be angry, jealous, infatuated, or pleased at being flattered, and so on without being aware of being so, even though it may be obvious to others.
Aaron Sloman
Another problem with the criterion is that it may rule out certain animals having emotions if they lack the ability to monitor and characterize their own states or lack the conceptual framework required to classify some states as emotions. Presumably, a newborn infant cannot classify its own mental states using our adult categories. Does that mean that it has no emotions? Perhaps it h a s them but does not feel them? Perhaps an infant’s behavioral manifestations of pain, distress, discomfort, and pleasure are simply part of the biologically important process of generating appropriate nurturing behavior in parents rather than being expressions of what the infant is aware of? There is no obvious way of resolving disagreements on these issues because of the ambiguities and confusion in the key concepts used. Yet another confusion concerns whether, in order to have emotions, an organism or machine must contain an emotion-producing module of some kind, or whether some or all emotions are simply states involving interactions between a host of processes that are not intrinsically emotional, as was argued in Wright, Sloman, and Beaudoin (1996). On the first view, it makes sense to ask how the emotion mechanism evolved and what biological function it has, whereas on the second view such questions make no sense. Another possibility is that the ambiguous word emotion sometimes refers to states and processes conforming to the first view, and sometimes to the second, because our usage is inconsistent. Because of this conceptual mess, anyone can produce a program, label some component of it the ‘‘emotion module’’ and proudly announce that they have developed a robot or software agent that has emotions. It will be hard to argue against such claims when there is no agreement on what emotions are. This is an extreme form of the phenomenon in AI of attributing mental states and human capabilities to programs on the basis of very shallow analogies, for which McDermott (1981) chided the AI community many years ago, though he was concerned with the undisciplined use of labels such as ‘‘plan,’’ ‘‘goal,’’ ‘‘infer.’’
3.2
A Modest Proposal
Is there any way to remove, or at least reduce, the muddle and confusion? Our proposal is to step back from the problem of defining emotions, or other affective states, and then ask two related questions. The first question is what sorts of animals or artifacts
How Many Separately Evolved Emotional Beasties Live within Us?
we may be talking about, where different sorts are defined by their information processing architectures: the varieties of mechanisms and submechanisms they have for acquiring, interpreting, transforming, storing, and using information, including making decisions, initiating actions, generating and pursuing goals, and doing various kinds of self-monitoring. It is important that the kind of architecture under discussion is not defined by physical mechanisms and their organization, but rather by what a software engineer might refer to as a virtual machine architecture, which need have no simple relationship to the underlying physical architecture. The second question can then be asked about each sort of architecture: what sorts of states and processes can occur in animals or artifacts that have this architecture? We may find that different kinds of systems can produce different sorts of things that might loosely be called emotions. Having distinguished these different sorts of cases, we can then replace questions like ‘‘Can computer-based robots have emotions?’’ ‘‘Can software agents have emotions?’’ ‘‘Can insects have emotions?’’ ‘‘Can newborn infants have emotions?’’ with questions like ‘‘What sorts of emotions can various sorts of robots or software agents have?’’ ‘‘Which kinds of insects can have which kinds of emotions?’’ ‘‘Which kinds of emotions can newborn infants have, if any?’’ We can then perhaps define a large number of different sorts of emotions: E 1 , E2, E3, . . . , E25, . . . , E73, . . . , related to different types of architectures and different states and processes that can occur within those architectures. Moreover, having defined a certain class of emotions, E25, in terms of the architectures that make them possible, we can then ask ‘‘Which kinds of animals, or computing systems, can have emotions of type E25?’’ All this assumes that the characterization of types of organisms that is potentially the most fruitful for philosophical understanding, for scientific advance, and for applications such as therapy and education, is one that refers to the organism’s or machine’s information processing architecture. This amounts to adopting what Dennett (1978) referred to as ‘‘the design stance.’’ It is different from ‘‘the physical stance,’’ which describes the physical components and organization of the system, and ‘‘the intentional stance,’’ which ignores internal mechanisms and merely assumes that it is predictively useful to regard certain behaving systems as acting rationally on the basis of beliefs, desires, intentions, and so on. 1
Aaron Sloman
The idea of explaining mental phenomena in terms of information processing architectures that support them is not new. For instance, Herbert Simon’s highly recommended paper (Simon 1967), written in the early 1960s, makes a start on the project of analyzing requirements for an architecture capable of supporting human emotions and motivations. Since then, there have been many proposals regarding architectures for intelligent systems, including, for instance, those described in the books by Albus (1981), Minsky (1987), Russell and Norvig (1995), Dennett (1996), Nilsson (1998), and my own early primitive efforts in my 1978 book. Different architectures will support different collections of states and processes: different mental ontologies. Using the design stance, we can then define different sorts of emotions, different kinds of awareness, different kinds of learning, different kinds of intentionality, and so forth in the context of the architectures that produce them. Then, vague and fruitless debates about which animals and which machines can have emotions or consciousness can be replaced by more productive factual investigations: we can ask which kinds of animals and machines can have which kinds of emotions, which kinds of awareness, and so on. So if we define twenty-five classes of emotions produced by different architectures, we may find that normal adult humans can have one subset, newborn infants another subset, chimpanzees yet another subset, particular sorts of robots yet another subset, and so on. For this activity, we do not need to assume that the systems we are analyzing are wholly or even partly rational, as would be required if we adopted the intentional stance, or if we attempted to build theories conforming to the ‘‘knowledge level’’ defined in Newell (1982). All this may help to counteract the fast-growing practice of publishing papers reporting yet another robot or softbot ‘‘with emotions’’ where ‘‘emotion’’ is defined so as to permit the author’s claim (often in ignorance of McDermott’s warnings). Instead, authors will have to specify precisely which class of emotions they are referring to. We shall find that some are far less interesting than others (e.g., because they depend on relatively shallow architectures), though they may be of some use for entertainment or educational purposes, or enhancing a user interface for inexperienced users. 2
How Many Separately Evolved Emotional Beasties Live within Us?
We can also explain, from this standpoint, the proliferation of definitions of emotion and theories explaining how they work. Human minds have extraordinarily complex architectures, with components whose effects are not clearly separable without a deep theory of the architecture. People with a limited view of the system will unwittingly focus on different aspects of the complete system, unaware of what they are leaving out, like the proverbial ten blind men saying what an elephant is on the basis of which part they can feel (tusk, trunk, ear, leg, belly, tail, etc.) Each has a small part of a large and complex truth. The same is true of most theories of emotions and, more generally, most theories of mind.
3.3
Phenomena to Be Explained
In parallel with producing theories about explanatory architectures, it is helpful to collect examples of many types of actual phenomena, so that we have a broad range of cases for the theories to explain. Of course, we may misdescribe some of the phenomena if we don’t yet have good theories on which to base our categorizations. So we should try to use descriptions that avoid premature commitment to a particular type of explanatory architecture. An example of a fairly neutral description that would be applicable to many animals might be When a large object is detected rapidly moving toward an organism, it will rapidly switch to some kind of evasive action. We do not need to assume that such generalizations express universal laws: They will have many exceptions, including individuals who are injured or not fully awake, or who freeze instead of jumping to one side. There is also an inherent vagueness in the terms ‘‘large’’ and ‘‘rapidly.’’ So, because small or slow moving objects may trigger different kinds of behavior (e.g., repulsive, blocking, or catching behavior), the generalization may have exceptions relating to intermediate cases, which are neither definitely small nor definitely large, or neither definitely rapid nor slow in their motion. The phenomena to be explained can include both commonplace examples (which most people know about from their own experience, observing others, gossiping, reading novels, etc.) and also phenomena that are discovered by experts from field studies or in various ‘‘unnatural’’ laboratory situations, and also information
Aaron Sloman
derived from studies of effects of brain damage, or in ethological studies of other animals. We should try to build a theory that explains all such phenomena, while respecting constraints from neuroscience, psychology, biological evolution, as well as what has been learned from computer science, software engineering, and AI about feasibility, tractability, and so on of various solutions to problems. Of course, there may be disagreements regarding which phenomena to include as initial constraints. Some researchers may wish to work bottom up from the most fine grained empirical observations of behavioral details, or physiological changes, or brain structures and processes, while others may wish to start from more global characterizations of behaviors and their interactions and the global designs that might explain some of their features. The former involves attempting to design mechanisms that replicate or model the detailed phenomena, worrying later about how to put them together in the design of a complete organism, whereas the latter approach involves attempting to design a complete working system with a wide range of capabilities that at first merely approximate to the original phenomena, and then progressively adding more and more detailed mechanisms to match the finer grained requirements. A mixture of bottom-up and top-down approaches is also possible. Our own work is primarily top down (aiming simultaneously for progressive deepening and progressively increasing breadth of the architecture), though we attempt to learn from those working in bottom-up mode—especially people doing research on brain functions and mechanisms.
3.4
Pitfalls
A danger in the bottom-up approach is that the mechanisms produced may be capable of performing very specific tasks but be incapable of contributing adequately to the functioning of a larger architecture. An example might be a visual feature detector that analyzes an input image and produces as output another image with the required features (e.g., edges) highlighted. This would be an inadequate mechanism if a complete visual system requires the output of an edge detector to be something quite unlike a modified image. A danger in the top-down approach (sometimes referred to, following Bates, Loyall, and Reilly 1991, as the ‘‘broad and shallow
How Many Separately Evolved Emotional Beasties Live within Us?
approach’’) is that it can lead to designs that are incapable of being refined in ways that match the fine grained detailed specifications added later. For example, suppose a model of a complete agent includes a specific submechanism described as an ‘‘emotion generator,’’ perhaps assigning varying numerical values to a collection of emotion descriptors. It could turn out that this submechanism was totally spurious if the actual emotional phenomena found in organisms are not produced by a specific emotion generator but are emergent features of the interactions of other mechanisms with different functions. (An analogy would be supposing that the thrashing of an operating system had to be produced by a ‘‘thrashgenerator’’ mechanism, rather than being an emergent feature of the interactions of other mechanisms under conditions of heavy load; see section 3.12 below, the subsection entitled Emergence versus Boxes.) A danger common to all approaches is that they can lead to excessive focus on one particular design or architecture. This could be a serious mistake if there is considerable variation in architectures (e.g., between different kinds of animals or even between humans at different stages of development). For instance, it is likely that newborn human infants do not have the same sort of architecture as adults. Apart from the fact that any particular design might correspond to only a small subset of the naturally occurring cases, it is also true that we cannot fully understand the functional importance of particular designs unless we compare and contrast them with alternative designs. In other words, for a full understanding of a particular architecture, we need to explore a neighborhood in ‘‘design space’’ (Sloman 1993b). More specifically, we should explore architectural variation: 9 Across the natural/artificial divide 9 Across species 9 Within species 9 Within an individual during normal development 9 Caused by brain damage
3.5
Objectives of the Analysis
In what follows, I shall sketch in schematic outline a hypothesized architecture with components that have evolved at different times, with older portions found in many organisms and newer portions in relatively few animals. It is conjectured that the oldest layer is
Aaron Sloman
a purely reactive layer. A newer portion is a deliberative layer capable of creating descriptions of possible but not yet actual situations (i.e., doing ‘‘what-if’’ reasoning). The newest portion is the metamanagement (or reflective) layer, which is a collection of mechanisms for monitoring, categorizing, evaluating, and (to some extent) controlling the rest of the system. The third layer provides a type of self-consciousness that was previously lacking. We do not share the assumption made by Rolls (1999) that the third layer necessarily requires the use of a humanlike external language, though we acknowledge that its functionality will be considerably modified by the existence of such a language. It is possible that some other primates have metamanagement capabilities without having an external, humanlike language. However, all information processing mechanisms, in all animals, require some type of formalism for encoding information—whether it is factual information or control information. How many such formalisms there are, on what dimensions they vary, and what they are useful for is an important (and difficult) topic of ongoing research (e.g., Chandrasekaran, Glasgow, and Narayanan 1995; Karmiloff-Smith 1996; Peterson 1996; Sloman 1996b).
Architecture-Based Concepts of Emotions
On the basis of the hypothesized architecture, it is possible to produce a provisional crude classification of many familiar types of emotional processes into three different classes: (a) primary emotions, which depend only on the oldest reactive subsystems; (b) secondary emotions, which depend in part on newer deliberative mechanisms; and (c) tertiary emotions (previously referred to as ‘‘perturbances’’ in Wright, Sloman, and Beaudoin 1996), which depend on interactions with the recently evolved reflective or ‘‘metamanagement’’ layer (argued for in Beaudoin 1994). Within the three classes, we can distinguish further subclasses depending on precisely which submechanisms are involved. We can also distinguish different types of perceptual mechanisms of varying degrees of sophistication and different types of action mechanisms. Then, just as different classes of emotions are supported by different architectures, so also are different classes of perceptions and actions—some much richer and more flexible than others. Likewise, if there are different sorts of mechanisms for acquiring, storing, and later on reusing information, we can use the
How Many Separately Evolved Emotional Beasties Live within Us?
differences to define different classes of learning and then ask which classes occur in which animals, or which occur at different stages in human development. All these distinctions provide a framework in which concepts of different sorts of mental states and processes are defined in terms of the kinds of information processing architectures they presuppose and the processes that occur in those architectures. By specifying architectures at the virtual machine level—familiar to software engineers—rather than at the physical level, and defining mental concepts in terms of states and processes within virtual machines, not physical machines, we avoid mistakes of materialist reductionism, though that is not a topic for this chapter.
Types of Variation of Architectures
There are several different ways in which submechanisms in an architecture may vary from one design to another. At the lowest level, there may be differences in the type of physical implementation—for example, whether the system uses biological mechanisms, such as neurons, or computer-based mechanisms. In the latter case, there may be differences between systems implemented directly in hardware and those implemented using software that is interpreted by other software or possibly directly by the hardware. At present, we know very little about the implications of using one or another form of bottom-level implementation, though much prejudice abounds as to whether using nonbiological components does or does not rule out accurate replication or simulation of h u m a n mental states and processes. (For many people, the desire for a particular answer to be correct gets in the way of investigating which answer is actually correct.) Whatever the lowest level implementations used, there are also differences between the forms of representation available to the system—for example: whether it simply has a fixed collection of state registers whose values can vary over time; whether it is capable of using distributed overlaid representations, such as the rules encoded nonlocally in weights in certain kinds of neural nets; whether it can use forms of representation with a recursive syntax, so that structures of varying degrees of complexity can be created as needed; whether it can use hybrid modes of representation; whether it can discover the need for new forms of representation and start using them, and so on. Moreover, for a given general type
Aaron Sloman
of representation there may be considerable variations in the type of semantic content that organisms can use those representations to encode. In particular, organisms that need different information about the environment will typically be restricted in what sorts of objects, properties, relations, and so on they can refer to. That is, there may be differences in the ontologies available to different organisms, or to different components of an organism. For instance, a chimpanzee and a flea close to it can be expected to be capable of acquiring, storing, and using quite different sorts of information about the environment. This is partly because they need different sorts of information. Less obviously, similar comments may be applicable to different subcomponents within the same organism (e.g., components that evolved at different times and perform different functions). For instance, within humans, one of the functions of our visual system is to provide information for posture control mechanisms, including information about optical flow patterns in the visual field, which can be clues to whether the individual has begun to fall forward or backward. Such information may be directly encoded as control signals causing various muscles to contract or relax to correct the detected motion. The posture control subsystem may be incapable of receiving or using any other kind of information about what is in the environment. Similarly, where complex actions, such as grasping or catching a moving object, require fine-grained control of motion using a visuomotor feedback loop, the information detected by the visual system for this purpose may be quite different from the information needed for another purpose—such as formulating a plan of future action, or recognizing a person’s facial expression. The different kinds of information may be processed by different subsystems even though they all get their information from the same optic array. (These issues are discussed in more detail in Sloman 1989, 1993a, 1996a.) In section 3.12, we shall return to a more detailed discussion of dimensions of variation of architectures, and ambiguities in descriptions of architectures by different researchers.
3.6
Can We Expect Intelligible Structure?
It is possible in principle that any attempt to build a theory of the human information processing architecture is doomed because the
How Many Separately Evolved Emotional Beasties Live within Us?
Figure 3.1 Could the architecture be an unintelligible mess? The ovals represent processing components and the shaded rectangles information stores or buffers. In principle, there could be very large numbers of such components interconnected in such a complex and unstructured fashion that humans could never comprehend it.
actual architecture is a completely unstructured mess, as suggested in figure 3 . 1 . However, I suggest that in producing a really complex, multifunction design, evolution, like a human designer, will be constrained to use a modular organization—so that changes to improve one module will not impact disastrously on the functionality of other modules, and also so that some economy of genetic encoding is possible for a design using several copies of approximately the same module. (This argument does not apply to evolution of relatively simple designs, where there is less need for functional decomposition to aid the search for a solution.) It is important that in this context modular does not mean or imply rigid or innate. As Simon (1969) pointed out, the boundaries between modules need not be sharp and clear: a complex system may be ‘‘nearly decomposable.’’ Moreover, during learning and development, new modules may be added and boundaries can change. Despite these qualifications, the claim that the design is modular contrasts with the claim that there is no intelligible structure in the architecture. An intermediate position can be found in Fodor (1983), where it was suggested that the central cognitive component of the human information processing architecture is an unintelligible mess, although it has various ‘‘encapsulated’’ sensory and motor
Aaron Sloman
Figure 3.2 Fodor’s modular ‘‘sunflower’’ architecture. Fodor’s 1983 book suggested that humans have a collection of ‘‘encapsulated’’ sensory modules, shown as S1, S2, etc. feeding information into a complex and messy central cognitive system, which in turn sends signals to a collection of encapsulated motor mechanisms.
modules connected to it as indicated schematically in figure 3.2. Fodor’s idea was that the encapsulated components were largely determined innately, and their mode of functioning could not be modified by cognitive processes. In Sloman (1989) I discussed and criticized Fodor’s model along with the theory of vision proposed by David Marr (1982) in his very influential book, which proposed that a visual system functions largely in data-driven mode, extracting information from the visual array, which is then used to construct descriptions of objects in the environment in terms of their geometrical and physical properties (e.g., shape, location, motion, color, texture, etc.). Marr’s work essentially took one of the petals from Fodor’s sunflower and expanded it by dividing it into processing layers through which data flowed, starting from retinal images passing through various intermediate databases—such as a database of edge features, a binocular disparity map, an optical flow map, a texture feature map, a primal sketch, a 2.5-D surface-feature database giving orientations of surface fragments, viewer-centered object models, and scenecentered object descriptions—ultimately feeding into the cognitive system information about the shapes, motion, color, and texture of objects in the environment, along with hierarchical structural descriptions in some cases, and if object-types are recognized, their categories (e.g., ‘‘dog,’’ ‘‘horse,’’ ‘‘bucket’’). This one-way, data-driven, geometry/physics-based view of a visual system’s architecture can be contrasted both with models (e.g., Albus 1981; Sloman 1989), which allow perceptual systems to be partly driven in top-down mode, influenced by current needs and expectations, and also with the ideas of the psychologist Gib-
How Many Separately Evolved Emotional Beasties Live within Us?
Figure 3.3 Various kinds of visual ambiguity. The necker cube has two interpretations whose differences are purely geometrical (involving distances and angles). The vase-face figure has interpretations differing both in subtle figure-ground relationships, and also in which objects are recognised. The duck-rabbit figure also has an ambiguity regarding the direction in which the perceived animal is facing, a very abstract property which could be perceptually relevant to predators or prey—i.e., an affordance.
son (1979), according to which instead of being constrained to produce descriptions of the geometrical and physical properties of objects in the environment, a sensory system can produce descriptions that relate to the needs and capabilities of the organism. Gibson called these ‘‘affordances.’’ For example, vision may inform an animal of positive affordances (such as support, graspability, the possibility of passage, access to food, etc.) and negative affordances (such as obstruction, risk of injury, etc.). The richness and diversity of the output of a human visual system can be seen both from the descriptions we use of how things look (e.g., ‘‘He looks angry,’’ ‘‘That spider web looks fragile,’’ ‘‘That bar looks as if it is holding up the shelf,’’ ‘‘That knife looks dangerous,’’ ‘‘That equation looks solvable’’) but also from the varieties of visual ambiguities in pictures illustrated in figure 3.3 and others found in textbooks on vision (such as Frisby 1979). This suggests that a visual system can include many subsystems able to detect and report information of very different kinds. In Sloman (1989), I proposed that some of those subsystems, instead of producing descriptions, produced control signals—for example, signals to trigger saccades or signals to the posture control system. Some of these different visual subsystems evolved early in our evolutionary history. Some evolved much later. Some—for example, modules for sight reading music—are produced by individual learning rather than evolution. Later, I will relate all this to a conjectured architecture for adult humans.
Aaron Sloman
3.7
A Schematic Overview of the Architecture
The particular sort of architecture we have been developing within the Cognition and Affect project 3 is rather complex and hard to understand. In what follows, I shall start from two simplified views of the architecture and then combine them in such a way as to provide a mnemonic overview. However, the combined picture is still an oversimplification, as various additional modules are needed to complete the specification. These will be described very briefly, and the whole thing related to various kinds of mental states and processes to be found in humans. We can also view other animals as sharing different subsets of the architecture, though possibly with different design and implementation details because they have to operate with different larger systems. For instance, mechanisms within a bird of prey for perceiving and using optical flow may be different from the optical flow mechanisms in a tiger or a h u m a n .
The Triple Tower Model
The first schematic view of a behaving system is based on the notion of flow of information through an organism (or robot) from sensors to motors, with some intervening processing mechanisms. A simplified version of this model is shown in figure 3.4. It assumes a division between sensory mechanisms, central processing mechanisms (possibly including changeable long-term states), and action mechanisms. Models of this type can vary according to how ‘‘thick’’ the perceptual and action pillars are (e.g., how much processing occurs between physical input and creation of central representations) and also whether the perceptual mechanisms are stratified (i.e., with different types of sensory or motor processing occurring in parallel, dealing with different levels of abstraction). In very simple cases (single-celled organisms?) a perceptual mechanism will be little more than a physical transducer, transforming external energy into a form that can be processed internally, and likewise action mechanisms may simply consist of motors that receive signals from the center, perhaps in the form of control vectors. We have already suggested that in humans visual processing is far more complicated than that, including at least the sorts of layers or analysis and interpretation described by Marr (1982), and possibly also additional layers allowing central control
How Many Separately Evolved Emotional Beasties Live within Us?
Figure 3.4 The ‘‘triple tower’’ model of information flow. This is a modified version of the ‘‘triple tower’’ model in Nilsson (1998). Arrows represent flow of descriptive or control information. Such models can vary according to how much and what kind of processing is allowed in the perceptual and action modules, and whether information flow is unidirectional or includes internal feedback or ‘‘hypothesis-driven’’ or ‘‘goaldriven’’ control.
or modulation of the processing, so that the visual system is not an encapsulated package as suggested by Fodor (1983). Likewise, action mechanisms may include not only direct and simple signals to motors, but also, in sophisticated organisms, a control hierarchy in which high level symbolic instructions can be decomposed into instructions for several subsystems to execute concurrently. There may be some organisms that are too simple or unstructured to be usefully represented by the triple tower model (e.g., single-celled organisms). This chapter does not discuss them, though a more complete investigation of architectures should.
Multilevel Systems
Another common architectural partition, depicted in figure 3.5, is concerned with a collection of processing layers that lie between sensors and motors and operate at different levels, where the levels may differ in type of processing, or in degree of power over lower levels, or in other respects. The idea is quite old in neuroscience. For example, Albus (1981, p. 184) presents the notion of a layered ‘‘triune’’ brain with a reptilian lowest level and two more recently evolved (old and new mammalian) levels above that. AI
Aaron Sloman
Figure 3.5 A triple-layer model. Multilayer models can vary according to the mechanisms and functions of the different layers, what sorts of descriptive and control information can flow up and down between the layers, whether the layers operate concurrently and asynchronously, and whether there is some sort of dominance hierarchy, e.g., with high levels dominating lower levels. Various levels can be added, e.g., by splitting off a ‘‘reflex action’’ level below the reactive layer as in Davis (1996), or by splitting the top label into different layers. Albus (1981) proposes a dominant top level labeled ‘‘Will.’’ Our model assumes a division related to evolutionary age, mechanisms used, and functional role, with all layers operating in parallel, asynchronously, all with sensory input and motor output connections. Differences between the layers are explained later.
researchers have been exploring a number of variants, of varying sophistication and plausibility, and varying kinds of control relations between layers. The ‘‘subsumption hierarchy’’ in Brooks (1991) is one of many examples (cf. Minsky 1987; Nilsson 1998). In some theories that propose different architectural layers, it is assumed that information comes from sensors via the lowest level, which then passes information up the hierarchy until the topmost level takes major decisions about what to do next and sends control signals down the hierarchy to the lowest level, which then drives motors, a sort of W model of information flow (e.g., Nilsson 1998, figure 15.2). By vertically stretching the central circle in Fodor’s architecture (figure 3.2) and adding some internal structure, we can see that the omega architecture and Fodor’s model have much in common. Many such systems are based on engineering designs motivated by some set of practical requirements. Our architecture is motivated largely by analysis of requirements for living organisms, especially humans, including the requirement that the system should have been produced by evolution (i.e., via an e-trajectory, in the terminology of Sloman 1998b).
How Many Separately Evolved Emotional Beasties Live within Us?
Humans are able to carry out many skilled tasks involving perception and action (e.g., walking, driving a car, or operating a machine) at the same time as performing other more cognitively demanding tasks such as deliberating about where we should go next, or holding a conversation, or trying to assess and evaluate our state of mind (e.g., ‘‘Am I feeling too tired to go on driving?’’). This suggests that there are subsystems that can to some extent behave as complete organisms, acting in parallel with other suborganisms. In previous papers (e.g., Beaudoin 1994; Wright, Sloman, and Beaudoin 1996; Sloman 1998b, 2000a) we have explained some of the main features of the reactive, deliberative, and metamanagement (reflective) layers. Within this sort of architecture it is not assumed that higher level systems can totally dominate the others, though they may have partial control over them. This distinguishes our system from a ‘‘subsumption’’ architecture (Brooks 1991). Another difference is the ability of the second layer to perform ‘‘what-if’’ reasoning about chains of possible future actions in order to form plans along with the closely related ability to reason about possible past actions in order to form either explanations or analyses of previous successes and failures. It is worth emphasizing that although only one of the three layers is reactive, everything ultimately has to be implemented in reactive mechanisms. It should not surprise anyone familiar with computing systems that a mechanism of a certain type can be implemented in a mechanism of a very different type. In particular, mechanisms that consider actions before performing them can be implemented in mechanisms that do no such thing.
Combining the Two Views
To indicate how the the triple tower and triple layer models can be combined, we superimpose the previous diagrams to produce figure 3.6, which indicates more clearly that both the sensory and the action subsystems may have different components operating in parallel at different levels of abstraction and feeding information to or getting information from different levels in the central threelayered system. The general idea is that as higher level cognitive functions developed, they required more sophisticated perceptual input, producing evolutionary pressure for the development of perceptual
Aaron Sloman
Figure 3.6 Combining the tower and layer views. This gives a schematic overview of some of the functional differentiation within an architecture. Components not shown are described in the text and later figures.
systems able to operate with more abstract categories, as described above. For example, Rolls (1999) remarks that view-invariant object recognition (perhaps using object-centered forms of representation) is much less developed in nonprimates (e.g., rodents). The more abstract representations have the advantage of generalizing over a wider range of conditions (along with the disadvantage of providing less information about the details of any particular instance, though that might be compensated for by additional mechanisms operating in parallel). The more abstract percepts are needed for chunking of information in a form that supports the learning of associations that can be useful in deliberative reasoning. If handled by fast dedicated perceptual mechanisms, they can support faster high level reactions, compared with having to apply general purpose inference mechanisms to results of low level processing. Such perceptual mechanisms can process biologically important information that is often required—for example, whether another animal is angry, pleased, afraid, or where it is looking (as in figure 3.3). Some of the more abstract perceptual tasks may be given very specific submechanisms dedicated to those tasks (e.g., face recognition in humans). (Precisely how all this high level functionality is achieved remains a hard research problem, especially the problem of characterizing the many functions of vision and how they are achieved.)
How Many Separately Evolved Emotional Beasties Live within Us?
Similarly, it may be useful if higher level mechanisms are able to give abstract commands to an action system, leaving it to highly trained or evolved submechanisms in an action control hierarchy to produce the detailed execution, sometimes simply by modulating actions generated by lower levels (walking faster to catch up with someone), and sometimes by replacing them (e.g., stopping to ask someone the way). To illustrate the power of specialized action routines, compare trying to write your name with your normal writing hand and then with the other hand. Although you have full knowledge of the task it is much harder to do it with the unusual hand, unless you first spend a lot of time practicing—that is, training the appropriate action mechanisms. Knowing exactly what you want to write and being able to control your muscles is not enough! On this theory, we can think of many of the mechanisms in the various boxes as if they were suborganisms that have coevolved in such a way that the needs and capabilities of each suborganism applied evolutionary pressure influencing the development of other suborganisms. (For a similar view, see Popper 1976.) For instance, development of mechanisms for standing and walking with an inherently unstable upright posture on two legs produces pressure for the sensory systems to produce perceptual information that can rapidly detect motion that indicates a need for corrective action. So in humans, the visual system can use information from optical flow to send signals to posture control subsystems (Lee and Lishman 1975). This can happen in parallel with performing other visual functions, such as recognizing objects, people, and affordances in the environment and also providing visual feedback for grasping actions or direction control while walking. This model implies that information coming in via a particular sensory subsystem may be processed concurrently in several different ways, and that different kinds of information will be routed in parallel through various different subsystems, according to their current needs. This ‘‘labyrinthine’’ architecture (Sloman 1989) contrasts with the ‘‘one-way, data-driven, geometry/physicsbased’’ view of a visual system mentioned above. It is also consistent with recent discoveries of multiple visual pathways (e.g., Goodale and Milner 1992), though the ideas presented here suggest that far more functionally distinct sensory pathways exist than have been discovered so far, including linkages between mechanisms for different sensory modalities that enable them to share some tasks, such as mutual disambiguation.
Aaron Sloman
The task of cataloging the full variety of perceptual functions in humans and other animals has barely begun. This seriously restricts our ability to design humanlike or, for instance, squirrellike, robots. In particular, completing the task will require a full analysis of the many kinds of perceptual affordances that need to be detected to meet all the needs of different subsystems of the organism. (Some of the more complex affordances involving perception of possibilities and impossibilities are discussed in Sloman 1996d.)
3.8
The Need for ‘‘Alarm’’ Systems
Reactive mechanisms are generally faster than deliberative mechanisms because (as explained in Sloman 2000a) deliberative systems are inherently discrete and serial. They have these characteristics both because they need access to a reusable short-term memory structure and because they repeatedly need to access the same associative memory store (e.g., to answer questions like ‘‘Which actions are possible in situation S?’’ and ‘‘What will happen if action A3 is performed in situation S?’’). Reactive systems can sometimes also be relatively slow, if they use a lengthy sequence of chained internal reactions between sensory input and motor output—for instance, in the teleoreactive systems described in Nilsson (1994). In some situations, even a delay of fractions of a second may make the difference between catching or losing an opportunity to eat, or to avoid injury or death. For this reason, it may be useful, even in a purely reactive organism, to have an ‘‘alarm’’ mechanism that is relatively fast and that, either as a result of genetically determined design or as a result of previous learning from experience, or both, can very rapidly detect a need for some change in behavior and very rapidly produce that change by sending control signals to other subsystems. Such an alarm system is schematically depicted in figure 3.7, where the alarm system receives inputs from all the main parts of the architecture and can send its outputs to all the main parts. If the alarm mechanism uses an appropriate neural net architecture with not too many layers it can function very quickly, although being fast rules out performing complex deductions, exploring alternatives, and so forth. This means that such an alarm mechanism will necessarily be relatively stupid (compared with a deliberative system or a more
How Many Separately Evolved Emotional Beasties Live within Us?
Figure 3.7 Adding alarm mechanisms. If there are contexts in which producing a behavioral response by ‘‘normal’’ means takes too long, it may be be necessary for a very fast (trainable) pattern-directed ‘‘alarm’’ system to be able to interrupt and redirect other parts of the system. The figure shows only one global alarm system, though there may be several more specialized alarm systems.
sophisticated reactive system) and it can be expected sometimes to get things wrong. However, if, in situations where global interrupts are triggered, false positives are generally not as serious as failures to react quickly to dangers and opportunities, the advantages of a fast and stupid alarm system will outweigh the disadvantages. An alarm system may trigger various kinds of global responses, including freezing, fleeing, attacking, rolling up in a ball, rapidly jumping sideways to avoid a fast approaching large object, or changing direction or speed if already in motion. More subtle reactions might be changes in high level attention control systems (e.g., attending to some visual or auditory stimulus) or changes in physiological or cognitive state to enhance readiness for some kind of action or some kind of cognitive process, such as rapid decision making or planning. Some of these will be chemical changes— either localized or throughout the organism. Other changes will be muscular, or changes in heart rate, and so on. These physiological phenomena underlie various familiar emotions; however, as we shall see, they are not essential to all types of emotions.
Aaron Sloman
It is worth noting that although figure 3.7 depicts the alarm system as if it were a separate architectural subsystem, its specification is actually consistent with requirements for the reactive layer in the architecture. So alarm mechanisms can also be seen simply as subcomponents of the reactive layer in the architecture—which is how they were construed in some of our earlier papers. However, they typically differ from other reactive components in receiving inputs from many parts of the organism, and being able to send high priority control signals to many parts of the organism. Insofar as this means that a reactive mechanism can sometimes dominate the other layers, it differs from a simple subsumption architecture, as mentioned above, though some of the principles are similar. Empirically, it seems that in nature, many organisms have one or more such mechanisms. In humans, there are various innate reflex mechanisms that serve this type of purpose, some of them very narrow in their field of influence, such as the blinking reflex triggered by rapid motion, which can protect eyes from danger. Athletes, boxers, drivers of fast cars and airplanes, and musicians who sight-read musical scores all illustrate the existence in humans of very fast trainable reflexes, some of them involving very sophisticated information processing. Trainable fast reactive systems seem to be in the brain stem, the cerebellum, and the limbic system, and no doubt elsewhere in the brain. Precisely how to define the boundary between alarm systems and other fast reactive systems is not clear, though not all will have the global control capability required for an alarm system as indicated in the diagram. They certainly need not all be innate: in a complex and changing environment, it may be useful for organisms to be able to learn which situations need a fast global response. Later we shall link some of the processes in reactive systems— especially processes involving alarm mechanisms triggered by sensory inputs or physiological state monitors and producing global physiological changes—to a subset of emotions, namely, the primary emotions. If global alarms are triggered by events in the deliberative and metamanagement layers, they are secondary emotions. The subset of cases in which the effect of an alarm mechanism, or other ‘‘interrupt’’ mechanisms, is a disposition to override control at the metamanagement level will be described as tertiary emotions. All this provides only a crude, provisional architecture-
How Many Separately Evolved Emotional Beasties Live within Us?
personae (variable personalities) attitudes formalisms
standards & values categories
descriptions
moods (global processing states) motives
motive comparators
motive generators (Frijda's "concerns") Long term associative memories attention filter
skill-compiler
Figure 3.8 Additional components required. For the systems described previously to work, several additional subsystems are required.
based distinction among types of emotions, and some further subdivisions will be indicated later, though this chapter cannot present a complete overview.
3.9
Additional Components
The ninefold architectural decomposition previously outlined does not amount to a specification for a functional system. There are many details missing, listed in figure 3.8. This section gives a brief overview of some of them.
Explicit and Implicit Goal Mechanisms
An organism may have implicit or explicit goals. A goal is implicit when detection of a situation directly triggers an appropriate reaction, though generally there is no unique implicit goal. The reaction may have been designed or may have evolved to achieve some result, but without using any explicit representation of that result. Where there is an explicit, enduring representation that can be used to check whether the current situation matches the goal specification, we can say that there is an explicit goal. If sensors detecting contact with a very hot object trigger a withdrawal reaction, then the implicit goal may be to avoid damage
Aaron Sloman
from the hot object, or to move to a state where there is no longer contact with the hot object. (In general, appropriate descriptions for implicit goals are not uniquely determined by the conditionaction relationship.) For a very simple organism swimming in a soup of nutrients, detecting a shortage of energy or some other resource may trigger behavior such as opening the mouth so that nutrients are swallowed, perhaps along with orientation or movement controlled by sensors that detect density gradients for the nutrients. In some cases, however, detecting a need does not automatically determine action required to satisfy the need. Which action is appropriate will depend on the context, which may have to be ascertained by examining the environment, searching information stores, and so on. For instance, if a need for food is detected, the organism may have to use knowledge to work out what to do in order to get food. If the knowledge is not available, then that determines another need, namely, to acquire knowledge that can then be used for the purpose of working out how to get food. Even if all the required knowledge is available (e.g., that the food is buried in a particular location some distance away), the achievement of the goal of consuming food may require many intermediate steps. It is possible for a purely reactive system to cope with these cases, provided that it already has a suitable collection of reactive behaviors. Detection of energy shortage may trigger production of an explicit internal state that effectively represents the goal of obtaining food. The new state has that representational function insofar as it helps to trigger additional behaviors that are terminated only when food is found, which in turn triggers that representation to be removed (or deactivated) and the food to be eaten. All this may involve several intermediate reactive steps, because the existence (or high level of activation) of the goal representation may repeatedly trigger actions determined by the combination of the goal and the changing context. This can continue until the current situation ‘‘matches’’ the goal (e.g., food is reached). This reactive goal-driven behavior can work because the organism in effect has a prestored plan for achieving the goal. The plan is implicit in a large collection of condition-action rules. However, if there are situations in which the prestored plans are not adequate, a planning mechanism may be needed for working out a sequence of actions that can achieve the goal.
How Many Separately Evolved Emotional Beasties Live within Us?
Plan Formation in Reactive Systems
Organisms with a sufficiently rich and varied evolutionary history may have evolved a large enough collection of prestored plans to cope with most problems that arise in their environment—for example, insects and perhaps many other animals. However, there are problems if the evolutionary history of the type of organism is not adequate, or if the genetic structures are not large enough to store all the plans required, or if the individual brains are not large enough for them. A partial solution would be for individuals to be born with a partially specified reactive plan store that has surplus capacity. Then various kinds of learning (e.g., learned sequences of responses, such as can be used to train performing animals) might encode new plans that can thereafter be used when the need arises, in that sort of environment. The implicit reactive plan-following mechanisms may be more or less sophisticated depending on the formalism available for expressing goals and various conditions and actions. For example, the goal formalism may simply be a collection of on-off flags (perhaps allowing different activation levels between ‘‘off’’ and fully ‘‘on’’). Alternatively, the goal states may include parameters that can be bound to different values at different times, introducing more flexibility in the selection of appropriate actions. In some cases, it may be possible to formulate syntactically complex goals (e.g., ‘‘find food of type X not further away than distance D’’). However, syntactic complexity is more likely to be found in deliberative mechanisms.
Deliberative Mechanisms and What-If Reasoning
In some environments, the previously mentioned partial solution to the need for novel plans is inadequate (e.g., because the training process in reactive systems is too slow or too dangerous). Training a purely reactive system is satisfactory if the environment provides suitable ‘‘safe’’ contexts for learning. A deliberative layer is required only when evolutionary history and individual training opportunities do not provide a wide enough variety of plans that can be learned safely. Forms of learning that require trial and error, or reward and punishment, are not safe where the error or punishment entails death or serious injury. A deliberative mechanism enabling analysis of future possibilities can sometimes provide a
Aaron Sloman
safer way of discovering useful new plans, as Craik pointed out in 1943, and various others have also noticed since. This allows our ‘‘hypotheses to die in our stead’’ as the philosopher Karl Popper (1976) noted (see also Dennett 1996). This process requires ‘‘what-if’’ reasoning mechanisms. Here, instead of a new goal immediately triggering behaviors in the reactive layer, it can trigger planning behavior in a deliberative layer. This typically involves exploring combinations of future possible actions until one is found that achieves the goal, or at least achieves a state in which it may be easier to achieve the goal. Only when the deliberative mechanism has created and selected an appropriate plan or partial plan (a process that may be more or less complex, depending on the organism and the environment) will it be possible to begin producing appropriate behavior under the control of the plan. Research in AI over the last forty years shows that such planning may be a far more complex process than most people might suspect if they have not tried to design working systems with such capabilities. Things get even more complex if the planning mechanism is interruptable, because it operates concurrently with perceptual mechanisms, goal generators (discussed below), and plan execution mechanisms (which may be busy executing a plan to achieve another goal). Among the important mechanisms required for such deliberative processes to work are the following: 1.
2.
A reusable short-term memory in which candidate plan steps and predictions of plan execution can be constructed and compared, prior to the selection of a plan, which can then be ‘‘frozen’’ in a longer-term, though still not necessarily permanent, memory until the plan has been achieved or abandoned. If there are mechanisms for learning by comparing the results of different plan-execution processes, a more sophisticated plan memory may be required. A long-term extendable associative memory that can store reusable information in order to answer questions of forms like: ‘‘What actions are possible in situations of type S?’’ ‘‘What type of situation can result from performing action A in situation S?’’ ‘‘What are the good and bad aspects of being in a situation of type S?’’ As suggested in Sloman (2000a), this process might have evolved from a reactive condition-action system by a typical biological process of copying and modifying a previously evolved mecha-
How Many Separately Evolved Emotional Beasties Live within Us?
nism. The same basic content-addressable associative engine for determining what action to produce in particular conditions could be copied (as is common in evolution) and the new one modified so that it is used to determine responses to inputs representing hypothetical situations. However, in order to be able to produce a set of possible actions, rather than an individual action, the mechanism would need to be able to cope with nonunique associations. This would also be required for a predictive mechanism in conditions of uncertainty, where a given event could be followed by alternative consequences. The precise sequence of evolutionary steps required to produce such functionality is a topic for further research. Although details are not known, it seems clear that some sort of trainable associative neural net could implement all the required functionality for such a memory. On this basis, along with further mechanisms of types explored in AI research on planning and problem solving, it would be possible for a planning system to be able to explore branching sets of alternative possible futures. If the associative memory included information about alternative conditions that could produce a particular consequence, it could also be used for a backward-chaining planner. If such a deliberative mechanism, when presented with a goal, is capable of discovering two or more alternative (partial or complete) plans for achieving that goal, then some additional evaluative criteria will be required for selecting among the alternative plans. In some cases, this could be done on the basis of how the plan relates to other current goals—for example, by preferring a plan that most promotes or least hinders other goals or by using standards, values, preferences, or ideals (compare chapter 6 of Sloman 1978). Notice that not all selections are necessarily based on selfish considerations. In addition, the organism may have general preferences—for example, for shorter plans rather than longer ones, or for less dangerous rather than more dangerous routes, or for easier terrain over shorter routes, or for familiar strategies over unfamiliar ones—at least in contexts where outcomes are partially uncertain. As I have explained in previous papers, the sort of deliberative mechanism that can produce new plans is likely to be discrete, serial, and slow (i.e., resource limited). Although parallel neural mechanisms may be used as an underlying implementation, the main operations are likely to be serial. One reason is the need to
Aaron Sloman
reuse a limited short-term memory for building temporary structures, such as possible plans. Access to the long-term associative memory will be inherently serial if it can deal with only one query at a time. It may also be necessary to have a single high-level ‘‘control’’ mechanism for resolving major conflicts, because otherwise, multiple inconsistent decisions could be taken (e.g., going east to get food and west to get water). Finally, if the system is to be able to learn associations between events in the deliberative mechanism, the number of co-occurring events needs to be limited—for the complexity of searching for associations between subsets of events is exponential in the number of events. So learning requires parallelism to be constrained (I first heard this argument from Dana Ballard).
Discussion
Bellman: I am really troubled by the fast/slow dichotomy—this notion that somehow ‘‘fast’’ equals ‘‘stupid,’’ you know, stupid components: There is just no reason for these assumptions. Sloman: I mean by ‘‘stupid’’ just ‘‘based on recognizing a pattern and producing a response associated with that pattern,’’ as opposed to using chains of deduction or creative and contextsensitive analysis of the situation. By definition, ‘‘alarm’’ mechanisms have to be fast. The kinds of mechanisms that operate very fast are likely also to be lacking in subtle discriminations. If they are produced deep in a reactive subsystem, they may be hard to suppress—even if they are known to be inappropriate. Ortony: It just affects the reaction path, the behavior that’s associated with inputs having this structure. Sloman: And because it’s fixed and pattern driven even often, it will sometimes make sense, and sometimes it will be stupid. Rolls: Yes. But the deliberative system is necessarily slow, because it is serial. Sloman: And it’s serial for several sorts of reasons: it’s reusing short-term memory resources required for creating temporary structures such as representations of plans or hypotheses, and I suspect that the long-term associative memory required for the deliberative processes will always be limited in a number of questions that we can answer at any one time. Even if it is imple-
How Many Separately Evolved Emotional Beasties Live within Us?
mented as a highly parallel associative memory, it may be able to deal only with one question at a time, otherwise we get nonsensical answers—e.g., because of cross-talk between the processes answering different questions. Bellman: My objection is that we don’t have to make those assumptions about the fast-slow. . . . Picard: It’s the ‘‘more-reasoning-is-not-necessarily-better’’ problem. There are cases where people’s snap judgment turns out to be much smarter than the one that went back and thought more about it. Sloman: Yes—a fast reactive response produced by a highly trained expert will often be far better than the results of explicit problem solving or creative design produced by a novice. This is very familiar in many contexts: playing chess, solving mathematical problems, writing computer programs, reacting to social encounters, et cetera. But in a situation that is subtly different from any encountered during the training process, that fast response may be inappropriate because a special feature of the situation is not noticed or not taken into account. Bellman: Our notions about fast and slow processing in the nervous system have undergone fundamental, far-reaching changes over the last thirty-four years. There’s lots of arguments in debates, and I just think it is a distinction that you don’t need. Sloman: It’s all relative! Whatever mechanisms are available, you may have some processes of recognizing and responding that go through very few intermediate stages, so that the response happens quickly; whereas other problems may require more complex analysis and problem solving, including combinatorial search, in a deliberative mechanism. Such a deliberative process will be slower than a reactive process in which responses are triggered directly and quickly. But sometimes the deliberative mechanism, despite its slowness, is required for dealing with novel situations. For an organism whose environment is always totally predictable and requires no new forms of behavior, the fast reactive mechanisms may suffice. If processing is going through a sequence of stages—e.g., exploring options, then it’s going to take longer than a process that simply associates a trained response with a recognized pattern. Sometimes deliberation gives you an advantage, but sometimes, if
Aaron Sloman
you are depending on that extra flexibility, you are going to be eaten, because you will not react fast enough. Ortony: So, Aaron, what you have here is a system whose sensitivity is such that it does not care about false alarms, but it can’t afford misses. Sloman: It is a trade-off, not a simple matter of ‘‘not caring’’: if the environment is such that most of the time a particular type of fast response is best in certain contexts, then the appropriate mechanism for dealing quickly with those contexts may evolve or be produced by training. But it may also be capable of serious mistakes: though if they occur less often the risk may be acceptable. And, of course, it may occasionally go wrong. Moreover, in all the battles between species of different sorts, other animals might discover what triggers certain fast reactions so they lay a trap for it, and it falls into the trap. A highly trained boxer may use the highly trained reactions of his opponent in exactly that way—e.g., trapping him into adopting the wrong sort of guard.
Skill Compilers
For all those reasons, creating new plans is likely to be a slow process. However, once a plan has been created, it may be possible for that plan to be stored and reused later without having to repeat the planning process. That is relatively easy for a computer to do. In humans, it seems that plans that are created and followed a number of times and, are usually found to be successful, can have the appropriate training effect on the reactive system’s memory (the cerebellum perhaps?). That is, there are mechanisms whereby structures learned in slow deliberative processes can be transferred to the reactive system so that in the future they are available for rapid retrieval and execution, without using the deliberative mechanism, which is then able to perform some other task at the same time. In other words, the architecture includes one or more skill compilers. This leaves open many questions about the precise details by which such stored reactive types of expertise are created and selected when needed in the future. For instance, is the same formalism used by the slow deliberative mechanism or does the development of fluency and speed require transformation to a different formalism, as has often been suggested by analogy with the difference between interpreted and compiled programs?
How Many Separately Evolved Emotional Beasties Live within Us?
Varieties of Motivational Submechanisms
The discussion so far has assumed that changes in body state (e.g., need for food or liquid) or in perceived situations can produce new goals, which may either trigger reactive behaviors if suitable ones exist, or can trigger deliberative processes to produce one or more plans to achieve the goal. But that assumes that there is some mechanism for producing the goal in the fi rst place. That is, the sort of organism we are discussing requires motive generators. Several issues regarding the functionality of motive generators were discussed in Beaudoin (1994). There are many different kinds of goals, or motives. They can be short term, long term, or even permanent. They can be triggered by physiological changes, by percepts, by deliberative processes (e.g., subgoals of a preexisting goal), by recollections, by metamanagement processes. Thus embedded in various parts of the organism there will be a collection of motive generators: MG1, MG2, MG3, . . . of various types. These can be activated asynchronously in parallel with other processes (e.g., while you are in the middle of building a shelter you may become tired and wish to rest, or hungry, or itchy, or curious about the noise coming from something out of sight). As a result of asynchronous generation, motives may be in conflict, so motive comparators are also needed for resolving conflicts (Sloman 1987a). Simple models often use a single numerical motive comparator, based on the assumption that every motive has an associated utility, where utilities are either scalar values or are at least partially ordered. However, in real life, things may be more complex, and the organism may have to learn which motives to treat as more important in various contexts—for example, by discovering the short-term and long-term consequences of failure to act on them. Such selection skills may be expressed in a collection of different specific motive comparators, MC1, MC2, . . . , some within the reactive layer, some within problem solving or planning mechanisms (e.g., comparing subgoals) and some as part of metamanagement (e.g., determining which ‘‘top level’’ goals to select when there are conflicts). Some organisms may have fixed, innately determined motive generators and motive comparators. Others may be capable of acquiring new ones through processes of learning, development, or being influenced by a culture. In other words, there may be motive generator generators (MGG) and motive comparator generators
Aaron Sloman
(MCG). We can speculate about even more sophisticated organisms (or robots) that may also have motive generator comparators (MGC), and motive comparator comparators (MCC), which could play a role in creating or learning motive generators or motive comparators, respectively. (Perhaps we should consider yet more possibilities: MGGG, MGGC, MCGG, MCGC, MGCG, MGSS, .. . , etc.?)
Evaluation Mechanisms
There are also evaluators. Various mechanisms within an organism may evaluate some aspect of the current state as good or bad (i.e., something to be preserved or something to be terminated)—this can be independent of a comparison with any goal or motive. Aesthetic experiences involve evaluation of the current state as worthwhile, even if it serves no purpose (e.g., looking at a rainbow or enjoying a tune). Evaluations play a role in many theories of emotions—for instance, as ‘‘concerns’’ in Frijda’s (1986) theory. 4 Similarly, possible futures, including actions or action sequences, can be evaluated as desirable or undesirable independently of any comparisons with specific alternatives. This may be part of the process of deciding whether a particular goal should be adopted or rejected. If the only available means are evaluated as bad, then the goal may be rejected, or perhaps postponed in the hope that new means may become available. Humans can also use their ‘‘what-if’’ reasoning capabilities to consider possible past events that did not actually occur, and evaluate them as good or bad. For instance, after some failure has occurred, this could be part of trying to understand what went wrong and what alternatives were possible. What it means to say that a mechanism evaluates a state or event as good or bad will be slightly different in different contexts. This cannot be defined simply in terms of the form of output of some classification mechanism, whether it is binary, symbolic, or a value on a continuous scale. Rather, what makes it an evaluation is the functional role of the classification within the larger architecture. For example, to say that the current situation is evaluated as good by an organism, or a submechanism, implies that, in the absence of other factors, the organism or mechanism will be disposed to try to preserve (or intensify) the situation, or resist occurrences that are likely to reduce or terminate it. Evaluation as bad is
How Many Separately Evolved Emotional Beasties Live within Us?
similar, with appropriate changes. This is different from evaluating as good or bad a future action or situation, or one that is merely considered as a possibility during planning: In these cases, different effects will follow from the evaluation. Evaluation is also related to learning processes. For instance, learning mechanisms based on positive and negative reinforcement tend to increase the likelihood of actions that are found to produce results evaluated as good and to decrease the likelihood of actions if results are evaluated as bad. (This simple summary does not do justice to the variety of such learning phenomena.) Rolls (1999) has emphasized the importance of such evaluations and, like many other researchers, assumes that the evaluations make use of a ‘‘common currency,’’ so that conflicts between motives can be resolved by comparing evaluations. However, having something like a simple numerical scale for goodness and badness of states or events is often too crude, as consumer reports show when they present the results of comparisons in terms of a collection of descriptions of objects on different dimensions, such as cost, ease of use, reliability, results achieved, and so on (compare the analysis of ‘‘better’’ in Sloman 1969). So although simple organisms, and simple submechanisms in more sophisticated organisms, use a one-dimensional ordered scale of value, in general, evaluation is a more complex process, and different evaluations need not be commensurable. This is one reason why we have conjectured that humans use learned motive comparators: There would be little need for them if all comparisons could be based on relative value or utility. If there is no fixed, general, common currency, then different rules for comparing motives or their outcomes of actions may have to be learned separately. In some cases, they may be learned through cultural processes (including forms of indoctrination) rather than through individual learning. In other, sometimes tragic, cases, conflicts remain unresolved (e.g., where there is a choice between looking after one’s sick mother and leaving her to fight for one’s country). The existence of a problem does not imply the existence of an answer. I have deliberately not described evaluations as if they were always produced by the whole organism, because different submechanisms, which evolved at different times and serve different functions within the larger architecture, may perform their own evaluations concurrently and independently. This allows the
Aaron Sloman
possibility of conflicting evaluations. Normally this will not arise, but it can and does occur—for instance, when curiosity and fear pull in different directions, or when pain and sexual pleasure have the same cause. Talk of evaluations does not imply anything about consciousness. However, if there is a metamanagement layer in the architecture, then because it can monitor some internal states and processes it may detect some of the evaluations produced by other subsystems. In that case, the organism is aware of its evaluations.
Pleasure and Pain
Evaluation is closely related to the familiar concepts of ‘‘pleasure’’ and ‘‘pain,’’ concepts that are extremely difficult to analyze (compare Dennett 1978, chapter 11). However, the concepts seem to include the use of something like what I have called evaluators. A state of pleasure is one in which the current situation or activity is evaluated positively, producing a disposition to try to preserve it. Likewise, pain is connected with negative evaluation, producing a disposition to terminate or reduce something. The effects of such evaluations are only dispositions to act, because in many contexts the effects will be overridden by other factors: A person’s behavior may irritate you, producing the disposition to try to change him, or to avoid his presence. But there may be other factors that override the disposition so that the negative evaluation prevents the action, even if the disposition to act is quite strong. Self-control can be valuable. Evaluations can occur at different levels in the system, and in different subsystems, accounting for many different kinds of pleasures and pains. However, we would not be inclined to refer to them as pleasures or pains unless they occur at a fairly high level in a control hierarchy and potentially affect relatively global behavior rather than simply the reactions of a minor component embedded deep within the system performing some homeostatic function. However, there is still work to be done clarifying the relationship between evaluations and pleasures and pains. Pleasure and pain are often thought of as emotions (e.g., in Rolls 1999, 2000). However, one can be enjoying a meal, a view, or a conversation without being at all emotional about it. Likewise it is possible to give oneself a mild pain or displeasure (e.g., by pressing a sharp point against one’s skin) without being at all emotional
How Many Separately Evolved Emotional Beasties Live within Us?
about it. Despite all this, some people insist that these are cases of ‘‘low-intensity’’ emotions. This is one of many ways in which the pretheoretical notion of ‘‘emotion’’ is seriously indeterminate. The account we offer below restricts the term ‘‘emotion’’ to cases where evaluations produce a (fairly strong) disposition to trigger some sort of interruption or redirection of ongoing activities (e.g., through activation of a global alarm mechanism). Similarly, not all evaluations involve emotions. For instance, within a perceptual mechanism, detection of positive and negative affordances may involve evaluation of some object or situation as good or bad in relation to current goals or general needs. But this need not involve any kind of emotion; though in special cases such a percept may trigger activity in the alarm mechanism, causing some disruption or redirection, or a disposition to produce disruption or redirection—that is, an emotion. However, such dispositions may vary in strength, and perhaps only those above a (vague) threshold are normally classed as emotions, if the disposition is overridden, as weak ones often are. In this respect, the normal concept of ‘‘emotion’’ has inherently vague boundaries.
Moods and Attitudes
Moods and attitudes are also affective concepts that can be defined architecturally. They are related to, but distinct from, emotions (though sometimes confused with them). They have in common that they tend to be longer lasting than goals, desires, or (most) emotions, though in other respects, moods and attitudes are very different from each other. Moods are relatively straightforwardly construed as semanticsfree global states modulated by either perceived environment or internal processes, often involving chemical changes in h u m a n s . Moods in turn have a general modulating effect on other states and processes. Note, however, that when someone says ‘‘I am in the mood for dancing’’ this expresses a specific desire rather than the sort of mood discussed above, which might be a general state of elation, calm satisfaction, depression, irritability, optimism, and so on. I don’t know if this is simply a quirk of English, or whether the word for ‘‘mood’’ in other languages has similar divergent usage. In what follows I’ll ignore such cases.
Aaron Sloman
Some moods (i.e., global states) may be adaptive or ‘‘rational’’ reactions to the environment. For example, in an environment where most attempts at partly risky actions fail, it can induce a cautious or pessimistic mood in which there is a strong disposition to select the less risky of available alternatives (e.g., lying low, or doing nothing), even where there are options that could achieve far better results. Similarly, in a more ‘‘friendly’’ environment where moderately risky actions are found generally to be successful, it can induce an optimistic mood, including a strong tendency to select the action with the most highly valued consequences, even when there is a risk of failure. Of course, in calling them rational reactions I do not imply that there is any conscious or deliberate adoption of the mood. Rather, it is implied that it might be rational for a designer to build in a mechanism for producing such moods. Similarly, evolution might select such mechanisms because they help organisms to tailor their behavior appropriately to the environment, even if individual organisms have no idea that that is happening. However, moods, like desires, emotions, attitudes, and preferences, may sometimes be dysfunctional (e.g., when they are results of addictions or brain damage of some sort). Further mechanisms for producing and changing moods will not be discussed here, though they are very important (there are probably several of them, e.g., some chemical, some symbolic). It is very likely that what I have called ‘‘moods’’ (enduring, global, general state modulations without any specific semantic content) need to be further subdivided according to their different architectural bases: another topic for future research. Attitudes, like moods, are generally long-term states, though they are far more complex and varied than moods and they have semanticcontent.Itis possiblesimultaneouslytohave attitudesto many individual people, to one’s family, one’s country, one’s job, political parties, particular styles of music, particular lifestyles, and so forth. Moreover, an attitude to one thing may have many components, including a collection of beliefs, expectations, motive generators, preferences, and evaluations and will generally be rich in semantic content (unlike moods, which need not be ‘‘about’’ anything). The fact that one can simultaneously have many attitudes, including loving some people and hating others, implies that the majority of them will typically be dormant at any particular time, like Frijda’s
How Many Separately Evolved Emotional Beasties Live within Us?
(1986) ‘‘concerns.’’ As a result, we may have far more attitudes than we are aware of. Usually terms like ‘‘love’’ and ‘‘hate’’ refer to such attitudes even though they are often thought of as examples of emotions. Your love for members of your family, or the music of Bach, or your hatred of religious bigotry, endures at times when you are thinking about other things and you are not at all emotional. However, a new percept, thought, memory, or inference can interact with a dormant attitude and trigger a new mood, emotion, or motive produced by an associated motive generator. For instance, hearing that a beloved relative is seriously ill can trigger a state of great anxiety, and seeing a newspaper headline announcing electoral success of an individual for whom one has a very strong attitude of disapproval can trigger an emotion of anger and motivations to expose the person. In this sense, each attitude will involve a large collection of unrealized dispositions. Not all of these need be behavioral dispositions. For example, there are also dispositions to produce new mental states (as noted in Ryle 1949). There is still much work to do, clarifying architectural requirements for mechanisms involved in production, ‘‘storage,’’ modification (including decay), and activation of attitudes.
Attention Filters
For reasons discussed in Simon (1967) and our previous papers (including Sloman and Croucher 1981; Beaudoin 1994; Wright, Sloman, and Beaudoin 1996), it may be that newly generated motives, alarm signals, or other pieces of information transferred from the reactive layer to the deliberative layer may be disruptive if an already active goal or plan or activity is urgent and important and requires close and continuous attention. It was argued above that the deliberative system is likely to be resource limited. Consequently, when new goals are generated, or new important items of perceptual information are presented to the deliberative mechanism, dealing with the new goal or information can cause diversion of resources from urgent, important, and demanding tasks. In extreme cases, even thinking about whether to continue thinking about the new information or new goal may disrupt a really intricate process (e.g., listening to complex instructions for performing a very important task, or watching a child that may be about to do something dangerous).
Aaron Sloman
Such disruptions and wasteful frequent redirections of attention could be reduced if the deliberative mechanism had some sort of attention filter with a dynamically variable threshold for determining which new goals, alarm signals, or other potential interrupts should be allowed to divert processing in the deliberative layer. The interrupt threshold could be high when current tasks are important, urgent, and demanding, and relatively low at other times. Because extremely important and urgent new goals could in principle be generated at any time (e.g., by the news that the building is on fire), it should never be possible for the filter totally to exclude every kind of interrupt. If the mechanisms that generate the potential interrupters also assign to them a quickly computed crude, heuristic measure of urgency and importance, which our previous papers labeled ‘‘insistence,’’ then whether a new motive diverts the deliberative process will depend on whether its insistence is above the current threshold. The suppression of pain produced by injury while a battle is still raging or an important football match is still in progress may be an example of the functioning of such a filter. 5 It is also useful to contrast insistence of a motive, defi ned in relation to power to penetrate of an interrupt filter, with intensity, defi ned as ability to remain active once selected for action. Boosting intensity of motives currently being acted on can reduce wasteful cycling between different plans and goals. Because the process of deciding whether or not to allow something to disturb resource-limited higher level processes must itself not divert those resources, the interrupt filter will have to be relatively fast and stupid, like the global alarm system, and both are for that reason potentially error prone. A global interrupt filter for resource-limited deliberative mechanisms is a sort of converse of the global alarm mechanism. Whether the filter is actually implemented as a separate mechanism or whether it is merely part of the operation of the alarm system or motive generators is not clear. Perhaps several implementations are used, for filtering different types of interrupts, as Beaudoin (1994) proposed.
Switching Personae
In humans, it seems that the metamanagement layer does not have a rigidly fixed mode of operation. Rather, it is as if different per-
How Many Separately Evolved Emotional Beasties Live within Us?
sonalities, using different evaluations, preferences, and control strategies, can inhabit/control the metamanagement system at different times (e.g., the same person may have different personalities when at home, when driving on a motorway, and when dealing with subordinates at the office). Switching control to a different personality involves turning on a large collection of skills, styles of thought and action, types of evaluations, decision-making strategies, reactive dispositions, associations, and possibly many other things. For such a thing to be possible, it seems that the architecture will require something like a store of ‘‘personalities,’’ a mechanism for acquiring new ones (e.g., via various social processes), mechanisms for storing new personalities and modifying or extending old ones, and mechanisms that can be triggered by external context to ‘‘switch control’’ between personalities. If such a system can go wrong, that could be part of the explanation of some kinds of multiple personality disorders. It is probably also related to mechanisms of social control. For example, if a social system or culture can influence how an individual represents, categorizes, evaluates, and controls his own deliberative processes, this might provide a mechanism whereby the individual learns things as a result of the experience of others, or mechanisms whereby individuals are controlled and made to conform to socially approved patterns of thought and behavior. An example would be a form of religious indoctrination that makes people disapprove of certain motives, thoughts, or attitudes.
3.10
Organisms with Subsets of the Architecture
Not all parts of the grid are necessarily present in all animals. The type of architecture described so far and conjectured as an explanation of much human functioning is relatively sophisticated. Organisms that evolved earlier, including most types of organisms still extant, do not have all this complexity. It seems that various subsets of the architecture are found in nature, which is consistent with the conjecture that the complete system may have evolved from simpler architectures by adding new layers, with increasing internal functional differentiation. Studying such architectures —that is, exploring the neighborhood of humans in ‘‘design space,’’ may give us a deeper understanding of the trade-offs in our design.
Aaron Sloman
Purely Reactive Organisms
As far as I know, insects, spiders, and many other evolutionarily old types of organisms are purely reactive, though they may include some very complex genetically determined reactive plans. In the case of web construction by spiders, the plans control behavior of individual organisms. In social insects, such as termites and bees, there are very complex achievements that result from parallel activation of reactive plans in different individuals (e.g., the construction of termite ‘‘cathedrals’’). It is as if evolution was presented with an enormous variety of different problems (different evolutionary niches) and consequently found an enormous variety of different solutions. So far, the discussion of reactive mechanisms has been very sketchy. The main feature is negative: reactive mechanisms do not include ‘‘what-if’’ reasoning mechanisms that can construct representations of possible futures—or, more generally, possible situations past, present, or future—categorize them, evaluate them, and so on. However, as explained previously, reactive mechanisms can include implicit or explicit goals. The following are the sorts of features that characterize a reactive agent, or the reactive subsystem of a hybrid agent: Many processes may be analog (continuous), while others use discrete condition/action mechanisms. Mechanisms and space are dedicated to specific tasks, so that parallelism can achieve great speed, and different processes can proceed without mutual interference. All internal information-bearing states use dedicated memory locations, registers, or neural components: There are no generalpurpose reusable information stores as are required for deliberative mechanisms. Both conditions and actions may be purely internal, in sophisticated reactive systems. Reinforcement learning is possible, though this usually requires alterations to weights in a preexisting architecture. Stored plans can be implicit in learned chains of responses. There is no construction of new plans or structural descriptions in advance of execution, though implicit plan learning or plan modification can occur (e.g., by learning new condition-action chains).
How Many Separately Evolved Emotional Beasties Live within Us?
There is no explicit creation of alternative new structures (e.g., descriptions of action sequences) followed by evaluation and selection. Conflicts between reactions triggered simultaneously may be handled by vector addition, winner-takes-all nets, or conflict resolution rules triggered by the existence of conflicts—implicit reactive motive comparators. Most behaviors are genetically determined, though there may also be some learning (e.g., modification of tunable control loops, change of weights by reinforcement learning). Disaster may follow if the environment requires new plan structures. (In some species this may be partly compensated for by having large numbers of expendable agents.) There are different classes of reactive architectures. Some use several processing layers: for example, hierarchical control loops, or subsumption architectures. Some include actions that manipulate internal state, including states representing temporary goals, as pointed out above and in Nilsson (1994). If reactive mechanisms (like many human forms of expertise) require rapid detection of fairly abstract features of the environment or rapid invocation of complex structured actions, then the perceptual and action ‘‘towers’’ may be layered, with hierarchical concurrent processing. If the reactive system includes a global alarm mechanism (as depicted in figure 3.9), it allows additional features, including rapid redirection of the whole system in light of sudden dangers or sudden opportunities. This may include such behaviors as: freezing, fighting, attacking, feeding (pouncing), fleeing, mating, and more specific trained or innate automatic behavioral responses. Such reactions seem to be closely related to what Damasio (1994) and Picard (1997) call primary emotions. Another type of response (another sort of primary emotion) that may be rapidly produced by an alarm system is general arousal and alertness (e.g., attending carefully to certain sights, sounds, or smells, or increased general vigilance). If this is a purely internal change, it is distinct from the type of emotion that involves changes in external behavior or externally detectable physiological changes. These alarm-driven reactions producing global changes can also occur in ‘‘hybrid’’ organisms that have deliberative as well as reactive components.
Aaron Sloman
Figure 3.9 A reactive system with alarm mechanism. How to design an insect with emotions? Hybrid Reactive and Deliberative Organisms
In addition to reactive mechanisms, many animals closer to humans in design space seem to have a deliberative capability, at least in a simple form, allowing brief ‘‘look ahead’’—for example, some mammals and perhaps some birds, though it is hard to be sure exactly what is going on in such cases. In 1925, Ko¨hler reported studies of creative problem solving in chimpanzees that seemed to demonstrate an ability to ‘‘think ahead’’ and grasp the consequences of actions not yet performed—an ability that varied between individual apes. A hybrid reactive and deliberative architecture is depicted schematically in figure 3.10, including reactive and deliberative layers, along with a global alarm system, and some of the mechanisms mentioned previously in section 3.9. Some of the features of a deliberative mechanism in a hybrid architecture have been discussed above. To recapitulate: 9
Explicit goals or motives, possibly with considerable syntactic richness supported by a reusable short-term memory, are manipulated and can drive the creation of plans. 9 Descriptions of new possible sequences of actions are constructed and evaluated, without the actions being performed. 9 A reusable general purpose short-term memory mechanism is required, which can be used for a variety of different tasks at dif-
How Many Separately Evolved Emotional Beasties Live within Us?
Figure 3.10 A hybrid reactive and deliberative agent, with alarms. This type of hybrid architecture includes some of the mechanisms discussed in section 3.9, including motive generators, a content addressable long term memory, a variable threshold attention filter, and layered perception and action systems.
ferent times, storing different information—unlike reactive systems, where all state information is in dedicated components. 9 Plans found useful repeatedly can be transferred (by a skill compiler) to the reactive layer, where they can be invoked and executed quickly. 9 Sensory mechanisms operate concurrently at different levels of abstraction, and high level (abstract) perceptual layers produce ‘‘chunked’’ information that is useful for stored generalizations required for ‘‘what-if’’ reasoning. 9 Action mechanisms may also be layered, with the higher levels accepting more abstract descriptions of actions, which are automatically decomposed into lower-level actions. 9 As explained earlier, the deliberative layer’s actions will be largely serial and comparatively slow.
Aaron Sloman
A fast-changing environment (including bodily changes triggering events in the reactive layer) can cause too many interrupts and new goals for a speed-limited deliberative layer to be able to process them all. Filtering such interrupts and new goals using dynamically varying interrupt thresholds (Sloman 1987a; Beaudoin 1994; Wright 1997a) may reduce the problem, as explained above. The filter has to act quickly and operate without using deliberative resources or sophisticated reasoning capabilities. Consequently, it is likely to make errors, though it may be trainable so that its heuristics fit the environment. A hybrid reactive/deliberative system is likely to require a global alarm mechanism for the reasons given in section 3.8. In addition to the effects an alarm mechanism has on reactive systems, in a hybrid system it might have new features. The alarm mechanism might, for instance, be triggered by the occurrence of certain hypothetical future or past possibilities represented in the ‘‘what-if’’ reasoning mechanism, instead of being triggered only by actual occurrences when they are detected by sensors and internal state monitors. In addition, in a hybrid system the alarm mechanism might be able to interrupt and redirect or modulate the deliberative layer as well as triggering changed external behaviors and modulating the reactive processes. These extensions to the functionality of the alarm mechanism could produce new sorts of states, such as becoming apprehensive about anticipated danger and later being relieved at discovering that the anticipated unpleasant events have not occurred. The alarm system may also produce a host of more specialized learned effects on the deliberative system (e.g., switching modes of thinking in specific ways that depend on current goals and the current environment). For instance, detection of rapidly approaching danger might trigger a new, less careful and less detailed mode of deliberation in order to increase the chance of finding a suitable plan of evasion rapidly, even if it is not necessarily the best plan. Where deliberative mechanisms can trigger global events through the alarm system, this seems to correspond to cases where cognitive processes trigger secondary emotions, as described in Damasio (1994). We can distinguish two types of secondary emotion produced in this way:
How Many Separately Evolved Emotional Beasties Live within Us?
1.
2.
Purely central secondary emotions, where a certain pattern of activity in the deliberative mechanism causes the alarm mechanism to trigger certain rapid changes in the type or content of deliberative processing, but without producing any new external behavior or externally detectable physiological changes. Partly peripheral secondary emotions, where the new global signals change not only central but also ‘‘peripheral,’’ externally detectable states of the body, such as sweating, muscular tension, changes in blood flow, and so on (described as sentic modulation by Picard 1997). There may also be certain classes of secondary emotions that produce only peripheral changes without any modification of internal deliberative processes. Lie detectors may be dependent on such mechanisms. Notice that within the context of an architecture, we can define a collection of concepts distinguishing various types of processes independently of whether those processes are or are not actually found in real organisms. In other words, the architecturally defined concepts allow us to formulate empirical research questions that can then be answered by investigating organisms with the architectures. Yet more such concepts can be supported if a metamanagement layer is added to the architecture.
Metamanagement with Alarms
We now return to another look at the complete system postulated previously, and depicted schematically in figure 3.7 and figure 3.8. We can present this sort of architecture in a slightly enriched form, but still schematically, as in figure 3.11. Metamanagement processes allow the following processes, which are not possible in the previous systems: 1.
2. 3.
Concurrent self monitoring (of many internal processes, possibly including some intermediate databases in a layered perceptual system). Categorization and evaluation of current states and processes, including those in the deliberative layer. Self-modification (self-control)—though this may be partial, because concurrently active, reactive, or alarm mechanisms may be able to disrupt metamanagement processes—for example, because
Aaron Sloman
Figure 3.11 A three-layered system with alarms: the Cogaff architecture. A metamanagement mechanism receives information from many parts of the system. On this basis it can classify, evaluate, and to some extent control various internal states and processes, including deliberative processes. However, it may be distracted or interrupted by information coming from the reactive mechanisms or perceptual system, or by global control signals from alarm mechanisms. The variable threshold interrupt filter mentioned previously can be used to protect both the deliberative and the metamanagement processes.
new items exceed the current threshold in the attention-filter mechanism mentioned previously. Why should this layer evolve? There are several functions that it could perform. Deliberative mechanisms with evolutionarily determined strategies may be too rigid. Internal monitoring and evaluation mechanisms may help the organism improve its planning and reasoning methods by doing the following: 9
Detect situations when attention should be redirected (e.g., when working on one problem produces information that acts as a reminder of some other important task)
How Many Separately Evolved Emotional Beasties Live within Us?
9
9
9 9
9 9
9
Improve the allocation of scarce deliberative resources (e.g., detecting ‘‘busy’’ states and varying the interrupt threshold in a more context-sensitive fashion, compared with an automatic and inflexible adjunct to a resource-limited deliberative system) Record events, problems, decisions taken by the deliberative mechanism (e.g., to feed into learning mechanisms) Detect management patterns, such as that certain deliberative strategies work well only in certain conditions Allow exploration of new internal problem solving strategies, categorizations, self-description concepts, evaluation procedures, and generalizations about consequences of internal processes Allow diagnosis of injuries, illness, and other problems by describing internal symptoms to others with more experience Evaluate high-level strategies, relative to high-level long-term generic objectives or standards Using aspects of intermediate visual representations to communicate more effectively with others—for example, by using viewpoint-centered appearances to help direct someone else’s attention (‘‘look at where the hillside meets the left edge of the roof’’) or using drawings to communicate how things look (having access to contents of intermediate perceptual buffers and other internal states may cause some philosophically inclined robots to discover sensory qualia, and perhaps start wondering whether humans have them too!) By doing all these things, metamanagement can promote various kinds of learning and development; reduce the frequency of failure in tasks; prevent one adopted goal from interfering with other goals; prevent endless, time-wasting efforts on problems that turn out not to be solvable; notice that a particular planning or problem solving strategy is slow and resource consuming, then possibly replace it with a faster or more elegant one; detect possibilities for structure sharing among actions; and finally allow more subtle cultural influences on behavior. However, none of these functions is likely to be performed perfectly, for various reasons. For instance, self-monitoring cannot give complete and error-free access to all the details of internal states and processing: It is just a form of perception, and like all forms of perception, it will involve abstracting, interpreting, and possibly introducing errors. Even when self-observations are accurate, self-evaluations may be ill judged and unproductive—for example, because some incorrect
Aaron Sloman
generalizations from previous experiences cause wrong decisions to be made about what does and does not work, or because religious indoctrination causes some people to categorize normal healthy thoughts and motives as ‘‘sinful.’’ Even when self-observations are accurate, and self-evaluations are not flawed, decisions about what to do may not be carried out either because some of the processes are not within the control of metamanagement (e.g., you cannot stop yourself blushing) or because processes that are partly under its control can also be disrupted by other processes (e.g., intrusive noise or other salient percepts or features of the situation that trigger the alarm situation to intervene w h e n it would be better not to). The latter are characteristic of tertiary emotions (perturbances), where there is partial loss of control of attention. This is possible only in the presence of metamanagement, which allows some control of attention. Contrary to the claims of some theorists (perhaps Damasio and Picard?) there could be emotions at a purely cognitive level—an alarm mechanism triggered by events in the deliberative mechanism interrupting and diverting processing in deliberative and metamanagement systems without going through the primary emotion system. Some people are more prone than others to react with bodily symptoms, including externally detectable symptoms, when secondary or tertiary emotions occur. We could therefore distinguish partly peripheral and purely central secondary and tertiary emotions, producing four subcategories in addition to primary emotions.
3.11
Architectural Layers and Types of Emotions
Our everyday attributions of emotions, moods, attitudes, desires, and other affective states implicitly presuppose that people are information processors. This is partly because they generally have semantic content, including reference to the objects of the emotions. For instance, you cannot be angry without being angry about something, and to long for something, you need to know of its existence, its remoteness, and the possibility of being together again. That information plays a central role in the production and character of the emotion. Besides these semantic information states, anger, longing, and other emotions also involve complex control states. One who has
How Many Separately Evolved Emotional Beasties Live within Us?
deep longing for X does not merely occasionally think it would be wonderful to be with X. In deep longing, thoughts are often uncontrollably drawn to X. Physiological processes (outside the brain), such as changes in posture, facial expression, blood pressure, muscular tensions, or amount of sweat produced, may be involved in some cases—for example, in primary emotions and in a subset of secondary and tertiary emotions, but not necessarily in all emotions. The importance of such physiological reactions is often overstressed by experimental psychologists under the influence of the James-Lange theory of emotions. Contrast the views of Oatley and Jenkins (1996), and what poets say about emotions. The fact that some theorists regard certain physiological phenomena as defining emotions, or somehow central to all emotions, illustrates the comparison in section 3.2 with a collection of blind men each trying to understand what an elephant is on the basis of the part he can feel. On the basis of our theory, we would predict such misunderstandings. On this theory, many sorts of control phenomena, including not only production of physical changes, but also redirection of perception, thought processes, attention, motivation, and actions, are possible if the information processing architecture is rich enough. In particular, the various architectural layers discussed above, along with the alarm system and mechanisms for generating new ‘‘high-insistence’’ motives, can explain (at least in outline) what have been called primary, secondary, and tertiary emotions. Primary emotions involve events triggered in reactive mechanisms that cause something like an alarm system to rapidly redirect reactive behaviors and usually also perceptual and motor subsystems. This can include such things as being startled, being disgusted by horrible sights and smells, being terrified by a large fast-approaching object, and perhaps various kinds of sexual and aesthetic arousal, though they should probably be given separate categories. In simple organisms, such primary emotions may provoke fighting, fleeing, freezing, and so forth. In more sophisticated cases, they could produce readiness for various kinds of actions rather than those actions themselves, along with increased general, or directed, alertness. Secondary emotions are triggered by events in the deliberative mechanism, during ‘‘what-if’’ reasoning processes, or by perceiving something relevant to an important deliberative activity or state
Aaron Sloman
(e.g., a long-term goal). Such secondary emotions could occur during planning, during reflections on past events, or during idle thinking or reminiscences. The results could be various kinds of anxiety, relief, fear, pleasure at unexpected success, and so on. An example would be noticing during planning that an action being contemplated could go horribly wrong. This might produce a mixture of apprehension and extreme attentiveness to the task. The effects of such emotions may or may not include those typical of primary emotions. Within different architectural frameworks we can explore different kinds of secondary emotions and their various causes and effects. For instance, in an architecture that supports long-term dormant attitudes, a new high-level percept (e.g., unexpectedly seeing an animal mistreated) can interact with a dormant attitude (e.g., love of animals) to generate a host of reactions including generating new goals and mobilizing resources to achieve those goals. Secondary emotions will often include effects characteristic of tertiary emotions—namely, disruption or redirection of metamanagement processes, where the architecture includes metamanagement. Tertiary emotions depend on the third type of architectural layer (metamanagement, reflection), which provides capabilities for selfmonitoring, self-evaluation and (partial) self-control, including control of attention and thought processes. In that case, there is also the possibility of some loss of control, which we previously called ‘‘perturbance,’’ and now refer to as ‘‘tertiary emotions.’’ This can include such states as feeling overwhelmed with shame, feeling humiliated, being infatuated or besotted, various aspects of grief, anger, excited anticipation, pride, and many more. These are typically h u m a n emotions, and are generally the stuff of plays, novels, and gossip. Many, though not all, of them involve social interactions, or the possibility of various kinds of social interactions. It is possible that some primates have simplified versions of tertiary emotions. Most socially important human emotions require having sophisticated concepts and knowledge and rich control mechanisms embedded in sophisticated architectures. Some of these emotions (e.g., patriotic fervour, dismay at the success of a despised politician, resenting being passed over for promotion, delight at solving a famous hard mathematical problem) are too semantically rich and complex to have ‘‘natural’’ behavioral expressions, though they can be expressed in language. (Many of the speeches written
How Many Separately Evolved Emotional Beasties Live within Us?
by Shakespeare and other great playwrights are marvelous examples of the use of language to express emotions.) The situation is somewhat more complex than the previous remarks may appear to indicate. First of all, actual emotional states may be a mixture of all three kinds and other things, and the labels we use do not necessarily identify a simple category. In particular, the things we call ‘‘love,’’ ‘‘hate,’’ ‘‘jealousy,’’ ‘‘pride,’’ ‘‘ambition,’’ ‘‘embarrassment,’’ ‘‘grief,’’ ‘‘infatuation’’ can involve effects found in all three categories, because in ordinary usage, these words do not refer unambiguously to any of the particular categories that can be defined precisely in terms of the underlying architecture. Further, as shown in the previous section, the sort of architecture we have been discussing could potentially explain more varieties of emotions that differ according to which parts of the of the system are affected. Moreover, because we are not discussing static states but developing processes, with very varied etiology, more subtle distinctions could be made according to the dynamics of different sorts of processes, how they are generated, which kinds of feedback loops they trigger, how long they last, how their intensity rises and falls, which long term effects they have, and so on. More generally, within the sort of framework we have been presenting, we can begin to define an architecture-based ontology of mind (Sloman, to appear). Different sorts of architectures, found in different animals, and in humans at different stages of development, or with various kinds of brain damage, will support different mental ontologies. In particular, different animals and different sorts of humans will not all be capable of the same classes of emotions or emotion-like states. For instance, as suggested above, if newborn human infants do not have adult information processing architectures, then they will not be capable of having burning ambition, religious guilt, or national pride. From this viewpoint, it is a mistake to claim that all sorts of emotions have physiological effects outside the brain in the manner suggested by William James. Some will and some will not (e.g., the purely central secondary and tertiary emotions, when the architecture supports them). For which individuals or species this is possible will be an empirical question, not a question of definition. Likewise, it is a mistake to claim, as many do, that having an emotion necessarily involves being aware of having that emotion.
Aaron Sloman
Leaving aside the ambiguity or unclarity of the concept of ‘‘being aware’’ there is first of all the fact that people can often be in emotional states that are evident to others but not to them (e.g., being angry or jealous or infatuated: a point often used in plays and novels). Second, from an architectural viewpoint, if being aware of X involves metamanagement capabilities and having self-monitoring capabilities directed at X, then organisms (or very young infants) that lack metamanagement may be incapable of being aware of internal states such as primary emotions, even if they have them. By defining various sorts of awareness in terms of the architectures and mechanisms that support them, we can replace one ambiguous, unanswerable question with a collection of more precise empirical questions. We can also use this to explain the existence of qualia, if they involve kinds of self-awareness made possible by the action of the third layer. The theories presented here are highly speculative. But that does not make them false! If they are close to the truth, then that has many implications for our understanding of many topics in cognitive science—for example, regarding kinds of individual development and learning that are possible, the varieties of perceptual processes that can occur and the different sorts of affordances that are processed in parallel; varieties of brain damage that might occur and their possible effects; the kinds of emotions and other affective states and processes that can occur. Finally, it should be acknowledged that our architecture-based theory of types of emotions and other affective states is not necessarily in conflict with theories that emphasize other aspects of emotions, such as their causes, semantic contents, and their effects. For instance, the theory in Ortony, Clore, and Collins (1988) concentrates on aspects of emotions and attitudes that are mostly orthogonal to the architectural issues and attempts to account for our pretheoretic emotional labels. A complete theory would need to encompass both viewpoints, though we see no reason to expect that pretheoretical taxonomies enshrined in colloquial language will survive unscathed.
3.12
What Sorts of Architectures Are Possible?
We know so little about possible information processing mechanisms and architectures (especially the extraordinarily powerful visual mechanisms implemented in animal brains) that it is pre-
How Many Separately Evolved Emotional Beasties Live within Us?
mature to hope for a complete survey of types of architectures and their capabilities. It could turn out, as some have claimed, that any information processing architecture produced by millions of years of evolution is bound to be far too messy and unstructured for us to understand as engineers, scientists, or philosophers (see figure 3.1). Alternatively, it may turn out that evolution, like human designers, must use principles of modularity and reusability in order to achieve a robust and effective collection of architectures, such as we find in many kinds of animals. Figure 3.6 and figure 3.7 and our earlier discussion present more structured and modular architectures, combining a threefold division between perception, central processing, and action, and three levels of processing, with and without a global ‘‘alarm’’ mechanism. However, such diagrams can be misleading partly because they convey very different designs to different researchers. A frequent confusion is between diagrams indicating state transitions (flow charts) and diagrams indicating persisting, interacting components of an architecture. In the former, an arrow represents a possible change of state. In the latter, it represents flow of information between components. My diagrams are of the latter kind. To help us understand what to look for in naturally occurring architectures, it may be useful to attempt a preliminary overview of some features of architectures that have already been proposed or implemented. We can then begin to understand the trade-offs between various options and that should help us to understand the evolutionary pressures that shaped our minds. Researchers on architectures often propose a collection of layers. The idea of hierarchic control systems is very old, both in connection with analog feedback control and more recently in AI systems. There are many proposals for architectures with three or more layers, including not only ours but also those described by Albus (1981) and Nilsson (1998) mentioned previously, the subsumption architecture of Brooks (1991), the ideas in Johnson-Laird’s discussion (1993) of consciousness as depending on a high-level ‘‘operating system,’’ the multilevel architecture proposed for story understanding in Okada and Endo (1992), Minsky’s notion of A, B, and C brains (1987, section 6.4), and many others. On closer inspection, the layering in multilevel architectures means different things to different researchers, and in particular, different researchers refer to a so-called three-layer architecture but propose very different distinctions between the layers.
Aaron Sloman
There seem to be several orthogonal distinctions at work, which, at present, I can classify only in a very crude fashion. The following should be read as a first categorization based on Sloman (2000b), which is likely to be revised in the near future.
Concurrently Active versus Pipelined Layers
In Albus (1981) and some of what Nilsson (1998) writes, the layers have a sequential processing function: Sensory information comes in (e.g., on the ‘‘left’’) via sensors to the bottom layer, gets abstracted and interpreted as it goes up through higher layers, then near the top, some decision is taken on the basis of the last stage of abstraction or interpretation, and then control information flows down through the layers and out to the motors (on the other side). I call this an omega architecture because the pattern of information flow is shaped like an W. Many AI models have this style. This can include hybrid architectures—for example, where the lower levels are competing neural nets whose activity is triggered by incoming sensory information and the higher levels are symbolic processes that help to select one of the competing subnets. An alternative is an architecture where the different layers are all concurrently active, with various kinds of control and other information constantly flowing within and between them in both directions, as in figure 3.6 and the ‘‘Cogaff’’ architecture in figure 3.11.
Dominance Hierarchies versus Functional Differentiation
A second distinction concerns whether higher levels dominate lower levels or merely attempt to control them, not always successfully and sometimes with the direction of control reversed. In the subsumption model (Brooks 1991), higher levels not only deal with more abstract state specifications, goals, and strategies, but also completely dominate lower levels (i.e., they can turn lower level behavior off, speed it u p , slow it down, modulate it in other ways, etc.). This conforms to the standard idea of hierarchical control in engineering. By contrast, in a nonsubsumptive layered architecture (such as the Cogaff architecture) the ‘‘higher’’ levels manipulate more sophisticated and abstract information, but do not necessarily dominate the lower levels, although they may sometimes attempt
How Many Separately Evolved Emotional Beasties Live within Us?
to do so. Higher levels may be able partially to control the lower levels but sometimes they lose control, either via alarm mechanisms or because other influences divert attention. For instance, attention can be diverted by sensory input with high salience (loud noises, bright flashes) and by newly generated motives with high ‘‘insistence’’ (e.g., hunger, sitting on a hard chair, etc.). In the Cogaff model, the majority of lower level reactive mechanisms cannot be directly controlled by the deliberative and metamanagement layers, especially those concerned with controlling bodily functions. Some training may be possible, however—a possibility allowed in the next dimension of variation.
Direct Control versus Trainability
In some layered systems, it is assumed that higher levels can directly control lower levels. A separate form of control that is not ‘‘immediate’’ is retraining. It is clear that in humans, higher levels can sometimes retrain lower levels even when they cannot directly control them. For instance, repeated performance of certain sequences of actions carefully controlled by the deliberative layer can cause a reactive layer to develop new chained condition-action behavior sequences, which can later run without higher level supervision. Fluent readers, skilled athletes, and musical sight-readers all make use of this. (The nature of the interface between central mechanisms and action control mechanisms, discussed in the section entitled Central to Motor Connections, below, is relevant here.)
Different Kinds of Processing versus Different Control Functions
On some models, different layers all use the same kinds of processing mechanisms (e.g., reactive behaviors) but perform different functions (e.g., because they operate at different levels of abstraction). In other models, there are different kinds of processing as well as different functional roles. For instance, our figures showing layered architectures include a lowest level that is purely reactive, whereas the second and third levels can do deliberative, ‘‘what-if’’ reasoning, using mechanisms able to represent possible future actions and consequences of actions, categorize them, evaluate them, and make selections. This is not how reactive systems behave.
Aaron Sloman
Traditional AI planning systems can do this, and similar mechanisms may be relevant to explaining past events, doing mathematical reasoning, or doing general reasoning about counterfactual conditionals. However, it is likely (indeed inevitable) that the deliberative mechanisms that go beyond reactive mechanisms in explicitly representing alternative actions prior to selection are themselves implemented in reactive mechanisms—for example, reactive mechanisms that operate on structures in a temporary work space. Reactive mechanisms may be implemented in various kinds of lower level mechanisms, including chemical, neural, and symbolic information processing engines, and it is possible that the reliance on these is different at different levels in the architecture. Some kinds of high-level global control may use chemical mechanisms, which would be too slow and unstructured for intricate problem solving. Some have argued that human capabilities require quantum mechanisms, though I have never seen a convincing account of how they could explain any detailed mental phenomena.
Where Are Springs of Action?
A fifth distinction concerns whether new ‘‘intrinsic’’ motives (which are not subgoals generated in a planning process) all come from a single layer or whether they can originate in any layer. In one variant of the omega model, information flows up the layers and triggers motivational mechanisms only at the top. In other models, processes anywhere in the system may include motive generators. For instance, in the Cogaff architecture, physiological monitors in the reactive layer can generate new goals (e.g., to acquire food, to adjust posture, to get warmer). Some of the motives thus generated may be handled entirely by reactive goal-directed behaviors, while others have to be transferred to the deliberative layer for evaluation, adoption or rejection, and possibly planning if there is no previously stored strategy for achieving such goals in this sort of context. A full survey of theories of motive generation is beyond the scope of this chapter, but it is perhaps worth noting that not all motives benefit the individual when satisfied. In particular, it would be useful to attempt to explain various kinds of addictions (to drugs, eating, sex, power over others, gambling, computer games, etc.) in terms of sources of motivation in an architecture.
How Many Separately Evolved Emotional Beasties Live within Us?
Handling Competing Motives
Not all motives will be mutually consistent, so there has to be some way of dealing with conflicts. Architectures differ regarding the locus of such conflict resolution and the mechanisms deployed. For instance, in some forms of contention-scheduling models, schemata form coalitions and oppositions on the basis of fixed excitatory and inhibitory links in a network, and then some kind of numerical summation leads to selection, which is always done at the same level in the hierarchy. In other models, the detection of conflicts might use symbolic reasoning, and the resolution might be done at different levels for different sorts of conflicts. For instance, the decision whether to stay home and help granny or go to the marvelous concert might be handled in one part of the system, the decision whether to continue uttering the current unfinished sentence or to stop and take a breath would be handled in another part, and the decision to use placatory or abusive vocabulary when addressing someone who has angered you might be handled by yet another part of the system. In the last example, two parts might compete for control: a reactive part generating an impulse to be abusive, and a deliberative or metamanagement mechanism deciding that only placatory words will do any good in the long run.
Perceptual-to-Central Connections
Architectures with perceptual components differ in the relationships they propose between modes of processing in perceptual modules and more central layers. Fox example, is the perceptual processing itself layered, producing different levels of perceptual information to feed into different central layers, or is there a fixed entry level into the central mechanisms, after which the information may or may not be passed up a hierarchy, as in the omega model, and in Fodor’s model depicted in figure 3.2? The omega model could be described as using a ‘‘peephole’’ model of perception: Sensory information comes in via a limited orifice and then has to be processed and interpreted centrally. The Fodor model (in the version proposed by Marr 1982) could be described as the ‘‘telescope’’ model, in which information arrives through a narrow orifice after several layers of specialized processing. Both can be contrasted with the ‘‘multiwindow’’ model of perception in the Cogaff model presented above.
Aaron Sloman
In ‘‘peephole’’ or ‘‘telescope’’ perceptual systems, the sensory mechanisms (simple transducers or more complex sensory analyzers) produce information about the environment and direct it all to some component of the central architecture. That may trigger processes that affect other parts. In figure 3.6 and subsequent figures, it is suggested that the perceptual processes are themselves layered, handling different levels of abstraction concurrently, with a mixture of top-down and bottom-up processing, and with different routes into different parts of the central system. For instance, deliberative mechanisms may need perceptual information chunked at a fairly high level of abstraction, whereas fine control of movement may require precise and continuously varying input into the reactive system. Differential effects of different kinds of brain damage seem to support the multiwindow multipathway model, which can also be defended on engineering grounds. (The multiwindow model was defended at greater length in Sloman 1989, though not with that label.)
Central-to-Motor Connections
An analogous distinction concerns the relationship between central and motor processing. Just as there is what I called ‘‘multiwindow’’ perception, ‘‘telescope’’ perception, and ‘‘peephole’’ perception, so too with action. At one extreme there is only a ‘‘narrow’’ channel linking the motor system only with the lowest level central mechanism, as in the omega model: There are motors and they all get signals directly from one part of the central mechanism (analogous to ‘‘peephole’’ perception). At another extreme, there can be a layered, hierarchical motor control system where control information of different sorts comes in directly at different levels, from different layers in the central system. In between is the ‘‘reverse telescope’’ model, where only fairly high-level abstract instructions (e.g., something like ‘‘pick up the hammer’’) are handed to the motor system, which then produces one or more transformations down to detailed motor instructions or muscle control signals. The Fodor model in figure 3.2 seems to cover both peephole and reverse telescope models of action. Humans seem to have motor systems with complex hierarchical skills, probably as do many other animals. An example supporting the multiwindow model is our ability to perform some wellrehearsed skill based on reactive behaviors while modulating it
How Many Separately Evolved Emotional Beasties Live within Us?
under the control of high-level preferences and goals—for instance, while playing a musical instrument, or making a speech, or acting on the stage in such a way as to give the impression of being furtive, or confident, or caring, and so on. In some proposed architectures (e.g., Albus 1981) this hierarchical organization of action is acknowledged, but instead of the action hierarchy being a separate ‘‘tower’’ with its own layers communicating with several central processing layers, it is folded into the central control hierarchy, as if the reverse telescope is part of the central cognitive mechanism. This might give the hierarchical action mechanism more access to powerful central reasoning mechanisms and information stores, but would reduce the opportunities for central processes (e.g., planning, problem solving) to continue uninterrupted while complex actions are being carried out. The different models could be construed as describing similar systems viewed differently. However, I believe there are significant engineering design differences, with complex trade-offs that have not yet been investigated fully. In particular, designing the central system and the action systems with both having hierarchic organization and both able to operate concurrently has advantages for an organism or robot that needs to be able to think a long way ahead while carrying out complex actions. Moreover, allowing different parts of the action hierarchy to receive instructions directly from different parts of the central system allows control information to be channeled directly to where it is needed without all having to go through a selection bottleneck. Conflicts can be detected within the action ‘‘tower’’ and may either be resolved there, if they are routine conflicts, or may trigger some sort of interrupt in the central mechanisms. I conjecture that thinking about such design trade-offs may help us understand how the whole system evolved in humans and other animals as well as helping us come up with better engineering designs. Similar comments are applicable to the trade-offs involved in different architectures for perception.
Emergence versus ‘‘Boxes’’
One of the notable features of recent AI literature is the proliferation of architecture diagrams in which there is a special box labelled ‘‘emotions.’’ Contrast our figures, where there is no
Aaron Sloman
specific component whose function is to produce emotions. Instead we explain several varieties of emotions as emergent properties of interactions between components that are there for other reasons, such as alarm mechanisms and mechanisms for diverting attention (which often happens without any emotion being generated). This is often compared with the emergence of thrashing in a multiprocessing architecture. The thrashing is a result of interactions between mechanisms for paging, swapping, and allocating resources fairly when there is a heavy load. There is no special thrashing module. As with emotions, thrashing may or may not be detected by the system within which it occurs: This depends on the availability of sophisticated self-monitoring processes. 6 Disagreements over whether to label components of an architecture using the word ‘‘emotion’’ may be partly terminological: For example, some theorists write as if all motives are emotions. Then a component that can generate motives may be described as an ‘‘emotion generator’’ by one person and as a ‘‘motive generator’’ by another. Separating them accords better with ordinary usage, because it is possible to have motives and desires without being at all emotional (e.g., when hungry), although intense desires can have the properties characteristic of disruptive emotional states. This is just one of many areas where we need far greater conceptual clarity, which may come in part from further study of varieties of architectures, their properties, and the states and processes they support. There are many cases where it is not clear whether some capability needs to be a component of the architecture, or an emergent feature of interactions between components. The attention filters in figure 3.10 and figure 3.11 are examples. Instead of using a special filtering mechanism, a design can produce the effects of filtering through interactions between competing components. The first alternative may be easier to implement and control. The second may be more flexible and general. There are many such design trade-offs still to be analyzed.
Dependence on Language
Some models postulate a close link between high-level internal processes and an external language. For instance, it is often suggested (Rolls 1998) that mechanisms analogous to metamanage-
How Many Separately Evolved Emotional Beasties Live within Us?
ment could not exist without a public language used by social organisms, and in some of Dennett’s writings (1978, 1996) consciousness is explained as a kind of ‘‘talking to oneself.’’ A contrary view is that internal mechanisms and formalisms for deliberation and high level self-evaluation and control were necessary precursors to the development of human language as we know it. The truth is probably somewhere in between, with an interplay between the development of internal facilitating information processing mechanisms and social processes, which then influence and enhance those mechanisms—for instance, by allowing a culture to affect the development in individuals of categories for internal processes of self-evaluation (Freud’s ‘‘superego’’). However, it appears, from the capabilities of many animals without what we call language, that very rich and complex information processing mechanisms evolved long before external humanlike languages and probably still underpin them. Because the acquisition, storage, manipulation, retrieval, and use of information in such animals has important high-level features in common with uses of external languages (including the need for structural variability, extendability, and manipulability of the internal information medium), we can usefully extend the word language to refer to forms of internal representation and say that the use of language to think with (see with, desire with, intend with, plan with, etc.) is biologically prior to its use in external communication. This will almost certainly give us a better understanding of the phenomena normally referred to as language because of the way they depend on older inner languages. When we understand better how they work, we shall be in a better position to understand how social linguistic phenomena influence individual mental phenomena.
Purely Internal versus Partly External Implementation
A more subtle distinction concerns how far the implementation of an organism or intelligent artifact depends entirely on the internal mechanisms and how far the implementation is shared with the environment. The development in the 1970s of ‘‘compliant wrists’’ for robots, which made it far easier, for example, to program the ability to push a cylinder into a tightly fitting hole, illustrated the advantage in some cases of off-loading information processing into
Aaron Sloman
mechanical interactions. Trail blazing and the design of ergonomically effective tools and furniture are other examples. From a philosophical viewpoint, a more interesting case is the ability to refer to a spatially located individual unambiguously. As explained long ago by Strawson (1959), whatever is within an individual cannot suffice to determine that some internal representation or thought refers to the Eiffel tower, as opposed to an exactly similar object on a ‘‘twin earth.’’ Instead, the referential capability depends in part on the agent’s causal and spatial relationships to the thing referred to. So attempting to implement all aspects of mental functioning entirely within a brain or robot is futile: There is always a subtle residue that depends on external relations, and an adequate theory of mind has to take account of that. In referring to parts of oneself, or parts of one’s own virtual machine, the problem of unique reference is solved partly by internal causal relationships (as explained in Sloman 1985, 1987b). Allowing that mental states with semantic content are not implemented solely in internal mechanisms distinguishes the theory developed here from common varieties of functionalism, which reduce mental phenomena to purely internal functional or computational phenomena. Our position is also distinct from methodological solipsism and other varieties of solipsism, because we allow semantic content to include reference to objects in the physical environment: a necessary condition for organisms or robots to function successfully.
Self-Bootstrapped Ontologies
I have been arguing that by analyzing each type of architecture, we can understand what sorts of processes can occur in it, and on that basis we can define an appropriate set of concepts for describing its ‘‘mental’’ states—that is, an appropriate mental ontology. However, some learning mechanisms can develop their own ways of clustering phenomena, according to what they have been exposed to and various other things, such as rewards and punishments. If a system with the kind of metamanagement layer depicted in the Cogaff architecture uses that ability on itself, it may develop a collection of concepts for categorizing its own internal states and processes that nobody else can understand because nobody else has been through that particular history of learning processes. The role those concepts play in subsequent internal
How Many Separately Evolved Emotional Beasties Live within Us?
processing in such an architecture may exacerbate the uniqueness, complexity, and idiosyncratic character of its internal processing. For systems with that degree of sophistication and reflective capability, scientific understanding of what is going on within it may forever be limited to very coarse-grained categorizations and generalizations. This could be as true of robots as of humans, or bats (Nagel 1981).
3.13
Humanlike Architectures
I have tried to bring out some of the design options that need to be faced when trying to explain the architecture of a human mind. When we understand what that architecture is, we can use it to define collections of concepts that will be useful for describing human mental states and processes, though we can expect to do that only to a certain degree of approximation for the reasons given in the preceding section. However, that degree of approximation may suffice to provide useful clarifications of many of our familiar concepts of mind, such as ‘‘belief,’’ ‘‘desire,’’ ‘‘intention,’’ ‘‘mood,’’ ‘‘emotion,’’ ‘‘awareness,’’ and many others. In particular, so many types of change are possible in such complex systems that we can expect to find our ordinary concepts of ‘‘learning’’ and ‘‘development’’ drowning in a sea of more precise architecture-based concepts. We may also be in a better position to understand how, after a certain stage of evolution, the architecture supported new types of interaction and the development of a culture—for instance, if the metamanagement layer, which monitors, categorizes, evaluates, and to some extent controls or redirects other parts of the system, absorbs many of its categories and its strategies from the culture. It seems that in humans the metamanagement layer is not a fixed system: Not only does it develop from very limited capabilities in infancy, but even in a normal adult it is as if there are different personalities ‘‘in charge’’ at different times and in different contexts (e.g., at home with the family, driving a car, in the office, at the p u b with mates, being interviewed for a job, etc.). This suggests new ways of studying how a society or culture exerts subtle and powerful influences on individuals through the metamanagement processes. The existence of the third layer does not presuppose the existence of an external human language (e.g., chimpanzees may have some reflective capabilities), though, as argued above, it does presuppose the availability of some internal
Aaron Sloman
information-bearing medium or formalism, as do the reactive and deliberative layers. When an external language develops, one of its functions may be to provide the categories and values to be used by individuals in judging their own mental processes (e.g., as selfish, or sinful, or clever, etc.). This would be a powerful form of social control, far more powerful than mechanisms for behavioral imitation, for instance. It might have evolved precisely because it allows what has been learned by a culture to be transmitted to later generations far more rapidly than if a genome had to be modified. However, even without this social role, the third layer would be useful to individuals, and that might have been a requirement for its original emergence in evolution. We can also hope to clarify more technical concepts. The common reference to ‘‘executive function’’ by psychologists and brain scientists seems to conflate aspects of the deliberative layer and aspects of the metamanagement layer. That they are different is shown by the existence of AI systems with sophisticated planning and problem solving and plan-execution capabilities without metamanagement (reflective) capabilities. A symptom would be a planner that doesn’t notice an obvious type of redundancy in the plan it produces, or subtle looping behavior. One consequence of having the third layer is the ability to attend to and reflect on one’s own mental states, which could cause intelligent robots to discover qualia, and wonder whether humans have them. All this should provide much food for thought for AI researchers working on multi-agent systems, as well as philosophers, brain scientists, social scientists, and biologists studying evolution.
3.14
Conclusion
This chapter has presented a view that can be summed up by saying that the architecture of a humanlike mind is a complex system that in some ways is like what Minsky called a society of mind and in some ways like an ecology of mind insofar as various components evolved within niches defined by the remainder of the system, just as happens in an ecology composed of different species of organisms. The mind is a collection of different evolved subspecies of suborganisms.
How Many Separately Evolved Emotional Beasties Live within Us?
This view and its detailed elaboration (barely begun in this chapter) have important implications both for the science of mind (including animal minds and robot minds) and also for various engineering activities involving the production of systems to interact with minds, or to model minds for entertainment or other purposes. Much of this is conjectural: Many details still have to be filled in and consequences developed—both of which can come partly from building working models, partly from multidisciplinary empirical investigations. The work is very difficult, and will need to be pursued by different groups adopting different methodologies—some mainly empirical, some mainly theoretical and philosophical, some attempting to design working systems using a variety of approaches: bottom u p , top down, and middle out, but with an open mind rather than dogmatic commitments to particular architectures or mechanisms. All these researchers will need to communicate their problems, their results, and their failures to one another, because otherwise they will not fully understand the constraints and opportunities that are relevant to their own approach. The particular sort of approach adopted here, emphasizing architecture-based ontologies for mind, can bring some order into the morass of studies of affect (e.g., helping us understand why there are myriad rival definitions of emotion and helping us move to a more useful synoptic viewpoint). This is partly analogous to the way in which our concepts of kinds of physical stuff (e.g., as represented in the periodic table of elements) and kinds of physical and chemical processes were enriched through an understanding of the previously hidden architecture of matter. For our task, the challenge is much greater: Whereas there is one physical world, with a single architecture (albeit with multiple levels), there are many architectures for organisms and possible intelligent machines, and correspondingly varied ontologies for mind. These are not, as is often supposed, differences of degree within a continuum of possibilities. There are many large and small discontinuities in ‘‘design space,’’ and we need to understand their implications, including the implications for evolutionary mechanisms and for individual learning and development. A particular application will be development of a conceptual framework for discussing which kinds of emotions and other mental phenomena can arise in software agents that lack the reactive
Aaron Sloman
mechanisms required for controlling a physical body, replacing discussions that depend on arbitrary preferences for one or another definition of ‘‘emotion.’’ This work can lead to a better approach to comparative psychology, developmental psychology (the architecture that develops after birth), and enhance the study of effects of brain damage and disease. Only when you have a good understanding of the normal functioning of a complex system can you hope to understand ways in which it can go wrong. Although the scientific and philosophical implications of these ideas are profound, they are also relevant to engineers. The topics are relevant to designers of complex humanlike systems for practical purposes, including new advanced forms of computer-based interactive entertainment or development of humanlike software or hardware assistants, advisors, teachers, and so on. In particular, the issues need to be understood by anyone who wishes: 1. 2.
3.
4.
5.
To build systems that effectively model human mental processes, including affective processes To design systems that engage fruitfully with human beings, because that requires an understanding of how humans work (including an appreciation of the enormous individual variability among humans) To design good teaching systems for helping people learn mathematics, languages, science, or other topics, because without understanding how minds work we cannot design systems (including classroom practices) that effectively enhance those modes To produce teaching/training packages for would-be counselors, psychotherapists, and psychologists, because those packages need to be based on good theories of how people function normally and how that normal functioning can be disrupted, damaged, and so forth To produce convincing synthetic characters in computer entertainments, because shallow, purely behavioral models may suffice for a while, but eventually they will be found to be dull, repetitive, rigid, and the task of extending the behavioral repertoires by adding more and more behaviors either by explicit programming or through imitative learning processes will turn out to be too tedious and too restrictive: deeper models can be expected to have more power and flexibility
How Many Separately Evolved Emotional Beasties Live within Us?
Acknowledgments
The ideas presented here were developed in collaboration with Steve Allen, Luc Beaudoin, Brian Logan, Catriona Kennedy, Ian Millington, Riccardo Poli, Tim Read, Edmund Shing, Ian Wright, and others in the Cognition and Affect Project at the University of Birmingham. I have also benefited from interactions with, or reading the publications of, many colleagues elsewhere, including Margaret Boden, Dan Dennett, Stan Franklin, Pat Hayes, John McCarthy, Marvin Minsky, Roz Picard, the members of the workshop on Emotions in Humans and Artifacts in Vienna, August 1999, and those who attended the ‘‘Designing a Mind’’ workshop at the AISB Convention in April 2000. Papers and theses by students and colleagues can be found online: hhttp://www.cs.bham.ac.uk/research/cogaff/i. Our software tools, including the SIM_AGENTtoolkit, are available free of charge on-line: hhttp://www.cs.bham.ac.uk/research/ poplog/freepoplog.htmli. I have, of course, also learned much from many others, including the authors listed in the bibliography. I am particularly grateful to Robert Trappl and Paolo Petta for making the workshop possible, and for their patience while I tried to get this chapter into a presentable shape. I am aware that there are still many gaps and much that needs to be clarified.
Discussion: Multilevel Processing
Ortony: Again: multilevel consciousness. Why is it not the case that, when I see a bear in the woods, as it were, the reaction is initiated by ‘‘If I hang around here, I am going to get eaten?’’ It’s just that it is not available to consciousness. Sloman: I have not said anything about consciousness. Ortony: No, I understand that! But ‘‘I see this bridge. What if I will go over it and it won’t support me?’’ seems to me to be fundamentally the same question as ‘‘I see this lion and if I hang around, I am going to get eaten.’’ The only difference is that, in the one case, it’s conscious information processing. Sloman: Yes. The lion case is exactly like the bridge case if it involves recognition of the lion as something that can produce some nasty future prospect. If, on the other hand, it’s a hard-wired
Aaron Sloman
reactive response which is ‘‘Something large and noisy is rushing towards me,’’ then that could produce the primary kind of emotion. If, however, instead of perceived threatening behavior what happens is that the percept triggers a thought process including ‘‘That’s a dangerous thing that might eat me,’’ then a secondary emotion may be triggered. These are two sorts of fear. Ortony: But, I mean, this is a question about how long I am going to hang around and test what the possibilities are. Sloman: You can have both these kinds of processes operating at once! Can˜amero: What you described as metamanagement seems to be a self-monitoring system of the deliberative layer. But you need also a monitoring mechanism for the reactive layer. Sloman: It’s not just monitoring of the deliberative layer. I believe it can also go down to the central mechanisms and other things below. But that’s maybe not relevant to what you are saying, is it? Can˜amero: Not exactly. I am wondering about the potential similarities between self-monitoring your reactive layer and selfmonitoring your deliberative layer. They are different kinds of processes, but it seems to me that the mechanisms could be quite similar. Sloman: All these mechanisms are ultimately implemented in reactive systems of different sorts. So, at some level, they are all the same. But nevertheless, they have functional roles that are importantly different within the whole architecture. Can˜amero: That’s what I am not sure about. I see there are many functionally equivalent roles, in both. That’s what I mean. Sloman: Let me put it this way: What the metamanagement level does, as I have tried to define—and I suspect I have not yet got it right—is to do to the internal system what the previous layers did to the environment and the organism’s actions in the environment. So, the evolutionary processes that produce metamanagement may be another case of copying and then functionally differentiating. So, there can be a lot in common. Can˜amero: Aha. Yes, but with the reactive layer, of course you monitor your relation with the environment, but you also monitor the internal processes. Sloman: There may be certain amounts of internal feedback which involves internal monitoring. All homeostatic processes can be
How Many Separately Evolved Emotional Beasties Live within Us?
thought of as primitive forms of metamanagement. And the deliberative mechanism may have low-level detectors of what’s going on in the planning subsystem. But I am talking about a different sort of more global perception, categorization, and evaluation of the whole system, which is capable of leading to high-level decision making. It is also very important that the metamanagement runs in parallel with the other subsystems, because while you are planning, and reflecting on features of your planning, you might suddenly realize that you are going on in a circle. Bellman: Does that mean qualia emerge at the top level? Sloman: Right. I believe if you built a robot that had everything I have been talking about implemented in sufficient detail, so that, for instance, not only does it see chairs and tables, but sometimes, it will also pay attention to the contents of its visual field, it will notice that besides a rectangular thing out there, which is the table top, there is a nonrectangular thing, which is this sort of obtuseangled distorted object, which occurs because the table is viewed from a different angle. This ability to reflect not only on the 3-D properties of objects seen in the physical environment, but also on some of the 2-D properties of intermediate visual data structures is exactly the sort of thing that has led philosophers to speak of qualia. I do not believe that there is anything more to qualia than what will naturally exist in a fully implemented version of what I am talking about. Rolls: Let me say that I agree with a lot of that. May I just press you on the alarm systems, and exactly where emotion comes into this. If I understand you, you are saying that emotion comes in because of the alarms and has something to do with the alarms, or at least you have emphasized the alarms. Sloman: Yes. I think the primary and secondary emotions are intimately bound up with the alarm system recognizing something going on now in the environment, or maybe something going on in your deliberation about the future, and then grabbing control and sending overriding control signals, or signals that may be hard to resist. Rolls: But is emotion just that? I mean, obviously, when you integrate alarm, it may become emotional. But I am not sure if that gathers all of it. So, in a sense, you could talk about each of your three layers as having goals. But I would argue that, where each layer is implementing behaviors to get goals, then you’ll have
Aaron Sloman
emotion represented as the state elicited by the goal. So, for me, I can have emotions at all three layers, independently of alarms, as long as the system is working properly. Sloman: I suspect you are talking about what I called ‘‘evaluation’’ of things as good and bad. I want to distinguish pleasures and pains from emotions, since in the familiar sense of ‘‘emotional,’’ which implies being disturbed or partly out of control, you can enjoy something or find it mildly painful without being at all emotional about it—e.g., enjoying eating an ice cream. In contrast, it appears that you want to say that emotions include all those affective states. Rolls: But, then, what do you say that emotions are? Do you say they are just the alarms? Sloman: No, because the tertiary emotions may not necessarily involve the alarm mechanism. That can be something more subtle which I don’t yet fully understand, which involves a mechanism that is able to grab control of and redirect the metamanagement system so that your attention goes somewhere other than to where you have, or it has, decided that it should be. I have no details worked out of how that happens, I have no idea which bits of the brain might be involved. I suspect it’s got something to do with the frontal lobes, because some kinds of frontal lobe damage seem to produce reduced high-level control of your thinking processes. Bellman: Are alarms attentional devices? Sloman: Amongst other things. The alarms in the reactive system may do more than switch attention—they may freeze you, they may make you drop down, they may make you stop running. But they may also just suddenly turn on certain attention mechanisms generating faster or more focused processing of perceptual phenomena. Rolls: But is a consequence of what you said then that at the first and the second level, the emotions only arise by virtue of alarms, and not by virtue of other emotions? Sloman: Well, I hate to say ‘‘only,’’ because I do not believe that the word ‘‘emotion’’ is sufficiently clear. I think that there are at least a hundred different definitions of ‘‘emotion’’ in the literature. What I want to do is try to clarify what sort of architecture we have, what range of architectures there might be in other animals, and maybe in various kinds of software systems.
How Many Separately Evolved Emotional Beasties Live within Us?
Maybe in future we will see happening to the word ‘‘emotion’’ something comparable to the development of physics. The architecture of matter developed, and we now have new ways of talking about physical elements, new ways of talking about chemical compounds, and different kinds of states of matter. When such new, more refined concepts develop, we often find that our old categories fit some of them, while maybe some are totally new and have never been thought of previously. And maybe in some cases where we thought there is one type of thing, we will now see that there are twenty-five significantly different ones. In other cases, where we said there were different things, we realize that, actually, there are just at the surface different variants of something that is deeply similar. And so I don’t want to legislate about what emotions are. Rolls: Right. So, that’s a nice difference between u s . I am willing to say that emotions are the states elicited when we are doing evaluation of rewards or punishers. And you are just saying you are not quite sure what they are. Sloman: Well, I am saying I think that there is a variety of things which are covered by what we have called ‘‘emotions’’ in the history of our language, and different people focus on different subsets. E.g., the types that poets and novelists and garden-fence gossips focus on are typically different from the ones studied by brain scientists doing experiments on rats. The states that you are talking about, I am inclined to call pleasures and pains, but in my dialect they are not emotions, because often their effects are too limited. However, intense pleasure and intense pain can produce a kind of emotional state which involves a partial loss of control of thought processes. So, I am inclined to link emotions to ‘‘being moved,’’ i.e., being partly out of control. Rolls: Can I just check: you could so have emotion in a machine that did not have levels two and three? Sloman: Yes—it would have fairly primitive primary emotions. That’s why I am happy to talk about an ‘‘emotional insect’’—it might be terrified in some simplified sense of ‘‘terrified.’’ Rolls: But then the issue of control by a higher level is not crucial to this. Sloman: Yes, it is not crucial to the case of primary emotions. It’s a different sort of control. You can have a subsumptive architecture with different layers in the reactive system, controlling lower
Aaron Sloman
levels. Something like primary emotions could occur in such a system. Can˜amero: What I was trying to say before is that: The role that you say emotions play in the higher level, is the same role emotions play in the reactive system. Sloman: I didn’t say emotions are playing any role. They are generally results of other things that have useful functions. The reactive alarm mechanisms may have evolved to produce specific, mostly useful, effects. But some of the more subtle and complex emotions in humans that involve actual or potential disruption of metamanagement—e.g., jealousy, obsessive ambition, elated anticipation, infatuation, are not necessarily all functionally useful states produced by specific emotional mechanisms. They are often emergent features of interactions between a number of different mechanisms. That’s not having a role, that’s being a side effect! There is not an emotional system that does those things. The mechanisms which produce such effect did not evolve to produce that effect: they are there for other reasons. They are there not to produce emotions, but to switch attention because there is something important going on, or when you need to notice that what you are doing now is probably not going to succeed, so you should switch. And if, in order to do that, you have to do more of the same kind of thing, you will never do it, because it uses resources that are fully committed. So, there are all sorts of mechanisms of control. They produce states as a result of interactions between those mechanisms and other things, and some of these interactions correspond roughly to what we mean by ‘‘emotion’’ in certain contexts. But then, as I said, not all emotions have any role, they are just states that occur as a result of other things having a role. Ortony: Aaron, let me try to give you three very quick scenarios. One is where you are at home alone at night, and your house is in the country, miles from anywhere, and it’s dark. You hear creaking noises, and you get increasingly nervous. And over the course of forty-five minutes, you just become increasingly convinced and increasingly afraid that somebody is in the house and so on. And that’s all you think about. It’s just like: ‘‘Goodness, there is somebody in the house, and what am I going to do?’’ Now compare this to the case where you are driving along and you see somebody on the wrong side of the road driving towards you rather
How Many Separately Evolved Emotional Beasties Live within Us?
fast, and you very hurriedly take evasive action. And finally, imagine that you are in the forest and you see that there is a bloody great tiger running like hell towards you. It would seem to me that all of these are cases of fear. In the last case, I can see you calling it a primary emotion. In the second case, you would call it a secondary emotion, because in there is some kind of planning and decision making. And in the first case, you have got this kind of obsessive rumination that is controlling all your information processing and attentional resources. But, if you agree roughly that all three are cases of fear, and that they vary in this kind of way, it seems to me terribly misleading to say: There are primary emotions, secondary emotions and tertiary emotions. What you would have to say is: There are emotions which can be realized in a primary fashion, and/or in a secondary fashion, and/or in a tertiary fashion. Sloman: Well, I don’t care about terminology, and as far as that is concerned, I use terms just because other people are using them. (Sloman—note added later: moreover, in humans many situations produce complex mixtures of reactions that include two or more things going on so that there may be primary, secondary, and tertiary emotions. In fact, I suspect that more detailed analysis will show that these three categories are just too crude to do justice to the full variety of phenomena.) Picard: I actually would like to add to that: I agree with distinguishing primary emotion from the others, for some cases, not for all; and I don’t agree with the distinction between secondary and tertiary. Though the extent to which I agree with the primary is based on the extent to which there is some neurological evidence suggesting this, that the stimulus triggers first a subconscious appraisal—or whatever you call it—an emotion, that then biases the thought, versus something that triggers the thought, and then that causes the emotion. Now, tertiary emotion, I see it more just as a matter of intensity. I am not sure if that’s the right word. I don’t see it as a separate category, but more as a gradation. In fact, as intensity fades, perturbances fade, too. (Sloman—note added later: But a machine or animal, or young child cannot have tertiary emotions if its architecture does not include metamanagement. If you abandon the study of the particular ways in which metamanagement changes the variety of possible cognitive and affective processes, that may simply reflect a difference of interest not a disagreement of substance.)
Aaron Sloman
Ortony: Aaron, you can’t get out of this problem by saying ‘‘I don’t care about words!’’ because words carry a lot of baggage. When you say ‘‘primary,’’ ‘‘secondary,’’ and ‘‘tertiary’’ emotions, you are promising three distinct classes of emotions, as opposed to three different styles of emotional responding. Sloman: What I think is very important is: This list was just meant to be a suggestion, to remind you of things that exist. Now, it is true that a person who has these things in the second category may become obsessed by them and may never think about anything else and lose control, and therefore it becomes a tertiary emotion. However, if an animal is able to do this deliberation, and is triggered to change its ways of deliberating and interacting as a result of detecting the future possibility, but has not got the third layer, then it can’t have a third kind of reaction. Ortony: That I agree about. Sloman: Right. So, what I am saying here is that our normal verbal descriptions do not adequately map onto the categories that I am talking about. The architecture-based categories are not things that you can expect to find precisely represented in our ordinary vocabulary. Any more than a vocabulary of physics, as it was a hundred years ago, will actually reflect the structure of matter. Ortony: But every research field has a culture, an established kind of language and culture for describing it. And ‘‘primary emotions’’ advertises something absolutely specific, and a total commitment to a class of emotions which are privileged in some way. It says that some subset of emotions have some special status. Sloman: No! I am trying to explain why different people focus on different features and explaining why they are all right about something: What they refer to exists. But other things also exist, and have different properties because they involve functionally different architectural components. I am saying, all the different sorts of things I and others here and poets and gossips have been referring to can occur within this multilevel architecture. The terminology of primary, secondary, and tertiary emotions is simply a first crude attempt to draw attention to some of the architecturebased differences. Rolls: There would be one way to help yourself, which might be to talk about layer 1, layer 2, and layer 3. Sloman: Well, that is what I want to d o . But I also want to explain why other people say what they say.
How Many Separately Evolved Emotional Beasties Live within Us?
Rolls: Right. But that does emphasize your important point, that the architecture in the machine is rather crucial. It depends crucially on how much of this set of layers you have implemented in the machine, the extent to which we want to say it has emotion.
Notes 1. It seems to me from Dennett’s recent writings that over the years, he has gradually shifted from stressing the intentional stance to adopting the design stance, which explains how a system works in terms of its (usually hierarchical) functional decomposition into interacting components, and which can justify ascribing mental states and processes to organisms and machines on the basis of how they work. I don’t know if he is aware of the shift or would even acknowledge it. His 1996 book, for instance, discusses designs for different kinds of minds. However, his definition there of the design stance is closer to McCarthy’s definition of ‘‘the functional stance’’ (McCarthy 1995), which explains behavior of a system in terms of what its function is, without reference to how it works. 2. For example, I have used a very shallow model for tutorial purposes in my introduction to the SIM_AGENT toolkit: On-line, Available: hhttp://www.cs.bham.ac.uk/research/ poplog/sim/teach/sim_feelingsi. 3. Our implementations are still far behind our theoretical analysis, partly because of lack of resources and partly because there are still many gaps in the design. 4. More precisely, concerns, for Frijda, are ‘‘dominant demons’’—that is, dispositions that can manifest themselves through evaluations when appropriate. They are close to the attitudes discussed. 5. There is considerable further discussion of various types of filters in Beaudoin’s Ph.D. thesis (1994). 6. Emergence in this sense does not involve any mysterious processes. There are two sorts of emergence that are usefully distinguished. In one sense, ‘‘agglomerative emergence,’’ emergent states, processes, and properties are merely large-scale features of a system, but can be defined in terms of the small-scale components that make them up, and their causal interactions can be derived logically or mathematically from the laws of behavior of the low-level features. A more interesting type of emergence involves processes in virtual machines whose ontologies are not definable in terms of the ontology of the underlying implementation machine, from which it follows that the laws of behavior of the entities in that ontology cannot be logically or mathematically derived from the laws of behavior of the implementation machine. An example would be a chess-playing virtual machine in a computer, where concepts like ‘‘king,’’ ‘‘capture,’’ and ‘‘win’’ cannot be defined in terms of the concepts of physics and (less obviously) the rules of chess and the playing strategies cannot be derived from the laws of physics and descriptions of the underlying physical machine. In these cases, there is circular causation, between the physical layer and the virtual machine layer, even though the physical layer may be causally complete. A more detailed, but still incomplete discussion of this point can be found on-line: hhttp://www.cs.bham.ac.uk/axs/misc/superveniencei.
References Albus, J. S. (1981): Brains, Behaviour, and Robotics. Byte Books, McGraw Hill, Peterborough, N.H. Bates, J., Loyall, A. B., and Reilly, W. S. (1991): Broad Agents. AAAI Spring Symposium on Integrated Intelligent Architectures. American Association for Artificial Intelligence. Reprint, Sigart Bull. Vol. 2, no. 4 (August): 38–40.
Aaron Sloman
Beaudoin, L. (1994): Goal Processing in Autonomous Agents. Ph.D. Thesis, School of Computer Science, University of Birmingham. On-line. Available: hhttp://www.cs. bham.ac.uk/research/cogaff/i. (Availability last checked 5 Nov 2002) Beaudoin, L., and Sloman, A. (1993): A Study of Motive Processing and Attention. In A. Sloman, D. Hogg, G. Humphreys, D. Partridge, and A. Ramsay, eds., Prospects for Artificial Intelligence, 229–238. IOS Press, Amsterdam. Brooks, R. A. (1991): Intelligence without Representation. Artif. Intell. 47: 139–159. Chandrasekaran, B., Glasgow, J., and Narayanan, N. H. eds. (1995): Diagrammatic Reasoning: Cognitive and Computational Perspectives. MIT Press, Cambridge. Craik, K. (1943): The Nature of Explanation. Cambridge University Press, Cambridge. Damasio, A. R. (1994): Descartes’ Error. Emotion, Reason, and the Human Brain. Putnam, New York. Davis, D. N. (1996): Reactive and Motivational Agents: Towards a Collective Minder. In J. Mueller, M. Wooldridge, and N. Jennings, eds., Intelligent Agents III—Proceedings of the Third International Workshop on Agent Theories, Architectures, and Languages. Springer, Berlin, Heidelberg, New York. Dennett, D. C. (1978): Brainstorms: Philosophical Essays on Mind and Psychology. MIT Press, Cambridge. Dennett, D. C. (1996): Kinds of Minds: Towards an Understanding of Consciousness. Weidenfeld and Nicholson, London. Fodor, J. (1983): The Modularity of Mind. MIT Press, Cambridge. Frijda, N. H. (1986): The Emotions. Cambridge University Press, Cambridge. Frisby, J. P. (1979): Seeing: Illusion, Brain, and Mind. Oxford University Press, Oxford, London, New York. Gibson, J. [1979] (1986): The Ecological Approach to Visual Perception. Reprint, Lawrence Erlbaum Associates, Hillsdale, N.J. Goleman, D. (1996): Emotional Intelligence: Why It Can Matter More than IQ. Bloomsbury Publishing, London. Goodale, M., and Milner, A. (1992): Separate Visual Pathways for Perception and Action. Trends Neurosci. 15 (1): 20–25. Johnson-Laird, P. (1993): The Computer and the Mind: An Introduction to Cognitive Science. 2nd ed. Fontana Press, London. Karmiloff-Smith, A. (1996): Internal Representations and External Notations: A Developmental Perspective. In D. Peterson, ed., Forms of Representation: An Interdisciplinary Theme for Cognitive Science, 141–151. Intellect Books, Exeter, UK. Ko¨hler, W. (1925): The Mentality of Apes. Tr. from the second revised edition by Ella Winter. London, K. Paul, Trench, Trubuer & Co., Ltd; New York, Harcourt, Brace & Comp., Inc. Lee, D., and Lishman, J. (1975): Visual Proprioceptive Control of Stance. J. Hum. Mov. Stud. 1: 87–95. Margulis, L. (1998): The Symbiotic Planet: A New Lookat Evolution. Weidenfeld and Nicolson, London. Marr, D. (1982): Vision. Freeman, New York, San Francisco. McCarthy, J. (1995): What Has Artificial Intelligence in Common with Philosophy? In C. S. Mellish, ed. Proceedings of the fourteenth International Joint Conference on Artificial Intelligence. Morgan Kaufmann, San Mateo, Calif. On-line: hhttp://wwwformal.stanford.edu/jmc/aiphil/aiphil.htmi. (Availability last checked 5 Nov 2002) McDermott, D. (1981): Artificial Intelligence Meets Natural Stupidity. In J. Haugeland, ed., Mind Design, 143–160. MIT Press, Cambridge. Minsky, M. L. (1987): The Society of Mind. William Heinemann, London. Nagel, T. (1981): What Is It Like to Be a Bat? In D. Hofstadter and D. C. Dennett, eds., The Mind’s I: Fantasies and Reflections on Self and Soul, 391–403. Penguin Books, Harmondsworth. Newell, A. (1982): The Knowledge Level. Artif. Intell. 18 (1): 87–127. Newell, A. (1990): Unified Theories of Cognition. Harvard University Press, Cambridge. Nilsson, N. (1994): Teleo-Reactive Programs for Agent Control. J. Artif. Intell. Res. 1: 139– 158.
How Many Separately Evolved Emotional Beasties Live within Us?
Nilsson, N. J. (1998): Artificial Intelligence: A New Synthesis. Morgan Kaufmann, San Mateo, Calif. Oatley, K., and Jenkins, J. (1996): Understanding Emotions. Blackwell, Oxford. Okada, N., and Endo, T. (1992): Story Generation Based on Dynamics of the Mind. Comput. Intell. 8: 123–160. Ortony, A., Clore, G., and Collins, A. (1988): The Cognitive Structure of the Emotions. Cambridge University Press, Cambridge. Peterson, D., ed. (1996): Forms of Representation: An Interdisciplinary Theme for Cognitive Science. Intellect Books, Exeter, UK. Picard, R. (1997): Affective Computing. MIT Press, Cambridge. Popper, K. (1976): Unended Quest. Fontana/Collins, Glasgow. Rolls, E. T. (1999): The Brain and Emotion. Oxford University Press, Oxford, London, New York. Rolls, E. T. (2000): Precis of the Brain and Emotion. Behav. Brain Sci. 23: 177–233. Russell, S., and Norvig, P. (1995): Artificial Intelligence, A Modern Approach. Prentice Hall, Englewood Cliffs, N.J. Ryle, G. (1949): The Concept of Mind. Hutchinson, London, New York. Simon, H. A. [1967] (1979): Motivational and Emotional Controls of Cognition. In H. A. Simon, Models of Thought, 29–38. Yale University Press, New Haven. Simon, H. A. [1969] (1981): The Sciences of the Artificial. 2nd ed. MIT Press, Cambridge. Sloman, A. (1969): How to Derive ‘‘Better’’ from ‘‘Is.’’ Am. Philoso. Q. 6: 43–52. Sloman, A. (1978): The Computer Revolution in Philosophy. Harvester Press (and Humanities Press), Hassocks, Sussex. Sloman, A. (1985): What Enables a Machine to Understand? In A. Joshi, ed., Proceedings of the ninth IJCAI, Los Angeles, pp. 995–1001. Morgan Kaufmann, Los Altos, Calif. Sloman, A. (1987a): Motives, Mechanisms, and Emotions. Cogn. Emotion 1 (3): 217–234. Reprinted in M. A. Boden, ed. (1990): The Philosophy of Artificial Intelligence, 231– 247. Oxford Readings in Philosophy Series, Oxford University Press, Oxford, London, New York. Sloman, A. (1987b): Reference without Causal Links. In J. du Boulay, D. Hogg, and L. Steels, eds., Advances in Artificial Intelligence—II, 369–381. North Holland, Dordrecht. Sloman, A. (1989): On Designing a Visual System (Towards a Gibsoman computational model of vision). J. Exp. Theor. AI 1 (4): 289–337. Sloman, A. (1993a): The Mind as a Control System. In C. Hookway and D. Peterson, eds., Philosophy and the Cognitive Sciences, 69–110. Cambridge University Press, Cambridge. Sloman, A. (1993b): Prospects for AI as the General Science of Intelligence. In A. Sloman, D. Hogg, G. Humphreys, D. Partridge, and A. Ramsay, eds., Prospects for Artificial Intelligence, 1–10. IOS Press, Amsterdam. Sloman, A. (1994): Explorations in Design Space. In A. Cohn, ed., Proceedings of the eleventh European Conference on AI, August, Amsterdam, 578–582. John Wiley, Chichester, UK. Sloman, A. (1996a): Actual Possibilities. In L. C. Aiello and S. C. Shapiro, eds., Principles of Knowledge Representation and Reasoning: Proceedings of the Fifth International Conference (KR ’96), 627–638. Morgan Kaufmann Publishers, San Mateo, Calif. Sloman, A. (1996b): Towards a General Theory of Representations. In D. M. Peterson, ed., Forms of Representation: An Interdisciplinary Theme for Cognitive Science, 118– 140. Intellect Books, Exeter, UK. Sloman, A. (1997): What Sort of Control System Is Able to Have a Personality? In R. Trappl and P. Petta, eds., Creating Personalities for Synthetic Actors: Towards Autonomous Personality Agents, 166–209. Springer, Berlin, Heidelberg, New York. Sloman, A. (1998a): Damasio, Descartes, Alarms, and Meta-Management. In J. C. Bezdek and L. O. Hall, eds. Proceedings of the International Conference on Systems, Man, and Cybernetics (SMC98), 2652–2657. IEEE, Piscataway, N.J.
Aaron Sloman
Sloman, A. (1998b): The ‘‘Semantics’’ of Evolution: Trajectories and Trade-Offs in Design Space and Niche Space. In H. Coelbo, ed., Progress in Artificial Intelligence, sixth Iberoamerican Conference on AI (IBERAMIA), 27–38. Springer, Lisbon. Sloman, A. (1999a): Review of Affective Computing by Rosalind Picard, 1997. AI Magazine, 20 (1): 127–133. Sloman, A. (1999b): What Sort of Architecture is Required for a Human-Like Agent? In M. Wooldridge and A. Rao, eds., Foundations of Rational Agency, 35–52. Kluwer Academic, Dordrecht. Sloman, A. (2000a): Architectural Requirements for Human-Like Agents Both Natural and Artificial. (What Sorts of Machines Can Love?) In K. Dautenhahn, ed., Human Cognition and Social Agent Technology, Advances in Consciousness Research, 163–195. John Benjamins, Amsterdam. Sloman, A. (2000b): Models of Models of Mind. In Proceedings of the Symposium on How to Design a Functioning Mind (AISB ’00), Birmingham, April. Society for the Study of Artificial Intelligence and Simulation of Behaviour. University of Birmingham, UK. Sloman, A. (to appear): Architecture-Based Conceptions of Mind. In P. Gardenfors, K. Kijana-Placek, J. Wolenski, and A. Drinan, eds., Proceedings of the eleventh International Congress of Logic, Methodology, and Philosophy of Science. Kluwer, Dordrecht. Sloman, A., and Croucher, M. (1981): Why Robots Will Have Emotions. In Proceedings of the seventh International Joint Conference on AI, Vancouver, pp. 197–202. American Association for Artificial Intelligence, Menlo Park, Calif. Sloman, A., and Logan, B. (1999): Building Cognitively Rich Agents Using the Sim-Agent Toolkit. Commun. Assoc. Comput. Machinery 42 (3): 71–77. Strawson, P. F. (1959): Individuals: An Essay in Descriptive Metaphysics. Methuen, London. Wright, I. P., Sloman, A., and Beaudoin, L. (1996): Towards a Design-Based Analysis of Emotional Episodes. Philos. Psychiatry Psychol. 3 (2): 101–126. Wright, I. P., and Sloman, A. (1997a): An implementation of a protoemotional agent architecture. Technical Report CSRP-97-1, University of Birmingham, School of Computer Science. Wright, I. P. (1997b): Emotional Agents. Ph.D. thesis, School of Computer Science, University of Birmingham. On-line. Available: hhttp://www.cs.bham.ac.uk/research/ cogaff/i. (Availability last checked 5 Nov 2002)
4 Designing Emotions for Activity Selection in Autonomous Agents Lola D. Can˜amero
What emotions are about is action (or motivation for action) a n d action control. —N. H. Frijda ‘‘Emotions in Robots’’
Abstract
This chapter advocates a ‘‘bottom-up’’ philosophy for the design of emotional systems for autonomous agents that is guided by functional concerns a n d considers the particular case of designing emotions as mechanisms for action selection. The concrete realization of these ideas implies that the design process must start with an analysis of the requirements that the features of the envir o n m e n t , t h e characteristics of the action-selection t a s k , a n d the agent architecture impose on the emotional system. This is particularly important if we see emotions as mechanisms that aim at modifying or maintaining the relation of the agent with its (external a n d internal) environment (rather than modifying the environment itself) in order to preserve the agent’s goals. Emotions can then be selected a n d designed according to the roles they play with respect to this relation.
4.1
Introduction
Autonomous agents are embodied, natural, or artificial systems in permanent interaction w i t h d y n a m i c , unpredictable, resourcelimited, and (in general) social environments, where they must satisfy a set of possibly conflicting goals in order to survive. Emotions—at least a subset of them—are one of the mechanisms found in biological agents to better deal w i t h s u c h e n v i r o n m e n t s , enhancing their autonomy and adaptation. Following a biomimetic approach that takes inspiration from mechanisms existing in natural systems can be a very useful principle to design emotion-like mechanisms for artificial agents that are confronted with the same kind of environments and tasks.
Lola D. Can˜amero
The notions of autonomy and adaptation can be understood in a number of different ways, but we will restrict ourselves to considering the sense they have within action selection problems and how emotions relate to them in this particular context. As far as action selection is concerned, agents can be autonomous at different levels, depending, among other things, on their morphology, on their functional and cognitive complexity, on the features of the environment in wh ich th ey are embedded, and on th e kind of interactions they entertain within it. Different ‘‘levels’’ of autonomy can be paired withdifferent ‘‘levels’’ of behavioral complexity, and we suspect that emotions might not be present or equally relevant at every level. At one level, we say an agent is autonomous if it can act by itself, without needing to be commanded by another agent. This type of autonomy can be put in correspondence with very simple behavior, such as reflex behavior responding to external stimuli alone—as in the case of purely reactive, environment-driven agents. No role seems to be left or needed for emotions in this very simple stimulus-response scheme. 1 A h igh er degree of autonomy is achieved when the agent can ‘‘choose’’ whether or not to pay attention and react to a given environmental stimulus, as a function of its relevance to the agent’s behavior and goals. This implies that the agent has, or can be attributed, some internal goals or motivations that drive its behavior—it is thus motivationally autonomous. This is the notion of autonomy we will be concerned with in this paper and in relation to which emotions start to make sense. One of the main problems that motivationally autonomous agents are confronted w i t h i s activity (action, behavior) selection: how to choose the appropriate behavior at each point in time so as to work toward the satisfaction of the current goal (its most urgent need), paying attention at the same time to the demands and opportunities coming from the environment, and without neglecting, in the long term, the satisfaction of other needs, so that survival is guaranteed. Taking inspiration from ethology, the ‘‘classical’’ solutions proposed within (embodied) artificial intelligence (AI) to design action selection mechanisms for artificial autonomous agents or animats (Wilson 1991) rely either on reactive behaviors—responsive to external stimuli (e.g., Brooks 1986) —or on a repertoire of consummatory and appetitive behaviors, whose execution is guided by internal needs or motivations (e.g., Maes 1991; Donnart and Meyer 1994; Steels 1994). My claim
Designing Emotions for Activity Selection in Autonomous Agents
is that, although these models can work well for moderately dynamic environments, they are less appropriate when dynamism increases. In particular, they are not very well suited to deal with contingencies, coming from the external or the internal environment, requiring an urgent response. These situations are better dealt with when a further level of autonomy exists that allows the agent to free itself from the demands of internal needs in order to interrupt ongoing behavior and change goal priorities when the situation requires it—therefore to adapt quickly to rapidly changing circumstances. Some simple emotions play this role in biological agents (see, e.g., Simon 1967; Oatley and Johnson-Laird 1987 for representative articles on what has come to be known as the ‘‘interrupt’’ theory of emotions), as we will see in the section entitled Natural Emotions and Action Selection. In recent years, some agent architectures have been proposed that integrate a set of survival-related ‘‘basic’’ emotions 2 as an additional element for behavior selection (e.g., Can˜amero 1997a; Vela´squez 1998). This chapter reflects on some of the lessons learned from this endeavor to point out a number of issues and problems that need to be taken into account when designing emotions for action selection.
4.2
Which Emotions?
The term emotions covers a wide range of complex phenomena that are difficult to group under a common definition. Emotions have many facets and can be approached from different levels of explanation, as they encompass neuroendocrine, physiological (visceral and muscular), behavioral, cognitive, communicative, and social aspects. All these aspects are not always active in every kind of emotion nor in every emotion episode. Indeed, some emotions seem to involve strong physiological and behavioral manifestations, w i t h w e a k or largely absent cognitive aspects; on the contrary, the cognitive element seems to be predominant in other emotions, which would then be close to thoughts and that might (apparently) lack any physiological correlates; still other emotions seem to be predominantly social; also, biological and cultural influences vary considerably among different emotions. It thus seems unlikely that all this structural diversity (not to mention the functional one) can be easily accounted for by a single definition. A metaphor could perhaps be more helpful at this point. Emotions
Lola D. Can˜amero
may conveniently be thought of as complex dynamic processes that integrate various causally related subsystems, as put forward, for instance, by Mandler (1985) and Pfeifer (1991)—namely, the physiological, the cognitive-evaluative, the communicativeexpressive, and the subjective experience subsystems. Other important subsystems equally involved in emotion seem to have been left out from this list, such as the motivational one, but the general idea of a dynamic system involving processes of causally related subsystems seems very promising in accounting for the diversity and complexity of emotions. In this chapter, I will not deal with emotions at this general level, though; nor will I consider all the different kinds of emotions mentioned in the previous paragraph, but will restrict myself to analyzing the features of some survival-related simple emotions that are involved in activity selection, and the advantages and drawbacks of taking inspiration from these features to design action-selection mechanisms for artificial autonomous agents.
Natural Emotions and Action Selection
Action selection is a kind of adaptation that occurs over short periods of time in highly dynamic environments to timely face rapid and usually temporary changes in the environment. It might be accompanied and improved by learning—a type of adaptation that occurs over longer periods of time and that produces longlasting changes in the organism itself—but in principle, learning is not necessary for action selection to happen. Action selection can be defined as the problem faced by a motivationally autonomous agent that must choose the appropriate behavior at each point in time so as to work toward the satisfaction of the current goal (its most urgent need), paying attention at the same time to the demands and opportunities coming from the environment, and without neglecting, in the long term, the satisfaction of other needs, so that survival is guaranteed. Motivational and emotional states can be considered to play complementary roles in this activity. Motivational states, such as hunger, thirst, aggression, social bonding, and so on are drives that constitute urges to action based on internal needs related withsurvival. Motivations involve arousal and satiation by a specific type of stimulus, and vary as a
Designing Emotions for Activity Selection in Autonomous Agents
function of deprivation. They have three main functions (Kandel, Schwartz, and Jessell 1995): 1. 2. 3.
They steer behavior toward, or away from, a specific goal. They increase general alertness and energize the individual to action. They combine individual behavioral components into a goaloriented sequence. Motivations are thus stimulus specific, requiring urgent satisfaction, and are goal or task oriented. In action selection, motivational states guide the choice of the behavior(s) that must be executed— those best contributing to the satisfaction of the most urgent need(s). They can be thought of as being concerned with appetitive processes that try to activate action as a response to deprivation; emotions, on the contrary, would be rather related with processes that try to stop ongoing behavior and are concerned with satiety processes of reequilibration (Pribram 1984). We could say that while motivations are in charge of driving behavior under normal circumstances, emotions take over behavior control and change goal priority in situations requiring rapid, urgent responses that often imply the interruption of ongoing behavior. But even though motivations and emotions may play complementary roles with respect to action selection, they cannot be placed at the same level. According to Tomkins (1984), emotions, unlike motivations, show some generality of object, time, density, and intensity, and this explains that a person can experience the same emotion under different circumstances and withdifferent objects; in complex systems, emotions also amplify the effects of drives, which have insufficient strength as motives: ‘‘The affect system is the primary motivational system because without its amplification, nothing else matters, and w i t h i t s amplification, anything else can matter’’ (p. 164). From an evolutionary perspective, emotions are biological functions of the nervous system involved in the survival of both the individual and the species in complex, dynamic, uncertain, resource-limited, social environments, and over which agents have little control. The emotions we are mainly considering here form the subset known in the emotion literature as ‘‘primary’’ or ‘‘basic,’’ in the sense that they are more closely related to survival and are universal (or very often found) across cultures, which
Lola D. Can˜amero
might explain the fact that they have rather distinctive physiological and expressive manifestations. Following LeDoux (1996), each (basic) emotion evolved for different reasons and may involve different brain subsystems. The different emotions can be regarded as mechanisms that contribute to safeguard survival-related goals by modifying or maintaining the relation that an individual has with its (external and internal) environment in different ways (Frijda 1995): by protecting itself from environmental influences (fear), by blocking them (anger), by diminishing the risks of dealing with an unknown environment (anxiety), and so on. Considering the emotional system as a whole, emotions have many different functions, but as far as action selection is concerned, we are mainly interested in three of them: 1. 2.
3.
Bodily adaptation that allows one to rapidly and appropriately deal w i t h d a n g e r s , unexpected events, and opportunities. Motivating a n d guiding action. At the simplest level, the categorization of events as pleasant/unpleasant and beneficial/noxious turns neutral stimuli into something to be pursued or avoided. As already mentioned, emotions can also change goal or motivation priority to deal withsituations that need an urgent response, and amplify the effects of motivation. Expressive a n d communicative function. The external manifestations of emotions displayed by an individual can be taken advantage of by its conspecifics to recognize its emotional state and use this as social reference to anticipate its possible behavior or to assess the current situation.
Artificial Emotions for Action Selection?
Emotions are a fact in humans and other animals. However, why would we want/need to endow artificial agents w i t h e m o t i o n s ? Two main answers are possible, depending on what our principal concern is when modeling emotions. If we have a more ‘‘scientific’’ interest, we will use artificial agents as a test bed for theories about natural emotions in animals and humans, providing a synthetic a p p r o a c h t h a t is complementary to the analytic study of natural systems. If we have a more ‘‘engineering’’ motivation, we might want to exploit some of the roles that emotions play in biological systems in order to develop mechanisms and tools to ground and enhance autonomy, adaptation, and social interaction in artificial
Designing Emotions for Activity Selection in Autonomous Agents
and hybrid agent societies. The underlying hypothesis here is that because natural emotions are mechanisms enhancing adaptation in dynamic, uncertain, and social environments—withlimited resources and over which the individual has a very limited control—when an artificial agent is confronted with an environment presenting similar features, it will need similar mechanisms in order to survive. In particular, as far as activity selection is concerned, we as engineers are interested in emotions as mechanisms that allow the agent to: 1. 2. 3.
Have rapid reactions (fast bodily adaptation) Contribute to resolve the choice among multiple goals (role in motivating a n d guiding behavior) Signal relevant events to others (expressive a n d communicative function) Deciding how to actually implement these mechanisms raises a number of issues, some of which we will examine in sections 4.3 and 4.4. But before proceeding with the analysis of these issues, the very idea of a biomimetic approach to emotion modeling needs some consideration. Natural emotions are the product of a long evolution. However, the designer of an autonomous agent could develop different mechanisms, possibly more simple and optimized, in order to (independently) fulfill the same roles that emotions play in activity selection and in adaptation in general. Why then should we adopt the metaphor of emotions? My answer to this question is that emotions allow for a higher economy in design, at two levels:
9
On the one hand, because an emotional system is a complex system connected to many other behavioral and cognitive subsystems, it can act on these other systems at the same time. 9 On the other hand, because emotions are related with goals, rather than with particular behavioral response patterns (Frijda 1995; Rolls 1999), they contribute to the generation of richer, more varied, and flexible behavior. Still another argument can be set forth against the use of emotion-like mechanisms for action selection in artificial autonomous agents. Indeed, natural emotions seem to be maladaptive in some occasions. Why use them at all, then, instead of designing more optimized mechanisms that do not present this drawback? At the individual level—the one we are mostly concerned with
Lola D. Can˜amero
here—emotions seem to be maladaptive especially when their intensity is too high, such as when strong fear prevents action, 3 but also, let us not forget, when their intensity is too low—for example, the absence of emotion prevents normal social interaction, as evidenced by the studies conducted by Damasio and his group (1994) on patients with damage to the prefrontal cortex and the amygdala. Other types of dysfunctionalities, such as some mental disorders, can be explained as improper synchronization of the different emotional subsystems (Rolls 1999). It would thus seem that emotions are mostly maladaptive ‘‘at the extremes,’’ but adaptive when the system is working under normal conditions. The designer of an artificial emotional system would therefore have two choices, depending on what use the system is intended for. We could design an emotional system that only works within the normal adaptive range, and gets ‘‘switched off’’ (or ‘‘on’’) or inhibited when approaching the dangerous zone. We could also model the full workings of a natural emotional system, including its dysfunctionalities, on the grounds of two main arguments: 9
9
4.3
First, we could think that, even though emotions are sometimes maladaptive at the individual level, this maladaptiveness might still be adaptive in some way or anoth er at th e level of th e species, and a computer simulation could help us understand how this could be. Second, an artificial model comprising maladaptive emotional phenomena could perhaps shed some light on the causes and developing factors of some emotional disorders in h u m a n s .
Models of Emotions for Activity Selection
First of all, I place myself within a ‘‘nouvelle’’—embodied, situated—artificial intelligence (AI) perspective and claim that this a p p r o a c h t o AI is better suited than ‘‘classical’’ or symbolic AI to model emotions as mechanisms for action selection in autonomous agents. In my view, the emphasis of situated AI on complete creatures in closed-loop interaction w i t h t h e i r environment allows for a more natural and coherent integration of emotions (at least the ‘‘noncognitive’’ or perhaps the ‘‘nonconscious’’ aspects of them) within the global behavior of the agent. This view has some implications concerning the way emotions are to be modeled in animats. Let us mention some of them.
Designing Emotions for Activity Selection in Autonomous Agents
9
The aspect of emotions that is more relevant here is how they affect the relationship between the agent and its environment; therefore our model of emotions must clearly establisha link between emotions, motivation, behavior, and perception, and how they feed back into each other. This link states that emotion (as well as motivation, behavior, and perception) can affect and be affected by the other elements in a way that can be either beneficial (e.g., energize the body to allow the individual to escape faster from a predator) or noxious (e.g., cause perceptual troubles that lead to inappropriate behavior) for the agent. 9 If we are to say that the agent ‘‘has’’ (some form of) emotion, this link must be grounded in the body of the agent—for instance, by means of a synthetic physiology (Can˜amero 1997b)—because it is through the body that agents interact with the world. I am thus implying that emotions—natural or artificial—cannot exist without a body. 4 9 The subjective and conscious aspects of emotions are not necessary for emotions to happen at this level, because we are primarily concerned with the ‘‘fast pathway’’ of emotion processing (LeDoux 1996). An investigation of the role of emotions in activity selection must thus start by modeling these nonconscious aspects. Only after they are well enough understood, and depending on the complexity of our creature and our action selection problem, it may make sense to add the ‘‘higher-level’’ aspects of emotions. 9 Because we are dealing w i t h c o m p l e t e autonomous creatures, emotions must be an integral part of the creature’s architecture. This means that they must be grounded in an internal value system that is at the heart of the creature’s autonomy. These internal values produce the valenced reactions that distinguish emotions from thoughts. Concerning emotions themselves, several types of models implementable from an embodied AI perspective (and some of them from a symbolic AI perspective as well) and applicable to behavior selection problems can be found in the literature. I’ll classify them according to two criteria: the modeling goal and the viewpoint adopted on emotion.
The Modeling Goal
From this perspective, we can distinguish two ‘‘pure’’ models of emotions merging the two classifications of models in Sloman
Lola D. Can˜amero
(1992) and Wehrle and Scherer (1995): phenomenon-based/blackbox models and design-based/process models. As I will argue below, b o t h t y p e s of models can be suited for application to activity selection problems, and the choice of one or another will depend on the reason why we want to endow our agents with emotions. PHENOMENON-BASED OR BLACK-BOX MODELS These models assume (implicitly or explicitly) the hypothesis that it is possible to somehow recognize the existence of an emotional state, and then measure its accompanying phenomena, such as physiological and cognitive causes and effects, behavioral responses, and so on. It is these accompanying phenomena that these models reproduce, paying exclusive attention to the input/ output relation and disregarding the mechanisms underlying this relation. They thus incorporate an explicit, ‘‘prewired’’ model of emotions and emotional components in charge of producing some outputs given certain inputs. These models respond to a purely engineering motivation. They can be very useful tools when emotions are introduced as behavior-producing mechanisms related to particular goals or tasks. Therefore, they can be successfully used for behavior selection tasks in cases when both the features of the environment and the kind of interactions the agent can have with it are not too complex, are rather well known, and are determined in advance by the engineer of the system. In this case, however, the adaptive character of emotions and their roles are taken for granted from the outset; therefore the model cannot shed any light on the reasons and origins of the adaptive roles of emotions. DESIGN-BASED OR PROCESS MODELS These models pay attention to the way a system must be designed in order to produce a certain behavior (i.e., to the mechanisms underlying this behavior). These mechanisms can be biologically inspired or not, and follow a top-down (e.g., Vela´squez 1996), a bottom-up (e.g., Braitenberg 1984; Pfeifer 1993), or a middle-out approach(e.g., Can˜amero 1997a, b). What really matters from this perspective is to come up w i t h a design that allows us to establish a relation between the underlying mechanisms, the resulting behavior, and the environment where this behavior is situated, so as to better assess the applicability of the model. Within this perspective, it is thus possible to elaborate different emotion-based
Designing Emotions for Activity Selection in Autonomous Agents
behavior selection models using different mechanisms in order to assess their particular contributions and applicability, and to compare them as a first step toward progressively achieving a higher level of generalization in the understanding of their roles. Designbased models thus respond to a more ‘‘scientific’’ preoccupation— either because we use our artificial setting as a test bed for theories of emotions in natural systems, or because, even in the case w h e n our main concern is to solve an AI or robotics problem, we hope to provide some feedback regarding the equivalent problem in biological systems. As for their use to design action-selection mechanisms, it can be more difficult to come up with the ‘‘appropriate’’ or expected behavior in each particular case, and it is not likely that these models are as robust as black-box ones, given their more exploratory nature. However, they are more useful in cases when neither the features of the environment nor the interactions of the agent w i t h i t are defined in advance or completely known, and in particular, when one of our goals is to understand the relationship of emotions to both.
The View on Emotion
From this perspective, we can distinguish structural models that split emotions into a set of components, and functional models, more interested in the adaptive, survival-related roles that emotions play in the interactions of the agent with its environment. Contrary to the previous classification criterion, these two types of models play complementary roles concerning the guidelines they can offer to design an action-selection mechanism. COMPONENT-BASED MODELS Component-based models postulate that an artificial agent can be said to have emotions when it has a certain number of components characterizing human (or animal) emotional systems. Picard (1997) proposes one s u c h m o d e l withfive components that are present in healthy human emotional systems—although not all the components need to be active at all times when the system is operating: 1.
2.
Behavior that an observer would believe arose from emotions (emergent emotions, emotional behaviors, and other external manifestations) Fast, ‘‘primary’’ emotional responses to certain inputs
Lola D. Can˜amero
3.
4. 5.
Ability to cognitively generate emotions by reasoning about situations, standards, preferences, or expectations that concern the individual’s goals in some way or another Emotional experience (i.e., physiological and cognitive awareness, and subjective feelings) Interactions between emotions and other processes that imitate human cognitive and physical functions, such as memory, perception, attention, learning, decision making, concerns and motivations, regulatory mechanisms, immune system, and so on This list is not intended to be a formal model capturing the essential features of h uman-level emotion, but rather a sort of extensional definition of what it means for a computer to fully ‘‘have’’ emotions. To evaluate whether the goal of endowing a computer with emotions has been reached, Picard (1997) also proposes a separate test to assess the presence of each component. For the purposes of our analysis, two questions are of particular relevance withrespect to this model:
1. 2.
Are all its components necessary for action selection? What kind of guidance can it provide for the design of an actionselection mechanism? The full complexity of this model does not seem to be required for action selection. For this task, I would assume that component 1 is only needed in social ‘‘decision-making’’ situations; component 2 must be present; component 3 can be a plus in complex agents but it is not necessary; physiological awareness in component 4 is required, but neither cognitive awareness nor subjective feelings are; and component 5 is also needed for action selection. I would, however, argue that an agent that can make use of its (more limited) emotional mechanisms to face action selection problems in a sensible way ‘‘has’’ (some form of) emotions. The choice of the components to be included in the emotional system will depend not only on the complexity of the agent architecture and of the behavior to which it must give rise, but also on the nature and complexity of the action selection tasks that the agent will have to solve in the environment in which it is situated. A thorough analysis of action selection situations and of the types of architectures better suited for each of them is therefore needed before we can decide on a list of components relevant for a particular emotional system.
Designing Emotions for Activity Selection in Autonomous Agents
As I have argued (Can˜amero 1998), another major problem with component-based models is that they leave open the problem of how the different components relate to each other and which are their relative priorities in the execution of different tasks and for survival in general. For these reasons, component-based models are underconstrained from a design viewpoint, because all the choices are left to the designer. In my opinion, it seems more appropriate to conceive the design process in the opposite direction—starting with an analysis of the requirements that the environment, the actionselection task, and the agent architecture impose on the emotional system. In other words, the choice of ‘‘emotional components’’ must be guided by functional criteria. FUNCTIONAL MODELS Functional models pay attention to the properties of the structure of humans—and more generally animals—and their environment that can be transposed to a structurally different system (system ¼ agent þ environment) in order to give rise to the same functions or roles. One example of s u c h p r o p e r t i e s that is most appropriate for action selection is provided by Frijda (1995): Properties of humans/agents. Humans can be said to h ave th e following properties relevant for the understanding of emotion: 9 They are autonomous. 9 They have multiple concerns. 5 9 They possess the capacity to emit and respond to (social) signals. 9 They possess limited energy- and information-processing resources and a single set of effectors. Features of the environment. The human environment presents the following characteristics relevant to emotional systems: 9 It contains limited resources for the satisfaction of concerns. 9 It contains signals that can be used to recognize that opportunities for the satisfaction of the individual’s goals and concerns or occurrences of threats might be present. 9 It is uncertain (probabilistic). 9 It is in part social. Functions of emotions. From these characteristics (relevant to the understanding of emotion) of the human system and environment, Frijda concludes that the functions of emotion are as follows:
Lola D. Can˜amero
9 9
To signal the relevance of events for the concerns of the system. To detect difficulties in solving problems that these events pose withrespect to the satisfaction of concerns. 9 To provide goals for plans to solve these problems in case of difficulties, resetting goal priorities and reallocating resources. 9 To do this in parallel for all concerns, simultaneously working toward whatever goal the system is actively pursuing at a given moment, as well as in the absence of any actually pursued goal. These functional requirements are much more specific than the ‘‘components’’ ones, and therefore seem to be easier to attain, although they still need to be refined by adding the elements of the agent-environment system, which are specific to the particular action selection situations that the agent will have to solve (e.g., which are the particular concerns of the agent, what kind of signals can be exploited from the environment, etc.). Functional models also leave more freedom concerning the particular elements to be used in order to achieve these functionalities. However, as Frijda himself points out, this model remains underspecified as far as the underlying architecture and implementation mechanisms are concerned, and many problems and design issues remain open.
4.4
Design Issues
In his ‘‘New Fungus Eater’’ or emergent view of emotions, Pfeifer (1993) claims that all controversies and open problems surrounding the characterization of emotions (such as the identification of th e components of emotions, th e debate on th e existence of a set of basic emotions, or the issues of their origins, activation, and roles) are due to the adoption of an inadequate modeling approach for their study. According to him, ‘‘these controversies need not be resolved: if the approach is appropriate, they largely ‘disappear’ or are resolved automatically’’ (p. 918). Most existing models up to the date of Pfeifer’s paper being phenomenon based, he proposed the adoption of a bottom-up, design-based modeling approach, where emotional behavior is an emergent phenomenon in the eye of the beholder. Although I agree that design-based modeling, and its principle of going below th e level of th e ph enomena we want to study, is th e most appropriate for the understanding of the origins and adaptive value of emotions in action selection, Pfeifer’s position presents two major drawbacks:
Designing Emotions for Activity Selection in Autonomous Agents
First, modeling and implementing emotions as purely emergent phenomena in the eye of the beholder rather than as an integral part of the agent architecture can be an exciting challenge for the designer of a robot, but this view misses what I consider two of the most important contributions of artificial emotional systems: their ability to relate to and influence different behavioral and cognitive subsystems at the same time, and the feedback they can provide to understand the mechanisms involved in natural emotions. Second, the claim that this approach dissolves all the problems seems really extreme. At best, it can dissolve (some of) the problems posed at the ‘‘phenomenon level,’’ but it moves the problems ‘‘down’’ to the design level. In particular, a fundamental problem arises from the outset: The elaboration of an ‘‘emotional agent’’ must be guided by a design concern, but wh at are th e criteria th at will guide our design choices? As a general guideline, I would propose the two following principles: 1.
2.
Design must be guided by the following question: What does this particular (type of) agent need emotions for in the particular (type of) environment it inhabits? One should not put more emotion in the agent than what is required by the complexity of the agent-environment interaction. These ideas are, however, too general and leave open many design problems. In Can˜amero (1998, 2001), I examine some of the questions that the functional view of emotions leaves unanswered w i t h r e s p e c t to the design of artificial emotional systems in general—namely, concerning the level and complexity of the model, the controversy between engineering versus evolving the emotional system, and the evaluation of models of emotions and the contributions of these to the agent performance. Let us now consider some of the problems that must be taken into account when designing emotions for activity selection.
Study of Environments
A thorough comprehension of the role of emotions in activity selection requires understanding the precise relationship between the emotional systems and the problems that each emotion contributes to solve. These problems are best characterized in terms of types of environments that allow us to understand how and why the different emotions emerged and are adaptive in a particular
Lola D. Can˜amero
context. A classification of environments in terms of features that are relevant for action selection (dynamism, uncertainty, threats, availability of resources, social interactions, etc.) is thus a necessary first step toward a principled design of emotional systems contributing to this task. These features can provide good guidelines to design emotional systems for action selection—at least in three main respects. In the first and perhaps most trivial one, they provide good clues to decide what particular emotions the agent needs in that particular context. This is the case because environmental features define the kinds of problems that the agent is confronted with, and the activities or competences it must incorporate to deal w i t h t h e m . In doing so, they also indirectly indicate what functions or mechanisms are needed to ensure survival in that environment: protection from (particular types of) dangers, removal of obstacles, attachment to conspecifics, and so forth. In this case, therefore, emotions are selected on the grounds of their survival-related functions. Second, the characteristics of the environment allow us to assess how important each emotion is in that precise type of context, and therefore how ‘‘easily’’ or how often it should/is likely to be activated—that is, what is the most appropriate activation threshold for e a c h e m o t i o n 6 in a given type of environment. If we follow Frijda (1995) to see emotions as mechanisms that serve to safeguard one’s own ‘‘concerns’’—goals, motivations—by modifying or maintaining one’s relation to the environment, rather than as mechanisms aiming at modifying the environment itself, then the significance of the diverse types of relations one can have with the environment will depend, to a big extent, on the features present in it. Some of the modifications mentioned by Frijda include protection from influences from the environment in the case of fear, blocking these influences in the case of anger, diminishing the risks of dealing w i t h a n unknown environment in the case of anxiety, and so on. The weight that each of these mechanisms has in a particular type of environment can be either established by the designer or left for the agent to learn as a consequence of its actions. Third, the characteristics of the environment help us decide the ‘‘cognitive complexity’’ required for the emotional system (i.e., which ‘‘components’’ can be meaningfully included). For example, in a moderately dynamic environment where resources have rela-
Designing Emotions for Activity Selection in Autonomous Agents
tively stable locations, the use of an ‘‘emotional memory’’ of places where different objects are likely to be found (e.g., where to seek shelter when faced with a fearful situation) can be much more efficient than blind search, in spite of its additional cognitive/ computational cost. On the contrary, if the environment is so dynamic that objects only remain at a particular location for a short period of time, this type of memory can be more inefficient than random search. It can even have a negative impact on survival, because th e time used to recall and reach th e position of an object that is likely to have moved could have been used more efficiently to execute a different behavior.
Choice of Primitives
Emotions are included w i t h a purpose in mind, and therefore we must carefully choose the mechanisms we need for the system to display the desired behavior, and at what level it is meaningful to model them within our architecture. When emotions must play a role in behavior selection, are these mechanisms better thought of in terms of continuous dimensions or discrete categories, to map the usual distinction in emotion theory? The answer to this question largely depends on the type of model we are using: black box/phenomenon based, where we must explicitly include in the model the elements that will produce the desired behaviors; or process/design based, where the model must go below the level of the phenomenon under study to allow for emotional behavior to naturally emerge out of the system’s interactions with its environment. For the first type of models, a ‘‘predefined’’ set of ‘‘discrete’’ emotions seems the most natural choice, but this poses the problem of which ones to choose and how ‘‘realistic’’ our model of them must be. As we have seen in the previous section, the characteristics of the environment can help us decide which emotions are relevant in e a c h p a r t i c u l a r case. As for how ‘‘realistic’’ our model should be, the answer largely depends on what our agent architecture is like and what our purpose is in building it. For the second type of models, emotional behavior could perhaps be better grounded in some underlying simple mechanisms or ‘‘internal value systems’’ (Pfeifer 1993) encoding features along some dimensions. Behavior that could be thought of as arising from emotions would then be an emergent property of the agentenvironment interaction. One could also think of encoding those
Lola D. Can˜amero
features in a genotype that would lead to ‘‘emotional behavior’’ (the phenotype) and that could be evolved to generate agents with emotion-based action selection systems adapted to different types of environments. However, the problem of selecting the right primitives (the genes, in this case) to give rise to meaningful behavior is still present; it could even be harder in this case, as the genotype-phenotype relation is not a linear one, and it is not fully understood. In my view, however, presenting the discrete categories/ continuous dimensions characterizations of emotions as conflicting ones is an ill-posed problem that opposes classifications belonging to different realms. Dimensions s u c h a s valence and arousal are ‘‘structural’’ features that are present in all emotions. If we implement emotions in our system as discrete categories, they must also possess these properties in order to function as natural emotions d o . I would thus say that they are necessary features of emotions; however, they are not sufficient to characterize emotions because they leave out function (What is each emotion for in this agent-environment system? What are their precise roles in action selection?). And function is precisely the main criterion underlying classifications of emotions in terms of categories. The two characterizations are thus complementary, and both must be taken into account, explicitly or implicitly, when designing an actionselection mechanism, regardless of the primitives we choose to implement emotions.
Connection with Motivation and Behavior
How to relate emotions with the other two main elements involved in activity selection, namely motivation and behavior? Let us consider behavior first. The connection between emotions and behavior is not a direct one. As pointed out by several authors (see, for instance, Ortony, Clore, and Collins 1988; Frijda 1995; Rolls 1999), the fact that emotions are related with goals, rather than with particular behavioral response patterns, explains the fact that they contribute to the generation of richer, more varied, and flexible behavior. Some emotions, and this is the case in relation with action selection, show characteristic action tendencies, to put it in Frijda’s terms, but this relation is far from being automatic or reflexlike. The relation between emotions and behavior is thus through goals or, in the context of action selection, motivations.
Designing Emotions for Activity Selection in Autonomous Agents
Some authors (e.g., Pfeifer 1993) place (basic) emotions at the same level as motivations or drives, but I consider emotions as second-order modifiers or amplifiers of motivation, following, for example Tomkins (1984) and Frijda (1995). In an action-selection system, motivations set goal priorities and guide the choice of the behavior(s) to be executed—those best contributing to the satisfaction of the most urgent current need(s). What role is then left for emotions in action selection? Quoting Frijda, ‘‘emotions alert us to unexpected threats, interruptions, and opportunities’’ (1995, 506– 507). For this, they may interrupt ongoing behavior by acting on (amplifying/modifying) motivation, as in some cases, when this can imply resetting goal priorities. Emotions, at least as far as action selection is concerned, are better seen as a ‘‘second-order’’ behavior control or monitoring mechanism acting on top of motivational control to safeguard survival-related concerns by modifying or maintaining our relation to the environment. As argued in Can˜amero (1997a) and Frijda (1995), in order to preserve the generality feature of emotions (its task independence), the connection between motivations and emotions must be indirect, w i t h e m o tions running independent of, and in parallel with, motivations. One could, in principle, imagine different mechanisms for this indirect connection. The solution I have proposed, as we will see in the next section, relies on the use of a synthetic physiology through which these elements interact.
4.5
Behavior Selection Using Homeostasis and Hormones
The agent architecture I proposed (Can˜amero 1997a, b) relies on bothmotivations and emotions to perform behavior selection. Its design can be seen as combining a ‘‘bottom-up’’ approach, which grounds emotions in a physiology, and a ‘‘top-down’’ one, where guidelines are drawn from functional considerations. The experimental setting is a two-dimensional world—Gridland—where agents (creatures) can perform a number of tasks in order to maintain their well-being and survive as long as possible—their ultimate goal. The simulated creatures, with very limited perception and action capabilities, have a synthetic physiology (survival-related controlled variables and hormones) that constitutes their internal state and makes them have motivations controlled in a homeostatic way. Controlled variables are essential for the creature’s survival, and their value must remain within a
Lola D. Can˜amero
viability range. Eachof these variables activates a different motivation (aggression, cold, curiosity, fatigue, hunger, self-protection, thirst, and warm) when its value goes out of the viability range. Motivations are thus driven by internal needs or drives. Hormones are released when the agents have an emotional reaction. They can only act on specific receptors attached to the internal sensors that monitor the controlled variables. Each hormone can affect several variables, but it has a particular valence with respect to each of them; therefore, at least two different hormones are needed for e a c h c o n t r o l l e d variable. Hormones have two functions— communicative and synthesizing. On the one hand, they allow emotional states to selectively act on certain subsets of controlled variables—those responsive to them. On the other hand, they modify (increase or decrease, depending on the particular hormone) the amount of the controlled variables that are receptive to them. This way, emotional states contribute to keep the stability of the internal milieu that is necessary for the creatures’ survival and adaptation. At e a c h p o i n t in time, a creature can have several ‘‘active’’ motivations w i t h v a r y i n g degrees of intensity; the intensity of e a c h motivation depends on the magnitude of the error of its corresponding controlled variable—deviation w i t h r e s p e c t to the ideal value. The motivation with the highest intensity will try to execute behaviors that can best contribute to satisfy the most urgent need(s)—a consummatory behavior (attack, drink, eat, play, rest, walk, and withdraw) if the stimulus is present, an appetitive one (look-for, look-around, avoid, etc.) otherwise. The execution of a consummatory behavior affects both the external world and the creature’s physiology. A second motivation is also taken into account to allow for opportunistic behaviors. For example, if a creature is primarily hungry and a bit less thirsty, it will actively look for food in order to satisfy its most urgent need; however, if it finds water in its way, it will stop and drink, to later continue its searchfor food. The creatures can also enter different emotional states (anger, boredom, fear, happiness, interest, and sadness) as a result of their interactions with the world—the presence of external objects or occurrence of internal events caused by these interactions. Emotional states affect the creatures’ physiology, attention, perception, motivation, and behavior.
Designing Emotions for Activity Selection in Autonomous Agents
I will briefly examine here the design choices underlying the emotional system in terms of the issues presented in the previous section.
Design Choices
FEATURES OF THE ENVIRONMENT The microworld, Gridland, is a two-dimensional toroidal grid with discrete space and time. It contains resources (food and water), obstacles, and two species of creatures: Abbotts—the ones integrating emotions—and Enemies—predators w i t h a very simple behavior and fixed arbitration mechanism. It presents the following features relevant for action selection. Dynamism: Gridland is a highly dynamic environment in which all objects can change locations. The only memory that Abbotts have is thus memory of different types of objects encountered in the past, but not of their location. Availability of resources: A number of food and water sources determined by the user are placed (randomly or at selected locations) all over the microworld at the beginning of a run. Food and water can be consumed by bothspecies, and several creatures can eat or drink from a source at th e same time. Wh en a source is exhausted, it disappears and a new one is created at a random location. Resources are thus never absent from the world, but their amount and location changes continuously, and they might be missing in a particular area when needed. Uncertainty: Uncertainty arises from two main sources: limited and noisy perception (uncertainty about the identity of objects), and the high dynamism of the world (uncertainty about the presence and location of objects beyond the small area currently perceived). Threats: These can be present either in the external environment (Enemies, angry Abbotts) or in the internal one (physiological parameters reaching values that menace survival). Social interactions: Most interactions, bothintra- and interspecies, are either competitive (consuming the same resource) or aggressive (flee or fight). Abbotts can get happy and play in the presence of a conspecific, but in the current implementation, this does not lead to cooperation or to attachment behaviors.
Lola D. Can˜amero
CHOICE OF PRIMITIVES The model includes explicit mechanisms for (‘‘basic’’) emotions. Emotions are characterized by: a triggering event; an intensity proportional to the level of activation; an activation threshold; a list of hormones, which are released when the emotion is active; and a list of physiological manifestations. A subset of discrete categories corresponding to ‘‘primary’’ or ‘‘basic’’ emotions were chosen to function as monitoring mechanisms to deal with important survival-related situations that arise in the relations the creatures entertain w i t h t h e i r environment, namely: Anger: A mechanism to block the influences from the environment by abruptly stopping the current situation. Its triggering event is the fact that the accomplishment of a goal (a motivation) is menaced or undone. Boredom: A mechanism to stop inefficient behavior that does not contribute to satisfy any of the creature’s needs. Its triggering event is prolonged repetitive activity. Fear: A defense mechanism against external threats. Its triggering event is the presence of Enemies. Happiness: It is a double mechanism. On the one hand, it is a form of reequilibration triggered by the achievement of a goal. On the other hand, it is an attachment mechanism triggered by the presence of a conspecific, although this second role is not further exploited in the current implementation. Interest: A mechanism for the creature to engage in interaction with the world. Its triggering event is the presence of a novel object. Sadness: A mechanism to stop an active relation with the environment when the creature is not in a condition to get a need satisfied. By slowing down the creature (its metabolism and motor system) it prevents action for some time—long enough, it is hoped, for either the world or the internal state of the creature to change. Its triggering event is the inability to achieve a goal. These discrete categories, however, have also properties of valence (through the release of hormones that act as ‘‘pain’’ or ‘‘pleasure’’ signals) and arousal (physiological activity). CONNECTION WITH MOTIVATION, BEHAVIOR, AND PERCEPTION Emotions are activated either by significant events or by general stimulation patterns (sudden stimulation increase, sustained high
Designing Emotions for Activity Selection in Autonomous Agents
stimulation level, sustained low stimulation level, and sudden stimulation decrease), and are discriminated by particular patterns of physiological parameters specific to each emotion. The model allows several emotions to be activated at the same time (Can˜amero 1997b), all of which influence behavior to various degrees through hormone release, or to adopt a winner-takes-all strategy (Can˜amero 1997a), where a single emotion defines the affective state of the creature. Specific hormones that selectively modify the levels of controlled variables (and, to a bigger extent, the readings of the sensors tracking these variables, to reflect the fact that visceral effects of emotions are usually much slower than behavioral ones) are released by different emotions when they are active. In addition, other hormones are emotion independent, and they are released as a function of arousal. Emotions run in parallel w i t h t h e motivational control system, continuously ‘‘monitoring’’ the external and internal environment for significant events. Connection with behavior is through their influence on motivation. Emotions modify motivations as follows. The effects of hormone release on the values of controlled variables is computed before motivations are assessed, producing a modification of the creature’s motivational state. The error signals that motivations have can then be different from those they would have had in the absence of emotions, and therefore their activation level or intensity will be different as well. This can change the priority of motivations (control precedence), which in turn modifies what the creature attends to and what behavior gets selected—that is, ongoing behavior is interrupted and a new one is selected to rapidly deal w i t h a (external or internal) threat, an inadaptive condition in behavior execution, or an opportunity. Hormonal modification of controlled variables can also change the way in which a behavior is executed—its duration or its motor strength, depending on the behavior—in the case of a modification of the motivation’s intensity, as this intensity is passed to behaviors. In addition to modifying the motivational state, emotions also influence the creatures’ perception of both the external world and th eir own body, and th is can in turn affect th eir beh avior. Perception of the external world is altered by hormonal modification of the ‘‘vigilance threshold’’ parameter in the ART-1 neural network (Carpenter and Grossberg 1988)—responsible for forming and remembering object categories; this leads, for example, to a coarser granularity in the categories formed and therefore to a ‘‘confused’’
Lola D. Can˜amero
state when emotions are active with a very high intensity, or to finer categorization under an ‘‘alert’’ state (moderately high emotion activation). Altered body perception is achieved through hormonal modification of the readings of internal sensors (those measuring the values of controlled variables). For example, endorphin is released under a very happy emotional state, and this reduces the perception of pain. Altered perception can lead either to more efficient or to inappropriate behavior, and this evidences the informational content of emotions, which go beyond mere hormone levels to constitute true ‘‘brain subsystems.’’ An example of altered perception producing more efficient behavior is reduction of pain perception in critical circumstances. Examples of inefficient behaviors caused by emotion-altered perception are those produced under a ‘‘confused’’ state, leading to wrong object categorization. Formation of too coarse object categories can make that incentive stimuli of quite different behaviors be grouped under a single category, and cause the activation of the wrong behavior. For example, if blocks and Enemies are grouped under the same category, when the fatigue motivation is active, an Abbott can try to sleep on top of an Enemy, be bitten by it, and feel pain. However, this inappropriate behavior can, in the end, turn out to be adaptive—intense pain will produce a fleeing reaction that can lead, depending on the subsequent interactions with the environment, either to a more ‘‘alert’’ state that raises the perception threshold and produces a finer (‘‘normal’’) categorization, or to deathif pain keeps increasing and goes out of the viability range.
Looking Backward, Looking Forward
This model meets the main requirements that action selection in a highly dynamic environment imposes on an emotional system. It includes a wide enoughrepertoire of mechanisms to allow the agent to modify the relation w i t h i t s environment, b o t h e x t e r n a l and internal, in ways that are beneficial for the attainment of the agent’s goals, contributing to its survival and well-being. The connection between emotions, motivations, and behavior through a physiology is also appropriate for action selection. In this respect, the fact that emotions constitute a ‘‘second-order’’ behavior control mechanism running in parallel with motivational control allows emotions to continuously monitor and timely influence the other
Designing Emotions for Activity Selection in Autonomous Agents
two elements, while keeping their generality (i.e., independence) withrespect to objects, tasks, and behaviors. Because the relation between emotions and behavior is only indirect—through motivations—the agent can show flexible and varied behavior. As for their connection to motivations, emotions are capable of appropriately modifying (by altering motivation/goal priorities) or amplifying (by raising or lowering the intensities of motivations) the effects of motivation on behavior to ensure proper treatment of situations needing an urgent response. One of the main problems with this architecture is that it was totally hand coded, and therefore very difficult to tune— particularly the connections among the different elements through the physiology. Using evolutionary techniques to generate different emotional systems would therefore be a very interesting direction to explore, as it would also allow us to evaluate the performance and adaptive value of emotions for different action selection tasks and environments. The emotional system was designed to meet the requirements of action selection in a very dynamic environment, where timely and fast responses are more important than sophisticated solutions. The emotional system is thus very simple. In particular, the more ‘‘cognitive’’ aspects of emotions, s u c h a s anticipation, appraisal of the situation and of the possible consequences of the emotional response, control of emotions, and emotion-related learning were not explored. These aspects can, however, be important for more ‘‘complex’’ action-selection situations—in particular, when social relations come into play.
4.6
Conclusion
In this chapter, I have advocated a ‘‘bottom-up’’ philosophy for the design of emotional systems for autonomous agents that is guided by functional concerns, and I have considered the particular case of emotions as a mechanism for action selection. In my view, the elaboration of artificial emotional systems should be guided by two main concerns: 1.
Design must be guided by the following question: What does this particular (type of) agent need emotions for in the particular (type of) environment it inhabits?
Lola D. Can˜amero
2.
We should be careful not to put more emotional machinery in the system than what is required by the complexity of the agentenvironment interaction. The concrete realization of these ideas implies that the design process must start with an analysis of the requirements that the features of the environment, the characteristics of the task, and the agent architecture impose on the emotional system. This is particularly important if we see emotions as mechanisms that aim at modifying or maintaining the relation of the agent with its (external and internal) environment—rather than modifying the environment itself—in order to preserve the agent’s goals of well-being. Requirements determine what kind of activities and mechanisms are needed to ensure that this relation favors the preservation of the agent’s goals. Emotions can then be selected and designed according to the roles they play with respect to this relation. In the case of action selection, the most relevant features of the environment are dynamism, uncertainty, threats, availability of resources, and social interactions. A careful analysis of these features can provide good pointers as to what kind of emotional system is needed for the agent to modify the relation with its particular environment in ways that are beneficial for the preservation of its own goals, especially in three main aspects: the choice of the emotions needed in that context, the relevance of each emotion in that precise environment, and the ‘‘cognitive complexity’’ (emotional ‘‘components’’) that is meaningful for that concrete emotional system. As for the requirements that action selection imposes on emotions, the most important consideration that needs to be taken into account is how emotions relate to motivation and behavior in order to properly deal w i t h s i t u a t i o n s requiring a rapid response— threats, unexpected events, inefficient behavior in the pursuit of goals, and so on. I see emotions as a ‘‘second-order’’ behavior control mechanism acting in parallel with, and on top of, motivational control, to continuously monitor the (external and internal) environment for situations in which the relation of the agent with its environment has some significance with respect to the agent’s goals. They affect behavior selection indirectly, by amplifying or modifying the effects of the motivational system (e.g., resetting goal priorities and redirecting attention toward a situation that needs prompt solution). The solution I have proposed to connect these
Designing Emotions for Activity Selection in Autonomous Agents
three elements—emotions, motivations, and behavior—while keeping them at different ‘‘levels,’’ relies on a synthetic physiology through which they interact. However, this solution was conceived for a particular type of action-selection problem and environment, and other solutions might be more adequate in other contexts. Finally, concerning the requirements that the agent architecture imposes on the emotional system, they can greatly vary depending on the type of architecture we are using, and this in turn depends on what our purpose is when endowing our agent with emotions.
Acknowledgments
I am grateful to Robert Trappl and Paolo Petta for creating the perfect atmosphere to meet with other colleagues and exchange ideas about Emotions in Humans and Artifacts in their Lab in Vienna, and for (their patience!) encouraging me to write this chapter. Discussions with the other participants at the Vienna meeting were an invaluable source of ideas. David Moffat provided additional comments on an earlier draft of this chapter.
Discussion
A Box Labeled ‘‘Emotions’’
Elliott: May I add some reflections: As soon as you even label this work as being ‘‘emotion’’ work and start using these labels that we understand to mean some kind of social interaction, we are already couching this as work that we can understand because we know what it is. And if you extract this anthropomorphism or the tendency to it, I wonder if it’s—I do not want to use the word ‘‘hypocritical,’’ but—misleading, in a way. And the same would apply to design. If you design the agents in ways which acknowledge we are modeling this after human social systems, and when we are not doing that at the low level, but on a much higher, symbolic level, then here is the case that I think can be made: If you have a design that is understandable by us, even if it includes some aspects that might not necessarily be the most efficient, it’s still creating the impression that we understand what’s happening in the system.
Lola D. Can˜amero
Ortony: You know, it’s like the underlying principle of Joe Bates’s concept of believability. Believability means to cause people to anthropomorphize these agents. I mean, it’s not that this is a weakness. The idea is: We want to make you people think that these are emotional beings. We exploit the tendency to anthropomorphism. Sloman: And that can be an engineering goal, if what you produce is for entertainment. But I think it’s very important to distinguish that goal from the goal which Edmund and I and other people have, which is to try to model something about h u m a n beings. I think there is a very interesting difference between your architecture and Edmund’s, which other people may or may not have noticed, but I think this difference crops up all over places. There are some people who build architectures, and there’s this box labeled ‘‘emotions.’’ Then other people, who produce series of emotions, build architectures, and there is no box labeled ‘‘emotions.’’ And I think anyone who puts in a box labeled ‘‘emotions’’ may be doing someth ing useful for th e engineering goals, but h e has totally got it wrong as regards the modeling of human emotional systems. We have to be clear what sorts of goals we have in building these systems. And some are for entertainment purposes. Some are for educational purposes, some are controlling some equipment in a factory, or on a planet somewhere, and we have to give the thing a certain amount of autonomy. For different sorts of purposes, we will have different requirements. It’s also worth pointing out that there are also different kinds of anthropomorphism. Some people in the room, or maybe everyone, will remember the days when debugging programs involved looking at code printout. Nowadays, that’s very rarely done, because we are able to give machines the ability to tell us much more in terms that we can understand what’s going on to help us with debugging and tracing. So, we anthropomorphize, because we interpret what the machine puts out as saying something about categories that we are also using for describing h u m a n s . That’s a sort of anthropomorphism that’s actually essential for getting computers to enhance themselves in both ways and to help us get to the next stage. So there are different kinds of anthropomorphism: some of those are there because they are fun and entertaining, others are there because they are deeply functional.
Designing Emotions for Activity Selection in Autonomous Agents
Picard: I was just reading on the way here a debate going on in Science and Science Weekly about why highly rational scientists who should know better would practice animism. Basically stating that things are living when they are not or, as Ben Shneiderman accuses computer scientists, talking about machines that learn when that misleads the general public into thinking the machines can do more than they really do, and so forth. Here is a letter to the editor that says: ‘‘It should surprise nobody that animism is popular among sophisticated adults in any culture including our own. People and presumably other animals use animistic explanations because under most real world conditions, animism provides the best model for predicting how the world will respond to what people do. A fire-fighter facing a blazing building has neither the time nor the means to develop a three-dimensional finite elements model to predict the development of the fire. It’s much more efficient to mentally model the fire as a hungry animal that can be stopped by depriving it of fuel to eat and air to breathe. Animism is th e first resort for anyone trying to deal with a situation that is too complex or has too many unknowns to be modeled in a more ‘rational’ way.’’—So, it says, when the chips are down, even sophisticated scientists use the best mental approach available and really don’t care whether psychologists or philosophers approve of them or not. I think we need to separate when we are talking among ourselves who know what a phrase like ‘‘recognize emotion’’ means, and when we are talking to the public. We are using the closest approximation that we all can relate to for trying to describe these phenomena. We are trying to come up with the equivalent of these three-dimensional finite models, recognizing that there is some distinction here. But we need to be especially careful when we talk to the general public about these things: we should go to a little more trouble to elaborate on what we mean by the term emotion, or phrases recognize emotion and have emotion. Bellman: I just want to say that the underlying assumption in that article, at least one, was that somehow such models are less sophisticated than are finite differential models. And of course that’s just not true. That’s part of the problem that we are trying to contend with exactly in this room, which is that somehow emotions are considered something that’s primitive processing. I can’t help start picking away at an attitude that is going into another line—that makes
Lola D. Can˜amero
that differentiation between the primitive processing of emotions versus a sophisticated ‘‘rationalistic’’ model. And, frankly, a rationalistic model is not nearly as sophisticated. Picard: I think we all know what we are really talking about: We talk about intentions, desires, emotions—all this terminology (that refers only approximately to what we are trying to model or build), and we should just be careful when we go to a more general audience that we recognize that they may carry different conceptions for all that. Sloman: Two separate issues here. One is I think a deep metaphysical point about how many layers of reality there actually are. And there was one view which falls into what I call the—I don’t know who coined the phrase, but I like it—‘nothing-but’-ery fallacy: ‘‘There really is nothing but all these atoms rushing around,’’ or whatever. This fallacy implies that there isn’t higher level organization, there aren’t intentions, there aren’t feedback loops, there aren’t, for instance, waves rushing across the sea at high speed w i t h a destructive power. They are just all those atoms and molecules going round. It is just wrong to say that there is nothing but this low-level thing. There are lots of higher-order structures. And we don’t have a good view of how many different kinds of ontology we really should use, as scientists, not just as naive people finding a convenient way to think about it, but as scientists. So, that’s one issue. The second issue is that we should be careful, because we use a label that may not be justified. I wonder how many people here have read McDermott’s article from about twenty years ago, called ‘‘Artificial Intelligence Meets Natural Stupidity,’’ 7 where he is talking about people who use the word ‘‘plan’’ or ‘‘goal’’ or ‘‘infer’’ in their programs, and then say ‘‘my program has plans’’ or ‘‘my program does inferences’’ or whatever. We are all in danger of making that overinterpretation of the things we designed, because we put labels on them that can mislead people, because the label does not entail the richness that they would assume. So, it’s one thing to say you shouldn’t make that mistake. It’s another thing to say reality is just this flat ‘nothing-but’-ery kind of model. And we shouldn’t fall into this fallacy. Can˜amero: Just let me add a remark to finish: You (Aaron) said previously that one of the advantages of modeling emotions artificially is that you have a whole space of architectures that you can explore. But what criteria do you take to explore that space?
Designing Emotions for Activity Selection in Autonomous Agents
What Kind of Criteria?
Sloman: I am not sure what you mean. If you mean a criterion for choosing an architecture—it will depend on what you are trying to d o . Are you trying to model human beings, or ants, or chimpanzees, or are you trying to build some robot to control a factory when there aren’t any human beings there, or are you trying to produce an entertainment? You might choose very different things. For instance, things that I would call very shallow, unrealistic architectures insofar as they are put forward as models of how human beings work. But they are excellent devices to put into software for entertainment or educational purposes. And in particular I think that there is a space of architectures where you have a box labeled ‘‘emotion.’’ And they in general are shallow compared to the sort of thing that Edmund was talking about. And I think we need to do serious modeling if we want to understand human beings. But shallow architectures may be very good for particular kinds of applications. Rolls: Just to follow that u p : If you had a problem where the events in the world, the stimuli, were not reliably correlated with outcomes, so you had to keep learning about them and relearning, then having an emotional system in there that could learn that arbitrary stimuli were associated w i t h r e w a r d s or punishers would be useful. So, that’s one sort of environment. And then you reduce your ‘‘genes’’ (the program) to determine what these primary reinforcers ought to be. In other words: The design principle could be a self-evolving machine, using something like genetic algorithms in a natural world, and the genes have two jobs in this. One is to identify the dimensions of our environment along which it’s building primary reinforcers, and the other is to design the actual general architecture of the machine, so that, for example, it has, in general, good learning properties and can perform, say, arbitrary responses to get these decoded rewards, and so on. So, for a biologist, this would be the class of problems for which at least sort of biologically realistic emotion would be most naturally applied. Sloman: Let’s make just a slight additional point there. From a biological point of view, I think it’s clear that genes put things into us, which are for the purposes of genes, not for u s . If you think about, for instance, what risks are involved in having babies . . . so, the genes have to work very hard to make you want to have babies, not
Lola D. Can˜amero
for the benefit of the individuals having these babies, but because without that, the genes would soon die out. Riecken: The gene is a very good agent. Sloman: The gene is—with its implicit goals and the way it functions—very powerful. Stern: Now you are anthropomorphizing! Sloman: It’s not anthropomorphizing. This is the way to describe things, if you think, as I do, about the biosphere as a sort of selforganizing system.
Notes 1. Some authors, such as Damasio, would talk of emotional systems also at this level, and even in animals lacking a nervous system (see Damasio 1999). 2. The use of the expression basic emotions is highly controversial in the psychology literature, as authors do not agree in the subset of emotions that can be considered as basic, nor in what sense they are so. See Ekman (1992) for a characterization of basic emotions in terms of their distinguishing features, and Ortony and Turner (1990) for a discussion on the meaning of this term. 3. This freezing reaction produced by strong fear can be maladaptive in some contexts, but from an evolutionary perspective it has an adaptive value. For example, it is often the case that an animal can have more chances to avoid being attacked by a predator if it does not move, as this can give the impression that it is an inanimate being. This reaction can, however, be maladaptive if it lasts for too long, or in other types of environments and situations where immediate action could be more appropriate. Some emotional dysfunctionalities can also be seen as remnants of our evolutionary past—some reactions that were functional in the past are no longer adaptive. 4. This is not the case when reasoning about emotions. One can perfectly think of a computer program that reasons about emotions (e.g., that recognizes emotional states from textual descriptions of behavior, or that advises appropriate responses to emotional states of people) independently of its behavior or body, and indeed several examples of such programs exist in the literature. I would not say, however, that such a program ‘‘has’’ emotions. 5. Concerns are defined by Frijda as ‘‘ultimate goals,’’ s u c h a s seeking protection, shelter, well-being, and so on, as opposed to more concrete goals guiding specific behaviors. We could thus say that concerns are high-level, abstract goals related with viability (in the sense of Ashby 1960) and survival. 6. This threshold—the relevance of the emotion—can be hand coded by the designer or learned by the agent, depending on the modeling approach adopted. 7. D. McDermott, Artificial Intelligence Meets Natural Stupidity, in Mind Design, ed. John Haugeland (MIT Press, Cambridge, 1981).
References Ashby, W. R. (1960): Design for a Brain: The Origin of Adaptive Behavior. 2nd ed. Ch apman and Hall, London. Braitenberg, V. (1984): Vehicles: Experiments in Synthetic Psychology. MIT Press, Cambridge. Brooks, R. A. (1986): A Robust Layered Control System for a Mobile Robot. IEEE J. Robotics Automation RA-2 (April): 14–23.
Designing Emotions for Activity Selection in Autonomous Agents
Brooks, R. A. (1991): Intelligence without Representation. Artif. Intell. 47: 139–159. Can˜amero, L. D. (1997a): Modeling Motivations and Emotions as a Basis for Intelligent Behavior. In W. L. Johnson, ed., Proceedings of the First International Conference on Autonomous Agents, 148–155. ACM Press, New York. Can˜amero, L. D. (1997b): A Hormonal Model of Emotions for Behavior Control. VUB AI-Lab Memo 97-06, Vrije Universiteit Brussel, Belgium. Can˜amero, L. D., ed. (1998): Issues in the Design of Emotional Agents. In Emotional and Intelligent: The Tangled Knot of Cognition. Papers from the 1998 AAAI Fall Symposium, TR FS-98-03, 49–54. AAAI Press, Menlo Park, Calif. Can˜amero, L. D. (2001): Emotions and Adaptation in Autonomous Agents: A Design Perspective. Cybern. Syst. 32 (5): 507–529. Carpenter, G. A., and Grossberg, S. (1988): The ART of Adaptive Pattern Recognition by a Self-Organizing Neural Network. Computer (March): 77–88. Damasio, A. (1994): Descartes’ Error: Emotion,Reason,and the Human Brain. Putnam, New York. Damasio, A. (1999): The Feeling of What Happens. Body and Emotions in the Making of Consciousness. Harcourt, New York. Donnart, J. Y., and Meyer, J. A. (1994): A Hierarchical Classifier System Implementing a Motivationally Autonomous Animat. In D. Cliff, P. Husbands, J. A. Meyer, and S. W. Wilson, eds., From Animals to Animats 3. Proceedings of the Third International Conference on Simulation of Adaptive Behavior. MIT Press/Bradford Books, Cambridge. Ekman, P. (1992): An Argument for Basic Emotions. Cogn. Emotion 6 (3/4): 169–200. Frijda, N. H. (1995): Emotions in Robots. In H. L. Roitblat and J. A. Meyer, eds., Comparative Approaches to Cognitive Science, 501–516. MIT Press, Cambridge. Kandel, E. R., Schwartz, J. H., and Jessell, T. M. (1995): Essentials of Neural Science and Behavior. Appleton and Lange, Norwalk, Conn. LeDoux, J. (1996): The Emotional Brain. Simon and Schuster, New York. Maes, P. (1991): A Bottom-Up Mechanism for Behavior Selection in an Artificial Creature. In J. A. Meyer and S. W. Wilson, eds., From Animals to Animats: Proceedings of the First International Conference on Simulation of Adaptive Behavior, 238–246. MIT Press, Cambridge. Mandler, G. (1985): Mind and Body. W. W. Norton, New York. Oatley, K., and Johnson-Laird, P. (1987): Towards a Cognitive Theory of Emotions. Cogn. Emotion 1: 29–50. Ortony, A., and Turner, T. J. (1990): What’s Basic about Basic Emotions? Psychol. Rev. 97: 315–331. Ortony, A., Clore, G. L., and Collins, A. (1988): The Cognitive Structure of Emotions. Cambridge University Press, New York. Pfeifer, R. (1991): A Dynamic View of Emotion w i t h a n Application to the Classification of Emotional Disorders, AI Memo 91-8. Vrije Universiteit, Brussel. Pfeifer, R. (1993): Studying Emotions: Fungus Eaters. In S. Goss, G. Nicolis, H. Bersini, and R. Dagonnier, eds., Proceedings of the Second European Conference on Artificial Life,ECAL ’93, 916–927. ULB, Brussels, Belgium. Pfeifer. R., and Scheier, C. (1999): Understanding Intelligence. MIT Press, Cambridge. Picard, R. W. (1997): Affective Computing. MIT Press, Cambridge. Pribram, K. H. (1984): Emotion: A Neurobehavioral Analysis. In K. R. Scherer and P. Ekman, eds., Approaches to Emotion, 13–38. Lawrence Erlbaum Associates, Hillsdale, N.J. Rolls, E. T. (1999): The Brain and Emotions. Oxford University Press, Oxford, London, New York. Simon, H. A. (1967): Motivational and Emotional Controls of Cognition. Psychol. Rev. 74 (1): 29–39. Sloman, A. (1992): Prolegomena to a Theory of Communication and Affect. In A. Ortony, J. Slack, and O. Stock, eds., AI and Cognitive Science Perspectives on Communication. Springer, Berlin, Heidelberg, New York.
Lola D. Can˜amero
Steels, L. (1994): Building Agents withAutonomous Behavior Systems. In L. Steels and R. Brooks, eds., The ‘‘Artificial Life’’ Route to ‘‘Artificial Intelligence.’’ Building Situated Embodied Agents. Lawrence Erlbaum Associates, Hillsdale, N.J. Tomkins, S. S. (1984): Affect Theory. In K. R. Scherer and P. Ekman, eds., Approaches to Emotion, 163–195. Lawrence Erlbaum Associates, Hillsdale, N.J. Vela´squez, J. D. (1996): Cathexis: A Computational Model for the Generation of Emotions and their Influence in the Behavior of Autonomous Agents. M.S. Th esis, MIT Media Laboratory. Vela´squez, J. D. (1998): Modeling Emotion-Based Decision-Making. In L. D. Can˜amero, ed., Emotional and Intelligent: The Tangled Knot of Cognition. Papers from the 1998 AAAI Fall Symposium, TR FS-98-03, pp. 164–169. AAAI Press, Menlo Park, Calif. Wehrle, T., and Scherer, K. (1995): Potential Pitfalls in Computational Modeling of Appraisal Processes: A Reply to Chwelos and Oatley. Cogn. Emotion 9: 599–616. Wilson, S. W. (1991): Th e Animat Path to AI. In J.-A. Meyer and S. W. Wilson, eds., From Animals to Animats: Proceedings of the First International Conference on Simulation of Adaptive Behavior. MIT Press, Cambridge.
5 Emotions: Meaningful Mappings Between the Individual and Its World Kirstie L. Bellman
Abstract
This chapter starts by addressing why one might want to consider emotions in computer programs (and more generally, in constructed complex systems). Drawing upon psychological a n d biological research, the chapter discusses how considerations of emotions quickly lead to the necessary consideration of selfawareness a n d consciousness. There is, in the wider intellectual community, a profusion of badly entangled concepts with rich a n d problematic terminology. Although this chapter is full of its own entanglements, the author hopes to contribute to the discussion by exploring some of these terms a n d speculating on what functional roles there could be for such properties in biological a n d artificial systems. Most of all, the chapter suggests a new type of test bed for empirically a n d experimentally trying out some of our ideas a n d distinctions about the role of emotional reasoning a n d emotional selves. Hence in the second half of the chapter, the author offers the idea that virtual worlds could be test beds in which we can test both our abstract notions about emotional selves a n d make observations a n d comparisons of the behaviors of both h u m a n a n d artificial beings within well-specified environments. Virtual worlds are engineered systems that can serve both as models of abstract systems a n d their functions a n d as experimental laboratories. The relationship in science between observing phenomenology a n d building models has always been a mutually beneficial one. Part of the problem in this area is that it crosses, by necessity, so many research disciplines. Virtual worlds also allow us an environment in which we can bring together a n d apply a diversity of tools, methodologies, a n d expertise. Constructed systems might actually prove to be a boon for studying issues such as emotions a n d self, because one can hypothesize concrete functions, roles within a system, a n d mechanisms for them, a n d see what is produced. Eventually, we want to understand what types of properties characterize different kinds
Kirstie L. Bellman
of emotional capabilities. We want to understand the implications of these different capabilities a n d how to make trade-offs among the extent of these capabilities for different types of constructed organisms, tasks, a n d operational environments.
5.1
Introduction: Why Consider Emotions in Constructed Systems?
All such terms as Intuition, Intellect, Emotion, Freedom, Logic, Immediacy, are already famous for their power to confuse a n d frustrate discussion. —Ogden and Richards, The Meaning of Meaning Despite an increasing number of leading scientists (Damasio 1994; Sacks 1995; Rolls 1 9 9 9 ) c o n c e r n e d with emotions, the attitude toward discussions of emotion have not changed much over the last seventy-five years of scientific work. Furthermore, whereas it is clear that there are a lot of reasons for considering emotions in biological systems, why would we want to discuss emotions in computer programs? After all, emotions clearly exist in biological systems, where they are, at a minimum, phenomena that profoundly affect every aspect of our lives—from our social interactions and mental health to being ‘‘intimately intermingled’’ with cognition (Lindsay and Norman 1972, p. 637). Why shouldn’t we leave the topic alone as a human—or at least animal— phenomenon, to be discussed by psychologists and physiologists but certainly not engineers and technologists? In this chapter, the author provides two answers. The first is an opinion that emotions in biological systems are not epiphenomena, but rather they allow the animals who possess them to perform better than organisms that do not have them. This is not a dogmatic attitude so much as a heuristic one; it allows us to explore what types of roles and advantages emotions might have for an organism as opposed to ending the conversation on emotions by simply enumerating the physiological correlates of emotion. Second, given as a starting point that emotions are indeed critical to biological systems, it must then be asked, ‘‘Are emotions also necessary to enabling certain types of intellectual capabilities in constructed systems?’’ This paper presents the author’s reasons for believing that some type of analogues to emotional capabilities are necessary to autonomous constructed systems, such as ‘‘agents’’ or
Meaningful Mappings Between the Individual and Its World
robots, if we want intelligent, independent behavior in real-world environments. As a start, what is the usefulness that we can expect from having emotions in computer programs, robots, or artificial agents? Do we expect emotions to motivate the agent correctly? That is, to care correctly (about the things we care about)? To save one’s own neck? Or do we expect emotions in agents to help with social interactions? That is, to result in altruistic, helpful, or cooperative behaviors? To motivate the agents to socialize or interact? Or are we attempting to gain for agents some different style of information processing? Some style perhaps that goes beyond the current ideas of logical or rationalistic cognitive styles? At a minimum, a better understanding of emotions, especially as made recognizable by the computer, could result in better man-machine interfaces and in machines able to respond more appropriately to the human user because they are able to perceive the human’s emotions. This is a point made well by Picard (1997, and chapter 7 of this volume). However, we cannot begin to explore all the reasons for incorporating emotions into artificial agents until we better understand the role we think emotions play in animal cognition and in our own. Emotions in humans have been thought of as ‘‘epiphenomena’’ (e.g., an inconsequential side effect), as a primitive part of the brain’s processing, or as a core quality of human processing, profoundly affecting but separable from cognition. The historical division between clinical psychology and cognitive psychology is representative of this last idea. There are long standing traditions in literature, then reflected into psychological writings, of the idea that emotions undermine logical (i.e., g o o d ) t h i n k i n g . Linguists have struggled for some time with the impact of emotion on what otherwise would be nicely expressed and logical descriptions of language. Ogden and Richards, in their early work, point out linguistic problems that are still unresolved, ‘‘This ideal supposes for its realization that the language is fixed like an algebra, where a formula once established remains without change in all the operations in which it is used. But phrases are not algebraic formulae. Affectivity always envelops and colors the logical expression of the thought’’ (1923, p. 152). Literary traditions, again reflected in later scientific writings, view emotional thinking as primitive (the ‘‘beast’’ within all men), and that part of the role of civilization is to control emotions so
Kirstie L. Bellman
that humans can go beyond their bestial ways and work together. Platonic ideals from Greek philosophy and the concepts of Freud’s superego and id spring to mind. David Moffat (1999)gave a very nice example of this in Aristotle’s dislike of emotions, which apparently he likened to a too eager servant who rushes off to do your bidding. The servant is fast but stupid and the results are undesirable. From this viewpoint, only rationality makes us better than the animals. But as always, h u m a n thought and societies are not logically consistent in their beliefs. Just as emotion is looked on as bestial, Western cultures also retain in their literature and everyday stories the importance of emotion (e.g., ‘‘If I speak in the tongues of men and of angels, but have not love, I am a noisy gong or a clanging cymbal’’). Cultures also emphasize its criticality to performing everyday jobs or especially its role in times of crisis. Hence there are the English phrases, such as ‘‘go with your gut instincts’’ or ‘‘trust your instincts.’’ Almost any doctor, policeman, soldier, or teacher will speak about the need to trust one’s intuitions and hunches, one’s ‘‘feeling about a situation’’ at certain points in their work. Many of these situations have to do with how to treat and react to other emotional beings, and could then be dismissed as using one’s emotions to understand others’ emotions. But in fact, there are many other reported moments having to do with physical objects: for example, evaluating the likelihood of an avalanche, or the spread of a fire, or the timing and severity of a storm. Or, and this brings it home to us, the intuitions of a scientist on the possible meanings of certain results. Part of the difficulty with thinking about emotions is that they are often associated, for very good reasons, with consciousness and with concepts of the self. Clearly, to make progress in this area, we need to start somehow teasing apart this tightly knit cluster of concepts: self-awareness, meaning, emotion, intuition, and mind. The approach taken in this chapter is to bring out some of these related concepts, explore them briefly, and then focus on an approach for making progress in this field by setting up virtual world test beds, in which we can test both our abstract notions about emotional selves and make observations on the interactions of both human and artificial beings within well-specified environments. Constructed systems might actually prove to be a boon for studying issues such as self and emotions, because one can
Meaningful Mappings Between the Individual and Its World
hypothesize concrete functions, explicit roles within a given system, and mechanisms for them, and see what is produced. Further, by providing environments that allow observations of both human and artificial beings, one can compare the real differences between them and the superior richness of biological ‘‘implementations.’’ Ironically, it may not in the end be as true that constructed systems need these concepts as it is that we need constructed systems to help refine our concepts. Virtual worlds can also give us a less emotionally charged common ground for studying emotions than just verbal debate within our scientific communities and ourselves.
5.2
Hints from Biological Systems on the Possible Roles of Emotion
As excellent recent work and summaries of the research have shown, emotional systems are old (Damasio 1994; McGinn 1999; Rolls 1999), a fact that argues for the fundamental roles of emotions in the viability of an adaptive living system. There is an enormous number of good and not so good social science studies on emotion: characterizing it; defining its roles and its impacts on perception, cognition, memory, language, and action; searching for cultural impacts; and so forth (Mandler 1984; Lindsay and Norman 1972). One of the most heated topics in this early research was on the differences between the emotions of animals and man. There are both everyday anecdotes and scientific evidence to support the common idea that animals share our emotions. Traditionally, that is one of the reasons we fear emotions. Although there has been a great deal of controversy over emotions in animals (many scientists dismiss any such comparisons as anthropomorphic thinking on the part of the researchers), there has been growing evidence both from ethological and neurophysiological research. First, we know that the parts of the brain, including the neurotransmitters that underlie at least the bulk of human emotions, are very old. Griffin (1984)argued that, at a certain point, one has to go through so many contortions to explain away the similarities between the emotions of animals and men that, in fact, Occam’s Razor argues for the simpler truth: If one thinks one sees in animals even the nuances of emotions (envy, shame, pride, mischief), they are likely to be correct. Generalizing emotional experiences to animals forces us to consider many types of fundamental roles for emotions across a wide range of biological and adaptive systems. In the following sections, the author discusses several
Kirstie L. Bellman
possible roles for emotions: as reinforcement, arousal, and motivation; as prioritizing options and making choices; and last, as integrative processes of several types.
Emotions as Arousal and Motivation
Lindsay and Norman departed from many psychologists long ago by emphasizing the importance of cognition and emotion for each other (1972). However, as many psychologists and neuroscientists have done and still do, Lindsay and Norman emphasized the roles of emotions as instrumental to arousal of, motivation for, and evaluation by the organism. ‘‘Words like expectations, uncertainty, disruption, discrepancy, dissonance, and conflict are key words in the experimental analysis of human emotion and motivation. In many types of motivational situations, the organism acts as if something were monitoring the ongoing cognitive processes, watching for potential trouble spots in dealing with the environment, and signaling when difficulties arise . . . when something is encountered that is new or discrepant from what is expected or potentially threatening, it acts like an interrupt mechanism, alerting the organism to the potential problem and mobilizing resources to deal with it. The result is a change in the level of arousal or activation’’ (Lindsay and Norman 1972, p. 611). Later they present a high-level model that outlines some of the general features of this viewpoint: There are cognitive processes that create predictions of the world aided by memory processes. These ‘‘predictions’’ are compared with feedback from the system and its environment as the result of the system’s actions. This leads, in the case of discrepancies between expectations and reality, to parallel chemical ( h o r m o n a l ) a n d neural activation. The resulting activity arouses, motivates, and activates the organism to act, to learn, and to remember. And in the case of humans, it is related to the experience of emotion. Rolls summarizes and has enhanced our understanding of the neurophysiological basis for emotions and their tie into attentional, motivational, and higher cognitive processes (1999 and chapter 2 of this volume). Unlike other areas in psychology, emotions have always required researchers to look at both the psychological and physiological correlates. As early as 1890, the great psychologist William James said, ‘‘If we fancy some strong emotion and then try to abstract from our consciousness of it all the feelings of its bodily
Meaningful Mappings Between the Individual and Its World
symptoms, we find we have nothing left behind, no ‘mind-stuff’ out of which the emotion can be constituted, and that a cold and neutral state of intellectual perception is all that remains . . . can one fancy the stage of rage and picture no ebullition in the chest, no flushing of the face, no dilatation of the nostrils, no clenching of the teeth, no impulse to vigorous action?’’ (quoted in Damasio 1994, p. 129). This ‘‘embodied cognition’’ (Varela, Thompson, and Rosch 1991, p. 172), this thinking and feeling state, has always been one of the most challenging aspects of thinking about emotion. Damasio, in his exciting book (1994), describes ‘‘the essence of emotion as the collection of changes in body state that are induced in myriad organs by nerve cell terminals, under the control of a dedicated brain system, which is responding to the content of thoughts relative to a particular entity or event. Many of the changes in body state . . . are actually perceptible to an external observer. . . . Emotion is the combination of a mental evaluative process, simple or complex, with dispositional responses to that process, mostly toward the body proper, resulting in an emotional body state, but also toward the brain itself . . . resulting in additional mental changes’’ (p. 139). Damasio, in a thoughtfully constructed argument, builds upon this base to gradually include higher level processes, ‘‘the latter cortices in particular receive an account of what is happening in your body, moment by moment, which means that they get a ‘view’ of the ever-changing landscape of your body during an emotion. . . . There is nothing static about it, no baseline, no little man—the homunculus—sitting in the brain’s penthouse like a statue, receiving signals from the corresponding part of the body. Instead there is change, ceaseless change. . . . The current body representations do not occur within a rigid cortical map as decades of human brain diagrams have insidiously suggested. They occur as a dynamic newly instantiated, ‘on-line’ representation of what is happening in the body now. . . . The process of continuous monitoring, that experience of what your body is doing while thoughts about specific contents roll by, is the essence of what I call a feeling’’ (1994, p p . 144, 145). Initially, this set of mechanisms leads only to a very concrete type of cognition. Essentially, feedback from the periphery produces emotional decisions by integrating all the complex factors that cannot be resolved by the rational system. The body does the
Kirstie L. Bellman
integration because it is too complex for the mind to decide. ‘‘The squirrel in my backyard, that runs up a tree to take cover from the neighbor’s adventurous black cat, has not reasoned much to decide on his action. He did not really think about his various options and calculate the costs and benefits of each. He saw the cat, was jolted by a body state and he ran’’ (1994, p. 189).
Emotions as Prioritization, Choice, and Discernment
In fact, what is mathematical creation? It does not consist in making new combinations with mathematical entities already known. Anyone could do that, but the combinations so m a d e would be infinite in number a n d most of them absolutely without interest. To create consists precisely in not making useless combinations a n d in making those which are useful a n d which are only a small minority. Invention is discernment, choice. —Poincare´ Damasio argues that eventually in higher animals, especially humans, we gain use of the lower emotional mechanisms to support new cognitive capabilities. He argues that evolution is ‘‘thrifty,’’ so the brain structures required to support new strategies ‘‘retain a functional link to their forerunners.’’ He sees ‘‘their purpose is the same, survival, and the parameters that control their operation and measure their success are also the same: well-being, absence of pain’’ (1994, p. 190). Hence the system that developed to produce ‘‘somatic markers’’ for personal and social decision making could help in other decision making. These somatic markers would not necessarily be perceived as feelings, but could act ‘‘covertly to highlight, in the form of an attentional mechanism, certain components over others, and to control, in effect, the go, stop, and turn signals necessary for some aspects of decision making and planning in nonpersonal, nonsocial domains’’ (1994, p. 190). Hence, according to Damasio’s somatic marker theory, emotions drastically reduce decision options, because the emotional associations give immediate feedback on different outcomes. We want to add that because this source of immediate feedback is internal, it is reasonable to imagine that animals capable of doing so would have advantages (in the risks incurred and the waste of time and energy)over those animals limited to simple reinforcement and trial-and-error learning capabilities.
Meaningful Mappings Between the Individual and Its World
Next to being considered as motivation, emotions are most often discussed as belonging to or underlying some sort of evaluative process that weighs choices, prioritizes options, or (as Damasio says)gives ‘‘cognitive guidance’’ (1994, p. 130)to the organism. As shown in the quote by Poincare (1908, quoted in Damasio 1994, p. 188), many have emphasized for some time now that our ability to reason cannot be in fact some enumeration of possibilities, which are then considered in some complete and methodical fashion. Our most rational reasoning turns out to be fragile, littered with intuitions and insights, prejudices and biases. Certain schools of artificial intelligence notwithstanding, we want computers mostly because they, in fact, don’t think like us (Norman 1993). Computers can algorithmically enumerate a solution space and then evaluate the solutions in a way that we find as foreign as we find it enviable. The bottom line is that we process information ‘‘with an attitude,’’ and many believe that that attitude is strongly based on emotions and remembered experiences. Lola Can˜amero (1998, and chapter 4 of this volume)describes her work in designing emotions for activity selection in autonomous agents. She points out that it is a dangerous and dynamical world, where biological or artificial agents need to rapidly choose among multiple goals and what to d o . Emotions are mechanisms for adaptation that motivate and guide behavior, including social interactions and communication. Emotions are a critical part of the ‘‘preference mechanisms’’ for planning actions and responding to the environment. To summarize so far, this chapter assumes the following points. Emotions are old in biological systems. They not only alter cognition, but also in fact probably underlie it in many crucial ways. They certainly underlie social interactions. It is also quite clear that for Western man, emotions have always been both cherished and feared. Especially within our Western scientific culture, emotions ‘‘cloud one’s judgment.’’ Although they are certainly seen as a powerful motivator (scientists ‘‘love’’ their science)and a powerful, intuitive, creative force (perhaps the ‘‘ahas’’ of scientific discovery?), emotion, in general, is seen as unreliable or untrustworthy. Hence every scientist is taught both to ‘‘go with their intuitions’’ and then to discipline such intuitions with a careful and methodical scientific process. What are the qualities of emotional reasoning that justify our attitudes toward it? It won’t come when it is called (the elusive
Kirstie L. Bellman
‘‘muses of creativity’’)nor is it as controllable as many would like. It is distracting and riveting to the feeler. Whereas a certain type of cognition can feel very external—comfortably thinking while writing symbols on a page—emotional thinking is very much internal and first person. It is you and you are feeling, embodying whatever the emotions are. In a real sense, emotions mean something to the organism experiencing them. They are embodied (painfully so and wonderfully so). They elate, they terrify, and they energize. They are always in a context and they are always situated to a person, both in the person’s memories of experiences and their current circumstances. Emotions also seem to be highly associative. That is, they bring in unexpected experiences and associations. Nearly everyone has had the experience, sometimes disconcerting, of smelling something, hearing something, or seeing something that evokes both a memory and an emotional response. Emotions seem to be powerful ways of tying together or ‘‘mapping’’ current events to their effects on us, current events with memories, and clusters of sensory and motor events with our thoughts about those events. In the following section, we now want to emphasize this role of emotions.
Emotions as Mapping Functions, Meaningful Integration, and Self
First, we considered how emotions could motivate an organism. Motivation here is used as an operational explanation of why it will do what it does. But then, as we saw, it is not enough to get the animal to do something. It must somehow plan what it should pay attention to and how to perform its activities, and there we saw again a role for emotions in helping to weigh our percepts, cognitions, and actions, and to provide us fundamental mechanisms for making decisions quickly. But, as is quite clearly implicit in many of the ideas presented thus far, emotions also play an integrative role. They are widespread and diffuse in their bodily and cognitive effects and hence they ‘‘rally’’ and mobilize our resources (Lindsay and Norman 1972; Damasio 1994; Rolls 1999). Insofar as different emotions mobilize clusters of associated effects, they also structure and coordinate the organism’s cognitive and emotional responses to different internal and external situations. We believe in fact that emotional thinking is one of the powerful ways in which biological systems m a p the consequences of their
Meaningful Mappings Between the Individual and Its World
behaviors (the outcomes of both their thoughts and physical actions)to the consequences within certain environmental settings (the feedback coming back to their system). The reason for using the word m a p is that we consider ‘‘association’’ to be within the mind of a perceiver. By m a p , we want to widen our concepts of cognitive associative processes to include other processes (both internal regulatory and physiological mechanisms and physics external to the organism)that constrain and shape the organism’s responses to the requirements of the external world’s dynamics. This then allows us to consider both mechanisms under the control of the organism and embodied by the organism, and those external to it. Clearly, in emphasizing this notion of mapping, we are drawing on a great deal of important work in second-order cybernetics, qualitative dynamics in mathematics, and catastrophe theory as applied to biological systems. In recent years, two elegant theorists, Maturana and Varela, have drawn upon this work and greatly expanded it in their concepts of autopoiesis and self-organizing systems (1980, 1987). In their theories, a living system’s structure is the physical embodiment of its organization. As described in Capra’s highly provocative book, ‘‘the product of its operation is itself . . . and that operation includes the manufacturing of boundaries or scopes of action’’ (1996, p. 174). The role of cognition is very different to these dynamicists; ‘‘the brain is not necessary for mind to exist. A bacterium, or a plant, has no brain but has a mind. The simplest organisms are capable of perception and thus of cognition. They do not see, but they nevertheless perceive changes in their environment—differences between light and shadow, hot and cold, higher and lower concentrations of some chemical, and the like. The new concept of cognition, the process of knowing, is thus much broader than thinking. It involves perception, emotion, and action—the entire process of life. . . . Descartes’ characterization of mind as the ‘thinking thing’ (res cogitans)is finally abandoned. Mind is not a thing but a process—the process of cognition, which is identified with the process of life. The brain is a specific structure through which this process operates. The relationship between mind and brain therefore is one between process and structure’’ (Capra 1996, p. 174). We find this viewpoint important and admirable in its emphasis on dynamics, on self-organization, and on nonvitalistic explanations of very difficult qualities in the biological organism. Yet, in the end,
Kirstie L. Bellman
there is an aggravating feeling that emotions (and concepts of self, as we shall see in a m o m e n t ) a r e relegated once again to being epiphenomena—though perhaps not as inconsequential or uninteresting as in some behaviorists’ view. Emotions are considered; they are part of the ‘‘structural changes’’ but it is not clear what impacts they have on the system at large. To the author, the above work is part of a long-standing ‘‘creative tension’’ in several related fields in the cognitive and neurosciences. How can we avoid the errors of rigid information processing and cognitive models, where the barely disguised homunculus symbolically processes overly static memory stores with reasoning processes strangely divorced from any bodily, cultural, and environmental context? Yet at the same time, how do we avoid overly general dynamical models that leave no room for the fact that we invent and use symbol systems, that we have will and intention, that we have very strongly a sense of self, and a mind that thinks about itself? In our work, we are trying to incorporate what we consider to be crucial features from both camps. Hence we believe that emotions do typify in many ways the same sort of qualities as many other dynamical processes (one of many mapping processes as defined above). However, we also present a view here that emotions are essential to a basic type of reasoning from the first person, from an ‘‘I’’ and about what is happening to that ‘‘I.’’ Reasoning here does not refer exclusively to the familiar deductive logical reasoning, but to a kind of generative processing based upon the input sensations and internal conditions. Whereas there are occasions for thinking about the world— observing it, exploring it—emotional thinking is very much a type of reasoning that is crucially self-centered. When one starts discussing emotion one is starting to discuss having a self—a perceived and felt self. Emotions are in terms of and help define that ‘‘self.’’ The purpose, we propose, of this self is to integrate experiences in a meaningful way into a self. Specifically, a self is a continuously maintained and global construction that speaks for the organism’s reasoning and assessments at a global level.
Discussion
Elliott: Emotional reasoning also gives you the capacity to use a symbolic room or possibly a nonsymbolic representation of that
Meaningful Mappings Between the Individual and Its World
local viewpoint on the world and a personal viewpoint on the world to allow some representation of that local viewpoint for others and to get the social aspect. Bellman: Yes. I suggest that empathy clearly is something that is advantageous to groups of animals that cluster together in small communities. It includes the cognitive capabilities, the learning capabilities, the socialization, all these different levels of processing that allow me to somehow understand something about your emotional responses. It’s clearly a very good thing for a small tribe, for a group of animals or even for the whole species, for example, to understand that my peer is angry. And this is something that I can even understand across species. Sloman: It’s worth noting that what you said about emotions—that they are personal and local—also applies to some extent to all sensory perception, and especially to vision. Vision is exquisitely viewpoint-centered. Bellman: Exactly. With a friend, I had a big debate about focus of attention versus personal view. But what I am talking here about, this personal viewpoint, is not the same thing as what we are focusing on at any given moment. I can change what I am focusing on, but I still also am aware that I have a personal viewpoint. Elliott: You can focus on how I might be perceiving the world, maintaining your own personal viewpoint . .. thinking about me at the same time. Can˜amero: I have a comment on the self-centered nature of emotion. Adopting that assumption seems natural to us, but one can also think that it is the other way round. In Western cultures, we classify emotions around the individual’s point of view. However, in other cultures, emotions are classified, for example, in terms of situations or social interactions rather than in terms of internal states and how events affect the individual. For some cultures, emotions are basically social: First they will help to grasp, conceptualize, and assess interactions among people and then people internalize that. So, instead of going from the inside to the outside, it’s the other way around. Bellman: That fits in very much with a lot of things I like to think about in terms of the social basis of knowledge, and meaning, and semantics. I think it’s a very important point. Nonetheless, I am proposing that the direction of emotional reasoning is from the organism into the world.
Kirstie L. Bellman
Sloman: Maybe what you are expressing is a Western romantic view of emotion, and the other one comes from another culture? Bellman: I understand, and that’s why it’s very careful to call it Western scientific view. We are composed of parts. In fact, if we look at the brain, we see many distinct architectures and ways of doing business—somehow integrated into a whole. Sometimes this is spoken of as having dozens of brains or different kinds of intelligences, and so forth. This is a fact recognized and struggled with in dozens of ways throughout our intellectual history: Minsky’s Society of Mind (1985); Braitenberg’s Vehicles (1984); Lewis Thomas’s musings on all the multiple organisms we really are (1974); Candace Pert’s immunological selves (Pert et al. 1985); Carl Sagan’s triune brain (1977), and our own work (Bellman and Walter 1984), to name just a few. In our view, emotions not only get our attention and bring things into consciousness and the full force of attention, reasoning, and activities at a global level (Lindsay and Norman 1972), but in fact also act to create that global level in the first place. How does an organism work as an integrated whole? When it reasons, how does it reason about so many and speak and act for so many? What does it mean to have ‘‘many in one?’’ We propose here that emotions are a critical part of constructing a self, and that that self is a key strategy for integrating the many into one. We start the discussion with our experience of self.
5.3
The Experience of Self
Any science that deals with living organisms must needs cover the phenomenon of consciousness, because consciousness, too, is part of reality. —Niels Bohr Science is a grand thing when you can get it; in its real sense one of the grandest words in the world. But what do these men mean, nine times out of ten, when they use it nowadays? . . . They mean getting outside a m a n a n d studying him as if he were a gigantic insect; in what they would call a dry impartial light; . . . I don’t deny the dry light may sometimes do good; though in one sense it’s the very reverse of science. So far from being knowledge, it’s
Meaningful Mappings Between the Individual and Its World
actually suppression of what we know. It’s treating a friend as a stranger, a n d pretending something familiar is really remote a n d mysterious. —Father Brown character, G. K. Chesterton We see from the previous section that emotions have led us back to the necessary discussion of a self. Anyone reading this chapter knows something about the experience of self. That is part of the problem. In terms of the quote above, it is not only a friend, but also a beloved friend. How then do we examine it without diminishing our experience? For this reason and others, self has long been a difficult topic for scientists and philosophers. Is it epiphenomenon or essential? What are its properties and roles within biological systems? How conscious are we? Is it present in animals? In machines? As Sacks puts it so elegantly, ‘‘that the brain is minutely differentiated is clear: There are hundreds of tiny areas crucial for every aspect of perception and behavior. . . . The miracle is how they all cooperate, are integrated together in the creation of a self. This, indeed, is the problem, the ultimate question, in neuroscience—and it cannot be answered, even in principle, without a global theory of brain function’’ (1995, p. xvii). Damasio gives a fascinating set of case studies in support of what he calls the neural self (1994). He describes several patients with anosognosia (no longer any sense of feeling their bodies), who no longer speak from the perspective of having an I. He continues, ‘‘I am in no way suggesting that all the contents of our minds are inspected by a single central knower and owner, and even less that such an entity would reside in a single brain place. I am saying though that our experiences tend to have a consistent perspective, as if there were indeed an owner and knower for most, though not all, contents. I imagine this perspective to be rooted in a relatively stable, endlessly repeated biological state’’ (p. 238). Building on his somatic marker theory, he sees the neural substrate for self as being a third type of image generator. ‘‘Finally consider all the ingredients I have described above—an object that is being represented, an organism responding to the object that is being represented, and a state of the self in the process of changing because of the organism’s response to the object—are held simultaneously in working memory and attended, side-by-side or in rapid interpolation, in early sensory cortices. . . . Subjectivity emerges . . . in a third kind of image, that of an organism in the act of perceiving and responding to an object’’ (1994, p. 242).
Kirstie L. Bellman
Self-Perception, Self-Monitoring, and Self-Reflection
Like others, we (Landauer and Bellman 1999a, b ) h a v e argued that self-perception and self-monitoring are critical features for goaloriented, autonomous systems in order for them to move around their environments. In other words, one can imagine designing an organism or a robot with bumper-car feedback (it hits a wall and it stops or turns). In many ways, that can suffice for certain types of simple activities in very constrained environments—like a room with four rectangular walls and a hard floor. But even in elementary creatures, such as crabs, lizards, and crayfish, we see much more sophisticated adaptive mechanisms (von Uexku¨ll 1934; Bellman and Walter 1984). For example, after hitting themselves on the side of a barrier, lizards can race around an enclosure’s rocks and logs—they have somehow learned something processable about how their size fits into that world. As Churchland said (1984, p. 74), ‘‘self-consciousness on this view is just a species of perception . . . self-consciousness is thus no more (and no less)mysterious than perception generally.’’ He goes on to emphasize the considerable variety of ‘‘self-monitoring’’ (p. 185)that occurs at different levels. Recognition and perception of what is ‘‘oneself’’ and what is not oneself are difficult processes, but we readily can identify their occurrence in a number of biological systems, from single cells in immune systems (Pert et al. 1985)to mammals. It also does not feel like a stretch to imagine mechanisms that could make those perceptions available to higher level and more cognitive systems. We discuss the processing of such information next. Self-monitoring capabilities are not the same as ‘‘self-reflection.’’ For example, the means to monitor internal state and respond to that internal state are available to a thermostat. Ironically, although we have indisputable evidence for self-reflection in humans, our most concrete definitions of self-reflection capabilities come from the world of computer programs. Patti Maes (1987)defines reflection as ‘‘the process of reasoning about and/or acting upon oneself.’’ Practically speaking, in computers, computational reflection means having machine interpretable descriptions of the machine’s resources. We have found in our approach (Landauer and Bellman 1996)that it is extremely useful to have not only state information available but also general metaknowledge about the limitations and required context information for all the resources. There are then processes that can act on this explicit knowledge about capabilities and state in order to control better the system in its per-
Meaningful Mappings Between the Individual and Its World
formance and maintenance (Landauer and Bellman 1996, 1997, 2000a; see also Brian Smith 1986 for an interesting discussion of some of the problems in representing and reasoning about selfreflective knowledge). It is clear from Damasio’s discussions that he is thinking of his ‘‘third type of image’’ as being available for both self-monitoring and self-reflection in a sense compatible with the ideas described here. McGinn also emphasizes cognitive reflection but seems to separate it from emotions and the sensate conscious self (1999).
Emotional Meanings and Making Sense of the World
We have emphasized self as an integration concept—that ‘‘global construct,’’ as a set of self-monitoring and self-perception mechanisms, and as a set of self-reflection capabilities. Now we approach perhaps the most difficult topic yet. We believe that having a self is critical to having ‘‘meanings’’ and that here especially is where emotions both support and reflect a self. In his always intriguing and informative book (1995), Sacks draws upon Temple Grandin’s self-reports of Asperger’s syndrome (high-functioning a u t i s m ) a n d his own experience with autism to discuss the ties between neurological difficulties with emotions and concepts of self. He and Grandin note that she is very rigid in the way she recites directions, a conversation, or an experience. Responding to Temple’s comment that ‘‘My mind is like a CD-ROM in a computer—like a quick access videotape. But once I get there, I have to play that whole part,’’ he cites a fascinating speculation by Bruner. Apparently, Bruner believes that some of his autistic patients may in fact lack the ability to integrate their perceptual experiences ‘‘with higher integrative ones and with concepts of self, so that relatively unprocessed, uninterpreted, unrevised images persist’’ (1995, p . 282).
Discussion
Picard: I have one more comment on autistics: We don’t know how good their personal view is within themselves. But we know that they have a really hard and almost impossible time coming up with what other people’s personal view is. Most of us understand what our personal view is and can project our self-tied notions to
Kirstie L. Bellman
other people. And this notion of personal view is what other people have, too. And yet one thing autistics do have is that they have a tremendous focus of attention, so much that it’s hard to get their attention and to divert it to something else. So, there is a real distinction between those two. Bellman: I worked with autistics a lot. That’s actually one of the few groups I could actually speak about out of my own knowledge about them. The experience, clinically, was actually a little bit different. A lot of autism in children would be better characterized by their very dispassionately seeing your emotion, and not caring about it at all. But they could in fact easily identify other persons’ emotional states, for example they could perfectly well recognize ‘‘Kirstie is mad now.’’ Picard: Yes. Actually, I am saying something different. What I wanted to point out was that if you start focusing them on some task it may be hard to make them think about something else. And, hard for them to predict what situation would give rise to anger. You know that autistic kids can learn and make an effort to do the socially right thing. An autistic person sees you get angry. He recognizes anger, but he does not understand why. He does not know how to predict that and how to understand the situation that gives rise to an emotion. He does not seem to be able to model what would go on inside another person. This admittedly speculative thought helps support our argument here: Not only is self a critical global construction, but that part of the role of that construction is exactly to live, experience, and reason in an interpreted, meaningful, and highly personal way within the organism’s environment. A self helps maintain a coherent viewpoint and a set of values about the world relative to that system’s needs and experience. Being in the world is not dispassionate, neutral, or unbiased. We may gain advantages by having processes that help us be more objective, but in a real sense, those are the secondary processes to the critical need to understand one’s world in relationship to oneself. Emotions are not only mappings between the world and the self, but rather they are in one sense the meanings of that world to the organism. One can take this view and easily make use of Damasiolike mechanisms. At the simplest level, one can see how a frightened animal has a frightening world at that moment, and how any of its experiences may well be linked forever with that ‘‘meaning.’’
Meaningful Mappings Between the Individual and Its World
But in ways that we don’t yet understand, we are hoping, with this focus, to explore much more than the processing of associations. Eventually, we want to understand how meanings lead to understanding, and how they are used in reasoning. Clearly, in humans, we do things with our language that remain much beyond the scope of simple ideas presented so far (but always colored by emotion as Ogden said in the earlier quote above). Many mysteries remain. How global is the global construct that we have proposed here? How much does it ‘‘speak for’’ the many voices under it? How much do different partially compatible and incompatible meanings get expressed or reconciled, or do they? And, of course, for our considerations here, how can we draw useful ideas from the biological to help build better-behaving constructed systems? As noted before, we are hoping to develop concepts that draw from both Capra-like dynamicists and traditional cognitive scientists and semioticians (in our view, semiotics is the study of symbol systems). We do this by emphasizing that, as seen in physics and chemistry, matter and energy exist as a continuum, but in fact there is something deep and important about the creation of objects and boundaries. Certainly, part of a biological system’s important capabilities is the ability to create boundaries and objects. Capra and other dynamicists, in their laudable attempt to shake people out of a naive realism that acts as if the objects we perceive simply exist as such in the external world, run the risk of ignoring critical constructive processes that in fact help self-organizing systems structure themselves. In other words, constructing a self does not have to be any more vitalistic or ‘‘old think’’ than constructing a cell wall. In our opinion, our viewpoint and our selves (the way we will interpret and find meaning in our experiences)are as critical constructions to our functioning as cell walls, bones, or any other physiological mechanisms that help us to regulate and control our ‘‘insides’’ relative to the requirements of the external world. We believe that such construction is very important—that thingness (our collection of constructions)allows us to develop a number of important capabilities. Like these other mechanisms, boundaries and buffers are useful regulatory strategies for allowing differential states (between the inside and external w o r l d ) a n d for allowing, literally, ‘‘time to think.’’ The immediacy of the external world becomes filtered, muted, and interpreted. One of the most important of these ‘‘thingness’’ constructions is this construction of a self as an object, as an entity. It takes for us
Kirstie L. Bellman
the multitude of our cells, our effectors and sensors, and all the rest, and represents them for us to reason about as a whole. It is a whole that we care about; that we experience; that we protect and work for; and that we present to others also as a whole. Self is the ultimate integration mechanism. Perhaps the only invariance of self is the style of this integration—over all my experiences, over all my living and dying cells, over all my variations—a whole. Further, it is an actively maintained construction, worked on continually (Damasio 1994). In order to wade out of this arena so full of verbal land mines, we want to emphasize that we are looking here for ideas on the possible functions that a ‘‘self’’ could play within complex systems. In the next section, we consider the advantages of developing a ‘‘self’’ for constructed systems and discuss our strategies for making progress in this field.
5.4
Strategy for Making Progress in This Field: Models and Test Beds
When we start discussing the roles of self, the conversations quickly become religious and—no surprise—personal. For example, Capra (1996), along with a small number of other thinkers, seems to admire a Buddhist resolution to the question of self. ‘‘The Buddhist doctrine of impermanence includes the notion that there is no self . . . that the idea of a separate, individual self is an illusion.’’ He adds, ‘‘The belief that all these fragments—in ourselves, in our environment, and in our society—are really separate has alienated us from nature and from our fellow human beings and thus has diminished us.’’ (1996, p. 269). Emotions in this framework are the same as cognition—part of the structural adjustments the organism is making to its world. Others we could cite would be equally passionate about the opposite view, that it is dangerous to not have a concept of self firmly in their science and society. As we have seen, we have a great deal of loaded language to overcome in this area of research. One of the most important things that we can now do as a field is to develop criteria and strategies for how to make progress in this area (e.g., what would constitute success?)—what experiments we should do; what is the type of data/information we would like to develop; and what data would convince us about the usefulness of any of these concepts and the roles and capabilities of ‘‘emotional selves.’’
Meaningful Mappings Between the Individual and Its World
We certainly cannot make up the success criteria for all the diverse parties who are interested in emotional agents. There are those who study emotion in order to construct better cognitive and social models of humans, to develop computer programs that will interact in more meaningful ways with humans, or more simply, to develop computer programs that will entertain the user better because they ‘‘act’’ more emotionally. We want to study the analogs of emotional reasoning in computer programs because we want to have some more interesting ideas about the types of processes that autonomous systems need, including a definition of self (Bellman 2000). Instead of therefore deciding the success criteria, we offer some strategies that allow us to set up overt agent models ‘‘personifying,’’ so to speak, our ideas, and use virtual worlds as test bed environments that invite us to explore abstract ideas coupled with observation. Because less is known about the virtual worlds than agent models, we will spend the bulk of the discussion below on setting up virtual worlds test beds.
Using Agents as Models for Emotional Selves
One of the problems we face, if we start adding a notion of self in biological agents, is the danger of postulating a homunculus inside each human brain as if there were a little seat of self, sitting in and controlling all the rest. Building computational agents keeps us steadfastly away from such a luxury, because even if we were tempted to do so, we cannot put such a homunculus inside an agent. Instead, we must mechanically build up capabilities that we believe act like self, or provide some of the important reasoning properties of self. This is a healthy process, as long as we do not mistake our processes (and the way we build them)for the biological capabilities we are trying to model. In Bellman (2000), we proposed that autonomous systems require models and processes that support a concept of ‘‘self’’ in order to permit interesting autonomy, and that emotions are essential reflections of having a self, and indeed for developing one. Our focus on the importance of self-monitoring and self-reflective capabilities in an autonomous system is certainly not new, in either earlier philosophical, biological, and psychological works (Granit 1979; Miller 1981; Paul Churchland 1984)or in the recent work in constructing artificial agents (Maes 1987; Sloman 1997;
Kirstie L. Bellman
Barber and Kim 1999; Landauer and Bellman 1999a, b; and many more). Autonomy is the degree to which the individual components of some greater system determine their own goals or purposes and make their own decisions about how to use their capabilities in order to further those goals. In Bellman (2000), we go on to argue that different kinds of autonomy turn out to be one of the most successful strategies for adaptive behavior in a distributed system. Although many computer scientists don’t think of a single animal as being distributed, one of the reasons for the rise of multicellular animals is that they are distributed in space and time relative to a single-cell animal. This property of distributedness is seen in even very small entities, such as a single cell, which has distributed elements within the microcosm of the cell wall. Suffice it to say for now that the natural world finds it useful or even essential to have parts—parts that divide up the continuities, and that can be used to accumulate, work against each other, and give to the world a patchwork of ‘‘local’’ effects within the context of more global effects, at many different time scales. The chapter goes on to describe several ‘‘strategies’’ used in adaptive distributed systems: distributed control, partitioning functionality, hiding internal reasoning capabilities, combining local and global views for better situational assessment, and combining local and global goals for more robust responsiveness. However, the problem with multicellular animals, societies, or distributed systems in general is how to coordinate the activities of the diverse subcomponents while enjoying the advantages of that diversity and multiplicity. This led us to speculate on the need for the construction of a self, for many of the reasons described in the previous section. If we want to take advantage of the adaptive features of autonomous systems, then the local entity must be designed to organize its resources and apply them to the goals at hand. But part of its resources are those that constitute itself. Therefore the agent must have enough self-knowledge to reason about how it can move through and act within the environment with all its properties and capabilities. There are other advantages to incorporating selfknowledge into agents: Potentially, selves can know what they are not seeing! They can know when they are obstructed or can’t solve a problem. If we also build in suitable communication and reporting capabilities, these selves could also ask us for the type of help they need. We need to build such capabilities (through learning
Meaningful Mappings Between the Individual and Its World
and other types of processes)into our constructed systems. For example, imagine an agent without a self who is supposed to cooperate with another agent on some type of activity. Without a self, it might have feedback from the environment and not realize in fact that what was true for ‘‘itself’’ is not true for everyone or for the whole world. Hence a concept of self is key to reasoning about the incremental aspects of a plan, partial outcomes, and the viewpoint of others or even of oneself at different times. (Smith 1986 has made similar suggestions as to the value of self-reflection in communication among agents.) Initially, we define in agents, a ‘‘self’’ as being the knowledge (and associated reasoning processes)of the agent’s capabilities, limits, and viewpoint (Landauer and Bellman 1999a, b). In traditional views, there is an emphasis on access to private views and meanings (e.g., the subjective, ‘‘felt,’’ and private experience of the individual)as being the key attribute of having a ‘‘self.’’ In biological systems, it is clearly true that the individual being has more access than any others to such experience. But in artificial systems, developers could potentially have access to all of the internal memories, activities, and processes of an artificial agent. Hence, for constructed autonomous systems, we initially would like to de-emphasize the property of having unique access to private information, and indeed the notion of ‘‘felt’’ experience as the criteria for having a self. Instead, we emphasize the knowledge (and supporting processes)of one’s boundaries (scope and extent of one’s capabilities), functions, goals, sensations (as the more limited input from sensors), actions, algorithms, and other processes. To summarize this argument, in the previous section we discussed emotion as critical to the formation of a ‘‘self’’ in biological systems. Here we emphasize the usefulness of having a self in constructing an autonomous system where it provides part of the functionality needed to adapt a distributed system to an environment. It arises as a property that is the outcome of a number of other capabilities that are crucial to reasoning and adapting to an environment. To move one’s self, an entity needs to ‘‘know’’ where it is and where it wants to go. Autonomy implies decision-making authority, and to make decisions implies some knowledge of the current state of the system—including its goals. When the decision we want to automate has to do with the system itself—that is, moving itself, making goals, or making decisions and acting upon them—we require self-knowledge. At the point that we have
Kirstie L. Bellman
embodied an agent with a limited set of sensors, effectors, and cognitive capabilities, allowed it to generate local goals and plans, provided it with feedback from the environment, and allowed it to record (and maybe even learn from)the results of its plans, we now have an agent that both benefits from and is limited to a viewpoint. The existence of private histories means that some parts of the viewpoint of an agent are never going to be public and should not be. Many linguists, especially Wittgenstein and Ogden, have dealt with the issues of ‘‘private language.’’ For this chapter, we simply want to note that it remains a hard theoretical and practical problem. As one builds up internal capabilities and the decisionmaking authority of an individual agent, how much and what type of information or processing should be made visible and shareable with both other agents (including human u s e r s ) a n d human overseers? Also, we have known for some time that it is very important for us to develop better concepts of viewpoint in computer science and artificial intelligence. In knowledge-based systems and related approaches (e.g., case-based reasoning, heuristics in simulations, and so forth), one is attempting to make a certain type of qualitative information computable. Actually, we usually want that information to be more than just computable—we want it to be increasingly analyzable. An important part of this type of knowledge is that it is not just ‘‘qualitative,’’ but rather that it contains an explicit point of view. In expert systems, the information was organized from the viewpoint of an expert or knowledgeable practitioner in the field. Hence a diagnostic for a set of medical conditions embodies how expert diagnosticians would view the existing evidence, what they would think about, ask questions about, and so forth. We believe that this emphasis on qualitative information from a collected set of expert observers led to the field of ‘‘agents,’’ in which the knowledge is both embodied a n d acted out from the point of view of the agent. A common limitation in this approach is that there is not enough explicitly stated or embodied about the use of that knowledge, the context for that use, and the problem being addressed (Landauer and Bellman 1999a, b). It is very difficult to effectively design agents without explicitly determining how the proposed capabilities lend themselves to specific goals and results in explicitly defined operational settings.
Meaningful Mappings Between the Individual and Its World
One of our approaches to studying the properties of viewpoint is to start with agents within well-specified environments that allow us to represent explicitly the contexts for the use of agents, and hence to explore both the problems in designing the right agent for the right task and deeper semantic issues. We believe that such virtual worlds would also contribute to the encouraging diversity of studies building emotional agents (Can˜amero, chapter 4 of this volume, and many others). We believe that these studies will benefit from virtual worlds as a test bed in many ways, especially when it comes to designing mechanisms within agents for interacting with their worlds and other agents. For example, virtual worlds can help us specify what is needed from the agents’ mechanisms that would help to evaluate the external world for resources, for cooperative others (or dangerous others), for feedback from the environment, for assessment of outcomes, and more. Hence we will now turn to a more thorough description of virtual worlds.
Virtual Worlds as Test Beds for Experiments on Self and Emotions
As we mentioned earlier in this chapter, this area of research is full of problematic language. We need an environment in which we can operationalize some of these concepts and actually observe such mechanisms in use. One of the most important things we need is an environment in which we can explore these very difficult ‘‘mappings’’ between goals, agent capabilities, agent behaviors, and interactions with the environment, and consequences or results in that environment. One of the most difficult issues has been that, heretofore, because we could not completely instrument the real world, we certainly could not capture all interactions between a system and its world. Now in virtual worlds we have an opportunity to do so. The disadvantage of course is that these worlds are not nearly as rich as real worlds. However, it is our experience that when one starts filling these worlds with information, objects, processes, and agents that move around the different rooms of these worlds and interact with each other, then the worlds are rich enough to be complex. If one now adds humans behaving through a number of modalities in these worlds while interacting with these objects and agents, we have more than sufficient complexity to be interesting enough for the foreseeable future.
Kirstie L. Bellman
Especially important is that these worlds force us to describe explicitly what the salient features in the ‘‘world’’ are that will be noticed by the agent, what the actions are that will be performed in this world to cause what results, and so forth. This, to our mind, has been the missing partner in the usual concentration on building agents to interact with the world. Last, we have a simplification not enjoyed in the real world of having hard boundaries between the inside and outside of a system (although one can walk into an object in a virtual world and have it become a setting; Landauer and Bellman 1999a). Equally important is that we need a test bed and a style of experimentation that allows us to build, observe, refine, and accumulate knowledge about our agent experiments and implementations. Computer scientists, unlike other scientific fields, do not have a good track record in building off of each other’s implementations. Partly d u e to the wide variety of languages, machines, and so on and the problems of integration, each researcher tends to rebuild capabilities rather than use someone else’s. Virtual worlds, it is hoped, will encourage a new attitude and process for conducting and observing each other’s efforts and sharing experiments and implementations. VIRTUAL WORLDS: NEW PLACES FOR INTEGRATION AND ANALYSIS Virtual worlds (Landauer and Bellman 1998c, 1999c), although drawing strongly from virtual reality technologies, differ from virtual reality in three ways. First, unlike most VR environments, virtual worlds are not necessarily homogeneous simulation environments; rather, they often have a large diversity of heterogeneous resources available through the environment. In some cases, one can access all of one’s computing resources—models, editors, websites, and so forth from within the virtual world. Second, many of the ‘‘utilities’’ or ‘‘services’’ in the environment are embodied as agents that move, interact, and talk within this environment. These agents will often come in through the same client-server protocol as a human user, and easily pass the Turing test in everyday conversation and activities. Human and artificial agents are often represented with an avatar (a character representation in text or graphics, depending upon the virtual world). Last, virtual worlds are organized in a spatial metaphor; each of the separate places is called a ‘‘room.’’ Unlike most VR environments, these rooms
Meaningful Mappings Between the Individual and Its World
are not just a picture or a description, but instead have rules or dynamical models underlying or constraining behaviors of entities in that room. The result is that the allowable types of agents, objects, and activities are all constrained within that subspace. Hence, even in rudimentary form, these settings begin to act like little ‘‘ecosystems,’’ with their own local physics and niche dynamics. One of the most important qualities of this for our discussion here is that these rooms are explicit models of the context within which we want to interpret and produce certain types of behaviors, interactions, and results. Virtual worlds rose from three major lines of development and experience: (1)Role-playing, multi-user Internet games called MUVEs (multi-user virtual environments); (2) Virtual reality environments and advanced distributed simulation, especially those used in military training exercises; and (3)Distributed computing environments, including the World Wide Web and the Internet. Because the text-based MUVEs have some very important properties for our discussion here and often are the least familiar to most readers, we will take a moment to describe them in more depth. Developed originally in 1979 (Bartle 1990), MUDs (multi-user dungeons originally, now more often multi-user d o m a i n s ) a r e the most interesting new development in computing for many years (O’Brien 1992; Riner and Clodius 1995; Leong 1999). Just in the last few years, they have moved out of the ‘‘game’’ arena into educational and corporate environments for distance learning, collaborative learning, literacy support (at all grade levels, including adult), corporate meeting support, professional organizations, and even technical conferences (Bellman 1997b; Landauer and Bellman 1996, 1997; Polichar 1996, 1997). Imagine reading a story set in some time or place. If the story is well written, it can feel like one is actually experiencing that situation or even becoming that character, regardless of whether the story is fiction or nonfiction. Stories can present information about a situation that is usually only learned through experience; they are particularly good at descriptions of complex settings that are very hard to construct (Dautenhahn 1998a, b). If a MUD is designed well, it is like a well-written story in its power to transport the user to a different situation, but it has three other important features. For already created stories (often called ‘‘quests’’), it is interactive, which means that the reader can affect the behavior and outcome
Kirstie L. Bellman
of the story; so, in particular, the reader can explore the story in many different ways. Based on the reader’s experiences in that world so far, they will also be exposed to certain characters, actions, and parts of the story. It is also multi-user, so that the reader can work with, play against, or interact with other readers. All users within the same room can see each other’s actions, character descriptions, and conversation with others (except when others ‘‘whisper,’’ which is a private point-to-point communication not broadcast to the rest of the room). Most important of all, there is plenty of ‘‘room’’ (and much encouragement within this subculture)for users to create their own ‘‘stories,’’ characters, and places. Both humans and computer programs enter these worlds and act within them as distinct characters with names, descriptions, and behaviors. These MUD stories are stored as databases on servers (that provide other services for managing the multi-user world), accessed by the users (both human and computer-based processes)with client programs. A simple command language is provided by all the server programs for MUDs, that allows users to move around, act, and talk within the virtual world. There is also a simple construction language in many MUDs that makes it easy for a player to immediately become a builder (a c r e a t o r ) i n that world. A MUD implements a notion of places that we create, in which we interact with each other, and we use our computing tools, together, instead of having all tool use and collaborative interactions mediated through tools used individually. Like other virtual reality environments, multi-user domains employ an underlying spatial metaphor. Humans use spatial maps for many things. We are able to organize an enormous amount of data and information if we can place it in a spatial context. Multi-user domains elicit a surprisingly powerful sense of space using only text. Characters may gather in the same location for conversations and other group activities, where their interactions are not restricted (or interp r e t e d ) b y the servers, and because the servers do not get in the way, it is as if they have become almost transparent (Gordon and Hall 1997). The sense of ‘‘being there’’ can be quite strong, and in fact, the emotional ‘‘reality’’ of human users comes across surprisingly well, and this in turn greatly enhances the sense of being there, making MUD experiences very compelling (Schwartz 1994; Turkle 1995; Landauer and Polichar 1998). This is why MUDs are VR pro-
Meaningful Mappings Between the Individual and Its World
grams: The human interactions are real; only the physical ones are not (Riner and Clodius 1995; Clodius 1995). One reason for being interested in multi-user domains is because there may be hundreds of people in the MUD at any given time, moving around separately and independently, creating objects in real time, and interacting with each other. From just a social science viewpoint, MUDs are clearly an important new phenomenon. There are now thousands of internationally populated MUDs, some with as many as ten thousand active players. These players are not simply visitors as to a website, but rather users who spend often several hours a day within that world. Some of these virtual communities have now been in existence as long as ten years. They have elected town officials in some places; they walk around their towns, have their own places that they build, and describe themselves as ‘‘living’’ there. They have imbued these virtual places with meanings. They have roles and functions that they play within those communities. As scientists, we want to understand more about why some of these virtual communities flourish over years and why others vanish within a month. Certainly for anyone interested in collaborative technologies, we need to know what they are doing right that they are able to live, work, and build together in these virtual communities. One of the most important qualities of MUDs as systems is that text-based MUDs allow people the freedom of and richness of word pictures, something that we can’t imitate yet with any graphical environments. Text-based MUDs have a much richer and more dynamic visual imagery than, say, movies or games, because it is customized to each player’s imagination. Even with the simplest of construction languages, people experience a deep sense of being present within these virtual environments, partly because they have built those environments from their own imagination. This sense of real presence and real interactions leads to real emotions and real social interactions, even though they are mediated through text displayed on a computer screen. This emotional reality will become important in a moment, when we consider using these as test beds for experiments on emotional and social interactions between h u m a n and artificial agents. Multi-user domains are also great equalizers: All people—not just what we call ‘‘technocrats’’—can become builders in a very short amount of time. It is not the computer technology that makes MUDs work. It is the writers and artists who create the world and
Kirstie L. Bellman
the people who live in it. The computer networking software is actually rather simple; it has changed little in the last twenty years. The MUDs with better poets seem to last longer than the ones with better computer scientists. In fact, we’ve seen examples of eightand nine-year-old children, who were raised in inner cities and were nearly illiterate, become, within a short amount of time, able to build up worlds (see Landauer and Bellman 1998c). Their teachers, in several different projects, have reported the childrens’ enormous motivation to be part of these environments, which had a noticeable impact on their efforts to read and to write well. One little girl comes to mind who built a 30-room mansion with gardens and pools. Another, an equally shy little boy, showed the author his gardens, where, when you looked at the flowers, they blossomed. This easy entre´e to MUDs extends across not only age, as just discussed, but across disabilities and gender. One of the most articulate people on one of the author’s favorite MUDs is profoundly deaf; he is much more comfortable speaking to people on-line (and vice versa; other people are more comfortable speaking to him on-line). Some MUDs have a near gender balance; one of the largest MUDs of all has an ongoing culture of role-playing, and actually has a slight majority of female players. Aside from being able to become not only players, but authors of the worlds, quickly, another aspect of these environments is that they have very simple client-server architectures, which means that there are people on these environments who only have teletypes. Others have speech-generation boxes because they cannot see. These environments are worldwide. You don’t need sophisticated equipment or programming experience to become a player or participant. This low cost of entry to MUDs has made these environments popular with a wide range of nontechnical people. Different kinds of MUDs use different construction languages. Usually, the variant of the word MUD reflects the choice of language available. These different languages allow different classes of behaviors to be specified for the objects created by the users. Some of these objects can be created and used in real time. Pedagogically, this can be very powerful. At one meeting of mathematicians on a MUD, some colleagues were joking with the author about an ‘‘infinitely parallel quantum computer.’’ While the others joked, the author quickly created an object labeled thus with a few simple attributes, and threw it across the room to one of the others. Although this was done as a joke, think about the ability to make—
Meaningful Mappings Between the Individual and Its World
even at a primitive level—a new idea active and visible; something that others can pick u p , modify, duplicate, and walk out of the room with. One of the colleagues present that day still has the ‘‘quantum machine’’ (in his virtual pocket, naturally). Another important quality about these environments is that every object is a state machine. Therefore, the objects that you are holding in your hand, the rooms you have walked through, the things you have accomplished in that environment can determine what you see, what objects do to you as you walk through this environment, and sometimes even where you go when you walk through a door or perform some action. These properties allow authors to set up ‘‘quests’’ or interactive stories that have game or logical features that must be accomplished to succeed. They are also easy ways of structuring learning material. One of our colleagues, a good amateur Egyptologist, set up a quest that requires one to learn some middle Egyptian—both vocabulary and grammar rules. If you don’t tell the boatman to take you across the river in proper Egyptian, you can’t cross, nor can you talk to the idols that give you other clues for finding the treasure and solving the puzzles. This particular quest is implemented in a virtual world that uses one of the simplest MUD servers of all, a TinyMUD. For our purposes here, these properties also allow one to set up rules that define an initial ‘‘ecology’’ or ‘‘physics’’ for each room. Although not as sophisticated as the modeling provided in other virtual worlds, even the simple TinyMUD can help one to start describing the contexts for and the constraints on the interactions between avatars, agents, and objects. Finally, the simple client-server architecture means that computer programs called ‘‘robots’’ can also be users, coming into the environment with the same interaction mechanisms that a human uses. They use the same commands that a human uses to move around the environment or construct new objects (see Foner 1997 for a description of one particularly interesting robot; and Johnson et al. 1999 for another application). The author couldn’t tell one was a robot until it was in a group situation, where its responses became less coherent, because its underlying pattern matcher could not keep track of multiple threads of conversation. These robots give us many interesting ideas about the kinds of intelligent support that agents could do for people within virtual environments. At present, there are prototype robots that take notes for people, tell them stories about the area, room, objects, and people
Kirstie L. Bellman
in the MUD, and play games with them. They can follow people around, help them find things and do errands for them. They can tutor them, help them find digital material, and give them tailored presentations on the computer programs or other objects available in the virtual world. Last, some robots helped us to monitor and evaluate the behavior of others in the MUD. (Bellman and Landauer 1997; Bellman 1997b). In the computer-aided education and training initiative (CAETI) project, a large educational technology research program sponsored by the author at the United States Defense Advanced Research Projects Agency (DARPA)from 1994 through 1998, these robots were also used for computer-based tutors and other evaluation agents (see Johnson et al. 1999 for an example). In the DARPA CAETI program, we added to the basic MUVE capabilities in several ways: We developed more advanced MUVE architectures, especially ones with the ability to keep a text, 2-D, and 3-D version of the world in sync. This was important especially because it always allowed users several ways of sharing in the world, even if at times in a limited way. The advanced architectures also allowed us to distribute the functionality underlying these worlds in more powerful ways. A good example of this was the better distributed database management. We also made it possible to have many more types of heterogeneous tools available from within the environment. Some of these tools were new types of embodied intelligent utilities and agents that helped individual users (librarians, guides, and t u t o r s ) o r conducted support activities across the world (evaluation agents). Some of these new tools also helped tailor resources to an individual user. Unlike a formal mathematical space or even the usual homogeneous simulation system, part of the strength of a virtual world is its ability to become the common meeting ground for a variety of different types of symbol systems and processing capabilities. These different symbol systems and processes occur in a variety of forms within the virtual world. Information and processing capabilities can be ‘‘packaged’’ as ‘‘agents,’’ who often interact with human users in natural language, and freely move and act in the same way as a human user within the virtual world; as an ‘‘object’’ that is manipulated by human and agent users within the virtual world; or as part of the ‘‘setting’’ (e.g., the description, capabilities, and ‘‘physics’’ of one of the many ‘‘places’’ within a virtual world).
Meaningful Mappings Between the Individual and Its World
The packaging of some process to become an object, an agent, or part of a setting in a virtual world hides, like a good application program interface (API)should, many details about the process and how that process works. The virtual world gives an appearance that here is a uniform world of objects, actors, and places all acting within the common ‘‘physics’’ of a setting and seen and heard in the same way. This is a reasonably successful and good strategy. Virtual worlds also draw from distributed simulation and virtual reality (VR)technologies and research. VR and distributed simulations gave us experience with distributed simulation environments (especially for training). They also provided us with some good examples of multimedia and multisensory worlds (example worlds are as diverse as the Naval Postgraduate School’s one for military operations training, such as hostage recovery; an MIT system for medical training; and NASA, Air Force, and Army flight simulators and tank trainers). They also produced high-end graphical environments and avatars (for example, ‘‘Jack’’ from the University of Pennsylvania, Lewis Johnson’s pedagogical agents, or Perlin’s ‘‘dancers’’; see Landauer and Bellman 1998c). VR research also developed the idea that one could use a spatial metaphor for working in even abstract spaces (e.g., data analysis). If one uses the virtual world as a work environment (i.e., as a front end to other tools and a place to meet), the computermediated environment means that all interactions—humanhuman, tool-tool, and human-tool—potentially can be captured and made available for analysis. Hence the constructed system becomes not only a model embodying our ideas, but a new type of laboratory within which we observe and are observed by our tools. WHAT CAN WE STUDY IN THESE WORLDS? What kinds of experiments and what kinds of issues might we explore in a virtual world? Although by no means complete, the list below is offered as stimulation and hopefully as a starting point for some to explore virtual worlds for their interests. 1.
Goal-directed behavior, agency, will, and intention. If we want to do this in a functional manner then we need to do it by having environments that are well specified, like a virtual world, tasks that are well specified, observable interactions and consequences, and then the mechanisms for planning goals, responding to and acting within the environment embodied in an agent. In such a test
Kirstie L. Bellman
2.
3.
4.
5.
6.
7.
bed we now have the potential to build up the types of correlations and causal relations we need to understand these complex relationships. Interactive systems. Virtual worlds allow us to see both sides of the problem of designing interactive agents—for example, interactive with what? What does the agent in the environment notice (relevancy)? What is the feedback from the environment? How is that feedback related to the effects of the activities, both of the agent and its environment? How does the embodiment of the agent reflect its role within the environment? Cooperative agents. What does an agent communicate with others in this world and where and why? What does an agent keep ‘‘hidden’’ in terms of its knowledge or processing? How do virtual worlds support nonverbal interactions and communications? How do agents negotiate? Humans versus artificial agents. How do we compare the activities and roles of humans and agents in virtual worlds? How do their differing capabilities show up within different virtual worlds? How do we distribute functions and roles between h u m a n and artificial devices and for what reasons? Comparisons across modeling formalisms and methodologies. In line with Lola Can˜amero’s comments at this workshop (chapter 4), we can compare different worlds, different types of agents, and different kinds of approaches in this type of test bed. This could provide the field with important common benchmarks. Can˜amero suggested that a start here is to try to characterize the environments’ attributes (uncertainty, etc.), allowable activities, and allowable interactions. Viewpoints. We can explore ‘‘viewpoints’’ based not only on where an agent is and how it is embodied (scope and kind of sensors and effectors), but also how that viewpoint is altered by its experiences, capabilities, and embodiment. Clearly this is a critical step toward defining a notion of self in these test beds. Also, important to social agents, we can study schemes for combining viewpoints. Dynamics and boundaries. We can tease apart the contribution of the underlying rules of the setting—the ecosystem, the physics— from the mechanisms and capabilities that we have embodied within the agent—essentially the organisms’ agency from the environmental dynamics. We can also start to tease apart what con-
Meaningful Mappings Between the Individual and Its World
8.
5.5
straints we have from the setting and what from the manner in which we have developed the agent. Meaning and interpretation. Last, because we have clear definitions in computer science of interpreters, we can conduct new types of experiments in protocols and interpreters as we try to move from syntactic to semantic information in artificial agents. We have proposed semantic experiments elsewhere (Bellman 1997b.)
Conclusions
Recently, the Los Angeles Times quoted John Searle as saying that consciousness as a philosophical problem is solved and has become a matter of science, ‘‘Let the brain stabbers figure out how it works’’ (28 December, 1999, p. A23). Perhaps he is correct insofar as the philosophical problem goes, but as to the science, we have just begun. Consciousness, self-awareness, self-identity, and emotions require much more than finding out the physiological correlates. In order to understand those ideas we will need to understand meaning and hardest of all, what we mean. As Gary Zukav passionately states, ‘‘the irrelevancy that we attribute to feelings pervades our thinking and our values’’ (1999, p. 60). The approach taken in this chapter was to explore briefly the cluster of concepts related to emotion and self. We presented an argument here that emotional reasoning is fast, associative, situated, embodied, compelling, and self-centered. By contrast, computer-based agents currently are fast, employ a limited notion of associative capabilities, are only situated and embodied in a limited way in virtual worlds, not yet particularly compelling, and only unwittingly self-centered. There is no ‘‘I,’’ no knowledge of viewpoint, and only limited computational reflection. With our focus on exploring the functional roles that emotions and self could play within artificial systems, we discussed, based on the research from a diversity of fields, several possible roles for emotions, followed by several possible roles for a self. Of all the reasons for emotions (as motivators and as a crucial part of our ability to trim down decision processes), we emphasized the emotional self as an integration concept—that ‘‘global construct,’’ as a set of self-monitoring and self-perception mechanisms, and as a set of self-reflection capabilities. But key to those all, we started to
Kirstie L. Bellman
explore how the self is critical to meanings and interpreting the organism’s experience of the world. We further speculated that this especially is where emotions support and reflect the construction of a self. Clearly, if we want to have more intelligent computer programs—ones that can interact comfortably with us, understand our meanings, and even reason more powerfully in fluid, uncertain environments—we need to continue to consider how fundamental emotions are to our thinking, to ourselves, and to our meaningful understanding of the world. Our hopes and expectations for the results of having emotions in agents must be considered with the shortcomings of that type of reasoning. It is local, immediate reasoning with a narrow focus on what the results are to the agent. It is not methodical or global (in the sense of considering all the possibilities or different viewpoints). Whereas other cognitive processes allow us to see other viewpoints, emotional thinking has an essentially personal viewpoint—‘‘what is happening to me in this circumstance, and is it good or bad?’’ This last sentence of course does not handle empathy adequately—our ability to feel as others do. Perhaps it became evolutionarily advantageous to have animals that empathize with the members of their groups. Perhaps we humans also are civilized (and animals too perhaps in their acculturation process)into considering the consequences of our actions on others. Perhaps we develop cognitive capabilities that help us reason about other viewpoints, including emotional ones. Perhaps all three of these reasons hold, and more, because this broader emotional reasoning will of course be critical to the nature of social agents. Clearly this broader emotional reasoning could be of great importance to the viability of any organism that mates, parents, fights, hunts, or cooperates with others. We will consider such social emotional reasoning more in future papers after we have had more experience and observations from ongoing experiments of social agents in virtual worlds. We then focused on an approach for making progress in this field by setting up virtual world test beds, in which we can test both our abstract notions about emotional selves and make observations of the interactions of both human and artificial beings within wellspecified environments. As we stated above, constructed systems might actually prove to be a boon for studying issues such as self and emotions, because one can hypothesize concrete functions, explicit roles within a given system, and mechanisms for them,
Meaningful Mappings Between the Individual and Its World
and see what is produced. We suggested that virtual worlds provide an important and new kind of test bed for experimenting with concepts of emotions and selves in artificial agents. Virtual worlds have the advantages of allowing us to collect a new level of observations of the interactions among humans and agents, of humans and agents using tools, and of the results of activities in a well-described, interactive and instrumentable environment. This certainly does not solve all the hard problems, but it does allow us to conduct new kinds of experiments. In these experiments, we can define what role the dynamics or ‘‘rules’’ of the environment have on the individual, what the agent is supposed to notice from, respond to, interact with, and do in a given environment, and what capabilities the agent must have in order to respond and work within that environment. Further, by providing environments that allow observations of both human and artificial beings, one can compare the real differences between them and the superior richness of biological ‘‘implementations.’’ As noted above, ironically, it may not in the end be as true that constructed systems need these concepts as it is that we need constructed systems to help refine our concepts. We certainly could benefit from a less emotionally charged common ground for studying emotions than just verbal debate within our scientific communities and ourselves. Eventually, we want to understand what types of properties characterize different kinds of emotional capabilities. We want to understand the implications of these different capabilities and how to make trade-offs among these capabilities for different types of constructed organisms, tasks, and operational environments. Part of the problem in this area is that it crosses, by necessity, so many research disciplines. Virtual worlds also allow us an environment in which we can bring together and apply a diversity of tools, methodologies, and expertise. Last, a critical reason for understanding these issues is that this type of engineering approach will give us greater insight into what might be the reasons for our own sense of identity and self (and perhaps our many context-dependent selves)and our own social behavior and reasoning. This will not only build better systems to serve us, but also build systems that can more effectively and naturally include us as part of the systems. As noted elsewhere (Bellman 2000), this starts moving us toward the development of a more ‘‘livable technology.’’
Kirstie L. Bellman
References Barber, K. S., and Kim J. (1999): Constructing and Dynamically Maintaining PerspectiveBased Agent Models in a Multi-Agent Environment. In O. Etzioni, J. Mu¨ller, and J. Bradshaw, eds., Proceedings of the Third Annual Conference on Autonomous Agents, ACM SIGART, 1–5 May, Seattle, 416–417. ACM Press, New York. Bartle, R. (1990): Interactive Multi-User Computer Games. MUSE, Ltd. On-line. Available: hhttp://www.apocalypse.org/pub/u/lpb/muddex/bartle.txti. (Availability last checked 5 Nov 2002) Bellman, K. L. (1997a): Playing in the MUD: Turning Virtual Reality into Real Places. In R. J. Seidel and P. R. Chatelier, eds., Virtual Reality: Now or Tomorrow. Plenum Press, New York. Bellman, K. L. (1997b): Sharing Work, Experience, Interpretation, and maybe even Meanings Between Natural and Artificial Agents. In Proc. SMC ’97 5: 4127–4132. Orlando, Florida. Bellman, K. L. (2000): Developing a Concept of Self for Constructed Autonomous Systems. In R. Trappl, ed., Cybernetics and Systems 2000, 693–698. Austrian Society for Cybernetic Studies, Vienna. Bellman, K. L., and Goldberg, L. (1984): Common Origin of Linguistic and Movement Abilities. Am. J. Physiol. 246: R915–R921. Bellman, K. L., and Landauer, C. (1997): A Note on Improving the Capabilities of Software Agents. In W. L. Johnson, ed., Proceedings of the First International Conference on Autonomous Agents, ACM SIGART, Marina Del Rey, Calif., 512–513. ACM Press, New York. Bellman, K. L., and Landauer, C. (1998): Playing in the MUD: Virtual Worlds are Real Places. In Proceedings ECAI ’98. Brighton, England, UK. Revised and extended in R. S. Aylett, ed. (2000): Special issue J. Appl. Artif. Intell. 14 (1): 93–123. Bellman, K. L., and Walter, D. O. (1984): Biological Processing. Am. J. Physiol. 246: R860–R867. Braitenberg, V. (1984): Vehicles: Experiments in Synthetic Psychology. MIT Press, Cambridge. Burka, L. P. (1999): The MUD Archive. On-line. Available: hhttp://www.apocalypse.org/ pub/u/lpb/muddexi. (Availability last checked 5 Nov 2002) Can˜amero, L. D. (1998): Issues in the Design of Emotional Agents. In Can˜amero, L. D., ed., Emotional and Intelligent: The Tangled Knot of Cognition. Papers from the 1998 AAAI Fall Symposium CA, 49–54. TR FS-98-03. AAAI Press, Menlo Park, Calif. Capra, F. (1996): The Web of Life. Anchor Books, New York. Churchland, P. (1984): Matter and Consciousness. MIT Press, Cambridge; Avon Books, New York. Clodius, J. (1994): Concepts of Space and Place in a Virtual Community. On-line. Available: hhttp://www.dragonmud.org/people/jen/space.htmli. (Availability last checked 26 July 2001.) Clodius, J. (1995): Computer-Mediated Interactions: Human Factors. MUDshop II, San Diego. On-line. Available: hhttp://www.dragonmud.org/people/jen/keynote.htmli. (Availability last checked 5 Nov 2002) Damasio, A. (1994): Descartes’ Error: Emotion, Reason, and the Human Brain. Putnam, New York. Dautenhahn, K. (1998a): The Art of Designing Socially Intelligent Agents: Science, Fiction, and the Human in the Loop. Appl. Artif. Intell. 12 (7): 573–618. Dautenhahn, K. (1998b): Story-Telling in Virtual Environments. In R. Aylett, ed., Working Notes: Intelligent Virtual Environments, Workshop at the 13th Biennial European Conference on Artificial Intelligence (ECAI-98), Brighton, UK. Foner, L. N. (1997): Entertaining Agents: A Sociological Case Study. In W. L. Johnson, ed., Proceedings of the First Int. Conf. on Autonomous Agents ’97, Marina del Rey, Calif. 122–129. ACM Press, New York. Gordon, A., and Hall, L. (1997): Collaboration with Agents in a Virtual World. TR NPCTRS-97-3, Department of Computing, University of Northumbria.
Meaningful Mappings Between the Individual and Its World
Granit, R. (1979): The Purposive Brain. MIT Press, Cambridge. Griffin, D. R. (1984): Animal Thinking. Harvard University Press, Cambridge. Heisenberg, W. (1971): Physics and Beyond. Harper and Row, New York. James, W. [1890] (1950): The Principles of Psychology. Vol. 2. Reprint, Dover, New York. Johnson, W., Rickel, L. J., Stiles, R., McCarthy, L., and Munro, A. (1999): Virtually Collaborating with Pedagogical Agents. In C. Landauer and K. Bellman, eds., Proceedings of VWsim ’99, San Francisco, 171–176. The Society for Computer Simulation International, San Diego, Calif. Landauer, C., and Bellman, K. L. (1996): Integration Systems and Interaction Spaces. In F. Baader, K. U. Schulz, eds., Proceedings of the First International Workshop on Frontiers of Combining Systems, Munich, 249–266. Kluger Academic, Dordrecht, Netherlands. Landauer, C., and Bellman, K. L. (1997): Model-Based Simulation Design with Wrappings. In Proceedings of OOS ’97: Object Oriented Simulation Conference, Phoenix, 169–174. The Society for Computer Simulation International, San Diego, Calif. Landauer, C., and Bellman, K. L. (1998a): MUDs, Integration Spaces, and Learning Environments. Thirty-first Hawaii Conference on System Sciences I: Collaboration Technologies, Kona, Hawaii. IEEE Computer Society Press, Los Alamitos, Calif. Landauer, C., and Bellman, K. L. (1998b): Integration and Modeling in MUVEs. In C. Landauer and K. L. Bellman, eds., Proceedings of VWsim ’98: Virtual Worlds and Simulation Conference, San Diego, 187–192. Society for Computer Simulation International, San Diego. Landauer, C., and Bellman, K. L., eds. (1998c): Proceedings of VWsim ’98: The 1998 Virtual Worlds and Simulation Conference, San Diego. Society for Computer Simulation International, San Diego. Landauer, C., and Bellman, K. L. (1998d): Language Formation in Virtual Worlds. In F. DiCesare and M. Jafari, eds., Proceedings of SMC ’98: The 1998 IEEE International Conference on Systems, Man, and Cybernetics, San Diego, 1365–1370. IEEE Computer Society Press, Los Alamitos, Calif. Landauer, C., and Bellman, K. L. (1999a): Computational Embodiment: Constructing Autonomous Software Systems. In Cybern. Syst. Int. J. 30 (2): 131–168. Landauer, C., and Bellman, K. L. (1999b): Computational Embodiment: Agents as Constructed Complex Systems, Chapter 11 in K. Dautenhahn, ed., Human Cognition and Social Agent Technology. Benjamins, New York. Landauer, C., and Bellman, K. L., eds. (1999c): Proceedings of VWsim ’99: The 1999 Virtual Worlds and Simulation Conference, San Francisco. Society for Computer Simulation International, San Diego, Calif. Landauer C., and Bellman, K. L. (2000a): Reflective Infrastructure for Autonomous Systems. In R. Trappl, ed., Cybernetics and Systems 2000, 671–676. Austrian Society for Cybernetic Studies, Vienna. Landauer, C., and Bellman, K. L. (2000b): Virtual Simulation Environments. In H. Sarjoughian, F. Cellier, Ting-Sheng Fu eds., Proceedings of International Conference of AI, Simulation, and Planning, Tucson, Ariz. Society for Computer Simulation International, San Diego. Landauer, C., and Polichar, V. (1998): More than Shared Artifacts: Collaboration via Shared Presence in MUDs. In Proceedings of WETICE ’98: Workshop on Enabling Technologies for Interactive Collaborative Environments, Stanford University, 182– 189. IEEE Computer Society Press, Los Alamitos, Calif. Leong, L. (1999): The MUD Resources Collection. On-line. Available: hhttp://www. godlike.com/muds/i. (Availability last checked 5 Nov 2002) Lindsay, P. H., and Norman, D. A. (1972): Human Information Processing: An Introduction to Psychology. Academic Press, New York. Lombard, M., and Ditton, T. (1997): At the Heart of It All: The Concept of Presence. J. Comput. Mediated Commun. 3 (2): On-line. Available, at hhttp://www.ascusc.org/ jcmci. (Availability last checked 5 Nov 2002) Maes, P. (1987): Computational Reflection. TR 87-2. MIT AI Laboratory, Cambridge.
Kirstie L. Bellman
Mandler, G. (1984): Mind and Body: Psychology of Emotion and Stress. W. W. Norton, New York. Maturana, H., and Varela, F. (1980): Autopoiesis and Cognition. D. Reidel, Dordrecht, Holland. Maturana, H., and Varela, F. (1987): The Tree of Knowledge. Shambhala, Boston. McGinn, C. (1999): The Mysterious Flame: Conscious Minds in a Material World. Basic Books, New York. Miller, R. (1981): Meaning and Purpose in the Intact Brain. Oxford University Press, Oxford, London, New York. Minsky, M. (1985): The Society of Mind. Simon and Schuster, New York. Moffat, D. (1999): Personal communication. Norman, D. (1993): Things That Make Us Smart: Defending Human Attributes in the Age of the Machine. Addison Wesley, New York. O’Brien, M. (1992): Playing in the MUD. Ask Mr. Protocol Column. SUN Expert 3 (5): 19–20; 23: 25–27. Ogden, C. K., and Richards, I. A. (1923): The Meaning of Meaning. Harvest/HBJ Book, New York. Pert, C., Candace, B., Ruff, M. R., Weber, R. J., and Herkenham, M. (1985): Neuro-peptides and Their Receptors: A Psychosomatic Network. J. Immunol. 135 (2): 820–826. Picard, R. W. (1997): Affective Computing. MIT Press, Cambridge. Polichar, V. E. (1996): An Office MUD for Fun and Profit? Or Maybe Just Better Communication; login Magazine. Polichar, V. E. (1997): On the value of MUDs as instructional and research tools. Open letter provided to Northern Arizona University. Riner, R., and Clodius, J. (1995): Simulating Future Histories. Anthropol. Educ. Q. 26 (1): 95–104. On-line. Available: hhttp://www.dragonmud.org/people/jen/solsys.htmli. (Availability last checked 5 Nov 2002) Rolls, E. T. (1999): The Brain and Emotion. Oxford University Press, Oxford, London, New York. Sacks, O. (1995): An Anthropologist on Mars. Alfred A. Knopf, New York. Sagan, C. (1977): The Dragons of Eden: Speculations on the Evolution of Human Intelligence. Random House, New York. Schachter, S., and Singer, J. E. (1962): Cognitive, Social, and Physiological Determinants of Emotional State. Psychol. Rev. 69: 379–399. Schwartz, J. (1994): A Terminal Obsession. Washington Post, Style Section, 27 March. Searle, J. (1999): No Limits Hinder UC Thinker. Article by Tersy McDermott in Los Angeles Times, 28 December, p. A23. Sloman, A. (1997): Synthetic Minds. In W. Lewis Johnson ed., Proceedings of the First International Conference on Autonomous Agents, ACM SIGART, Marina Del Rey, Calif., 534–535. Association for Computing Machinery, New York. Smith, B. C. (1986): Varieties of Self-Reference. In J. Y. Halpern, ed., Theoretical Aspects of Reasoning about Knowledge, Proceedings of TARK 1986, 19–43. AAAI Publication, Morgan Kaufman Publishers, Los Altos, Calif. Smith, B. C. (1996): On the Origins of Objects. MIT Press, Cambridge. Thomas, L. (1974): The Lives of a Cell: Notes of a Biology Watcher. Bantam Books, New York. Turkle, S. (1995): Life on the Screen. Simon and Schuster, New York. Varela, F., Thompson, E., and Rosch, E. (1991): The Embodied Mind. MIT Press, Cambridge. von Uexku¨ll, J. [1934] (1957): A Stroll through the World of Animals and Men. In C. Schiller, trans., Instinctive Behavior: The Develoment of a Modern Concept. International Universities Press, New York. Zukav, G. (1999): The Seat of the Soul. Simon and Schuster, New York.
6 On Making Believable Emotional Agents Believable Andrew Ortony
Abstract
How do we make an emotional agent a believable emotional agent? Part of the answer is that we have to be able to design agents whose behaviors a n d motivational states have some consistency. This necessitates ensuring situationally a n d individually appropriate internal responses (emotions), ensuring situationally a n d individually appropriate external responses (behaviors a n d behavioral inclinations), a n d arranging for sensible coordination between internal a n d external responses. Situationally appropriate responses depend on implementing a robust model of emotion elicitation a n d emotion-to-response relations. Individual appropriateness requires a theory of personality viewed as a generative engine that provides coherence, consistency, a n d thus some measure of predictability.
6.1
Making Believable Emotional Agents Believable
What does it take to make an emotional agent a believable emotional agent? If we take a broad view of believability—one that takes us beyond trying to induce an illusion of life through what Stern (chapter 12 of this volume)refers to as the ‘‘Eliza effect,’’ to the idea of generating behavior that is genuinely plausible—then we have to do more than just arrange for the coordination of, for example, language and action. Rather, and certainly in the context of emotional agents, the behaviors to be generated—and the motivational states that subserve them—have to have some consistency, for consistency across similar situations is one of the most salient aspects of human behavior. If my mother responds with terror on seeing a mouse in her bedroom today, I generally expect her to respond with terror tomorrow. Unless there is some consistency in an agent’s emotional reactions and motivational states, as well as in the observable behaviors associated with such reactions and states, much of what the agent does will not make sense. To be sure, people do not always react in the same way in the same
Andrew Ortony
kind of situation—there must be variability within consistency, but equally surely there is some consistency—enough in fact for it to be meaningful to speak of people behaving in character. An agent whose behaviors were so arbitrary that they made no sense would probably strike us as psychotic, and Parry (e.g., Colby 1 9 8 1 ) n o t withstanding, building psychotics is not generally what we have in mind when we think about building believable emotional agents or modeling human ones. But consistency is not sufficient for an agent to be believable. An agent’s behavior also has to be coherent. In other words, believability entails not only that emotions, motivations, and actions fit together in a meaningful and intelligible way at the local (momentto-moment)level, but also that they cohere at a more global level— across different kinds of situations, and over quite long time periods. For example, I know that my daughter intensely dislikes meat—it disgusts her to even think about eating something that once had a face. Knowing this, I know that she would experience disgust if she were to suddenly learn that she was eating something that contained meat (e.g., beef bouillon, not vegetable bouillon), and I would expect her disgust to influence her behavior—she would grimace, and push the plate away, and make some hideous noise. In other words, I expect her emotion-related behaviors to be consonant with (i.e., appropriate for)her emotions. But I also expect coherence with other realms of her life. Accordingly, I would be amazed if she told me that just for the fun of it, she had taken a summer job in a butcher’s shop (unless perhaps I learned that she had taken the job with a view to desensitizing herself). Clearly, the issue of coherence is an important part of the solution to the problem of how to construct believable emotional agents.
6.2
Consistency and Variability in Emotions
It is an interesting fact about humans that they are often able to predict with reasonable accuracy how other individuals will respond to and behave in certain kinds of situations. These predictions are rarely perfect, partly because when we make them, we generally have imperfect information, and partly because the people whose behavior and responses we are predicting do not always respond in the same way in similar situations. Nevertheless, it is certainly possible to predict to some degree what other people
On Making Believable Emotional Agents Believable
(especially those whom we know w e l l ) w i l l do and how they will feel and respond (or be inclined to r e s p o n d ) u n d e r varying circumstances. We also know that certain kinds of people tend to respond in similar ways. In other words, to some extent, there is both within-individual consistency and cross-individual consistency. So what makes it possible to predict and understand with any accuracy at all other people’s feelings, inclinations, and behavior? At least part of the answer lies in the fact that their emotions and corresponding behavioral inclinations are not randomly related to the situations in which they find themselves, for if they were, we’d be unable to predict anything. But if the emotions, motivations, and behaviors of people are not randomly associated with the situations whence they arise, there must be some psychological constraints that limit the responses that are produced. And indeed, there are. Sometimes the constraints are very limiting (as with reflexes, such as the startle r e s p o n s e ) a n d sometimes they are less so—merely circumscribing a set of possibilities, with other factors, both personal and contextual, contributing to the response selection. But either way, there are constraints on the internal responses to situations—that is, on the internal affective states and conditions that arise in people—and on the external actions that are associated with those states and conditions.
Discussion
Sloman: You used the word ‘‘behavior’’ several times, and I suspect you are talking about intentions rather than behavior. Ortony: Yes, that’s why I called it motivational-behavioral component. Sloman: But it’s absolutely a crucial thing, for example, with regard to your daughter. She might well be going to work at a butcher’s for the same kind of reason as somebody who belongs to a police group might join a terrorists’ organization. It’s the intention that is important, and the behavior might just be an appropriate means. Ortony: Right. And actually I meant to mention this in the imaginary context of my daughter going to work at the butcher’s, because one thing we would try to do to maintain our belief that people’s behavior is coherent is that we would come up with an explanation, such as: ‘‘she is trying to desensitize herself.’’ We
Andrew Ortony
would not feel comfortable letting these two parts of behavior coexist—we would think that she was crazy or something. There are two classes of theories in psychology that are relevant to these issues. Theories of emotion, and theories of personality. Consider first, emotion theories—especially cognitive ones, which are often incorporated into affective artifacts. The principal agenda of cognitive theories of emotion is the characterization of the relation between people’s construals of the situations in which they find themselves and the kinds of emotions that result. The specification of such relationships is a specification of the constraints that construals of the world impose on emotional states. And these constraints are a major source of consistency, both within and across individuals. At the same time, they are only constraints— they do not come close to fully determining what a particular individual will feel or do on a particular occasion because they work in concert with several sources of variation. These are (1) individual differences in the mappings from world situations to construals (e.g., members of the winning and losing teams in a football game have different mappings from the same objective event), (2) individual differences in something that we might call emotionality (e.g., some of the team members might be more prone to respond emotionally to good or bad outcomes than others), and ( 3 ) t h e current state of the individual at the time (e.g., current concerns, goals, mood). Mappings from particular types of emotions to classes of behavioral inclinations and behaviors are similarly constrained, and thus constitute another source of consistency. This is an area that only a few psychologists (e.g., Averill 1982, on a n g e r ) h a v e studied in any very deep way, except with respect to facial expressions (e.g., Ekman 1982), although it was of considerable interest to Darwin who first wrote about it at length in his 1872 (first edition)book, The Expression of Emotions in Man a n d Animals. However, probably because the linkage between emotions and behaviors is often very flexible, there has been little effort to develop systematic accounts of it. But again, we know that the relation cannot be random, and this means that it ought to be possible to identify some principles governing constraints on the relation between what we feel and what we do, or are inclined to d o . And again, whereas there are some constraining principles governing the emotionbehavior connection—principles that are the source of some con-
On Making Believable Emotional Agents Believable
sistency—there are also various factors (e.g., emotionality, again) that give rise to variation. People only get into emotional states when they care about something (Ortony, Clore, and Foss 1987)—when they view something as somehow good or bad. If there’s no caring, there’s no emoting. This suggests that the way to characterize emotions is in terms of the different ways there might be for feeling good or bad about things. Furthermore, many traits can be regarded as chronic propensities to get into corresponding emotional states. For example, an anxious person is one who experiences fear emotions more easily (and therefore more frequently)than most people, and an affectionate person is one who is likely to experience (and demonstrate)affection more readily than less affectionate people. This means that if we have a way of representing and creating internal states that correspond to emotions, we can capture many traits too. This is important because, at the level of individuals—and this is one of my main points—traits are a major source of emotional and behavioral consistency. Many psychologists (e.g., Ortony, Clore, and Collins 1988; Roseman, Antoniou, and Jose 1996; Scherer 1 9 9 7 ) h a v e proposed schemes for representing the conditions under which emotions are elicited. In our own work (which in affective computing circles is often referred to as the OCC model), we proposed a scheme that we thought accommodated a wide range of emotions within the framework of twenty-two distinct emotion types. Over the years, Gerald Clore and I, together with some of our students, collected considerable empirical support for many of the basic ideas. However, for the purposes of building believable artifacts, I think we might want to consolidate some of our categories of emotions. So, instead of the rather cumbersome (and to some degree arbitrary) analysis we proposed in 1988, I think it is worth considering collapsing some of the original categories down to five distinct positive and five negative specializations of two basic types of affective reactions—positive and negative ones—as shown in table 6.1. I think that these categories have enough generative capacity to endow any affective agent with the potential for a rich and varied emotional life. As the information processing capabilities of the agent become richer, more elaborate ways of characterizing the good and the bad become possible, so that one can imagine a system starting with only the competence to differentiate positive from negative and then developing progressively more elaborate
Andrew Ortony
Table 6.1 Five specializations of generalized good and bad feelings (collapsed from Ortony, Clore, and Collins 1988)
Positive reactions because something good happened (joy, happiness etc.) about the possibility of something good happening (hope) because a feared bad thing didn’t happen (relief) about a self-initiated praiseworthy act (pride, gratification) about an other-initiated praiseworthy act (gratitude, admiration) because one finds someone/thing appealing or attractive (love, like, etc.) Negative reactions because something bad happened (distress, sadness, etc.) about the possibility of something bad happening (fear, etc.) because a hoped-for good thing didn’t happen (disappointment) about a self-initiated blameworthy act (remorse, self-anger, shame, etc.) about an other-initiated blameworthy act (anger, reproach, etc.) because one finds someone/thing unappealing or unattractive (hate, dislike, etc.)
The first entry in each group of six is the undifferentiated (positive or negative) reaction. The remaining five entries are specializations (the first pair goal-based, the second standards-based, and the last taste-based).
categories. A simple example of this idea is that fear can be viewed as a special case of a negative feeling about something bad happening—with the bad thing being the prospect of something bad happening. If one adopts this position, then one is left with the idea that the main driving force underlying all emotions is the registration of good and bad and that discrete emotions can arise to the extent that the nature of what is good and bad for the agent can be and is elaborated. Indeed, this may well be how humans develop increasingly sophisticated emotion systems as they move from infancy through childhood to adulthood. So, specifying a mechanism that generates distinct emotions and other affective conditions seems not so hard—what is hard is to make it all believable. As I just indicated, a key issue is the need for affective artifacts to be able to parse the environment so as to understand its beneficial and harmful affordances—a crucial requirement for consistency, and thus also for believability. And a prerequisite for doing this is a coherent and relatively stable value system in terms of which the environment is appraised. As we indicated in OCC (and as illustrated in figure 6.1), such a system, at least in humans, is an amalgam of a goal hierarchy in which at least some of the higher-level goals are sufficiently enduring that
On Making Believable Emotional Agents Believable 195
Event, Agent, or Object of appraisal
appraised in terms of
goals (events)
joy distress hope fear relief disappointment
norms/standards (agents' actions)
tastes/attitudes (objects)
anger gratitude
pride shame
gratification remorse
admiration reproach
etc.
etc.
etc.
COMPOUND EMOTIONS
STANDARDS-BASED EMOTIONS
ATTITUDE-BASED EMOTIONS
love hate
etc.
GOAL-BASED EMOTIONS
Figure 6.1 The relation between different things being appraised, the representations in terms of which they are appraised, and the different classes of resulting emotions.
they influence behavior and emotions over an extended period (rather than transiently), a set of norms, standards, and values that underlie judgments of appropriateness, fairness, morality, and so on, and tastes and preferences whence especially value-laden sensory stimuli acquire their value. Another respect in which emotional reactions and their concomitant behaviors need some degree of consistency has to do with emotion intensity. It is not sufficient that similar situations tend to elicit similar emotions within an individual. Similar situations also elicit emotions of comparable intensity. In general, other things (external circumstances, and internal conditions such as moods, current concerns, etc.)being equal, the emotions that individuals experience in response to similar situations, and the intensity with
Andrew Ortony
which they experience them, are reasonably consistent. Emotionally volatile people explode with the slightest provocation while their placid counterparts remain unmoved. In this connection, I’m reminded of a colleague (call him G ) w h o m my (other)colleagues and I know to be unusually ‘‘laid back’’ and unemotional. One day several of us were having lunch together in an Italian restaurant when G managed to splash a large amount of tomato sauce all over his brilliant white, freshly laundered shirt. Many people would have become very angry at such an incident—I for example, would no doubt have sworn profusely, and for a long time! G, on the other hand, said nothing; he revealed no emotion at all—not even as much as a mild kind of ‘‘oh dear, what a bother’’ reaction; he just quietly dipped his napkin into his water and started trying to wipe the brilliant red mess off his shirt (in fact making it worse with every wipe), while carrying on the conversation as though nothing had happened. Yet, unusual as his nonreaction might have been for people in general, those of us who witnessed this were not at all surprised by G’s reaction (although we were thoroughly amused) because we all know G to be a person who, when he emotes at all, consistently does so with very low intensity—that’s just the kind of person he is, that’s his personality.
6.3
Consistency and Variability in Emotion-Related Response Tendencies
The tomato sauce episode not only highlights questions about emotion intensity, it also, for the same reason, brings to the fore the question of the relation between (internal)emotional states and their related behaviors. To design a computational artifact that exhibits a broad range of believable emotional behavior, we have to be able to identify the general principles governing the relation between emotions and behavior, or, more accurately, behavioral inclinations, because, as Ekman (e.g., 1 9 8 2 ) h a s argued so persuasively, at least in humans, social and cultural norms (display rules) often interfere with the ‘‘natural’’ expression (both in the face, and in behavior)of emotions. Associated with each emotion type is a wide variety of reactions, behaviors, and behavioral inclinations, which, for simplicity of exposition, I shall refer to collectively as ‘‘response tendencies’’ (as distinct from responses). Response tendencies range from involuntary expressive manifestations, many (e.g., flushing)having immediate physiological causes, through changes in the way in which
On Making Believable Emotional Agents Believable
information is attended to and processed, to coping responses such as goal-oriented, planned actions (e.g., taking revenge). From this characterization alone, it is evident that one of the most salient aspects of emotional behavior is that some of it sometimes is voluntary and purposeful (goal-oriented, planned, and intentional) and some of it is sometimes involuntary and spontaneous—as when a person flies into an uncontrollable rage, trembles with fear, blushes with embarrassment, or cries with joy. Figure 6.2 sketches a general way of thinking about the constraints on the response tendencies for emotions. It shows three major types of emotion response tendencies (labeled ‘‘expressive,’’ ‘‘information-processing,’’ and ‘‘coping’’), each of which is elaborated below its corresponding box. The claim is that all emotion responses have these three kinds of tendencies associated with them. Note, however, that this is not the same as saying that in every case of every emotion, these tendencies have observable concomitants—they are tendencies to behave in certain ways, not actual behaviors. The first group—the expressive tendencies—are the usually spontaneous, involuntary manifestations of emotions that are often referred to by emotion theorists (following Darwin) as emotional expressions. These expressive tendencies are of three kinds: somatic (i.e., bodily), behavioral, and communicative (both verbal and nonverbal). Consider first the somatic tendencies. These are almost completely beyond the control of the person experiencing the emotion. For instance, the box marked ‘‘somatic’’ in figure 6.2 has a parenthetical ‘‘flushing’’ in it. This (and the other parenthetical entries)is presented ( o n l y ) a s an example of the kind of response tendencies that one might expect to find in the case of anger; it should be interpreted as indicating that when someone is angry, one possible somatic manifestation is that the person grows red in the face. Notice that this is not something that he or she chooses to d o . We do not choose to flush—our physiology does it for us, without us asking. The next class of expressive tendencies are the behavioral ones. Again, these tendencies are fairly automatic, often hardwired, and relatively difficult (although not always i m p o s s i b l e ) t o control; they are spontaneous actions that are rarely truly instrumental (although they might have vestigial instrumentality), such as kicking something in anger. So, to continue with the example of anger, I have in mind not the reasoned planful behaviors that might be entertained as part of a revenge strategy (they belong to the
Andrew Ortony 198
M E
O)
c
0 3 (0 5=
On Making Believable Emotional Agents Believable
‘‘coping’’ category), but the more spontaneous tendencies to exaggerate actions (as when one slams a door that one might have otherwise closed quietly), or the tendency to perform almost symbolic gestural actions (albeit, often culturally learned o n e s ) s u c h as clenching one’s fist. Finally, I have separated out communicative tendencies (while realizing that symbolic acts such as fist clenching also have communicative value)as a third kind of expressive response tendency. Still, I wish here to focus more on communication through the face, because historically this has been so central to emotion research. Communicative response tendencies are those that have the capacity to communicate information to others, even though they are often not intended to do so. They have communicative value because they are (sometimes pan-culturally)recognized as symptoms of emotions. They include nonverbal manifestations in the face, including those usually referred to by emotion theorists as ‘‘facial expressions’’ (e.g., scowling, furrowing of the brow), as well as verbal manifestations (e.g., swearing, unleashing torrents of invectives), and other kinds of oral (but nonverbal) responses such as growling, screaming, and laughing. The second, information processing, component has to do with changes in the way in which information is processed. A major aspect of this is the diversion of attention (again often quite involuntary)from those tasks that were commanding resources prior to the emotion-inducing event to issues related to the emotioninducing event. One of the most striking cases of the diversion of attentional resources is the all-consuming obsessive focus that people often devote to situations that are powerfully emotional. In humans, this obsessive rumination can be truly extraordinary and often quite debilitating, as so convincingly depicted in much of the world’s great literature—consider, for example, Shakespeare’s Othello. The second part of the information processing response has to do with updating beliefs, attitudes, and more generally evaluations about other agents and objects pertinent to the emotioninducing event—you increasingly dislike your car when it repeatedly infuriates you by breaking down on the highway, whereas your liking for an individual increases as he or she repeatedly generates positive affect in you (Ortony 1991). Finally, there are coping strategies, of which I have identified two kinds. One of these, problem-oriented coping, is what emotion theorists usually have in mind when they talk about coping;
Andrew Ortony
namely, efforts to bring the situation under control—to change or perpetuate it—with the goal of improving a bad situation, or prolonging or taking advantage of a good one. In the case of anger, people often seek to do something that they think might prevent a recurrence of the problem, or that might somehow fix the problem. The more interesting kind of coping is emotion-oriented coping. This kind of coping has to do with managing emotions themselves—either one’s own, or those of some other agent or agents involved in the emotion-inducing situation. Self-regulating emotion-oriented coping responses focus on one’s own emotions. For example, if I am angry I might try to calm down, perhaps as a precursor to developing a sensible plan to solve the problem, or perhaps simply because I don’t like the feeling of being out of control. The other-modulating emotion management strategies can serve various purposes. For instance, if I induce distress in you because of what you did to me, not only might it make me feel better (i.e., it might help me to manage my own emotion of anger, hence the association in the figure between self-regulating and othermodulating responses), but it might also make you more likely to fix the problem you caused me (hence the link between emotionoriented and problem-oriented responses). So, for example, suppose you are angry at somebody for smashing into your car. Developing or executing a plan to have the car fixed is a problemoriented response, as would be a desire to prevent, block, or otherwise interfere with the antagonist’s prospects for doing the same kind of thing again. But one might also try to modulate the antagonist’s emotions by retaliating and getting one’s own back so as to ‘‘make him pay for what he did to me,’’ or one might try to induce fear or shame in him to make him feel bad, all with a view to making one’s self feel better. There is no requirement that any of these responses be ‘‘rational.’’ Indeed, if we designed only rational emotion response-tendencies into our emotional agents, we would almost certainly fail to make our emotional agents believable. So the general claim is that a major source of consistency derives from the fact that all emotions constrain their associated response tendencies and all emotions have all or most of these tendencies. It should be clear from this discussion, and from table 6.2 (which indicates how the various constraints might be manifested in the emotions of anger and fear)that there is plenty of room for individual variation. Just as in the case of the emotions themselves, much of this variation is captured by traits—so many, although
On Making Believable Emotional Agents Believable
Table 6.2 Sample manifestation of the different components for fear emotions (upper panel) and for anger emotions (lower panel)
Expressive
Somatic Behavioral Communicative nonverbal
Trembling, shivering, turning-pale, piloerection Freezing, cowering Screaming
Information Processing
Attentional Evaluative
Obsessing about event, etc. Disliking source, viewing self as powerless/victim
Coping
Emotion Self-regulating Emotion Other-modulating Problem-oriented coping
Calming down, getting a grip Scaring away Getting help/protection, escaping, eliminating threat
Expressive
Somatic Behavioral Communicative verbal Communicative nonverbal
Shaking, flushing Fist-clenching Swearing Scowling, frowning, stomping, fist-pounding, etc.
Information Processing
Attentional Evaluative
Obsessing about event, etc. Disliking and despising source
Coping
Emotion Self-regulating Emotion Other-modulating Problem-oriented coping
Calming down, getting a grip Causing distress to antagonist Preventing continuation or recurrence of problem
not all, of the ways in which a timid person responds to angerinducing situations are predictably different from the ways in which an aggressive person responds.
6.4
Why Personality?
Traits are the stuff of personality theory. Personality psychologists disagree as to whether personality should be viewed merely as an empirical description of observed regularities, or whether it should be viewed as a driver of behavior. But for people interested in building affective artifacts, personality can only be interesting and relevant if one adopts the second position. If one really wants to build believable emotional agents, one is going to need to ensure situationally and individually appropriate internal responses (emotions), ensure situationally and individually appropriate external responses (behaviors and behavioral inclinations), and arrange for sensible coordination between internal and external responses. Situationally appropriate responses are controlled by incorporating models of emotion elicitation and of emotion to emotion-responses of the kind I have just outlined. But to arrange
Andrew Ortony
for individual appropriateness, we will have to incorporate personality, not to be cute, but as a generative engine that contributes to coherence, consistency, and predictability in emotional reactions and responses. The question is, how can we incorporate personality into an artifact without doing it trait by trait for the thousands of traits that make up a personality? In their famous 1938 monograph, Trait Names: A Psycho-lexical Study, Allport and Odbert identified some 18,000 English words as trait descriptors, and even though many of the terms they identified do not in fact refer to traits, the number still remains very large. Trying to construct personalities in a more or less piecemeal fashion, trait by trait, is probably quite effective if the number of traits implemented is relatively small and if the system complexity is relatively limited. To some extent, this appears to be the way in which emotional behaviors and expressions are constrained in Cybercafe´—part of Hayes-Roth’s Virtual Theater Project at Stanford (e.g., Rousseau 1996), and to an even greater extent, in Virtual Petz and Babyz (see Stern, chapter 12 of this volume), and anyone who has interacted with these characters knows how compelling they are. However, if one has more stringent criteria for believability—as one might have, for example, in a soft-skills business training simulation, where the diversity and complexity of trait and trait constellations might have to be much greater—I suspect that a more principled mechanism is going to be necessary to produce consistent and coherent (i.e., believable)characters. Note, incidentally, that this implies that ‘‘believability’’ is a context-, or rather application-dependent notion. A character that is believable in an entertainment application might not be believable in an education or training application. One solution to the problem of how to achieve this higher level of believability is to exploit the fact that traits don’t live in isolation. If we know that someone is friendly we know that he has a general tendency or disposition to be friendly relative to people in general; we know that in a situation that might lead him to be somewhere on the unfriendly-friendly continuum, he is more likely to be toward the friendly end. However, we also know some other very important things—specifically, we know that he is likely to be kind, generous, outgoing, warm, sincere, helpful, and so on. In other words, we expect such a person to exhibit a number of correlated traits. This brings us back to the question of behavioral coherence. There is much empirical evidence that traits clus-
On Making Believable Emotional Agents Believable
ter together and that trait space can be characterized in terms of a small number of factors—varying in number from two to five, depending on how one decides to group them. For our purposes here, the question of which version of the factor structure of personality one adopts may not be crucial (although I do have a personal preference). What matters is that the factor structure of trait space provides a meaningful way to organize traits. What matters is that it provides a meaningful and powerful reduction of data to note that people whom we would normally describe as being outgoing or extroverted (as opposed to i n t r o v e r t e d ) t e n d to be sociable, warm, and talkative, and that people who are forgiving, good-natured, and softhearted we generally think of as agreeable or likeable (as opposed to antagonistic). Similarly, people who are careful, well organized, hard working, and reliable we tend to think of as being conscientious (as opposed to negligent). These (extroversion, agreeableness, and conscientiousness)are three of the ‘‘big five’’ (e.g., McCrae and Costa 1987)dimensions of personality—the other two being openness (as opposed to closed to new experiences), and neuroticism (as opposed to emotional stability). The key point here is that such clusters, such groups of tendencies to behave and feel in certain kinds of ways, constitute one source of behavioral and emotional consistency and hence predictability of individuals. Viewed in this way, personality is the engine of behavior. You tend to react this way in situations of this kind because you are that kind of person. Personality is a (partial) determiner of, not merely a summary statement of, behavior. Consistent with this view (which is certainly not shared by all personality t h e o r i s t s ) i s the fact that some components of personality appear to be genetically based. All this suggests that to build truly believable emotional agents, we need to endow them with personalities that serve as engines of consistency and coherence rather than simply pulling small groups of traits out of the thin air of intuition. A general approach to doing this would be to identify generative mechanisms that might have the power to spawn a variety of particular states and behaviors simply by varying a few parameters. Many of the proposals in the personality literature provide a basis for this kind of approach. For example, one might start with the distinction between two kinds of regulatory focus (e.g., Higgins 1998), namely, promotion focus in which agents are more
Andrew Ortony
concerned with attempting to achieve things that they need and want (e.g., they strive for nurturance, or the maintenance of ideals). Promotion focus is characterized as a preference for gain-nongain situations. In contrast, with prevention focus, agents seek to guard against harm (e.g., they strive for security)and exhibit a preference for nonloss-loss situations. Thus regulatory focus is a fundamental variable that characterizes preferred styles of interacting with the world. Different people at different times prefer to focus on the achievement of pleasurable outcomes (promotion focus), or on the avoidance of painful outcomes (prevention focus). These are essentially the same constructs as approach motivation and avoidance motivation (e.g., Revelle, Anderson, and Humphreys 1987), and are closely related to the idea that individuals differ in their sensitivity to cues for reward and punishment (Gray 1994). This can be clearly seen when we consider people’s gambling or sexual behavior (sometimes there’s not much difference): Those who are predominantly promotion focused (sensitive to cues for reward) focus on the possible gains rather the possible losses—they tend to be high (as opposed to l o w ) o n the personality dimension of impulsivity; those with a prevention focus (sensitive to cues for p u n i s h m e n t ) p r e f e r not to gamble so as to avoid the possible losses—these people tend to be high as opposed to low on the anxiety dimension. If an individual prefers one regulatory strategy over another, this will be evident in his behaviors, in his styles of interaction with the world, and with other agents in the world, and as such, it constitutes one aspect of personality. Probably the most productive way to think about regulatory focus is that in many of our encounters with the world, a little of each is present—the question then becomes, which one dominates, and to what degree. Different people will have different degrees of each, leading to different styles of interacting with the world. Still, some of each is what we would ordinarily strive for in designing an affective artifact. Without some counterbalancing force, each is dysfunctional. For example, unbridled promotion focus is associated with a high tolerance for negativity (including a high threshold for fear, pain, and the like), and that comes pretty close to being pathologically reckless behavior. I think it is possible to exploit these kinds of ideas in a principled way in designing our artifacts. We might start with the ideas of psychologists such as Eysenck, Gray, Revelle, and others (e.g., Rolls 1999; chapter 2 of this v o l u m e ) w h o take the position that
On Making Believable Emotional Agents Believable
there are biological substrates of personality (such as cue sensitivity). The virtue of this kind of approach is that it provides a biological basis for patterns of behaviors and, correspondingly, emotions, which can serve as the basis for generating some sort of systematicity and hence plausibility or believability of an artificial agent. Which particular activities a h u m a n agent actually pursues in the real world is of course also dependent on the particular situation and local concerns of that agent, as well, no doubt, as on other biologically based determinants of other components of personality. But at least we have a scientifically plausible and computationally tractable place to start, even though the specification of exactly how this can be done remains a topic for future research.
Discussion: On Modeling Emotion and Personality
Bellman: So you are telling me that personality would be this core biological basis that somehow constrains behavioral inclinations. I have been bothered for a long time about a lot of research in emotions, because I am confused by the tendency to want to have oversimplified models—why people keep trying to reduce the space to two or three bases. There are many disciplines that I have been in which try to take complex multidimensional problems and reduce them to two or three bases. And, yes, you can do that at some level, but you usually lose most of the interesting stuff when you do that. Sloman: I have a worse problem: Why try to find any number of dimensions as opposed to finding what the underlying architecture is and generating these things? Bellman: But the underlying architecture doesn’t have to be something with only two or three reinforcement/nonreinforcement bases. Why should that be the underlying architecture? Rolls: The only theory is that one tries to get a principled approach here instead of doing something like factor analysis. The idea is to say something like this: So, what is it that causes emotion? If one would recreate and operationalize things where reinforcers are involved—most people find it difficult to think of exceptions to that—then one ought to pursue that idea and ask: What sorts of reinforcers are there? You know, you can give positive and negative reinforcement, you can withdraw positive and negative reinforcement. The second question is: What comes out of that? The
Andrew Ortony
nice thing that Andrew is pointing us towards here is that personality might drop out of that research. If some individuals, by their genes, were a bit less sensitive to nonreward, or a bit less sensitive to punishment, it turns out that you would categorize them as extraverts. And so, one gets the dimension of personality without having to buy into any sort of special engineering specifications. Ortony: But personality has consequences, because it constrains behaviors down the road, individual behaviors and motivations, and it makes one more likely to gamble in casinos and more likely to engage in unsafe sex and more likely to do a whole host of things which actually make people look as though they are individuals with some sort of stable underlying behavioral patterns. Rolls: The idea is that here, then, is a sort of scientifically principled way to get personality. And I agree with your notion about consistency, but it is one of the quite nice things that come with personality. Notice that consistency is slightly different to persistence of emotions: If you have a nonreward, it’s helpful on the short-time scale to keep your behavior directed towards a reward. So, the emotional state ought to be continuing slightly. Consistency, on the other hand, is more of a long-term requirement for believable agents: Next time you come round to that similar situation for that individual, it makes sense if they behave in a somewhat similar way. At least we as biological organisms do, perhaps for the reasons that we have discussed. Bellman: The comment was not that there aren’t some important principles. It is exactly how we model those important principles. I will give you a simple example: If we take language generation, we can model it, as we have learned to do over time. It has been very difficult, with all sorts and kinds of generative grammars. That’s a very different kind of modeling than from a simple combinatorics. I don’t know any cases, but one could imagine language as having been modeled as if it were a simple combinatoric problem. And eventually, people shifted to more interesting underlying modeling with grammars. That’s part of my point: I don’t see any reason why we should suppose, just because of a positive modeling, a simple combinatoric space. That’s what my comment was directed towards, not the lack of principles. Sloman: But are you talking about building synthetic artifacts for some purpose, which may be useful, or are you talking about how human beings work? Because, if it’s the latter, there are going to be
On Making Believable Emotional Agents Believable
constraints, and you can go and find out the biological bases, you can go and find out how the brain is involved. Bellman: But there are lots of constraints that we know about in human language generation. Sloman: So, the answer to your question is: Technically, you can take as many constraints as you get a handle on. And you do the best you can. And then someone else may come and have a better solution. Bellman: Ok. I am suggesting a different style of modeling for it. I think that there is a long history of emotional modeling. Petta: If you talk about human beings, the society and the environment as such are so complex that perhaps you have to leave out that part and just concentrate on the individual. You make an analysis of how the individual describes itself and end up with this collection of traits, which always just refers to the first person point of view. What I think is important however, especially when you are talking about artifacts, is that perhaps way more efforts should be put into performing a thorough lifeworld analysis—to use the terminology of Agre. 1 You would have to look at the whole system, what the environment provides and what kinds of couplings there are between a single architecture and between different instances of the architecture and/or what could happen in the environment, what kinds of dimensions, effects, and kinds of dynamics are relevant for your artifact. Personality, after all, is also something that is perceived about the other. Sloman: Could you give an example of the kind of coupling you talked about? Petta: Especially when you design an artifact, you want it to behave within given constraints. These constraints typically are characterized by, for instance, what kinds of interactions you want to occur, what kind of dynamics, how you want to stabilize its behavior. And once you know that, you can take a look at what happens in that environment, what can take place over a certain amount of time and how that relates to the possibilities of the agent, what its perceptional capabilities are, what its choice of actions is. Then you can start to consider how to constrain or direct those elements. Where do you put the incentives, where do you rather expand, where do you rather suppress? And these, in some, turn out to be biases, constraints, which again can have their own dynamics coupled to the environment, and which, I would
Andrew Ortony
assume, should lead to something recognizable as a certain personality. Ortony: Part of this, I think, has to do with a fact that I did not talk about, the appraisal side. Also, you want a level of description, if you are thinking about this in general, that goes beyond any particular design intention we may have. So, in thinking in general about what you want to do when you construct believable agents, you are not going to have one set of criteria for a person or a system, and a different set of criteria for what makes it believable. For a pet robot, for example, you have to have some level of description that characterizes the interaction which organisms are likely to have in a physical world. So, this gets you to things like this, but stuff happens that gets appraised. Again, there are individual differences, but there are also similarities with respect to our goals. The norms or standards that we use to make judgments of the kind that lead us to approve or disapprove of things, which is different from goals, although they could in fact be collapsed if you said that you had a goal to maintain order or something. But the point is, it does not matter what the goals of the organism are. It only matters that it indeed has some goals. It is very difficult to imagine building an emotionally believable artifact which didn’t have goals. So, once you admit that it has got goals, the architecture should be such that it’s impervious to the particular kind of goals. It only cares what happens when goals are satisfied or blocked or failed. This really comes down to how you characterize the underlying value system in terms of which appraisals of the environment you are constructing. Is that an answer to your question or not? Petta: Partly, because, actually, I refer to the box with norms and standards (cf. figure 6.1), because this is where you introduce aspects of the society which are beyond the individual. So, this is just one very evident spot where you gather stuff that is external to the single individual and put it into your picture. Sloman: But if that does not get internalized by the individual, it has no effect on the individual’s behavior. Petta: Yes, sure. Obviously, there must be a connection, there must be a coupling. Sloman: Can I, in this context, say something which, at first, will contradict with what you say? You have been saying that you need consistency because of the predictability of personality. Now, if you actually look at human beings, but not necessarily at other
On Making Believable Emotional Agents Believable
animals, you will find that there may be consistency in a particular individual’s behavior in a certain context only. If you put him in another context, you will get a different collection of behaviors, which is itself consistent. So, at home with your family, you may be kind and generous and thoughtful, whereas being in the aggressive MIT AI Lab or in the office, where there is a lot of competition and insecurity, you may suddenly behave in a very different way, but consistently in itself. I think that would not be possible for other animals. Ortony: It might actually: animal as parent and animal as hunter, for example. Sloman: Yes. In that case, it may be a very general characterization that, in some sense, there is not a single personality, there are subpersonalities which get turned on and off by the context. Petta: By the context, by the environment. That’s precisely my point. Ortony: Yes and no, is my reaction. At some level, of course, it’s true that we are going to behave less aggressively in an environment in which the interpersonal interactions are characterized by love, affection, and familial relations, than when we are in a hostile or in a competitive environment—at the workplace, for example. Elliott: You can have one personality that appraises everything always the same way, but it’s appraising different things at different times. Sloman: But even cues for punishment or reward might be a variable factor. They may be consistently low in this situation and consistently high when you are in that situation. Elliott: Also, you can’t leave out that individuals are highly affected by moods, too, in that you can characterize being in the workplace as placing yourself in an aggressive mood. Ortony: Let’s take a dimension like friendliness—a person you characterize as a friendly person would be represented on one side of a curve for ‘‘friendliness’’ in a group. But, for this particular individual, it is still true that there is a distribution of behaviors relative to his own behaviors, such that some of his behaviors are friendly and some are unfriendly. But this distribution lies inside the ‘‘friendly’’ sector with respect to the reference group. Sloman: And that distribution might shift with context for the same individual.
Andrew Ortony
Ortony: Yes, it could shift, but it’s still probably the case that a person whom you would regard as aggressive is going to have more aggressive behaviors in the aggressive roles. Sloman: I think what you are saying is that there may be an even deeper consistency. I suspect that for some people, there is a degree of integration that is higher than for others. And in an extreme case you get personality disorders where this mechanism is going badly wrong. Rolls: So, is the bottom line of this that your sensitivity to reward and punishment could be relatively set, but the actual coping strategies that you are bringing into play in different environments are appropriate for that particular environment? For example, if there is something you can do about it, you might be angry; but if there’s nothing you can do about it, you might be sad? Is that one way to try to rescue a sort of more biological approach? The basic biology might—successive to the reward and punishment—be unchanged, but then you have different, as it were, coping strategies. Sloman: This is an empirical question, and I have no reason to think that, at least for humans, it’s more like what I said than like what you said. But I could be wrong. We have to investigate. Rolls: Yes, that’s right. But I think it’s worth underlining the fact that there are at least two possibilities to explain context dependency of personality.
Note 1.
P. E. Agre, Computation and Human Experience (Cambridge University Press, Cambridge, 1997).
References Allport, G. W., and Odbert, H. S. (1938): Trait Names: A Psycho-Lexical Study. Psychol. Monogr. 47 (1): whole no. 211. Averill, J. R. (1982): Anger and Aggression: An Essay on Emotion. Springer, Berlin, Heidelberg, New York. Colby, K. M. (1981): Modeling a Paranoid Mind. Behav. Brain Sci. 4: 515–560. Darwin. C. (1872): The Expression of Emotions in Man and Animals. Murray, London. Ekman, P., ed. (1982): Emotion in the Human Face. Cambridge University Press, Cambridge. Gray, J. A. (1994): Neuropsychology of Anxiety. Oxford University Press, Oxford, London, New York.
On Making Believable Emotional Agents Believable
Higgins, E. T. (1998): Promotion and Prevention: Regulatory Focus as a Motivational Principle. In M. P. Zanna, ed., Advances in Experimental Social Psychology. Academic Press, New York. McCrae, R. R., and Costa, P. T. (1987): Validation of a Five-Factor Model of Personality across Instruments and Observers. J. Pers. Soc. Psychol. 52: 81–90. Ortony, A. (1991): Value and Emotion. In W. Kessen, A. Ortony, and F. Craik, eds., Memories, Thoughts, and Emotions: Essays in Honor of George Mandler. Laurence Erlbaum Associates, Hillsdale, N.J. Ortony, A., Clore, G. L., and Collins, A. (1988): The Cognitive Structure of Emotions. Cambridge University Press, Cambridge. Ortony, A., Clore, G. L., and Foss, M. A. (1987): The Referential Structure of the Affective Lexicon. Cogn. Sci. 11: 341–364. Revelle, W., Anderson, K. J., and Humphreys, M. S. (1987): Empirical Tests and Theoretical Extensions of Arousal Based Theories of Personality. In J. Strelau and H. J. Eysenck, eds., Personality Dimensions and Arousal, 17–36. Plenum Press, London. Rolls, E. (1999): The Brain and Emotion. Oxford University Press, Oxford, London, New York. Roseman, I. J., Antoniou, A. A., and Jose, P. E. (1996): Appraisal Determinants of Emotions: Constructing a More Accurate and Comprehensive Theory. Cogn. Emotion 10: 241–277. Rousseau, D. (1996): Personality in Computer Characters. Paper Presented at the Annual Meeting of the American Association of Artificial Intelligence. In: H. Kitano, ed., ‘‘Entertainment and AI/A-Life’’, AAAI Workshop Technical Report WS-96-03, AAAI Press, Menlo Park, CA, pp. 38–43. Scherer, K. R. (1997): Profiles of Emotion-Antecedent Appraisal: Testing Theoretical Predictions Across Cultures. Cogn. Emotion 11: 113–150.
7 What Does It Mean for a Computer to ‘‘Have’’ Emotions? Rosalind W. Picard
There is a lot of talk about giving machines emotions, some of it fluff. Recently at a large technical meeting, a researcher stood up and talked of how a Barney stuffed animal (the purple dinosaur for kids) ‘‘has emotions.’’ He did not define what he meant by this, but after repeating it several times, it became apparent that children attributed emotions to Barney, and that Barney had deliberately expressive behaviors that would encourage the kids to think Barney had emotions. But kids have attributed emotions to dolls and stuffed animals for as long as we know; and most of my technical colleagues would agree that such toys have never had and still do not have emotions. What is different now that prompts a researcher to make such a claim? Is the computational plush an example of a computer that really does have emotions? If not Barney, then what would be an example of a computational system that has emotions? I am not a philosopher, and this paper will not be a discussion of the meaning of this question in any philosophical sense. However, as an engineer I am interested in what capabilities I would require a machine to have before I would say that it ‘‘has emotions,’’ if that is even possible. Theorists still grapple with the problem of defining emotion, after many decades of discussion, and no clean definition looks likely to emerge. Even without a precise definition, one can still begin to say concrete things about certain components of emotion, at least based on what is known about h u m a n and animal emotions. Of course, much is still unknown about h u m a n emotions, so we are nowhere near being able to model them, much less duplicate all their functions in machines. Also, all scientific findings are subject to revision—history has certainly taught us humility, that what scientists believed to be true at one point has often been changed at a later date. I wish to begin by mentioning four motivations for giving machines certain emotional abilities (and there are more). One goal is to build robots and synthetic characters that can emulate living humans and animals—for example, to build a humanoid robot. A
Rosalind W. Picard
second goal is to make machines that are intelligent, even though it is also impossible to find a widely accepted definition of machine intelligence. A third goal is to try to understand human emotions by modeling them. Although I find these three goals intriguing, my main focus is on a fourth: making machines less frustrating to interact with. Toward this goal, my research assistants and I have begun to develop computers that can identify and recognize situations that frustrate the user, perceiving not only the user’s behavior and expressions, but also what the system was doing at the time. Such signs of frustration can then be associated with potential causes for which the machine might be responsible or able to help, and the machine can then try to learn how to adjust its behavior to help reduce frustration. It may be as simple as the computer noticing that lots of fancy ‘‘smart’’ features are irritating to the user, and offering the user a way to remove all of them. Or, it may be that the computer’s sensitive acknowledgment of and adaptation to user frustration simply leads to more productive and pleasing interactions. One of the key ideas is that the system could associate expressions of users, such as pleasure and displeasure, with its own behavior, as a kind of reward and punishment. In this age of adaptive, learning computer systems, such feedback happens to be easy and natural for users to provide.
Discussion
Picard: One of the things that is controversial with respect to agents is if they should show empathy to people. This is sort of strange, a computer saying ‘‘That feels pretty bad, and I am sorry to hear that you had such a bad experience,’’ when a computer has no feelings. You would think this would just upset people. In fact, Reeves and Nass 1 found the same surprises in their studies out in Stanford. They tested their system with Stanford students who know that the machine does not have emotions. Stern: Do you think people could possibly be thinking, well, the person who wrote this program has empathy? Picard: This is one of the key factors that Reeves and Nass talk about and that we all address: Do they attribute any expression of feelings to the designer of the software? And if so, then the computer should not be saying ‘‘I,’’‘‘my,’’‘‘me,’’or whatever, it should
What Does It Mean for a Computer to ‘‘Have’’ Emotions?
be saying ‘‘the makers of the software.’’ And they ran experiments investigating that, too. They have concluded that it’s the machine. Even though people know better, they act as if it’s the machine and not the maker of the machine. Elliott: One of our experiences is similar, and I would say that to the extent that there is complexity in the understanding, the feeling of sincerity goes u p . If you say ‘‘Here is everything I know, it’s not much, but I do know this,’’ people seem to accept that, and to the extent that there is more complexity in there and it feels like if there is more understanding, they accept it even more. It does not matter if you say it’s not real; it’s a bit like flattery not being real, and still . . . Picard: That’s interesting. Yes. Elliott: Flattery wears off, when you get it two or three times in a row, it’s like: ‘‘Well, I heard this before.’’ But if the complexity is there, it doesn’t seem to run off in the same way: ‘‘I know it’s not real, but you seem to understand quite a few things about how I feel, and that satisfies me.’’ If it’s just ‘‘I am sorry, but I don’t know why’’ instead of ‘‘I am sorry because I believe that you really wanted this thing, and you did not get it, and you are embarrassed that you did not get it.’’ Picard: This reminds me of the strategy I use with my two-year-old: if he tries to do something, and I don’t like it and say ‘‘no,’’ then he says ‘‘why?’’ If I give him a short explanation, he asks ‘‘why’’ again. If I give another short explanation, he says ‘‘why’’ again. If I give him a really long complex explanation, he gets bored and forgets about it. He is training me in a sense. Bellman: I guess I would want to see more experimentation about the implicit people behind the artifacts, because I think we have some information from our experiences that say that people, even though they suspend disbelief, are actually very aware of the authorship by other human beings. And in fact, in our virtual world studies, we often get users who come up to the author and say: ‘‘I really enjoyed your robot. He is so great, he is so lovable, you know!’’ Elliott: I think that is reflection after the fact, though. Ball: But in these experiments, people know. There is no question about misunderstanding this computer. It’s just that they are still affected.
Rosalind W. Picard
Elliott: They are inherently engaged, and this satisfies this feeling of empathy. Sloman: We are biologically programmed to respond to this kind of behavior. If this behavior comes from a computer, we will still respond. Ortony: But we may at the same time also praise the author as we praise the parent of a child. We have enjoyed an interaction, we don’t attribute the behavior of the child to the interaction alone, but we see, as Clark said, a sign of the parent in the child. So, we say: ‘‘I really liked your kid.’’ Picard: But people do say: ‘‘Can’t you control your child?’’ Ortony: Well, that’s the other side. Picard: Well, we are not going to be able to control, so to speak, these agents at some point. I think this is a responsibility decision to make as designers, while we are in control. My first goal thus involves sensing and recognizing patterns of emotional information—dynamic expressive spatiotemporal forms that influence the face, voice, posture, and ways the person moves—as well as sensing and reasoning about other situational variables, such as if the person retyped the same word many times and is now using negative language. All of this is what I refer to in shorthand as ‘‘recognizing emotion,’’ although I should be clear that it means the first sentence of this paragraph, and not that a computer can know your innermost emotions, which involve thoughts and feelings that no person besides you can sense. But once a computer has recognized emotion, what should it do? Here lies my second main goal: giving the computer the ability to adapt to the emotional feedback in a way that does not further frustrate the user. Although ‘‘having emotion’’ may help with the first goal, I can imagine how to achieve the first goal without this ability. However, the second goal involves intricacies in regulating and managing ongoing perceptual information, attention, decision making, and learning. All of these functions in humans apparently involve emotion. This does not mean that we could not possibly implement them in machines without emotion. At the same time, it appears to be the case that all living intelligent systems have emotion in some form, and that humans have the most sophisticated emotion systems of all, as evinced not just by a greater development of limbic and cortical structures, but also by greater
What Does It Mean for a Computer to ‘‘Have’’ Emotions?
facial musculature, a hairless face, and the use of artistic expression, including music, for expressing emotions beyond verbal articulation. Part of me would love to give a computer the ability to recognize and deal with frustration as well as a person can, without giving it emotions. I have no longing to make a computer into a companion; I am quite content with it as a tool. However, it has become a very complex adaptive tool that frustrates so many people that I think it’s time to look at how it can do a better job of adapting to people. I think emotion will play a key role in this. Let’s look more closely at four components of emotion that people have, and how these might or might not become a part of a machine.
7.1
Components of Emotion
I find it useful to identify at least four components when talking about emotions in the context of what one might want to try to implement in machines (figure 7.1). Some of these components already exist in some computational systems. The components are (1) emotional appearance, (2) multiple levels of emotion generation, (3) emotional experience, and (4) (a large category of) mindbody interactions. These four components are not intended to be self-evident from their short names, nor are they intended to be mutually exclusive or collectively exhaustive. Let me say what I mean by each, and why all four are important to consider.
A computer that "has emotions," in the sense that a person does, will be capable of: 1. Emotional appearance 2. Multilevel emotion generation 3. Emotional experience 4. Mind-body interactions
Figure 7.1
Rosalind W. Picard
Figure 7.2 Emotional Appearance
Barney the stuffed animal sometimes sounds as if he is happy. Like a 3-D animated cartoon, he has expressions and behaviors that were designed to communicate certain emotions. ‘‘Emotional appearance’’ includes behavior or expressions that give the appearance that the system has emotions (figure 7.2). This component is the weakest of the four, in the sense that it is the easiest of the four to produce, at least at a superficial level. However, I include it because this quality is all that an outside observer (nondesigner of the system, who cannot access or decipher its inward functions) has at his or her disposal in order to judge the emotional nature of the system. By and large, it is what the crew in the film 2001: A Space Odyssey did not perceive about the computer HAL until the end of the film, otherwise they might have obtained earlier clues about HAL’s increasingly harmful emotional state, which at the end of the film is illuminated when HAL finally says, ‘‘I’m afraid, Dave, I’m afraid.’’ This component is also the most commonly implemented in machines today— primarily in agents and robots that display emotional behaviors in order to ‘‘look natural’’ or to ‘‘look believable.’’ (Note: the following discussion occurred when the original slide had this component labeled as ‘‘emotional behavior.’’)
Discussion
Ortony: I think emotional behavior is not really interesting. Acting is emotional behavior—it’s all imitation and mimicry. The Mac’s
What Does It Mean for a Computer to ‘‘Have’’ Emotions?
smile is not emotional behavior, unless it is actually initiated by and related to an emotion. Picard: Actually, I could m a p these behaviors to your slide with the emotional response tendencies (figure 7.2). Ortony: Well, no, because they are actually in response to an emotion, while the Mac’s smile isn’t in response to anything. It’s just a curvy line. Picard: No, n o ! It holds a response to an internal state that reads as ‘‘satisfactory, good . . . ,’’ which I would not call an emotion, but some people would say it gives rise to this feeling that universally is recognized as ‘‘all is going well.’’ Ortony: If you say ‘‘emotional behavior’’ related to ‘‘multilevel emotion generation’’ [in figure 7.1], then I am perfectly happy. What I am not happy with is mimicry, acting, and all those other things that are ‘‘as if’’ and intended to pretend. My point is actually a causal connection between 2 and 1. These things are not independent—that’s all I am saying. Picard: I have never said that these things are independent. In fact, I talked about the explicit interconnections between these! Ball: The behavior is irrelevant. I don’t think it’s necessary to show behaviors. Picard: I am not saying that if you don’t see emotional behavior, there is no emotion. Ortony: I understand that. It’s just the examples you give of emotional behavior that aren’t actually caused by 2, 3, and therefore they are not examples of emotional behavior, they are ‘‘as if’’ behaviors! That’s all. Picard: And I am just talking about what they are received as. You know, to an observer it may not make a difference. Ortony: We are talking about what they are capable of, not how they are interpreted. Picard: I see your distinction. I think we could work it out as the semantics of what we are talking about. Ortony: No, no. We have to work it out with a causal model as opposed to a set of causally unrelated independent things. Picard: Ok. What I would say is that a person who has emotions is capable of emotional behavior. Ortony: Yes, absolutely right.
Rosalind W. Picard
Because emotional appearance results largely from emotional behavior, and because I include the making of facial, vocal, and other expressions as kinds of behavior, I have previously referred to this component as emotional behavior. I am here changing my two-word description because a couple colleagues at the Vienna workshop argued that it was confusing; however, I am not changing what it refers to, which remains the emotional appearance of the system’s behavior. Examples of systems with behaviors that appear to be emotional include the tortoises of W. Gray Walter (1950) and Braitenberg’s Vehicles (Braitenberg 1984). When one of Braitenberg’s little vehicles approached a light or backed rapidly away from it, observers described the behavior as ‘‘liking lights’’ or as ‘‘acting afraid of lights,’’ both of which involve emotional attribution, despite the fact that the vehicles had no deliberately designed internal mechanisms of emotion. Today there are a number of efforts to give computers facial expressions; the Macintosh has been displaying a smile at people for years, and there is a growing tendency to build animated agents and other synthetic characters and avatars that would have emotional expressions. These expressive behaviors may result in people saying the system is ‘‘happy’’ or otherwise, because it appears that way. I think all of us would agree that the examples just given do not have internal feelings, and their behavior is not generated by emotions in the same sense that human or animal behavior is. However, the boundary is quickly blurred: Contrast a machine like the Apple Macintosh, which shows a smile because it is hardwired to do that in a particular machine state, and a new ‘‘emotional robot,’’ which shows a smile (Johnstone 1999) because it has appraised its present state as good and its present situation as one where smiling can communicate something useful. The Mac’s expression signals that the boot-up has succeeded and the machine is in a satisfactory state for the user to proceed. However, most of us would not say that the Mac is happy. More might say that the robot is happy, in a rudimentary kind of way. But, if the robot’s happy facial expression were driven by a simple internal state labeled ‘‘satisfaction,’’ then it would really be no different than the Mac’s display of a smile. As the generation mechanisms become more complex and adapted for many such states and expressions, then the argument that the expression or behavior really arose from an emotion becomes more compelling. The more complex the system, and the
What Does It Mean for a Computer to ‘‘Have’’ Emotions?
higher the user’s expectations, the harder it also becomes for the system’s designer to craft the appearance of natural, believable emotions. Nonetheless, we should not let mere complexity fool us into thinking emotions are there. If a system really has emotions, then we expect to see those emotions influence and give rise to behavior on many levels. There are the obvious expressions and other observable emotional behaviors, like saying ‘‘Humph,’’ and turning abruptly away from the speaker; however, emotions also modulate nonemotional behaviors: The way you pick up a pen (a neutral behavior) is different when you are seething with anger versus when you are bubbling with delight. True emotions influence a number of internal functions, which are generally not apparent to anyone but the designer of the system (and in part to the system, to the extent that it is given a kind of ‘‘conscious awareness’’ of such). Some of emotion’s most important functions are those that are unseen, or at least very hard to see. The body-mind mechanisms for signaling and linking the many seen and unseen functions are primarily captured by the fourth component, which I’ll describe shortly.
Multiple Levels of Emotion Generation
Animals and people have fast subconscious brain mechanisms that perform high-priority survival-related functions, such as the response of fear in the face of danger or threat. LeDoux (1996) has described the subcortical pathway of fear’s ‘‘quick and dirty’’ mechanism, which precedes cortical involvement. This level of preconscious, largely innate, but not highly accurate emotion generation appears to be critical for survival in living systems. One can imagine giving robots and machines sensors that operate at a similar level—in a relatively hardwired way, detecting when the system’s critical parameters are in a danger zone, and triggering rapid protective responses, which can shortly thereafter be modified by slower, more accurate mechanisms. The level of emotions just described stands in contrast with slightly slower (although still very fast) emotion generation that tends to involve higher cortical functions and may or may not involve conscious appraisals (figure 7.3). If you jump out of the way of a snake, and suddenly realize it was only a stick, then that was probably an instance of the fast subconscious fear-generation mechanism. In contrast, if you hear that a convicted killer has
Rosalind W. Picard
Multilevel emotion generation Fast, "hard-wired" fear (LeDoux) Computer's power alarm * Robot response to "pain" Slower, more "reasoned" emotions Rule-based, associative, flexible...
Figure 7.3
escaped a nearby prison, and consequently decide that you don’t want to leave the house, then it is likely that your thoughts generated a form of a learned fear response, which subsequently influenced your decision. You may have never seen a convicted killer, but you cognitively know that such a person could be dangerous, and you associate with it a response that you learned from a similar but real experience. This learned fear response engages some of the same parts of the brain as the lower-level quick version of fear, but it additionally involves reasoning and cortical appraisal of an emotional situation. Some of the most common methods of ‘‘implementing emotions’’ in computers involve constructing rules for appraising a situation, which then give rise to an emotion appropriate to that situation. An example is the OCC model (Ortony, Clore, and Collins 1988), which was not designed to synthesize emotions, but rather to reason about them, but works in part for either. Consider the generation of joy, which involves deciding that if an event happens, and that event is desirable, then it may result in joy for oneself or in happiness for another. A machine can use this rule-based reasoning either to try to infer another’s emotion, or to synthesize an internal emotional state label for itself. All of this can happen in the machine in a cold and logical way, without anything that an outsider might observe as emotion. It can happen without any so-called conscious awareness or ‘‘feeling’’ of what the machine is doing. This kind of ‘‘emotion generation’’ does not need to give
What Does It Mean for a Computer to ‘‘Have’’ Emotions?
rise to component one—emotional appearance—or to the other two components listed below, but it could potentially give rise to all of them. In a healthy human, such emotional appraisals are also influenced by one’s feelings, via many levels of mechanisms. People appear to be able to reason in a cold way about emotions, with minimal if any engaging of observable bodily responses. However, more often there seem to be bodily changes and feelings associated with having an emotion, especially if the emotion is intense. An exception arises in certain neurologically impaired patients (e.g., see accounts in Damasio 1994) who show minimal signs of such somatic concomitants of emotion. If you show these patients grotesque blood-and-guts mutilation scenes, which cause most people to have high skin conductivity levels and to have a feeling of horror and revulsion, these patients will report in a cool cognitive way that the scenes are horrible and revolting, but they will not have any such feelings, nor will they have any measurable skin conductivity change. Their emotional detachment is remarkable, and might seem a feature, if it were not for the serious problems that such lack of emotionality actually seems to be a part of in day-to-day rational functioning, rendering these otherwise intelligent people severely handicapped. What these patients have is similar to what machines that coldly appraise emotions can have— a level of emotion generation that involves appraisal, without any obvious level of bodily or somatic involvement. It is not clear to what extent normal people can have emotions without having any associated bodily changes other than those of unfelt thought patterns in the brain; consequently, the levels of emotion generation described here may not typically exist in normal people without being accompanied by some of the mind-body linkages in the fourth component, described below. Nonetheless, multilevel generation of emotion is an important component because of its descriptive power for what is believed to happen in human emotion generation, and because some of these levels have already been implemented to a certain degree in machines. It is also relevant for certain neurologically atypical people, such as high-functioning autistics, who describe their ability to understand emotions as ‘‘like a computer—having to reason about what an emotion is’’ versus understanding it intuitively. The two levels just described—quick and dirty subconsciously generated emotions and slightly slower, more reason-generated
Rosalind W. Picard
emotions, are not the only possibilities. Nor does my choice of these two examples impose a belief that ‘‘reasoning’’ has to be conscious. My point is instead that here are examples of emotions occurring via different levels of mechanisms. I expect that neuroscientists will find unique patterns of activation (and deactivation) across cortical and subcortical regions for each kind of emotion— joy, fear, frustration, anger, and so forth, and possible unique patterns for significant variations in levels of these. I would also expect we would build multiple levels of activation of emotiongeneration mechanisms in machines, varying in resources used and varying in timing and in influence, in accord with the specific roles of each emotion. Some would be quick and perhaps less accurate, while some would be more carefully deliberated. Some would be at a level that could be consciously attended, or at least attended by some ‘‘higher’’ mechanisms, while some would occur without any such monitoring or awareness. Some of the mechanisms would be easy to modify over time, while others would be fairly hardwired. Some of the emotion-generation mechanisms might be rule based, and easy to reason about—at least after the fact if not during—while others would be triggered by patterns of similarity that might not be easily explained. And many or even all of these mechanisms might be active at different levels contributing to background or mixed emotions, not just to a small set of discrete emotions. In summary, machines will have different combinations of mechanisms activating different emotions, a veritable orchestra for emotion generation.
Emotional Experience
We humans have the ability to perceive our personal emotional state and to experience a range of feelings, although many times we are not aware of or do not have the language to describe what we are feeling (figure 7.4). Our feelings involve sensing of physiological and biochemical changes particular to our human bodies (I include the brain and biochemical changes within it as part of the body). Even as machines acquire abilities to sense what their ‘‘bodies’’ are doing, the sensations remain different than those of human bodies, because the bodies are substantially different. In this sense machine feelings cannot duplicate human feelings. Nonetheless, machines need to be able to sense and monitor more
What Does It Mean for a Computer to ‘‘Have’’ Emotions?
Emotional Experience What one can perceive of one's own emotional state: I. Cognitive or semantic label II. Physiological changes III. Subjective feeling, intuition
Problem: consciousness Figure 7.4
of what is going on within and around their systems if they are to do a better job of regulating and adapting their own behavior. They will likely need mechanisms that perform the functions performed by what we call consciousness, if only to better evaluate what they are doing and learn from it. A great distinction exists between our experience and what machines might have. The quality of conscious awareness of our feelings and intuition currently defies mechanistic description, much less implementation in machines. Several of my colleagues think that it is just a matter of time and computational power before machines will ‘‘evolve’’ consciousness, and one of them tells me he’s figured out how to implement consciousness, but I see no scientific nuggets that support such belief. But I also have no proof that it cannot be done. It won’t be long before we can implement numerous functions of consciousness, such as awareness and monitoring of events in machines, but these functions should not be confused with the experience of self that we humans have. I do not yet see how we could computationally build even an approximation to the quality of emotional experience or experience of self that we have. Thus I remain a skeptic on whether machines will ever attain consciousness in the same way we humans think of that concept. Consciousness, and life, for that matter, involves qualities that I do not yet see humans as capable of creating, outside of procreation. Perhaps someday we will have such creative abilities; nonetheless, I do not see them arising as a natural progression of
Rosalind W. Picard
past and present computational designs, not even with the advent of quantum computing.
Discussion
Riecken: I don’t know what consciousness is. Picard: It’s more loaded than awareness. I prefer that we say (referring to figure 7.4): what we perceive of our own emotion, what something in us can perceive or become aware of. Because my list applied both to people and computers, I didn’t want to put the word self in. I always said ‘‘one’s,’’ but, you know, that’s not quite the same as ‘‘self.’’ I think self is a more loaded word than just saying what this entity perceives of what’s going on within this entity. Sloman: A good operating system has a certain amount of selfawareness. Ortony: It’s a little tough for a machine to be aware of its physiological changes if it does not have a physiology. Picard: A computer can sense physically. We recently hardwired the back of our monitor to sense surges in voltage, so it could sense the precise instant that it was displaying the image to subjects in one of our studies. The operating system did not give us the hooks to sense that. I think we need to build software a n d hardware that has better self-awareness. Rolls: Well, can I say that I am really worried about saying that any machine has self-awareness in this sense? Picard: Yes. It sounds just as dangerous as saying it has emotions. Rolls: Whenever we use the word ‘‘awareness,’’ it implies to me qualia of phenomenology. If you would replace that by ‘‘selfmonitoring,’’ we would not get into a problem. If we can understand something, we can model it and build a computational model of it. Modeling is a form of imitation, not duplication. Thus I use the term ‘‘imitate’’ instead of ‘‘duplicate’’ with respect to implementing this component in machines. In fact, we should probably be more careful about using the phrase ‘‘imitating some of the known mechanisms of h u m a n emotion in machines’’ to describe much of the current research concerned with ‘‘giving machines emotion.’’ For brevity and readability, the latter phrase is what I will continue to use, with hope that with this
What Does It Mean for a Computer to ‘‘Have’’ Emotions?
Mind-Body Interaction: Emotions are NOT just "thoughts" • Conscious and nonconscious events • Regulatory and signaling mechanisms • Biasing mechanisms, intuition • Physiological and biochemical changes • Sentic modulation, lying impacts pressure waveform of love; smiles induce joy...
Figure 7.5
paper, we will begin to find some common understanding for what this shorter expression represents.
Mind-Body Interactions
The fourth component is a broad category including many signaling and regulatory mechanisms that emotion seems to provide in linking cognitive and other bodily activities (figure 7.5). Here we find that emotions often involve changes in bodily systems outside the brain, as well as inside the brain. There is evidence, for example, that emotions inhibit and activate different regions of the brain, facilitating some kinds of cognitive activity while inhibiting others. Researchers have shown numerous effects of emotion and mood biases on creative problem solving, perception, memory retrieval, learning, judgment, and more. (See Picard 1997a for a description of several such findings.) Not only do human emotions influence brain information processing, but they also influence the information processing that goes on in the gastrointestinal and immune systems (see Gershon 1998 for a description of information processing in the gut). Emotions modulate our muscular activity, shaping the spacetime trajectories of even very simple movements, such as the way we press on a surface when angry versus when joyful. I call the way in which emotions influence bodily activity sentic modulation, after Manfred Clynes’s (1977) work in sentics, where he first attempted to quantify and measure a spatiotemporal form of
Rosalind W. Picard
emotion. Clynes found that even simple finger pressure, applied to a nondescript firm surface, took on a characteristic pattern when people tried to express different emotions. Moreover, some of the emotions had implications for cognitive states such as lying or telling the truth. Subjects were asked to physically express either anger or love while lying or while telling the truth, and their physical expressions (finger pressure patterns, measured along two dimensions) were recorded and measured. When subjects were asked to express anger, the expressions were not significantly different during lying than when telling the truth. However, when subjects were asked to express love, the expressions differed significantly when lying versus when telling the truth. In other words, their bodily emotional expression was selectively interfered with by the cognitive state of lying, given that it was not obviously interfered with in any other way. I expect that this particular love-lying interaction is one of many that remain to be characterized. The interaction between emotions and other physical and cognitive states is rich, and much work remains to be done to refine our understanding of which states inhibit and activate each other. As each interaction is functionally characterized in humans, so too might it be implemented in machines. Ultimately, if a machine is to duplicate human emotions, the level of duplication must include these many signaling, regulatory components of emotion, which weave interactive links among physical and mental states. Consider building synthetic pain-sensing and signaling mechanisms. Some machines will probably need an ability outside of their modifiable control to essentially feel bad at certain times—to sense a kind of highest-priority unpleasant attention-refocusing signal in a situation of dire self-preservation (where the word self is not intended to personify, but only to refer back to the machine). This ‘‘feeling,’’ however it is constructed, would be of the same incessantly nagging, attention-provoking nature that pain provides in humans. When people lose their sense of pain, they allow severe damage to their body, often as the accumulation of subtle small damages that go unnoticed. Brand and Yancey (1997) describe attempts to build automatic pain systems for people—in one case, a system that senses potentially damaging pressure patterns over time. The artificial pain sensors relay signs of pain to the patient via other negative attention-getting signals, such as an obnoxious sound in their ear. One of the ideas behind the artificial system is
What Does It Mean for a Computer to ‘‘Have’’ Emotions?
to provide the advantages of pain—calling attention to danger— without the disadvantages—the bad feelings. The inputs approximate those of real pain inputs, and the outputs are symbolically the same: irritating and attention getting. Ironically, what people who use the artificial system do is either turn these annoying warnings off or ignore them, rationalizing that it can’t be as bad as it sounds. Eventually the pain-impaired person gets seriously injured, although he or she doesn’t really mind because it does not hurt. In short, the artificial pain system doesn’t work; somehow it has to be ‘‘real’’ enough that you can’t override or ignore it for long. Otherwise, injury accumulates, and the long-term prognosis is bad. Whatever version of ‘‘pain’’ we give machines, if its goal is system preservation, then it must be such that it is equivalent to not being able to be turned off, except under greater goals, or except by the machine’s designer. This is not simply to say that pain avoidance should always have the highest priority. Self-preservation goals may at some point be judged as less important than another goal, as suggested by Asimov’s three laws of robotics, where human life is placed above robot ‘‘life,’’ although this presumes that such assessments could be accurately made by the robot. Humans sometimes endure tremendous pain and loss of life for a greater goal. Similar trade-offs in behavior are likely to be desirable in certain kinds of machines. Before concluding this section, let me restate that it is important to keep in mind that all computers will not need all components of all emotions. Just like simple animal forms do not need more than a few primary emotion mechanisms, not all computers will need all emotional abilities, and some will not need any emotional abilities. Humans are not distinguished from animals just by a higher ability to reason, but also by greater affective abilities. I expect more sophisticated computers to need correspondingly sophisticated emotion functions.
7.2
Discussion
What is my view on what it means for a computer to have emotion? Before closing this discussion, we should keep in mind that we are still learning the answer to this question for living systems, and the machine is not even alive. That said, I have tried to briefly describe four components of emotion that are present in people and to discuss how they might begin to be built into machines.
Rosalind W. Picard
Evidence suggests that emotions... Coordinate/regulate mental processes Guide/bias attention and selection Signal meaningfulness Help with intelligent decision-making Enable resource-limited systems to deal with unpredictable, complex inputs, in an intelligent flexible way
Figure 7.6
My claim, which opened this section, is that all four components of emotion occur in a healthy h u m a n . Each component in turn has many levels and nuances (figure 7.6). If we acknowledge, say, N ¼ 60 such nuances, and implement them in a machine, then the machine can be said to have dozens of mechanisms of emotion that imitate, or possibly duplicate, those of the human emotional system. So, does the machine ‘‘have emotions’’ in the same sense that we do? There is a very basic problem with answering this: One could always argue that there are only N k n o w n human emotion mechanisms and more may become known; how many does the machine have to have before one will say that it has humanlike emotions? If we require all of them to be identified and implemented, then one can always argue that machines aren’t there yet, because we can never be assured that we have understood and imitated everything there is to know. Consequently, one could never confidently say that machines have emotions in the sense that we d o . The alternative is to agree on some value of N that suffices as a form of ‘‘critical mass.’’ But that is also ultimately unsatisfactory. Furthermore, some machines will have or may benefit from aspects of emotion-like mechanisms that humans don’t have. Animals doubtless already have different mechanisms of emotion than humans, and we are not troubled by the thought of someone saying that they have emotions. Ultimately, we face the fact that a precise equality between human and machine emotion mechanisms cannot be assured because we simply do not have complete lists of what there is to compare, nor do we know how incomplete our lists are.
What Does It Mean for a Computer to ‘‘Have’’ Emotions?
Can't we do all this without giving the machines emotions? Sure. But, once we've given them all the regulatory, signaling, biasing, and other useful attention and prioritization mechanisms (by any other name) and done so in an integrated, efficient interwoven system, then we have essentially given the machine an emotion system, even if we don't call it that. Figure 7.7
Machines are still not living organisms, despite the fact that we describe many living organisms as machines (figure 7.7). It has become the custom to associate machine behavior and human behavior without really thinking about the differences anymore. Despite the rhetoric, our man-made machines remain of a nature entirely different than living things. Does this mean they cannot have emotions? I think not, if we are clear that we are describing emotions as mechanisms with functional components like the four described here. Almost all of these have been implemented in machines at some level, and I can see a path toward implementing all of them. At the same time, it is prudent to acknowledge that one of the components, emotional experience, includes components of consciousness that have not yet been shown to be reducible to computational functions. Machines with all components but this one might be said to have emotion systems, but no real feelings. As we make lists of functions and match them, let us not forget that the whole process of representing emotions as mechanisms and functions for implementation in machines is approximate. The process is inherently limited to that which we can observe, represent, and reproduce. It would be arrogant and presumptuous to not admit that our abilities in these areas are finite and small compared to all that is unknown, which may be infinite. Remember that I began this presentation asking whether or not it was necessary to give machines emotions if all we are interested in is giving them the ability to recognize and respond appropriately to
Rosalind W. Picard
a user’s emotion. Suppose we just want the computer to see it has annoyed someone, and to change its behavior so as not to do that again; why bother giving it emotions? Well, we still may not have to bother, if we can give it all the functions that deal with complex unpredictable inputs in an intelligent and flexible way, carefully managing the limited resources, dynamically shifting them to what is most important, judging importance and salience, juggling priorities and attention, signaling the useful biases and actionreadiness potentials that might lead to intelligent decisions and action, and so forth. Each of these functions, and others that might someday be added to this list, may possibly be implemented by means other than ‘‘emotions.’’ However, it is the case that these functions, in humans, all seem to involve the emotional system. In particular, these functions involve the second, third, and fourth components of emotion that I’ve described, which sometimes give rise to the first component. We may find once we have implemented all of these useful functions, a n d integrated them in an efficient, flexible, and robust system, that we have essentially given the machine an emotion system, even if we don’t call it that. Machines already have some mechanisms that implement (in part) the functions implemented by the human emotional system. Computers are acquiring computational functions of emotion systems whether or not one uses the ‘‘e’’ word. But computers do not have humanlike emotions in any rich or experiential natural sense. They may sense and label certain physical events as categories of ‘‘sensations,’’ but they do not experience feelings like we d o . They may have signals that perform many and possibly all of the functions performed by our feelings, but this does not establish equivalence between their emotion systems and ours. Computers may have mechanisms that imitate some of ours, but this is only in part, especially because our bodies differ and because so little is known about human emotions. It is science’s methodology to try to reduce complex phenomena like emotions to a list of functional requirements, and it is the challenge of many in computer science to try to duplicate these in computers to different degrees, depending on the motivations of the research. But we must not be glib in presenting this challenge to the public, who thinks of emotion as the final frontier of what separates man from machine. When a scientist tells the public that a machine ‘‘has emotion’’ then the public concludes that not only
What Does It Mean for a Computer to ‘‘Have’’ Emotions?
could Deep Blue beat a grand master, but also Deep Blue could feel the joy of victory. The public is expecting that science will catch up with science fiction, that we will build HAL and other machines that have true feelings, and that emotions will consequently be reduced to a program available as a plug-in or free download if you click on the right ad. I think we do a disservice when we talk in such a way to the public, and that it is our role to clarify what aspects of emotion we are really implementing. Emotions are not the system that separates man and machine; the distinction probably lies with a less popular concept—the soul—an entity that currently remains ineffable, but is something more than a conscious living self. I don’t have much to say about this, except that we should be clear to the public that giving a machine emotion does not imply giving it a soul. As I have described, the component of emotional experience is closely intertwined with a living self and I remain uncertain about the possibility of reducing this component to computational elements. However, I think that the other three components of emotion are largely capable of machine implementation. If the day comes that scientists think that human emotion and its interactions with the mind and body are precisely described and understood, then it will not be many days afterward that such functions will be implemented in a machine, to the closest approximation as possible to the human system. In that time, most people would probably say that the machine has emotions, and few scientists will focus on what this means. Instead, we will focus on how machines, made in our image, can be guided toward good and away from evil, while remaining free to do what we designed them to do.
Discussion: Emotional Maturity
Elliott: Socially intelligent systems or emotionally intelligent systems require emotions. And that is what we want to build, or what we want to study. Picard: I don’t actually think that we can say that emotionally intelligent systems require emotions. That’s the question. I am trying to build emotionally intelligent systems, and I will see how far I can go without giving them emotions. Which components of emotions? I guess I can invent a mechanism for each of them, and I
Rosalind W. Picard
don’t need emotions to do that. But by the time we have built these special mechanisms that perform the decision making this way, maybe we discover that it would be more efficient to go back and just build an emotional system, because that one emotional system could maybe do it all. Ortony: I thought you were going to say something else. I thought you were going to say that emotion did somehow fall out as a by-product of having built an integrated system. Picard: Actually, I could say that as well, too. By the time you have done the integrated system without ever invoking the word ‘‘emotion,’’ it is an emotional system. I have always thought this was obvious. I never bothered to say it. . . . Ortony: It’s not totally obvious, you know. Picard: I used to design computer architectures for a living. You can play out all the goals of what you want to achieve on the table, and then you figure out how you combine it all in an efficient architecture. So, to me, this was just obvious. We have been focusing on systems that help people communicate emotions, that help them express emotion; or the machine might express it—and might try to recognize emotion. Is that going to make a really emotionally intelligent system, if we do that? I don’t think so. These are exactly the capabilities autistics have. Autistics have emotions, they can express emotions, and they can sometimes pattern-recognize other people’s emotions, and yet they are really difficult to interact with as human beings. We cannot consider them as emotionally intelligent. Ortony: Let me offer a new word, which is emotional maturity. I think it is a much more felicitous term than emotional intelligence. I think emotional maturity is what it really is! I mean, it’s a much more natural way to think about it. Emotions do in fact develop in h u m a n s . In the normal course of development, they mature, and people become emotionally sophisticated and capable of doing all these things that go under the rubric of ‘‘emotional intelligence.’’ Or maybe there is a better word yet. Sloman: Competent. Emotionally competent! Bellman: Emotional competency in infants at a certain stage for certain things. Ortony: Well, the reason I said ‘‘mature’’ is that it implies ageappropriate competence.
What Does It Mean for a Computer to ‘‘Have’’ Emotions?
Picard: I know intelligence is a loaded word for a lot of people. And so I just list the set of learnable skills, as opposed to implying some innate capabilities. Trappl: Maturity implies a genetic aspect. Picard: Yes, it’s rather hard to make a sure divide. But there are a lot of arguments for the case that you can teach people how to improve this set of skills, so to some degree, they are learnable, although to some degree they are probably also genetic. Ortony: You can teach people to improve their posture, but that does not mean that development of posture is not a sort of natural maturation. There is a natural development of posture, and we can still correct it.
Note 1. B. Reeves, and C. Nass, The Media Equation (Cambridge University Press, Cambridge, 1996).
References Braitenberg, V. (1984): Vehicles: Experiments in Synthetic Psychology. MIT Press, Cambridge. Brand, P. W., and Yancey, P. (1997): The Gift of Pain. Zondervan Publishing House, Grand Rapids, Michigan. Clynes, M., Jurisevic, S., and Rynn, M. (1990): Inherent Cognitive Substrates of Specific Emotions: Love is Blocked by Lying but not Anger. Percept. Motor Skills 70: 195– 206. Clynes, M. (1977): Sentics: The Touch of Emotions. Anchor Press/Doubleday, New York. Damasio, A. R. (1994): Descartes’ Error: Emotion, Reason, and the Human Brain. Putnam, New York. Gershon, M. D. (1998): The Second Brain: The Scientific Basis of Gut Instinct and a Groundbreaking New Understanding of Nervous Disorders of the Stomach and Intestines. HarperCollins, New York. Johnstone, B. (1999): Japan’s Friendly Robots. Technol. Rev. May/June 63–69. LeDoux, J. E. (1996): The Emotional Brain. Simon and Schuster, New York. Ortony, A., Clore, G. L., and Collins, A. (1988): The Cognitive Structure of Emotions. Cambridge University Press, Cambridge. Picard, R. W. (1997a): Affective Computing. MIT Press, Cambridge. Picard, R. W. (1997b): Does HAL Cry Digital Tears? Emotion and Computers. In D. G. Stork, ed., HAL’s Legacy: 2001’s Computer as Dream and Reality. MIT Press, Cambridge. Roseman, I. J., Antoniou, A. A., and Jose, P. E. (1996): Appraisal Determinants of Emotions: Constructing a More Accurate and Comprehensive Theory. Cogn. Emotion 10: 241–277. Walter, W. G. (1950): An Imitation of Life. Sci. Am. 182 (5): 42–45.
8 The Role of Elegance in Emotion and Personality: Reasoning for Believable Agents Clark Elliott
In this short chapter, intended to foster discussion at the Austrian Research Institute for Artificial Intelligence workshop on Emotions in Humans and Artifacts, we will suggest that elegance and artistic cohesion are important factors in emotion and personality models used for simulating anthropomorphic creatures on computers. For such models to be useful in either real time or delayed interaction with users, they must support the creation of a fluid social fabric, with material coming primarily from the user’s imagination, and they must support the user’s ability to continually generate the spark of inspiration that makes such interaction work. When the rhythm of such interaction locks in, a spell may be cast, creating a context in which inspired beings can exist, where previously there were only amusing, but disconnected, techniques. Without elegance in both the underlying theories that drive the interaction, and in the exposition of the theories, the spell will be broken and the inspired beings lost.
8.1
Elegance Leads to Generalization and Adaptability
Emotion and personality models for the computer will tend to fall into two categories, albeit with some overlap. The first category, with which we are less concerned here, contains theories that seek to illuminate human systems, both psychological and physiological. The second category contains theories that forsake human systems and processes and instead attempt to describe, symbolically, details of humanlike emotion and personality in ways that are, ultimately, useful for computer applications of many types. These two categories are analogous to what has been termed h u m a n AI and alien AI. With human AI, primarily concerned with illuminating real-life systems, granularity is not an issue: The systems exist in the physical, temporal, world. The level at which translation into symbolic form takes place depends on the particular data being collected. With alien AI, however, wherein we are attempting to support at
Clark Elliott
least the illusion of humanlike emotional behavior and understanding (however we can get it), the granularity with which emotion and personality are being described is of paramount importance. We suggest that for symbolically descriptive systems (i.e., alien AI emotion and personality systems), selecting a suitable level of granularity that allows us to reasonably support a wide range of social scenarios may be critical. Doing this correctly allows us to provide a solid foundation for the elegance that gives rhythm and flow to the imaginative processes of users. Additionally, we can argue that this elegance, giving rise to a natural understanding and acceptance on the part of the user, will also allow our theories to work in a variety of different contexts: for generating personalities, for describing social processes, for understanding users, for tweaking tutoring systems, and so on.
8.2
A Multicontext Example
For example, consider that in all cases, emotion and personality models for computer applications must make the translation from the analog processes of the real world to the rough-hewn set of discrete states used in computer simulations. For a socially rich system, these discrete states will, currently, be high level and complex, and, necessarily, cut from relatively large chunks of time. If the theory we are using to drive our multicontext emotion model strives for deep coverage of many different aspects of emotion reasoning, it is possible that it will suffer from inconsistent granularity of representation: A more profound representation in one area of emotion reasoning may lend insight to the field as a whole, but may also weaken the rhythm of interaction, and thus the elegance of the theory, when it is used in believable agents. One aspect of the model, within which the representational problems are less difficult, will be richer and constructed of finer-grain components while another aspect of the model will be sparse. For the study of humanlike systems this may be fine; for artistic balance, and the creation of flow in interaction with users, it may be fatal. We are not championing any particular believable agent’s emotion and personality model in this exposition, but rather a style of model. Our point, independent of the particular theory, is that whatever (necessarily impoverished) representation is used, the
The Role of Elegance in Emotion and Personality 239
strengths of the theory must be emphasized in such a way that they can act as the glue holding together other agent-building techniques. Whatever the theory, it will best have enough internal consistency that the ebb and flow of interaction with it will transfer to a number of contexts. Users can become familiar at many levels (conscious and subliminal) with how such a social/reasoning process works. Most people never study conversation, but they understand the concept of taking turns, of asking for elaboration, and so on. Similarly, even without studying music, they will become familiar with standard chord progressions and melodic contour. The underlying rules for each of these two systems apply in many contexts (e.g., talking with a friend, arguing with a salesperson, watching people speak to each other in a foreign language; listening to country music, listening to a Beethoven symphony, listening to jazz). Similarly, if an emotion theory has internal consistency, users will come to know, and be able to use, the rules governing its structure in many contexts. For the purpose of discussion, we will use an example from work best known to us—that of Ortony, Clore, and Collins (1988; hereafter, OCC), as embodied in the ‘‘Affective Reasoner’’ (Elliott 1992; hereafter, AR). As used in the AR, the OCC theory uses relatively high-level symbolism for driving personality and emotion. It is also broad in coverage and uses a similar level of granularity for its varied aspects. We will show, through brief example, that because of these traits, and its inner consistency, such a representation is useful in a number of different believable agent contexts. Our discussion here is not about the quality of this particular theory, but rather about the quality of the style of this kind of theory, which could be manifested in a number of quite different theoretical models. In the following list, we briefly examine the OCC representation of hope (being pleased over the prospect of a favorable outcome with respect to one’s goals), as used in the AR, in seven different contexts. Some subset of twenty-four such emotion templates (e.g., hope, anger, love, pity, etc.), each with considerably more detail than presented here, has been used in each of these seven contexts, as well as in others. We suggest, as a position, that it is the balance in the OCC/AR representation, and internal consistency, rather than its empirical rightness or wrongness that allow us to use it, fluidly, in so many different ways.
Clark Elliott
Personality Generation Agent personality properties affected by the representation of hope include: (a) The ability to reason about the future. This can be as simple as holding a discrete representation of world features to be checked against future states in the system. When the features map to an unfolding event, outcomes can be confirmation or disconfirmation. (b) Character traits: tendency toward experiencing hope (optimistic personality), or the reverse (a less volatile personality—note that a pessimistic personality would involve the generation of worry, a form of fear); strength of hopefulness; duration of hopefulness without reinforcement; strength of ultimate satisfaction or disappointment, and so on. (c) Agent mood changes through dynamic tweaking of degree of hopefulness ‘‘experienced’’ by agent. Emotion Generation When an agent ‘‘observes’’ world situations that suggest a stored goal will obtain in the future, then possibly, depending on current constraints in both the external state of the world and the internal state of the agent, it will generate an instance of hope, and queue up a set of features to be compared against future events for the possible generation of satisfaction or disappointment. In observing agents, make the following changes of emotion state: (a) possibly generate an instance of a fortunes-ofothers emotion if the same set of observed-world situations, also observed by the second agent, leads that second agent to believe that the first agent will experience an emotion, and the second agent is in a relationship with the first agent (e.g., the second agent is happy for the first agent because the triggering event is believed to have caused the first agent to be hopeful); or (b) possibly generate an emotion sequence wherein the expressed emotion of the first agent triggers an emotion in the second agent (e.g., resentment over the perceived hopefulness of the first agent, irrespective of other observed-world situations). Emotion Manipulation If the system (e.g., tutoring system) knows that a user has a certain goal, then lead them to believe that such a goal is likely to obtain in the future. Tweak subsequent actions to take advantage of the likelihood that the user is experiencing hope. Emotion Understanding If a user is hopeful, look for uncertainty with respect to as yet unrealized goals. What are these goals? Ask the user for information relevant to hope. Reason about personality traits in the individual. Know what questions to ask: Does the user have a positive personality? Look for features constituting the strength of the hope. How far in the future? How strong is the hope? How important is the goal?
The Role of Elegance in Emotion and Personality 241
Story Generation When generating story variations, if an agent has a goal, make it currently unrealized (or if a preservation goal, realized but possibly threatened in the future), but believed to be realized in the near future by the agent intended to experience hope. Have other agents presume this desired goal of the agent. With a fixed external plot (that is, with no change to what happens in the world external to the agent), alter the story by having an agent desire a future event, rather than desire not the same future event. Alter the personality of the agent from one that tends to believe in the negative outcome of the fixed future event, to one that tends to believe in the positive outcome of the fixed future event. In this way, while the events stay the same, the meaning of the events changes. User Models Does this user tend to experience hope or fear? Is she motivated better by hope or by fear? If she is hopeful, find out what goal she values. How strong is the desire for this outcome? How confident is she that this goal will obtain? Ask. Users are motivated to answer questions when they feel that they are being understood. Humor Consider this scenario: We, as the observing audience, know that an event will not obtain (e.g., the protagonist wants the girl to go out with him, and does not realize that her mammoth boyfriend has just returned with their drinks). In fact, it may already have been disconfirmed. The agent, with which we (a) may be in an adversarial relationship (possibly temporary), and (b) are able to identify (cognitive unit), plays out the elements of classic comedy wherein we observe the agent’s increasing hope, knowing that it can only, ultimately, end in disappointment. To be funny, the scenario has to be tweaked according to the represented features of our emotion and personality theory: If the cognitive bond is too strong, the scenario will not be funny, it will be sad. If the cognitive bond is too weak, we may not feel anything because it is not personal enough. The degree of funniness may depend on our own emotional involvement in the emotions of the central characters.
8.3
A More Extended Example—Capturing the Situational Humor in Two Stories
To illustrate how internal consistency (in this case, of the OCC theory) can lead to flexibility, and hence flow, we look a little more deeply into how the AR explores the generation of, and understanding of, humor by believable agents. This application, at first
Clark Elliott
pass, might seem quite far afield, but because the emotion theory is transportable to most any arena requiring insight into the human social fabric, even here we can carve our niche. We will look at two vignettes illustrating a particular kind of situational humor that has been loosely categorized as like ‘‘pulling a chair out from under a dignitary,’’ but which we will further refine as the type of humor wherein authority figures are taken down for violating a principle they have previously championed at another’s expense. Emotion types (e.g., anger, gloating, shame) are well defined in both OCC and the AR (see above). Syntax note: In general, we will use parentheses, ( ), for function arguments, and square braces, [ ], for parenthetical comments where appropriate in the two examples.
Senior Graduate Student Gets Taken Down
‘‘In graduate school the older Ph.D. students lorded it over us. One day, a secretary sent out bulk e-mail to everyone in Computer Science asking that we each confirm our status on some unimportant matter. A first-year graduate student mistakenly used the ‘R’ form of reply—which replies to all recipients copied in on the original note, instead of the ‘r’form—which only replies to the sender. A senior graduate student then replied to this poor e-mail neophyte, flaming him mercilessly for interrupting everyone’s busy academic work by being so technically inept, and so on. Of course, the senior Ph.D. student mistakenly used the ‘R’form to reply as well. We all thought this was very funny.’’
The Dean Is Late!
‘‘The dean of our school, a rather formal, pontificating individual, devoted some minutes at the end of a faculty meeting discussing professionalism. We were a typical, frazzled faculty, with continuous interruptions during our day. Nonetheless, he belittled the rest of the faculty members in the school for coming late to meetings, saying that because these were scheduled in advance there really should be no excuse for not arriving on time. When he arrived ten minutes late to the next faculty meeting two days later, much chagrined, we were all greatly pleased at his expense.’’
The Role of Elegance in Emotion and Personality 243
Analysis
The parallel structure of these two stories is relatively obvious, and we offer one analysis here. We can find direct mappings for the following labelings within each of the stories. Step by step, using the existing emotion theory underlying the AR, we have the following: Agent-A: Authority figure who later becomes an unwilling victim. Agent-V: Original victims, who later delight in the downfall of the authority figure, Agent-A. Relationship R1: Adversarial from the perspective of the authority figure toward the original victims, with respect to the principle P. Relationship R2: Adversarial from the perspective of the original victims toward the authority figure, with respect to the principle P. Agent-V through an action, Act-1, violates a principle P, held by the authority figure, Agent-A, leading to emotion instance E1-shame. The authority figure (older grad student/dean) has an emotion instance, E2-mixed, of mixed emotions, comprising (a) E2-agloating (using the R1 adversarial relationship) over the misfortune of others, and (b) E2-b-anger, over the violation of the stated principle P, which lead to a thwarting of one of the authority figure’s goals G (not being disturbed via email/meeting beginning on time). Original victims are held accountable, through Act-2, by the authority figure, for violating the principle P. The original victims (the new grad student, the faculty) are ashamed, emotion instance E3-shame, but the intensity of this emotion is increased over E1-shame by Act-2 [through emotion intensity variables such as sense of reality, and importance]. An adversarial relationship R2-adversarial, with respect to this principle, from the original victims to the authority figure, is established at this point, if not already present through the authority figure holding the original victims accountable for their transgression. The principle held by the authority figure, for which the observers have been earlier held accountable is later violated by the authority figure, through Act-3. An emotion instance E3-a of embarrassment (shame) of the authority figure over the violated principle P. An instance E3-b of gloating of the original victims over the authority figure’s downfall. This gives us the emotion fabric:
Clark Elliott
P is held by Agent-V and Agent-A. Act-1: Agent-V violates P; E1-a-shame(Agent-V,P); E1-b-mixedemotions(Agent-A,P,G,R1) [E1-b-i-gloating(Agent-A,P,R1)/E1-b-iianger(Agent-A,[at]Agent-V,G,P)]. Act-2: E2-shame(Agent-V,P), intensity is increased by Act2(Agent-A); relationship [Agent-V ! R2-adversarial ! Agent-A, with respect to P] is established if not already present. Act-3: Agent-A violates P; E3-a-shame(Agent-A); E3-bgloating(Agent-V over Agent-A).
8.4 Discussion Not previously well defined in the Affective Reasoner are the authority figure and the adversarial with respect to a specific principle. (For example: I have a friend V who has another long-time friend K who will no longer come over to V’s house because V has purchased guns, which K is, in principle, opposed to. The two are mutually adversarial with respect to that principle, but are otherwise friends.) Each of these cases can be handled, ad hoc, without changing the computational framework in the AR. First, that one is an authority figure is in this case really only with respect to the principle under consideration: The senior graduate student is an authority on sending return mail; the dean is an authority on meetings. Any time a person feels it his job to criticize others for the violation of a principle, for the purposes of this type of humor, we can consider him to consider himself an authority on that principle. In the case of adversarial relationships being established with respect to certain principles, this is straightforward. In the AR, relationships have been more or less arbitrarily established (although dynamically set) anyway. For the purpose of studying humor then, setting the relationships so that an agent is in an adversarial relationship with another agent need not be specific to a certain principle, but can continue to be, instead, global—thus encompassing that principle. From a future-design plausibility standpoint, it would seem that all fortunes-of-others-emotions dependent on that particular principle, and that agent, could simply use whatever setting of the adversarial relationship was desired, ignoring the setting at other times. In this way, should V’s guns be stolen, K would gloat, but should V’s television be stolen K would feel pity.
The Role of Elegance in Emotion and Personality 245
Some Additional Intuitions
E1-b-mixed(Agent-A,P,G,R) does not seem necessary, E1-b-igloating(Agent-A,P,R) is sufficient; E1-b-ii-anger(Agent-A, [at]Agent-V,G,P) might work. E1-a-shame(Agent-V,P) can be dropped and folded into E2shame(Agent-V). It is only important that the adversarial relationship, R2, be established, or continued, with Act-2 as the trigger, and that Act-2 either cause, or intensify, E2-Shame(Agent-V,P). A (mild) form of E2-Hate(Agent-V,[toward]Agent-A,P2) may be present stemming from disliking of Agent-A by Agent-V, and by the violation of the principle, P2, ‘‘Do not point out my faults.’’ Agent-V does not necessarily have to hold the same principle P, but has to be held accountable for it by Agent-A so that if shame is not present, then a new goal G2 sought by Agent-V must take the place of the principle P, such that violating the principle P held by Agent-A will lead to blocking of the goal G2. In other words, if the authority holds a principle to be valid, and an original victim violates the principle and gets slammed because of it, it may still be funny when the authority later violates that principle. Here we can also see anger replacing shame—that is, instead of E1-a-shame(Agent-V,P) we have E1-a-anger(Agent-V,P2,G2), where G2 is some goal thwarted by Agent-A as a consequence of Agent-V having violated Agent-A’s principle P, and P2 is Agent-V’s principle, ‘‘Do not thwart my goals on the basis of principles I do not believe in.’’ For example, many non-Christians gloated over the downfall of Jimmy Swaggart, even though they themselves might not have held the principles he was embarrassed to have violated. However, they felt that through the political arena, their own goals were being blocked through Swaggart’s actions based on their own violation of Swaggart’s principles. The original victim must identify with the victim’s plight just before, or during, the moment that humor emerges. In the AR we use the cognitive unit mechanism wherein, in this case, the original victim would identify with the shame of the authority figure before moving into (or perhaps ‘‘simultaneously’’ with) gloating. The principle violated by Agent-A must be the same, or similar to, that recently violated by Agent-V or the situation will not be funny.
Clark Elliott
9
8.5
The victim, Agent-A, must feel shame over the action that violates the principle, or it will not be funny. If the senior graduate student, or the dean, had simply taken the attitude, ‘‘Who cares . . . the rule only applies to peons,’’ it would have simply generated bad feeling, not humor.
Supporting the User rather than the Empirical Emotion Theory
The apex of interaction with believable agents is firmly located internally in the user. Because of this, it is not so much a matter of ‘‘rightness’’ and ‘‘wrongness’’ but rather of ‘‘goodness’’ (elegance, flow) and ‘‘badness’’ (lack of rhythm, unintuitiveness), which will determine the effectiveness of the underlying emotion and personality theory used. We use the following argument to illustrate that users are not really interested in computerized beings, but rather only their projections of such beings. Consider pain and pleasure, two very basic, elemental components of somatic emotion experience, and certainly states represented in some form in all believable agents. At what point have we really represented these states in the computer, on behalf of an agent? Choosing pain, the more dramatic of these two, for this exposition, let us work through a small hypothetical example. Suppose that we build a computer system, ‘‘Samantha’’ (Sam), that incorporates state-of-the-art danger avoidance, sensors, stimulus-response learning, and so on. Now, we tell Sam-the-computer-system we are going to kill her, and begin slowly to dismantle her hardware and software. In other words, we are ‘‘torturing’’ Sam to death. The question arises, how much do we, as users, care? If Sam-thecomputer-system is our friend, maintains her state, has a long history with us, and provides interesting company, we may be sad that Sam is dying. Furthermore, if Sam is able to clearly express her distress (in the case of the AR agents this would be through the use of arresting music, facial expressions, and spoken, somewhat inflected, text) we might even experience a somatic sympathetic response of remarkable strength. However, this response on our part is internally fabricated on the basis of our social nature. We, as users, provide all of the ‘‘juice’’ that makes this relationship work. We are now told by someone, ‘‘I have placed your cat in the oven, and turned on the heat. To the extent that you increase Sam’s pain, I will reduce the cat’s pain.’’ Most of us (depending, of
The Role of Elegance in Emotion and Personality 247
course, on how we feel about the cat!) would immediately say, ‘‘Well, sorry Sam, it was nice knowing you,’’ and do whatever we can to help the cat. It is always in the back of our minds that, under the illusion, somewhere in Sam, there is a register labeled ‘‘pain’’ with a number in it. Sam does not really feel pain. We can make the example even more striking by changing two of the details. First, when we dismantle Sam, she believes that this is permanent, but it is not—we have a full backup. Second, instead of threatening to cook the cat, we merely threaten to dip it, repeatedly, in cold water, in a room in some other building where we can neither see, nor hear, what is going on. Third, just before Sam starts to (virtually) ‘‘scream’’ her agonized protests, we turn off the speakers and the monitor. Under such circumstances we can argue that even though the cat will not be harmed, few (one might hope none!) of us would be willing to make the cat suffer this discomfort. We know full well that Sam, in her own world, is going through a terrific martyrdom. We also know that if we were to immerse ourselves in Sam’s world through what have become rather hypnotic multimedia engagement techniques we might very well be extremely upset. However, this reasoning about our potential emotional involvement in Sam’s martyrdom is a poor relation to any real emotion on our parts. We have lost all experience of Sam’s suffering because without our experience of it the martyrdom does not exist. Sam, independently of our experience of Sam, does not matter, but the cat does. The fact that we can fully restore Sam makes Sam’s suffering of no account whatsoever. It can thus be argued that this clearly delineates the internal/ external distinction in our subjective experience with respect to believable computer agents. We do not really believe that, in the ‘‘real’’ world, such agents have either the experience or the subjective qualities that we are willing to attribute to them. Nonetheless, we are quite willing to interact with them in a way that acknowledges their social qualities, so long as they can skillfully, and elegantly, support the illusion we create to support this interaction.
8.6
Elegance in Communication
Another important component of elegance in believable agents is in capitalizing on the communicative abilities, and especially the communicative adaptability, of humans. If people are placed together, they will find ways to communicate, even in a group
Clark Elliott
comprising widely different cultures and languages. Additionally, humans find ways to communicate with dogs, birds, and so on. We are robust in our social craft. By contrast, however, even the very best software agents are exceedingly brittle in their communicative capabilities. Additionally, humans are strongly motivated to communicate with other beings—this is in our nature. If we hold the belief that someone can understand us, we will ‘‘talk’’ to them. Thus, given that humans are both motivated to communicate and also quite adaptable in the ways they are willing to effect this, whereas computer agents have almost no adaptability in this area, it follows that a system supporting believable agents should place the burden of communication with the human participants. However, here is where a touch of elegance is essential: To capitalize on this particular ‘‘hook,’’ the computer system must be designed with the theme not of mimicking human communication, but rather supporting it. It may not be necessary, or even desirable, to have agent communication components embodied as fully realized, real-time morphing human faces with fully inflected, phoneme-tweaked, phrase-generated speech and supporting a complete understanding of the human world, etc. If our goal is to support h u m a n skill at communication rather than fully realize it independently, then our focus might rather be on creating new expressive attributes and repertory that are consistent, of uniform granularity, and maximally suited to the true nature of the believable agent. Here is a perverse (but in its perversity, illustrative) example. Suppose an agent can generate only simple, albeit not always correct, speech patterns, such as the following: Sam speaks: Clark desire(s) END CONVERSATION. I NOT desire(s) END CONVERSATION. Clark CONFLICT I. Clark FIGHT WITH I or Clark GIVE UP? We can argue that although the syntax is poor, a human can easily learn to understand such expressions. However, despite its impoverished delivery, for a believable agent the content conveyed would be of interest. Agent elegance here is supported by the consistent delivery of information, not the correctness of the syntax. An elegant design might acknowledge then that the particular syntax, or lack thereof, is merely a little noise in the communication
The Role of Elegance in Emotion and Personality 249
channel. Accordingly, it would specify a graphical interface that does not, from an artistic perspective, look unnatural with broken English. In fact, a lifelike, intelligent-looking, 3-D talking head would likely raise the wrong sort of expectations, and call attention to the poor syntax, whereas a cartoon face, or an animal might be a good match. After all, if a dog could generate such broken English he would soon be making someone millions on the Tonight Show. No one would care that his pronouns are not correct—he is just a dog!
8.7
Closing
We have taken the following position: Emotion and personality models for believable agents should best have an internal consistency and elegance that will come through in many different contexts. This elegance allows the user to create the illusion of inspired life in a number of different applications, through rhythmic interaction with computer agents indirectly supported by consistent rules in the underlying emotion model. Users interacting with believable agents care much more about their perceptions of agent personality and emotional life than they do about the agents themselves, and this is fundamentally unlike the way people relate to real emotional beings. Above all, the believable agents researcher should recognize that the biggest win is in supporting the user’s ability to inspire computer agents with simulated life, rather than in attempting to create that simulated life unilaterally.
References Elliott, C. D. (1992): The Affective Reasoner: A Process Model of Emotions in a Multiagent System. Ph.D. thesis, Northwestern University, Illinois. On-line. Available: hftp://ftp.depaul.edu/pub/cs/ar/elliott-thesis.psi. (Availablility last checked 5 Nov 2002) Ortony, A., Clore, G. L., and Collins A. (1988): The Cognitive Structure of Emotions. Cambridge University Press, Cambridge.
9 The Role of Emotions in a Tractable Architecture for Situated Cognizers Paolo Petta
The rehabilitation and appreciation of the emotional as a key element for successful coping with nondeterministic, highly dynamic environments by providing a flexible and adaptive mechanism is quickly gaining a wider acceptance, as reflected not only by the present volume, but also by the now steady stream of presentations, discussions, and pertinent publications that are no longer limited to highly specialized journals or events (see, e.g., Keltner and Gross 1999 for a review of functional accounts of emotions). One aspect of emotions that is of particular interest to us as agent engineers is the fact that emotions and affect provide a glimpse of what could be termed as being really at work in the control of natural beings.The classical emotional response triad is composed of physiological reaction as preparation of the body according to some tendencies to act; motor expression communicating the internal state and intention to others as well as eliciting affective response; and irreflexive and reflexive subjective feelings with overt (i.e., perceivable and evident to the self) influence on motivation, perception, learning, or m e m o r y . H o w e v e r , all of these are responses brought about by a mechanism that itself remains thoroughly hidden from possibilities of inspection or introspect i o n . A t the same time, there exists abundant literature providing complementary evidence by documenting the limited influence of the conscious self in the daily routine.Papers (such as Bargh and Chartrand 1999) eloquently report how the acquisition of routines in humans is unconditional and how rarely conscious acts of selfregulation occur, with the conscious self playing a causal role in guiding behavior only around 5 percent of the time under normal circumstances. 1 To paraphrase Antonio Damasio (Damasio 1994, 1999) using a notion introduced to agent research by Phil Agre and Ian Horswill (Agre and Horswill 1997), the lifeworlds of individuals are much richer than what could be captured and maintained usefully in propositional form; the emotion process, then, can be seen as a hidden mechanism that interfaces to consciously inaccessible components that govern most of b e h a v i o r . W e see this
Paolo Petta
circumstance in turn as offering us a most welcome bridge of inspiration from the hopelessly complex natural human role model to the engineering of useful practical solutions to extant issues in the design of agent architectures.The present chapter reports on investigations being carried out by our group, where we try to integrate findings, notions, and ideas from emotion research with results from behavior-based robotics and related areas in the form of a tractable appraisal-based architecture for situated cognizers, TABASCO (tractable appraisal-based architecture framework for situated cognizers). This chapter is structured as follows: We first set out identifying a range of contributions that the emotional could in principle be expected to bring to situated agent engineering, setting aside for the moment questions of engineering practicality.The next section then presents a brief summary of the nature and adaptive function of emotion according to widely accepted theories and proceeds to a more detailed discussion of appraisal theories of emotion, which are gathering increasing consensus in psychological research.Section 9.3 provides a review of research on agent architectures and emotion synthesis and presents a typical example of an early appraisal-based architecture.Section 9.4 gives an outline of TABASCO, relating the architecture to the areas of research in psychology, artificial life, and agent architectures. We conclude with a brief description of currently ongoing and scheduled future directions of research within our group.
9.1
Relevance of the Emotional for Agent Engineering
In this section, we identify a number of interesting properties of emotions that could justify taking a closer look to see whether what is known about the natural phenomenon could prove of relevance for the engineering of synthetic autonomous a g e n t s . W e organize this compilation around the widely known agent characteristics (e.g., Wooldridge and Jennings 1995), namely, reactivity, social ability, autonomy, and proactiveness.
Emotions and Reactivity
Emotions can be characterized as fast, adapted responses to situational meanings.This implies that they implement operationalized abstractions from specific instances, so as to allow one and the same objective stimulus to elicit different emotions in different
The Role of Emotions in a Tractable Architecture
Table 9.1 Emotions vs. reflexes and physiological drives (adapted from Smith and Lazarus 1990)
PROPERTY
REFLEX
PHYSIOLOGICAL DRIVE
EMOTION
Stimulus source
Internal or external event; real
Internal tissue deficit; real
Internal or external event; realorimagined
Periodicity
Reactive
Cyclical
Reactive
Responseflexibility
Low
Moderate
High
Examples
Startle, blink
Hunger, thirst
Anger, guilt, sadness
individuals at the same time and even different emotions in the same individual (typically, but not always, at different times). Conversely, different objective stimuli may elicit similar emotional reactions in different (or one and the same) individuals.Table 9.1 summarizes how emotions differ from reflexes and physiological drives in terms of realizing a powerfully flexible stimulus-response coupling. The capacity to provide adapted responses in individuals inhabiting some particular out of a very broad range of domains (differing degrees of urbanization, cultures, social groups, etc.) indicates the efficient integration of design-time and run-time solutions (as, e.g., explicated in the contribution by E . T . R o l l s in chapter 2 of this volume and in further detail in Rolls 1 9 9 9 ) . T h e real-time properties of emotional responses furthermore suggest the presence of mechanisms capable of integrating different levels of situational analysis (cf., e.g., the notion of ‘‘the low and the high path’’ in LeDoux 1996). In addition, the roles of expressivity connected to emotional behavior already mentioned in the beginning of the chapter provide an example for how a fast, broad (in the sense of multipurpose), high-bandwidth local broadcast channel can be usefully put to a multitude of services (if at the same time presupposing the existence of emitters and sensors of information of adequate sophistication).
Emotions and Social Abilities
The prominent roles played by emotions in the social capabilities of humans is reflected in positions such as the weak social constructionism (e.g., Armon-Jones 1986), which defines emotions as
Paolo Petta
mostly social constructs (see also the contribution by A . O r t o n y , chapter 6 of this volume), if with concession of existence of a limited range of natural emotion responses of individuals.As such, emotions utilize some deeper models of the self, of others, of others as seen by the self, and so on (cf., e.g., the affective reasoner’s model of twice-removed points of view; Elliott 1992).Similarly, interpersonal models of emotions achieve a significantly broad coverage, employing a limited number of generative elements (e.g., the well-known model by T. D. Kemper) based on the notions of power—the capacity to compel another actor to do something not wished—and status—voluntary compliance with wishes, interests, and desires of another (Kemper 1993). With respect to communication, involuntarily generated expressive behavior functions as social imperative (Frijda 1986) insofar as understanding and reacting to it are in turn i n v o l u n t a r y . O n one hand, this facilitates the access to second-level resources (i.e., resources that are not immediately at the disposal of and directly accessible to individuals—e.g., beings requiring being taken care of) and the management of established commitments (cf.Aube´ 2 0 0 1 ) . T h e overt display of action tendencies serves additional purposes of coordination by communicating information about the intentions of the individual and thereby likely future activities.It is not accidental that an important aspect of the acquisition of social emotional competence regards the learning (and automation, cf.above) of display rules regulating under which circumstances it is (in)appropriate to show what overt signals of an individual’s emotional state. Another important topic regards the interplay of emotions and social norms, where the role of emotions for the sustenance of social norms has been considered to be of such impact to merit acknowledgment in influential economic models of societies (Elster 1996, 1999).Here, emotions are included in the definition of social norms as injunctions to behavior with the features of not being outcome oriented (in contrast to rules for rational action); of being shared across some or all members of a society; and of being enforced through sanctions as well as by emotions triggered by norm violation (even when the transgression cannot be observed). The apparent overdetermination of enforcement of social norms by both cold sanctions and hot emotions is explained by the decisive advantage of the emotional channel to be independent from actual material loss.Furthermore, already the prospect of unpleas-
The Role of Emotions in a Tractable Architecture
ant outcomes (e.g., embarrassment, shame, or guilt) prevents norm violation, while pleasant emotions (e.g., pride, gratification, or admiration) favor the upholding of norms.Emotions are thus seen to function as punishments and rewards for instrumental conditioning, not unlike the proposal by E . T . R o l l s (chapter 2 of this volume).In return, as explained above, social norms (mainly culturally defined ones) regulate emotions as socially constructed phenomena in terms of feeling rules, expression rules, and display rules, and by providing support for the coping with and management of emotions in regulation (cf.also Ortony, chapter 6 of this volume). As a final example for the relevance of emotions in social abilities, we mention some examples of their role in the establishment and maintenance of long-term relationships.Here, emotions can restrict self-interested behavior in favour of cooperative altruistic behavior—for example, Trivers (1971) sees reciprocal relationships to be upheld by emotions such as moralistic aggression and guilt. Frank (1988) characterizes moral sentiments such as guilt and anger as commitment devices to act contrary to immediate selfi n t e r e s t . I n this view, the predisposition to feel guilt is seen to commit to cooperation while the predisposition to outrage would commit to punishment, even if costly or out of proportion, thereby supporting the long-term gains of repeated interaction and deterrence of o t h e r s . T h e requirements to enact this kind of regulation are the possibility to discern the presence of these dispositions via both the environment (reputation) and direct communication (affective signals).(We refer the interested reader to, e.g., Keltner and Haidt 1999; Levenson 1999; Staller and Petta 2001 for further information on interpersonal and social functions of emotions).
Emotions and Autonomy
McFarland and Bo¨sser (1993) distinguish an autonomous agent from an automaton with state-dependent behavior by its being self-controlling and m o t i v a t e d . M o t i v a t i o n in turn is defined as a reversible internal process responsible for change in an agent’s behavior.Emotions can then be inserted into the picture as situated motivators in domains that are not simply task oriented, as endowment with multiple concerns (Frijda 1986) leads to caring about certain states and changes (not) to come a b o u t . T h e
Paolo Petta
emotional system signals the relevance of events for the concerns of the system and assesses its capabilities to cope with the opportunity or challenge detected.With respect to the previous subsection, emotions furthermore manage dependencies and commitments in social scenarios, thereby realizing a flexible administration of degree of autonomy (see also section entitled ‘‘Toward Taking Situatedness Seriously’’).To the extent that emotions develop at run time and in turn influence the development of a persistent agent over its lifetime, they furthermore form the basis for its individuality (cf. Damasio 1994).
Emotions and Proactiveness
Serious confrontation with situated resource-bounded systems has supported the progression in adaptive system design that led first from the specialization of the generic concept of rationality to one of bounded rationality (Russell and Norvig 1995; Russell 1997), and subsequently to full appreciation of the differences between the global designer’s view and the local, situated agent’s view, allowing for effective use of indexicals.Confrontation with the particular challenges posed and opportunities offered by rich dynamic environments highlighted the relevance of convergent approaches complementing the more traditional employment of correspondent models with reliance on properties of the (structure of the) environment forming the agent’s lifeworld (Agre 1995; Agre and Horswill 1 9 9 7 ) . I n agreement with this view, the notion of control and the related supremacy of planning are increasingly being traded in for notions such as of differential couplings and coordination (Clancey 1997).We refer to other contributions in this volume, in particular L.D.Can˜amero (chapter 4), but also A . S l o man (chapter 3) and E . T . R o l l s (chapter 2) for more detailed discussions of pertinent aspects of emotions. Having thus summarized but a few of the multifarious reasons for which the topic of emotions may raise the interest in agent designers, we next take a closer look at the nature and adaptive function of emotion as exposed in appraisal theories of emotions— the particular theoretical approach chosen by us as a first main reference, mainly because of its level of abstraction, which arguably makes it a good match for integration into hybrid layered control architectures and its openness to gradual inclusion of more detailed lower-level models (e.g., based on Damasio 1994, as in
The Role of Emotions in a Tractable Architecture
Ventura, Custodio, and Pinto-Ferreira 1998; Custodio, Ventura, and Pinto-Ferreira 1999; or Rolls 1999, as in More´n and Balkenius 2000).
9.2
The Nature and Adaptive Function of Emotion
Emotion and Adaptation
Emotion can be viewed as a flexible adaptation mechanism that has evolved from more rigid adaptational systems, such as reflexes and physiological drives (Scherer 1984; Smith and Lazarus 1990; Lazarus 1991).The flexibility of emotion is obtained by decoupling the behavioral reaction from the stimulus e v e n t . T h e heart of the emotion process thus is not a reflexlike stimulus-response pattern, but rather the appraisal of an event with respect to its adaptational significance for the individual, followed by the generation of an action tendency aimed at changing the relationship 2 between the individual and the environment (Frijda 1986).Lazarus (1991) subsumes adaptive activities under the term coping.In the remainder of this section, appraisal, action tendencies, and coping are discussed in more detail.
Appraisal Theories of Emotion
‘‘Appraisal theories’’ of emotion emphasize the role of a hypothesized continuous evaluation of the environment according to dimensions regarding the i n d i v i d u a l . T h i s evaluation is called a p p r a i s a l . I t is conceived of as a constituting element in emotion generation, mediating between events and emotions, explaining why the same event can give rise to different emotions in different individuals, or even in one and the same individual (usually at different times).Conversely, appraisals offer a framework for the identification of the conditions for the elicitation of different emotions, as well as for understanding what differentiates emotions from each other. Many theories have been developed in order to specify how many and which appraisal criteria are minimally needed for emotion differentiation (e.g., Roseman 1984, 1996; Scherer 1984, 1999; Smith and Ellsworth 1985; Frijda 1986; Ortony, Clore, and Collins 1988; Lazarus 1991; Roseman, Antoniou, and Jose 1996).
Paolo Petta
There is a high degree of consensus with respect to these appraisal criteria (see, e.g., Scherer 1988, 1999 for reviews). According to Reekum and Scherer (1997, p p . 2 5 9 – 2 6 0 ) , these include the perception of a change in the environment that captures the subject’s attention (¼ novelty and expectancy), the perceived pleasantness or unpleasantness of the stimulus or event (¼ valence), the importance of the stimulus or event to one’s goals or concerns (¼ relevance and goal conduciveness or motive consistency), the notion of who or what caused the event (¼ agency or responsibility), the estimated ability to deal with the event and its consequences (¼ perceived control, power, or coping potential), and the evaluation of one’s own actions in relation to moral standards or social norms (¼ legitimacy), and one’s self-ideal.
Modeling the Appraisal Process
The main application of the appraisal theories cited above is the structural analysis of the semantics of emotion w o r d s . H o w e v e r , these structural theories do not specify the nature of the appraisal process.In response to Zajonc (1980), who criticized the ‘‘exaggerated cognitivism’’ of appraisal theories, several appraisal theorists pointed out that the appraisal process is not necessarily conscious or voluntary (Scherer 1984; Smith and Lazarus 1990; Frijda 1993). Leventhal and Scherer (1987) attempt a more precise specification of the nature of the appraisal process.They suggest a hierarchical processing system consisting of three levels: sensorimotor, schematic, and c o n c e p t u a l . T h e sensorimotor level is based on innate hardwired feature detectors giving rise to reflexlike react i o n s . T h e schema concept was introduced by Bartlett (1932) and has been widely used in cognitive science as a central memory structure.In Leventhal and Scherer’s model, the schematic level is based on schema m a t c h i n g . T h e conceptual level involves reasoning and inference processes that are abstract, active, and reflective. Problem solving is an example of a conceptual-level process. Smith and colleagues (1996) suggest a model of the appraisal process that builds on this distinction between schematic and conceptual processing.Schematic processing is fast, automatic, parallel, inflexible, and c o n c r e t e . I t can be thought of in terms of priming and spreading activation.In contrast, conceptual processing is slow, voluntary, serial, flexible, and relying on semantically accessible information.Smith and colleagues especially emphasize
The Role of Emotions in a Tractable Architecture
the interactions between schematic and conceptual processing. For example, conceptual processing can activate, create, and alter schematic m e m o r i e s . O n the other hand, a sufficiently activated schema can become available for conceptual processing.A novel feature of the model proposed in Smith and colleagues (1996) is the so-called appraisal register.It monitors appraisal information generated through schematic or conceptual processing, in addition to perceptual information, and generates an emotional reaction. The appraisal register is assumed to model the function of the amygdala, which plays an important role in the elicitation of fear (LeDoux 1996). Although the proposals by Leventhal and Scherer (1987) and Smith and colleagues (1996) are preliminary and not very detailed, results of research in the areas of social cognition, clinical psychology, and neuropsychology strengthen the evidence for a multilevel appraisal process (Reekum and Scherer 1997; see also Teasdale 1999 for further information about multilevel theories of cognition-emotion relations).
Action Tendencies and Coping
The appraisal of the situation is just one step of the emotion process.For successful adaptation, emotion must have an effect on the actions of the individual.According to Frijda (1986), a ‘‘change in action readiness’’ is the essence of an e m o t i o n . A s mentioned in the section entitled Emotions and Reactivity (under section 9.1), emotion does not lead to a fixed action, but to the generation of an action t e n d e n c y . A c t i o n tendencies ‘‘are states of readiness to achieve or maintain a given kind of relationship with the environm e n t . T h e y can be conceived of as plans or programs to achieve such ends, which are put in a state of readiness’’ (Frijda 1986, p.75).Avoidance is an example for an action tendency during fear. With respect to action control, action tendencies have the feature of ‘‘control precedence.’’ However, whether an action tendency actually leads to action depends on feasibility tests and the monitoring of progress. Lazarus (1991) stresses that especially in humans, action tendencies do not automatically lead to a c t i o n . H e emphasizes the importance of coping, which ‘‘consists of cognitive and behavioral efforts to manage specific external or internal demands (and conflicts between them) that are appraised as taxing or exceeding the
Paolo Petta
resources of the person’’ (p.112).According to Lazarus, an action tendency is an innate biological impulse, while ‘‘coping is a much more complex, deliberate, and often planful psychological process’’ that ‘‘draws heavily on appraisals about what is possible, likely to be effective in the specific context, and compatible with social and personal standards of conduct (e.g., display rules, action rules)’’ ( p . 1 1 4 ) . C o p i n g processes can augment or inhibit the innate action tendencies. To summarize, although Lazarus (1991) agrees with Frijda (1986) that flexible action tendencies are characteristic for emotion, he does not conceive of them as plans, in contrast to Frijda’s definition of action tendencies cited a b o v e . F o r Lazarus, plans are involved in coping, not in the generation of action tendencies.
9.3
Research on Agent Architectures and Emotion Synthesis
Wooldridge and Jennings (1995) as well as Jennings, Sycara, and Wooldridge (1998) provide thorough reviews of existing agent architectures.Here, only the distinction between deliberative, reactive, and hybrid architectures is briefly highlighted. Wooldridge and Jennings (1995, p . 2 4 ) define a deliberative agent architecture ‘‘to be one that contains an explicitly represented, symbolic model of the world, and in which decisions (for example about what actions to perform) are made via logical (or at least pseudo-logical) reasoning, based on pattern matching and symbolic manipulation.’’ The characterizing feature of a deliberative agent is the ability to plan.Pfeifer (1996) reviews the main problems of the symbol-processing approach (e.g., the well-known symbol grounding and frame problems). These problems, together with theoretical results by Chapman (1987) about the unfeasibility of the planning approach in practical applications, led to the emergence of new AI (‘‘behavior-based AI,’’ ‘‘situated AI’’).Brooks (1986, 1991a, b), one of the main proponents of this new paradigm, argued that intelligence is not a property of disembodied systems, but a result of the interaction of an embodied agent with its environment.Further, he designed the subsumption architecture that does not make use of any symbolic representations or reasoning.Rather, the architecture is a collection of task-accomplishing behaviors that are implemented as augmented finite state machines arranged in a hierarchy.Higher layers
The Role of Emotions in a Tractable Architecture
represent more abstract behaviors and can be suppressed by the output of lower layers.The subsumption architecture is the prototypical example of a reactive agent architecture. Despite the merits of reactive architectures, Jennings, Sycara, and Wooldridge (1998) enumerate a number of disadvantages.For example, the fact that overall behavior emerges from the interaction of simple behaviors implies that there is no principled methodology for engineering agents to fulfill specific t a s k s . A s a result, so-called hybrid architectures have been developed that combine aspects of both deliberative and reactive architectures.Typically, hybrid architectures consist of a number of layers that are arranged hierarchically.Very common is a hierarchy of three l a y e r s . F o r example, the 3 T architecture (Bonasso et a l . 1 9 9 7 ) consists of the layers ‘‘reactive skills,’’ ‘‘sequencing,’’ and ‘‘deliberation.’’ The reactive skills are coordinated by a skill manager (Yu, Slack, and Miller 1994).At the sequencing layer, the reactive action packages (RAPs) system (Firby 1989) is used for activating and deactivating ‘‘sets of skills to create networks that change the state of the world and accomplish specific tasks’’ (Bonasso et a l . 1 9 9 7 , p . 2 3 8 ) . The deliberation layer consists of the adversarial planner (AP) (Elsaesser and Slack 1994).Other examples of hybrid architectures are TouringMachines (Ferguson 1992, 1995) and Cypress (Wilkins et al.1995), which combines a planner with a modification of the procedural reasoning system (PRS) (Georgeff and Lansky 1987). In a survey of earlier artificial intelligence research on models of emotion by Pfeifer (1988, 1994), the majority of efforts were found to be fundamentally hampered by the attempt to isolate the object of study according to what has been termed ‘‘the traditional research methodology.’’ Similar criticism had also already been voiced as early as the late seventies by Aaron Sloman (Sloman 1978; Sloman and Croucher 1981; chapter 3 of this volume).Pfeifer’s review has had a seminal effect insofar as it substantially contributed to setting off a whole avalanche of related work in the artificial life c o m m u n i t y . T h e basic tenet of the animat/artificial life approach is to thoroughly explore the space of opportunities offered by the simplest designs by virtue of their situatedness and physical embodiment, before moving up toward the incorporation of more ‘‘complex’’ components, such as declarative representations and related manipulation capabilities (Pfeifer and Scheier 1999).
Paolo Petta
Nowadays, it is commonly recognized that research on functional process models of emotion synthesis requires whole agent designs (Picard 1997; Sloman 1997; chapters 7 and 3 of this volume).Various approaches have been adopted to this end, adopting guidelines from, for example, ethology, neurophysiology, and cognitive and social science (Loyall and Bates 1993; Blumberg 1994, 1997; Balkenius 1995; Reilly 1996; Can˜amero 1997; Hayes-Roth, van Gent, and Huber 1997; Rousseau and Hayes-Roth 1997; Loyall 1997; Velasquez and Maes 1997; Wright 1997; Velasquez 1999; Martinho and Paiva 1999; Can˜amero and Petta 2001; Petta and Can˜amero 2001). In the following subsection, we briefly discuss the affective reasoner, a representative early appraisal-based architecture.
The Affective Reasoner as Appraisal-Based Architecture
The affective reasoner (Elliott 1992, 1993; Elliott, Rickel, and Lester 1997) is an architecture for abstract, domain-independent emotion r e a s o n i n g . I t uses a descriptive process model of emotion based on a cognitive theory of emotion-eliciting conditions (Ortony, Clore, and Collins 1988; O’Rorke and Ortony 1994) that defines 26 (originally 24) discrete emotion types.The ‘‘rudimentary personalities’’ engendered by the affective reasoner architecture comprise an ‘‘interpretive personality component’’—providing individuality with respect to their interpretation of situations—and a ‘‘manifestative personality component’’—conferring individuality with respect to the way agents express or manifest their emotions (cf.Elliott, chapter 8 and Ortony, chapter 6, this volume).An agent’s interpretive personality component is defined by means of a hierarchical ‘‘goals, standards, and preferences’’ (GSP) database; its manifestative personality component is encoded in an action database. Processing flows from some initiating event in a simulated world through emotion and action generation to the final stages that simulate observation by other agents.Initiating situations are ‘‘construed’’ to fall into one of the eliciting condition classes: goalrelevant events; acts of accountable agents; attractive or unattractive objects; or combinations of the first three categories.In this way, each successful construal instantiates an emotion eliciting condition (EEC) relation structure, a ninetuple encoding of the characteristics of the situation relevant for further processing
The Role of Emotions in a Tractable Architecture
derived from the six-slot structure employed in the BORIS system (Dyer 1 9 8 3 ) . A s the affective reasoner covers only a finite set of discrete emotions, the mapping of EECs to emotions involves some ad-hoc solutions to cope with a number of problems caused by the fact that often this mapping would not be unique but may reference multiple, and at times conflicting, emotions. Instantiated emotions are mapped to actions by means of an emotion manifestation lexicon.For each of the original twenty-four emotion classes, seventeen action response categories are defined, comprising: somatic responses; behavioral responses; communicative responses; evaluative responses; and cognitive responses such as initiation of p l a n s . F o r deployment in a given domain, a few (about three, in the example given in the Ph.D. thesis; Elliott 1992) actual response actions (ranked by ‘‘intensity’’) are to be defined for each action response category.The lexicon thus comprises over a thousand individual actions to be defined for each application d o m a i n . I n addition, conflict sets have to be defined to explicitly rule out incompatible manifestations, which may be triggered by co-occurring emotions. The affective reasoner also includes support for social emotions, by virtue of a ‘‘concerns of other’’ database, which at first is initialized with a partial, system-wide GSP database and extended via case-based adductive reasoning during the run time of the prog r a m . M o d e l i n g of other agents is limited to at most ‘‘twiceremoved’’ cases (i.e., the representation of another agent’s representation of the concerns of those agents that are important to it).
Discussion
The following is a summary of some noteworthy points. There exist robust, widely adopted solutions to fundamental aspects of the realization of situated software agents in virtual environments, in terms of hybrid layered architectures and organization of behavioral components, substantially facilitating the overall task of implementing such systems. Demonstrating the validity of Pfeifer’s view (above), abstract, domain-independent architectures such as the affective reasoner run into problems when deployed in interactive virtual scenarios, as also openly recognized by Clark Elliott (1994): to be effective in such applications, affective reasoning has to have appropriate access to pertinent information about and from the world, and has
Paolo Petta
to be able to influence the overt external behavior as well as the internal information processing of an a g e n t . T h e only means to achieve this is to properly integrate emotional competence into an architecture, which in turn has to be adapted to the environment in which the agent is situated (as presented, e.g., in Gratch 2000). J. Gratch proposes a generic appraisal mechanism that evaluates a planning process, focusing on the emotional significance of events as they relate to plans and goals.The most important of the small set of variables used to this end is desirability, defined in terms of the contribution of memory facts (the appraisal evaluates the plan memory only and does not consider external events) to plan achievement (i.e., all that poses a threat to the current plan is undesirable).Only two intensity variables are considered: probability of goal achievement and emotional goal importance.Even so, successful coupling of the affective reasoner at a high level was demonstrated, for example, in educational systems (Elliott, Rickel, and Lester 1999). Shallow approaches with a reified representation of a finite number of discrete emotional states, such as the affective reasoner, demonstrate the problems of brittleness and consistency well known from the traditional research area of expert systems. Reification of emotions as identifiable system components and routing of all processing through these entities engenders the problem of how to proceed from these emotions for further system processing, leading to the adoption of ad-hoc constructions of dubious validity. Discretizing emotions all but rules out smooth and fine-grained system behavior; it is precisely this kind of capability that is becoming of increasing importance, as the information infrastructure in general and virtual environments in particular increasingly assume the characteristics of persistent social places, in which long-term relationships develop (see also chapters 8 and 12 in this volume). Applications of functional emotion theory can provide an alternative approach for tackling at least some of the problems mentioned, in good part by virtue of seeing emotions as processes, as characteristics of a whole architecture, which pervasively influence the overall behavior of the agent. TABASCO (tractable appraisal-based architecture framework for situated cognizers) is an architecture framework for software agents situated in virtual environments that attempts to take
The Role of Emotions in a Tractable Architecture
Action Monitoring
Perception and Appraisal Conceptual
Action
--^^
Appraisal Register
Conceptual
Schematic
Schematic
Sensory
Motor
Environment Figure 9.1
^r
2
The TABASCO architecture framework.
these points into a c c o u n t . I t will be described in the following section.
9.4
The TABASCO Architecture
Outline
The ultimate goal of our research is the integration of the emotion process within the architecture of a situated software agent. In TABASCO, we model emotions not as reified entities, but as adaptive processes related to the interaction of an agent with its environment.Especially, we need to model the steps that were identified in section 9.2 as characteristic of the emotion process: the appraisal process and the generation of action tendencies as well as coping activities.The aim of this section is to provide an outline of the TABASCO framework shown in figure 9.1. The basic idea behind TABASCO is that the distinction between sensorimotor, schematic, and conceptual processing does not only apply to appraisal, but also to the generation of action, as proposed originally by Leventhal in his ‘‘perceptual-motor theory of emotion’’ (see, e.g., Leventhal 1984). So the two main components of the agent architecture, ‘‘perception and appraisal’’ as well as ‘‘action’’ are three-level hierarchies, with sensory processing as the lowest level of the ‘‘perception and appraisal’’ component and motor mechanisms as the lowest level of the ‘‘action’’ component. In the following, the components of TABASCO are described in more detail.
Paolo Petta
PERCEPTION AND APPRAISAL This component processes environmental stimuli and models the appraisal p r o c e s s . T h e input must be evaluated according to the appraisal criteria enumerated in the section entitled Appraisal Theories of Emotion (under section 9.2). Frijda (1993) argues that it is not necessary to model all appraisal checks explicitly.Rather, most of the appraisals ‘‘can be seen as implied by the very perceptual processes by which stimulus events are taken in’’ (Frijda 1993, p . 3 8 0 ) . F o r example, it is not necessary to model a ‘‘novelty check’’ explicitly—the appraisal of novelty is implied by the fact that the input does not match any schema. Leventhal and Scherer (1987) propose that at each level of processing, the input is evaluated with respect to all appraisal criteria. But they do not provide empirical evidence for this c l a i m . I n our view, it may well be that an appraisal does not occur at all levels. For example, it is difficult to see how sensory processing is related to standards or norms. The implementation of the several evaluations made during appraisal is still an open question.But research by social psychologists, especially in the field of social cognition, may help to model the processes involved in these evaluations (see Fiske and Taylor 1991 for an overview of research on social cognition).For example, causal attribution has been investigated by social psychologists for decades.Findings of attribution theorists are relevant for the appraisal of agency or responsibility.Another research topic is the influence of social schemata on encoding, memory, and inference in social situations.The self is also an important research topic in social psychology.Self schemata have been proposed as representations of important attributes of the self.Further, research on self-regulation is relevant for modeling the appraisal process (e.g., for modeling the evaluation of one’s coping potential or the comparison of oneself to standards and norms). The processes at each level can be summarized as follows: The sensory level consists of feature detectors for the detection of, for example, sudden, intense stimulation or the pleasantness of a stimulus. At the schematic level, the input is matched with schemata, especially with social and self schemata.Smith and colleagues (1996) propose the implementation of the schematic level as an associative network with a spreading activation m e c h a n i s m . T h e r e are a number of associative network models of memory (e.g., ACT* by
The Role of Emotions in a Tractable Architecture
Anderson 1 9 8 3 ) . I n addition, work on semantic networks in the fields of AI and information retrieval can guide the design of the schematic level. The conceptual level involves abstract reasoning and inference based on propositional knowledge and b e l i e f s . T h e inferences involved in causal attribution will be modeled at this level as well as the evaluation of one’s actions in relation to norms or one’s selfideal.Therefore, models of self and others will be placed at the conceptual level.Ferguson’s (1992, 1995) use of belief-desireintention (BDI) models at the ‘‘Modeling Layer’’ of his hybrid TouringMachines architecture as well as Elliott’s ‘‘Concerns of Other’’ database (1992; see section 9.3) are relevant for the design of the conceptual level. ACTION The approach is to design and implement this component by drawing upon already existing hybrid agent architectures.Suitable candidates include, for example, the 3 T architecture (Bonasso et al. 1997) mentioned in section 9.3. The conceptual level basically consists of a planner.According to Lazarus’s (1991) characterization of coping as a planful process, coping is implemented at the conceptual level. In 3 T, the sequencing layer based on the RAP system (Firby 1989) can be seen to correspond to schematic processing.In fact, Earl and Firby (1997) merge ideas from the RAP system with Drescher’s schema mechanism (Drescher 1991).Action tendencies may be implemented using designs such as the RAP system. Although Frijda (1986) uses the term ‘‘plan’’ to characterize an action tendency, he also uses the term ‘‘flexible program’’ and states: ‘‘Flexible programs are those that are composed of alternative courses of action, that allow for variations in circumstances and for feedback from actions executed’’ ( p . 8 3 ) . T h i s notion of a flexible program is very close to the functionality of a RAP, which is designed to react to feedback and ‘‘is simply a description of how to accomplish a task in the world under a variety of circumstances using discrete steps’’ (Bonasso et a l . 1 9 9 7 , p . 2 4 0 ) . T h e reactive skills of the 3 T architecture correspond to sensorimotor processing. In figure 9.1, the lowest level of the action component actually contains only the motor commands of the reactive skills, while the sensory input is processed at the lowest level of the perception and appraisal c o m p o n e n t . O n e important class of motor mechanisms
Paolo Petta
relevant for modeling the emotion process is that of ‘‘expressive’’ motor mechanisms, such as facial and body expressions.These expressive behaviors can be implemented at the motor level of the action component. APPRAISAL REGISTER This component mediates between the perception and appraisal and the action c o m p o n e n t s . T h e notion of an appraisal register is derived from Smith and colleagues (1996) (cf.the section entitled Modeling the Appraisal Process—under section 9.2). It detects and combines the appraisal outcomes from the sensory, schematic, and conceptual levels of the perception and appraisal component. Then it influences the action component based on the appraised state of the w o r l d . T h i s influence may have several forms: First, it may lead to the immediate execution of a motor program, such as a startle or an orienting r e s p o n s e . T h e s e responses may not be considered as full emotions, but as ‘‘preemotions’’ (Lazarus 1991) that enable the agent to gather more information for the appraisal process.Second, an action tendency may be generated by putting a RAP into the state of readiness.Third, a long-term planning process may be initiated in order to cope with the appraised event. It should be noted that it is not necessary to wait for the appraisal outcomes of all levels of processing in order to influence the action c o m p o n e n t . A startle response occurs immediately after the sensory input has been processed, not after slow conceptual p r o c e s s e s . I t is therefore a matter of discussion whether the appraisal register is actually necessary.It may be possible to design a three-level architecture in which at each level of processing, the appraisal process and the influence on action are directly coupled. On the other hand, the appraisal register has the advantage that the interplay between coping, action tendencies, and expressive behavior can be better coordinated. ACTION MONITORING Frijda (1993, p . 3 8 1 ) notes that a ‘‘major source of appraisals is the monitoring of action planning and action execution.’’ For example, the exhaustion of the whole repertoire of plans and actions without success leads to the appraisal that no coping potential is available. Therefore, TABASCO contains an action monitoring component that monitors the planning and execution processes in the action
The Role of Emotions in a Tractable Architecture
component and sends the results of the monitoring process to the perception and appraisal component, where it is integrated within the appraisal process.
Situated Cognizers
The view of emotion as a flexible adaptation mechanism is fully compatible with the concept of a situated agent.Agre (1995), one of the proponents of situated AI, argues that the interaction between an agent and its environment should guide the analysis of living agents and the design of artificial o n e s . I n accordance with this view, Lazarus (1991, p . 2 9 ) identifies the ‘‘person-environment relationship’’ as the ‘‘basic arena of analysis for the study of the emotion process.’’ Appraisal is the evaluation of this relationship, while action tendencies and coping are aimed at changing this relationship (Frijda 1986 also uses the term ‘‘relational action tendencies’’). A guideline for modeling appraisal in a situated agent may be the concept of affordances.An affordance is defined by Gibson (1979, p . 1 2 7 ) as ‘‘what it offers the animal, what it provides or furnishes, either for good or ill.’’ The general idea is that a perceptually attuned animal actively perceives meaning in the environment without further interpretative cognitive processing.So there is a direct coupling between perception and action characteristic for situated a g e n t s . T h e field of ecological psychology is based on this idea.Both Frijda (1993) and Lazarus (1991) emphasize that appraisal may be based on the perception of affordances.An example of a system using a concept closely related to affordances is the PARETO plan execution system by Pryor (1994).PARETO is based on the RAP system (Firby 1989) and uses ‘‘reference features’’ for recognizing and reasoning about opportunities. Baron (1988; McArthur and Baron 1983) applies the affordance concept to social perception (e.g., to emotion perception, impression formation, and causal attribution).He establishes a ‘‘dualmode theory of social knowing’’ (Baron 1988), in which the ecological view and the traditional cognitive view are treated not as contradictory, but as complementary.The ecological view emphasizes the direct detection of information given in the stimulus configuration.In contrast, the cognitive view emphasizes the knowledge representations and inference processes people use to elaborate and interpret the stimulus.
Paolo Petta
We apply this dual-mode view to the nature of appraisal. Although appraisal may be based on the direct perception of (social) affordances, it nevertheless involves cognizing.Chomsky (1980) introduces this term to denote persons’ relations to their knowledge.Cognizing means having access to knowledge that is not necessarily accessible to consciousness and does not consist in warranted or justified b e l i e f . A typical example of cognized knowledge is a speaker’s knowledge of g r a m m a r . A s described in section 9.2, the appraisal process is not necessarily conscious or voluntary, but nevertheless based on knowledge (e.g., represented in the form of schemata).So, besides the direct perception of affordances, cognizing is a central part of the appraisal process. We chose the term situated cognizers to emphasize the importance of both situatedness and cognizing for the view of emotion as a flexible adaptation mechanism adopted in this chapter.
Tractability
An interesting and important aspect regards the tractability of the architecture for situated cognizers, particularly with respect to the design of flexible representation schemes that are perceptually tractable in the sense exposed by Ian Horswill (1997).Horswill specifically addresses the problem of maintenance of complex world models with limited resources—that is, ‘‘the perception system’s problem of knowing how and when to allocate its internal resources to updating different components of the world model’’ ( p . 2 9 0 ) . W h i l e Horswill envisages solutions based on mixed reaction and deliberation, he does not hesitate to express reservations with respect to simple layering approaches.Instead, he finds that ‘‘In practice, hybrid systems work well when they are given problems in which the deliberative components (1) are given little to do and (2) work only on problems, such as topological path planning, in which the relevant portions of the world model are updated slowly or not at a l l . . . . I believe that the solution to perceptual and combinatoric problems lies not in adjoining traditional and reactive systems but in finding useful intermediate points in between’’ (p.290). It is our intention to devise TABASCO as a sibling of the 3 T architecture that Horswill cites among the examples of existing ‘‘useful intermediate points’’: smart and programmable reactive systems (e.g., sensorimotor mechanisms) that are allowed to spawn
The Role of Emotions in a Tractable Architecture
off graph search algorithms or resource allocators (in our case, e.g., coping strategies, action tendency generators, explicit local plans) as primitive operations when necessary.This design methodology is also in accordance with the bottom-up animat/artificial life approach mentioned in section 9.3.
A Software Architecture Framework
At least at the outset, TABASCO is intended as an architecture for software agents situated in virtual environments (but see the concluding section for forthcoming efforts involving h a r d w a r e ) . W e can put forward the following points in support of our ‘‘softwareonly’’ approach: The discussion between proponents of software and hardware approaches (e.g., the dispute between Etzioni and Brooks) has been settled, with the conclusion that both fields offer their respective rich bag of research problems (see Hexmoor et a l . 1 9 9 7 ) . Bridging and connecting the two fields of software and hardware agent research, there is an important common stream of research concerned with software architectures that are increasingly used for hardware and software agents alike; witness the convergence on layered—often three-level—architectures (Hexmoor 1997). Of importance, many potential reservations from an artificial life point of view can be done away with by taking care in providing an environment that on one hand is sufficiently rich in detail (McFarland and Bo¨sser 1993; Agre and Horswill 1997), and on the other hand comprises software agents modeled at an appropriate level of internal (information processing) and external (bodily) complexity. Finally, the result of this research could always be valued as a necessary and thus worthwhile first step toward a realization of embodied solutions (Gat 1995). In any case, it also remains an established fact that the development of hardware embodiments is substantially costlier in terms of required infrastructure, variety of required expertise, and project time.
Implementations
Naturally, a high aim such as pursued with TABASCO requires continuous input from empirical investigations.These are not only aimed at assessing the validity of the theoretical design choices
Paolo Petta
in terms of practicality, sufficiency, and performance, but, significantly, also to acquire a deeper, more immediate, first-hand understanding of the very domain of interacting in a dynamic environment as a situated being (cf.the difference between having a piece of knowledge, and actually using it—commented upon in Sousa 1996).First experiences with a straightforward linear implementation of the appraisal process described in Frijda (1986) were gathered with the development of FORREST, an adaptation of the Colin MUDbots produced at CMU by Michael L . M a u l d i n (1994), testing the effect of this model on the believability and concern satisfaction ability of an agent in a text-based, real-time virtual reality (Macmahon and Petta 1997; Petta et al.2000).Shortly afterward, a unique opportunity to build upon the early promising experiences came about w h e n a permanent interactive exhibit was commissioned for the reopening of the Vienna Museum of Technology in 1 9 9 8 . W e summarize some salient features of the Invisible Person (Petta 1999; Petta et a l . 1 9 9 9 , 2000) in the following, as the installation has undergone its first major o v e r h a u l . M o s t recently, a closer look was taken at the modeling of the interplay between social norms and emotions described in section 9.1 as a basis for inclusion in further developments in synthetic agent control.
The Invisible Person
As an instantiation of the TABASCO framework, the control architecture driving the Invisible Person (figure 9.2), an autonomous synthetic character inhabiting an ALIVE-style (Darrell et al.1994) immersive virtual environment was developed around a realization of a grounded emotion process as characterized in the cognitive appraisal t h e o r y . A reduced set of four persistent concerns provide the ‘‘caring’’ basis for motivation; the agent is self-driven by the dynamics in concern satisfaction states induced by the indirect competition among the concerns, rest (¼ preventing physical exhaustion), interaction (¼ establishing contact, e.g., by greeting and subsequent engagement in various social behaviors, including taking photos of the users via a virtual hand camera and now to further cover also simple games such as tic-tac-toe played on a playfield projected onto the stage floor), competence/control (¼ managing to achieve more articulate and longer-lasting interaction behaviors, overcoming difficulties to bring about desired rela-
The Role of Emotions in a Tractable Architecture
Figure 9.2 Children playing with the Invisible Person exhibit in the Vienna Museum of Technology.
tional states), and variation (with respect to enacted interactions and interaction partners). The current action tendencies are continuously directly mapped onto a dynamic pulsing texture covering the body of the synthetic actor. Various features of this texture, such as coloring, speed, and direction of the pulses (toward or from the limbs) encode different aspects of the current action tendencies, such as valence and intensity. As can be asserted now after almost three years of continuous deployment, there is strong evidence for the efficacy and significance of the immediate display of this information, even if via such an artificial channel. Visitors readily p i c k u p the consistent regularities between these quickly reacting dynamics and ensuing corresponding behavioral changes as clues for the segmentation and interpretation of the ongoing interaction. BEHAVIOR LEVELS Conceptually, the repertoire of the Invisible Person is structured along a number of levels, only the bottom three of which are reflected in actually implemented parts (figure 9.3). At the topmost level reside the concerns, and the main persistent goal of entertaining the visitors of the installation (i.e., engaging in entertaining,
Paolo Petta
VISION Identification of user location, posture&gesture
Appraisal based on AGENT "contexts, concerns, user CONTROL location, posture&gesture
I Action tendency and Concern update V
I Segmentation
^
Action selection | Motion sequencing
I
Texture parameters
J
Motion blending Texture generation Video image acquisition.
\
I
Real-time rendering .ANJMAT!QN..... and compositing Figure 9.3 The system architecture of the Invisible Person exhibit, consisting of vision, agent control, and animation rendering subsystems.
varied forms of interaction ranging from looser activities, such as dancing, to short-term interactions, as in following or chasing each other or having a picture of the user taken by the Invisible Person; to longer-lasting rule-controlled scenarios implemented in the form of simple g a m e s ) . T h i s given has pervasive influence on design-time decisions such as selection, differentiation, and implementation of each of the lower activity levels described next, but also, for example, the kinds of perceptions provided. The second conceptual design level is formed by acts (Handlungen; Weizsa¨cker 1993), abstract specifications of sequences of actions that are assigned a specific semantics in their entirety (e.g., the act of greeting).Acts operate upon conceptualizations of the domain and their semantics are specified in terms of bringing about (or maintaining, or avoiding) specific relational states of the individual (the Invisible Person) and its environment (formed by the stage and any visitors).Examples include ‘‘idling,’’ ‘‘greeting,’’ ‘‘interacting,’’ or ‘‘waiting’’ acts. At the next level, behavioral schemata are instantiations of acts in the w o r l d . E v e n though still abstract insofar as a behavioral schema is implemented in terms of further subcomponents, behaviors are seen as the unit, as a sequence of actions, that brings about the defining characteristics of an act in the w o r l d . W h i c h behaviors are suited to realize a specific act depends on internal and external conditions; acts and behaviors do not stand in direct correspondence.For instance, handshaking could be an appropriate behavior to carry out the act of ‘‘greeting,’’ as could b o w i n g . A t the same
The Role of Emotions in a Tractable Architecture
time, bowing may serve also as a gesture of submission.Behaviors are structured, in the sense of being composed out of distinct components, which in turn can be other behaviors or actual executable actions—packaged as discussed below. SOME LESSONS FROM COGNITIVE ETHOLOGY For the next lower level, we tap into cognitive ethology, borrowing from Janet Halperin’s observations on the nature of fixed action patterns (FAPs) (Halperin 1 9 9 5 ) . I n ethology, FAPs are quantifiable, recognizable, and repeated patterns within what at first would appear to be a continuous flow of a c t i o n . T h e hypothesis of the existence of such discrete behavior modules also in humans found support particularly in the domain of facial expressions and related body language—here the constrained range of differing behaviors is seen as crucial for achieving clear communication and for situations requiring fast action such as in fleeing or fighting. Study of FAPs in animals and humans has uncovered general properties that contribute to solve a range of difficult challenges and provide essential features such as homeostasis, by virtue of such sophisticated features as ‘‘declaratively situated’’ goals (i.e., goals defined in terms of a target perception to be achieved and maintained); orienting components providing corrective movements to counter environmental perturbations; timing properties combining initial persistence with temporal boundedness (with typical FAP durations of a few seconds); context dependency for the effectiveness of releasing stimuli; and modifiability of the repertory of preceptors and b e h a v i o r s . C o m p l e x activities are achieved by chaining of related function patterns.However, disposing of a nontrivial number of different FAPs (and related perceptual releasers, etc.) brings about the problem of an optimal action selection strategy, and this is where emotional states are seen to contribute. Emotional states in animals are commonly defined by the group of FAPs that are potentiated (i.e., share an increased probability of being enacted) when a given stimulus is presented (cf.also Can˜amero, chapter 4 of this volume).Each such set of FAPs thus defines a different discrete emotional state, while FAPs may be shared across states.Similar to the timing properties in FAPs, an emotional state preserves the potentiating of the defining FAPs over a time interval that can exceed by far the duration of the actual confrontation with the eliciting stimulus, which can be a simple sign
Paolo Petta
stimulus such as the ‘‘happy face’’ icon or even simpler sensory inputs like pain, and can be shared by both individual FAPs and emotional states.This persistence of bias toward a one-time decision about probably relevant FAPs reflects the recurrent existence of environmental states of more or less protracted stability.The initial persistence of FAPs serves to ensure or facilitate completion of whole functional programs, while their short duration still allows for rapid behavioral changes according to stimulus situat i o n . I n contrast, high-level, temporally persistent states cannot make yes/no behavioral choices, to avoid inflexibility and slow response to changing conditions. Although several sets may be potentiated at the same time, there also exist competing FAPs.Mapping of environmental information to a probabilistic bias toward a group of FAPs selected by a combination of evolutionary design and individual ‘‘run-time’’ adaptation is a plausible way to best exploit the capabilities of these two learning variants while deferring the actual selection of action to the latest possible moment.Consequently, emotional states are modifiable to differing degrees: they may change only slightly with respect to form (¼ what to do) and largely with respect to stimuli (e.g., what to be afraid of). More complex cognitive assessments are associated with emotional states rather than FAPs, an indication for how the emotional extends more purely reactive FAP-based behavior toward more sophisticated ways of acting in the w o r l d . T h e complexity of certain contexts that determine the effectiveness of associated stimuli and the corresponding existence of specialized emotional states indicates high cognitive demands involved, whether in sheer number of available FAPs or sophistication of required information processing capabilities.Neurophysiological findings (cf.Rolls, chapter 2 of this volume) confirm the fundamental role of cognitive processes in emotional activity. Within the Invisible Person, the FAP level is formed out of specific sequences of physical activities that acquire domain-level semantics by virtue of their embedding within particular behavi o r s . S e e n the other way around, FAPs form (recyclable) building bricks for the assembly of behaviors, such as moving to a particular location (specified in absolute coordinates or, e.g., relative to a specific user), or orienting toward a direction specified in absolute or relative t e r m s . T h e Invisible Person implements both the timing properties of FAPs as well as the implementation of emotional guidance (as different from immediate action selection, see also
The Role of Emotions in a Tractable Architecture
below) via grouping of potentiated behaviors (by expressed action tendency—cf.also the notions of manifestation component and in particular, emotion response tendencies in Andrew Ortony, chapter 6 of this volume, or the role of the Bayesian hub described in chapter 11 by Gene Ball) and endowing action tendencies with an intrinsic decay dynamic (empirically adapted to the physical environment) toward nominal levels that codetermine the baseline personality. The lowest level in this hierarchy is finally formed by single animations out of which all of the Invisible Person’s activity in the magic mirror is composed by run-time blending; animations are parameterized in terms of rotation (e.g., to achieve full orientability), translation (e.g., scaling of step length), and execution speed (e.g., throttled with increasing levels of tiredness). TOWARD TAKING SITUATEDNESS SERIOUSLY The rather severely impoverished environment of the Invisible Person, particularly with respect to its sensorial capacities (if recently improved by the installation of a second vertically mounted camera in addition to the frontal one installed on top of the projection screen, greatly improving the tracking of multiple users), invites us to reconsider the role of sensing/perceiving and execution control in situated agents, along the lines of the proposal by Phil Agre (1995) to overcome the conceptual impasse between planning (i.e., looser coupling with reliance on internal conceptualizations) and reaction (i.e., closer coupling with reliance on sensing and perception).Clearly, under circumstances such as the ones given here, design-time prescriptions of what is of import are of particular relevance.As limited perception can only work in constrained domains, the role of the interface to the world here takes on more of a characteristic of (dis)confirmation or indication (Bickhard 2000) of a particular world state rather than of a provision of a richly differentiated stream of i n p u t . I n any case, availability of sufficiently rich context (in particular, disprovable expectations) is a precondition for well differentiated emotional experience (as, e.g., extensively argued in Frijda 1986). In summary then, it would seem to follow that emotions are not so much about action selection (Can˜amero, chapter 4 of this volume) as about monitoring and influencing (and perhaps at best, regulating) interaction, or maybe better still: coordination (Clancey 1997) of the individual’s performance in the e n v i r o n m e n t . T h i s stance
Paolo Petta
would seem to get further support when considering that the focus on the individual is but one out of many coexisting valid and relevant views of a n a l y s i s . I n relational social activities, the dyad in particular but also bindings involving more individuals form important additional viewpoints (and similarly one could move down-/inward toward the levels of agencies within the self; cf. Riecken, chapter 10, and Bellman, chapter 5 in this volume).Along with the appreciation of the existence of these other observational standpoints comes the insight that the individual can necessarily only exert at best but partial control over the development of an interaction.We consider it important to stress that in this context not only consummatory behavior (as put forward by Can˜amero), but any behavior (i.e., also appetitive behavior) exerts an influence on the environment via the broadcast of discernable affective signals of different degrees of explicitness (ranging from highly specific facial expressions to rather broad discernible traits of overall expressive behavior). Petta and colleagues (1999; Petta, Pinto-Ferreira, and Ventura 1999) discuss the incremental change in appreciation of the role of the middle tier of the established trionic model, with the empirical realization of the special function of the scheduler in the middleout model of control (Gat 1 9 9 8 ) . I n this view, the scheduler is the location where important information from both lower level behaviors—by means of cognizant failures—and spanned longterm planning activities converge.As argued in those publications, there seems to be evidence for a deeper, intimate relationship between these architectural engineering insights and theoretical and empirical findings in emotion research in psychology and the neurosciences.Also for these reasons, in the control architecture of the Invisible Person, the emotion subsystem was interfaced to the middle layer of a trionic architecture (Gat 1998; Hexmoor et al. 1997) (figure 9.4). THREEFOLD USES OF EMOTIONS In the Invisible Person, the emotion process serves three purposes: the uptake of concern-relevant stimuli (i.e., those that are of significance for the character’s relational activity); influencing action selection so as to achieve long-term overt behavioral consistency; and driving expressive behavior, reflecting the system’s view of the couplings currently in effect and announcing behavioral choices in the near-term future.
The Role of Emotions in a Tractable Architecture
• Behavior library
t
Communications: > vision data, animation status <- parameterized animation selection, emotion system state
Scheduler: • event appraisal, • behavior activation, • TAP" selection • TAP" execution monitoring
t
••••••••••••
• • • •
Emotion subsystem: concern maintenance, action tendency dynamics
Fixed action packages: • start/context/ending conditions • • idle/cleanup procedures • cognizant failures
••••••••••••••••••a
Figure 9.4 The Trionic Control Architecture of the Invisible Person.
Implementation of the appraisal process is a topic of current research.Similar to Andrew Stern (chapter 12 of this volume) and in agreement with views put forward by theorists as, for example, in Frijda (1986; but differently from clear-cut approaches such as Scherer’s 1984 stimulus evaluation checks), we adopted the stance of mapping the theoretical notion onto a trionic agent control architecture model in an integrated (and therefore distributed) f a s h i o n . S t i m u l u s uptake occurs at different available context levels of the currently activated behavioral schema; current activity implemented in an activated FAP (taking the place of 3 T RAPs); and the broader context of the global ‘‘master plan’’ to entertain all visitors.Semantically significant nodes within the architecture (such as leaf nodes of FAPs signaling success or failure to the calling behavior, which in turn can attribute appropriate semantics or interactions with the user, announcing likely failure, success, or [im]possible lines of activity) differentially activate a fixed set of action tendencies that jointly work as appraisal register: approach (‘‘desire’’); being with; free activation; attending (‘‘interest’’); interrupting; dominating (‘‘arrogance’’); excitement; nonattending; deactivation (‘‘sorrow’’); avoidance; inhibition; agonistic (‘‘anger’’); rejecting; and submitting. These action tendencies are then used to discriminate between eligible behaviors suited to satisfy current concern needs, to choose a behavior variant suitable to express the current action tendency distribution, to influence parameterizations of intended
Paolo Petta
actions submitted to the animation pipeline, and to bias choices made within executing behaviors and FAPs ( c f . t h e informationprocessing and coping emotion response tendencies in chapter 6 by Andrew Ortony). As explained in some further detail in Petta (1999) and also treated in chapter 12 by Andrew Stern, a particularly important role of action tendencies is the providing of a basis for the selection and enactment of a teleologically unbound display of the current internal state, which naturally falls into the sequence of selected behaviors at times of behavior completion or behavior change.
9.5
Summary and Future Work
In this expository chapter, we have sketched out some of the background that we are drawing upon in the attempt to design TABASCO, a tractable appraisal-based architecture framework for situated cognizers.We have tried to cast some light on the relevance of each of the aspects identified in the project’s acronym.As with the work carried out so far, we do expect the strong links we tied between psychological emotion research and work on situated autonomous agents to keep bringing forward interesting insights and relevant further research topics in both a r e a s . S c h e d u l e d future research will address integration of results from the investigations into the interplay of social norms and emotions and the control architecture of the Invisible Person, and testing in different scenarios.A focus will be placed on the appraisal mechanisms, to include relational reinforcement learning in an implementation taking up some of the ideas put forward by Damasio and by Rolls. We are preparing collaboration with a robotics group, to assess the validity of the obtained results for hardware agents in the context of scenarios such as RoboCup-Rescue.Efforts are also to continue in the area of human-computer communication, so as to make the architecture more easily deployable in different domains. As rightly and appropriately pointed out in Gene Ball’s closing remarks (chapter 11), striking results, such as reflected in the photos of visitors of the installation taken by the Invisible Person itself (figure 9.5) provide not only a strong incentive to further the work, but should also serve as a reminder to remain consciously aware of the power of the natural phenomenon under investigation.
The Role of Emotions in a Tractable Architecture
Figure 9.5 Pictures of visitors of the Invisible Person Exhibit (reproduced with kind permission by the Invisible Person). Acknowledgments
The author wishes to thank Alexander Staller, Rainer Hubovsky, Gudrun Novak, Sabine Payr, and Monika Farukuoye for their contributions in the preparation of this manuscript. During the carrying out of the work described, helpful comments were received from Andrew Ortony, Craig A. Smith, Carlos Pinto-Ferreira, Rodrigo Ventura, Henry Hexmoor, Joanna Bryson, Doug Riecken, Marvin Minsky, and many others. Robert Trappl initiated the research on this topic at the Austrian Research Institute for Artificial Intelligence and has been providing invaluable continuous support. The Austrian Research Institute for Artificial Intelligence is supported by the Austrian Federal Ministry of Education, Science, and Culture. This research is being carried out in part under project GZ 61.096/4-V/B/99 of the Austrian Federal Ministry of Transport, Innovation, and Technology.
Notes 1.
We invite the readers to sit back for a moment and, for example, recall their own last experience of leaving the office with the firm intent to run some errands, to remember it again only while taking off the shoes back home.
Paolo Petta
2 . I t may be interesting to note how from this point of view the bidirectional application of the Bayesian net described in chapter 11 (by Gene Ball) could be seen to quite appropriately reflect the lack of (preferred) direction in relational dependency.
References Agre, P.E.(1995): Computational Research on Interaction and Agency.Artif. Intell. 72 (1/2): 1–52. Agre, P. E., and Horswill, I. (1997): Lifeworld Analysis. J. Artif. Intell. Res. 6: 111–145. Anderson, J . R . ( 1 9 8 3 ) : The Architecture of Cognition.Harvard University Press, Cambridge. Armon-Jones, C.(1986): The Thesis of Constructionism.In R.Harre´, ed., The Social Construction of Emotions.Blackwell, Oxford. Aube´, M.(2001): From Toda’s Urge Theory to the Commitment Theory of Emotions.In D. Can˜amero and P.Petta, eds., Cybernetics and Systems 32. Vol.2, Grounding Emotions in Adaptive Systems.Taylor & Francis, London. Balkenius, C.(1995): Natural Intelligence in Artificial Creatures.Cognitive Studies, 37. Lund University. Bargh, J . A . , and Chartrand, T.L.(1999): The Unbearable Automaticity of Being.Am. Psychol. 54 (7): 462–479. Baron, R.M.(1988): An Ecological Framework for Establishing a Dual-Mode Theory for Social Knowing.In D.Bar-Tal and A.W.Kruglanski, eds., The Social Psychology of Knowledge, 48–82.Cambridge University Press, Cambridge. Bartlett, F.C.(1932): Remembering: A Study in Experimental and Social Psychology. Cambridge University Press, Cambridge. Bickhard, M.H.(2000): Motivation and Emotion: An Interactive Process Model.In R . D . Ellis and N.Newton, eds., The Caldron of Consciousness, 161–178.John Benjamins, Philadelphia. Blumberg, B.M.(1994): Action-Selection in Hamsterdam: Lessons from Ethology.In D. Cliff, P.Husbands, J.-A.Meyer, and S.W.Wilson, eds., Proceedings of the Third International Conference on the Simulation of Adaptive Behavior, Brighton, England, 108–117.MIT Press, Cambridge. Blumberg, B.M.(1997): Old Tricks, New Dogs: Ethology and Interactive Creatures.Ph.D. thesis.MIT Press, Cambridge. Bonasso, R. P., Firby, R. J., Gat, E., Kortenkamp, D., Miller, D. P., and Slack, M. (1997): Experiences with an Architecture for Intelligent, Reactive Agents.In H.Hexmoor, ed., Special Issue: Software Architectures for Hardware Agents, J. Theoret. Exp. Artif. Intell. 9 (2/3): 237–256. Brooks, R . A . ( 1 9 8 6 ) : A Robust Layered Control System for a Mobile Robot.IEEE J. Robotics Automation 2 (1): 14–23. Brooks, R.A.(1991a): Intelligence without Reason.In J.Mylopoulos, R.Reiter, eds., Proceedings of the Twelfth International Joint Conference on Artificial Intelligence (IJCAI-91), Sydney, Australia, 569–595.Morgan Kaufman, San Mateo, Calif. Brooks, R.A.(1991b): Intelligence without Representation.Artif. Intell. 47: 139–159. Can˜amero, L.D.(1997): Modeling Motivations and Emotions as a Basis for Intelligent Behavior.In W.L.Johnson, ed., Proceedings of the First International Conference on Autonomous Agents, 148–155.ACM Press, Marina del Rey, Calif., New York. Can˜amero, L.D., ed.(1998): Emotional and Intelligent: The Tangled Knot of Cognition. Proceedings of 1998 AAAI Fall Symposium, TR FS-98-03.AAAI, Orlando, F l a . Can˜amero, L. D., and Petta, P., eds. (2001): Grounding Emotions in Adaptive Systems, Vol.1.Cybern. Syst. 32 (5). Chapman, D.(1987): Planning for Conjunctive Goals.Artif. Intell. 32: 333–378. Chomsky, N.(1980): Rules and Representations.Columbia University Press, New York. Clancey, W.J.(1997): Situated Cognition: On Human Knowledge and Computer Representations.Cambridge University Press, Cambridge.
The Role of Emotions in a Tractable Architecture
Custo´dio, L., Ventura, R., and Pinto-Ferreira, C. A. (1999): Artificial Emotions and Emotion-Based Control Systems.In J.M.Fuertes, ed., Proceedings of the Seventh IEEE International Conference on Emerging Technologies and Factory Automation (ETFA ’99).Vol.2, pp.1415–1420.IEEE, Piscataway, N.J. Damasio, A.R.(1994): Descartes’ Error: Emotion, Reason, and the Human Brain.Putnam, New York. Damasio, A.R.(1999): The Feeling of What Happens: Body and Emotion in the Making of Consciousness.Harcourt Brace Jovanovich, New York. Darrell, T., Maes, P., Blumberg, B., and Pentland, A. P. (1994): A Novel Environment for Situated Vision and Behavior.TR no.261.MIT Media Laboratory, Perceptual Computing Section. Drescher, G.L.(1991): Made-Up Minds.MIT Press, Cambridge. Dyer, M.G.(1983): In-Depth Understanding: A Computer Model of Integrated Processing for Narrative Comprehension.MIT Press, Cambridge. Earl, C., and Firby, R. J. (1997): Combined Execution and Monitoring for Control of Autonomous Agents.In W.L.Johnson, ed., Proceedings of the First International Conference on Autonomous Agents, Marina del Rey, Calif., 88–95. ACM Press, New York. Elliott, C.D.(1992): The Affective Reasoner: A Process Model of Emotions in a Multiagent System.Ph.D.thesis, Northwestern University, Evanston, I l l . Elliott, C.D.(1993): Using the Affective Reasoner to Support Social Simulations.In R. Bajcsy, ed., Proceedings of the Thirteenth International Joint Conference on Artificial Intelligence, 194–201.Morgan Kaufmann, San Mateo, Calif. Elliott, C.D.(1994): Research Problems in the Use of a Shallow Artificial Intelligence Model of Personality and Emotion.In Proceedings of the Twelfth National Conference on Artificial Intelligence, 9–15.AAAI Press/MIT Press, Cambridge. Elliott, C. D., Rickel, J., and Lester, J. C. (1997): Integrating Affective Computing into Animated Tutoring Agents.In E.Andre´, ed., Proceedings of the IJCAI-97 Workshop ‘‘Animated Interface Agents: Making Them Intelligent,’’ Nagoya, Japan.IJCAI, Nagoya, Japan. Elliott, C. D., Rickel, J., and Lester, J. (1999): Lifelike Pedagogical Agents and Affective Computing: An Exploratory Synthesis.In M.J.Wooldridge and M.Veloso, eds., Artificial Intelligence Today.Springer, Berlin, Heidelberg, New York; Lecture Notes in Artificial Intelligence 1600, pp.195–212. Elsaesser, C., and Slack, M. G. (1994): Integrating Deliberative Planning in a Robot Architecture.In P.J.Weitz, ed., Proceedings of the AIAA/NASA Conference on Intelligent Robots in Field, Factory, Service, and Space (CIRFFSS ’94), Houston, Tex.782–787. Elster, J.(1996): Rationality and the Emotions.Economic J. 106 (438): 1386–1397. Elster, J.(1999): Alchemies of the Mind: Rationality and the Emotions.Cambridge University Press, Cambridge. Ferguson, I.A.(1992): Touring Machines: An Architecture for Dynamic, Rational, Mobile Agents.Ph.D.thesis, University of Cambridge, UK. Ferguson, I.A.(1995): On the Role of BDI Modelling for Integrated Control and Coordinated Behavior in Autonomous Agents. Appl. Artif. Intell. Special issue: Intelligent Agents and Multi-Agent Systems Part 1, 9 (4): 421–448. Firby, R.J.(1989): Adaptive Execution in Complex Dynamic Worlds.Ph.D.thesis, Yale University, New Haven. Fiske, S.T., and Taylor, S.E.(1991): Social Cognition.2nd ed.McGraw-Hill, New York. Frank, R.H.(1988): Passions within Reason: The Strategic Role of the Emotions.Norton, New York. Frijda, N.H.(1986): The Emotions.Cambridge University Press, Cambridge. Frijda, N.H.(1993): The Place of Appraisal in Emotion.In N.H.Frijda, ed., Appraisal and Beyond: The Issue of Cognitive Determinants of Emotion. Cogn. Emotion 7 (3/4): 357–388. Gat, E.(1995): On the Role of Simulation in the Study of Autonomous Mobile Robots. In H.Hexmoor and D.Kortenkamp, eds., AAAI 1995 Spring Symposium, Lessons
Paolo Petta
Learned from Implemented Software Architectures for Physical Agents.Technical Report SS-95-02, AAAI Press, Menlo Park, Calif. Gat, E.(1998): On Three-Layer Architectures.In D.Kortenkamp, R . P . B o n a s s o , and R.Murphy, eds., Artificial Intelligence and Mobile Robots.MIT Press, AAAI Press, Menlo Park, Calif. Georgeff, M.P., and Lansky, A.L.(1987): Reactive Reasoning and Planning.In Proceedings of the Sixth National Conference on Artificial Intelligence (AAAI-87), pp.677– 682.Morgan Kaufmann, San Mateo, Calif. Gibson, J.J.(1979): The Ecological Approach to Visual Perception.Houghton Mifflin, Boston. Gratch, J.(1999): Why You Should Buy an Emotional Planner.In J.D.Velasquez, ed., Workshop: ‘‘Emotion-Based Agent Architectures’’ (EBAA ’99).Third International Conference on Autonomous Agents (Agents ’99), Seattle, Wash.53–60.ACM Press, New York. ´ Gratch, J.(2000): Emile: Marshalling Passions in Training and Education. In C. Sierra, M. Gini, and J.S.Rosenschein, eds., Proceedings of the Fourth International Conference on Autonomous Agents, Spain, 325–332.ACM Press, New York. Halperin, J.R.P.(1995): Cognition and Emotion in Animals and Machines.In H.Roitblat and J.A.Meyer, eds., Comparative Approaches to Cognitive Science.MIT Press, Cambridge. Hayes-Roth, B., van Gent, R., and Huber, D. (1997): Acting in Character.In R.Trappl and P.Petta, eds., Creating Personalities for Synthetic Actors, LNAI 1195, p p . 9 2 – 1 1 2 . Springer, Berlin, Heidelberg, New York. Hexmoor, H., ed. (1997): Special issue: Software Architectures for Hardware Agents, J. Exp. Theor. Artif. Intell. 9(2/3). Hexmoor, H., Kortenkamp, D., and Horswill, I. (1997): Software Architectures for Hardware Agents.In H.Hexmoor, ed., Special issue: Software Architectures for Hardware Agents, J. Exp. Theor. Artif. Intell. 9 (2/3): 147–156. Horswill, I.(1997): Visual Architecture and Cognitive Architecture.In H.Hexmoor, ed., Special issue: Software Architectures for Hardware Agents, J. Exp. Theor. Artif. Intell. 9 (2/3): 277–292. Jennings, N. R., Sycara, K., and Wooldridge, M. (1998): A Roadmap of Agent Research and Development. Int. J. Auton. Agents Multi-Agent Syst. 1 (1): 275–306. Keltner, D., and Gross, J.J.(1999): Functional Accounts of Emotions.Cogn. Emotion 13 (5): 467–480. Keltner, D., and Haidt, J. (1999): Social Functions of Emotions at Four Levels of Analysis. Cogn. Emotion 13 (5): 505–521. Kemper, T.D.(1993): Sociological Models in the Explanation of Emotions.In M.Lewis and J.M.Haviland, eds., Handbook of Emotions, 41–52.Guilford Press, New York, London. Lazarus, R.S.(1991): Emotion and Adaptation.Oxford University Press, Oxford, London, New York. LeDoux, J.E.(1996): The Emotional Brain.Simon and Schuster, New York. Levenson, R.W.(1999): The Intrapersonal Functions of Emotion.Cogn. Emotion 13 (5): 481–504. Leventhal, H.(1984): A Perceptual-Motor Theory of Emotion.Adv. Exp. Soc. Psychol. 17: 117–182. Leventhal, H., and Scherer, K.R.(1987): The Relationship of Emotion to Cognition: A Functional Approach to a Semantic Controversy. Cogn. Emotion 1 (1): 3–28. Loyall, A.B.(1997): Believable Agents: Building Interactive Personalities.Ph.D.thesis, Carnegie-Mellon University, Pittsburgh. Loyall, A . B . , and Bates, J.(1993): Real-Time Control of Animated Broad Agents.In Proceedings of the Fifteenth Annual Conference of the Cognitive Science Society, Boulder, Colo.Laurence Erlbaum, Hillsdale, N.J. Macmahon, M., and Petta, P. (1997): FORREST: Forschung u¨ ber/Research on Emotion
¨ Simulation.Technical Report TR-97-22, Osterreichisches Forschungsinstitut fu¨r Artificial Intelligence, Vienna, Austria.
The Role of Emotions in a Tractable Architecture
Martinho, C., and Paiva, A. (1999): ‘‘Underwater Love’’: Building Tristan and Isolda’s Personalities.In M.J.Wooldridge and M.Veloso, eds., Artificial Intelligence Today. Springer, Berlin, Heidelberg, New York; Lecture Notes in Artificial Intelligence 1600, pp.269–296. Mauldin, M.L.(1994): Chatterbots, TinyMUDs, and the Turing Test: Entering the Loebner Prize Competition.In Proceedings of the Twelfth National Conference on Artificial Intelligence, 16–21.AAAI Press/MIT Press, Cambridge. McArthur, L.Z., and Baron, R.M.(1983): Toward an Ecological Theory of Social Perception. Psychol. Rev. 90 (3): 215–238. McFarland, D.J.(1989): Goals, No Goals, and Own Goals.In A.Montefiore and D.Noble, eds., Goals, No Goals, and Own Goals.Unwin-Hyman, London. McFarland, D.J., and Bo¨sser, T.(1993): Intelligent Behavior in Animals and Robots. A Bradford Book, MIT Press, Cambridge. More´n, J., and Balkenius, C.(2000): Reflections on Emotion.In R.Trappl, ed., Cybernetics
¨ and Systems 2000, Osterreichische Studiengesellschaft fu¨r Kybernetik, Vienna, Austria.Vols.1–2, p p . 6 9 9 – 7 0 3 . O’Rorke, P., and Ortony, A. (1994): Explaining Emotions. Cogn. Sci. 18 (2): 283–323. Ortony, A., Clore, G. L., and Collins, A. (1988): The Cognitive Structure of Emotions. Cambridge University Press, Cambridge. Petta, P.(1999): Principled Generation of Expressive Behavior in an Interactive Exhibit.In J.D.Velasquez, ed., Workshop: ‘‘Emotion-Based Agent Architectures’’ (EBAA ’99), Third International Conference on Autonomous Agents (Agents ’99), Seattle, 94–98. ACM Press, New York. Petta, P., and Can˜amero L., eds. (2001): Cybernetics and Systems 32(5), Grounding Emotions in Adaptive Systems, V o l . 2 . Petta, P., Macmahon, M., and Staller, A. (2000): FORREST: Forschung u¨ber/Research on Emotion Simulation.In C.Landauer and K.L.Bellman, eds., Proceedings of the Virtual Worlds and Simulation Conference, 23–27 January, San Diego.Society for Computer Simulation International, San Diego. Petta, P., Pinto-Ferreira, C., and Ventura, R. (1999): Autonomy Control Software: Lessons from the Emotional.In H.Hexmoor, ed., Workshop: ‘‘Autonomy Control Software,’’ Third International Conference on Autonomous Agents (Agents ’99), Seattle, 74–77. ACM Press, New York. Petta, P., Staller A., Trappl, R., Mantler, S., Szalavari, Z., Psik, T., and Gervautz, M. (1999): Towards Engaging Full-Body Interaction.In H.J.Bullinger and P.H.Vossen, eds., Adjunct Conference Proceedings, HCI International ’99, Eighth International Conference on Human-Computer Interaction, Munich, Germany, 280–281.Fraunhofer IRB, Verlag. Petta, P., Staller, A., Trappl, R., Mantler, S., Psik, T., Szalavari, Z., and Gervautz, M. (2000): Die ‘‘Invisible Person’’ im Technischen Museum Wien. Ku¨ nstliche Intelligenz 2: 34–39 (in German). Pfeifer, R.(1988): Artificial Intelligence Models of Emotion.In V.Hamilton, G.H.Bower, and N.H.Frijda, eds., Cognitive Perspectives on Emotion and Motivation, Behavioral and Social Sciences 44, pp.287–320.Kluwer Academic Publishers, Dordrecht. Pfeifer, R.(1994): The Fungus Eater Approach to Emotion: A View From Artificial Intelligence. Cogn. Stud. 1: 42–57. Pfeifer, R.(1996): Symbols, Patterns, and Behavior: Towards a New Understanding of Intelligence.Technical Report IFI-AI-96.08, Artificial Intelligence Laboratory, D e p t . Computer Science, University of Zurich. Pfeifer, R., and Scheier, C. (1999): Understanding Intelligence.MIT Press/Bradford Books, Cambridge. Picard, R.W.(1997): Affective Computing.MIT Press, Cambridge. Pryor, L.M.(1994): Opportunities and Planning in an Unpredictable World.Ph.D.thesis, Northwestern University, Evanston, Ill. Reekum, C . M . v a n , and Scherer, K . R . ( 1 9 9 7 ) : Levels of Processing in EmotionAntecedent Appraisal.In G.Matthews, ed., Cognitive Science Perspectives on Personality and Emotion, 259–300.Elsevier, Amsterdam.
Paolo Petta
Reilly, W . S . N . ( 1 9 9 6 ) : Believable Social and Emotional Agents.Ph.D.thesis, Carnegie Mellon University, Pittsburgh. Rolls, E.T.(1999): The Brain and Emotion.Oxford University Press, Oxford, London, New York. Roseman, I.J.(1984): Cognitive Determinants of Emotion: A Structural Theory.In P.Shaver, ed., Review of Personality and Social Psychology.Vol.5, pp.11–36.Sage, Beverly Hills. Roseman, I.J.(1996): Why These Appraisals? Anchoring Appraisal Models to Research on Emotional Behaviour and Related Response Systems.In N.H.Frijda, ed., Proceedings of the ninth Conference of the International Society for Research on Emotions (ISRE ’96).ISRE, Toronto, Ontario, 106–110. Roseman, I.J., Antoniou, A . A . , and Jose, P.E.(1996): Appraisal Determinants of Emotions: Constructing a More Accurate and Comprehensive Theory. Cogn. Emotion 10 (3): 241–277. Rousseau, D., and Hayes-Roth, B. (1997): Improvisational Synthetic Actors with Flexible Personalities.Knowledge Systems Laboratory, TR no.KSL 97-10.Computer Science Department, Stanford University, Stanford. Russell, S.J.(1997): Rationality and Intelligence.Special issue: Economic Principles of Multi-Agent Systems, Artif. Intell. 94 (1–2): 57–77. Russell, S.J., and Norvig, P.(1995): Artificial Intelligence—A Modern Approach.Prentice Hall, Englewood Cliffs, N.J. Scherer, K.R.(1984): On the Nature and Function of Emotion: A Component Process Approach.In K.R.Scherer and P.Ekman, eds., Approaches to Emotion, 293–318. Erlbaum, Hillsdale, N.J. Scherer, K.R.(1988): Criteria for Emotion-Antecedent Appraisal: A Review.In V.Hamilton, G.H.Bower, and N.H.Frijda, eds., Cognitive Perspectives on Emotion and Motivation, 89–126.Kluwer, Dordrecht. Scherer, K.R.(1999): Appraisal Theory.In T.Dalgleish and M.Power, eds., Handbook of Cognition and Emotion, 637–663.Wiley, Chichester, London, New York. Sloman, A.(1978): The Computer Revolution in Philosophy: Philosophy, Science, and Models of Mind.Harvester Press (and Humanities Press), Hassocks, Sussex. Sloman, A.(1997): What Sort of Control System is Able to Have a Personality? In R.Trappl and P.Petta, eds., Creating Personalities for Synthetic Actors, LNAI 1195, pp.166–208.Springer, Berlin, Heidelberg, New York. Sloman, A., and Croucher, M.(1981): Why Robots Will Have Emotions.In A.Drinan, ed., Proceedings of the seventh International Joint Conference on AI, Vancouver, Canada. AAAI, Menlo Park, Calif. Smith, C . A . , and Ellsworth, P.C.(1985): Patterns of Cognitive Appraisal in Emotion. J. Pers. S oc. Psychol. 48: 813–838. Smith, C . A . , and Lazarus, R.S.(1990): Emotion and Adaptation.In L.A.Pervin, ed., Handbook of Personality: Theory and Research, 609–637.Guilford, New York. Smith, C. A., Griner, L. A., Kirby, L. D., and Scott, H. S. (1996): Toward a Process Model of Appraisal in Emotion.In N.H.Frijda, ed., Proceedings of the Ninth Conference of the International Society for Research on Emotion, ISRE ’96.ISRE Publications, Storrs. Sousa, R.de (1996): Prefrontal Kantians.A Review of Decartes’ Error: Emotion, Reason, and the Human Brain by Antonio R.Damasio.Cogn. Emotion 10 (3): 329–333. Staller, A., and Petta, P. (1998): Towards a Tractable Appraisal-Based Architecture for Situated Cognizers: In D.Can˜amero, C.Numaoka, and P.Petta, eds., Grounding Emotions in Adaptive Systems, workshop notes, Fifth International Conference of
¨ the Society for Adaptive Behaviour (SAB ’98), 56–61.Zurich, Switzerland.Osterreichisches Forschungsinstitut fu¨r Artificial Intelligence, Vienna, Austria. Staller, A., and Petta, P. (2001): Introducing Emotions into the Computational Study of Social Norms: A First Evaluation. J. Artif. Soc. Soc. Simulation 4(1).Available online at: http://www.soc.surrey.ac.uk/JASSS/4/1/2.html. (Availability last checked 5 Nov 2002) Teasdale, J.D.(1999): Multi-level Theories of Cognition—Emotion Relations.In T.Dalgleish and M.Power, eds., Handbook of Cognition and Emotion, 665–682.Wiley, Chichester, London, New York.
The Role of Emotions in a Tractable Architecture
Trivers, R.L.(1971): The Evolution of Reciprocal Altruism.Q. Rev. Biol. 46: 35–57. Turner, R.M.(1993): The Tragedy of the Commons and Distributed AI Systems.In K . P . Sycara, ed., Proceedings of the Twelfth International Distributed Artificial Intelligence Workshop, Hidden Valley, Penn.; also: UNH CS T 93-01. The Robotics Institute, School of Computer Science, Carnegie Mellon University, Pittsburgh, Penn. Velasquez, J.D., ed.(1999): Emotion-Based Agent Architectures (EBAA ’99), workshop notes, Third International Conference on Autonomous Agents (Agents ’99), Seattle. ACM Press, New York. Velasquez, J.D., and Maes, P.(1997): Cathexis: A Computational Model of Emotions.In Proceedings of the First International Conference on Autonomous Agents, 518–519. Marina del Rey, Calif.ACM Press, New York. Ventura, R., Custodio, L., and Pinto-Ferreira, C. (1998): Emotions—The Missing Link? In L.D.Can˜amero, ed., Emotional and Intelligent: The Tangled Knot of Cognition, Proceedings of the 1998 AAAI Fall Symposium, TR FS-98-03, pp.170–175.AAAI, Orlando, Fla. Weizsa¨cker, C.F.von (1993): Zeit und Wissen. Hanser, Mu¨nchen, Wien. Wilkins, D . E . , Myers, K . L . , Lowrance, J . D . , and Wesley, L.P.(1995): Planning and Reacting in Uncertain and Dynamic Environments. J. Exp. Theor. Artif. Intell. 7 (1): 197–227. Wooldridge, M., and Jennings, N . R . ( 1 9 9 5 ) : Intelligent Agents: Theory and Practice. Knowledge Eng. Rev. 10 (2): 115–152. Wright, I.P.(1997): Emotional Agents.Ph.D.thesis, University of Birmingham, Birmingham, UK. Yu, S . T . , Slack, M . G . , and Miller, D.P.(1994): A Streamlined Software Environment for Situated Skills.In P.J.Weitz, ed., Proceedings of the AIAA/NASA Conference on Intelligent Robots in Field, Factory, Service, and Space (CIRFFSS ’94), 233–239. Houston, Tex.AIAA/NASA, Houston T e x . Zajonc, R.B.(1980): Feeling and Thinking: Preferences Need No Inferences.Am. Psychol. 2: 151–176.
10 The Wolfgang System: A Role of ‘‘Emotions’’ to Bias Learning and Problem Solving when Learning to Compose Music Douglas Riecken
For at best, the very aim of syntax oriented theories is misdirected; they aspire to describe the things that minds produce—without attempting to describe how they’re produced. —Marvin Minsky
10.1 An Emotional ‘‘Call to Arms’’
What are emotions? Words we use to discuss emotions are just that! They are words in a language such as English, German, or French used to characterize the observed behavior of a ‘‘black box’’ like a human, dog, or computer. Words dealing with emotions, such as love, hate, happy, and sad do not tell us what is really occurring in the mind. What is required are good theories of emotions; perhaps that should require good theories of mind. In the end, such theories would help to better understand what emotions are. There is a range of research areas on emotions that are useful to different researchers. For my work, I have been exploring theories of mind with a focus on memory, learning, and emergent ability. In doing so, my work has been deeply focused on the following: 1. 2. 3. 4.
The role of instincts and ‘‘emotions’’ Commonsense reasoning Multistrategy reasoning Representation In essence, my studies explore cognitive architecture. Before entering into discussion of the Wolfgang system, it is important to identify several key questions that continue to motivate my thinking.
Question 1: Where do Goals Come From?
Living systems (biological or silicon) have needs. If such needs are not addressed by the system or for the system, then the system will ‘‘die.’’ When such needs occur, the system will perform some
Douglas Riecken
action that will address the need. Thus the system must be able to ‘‘formulate’’ a goal that it hopes will be satisfied by a ‘‘solution’’ in order to address a need. Are the first goals formulated by a living system’s primitive instincts for survival? What about goals that are cognitively complex—like composing a musical composition—or even more complex—like learning to become a composer? It is interesting to consider what goal and personal experiences helped Beethoven to ‘‘plan’’ and decide that his Symphony n o . 5 in C Minor would begin and be based on a motif of four notes. How does a composer select the first two or three notes to a composition? What type of goal does this?
Question 2: Do Goals and Solutions that Satisfy Goals Become Biased Based on ‘‘Learning’’ Experiences?
Some living systems ‘‘learn.’’ One of my working assumptions is that the nervous system is a fantastic multimodal encoding and pattern-matching machine. In humans, there are enormous relationships of many simple little pieces of ‘‘information’’ about a world and its culture that a human memory system is constantly encoding and reformulating. Memory is a fantastic representation of enormous biased partial orderings of ‘‘information’’ that reflect an individual’s experiences. What happens when a baby or a child uses a perfectly good (legal) solution for a goal, but as the solution is applied, there are certain ‘‘properties’’ associated with the solution that have a ‘‘strong’’ positive or negative impact on the baby’s/child’s instincts or cognitive perception? How might such an experience impact the learning and the reuse of such a goal or solution?
Question 3: When in a Particular Mental State, What Is the ‘‘State of the Nervous System?’’ What Is the Mind Doing?
Experiences affect humans in complex ways. One might consider that reflection on a previous experience could affect a human ‘‘toward’’ a particular mental state. Is the state of the nervous system in some innate instinctive mode? Are ‘‘emotional’’ states, such as when I am happy or sad or angry, really words from a language attempting to map some abstract convolution of my mind’s ‘‘life experience’’ onto specific innate instinctive states of the nervous system?
The Wolfgang System
It is from a continued series of questions like these that my work is based on a view that: goals plus emotions enable learning. Without goal formulation and instinctive nervous system elements that bias goal formulation, learning would not be possible. This statement covers the full range of initial learning through cognitive development and beyond. As we learn, we continue to encode and what we encode includes pieces of information that bias our memories to remember and motivate specific behaviors and biases of the learned information and knowledge for future experiences and goals. Over the years since 1987, the Wolfgang system (Riecken 1989) was the initial prototype software system of a model that I have continued to work on. Later work from 1991 to 1996 continued in a system called the M system (Riecken 1994). Since 1996 up to today, the M system work continues to evolve. The focus of this work is a theory of mind with attention to architecture addressing multistrategy reasoning, emotions, and representation (Riecken 2000). Two principal influences on this work relating to architecture continue to be recent work by Marvin Minsky (2001) and Aaron Sloman (chapter 3 of this volume). Several key influences on my work specific to emotions continue to be Manfred Clynes (1988), Paolo Petta (chapter 9 of this volume), Edmund Rolls (chapter 2 of this volume), Andrew Ortony and colleagues (Ortony Clore, and Collins 1988; chapter 6 of this volume), and Rosalind Picard (1997; chapter 7 of this volume). The key focus on the Wolfgang work was to understand a role that emotions might perform relating to goal formulation and learning. A key point here is that Wolfgang is a system that continues to learn! It learns to become a composer. It is not a system that learns one thing and then stops learning. While the work on Wolfgang resulted in a system that learned to compose, the Wolfgang architecture has since evolved in my later work on the M system. It is in this work that efforts by Minsky and Sloman have helped to influence a better design for exploring the role of ‘‘emotions/instincts’’ relating to reflection and reaction in a cognitive architecture.
10.2 A Problem with Learning to Compose
In this chapter, I reflect on the design motivations for a system called Wolfgang that composes tonal monodies. The investigated problem concerns the definition of the evaluation criteria guiding
Douglas Riecken
Wolfgang’s compositional processing and learning. The thesis of this work is derived from the hypothesis that a system’s innate sense of (musical) sound strongly influences the development of its perception, as well as composing hab its. As the system develops its musical skills, it also develops a subjective use of a musical language biased by its sense of musical sounds and its adaptation to the cultural musical grammar of its environment. In 1987, I began a study in machine learning with a rather simple software system called Wolfgang. Wolfgang was a research project, which focused on the development of compositional performance. This initial system applied Michalski’s STAR methodology (1983) for inductive machine learning in a knowledge-based system. In this first generation of Wolfgang, the evaluation criteria for guiding the composition process was derived from an explicit grammar of Western music; Wolfgang learned to compose simple compositions based on learning simple rules of syntax. In essence, the Wolfgang system was ‘‘programmed’’ to learn the syntax rules of a cultural grammar. It did not seem clear that this form of learning captured the ‘‘true spirit’’ of learning to compose music. It would appear that when an ‘‘individual’’ first hears or plays a new (musical) idea, some type of physical/‘‘emotional’’ reaction should occur. If Wolfgang was going to learn, it must have fundamental biases that shape its behavior with each learning experience. In essence, Wolfgang should be ‘‘programmed’’ to formulate biases based on the emoting properties of an ‘‘auditory/musical’’ experience, not just on the rules of syntax. The motivation for this theory has similarities to learning in Lenat’s AM system (1982), insofar as the AM system was designed to be ‘‘curious’’ via a heuristic search of number theory concepts. Thus a second design of Wolfgang began. The second generation of Wolfgang introduces ‘‘emotional’’ criteria to constrain goal formulation both when Wolfgang learns and composes (Riecken 1989, 1992). Wolfgang was designed to guide its composition process biased by a cultural grammar of music, as well as by a disposition for crafting musical phrases such that they express a specific emotional characteristic. Specifically, the evaluation criteria guiding Wolfgang’s composing process consists of: a cultural grammar reflective of Wolfgang’s musical development, and an ability to realize the emotive potential of musical elements represented in the respective cultural grammar.
The Wolfgang System
An abstract description of Wolfgang’s composition process is as follows. Based on the grammatical context of a given compositional decision, Wolfgang defines a set of legal solutions from its domain knowledge of music, which satisfy the cultural grammar; then Wolfgang selects from this set of legal solutions the solution that best satisfies Wolfgang’s current disposition in order to endow the current musical phrase with a specific emotive potential. Key enablers in the design and implementation of Wolfgang were an approach to knowledge representation (KR) based on Minsky’s K-line theory (1980) and Trans-Frames, as presented in his society of mind (SOM) theory (1985). Wolfgang’s architecture required that domain knowledge be represented in ‘‘micro’’ element structures so that they could be dynamically used to formulate and reformulate various partial solutions. The idea is not to represent whole ideas or facts, but to let the learning encode parts of ideas and facts as ‘‘simple’’ structures that can be chained together to form various partial orderings of knowledge. This is a core feature in SOM theory, where multiple agencies of reasoning are engaged in multistrategy reasoning. An important advantage in representing domain knowledge as dynamic orderings (in linked data structures) is that the diverse compositional levels of linked structures can be reused and adapted to represent changes in the composing behavior of Wolfgang as it continues to learn and evolve as a composer. This design approach was useful when compared to the classic problem of the plastic expert system. Consider the composer who lived for the first 25 years of his or her life in Brooklyn, New York and then for the next 12 years in South America. Clearly, the ‘‘fluidity of the human mind’’ enables a composer’s style to evolve due to new influences and experiences. Perhaps the diverse relationships of many partially ordered elements of knowledge acquired over time, in a computer composing ontology, is essential for fluid learning and development. The Wolfgang architecture, based on an interpretation of K-line theory, enabled the implementation of fundamental ‘‘emoting’’ biases representing Wolfgang’s initial sensation of sound and later its learning and perception of ‘‘musical sounds’’ and musical knowledge along with their respective emoting potentials. It was necessary that the emoting potentials be represented in a dynamic K-line network so that they would assert Wolfgang’s ‘‘emotional’’ disposition toward the current composing task and/or learning experience.
Douglas Riecken
10.3
General Discussion of Wolfgang
The second generation of Wolfgang introduces ‘‘emotional’’ criteria to guide both the composing process and learning to compose. Wolfgang allows a user to request it to compose a sixty-four measure monody composition that realizes a specific emotional characteristic. The set of emotional characteristics include happiness, sadness, anger, and meditativeness. Wolfgang’s disposition during a composing session will guide goal formulation and decision making so as to compose a composition with the user-requested emoting quality. It is important to note that during a composing session with a user, Wolfgang will change its disposition in order to create a composition that reflects some ‘‘maturity.’’ That is to say, Wolfgang’s disposition does not constrain the composing process to only one type of emoting quality for the resulting composition. While the overall composition is realized to communicate a specific emoting quality, Wolfgang will integrate other emoting qualities to enhance the quality of the composition. In this paper, the term disposition is influenced by Minsky’s use of this term in his K-lines paper (1980). Minsky states, ‘‘I use ‘disposition’ to mean ‘a momentary range of possible behaviors’; technically it is the shorter term component of the state. In a computer program, a disposition might depend upon which items are currently active in a data base, e.g., as in Doyle’s (Doyle 1979) flagging of items that are ‘in’ and ‘out’ in regard to making decisions’’ (p. 131). Although the design of Wolfgang makes use of the concept of dispositions, the current work on Wolfgang is still an evolving attempt to integrate some of Minsky’s ideas; it is my view that considerable work is still required. In the remainder of this chapter, I will provide a general discussion of Wolfgang’s architecture. The Wolfgang system architecture is composed of the following five fundamental system components: (1) a corpus of E-nodes, (2) a K-line network, (3) a set of blackboard systems, (4) a disposition feedback facility, and (5) logfiles.
10.4
E-Nodes
E-nodes (emotion-nodes) are the most fundamental system component. Informally, an E-node is a collection of information defining the emoting potential of a given primitive musical artifact. The
The Wolfgang System
term primitive musical artifact refers to such elements as vertical 2-note harmonic structures (e.g., a major third, a dominant seventh, etc.), horizontal 2-note harmonic structures (e.g., harmonic progressions consisting of paired vertical harmonic structures), amplitude, tempo, and simple rhythmic elements. The use of the term emotive refers to a primitive musical artifact’s potential to express emotion or sentiment in a demonstrative manner. The information provided by each E-node allows Wolfgang to interpret the emoting potential of each primitive musical artifact. While the musical artifacts mentioned are actually quite complex structures (in terms of music theory), for the design of Wolfgang they are viewed as simple elements, and in that sense are taken for granted as elements of the musical imagination. A critical design issue concerning E-nodes must now be reviewed. E-nodes are not instances of learned musical knowledge. An E-node is simply a qualification and quantification of the emotive potential of a given primitive musical artifact. The E-node functions to represent and simulate in the computer model Wolfgang’s sensation of each respective primitive musical artifact. In order for Wolfgang to be knowledgeable and perceive a given primitive musical artifact, a representation of the respective artifact must be encoded into its (system) memory, representing a learned experience. Once this representation has been encoded, it inherits its emoting potential from the respective E-node. Thus E-nodes serve as ‘‘innate’’ system properties. It is important to note that this type of learning is restricted to only primitive musical artifacts. In time, Wolfgang will begin to learn about compound musical artifacts. The term compound musical artifact refers to a musical artifact composed directly or indirectly (or both) of two or more primitive musical artifacts. The emoting potential for compound musical artifacts is computationally derived from the emoting potentials of the ‘‘simpler’’ musical artifacts that make up the respective compound musical artifact. Now that we have an ab stract statement of what an E-node is, let us review it in more detail. An E-node is defined as an identifier and a set of four numeric values. Each numeric value defines the emotive potential of a particular emotion, as realized by the primitive musical artifact represented by an E-node. The four distinct emotions represented by the four numeric values assigned to each E-node are happiness, sadness, anger, and meditativeness. The design of Wolfgang provides each primitive musical artifact with its
Douglas Riecken
range of emoting potential over the defined set of emotions. (The number of individual emotions supported is currently restricted to four so as to minimize complexity.) The significance of E-nodes lies in their representation of emoting potentials for the four emotions over the complete set of primitive musical artifacts. These emotive representations perform a critical role during Wolfgang’s development; they serve as operands by which emoting values are computed and assigned to the learning of a new primitive musical artifact, or to a compound musical artifact. The term development refers to the acquisition of musical knowledge by Wolfgang to improve its performance as a composer. Thus E-nodes provide emotive primitives (initial properties) used to computationally derive the emoting potential for the combinatorial development of Wolfgang’s musical knowledge. The decision to specify the emotive potential of all musical artifacts is motivated by our task of developing evaluation criteria for guiding the composing process. As we have learned thus far, one of the metrics that guides Wolfgang’s composition process consists of composing musical phrases that satisfy some current compositional disposition. This means that Wolfgang will attempt to compose a musical phrase provoking a specific emotion that matches its current disposition; at any moment, Wolfgang’s disposition is at some level of happiness, sadness, anger, or meditativeness. Therefore the selection of musical artifacts is biased toward those musical artifacts that provide the highest emotional potential matching Wolfgang’s current disposition; consequently, the system design of Wolfgang requires a method to specify the emoting potential of all musical artifacts.
10.5
Memory Based on the K-line Theory
The storage, access, and management of Wolfgang’s musical domain knowledge is supported by a network of interconnected musical artifacts; these artifacts reflect Wolfgang’s musical experiences and development. The principal influence in the design of Wolfgang’s memory is Minsky’s K-line theory of memory (Minsky 1980, 1985): ‘‘When you ‘get an idea,’ or ‘solve a problem,’ or have a ‘memorable experience,’ you create what we shall call a K-line. This K-line gets connected to those ‘‘mental agencies’’ that were actively involved in the memorable mental event. When that K-line is later ‘activated,’ it reactivates some of those mental agencies,
The Wolfgang System
creating a ‘partial mental state’ resembling the original’’ (Minsky 1980, p . 118). The K-line memory theory describes the behavior of a system by the dynamic relationships of linked elements (called K-lines). The complexity of a K-line can range from a representation defined by a single instance of information to extremely complex representations of behavior defined by a K-line composed of a large set of linked K-lines (a.k.a. K-trees). Minsky’s theory explains that sets of K-lines form societies of resources providing specific mental functions, and that these societies can dynamically form multiple connections (K-lines) with each other to create new K-lines, thus increasing the body of knowledge. It is the activation of sets of Klines that brings about partial mental states. A total mental state is composed of several partial mental states active at a single moment in time. Wolfgang’s development (a.k.a. learning) is realized by constructing new K-line connections within or between partial mental states.
10.6
Advantages from K-line Memory Features
Wolfgang’s memory is its fundamental system component. In the second generation of Wolfgang, great care was taken to design a system that demonstrated improvement in its ability to ‘‘develop.’’ If Wolfgang is to develop composing skills, these skills should demonstrate some characteristic of personalized style. The design of Wolfgang’s memory attempts to capture two powerful qualities found in the K-line theory: flexibility and recall of personalized habits. The term flexibility refers to Wolfgang’s ability to store and use diverse knowledge during its development as a composing system. The issue of flexibility provided critical motivation for the evolution to the second generation of Wolfgang. The first-generation system consistently reached impassable stages of development. This problem resulted due to the nature common to knowledgebased systems: these systems have historically demonstrated their ability to perform quite well over restricted sets of problems, except that the acquisition and representation of a large body of knowledge quickly promotes conflicting assertions of facts and rules. The K-line model avoids conflicting assertions by allowing multiple patterns of diverse knowledge to represent different memorable experiences; musical experiences that provide good
Douglas Riecken
results during a composing session are therefore encoded as individual K-lines in Wolfgang’s memory. This representation serves to minimize the number of rules that ‘‘govern’’ methods and facts, and thus to avoid the attending problems of complexity. Wolfgang’s K-lines enable a range of diverse and reusable topologies of K-lines to be formulated and reformulated both for building new K-lines representing new knowledge and abstractions learned, and changing context by changing activation of the K-line network, which defines Wolfgang’s memory of musical methods and facts. The concept recall of personalized habits refers to Wolfgang’s ability to develop and apply its composing skills in a subjective manner reflecting its musical learning experiences. This feature of system performance is a direct result of the explicit method by which musical knowledge is encoded in Wolfgang’s memory as individual K-line instances of successful musical methods and facts. Thus as Wolfgang develops, it also develops a distinct style of composing; this results from the frequent combinations of collaborating K-lines that, over time, are referenced as individual compound K-lines. By attempting to model features of Minsky’s K-line memory, Wolfgang develops personal composing habits; these habits become the system’s compositional signature.
10.7 Design and Implementation of K-line Memory
Wolfgang’s memory is implemented as a frame-based spreadingactivation network; semantic relationships within the network form individual K-lines. Each K-line is represented by a discrete set of network links. These network links interconnect supporting K-lines to represent musical knowledge. All K-lines are implemented as frame structures (Minsky 1975), known as K-line frames (KF). A KF is a structure (a.k.a. schema) representing specific knowledge of some object or concept; the structure associates features that are descriptive of a given object or concept. These features are represented as attributes called slots. Slot values can be either some physical value (such as a symbol or numerical value) or some process/function to be invoked to perform some task. In Wolfgang, the slot values in each KF identify the respective KF and its supporting structures and characteristics. An important set of slot values in each KF are a set of four numeric weights. Each numeric weight represents an emotive potential for the respective K-line; each weight respectively represents one of the four emotive
The Wolfgang System
types used in the current design of Wolfgang (happiness, sadness, anger, and meditativeness). These four numeric weights serve to support computations used during composing decisions to select solutions that provide specific emotive qualities. As given solutions provide repeated successes, Wolfgang will attenuate these values and formulate ‘‘abstractions’’ as new K-lines referencing the solution(s) so as to reflect their utility for a given context. Over time, this is how Wolfgang’s musical composing signature emerges. The overall K-line network is partitioned into two functional parts: method facts (e.g., methods of motivic development, methods of harmonization, etc.) and facts (specific instances of intervals, harmonic structures, etc.). Within a K-line network, K-lines are partitioned into distinct classes of musical artifacts, such as sets of melodic intervals, rhythmic patterns, harmonic progressions, methods for motivic development, and so on. This is done to provide effective management and efficient access of system memory; the management and access of K-line classes is supported via blackboard technologies (see Nii 1989 for discussion on blackboard technologies). Finally, each K-line class is implemented as a frame, called a musical-component-frame (MCF). Each MCF may contain an arbitrary number of slots; each slot in an MCF references a distinct KF contained within a K-line class. The MCFs are implemented as lists.
10.8
Blackboards as Knowledge Negotiators
Wolfgang’s blackboard systems serve as knowledge negotiators. They manage the interactions of K-lines from different K-line classes as they collaborate to compose a musical work. The blackboard technologies are implemented in a distributed hierarchical model. The model consists of a primary blackboard system supported by (three) subordinate blackboard systems. The primary blackboard system, called the root blackboard (RBB), manages the overall composition process. The three subordinate blackboard systems include the melodic blackboard (MBB), the harmonic blackboard (HBB), and the rhythmic blackboard (RHBB). These three blackboard systems manage composing processes relating to melody, harmony, and rhythm, respectively. Also, these three blackboard systems serve as blackboard knowledge sources (KSs) to the RBB.
Douglas Riecken
Each blackboard system is implemented as a shared memory allowing access to its respective K-line classes (e.g., K-line melody classes, K-line methods of motivic development classes, etc.). The activation of K-lines within these classes serve as KSs that attempt to assert information provided by each active K-line onto the blackboard. This information is evaluated by a blackboard scheduler. Each blackboard system comprises a scheduler for managing blackboard functions. In Wolfgang, the schedulers are implemented as inference engines composed of minimal sets of rules relating to the distinct tasks of each respective blackboard. The schedulers are the only system components within the Wolfgang architecture whose knowledge is defined explicitly by rules.
10.9
Disposition Feedback Facility
The disposition feedback facility provides Wolfgang with the ability to evaluate decisions made during a composing session. Evaluations are based on the emoting potential derived from each possible decision, and a ranking of each possible decision with regard to an ordered list of previous decisions that demonstrate high emoting potentials. This facility allows Wolfgang to mark for future use, during a given composing session, specific musical artifacts that compute high emotional potentials, while satisfying the current disposition of the system. Implementation of the disposition feedback facility consists of: a feedback loop, a variable called the *dispositionValue*, and an ordered list of decisions called the *listOfGoodElements*. Both variables, *dispositionValue* and *listOfGoodElements*, are asserted and maintained on the RBB. The variable *dispositionValue* maintains the current disposition of Wolfgang as one of the four emotion types: happy, sad, angry, and meditative. The variable *listOfGoodElements* provides Wolfgang with a list of musical elements that have been applied previously during the current composing session, and that have provided high emoting potentials. This list is significant; it acts as a short-term store of musical artifacts applicable in the motivic development of remaining musical phrases. Thus, during an individual composing session, Wolfgang creates and appends to the variable *listOfGoodElements* musical features derived from decisions made during the session that provide high emoting potentials. The feedback loop is implemented as a background process that sends status messages
The Wolfgang System
to the RBB scheduler when an interesting event has occurred and has been posted on the RBB, MBB, HBB, or RHBB; the scheduler then appends the musical artifact associated with this event to the *listOfGoodElements*. The Disposition Feedback Facility evaluates and resolves composing decisions by evaluating the activation of K-lines that represent previous successful solutions (note: these previous solutions are a reflection of ‘‘positive’’ cultural experiences, which suggests that such solutions are both the development of a culturally learned syntax for a musical grammar and a ‘‘personal’’ bias by Wolfgang in their use) and then selecting the solution whose emoting potential best satisfies Wolfgang’s current disposition—keep in mind that a solution’s emoting potential is computed via the numeric weights represented in that solution’s K-line representation.
10.10
System Logfile
The logfile insures that each composition is distinct; it provides information of previously composed works, so that Wolfgang can avoid the excessive repetition of musical artifacts by reviewing its previous composing habits during a session. The logfile consists of trace data from the last twenty sessions stored in a hard disk file.
10.11 Closing Discussion
The second-generation Wolfgang system provided a working model of music composition in which ‘‘dispositions’’ are instrumental in deciding about steps in the elaboration of tonal monodies during composing. The research and design of Wolfgang has resulted from a subjective view of musical composing according to which the emoting potential of musical constructs is more important to the musical logic of a monody than are their syntactic features. A musical composition is thought to be an artifact, which stimulates the senses and cognitive awareness of both its creator and any intended listener. I therefore view composing as a process that creates an artifact to communicate some cognitive ‘‘emotional’’ effect. The composing process necessitates the development of a set of musical skills and the application of these skills based on the disposition of the composer. We might consider Wolfgang’s compositional processing as constrained by its cultural grammar and
Douglas Riecken
guided by its disposition to musically communicate some emoting quality.
Acknowledgments
The author is deeply indebted to Marvin Minsky, Ed Pednault, and Stacy Marsella for their helpful interest and comments in this work.
References Clynes, M. (1988): Generalised Emotion: How it is Produced, and Sentic Cycle Therapy. In M. Clynes and J. Panksepp, eds., Emotions and Psychopathology. Plenum Press, New York. pages 107–170. Doyle, J. (1979): A Truth Maintenance System. AI memo no. 521. Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge. Lenat, D. (1982): AM: An Artificial Intelligence Approach to Discovery in Mathematics as Heuristic Search. In R. Davis and D. Lenat, eds., Knowledge-Based Systems in Artificial Intelligence. McGraw-Hill, New York. Michalski, R. S. (1983): A Theory and Methodology of Inductive Learning. In R. S. Michalski, J. G. Carbonell, and T. M. Mitchell, eds., Machine Learning. Morgan Kaufmann, San Mateo, Calif. Minsky, M. (1975): A Framework for Representing Knowledge. In P. H. Winston, ed., The Psychology of Computer Vision. McGraw-Hill, New York. Minsky, M. (1980): K-Lines: A Theory of Memory. Cogn. Sci. J. 4 (2): 117–133. Minsky, M. (1982): Music, Mind, and Meaning. In M. Clynes, ed., Music, Mind, and Brain: The Neuropsychology of Music. Plenum Press, New York. Minsky, M. (1985): The Society of Mind. Simon and Schuster, New York. Minsky, M. (2001): The Emotion Machine. Pantheon Books, New York. Nii, H. P. (1989): Introduction. In V. Jagannathan, R. Dodhiawala, and L. Baum, eds., Blackboard Architectures and Applications. Academic Press, New York. Ortony, A., Clore, G., and Collins, A. (1988): The Cognitive Structure of Emotions. Cambridge University Press, Cambridge. Picard, R. (1997): Affective Computing. MIT Press, Cambridge. Riecken, D. (1989): Goal Formulation with Emotional Constraints: Musical Composition by Emotional Computation. In H. Schorr and A. Rappaport, eds., Proceedings of AAAI—First Annual Conference on Innovative Applications of Artificial Intelligence, Stanford University. AAAI/MIT Press, Cambridge. Riecken, D. (1992): Wolfgang: A System Using Emoting Potentials to Manage Musical Design. In M. Balaban, K. Ebcioglu, and O. Laske, eds., Understanding Music with Artificial Intelligence Perspectives on Music Cognition. AAAI/MIT Press, Cambridge. Riecken, D. (1994): M: An Architecture of Integrated Agents. In D. Riecken, ed., special issue of the Commun. ACM Intell. Agents 37 (7): 106–116. ACM, New York. Riecken, D. (2000): We Must Re-Member to Re-Formulate: The M System. In A. Sloman, ed., Proceedings of AISB 2000 Symposium How to Design a Functioning Mind, The University of Birmingham, UK. Society for the Study of Artificial Intelligence and Simulation of Behaviour, Univ. of Birmingham, UK.
11 A Bayesian Heart: Computer Recognition and Simulation of Emotion Eugene Ball
11.1 Why Does a Computer Need a Heart?
Computers are rapidly becoming a critical and pervasive piece of our societal infrastructure. Within a decade or two they are likely to be the constant companions of most people in the industrialized world. Many of our interactions with them will continue to be like those with simpler machines: We push a button and the machine takes a limited and well-defined action, like zapping our food or washing the dishes. But there will also be many situations wherein spoken conversation will be the preferred means of communicating with a computer: perhaps to ask what the weather is likely to be in Vienna next week, or to select a good book to read on the plane. There are huge technical challenges that must still be overcome before conversational computers will be competent and reliable enough to use in this fashion, but I have little doubt that we will get there in twenty years (possibly much sooner). A computer with which we engage in casual conversation (even if limited to narrow domains), will inevitably become a significant social presence (at least as noticeable as a human ticket agent with whom we carry out a brief transaction). I suspect that for many people, such a system will eventually become a long-term companion with which (whom?) we share much of our day-to-day activity. To be useful, conversational interfaces must be competent to provide some desired service; to be usable, they must be efficient and robust communicators; and to be comfortable, they will have to fulfill our deeply ingrained expectations about how h u m a n conversations take place. One subtle aspect of those expectations involves the emotional sensitivity of our conversational partners. We would be surprised if an outburst of anger toward someone produced a completely deadpan response, and might even be further angered by their lack of acknowledgment of our own emotional state. If we laugh in response to someone’s joke, we expect them to laugh (or at least smile) along with us—to do otherwise would be disconcerting.
Eugene Ball
My work (with my colleague Jack Breese) on the computational modeling of emotion and personality is intended as a first step toward an emotionally sensitive conversational interface: one that can recognize the emotional state of the human user and respond in a fashion that adds to the naturalness of the overall interaction.
Communicating with Computers
People frequently find spoken conversation to be the most efficient and comfortable way to conduct interactions with others. Particularly for tasks requiring many back-and-forth steps, written communication (even e-mail) can be tedious and suffers from the need to reacquire working context for each step of the interaction. Graphical computer interfaces have been quite successful as a medium for conducting many well-specified tasks of this sort. For example, common requests to a travel agent (long a favorite target application of the spoken language research community) can be carried out quite efficiently as an interaction with a Web server. However, if a request is unusual, it may be difficult to build a graphical user interface to handle it without unduly complicating the interface for more common cases. ‘‘Hidden commands’’ can provide less common capabilities without complicating simple ones, but they require that the user know both that they exist, and how to find them. One likely path for computer interface design is to gradually augment graphical user interfaces with linguistic capabilities. An ‘‘assistant’’ would accept flexible descriptions of commands or objects outside of the immediately visible workspace: ‘‘I’d like to reserve a group of fifty seats for travel from Minneapolis to Seattle next December’’ or ‘‘Check to see which movies are showing on flights from London to Seattle.’’ The ability to respond properly to powerful natural language requests (either typed or spoken) would be a welcome addition to current interfaces. While natural language may first be introduced as an ‘‘escape’’ for uncommon requests, its role is likely to steadily expand as it becomes more capable and as speech recognition becomes more reliable. Natural language requests of the sort suggested above are convenient and powerful, but often result in ambiguities that require clarification: ‘‘Do you want round-trip tickets?’’ Therefore, spoken interactions won’t usually consist of just isolated commands but will become conversational dialogues.
A Bayesian Heart
Social Aspects of Computer-Human Interaction
In many fundamental ways, people respond psychologically to interactive computers as if they were human. Reeves and Nass (1995) have demonstrated that strong social responses are evoked even if the computer does not use an explicitly anthropomorphic animated assistant. They suggest that humans have evolved to accord special significance to the movement and sounds produced by other people, and at some deeply ingrained level cannot avoid responding to the communication of modern technology as if it were coming from another person. As user interface designers, I believe we have a responsibility to try to understand the psychological reality and significance of these effects and to adapt our computer systems to the needs of our users. In addition, we should recognize that this social response is likely to become much stronger when the user is having a spoken conversation with a computer. It is clear that our emotional responses do not disappear while we are interacting with machines of all types: We get annoyed when they do not work properly, we respond with joy when a difficult task is completed smoothly, we even attribute h u m a n motives to their inanimate behaviors on occasion. Thus it will not be surprising to see even stronger emotional reactions to computers that talk, including expectations of appropriate emotional responses in the computer itself.
Emotionally Aware Computing
Explicit attention to the emotional aspects of computer interaction will be necessary in order to avoid degrading the user’s experience by generating unnatural and disconcerting behaviors. For example, early text-to-speech systems generated completely monotonic speech, which conveyed a distinctly depressed (and depressing) emotional tone. Therefore I would argue that the initial goal for emotional interfaces should be to simulate appropriate emotional reactivity by demonstrating an awareness of the emotional content of an interaction. The type of emotional reactivity that might be appropriately demonstrated by a conversational assistant can be illustrated by considering some imaginary computer responses to different situations. I’ve labeled each example with an emotional term or attitude description that could properly accompany the words,
Eugene Ball
giving them a more natural feel and a greater communicative potency. 9 9 9 9
The assistant is reporting the results of an assigned search task: I was unable to find anything meeting that description. (sadness) This is just what you were looking for. (pride) I’m not sure, but one of these might suit your needs. (uncertainty) Gee, and it only took me 12 minutes to find it! (embarrassment)
The assistant reacts to difficulties with the communication itself: I’m afraid I don’t understand what you mean by that. (confusion) I believe I just told you that I don’t know anything about that topic. (irritation) 9 This doesn’t seem to be going so well . .. could you try again? (embarrassment) 9 I’m really sorry, but could you repeat that one more time? (solicitude) 9 9
The assistant detects a strong emotional reaction from the user: 9 Whoa . . . Can we calm down and try again? (calming) 9 Gee, that was pretty frustrating, wasn’t it? (empathy) 9 Great! Glad to be of help. (joy) The assistant is trying to fulfill a user’s request to help modify the user’s behavior: 9 Shouldn’t you get back to work now? (disapproval) 9 If you don’t leave now, you’ll be late. (warning) 9 Take a break now, if you want your wrists to get better! (commanding)
Discussion
Picard: ‘‘I was unable to find anything meeting that description.’’ When you assume goodness and sincerity and so forth, sadness could actually be expressed with no tone of voice. But of course we could all hear these said in other ways as well. And that’s where, I think, it’s interesting to consider not just the emotion of the system that is reading these sentences, but the emotion of the perceiver of these sentences. There are several interesting studies in which you present perceivers with a neutral stimulus, and a perceiver in a good mood will perceive the neutral stimulus as more positive, while the perceiver in a bad mood will perceive it as more negative.
A Bayesian Heart
Ball: In human interaction we see that distinction in the perceiver, and we react to that. Picard: That’s right. E-mail that we send without tone really is more likely to be perceived ambiguously, so we may need to go to even greater efforts to verbally try to be clear, if we are positive, or just hurried, as opposed to angry—things that could easily be confused without the tone. Bellman: Washington, D.C. had a big controversy over the voice that they were using for closing the doors inside a subway. Did you hear that? Picard: Actually, there was a similar thing in Atlanta many years ago at the airport. Bellman: They purposely made the voice a little bit brisk. They wanted to get people to get on: Doors are closing—get on! And there was actually a tremendous backfire against it. People found it was rude, it was a nasty voice. They finally had to get rid of it. Picard: It’s funny: In Atlanta, they had started with a nice, h u m a n sounding voice that sounded very friendly. And people didn’t pay much attention to it. So they went to a more computerized, synthetic-sounding voice, that sounded sort of more high-tech and cold. The perceiver not just is influenced by their own emotions, but of course by what they think of that entity. They think, what is this computer that’s so stupid? Because so many people harbor these mixed feelings. We are very unusual in how we feel about computers compared to the rest of society. Ball: And all these examples are imaginary. I think, having the competence to generate the right one of these in the right situation is a huge goal. And getting it wrong is something that people can react very strongly to. Ortony: There are huge individual differences. I have a friend from New York who has a reputation of being rather brusque. I find this little story illustrative. He finds Chicago intolerable compared to New York. One of the things that irritate him in Chicago is: people get off the bus and they thank the bus driver. He finds this absolutely incomprehensible behavior, like ‘‘The goddamned guy is paid to drive the bus. What’s the problem? You get off the bus and you go!’’—The point here is that people obviously have different personalities that require different interactional styles. Actually, some people will be upset by one style, while others will be sat-
Eugene Ball
isfied. I mean, the environment includes the personality of the individual one is interacting with. Picard: It’s going to be constantly changing in different situations. So, if the computer tries one of these lines on your friend, and your friend trashes the thing, then the computer will better not try any lines similar to that. In these examples, the character’s linguistic expression is the clearest indicator of its emotional state, but if that expression is to seem natural and believable, it needs to be accompanied by appropriate nonverbal indications as well. Whether generated by preauthored scripts or from strong AI first principles, such utterances will seem false if the vocal prosody, hand gestures, facial expressions, and posture of the character do not match the emotional state expressed linguistically. In order to produce responses demonstrating as much emotional sensitivity as these examples suggest, a system must be able to: recognize and/or predict the emotional state of the user, and then synthesize and communicate an appropriate emotional response from the computer. The next section describes a simple emotional model that can be used to adjust the emotional expression of a talking computer. While the motivation for this work is strongest for conversational systems, its application may be appropriate more generally. As computer use becomes ever more widespread in our culture, it is likely that we will see greatly increased attention to the subjective experiences of computer users, including the aesthetic and emotional impact of computer use. My expectation is that the experience gained from modeling the emotional impact of spoken interfaces will also be used to inform the design (and possibly the dynamic behavior) of conventional graphical interfaces, in order to improve user satisfaction.
11.2
A Bayesian Model of Emotion
Modeling Emotion
The understanding of emotion is the focus of an extensive psychology literature. Much of this work is based upon a deep understanding of an individual’s beliefs about how events will effect
A Bayesian Heart
him, and then modeling the way those beliefs lead to an emotional response (Scherer 1984; Ortony, Clore, and Collins 1988). While a few research efforts are attempting to build agents with sufficiently deep understanding that these models can be applied directly (Bates, Loyall, and Reilly 1994; Martinho and Paiva 1999), we have chosen to utilize a much simpler model of emotion; one that corresponds more directly to the universal responses (including physical responses) that people have to the events that affect them. Although this approach is unable to model many subtle emotional distinctions, it seems like a good match to conversational interfaces that communicate with people (within specific domains) using only a limited understanding of language and the user’s goals. The term emotion is used in psychology to describe short-term (often lasting only a few seconds) variations in internal mental state, including both physical responses, like fear, and cognitive responses, like jealousy. We focus on two basic dimensions of emotional response (Lang 1995) that can usefully characterize nearly any experience: Valence represents positive or negative dimension of feeling. Arousal represents the degree of intensity of the emotional response. Figure 11.1 shows the emotional space defined by these dimensions, and shows where a few named emotions fit within them. In our model, these two continuous dimensions are further simplified by encoding them as a small number of discrete values. Valence is considered to be either negative, neutral, or positive; similarly, arousal is judged to be excited, neutral, or calm. Psychologists also recognize that individuals have long-term traits that guide their attitudes and responses to events. The term personality is used to describe permanent (or very slowly changing) patterns of thought, emotion, and behavior associated with an individual. McCrae and Costa (1989) analyzed the five basic dimensions of personality (see Wiggins 1979), which form the basis of commonly used personality tests. They found that this interpersonal circumplex can be usefully characterized within a two-dimensional space. Taking an approach similar to our representation of emotion, we have incorporated into our model a representation of personality based upon the dimensions of:
Eugene Ball
Figure 11.1 The position of some named emotions within the Valence Arousal Space.
9
Dominance, indicating an individual’s relative disposition toward controlling (or being controlled by) others 9 Friendliness, measuring the tendency to be warm and sympathetic Dominance is encoded in our model as dominant, neutral, or submissive; friendliness is represented as friendly, neutral, or unfriendly. Given this quite simple but highly descriptive model of an individual’s internal emotional state and personality type, we wish to relate it to behaviors that help to communicate that state to others. The behaviors to be considered can include any observable variable that could potentially be caused by these internal states. In laboratory settings, some of the most reliable measures of emotional state involve physiological sensing, such as galvanic skin response (GSR) and heart rate. For both emotion and personality, survey questions are often used to elicit accurate measures of internal state (with tests such as the Myers-Briggs Type Indicator; Myers and McCaulley 1985). However, in normal human interaction, we rely primarily on visual and auditory observation to judge the emotion and personality of others. A computer-based agent might be able to use direct sensors of physiological changes, but if those measures require the attach-
A Bayesian Heart
ment of unusual devices, they would be likely to have an adverse effect on the user’s perception of a natural interaction. For that reason, we have been most interested in observing behavior unobtrusively, either through audio and video channels, or possibly by using information (especially timing) that is available from traditional input devices like keyboards and mice, which might be a good indicator of the user’s internal state. More specialized devices like a GSR-sensing mouse or a pressure-sensitive keyboard might be worth investigating as well—although unless they turned out to be extraordinarily helpful, they are unlikely to make it into widespread use.
Bayes Networks
Bayesian networks (Jensen 1996) are a formalism for representing networks of probabilistic causal interactions that have been effectively applied to medical diagnosis (Horvitz and Shwe 1995), troubleshooting tasks (Heckerman, Breese, and Rommelse 1995), and many other domains. Bayes nets have a number of properties that make them an especially attractive mechanism for modeling emotion. First, they deal explicitly with uncertainty at every stage, which is a necessity for modeling anything as inherently nondeterministic as the connections between emotion and behavior. For example, an approach using explicit rules (if behavior B i s observed, then deduce the presence of emotional state E) would have great difficulty accounting for inconsistent reactions to the same events. However, a Bayes network will make predictions about the relative likelihood of different outcomes, which can naturally capture the inherent uncertainty in human emotional responses. Second, the links in a Bayes net are intuitively meaningful because they directly represent the connections between causes and their effects. For example, a link between emotional arousal and the base pitch of speech can be used to represent the theoretical effect that arousal (and the resulting increased muscular tension) has on the vocal tract. It is quite easy to encode the expectation that with increasing arousal, the base pitch of speech is likely to increase as well. The exact probabilities involved can still be difficult to determine, but if the network is designed so that the parameters represent relatively isolated effects, relevant quantitative information from psychological studies is sometimes
Eugene Ball
available. Moreover, any model with enough complexity to model even simple emotional responses is likely to have a large number of parameters that have to be determined, and in a Bayesian network these parameters at least have clearly understandable meanings. Finally, and especially relevant to the twin requirements of emotionally aware computing (recognizing emotion in the user, and simulating emotional response by the computer), Bayesian networks can be used both to calculate the likely consequences of changes to their causal nodes and also to diagnose the likely causes of a collection of observed values at the dependent nodes. This means that a single network (and all of its parameters) can be used for both the recognition and the simulation tasks. When used to simulate emotionally realistic behavior of the computer, the states of the internal nodes representing dimensions of emotion and personality can be set to the values that we wish the computer to portray. The evaluation of the Bayes net will then predict a probability distribution for each possible category of behavior. This has the extra advantage that by randomly sampling this distribution over time, we can very easily generate a sequence of computer behaviors that are consistent with the desired emotional state, but are not completely deterministic. Because excessively deterministic behavior is a strong indicator of mechanistic origins, observers frequently judge that such behavior appears unnatural. By introducing some random (but consistent) variability, that source of unnaturalness can be avoided. When the computer observes user behavior (through cameras, microphones, etc.) the observations can be used to set the values of the corresponding leaf (or dependent) nodes in the network. Evaluation of the network then results in estimated values for the internal dimensions of the user’s emotional state. The most probable value can be taken as the user’s state (as perceived by the computer). If multiple values have similar probabilities, the diagnosis can be treated as uncertain.
Emotion and Behavior
The Bayesian model that we have built (figure 11.2) contains internal states for emotional valence and arousal, and for the dominance and friendliness aspects of personality. These nodes are
A Bayesian Heart
Eugene Ball
treated as unobservable variables in the Bayesian formalism, with links connecting them to nodes representing aspects of behavior that are judged to be influenced by that hidden state. The behavior nodes currently represented include linguistic behavior (especially word selection), vocal expression (base pitch and pitch variability, speech speed and energy), posture, and facial expressions. Our Bayesian network therefore integrates information from a variety of observable linguistic and nonlinguistic behaviors. The static model described above can be extended to a version with temporal dependencies between current and previous values of the internal variables characterizing emotions. In this model, we assume that the values of the observable variables such as speech speed, wording, gesture, and so on are independent of emotions given the current emotional state. The variables describing emotions evolve over time, and in this model the interval between time slices is posited to be three seconds. Valence, modeled by the variable E-Valence(t) in the network, depends on valence in the previous time period, EValence(t — 1), as well as the occurrence of a ValenceEvent(t) in the previous period. A valence event refers to an event in the interaction that affects valence. For example, in a troubleshooting application, a negative valence event might be a failed repair attempt or a misrecognized utterance. We have a similar structure for arousal, where the variable ArousalEvent(t — 1) captures external events that may effect arousal in the current period, with discrete states calming, neutral, and exciting. The conditional probability distribution indicating the dynamic transition probabilities is shown in figure 11.3. The distribution does not admit a direct transition from a calm state of arousal to an excited state. (Note: this distribution is illustrative, and not based on a formal study or experiment.) Because personality, by definition, is a long-term trait, we treat these variables as not being time dependent in the model, hence the lack of a time index in the variable names for personality, PFriendly and P-Dominant. Note that the existence of the personality variables in the model induce a dependency among the observables at all times, so the model is not strictly Markovian in the sense that observations are conditionally independent of the past, given the current unknown emotional state. However, this model can be converted to a Markovian representation for inference.
A Bayesian Heart 315
Eugene Ball
In the next section, we discuss the architecture of the emotional component of a complete interactive system. Following that, we present more detail on the model’s treatment of specific behaviors—particularly linguistic and vocal expression.
11.3
Emotional Interactive Systems
In an emotionally aware interactive system, the recognition and simulation of emotion will play an auxiliary and probably quite subtle role. The goal is to provide an additional channel of communication alongside the spoken or graphical exchanges that carry the main content of the interaction. If the emotional aspects of the system call attention to themselves, the primary motivation of producing natural interactions will have been defeated. In fact, users that get the feeling that the system is monitoring them too closely may begin to feel anxious or resentful (of course, the emotional system, recognizing that fact, could always turn itself off!). Because recognizing emotional behaviors is likely to require considerable effort and produce only a modest benefit, it will probably require that a single emotional component be shared among many applications in order to be practical. Therefore, another attraction of adopting a simple noncognitive model of emotion is the ability to keep the emotional component independent of most of the domain-aware portions of the system. If we observe and simulate emotional behaviors that are expressed automatically and unconsciously, then the recognition and interpretation of those behaviors can take place in an independent subsystem. Thus we may well see the creation of just a few competing ‘‘emotion chips’’ that can be incorporated into many applications. These modules will be responsible for receiving sensory input and estimating the current emotional state of the user, selecting the emotional response from the system that will be most appropriate, and then modify the speech and animated behavior of the system in order to express the selected behavior in a natural way.
System Structure
The system architecture that we have experimented with is demonstrated in figure 11.4. In our agent, we maintain two copies of the emotion/personality model. One is used to assess the user’s
A Bayesian Heart 317
J^
User's E&P
t
Emotion and Personality Assessment
t
Observation
\ USER
Policy
^S.
Agent's E&P
\ Emotion & Personality Simulation
I t
Behavior
AGENT
Figure 11.4 An architecture for speech and interaction interpretation and subsequent behavior generation by a character based agent.
emotional state, the other to generate behavior for the agent. The model operates in a cycle, continuously repeating the following steps. 1.
2.
3.
Observation. First, the available sensory input is analyzed to identify the value of any relevant input nodes. For example, a phrase spoken by the user might be recognized as one possible paraphrase among a group of semantically equivalent, but emotionally distinct, ways of expressing a concept. (The modeling of such alternatives is discussed in the next section.) In parallel, the vision subsystem might report its analysis that the user is currently producing large and fast gestures along with their speech. For each such perception, the corresponding node in the diagnostic copy of the Bayesian network is set to the appropriate value. Assessment. Next, we use a standard probabilistic inference (Jensen 1989; Jensen 1996) algorithm to update the emotion and personality nodes in the diagnostic network to reflect the new evidence. Policy. The linkage between the models is captured in the policy component. This component makes the judgment of what emotional response from the computer is desirable, given the new
Eugene Ball
4.
5.
estimate of the user’s emotional state. Possible approaches to the policy component are discussed in the next section. Simulation. Next, a probabilistic inference algorithm is applied to the second copy of the Bayes network. This time, the consequences of the new states of the emotion and personality nodes are propagated to generate probability distributions over the available behaviors of the agent. These distributions indicate which paraphrases, animations, speech characteristics, and so on would be most consistent with the agent’s emotional state and personality (as determined by the policy module). Behavior. Some agent behaviors can be expressed immediately; for example, instructions for changes in posture or facial expression can be transmitted directly to the animation routines, and generate appropriate background movement. Other behavior nodes act as modifiers on application commands to the agent. At a given stage of the dialogue, the application may dictate that the agent should express a particular concept, such as a greeting or an apology. The current distribution for the node corresponding to that concept is then sampled to select a paraphrase to use in the spoken message.
Policy
The policy module has not been explored very thoroughly at this point. In a working system it would likely be quite complex, taking into account the history of the dialogue with the user thus far (or at least its emotional trajectory), and a model of the particular user’s preferences, as well as the estimates from the personality and emotion nodes in the diagnostic network. The imagined responses shown in the section entitled Emotionally Aware Computing illustrate a few of the difficulties. For example, at what point should a computer agent express irritation toward a user? Conversational systems frequently encounter explicit attempts to ‘‘break the demo.’’ The form of such an attack is sometimes sufficiently predictable that a clever response can be generated in an attempt to deflect it. If the user then persists in generating additional antagonistic input, perhaps an expression of irritation is the appropriate response. Thus far, we have only considered two very simplistic policies. The empathetic agent tries to match the user’s emotion and personality. There is some evidence that people prefer to deal with a computer agent that is similar to themselves (Reeves and Nass
A Bayesian Heart
1995), so this might be a good starting point. Of course, it does lead to a possible positive feedback loop, particularly if the user becomes angry! We have also experimented briefly with a contrary agent, whose emotions and personality tend to be the exact opposite of the user. While there are particular contexts in which this may produce interesting results—for example, when the user becomes bored or sad—it obviously is too simplistic to be a general policy.
Discussion
Bellman: So, is this policy box like the kind of thing in Eliza and other systems that we know, which would basically decide how friendly or how sympathetic the agent is? Ball: Yes, it decides what its emotional state is. If the user is angry, then just being deadpan in response to that isn’t, I think, the right choice. Ortony: The social interaction rules, essentially. Ball: But there is a complex choice about this, given the history of the interaction and the long-term assessment of this user and all kinds of complex issues. Picard: A whim to apologize, for example. Ball: If you decide to apologize, for example, then you want the behavior or expression of the agent’s emotional state to be appropriate. Bellman: The question I was asking actually was: It seemed to me that the policy you have represents the personality of the agent, and you could have your settings there. But then, you have your definition of the cultural interactions that are allowed, the conversational rules and other kinds of things that would be culturally determined. You wouldn’t sell the same agents and policies in Japan as you would here. Ball: Sure. Sloman: But at the moment they are collapsed into a policy box. Ball: Well, it’s the policy box, and also outside the policy box. So, there’s something that’s controlling the overall interaction, the dialogue, deciding what to say, what choice to take. Bellman: So, what I am saying is that it should have at least two boxes of that.
Eugene Ball
Ball: I am saying that this is just a little sideline to some whole other system which is doing the communication and the task and all of that. And so, this is just providing little, subtle modulation on the style of the interaction in order to try to make it feel more believable.
11.4
Recognition and Simulation
Linguistic Behavior
A key method of communicating emotional state is by choosing among semantically equivalent, but emotionally diverse paraphrases—for example, the difference between responding to a request with ‘‘sure thing,’’ ‘‘yes,’’ or ‘‘if you insist.’’ Similarly, an individual’s personality type will frequently influence their choice of phrasing—for example, ‘‘you should definitely’’ versus ‘‘perhaps you might like to.’’ Our approach to differentiating the emotional content of language is based on behavior nodes that represent ‘‘concepts,’’ including a set of alternative expressions or paraphrases. Some examples are shown in table 11.1. We model the influence of emotion and personality on wording choice in two stages, only the first of which is shown in the network of figure 11.1. Because the choice of a phrase can have a complex relationship with both emotion and personality, the probTable 11.1
Paraphrases for alternative concepts
CONCEPT
PARAPHRASES
greeting
Hello Hi there Howdy
Greetings Hey
yes
Yes Yeah I think so
Absolutely I guess so For sure
suggest
I suggest that you Perhaps you would like to Maybe you could
You should Let’s
A Bayesian Heart
lem of directly assessing probabilities for each alternative depending on all four dimensions rapidly becomes burdensome. However, inspired by Osgood’s work on meaning (Osgood, Suci, and Tannenbaum 1967), in which he identified several dimensions that can be used to characterize the connotations of most concepts, we first capture the relationship emotion and several ‘‘expressive styles.’’ The current model has nodes representing positive, strong, and active styles of expression (similar to Osgood’s evaluative, potent, and active), as well as measures of terseness and formality (see figure 11.5). These nodes depend upon the emotion and personality nodes and capture the probability that individuals express themselves in a positive (judgmental), strong, active, terse, and/or formal manner. Each of these nodes is binary valued, true or false. Thus this stage captures the degree to which an individual with a given personality and in a particular emotional state will tend to communicate in a particular style. The second stage captures the degree that each paraphrase actually is positive, strong, active, terse, and formal. This stage says nothing about the individual, but rather reflects a general cultural interpretation of each paraphrase: that is, the degree to which that phrase will be interpreted as positive, active, and so on by a speaker of American English. A node such as ‘‘GreetPositive’’ is also binary valued, and is true if the paraphrase would be interpreted as ‘‘positive’’ and false otherwise. Finally, a set of nodes evaluates whether the selected paraphrase of a concept actually matches the chosen value of the corresponding expressive style. A node such as ‘‘GreetMatchPositive’’ has value true if and only if the values of ‘‘GreetPositive’’ and ‘‘wdsPositive’’ are the same. The node ‘‘GreetMatch’’ is simply a Boolean that has value true when all of its parents (the match nodes for each expressive style) are true. When using the network, we set ‘‘GreetMatch’’ to have an observed value of true. This causes the Bayesian inference algorithm to force the values of the nodes in the concept and style stages to be consistent. For example, when simulating the behavior of an agent, each style node (like ‘‘wdsPositive’’) will have a value distribution implied by the agent’s personality and emotional state. The likelihood of alternative phrasings of a concept node (like ‘‘Greet’’) will then be adjusted in order to produce the best possible match between its attributes and the style nodes. In this fashion, a negative emotional
Eugene Ball
_
• 1 B •'-/-
•:
•
"
;
.
KC\
(!)
word Form
•
/ »\
\
[ V
H
•
|
v
^
J
_
WllVi^
Ml
/
\_y
/
^ f \
"J
• 0
c
••"-.•
o
o
10
o (D
C-
cn
o
in
o
lf>
O
q
r-
o>.
m
CO
o
Ifi
1
/
\
i-
T*
o in
I § s \^—-JlL
V
/
•
J 0) \
i i \
« Q-
J 1
i1
1
o
1
«
r
I
\ a /
Jv
r E \ 1 5 1
5 J
1 I
6
>
Xy
> ^
B £
>V
/ • \
I
m £
/
/ S \ \°/
s
V
(OW viz / /
ii
\
J.
/ \
/
\ ^^^ 1 £ 1 Y >v 1i A
Hi ther
X
Oh. You a
ositive Greet
°r
X
Good %day
a
U
Good to see y ou again.
c '5
re
a
/\ 1 1
i 8 /
(
s
v ry >
£
xn ^ f{\ Y w
/
/
1 »< J
r
i
/11
T^AW
/
/
/
i § 9 i
^ ^ —. L
—y X* 11 f / i\ /
y\ '
'-
/
n>\
wrtij V/ f 8 \
T—F"^
-•»^_ ^ j ^ \
V)
HOfJi
.
; •>,'!
1
1 /
/
M [ i
J A
A Bayesian Heart
state will greatly increase the chance that the agent will select ‘‘Oh, you again’’ as a greeting. In developing a version of this Bayes net for a particular application, we need to generate a network fragment, such as shown in figure 11.5, for each conceptual element for which we want emotional expression. These fragments are merged into a global Bayesian network capturing the dependencies between the emotional state, personality, natural language, and other behavioral components of the model. The various fragments differ only in the assessment of the paraphrase scorings—that is, the probability that each paraphrase will be interpreted as active, strong, and so on. There are five assessments needed for each alternative paraphrase for a concept (the ones mentioned earlier, plus a formality assessment). Note that the size of the belief network representation grows linearly in the number of paraphrases (the number of concepts modeled times the number of paraphrases per concept). In a previously proposed model structure, we had each of the expressive style nodes pointing directly into the concept node, creating a multistated node with five parents. The assessment burden in this structure was substantial, and a causal independence assumption such as noisy-or is not appropriate (Heckerman 1993). The current structure reduces this assessment burden, and also allows modular addition of new expressive style nodes. If we add a new expressive style node to the network (such as cynical), then the only additional assessments we need to generate are the cynical interpretation nodes of each concept paraphrase. These features of the Bayes network structure make it easy to extend the model for new concepts and dimensions of expressive style.
Vocal Expression
As summarized by Murray and Arnott (1993), there is a considerable (but fragmented) literature on the vocal expression of emotion. Research has been complicated by the lack of agreement on the fundamental question of what constitutes emotion, and how it should be measured. Most work is based upon either self-reporting of emotional state or upon an actor’s performance of a named emotion. In both cases, a short list of ‘‘basic emotions’’ is generally used; however, the categories used vary among studies.
Eugene Ball
A number of early studies demonstrated that vocal expression carries an emotional message independent of its verbal content, using very short fragments of speech, meaningless or constant carrier phrases, or speech modified to make it unintelligible. These studies generally found that listeners can recognize the intended emotional message, although confusions between emotions with a similar arousal level are relatively frequent. Using synthesized speech, in a 1989 MIT masters thesis, Janet Cahn (1989) showed that the acoustic parameters of the vocal tract model in the DECtalk speech synthesizer could be modified to express emotion, and that listeners could correctly identify the intended emotional message in most cases. Studies done by the Geneva Emotion Research Group (Johnstone, Banse, and Scherer 1995; Banse and Scherer 1996) have looked at some of the emotional states that seem to be most confusable in vocal expression. They suggest, for example, that the communication of disgust may not depend on acoustic parameters of the speech itself, but on short sounds generated between utterances. In more recent work (Johnstone and Scherer 1999), they have collected both vocal and physiological data from computer users expressing authentic emotional responses to interactive tasks. The body of experimental work on vocal expression indicates that arousal, or emotional intensity, is encoded fairly reliably in the average pitch and energy level of speech. This is consistent with the theoretical expectations of increased muscle tension in high arousal situations. Pitch range and speech rate also show correlations with emotional arousal, but these are less reliable indicators. The communication of emotional valence through speech is a more complicated matter. While there are some interesting correlations with easily measured acoustic properties (particularly pitch range), complex variations in rhythm seem to play an important role in transmitting positive/negative distinctions. In spite of the widely recognized ability to ‘‘hear a smile,’’ which Tartter (1980) related to formant shifts and speaker-dependent amplitude and duration changes, no reliable acoustic measurements of valence have been found. Roy and Pentland (1996) more recently performed a small study in which a discrimination network trained with samples from three speakers expressing imagined approval or disapproval was able to distinguish those cases with reliability
A Bayesian Heart
comparable to human listeners. Thus recognition of emotional valence from acoustic cues remains a possibility, but supplementary evidence from other modalities (especially observation of facial expression) will probably be necessary to achieve reliable results. Our preliminary Bayesian subnetwork representing the effects of emotional valence and arousal on vocal expression therefore reflects the trends reported in the literature cited above, as follows: With increasing levels of emotional arousal, we expect to find: Higher average pitch Wider pitch range Faster speech Higher speech energy As the speaker feels more positive emotional valence, their speech will tend toward: Higher average pitch A tendency for a wider pitch range A bias toward higher speech energy
Gesture and Posture
Humans communicate their emotional state constantly through a variety of nonverbal behaviors, ranging from explicit (and sometimes conscious) signals like smiles and frowns, to subtle (and unconscious) variations in speech rhythm or body posture. Moreover, people are correspondingly sensitive to the signals produced by others, and can frequently assess the emotional states of one another accurately even though they may be unaware of the observations that prompted their conclusions. The range of nonlinguistic behaviors that transmit information about personality and emotion is quite large. We have only begun to consider them carefully and list here just a few of the more obvious examples. Emotional arousal affects a number of (relatively) easily observed behaviors, including speech speed and amplitude, the size and speed of gestures, and some aspects of facial expression and posture. Emotional valence is signaled most clearly by facial expression, but can also be communicated by means of the pitch contour and rhythm of speech. Dominant personalities might be expected to generate characteristic rhythms and amplitude of speech, as well as assertive postures and gestures.
Eugene Ball
Friendliness will typically be demonstrated through facial expressions, speech prosody, gestures, and posture. The observation and classification of emotionally communicative behaviors raises many challenges, ranging from simple calibration issues (e.g., speech amplitude) to gaps in psychological understanding (e.g., the relationship between body posture and personality type). However, in many cases the existence of a causal connection is uncontroversial, and given an appropriate sensor (e.g., a gesture size estimator from camera input), the addition of a new source of information to our model will be fairly straightforward. Within the framework of the Bayesian network of figure 11.1, it is a simple matter to introduce a new source of information to the emotional model. For example, suppose we got a new speech recognition engine that reported the pitch range of the fundamental frequencies in each utterance (normalized for a given speaker). We could add a new network node that represents PitchRange with a few discrete values and then construct causal links from any emotion or personality nodes that we expect to affect this aspect of expression. In this case, a single link from Arousal to PitchRange would capture the significant dependency. Then the model designer would estimate the distribution of pitch ranges for each level of emotional arousal, to capture the expectation that increased arousal leads to generally raised pitch. The augmented model would then be used both to recognize that increased pitch may indicate emotional arousal in the user, as well as adding to the expressiveness of a computer character by enabling it to communicate heightened arousal by adjusting the base pitch of its synthesized speech.
11.5
Concerns
I think it is appropriate at this point to raise two areas of concern for research involving emotional response and computing.
Overhyping Emotional Computing
First, any mention of emotion and computers in the same breath gets an immediate startle reaction from members of the general public (and the media). Even if we were to avoid any discussion of the far-out questions (Will computers ever truly ‘‘feel’’ an emo-
A Bayesian Heart
tion?), I think we will be well advised to take exceptional care when explaining our work to others. The good news is that the idea of a computer with emotional sensitivity seems to be getting a lot of serious interest and discussion. Perhaps there is a shared (though unarticulated) appreciation that a device that can be so capable of generating emotional responses should also know how to respond to them! However, the recent high level of interest may generate an unreasonably high level of expectation for technology that dramatically and reliably interprets emotional behavior. My personal expectation is that the significance of emotionally aware computing will be subtle, and will only reach fruition when spoken interaction with computers becomes commonplace. Moreover, as a technology that works best when it isn’t noticed, emotional computing should probably try to avoid the limelight as much as possible.
Ethical Considerations
When considering the ethics of emotional computing, there are many pitfalls. Some people may find the idea of ascribing an attribute as distinctively human as the communication of emotion to a computer as inherently objectionable. But even considering only more prosaic concerns, there are some potential uses of emotionally sensitive computing that clearly cross the boundaries of ethical behavior. Emotion could be a very powerful persuasive tool, if used effectively—especially if coming from the personification of your own computer, with which you may have a long and productive relationship. B. J. Fogg (1999) at Stanford University has begun to seriously consider these issues, under the name of computer aided persuasive technology (CAPTology). I believe it would be irresponsible of us to pretend that some of our ideas will never be used to unethically persuade people. Conartists have always been quick to adopt any available technology to defraud unsuspecting people. The age-old defense of ‘‘if we don’t do this, someone else will,’’ while true, doesn’t seem to me a sufficient response. I would be interested in hearing the thoughts of other workshop members on this topic. My best idea is to try to develop a clear consensus on what exactly would constitute an unethical use of emotionally aware computing. If we could agree on that, we could quickly and vocally
Eugene Ball
object to unethical behavior (particularly commercial uses) when it occurs, and by announcing that intention in advance, perhaps dissuade at least some people from misusing our ideas.
Discussion: The Ethics of Emotional Agents
Picard: Let me tell you a short story. We made an experiment. We frustrated the test persons, and we built an agent that tries to make them feel less frustrated by using socially acceptable strategies. And it looks like we succeeded: The people who work with the agent show a behavior that was indicative of significantly less frustration. I was explaining this to some Sloane fellows, one of them from a very large computer company, and he said: No surprise to u s . We found that, when we surveyed our customers who had had our product, that those who had had a problem with the product, found it defective, and had gotten this empathysympathy-active listening kind of response from customer service people—not from an agent!—were significantly more likely to buy our products again than those who bought a product and had no problems with it. And he said that they seriously confronted this as an ethical issue. Furthermore, all these years I have talked to visitors of our lab. Every one of them lately has raised the issue of how they have to get technology out faster, and they are now aiming not for it to have gone through alpha and beta cycles, and so forth, but they say, ‘‘60 per cent ready, and then it goes out there.’’ They are excited about the fact that getting something out that is defective and handling complaints about it better could actually lead to better sales. Ortony: And the real ethical problem that I guess the guys are focusing on is that this is a sort of unknown property—when do you come in? If one were to explicitly say: ‘‘Whenever you encounter a bug in our software, some very nice agents are going to come and calm you down,’’ then you would not have an ethical problem. Then you think the ethical problem would go away. It is the fact is that this is ‘‘unsolicited’’ behavior from the system that’s problematic, I presume. Ball: I am not sure if it would go away. Picard: I am not so sure if it is unsolicited, either. Ortony: Well, it diminishes, because, after all, you design cars with features to make people comfortable, but they are visible and they
A Bayesian Heart
are available for inspection prior to purchase, and so you don’t feel bad that you have made power seats as opposed to manual seats. On the other hand: What does a power seat do? It makes you feel all easy about changing the position of your seat, and all kinds of things that just make you feel better as a user. We don’t have a problem with that, presumably because it’s explicit and openly available and inspectable rather than requiring actual interaction with a product. Ball: Right. And so, I think there is a deep problem about the emotional component, because it needs to be hidden in order to make sense. Bellman: And there is a privacy issue somewhere, just in terms of the modeling that you do about the user, and who you pass it to, and what other kinds of reasons it is used for. Let me just mention the example of school children who were being taught how they could get their parents’ financial information back to companies. Ball: If you have an agent that’s observing your behavior for a long period of time, he is going to know a lot about you. I think it is relatively easy to put a line and say: There is a lot of information, it just never goes out of your machine. Picard: In an office environment, the company owns your machine and what’s on it. But in the future, you know, you might be much more comfortable with the mediator that you trust, operating between you and the office machine, if that mediator was as comfortable as your earring or your jewelry or something that you owned. Sloman: Or it’s your personal computer, that you bring into the office and plug into the main one—rather than an earring. Bellman: What I was pointing out is: Why should this be a secret from the patient or from the person who is being educated? Why can’t they have control over it? And that fits in with your wearable devices. In some sense, yes, you have this wonderful computerbased technology that allows this kind of collection about you, but you have control over it. That’s part of the solution. The other thing is that a lot of the virtual world work is still highly effective even when it’s transparent to the user. Many of you seem to assume that it takes away from the mythology of the character if somehow people begin to lift the hood. In my experience we have just found the opposite. In really hundreds and hundreds of cases, letting people actually, for example, walk into your office,
Eugene Ball
and they meet your characters, and they actually look on the lifted hood, see how it’s set u p , see the way in which it works, and they pick up issues about how responsive it is, or what it does, or what it collects on them. We have not found that this knowledge would actually take away the experience. It’s very empowering, and it’s an interesting way of thinking about these control issues. Sloman: These points are all about how much the end user has access to information. And it’s quite important that often they cannot absorb and evaluate the information if they get it themselves. You may have to have third parties, like consumer associations and other people who have the right to investigate these things, to evaluate them, and then to publicize. If you are not an expert, you go to someone you trust. That is not necessarily your earring or your personal computer, but it might be another person or an organization who has looked at this thing, and you will be in a better position. So, the information must be available. Picard: We have to distinguish real-time, run-time algorithms from store-up-plots of do-it-slowly algorithms. If you allow accumulation, then we are getting to know somebody over a long period of time, and you can build up their goals, values, and expectations, all these other things that help predict the emotion, since it’s not just what you see right now, but it’s also what you know about the person. Whereas if you don’t keep all that person-specific memory, you can only have ‘‘commonsense about prototypes about people,’’ and then take what you observe at face value from this stranger, so to speak. So, in the latter case, I believe, we can do without any problems of privacy. We just build up really good models of what is typical, and they won’t always be as good. But if you really want the system to get to know you intimately, to know your values, how you are likely to respond to the situation, there is going to have to be some memory. We can’t do it all in the run-time. And that’s an issue of privacy. But we can go a long way without hitting the privacy. And once we do with the privacy, there are a whole lot of possible solutions to it.
References Ball, G., Ling, D., Kurlander, D., Miller, J., Pugh, D., Sally, T., Stankosky, A., Thiel, D., van Dantzich, M., and Wax, T. (1997): Lifelike Computer Characters: The Persona Project at Microsoft Research. In J. M. Bradshaw, ed., Software Agents, 191–222. AAAI Press/MIT Press, Menlo Park, Calif.
A Bayesian Heart
Banse, R., and Scherer, K. R. (1996): Acoustic profiles in vocal emotion expression. J. Personality Social Psychol. 70: 614–636. Bates, J., Loyall, A. B., and Reilly, W. S. (1994): An Architecture for Action, Emotion, and Social Behavior. In C. Castelfranchi and E. Werner, eds., Artificial Social Systems: Fourth European Workshop on Modeling Autonomous Agents in a Multi-Agent World, MAAMAW ’92, S. Martino al Cimino, Italy, July 29–31, 1992. Lecture Notes in Computer Science Vol. 830, Springer-Verlag, Berlin. Cahn, J. E. (1989): Generating Expression in Synthesized Speech. Master’s Thesis, Massachusetts Institute of Technology, May 1989. Cialdini, R. B. (1993): Influence: The Psychology of Persuasion. Quill, William Morrow, New York. Elliot, C. D. (1992): The Affective Reasoner: A Process Model of Emotions in a MultiAgent System. Ph.D. diss., Northwestern University, Evanston, Ill. Flanagan, J., Huang, T., Jones, P., and Kasif, S. (1997): Final Report of the NSF Workshop on Human-Centered Systems: Information, Interactivity, and Intelligence. National Science Foundation, Washington, D.C. Fogg, B. J. (1999): Persuasive Technologies. Commun. ACM 42 (5): 26–29. Heckerman, D. (1993): Causal Independence for Knowledge Acquisition and Inference. In D. Heckerman and A. Mamdani, eds., Proceedings of the Ninth Conference on Uncertainty in Artificial Intelligence, 122–127. Morgan Kaufmann, San Mateo, Calif. Heckerman, D., Breese, J., and Rommelse, K. (1995): Troubleshooting under Uncertainty. Commun. ACM 38 (3): 49–57. Horvitz, E., and Shwe, M. (1995): Melding Bayesian Inference, Speech Recognition, and User Models for Effective Handsfree Decision Support. In R. M. Gardner, Proceedings of the Symposium on Computer Applications in Medical Care. IEEE Computer Society Press, Long Beach, Calif. Huang, X., Acero, A., Alleva, F., Hwang, M. Y., Jiang, L., and Mahajan, M. (1995): Microsoft Windows’s Highly Intelligent Speech Recognizer: Whisper. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Detroit. Vol. 1, pp. 93–96, IEEE. Jensen, F. V. (1996): An Introduction to Bayesian Networks. Springer, Berlin, Heidelberg, New York. Jensen, F. V., Lauritzen, S. L., and Olesen, K. G. (1989): Bayesian Updating in Recursive Graphical Models by Local Computations. TR R-89-15, Institute for Electronic Systems, Department of Mathematics and Computer Science, University of Aalborg, Denmark. Johnstone, I. T., Banse, R., and Scherer, K. R. (1995): Acoustic Profiles from Prototypical Vocal Expressions of Emotion. In K. Elenius and P. Branderud, eds., Proceedings of the XIIIth International Congress of Phonetic Sciences, publisher/location: Stockholm: KTH/Stockholm University Vol. 4, pp. 2–5. Johnstone, I. T., and Scherer, K. R. (1999): The Effects of Emotions on Voice Quality. In J. J. Ohala, Y. Hasegawa, M. Ohala, D. Granville, and A. C. Bailey, eds., Proceedings of the XIVth International Congress of Phonetic Sciences, American Institute of Physics, New York Vol. 3, pp. 2029–2032. Klein, J., Moon, Y., and Picard, R. W. (1999): This Computer Responds to User Frustration: Theory, Design, Results, and Implications. TR-501, MIT Media Laboratory, Vision and Modeling Technical Group. Lang, P. (1995): The Emotion Probe: Studies of Motivation and Attention. Am. Psychol. 50 (5): 372–385. Martinho, C., and Paiva, A. (1999): Pathematic Agents: Rapid Development of Believable Emotional Agents in Intelligent Virtual Environments. In O. Etzioni, J. P. Mu¨ller, and J. M. Bradshaw, eds., Proceedings of the Third Annual Conference on Autonomous Agents, AGENTS ’99, 1–8. ACM Press, New York. McCrae, R., and Costa, P. T. (1989): The Structure of Interpersonal Traits: Wiggin’s Circumplex and the Five Factor Model. J. Pers. S oc. Psychol. 56 (5): 586–595. Murray, I. R., and Arnott, J. L. (1993): Toward the Simulation of Emotion in Synthetic Speech: A Review of the Literature on Human Vocal Emotion, J. Acoust. Soc. Am. 93 (2): 1097–1108.
Eugene Ball
Myers, I. B., and McCaulley, M. H. (1985): Manual: A Guide to the Development and Use of the Myers-Briggs Type Indicator. Consulting Psychologists Press, Palo Alto, Calif. Ortony, A., Clore, G. L., and Collins, A. (1988): The Cognitive Structure of Emotions. Cambridge University Press, Cambridge. Osgood, C. E., Suci, G. J., and Tannenbaum, P. H. (1967): The Measurement of Meaning. University of Illinois Press, Urbana. Picard, R. W. (1995): Affective Computing. MIT Media Lab, Cambridge. Perceptual Computing Section Technical Report 321. Picard, R. W. (1997): Affective Computing. MIT Press, Cambridge. Perceptual Computing Section Technical Report 321. Reeves, B., and Nass, C. (1995): The Media Equation. CSLI Publications and Cambridge University Press, New York. Roy, D., and Pentland, A. (1996): Automatic spoken affect analysis and classification. Scherer, K. R. (1984): Emotion as a Multicomponent Process: A Model and some CrossCultural Data. Rev. Pers. S oc. Psychol. (5): 37–63. Sloman, A. (1992): Prolegomena to a Theory of Communication and Affect. In A. Ortony and J. Slack, eds., AI and Cognitive Science Perspectives on Communication. Springer, Berlin, Heidelberg, New York. Tartter, V. C. (1980): Happy Talk: Perceptual and Acoustic Effects of Smiling on Speech. Percep. Psychophys. 27: 24–27. Trower, T. (1997): Microsoft Agent. Microsoft Corporation, Redmond, Wash. On-line. Available: hhttp://www.microsoft.com/msagent/i. (Availability last checked 5 Nov 2002) Wiggins, J. S. (1979): A psychological taxonomy of trait-descriptive terms: The interpersonal domain. J. Personality Social Psychol. 37: 395–412.
12 Creating Emotional Relationships with Virtual Characters Andrew Stern
During the workshop on Emotions in Humans and Artifacts, participants presented research addressing such questions as How do emotions work in the human mind? What are emotions? How would you build a computer program with emotions? Can we detect the emotional state of computer users? Yet as the discussions wore on, instead of providing some answers to these questions, they persistently raised new ones. When can we say a computer ‘‘has’’ emotions? What does it mean for a computer program to be ‘‘believable?’’ Do computers need emotions to be ‘‘intelligent,’’ or not? These scientific, engineering, and philosophical questions are fascinating and important research directions; however, as an artist with a background in computer science and filmmaking, I found my own perspective on emotions in humans and artifacts generally left out of the discussion. I find the burning question to be, What will humans actually do with artifacts that (at least seem to) have emotions? As a designer and engineer of some of the first fully realized, ‘‘believable’’ interactive virtual characters to reach a worldwide mass audience, Virtual Petz and Babyz (PF.Magic/ Mindscape 1995–99), I know this is a question no longer restricted to the domain of science fiction. Today, millions of people are already encountering and having to assimilate into their lives new interactive emotional ‘‘artifacts.’’ Emotions have been a salient feature of certain man-made artifacts—namely, stories and art—long before the scientific study of emotion began. Looking into the past, we see a tradition of humans creating objects in one form or another that display and communicate emotional content. From figurative painting and sculpture to hand puppets and animated characters in Disney films such as Snow White, man-made artifacts have induced emotional reactions in people as powerful and as meaningful as those generated between people themselves. Now we are at a point in the history of story and art where humans can create interactive artifacts, using the computer as a
Andrew Stern
new medium. We can now write software programs that can ‘‘listen’’ to a human, process what it just ‘‘heard,’’ and ‘‘speak’’ back with synthetically generated imagery and sound, on machines that are already in the households of millions of families around the world. The new urgent challenge for artists and storytellers is to create interactive artifacts that can induce the same emotional reactions and communicate emotional content as traditional noninteractive stories and art have done. In fact, because the computer theoretically can be programmed to tailor the experience to an individual, it could become the most powerful medium of all for creating affective stories and art. The question becomes, How do you do that? How do emotionally powerful stories and art ‘‘work’’ anyway? Alas, creating an artifact that produces a meaningful emotional reaction in a human is considered an ‘‘art’’ itself. Although techniques and advice on the artistic process have been published—works such as The Art of Dramatic Writing by playwright Lajos Egri (1946), The Illusion of Life by Disney animators Thomas and Johnston (1981), Letters to a Young Poet by poet Rainer Rilke (1934)—the act of creating emotionally powerful artifacts is by and large considered elusive, mysterious, and unquantifiable. Even in art school the typical approach is to teach students to imitate (‘‘master’’) traditional styles and techniques, after which it is hoped the students will be ready to ‘‘find their own style,’’ which sometimes never happens. Naturally it is difficult to discuss the art of creating emotionally powerful artifacts in the context of a scientific workshop, in the way one would approach a computer science or engineering problem—which helps explain the general reluctance to research the topic, and the not-so-uncommon attitude among artificial intelligence researchers that the topic is ‘‘mushy,’’ ill-formed, or worst of all, unimportant. Art and entertainment are considered as fun, not serious pursuits. This view is shortsighted. On the contrary, stories and art are among the most serious and meaningful pursuits we have. We communicate ideas and experiences to each other in this way. The fact that the problem is, to a degree, mushy and unquantifiable, makes it all the more challenging and difficult to undertake. This paper puts forth virtual characters as an emerging form of man-made artifact with emotional content. Virtual characters go by a few other names in the AI research community, such as believable agents, synthetic actors, and synthetic personalities. These are embodied autonomous agents that a user can interact with in some
Creating Emotional Relationships with Virtual Characters
fashion, animated as real-time graphics or built as physical robots, which appear to have personality, emotion, and motivation, designed to be used in art or entertainment. Over the past decade, several media labs have been exploring the issues involved in building virtual characters (such as Bates 1992; Blumberg 1997; Perlin 1995; Hayes-Roth et al. 1996; Goldberg 1997; Elliott et al. 1998). Some groups have designed architectures and implemented prototypes that have been demonstrated at academic conferences. But it should be made clear that in the business of creating emotional artifacts, in the final analysis, prototypes and demos are not enough. The point of creating emotionally powerful experiences, whether interactive or not, is to induce a reaction in an audience, in ‘‘users.’’ These creations must be experienced by the general public to serve the purpose for which they were created in the first place. Until this happens, the work created in closed-door media labs is ultimately incomplete. The public has been consuming interactive entertainment for two decades now, in the form of software products from the video game and computer game industry. Unfortunately, the experiences offered in these games are mostly juvenile, primarily focused on fighting, shooting, racing, and puzzle-solving, and the virtual characters offered in them are most often shallow, one-dimensional cardboard cutouts. Such games can be emotionally powerful experiences for those who play them, but they do not appeal to the majority of the population. Very few successful pieces of interactive entertainment or art have been made with emotional content for a mass audience—that is, the kind of ‘‘personal relationship’’ stories that books, theater, television, and movies offer, or the kind of ‘‘high art’’ exhibited at museums and art shows (Stern 1999a; Mateas 1999). This chapter suggests new ways to employ virtual characters to create emotionally powerful interactive experiences, using our Virtual Petz and Babyz projects as case studies. The techniques used to create these projects will be presented and discussed, with emphasis on the importance of design. Finally, we will attempt to address the question of what it could mean for a human to have an emotional relationship with a virtual character.
12.1 Do You Feel It? The Case for Emotional Relationships
Animation and artificial intelligence technologies for creating realtime interactive virtual characters are currently being researched
Andrew Stern
and developed in academic labs and industry companies. We are told that soon we will have virtual humans that look photorealistic, with behavior driven by some degree of AI. Many are working with the intention that these characters will become functional agents, at our command to perform a variety of complicated and menial tasks. They will learn our likes and dislikes and be able to autonomously communicate and negotiate with others. And they will become teachers in the virtual classroom, always ready and willing to answer our questions. At first glance, it seems natural that adding ‘‘emotions’’ to these virtual characters should greatly enhance them. After all, real people have emotions, so virtual h u m a n characters should have them too. Some researchers in neuroscience and psychology point to emotion as an important factor in problem-solving capabilities and intelligence in general (Damasio 1994). As Marvin Minsky put it, ‘‘the question is not whether intelligent machines can have emotions, but whether machines can be intelligent without any emotions’’ (Minsky 1985). Virtual characters may very well need emotions to have the intelligence to be useful. But when thinking in terms of a virtual character actually interacting with a user, is emotional behavior really appropriate for these types of applications? In real life it is arguable that interactions with ‘‘functional agents’’ (e.g., waiters, butlers, secretaries, travel agents, librarians, salespeople) are often best when emotions are not involved. Emotional reactions can often be irrational, illogical, and time consuming, which work against the efficient performance of tasks. Of course, in any transaction, politeness and courtesy are always appreciated, but they hardly qualify as emotional. We expect teachers to be a bit more personable and enthusiastic about their material than a travel agent, but do we want them to get angry or depressed at us? Although emotions may be required for intelligence, I would argue that the most compelling interactions with virtual characters will not be in the area of functional agents. If a user encounters a virtual character that seems to be truly alive and have emotions, the user may instead want to befriend the character, not control them. Users and interactive virtual characters have the potential to form emotional relationships with each other—relationships that are more than a reader’s or moviegoer’s affinity for a fictional character in a traditional story, and perhaps as meaningful as a friendship between real people. By an emotional relationship, we mean a
Creating Emotional Relationships with Virtual Characters
set of long-term interactions wherein the two parties pay attention to the emotional state of the other, communicate their feelings, share a trust, feel empathetic, and establish a connection, a bond.
Virtual Friends
The recent success of several ‘‘virtual pet’’ products, popular among both kids and adults, offers some support for this idea. The most sophisticated of these characters are animated on the computer screen, such as Dogz and Catz (PF.Magic/Mindscape 1995– 99), and Creatures (Grand, Cliff, and Malhotra 1997), but some are displayed on portable LCD keychain toys or even embodied as simple physical robots, such as Tamagotchi (Bandai 1996), Furby (Tiger Electronics 1998), and Aibo (Sony 1999). Users ‘‘nurture’’ and ‘‘play’’ with these pets, feeding them virtual food, petting them with a virtual hand, and generally giving them attention and care lest they runaway or die. Although there can be some blurring into the domain of video games, in their purest form, virtual pets are not a game, because they are nongoal oriented; it is the process of having a relationship with a virtual pet that is enjoyable to the user, with no end goal of winning to aim for. As of this writing, there have been no completed formal studies of virtual pets; a study is currently underway by Turkle (1999). In our experience with the Dogz and Catz products, our anecdotal evidence suggests that depending upon the sophistication of the virtual character, the emotional relationship that a user can form with it ranges anywhere from the attachment one has to a favorite plant to the bond between a master and their dog. Children are the most willing to suspend their disbelief and can become very attached to their virtual pets, playing with and feeding them every day. It is precisely those irrational, illogical, and time-consuming emotional interactions that may hamper a functional agent that are so engaging and entertaining here. (Please refer to the appendix of this paper to read real customer letters we have received about Petz.) Only a few Petz users, mostly technology–oriented adult men, have requested that their Petz be able to perform functional tasks such as fetching e-mail. What is most interesting about the phenomenon of virtual pets are not the toys and software themselves—some of which have minimal interactivity and little or no artificial intelligence driving them—but the fact that some people seem to want to form
Andrew Stern
emotional relationships with them. Some appear quite eager to forget that these characters are artificial and are ready and willing to engage in emotional relationships, even when some of the virtual pets offer little or no reward or ‘‘warmth’’ in return. This offers some promise for the public’s acceptance of the concept of a more advanced virtual friend. As commercially successful as these virtual pets are, it seems likely that emotional relationships at the level of favorite plants or pets will be far easier to accomplish than the level of friendship between two adults. An owner-to-pet relationship dynamic is much simpler than a person-to-person one, and much less communication is required between the two parties. Most important, the relationship is inherently unequal. An owner of a real pet chooses (even purchases!) their real-life cat or dog. Therefore, the act of buying a virtual pet as a toy or piece of software does not violate the hierarchy of a real-world owner-to-pet relationship. As people, we do not get to choose which other people will be friends with us. Friends, by definition, choose to be friends with one another. Therefore, even if we create an interactive virtual character that can perform all the behaviors required for an emotional relationship between human adults, as a man-made artifact that can be bought and sold, could a ‘‘true’’ friendship could be formed at all? This is an open question that invites exploration.
Interactive Stories
Stories have long been our primary way to observe and understand our emotional relationships. If stories could be made interactive— where users could immerse themselves in virtual worlds with characters they could talk to, form relationships with, touch and be touched by, and together alter the course of events, literally creating a new story in real time—then we would have a new form of interactive entertainment that eclipses video games. Like traditional stories from books, theater, television, and movies, an interactive story would be affecting and meaningful, but made all the more personal because the user helped shape it and create it (Stern 1998). Virtual characters programmed to simulate the dynamics of emotional relationships could be used as starting points for creating interactive stories. In her book, Hamlet on the Holodeck: The Future of Narrative in Cyberspace, Janet Murray (1997) suggests
Creating Emotional Relationships with Virtual Characters
that interactive virtual characters ‘‘may mark the beginning of a new narrative format.’’ As a first step in this direction, instead of relying heavily on planning to generate story plots, as some previous story researchers have done (such as Meehan 1976; Pemberton 1989; Turner 1994), a developing and ongoing emotional relationship itself could serve as a narrative. For example, a user and a virtual character could meet and get to know each other, begin to develop trust for one another, and perhaps (accidentally or not) violate that trust in some way, causing the relationship to take a downturn. The relationship could progress from there in many ways, perhaps recovering, ending, cycling between highs and lows—much like real-life relationships. There are several traditional stories that follow this pattern, such as boy meets girl, boy and girl fall in love, boy loses girl, and so on.
Interactive Art
Autonomous interactive virtual characters are only just beginning to make their way into installation and performance art. The goal of Simon Penny’s 1995 ‘‘Petit Mal’’ was, according to the artist, ‘‘to produce a robotic artwork which is truly autonomous; which was nimble and had ‘charm,’’’ to give ‘‘the impression of being sentient’’ (Penny 1997). Petit Mal was a tall, thin robot on bicycle wheels, which could quietly and gently move about in a room, able to sense when it was approaching walls or people. Celebrated video artists Lynn Hershman and Bill Viola have begun experimenting with combining video imagery of people and some simple interactivity. Mark Boehlen and Michael Mateas (1998) recently exhibited ‘‘Office Plant #1,’’ a robot plant that will bloom, wither, and make sounds in response the mood of your office environment. Other efforts include the large, destructive autonomous robots from the industrial performance art of Survival Research Labs, the RoboWoggles (Wurst and McCartney 1996), and the robot installations of Alan Rath. The potential for exploration and discovery in this area seems untapped and wide open.
12.2 I Need Your Love: Virtual Dogs, Cats, and Babies
Recognizing a dearth of consumer software with characters that displayed emotions or personality, the startup company PF.Magic was formed in 1992 with the mission to ‘‘bring life to enter-
Andrew Stern
tainment.’’ We wanted to break out of the mold of traditional video games (e.g., flight simulators, sports games, runningjumping-climbing games, shooters, puzzle games) to create playful new interactive experiences with emotional, personality-rich characters.
Dogz and Catz
By 1995, the personal computer was powerful enough to support the real-time animation we felt was required for a convincing virtual character. The first Dogz program, originally conceived by company cofounder Rob Fulop and created by Adam Frank and Ben Resner, was a simple idea: an animated virtual dog that you could feed and play with. As a product, it was very risky; an emotional relationship as the sole basis of an interactive experience had never been done before. It was unknown at the time if anyone would pay money to interact with a virtual character in this way. The program quickly generated interest from a wide range of customers—male and female, kids and adults—which is typically unheard of in entertainment software. We followed up Dogz with a companion product, Catz, establishing the explicit design goal to create the strongest interactive illusion of life we could on a PC. To imbue the Petz characters with personality and emotion we began cherry-picking techniques from computer animation and artificial intelligence, to construct what eventually became a powerful realtime animation engine tightly integrated with a goal-based behavior architecture. The Virtual Petz characters are socially intelligent autonomous agents with real-time 3-D animation and sound. By using a mouse, the user moves a hand-shaped cursor to directly touch, pet, and pick up the characters, as well as use toys and objects. Petz grow up over time on the user’s computer desktop and strive to be the user’s friends and companions. The interaction experience is nongoal oriented; users are allowed to explore the characters and their toys in any order they like within an unstructured yet active play environment. This freedom allows users to socialize with the Petz in their own way and at their own pace. This also encourages users to come up with their own interpretation of their pet’s feelings and thoughts. To date, the Virtual Petz products (figure 12.1) have sold more than two million copies worldwide.
Creating Emotional Relationships with Virtual Characters
Figure 12.1 Virtual Petz.
The goal of the Petz characters is to build an emotional relationship with the user. Their behaviors are centered around receiving attention and affection. They feed off of this interaction. Without it they become lethargic, depressed, and if ignored for long enough, they will run away. The most direct way the user can show affection to the Petz is through petting. By holding down the left mouse button, users can pet, scratch, and stroke with a hand cursor; the Petz immediately react in a variety of ways depending on what spot on their body is being petted, how fast, and how they feel at the time. Users can also pick up the characters with the right mouse button and carry them around the environment. We found that being able to (virtually) touch and hold the characters was a very effective way of building emotional relationships and creating the illusion of life. The Petz have equal footing in their relationship with the user. The toys and objects in their environment have direct objectlike interaction for both the user and the characters. Petz have full access to the toy shelf, and if they really want something, they have the freedom to get it themselves. This helps express the unpredictability and autonomous nature of the Petz. It also requires users to share control of the environment with them. For example, by picking up and using a toy, the user can initiate play. Throwing a ball may initiate a game of fetch, or holding a tugtoy in
Andrew Stern
front of a pet may begin a game of tug-of-war. Similarly, a pet can get its own toy and bring it to the user to initiate play. The act of sharing control of the environment and cooperative decision making helps further strengthen the relationship. We have created a variety of personalities—playful terriers, grumpy bulldogs, hyper Chihuahuas, lazy Persian cats, aggressive hunter cats, timid scaredy cats, and so on. Each individual character has its own likes and dislikes, spots and body coloration, and personality quirks. Users get to play with individual Petz to see if they like them before deciding to adopt. Once adopted, the user gives them a name. This individual variation allows the user to develop a unique relationship with a particular character. Each owner-pet relationship has the potential to be different.
Babyz
Our newest virtual characters, Babyz, released in October 1999 (figure 12.2), have the same nongoal-oriented play and direct interaction interface as the Petz. The user adopts one or more cute, playful babies that live in a virtual house on the computer. The Babyz have a similar cartoony style as the original Petz, but have more sophisticated, emotive facial expressions, and some simple natural language capability. Babyz vary in size, shape, and personality, and in some ways appear to be smarter and more clever than real one-year-old babies would be. They want to be fed, clothed, held, and nurtured, but are also quite playful and mischievous. Users can think of themselves as their parent or babysitter, whichever they feel most comfortable with. The user can nurture a Babyz character by holding and rocking it, tickling it, feeding it milk and baby food, putting on a fresh
Figure 12.2 Babyz.
Creating Emotional Relationships with Virtual Characters
diaper, giving it a bubble bath, soothing it if it gets upset, giving it medicine if it gets sick, laying it down to sleep in a crib, and so on. Play activities include playing with a ball, blocks, baby toys, music, dancing and singing, and dress-up. Through voice recognition, Babyz can understand and respond to some basic spoken words (such as ‘‘mommy,’’ ‘‘baby,’’ ‘‘yes,’’ ‘‘no,’’ ‘‘stop that’’), and can be read simple picture books. The Babyz characters develop over time, appearing to learn how to use toys and objects, learning to walk, and to speak a baby-talk language. In this version of the product, they will always be babies—never progressing beyond a stumble walk and simple baby talk. (All behaviors are preauthored, with the user’s interaction unlocking them over time, to create the illusion that the Babyz are learning.) If the user has more than one baby adopted, they can interact and form relationships with one another. Babyz can be friends and play nicely together or engage in long-term sibling rivalries. The program becomes an especially entertaining and chaotic experience with three active Babyz all getting into mischief at the same time (Stern 1999b).
Behaviors to Support Emotional Relationships
To allow for the formation of emotional relationships between the user and the Petz and Babyz characters, we built a broad base of interactive behaviors. These behaviors offer the characters and the user the means of communicating emotion (or the appearance of emotion) to each other. This section will detail these interactions and behaviors, specifying in each the emotions we intended to be perceived by the user. Our hope is that by having the characters express emotion in a convincing and lifelike way, the user will instinctively feel empathetic; and at the same time, if given the means to express their own emotions in return, users will feel like they are connecting to the characters on an emotional level. AFFECTION Users can express affection to the characters by touching them and holding them with their mouse-controlled hand cursor. For Petz, the touching is petting; for Babyz, it is tickling. Petz express affection to the user by sweetly barking, meowing, or purring; licking and nuzzling the hand cursor; and bringing the user a toy. Babyz
Andrew Stern
will smile, giggle, and laugh, coo, act cute, and say ‘‘mama’’ or ‘‘dada’’ in a loving voice. When perceiving a lack of affection, Petz will howl and yowl, sounding lonely; Babyz will cry and say ‘‘mama’’ or ‘‘dada’’ in a sad tone of voice. The intent is for the user to perceive the feelings of love, warmth, happiness, and loneliness in the characters. NURTURING Users can feed, clothe, and give medicine to the characters. Petz express the need to be nurtured by acting excited when food is brought out; begging, acting satisfied and grateful after eating; or disgusted when they don’t like the food. Babyz may hold out their arms and ask for food, whine and cry if hungry or need a diaper change, and may throw and spit up food they don’t like. Users are meant to perceive feelings of craving, satisfaction, pleasure, gratefulness, dislike, and disgust. PLAY By picking up a toy, users can initiate play with one or more of the characters, such as a game of fetch or building blocks. Petz or Babyz may join the user’s invitation to play, or get a toy of their own and begin playing by themselves or each other, waiting for the user to join them. A character may react if the user is ignoring them and instead playing with another character. Emotions intended to be perceived by the user include excitement, boredom, aggressiveness, timidity, laziness, and jealousy. TRAINING Users can give positive and negative reinforcement in the form of food treats, water squirts (for Petz), and verbal praise or discipline to teach the characters to do certain behaviors more or less often. Petz or Babyz are programmed to occasionally act naughty, to encourage users to train them. During these behaviors, users are meant to perceive the emotions of feeling rewarded, punished, pride, shame, guilt, and anger.
Effective Expression of Emotion
None of the aforementioned behaviors would seem believable to the user unless the characters effectively expressed convincing emotions. We found all of the following techniques to be
Creating Emotional Relationships with Virtual Characters
critical for successful real-time emotion expression in virtual characters. EMOTION EXPRESSION IN PARALLEL WITH ACTION During any body action (such as walking, sitting, using objects, etc.) Petz and Babyz characters can display any facial expression or emotive body posture, purr or cry, make any vocalization, or say any word in any of several emotional tones. This allows a baby character to look sad and say what it wants while it crawls toward the user. Catz can lick their chops and narrow their eyes as they stalk a mouse. Characters can immediately sound joyful when the user tickles their toes. We found if a virtual character cannot immediately show an emotional reaction, it will not seem believable. Timing is very important. EMOTION EXPRESSION AT REGULAR INTERVALS Programming the characters to regularly pause during the execution of a behavior to express their current mood was a very effective technique. For example, while upset and crawling for a toy that it wants, Babyz may stop in place and throw a short tantrum. Or just before running after a ball in a game of fetch, Dogz may leap in the air with joy, barking ecstatically. These serve no functional purpose (in fact they slow down the execution of a behavior), but they contribute enormously to the communication of the emotional state of the character. Additionally, related to Phoebe Sengers’s concept of behavior transitions (1998), when Petz or Babyz finish one behavior and are about to begin a new one, they pause for a moment, appearing to ‘‘stop and think’’ about it, look around, and express their current mood with a happy bark or timid cower. EMOTION EXPRESSION THROUGH CUSTOMIZATION OF BEHAVIOR Some behaviors have alternate ways to execute, depending on the emotional state of the character. Mood may influence a character’s style of locomotion, such as trotting proudly, galloping madly in fear, or stalking menacingly. A hungry character may choose to beg for food if lazy, cry and whine for food if upset, whimper if afraid, explore and search for food if confident, or attack anything it sees if angry. The greater the number of alternate ways a character has to perform a particular behavior, the stronger and deeper the perceived illusion of life.
Andrew Stern
PRIORITIZATION OF EMOTION EXPRESSION, AND AVOIDANCE OF DITHERING It is possible for a character to have multiple competing emotions, such as extreme fear of danger simultaneous with extreme craving for food. We found it to be most believable if ‘‘fear’’ has the highest priority of all emotion expression, followed by ‘‘craving’’ for food and then extreme ‘‘fatigue.’’ All other emotions such as ‘‘happiness’’ or ‘‘sadness’’ are secondary to these three extreme emotional states. It is also important that characters do not flop back and forth between conflicting emotions, else their behaviors appear incoherent. THEATRICAL TECHNIQUES Our characters are programmed to obey several important theatrical techniques, such as facing outward as much as possible, looking directly outward into the eyes of the user, and carefully positioning themselves relative to each other (‘‘stage blocking’’). If the user places a character offscreen, behind an object, or at an odd angle, the characters quickly get to a visible position and turn to face the user. If two characters plan to interact with one another, such as licking each other’s noses or giving each other an object, they try to do this from a side view, so the user can see as much of the action and emotion expression as possible in both characters. Similar techniques were identified in the context of virtual characters in Goldberg (1997).
Animation and Behavior Architecture
Animation and behavior are tightly integrated in the Petz and Babyz architecture. An attempt was made during software development to construct a clean modular code structure; however, both time constraints and practical considerations forced us to at times adopt a more ad hoc, hackish approach to implementation. The lowest level in the architecture is an animation script layer where frames of real-time rendered 3-D animation are sequenced and marked with timing for sound effects and action cues, such as when an object can be grabbed during a grasping motion. Above this is a finite-state machine that sequences and plays the animation scripts to perform generic but complicated low-level behaviors such as locomotion, picking up objects, expressing emotions, and so on. The next level up is a goal-and-plan layer that controls the
Creating Emotional Relationships with Virtual Characters
finite-state machine, containing high level behaviors such as ‘‘eat,’’ ‘‘hide,’’ or ‘‘play with object.’’ Goals are typically spawned as reactions to user interaction, to other events in the environment, or to the character’s own internal metabolism. Goals can also be spawned deliberately as a need to regularly express the character’s particular personality or current mood. At any decision point, each goal’s filter function is queried to compute how important it is for that goal to execute under the current circumstances. Filter functions are custom code in which the programmer can specify when a goal should execute. Part of the craft of authoring behaviors is balancing the output of these filter functions; it is easy to accidentally code a behavior to happen far too often or too seldom for believability. Alongside instantiated goals are instantiated emotion code objects such as ‘‘happy,’’ ‘‘sad,’’ and ‘‘angry.’’ Emotions have filter functions much like goals, but can also be spawned by the custom logic in states or goals. These emotion code objects themselves can in turn spawn new goals, or set values in the character’s metabolism. For example, the filter function of a pet’s ‘‘observe’’ goal may be activated in reaction to the user petting another pet. The ‘‘observe’’ goal is written to spawn a ‘‘jealousy’’ emotion if the other pet is a rival and not a friend. The ‘‘jealousy’’ emotion may in turn spawn a ‘‘wrestle’’ goal; any fighting that ensues could then spawn additional emotions, which may spawn additional goals, and so on. Goals are constantly monitoring what emotion objects are currently in existence to help decide which plans to choose and how they should be performed; states monitor emotions to determine which animations, facial expressions, and types of sound to use at any given moment. Note that by no means did we implement a ‘‘complete’’ model of emotion. Instead, we coded only what was needed for these particular characters. For example, the Babyz have no ‘‘fear’’ emotion, because acting scared was not necessary (or considered entertaining) for the baby characters we were making. Fear was necessary however for the Petz personalities such as the scaredy cat. The emotion lists for Petz and Babyz varied slightly; it would have been inefficient to try to have both to use the same exact model. At the highest level in the architecture is the ‘‘free will’’ and narrative intelligence layer. This is custom logic that can spontaneously (using constrained randomness) spawn new goals and emotions to convey the illusion that the character has intent of its
Andrew Stern
own. This code is also keeping track of what goals have occurred over time, making sure that entertaining behaviors are happening regularly. It keeps track of long-term narratives such as learning to walk, sibling rivalries, and mating courtship.
Discussion
Petta: How much context do the behaviors of the babies have? Is it just isolated behaviors that the babies display—now this kind of behavior is selected, and the baby will be able to perform that, and then it will switch to just a totally different behavior? Or is there some kind of coverage of context? How much history does each single baby ‘‘remember?’’ Stern: Each behavior is always suggesting to the action selector how important it is for this behavior to happen. And so, the more hungry you get, the higher the chance to eat gets. Petta: What I was trying to get at was the action expression problem. Is there any specific thing—sort of that babies try to display the reason why they behave as they are behaving now and sort of getting towards the direction of trying to report a story of what happened so far and the reason why? Stern: I have only talked about the general goals. But a goal also has a plan. This does not really map perfectly to the traditional definition of a goal, perhaps. A goal has a plan, a goal has multiple plans. And how the plans are authored, exactly which animations are chosen and why, can give you some idea of what is the motivation behind it. Petta: Ok, but still: Each plan is really self-contained and static, and it is not modified according to the previous history. Stern: No, it does keep track of the history. Sloman: For how long? I mean, is it just a lot of five time-slices or something? Or is it actually developing over time a rich memory of behavior? Stern: Yes, it is keeping a simple memory basically of what happened. It has what we call an association matrix, where it associates successes and failures, or rewards and punishments with the behaviors, objects, and characters involved at that moment. Sloman: Could it not actually use that to form new generalizations? One of the rules it might have as its condition could be: if you have
Creating Emotional Relationships with Virtual Characters
ever been stroked, then do this—or something like that. Or maybe there would be a recency effect? Stern: There are algorithms written to decide whether a goal should happen or not. But over time, as different associations are made, the likelihood that a goal can happen can change. Riecken: You have microworlds, and so there is a little set of operations over a little set of domain objects, and you can then provide a mapping, so that there is a reinforcement, positive or negative. You won’t touch that object again, because you are reinforced inside that microworld. If the user of this product does not provide any reinforcement—positive or negative—while the entity is working with it or playing with it or whatever, does the baby or the dog formulate an opinion of the object, and how? Stern: Yes. At first, a character doesn’t have any association to any objects. As they encounter an object, they start with a default, neutral association. As the character interacts with the toy, the toy has its own smartness to reward or discipline. For example, it could make a loud noise, which a character can interpret as a good or bad thing, depending on the character’s personality, and other factors. The strength of the association builds up over time, the more negative a reaction gets.
12.3
Feeling Holistic: The Importance of Design
In creating an interactive emotional artifact, even the best animation and artificial intelligence technology will be lost and ineffective without a solid design. In this section, we discuss the importance of the overall design of an interactive experience to ensure that a virtual character’s emotions are effective and powerful to the user.
Concept and Context
The type of characters you choose and the context you present them in will have a great impact on how engaging and emotionally powerful the interactive experience is. Judging by the confusing and poorly thought-out concepts in many pieces of interactive entertainment today, we feel this is a design principle too often ignored. In our products, we were careful to choose characters that people immediately recognize—dogs, cats, and babies—which
Andrew Stern
allow users to come to the experience already knowing what to d o . They immediately understand that they need to nurture and play with the characters. Even though Petz and Babyz are presented in a cartoony style, we made sure to keep their behavior in a careful balance between cartooniness and realism. This was important to maintain believability and the illusion of life; if the Petz stood up on their hind legs and began speaking English to one another, users would not have been able to project their own feelings and experiences with real pets onto the characters. One of our maxims was ‘‘if Lassie could do it, our Petz can do it.’’ That is, the Petz can do a bit more than a real dog or cat would normally do, but nothing that seems physically or mentally impossible. A very important design principle in Petz and Babyz for supporting emotional relationships is that users play themselves. Users have no embodied avatar that is supposed to represent them to the characters; the hand cursor is meant to be an extension of their real hand. The characters seem to ‘‘know’’ they are in the computer, and they look out at the user as if they actually see them. There is no additional level of abstraction here; you are you, and the characters are the characters. This is akin to a first-person versus third-person perspective. If the user had an avatar that they viewed from a third-person perspective, the other characters would be required to look at that avatar, not at the user directly, thereby weakening the impact of their emotional expression.
Direct, Simple User Interface
Petz and Babyz are almost completely devoid of the typical user interface trappings of most interactive entertainment products. To interact with the characters, users operate the hand cursor in a ‘‘natural,’’ direct way to touch and pick up characters and objects. No keyboard commands are required. All of the objects in the virtual world are designed to be intuitively easy to use; you can throw a ball, press keys on a toy piano, open cabinets, and so on.
Discussion
Bellman: I look at your demonstration, and I find it actually frustrating, because I am used to worlds where I am sharing the space
Creating Emotional Relationships with Virtual Characters
with the robots and with other people. So, your program seems very outside, and you can’t reach in. And then the avatar that you have, this disembodied hand, is a very impoverished version of an avatar where you are really there in the space. And that feels frustrating. Stern: Well, let me speak to that one. That is very intentional that you have only a hand. I think one of the reasons this works for people, and that people can form relationships with the characters is because you are yourself, you don’t take on an avatar. You are yourself. The characters that live in the computer know that they are in the computer, and they look out at you, and the hand is just an extension of your own hand. So, you are not role playing—not that avatars necessarily involve role playing, but typically, in a computer game, an avatar you control is a puppet or it’s a character that you must take on. Bellman: Yes. But you are speaking only of a certain kind of use of avatars, which is the bird’s eye view. In fact, in most avatar use, you are there in the scene, but you see your arm extending in front of you. Stern: Ok. I would call that an avatar. Bellman: That is an avatar? Ok. Because you have a representation in the space, and as you move, as you pick up things, as you look at things, you represent. Sloman: But these are just different forms of interaction. There is nothing right or wrong or good about either of them. Bellman: that my usually actually
I was not trying to say it is wrong. What I want to say is personal reaction after coming out of a world in which I am more embodied and I share with other people, was a sense of frustration.
Of course this simplicity limits the amount of expressivity offered to the user. We cannot make objects and behaviors that require more complicated operation, such as a holding a baby and a milk bottle at the same time. While we could program some obscure arbitrary keyboard command sequence to accomplish this, we have chosen not to in order to keep the interface as pure and simple as possible. To allow the user more expressivity we would be required to add more intuitive interface channels, such as a data glove or voice recognition. In fact, instead of typing words to your Petz and Babyz (which you would never do in real-life of course),
Andrew Stern
the latest versions of the products allow you to speak some basic words to the characters. In general, we feel that user interface, not animation or artificial intelligence technology, is the largest impediment for creating more advanced virtual characters. With only a mouse and keyboard, users are very constrained in their ability to naturally express their emotions to virtual characters. When interacting with characters that speak out loud, users should be able to speak back with their own voice, not with typing. Unfortunately, voice recognition is still a bleeding-edge technology. In the future, we look forward to new interface devices such as video cameras on computer monitors that will allow for facial and gesture recognition (Picard 1997).
Natural Expression
When trying to achieve believability, we found it effective for characters to express themselves in a natural way, through action and behavior, rather than through traditional computer interface methods such as sliders, number values, bar graphs, or text. In Petz and Babyz, the only way the user can understand what the characters seem to be feeling is to interpret their actions and physical cues, in the same way an audience interprets an actor’s performance. We do not display bar graphs or text messages describing the characters’ internal variables, biorhythms, or emotional state. By forcing a natural interpretation of their behavior, we do not break the illusion of a relationship with something alive.
Favor Interactivity and Generativity over a High Resolution Image
In Petz and Babyz, we made a trade-off to allow our characters to be immediately responsive, reactive, and able to generate a variety of expressions, at the expense of a higher resolution image. Surprisingly, most game developers do not make this trade-off! From a product marketing perspective, a beautiful still frame is typically considered more important than the depth and quality of the interactive experience. Of course, there is a minimum level of visual quality any professional project needs, but we feel most developers place far too much emphasis on flashy effects such as lighting, shading, and visual detail (i.e., spectacle) and not enough emphasis on interactivity and generativity.
Creating Emotional Relationships with Virtual Characters
Purity versus ‘‘Faking It’’: Take Advantage of the Eliza Effect
The ‘‘Eliza effect’’—the tendency for people to treat programs that respond to them as if they had more intelligence than they really do (Weizenbaum 1966) is one of the most powerful tools available to the creators of virtual characters. As much as it may aggravate the hard-core computer scientists, we should not be afraid to take advantage of this. ‘‘Truly alive’’ versus ‘‘the illusion of life’’ may ultimately be a meaningless distinction to the audience. Ninetynine percent of users probably will not care how virtual characters are cognitively modeled—they just want to be engaged by the experience, to be enriched and entertained.
12.4
Conclusion
This chapter has put forth virtual characters as a new form of emotional artifact, and the arrival of emotional relationships between humans and virtual characters as a new social phenomenon and direction for story and art. The design and implementation techniques we found useful to support such emotional relationships in the Virtual Petz and Babyz projects have been presented. We will conclude with some final thoughts on what it could m e a n for a person to have an emotional relationship with a virtual character. Are relationships between people and virtual characters somehow wrong, perverse, even dangerous—or just silly? Again, we can look to the past to help us answer this. Audiences that read about or see emotional characters in traditional media—painting, sculpture, books, theater, television, and movies—have been known to become very ‘‘attached’’ to the characters. Even though the characters are not real, they can feel real to the audience. People will often cry when the characters suffer and feel joy when they triu m p h . When the written novel first appeared it was considered dangerous by some; today we find that this is not the case. However, television, a more seductive medium than the novel, has certainly captured free time in the lives of many people. Some consider the effect of television and video games on children’s development to be a serious problem; the media has even reported a few outrageous stories of people going to death-defying lengths to take care of their virtual pet Tamagotchis. Designers should be aware that man-made characters have the potential to have a powerful effect on people.
Andrew Stern
Why create artificial pets and humans? Isn’t it enough to interact with real animals and people? From our perspective on making Virtual Petz, this was not the point. Our intent was not to replace people’s relationships with real living things, but to create characters in the tradition of stuffed animals and cartoons. And while some people are forming emotional relationships with today’s virtual characters, by and large they are still thought of as sophisticated software toys that try to get you to suspend your disbelief and pretend they are alive. However, as we move toward virtual human characters such as Babyz, the stakes get higher. As of this writing, we have not yet received feedback from the general public on their feelings and concerns about Babyz. Also, the characters made so far have been ‘‘wholesome’’ ones, such as dogs, cats, and babies, but one could easily imagine someone using these techniques to create characters that could support other types of emotional relationships, from the manipulative to the pornographic. Inevitably this will happen. Of course, the promise and danger of artificial characters has long been an area of exploration in literature and science fiction, ranging from friendly, sympathetic characters such as Pinocchio and R2D2 to more threatening ones such as Frankenstein and HAL9000. As virtual characters continue to get more lifelike, we hope users keep in mind that someone (human) created these virtual characters. Just as an audience can feel a connection with the writer, director, or actor behind a compelling character on the written page or the movie screen, a user could potentially feel an even stronger connection to the designer, animator, and programmer of an interactive virtual character. For the artist, the act of creating a virtual character requires a deep understanding of the processes at work in the character’s mind and body. This has always been true in traditional art forms, from painting and sculpting realistic people to novels to photography and cinema, but it is taken to a new level with interactive virtual characters. As the artist, you are not just creating an ‘‘instantiation’’ of a character—a particular moment or story in the character’s life—you are creating the algorithms to generate potentially endless moments and stories in that character’s life. People need emotional artifacts. When the public gets excited about buzzwords like ‘‘artificial intelligence’’ or ‘‘artificial life,’’ what they are really asking for are experiences where they can interact with something that seems alive, that has feelings, that
Creating Emotional Relationships with Virtual Characters
they can connect with. Virtual characters are a promising and powerful new form of emotional artifact that we are only just beginning to discover.
Acknowledgments
Virtual Petz and Babyz were made possible by a passionate team of designers, programmers, animators, artists, producers, and testers at PF.Magic/Mindscape that includes Adam Frank, Rob Fulop, Ben Resner, John Scull, Andre Burgoyne, Alan Harrington, Peter Kemmer, Jeremy Cantor, Jonathan Shambroom, Brooke Boynton, David Feldman, Richard Lachman, Jared Sorenson, John Rines, Andrew Webster, Jan Sleeper, Mike Filippoff, Neeraj Murarka, Bruce Sherrod, Darren Atherton, and many more. Thanks to Robert Trappl and Paolo Petta for organizing such a fascinating and informative workshop.
Appendix: Real Customer Letters
I h a d a dog that was a chawawa a n d his n a m e was Ramboo. Well he got old a n d was very sick a n d suffering so my parents put him to sleep. Ever since then I have begged my parents for a new dog. I have wanted one soo bad. So I heard about this dogz on the computer. I bought it a n d LOVE it!!! I have adopted 9 dogs. Sounds a bit to much to you ehhh? Well I have alot of free time on my h a n d s . So far everyday I take each dog out one by one by them selves a n d play with them, feed them, a n d brush them, a n d spray them with the flee stuff. I love them all. They are all so differnant with differant personalitys. After I take them out indaviually then I take 2 out at a time a n d let them play with me with each other. Two of the dogs my great Dane a n d chawawa dont like to play with any of the other dogs but each other. This is a incrediable program. I h a d my parents thinking I was crazy the other night. I was sitting here playing with my scottie Ren a n d mutt stimpy a n d they where playing so well together I dont know why but I said good dog out loud to my computer. I think my parents wondered a little bit a n d then asked me what the heck I was doing. But thankz PF.Magic. Even though I cant have a real dog it is really nice to have some on my screen to play with. The only problem now is no one can get me away from this computer, a n d I think my on-line friendz are getting
Andrew Stern
a little m a d cause im not chatting just playing fetch a n d have a great time with my new dogz. Thanks again PF.magic. I love this program a n d will recomend it to everyone I know!!!!!!! I am a teacher a n d use the catz program on my classroom PC to teach children both computer skills a n d caring for an animal. One of the more disturbed children in my class repeatedly squirted the catz a n d she ran away. Now the other children are angry at this child. I promised to try a n d get the catz back. It h a s been a wonderful lesson for the children. (And no live animal was involved.) But if there is any way to get poor Lucky to come homze to our clazz, we would very much appreciate knowing how to do it. Thanks for your help, Ms. Shinnick’s 4th grade, Boston, MA. Dear PF.Magic, I am an incredible fan of your latest release,Petz 3,I have both programs a n d in Janurary 1999, my cherised Dogz Tupaw was born. He is the most wonderful dogz a n d I thank you from the bottom of my heart, because in Janurary through to the end of April I h a d Anorexia a n d i was very sick. I ate a n d recovered because i cared so much about Tupaw a n d i wanted to see him grow u p . I would have starved without you bringing Petz 3 out. Please Reply to this, it would m e a n alot to me. Oh, a n d please visit my webpage, the url is http://www.homestead.com/wtk/pets.html. Thankyou for releasing petz 3,Give your boss my best wishes, Sincerily, Your Number One Fan, Faynine. I just reciently aquired all your Petz programs a n d I think they are great! I really love the way the animals react. I raised show dogs a n d have h a d numerous pets of all kinds in my life a n d making something like this is great. I am a school bus driver a n d have introduced unfortunate kids to your program. Children who not only can they not afford a computer but they can’t afford to keep a pet either. This has taught them a tremendous amount of responsibilty. I am trying to get the school to incorporate your programs so as to give all children a chance to see what it is like to take care of a pet. It might help to put a little more compassion in the world. Please keep me updated on your newest releases. Thanks for being such a great company. Nancy M. Gingrich. Dear PF.Magic, Hello! My n a m e is Caitlin, a n d I’m 10 years old. I have Dogz 1 a n d Catz 1, as well as Oddballz, a n d I enjoy them all very much. Just this morning was I playing with my Jester breed Catz, Lilly. But I know how much better Petz II is. For a while, I
Creating Emotional Relationships with Virtual Characters
thought I had a solution to my Petz II problem. I thought that if only I could get Soft Windows 95 for $200, that would work. Well, I took $100 out of my bank account (by the way, that’s about half my bank account) and made the rest. I cat-sit, I sold my bike, and I got some money from my parents. Anyway, I really, really love animals (I’m a member of the ASPCA, Dog Lovers of America, and Cat Lovers of America) but I can’t have one! That’s why I love Petz so much! It’s like having a dog or cat (or alien for that matter) only not. It’s wonderful! I have a Scrappy named Scrappy (Dogz), Chip named Chip (Dogz), Bootz named Boots (Dogz), Cocker Spaniel named Oreo (Dogz), Jester named Lilly (Catz), and Jester named Callie (Catz). And then every single Oddballz breed made. =) I don’t mean to bore you as I’m sure this letter is getting very boring. I would love SO MUCHto have Petz II. I really would. (At this point in the letter I’m really crying) I adopted 5 Catz II catz at my friend’s house, but I go over to her house so little I’m sure they’ll run away. I’d hate for them to run away. Is there anything I can do? I love my petz, and I’m sure they’d love Petz II. Thank you for reading this. Please reply soon. ~*~ Caitlin and her many petz ~ * ~ . My husband went downtown (to Manchester) and found Catz for sale, and having heard so much about it he bought it on the spot. He put it on his very small laptop and came back from one of his business trips saying, ‘‘How many Dutchmen can watch Catz at once on a little laptop on a Dutch train?’’ The answer was TEN. I asked if any of them said, ‘‘Awww,’’ the way we all did, but he said they all walked off saying it was silly. I bet they ran out to buy it anyway, though! Yours, Mrs. H. Meyer. Dear Sirs, Just wanted to thank-you for the pleasure my petz have brought me. I am paralyzed from the neck down yet your program has allowed me too again enjoy the pleasure of raising my own dogz. I have adopted 5 so far. I love them equally as if they were real. Thanks again.
Discussion: The Role of Multiple Users
Elliott: You know, Microsoft is moving this field towards something like multi-user systems, maybe not at the same time, but systems where you can identify a user with a particular session. Do you see your software developing relationships, historical relationships with different users, recognized by their login names?
Andrew Stern
Stern: Well, it seems reasonable. One problem you have with multiple users—that’s why we shy away from the idea—is that it goes against the illusion of life that these pets and babies give. I mean, basically we try to do nothing to artificially set up the situation. So, even the idea of logging in as yourself is not a ‘‘natural’’ thing to do, it does not work for the kind of idea we are trying to implement here. Sloman: But if the operating system is doing that anyway, then the information could be made available . . . Stern: I think the better way would be to recognize the user’s emotion or recognize their actual face. Sloman: But that’s much harder to implement. Ball: The cost of misidentifying someone is relatively high, because you break the illusion completely, if it does not act the way it is supposed to act with you. Picard: But we sometimes misidentify each other, or at least I misidentify people from time to time. And, you know, we have evolved ways to show our embarrassment, and social cues when we make mistakes like that, so we can remediate these errors. Trappl: We have made this experience with our exhibit in the Technical Museum. We expected to have always one person interact with it. But children just rush in together and interact. And the same could happen here, that three children are sitting in front of the computer. Picard: Right: and it misrecognizes that as one person. Sloman: I could well imagine that in a family with, say, two or three kids and one computer, they might enjoy the opportunity to identify themselves somehow. They click on the ‘‘Joe,’’ ‘‘Fred,’’ ‘‘Mary,’’ or whatever button when they start u p , and then . . . Bellman: Actually, that’s one of the advantages of having an avatar, because an avatar usually recognizes this, and we enter into the world through a body and things can look at you and see that it’s you. Sloman: That’s another way of doing it. The point is: There are many ways in which it could be done, and I could imagine some people would appreciate that. But I agree, it’s artificial. . . . Bellman: What I am a little worried about here, especially for training kids, (because we do have studies now about some of the impacts on kids of being in these environments and what they are
Creating Emotional Relationships with Virtual Characters
learning) is the question, basically: What are they going to do with these babies? They can develop some kind of a bizarre pattern, in order to get a response, in order to feel that they have authored something in this environment, they will create a very disturbed baby, just for the effect of being able to see: I have done something here, and it’s different from your baby. Stern: The hope is that you can make many different positive—not disturbed—babies, but maybe different happy babies. Bellman: It just seems to me really important, because half of what I think what kids are going to do is presumably that negative stuff. Ortony: Exactly. They rip the arms and hands off their dolls! Bellman: And the software toy is something that is more powerful and more reinforcing. It allows manipulation on the next level up in an interaction. Your ripped-off Barbie doll just lies there with dismembered parts. It does not sit there and cry indefinitely. In the environments that I am talking about, in addition to the authorship, there are other people present. And no matter how weird the group is, there are still some social warnings about odd or destructive or too violent behavior even in games, that actually helps. You have human beings there who also interact. Picard: I think kids need a safe place to explore the things they don’t want to try in the real world, things you don’t want to do to a real baby. It would be neat if this was a good enough model that behaves like a real baby. I never wanted to run experiments with my son and his emotions when he was born. And I could bring myself to do the ones that induce positive emotions, but I could not bring myself to do the ones that instill morbid fear on my own child. Whereas if I had a virtual model of my son, then I could maybe do some of these. I think of the behaviors that a lot of people are trying in these on-line worlds, where they can play around and then come back with more confidence in the real world. Bellman: Yes. But what’s really interesting is to watch the feedback from other human beings about these behaviors. And one of the first experiences I had when I came into these worlds as a novice, was watching the consequences of a virtual rape. So, here was a young man who was being tried in a public MUD by his peers. About a couple of hundred people were gathered in their avatars inside a room to judge his virtual rape. What was fascinating was the whole range of debate about: Well, is this really real? Or, is this not obscene? It was really a fascinating discussion. Part of what
Andrew Stern
went on was that eventually, after people had concluded it was serious enough, they decided they were going to do things like label him with a scarlet letter, put him into a virtual jail every time he showed u p , all these different kinds of punishment. What happened was that 90 percent of the community locked against him, which meant that when he walked into a room, they couldn’t hear him, and they couldn’t see him. They decided not to close the MUD, they decided not to change the culture and punishment like having jails etc., but just 90 percent of the population, which means a couple of thousand people, locked against him. Strangely enough, he stayed in this situation (it turned out to be a 19-year-old boy). Did he think of leaving and coming back as a different actor? No. He came back always as himself, and I watched his amazing transformation. I watched him go from total defense—saying ‘‘oh, I was just playing around’’—to eventually, about a year later, to saying: ‘‘I never realized I could hurt somebody.’’ And then, eventually, the virtual community, very much like the pygmies, began to allow him back in, in a process that took a couple of months. People started hearing that he was basically an ‘‘ok guy’’ now and that he was much better behaved, and they started to open u p . I think he had a real learning experience. Part of that came about because there was a community of people there. So, you can have your emotional agents too, but something about having these other human beings there is very powerful.
References Bandai (1996): Tamagotchi keychain toy. On-line. Available: hhttp://www.bandai.comi. (Availability last checked 5 Nov 2002) Bates, J. (1992): The Nature of Characters in Interactive Worlds and the Oz Project. TR CMU-CS-92-200, School of Computer Science, Carnegie Mellon University, Pittsburgh. Bates, J. (1994): The Role of Emotion in Believable Agents. TR CMU-CS-94-136, School of Computer Science, Carnegie Mellon University, Pittsburgh. Blumberg, B. (1997): Multi-level Control for Animated Autonomous Agents: Do the Right Thing . . . Oh, Not That . . . In R. Trappl and P. Petta, eds., Creating Personalities for Synthetic Actors. Springer, Berlin, Heidelberg, New York. Boehlen, M., and Mateas, M. (1998): Office Plant #1: Intimate Space and Contemplative Entertainment. Leonardo 31 (5): 345–348. Damasio, A. (1994): Descartes’ Error: Emotion, Reason, and the Human Brain. Putnam, New York. Dautenhahn, K. (1997): Ants Don’t Have Friends—Thoughts on Socially Intelligent Agents. In K. Dautenhahn, ed., Proceedings of the 1997 AAAI Fall Symposium, Socially Intelligent Agents. Technical Report FS-97-02. AAAI Press, Menlo Park, Calif.
Creating Emotional Relationships with Virtual Characters
Egri, L. (1946): The Art of Dramatic Writing. Simon and Schuster, New York. Elliott, C., Brzezinski, J., Sheth, S., and Salvatoriello, R. (1998): Story-Morphing in the Affective Reasoning Paradigm: Generating Stories Semi-Automatically for Use With ‘‘Emotionally Intelligent’’ Multimedia Agents. In C. Sierra, M. Gini, and J. S. Rosenschein, eds., Proceedings of the second international conference on Autonomous Agents (Agents ’98), ACM Press. New York, pp. 181–188. Frank, A., and Stern, A. (1998): Multiple Character Interaction between Believable Characters. In Proceedings of the 1998 Computer Game Developers Conference, 215–224. Miller Freeman, San Francisco. Frank, A., Stern, A., and Resner, B. (1997): Socially Intelligent Virtual Petz. In K. Dautenhahn, ed., Proceedings of the 1997 AAAI Fall Symposium, Socially Intelligent Agents, FS-97-02, pp. 43–45. AAAI Press, Menlo Park, Calif. Goldberg, A. (1997): IMPROV: A System for Real-Time Animation of Behavior-Based Interactive Synthetic Actors. In R. Trappl and P. Petta, eds., Creating Personalities for Synthetic Actors. Springer, Berlin, Heidelberg, New York. Grand, S., Cliff, D., and Malhotra A. (1997): Creatures: Artificial Life Autonomous Software Agents for Home Entertainment. In W. L. Johnson, ed., Proceedings of the First International Conference on Autonomous Agents, 22–29. ACM Press, Minneapolis. Hayes-Roth, B., van Gent, R., and Huber, D. (1997): Acting in Character. In R. Trappl and P. Petta, eds., Creating Personalities for Synthetic Actors. Springer, Berlin, Heidelberg, New York. Kline, C., and Blumberg, B. (1999): The Art and Science of Synthetic Character Design. In F. Nack, ed., Proc. Symposium on AI and Creativity in Entertainment and Visual Art, 1999 Convention of the Society for the Study of Artificial Intelligence and the Simulation of Behavior (AISB), Edinburgh, Scotland. Univ. of Edinburgh, Edinburgh, Scotland. Loyall, A. (1997): Believable Agents: Building Interactive Personalities. Ph.D. diss., Carnegie Mellon University, CMU-CS-97-123, Pittsburgh. Mateas, M. (1999): Not Your Grandmother’s Game: AI-Based Art and Entertainment. In D. Dobson and K. Forbus, eds., Proceedings of the 1999 AAAI Spring Symposium, Artificial Intelligence and Computer Games, Technical Report SS-99-02, pp. 64–68. AAAI Press, Menlo Park, Calif. Meehan, J. (1976): The Metanovel: Writing Stories by Computer. Ph.D. diss., Department of Computer Science, Yale University. Minsky, M. (1985): The Society of Mind. Simon and Schuster, New York. Murray, J. (1997): Hamlet on the Holodeck: The Future of Narrative in Cyberspace. Free Press, New York. Pemberton, L. (1989): A Modular Approach to Story Generation. In The Fourth Conference of the European Chapter of the Association for Computational Linguistics, 217– 224. BPCC Wheatons, Exeter. Penny, S. (1997): Embodied Cultural Agents: At the Intersection of Robotics, Cognitive Science, and Interactive Art. In K. Dautenhahn, ed., Proceedings of the 1997 AAAI Fall Symposium, Socially Intelligent Agents, FS-97-02, pp. 43–45. AAAI Press, Menlo Park, Calif. Perlin, K. (1995): Real-Time Responsive Animation with Personality. IEEE Trans. Visualization Comput. Graphics. 1 (1): 5–15. PF.Magic/Mindscape. (1995–99): Dogz and Catz: Your Virtual Petz; Babyz. On-line. Available: hhttp://www.petz.com and http://www.babyz.neti. (Availability last checked 5 Nov 2002) Picard, R. (1997): Affective Computing. MIT Press, Cambridge. Reilly, W. (1996): Believable Social and Emotional Agents. Ph.D. thesis. TR CMU-CS-96138, School of Computer Science, Carnegie Mellon University, Pittsburgh. Rilke, R. (1934): Letters to a Young Poet. W. W. Norton, New York. Sengers, P. (1996): Socially Situated AI: What It Means and Why It Matters. In H. Kitano, ed., Proceedings of the 1996 AAAI Symposium, Entertainment and AI/A-Life. Technical Report WS-96-03, pp. 69–75. AAAI Press, Menlo Park, Calif.
Andrew Stern
Sengers, P. (1998): Do the Thing Right: An Architecture for Action-Expression. In C. Sierra, M. Gini, and J. S. Rosenschein, eds., Proceedings of the Second International Conference on Autonomous Agents, 24–31. ACM Press, New York. Sony Electronics. (1999): Aibo robot dog. On-line. Available: hhttp://www.aibo.comi. (Availability last checked 6 July 2002.) Stern, A. (1998): Interactive Fiction: The Story Is Just Beginning. IEEE Intell. Syst. 13 (5): 16–18. Stern, A. (1999a): AI Beyond Computer Games. In Proceedings of the 1999 AAAI Spring Symposium, Artificial Intelligence and Computer Games, SS-99-02, pp. 77–80. AAAI Press, Menlo Park, Calif. Stern, A. (1999b): Virtual Babyz: Believable Agents with Narrative Intelligence. In M. Mateai, P. Sengers, Proceedings of the 1999 AAAI Fall Symposium, Narrative Intelligence. Technical Report FS-99-01, pp. 52–58. AAAI Press, Menlo Park, Calif. Stern, A., Frank, A., and Resner, B. (1998): Virtual Petz: A Hybrid Approach to Creating Autonomous, Lifelike Dogz and Catz. In C. Sierra, M. Gini, and J. S. Rosenschein, eds., Proceedings of the Second International Conference on Autonomous Agents, 334–335. ACM Press, New York. Thomas, F., and Johnston, O. (1981): Disney Animation: The Illusion of Life. Abbeville Press, New York. Tiger Electronics. (1998): Furby toy. On-line. Available: hhttp://www.furby.comi. (Availability last checked 6 July 2002.) Turkle, S. (1999): Ongoing Virtual Pet Study. On-line. Available: hhttp://web.mit.edu/ sturkle/www/vpet.htmli. (Availability last checked 6 July 2002.) Turner, S. (1994): The Creative Process: A Computer Model of Storytelling and Creativity. Lawrence Erlbaum Associates, Hillsdale, N.J. Weizenbaum, J. (1966): Eliza. Commun. ACM 9: 36–45. Wurst, K., and McCartney, R. (1996): Autonomous Robots as Performing Agents. In H. Kitand, ed., Proceedings of the 1996 AAAI Symposium, Entertainment and AI/A-Life. AAAI Press, Menlo Park, Calif.
Concluding Remarks Robert Trappl
A diversity of fascinating topics has been covered in this book. If the reader wants to delve deeper, it is recommended to follow up the publications of the contributing scientists or to contact them directly. More information about these scientists can be found in the Contributors section following this chapter. The contributors and the editors also have compiled a list of Recommended Reading that provide more detailed information on specific aspects of emotions in humans, animals, and/or artifacts. Finally, three possible remaining questions and their answers (or vice versa) shall be mentioned: First, will emotions research have an impact on our self-image, especially on our view of the function of our consciousness? While some scientists (e.g., Steven Pinker 1997) assume that the mind of Homo sapiens lacks the cognitive equipment to solve the puzzle of consciousness because our minds evolved by natural selection to solve problems that were life-and-death matters to our ancestors, Antonio Damasio (1999, p. 285) offers an interesting alternative explanation: ‘‘Knowing a feeling requires a knower subject. In looking for a good reason for the endurance of consciousness in evolution, one might do worse than say that consciousness endured because organisms so endowed could ‘feel’ their feelings. I am suggesting that the mechanisms which permit consciousness may have prevailed because it was useful for organisms to know of their emotions.’’ Where is consciousness located in the brain? Numerous experiments and observations with extremely sensitive recording instruments show that consciousness is not vaguely distributed in the brain but located in a definite place, the associative cortex. The associative cortex, however, does not look very different from other cortical areas and looks even more similar to the cerebellum (Roth, 2001). Why, then, should consciousness be located in this area? The very likely reason is that the associative cortex is the only part of the cortex which is strongly interconnected with the limbic system, where the emotional evaluation system of the brain is located.
Robert Trappl
We therefore may conclude that both cognition and emotion are necessary prerequisites of consciousness. Second, an even closer look at the cortex of the human brain reveals another fact related to the topic of this book: The number of afferent and efferent fibers to and from cortical neurons—the ‘‘input/output channels’’—amounts to at most 100 million. In contrast, the ca. 50 billion neurons in the cortex are strongly interconnected, the number of connections amounting to 5 x 10 1 4 . Given that, we find a ratio of one afferent/efferent fiber to every 5 million intracortical fibers (Schu¨tz 2000). What does this mean? It means that the input/output processes represent but a minute fraction of the processes going on within the cortical system! This is in total contrast to how the vast majority of researchers and developers construct emotional and intelligent artifacts. Much money and energy is invested in the ‘‘surface’’; much effort is also invested in the development of both sensory detection devices and means for very ‘‘realistic’’ outputs. Perhaps this insight about the human cortex should lead us to focus (again?) more on the ‘‘deep’’ structure, for instance, developing further more complex models of cognition and emotion and their interplay. Finally, the main aim covered by most of the contributors to this book is the development of emotional, intelligent artifacts. With respect to computers, de Rosis (2002) describes this process as eventually leading to ‘‘the kind of computer that is beautiful, warm, has a sense of humor, is able to distinguish good from evil, feels love and induces the user to fall in love with it, etc.’’ Replace ‘‘computer’’ by ‘‘robot’’ or ‘‘synthetic actor’’ and there is a homunculus, as in the Spielberg’s movie Artificial Intelligence. The issue of homunculi/ae of this kind is quite old, especially with men falling in love with attractive ‘‘women,’’ ranging from Galatea to Olympia in ‘‘The Tales of Hoffmann’’ to the replicant in Ridley Scott’s Blade Runner. There are already some speech systems available which sound quite natural, especially if fragments of ‘‘canned speech’’ are used. Synthetic actors increasingly make a humanlike impression—is it really desirable that humans are so impressed by them as to fall in love with them? The European Union requires that the labels on food packages inform the consumer if even only a small percentage of the food contained is genetically altered. Given the progress in computer animation and the slower but, nevertheless, existing progress in
Concluding Remarks
synthetizing humanlike personalities, the time is ripe to consider the request for a mandatory declaration: Synthetic actors should declare that they are synthetic. This concern takes on alarming proportions when considering children. Sherry Turkle (1998, 2000) sees a special risk in, as she calls them, ‘‘relational toys’’: Until now, for example, girls could learn parental discipline when playing with a doll, attributing to this doll emotions they themselves experience when interacting with a parent. Now the relational dolls declare that they have emotions, they express them, and the child has to cope with these emotions. She is rewarded when she induces certain emotions in the doll through a specific behavior. But how would the behavior of a child be affected who abuses a doll and then is rewarded? The big U.S. toy company Hasbro has decided that its dolls will not respond if they are abused. Is this enough? How long will it be until another company does not stick to such a moral code? In conclusion, research on emotions in humans and artifacts is definitely not l’art pour l’art, but rather research with potentially far-reaching consequences. Therefore, researchers in this area especially have a moral obligation to bear these implications in mind.
References Damasio, A. R. (1999): The Feeling of What Happens. Body and Emotion in the Making of Consciousness. Harcourt Brace, New York. Pinker, S. (1997): How the Mind Works. Penguin, New York. de Rosis, F., ed. (2002): Toward Merging Cognition and Affect in HCI. Special Double Issue of Applied Artificial Intelligence, 16 (7 & 8). Roth, G. (2001): Fu¨ hlen, Denken, Handeln. Wie das Gehirn unser Verhalten steuert. Suhrkamp, Frankfurt am Main. Schu¨tz, A. (2000): What can the cerebral cortex do better than other parts of the brain? In G. Roth and M. F. Wulliman, eds., Brain Evolution and Cognition. Wiley-Spektrum Akademischer Verlag, New York, Heidelberg. Turkle, S. (1998): Cyborg Babies and Cy-Dough-Plasm: Ideas about Self and Life in the Culture of Simulation. In R. Davis-Floyd and J. Dumit, eds., Cyborg Babies: From Techno-Sex to Techno-Tots. Routledge, New York. Turkle, S. (2000): The Cultural Consequences of the Digital Economy. Invited lecture at the mobilcom austria Conference, 9 November 2000, Hofburg, Vienna, Austria.
Recommended Reading
Note: Some of the following books were recommended by a contributor, some by the editors. It therefore cannot be concluded that a book on this list is recommended by all contributors. Arkin, R. C. (1998): Behavior-Based Robotics (Intelligent Robots and Autonomous Agents). MIT Press, Cambridge. Boatley, K., and Jenkins, J. M. (1995): Understanding Emotions. Basil Blackwell, Oxford, Cambridge. Can˜amero, L., ed. (2002): Emotional and Intelligent II: The Tangled Knot of Social Cognition. Papers from the 2001 Fall Symposium, November 2–4, 2001, North Falmouth, Massachusetts. Technical Report FS-01-02, AAAI Press, Menlo Park, Calif. Can˜amero, L., and Petta, P., eds. (2001): Grounding Emotions in Adaptive Systems. Two Special Issues of Cybernetics and Systems, 32 (5) and (6). Cassell, J., Sullivan, J., Prevost, S., and Churchill, E., eds. (2000): Embodied Conversational Agents. MIT Press, Cambridge. Clancey, W. J. (1999): Conceptual Coordination: How the Mind Orders Experiences in Time. Lawrence Erlbaum Associates, Hillsdale, N.J. Crawford, C. (2000): Understanding Interactivity. Self-published on-line. Available: hhttp://www.erasmatazz.comi. (Availability last checked 5 Nov 2002) Dalgleish T., and Power, M. eds. (1999): Handbook of Cognition and Emotion. Wiley, New York. Damasio, A. R. (1994): Descartes’ Error. Putnam, New York. Damasio, A. R. (1999): The Feeling of What Happens: Body and Emotion in the Making of Consciousness. Harcourt Brace Jovanovich, New York. Dautenhahn, K., Bond, H. A., Can˜amero, L., and Edmonds, B., eds. (2002): Socially Intelligent Agents: Creating Relationships with Computers and Robots. Kluwer Academic Press. Elliott, C. D. (1992): The Affective Reasoner: A Process Model of Emotions in a Multiagent System. Ph.D. thesis, Northwestern University, Illinois. On-line. Available: hftp://ftp.depaul.edu/pub/cs/ar/elliott-thesis.psi. (Availability last checked 5 Nov 2002) Frijda, N. H. (19 86): The Emotions. Cambridge University Press, Editions de la Maison des Sciences de l’Homme, Paris. Hauser, M. (2000): Wild Minds: What Animals Really Think. Henry Holt, New York. Laurel, B. (1991): Computers as Theatre. Addison-Wesley, Reading, Mass. LeDoux, J. E. (1996): The Emotional Brain. Simon and Schuster, New York. Lewis, M., and Haviland-Jones, J. M., eds. (2000): Handbook of Emotions. 2nd ed. Guilford Press, New York, London. McKee, R. (1997): Story: Substance, Structure, Style, and the Principles of Screenwriting. Harper Collins, New York. Murray, J. (1997): Hamlet on the Holodeck: The Future of Narrative in Cyberspace. Free Press, New York. Ortony, A., Clore, G. L., and Collins, A. (1988): The Cognitive Structure of Emotions. Cambridge University Press, Cambridge. Paiva, A., ed. (2000): Affective Interactions: Towards a New Generation of Computer Interfaces. Springer, Berlin, Heidelberg, New York. Picard, R. W. (1997): Affective Computing. MIT Press, Cambridge. Pinker, S. (1997): How the Mind Works. Penguin, New York.
Recommended Reading
Powers, W. T. (19 73): Behavior: The Control of Perception. Aldine, Chicago. Reeves, B., and Nass C. (1998): The Media Equation. CSLI Publications, Stanford, Calif. Roth, G. (2001): Fu¨ hlen, Denken, Handeln. Wie das Gehirn unser Verhalten steuert. Suhrkamp, Frankfurt am Main. (Feeling, Thinking, Acting. How the Brain Controls our Behavior. Unfortunately, this excellent book has not (yet) been translated into English. RT) Rolls, E. T. (19 9 9 ): The Brain and Emotion. Oxford University Press, Oxford, London, New York. Rosis, F. de, ed. (2002): Toward Merging Cognition and Affect in HCI. Special Double Issue of Applied Artificial Intelligence, 16 (7 & 8). Scherer, K. R., Schorr, A., and Johnstone, T., eds. (2001): Appraisal Processes in Emotion: Theory, Methods, Research. Oxford University Press, Oxford, London, New York. Sloman, A., ed. (2000): Proceedings of the AISB 2000 Symposium on ‘‘How to Design a Functioning Mind,’’ The University of Birmingham, UK. Sousa, R. de (1987): The Rationality of Emotion. MIT Press, Cambridge. Thomas, F., and Johnston, O. (1981): Disney Animation: The Illusion of Life. Abbeville Press, New York. Trappl, R., and Petta, P., eds. (1997): Creating Personalities for Synthetic Actors. Springer, Berlin, Heidelberg, New York.
Contributors
Gene Ball
E-mail:
[email protected] Gene Ball was a senior researcher at Microsoft Corporation until January 2001. He earned his bachelor’s degree in Mathematics from the University of Oklahoma and his master’s degree in Computer Science from the University of Rochester, where he also did his Ph.D. studies, which he completed in 1982. He worked as a research computer scientist at Carnegie Mellon University from 1979 to 1982 and as a software designer for the company Formative Technologies from 1983 to 1984. From 1985 to 1991, he was assistant professor in Computer and Information Sciences at the University of Delaware at Newark, before joining Microsoft Corporation in 1992, first as researcher and then, from 1995 onward, as senior researcher. He has been active in the Persona Project at Microsoft Research and, between 1994 and 1998, has organized four ‘‘Lifelike Computer Characters’’ conferences. Kirstie L. Bellman
E-mail:
[email protected] Kirstie L. Bellman is Principal Director of the Aerospace Integration Sciences Center at the Aerospace Corporation. She gained her bachelor’s degree from the University of Southern California and her Ph.D. from the University of California, San Diego—both in Psychology. She was NIH postdoctoral scholar and trainee in neuropsychology for three years. She worked as a researcher at the University of California, Los Angeles and for the Crump Institute for Medical Engineering. She joined The Aerospace Corporation in 1991 as a senior scientist. From 1993 to 1997, she started up the new Aerospace Integration Sciences Center for DARPA, which she is now heading. Her recent work focuses on the use of domain-specific languages and formally based architectural description languages to design and analyze information architectures. With a number of academic partners, she is also developing new mathematical approaches to the analysis of virtual worlds containing collaborating humans, artificial agents, and heterogeneous
Contributors
representations, models, and processing tools. Lately, she has been working on reflective architectures that use models of themselves to manage their own resources and to reason about appropriate behavior. Lola Can˜amero
E-mail:
[email protected] Lola (Dolores) Can˜amero is Senior Lecturer in Computer Science at the University of Hertfordshire, UK. She received bachelor’s and master’s degrees in philosophy from the Complutense University of Madrid, and a Ph.D. in computer science from the University of Paris-XI. She worked as a postdoctoral associate at the MIT Artificial Intelligence Laboratory and at the VUB (Free University of Brussels) Artificial Intelligence Laboratory, and as a researcher at the Artificial Intelligence Institute of the Spanish Scientific Research Council. Her research lies in the areas of adaptive behavior and emotion modeling for autonomous and social agents (both robotic and synthetic). She has organized a number of symposia and workshops on this topic, and she is guest editor (with Paolo Petta) of the special issue of the Cybernetics and Systems Journal Grounding Emotions in Adaptive Systems, as well as coeditor of the book Socially Intelligent Agents: Creating Relationships with Computers and Robots. Clark Elliott
E-mail:
[email protected] Clark Elliott, associate professor of computer science at DePaul University, has conducted research in both theoretical and applied computer applications of emotion reasoning since 1989. He was among the first graduates of Northwestern University’s Institute for the Learning Sciences, receiving his degree there in 1992. Dr. Elliott was an early proponent of the use of emotion models in the design of believable agents in multi-agent systems, and in this capacity served on the program committees of numerous conferences that supported this area of research. His work on emotion representation has been applied to diverse subareas of AI such as intelligent tutoring systems, personality representation, story representation and generation, user modeling, and humor representation. His Affective Reasoner emobodied real-time, interactive agents that used speech-generation, speechrecognition, music, and face-morphing techniques to communicate with users. Dr. Elliott founded DePaul’s Distributed Systems division in 1997, and is currently serving as its associate director. At the time of
Contributors
publication he is on a temporary hiatus from his research while recovering from an accidental brain injury. Andrew Ortony
E-mail:
[email protected] Andrew Ortony was educated in Britain, gaining his bachelor’s degree from the University of Edinburgh, where he majored in philosophy, and then doing his Ph.D. in computer science at the University of London’s Imperial College of Science and Technology. His Ph.D. dissertation was concerned with graphical interface design. In 1973, he joined the faculty at the University of Illinois at Urbana-Champaign. There, with appointments in education and in psychology, he started to investigate questions having to do with knowledge representation and language understanding, concentrating in particular on the communicative functions of and the processes involved in the production and comprehension of nonliteral (especially metaphorical) uses of language. His approach to research problems is strongly interdisciplinary, as is evident from the diverse perspectives on metaphor represented in his edited book, Metaphor and Thought. In 1981, he started a long collaboration with Gerald Clore working on the relationship between emotion and cognition and culminating in the publication of their 1988 book (with Allan Collins), The Cognitive Structure of Emotions. Since moving to Northwestern University in 1989, he has maintained his interest in research on metaphor. At the same time, he has become increasingly interested in emotion research as it relates to various aspects of artificial intelligence, including the design of intelligent, emotional autonomous agents. Sabine Payr
E-mail:
[email protected] Sabine Payr holds a diploma as a conference interpreter from the University of Innsbruck, and a doctorate in linguistics from the University of Klagenfurt. Her international experience includes one year stays for studies (Paris), work (Brussels), and research (Berkeley). Professional activities range from conference interpreting to regional development initiatives, from IT training/consulting to research. Since 1987, she has been involved in interactive media in training and education, doing research and development in the field of educational technology in higher education, open and distance learning and tele-learning in vocational training and further education. Sabine Payr has worked at
Contributors
the Institute for Interdisciplinary Research and Further Education IFF, the Austrian Federal Ministry of Education, Science, and Culture and the Research Center Information Technologies (FGI). She has been working at the Austrian Research Institute for Artificial Intelligence since spring 2000, in the framework of the project ‘‘An Inquiry into the Cultural Context of the Design and Use of Synthetic Actors.’’ She is currently also visiting professor at the University for Design (Linz/Austria). Paolo Petta
E-mail:
[email protected] Paolo Petta earned his masters (1987) and doctorate degree (1994) in computer science from the Technical University of Vienna. Since 1987, he has been working at the Austrian Research Institute for Artificial Intelligence, where he founded, in 1997, the research group Intelligent Software Agents and New Media, of which he is head. In 1989, he also joined the Department of Medical Cybernetics and Artificial Intelligence of the University of Vienna as an assistant professor. He has led a number of research projects in the field of autonomous intelligent agents, among them the development of a life-size improvising synthetic character for an interactive exhibit at the Technical Museum of Vienna. In 1997, he edited, together with Robert Trappl, the book Creating Personalities for Synthetic Actors. Rosalind W. Picard
E-mail:
[email protected] In 1984, Rosalind W. Picard earned a bachelor in electrical engineering with highest honors from the Georgia Institute of Technology and was named a National Science Foundation graduate fellow. She worked as a member of the technical staff at AT&T Bell Laboratories from 1984–87, designing VLSI chips for digital signal processing and developing new methods of image compression and analysis. Picard earned her masters and doctorate, both in electrical engineering and computer science, from the Massachusetts Institute of Technology (MIT) in 1986 and 1991, respectively. In 1991, she joined the MIT Media Laboratory as an assistant professor, and in 1992 was appointed to the NEC Development Chair in Computers and Communications. She was promoted to associate professor in 1995, and awarded tenure at MIT in 1998. Her award-winning book, Affective Computing (MIT Press, 1997), lays the groundwork for giving machines the skills of
Contributors
emotional intelligence. Rosalind W. Picard is founder and director of the Affective Computing Research Group at the MIT Media Laboratory. Douglas Riecken
E-mail:
[email protected] and
[email protected] Doug Riecken is a principal investigator and manager at the IBM T. J. Watson Research Center. Doug manages the Common Sense Reasoning and e-Commerce Intelligence Research Department. He has also established the Center of Excellence for Common Sense Reasoning while at IBM. Since 1987, Doug continues to work with Marvin Minsky on theories of mind and common sense reasoning with a fundamental focus on the role of emotions and instincts in memory, reasoning, and learning. Riecken is also a member of the graduate faculty at Rutgers University. Prior to joining IBM Research in 1999, Riecken served for 17 years as a principal investigator and manager at AT&T Bell Laboratories Research. He received his Ph.D. from Rutgers University working under Minsky. Edmund T. Rolls
E-mail:
[email protected] Edmund T. Rolls read preclinical medicine at the University of Cambridge, and performed research on brain function for a Ph.D. at the University of Oxford. He is now professor of experimental psychology at the University of Oxford, and a fellow and tutor at Corpus Christi College, Oxford. He is associate director of the Medical Research Council Oxford Interdisciplinary Research Centre for Cognitive Neuroscience. His research interests include the brain mechanisms of emotion and memory; the neurophysiology of vision; the neurophysiology of taste, olfaction, and feeding; the neurophysiology of the striatum; and the operation of real neuronal networks in the brain. He is author, with A. Treves, of Neural Networks and Brain Function (1998). In 1999, he published the much noted book The Brain and Emotion. His website is: hwww.cns.ox.ac.uki. Aaron Sloman
E-mail:
[email protected] Aaron Sloman received a B.S. in mathematics and physics first class in 1956 in Cape Town, and a Ph.D. in philosophy, Oxford 1962. He joined the faculty of the University of Birmingham, UK in 1991. He was Rhodes scholar at Balliol College, Oxford (1957–60), senior
Contributors
scholar at St Antony’s College 1960–62, GEC professorial fellow 1984– 86, and elected fellow of the American Association for AI in 1991. He was elected honorary life fellow of AISB in 1997, and fellow of ECCAI in 1999. Aaron Sloman is a philosopher and programmer trying to understand how minds evolved and what sorts of designs make them possible. Many papers on aspects of mind, emotions, representations, vision, architectures, evolution, and so on can be found at the Cognition and Affect website, hwww.cs.bham.ac.uk/research/cogaff/i. Andrew Stern
E-mail:
[email protected] Andrew Stern is a designer and programmer of the interactive characters Dogz, Catz, and Babyz from PF.Magic in San Francisco. Along with his fellow creators Adam Frank, Ben Resner, and Rob Fulop, he has presented these projects at the Siggraph Art Gallery 2000, Digital Arts and Culture ’99, AAAI Narrative Intelligence Symposium ’99, Autonomous Agents ’98, and Intelligent User Interfaces ’98. The projects have received press coverage from the New York Times, Time Magazine, Wired, and AI Magazine. Babyz received a Silver Invision 2000 award for Best Overall Design for CD-Rom; Catz received a Design Distinction in the first annual I.D. Magazine Interactive Media Review, and along with Dogz and Babyz was part of the American Museum of Moving Image’s Computer Space exhibit in New York. Andrew Stern is currently collaborating with Michael Mateas on an interactive drama project, ‘‘Facade.’’ He holds a B.S. in computer engineering with a concentration in filmmaking from Carnegie Mellon University and a masters degree in computer science from the University of Southern California. His website can be found at: hwww.interactivestory.neti. Robert Trappl
E-mail:
[email protected] Robert Trappl is professor and head of the Department of Medical Cybernetics and Artificial Intelligence, University of Vienna, Austria. He is director of the Austrian Research Institute for Artificial Intelligence in Vienna, which was founded in 1984. He holds a Ph.D. in psychology (minor in astronomy), a diploma in sociology (Institute for Advanced Studies, Vienna), and is an electrical engineer. He has published more than 130 articles, he is coauthor, editor or coeditor of 28 books, the most recent being Power, Autonomy, Utopia: New Approaches toward Complex Systems, Plenum, New York; Cybernetics
Contributors
¨ and Systems 2002, OSGK, Vienna; Advanced Topics in Artificial Intelligence; Creating Personalities for Synthetic Actors; Multi-Agent Systems and Applications;—these three books by Springer, Heidelberg, New York. He is editor-in-chief of Applied Artificial Intelligence: An International Journal and Cybernetics and Systems: An International Journal, both published by Taylor and Francis, United States. His main research focus at present is the development and application of artificial intelligence methods to aid decision makers in preventing/ ending wars, and the design of emotional personality agents for synthetic actors in films, television, and interactive computer programs. He has been giving lectures and has been working as a consultant for national and international companies and organizations (OECD, UNIDO, WHO).
Name Index
Agre, P. E., 251, 256, 269, 207, 271, 277 Albus, J. S., 40, 48, 51, 52, 89, 90, 95, 111 Alexander, R. D., 27, 33 Allen, S., 103 Allport, G. W., 202 Anderson, J.R., 267 Anderson, K. J., 204 Antoniou, A. A., 193, 257 Armon-Jones, C., 253 Arnott, J. L., 323 Ashby, W. R., 146 Atherton, D., 355 Aube´, M., 254 Averill, J. R., 192 Balkenius, C., 257 Ball, G., 8, 215, 219, 277, 280, 307, 319, 320, 328, 329, 358 Banse, R., 324 Barber, K. S., 170 Bargh, J. A., 251 Baron, R. M., 269 Bartle, R., 175 Bartlett, F. C., 258 Bates, J., 42, 111, 142, 262, 309, 335 Baumgartner, P., 1, 10 Beaudoin, L., 38, 44, 53, 67, 73, 74, 80, 103, 111, 112, 114 Beethoven, L. van, 290 Bellman, K., 5, 30, 64, 65, 105, 106, 143, 162, 164–166, 169–172, 174–175, 178, 180–181, 183, 185, 205–207, 215, 234, 278, 307, 319, 329, 350, 351, 358, 359–360 Bickhard, M. H., 277 Blumberg, B., 335 Boden, M., 103 Boehlen, M., 339 Bohr, N., 162 Bonasso, R. P., 261, 267 Booth, D. A., 23, 33 Boynton, B., 355 Bo¨sser, T., 255, 271 Braitenberg, V., 124, 146, 162, 220 Brand, P. W., 228 Breese, J., 304, 311 Brooks, R. A., 52, 53, 90, 89, 112, 116, 146, 260, 271
Bruner, J., 165 Brzezinski, J., 335 Burgess, P., 26, 34 Burgoyne, A., 355 Cahn, J. E., 324 Can˜amero, L. D., 4, 104, 108, 115, 117, 127, 123, 124, 129, 133, 137, 144, 147, 157, 161, 173, 182, 256, 262, 275, 277–278 Cantor, J., 355 Capra, F., 159, 167–168 Carpenter, G. A., 137, 147 Chandrasekaran, B., 44, 112 Chapman, D., 260 Chartrand, T. L., 251 Chesterto wn, G. K., 163 Chomsky, N., 270 Christopher, S. B., 28, 34 Churchland, P., 164, 169 Clancey, W. J., 256, 277 Cliff, D., 337 Clodius, J., 175, 177 Clore, G. L., 88, 113, 193, 132, 147, 222, 257, 262, 291, 309 Clynes, M., 227–228, 291 Colby, K. M., 190 Collins, A., 88, 113, 193, 132, 147, 222, 257, 262, 291, 309 Costa, P. T., 203, 309 Craik, K., 62, 112 Croucher, M., 73, 114, 261 Custodio, L., 257 Damasio, A. R., 1, 10, 77, 80, 84, 112, 122, 147, 146, 150, 153, 155–158, 163, 165–166, 168, 223, 251, 256, 280, 336, 363 Darrell, T., 272 Darwin, C., 192, 17, 22, 33 Dautenhahn, K., 175 Davies, D. N., 52, 112 Dawkins, R., 22, 34 Dennett, D. C., 39, 112, 40, 62, 70, 97, 103, 111 de Rosis, F., 364 Descartes, R., 159 de Sousa, R., 272 Donnart, J. Y., 116, 147
Name Index
Doyle, J., 294 Drescher,G.L., 267 Dyer, M. G., 263 Earl, C., 267 Egri, L., 334 Ekman, P., 192, 196, 17, 146, 147 Elliott, C., 7, 8, 141, 160, 209, 215–216, 233, 254, 262–264, 267, 335, 357 Ellsworth, P.C., 257 Elsaesser, C., 261 Elster, J., 254 Endo, T., 89, 113 Eysenck, H. J., 204 Feldman, D., 355 Ferguson, I. A., 261, 267 Filippoff, M., 355 Firby, R. J., 261, 267, 269 Fiske, S. T., 266 Fodor, J., 47, 112, 48, 51, 52, 93, 94 Fogg, B.J., 327 Foner, L. N., 179 Foss, M. A., 193 Frank, A., 340, 355 Frank, R. H., 255 Franklin, S., 103 Freud, S., 97, 152 Fridlund, A. J., 18, 34 Frijda, N. H., 5, 8, 12, 34, 68, 72, 115, 120, 121, 127, 130, 132, 133, 146, 147, 254– 255, 257–260, 266–268, 272, 277, 279 Frisby, J. P., 49, 112 Fulop, R., 340, 355 Gat, E., 271, 278 Geneva Emotion Research Group, 324 Gent, R. van, 262, 335 Georgeff, M. P., 261 Gershon, M. D., 227 Gibson, J. J., 49, 112, 269 Glasgow, J., 44, 112 Goldberg, A., 335, 346 Goldman-Rakic, P. S., 26, 34 Goodale, M., 55, 112 Gordon, A., 176 Grand, S., 337 Grandin, T., 165 Granit, R., 169 Gratch, J., 264 Gray, J. A., 204, 13, 14, 34 Griffin, D. R., 153 Gross, J. J., 251 Grossberg, S., 137, 147 Haidt, J., 255 Hall, L., 176
Halperin, J., 275 Harrington, A., 355 Hayes, P., 103 Hayes-Ro th, B., 202, 262, 335, Heckerman, D., 311, 323 Hershman, L., 339 Hexmoor, H., 271, 278 Higgins, E. T., 203 Horswill, I., 251, 256, 270–271 Horvitz, E., 311 Huber, D., 262, 335 Humphreys, M. S., 204 Izard, C. E., 18, 34 James, W., 87, 154 Jenkins, J. M., 12, 85, 113 Jennings, N. R., 252, 260–261 Jensen, F. V., 311, 317 Jessel, T. M., 119, 147 Johnson, W. L., 179, 181 Johnson–Laird, P., 89, 112, 117, 147 Johnston, O., 334 Johnstone, B., 220 Johnstone, I. T., 324 Jose, P. E., 193, 257 Kacelnik, A., 25, 34 Kandel, E. R., 119, 147 Karmiloff–Smith, A., 44, 112 Keltner, D., 251, 255 Kemmer, P., 355 Kemper, T. D., 254 Kennedy, C., 103 Kim, J., 170 Ko¨hler, W., 78, 112 Krebs, J. R., 25, 34 Lachman, R., 355 Landauer, C., 164–165, 170–172, 174–176, 178, 180–181 Lang, P., 309 Lansky, A. L., 261 Laurtizen, S.L., 317 Lazarus, R. S., 12, 34, 257–260, 267–269 Leak, G. K., 28, 34 LeDoux, J. E., 33, 120, 123, 147, 221, 253, 259 Lee, D., 55, 112 Lenat, D., 292 Leong, L., 175 Lester, J., 262, 264 Levenson, R. W., 255 Leventhal, H., 258–259, 265–266 Lindsay, P. H., 150, 153–154, 158, 162 Lishman, J., 55, 112 Lloyd, A. T., 28, 34
Name Index
Logan, B., 103 Loyall, A. B., 42, 111, 262, 309 Macintosh, N. J., 14, 34 Macmahon, M., 272 Maes, P., 116, 147, 164, 169, 262 Malhotra, A., 337 Mandler, G., 118, 147, 153 Marr, D., 48, 50, 93, 112 Marsella, S., 302 Martinho, C., 262, 309 Mateas, M., 335, 339 Maturana, H., 159 Mauldin, M. L., 272 McArthur, L. Z., 269 McCarthy, J., 103, 111 McCaulley, M. H., 310 McCrae, R. R., 203, 309 McDermott, D., 144, 146, 38, 112, 40 McFarland, D. J., 255, 271 McGinn, C., 153, 165 Meehan, J., 339 Meyer, J. A., 116, 147 Michalski, R. S., 292 Millenson, J. R., 13, 34 Miller, D. P., 261 Miller, R., 169 Millington, I., 103 Milner, A., 55, 112 Minsky, M. L., 40, 52, 89, 100, 103, 112, 162, 291, 293, 294, 296, 297, 298, 336 Moffat, D., 141, 152 More´n, J., 257 Murarka, N., 355 Murray, I. R., 323 Murray, J., 338–339 Myers, J. B., 310 Nagel, T., 99, 112 Narayanan, N. H., 44, 112 Nass, C., 1, 10, 214, 305, 318–319 Nesse, R. M., 28, 34 Newell, A., 40, 112 Nii, H.P., 299 Nilsson, N. J., 40, 51, 52, 56, 77, 89, 90, 113 Norman, D. A., 150, 153–154, 157–158, 162 Norvig, P., 40, 113, 256 O’Brien, M., 175 O’Rorke, P., 262 Oatley, K., 12, 34, 85, 113, 117, 147 Odbert, H. S., 202 Ogden, C. K., 150–151, 167, 172 Okada, N., 89, 113 Olesen, K. G., 317
Ortony, A., 6, 29–33, 64, 66, 88, 103, 104, 108, 110, 113, 132, 147, 142, 146, 191, 193, 199, 206, 208–210, 216, 218, 219, 222, 226, 234, 235, 254–255, 257, 262, 277, 280, 291, 307, 309, 319, 328, 359 Osgood, C. E., 321 Paiva, A., 262, 309 Payr, S., 1, 10 Pednault, E., 302 Pemberton, L., 339 Penny, S., 339 Pentland, A., 324 Perlin, K., 181, 335 Pert, C., 162, 164 Peterson, D., 44, 113 Petrides, M., 26, 34 Petta, P., 2, 8, 10, 103, 141, 207–208, 255, 262, 272, 278, 280, 291, 348 Pfeifer, R., 118, 124, 128, 131, 133, 147, 260–261, 263 Picard, R., 2, 5–7, 10, 36, 65, 77, 84, 103, 109, 113, 125, 126, 143, 144, 147, 151, 165–166, 214–216, 219, 226–227, 233–235, 262, 291, 306, 307, 308, 319, 328, 329, 352, 358, 359 Pinker, S., 363 Pinto-Ferreira, C. A., 257, 278 Poincare´, H., 156–157 Poli, R., 103 Polichar, V. E., 175–176 Popper, K., 55, 113, 62 Pribram, K. H., 119, 147 Pryor, L. M., 269 Rath, A., 339 Read, T., 103 Reekum, C. M. van, 258–259 Reeves, B., 1, 10, 214, 305, 318–319 Reilly, W. S., 42, 111, 309 Resner B., 340, 355 Revelle, W., 204 Richards, I. A., 150–151 Rickel, J., 262, 264 Riecken, D., 8, 146, 226, 278, 291, 292, 349 Rilke, R., 334 Riner, R., 175, 177 Rines, J., 355 Rolls, E., 3, 11–16, 19–21, 23, 24, 29–33, 44, 54, 64, 69, 70, 96, 105–107, 110, 111, 113, 121, 122, 132, 142, 145, 147, 150, 153–154, 158, 204, 205–206, 210, 226, 253, 255–257, 280, 291 Rommelse, K., 311 Rosch, E., 155 Roseman, I. J., 193, 257 Roth, G., 363
Name Index
Rousseau, D., 202, 262 Roy, D., 324 Russell, S., 40, 113, 256 Ryle, G., 73, 113 Sacks, O., 150, 163, 165 Sagan, C., 162 Sakaguchi, H., 2 Salvatoriello, R., 335 Scherer, K. R., 124, 148, 193, 257–259, 261,266, 279, 309, 324 Scull, J., 355 Schu¨tz, A., 364 Schwartz, J. H., 119, 147, 176 Scott, R., 364 Searle, J., 183 Sengers, P., 345 Shakespeare, W., 87 Shambroom, J., 355 Shallice, T., 26, 34 Sherrod, B., 355 Sheth, S., 335 Shing, E., 103 Shneiderman, B., 143 Shwe, M., 311 Simon, H. A., 47, 73, 113, 117, 140, 147 Slack, M. G., 261 Sleeper, J., 355 Sloman, A., 4, 30–33, 35, 38, 43, 44, 46, 48, 49, 52, 53, 55, 56, 63–67, 69, 73, 80, 83, 90, 94, 98, 103–109, 113, 114, 123, 142, 144–146, 147, 161–162, 169, 191, 205–210, 216, 218–219, 226, 234, 256, 261–262, 291, 319, 329, 348, 351, 358 Smith, B., 165, 171 Smith, C. A., 257–259, 266, 268 Sorenson, J., 355 Staller, A., 255 Steels, L., 116, 148 Stern, A., 9, 146, 202, 214, 279–280, 335, 338, 343, 348, 349, 351, 358, 359 Strongman, K. T., 12, 34 Strawson, P. F., 98, 114 Suci, G.J., 321 Swaggart, J., 245 Sycara, K., 260–261 Tannenbaum, P.H., 321 Tartter, V.C., 324 Taylor, S. E., 266 Teasdale, J. D., 259 Thomas, F., 334 Thomas, L., 162 Thompson, E., 155 Tinbergen, N., 16, 34 Tomkins, S. S., 119, 133, 148
Trappl, R., 1, 2, 10, 103, 141, 235, 355, 358 Treves, A., 19, 34 Trivers, R. L., 28, 34, 255 Turkle, S., 176, 337, 365 Turner, S., 339 Turner, T. J., 146, 147 Varela, F., 155, 159 Vela´squez, J. D., 117, 124, 148, 262 Ventura, R., 257, 278 Viola, W., 339 von Uexku¨ll, J., 164 Walter, D. O., 162, 164 Walter, W. G., 220 Webster, A., 355 Wehrle, T., 124, 148 Weiskrantz, L., 13, 34 Weizenbaum, J., 353 Weizsa¨cker, C. F. von, 273 Wiggins, J. S., 309 Wilkins, D. E., 261 Wilson, S. W., 116, 148 Wittgenstein, L., 172 Wooldridge, M., 252, 260–261 Wright, I. P., 38, 44, 53, 73, 80, 103, 114, 262 Yancey, P., 228 Yu, S. T., 261 Zajo nc, R. B., 258 Zukav, G., 183
Subject Index
3T
Architecture, 267, 270, 279. See also System architecture, layered trionic ACT*, 266 Action dual routes to, 3, 23, 25, 28 evasive, 41 (see also Action tendency) Action expression, 348 Action selection, 118–120, 140, 276–278. See also Activity selection, Behavior, selector of Action tendency, 254, 257, 259–260, 265, 267–269, 273, 277, 279–280. See also Behavioral inclination avoidance, 259 (see also Avoidance, active) Activation chemical (hormonal), 154 neural, 154 Activity selection, 115, 116, 123. See also Action selection Adaptation, 115, 257 Affection. See Emotions and feelings, affection Affective agent. See Agent, affective Affective art, 333–339, 349–350, 353–355 Affective artifact. See Artifact, affective Affective computing concerns, 326–330, 353–354, 364–365 ethics of, 327–330 impact, 326 Affective Reasoner, 8, 239, 241–246, 254, 262–264 Affective state. See State, emotional Affordance, 49, 56, 269–270 Agency, 127, 181, 258 Agent affective (see Agent, emotional) animated, 220 artificial, 115, 150–151, 157, 172–173, 183 (see also Artifact) autonomous, 4, 115, 157, 252, 255 (see also System, autonomous; Autonomy) believable, 6, 7, 238, 240–241, 246–249 (see also Believability) biological, 115, 157 cooperative, 182 embodied, 169, 173, 180–182, 260
emotional, 6, 120, 129, 189, 193 (see also Artifact, affective) functional, 9, 336 situated, 252, 256, 265, 269, 277, 280 synthetic (see Agent, artificial) Agent architecture, 252, 256, 260 deliberative, 260 (see also System, deliberative) design approaches, 8, 42–43, 55 hybrid, 260–261, 267 reactive, 76, 77, 260–261 (see also System, purely reactive) Agent societies, 120–121 Aibo (robot dog), 337 Alarm systems. See Control, alarm systems Alcohol, 27 AM (heuristic search system), 292 Amygdala, 3, 18, 19, 23, 24, 26, 33, 259 Animal cognition. See Cognition, animal Animat, 116, 261 Animism, 143 Anosognosia, 163 Anthropomorphism, 7, 141, 142, 146 Appraisal, 12, 208, 260, 266, 269–270 conceptual processing, 258–259 conscious, 221 (see also Consciousness) emotional, 223 schematic processing, 258–259 subconscious, 109 Appraisal criteria, 257–258, 266 Appraisal mechanism, 264 Appraisal process, 258–259, 265–266, 268–270, 279 Appraisal register, 259, 268 Appraisal theory, 252, 256–257, 266, 272 AR. See Affective Reasoner Architecture. See System architecture; Agent architecture Arousal, 9, 132, 154, 309, 312–314, 325 aesthetic, 85 sexual, 85 vocal encoding of, 324–325 Artifact, 152–153, 171, 184–185 affective, 192–194, 201, 204–207 (see also Agent, emotional) believable, 193, 201 (see also Believability)
Subject Index
Artificial intelligence, 1, 5, 7, 62, 63, 89, 90, 92, 95, 100, 116, 172 ‘‘alien,’’ 237–238 ‘‘human,’’ 237 embodied, 122, 123 situated, 260, 269 Artificial Intelligence (movie), 364 Artificial intelligence research paradigms, development of, 2 Artificial life, 252, 261, 271 Asperger’s syndrome, 165 Association, 158–159, 167. See also Cortex, association; Reward and punishment, association matrix of Associative network. See Spreading activation network Attention diversion of, 199 focus of, 166, 258 receiving, 341 Attention filters, 73, 74 Attention switching, 108 Autism, 223, 234 high-functioning, 165–166 Autonomic response, elicitation of, 16 Autonomy, 115, 170–171 ‘‘levels’’ of, 116 motivational, 116 Autopoiesis, 159 Avatar, 174 , 181, 220 Avoidance, active, 14. See also Action tendency, avoidance Awareness conscious, 222, 224–225 (see also Consciousness) of emotional content, 305 of self (see Self-awareness) Backward masking experiment, 33 Basal ganglia, 17, 24, 26, 27 Bayes Network, 8, 9, 311–315, 316–318, 321–323, 325, 326 BDI model. See Model, BDI Behavior altruistic, 255 coherence of, 190, 202 consistency of, 190, 192, 194–196, 200, 202–203, 206, 209–210, 278 consummatory, 116 emotional, 190, 197, 218–221, 312–316, 318–319, 320–326, 328 (see also Emotional state indication) expressive, 220, 278 (see also Emotional expression) generation of, 6, 189 goal-directed (see Goal-directed behavior)
interactive (see Interactive behavior) involuntary expressive, 197, 254 linguistic, 314, 320–323 predictability of, 190 reactive, 60, 116, 257 rewarded (see Reward) selector of, 17 (see also Action selection) self-interested, 255 sexual, 22 vocal, 314(see also Emotional expression, vocal) voluntary vs. involuntary, 197 Behavior adaptation, 225 Behavior node, 314, 318, 321, 322, 323, 325 Behavior transition, 345 Behavioral complexity, different ‘‘levels’’ of, 116 Behavioral dithering, avoidance of, 346 Behavioral inclination, 191–192, 196, 201, 205. See also Action tendency Behavioral response active, 14 flexibility of, 16 passive, 14 Behavioral schema, 274 Believability, 9, 189, 194, 202, 205, 238, 241, 246–249, 350, 352. See also Artifact, believable Bias. See Disposition; Goal formulation, bias of Bidirectionality, 21 Biomimetic approach, 115, 121 Blackboard harmonic, 299 melodic, 299 rhythmic, 299 root, 299 system, 294, 299–300 Blade Runner (movie), 364 Body language, 268, 275 Body perception, 138 Body state. See State, body Brain ‘‘triune,’’ 51 mammalian, 4 reptilian, 4 Brain damage, 1, 102 Brain design, types of, 20 Brain mechanism, 221 Brain research, 1 Brain subsystems, 138 Brain systems, 17 CAPTology. See Computer aided persuasive technology
Subject Index
Cartooniness vs. realism, 10, 350 Catastrophe theory, 159 Cholinergic pathways, 19 Chunking, 54 Circumplex, interpersonal, 309–310 Cogaff architecture, 82, 90, 92, 93, 98 Cognition, 29 animal, 151 social (see Social cognition) Cognition and Affect project, 50 Cognitive complexity, requirements for, 130 Cognitive ethology, 275 Cognitive processing, 15, 30, 154 Cognitive science, 5 Cognizer, situated, 8, 269–270 Communication nonverbal, 17, 182 (see also Emotional state indication, nonverbal) social (see Social communication) Communicative ability, 247–248 Communicative adaptability, 247 Computer aided persuasive technology, 327 Computer animation, 2 Computer game development, 1 Computer toys, 9 Concern satisfaction, 272 Conditioning, instrumental, 255 Confabulations, 26 Conscience, 28 Consciousness, 4, 7, 26, 27, 31, 35, 40, 44, 103, 152, 154, 183, 225–226, 231. See also Appraisal, conscious; Awareness, conscious; Control system, conscious; Emotions and consciousness; Self, conscious Consistency cross-individual, 191 in emotions, 190 in response tendency (see Response tendencies, consistency in) within-individual, 191 Constructed system. See Artifact Contingencies, 117 Continuity, sense of, 26 Control alarm systems, 56–58, 64, 77, 105, 106 (see also Metamanagement with alarms) direct, 91 global interrupt filter, 74 goal-driven, 51, 60 hypothesis-driven, 51 sharing of, 341–342 social (see Social control) Control states, complex, 84, 85
Control system conscious, 27 emotional ‘‘second-order,’’ 133, 140 hierarchical, 89 motivational, 137 Conversational dialog, 304 Conversational interface, 303–304, 305– 306, 316–318 comfortable, useful, 303 Coordination failure of emotional subsystem, 122 of distributed system, 170 Coping, 257, 259–260, 265, 267–269 Coping strategies, 210 emotion-oriented vs. problem-oriented, 200 Cortex, 155, 221, 224 association, 18, 29, 30, 33 frontal lobe, 106 language, 18 orbitofrontal, 3, 18, 19, 23, 26, 33 prefrontal, 25, 26 Cost-benefit analysis, 25 Creatures (See also Agent) complete autonomous, 123 simulated, 133 Creatures (computer game), 337 Credit assignment problem, 31, 32 Culture, 152, 255 and emotion, 292 Western, 152, 157, 161–162 Cybernetics, second-order, 159 Decision making, 27, 156, 158, 172 cooperative, 342 Declarative processing, 26 DECtalk (speech synthesizer), 324 Deep Blue, 233 Deliberative mechanisms, 61–64. See also System, deliberative Design space, 43 discontinuities in, 101 Design stance, 39, 40 Desires, unconscious, 27. See also Unconscious system Development. See Knowledge acquisition Dialog. See Conversational dialog Display rules, 196, 254–255, 260 Disposition, 70, 202, 294, 301 feedback facility of, 300–301 Drive, physiological, 253, 257 Drives, 5 Education, computer-aided, 180 Elegance, 7 Eliza effect, 353 Emergence, 95,111
Subject Index
Emotion definition of, 11, 12, 117, 118, 309 elicitation of, 195, 201 persistence of, 206 types of (see Emotion classes) understanding of, 240 Emotion and behavior. See Behavior, emotional; Control system Emotion and learning, 30, 291. See also Learning mechanisms Emotion attribution, 220 Emotion chip, 316 Emotion classes, 193, 262–263 Emotion code object, 347 angry, 347 (see also Emotions and feelings, anger) happy, 347 (see also Emotions and feelings, happiness) sad, 347 (see also Emotions and feelings, sadness) Emotion generation, 7, 221–224, 240, 252, 257 multilevel, 219, 223 Emotion intensity, 196 Emotion model, 237–238, 246, 249. See also Model, emotional valence Bayesian (see Bayes network) OCC, 193–194, 222, 239, 241–242, 262– 263 noncognitive, 316 Emotion node, 294, 295 Emotion process, 251 Emotion recognition, 216, 316 Emotion response tendencies. See Response tendencies Emotion sensing, physiological, 310. See also Heart rate; Galvanic skin response Emotion simulation, role of, 316 Emotion system, 119, 216, 230, 234, 255– 256 artificial, 129 human vs. artificial, 230, 232, 234 Emotion theory, 117, 192 functional, 264 perceptual-motor, 265 Emotional appearance, 7, 218, 220, 223 Emotional attitude, 71–73. See also Emotions and feelings calming, 306 commanding, 306 disapproval, 306 empathy, 306 solicitude, 306 warning, 306 Emotional behavior. See Behavior, emotional
Emotional competence, 264 Emotional experience, 7, 154, 224–225, 231, 233 Emotional expression. See also Linguistic expressive style; Locomotion style behavioral, 197, 344–349 (see also Behavior, expressive) communicative, 197, 199, 310, 320, 326 (see also Communication; Emotional state indication, nonverbal) effective, 344–346, 353 fatigue, 346 musical, 292, 294–296 prioritization of, 346 real-time, 345 somatic, 197 theatrical techniques of, 346 vocal, 323–325 Emotional filter function, 347 Emotional guidance, 276 Emotional information, 216 Emotional insect, 107 Emotional maturity, 234 Emotional reaction, to computers, 305 Emotional relationships, 335–338, 343– 344, 353–355 Emotional response, 309. See also Arousal; Emotional valence policy, 317, 318–320 tendency (see Response tendencies) triad, 251 Emotional sensitivity, 303 Emotional state. See State, emotional Emotional state indication, nonverbal, 308, 325. See also Communication, nonverbal; Behavior, emotional Emotional subsystems, improper synchronization of, 122 Emotional tone, depressed, 305 Emotional valence, 9, 132, 309, 312–316, 325. See also Model, emotional valence vocal encoding of, 324–325 Emotionality, 1 Emotionally intelligent system, 233–234 Emotion-behavior linkage, 192, 196, 201 Emotions animal, 153 architecture-based concepts of, 44 artificial, 120 basic, 119, 120 components of, 217 low-intensity, 71 maladaptive, 121, 122, 146 primary, 44, 58, 77, 85, 105, 109, 110, 119, 120 role of, 120, 127, 128, 151, 153–154, 158, 253–255
Subject Index
secondary, 44, 58, 80, 81, 85, 105, 110 socially important human (see Social emotions) tertiary, 44, 58, 86, 106, 109, 110 unaware, 32 Emotions and consciousness, 363–364. See also Consciousness Emotions and feelings. See also Feeling affection, 193, 341 (see also Interactive behavior, affection) anger, 120, 130, 136, 197, 200, 228, 295, 243, 344 (see also Emotion code object, angry) anxiety, 120, 130 confusion, 306 disgust, 344 embarrassment, 306 fear, 12, 14, 15, 33, 120, 130, 136, 193– 194, 346, 347 (see also Response, fear) frustration, 12, 213, 215–216 gloating, 243, 245 happiness, 12, 136, 295, 344, 346 (see also Emotion code object, happy) irritation, 306 joy, 306 love, 228, 344 meditativeness, 295 pain, 15, 70–71, 228–229, 246–247 pleasure, 70, 71, 246, 344 (see also Touch, pleasantness of) pride, 306, 344 relief, 12 sadness, 136, 295, 306, 346 (see also Emotion code object, sad) shame, 243–246 uncertainty, 306 Emotions research, impact, 363 Empathy, 161, 184 Endorphin, 138 Engineering design, 22 E-Nodes. See Emotion node Entertainment, 142, 36, 102 interactive, 338 (see also Interactive experience, design of) Environment, 127, 135, 145, 154, 159, 164, 166, 170–171, 173, 182, 207–208, 210 affordances of the, 194 dynamism of, 135 external and internal, 115, 140 rational reaction to, 72 Environmental needs, 35 Environments, classification of, 129–131 Escape, 14 Ethology, 116 cognitive (see Cognitive ethology) Evaluation mechanisms, 19, 68–70, 106 Evaluation system, 11
Evaluations, conflicting, 70 Evaluators, 70 Event interesting, 301 reinforcing (see Reinforcing events) Evolution, 3, 20, 22, 35, 47, 52, 89, 104, 119, 139 Evolution fitness, 3, 11, 17, 21–23 Expert system, 172 Explanations, reasonable, 26 Extinction, 14 Facial expression, 18, 192, 196, 199, 220, 246, 268, 275, 278, 314 Fast/slow dichotomy, 64, 65 Feedback, 171–173. See also Disposition, feedback facility of emotional, 216 Feeling, 155, 222–225, 231–232. See also Emotions and feelings aggressiveness, 344 boredom, 136, 344 craving, 344, 346 dislike, 344 excitement, 344 gratefulness, 344 guilt, 344 jealousy, 344, 347 laziness, 344 loneliness, 344 punished, 344, 348–349 (see also Punishment) rewarded, 344 (see also Reward) satisfaction, 344 shame, 344 timidity, 344 warmth, 344 of users, 6 Final Fantasy (movie), 2 Fixed action pattern (FAP), 275–276, 279– 280 Flexibility, 297–298 Frame-based system. See also K-line frame Frankenstein, 354 Functional differentiation, 90 Functional neutrality, 31 Fungus Eater, New, 128 Galvanic skin response, 223, 310 Gedankenexperiment, 7 Genes, 17, 145, 146 Geneva Emotion Research Group, 324 Genotype-phenotype relation, 132 Goal achievement, 264 Goal conflicts, resolution of, 5 Goal-directed behavior, 181
Subject Index
Goal formulation bias of, 291, 292 emotion and, 292 Goal hierarchy, 194 Goal importance, 264 Goal mechanisms, 59, 60 Goals, 208, 291 survival-related, 120 Gridland, 5, 133, 135 GSR. See Galvanic skin response Habit, 16 HAL (computer in 2001: A Space Odyssey), 218, 233, 354 HBB. See Blackboard, harmonic Heart rate, 310 Homeostasis, 104, 133. See also Hysteresis Homeostatic need states, 15 Homunculus, 169, 364 Hope, 239–240 Hormones, 133, 134, 137 Human architecture, 35. See also Information-processing architecture, human Humans, properties of, 127 Humor, 241–242 Hunger. See State, hunger as internal need Hypothetical future, 80 Hysteresis, 24. See also Homeostasis Indexicals, 256 Individuality, 256 Information-processing architecture, human, 35, 39, 45, 46 Information-processing perspective, 4 Instrumental response. See Response, instrumental Intention, 181, 191, 254. See also Model, BDI Interaction computer, 305 (see also Entertainment, interactive) human-agent, 173, 182, 185 social (see Social interaction) Interactive behavior affection, 343–344 (see also Emotions and feelings, affection) nurturing, 344 play, 344 training, 344 Interactive experience, design of, 349. See also Interaction, computer Interrupt filter, global. See Control, global interrupt filter Intuition, 152, 157 Invisible Person, 272–273, 276–278
K-line frame, 298–299 memory, 8, 298–299 theory, 293, 294, 296–297 K-lines, compound, 298 Knowledge musical, 296 Knowledge acquisition, 296 Knowledge representation, 293 Knowledge source, 299–300 Language generation, 248 Language system, 25, 26, 28, 30, 31, 33, 96, 97 Layer deliberative, 44, 53, 61, 104 (see also System, deliberative) narrative intelligence, 347 reactive, 44, 53, 58, 104 Layered architectures. See System architecture, layered Layers, concurrently active versus pipelined, 90 Learning. See also Emotion and Learning by trial and error, 156 inductive machine, 292 instrumental, 17 by stimulus-reinforcer association, 17, 23 Learning mechanisms, 69. See also Emotion and Learning; Knowledge acquisition; Motive comparators, learned; Reinforcer, learned; STAR; Stimulus reinforcement, learning of Lifeworld, 207, 251, 256 Linguistic expressive style, 321–323 Locomotion, 20 Locomotion style, 345 Man-machine interface, 151 Meaning, 152, 165–167, 184, 269, 321 Memories cognitive evaluation of, 19 recall of (see Recall, of memories) storage of, 19 Memory emotional, 131 human, 290 (see also Recall, of personalized habits) K-line (see K-line memory) long-term associative, 62, 64 short-term, 3, 25, 26, 62, 64 Memory process, 154 Mental disorders, 122 Mental ontologies, 40 Metamanagement, 32, 44, 53, 58, 74, 88, 98, 104, 105 with alarms, 81–84(see also Control, alarm systems)
Subject Index
Microworld, 135 Mimicry, 218–219 Mind everyday concepts of, 35 society of (see Society of Mind theory) Mind ecology, 4, 36, 55 Mind fluidity, 293 Mind-body interaction, 7, 223, 227–228 Model BDI, 267 component-based, 5, 125–127 contention-scheduling, 93 emotion (see Emotion model) emotional valence, 258, 314–316 functional, 5, 127, 128 information flow, 4 Markovian, 314 personality, 237–238, 246 phenomenon-based/black-box, 4, 124, 131 process/design-based, 5, 124, 125, 128, 131 triple-layer, 52–54(see also System architecture, layered) triple-tower, 50, 51, 53, 54 user, 241 Modular organization, 47 Mood expression, 345 Moods, 71–73, 195 Moral, 255 Motivation, 5, 17, 132, 154, 157–158, 255. See also Control system, motivational Motivation emotion amplified, 133 incentive, 24 persistent and continuing, 19 Motivational states. See State, motivational Motivational submechanisms, 67 Motive generators, 67 insistence of, 74 intensity of, 74 Motive comparators, learned, 69 Motives, competing, 93 Motives, intrinsic, 92 MUD. See Multi-user domain Multilevel processing, 103 Multi-user domain, 175–180 Multi-user domain robot, 179–180 Multi-user dungeon. See Multi-user domain Multi-user systems, 357–360. See also Multi-user domain; Multi-user virtual environment Multi-user virtual environment, 175, 180. See also Virtual world
Music, 8, 49. See also Knowledge, musical; Emotional expression, musical Music composition process, 8, 293 Musical artifact, 295 Musical component frame, 299 MUVE. See Multi-user virtual environment Naive realism, 167 Narrative intelligence. See Layer, narrative intelligence Natural language interaction, 180 Natural language interface, 304, 308. See also Text-to-speech systems Natural selection, 21 Nervous system, 290 Net reward. See Reward, net Neural net, 31, 45, 63 Neurological impairment, 223 Neurophysiology, 154 Neuroscience, 224 Neurotransmitter, 153 Noradrenergic pathways, 19 Norms. See Social norms OCC Model. See Emotion model, OCC Office Plant #1 (robot plant), 339 Ontologies, self-bootstrapped, 98, 99 Ontology, 46, 144 Ontology of mind, architecture-based, 87, 101 Organisms hybrid reactive and deliberative, 78–81 (see also System, deliberative) purely reactive, 76, 77 (see also System, purely reactive) PARETO, 269 Partial solutions, 293 Pattern of activation/deactivation, 224. See also Activation Perception, altered, 138 Personae, switching, 74, 75 Personal viewpoint. See Point of view, personal Personality, 6, 75, 196, 201–202, 204, 206– 208, 309 biological substrates of, 205, 207 context-dependency of, 209–210 generation, 240 Myers-Briggs typology of, 310 representation of, 309–310, 314–315 theory of, 192, 203 Personality dimension dominance, 9, 310, 312–314, 325 friendliness, 9, 310, 312–314, 326
Subject Index
Personality dimensions, 203, 205–206, 209–210, 310 Personality model. See Model, personality Personality traits, 193, 200, 202–203, 207 clusters of, 203 longer-term, 9 Persuasion, 327–328 Petit Mal (robot), 339 Phobics, 32, 33 Physiology, 85, 197 Pinocchio, 354 Plan formation, 61 Planning, 25, 62, 63 Plans, long-term, 17, 27 Point of view personal, 161, 165–166, 172–173, 182– 183 first-person, 5, 350 third-person, 350 Posture, 314 control mechanisms of, 46 Predictability, 202–203 Prediction, 154 Pregnant woman, 27 Primitives, choice of, 131, 136 Privacy, 329–330 Problem solving, 63. See also Partial solutions Psychology, comparative, 102 Psychology, developmental, 102 Punishment, 255. See also Reward and punishment Qualia, 226 Quantum mechanisms, 92 R2D2, 354 RAP system, 267–269, 279 Rationality, 1, 26. See also Environment, rational reaction to; Reasoning, rational; Response tendencies, rational bounded, 256 Realism. See Cartooniness vs. realism; Believability; Naive realism Reason, 26 Reasoning, 160, 167, 224 emotional, 157–158, 160–161, 184 multistrategy, 293 rational, 157 rule-based, 222, 224 Recall of memories, 19 of personalized habits, 297, 298 Reequilibration, 119 Reflex, 191, 253, 257 Regulatory focus, 203 prevention, 204
promotion, 204 Reinforcement, 69, 156, 205. See also Stimulus reinforcement Reinforcement contigency, 14 Reinforcer, 11 learned, 3, 13 instrumental, 3, 13 intensity of, 14 negative, 14 positive, 14 primary, 3, 14, 15, 29, 30, 33 secondary, 3, 14 , 15 unlearned, 3, 13 Reinforcing events, 15 Relational toys, 365 Reproductive success, 17 Response, behavioral. See Behavioral response Response bodily, 223 (see also Mind-body interaction; State, body) emotional, 158–159, 192, 253 fear, 221–222 instrumental, 16 internal, 191 selection of, 191 Response categories, 263 Response tendencies, 196, 201, 219. See also Action tendency consistency in, 196 constraints on, 197 coping, 199 (see also Coping strategies) expressive, 197 (see also Emotional expression) information processing, 199 rational, 200 types of, 6, 197 variability in, 196 Reward, 21, 255. See also Reward and punishment deferred, 25 net, 25 Reward and punishment, 3, 11–13, 16, 17, 20–23, 107, 204, 210. See also Feeling, punished; Feeling, rewarded association matrix of, 348–349 Robot, 151 Robot arm, 22 RoboWoggles, 339 Satiety mechanisms, 22, 119 Self, 160, 162, 165–169, 171, 173, 183– 185, 266 concept of, 152, 160, 165, 171 conscious, 251 (see also Consciousness) experience of, 162–163, 225 a sense of, 5
10 The Wolfgang System: A Role of ‘‘Emotions’’ to Bias Learning and Problem Solving when Learning to Compose Music Douglas Riecken
For at best, the very aim of syntax oriented theories is misdirected; they aspire to describe the things that minds produce—without attempting to describe how they’re produced. —Marvin Minsky
10.1 An Emotional ‘‘Call to Arms’’
What are emotions? Words we use to discuss emotions are just that! They are words in a language such as English, German, or French used to characterize the observed behavior of a ‘‘black box’’ like a human, dog, or computer. Words dealing with emotions, such as love, hate, happy, and sad do not tell us what is really occurring in the mind. What is required are good theories of emotions; perhaps that should require good theories of mind. In the end, such theories would help to better understand what emotions are. There is a range of research areas on emotions that are useful to different researchers. For my work, I have been exploring theories of mind with a focus on memory, learning, and emergent ability. In doing so, my work has been deeply focused on the following: 1. 2. 3. 4.
The role of instincts and ‘‘emotions’’ Commonsense reasoning Multistrategy reasoning Representation In essence, my studies explore cognitive architecture. Before entering into discussion of the Wolfgang system, it is important to identify several key questions that continue to motivate my thinking.
Question 1: Where do Goals Come From?
Living systems (biological or silicon) have needs. If such needs are not addressed by the system or for the system, then the system will ‘‘die.’’ When such needs occur, the system will perform some
Douglas Riecken
action that will address the need. Thus the system must be able to ‘‘formulate’’ a goal that it hopes will be satisfied by a ‘‘solution’’ in order to address a need. Are the first goals formulated by a living system’s primitive instincts for survival? What about goals that are cognitively complex—like composing a musical composition—or even more complex—like learning to become a composer? It is interesting to consider what goal and personal experiences helped Beethoven to ‘‘plan’’ and decide that his Symphony n o . 5 in C Minor would begin and be based on a motif of four notes. How does a composer select the first two or three notes to a composition? What type of goal does this?
Question 2: Do Goals and Solutions that Satisfy Goals Become Biased Based on ‘‘Learning’’ Experiences?
Some living systems ‘‘learn.’’ One of my working assumptions is that the nervous system is a fantastic multimodal encoding and pattern-matching machine. In humans, there are enormous relationships of many simple little pieces of ‘‘information’’ about a world and its culture that a human memory system is constantly encoding and reformulating. Memory is a fantastic representation of enormous biased partial orderings of ‘‘information’’ that reflect an individual’s experiences. What happens when a baby or a child uses a perfectly good (legal) solution for a goal, but as the solution is applied, there are certain ‘‘properties’’ associated with the solution that have a ‘‘strong’’ positive or negative impact on the baby’s/child’s instincts or cognitive perception? How might such an experience impact the learning and the reuse of such a goal or solution?
Question 3: When in a Particular Mental State, What Is the ‘‘State of the Nervous System?’’ What Is the Mind Doing?
Experiences affect humans in complex ways. One might consider that reflection on a previous experience could affect a human ‘‘toward’’ a particular mental state. Is the state of the nervous system in some innate instinctive mode? Are ‘‘emotional’’ states, such as when I am happy or sad or angry, really words from a language attempting to map some abstract convolution of my mind’s ‘‘life experience’’ onto specific innate instinctive states of the nervous system?
The Wolfgang System
It is from a continued series of questions like these that my work is based on a view that: goals plus emotions enable learning. Without goal formulation and instinctive nervous system elements that bias goal formulation, learning would not be possible. This statement covers the full range of initial learning through cognitive development and beyond. As we learn, we continue to encode and what we encode includes pieces of information that bias our memories to remember and motivate specific behaviors and biases of the learned information and knowledge for future experiences and goals. Over the years since 1987, the Wolfgang system (Riecken 1989) was the initial prototype software system of a model that I have continued to work on. Later work from 1991 to 1996 continued in a system called the M system (Riecken 1994). Since 1996 up to today, the M system work continues to evolve. The focus of this work is a theory of mind with attention to architecture addressing multistrategy reasoning, emotions, and representation (Riecken 2000). Two principal influences on this work relating to architecture continue to be recent work by Marvin Minsky (2001) and Aaron Sloman (chapter 3 of this volume). Several key influences on my work specific to emotions continue to be Manfred Clynes (1988), Paolo Petta (chapter 9 of this volume), Edmund Rolls (chapter 2 of this volume), Andrew Ortony and colleagues (Ortony Clore, and Collins 1988; chapter 6 of this volume), and Rosalind Picard (1997; chapter 7 of this volume). The key focus on the Wolfgang work was to understand a role that emotions might perform relating to goal formulation and learning. A key point here is that Wolfgang is a system that continues to learn! It learns to become a composer. It is not a system that learns one thing and then stops learning. While the work on Wolfgang resulted in a system that learned to compose, the Wolfgang architecture has since evolved in my later work on the M system. It is in this work that efforts by Minsky and Sloman have helped to influence a better design for exploring the role of ‘‘emotions/instincts’’ relating to reflection and reaction in a cognitive architecture.
10.2 A Problem with Learning to Compose
In this chapter, I reflect on the design motivations for a system called Wolfgang that composes tonal monodies. The investigated problem concerns the definition of the evaluation criteria guiding
Douglas Riecken
Wolfgang’s compositional processing and learning. The thesis of this work is derived from the hypothesis that a system’s innate sense of (musical) sound strongly influences the development of its perception, as well as composing hab its. As the system develops its musical skills, it also develops a subjective use of a musical language biased by its sense of musical sounds and its adaptation to the cultural musical grammar of its environment. In 1987, I began a study in machine learning with a rather simple software system called Wolfgang. Wolfgang was a research project, which focused on the development of compositional performance. This initial system applied Michalski’s STAR methodology (1983) for inductive machine learning in a knowledge-based system. In this first generation of Wolfgang, the evaluation criteria for guiding the composition process was derived from an explicit grammar of Western music; Wolfgang learned to compose simple compositions based on learning simple rules of syntax. In essence, the Wolfgang system was ‘‘programmed’’ to learn the syntax rules of a cultural grammar. It did not seem clear that this form of learning captured the ‘‘true spirit’’ of learning to compose music. It would appear that when an ‘‘individual’’ first hears or plays a new (musical) idea, some type of physical/‘‘emotional’’ reaction should occur. If Wolfgang was going to learn, it must have fundamental biases that shape its behavior with each learning experience. In essence, Wolfgang should be ‘‘programmed’’ to formulate biases based on the emoting properties of an ‘‘auditory/musical’’ experience, not just on the rules of syntax. The motivation for this theory has similarities to learning in Lenat’s AM system (1982), insofar as the AM system was designed to be ‘‘curious’’ via a heuristic search of number theory concepts. Thus a second design of Wolfgang began. The second generation of Wolfgang introduces ‘‘emotional’’ criteria to constrain goal formulation both when Wolfgang learns and composes (Riecken 1989, 1992). Wolfgang was designed to guide its composition process biased by a cultural grammar of music, as well as by a disposition for crafting musical phrases such that they express a specific emotional characteristic. Specifically, the evaluation criteria guiding Wolfgang’s composing process consists of: a cultural grammar reflective of Wolfgang’s musical development, and an ability to realize the emotive potential of musical elements represented in the respective cultural grammar.
The Wolfgang System
An abstract description of Wolfgang’s composition process is as follows. Based on the grammatical context of a given compositional decision, Wolfgang defines a set of legal solutions from its domain knowledge of music, which satisfy the cultural grammar; then Wolfgang selects from this set of legal solutions the solution that best satisfies Wolfgang’s current disposition in order to endow the current musical phrase with a specific emotive potential. Key enablers in the design and implementation of Wolfgang were an approach to knowledge representation (KR) based on Minsky’s K-line theory (1980) and Trans-Frames, as presented in his society of mind (SOM) theory (1985). Wolfgang’s architecture required that domain knowledge be represented in ‘‘micro’’ element structures so that they could be dynamically used to formulate and reformulate various partial solutions. The idea is not to represent whole ideas or facts, but to let the learning encode parts of ideas and facts as ‘‘simple’’ structures that can be chained together to form various partial orderings of knowledge. This is a core feature in SOM theory, where multiple agencies of reasoning are engaged in multistrategy reasoning. An important advantage in representing domain knowledge as dynamic orderings (in linked data structures) is that the diverse compositional levels of linked structures can be reused and adapted to represent changes in the composing behavior of Wolfgang as it continues to learn and evolve as a composer. This design approach was useful when compared to the classic problem of the plastic expert system. Consider the composer who lived for the first 25 years of his or her life in Brooklyn, New York and then for the next 12 years in South America. Clearly, the ‘‘fluidity of the human mind’’ enables a composer’s style to evolve due to new influences and experiences. Perhaps the diverse relationships of many partially ordered elements of knowledge acquired over time, in a computer composing ontology, is essential for fluid learning and development. The Wolfgang architecture, based on an interpretation of K-line theory, enabled the implementation of fundamental ‘‘emoting’’ biases representing Wolfgang’s initial sensation of sound and later its learning and perception of ‘‘musical sounds’’ and musical knowledge along with their respective emoting potentials. It was necessary that the emoting potentials be represented in a dynamic K-line network so that they would assert Wolfgang’s ‘‘emotional’’ disposition toward the current composing task and/or learning experience.
Douglas Riecken
10.3
General Discussion of Wolfgang
The second generation of Wolfgang introduces ‘‘emotional’’ criteria to guide both the composing process and learning to compose. Wolfgang allows a user to request it to compose a sixty-four measure monody composition that realizes a specific emotional characteristic. The set of emotional characteristics include happiness, sadness, anger, and meditativeness. Wolfgang’s disposition during a composing session will guide goal formulation and decision making so as to compose a composition with the user-requested emoting quality. It is important to note that during a composing session with a user, Wolfgang will change its disposition in order to create a composition that reflects some ‘‘maturity.’’ That is to say, Wolfgang’s disposition does not constrain the composing process to only one type of emoting quality for the resulting composition. While the overall composition is realized to communicate a specific emoting quality, Wolfgang will integrate other emoting qualities to enhance the quality of the composition. In this paper, the term disposition is influenced by Minsky’s use of this term in his K-lines paper (1980). Minsky states, ‘‘I use ‘disposition’ to mean ‘a momentary range of possible behaviors’; technically it is the shorter term component of the state. In a computer program, a disposition might depend upon which items are currently active in a data base, e.g., as in Doyle’s (Doyle 1979) flagging of items that are ‘in’ and ‘out’ in regard to making decisions’’ (p. 131). Although the design of Wolfgang makes use of the concept of dispositions, the current work on Wolfgang is still an evolving attempt to integrate some of Minsky’s ideas; it is my view that considerable work is still required. In the remainder of this chapter, I will provide a general discussion of Wolfgang’s architecture. The Wolfgang system architecture is composed of the following five fundamental system components: (1) a corpus of E-nodes, (2) a K-line network, (3) a set of blackboard systems, (4) a disposition feedback facility, and (5) logfiles.
10.4
E-Nodes
E-nodes (emotion-nodes) are the most fundamental system component. Informally, an E-node is a collection of information defining the emoting potential of a given primitive musical artifact. The
The Wolfgang System
term primitive musical artifact refers to such elements as vertical 2-note harmonic structures (e.g., a major third, a dominant seventh, etc.), horizontal 2-note harmonic structures (e.g., harmonic progressions consisting of paired vertical harmonic structures), amplitude, tempo, and simple rhythmic elements. The use of the term emotive refers to a primitive musical artifact’s potential to express emotion or sentiment in a demonstrative manner. The information provided by each E-node allows Wolfgang to interpret the emoting potential of each primitive musical artifact. While the musical artifacts mentioned are actually quite complex structures (in terms of music theory), for the design of Wolfgang they are viewed as simple elements, and in that sense are taken for granted as elements of the musical imagination. A critical design issue concerning E-nodes must now be reviewed. E-nodes are not instances of learned musical knowledge. An E-node is simply a qualification and quantification of the emotive potential of a given primitive musical artifact. The E-node functions to represent and simulate in the computer model Wolfgang’s sensation of each respective primitive musical artifact. In order for Wolfgang to be knowledgeable and perceive a given primitive musical artifact, a representation of the respective artifact must be encoded into its (system) memory, representing a learned experience. Once this representation has been encoded, it inherits its emoting potential from the respective E-node. Thus E-nodes serve as ‘‘innate’’ system properties. It is important to note that this type of learning is restricted to only primitive musical artifacts. In time, Wolfgang will begin to learn about compound musical artifacts. The term compound musical artifact refers to a musical artifact composed directly or indirectly (or both) of two or more primitive musical artifacts. The emoting potential for compound musical artifacts is computationally derived from the emoting potentials of the ‘‘simpler’’ musical artifacts that make up the respective compound musical artifact. Now that we have an ab stract statement of what an E-node is, let us review it in more detail. An E-node is defined as an identifier and a set of four numeric values. Each numeric value defines the emotive potential of a particular emotion, as realized by the primitive musical artifact represented by an E-node. The four distinct emotions represented by the four numeric values assigned to each E-node are happiness, sadness, anger, and meditativeness. The design of Wolfgang provides each primitive musical artifact with its
Douglas Riecken
range of emoting potential over the defined set of emotions. (The number of individual emotions supported is currently restricted to four so as to minimize complexity.) The significance of E-nodes lies in their representation of emoting potentials for the four emotions over the complete set of primitive musical artifacts. These emotive representations perform a critical role during Wolfgang’s development; they serve as operands by which emoting values are computed and assigned to the learning of a new primitive musical artifact, or to a compound musical artifact. The term development refers to the acquisition of musical knowledge by Wolfgang to improve its performance as a composer. Thus E-nodes provide emotive primitives (initial properties) used to computationally derive the emoting potential for the combinatorial development of Wolfgang’s musical knowledge. The decision to specify the emotive potential of all musical artifacts is motivated by our task of developing evaluation criteria for guiding the composing process. As we have learned thus far, one of the metrics that guides Wolfgang’s composition process consists of composing musical phrases that satisfy some current compositional disposition. This means that Wolfgang will attempt to compose a musical phrase provoking a specific emotion that matches its current disposition; at any moment, Wolfgang’s disposition is at some level of happiness, sadness, anger, or meditativeness. Therefore the selection of musical artifacts is biased toward those musical artifacts that provide the highest emotional potential matching Wolfgang’s current disposition; consequently, the system design of Wolfgang requires a method to specify the emoting potential of all musical artifacts.
10.5
Memory Based on the K-line Theory
The storage, access, and management of Wolfgang’s musical domain knowledge is supported by a network of interconnected musical artifacts; these artifacts reflect Wolfgang’s musical experiences and development. The principal influence in the design of Wolfgang’s memory is Minsky’s K-line theory of memory (Minsky 1980, 1985): ‘‘When you ‘get an idea,’ or ‘solve a problem,’ or have a ‘memorable experience,’ you create what we shall call a K-line. This K-line gets connected to those ‘‘mental agencies’’ that were actively involved in the memorable mental event. When that K-line is later ‘activated,’ it reactivates some of those mental agencies,
The Wolfgang System
creating a ‘partial mental state’ resembling the original’’ (Minsky 1980, p . 118). The K-line memory theory describes the behavior of a system by the dynamic relationships of linked elements (called K-lines). The complexity of a K-line can range from a representation defined by a single instance of information to extremely complex representations of behavior defined by a K-line composed of a large set of linked K-lines (a.k.a. K-trees). Minsky’s theory explains that sets of K-lines form societies of resources providing specific mental functions, and that these societies can dynamically form multiple connections (K-lines) with each other to create new K-lines, thus increasing the body of knowledge. It is the activation of sets of Klines that brings about partial mental states. A total mental state is composed of several partial mental states active at a single moment in time. Wolfgang’s development (a.k.a. learning) is realized by constructing new K-line connections within or between partial mental states.
10.6
Advantages from K-line Memory Features
Wolfgang’s memory is its fundamental system component. In the second generation of Wolfgang, great care was taken to design a system that demonstrated improvement in its ability to ‘‘develop.’’ If Wolfgang is to develop composing skills, these skills should demonstrate some characteristic of personalized style. The design of Wolfgang’s memory attempts to capture two powerful qualities found in the K-line theory: flexibility and recall of personalized habits. The term flexibility refers to Wolfgang’s ability to store and use diverse knowledge during its development as a composing system. The issue of flexibility provided critical motivation for the evolution to the second generation of Wolfgang. The first-generation system consistently reached impassable stages of development. This problem resulted due to the nature common to knowledgebased systems: these systems have historically demonstrated their ability to perform quite well over restricted sets of problems, except that the acquisition and representation of a large body of knowledge quickly promotes conflicting assertions of facts and rules. The K-line model avoids conflicting assertions by allowing multiple patterns of diverse knowledge to represent different memorable experiences; musical experiences that provide good
Douglas Riecken
results during a composing session are therefore encoded as individual K-lines in Wolfgang’s memory. This representation serves to minimize the number of rules that ‘‘govern’’ methods and facts, and thus to avoid the attending problems of complexity. Wolfgang’s K-lines enable a range of diverse and reusable topologies of K-lines to be formulated and reformulated both for building new K-lines representing new knowledge and abstractions learned, and changing context by changing activation of the K-line network, which defines Wolfgang’s memory of musical methods and facts. The concept recall of personalized habits refers to Wolfgang’s ability to develop and apply its composing skills in a subjective manner reflecting its musical learning experiences. This feature of system performance is a direct result of the explicit method by which musical knowledge is encoded in Wolfgang’s memory as individual K-line instances of successful musical methods and facts. Thus as Wolfgang develops, it also develops a distinct style of composing; this results from the frequent combinations of collaborating K-lines that, over time, are referenced as individual compound K-lines. By attempting to model features of Minsky’s K-line memory, Wolfgang develops personal composing habits; these habits become the system’s compositional signature.
10.7 Design and Implementation of K-line Memory
Wolfgang’s memory is implemented as a frame-based spreadingactivation network; semantic relationships within the network form individual K-lines. Each K-line is represented by a discrete set of network links. These network links interconnect supporting K-lines to represent musical knowledge. All K-lines are implemented as frame structures (Minsky 1975), known as K-line frames (KF). A KF is a structure (a.k.a. schema) representing specific knowledge of some object or concept; the structure associates features that are descriptive of a given object or concept. These features are represented as attributes called slots. Slot values can be either some physical value (such as a symbol or numerical value) or some process/function to be invoked to perform some task. In Wolfgang, the slot values in each KF identify the respective KF and its supporting structures and characteristics. An important set of slot values in each KF are a set of four numeric weights. Each numeric weight represents an emotive potential for the respective K-line; each weight respectively represents one of the four emotive
The Wolfgang System
types used in the current design of Wolfgang (happiness, sadness, anger, and meditativeness). These four numeric weights serve to support computations used during composing decisions to select solutions that provide specific emotive qualities. As given solutions provide repeated successes, Wolfgang will attenuate these values and formulate ‘‘abstractions’’ as new K-lines referencing the solution(s) so as to reflect their utility for a given context. Over time, this is how Wolfgang’s musical composing signature emerges. The overall K-line network is partitioned into two functional parts: method facts (e.g., methods of motivic development, methods of harmonization, etc.) and facts (specific instances of intervals, harmonic structures, etc.). Within a K-line network, K-lines are partitioned into distinct classes of musical artifacts, such as sets of melodic intervals, rhythmic patterns, harmonic progressions, methods for motivic development, and so on. This is done to provide effective management and efficient access of system memory; the management and access of K-line classes is supported via blackboard technologies (see Nii 1989 for discussion on blackboard technologies). Finally, each K-line class is implemented as a frame, called a musical-component-frame (MCF). Each MCF may contain an arbitrary number of slots; each slot in an MCF references a distinct KF contained within a K-line class. The MCFs are implemented as lists.
10.8
Blackboards as Knowledge Negotiators
Wolfgang’s blackboard systems serve as knowledge negotiators. They manage the interactions of K-lines from different K-line classes as they collaborate to compose a musical work. The blackboard technologies are implemented in a distributed hierarchical model. The model consists of a primary blackboard system supported by (three) subordinate blackboard systems. The primary blackboard system, called the root blackboard (RBB), manages the overall composition process. The three subordinate blackboard systems include the melodic blackboard (MBB), the harmonic blackboard (HBB), and the rhythmic blackboard (RHBB). These three blackboard systems manage composing processes relating to melody, harmony, and rhythm, respectively. Also, these three blackboard systems serve as blackboard knowledge sources (KSs) to the RBB.
Douglas Riecken
Each blackboard system is implemented as a shared memory allowing access to its respective K-line classes (e.g., K-line melody classes, K-line methods of motivic development classes, etc.). The activation of K-lines within these classes serve as KSs that attempt to assert information provided by each active K-line onto the blackboard. This information is evaluated by a blackboard scheduler. Each blackboard system comprises a scheduler for managing blackboard functions. In Wolfgang, the schedulers are implemented as inference engines composed of minimal sets of rules relating to the distinct tasks of each respective blackboard. The schedulers are the only system components within the Wolfgang architecture whose knowledge is defined explicitly by rules.
10.9
Disposition Feedback Facility
The disposition feedback facility provides Wolfgang with the ability to evaluate decisions made during a composing session. Evaluations are based on the emoting potential derived from each possible decision, and a ranking of each possible decision with regard to an ordered list of previous decisions that demonstrate high emoting potentials. This facility allows Wolfgang to mark for future use, during a given composing session, specific musical artifacts that compute high emotional potentials, while satisfying the current disposition of the system. Implementation of the disposition feedback facility consists of: a feedback loop, a variable called the *dispositionValue*, and an ordered list of decisions called the *listOfGoodElements*. Both variables, *dispositionValue* and *listOfGoodElements*, are asserted and maintained on the RBB. The variable *dispositionValue* maintains the current disposition of Wolfgang as one of the four emotion types: happy, sad, angry, and meditative. The variable *listOfGoodElements* provides Wolfgang with a list of musical elements that have been applied previously during the current composing session, and that have provided high emoting potentials. This list is significant; it acts as a short-term store of musical artifacts applicable in the motivic development of remaining musical phrases. Thus, during an individual composing session, Wolfgang creates and appends to the variable *listOfGoodElements* musical features derived from decisions made during the session that provide high emoting potentials. The feedback loop is implemented as a background process that sends status messages
The Wolfgang System
to the RBB scheduler when an interesting event has occurred and has been posted on the RBB, MBB, HBB, or RHBB; the scheduler then appends the musical artifact associated with this event to the *listOfGoodElements*. The Disposition Feedback Facility evaluates and resolves composing decisions by evaluating the activation of K-lines that represent previous successful solutions (note: these previous solutions are a reflection of ‘‘positive’’ cultural experiences, which suggests that such solutions are both the development of a culturally learned syntax for a musical grammar and a ‘‘personal’’ bias by Wolfgang in their use) and then selecting the solution whose emoting potential best satisfies Wolfgang’s current disposition—keep in mind that a solution’s emoting potential is computed via the numeric weights represented in that solution’s K-line representation.
10.10
System Logfile
The logfile insures that each composition is distinct; it provides information of previously composed works, so that Wolfgang can avoid the excessive repetition of musical artifacts by reviewing its previous composing habits during a session. The logfile consists of trace data from the last twenty sessions stored in a hard disk file.
10.11 Closing Discussion
The second-generation Wolfgang system provided a working model of music composition in which ‘‘dispositions’’ are instrumental in deciding about steps in the elaboration of tonal monodies during composing. The research and design of Wolfgang has resulted from a subjective view of musical composing according to which the emoting potential of musical constructs is more important to the musical logic of a monody than are their syntactic features. A musical composition is thought to be an artifact, which stimulates the senses and cognitive awareness of both its creator and any intended listener. I therefore view composing as a process that creates an artifact to communicate some cognitive ‘‘emotional’’ effect. The composing process necessitates the development of a set of musical skills and the application of these skills based on the disposition of the composer. We might consider Wolfgang’s compositional processing as constrained by its cultural grammar and
Douglas Riecken
guided by its disposition to musically communicate some emoting quality.
Acknowledgments
The author is deeply indebted to Marvin Minsky, Ed Pednault, and Stacy Marsella for their helpful interest and comments in this work.
References Clynes, M. (1988): Generalised Emotion: How it is Produced, and Sentic Cycle Therapy. In M. Clynes and J. Panksepp, eds., Emotions and Psychopathology. Plenum Press, New York. pages 107–170. Doyle, J. (1979): A Truth Maintenance System. AI memo no. 521. Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge. Lenat, D. (1982): AM: An Artificial Intelligence Approach to Discovery in Mathematics as Heuristic Search. In R. Davis and D. Lenat, eds., Knowledge-Based Systems in Artificial Intelligence. McGraw-Hill, New York. Michalski, R. S. (1983): A Theory and Methodology of Inductive Learning. In R. S. Michalski, J. G. Carbonell, and T. M. Mitchell, eds., Machine Learning. Morgan Kaufmann, San Mateo, Calif. Minsky, M. (1975): A Framework for Representing Knowledge. In P. H. Winston, ed., The Psychology of Computer Vision. McGraw-Hill, New York. Minsky, M. (1980): K-Lines: A Theory of Memory. Cogn. Sci. J. 4 (2): 117–133. Minsky, M. (1982): Music, Mind, and Meaning. In M. Clynes, ed., Music, Mind, and Brain: The Neuropsychology of Music. Plenum Press, New York. Minsky, M. (1985): The Society of Mind. Simon and Schuster, New York. Minsky, M. (2001): The Emotion Machine. Pantheon Books, New York. Nii, H. P. (1989): Introduction. In V. Jagannathan, R. Dodhiawala, and L. Baum, eds., Blackboard Architectures and Applications. Academic Press, New York. Ortony, A., Clore, G., and Collins, A. (1988): The Cognitive Structure of Emotions. Cambridge University Press, Cambridge. Picard, R. (1997): Affective Computing. MIT Press, Cambridge. Riecken, D. (1989): Goal Formulation with Emotional Constraints: Musical Composition by Emotional Computation. In H. Schorr and A. Rappaport, eds., Proceedings of AAAI—First Annual Conference on Innovative Applications of Artificial Intelligence, Stanford University. AAAI/MIT Press, Cambridge. Riecken, D. (1992): Wolfgang: A System Using Emoting Potentials to Manage Musical Design. In M. Balaban, K. Ebcioglu, and O. Laske, eds., Understanding Music with Artificial Intelligence Perspectives on Music Cognition. AAAI/MIT Press, Cambridge. Riecken, D. (1994): M: An Architecture of Integrated Agents. In D. Riecken, ed., special issue of the Commun. ACM Intell. Agents 37 (7): 106–116. ACM, New York. Riecken, D. (2000): We Must Re-Member to Re-Formulate: The M System. In A. Sloman, ed., Proceedings of AISB 2000 Symposium How to Design a Functioning Mind, The University of Birmingham, UK. Society for the Study of Artificial Intelligence and Simulation of Behaviour, Univ. of Birmingham, UK.
11 A Bayesian Heart: Computer Recognition and Simulation of Emotion Eugene Ball
11.1 Why Does a Computer Need a Heart?
Computers are rapidly becoming a critical and pervasive piece of our societal infrastructure. Within a decade or two they are likely to be the constant companions of most people in the industrialized world. Many of our interactions with them will continue to be like those with simpler machines: We push a button and the machine takes a limited and well-defined action, like zapping our food or washing the dishes. But there will also be many situations wherein spoken conversation will be the preferred means of communicating with a computer: perhaps to ask what the weather is likely to be in Vienna next week, or to select a good book to read on the plane. There are huge technical challenges that must still be overcome before conversational computers will be competent and reliable enough to use in this fashion, but I have little doubt that we will get there in twenty years (possibly much sooner). A computer with which we engage in casual conversation (even if limited to narrow domains), will inevitably become a significant social presence (at least as noticeable as a human ticket agent with whom we carry out a brief transaction). I suspect that for many people, such a system will eventually become a long-term companion with which (whom?) we share much of our day-to-day activity. To be useful, conversational interfaces must be competent to provide some desired service; to be usable, they must be efficient and robust communicators; and to be comfortable, they will have to fulfill our deeply ingrained expectations about how h u m a n conversations take place. One subtle aspect of those expectations involves the emotional sensitivity of our conversational partners. We would be surprised if an outburst of anger toward someone produced a completely deadpan response, and might even be further angered by their lack of acknowledgment of our own emotional state. If we laugh in response to someone’s joke, we expect them to laugh (or at least smile) along with us—to do otherwise would be disconcerting.
Eugene Ball
My work (with my colleague Jack Breese) on the computational modeling of emotion and personality is intended as a first step toward an emotionally sensitive conversational interface: one that can recognize the emotional state of the human user and respond in a fashion that adds to the naturalness of the overall interaction.
Communicating with Computers
People frequently find spoken conversation to be the most efficient and comfortable way to conduct interactions with others. Particularly for tasks requiring many back-and-forth steps, written communication (even e-mail) can be tedious and suffers from the need to reacquire working context for each step of the interaction. Graphical computer interfaces have been quite successful as a medium for conducting many well-specified tasks of this sort. For example, common requests to a travel agent (long a favorite target application of the spoken language research community) can be carried out quite efficiently as an interaction with a Web server. However, if a request is unusual, it may be difficult to build a graphical user interface to handle it without unduly complicating the interface for more common cases. ‘‘Hidden commands’’ can provide less common capabilities without complicating simple ones, but they require that the user know both that they exist, and how to find them. One likely path for computer interface design is to gradually augment graphical user interfaces with linguistic capabilities. An ‘‘assistant’’ would accept flexible descriptions of commands or objects outside of the immediately visible workspace: ‘‘I’d like to reserve a group of fifty seats for travel from Minneapolis to Seattle next December’’ or ‘‘Check to see which movies are showing on flights from London to Seattle.’’ The ability to respond properly to powerful natural language requests (either typed or spoken) would be a welcome addition to current interfaces. While natural language may first be introduced as an ‘‘escape’’ for uncommon requests, its role is likely to steadily expand as it becomes more capable and as speech recognition becomes more reliable. Natural language requests of the sort suggested above are convenient and powerful, but often result in ambiguities that require clarification: ‘‘Do you want round-trip tickets?’’ Therefore, spoken interactions won’t usually consist of just isolated commands but will become conversational dialogues.
A Bayesian Heart
Social Aspects of Computer-Human Interaction
In many fundamental ways, people respond psychologically to interactive computers as if they were human. Reeves and Nass (1995) have demonstrated that strong social responses are evoked even if the computer does not use an explicitly anthropomorphic animated assistant. They suggest that humans have evolved to accord special significance to the movement and sounds produced by other people, and at some deeply ingrained level cannot avoid responding to the communication of modern technology as if it were coming from another person. As user interface designers, I believe we have a responsibility to try to understand the psychological reality and significance of these effects and to adapt our computer systems to the needs of our users. In addition, we should recognize that this social response is likely to become much stronger when the user is having a spoken conversation with a computer. It is clear that our emotional responses do not disappear while we are interacting with machines of all types: We get annoyed when they do not work properly, we respond with joy when a difficult task is completed smoothly, we even attribute h u m a n motives to their inanimate behaviors on occasion. Thus it will not be surprising to see even stronger emotional reactions to computers that talk, including expectations of appropriate emotional responses in the computer itself.
Emotionally Aware Computing
Explicit attention to the emotional aspects of computer interaction will be necessary in order to avoid degrading the user’s experience by generating unnatural and disconcerting behaviors. For example, early text-to-speech systems generated completely monotonic speech, which conveyed a distinctly depressed (and depressing) emotional tone. Therefore I would argue that the initial goal for emotional interfaces should be to simulate appropriate emotional reactivity by demonstrating an awareness of the emotional content of an interaction. The type of emotional reactivity that might be appropriately demonstrated by a conversational assistant can be illustrated by considering some imaginary computer responses to different situations. I’ve labeled each example with an emotional term or attitude description that could properly accompany the words,
Eugene Ball
giving them a more natural feel and a greater communicative potency. 9 9 9 9
The assistant is reporting the results of an assigned search task: I was unable to find anything meeting that description. (sadness) This is just what you were looking for. (pride) I’m not sure, but one of these might suit your needs. (uncertainty) Gee, and it only took me 12 minutes to find it! (embarrassment)
The assistant reacts to difficulties with the communication itself: I’m afraid I don’t understand what you mean by that. (confusion) I believe I just told you that I don’t know anything about that topic. (irritation) 9 This doesn’t seem to be going so well . .. could you try again? (embarrassment) 9 I’m really sorry, but could you repeat that one more time? (solicitude) 9 9
The assistant detects a strong emotional reaction from the user: 9 Whoa . . . Can we calm down and try again? (calming) 9 Gee, that was pretty frustrating, wasn’t it? (empathy) 9 Great! Glad to be of help. (joy) The assistant is trying to fulfill a user’s request to help modify the user’s behavior: 9 Shouldn’t you get back to work now? (disapproval) 9 If you don’t leave now, you’ll be late. (warning) 9 Take a break now, if you want your wrists to get better! (commanding)
Discussion
Picard: ‘‘I was unable to find anything meeting that description.’’ When you assume goodness and sincerity and so forth, sadness could actually be expressed with no tone of voice. But of course we could all hear these said in other ways as well. And that’s where, I think, it’s interesting to consider not just the emotion of the system that is reading these sentences, but the emotion of the perceiver of these sentences. There are several interesting studies in which you present perceivers with a neutral stimulus, and a perceiver in a good mood will perceive the neutral stimulus as more positive, while the perceiver in a bad mood will perceive it as more negative.
A Bayesian Heart
Ball: In human interaction we see that distinction in the perceiver, and we react to that. Picard: That’s right. E-mail that we send without tone really is more likely to be perceived ambiguously, so we may need to go to even greater efforts to verbally try to be clear, if we are positive, or just hurried, as opposed to angry—things that could easily be confused without the tone. Bellman: Washington, D.C. had a big controversy over the voice that they were using for closing the doors inside a subway. Did you hear that? Picard: Actually, there was a similar thing in Atlanta many years ago at the airport. Bellman: They purposely made the voice a little bit brisk. They wanted to get people to get on: Doors are closing—get on! And there was actually a tremendous backfire against it. People found it was rude, it was a nasty voice. They finally had to get rid of it. Picard: It’s funny: In Atlanta, they had started with a nice, h u m a n sounding voice that sounded very friendly. And people didn’t pay much attention to it. So they went to a more computerized, synthetic-sounding voice, that sounded sort of more high-tech and cold. The perceiver not just is influenced by their own emotions, but of course by what they think of that entity. They think, what is this computer that’s so stupid? Because so many people harbor these mixed feelings. We are very unusual in how we feel about computers compared to the rest of society. Ball: And all these examples are imaginary. I think, having the competence to generate the right one of these in the right situation is a huge goal. And getting it wrong is something that people can react very strongly to. Ortony: There are huge individual differences. I have a friend from New York who has a reputation of being rather brusque. I find this little story illustrative. He finds Chicago intolerable compared to New York. One of the things that irritate him in Chicago is: people get off the bus and they thank the bus driver. He finds this absolutely incomprehensible behavior, like ‘‘The goddamned guy is paid to drive the bus. What’s the problem? You get off the bus and you go!’’—The point here is that people obviously have different personalities that require different interactional styles. Actually, some people will be upset by one style, while others will be sat-
Eugene Ball
isfied. I mean, the environment includes the personality of the individual one is interacting with. Picard: It’s going to be constantly changing in different situations. So, if the computer tries one of these lines on your friend, and your friend trashes the thing, then the computer will better not try any lines similar to that. In these examples, the character’s linguistic expression is the clearest indicator of its emotional state, but if that expression is to seem natural and believable, it needs to be accompanied by appropriate nonverbal indications as well. Whether generated by preauthored scripts or from strong AI first principles, such utterances will seem false if the vocal prosody, hand gestures, facial expressions, and posture of the character do not match the emotional state expressed linguistically. In order to produce responses demonstrating as much emotional sensitivity as these examples suggest, a system must be able to: recognize and/or predict the emotional state of the user, and then synthesize and communicate an appropriate emotional response from the computer. The next section describes a simple emotional model that can be used to adjust the emotional expression of a talking computer. While the motivation for this work is strongest for conversational systems, its application may be appropriate more generally. As computer use becomes ever more widespread in our culture, it is likely that we will see greatly increased attention to the subjective experiences of computer users, including the aesthetic and emotional impact of computer use. My expectation is that the experience gained from modeling the emotional impact of spoken interfaces will also be used to inform the design (and possibly the dynamic behavior) of conventional graphical interfaces, in order to improve user satisfaction.
11.2
A Bayesian Model of Emotion
Modeling Emotion
The understanding of emotion is the focus of an extensive psychology literature. Much of this work is based upon a deep understanding of an individual’s beliefs about how events will effect
A Bayesian Heart
him, and then modeling the way those beliefs lead to an emotional response (Scherer 1984; Ortony, Clore, and Collins 1988). While a few research efforts are attempting to build agents with sufficiently deep understanding that these models can be applied directly (Bates, Loyall, and Reilly 1994; Martinho and Paiva 1999), we have chosen to utilize a much simpler model of emotion; one that corresponds more directly to the universal responses (including physical responses) that people have to the events that affect them. Although this approach is unable to model many subtle emotional distinctions, it seems like a good match to conversational interfaces that communicate with people (within specific domains) using only a limited understanding of language and the user’s goals. The term emotion is used in psychology to describe short-term (often lasting only a few seconds) variations in internal mental state, including both physical responses, like fear, and cognitive responses, like jealousy. We focus on two basic dimensions of emotional response (Lang 1995) that can usefully characterize nearly any experience: Valence represents positive or negative dimension of feeling. Arousal represents the degree of intensity of the emotional response. Figure 11.1 shows the emotional space defined by these dimensions, and shows where a few named emotions fit within them. In our model, these two continuous dimensions are further simplified by encoding them as a small number of discrete values. Valence is considered to be either negative, neutral, or positive; similarly, arousal is judged to be excited, neutral, or calm. Psychologists also recognize that individuals have long-term traits that guide their attitudes and responses to events. The term personality is used to describe permanent (or very slowly changing) patterns of thought, emotion, and behavior associated with an individual. McCrae and Costa (1989) analyzed the five basic dimensions of personality (see Wiggins 1979), which form the basis of commonly used personality tests. They found that this interpersonal circumplex can be usefully characterized within a two-dimensional space. Taking an approach similar to our representation of emotion, we have incorporated into our model a representation of personality based upon the dimensions of:
Eugene Ball
Figure 11.1 The position of some named emotions within the Valence Arousal Space.
9
Dominance, indicating an individual’s relative disposition toward controlling (or being controlled by) others 9 Friendliness, measuring the tendency to be warm and sympathetic Dominance is encoded in our model as dominant, neutral, or submissive; friendliness is represented as friendly, neutral, or unfriendly. Given this quite simple but highly descriptive model of an individual’s internal emotional state and personality type, we wish to relate it to behaviors that help to communicate that state to others. The behaviors to be considered can include any observable variable that could potentially be caused by these internal states. In laboratory settings, some of the most reliable measures of emotional state involve physiological sensing, such as galvanic skin response (GSR) and heart rate. For both emotion and personality, survey questions are often used to elicit accurate measures of internal state (with tests such as the Myers-Briggs Type Indicator; Myers and McCaulley 1985). However, in normal human interaction, we rely primarily on visual and auditory observation to judge the emotion and personality of others. A computer-based agent might be able to use direct sensors of physiological changes, but if those measures require the attach-
A Bayesian Heart
ment of unusual devices, they would be likely to have an adverse effect on the user’s perception of a natural interaction. For that reason, we have been most interested in observing behavior unobtrusively, either through audio and video channels, or possibly by using information (especially timing) that is available from traditional input devices like keyboards and mice, which might be a good indicator of the user’s internal state. More specialized devices like a GSR-sensing mouse or a pressure-sensitive keyboard might be worth investigating as well—although unless they turned out to be extraordinarily helpful, they are unlikely to make it into widespread use.
Bayes Networks
Bayesian networks (Jensen 1996) are a formalism for representing networks of probabilistic causal interactions that have been effectively applied to medical diagnosis (Horvitz and Shwe 1995), troubleshooting tasks (Heckerman, Breese, and Rommelse 1995), and many other domains. Bayes nets have a number of properties that make them an especially attractive mechanism for modeling emotion. First, they deal explicitly with uncertainty at every stage, which is a necessity for modeling anything as inherently nondeterministic as the connections between emotion and behavior. For example, an approach using explicit rules (if behavior B i s observed, then deduce the presence of emotional state E) would have great difficulty accounting for inconsistent reactions to the same events. However, a Bayes network will make predictions about the relative likelihood of different outcomes, which can naturally capture the inherent uncertainty in human emotional responses. Second, the links in a Bayes net are intuitively meaningful because they directly represent the connections between causes and their effects. For example, a link between emotional arousal and the base pitch of speech can be used to represent the theoretical effect that arousal (and the resulting increased muscular tension) has on the vocal tract. It is quite easy to encode the expectation that with increasing arousal, the base pitch of speech is likely to increase as well. The exact probabilities involved can still be difficult to determine, but if the network is designed so that the parameters represent relatively isolated effects, relevant quantitative information from psychological studies is sometimes
Eugene Ball
available. Moreover, any model with enough complexity to model even simple emotional responses is likely to have a large number of parameters that have to be determined, and in a Bayesian network these parameters at least have clearly understandable meanings. Finally, and especially relevant to the twin requirements of emotionally aware computing (recognizing emotion in the user, and simulating emotional response by the computer), Bayesian networks can be used both to calculate the likely consequences of changes to their causal nodes and also to diagnose the likely causes of a collection of observed values at the dependent nodes. This means that a single network (and all of its parameters) can be used for both the recognition and the simulation tasks. When used to simulate emotionally realistic behavior of the computer, the states of the internal nodes representing dimensions of emotion and personality can be set to the values that we wish the computer to portray. The evaluation of the Bayes net will then predict a probability distribution for each possible category of behavior. This has the extra advantage that by randomly sampling this distribution over time, we can very easily generate a sequence of computer behaviors that are consistent with the desired emotional state, but are not completely deterministic. Because excessively deterministic behavior is a strong indicator of mechanistic origins, observers frequently judge that such behavior appears unnatural. By introducing some random (but consistent) variability, that source of unnaturalness can be avoided. When the computer observes user behavior (through cameras, microphones, etc.) the observations can be used to set the values of the corresponding leaf (or dependent) nodes in the network. Evaluation of the network then results in estimated values for the internal dimensions of the user’s emotional state. The most probable value can be taken as the user’s state (as perceived by the computer). If multiple values have similar probabilities, the diagnosis can be treated as uncertain.
Emotion and Behavior
The Bayesian model that we have built (figure 11.2) contains internal states for emotional valence and arousal, and for the dominance and friendliness aspects of personality. These nodes are
A Bayesian Heart
Eugene Ball
treated as unobservable variables in the Bayesian formalism, with links connecting them to nodes representing aspects of behavior that are judged to be influenced by that hidden state. The behavior nodes currently represented include linguistic behavior (especially word selection), vocal expression (base pitch and pitch variability, speech speed and energy), posture, and facial expressions. Our Bayesian network therefore integrates information from a variety of observable linguistic and nonlinguistic behaviors. The static model described above can be extended to a version with temporal dependencies between current and previous values of the internal variables characterizing emotions. In this model, we assume that the values of the observable variables such as speech speed, wording, gesture, and so on are independent of emotions given the current emotional state. The variables describing emotions evolve over time, and in this model the interval between time slices is posited to be three seconds. Valence, modeled by the variable E-Valence(t) in the network, depends on valence in the previous time period, EValence(t — 1), as well as the occurrence of a ValenceEvent(t) in the previous period. A valence event refers to an event in the interaction that affects valence. For example, in a troubleshooting application, a negative valence event might be a failed repair attempt or a misrecognized utterance. We have a similar structure for arousal, where the variable ArousalEvent(t — 1) captures external events that may effect arousal in the current period, with discrete states calming, neutral, and exciting. The conditional probability distribution indicating the dynamic transition probabilities is shown in figure 11.3. The distribution does not admit a direct transition from a calm state of arousal to an excited state. (Note: this distribution is illustrative, and not based on a formal study or experiment.) Because personality, by definition, is a long-term trait, we treat these variables as not being time dependent in the model, hence the lack of a time index in the variable names for personality, PFriendly and P-Dominant. Note that the existence of the personality variables in the model induce a dependency among the observables at all times, so the model is not strictly Markovian in the sense that observations are conditionally independent of the past, given the current unknown emotional state. However, this model can be converted to a Markovian representation for inference.
A Bayesian Heart 315
Eugene Ball
In the next section, we discuss the architecture of the emotional component of a complete interactive system. Following that, we present more detail on the model’s treatment of specific behaviors—particularly linguistic and vocal expression.
11.3
Emotional Interactive Systems
In an emotionally aware interactive system, the recognition and simulation of emotion will play an auxiliary and probably quite subtle role. The goal is to provide an additional channel of communication alongside the spoken or graphical exchanges that carry the main content of the interaction. If the emotional aspects of the system call attention to themselves, the primary motivation of producing natural interactions will have been defeated. In fact, users that get the feeling that the system is monitoring them too closely may begin to feel anxious or resentful (of course, the emotional system, recognizing that fact, could always turn itself off!). Because recognizing emotional behaviors is likely to require considerable effort and produce only a modest benefit, it will probably require that a single emotional component be shared among many applications in order to be practical. Therefore, another attraction of adopting a simple noncognitive model of emotion is the ability to keep the emotional component independent of most of the domain-aware portions of the system. If we observe and simulate emotional behaviors that are expressed automatically and unconsciously, then the recognition and interpretation of those behaviors can take place in an independent subsystem. Thus we may well see the creation of just a few competing ‘‘emotion chips’’ that can be incorporated into many applications. These modules will be responsible for receiving sensory input and estimating the current emotional state of the user, selecting the emotional response from the system that will be most appropriate, and then modify the speech and animated behavior of the system in order to express the selected behavior in a natural way.
System Structure
The system architecture that we have experimented with is demonstrated in figure 11.4. In our agent, we maintain two copies of the emotion/personality model. One is used to assess the user’s
A Bayesian Heart 317
J^
User's E&P
t
Emotion and Personality Assessment
t
Observation
\ USER
Policy
^S.
Agent's E&P
\ Emotion & Personality Simulation
I t
Behavior
AGENT
Figure 11.4 An architecture for speech and interaction interpretation and subsequent behavior generation by a character based agent.
emotional state, the other to generate behavior for the agent. The model operates in a cycle, continuously repeating the following steps. 1.
2.
3.
Observation. First, the available sensory input is analyzed to identify the value of any relevant input nodes. For example, a phrase spoken by the user might be recognized as one possible paraphrase among a group of semantically equivalent, but emotionally distinct, ways of expressing a concept. (The modeling of such alternatives is discussed in the next section.) In parallel, the vision subsystem might report its analysis that the user is currently producing large and fast gestures along with their speech. For each such perception, the corresponding node in the diagnostic copy of the Bayesian network is set to the appropriate value. Assessment. Next, we use a standard probabilistic inference (Jensen 1989; Jensen 1996) algorithm to update the emotion and personality nodes in the diagnostic network to reflect the new evidence. Policy. The linkage between the models is captured in the policy component. This component makes the judgment of what emotional response from the computer is desirable, given the new
Eugene Ball
4.
5.
estimate of the user’s emotional state. Possible approaches to the policy component are discussed in the next section. Simulation. Next, a probabilistic inference algorithm is applied to the second copy of the Bayes network. This time, the consequences of the new states of the emotion and personality nodes are propagated to generate probability distributions over the available behaviors of the agent. These distributions indicate which paraphrases, animations, speech characteristics, and so on would be most consistent with the agent’s emotional state and personality (as determined by the policy module). Behavior. Some agent behaviors can be expressed immediately; for example, instructions for changes in posture or facial expression can be transmitted directly to the animation routines, and generate appropriate background movement. Other behavior nodes act as modifiers on application commands to the agent. At a given stage of the dialogue, the application may dictate that the agent should express a particular concept, such as a greeting or an apology. The current distribution for the node corresponding to that concept is then sampled to select a paraphrase to use in the spoken message.
Policy
The policy module has not been explored very thoroughly at this point. In a working system it would likely be quite complex, taking into account the history of the dialogue with the user thus far (or at least its emotional trajectory), and a model of the particular user’s preferences, as well as the estimates from the personality and emotion nodes in the diagnostic network. The imagined responses shown in the section entitled Emotionally Aware Computing illustrate a few of the difficulties. For example, at what point should a computer agent express irritation toward a user? Conversational systems frequently encounter explicit attempts to ‘‘break the demo.’’ The form of such an attack is sometimes sufficiently predictable that a clever response can be generated in an attempt to deflect it. If the user then persists in generating additional antagonistic input, perhaps an expression of irritation is the appropriate response. Thus far, we have only considered two very simplistic policies. The empathetic agent tries to match the user’s emotion and personality. There is some evidence that people prefer to deal with a computer agent that is similar to themselves (Reeves and Nass
A Bayesian Heart
1995), so this might be a good starting point. Of course, it does lead to a possible positive feedback loop, particularly if the user becomes angry! We have also experimented briefly with a contrary agent, whose emotions and personality tend to be the exact opposite of the user. While there are particular contexts in which this may produce interesting results—for example, when the user becomes bored or sad—it obviously is too simplistic to be a general policy.
Discussion
Bellman: So, is this policy box like the kind of thing in Eliza and other systems that we know, which would basically decide how friendly or how sympathetic the agent is? Ball: Yes, it decides what its emotional state is. If the user is angry, then just being deadpan in response to that isn’t, I think, the right choice. Ortony: The social interaction rules, essentially. Ball: But there is a complex choice about this, given the history of the interaction and the long-term assessment of this user and all kinds of complex issues. Picard: A whim to apologize, for example. Ball: If you decide to apologize, for example, then you want the behavior or expression of the agent’s emotional state to be appropriate. Bellman: The question I was asking actually was: It seemed to me that the policy you have represents the personality of the agent, and you could have your settings there. But then, you have your definition of the cultural interactions that are allowed, the conversational rules and other kinds of things that would be culturally determined. You wouldn’t sell the same agents and policies in Japan as you would here. Ball: Sure. Sloman: But at the moment they are collapsed into a policy box. Ball: Well, it’s the policy box, and also outside the policy box. So, there’s something that’s controlling the overall interaction, the dialogue, deciding what to say, what choice to take. Bellman: So, what I am saying is that it should have at least two boxes of that.
Eugene Ball
Ball: I am saying that this is just a little sideline to some whole other system which is doing the communication and the task and all of that. And so, this is just providing little, subtle modulation on the style of the interaction in order to try to make it feel more believable.
11.4
Recognition and Simulation
Linguistic Behavior
A key method of communicating emotional state is by choosing among semantically equivalent, but emotionally diverse paraphrases—for example, the difference between responding to a request with ‘‘sure thing,’’ ‘‘yes,’’ or ‘‘if you insist.’’ Similarly, an individual’s personality type will frequently influence their choice of phrasing—for example, ‘‘you should definitely’’ versus ‘‘perhaps you might like to.’’ Our approach to differentiating the emotional content of language is based on behavior nodes that represent ‘‘concepts,’’ including a set of alternative expressions or paraphrases. Some examples are shown in table 11.1. We model the influence of emotion and personality on wording choice in two stages, only the first of which is shown in the network of figure 11.1. Because the choice of a phrase can have a complex relationship with both emotion and personality, the probTable 11.1
Paraphrases for alternative concepts
CONCEPT
PARAPHRASES
greeting
Hello Hi there Howdy
Greetings Hey
yes
Yes Yeah I think so
Absolutely I guess so For sure
suggest
I suggest that you Perhaps you would like to Maybe you could
You should Let’s
A Bayesian Heart
lem of directly assessing probabilities for each alternative depending on all four dimensions rapidly becomes burdensome. However, inspired by Osgood’s work on meaning (Osgood, Suci, and Tannenbaum 1967), in which he identified several dimensions that can be used to characterize the connotations of most concepts, we first capture the relationship emotion and several ‘‘expressive styles.’’ The current model has nodes representing positive, strong, and active styles of expression (similar to Osgood’s evaluative, potent, and active), as well as measures of terseness and formality (see figure 11.5). These nodes depend upon the emotion and personality nodes and capture the probability that individuals express themselves in a positive (judgmental), strong, active, terse, and/or formal manner. Each of these nodes is binary valued, true or false. Thus this stage captures the degree to which an individual with a given personality and in a particular emotional state will tend to communicate in a particular style. The second stage captures the degree that each paraphrase actually is positive, strong, active, terse, and formal. This stage says nothing about the individual, but rather reflects a general cultural interpretation of each paraphrase: that is, the degree to which that phrase will be interpreted as positive, active, and so on by a speaker of American English. A node such as ‘‘GreetPositive’’ is also binary valued, and is true if the paraphrase would be interpreted as ‘‘positive’’ and false otherwise. Finally, a set of nodes evaluates whether the selected paraphrase of a concept actually matches the chosen value of the corresponding expressive style. A node such as ‘‘GreetMatchPositive’’ has value true if and only if the values of ‘‘GreetPositive’’ and ‘‘wdsPositive’’ are the same. The node ‘‘GreetMatch’’ is simply a Boolean that has value true when all of its parents (the match nodes for each expressive style) are true. When using the network, we set ‘‘GreetMatch’’ to have an observed value of true. This causes the Bayesian inference algorithm to force the values of the nodes in the concept and style stages to be consistent. For example, when simulating the behavior of an agent, each style node (like ‘‘wdsPositive’’) will have a value distribution implied by the agent’s personality and emotional state. The likelihood of alternative phrasings of a concept node (like ‘‘Greet’’) will then be adjusted in order to produce the best possible match between its attributes and the style nodes. In this fashion, a negative emotional
Eugene Ball
_
• 1 B •'-/-
•:
•
"
;
.
KC\
(!)
word Form
•
/ »\
\
[ V
H
•
|
v
^
J
_
WllVi^
Ml
/
\_y
/
^ f \
"J
• 0
c
••"-.•
o
o
10
o (D
C-
cn
o
in
o
lf>
O
q
r-
o>.
m
CO
o
Ifi
1
/
\
i-
T*
o in
I § s \^—-JlL
V
/
•
J 0) \
i i \
« Q-
J 1
i1
1
o
1
«
r
I
\ a /
Jv
r E \ 1 5 1
5 J
1 I
6
>
Xy
> ^
B £
>V
/ • \
I
m £
/
/ S \ \°/
s
V
(OW viz / /
ii
\
J.
/ \
/
\ ^^^ 1 £ 1 Y >v 1i A
Hi ther
X
Oh. You a
ositive Greet
°r
X
Good %day
a
U
Good to see y ou again.
c '5
re
a
/\ 1 1
i 8 /
(
s
v ry >
£
xn ^ f{\ Y w
/
/
1 »< J
r
i
/11
T^AW
/
/
/
i § 9 i
^ ^ —. L
—y X* 11 f / i\ /
y\ '
'-
/
n>\
wrtij V/ f 8 \
T—F"^
-•»^_ ^ j ^ \
V)
HOfJi
.
; •>,'!
1
1 /
/
M [ i
J A
A Bayesian Heart
state will greatly increase the chance that the agent will select ‘‘Oh, you again’’ as a greeting. In developing a version of this Bayes net for a particular application, we need to generate a network fragment, such as shown in figure 11.5, for each conceptual element for which we want emotional expression. These fragments are merged into a global Bayesian network capturing the dependencies between the emotional state, personality, natural language, and other behavioral components of the model. The various fragments differ only in the assessment of the paraphrase scorings—that is, the probability that each paraphrase will be interpreted as active, strong, and so on. There are five assessments needed for each alternative paraphrase for a concept (the ones mentioned earlier, plus a formality assessment). Note that the size of the belief network representation grows linearly in the number of paraphrases (the number of concepts modeled times the number of paraphrases per concept). In a previously proposed model structure, we had each of the expressive style nodes pointing directly into the concept node, creating a multistated node with five parents. The assessment burden in this structure was substantial, and a causal independence assumption such as noisy-or is not appropriate (Heckerman 1993). The current structure reduces this assessment burden, and also allows modular addition of new expressive style nodes. If we add a new expressive style node to the network (such as cynical), then the only additional assessments we need to generate are the cynical interpretation nodes of each concept paraphrase. These features of the Bayes network structure make it easy to extend the model for new concepts and dimensions of expressive style.
Vocal Expression
As summarized by Murray and Arnott (1993), there is a considerable (but fragmented) literature on the vocal expression of emotion. Research has been complicated by the lack of agreement on the fundamental question of what constitutes emotion, and how it should be measured. Most work is based upon either self-reporting of emotional state or upon an actor’s performance of a named emotion. In both cases, a short list of ‘‘basic emotions’’ is generally used; however, the categories used vary among studies.
Eugene Ball
A number of early studies demonstrated that vocal expression carries an emotional message independent of its verbal content, using very short fragments of speech, meaningless or constant carrier phrases, or speech modified to make it unintelligible. These studies generally found that listeners can recognize the intended emotional message, although confusions between emotions with a similar arousal level are relatively frequent. Using synthesized speech, in a 1989 MIT masters thesis, Janet Cahn (1989) showed that the acoustic parameters of the vocal tract model in the DECtalk speech synthesizer could be modified to express emotion, and that listeners could correctly identify the intended emotional message in most cases. Studies done by the Geneva Emotion Research Group (Johnstone, Banse, and Scherer 1995; Banse and Scherer 1996) have looked at some of the emotional states that seem to be most confusable in vocal expression. They suggest, for example, that the communication of disgust may not depend on acoustic parameters of the speech itself, but on short sounds generated between utterances. In more recent work (Johnstone and Scherer 1999), they have collected both vocal and physiological data from computer users expressing authentic emotional responses to interactive tasks. The body of experimental work on vocal expression indicates that arousal, or emotional intensity, is encoded fairly reliably in the average pitch and energy level of speech. This is consistent with the theoretical expectations of increased muscle tension in high arousal situations. Pitch range and speech rate also show correlations with emotional arousal, but these are less reliable indicators. The communication of emotional valence through speech is a more complicated matter. While there are some interesting correlations with easily measured acoustic properties (particularly pitch range), complex variations in rhythm seem to play an important role in transmitting positive/negative distinctions. In spite of the widely recognized ability to ‘‘hear a smile,’’ which Tartter (1980) related to formant shifts and speaker-dependent amplitude and duration changes, no reliable acoustic measurements of valence have been found. Roy and Pentland (1996) more recently performed a small study in which a discrimination network trained with samples from three speakers expressing imagined approval or disapproval was able to distinguish those cases with reliability
A Bayesian Heart
comparable to human listeners. Thus recognition of emotional valence from acoustic cues remains a possibility, but supplementary evidence from other modalities (especially observation of facial expression) will probably be necessary to achieve reliable results. Our preliminary Bayesian subnetwork representing the effects of emotional valence and arousal on vocal expression therefore reflects the trends reported in the literature cited above, as follows: With increasing levels of emotional arousal, we expect to find: Higher average pitch Wider pitch range Faster speech Higher speech energy As the speaker feels more positive emotional valence, their speech will tend toward: Higher average pitch A tendency for a wider pitch range A bias toward higher speech energy
Gesture and Posture
Humans communicate their emotional state constantly through a variety of nonverbal behaviors, ranging from explicit (and sometimes conscious) signals like smiles and frowns, to subtle (and unconscious) variations in speech rhythm or body posture. Moreover, people are correspondingly sensitive to the signals produced by others, and can frequently assess the emotional states of one another accurately even though they may be unaware of the observations that prompted their conclusions. The range of nonlinguistic behaviors that transmit information about personality and emotion is quite large. We have only begun to consider them carefully and list here just a few of the more obvious examples. Emotional arousal affects a number of (relatively) easily observed behaviors, including speech speed and amplitude, the size and speed of gestures, and some aspects of facial expression and posture. Emotional valence is signaled most clearly by facial expression, but can also be communicated by means of the pitch contour and rhythm of speech. Dominant personalities might be expected to generate characteristic rhythms and amplitude of speech, as well as assertive postures and gestures.
Eugene Ball
Friendliness will typically be demonstrated through facial expressions, speech prosody, gestures, and posture. The observation and classification of emotionally communicative behaviors raises many challenges, ranging from simple calibration issues (e.g., speech amplitude) to gaps in psychological understanding (e.g., the relationship between body posture and personality type). However, in many cases the existence of a causal connection is uncontroversial, and given an appropriate sensor (e.g., a gesture size estimator from camera input), the addition of a new source of information to our model will be fairly straightforward. Within the framework of the Bayesian network of figure 11.1, it is a simple matter to introduce a new source of information to the emotional model. For example, suppose we got a new speech recognition engine that reported the pitch range of the fundamental frequencies in each utterance (normalized for a given speaker). We could add a new network node that represents PitchRange with a few discrete values and then construct causal links from any emotion or personality nodes that we expect to affect this aspect of expression. In this case, a single link from Arousal to PitchRange would capture the significant dependency. Then the model designer would estimate the distribution of pitch ranges for each level of emotional arousal, to capture the expectation that increased arousal leads to generally raised pitch. The augmented model would then be used both to recognize that increased pitch may indicate emotional arousal in the user, as well as adding to the expressiveness of a computer character by enabling it to communicate heightened arousal by adjusting the base pitch of its synthesized speech.
11.5
Concerns
I think it is appropriate at this point to raise two areas of concern for research involving emotional response and computing.
Overhyping Emotional Computing
First, any mention of emotion and computers in the same breath gets an immediate startle reaction from members of the general public (and the media). Even if we were to avoid any discussion of the far-out questions (Will computers ever truly ‘‘feel’’ an emo-
A Bayesian Heart
tion?), I think we will be well advised to take exceptional care when explaining our work to others. The good news is that the idea of a computer with emotional sensitivity seems to be getting a lot of serious interest and discussion. Perhaps there is a shared (though unarticulated) appreciation that a device that can be so capable of generating emotional responses should also know how to respond to them! However, the recent high level of interest may generate an unreasonably high level of expectation for technology that dramatically and reliably interprets emotional behavior. My personal expectation is that the significance of emotionally aware computing will be subtle, and will only reach fruition when spoken interaction with computers becomes commonplace. Moreover, as a technology that works best when it isn’t noticed, emotional computing should probably try to avoid the limelight as much as possible.
Ethical Considerations
When considering the ethics of emotional computing, there are many pitfalls. Some people may find the idea of ascribing an attribute as distinctively human as the communication of emotion to a computer as inherently objectionable. But even considering only more prosaic concerns, there are some potential uses of emotionally sensitive computing that clearly cross the boundaries of ethical behavior. Emotion could be a very powerful persuasive tool, if used effectively—especially if coming from the personification of your own computer, with which you may have a long and productive relationship. B. J. Fogg (1999) at Stanford University has begun to seriously consider these issues, under the name of computer aided persuasive technology (CAPTology). I believe it would be irresponsible of us to pretend that some of our ideas will never be used to unethically persuade people. Conartists have always been quick to adopt any available technology to defraud unsuspecting people. The age-old defense of ‘‘if we don’t do this, someone else will,’’ while true, doesn’t seem to me a sufficient response. I would be interested in hearing the thoughts of other workshop members on this topic. My best idea is to try to develop a clear consensus on what exactly would constitute an unethical use of emotionally aware computing. If we could agree on that, we could quickly and vocally
Eugene Ball
object to unethical behavior (particularly commercial uses) when it occurs, and by announcing that intention in advance, perhaps dissuade at least some people from misusing our ideas.
Discussion: The Ethics of Emotional Agents
Picard: Let me tell you a short story. We made an experiment. We frustrated the test persons, and we built an agent that tries to make them feel less frustrated by using socially acceptable strategies. And it looks like we succeeded: The people who work with the agent show a behavior that was indicative of significantly less frustration. I was explaining this to some Sloane fellows, one of them from a very large computer company, and he said: No surprise to u s . We found that, when we surveyed our customers who had had our product, that those who had had a problem with the product, found it defective, and had gotten this empathysympathy-active listening kind of response from customer service people—not from an agent!—were significantly more likely to buy our products again than those who bought a product and had no problems with it. And he said that they seriously confronted this as an ethical issue. Furthermore, all these years I have talked to visitors of our lab. Every one of them lately has raised the issue of how they have to get technology out faster, and they are now aiming not for it to have gone through alpha and beta cycles, and so forth, but they say, ‘‘60 per cent ready, and then it goes out there.’’ They are excited about the fact that getting something out that is defective and handling complaints about it better could actually lead to better sales. Ortony: And the real ethical problem that I guess the guys are focusing on is that this is a sort of unknown property—when do you come in? If one were to explicitly say: ‘‘Whenever you encounter a bug in our software, some very nice agents are going to come and calm you down,’’ then you would not have an ethical problem. Then you think the ethical problem would go away. It is the fact is that this is ‘‘unsolicited’’ behavior from the system that’s problematic, I presume. Ball: I am not sure if it would go away. Picard: I am not so sure if it is unsolicited, either. Ortony: Well, it diminishes, because, after all, you design cars with features to make people comfortable, but they are visible and they
A Bayesian Heart
are available for inspection prior to purchase, and so you don’t feel bad that you have made power seats as opposed to manual seats. On the other hand: What does a power seat do? It makes you feel all easy about changing the position of your seat, and all kinds of things that just make you feel better as a user. We don’t have a problem with that, presumably because it’s explicit and openly available and inspectable rather than requiring actual interaction with a product. Ball: Right. And so, I think there is a deep problem about the emotional component, because it needs to be hidden in order to make sense. Bellman: And there is a privacy issue somewhere, just in terms of the modeling that you do about the user, and who you pass it to, and what other kinds of reasons it is used for. Let me just mention the example of school children who were being taught how they could get their parents’ financial information back to companies. Ball: If you have an agent that’s observing your behavior for a long period of time, he is going to know a lot about you. I think it is relatively easy to put a line and say: There is a lot of information, it just never goes out of your machine. Picard: In an office environment, the company owns your machine and what’s on it. But in the future, you know, you might be much more comfortable with the mediator that you trust, operating between you and the office machine, if that mediator was as comfortable as your earring or your jewelry or something that you owned. Sloman: Or it’s your personal computer, that you bring into the office and plug into the main one—rather than an earring. Bellman: What I was pointing out is: Why should this be a secret from the patient or from the person who is being educated? Why can’t they have control over it? And that fits in with your wearable devices. In some sense, yes, you have this wonderful computerbased technology that allows this kind of collection about you, but you have control over it. That’s part of the solution. The other thing is that a lot of the virtual world work is still highly effective even when it’s transparent to the user. Many of you seem to assume that it takes away from the mythology of the character if somehow people begin to lift the hood. In my experience we have just found the opposite. In really hundreds and hundreds of cases, letting people actually, for example, walk into your office,
Eugene Ball
and they meet your characters, and they actually look on the lifted hood, see how it’s set u p , see the way in which it works, and they pick up issues about how responsive it is, or what it does, or what it collects on them. We have not found that this knowledge would actually take away the experience. It’s very empowering, and it’s an interesting way of thinking about these control issues. Sloman: These points are all about how much the end user has access to information. And it’s quite important that often they cannot absorb and evaluate the information if they get it themselves. You may have to have third parties, like consumer associations and other people who have the right to investigate these things, to evaluate them, and then to publicize. If you are not an expert, you go to someone you trust. That is not necessarily your earring or your personal computer, but it might be another person or an organization who has looked at this thing, and you will be in a better position. So, the information must be available. Picard: We have to distinguish real-time, run-time algorithms from store-up-plots of do-it-slowly algorithms. If you allow accumulation, then we are getting to know somebody over a long period of time, and you can build up their goals, values, and expectations, all these other things that help predict the emotion, since it’s not just what you see right now, but it’s also what you know about the person. Whereas if you don’t keep all that person-specific memory, you can only have ‘‘commonsense about prototypes about people,’’ and then take what you observe at face value from this stranger, so to speak. So, in the latter case, I believe, we can do without any problems of privacy. We just build up really good models of what is typical, and they won’t always be as good. But if you really want the system to get to know you intimately, to know your values, how you are likely to respond to the situation, there is going to have to be some memory. We can’t do it all in the run-time. And that’s an issue of privacy. But we can go a long way without hitting the privacy. And once we do with the privacy, there are a whole lot of possible solutions to it.
References Ball, G., Ling, D., Kurlander, D., Miller, J., Pugh, D., Sally, T., Stankosky, A., Thiel, D., van Dantzich, M., and Wax, T. (1997): Lifelike Computer Characters: The Persona Project at Microsoft Research. In J. M. Bradshaw, ed., Software Agents, 191–222. AAAI Press/MIT Press, Menlo Park, Calif.
A Bayesian Heart
Banse, R., and Scherer, K. R. (1996): Acoustic profiles in vocal emotion expression. J. Personality Social Psychol. 70: 614–636. Bates, J., Loyall, A. B., and Reilly, W. S. (1994): An Architecture for Action, Emotion, and Social Behavior. In C. Castelfranchi and E. Werner, eds., Artificial Social Systems: Fourth European Workshop on Modeling Autonomous Agents in a Multi-Agent World, MAAMAW ’92, S. Martino al Cimino, Italy, July 29–31, 1992. Lecture Notes in Computer Science Vol. 830, Springer-Verlag, Berlin. Cahn, J. E. (1989): Generating Expression in Synthesized Speech. Master’s Thesis, Massachusetts Institute of Technology, May 1989. Cialdini, R. B. (1993): Influence: The Psychology of Persuasion. Quill, William Morrow, New York. Elliot, C. D. (1992): The Affective Reasoner: A Process Model of Emotions in a MultiAgent System. Ph.D. diss., Northwestern University, Evanston, Ill. Flanagan, J., Huang, T., Jones, P., and Kasif, S. (1997): Final Report of the NSF Workshop on Human-Centered Systems: Information, Interactivity, and Intelligence. National Science Foundation, Washington, D.C. Fogg, B. J. (1999): Persuasive Technologies. Commun. ACM 42 (5): 26–29. Heckerman, D. (1993): Causal Independence for Knowledge Acquisition and Inference. In D. Heckerman and A. Mamdani, eds., Proceedings of the Ninth Conference on Uncertainty in Artificial Intelligence, 122–127. Morgan Kaufmann, San Mateo, Calif. Heckerman, D., Breese, J., and Rommelse, K. (1995): Troubleshooting under Uncertainty. Commun. ACM 38 (3): 49–57. Horvitz, E., and Shwe, M. (1995): Melding Bayesian Inference, Speech Recognition, and User Models for Effective Handsfree Decision Support. In R. M. Gardner, Proceedings of the Symposium on Computer Applications in Medical Care. IEEE Computer Society Press, Long Beach, Calif. Huang, X., Acero, A., Alleva, F., Hwang, M. Y., Jiang, L., and Mahajan, M. (1995): Microsoft Windows’s Highly Intelligent Speech Recognizer: Whisper. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, Detroit. Vol. 1, pp. 93–96, IEEE. Jensen, F. V. (1996): An Introduction to Bayesian Networks. Springer, Berlin, Heidelberg, New York. Jensen, F. V., Lauritzen, S. L., and Olesen, K. G. (1989): Bayesian Updating in Recursive Graphical Models by Local Computations. TR R-89-15, Institute for Electronic Systems, Department of Mathematics and Computer Science, University of Aalborg, Denmark. Johnstone, I. T., Banse, R., and Scherer, K. R. (1995): Acoustic Profiles from Prototypical Vocal Expressions of Emotion. In K. Elenius and P. Branderud, eds., Proceedings of the XIIIth International Congress of Phonetic Sciences, publisher/location: Stockholm: KTH/Stockholm University Vol. 4, pp. 2–5. Johnstone, I. T., and Scherer, K. R. (1999): The Effects of Emotions on Voice Quality. In J. J. Ohala, Y. Hasegawa, M. Ohala, D. Granville, and A. C. Bailey, eds., Proceedings of the XIVth International Congress of Phonetic Sciences, American Institute of Physics, New York Vol. 3, pp. 2029–2032. Klein, J., Moon, Y., and Picard, R. W. (1999): This Computer Responds to User Frustration: Theory, Design, Results, and Implications. TR-501, MIT Media Laboratory, Vision and Modeling Technical Group. Lang, P. (1995): The Emotion Probe: Studies of Motivation and Attention. Am. Psychol. 50 (5): 372–385. Martinho, C., and Paiva, A. (1999): Pathematic Agents: Rapid Development of Believable Emotional Agents in Intelligent Virtual Environments. In O. Etzioni, J. P. Mu¨ller, and J. M. Bradshaw, eds., Proceedings of the Third Annual Conference on Autonomous Agents, AGENTS ’99, 1–8. ACM Press, New York. McCrae, R., and Costa, P. T. (1989): The Structure of Interpersonal Traits: Wiggin’s Circumplex and the Five Factor Model. J. Pers. S oc. Psychol. 56 (5): 586–595. Murray, I. R., and Arnott, J. L. (1993): Toward the Simulation of Emotion in Synthetic Speech: A Review of the Literature on Human Vocal Emotion, J. Acoust. Soc. Am. 93 (2): 1097–1108.
Eugene Ball
Myers, I. B., and McCaulley, M. H. (1985): Manual: A Guide to the Development and Use of the Myers-Briggs Type Indicator. Consulting Psychologists Press, Palo Alto, Calif. Ortony, A., Clore, G. L., and Collins, A. (1988): The Cognitive Structure of Emotions. Cambridge University Press, Cambridge. Osgood, C. E., Suci, G. J., and Tannenbaum, P. H. (1967): The Measurement of Meaning. University of Illinois Press, Urbana. Picard, R. W. (1995): Affective Computing. MIT Media Lab, Cambridge. Perceptual Computing Section Technical Report 321. Picard, R. W. (1997): Affective Computing. MIT Press, Cambridge. Perceptual Computing Section Technical Report 321. Reeves, B., and Nass, C. (1995): The Media Equation. CSLI Publications and Cambridge University Press, New York. Roy, D., and Pentland, A. (1996): Automatic spoken affect analysis and classification. Scherer, K. R. (1984): Emotion as a Multicomponent Process: A Model and some CrossCultural Data. Rev. Pers. S oc. Psychol. (5): 37–63. Sloman, A. (1992): Prolegomena to a Theory of Communication and Affect. In A. Ortony and J. Slack, eds., AI and Cognitive Science Perspectives on Communication. Springer, Berlin, Heidelberg, New York. Tartter, V. C. (1980): Happy Talk: Perceptual and Acoustic Effects of Smiling on Speech. Percep. Psychophys. 27: 24–27. Trower, T. (1997): Microsoft Agent. Microsoft Corporation, Redmond, Wash. On-line. Available: hhttp://www.microsoft.com/msagent/i. (Availability last checked 5 Nov 2002) Wiggins, J. S. (1979): A psychological taxonomy of trait-descriptive terms: The interpersonal domain. J. Personality Social Psychol. 37: 395–412.
12 Creating Emotional Relationships with Virtual Characters Andrew Stern
During the workshop on Emotions in Humans and Artifacts, participants presented research addressing such questions as How do emotions work in the human mind? What are emotions? How would you build a computer program with emotions? Can we detect the emotional state of computer users? Yet as the discussions wore on, instead of providing some answers to these questions, they persistently raised new ones. When can we say a computer ‘‘has’’ emotions? What does it mean for a computer program to be ‘‘believable?’’ Do computers need emotions to be ‘‘intelligent,’’ or not? These scientific, engineering, and philosophical questions are fascinating and important research directions; however, as an artist with a background in computer science and filmmaking, I found my own perspective on emotions in humans and artifacts generally left out of the discussion. I find the burning question to be, What will humans actually do with artifacts that (at least seem to) have emotions? As a designer and engineer of some of the first fully realized, ‘‘believable’’ interactive virtual characters to reach a worldwide mass audience, Virtual Petz and Babyz (PF.Magic/ Mindscape 1995–99), I know this is a question no longer restricted to the domain of science fiction. Today, millions of people are already encountering and having to assimilate into their lives new interactive emotional ‘‘artifacts.’’ Emotions have been a salient feature of certain man-made artifacts—namely, stories and art—long before the scientific study of emotion began. Looking into the past, we see a tradition of humans creating objects in one form or another that display and communicate emotional content. From figurative painting and sculpture to hand puppets and animated characters in Disney films such as Snow White, man-made artifacts have induced emotional reactions in people as powerful and as meaningful as those generated between people themselves. Now we are at a point in the history of story and art where humans can create interactive artifacts, using the computer as a
Andrew Stern
new medium. We can now write software programs that can ‘‘listen’’ to a human, process what it just ‘‘heard,’’ and ‘‘speak’’ back with synthetically generated imagery and sound, on machines that are already in the households of millions of families around the world. The new urgent challenge for artists and storytellers is to create interactive artifacts that can induce the same emotional reactions and communicate emotional content as traditional noninteractive stories and art have done. In fact, because the computer theoretically can be programmed to tailor the experience to an individual, it could become the most powerful medium of all for creating affective stories and art. The question becomes, How do you do that? How do emotionally powerful stories and art ‘‘work’’ anyway? Alas, creating an artifact that produces a meaningful emotional reaction in a human is considered an ‘‘art’’ itself. Although techniques and advice on the artistic process have been published—works such as The Art of Dramatic Writing by playwright Lajos Egri (1946), The Illusion of Life by Disney animators Thomas and Johnston (1981), Letters to a Young Poet by poet Rainer Rilke (1934)—the act of creating emotionally powerful artifacts is by and large considered elusive, mysterious, and unquantifiable. Even in art school the typical approach is to teach students to imitate (‘‘master’’) traditional styles and techniques, after which it is hoped the students will be ready to ‘‘find their own style,’’ which sometimes never happens. Naturally it is difficult to discuss the art of creating emotionally powerful artifacts in the context of a scientific workshop, in the way one would approach a computer science or engineering problem—which helps explain the general reluctance to research the topic, and the not-so-uncommon attitude among artificial intelligence researchers that the topic is ‘‘mushy,’’ ill-formed, or worst of all, unimportant. Art and entertainment are considered as fun, not serious pursuits. This view is shortsighted. On the contrary, stories and art are among the most serious and meaningful pursuits we have. We communicate ideas and experiences to each other in this way. The fact that the problem is, to a degree, mushy and unquantifiable, makes it all the more challenging and difficult to undertake. This paper puts forth virtual characters as an emerging form of man-made artifact with emotional content. Virtual characters go by a few other names in the AI research community, such as believable agents, synthetic actors, and synthetic personalities. These are embodied autonomous agents that a user can interact with in some
Creating Emotional Relationships with Virtual Characters
fashion, animated as real-time graphics or built as physical robots, which appear to have personality, emotion, and motivation, designed to be used in art or entertainment. Over the past decade, several media labs have been exploring the issues involved in building virtual characters (such as Bates 1992; Blumberg 1997; Perlin 1995; Hayes-Roth et al. 1996; Goldberg 1997; Elliott et al. 1998). Some groups have designed architectures and implemented prototypes that have been demonstrated at academic conferences. But it should be made clear that in the business of creating emotional artifacts, in the final analysis, prototypes and demos are not enough. The point of creating emotionally powerful experiences, whether interactive or not, is to induce a reaction in an audience, in ‘‘users.’’ These creations must be experienced by the general public to serve the purpose for which they were created in the first place. Until this happens, the work created in closed-door media labs is ultimately incomplete. The public has been consuming interactive entertainment for two decades now, in the form of software products from the video game and computer game industry. Unfortunately, the experiences offered in these games are mostly juvenile, primarily focused on fighting, shooting, racing, and puzzle-solving, and the virtual characters offered in them are most often shallow, one-dimensional cardboard cutouts. Such games can be emotionally powerful experiences for those who play them, but they do not appeal to the majority of the population. Very few successful pieces of interactive entertainment or art have been made with emotional content for a mass audience—that is, the kind of ‘‘personal relationship’’ stories that books, theater, television, and movies offer, or the kind of ‘‘high art’’ exhibited at museums and art shows (Stern 1999a; Mateas 1999). This chapter suggests new ways to employ virtual characters to create emotionally powerful interactive experiences, using our Virtual Petz and Babyz projects as case studies. The techniques used to create these projects will be presented and discussed, with emphasis on the importance of design. Finally, we will attempt to address the question of what it could mean for a human to have an emotional relationship with a virtual character.
12.1 Do You Feel It? The Case for Emotional Relationships
Animation and artificial intelligence technologies for creating realtime interactive virtual characters are currently being researched
Andrew Stern
and developed in academic labs and industry companies. We are told that soon we will have virtual humans that look photorealistic, with behavior driven by some degree of AI. Many are working with the intention that these characters will become functional agents, at our command to perform a variety of complicated and menial tasks. They will learn our likes and dislikes and be able to autonomously communicate and negotiate with others. And they will become teachers in the virtual classroom, always ready and willing to answer our questions. At first glance, it seems natural that adding ‘‘emotions’’ to these virtual characters should greatly enhance them. After all, real people have emotions, so virtual h u m a n characters should have them too. Some researchers in neuroscience and psychology point to emotion as an important factor in problem-solving capabilities and intelligence in general (Damasio 1994). As Marvin Minsky put it, ‘‘the question is not whether intelligent machines can have emotions, but whether machines can be intelligent without any emotions’’ (Minsky 1985). Virtual characters may very well need emotions to have the intelligence to be useful. But when thinking in terms of a virtual character actually interacting with a user, is emotional behavior really appropriate for these types of applications? In real life it is arguable that interactions with ‘‘functional agents’’ (e.g., waiters, butlers, secretaries, travel agents, librarians, salespeople) are often best when emotions are not involved. Emotional reactions can often be irrational, illogical, and time consuming, which work against the efficient performance of tasks. Of course, in any transaction, politeness and courtesy are always appreciated, but they hardly qualify as emotional. We expect teachers to be a bit more personable and enthusiastic about their material than a travel agent, but do we want them to get angry or depressed at us? Although emotions may be required for intelligence, I would argue that the most compelling interactions with virtual characters will not be in the area of functional agents. If a user encounters a virtual character that seems to be truly alive and have emotions, the user may instead want to befriend the character, not control them. Users and interactive virtual characters have the potential to form emotional relationships with each other—relationships that are more than a reader’s or moviegoer’s affinity for a fictional character in a traditional story, and perhaps as meaningful as a friendship between real people. By an emotional relationship, we mean a
Creating Emotional Relationships with Virtual Characters
set of long-term interactions wherein the two parties pay attention to the emotional state of the other, communicate their feelings, share a trust, feel empathetic, and establish a connection, a bond.
Virtual Friends
The recent success of several ‘‘virtual pet’’ products, popular among both kids and adults, offers some support for this idea. The most sophisticated of these characters are animated on the computer screen, such as Dogz and Catz (PF.Magic/Mindscape 1995– 99), and Creatures (Grand, Cliff, and Malhotra 1997), but some are displayed on portable LCD keychain toys or even embodied as simple physical robots, such as Tamagotchi (Bandai 1996), Furby (Tiger Electronics 1998), and Aibo (Sony 1999). Users ‘‘nurture’’ and ‘‘play’’ with these pets, feeding them virtual food, petting them with a virtual hand, and generally giving them attention and care lest they runaway or die. Although there can be some blurring into the domain of video games, in their purest form, virtual pets are not a game, because they are nongoal oriented; it is the process of having a relationship with a virtual pet that is enjoyable to the user, with no end goal of winning to aim for. As of this writing, there have been no completed formal studies of virtual pets; a study is currently underway by Turkle (1999). In our experience with the Dogz and Catz products, our anecdotal evidence suggests that depending upon the sophistication of the virtual character, the emotional relationship that a user can form with it ranges anywhere from the attachment one has to a favorite plant to the bond between a master and their dog. Children are the most willing to suspend their disbelief and can become very attached to their virtual pets, playing with and feeding them every day. It is precisely those irrational, illogical, and time-consuming emotional interactions that may hamper a functional agent that are so engaging and entertaining here. (Please refer to the appendix of this paper to read real customer letters we have received about Petz.) Only a few Petz users, mostly technology–oriented adult men, have requested that their Petz be able to perform functional tasks such as fetching e-mail. What is most interesting about the phenomenon of virtual pets are not the toys and software themselves—some of which have minimal interactivity and little or no artificial intelligence driving them—but the fact that some people seem to want to form
Andrew Stern
emotional relationships with them. Some appear quite eager to forget that these characters are artificial and are ready and willing to engage in emotional relationships, even when some of the virtual pets offer little or no reward or ‘‘warmth’’ in return. This offers some promise for the public’s acceptance of the concept of a more advanced virtual friend. As commercially successful as these virtual pets are, it seems likely that emotional relationships at the level of favorite plants or pets will be far easier to accomplish than the level of friendship between two adults. An owner-to-pet relationship dynamic is much simpler than a person-to-person one, and much less communication is required between the two parties. Most important, the relationship is inherently unequal. An owner of a real pet chooses (even purchases!) their real-life cat or dog. Therefore, the act of buying a virtual pet as a toy or piece of software does not violate the hierarchy of a real-world owner-to-pet relationship. As people, we do not get to choose which other people will be friends with us. Friends, by definition, choose to be friends with one another. Therefore, even if we create an interactive virtual character that can perform all the behaviors required for an emotional relationship between human adults, as a man-made artifact that can be bought and sold, could a ‘‘true’’ friendship could be formed at all? This is an open question that invites exploration.
Interactive Stories
Stories have long been our primary way to observe and understand our emotional relationships. If stories could be made interactive— where users could immerse themselves in virtual worlds with characters they could talk to, form relationships with, touch and be touched by, and together alter the course of events, literally creating a new story in real time—then we would have a new form of interactive entertainment that eclipses video games. Like traditional stories from books, theater, television, and movies, an interactive story would be affecting and meaningful, but made all the more personal because the user helped shape it and create it (Stern 1998). Virtual characters programmed to simulate the dynamics of emotional relationships could be used as starting points for creating interactive stories. In her book, Hamlet on the Holodeck: The Future of Narrative in Cyberspace, Janet Murray (1997) suggests
Creating Emotional Relationships with Virtual Characters
that interactive virtual characters ‘‘may mark the beginning of a new narrative format.’’ As a first step in this direction, instead of relying heavily on planning to generate story plots, as some previous story researchers have done (such as Meehan 1976; Pemberton 1989; Turner 1994), a developing and ongoing emotional relationship itself could serve as a narrative. For example, a user and a virtual character could meet and get to know each other, begin to develop trust for one another, and perhaps (accidentally or not) violate that trust in some way, causing the relationship to take a downturn. The relationship could progress from there in many ways, perhaps recovering, ending, cycling between highs and lows—much like real-life relationships. There are several traditional stories that follow this pattern, such as boy meets girl, boy and girl fall in love, boy loses girl, and so on.
Interactive Art
Autonomous interactive virtual characters are only just beginning to make their way into installation and performance art. The goal of Simon Penny’s 1995 ‘‘Petit Mal’’ was, according to the artist, ‘‘to produce a robotic artwork which is truly autonomous; which was nimble and had ‘charm,’’’ to give ‘‘the impression of being sentient’’ (Penny 1997). Petit Mal was a tall, thin robot on bicycle wheels, which could quietly and gently move about in a room, able to sense when it was approaching walls or people. Celebrated video artists Lynn Hershman and Bill Viola have begun experimenting with combining video imagery of people and some simple interactivity. Mark Boehlen and Michael Mateas (1998) recently exhibited ‘‘Office Plant #1,’’ a robot plant that will bloom, wither, and make sounds in response the mood of your office environment. Other efforts include the large, destructive autonomous robots from the industrial performance art of Survival Research Labs, the RoboWoggles (Wurst and McCartney 1996), and the robot installations of Alan Rath. The potential for exploration and discovery in this area seems untapped and wide open.
12.2 I Need Your Love: Virtual Dogs, Cats, and Babies
Recognizing a dearth of consumer software with characters that displayed emotions or personality, the startup company PF.Magic was formed in 1992 with the mission to ‘‘bring life to enter-
Andrew Stern
tainment.’’ We wanted to break out of the mold of traditional video games (e.g., flight simulators, sports games, runningjumping-climbing games, shooters, puzzle games) to create playful new interactive experiences with emotional, personality-rich characters.
Dogz and Catz
By 1995, the personal computer was powerful enough to support the real-time animation we felt was required for a convincing virtual character. The first Dogz program, originally conceived by company cofounder Rob Fulop and created by Adam Frank and Ben Resner, was a simple idea: an animated virtual dog that you could feed and play with. As a product, it was very risky; an emotional relationship as the sole basis of an interactive experience had never been done before. It was unknown at the time if anyone would pay money to interact with a virtual character in this way. The program quickly generated interest from a wide range of customers—male and female, kids and adults—which is typically unheard of in entertainment software. We followed up Dogz with a companion product, Catz, establishing the explicit design goal to create the strongest interactive illusion of life we could on a PC. To imbue the Petz characters with personality and emotion we began cherry-picking techniques from computer animation and artificial intelligence, to construct what eventually became a powerful realtime animation engine tightly integrated with a goal-based behavior architecture. The Virtual Petz characters are socially intelligent autonomous agents with real-time 3-D animation and sound. By using a mouse, the user moves a hand-shaped cursor to directly touch, pet, and pick up the characters, as well as use toys and objects. Petz grow up over time on the user’s computer desktop and strive to be the user’s friends and companions. The interaction experience is nongoal oriented; users are allowed to explore the characters and their toys in any order they like within an unstructured yet active play environment. This freedom allows users to socialize with the Petz in their own way and at their own pace. This also encourages users to come up with their own interpretation of their pet’s feelings and thoughts. To date, the Virtual Petz products (figure 12.1) have sold more than two million copies worldwide.
Creating Emotional Relationships with Virtual Characters
Figure 12.1 Virtual Petz.
The goal of the Petz characters is to build an emotional relationship with the user. Their behaviors are centered around receiving attention and affection. They feed off of this interaction. Without it they become lethargic, depressed, and if ignored for long enough, they will run away. The most direct way the user can show affection to the Petz is through petting. By holding down the left mouse button, users can pet, scratch, and stroke with a hand cursor; the Petz immediately react in a variety of ways depending on what spot on their body is being petted, how fast, and how they feel at the time. Users can also pick up the characters with the right mouse button and carry them around the environment. We found that being able to (virtually) touch and hold the characters was a very effective way of building emotional relationships and creating the illusion of life. The Petz have equal footing in their relationship with the user. The toys and objects in their environment have direct objectlike interaction for both the user and the characters. Petz have full access to the toy shelf, and if they really want something, they have the freedom to get it themselves. This helps express the unpredictability and autonomous nature of the Petz. It also requires users to share control of the environment with them. For example, by picking up and using a toy, the user can initiate play. Throwing a ball may initiate a game of fetch, or holding a tugtoy in
Andrew Stern
front of a pet may begin a game of tug-of-war. Similarly, a pet can get its own toy and bring it to the user to initiate play. The act of sharing control of the environment and cooperative decision making helps further strengthen the relationship. We have created a variety of personalities—playful terriers, grumpy bulldogs, hyper Chihuahuas, lazy Persian cats, aggressive hunter cats, timid scaredy cats, and so on. Each individual character has its own likes and dislikes, spots and body coloration, and personality quirks. Users get to play with individual Petz to see if they like them before deciding to adopt. Once adopted, the user gives them a name. This individual variation allows the user to develop a unique relationship with a particular character. Each owner-pet relationship has the potential to be different.
Babyz
Our newest virtual characters, Babyz, released in October 1999 (figure 12.2), have the same nongoal-oriented play and direct interaction interface as the Petz. The user adopts one or more cute, playful babies that live in a virtual house on the computer. The Babyz have a similar cartoony style as the original Petz, but have more sophisticated, emotive facial expressions, and some simple natural language capability. Babyz vary in size, shape, and personality, and in some ways appear to be smarter and more clever than real one-year-old babies would be. They want to be fed, clothed, held, and nurtured, but are also quite playful and mischievous. Users can think of themselves as their parent or babysitter, whichever they feel most comfortable with. The user can nurture a Babyz character by holding and rocking it, tickling it, feeding it milk and baby food, putting on a fresh
Figure 12.2 Babyz.
Creating Emotional Relationships with Virtual Characters
diaper, giving it a bubble bath, soothing it if it gets upset, giving it medicine if it gets sick, laying it down to sleep in a crib, and so on. Play activities include playing with a ball, blocks, baby toys, music, dancing and singing, and dress-up. Through voice recognition, Babyz can understand and respond to some basic spoken words (such as ‘‘mommy,’’ ‘‘baby,’’ ‘‘yes,’’ ‘‘no,’’ ‘‘stop that’’), and can be read simple picture books. The Babyz characters develop over time, appearing to learn how to use toys and objects, learning to walk, and to speak a baby-talk language. In this version of the product, they will always be babies—never progressing beyond a stumble walk and simple baby talk. (All behaviors are preauthored, with the user’s interaction unlocking them over time, to create the illusion that the Babyz are learning.) If the user has more than one baby adopted, they can interact and form relationships with one another. Babyz can be friends and play nicely together or engage in long-term sibling rivalries. The program becomes an especially entertaining and chaotic experience with three active Babyz all getting into mischief at the same time (Stern 1999b).
Behaviors to Support Emotional Relationships
To allow for the formation of emotional relationships between the user and the Petz and Babyz characters, we built a broad base of interactive behaviors. These behaviors offer the characters and the user the means of communicating emotion (or the appearance of emotion) to each other. This section will detail these interactions and behaviors, specifying in each the emotions we intended to be perceived by the user. Our hope is that by having the characters express emotion in a convincing and lifelike way, the user will instinctively feel empathetic; and at the same time, if given the means to express their own emotions in return, users will feel like they are connecting to the characters on an emotional level. AFFECTION Users can express affection to the characters by touching them and holding them with their mouse-controlled hand cursor. For Petz, the touching is petting; for Babyz, it is tickling. Petz express affection to the user by sweetly barking, meowing, or purring; licking and nuzzling the hand cursor; and bringing the user a toy. Babyz
Andrew Stern
will smile, giggle, and laugh, coo, act cute, and say ‘‘mama’’ or ‘‘dada’’ in a loving voice. When perceiving a lack of affection, Petz will howl and yowl, sounding lonely; Babyz will cry and say ‘‘mama’’ or ‘‘dada’’ in a sad tone of voice. The intent is for the user to perceive the feelings of love, warmth, happiness, and loneliness in the characters. NURTURING Users can feed, clothe, and give medicine to the characters. Petz express the need to be nurtured by acting excited when food is brought out; begging, acting satisfied and grateful after eating; or disgusted when they don’t like the food. Babyz may hold out their arms and ask for food, whine and cry if hungry or need a diaper change, and may throw and spit up food they don’t like. Users are meant to perceive feelings of craving, satisfaction, pleasure, gratefulness, dislike, and disgust. PLAY By picking up a toy, users can initiate play with one or more of the characters, such as a game of fetch or building blocks. Petz or Babyz may join the user’s invitation to play, or get a toy of their own and begin playing by themselves or each other, waiting for the user to join them. A character may react if the user is ignoring them and instead playing with another character. Emotions intended to be perceived by the user include excitement, boredom, aggressiveness, timidity, laziness, and jealousy. TRAINING Users can give positive and negative reinforcement in the form of food treats, water squirts (for Petz), and verbal praise or discipline to teach the characters to do certain behaviors more or less often. Petz or Babyz are programmed to occasionally act naughty, to encourage users to train them. During these behaviors, users are meant to perceive the emotions of feeling rewarded, punished, pride, shame, guilt, and anger.
Effective Expression of Emotion
None of the aforementioned behaviors would seem believable to the user unless the characters effectively expressed convincing emotions. We found all of the following techniques to be
Creating Emotional Relationships with Virtual Characters
critical for successful real-time emotion expression in virtual characters. EMOTION EXPRESSION IN PARALLEL WITH ACTION During any body action (such as walking, sitting, using objects, etc.) Petz and Babyz characters can display any facial expression or emotive body posture, purr or cry, make any vocalization, or say any word in any of several emotional tones. This allows a baby character to look sad and say what it wants while it crawls toward the user. Catz can lick their chops and narrow their eyes as they stalk a mouse. Characters can immediately sound joyful when the user tickles their toes. We found if a virtual character cannot immediately show an emotional reaction, it will not seem believable. Timing is very important. EMOTION EXPRESSION AT REGULAR INTERVALS Programming the characters to regularly pause during the execution of a behavior to express their current mood was a very effective technique. For example, while upset and crawling for a toy that it wants, Babyz may stop in place and throw a short tantrum. Or just before running after a ball in a game of fetch, Dogz may leap in the air with joy, barking ecstatically. These serve no functional purpose (in fact they slow down the execution of a behavior), but they contribute enormously to the communication of the emotional state of the character. Additionally, related to Phoebe Sengers’s concept of behavior transitions (1998), when Petz or Babyz finish one behavior and are about to begin a new one, they pause for a moment, appearing to ‘‘stop and think’’ about it, look around, and express their current mood with a happy bark or timid cower. EMOTION EXPRESSION THROUGH CUSTOMIZATION OF BEHAVIOR Some behaviors have alternate ways to execute, depending on the emotional state of the character. Mood may influence a character’s style of locomotion, such as trotting proudly, galloping madly in fear, or stalking menacingly. A hungry character may choose to beg for food if lazy, cry and whine for food if upset, whimper if afraid, explore and search for food if confident, or attack anything it sees if angry. The greater the number of alternate ways a character has to perform a particular behavior, the stronger and deeper the perceived illusion of life.
Andrew Stern
PRIORITIZATION OF EMOTION EXPRESSION, AND AVOIDANCE OF DITHERING It is possible for a character to have multiple competing emotions, such as extreme fear of danger simultaneous with extreme craving for food. We found it to be most believable if ‘‘fear’’ has the highest priority of all emotion expression, followed by ‘‘craving’’ for food and then extreme ‘‘fatigue.’’ All other emotions such as ‘‘happiness’’ or ‘‘sadness’’ are secondary to these three extreme emotional states. It is also important that characters do not flop back and forth between conflicting emotions, else their behaviors appear incoherent. THEATRICAL TECHNIQUES Our characters are programmed to obey several important theatrical techniques, such as facing outward as much as possible, looking directly outward into the eyes of the user, and carefully positioning themselves relative to each other (‘‘stage blocking’’). If the user places a character offscreen, behind an object, or at an odd angle, the characters quickly get to a visible position and turn to face the user. If two characters plan to interact with one another, such as licking each other’s noses or giving each other an object, they try to do this from a side view, so the user can see as much of the action and emotion expression as possible in both characters. Similar techniques were identified in the context of virtual characters in Goldberg (1997).
Animation and Behavior Architecture
Animation and behavior are tightly integrated in the Petz and Babyz architecture. An attempt was made during software development to construct a clean modular code structure; however, both time constraints and practical considerations forced us to at times adopt a more ad hoc, hackish approach to implementation. The lowest level in the architecture is an animation script layer where frames of real-time rendered 3-D animation are sequenced and marked with timing for sound effects and action cues, such as when an object can be grabbed during a grasping motion. Above this is a finite-state machine that sequences and plays the animation scripts to perform generic but complicated low-level behaviors such as locomotion, picking up objects, expressing emotions, and so on. The next level up is a goal-and-plan layer that controls the
Creating Emotional Relationships with Virtual Characters
finite-state machine, containing high level behaviors such as ‘‘eat,’’ ‘‘hide,’’ or ‘‘play with object.’’ Goals are typically spawned as reactions to user interaction, to other events in the environment, or to the character’s own internal metabolism. Goals can also be spawned deliberately as a need to regularly express the character’s particular personality or current mood. At any decision point, each goal’s filter function is queried to compute how important it is for that goal to execute under the current circumstances. Filter functions are custom code in which the programmer can specify when a goal should execute. Part of the craft of authoring behaviors is balancing the output of these filter functions; it is easy to accidentally code a behavior to happen far too often or too seldom for believability. Alongside instantiated goals are instantiated emotion code objects such as ‘‘happy,’’ ‘‘sad,’’ and ‘‘angry.’’ Emotions have filter functions much like goals, but can also be spawned by the custom logic in states or goals. These emotion code objects themselves can in turn spawn new goals, or set values in the character’s metabolism. For example, the filter function of a pet’s ‘‘observe’’ goal may be activated in reaction to the user petting another pet. The ‘‘observe’’ goal is written to spawn a ‘‘jealousy’’ emotion if the other pet is a rival and not a friend. The ‘‘jealousy’’ emotion may in turn spawn a ‘‘wrestle’’ goal; any fighting that ensues could then spawn additional emotions, which may spawn additional goals, and so on. Goals are constantly monitoring what emotion objects are currently in existence to help decide which plans to choose and how they should be performed; states monitor emotions to determine which animations, facial expressions, and types of sound to use at any given moment. Note that by no means did we implement a ‘‘complete’’ model of emotion. Instead, we coded only what was needed for these particular characters. For example, the Babyz have no ‘‘fear’’ emotion, because acting scared was not necessary (or considered entertaining) for the baby characters we were making. Fear was necessary however for the Petz personalities such as the scaredy cat. The emotion lists for Petz and Babyz varied slightly; it would have been inefficient to try to have both to use the same exact model. At the highest level in the architecture is the ‘‘free will’’ and narrative intelligence layer. This is custom logic that can spontaneously (using constrained randomness) spawn new goals and emotions to convey the illusion that the character has intent of its
Andrew Stern
own. This code is also keeping track of what goals have occurred over time, making sure that entertaining behaviors are happening regularly. It keeps track of long-term narratives such as learning to walk, sibling rivalries, and mating courtship.
Discussion
Petta: How much context do the behaviors of the babies have? Is it just isolated behaviors that the babies display—now this kind of behavior is selected, and the baby will be able to perform that, and then it will switch to just a totally different behavior? Or is there some kind of coverage of context? How much history does each single baby ‘‘remember?’’ Stern: Each behavior is always suggesting to the action selector how important it is for this behavior to happen. And so, the more hungry you get, the higher the chance to eat gets. Petta: What I was trying to get at was the action expression problem. Is there any specific thing—sort of that babies try to display the reason why they behave as they are behaving now and sort of getting towards the direction of trying to report a story of what happened so far and the reason why? Stern: I have only talked about the general goals. But a goal also has a plan. This does not really map perfectly to the traditional definition of a goal, perhaps. A goal has a plan, a goal has multiple plans. And how the plans are authored, exactly which animations are chosen and why, can give you some idea of what is the motivation behind it. Petta: Ok, but still: Each plan is really self-contained and static, and it is not modified according to the previous history. Stern: No, it does keep track of the history. Sloman: For how long? I mean, is it just a lot of five time-slices or something? Or is it actually developing over time a rich memory of behavior? Stern: Yes, it is keeping a simple memory basically of what happened. It has what we call an association matrix, where it associates successes and failures, or rewards and punishments with the behaviors, objects, and characters involved at that moment. Sloman: Could it not actually use that to form new generalizations? One of the rules it might have as its condition could be: if you have
Creating Emotional Relationships with Virtual Characters
ever been stroked, then do this—or something like that. Or maybe there would be a recency effect? Stern: There are algorithms written to decide whether a goal should happen or not. But over time, as different associations are made, the likelihood that a goal can happen can change. Riecken: You have microworlds, and so there is a little set of operations over a little set of domain objects, and you can then provide a mapping, so that there is a reinforcement, positive or negative. You won’t touch that object again, because you are reinforced inside that microworld. If the user of this product does not provide any reinforcement—positive or negative—while the entity is working with it or playing with it or whatever, does the baby or the dog formulate an opinion of the object, and how? Stern: Yes. At first, a character doesn’t have any association to any objects. As they encounter an object, they start with a default, neutral association. As the character interacts with the toy, the toy has its own smartness to reward or discipline. For example, it could make a loud noise, which a character can interpret as a good or bad thing, depending on the character’s personality, and other factors. The strength of the association builds up over time, the more negative a reaction gets.
12.3
Feeling Holistic: The Importance of Design
In creating an interactive emotional artifact, even the best animation and artificial intelligence technology will be lost and ineffective without a solid design. In this section, we discuss the importance of the overall design of an interactive experience to ensure that a virtual character’s emotions are effective and powerful to the user.
Concept and Context
The type of characters you choose and the context you present them in will have a great impact on how engaging and emotionally powerful the interactive experience is. Judging by the confusing and poorly thought-out concepts in many pieces of interactive entertainment today, we feel this is a design principle too often ignored. In our products, we were careful to choose characters that people immediately recognize—dogs, cats, and babies—which
Andrew Stern
allow users to come to the experience already knowing what to d o . They immediately understand that they need to nurture and play with the characters. Even though Petz and Babyz are presented in a cartoony style, we made sure to keep their behavior in a careful balance between cartooniness and realism. This was important to maintain believability and the illusion of life; if the Petz stood up on their hind legs and began speaking English to one another, users would not have been able to project their own feelings and experiences with real pets onto the characters. One of our maxims was ‘‘if Lassie could do it, our Petz can do it.’’ That is, the Petz can do a bit more than a real dog or cat would normally do, but nothing that seems physically or mentally impossible. A very important design principle in Petz and Babyz for supporting emotional relationships is that users play themselves. Users have no embodied avatar that is supposed to represent them to the characters; the hand cursor is meant to be an extension of their real hand. The characters seem to ‘‘know’’ they are in the computer, and they look out at the user as if they actually see them. There is no additional level of abstraction here; you are you, and the characters are the characters. This is akin to a first-person versus third-person perspective. If the user had an avatar that they viewed from a third-person perspective, the other characters would be required to look at that avatar, not at the user directly, thereby weakening the impact of their emotional expression.
Direct, Simple User Interface
Petz and Babyz are almost completely devoid of the typical user interface trappings of most interactive entertainment products. To interact with the characters, users operate the hand cursor in a ‘‘natural,’’ direct way to touch and pick up characters and objects. No keyboard commands are required. All of the objects in the virtual world are designed to be intuitively easy to use; you can throw a ball, press keys on a toy piano, open cabinets, and so on.
Discussion
Bellman: I look at your demonstration, and I find it actually frustrating, because I am used to worlds where I am sharing the space
Creating Emotional Relationships with Virtual Characters
with the robots and with other people. So, your program seems very outside, and you can’t reach in. And then the avatar that you have, this disembodied hand, is a very impoverished version of an avatar where you are really there in the space. And that feels frustrating. Stern: Well, let me speak to that one. That is very intentional that you have only a hand. I think one of the reasons this works for people, and that people can form relationships with the characters is because you are yourself, you don’t take on an avatar. You are yourself. The characters that live in the computer know that they are in the computer, and they look out at you, and the hand is just an extension of your own hand. So, you are not role playing—not that avatars necessarily involve role playing, but typically, in a computer game, an avatar you control is a puppet or it’s a character that you must take on. Bellman: Yes. But you are speaking only of a certain kind of use of avatars, which is the bird’s eye view. In fact, in most avatar use, you are there in the scene, but you see your arm extending in front of you. Stern: Ok. I would call that an avatar. Bellman: That is an avatar? Ok. Because you have a representation in the space, and as you move, as you pick up things, as you look at things, you represent. Sloman: But these are just different forms of interaction. There is nothing right or wrong or good about either of them. Bellman: that my usually actually
I was not trying to say it is wrong. What I want to say is personal reaction after coming out of a world in which I am more embodied and I share with other people, was a sense of frustration.
Of course this simplicity limits the amount of expressivity offered to the user. We cannot make objects and behaviors that require more complicated operation, such as a holding a baby and a milk bottle at the same time. While we could program some obscure arbitrary keyboard command sequence to accomplish this, we have chosen not to in order to keep the interface as pure and simple as possible. To allow the user more expressivity we would be required to add more intuitive interface channels, such as a data glove or voice recognition. In fact, instead of typing words to your Petz and Babyz (which you would never do in real-life of course),
Andrew Stern
the latest versions of the products allow you to speak some basic words to the characters. In general, we feel that user interface, not animation or artificial intelligence technology, is the largest impediment for creating more advanced virtual characters. With only a mouse and keyboard, users are very constrained in their ability to naturally express their emotions to virtual characters. When interacting with characters that speak out loud, users should be able to speak back with their own voice, not with typing. Unfortunately, voice recognition is still a bleeding-edge technology. In the future, we look forward to new interface devices such as video cameras on computer monitors that will allow for facial and gesture recognition (Picard 1997).
Natural Expression
When trying to achieve believability, we found it effective for characters to express themselves in a natural way, through action and behavior, rather than through traditional computer interface methods such as sliders, number values, bar graphs, or text. In Petz and Babyz, the only way the user can understand what the characters seem to be feeling is to interpret their actions and physical cues, in the same way an audience interprets an actor’s performance. We do not display bar graphs or text messages describing the characters’ internal variables, biorhythms, or emotional state. By forcing a natural interpretation of their behavior, we do not break the illusion of a relationship with something alive.
Favor Interactivity and Generativity over a High Resolution Image
In Petz and Babyz, we made a trade-off to allow our characters to be immediately responsive, reactive, and able to generate a variety of expressions, at the expense of a higher resolution image. Surprisingly, most game developers do not make this trade-off! From a product marketing perspective, a beautiful still frame is typically considered more important than the depth and quality of the interactive experience. Of course, there is a minimum level of visual quality any professional project needs, but we feel most developers place far too much emphasis on flashy effects such as lighting, shading, and visual detail (i.e., spectacle) and not enough emphasis on interactivity and generativity.
Creating Emotional Relationships with Virtual Characters
Purity versus ‘‘Faking It’’: Take Advantage of the Eliza Effect
The ‘‘Eliza effect’’—the tendency for people to treat programs that respond to them as if they had more intelligence than they really do (Weizenbaum 1966) is one of the most powerful tools available to the creators of virtual characters. As much as it may aggravate the hard-core computer scientists, we should not be afraid to take advantage of this. ‘‘Truly alive’’ versus ‘‘the illusion of life’’ may ultimately be a meaningless distinction to the audience. Ninetynine percent of users probably will not care how virtual characters are cognitively modeled—they just want to be engaged by the experience, to be enriched and entertained.
12.4
Conclusion
This chapter has put forth virtual characters as a new form of emotional artifact, and the arrival of emotional relationships between humans and virtual characters as a new social phenomenon and direction for story and art. The design and implementation techniques we found useful to support such emotional relationships in the Virtual Petz and Babyz projects have been presented. We will conclude with some final thoughts on what it could m e a n for a person to have an emotional relationship with a virtual character. Are relationships between people and virtual characters somehow wrong, perverse, even dangerous—or just silly? Again, we can look to the past to help us answer this. Audiences that read about or see emotional characters in traditional media—painting, sculpture, books, theater, television, and movies—have been known to become very ‘‘attached’’ to the characters. Even though the characters are not real, they can feel real to the audience. People will often cry when the characters suffer and feel joy when they triu m p h . When the written novel first appeared it was considered dangerous by some; today we find that this is not the case. However, television, a more seductive medium than the novel, has certainly captured free time in the lives of many people. Some consider the effect of television and video games on children’s development to be a serious problem; the media has even reported a few outrageous stories of people going to death-defying lengths to take care of their virtual pet Tamagotchis. Designers should be aware that man-made characters have the potential to have a powerful effect on people.
Andrew Stern
Why create artificial pets and humans? Isn’t it enough to interact with real animals and people? From our perspective on making Virtual Petz, this was not the point. Our intent was not to replace people’s relationships with real living things, but to create characters in the tradition of stuffed animals and cartoons. And while some people are forming emotional relationships with today’s virtual characters, by and large they are still thought of as sophisticated software toys that try to get you to suspend your disbelief and pretend they are alive. However, as we move toward virtual human characters such as Babyz, the stakes get higher. As of this writing, we have not yet received feedback from the general public on their feelings and concerns about Babyz. Also, the characters made so far have been ‘‘wholesome’’ ones, such as dogs, cats, and babies, but one could easily imagine someone using these techniques to create characters that could support other types of emotional relationships, from the manipulative to the pornographic. Inevitably this will happen. Of course, the promise and danger of artificial characters has long been an area of exploration in literature and science fiction, ranging from friendly, sympathetic characters such as Pinocchio and R2D2 to more threatening ones such as Frankenstein and HAL9000. As virtual characters continue to get more lifelike, we hope users keep in mind that someone (human) created these virtual characters. Just as an audience can feel a connection with the writer, director, or actor behind a compelling character on the written page or the movie screen, a user could potentially feel an even stronger connection to the designer, animator, and programmer of an interactive virtual character. For the artist, the act of creating a virtual character requires a deep understanding of the processes at work in the character’s mind and body. This has always been true in traditional art forms, from painting and sculpting realistic people to novels to photography and cinema, but it is taken to a new level with interactive virtual characters. As the artist, you are not just creating an ‘‘instantiation’’ of a character—a particular moment or story in the character’s life—you are creating the algorithms to generate potentially endless moments and stories in that character’s life. People need emotional artifacts. When the public gets excited about buzzwords like ‘‘artificial intelligence’’ or ‘‘artificial life,’’ what they are really asking for are experiences where they can interact with something that seems alive, that has feelings, that
Creating Emotional Relationships with Virtual Characters
they can connect with. Virtual characters are a promising and powerful new form of emotional artifact that we are only just beginning to discover.
Acknowledgments
Virtual Petz and Babyz were made possible by a passionate team of designers, programmers, animators, artists, producers, and testers at PF.Magic/Mindscape that includes Adam Frank, Rob Fulop, Ben Resner, John Scull, Andre Burgoyne, Alan Harrington, Peter Kemmer, Jeremy Cantor, Jonathan Shambroom, Brooke Boynton, David Feldman, Richard Lachman, Jared Sorenson, John Rines, Andrew Webster, Jan Sleeper, Mike Filippoff, Neeraj Murarka, Bruce Sherrod, Darren Atherton, and many more. Thanks to Robert Trappl and Paolo Petta for organizing such a fascinating and informative workshop.
Appendix: Real Customer Letters
I h a d a dog that was a chawawa a n d his n a m e was Ramboo. Well he got old a n d was very sick a n d suffering so my parents put him to sleep. Ever since then I have begged my parents for a new dog. I have wanted one soo bad. So I heard about this dogz on the computer. I bought it a n d LOVE it!!! I have adopted 9 dogs. Sounds a bit to much to you ehhh? Well I have alot of free time on my h a n d s . So far everyday I take each dog out one by one by them selves a n d play with them, feed them, a n d brush them, a n d spray them with the flee stuff. I love them all. They are all so differnant with differant personalitys. After I take them out indaviually then I take 2 out at a time a n d let them play with me with each other. Two of the dogs my great Dane a n d chawawa dont like to play with any of the other dogs but each other. This is a incrediable program. I h a d my parents thinking I was crazy the other night. I was sitting here playing with my scottie Ren a n d mutt stimpy a n d they where playing so well together I dont know why but I said good dog out loud to my computer. I think my parents wondered a little bit a n d then asked me what the heck I was doing. But thankz PF.Magic. Even though I cant have a real dog it is really nice to have some on my screen to play with. The only problem now is no one can get me away from this computer, a n d I think my on-line friendz are getting
Andrew Stern
a little m a d cause im not chatting just playing fetch a n d have a great time with my new dogz. Thanks again PF.magic. I love this program a n d will recomend it to everyone I know!!!!!!! I am a teacher a n d use the catz program on my classroom PC to teach children both computer skills a n d caring for an animal. One of the more disturbed children in my class repeatedly squirted the catz a n d she ran away. Now the other children are angry at this child. I promised to try a n d get the catz back. It h a s been a wonderful lesson for the children. (And no live animal was involved.) But if there is any way to get poor Lucky to come homze to our clazz, we would very much appreciate knowing how to do it. Thanks for your help, Ms. Shinnick’s 4th grade, Boston, MA. Dear PF.Magic, I am an incredible fan of your latest release,Petz 3,I have both programs a n d in Janurary 1999, my cherised Dogz Tupaw was born. He is the most wonderful dogz a n d I thank you from the bottom of my heart, because in Janurary through to the end of April I h a d Anorexia a n d i was very sick. I ate a n d recovered because i cared so much about Tupaw a n d i wanted to see him grow u p . I would have starved without you bringing Petz 3 out. Please Reply to this, it would m e a n alot to me. Oh, a n d please visit my webpage, the url is http://www.homestead.com/wtk/pets.html. Thankyou for releasing petz 3,Give your boss my best wishes, Sincerily, Your Number One Fan, Faynine. I just reciently aquired all your Petz programs a n d I think they are great! I really love the way the animals react. I raised show dogs a n d have h a d numerous pets of all kinds in my life a n d making something like this is great. I am a school bus driver a n d have introduced unfortunate kids to your program. Children who not only can they not afford a computer but they can’t afford to keep a pet either. This has taught them a tremendous amount of responsibilty. I am trying to get the school to incorporate your programs so as to give all children a chance to see what it is like to take care of a pet. It might help to put a little more compassion in the world. Please keep me updated on your newest releases. Thanks for being such a great company. Nancy M. Gingrich. Dear PF.Magic, Hello! My n a m e is Caitlin, a n d I’m 10 years old. I have Dogz 1 a n d Catz 1, as well as Oddballz, a n d I enjoy them all very much. Just this morning was I playing with my Jester breed Catz, Lilly. But I know how much better Petz II is. For a while, I
Creating Emotional Relationships with Virtual Characters
thought I had a solution to my Petz II problem. I thought that if only I could get Soft Windows 95 for $200, that would work. Well, I took $100 out of my bank account (by the way, that’s about half my bank account) and made the rest. I cat-sit, I sold my bike, and I got some money from my parents. Anyway, I really, really love animals (I’m a member of the ASPCA, Dog Lovers of America, and Cat Lovers of America) but I can’t have one! That’s why I love Petz so much! It’s like having a dog or cat (or alien for that matter) only not. It’s wonderful! I have a Scrappy named Scrappy (Dogz), Chip named Chip (Dogz), Bootz named Boots (Dogz), Cocker Spaniel named Oreo (Dogz), Jester named Lilly (Catz), and Jester named Callie (Catz). And then every single Oddballz breed made. =) I don’t mean to bore you as I’m sure this letter is getting very boring. I would love SO MUCHto have Petz II. I really would. (At this point in the letter I’m really crying) I adopted 5 Catz II catz at my friend’s house, but I go over to her house so little I’m sure they’ll run away. I’d hate for them to run away. Is there anything I can do? I love my petz, and I’m sure they’d love Petz II. Thank you for reading this. Please reply soon. ~*~ Caitlin and her many petz ~ * ~ . My husband went downtown (to Manchester) and found Catz for sale, and having heard so much about it he bought it on the spot. He put it on his very small laptop and came back from one of his business trips saying, ‘‘How many Dutchmen can watch Catz at once on a little laptop on a Dutch train?’’ The answer was TEN. I asked if any of them said, ‘‘Awww,’’ the way we all did, but he said they all walked off saying it was silly. I bet they ran out to buy it anyway, though! Yours, Mrs. H. Meyer. Dear Sirs, Just wanted to thank-you for the pleasure my petz have brought me. I am paralyzed from the neck down yet your program has allowed me too again enjoy the pleasure of raising my own dogz. I have adopted 5 so far. I love them equally as if they were real. Thanks again.
Discussion: The Role of Multiple Users
Elliott: You know, Microsoft is moving this field towards something like multi-user systems, maybe not at the same time, but systems where you can identify a user with a particular session. Do you see your software developing relationships, historical relationships with different users, recognized by their login names?
Andrew Stern
Stern: Well, it seems reasonable. One problem you have with multiple users—that’s why we shy away from the idea—is that it goes against the illusion of life that these pets and babies give. I mean, basically we try to do nothing to artificially set up the situation. So, even the idea of logging in as yourself is not a ‘‘natural’’ thing to do, it does not work for the kind of idea we are trying to implement here. Sloman: But if the operating system is doing that anyway, then the information could be made available . . . Stern: I think the better way would be to recognize the user’s emotion or recognize their actual face. Sloman: But that’s much harder to implement. Ball: The cost of misidentifying someone is relatively high, because you break the illusion completely, if it does not act the way it is supposed to act with you. Picard: But we sometimes misidentify each other, or at least I misidentify people from time to time. And, you know, we have evolved ways to show our embarrassment, and social cues when we make mistakes like that, so we can remediate these errors. Trappl: We have made this experience with our exhibit in the Technical Museum. We expected to have always one person interact with it. But children just rush in together and interact. And the same could happen here, that three children are sitting in front of the computer. Picard: Right: and it misrecognizes that as one person. Sloman: I could well imagine that in a family with, say, two or three kids and one computer, they might enjoy the opportunity to identify themselves somehow. They click on the ‘‘Joe,’’ ‘‘Fred,’’ ‘‘Mary,’’ or whatever button when they start u p , and then . . . Bellman: Actually, that’s one of the advantages of having an avatar, because an avatar usually recognizes this, and we enter into the world through a body and things can look at you and see that it’s you. Sloman: That’s another way of doing it. The point is: There are many ways in which it could be done, and I could imagine some people would appreciate that. But I agree, it’s artificial. . . . Bellman: What I am a little worried about here, especially for training kids, (because we do have studies now about some of the impacts on kids of being in these environments and what they are
Creating Emotional Relationships with Virtual Characters
learning) is the question, basically: What are they going to do with these babies? They can develop some kind of a bizarre pattern, in order to get a response, in order to feel that they have authored something in this environment, they will create a very disturbed baby, just for the effect of being able to see: I have done something here, and it’s different from your baby. Stern: The hope is that you can make many different positive—not disturbed—babies, but maybe different happy babies. Bellman: It just seems to me really important, because half of what I think what kids are going to do is presumably that negative stuff. Ortony: Exactly. They rip the arms and hands off their dolls! Bellman: And the software toy is something that is more powerful and more reinforcing. It allows manipulation on the next level up in an interaction. Your ripped-off Barbie doll just lies there with dismembered parts. It does not sit there and cry indefinitely. In the environments that I am talking about, in addition to the authorship, there are other people present. And no matter how weird the group is, there are still some social warnings about odd or destructive or too violent behavior even in games, that actually helps. You have human beings there who also interact. Picard: I think kids need a safe place to explore the things they don’t want to try in the real world, things you don’t want to do to a real baby. It would be neat if this was a good enough model that behaves like a real baby. I never wanted to run experiments with my son and his emotions when he was born. And I could bring myself to do the ones that induce positive emotions, but I could not bring myself to do the ones that instill morbid fear on my own child. Whereas if I had a virtual model of my son, then I could maybe do some of these. I think of the behaviors that a lot of people are trying in these on-line worlds, where they can play around and then come back with more confidence in the real world. Bellman: Yes. But what’s really interesting is to watch the feedback from other human beings about these behaviors. And one of the first experiences I had when I came into these worlds as a novice, was watching the consequences of a virtual rape. So, here was a young man who was being tried in a public MUD by his peers. About a couple of hundred people were gathered in their avatars inside a room to judge his virtual rape. What was fascinating was the whole range of debate about: Well, is this really real? Or, is this not obscene? It was really a fascinating discussion. Part of what
Andrew Stern
went on was that eventually, after people had concluded it was serious enough, they decided they were going to do things like label him with a scarlet letter, put him into a virtual jail every time he showed u p , all these different kinds of punishment. What happened was that 90 percent of the community locked against him, which meant that when he walked into a room, they couldn’t hear him, and they couldn’t see him. They decided not to close the MUD, they decided not to change the culture and punishment like having jails etc., but just 90 percent of the population, which means a couple of thousand people, locked against him. Strangely enough, he stayed in this situation (it turned out to be a 19-year-old boy). Did he think of leaving and coming back as a different actor? No. He came back always as himself, and I watched his amazing transformation. I watched him go from total defense—saying ‘‘oh, I was just playing around’’—to eventually, about a year later, to saying: ‘‘I never realized I could hurt somebody.’’ And then, eventually, the virtual community, very much like the pygmies, began to allow him back in, in a process that took a couple of months. People started hearing that he was basically an ‘‘ok guy’’ now and that he was much better behaved, and they started to open u p . I think he had a real learning experience. Part of that came about because there was a community of people there. So, you can have your emotional agents too, but something about having these other human beings there is very powerful.
References Bandai (1996): Tamagotchi keychain toy. On-line. Available: hhttp://www.bandai.comi. (Availability last checked 5 Nov 2002) Bates, J. (1992): The Nature of Characters in Interactive Worlds and the Oz Project. TR CMU-CS-92-200, School of Computer Science, Carnegie Mellon University, Pittsburgh. Bates, J. (1994): The Role of Emotion in Believable Agents. TR CMU-CS-94-136, School of Computer Science, Carnegie Mellon University, Pittsburgh. Blumberg, B. (1997): Multi-level Control for Animated Autonomous Agents: Do the Right Thing . . . Oh, Not That . . . In R. Trappl and P. Petta, eds., Creating Personalities for Synthetic Actors. Springer, Berlin, Heidelberg, New York. Boehlen, M., and Mateas, M. (1998): Office Plant #1: Intimate Space and Contemplative Entertainment. Leonardo 31 (5): 345–348. Damasio, A. (1994): Descartes’ Error: Emotion, Reason, and the Human Brain. Putnam, New York. Dautenhahn, K. (1997): Ants Don’t Have Friends—Thoughts on Socially Intelligent Agents. In K. Dautenhahn, ed., Proceedings of the 1997 AAAI Fall Symposium, Socially Intelligent Agents. Technical Report FS-97-02. AAAI Press, Menlo Park, Calif.
Creating Emotional Relationships with Virtual Characters
Egri, L. (1946): The Art of Dramatic Writing. Simon and Schuster, New York. Elliott, C., Brzezinski, J., Sheth, S., and Salvatoriello, R. (1998): Story-Morphing in the Affective Reasoning Paradigm: Generating Stories Semi-Automatically for Use With ‘‘Emotionally Intelligent’’ Multimedia Agents. In C. Sierra, M. Gini, and J. S. Rosenschein, eds., Proceedings of the second international conference on Autonomous Agents (Agents ’98), ACM Press. New York, pp. 181–188. Frank, A., and Stern, A. (1998): Multiple Character Interaction between Believable Characters. In Proceedings of the 1998 Computer Game Developers Conference, 215–224. Miller Freeman, San Francisco. Frank, A., Stern, A., and Resner, B. (1997): Socially Intelligent Virtual Petz. In K. Dautenhahn, ed., Proceedings of the 1997 AAAI Fall Symposium, Socially Intelligent Agents, FS-97-02, pp. 43–45. AAAI Press, Menlo Park, Calif. Goldberg, A. (1997): IMPROV: A System for Real-Time Animation of Behavior-Based Interactive Synthetic Actors. In R. Trappl and P. Petta, eds., Creating Personalities for Synthetic Actors. Springer, Berlin, Heidelberg, New York. Grand, S., Cliff, D., and Malhotra A. (1997): Creatures: Artificial Life Autonomous Software Agents for Home Entertainment. In W. L. Johnson, ed., Proceedings of the First International Conference on Autonomous Agents, 22–29. ACM Press, Minneapolis. Hayes-Roth, B., van Gent, R., and Huber, D. (1997): Acting in Character. In R. Trappl and P. Petta, eds., Creating Personalities for Synthetic Actors. Springer, Berlin, Heidelberg, New York. Kline, C., and Blumberg, B. (1999): The Art and Science of Synthetic Character Design. In F. Nack, ed., Proc. Symposium on AI and Creativity in Entertainment and Visual Art, 1999 Convention of the Society for the Study of Artificial Intelligence and the Simulation of Behavior (AISB), Edinburgh, Scotland. Univ. of Edinburgh, Edinburgh, Scotland. Loyall, A. (1997): Believable Agents: Building Interactive Personalities. Ph.D. diss., Carnegie Mellon University, CMU-CS-97-123, Pittsburgh. Mateas, M. (1999): Not Your Grandmother’s Game: AI-Based Art and Entertainment. In D. Dobson and K. Forbus, eds., Proceedings of the 1999 AAAI Spring Symposium, Artificial Intelligence and Computer Games, Technical Report SS-99-02, pp. 64–68. AAAI Press, Menlo Park, Calif. Meehan, J. (1976): The Metanovel: Writing Stories by Computer. Ph.D. diss., Department of Computer Science, Yale University. Minsky, M. (1985): The Society of Mind. Simon and Schuster, New York. Murray, J. (1997): Hamlet on the Holodeck: The Future of Narrative in Cyberspace. Free Press, New York. Pemberton, L. (1989): A Modular Approach to Story Generation. In The Fourth Conference of the European Chapter of the Association for Computational Linguistics, 217– 224. BPCC Wheatons, Exeter. Penny, S. (1997): Embodied Cultural Agents: At the Intersection of Robotics, Cognitive Science, and Interactive Art. In K. Dautenhahn, ed., Proceedings of the 1997 AAAI Fall Symposium, Socially Intelligent Agents, FS-97-02, pp. 43–45. AAAI Press, Menlo Park, Calif. Perlin, K. (1995): Real-Time Responsive Animation with Personality. IEEE Trans. Visualization Comput. Graphics. 1 (1): 5–15. PF.Magic/Mindscape. (1995–99): Dogz and Catz: Your Virtual Petz; Babyz. On-line. Available: hhttp://www.petz.com and http://www.babyz.neti. (Availability last checked 5 Nov 2002) Picard, R. (1997): Affective Computing. MIT Press, Cambridge. Reilly, W. (1996): Believable Social and Emotional Agents. Ph.D. thesis. TR CMU-CS-96138, School of Computer Science, Carnegie Mellon University, Pittsburgh. Rilke, R. (1934): Letters to a Young Poet. W. W. Norton, New York. Sengers, P. (1996): Socially Situated AI: What It Means and Why It Matters. In H. Kitano, ed., Proceedings of the 1996 AAAI Symposium, Entertainment and AI/A-Life. Technical Report WS-96-03, pp. 69–75. AAAI Press, Menlo Park, Calif.
Andrew Stern
Sengers, P. (1998): Do the Thing Right: An Architecture for Action-Expression. In C. Sierra, M. Gini, and J. S. Rosenschein, eds., Proceedings of the Second International Conference on Autonomous Agents, 24–31. ACM Press, New York. Sony Electronics. (1999): Aibo robot dog. On-line. Available: hhttp://www.aibo.comi. (Availability last checked 6 July 2002.) Stern, A. (1998): Interactive Fiction: The Story Is Just Beginning. IEEE Intell. Syst. 13 (5): 16–18. Stern, A. (1999a): AI Beyond Computer Games. In Proceedings of the 1999 AAAI Spring Symposium, Artificial Intelligence and Computer Games, SS-99-02, pp. 77–80. AAAI Press, Menlo Park, Calif. Stern, A. (1999b): Virtual Babyz: Believable Agents with Narrative Intelligence. In M. Mateai, P. Sengers, Proceedings of the 1999 AAAI Fall Symposium, Narrative Intelligence. Technical Report FS-99-01, pp. 52–58. AAAI Press, Menlo Park, Calif. Stern, A., Frank, A., and Resner, B. (1998): Virtual Petz: A Hybrid Approach to Creating Autonomous, Lifelike Dogz and Catz. In C. Sierra, M. Gini, and J. S. Rosenschein, eds., Proceedings of the Second International Conference on Autonomous Agents, 334–335. ACM Press, New York. Thomas, F., and Johnston, O. (1981): Disney Animation: The Illusion of Life. Abbeville Press, New York. Tiger Electronics. (1998): Furby toy. On-line. Available: hhttp://www.furby.comi. (Availability last checked 6 July 2002.) Turkle, S. (1999): Ongoing Virtual Pet Study. On-line. Available: hhttp://web.mit.edu/ sturkle/www/vpet.htmli. (Availability last checked 6 July 2002.) Turner, S. (1994): The Creative Process: A Computer Model of Storytelling and Creativity. Lawrence Erlbaum Associates, Hillsdale, N.J. Weizenbaum, J. (1966): Eliza. Commun. ACM 9: 36–45. Wurst, K., and McCartney, R. (1996): Autonomous Robots as Performing Agents. In H. Kitand, ed., Proceedings of the 1996 AAAI Symposium, Entertainment and AI/A-Life. AAAI Press, Menlo Park, Calif.
Concluding Remarks Robert Trappl
A diversity of fascinating topics has been covered in this book. If the reader wants to delve deeper, it is recommended to follow up the publications of the contributing scientists or to contact them directly. More information about these scientists can be found in the Contributors section following this chapter. The contributors and the editors also have compiled a list of Recommended Reading that provide more detailed information on specific aspects of emotions in humans, animals, and/or artifacts. Finally, three possible remaining questions and their answers (or vice versa) shall be mentioned: First, will emotions research have an impact on our self-image, especially on our view of the function of our consciousness? While some scientists (e.g., Steven Pinker 1997) assume that the mind of Homo sapiens lacks the cognitive equipment to solve the puzzle of consciousness because our minds evolved by natural selection to solve problems that were life-and-death matters to our ancestors, Antonio Damasio (1999, p. 285) offers an interesting alternative explanation: ‘‘Knowing a feeling requires a knower subject. In looking for a good reason for the endurance of consciousness in evolution, one might do worse than say that consciousness endured because organisms so endowed could ‘feel’ their feelings. I am suggesting that the mechanisms which permit consciousness may have prevailed because it was useful for organisms to know of their emotions.’’ Where is consciousness located in the brain? Numerous experiments and observations with extremely sensitive recording instruments show that consciousness is not vaguely distributed in the brain but located in a definite place, the associative cortex. The associative cortex, however, does not look very different from other cortical areas and looks even more similar to the cerebellum (Roth, 2001). Why, then, should consciousness be located in this area? The very likely reason is that the associative cortex is the only part of the cortex which is strongly interconnected with the limbic system, where the emotional evaluation system of the brain is located.
Robert Trappl
We therefore may conclude that both cognition and emotion are necessary prerequisites of consciousness. Second, an even closer look at the cortex of the human brain reveals another fact related to the topic of this book: The number of afferent and efferent fibers to and from cortical neurons—the ‘‘input/output channels’’—amounts to at most 100 million. In contrast, the ca. 50 billion neurons in the cortex are strongly interconnected, the number of connections amounting to 5 x 10 1 4 . Given that, we find a ratio of one afferent/efferent fiber to every 5 million intracortical fibers (Schu¨tz 2000). What does this mean? It means that the input/output processes represent but a minute fraction of the processes going on within the cortical system! This is in total contrast to how the vast majority of researchers and developers construct emotional and intelligent artifacts. Much money and energy is invested in the ‘‘surface’’; much effort is also invested in the development of both sensory detection devices and means for very ‘‘realistic’’ outputs. Perhaps this insight about the human cortex should lead us to focus (again?) more on the ‘‘deep’’ structure, for instance, developing further more complex models of cognition and emotion and their interplay. Finally, the main aim covered by most of the contributors to this book is the development of emotional, intelligent artifacts. With respect to computers, de Rosis (2002) describes this process as eventually leading to ‘‘the kind of computer that is beautiful, warm, has a sense of humor, is able to distinguish good from evil, feels love and induces the user to fall in love with it, etc.’’ Replace ‘‘computer’’ by ‘‘robot’’ or ‘‘synthetic actor’’ and there is a homunculus, as in the Spielberg’s movie Artificial Intelligence. The issue of homunculi/ae of this kind is quite old, especially with men falling in love with attractive ‘‘women,’’ ranging from Galatea to Olympia in ‘‘The Tales of Hoffmann’’ to the replicant in Ridley Scott’s Blade Runner. There are already some speech systems available which sound quite natural, especially if fragments of ‘‘canned speech’’ are used. Synthetic actors increasingly make a humanlike impression—is it really desirable that humans are so impressed by them as to fall in love with them? The European Union requires that the labels on food packages inform the consumer if even only a small percentage of the food contained is genetically altered. Given the progress in computer animation and the slower but, nevertheless, existing progress in
Concluding Remarks
synthetizing humanlike personalities, the time is ripe to consider the request for a mandatory declaration: Synthetic actors should declare that they are synthetic. This concern takes on alarming proportions when considering children. Sherry Turkle (1998, 2000) sees a special risk in, as she calls them, ‘‘relational toys’’: Until now, for example, girls could learn parental discipline when playing with a doll, attributing to this doll emotions they themselves experience when interacting with a parent. Now the relational dolls declare that they have emotions, they express them, and the child has to cope with these emotions. She is rewarded when she induces certain emotions in the doll through a specific behavior. But how would the behavior of a child be affected who abuses a doll and then is rewarded? The big U.S. toy company Hasbro has decided that its dolls will not respond if they are abused. Is this enough? How long will it be until another company does not stick to such a moral code? In conclusion, research on emotions in humans and artifacts is definitely not l’art pour l’art, but rather research with potentially far-reaching consequences. Therefore, researchers in this area especially have a moral obligation to bear these implications in mind.
References Damasio, A. R. (1999): The Feeling of What Happens. Body and Emotion in the Making of Consciousness. Harcourt Brace, New York. Pinker, S. (1997): How the Mind Works. Penguin, New York. de Rosis, F., ed. (2002): Toward Merging Cognition and Affect in HCI. Special Double Issue of Applied Artificial Intelligence, 16 (7 & 8). Roth, G. (2001): Fu¨ hlen, Denken, Handeln. Wie das Gehirn unser Verhalten steuert. Suhrkamp, Frankfurt am Main. Schu¨tz, A. (2000): What can the cerebral cortex do better than other parts of the brain? In G. Roth and M. F. Wulliman, eds., Brain Evolution and Cognition. Wiley-Spektrum Akademischer Verlag, New York, Heidelberg. Turkle, S. (1998): Cyborg Babies and Cy-Dough-Plasm: Ideas about Self and Life in the Culture of Simulation. In R. Davis-Floyd and J. Dumit, eds., Cyborg Babies: From Techno-Sex to Techno-Tots. Routledge, New York. Turkle, S. (2000): The Cultural Consequences of the Digital Economy. Invited lecture at the mobilcom austria Conference, 9 November 2000, Hofburg, Vienna, Austria.
Contributors
Gene Ball
E-mail:
[email protected] Gene Ball was a senior researcher at Microsoft Corporation until January 2001. He earned his bachelor’s degree in Mathematics from the University of Oklahoma and his master’s degree in Computer Science from the University of Rochester, where he also did his Ph.D. studies, which he completed in 1982. He worked as a research computer scientist at Carnegie Mellon University from 1979 to 1982 and as a software designer for the company Formative Technologies from 1983 to 1984. From 1985 to 1991, he was assistant professor in Computer and Information Sciences at the University of Delaware at Newark, before joining Microsoft Corporation in 1992, first as researcher and then, from 1995 onward, as senior researcher. He has been active in the Persona Project at Microsoft Research and, between 1994 and 1998, has organized four ‘‘Lifelike Computer Characters’’ conferences. Kirstie L. Bellman
E-mail:
[email protected] Kirstie L. Bellman is Principal Director of the Aerospace Integration Sciences Center at the Aerospace Corporation. She gained her bachelor’s degree from the University of Southern California and her Ph.D. from the University of California, San Diego—both in Psychology. She was NIH postdoctoral scholar and trainee in neuropsychology for three years. She worked as a researcher at the University of California, Los Angeles and for the Crump Institute for Medical Engineering. She joined The Aerospace Corporation in 1991 as a senior scientist. From 1993 to 1997, she started up the new Aerospace Integration Sciences Center for DARPA, which she is now heading. Her recent work focuses on the use of domain-specific languages and formally based architectural description languages to design and analyze information architectures. With a number of academic partners, she is also developing new mathematical approaches to the analysis of virtual worlds containing collaborating humans, artificial agents, and heterogeneous
Contributors
representations, models, and processing tools. Lately, she has been working on reflective architectures that use models of themselves to manage their own resources and to reason about appropriate behavior. Lola Can˜amero
E-mail:
[email protected] Lola (Dolores) Can˜amero is Senior Lecturer in Computer Science at the University of Hertfordshire, UK. She received bachelor’s and master’s degrees in philosophy from the Complutense University of Madrid, and a Ph.D. in computer science from the University of Paris-XI. She worked as a postdoctoral associate at the MIT Artificial Intelligence Laboratory and at the VUB (Free University of Brussels) Artificial Intelligence Laboratory, and as a researcher at the Artificial Intelligence Institute of the Spanish Scientific Research Council. Her research lies in the areas of adaptive behavior and emotion modeling for autonomous and social agents (both robotic and synthetic). She has organized a number of symposia and workshops on this topic, and she is guest editor (with Paolo Petta) of the special issue of the Cybernetics and Systems Journal Grounding Emotions in Adaptive Systems, as well as coeditor of the book Socially Intelligent Agents: Creating Relationships with Computers and Robots. Clark Elliott
E-mail:
[email protected] Clark Elliott, associate professor of computer science at DePaul University, has conducted research in both theoretical and applied computer applications of emotion reasoning since 1989. He was among the first graduates of Northwestern University’s Institute for the Learning Sciences, receiving his degree there in 1992. Dr. Elliott was an early proponent of the use of emotion models in the design of believable agents in multi-agent systems, and in this capacity served on the program committees of numerous conferences that supported this area of research. His work on emotion representation has been applied to diverse subareas of AI such as intelligent tutoring systems, personality representation, story representation and generation, user modeling, and humor representation. His Affective Reasoner emobodied real-time, interactive agents that used speech-generation, speechrecognition, music, and face-morphing techniques to communicate with users. Dr. Elliott founded DePaul’s Distributed Systems division in 1997, and is currently serving as its associate director. At the time of
Contributors
publication he is on a temporary hiatus from his research while recovering from an accidental brain injury. Andrew Ortony
E-mail:
[email protected] Andrew Ortony was educated in Britain, gaining his bachelor’s degree from the University of Edinburgh, where he majored in philosophy, and then doing his Ph.D. in computer science at the University of London’s Imperial College of Science and Technology. His Ph.D. dissertation was concerned with graphical interface design. In 1973, he joined the faculty at the University of Illinois at Urbana-Champaign. There, with appointments in education and in psychology, he started to investigate questions having to do with knowledge representation and language understanding, concentrating in particular on the communicative functions of and the processes involved in the production and comprehension of nonliteral (especially metaphorical) uses of language. His approach to research problems is strongly interdisciplinary, as is evident from the diverse perspectives on metaphor represented in his edited book, Metaphor and Thought. In 1981, he started a long collaboration with Gerald Clore working on the relationship between emotion and cognition and culminating in the publication of their 1988 book (with Allan Collins), The Cognitive Structure of Emotions. Since moving to Northwestern University in 1989, he has maintained his interest in research on metaphor. At the same time, he has become increasingly interested in emotion research as it relates to various aspects of artificial intelligence, including the design of intelligent, emotional autonomous agents. Sabine Payr
E-mail:
[email protected] Sabine Payr holds a diploma as a conference interpreter from the University of Innsbruck, and a doctorate in linguistics from the University of Klagenfurt. Her international experience includes one year stays for studies (Paris), work (Brussels), and research (Berkeley). Professional activities range from conference interpreting to regional development initiatives, from IT training/consulting to research. Since 1987, she has been involved in interactive media in training and education, doing research and development in the field of educational technology in higher education, open and distance learning and tele-learning in vocational training and further education. Sabine Payr has worked at
Contributors
the Institute for Interdisciplinary Research and Further Education IFF, the Austrian Federal Ministry of Education, Science, and Culture and the Research Center Information Technologies (FGI). She has been working at the Austrian Research Institute for Artificial Intelligence since spring 2000, in the framework of the project ‘‘An Inquiry into the Cultural Context of the Design and Use of Synthetic Actors.’’ She is currently also visiting professor at the University for Design (Linz/Austria). Paolo Petta
E-mail:
[email protected] Paolo Petta earned his masters (1987) and doctorate degree (1994) in computer science from the Technical University of Vienna. Since 1987, he has been working at the Austrian Research Institute for Artificial Intelligence, where he founded, in 1997, the research group Intelligent Software Agents and New Media, of which he is head. In 1989, he also joined the Department of Medical Cybernetics and Artificial Intelligence of the University of Vienna as an assistant professor. He has led a number of research projects in the field of autonomous intelligent agents, among them the development of a life-size improvising synthetic character for an interactive exhibit at the Technical Museum of Vienna. In 1997, he edited, together with Robert Trappl, the book Creating Personalities for Synthetic Actors. Rosalind W. Picard
E-mail:
[email protected] In 1984, Rosalind W. Picard earned a bachelor in electrical engineering with highest honors from the Georgia Institute of Technology and was named a National Science Foundation graduate fellow. She worked as a member of the technical staff at AT&T Bell Laboratories from 1984–87, designing VLSI chips for digital signal processing and developing new methods of image compression and analysis. Picard earned her masters and doctorate, both in electrical engineering and computer science, from the Massachusetts Institute of Technology (MIT) in 1986 and 1991, respectively. In 1991, she joined the MIT Media Laboratory as an assistant professor, and in 1992 was appointed to the NEC Development Chair in Computers and Communications. She was promoted to associate professor in 1995, and awarded tenure at MIT in 1998. Her award-winning book, Affective Computing (MIT Press, 1997), lays the groundwork for giving machines the skills of
Contributors
emotional intelligence. Rosalind W. Picard is founder and director of the Affective Computing Research Group at the MIT Media Laboratory. Douglas Riecken
E-mail:
[email protected] and
[email protected] Doug Riecken is a principal investigator and manager at the IBM T. J. Watson Research Center. Doug manages the Common Sense Reasoning and e-Commerce Intelligence Research Department. He has also established the Center of Excellence for Common Sense Reasoning while at IBM. Since 1987, Doug continues to work with Marvin Minsky on theories of mind and common sense reasoning with a fundamental focus on the role of emotions and instincts in memory, reasoning, and learning. Riecken is also a member of the graduate faculty at Rutgers University. Prior to joining IBM Research in 1999, Riecken served for 17 years as a principal investigator and manager at AT&T Bell Laboratories Research. He received his Ph.D. from Rutgers University working under Minsky. Edmund T. Rolls
E-mail:
[email protected] Edmund T. Rolls read preclinical medicine at the University of Cambridge, and performed research on brain function for a Ph.D. at the University of Oxford. He is now professor of experimental psychology at the University of Oxford, and a fellow and tutor at Corpus Christi College, Oxford. He is associate director of the Medical Research Council Oxford Interdisciplinary Research Centre for Cognitive Neuroscience. His research interests include the brain mechanisms of emotion and memory; the neurophysiology of vision; the neurophysiology of taste, olfaction, and feeding; the neurophysiology of the striatum; and the operation of real neuronal networks in the brain. He is author, with A. Treves, of Neural Networks and Brain Function (1998). In 1999, he published the much noted book The Brain and Emotion. His website is: hwww.cns.ox.ac.uki. Aaron Sloman
E-mail:
[email protected] Aaron Sloman received a B.S. in mathematics and physics first class in 1956 in Cape Town, and a Ph.D. in philosophy, Oxford 1962. He joined the faculty of the University of Birmingham, UK in 1991. He was Rhodes scholar at Balliol College, Oxford (1957–60), senior
Contributors
scholar at St Antony’s College 1960–62, GEC professorial fellow 1984– 86, and elected fellow of the American Association for AI in 1991. He was elected honorary life fellow of AISB in 1997, and fellow of ECCAI in 1999. Aaron Sloman is a philosopher and programmer trying to understand how minds evolved and what sorts of designs make them possible. Many papers on aspects of mind, emotions, representations, vision, architectures, evolution, and so on can be found at the Cognition and Affect website, hwww.cs.bham.ac.uk/research/cogaff/i. Andrew Stern
E-mail:
[email protected] Andrew Stern is a designer and programmer of the interactive characters Dogz, Catz, and Babyz from PF.Magic in San Francisco. Along with his fellow creators Adam Frank, Ben Resner, and Rob Fulop, he has presented these projects at the Siggraph Art Gallery 2000, Digital Arts and Culture ’99, AAAI Narrative Intelligence Symposium ’99, Autonomous Agents ’98, and Intelligent User Interfaces ’98. The projects have received press coverage from the New York Times, Time Magazine, Wired, and AI Magazine. Babyz received a Silver Invision 2000 award for Best Overall Design for CD-Rom; Catz received a Design Distinction in the first annual I.D. Magazine Interactive Media Review, and along with Dogz and Babyz was part of the American Museum of Moving Image’s Computer Space exhibit in New York. Andrew Stern is currently collaborating with Michael Mateas on an interactive drama project, ‘‘Facade.’’ He holds a B.S. in computer engineering with a concentration in filmmaking from Carnegie Mellon University and a masters degree in computer science from the University of Southern California. His website can be found at: hwww.interactivestory.neti. Robert Trappl
E-mail:
[email protected] Robert Trappl is professor and head of the Department of Medical Cybernetics and Artificial Intelligence, University of Vienna, Austria. He is director of the Austrian Research Institute for Artificial Intelligence in Vienna, which was founded in 1984. He holds a Ph.D. in psychology (minor in astronomy), a diploma in sociology (Institute for Advanced Studies, Vienna), and is an electrical engineer. He has published more than 130 articles, he is coauthor, editor or coeditor of 28 books, the most recent being Power, Autonomy, Utopia: New Approaches toward Complex Systems, Plenum, New York; Cybernetics
Contributors
¨ and Systems 2002, OSGK, Vienna; Advanced Topics in Artificial Intelligence; Creating Personalities for Synthetic Actors; Multi-Agent Systems and Applications;—these three books by Springer, Heidelberg, New York. He is editor-in-chief of Applied Artificial Intelligence: An International Journal and Cybernetics and Systems: An International Journal, both published by Taylor and Francis, United States. His main research focus at present is the development and application of artificial intelligence methods to aid decision makers in preventing/ ending wars, and the design of emotional personality agents for synthetic actors in films, television, and interactive computer programs. He has been giving lectures and has been working as a consultant for national and international companies and organizations (OECD, UNIDO, WHO).
Name Index
Agre, P. E., 251, 256, 269, 207, 271, 277 Albus, J. S., 40, 48, 51, 52, 89, 90, 95, 111 Alexander, R. D., 27, 33 Allen, S., 103 Allport, G. W., 202 Anderson, J.R., 267 Anderson, K. J., 204 Antoniou, A. A., 193, 257 Armon-Jones, C., 253 Arnott, J. L., 323 Ashby, W. R., 146 Atherton, D., 355 Aube´, M., 254 Averill, J. R., 192 Balkenius, C., 257 Ball, G., 8, 215, 219, 277, 280, 307, 319, 320, 328, 329, 358 Banse, R., 324 Barber, K. S., 170 Bargh, J. A., 251 Baron, R. M., 269 Bartle, R., 175 Bartlett, F. C., 258 Bates, J., 42, 111, 142, 262, 309, 335 Baumgartner, P., 1, 10 Beaudoin, L., 38, 44, 53, 67, 73, 74, 80, 103, 111, 112, 114 Beethoven, L. van, 290 Bellman, K., 5, 30, 64, 65, 105, 106, 143, 162, 164–166, 169–172, 174–175, 178, 180–181, 183, 185, 205–207, 215, 234, 278, 307, 319, 329, 350, 351, 358, 359–360 Bickhard, M. H., 277 Blumberg, B., 335 Boden, M., 103 Boehlen, M., 339 Bohr, N., 162 Bonasso, R. P., 261, 267 Booth, D. A., 23, 33 Boynton, B., 355 Bo¨sser, T., 255, 271 Braitenberg, V., 124, 146, 162, 220 Brand, P. W., 228 Breese, J., 304, 311 Brooks, R. A., 52, 53, 90, 89, 112, 116, 146, 260, 271
Bruner, J., 165 Brzezinski, J., 335 Burgess, P., 26, 34 Burgoyne, A., 355 Cahn, J. E., 324 Can˜amero, L. D., 4, 104, 108, 115, 117, 127, 123, 124, 129, 133, 137, 144, 147, 157, 161, 173, 182, 256, 262, 275, 277–278 Cantor, J., 355 Capra, F., 159, 167–168 Carpenter, G. A., 137, 147 Chandrasekaran, B., 44, 112 Chapman, D., 260 Chartrand, T. L., 251 Chesterto wn, G. K., 163 Chomsky, N., 270 Christopher, S. B., 28, 34 Churchland, P., 164, 169 Clancey, W. J., 256, 277 Cliff, D., 337 Clodius, J., 175, 177 Clore, G. L., 88, 113, 193, 132, 147, 222, 257, 262, 291, 309 Clynes, M., 227–228, 291 Colby, K. M., 190 Collins, A., 88, 113, 193, 132, 147, 222, 257, 262, 291, 309 Costa, P. T., 203, 309 Craik, K., 62, 112 Croucher, M., 73, 114, 261 Custodio, L., 257 Damasio, A. R., 1, 10, 77, 80, 84, 112, 122, 147, 146, 150, 153, 155–158, 163, 165–166, 168, 223, 251, 256, 280, 336, 363 Darrell, T., 272 Darwin, C., 192, 17, 22, 33 Dautenhahn, K., 175 Davies, D. N., 52, 112 Dawkins, R., 22, 34 Dennett, D. C., 39, 112, 40, 62, 70, 97, 103, 111 de Rosis, F., 364 Descartes, R., 159 de Sousa, R., 272 Donnart, J. Y., 116, 147
Name Index
Doyle, J., 294 Drescher,G.L., 267 Dyer, M. G., 263 Earl, C., 267 Egri, L., 334 Ekman, P., 192, 196, 17, 146, 147 Elliott, C., 7, 8, 141, 160, 209, 215–216, 233, 254, 262–264, 267, 335, 357 Ellsworth, P.C., 257 Elsaesser, C., 261 Elster, J., 254 Endo, T., 89, 113 Eysenck, H. J., 204 Feldman, D., 355 Ferguson, I. A., 261, 267 Filippoff, M., 355 Firby, R. J., 261, 267, 269 Fiske, S. T., 266 Fodor, J., 47, 112, 48, 51, 52, 93, 94 Fogg, B.J., 327 Foner, L. N., 179 Foss, M. A., 193 Frank, A., 340, 355 Frank, R. H., 255 Franklin, S., 103 Freud, S., 97, 152 Fridlund, A. J., 18, 34 Frijda, N. H., 5, 8, 12, 34, 68, 72, 115, 120, 121, 127, 130, 132, 133, 146, 147, 254– 255, 257–260, 266–268, 272, 277, 279 Frisby, J. P., 49, 112 Fulop, R., 340, 355 Gat, E., 271, 278 Geneva Emotion Research Group, 324 Gent, R. van, 262, 335 Georgeff, M. P., 261 Gershon, M. D., 227 Gibson, J. J., 49, 112, 269 Glasgow, J., 44, 112 Goldberg, A., 335, 346 Goldman-Rakic, P. S., 26, 34 Goodale, M., 55, 112 Gordon, A., 176 Grand, S., 337 Grandin, T., 165 Granit, R., 169 Gratch, J., 264 Gray, J. A., 204, 13, 14, 34 Griffin, D. R., 153 Gross, J. J., 251 Grossberg, S., 137, 147 Haidt, J., 255 Hall, L., 176
Halperin, J., 275 Harrington, A., 355 Hayes, P., 103 Hayes-Ro th, B., 202, 262, 335, Heckerman, D., 311, 323 Hershman, L., 339 Hexmoor, H., 271, 278 Higgins, E. T., 203 Horswill, I., 251, 256, 270–271 Horvitz, E., 311 Huber, D., 262, 335 Humphreys, M. S., 204 Izard, C. E., 18, 34 James, W., 87, 154 Jenkins, J. M., 12, 85, 113 Jennings, N. R., 252, 260–261 Jensen, F. V., 311, 317 Jessel, T. M., 119, 147 Johnson, W. L., 179, 181 Johnson–Laird, P., 89, 112, 117, 147 Johnston, O., 334 Johnstone, B., 220 Johnstone, I. T., 324 Jose, P. E., 193, 257 Kacelnik, A., 25, 34 Kandel, E. R., 119, 147 Karmiloff–Smith, A., 44, 112 Keltner, D., 251, 255 Kemmer, P., 355 Kemper, T. D., 254 Kennedy, C., 103 Kim, J., 170 Ko¨hler, W., 78, 112 Krebs, J. R., 25, 34 Lachman, R., 355 Landauer, C., 164–165, 170–172, 174–176, 178, 180–181 Lang, P., 309 Lansky, A. L., 261 Laurtizen, S.L., 317 Lazarus, R. S., 12, 34, 257–260, 267–269 Leak, G. K., 28, 34 LeDoux, J. E., 33, 120, 123, 147, 221, 253, 259 Lee, D., 55, 112 Lenat, D., 292 Leong, L., 175 Lester, J., 262, 264 Levenson, R. W., 255 Leventhal, H., 258–259, 265–266 Lindsay, P. H., 150, 153–154, 158, 162 Lishman, J., 55, 112 Lloyd, A. T., 28, 34
Name Index
Logan, B., 103 Loyall, A. B., 42, 111, 262, 309 Macintosh, N. J., 14, 34 Macmahon, M., 272 Maes, P., 116, 147, 164, 169, 262 Malhotra, A., 337 Mandler, G., 118, 147, 153 Marr, D., 48, 50, 93, 112 Marsella, S., 302 Martinho, C., 262, 309 Mateas, M., 335, 339 Maturana, H., 159 Mauldin, M. L., 272 McArthur, L. Z., 269 McCarthy, J., 103, 111 McCaulley, M. H., 310 McCrae, R. R., 203, 309 McDermott, D., 144, 146, 38, 112, 40 McFarland, D. J., 255, 271 McGinn, C., 153, 165 Meehan, J., 339 Meyer, J. A., 116, 147 Michalski, R. S., 292 Millenson, J. R., 13, 34 Miller, D. P., 261 Miller, R., 169 Millington, I., 103 Milner, A., 55, 112 Minsky, M. L., 40, 52, 89, 100, 103, 112, 162, 291, 293, 294, 296, 297, 298, 336 Moffat, D., 141, 152 More´n, J., 257 Murarka, N., 355 Murray, I. R., 323 Murray, J., 338–339 Myers, J. B., 310 Nagel, T., 99, 112 Narayanan, N. H., 44, 112 Nass, C., 1, 10, 214, 305, 318–319 Nesse, R. M., 28, 34 Newell, A., 40, 112 Nii, H.P., 299 Nilsson, N. J., 40, 51, 52, 56, 77, 89, 90, 113 Norman, D. A., 150, 153–154, 157–158, 162 Norvig, P., 40, 113, 256 O’Brien, M., 175 O’Rorke, P., 262 Oatley, K., 12, 34, 85, 113, 117, 147 Odbert, H. S., 202 Ogden, C. K., 150–151, 167, 172 Okada, N., 89, 113 Olesen, K. G., 317
Ortony, A., 6, 29–33, 64, 66, 88, 103, 104, 108, 110, 113, 132, 147, 142, 146, 191, 193, 199, 206, 208–210, 216, 218, 219, 222, 226, 234, 235, 254–255, 257, 262, 277, 280, 291, 307, 309, 319, 328, 359 Osgood, C. E., 321 Paiva, A., 262, 309 Payr, S., 1, 10 Pednault, E., 302 Pemberton, L., 339 Penny, S., 339 Pentland, A., 324 Perlin, K., 181, 335 Pert, C., 162, 164 Peterson, D., 44, 113 Petrides, M., 26, 34 Petta, P., 2, 8, 10, 103, 141, 207–208, 255, 262, 272, 278, 280, 291, 348 Pfeifer, R., 118, 124, 128, 131, 133, 147, 260–261, 263 Picard, R., 2, 5–7, 10, 36, 65, 77, 84, 103, 109, 113, 125, 126, 143, 144, 147, 151, 165–166, 214–216, 219, 226–227, 233–235, 262, 291, 306, 307, 308, 319, 328, 329, 352, 358, 359 Pinker, S., 363 Pinto-Ferreira, C. A., 257, 278 Poincare´, H., 156–157 Poli, R., 103 Polichar, V. E., 175–176 Popper, K., 55, 113, 62 Pribram, K. H., 119, 147 Pryor, L. M., 269 Rath, A., 339 Read, T., 103 Reekum, C. M. van, 258–259 Reeves, B., 1, 10, 214, 305, 318–319 Reilly, W. S., 42, 111, 309 Resner B., 340, 355 Revelle, W., 204 Richards, I. A., 150–151 Rickel, J., 262, 264 Riecken, D., 8, 146, 226, 278, 291, 292, 349 Rilke, R., 334 Riner, R., 175, 177 Rines, J., 355 Rolls, E., 3, 11–16, 19–21, 23, 24, 29–33, 44, 54, 64, 69, 70, 96, 105–107, 110, 111, 113, 121, 122, 132, 142, 145, 147, 150, 153–154, 158, 204, 205–206, 210, 226, 253, 255–257, 280, 291 Rommelse, K., 311 Rosch, E., 155 Roseman, I. J., 193, 257 Roth, G., 363
Name Index
Rousseau, D., 202, 262 Roy, D., 324 Russell, S., 40, 113, 256 Ryle, G., 73, 113 Sacks, O., 150, 163, 165 Sagan, C., 162 Sakaguchi, H., 2 Salvatoriello, R., 335 Scherer, K. R., 124, 148, 193, 257–259, 261,266, 279, 309, 324 Scull, J., 355 Schu¨tz, A., 364 Schwartz, J. H., 119, 147, 176 Scott, R., 364 Searle, J., 183 Sengers, P., 345 Shakespeare, W., 87 Shambroom, J., 355 Shallice, T., 26, 34 Sherrod, B., 355 Sheth, S., 335 Shing, E., 103 Shneiderman, B., 143 Shwe, M., 311 Simon, H. A., 47, 73, 113, 117, 140, 147 Slack, M. G., 261 Sleeper, J., 355 Sloman, A., 4, 30–33, 35, 38, 43, 44, 46, 48, 49, 52, 53, 55, 56, 63–67, 69, 73, 80, 83, 90, 94, 98, 103–109, 113, 114, 123, 142, 144–146, 147, 161–162, 169, 191, 205–210, 216, 218–219, 226, 234, 256, 261–262, 291, 319, 329, 348, 351, 358 Smith, B., 165, 171 Smith, C. A., 257–259, 266, 268 Sorenson, J., 355 Staller, A., 255 Steels, L., 116, 148 Stern, A., 9, 146, 202, 214, 279–280, 335, 338, 343, 348, 349, 351, 358, 359 Strongman, K. T., 12, 34 Strawson, P. F., 98, 114 Suci, G.J., 321 Swaggart, J., 245 Sycara, K., 260–261 Tannenbaum, P.H., 321 Tartter, V.C., 324 Taylor, S. E., 266 Teasdale, J. D., 259 Thomas, F., 334 Thomas, L., 162 Thompson, E., 155 Tinbergen, N., 16, 34 Tomkins, S. S., 119, 133, 148
Trappl, R., 1, 2, 10, 103, 141, 235, 355, 358 Treves, A., 19, 34 Trivers, R. L., 28, 34, 255 Turkle, S., 176, 337, 365 Turner, S., 339 Turner, T. J., 146, 147 Varela, F., 155, 159 Vela´squez, J. D., 117, 124, 148, 262 Ventura, R., 257, 278 Viola, W., 339 von Uexku¨ll, J., 164 Walter, D. O., 162, 164 Walter, W. G., 220 Webster, A., 355 Wehrle, T., 124, 148 Weiskrantz, L., 13, 34 Weizenbaum, J., 353 Weizsa¨cker, C. F. von, 273 Wiggins, J. S., 309 Wilkins, D. E., 261 Wilson, S. W., 116, 148 Wittgenstein, L., 172 Wooldridge, M., 252, 260–261 Wright, I. P., 38, 44, 53, 73, 80, 103, 114, 262 Yancey, P., 228 Yu, S. T., 261 Zajo nc, R. B., 258 Zukav, G., 183
Recommended Reading
Note: Some of the following books were recommended by a contributor, some by the editors. It therefore cannot be concluded that a book on this list is recommended by all contributors. Arkin, R. C. (1998): Behavior-Based Robotics (Intelligent Robots and Autonomous Agents). MIT Press, Cambridge. Boatley, K., and Jenkins, J. M. (1995): Understanding Emotions. Basil Blackwell, Oxford, Cambridge. Can˜amero, L., ed. (2002): Emotional and Intelligent II: The Tangled Knot of Social Cognition. Papers from the 2001 Fall Symposium, November 2–4, 2001, North Falmouth, Massachusetts. Technical Report FS-01-02, AAAI Press, Menlo Park, Calif. Can˜amero, L., and Petta, P., eds. (2001): Grounding Emotions in Adaptive Systems. Two Special Issues of Cybernetics and Systems, 32 (5) and (6). Cassell, J., Sullivan, J., Prevost, S., and Churchill, E., eds. (2000): Embodied Conversational Agents. MIT Press, Cambridge. Clancey, W. J. (1999): Conceptual Coordination: How the Mind Orders Experiences in Time. Lawrence Erlbaum Associates, Hillsdale, N.J. Crawford, C. (2000): Understanding Interactivity. Self-published on-line. Available: hhttp://www.erasmatazz.comi. (Availability last checked 5 Nov 2002) Dalgleish T., and Power, M. eds. (1999): Handbook of Cognition and Emotion. Wiley, New York. Damasio, A. R. (1994): Descartes’ Error. Putnam, New York. Damasio, A. R. (1999): The Feeling of What Happens: Body and Emotion in the Making of Consciousness. Harcourt Brace Jovanovich, New York. Dautenhahn, K., Bond, H. A., Can˜amero, L., and Edmonds, B., eds. (2002): Socially Intelligent Agents: Creating Relationships with Computers and Robots. Kluwer Academic Press. Elliott, C. D. (1992): The Affective Reasoner: A Process Model of Emotions in a Multiagent System. Ph.D. thesis, Northwestern University, Illinois. On-line. Available: hftp://ftp.depaul.edu/pub/cs/ar/elliott-thesis.psi. (Availability last checked 5 Nov 2002) Frijda, N. H. (19 86): The Emotions. Cambridge University Press, Editions de la Maison des Sciences de l’Homme, Paris. Hauser, M. (2000): Wild Minds: What Animals Really Think. Henry Holt, New York. Laurel, B. (1991): Computers as Theatre. Addison-Wesley, Reading, Mass. LeDoux, J. E. (1996): The Emotional Brain. Simon and Schuster, New York. Lewis, M., and Haviland-Jones, J. M., eds. (2000): Handbook of Emotions. 2nd ed. Guilford Press, New York, London. McKee, R. (1997): Story: Substance, Structure, Style, and the Principles of Screenwriting. Harper Collins, New York. Murray, J. (1997): Hamlet on the Holodeck: The Future of Narrative in Cyberspace. Free Press, New York. Ortony, A., Clore, G. L., and Collins, A. (1988): The Cognitive Structure of Emotions. Cambridge University Press, Cambridge. Paiva, A., ed. (2000): Affective Interactions: Towards a New Generation of Computer Interfaces. Springer, Berlin, Heidelberg, New York. Picard, R. W. (1997): Affective Computing. MIT Press, Cambridge. Pinker, S. (1997): How the Mind Works. Penguin, New York.
Recommended Reading
Powers, W. T. (19 73): Behavior: The Control of Perception. Aldine, Chicago. Reeves, B., and Nass C. (1998): The Media Equation. CSLI Publications, Stanford, Calif. Roth, G. (2001): Fu¨ hlen, Denken, Handeln. Wie das Gehirn unser Verhalten steuert. Suhrkamp, Frankfurt am Main. (Feeling, Thinking, Acting. How the Brain Controls our Behavior. Unfortunately, this excellent book has not (yet) been translated into English. RT) Rolls, E. T. (19 9 9 ): The Brain and Emotion. Oxford University Press, Oxford, London, New York. Rosis, F. de, ed. (2002): Toward Merging Cognition and Affect in HCI. Special Double Issue of Applied Artificial Intelligence, 16 (7 & 8). Scherer, K. R., Schorr, A., and Johnstone, T., eds. (2001): Appraisal Processes in Emotion: Theory, Methods, Research. Oxford University Press, Oxford, London, New York. Sloman, A., ed. (2000): Proceedings of the AISB 2000 Symposium on ‘‘How to Design a Functioning Mind,’’ The University of Birmingham, UK. Sousa, R. de (1987): The Rationality of Emotion. MIT Press, Cambridge. Thomas, F., and Johnston, O. (1981): Disney Animation: The Illusion of Life. Abbeville Press, New York. Trappl, R., and Petta, P., eds. (1997): Creating Personalities for Synthetic Actors. Springer, Berlin, Heidelberg, New York.
Subject Index
3T
Architecture, 267, 270, 279. See also System architecture, layered trionic ACT*, 266 Action dual routes to, 3, 23, 25, 28 evasive, 41 (see also Action tendency) Action expression, 348 Action selection, 118–120, 140, 276–278. See also Activity selection, Behavior, selector of Action tendency, 254, 257, 259–260, 265, 267–269, 273, 277, 279–280. See also Behavioral inclination avoidance, 259 (see also Avoidance, active) Activation chemical (hormonal), 154 neural, 154 Activity selection, 115, 116, 123. See also Action selection Adaptation, 115, 257 Affection. See Emotions and feelings, affection Affective agent. See Agent, affective Affective art, 333–339, 349–350, 353–355 Affective artifact. See Artifact, affective Affective computing concerns, 326–330, 353–354, 364–365 ethics of, 327–330 impact, 326 Affective Reasoner, 8, 239, 241–246, 254, 262–264 Affective state. See State, emotional Affordance, 49, 56, 269–270 Agency, 127, 181, 258 Agent affective (see Agent, emotional) animated, 220 artificial, 115, 150–151, 157, 172–173, 183 (see also Artifact) autonomous, 4, 115, 157, 252, 255 (see also System, autonomous; Autonomy) believable, 6, 7, 238, 240–241, 246–249 (see also Believability) biological, 115, 157 cooperative, 182 embodied, 169, 173, 180–182, 260
emotional, 6, 120, 129, 189, 193 (see also Artifact, affective) functional, 9, 336 situated, 252, 256, 265, 269, 277, 280 synthetic (see Agent, artificial) Agent architecture, 252, 256, 260 deliberative, 260 (see also System, deliberative) design approaches, 8, 42–43, 55 hybrid, 260–261, 267 reactive, 76, 77, 260–261 (see also System, purely reactive) Agent societies, 120–121 Aibo (robot dog), 337 Alarm systems. See Control, alarm systems Alcohol, 27 AM (heuristic search system), 292 Amygdala, 3, 18, 19, 23, 24, 26, 33, 259 Animal cognition. See Cognition, animal Animat, 116, 261 Animism, 143 Anosognosia, 163 Anthropomorphism, 7, 141, 142, 146 Appraisal, 12, 208, 260, 266, 269–270 conceptual processing, 258–259 conscious, 221 (see also Consciousness) emotional, 223 schematic processing, 258–259 subconscious, 109 Appraisal criteria, 257–258, 266 Appraisal mechanism, 264 Appraisal process, 258–259, 265–266, 268–270, 279 Appraisal register, 259, 268 Appraisal theory, 252, 256–257, 266, 272 AR. See Affective Reasoner Architecture. See System architecture; Agent architecture Arousal, 9, 132, 154, 309, 312–314, 325 aesthetic, 85 sexual, 85 vocal encoding of, 324–325 Artifact, 152–153, 171, 184–185 affective, 192–194, 201, 204–207 (see also Agent, emotional) believable, 193, 201 (see also Believability)
Subject Index
Artificial intelligence, 1, 5, 7, 62, 63, 89, 90, 92, 95, 100, 116, 172 ‘‘alien,’’ 237–238 ‘‘human,’’ 237 embodied, 122, 123 situated, 260, 269 Artificial Intelligence (movie), 364 Artificial intelligence research paradigms, development of, 2 Artificial life, 252, 261, 271 Asperger’s syndrome, 165 Association, 158–159, 167. See also Cortex, association; Reward and punishment, association matrix of Associative network. See Spreading activation network Attention diversion of, 199 focus of, 166, 258 receiving, 341 Attention filters, 73, 74 Attention switching, 108 Autism, 223, 234 high-functioning, 165–166 Autonomic response, elicitation of, 16 Autonomy, 115, 170–171 ‘‘levels’’ of, 116 motivational, 116 Autopoiesis, 159 Avatar, 174 , 181, 220 Avoidance, active, 14. See also Action tendency, avoidance Awareness conscious, 222, 224–225 (see also Consciousness) of emotional content, 305 of self (see Self-awareness) Backward masking experiment, 33 Basal ganglia, 17, 24, 26, 27 Bayes Network, 8, 9, 311–315, 316–318, 321–323, 325, 326 BDI model. See Model, BDI Behavior altruistic, 255 coherence of, 190, 202 consistency of, 190, 192, 194–196, 200, 202–203, 206, 209–210, 278 consummatory, 116 emotional, 190, 197, 218–221, 312–316, 318–319, 320–326, 328 (see also Emotional state indication) expressive, 220, 278 (see also Emotional expression) generation of, 6, 189 goal-directed (see Goal-directed behavior)
interactive (see Interactive behavior) involuntary expressive, 197, 254 linguistic, 314, 320–323 predictability of, 190 reactive, 60, 116, 257 rewarded (see Reward) selector of, 17 (see also Action selection) self-interested, 255 sexual, 22 vocal, 314(see also Emotional expression, vocal) voluntary vs. involuntary, 197 Behavior adaptation, 225 Behavior node, 314, 318, 321, 322, 323, 325 Behavior transition, 345 Behavioral complexity, different ‘‘levels’’ of, 116 Behavioral dithering, avoidance of, 346 Behavioral inclination, 191–192, 196, 201, 205. See also Action tendency Behavioral response active, 14 flexibility of, 16 passive, 14 Behavioral schema, 274 Believability, 9, 189, 194, 202, 205, 238, 241, 246–249, 350, 352. See also Artifact, believable Bias. See Disposition; Goal formulation, bias of Bidirectionality, 21 Biomimetic approach, 115, 121 Blackboard harmonic, 299 melodic, 299 rhythmic, 299 root, 299 system, 294, 299–300 Blade Runner (movie), 364 Body language, 268, 275 Body perception, 138 Body state. See State, body Brain ‘‘triune,’’ 51 mammalian, 4 reptilian, 4 Brain damage, 1, 102 Brain design, types of, 20 Brain mechanism, 221 Brain research, 1 Brain subsystems, 138 Brain systems, 17 CAPTology. See Computer aided persuasive technology
Subject Index
Cartooniness vs. realism, 10, 350 Catastrophe theory, 159 Cholinergic pathways, 19 Chunking, 54 Circumplex, interpersonal, 309–310 Cogaff architecture, 82, 90, 92, 93, 98 Cognition, 29 animal, 151 social (see Social cognition) Cognition and Affect project, 50 Cognitive complexity, requirements for, 130 Cognitive ethology, 275 Cognitive processing, 15, 30, 154 Cognitive science, 5 Cognizer, situated, 8, 269–270 Communication nonverbal, 17, 182 (see also Emotional state indication, nonverbal) social (see Social communication) Communicative ability, 247–248 Communicative adaptability, 247 Computer aided persuasive technology, 327 Computer animation, 2 Computer game development, 1 Computer toys, 9 Concern satisfaction, 272 Conditioning, instrumental, 255 Confabulations, 26 Conscience, 28 Consciousness, 4, 7, 26, 27, 31, 35, 40, 44, 103, 152, 154, 183, 225–226, 231. See also Appraisal, conscious; Awareness, conscious; Control system, conscious; Emotions and consciousness; Self, conscious Consistency cross-individual, 191 in emotions, 190 in response tendency (see Response tendencies, consistency in) within-individual, 191 Constructed system. See Artifact Contingencies, 117 Continuity, sense of, 26 Control alarm systems, 56–58, 64, 77, 105, 106 (see also Metamanagement with alarms) direct, 91 global interrupt filter, 74 goal-driven, 51, 60 hypothesis-driven, 51 sharing of, 341–342 social (see Social control) Control states, complex, 84, 85
Control system conscious, 27 emotional ‘‘second-order,’’ 133, 140 hierarchical, 89 motivational, 137 Conversational dialog, 304 Conversational interface, 303–304, 305– 306, 316–318 comfortable, useful, 303 Coordination failure of emotional subsystem, 122 of distributed system, 170 Coping, 257, 259–260, 265, 267–269 Coping strategies, 210 emotion-oriented vs. problem-oriented, 200 Cortex, 155, 221, 224 association, 18, 29, 30, 33 frontal lobe, 106 language, 18 orbitofrontal, 3, 18, 19, 23, 26, 33 prefrontal, 25, 26 Cost-benefit analysis, 25 Creatures (See also Agent) complete autonomous, 123 simulated, 133 Creatures (computer game), 337 Credit assignment problem, 31, 32 Culture, 152, 255 and emotion, 292 Western, 152, 157, 161–162 Cybernetics, second-order, 159 Decision making, 27, 156, 158, 172 cooperative, 342 Declarative processing, 26 DECtalk (speech synthesizer), 324 Deep Blue, 233 Deliberative mechanisms, 61–64. See also System, deliberative Design space, 43 discontinuities in, 101 Design stance, 39, 40 Desires, unconscious, 27. See also Unconscious system Development. See Knowledge acquisition Dialog. See Conversational dialog Display rules, 196, 254–255, 260 Disposition, 70, 202, 294, 301 feedback facility of, 300–301 Drive, physiological, 253, 257 Drives, 5 Education, computer-aided, 180 Elegance, 7 Eliza effect, 353 Emergence, 95,111
Subject Index
Emotion definition of, 11, 12, 117, 118, 309 elicitation of, 195, 201 persistence of, 206 types of (see Emotion classes) understanding of, 240 Emotion and behavior. See Behavior, emotional; Control system Emotion and learning, 30, 291. See also Learning mechanisms Emotion attribution, 220 Emotion chip, 316 Emotion classes, 193, 262–263 Emotion code object, 347 angry, 347 (see also Emotions and feelings, anger) happy, 347 (see also Emotions and feelings, happiness) sad, 347 (see also Emotions and feelings, sadness) Emotion generation, 7, 221–224, 240, 252, 257 multilevel, 219, 223 Emotion intensity, 196 Emotion model, 237–238, 246, 249. See also Model, emotional valence Bayesian (see Bayes network) OCC, 193–194, 222, 239, 241–242, 262– 263 noncognitive, 316 Emotion node, 294, 295 Emotion process, 251 Emotion recognition, 216, 316 Emotion response tendencies. See Response tendencies Emotion sensing, physiological, 310. See also Heart rate; Galvanic skin response Emotion simulation, role of, 316 Emotion system, 119, 216, 230, 234, 255– 256 artificial, 129 human vs. artificial, 230, 232, 234 Emotion theory, 117, 192 functional, 264 perceptual-motor, 265 Emotional appearance, 7, 218, 220, 223 Emotional attitude, 71–73. See also Emotions and feelings calming, 306 commanding, 306 disapproval, 306 empathy, 306 solicitude, 306 warning, 306 Emotional behavior. See Behavior, emotional
Emotional competence, 264 Emotional experience, 7, 154, 224–225, 231, 233 Emotional expression. See also Linguistic expressive style; Locomotion style behavioral, 197, 344–349 (see also Behavior, expressive) communicative, 197, 199, 310, 320, 326 (see also Communication; Emotional state indication, nonverbal) effective, 344–346, 353 fatigue, 346 musical, 292, 294–296 prioritization of, 346 real-time, 345 somatic, 197 theatrical techniques of, 346 vocal, 323–325 Emotional filter function, 347 Emotional guidance, 276 Emotional information, 216 Emotional insect, 107 Emotional maturity, 234 Emotional reaction, to computers, 305 Emotional relationships, 335–338, 343– 344, 353–355 Emotional response, 309. See also Arousal; Emotional valence policy, 317, 318–320 tendency (see Response tendencies) triad, 251 Emotional sensitivity, 303 Emotional state. See State, emotional Emotional state indication, nonverbal, 308, 325. See also Communication, nonverbal; Behavior, emotional Emotional subsystems, improper synchronization of, 122 Emotional tone, depressed, 305 Emotional valence, 9, 132, 309, 312–316, 325. See also Model, emotional valence vocal encoding of, 324–325 Emotionality, 1 Emotionally intelligent system, 233–234 Emotion-behavior linkage, 192, 196, 201 Emotions animal, 153 architecture-based concepts of, 44 artificial, 120 basic, 119, 120 components of, 217 low-intensity, 71 maladaptive, 121, 122, 146 primary, 44, 58, 77, 85, 105, 109, 110, 119, 120 role of, 120, 127, 128, 151, 153–154, 158, 253–255
Subject Index
secondary, 44, 58, 80, 81, 85, 105, 110 socially important human (see Social emotions) tertiary, 44, 58, 86, 106, 109, 110 unaware, 32 Emotions and consciousness, 363–364. See also Consciousness Emotions and feelings. See also Feeling affection, 193, 341 (see also Interactive behavior, affection) anger, 120, 130, 136, 197, 200, 228, 295, 243, 344 (see also Emotion code object, angry) anxiety, 120, 130 confusion, 306 disgust, 344 embarrassment, 306 fear, 12, 14, 15, 33, 120, 130, 136, 193– 194, 346, 347 (see also Response, fear) frustration, 12, 213, 215–216 gloating, 243, 245 happiness, 12, 136, 295, 344, 346 (see also Emotion code object, happy) irritation, 306 joy, 306 love, 228, 344 meditativeness, 295 pain, 15, 70–71, 228–229, 246–247 pleasure, 70, 71, 246, 344 (see also Touch, pleasantness of) pride, 306, 344 relief, 12 sadness, 136, 295, 306, 346 (see also Emotion code object, sad) shame, 243–246 uncertainty, 306 Emotions research, impact, 363 Empathy, 161, 184 Endorphin, 138 Engineering design, 22 E-Nodes. See Emotion node Entertainment, 142, 36, 102 interactive, 338 (see also Interactive experience, design of) Environment, 127, 135, 145, 154, 159, 164, 166, 170–171, 173, 182, 207–208, 210 affordances of the, 194 dynamism of, 135 external and internal, 115, 140 rational reaction to, 72 Environmental needs, 35 Environments, classification of, 129–131 Escape, 14 Ethology, 116 cognitive (see Cognitive ethology) Evaluation mechanisms, 19, 68–70, 106 Evaluation system, 11
Evaluations, conflicting, 70 Evaluators, 70 Event interesting, 301 reinforcing (see Reinforcing events) Evolution, 3, 20, 22, 35, 47, 52, 89, 104, 119, 139 Evolution fitness, 3, 11, 17, 21–23 Expert system, 172 Explanations, reasonable, 26 Extinction, 14 Facial expression, 18, 192, 196, 199, 220, 246, 268, 275, 278, 314 Fast/slow dichotomy, 64, 65 Feedback, 171–173. See also Disposition, feedback facility of emotional, 216 Feeling, 155, 222–225, 231–232. See also Emotions and feelings aggressiveness, 344 boredom, 136, 344 craving, 344, 346 dislike, 344 excitement, 344 gratefulness, 344 guilt, 344 jealousy, 344, 347 laziness, 344 loneliness, 344 punished, 344, 348–349 (see also Punishment) rewarded, 344 (see also Reward) satisfaction, 344 shame, 344 timidity, 344 warmth, 344 of users, 6 Final Fantasy (movie), 2 Fixed action pattern (FAP), 275–276, 279– 280 Flexibility, 297–298 Frame-based system. See also K-line frame Frankenstein, 354 Functional differentiation, 90 Functional neutrality, 31 Fungus Eater, New, 128 Galvanic skin response, 223, 310 Gedankenexperiment, 7 Genes, 17, 145, 146 Geneva Emotion Research Group, 324 Genotype-phenotype relation, 132 Goal achievement, 264 Goal conflicts, resolution of, 5 Goal-directed behavior, 181
Subject Index
Goal formulation bias of, 291, 292 emotion and, 292 Goal hierarchy, 194 Goal importance, 264 Goal mechanisms, 59, 60 Goals, 208, 291 survival-related, 120 Gridland, 5, 133, 135 GSR. See Galvanic skin response Habit, 16 HAL (computer in 2001: A Space Odyssey), 218, 233, 354 HBB. See Blackboard, harmonic Heart rate, 310 Homeostasis, 104, 133. See also Hysteresis Homeostatic need states, 15 Homunculus, 169, 364 Hope, 239–240 Hormones, 133, 134, 137 Human architecture, 35. See also Information-processing architecture, human Humans, properties of, 127 Humor, 241–242 Hunger. See State, hunger as internal need Hypothetical future, 80 Hysteresis, 24. See also Homeostasis Indexicals, 256 Individuality, 256 Information-processing architecture, human, 35, 39, 45, 46 Information-processing perspective, 4 Instrumental response. See Response, instrumental Intention, 181, 191, 254. See also Model, BDI Interaction computer, 305 (see also Entertainment, interactive) human-agent, 173, 182, 185 social (see Social interaction) Interactive behavior affection, 343–344 (see also Emotions and feelings, affection) nurturing, 344 play, 344 training, 344 Interactive experience, design of, 349. See also Interaction, computer Interrupt filter, global. See Control, global interrupt filter Intuition, 152, 157 Invisible Person, 272–273, 276–278
K-line frame, 298–299 memory, 8, 298–299 theory, 293, 294, 296–297 K-lines, compound, 298 Knowledge musical, 296 Knowledge acquisition, 296 Knowledge representation, 293 Knowledge source, 299–300 Language generation, 248 Language system, 25, 26, 28, 30, 31, 33, 96, 97 Layer deliberative, 44, 53, 61, 104 (see also System, deliberative) narrative intelligence, 347 reactive, 44, 53, 58, 104 Layered architectures. See System architecture, layered Layers, concurrently active versus pipelined, 90 Learning. See also Emotion and Learning by trial and error, 156 inductive machine, 292 instrumental, 17 by stimulus-reinforcer association, 17, 23 Learning mechanisms, 69. See also Emotion and Learning; Knowledge acquisition; Motive comparators, learned; Reinforcer, learned; STAR; Stimulus reinforcement, learning of Lifeworld, 207, 251, 256 Linguistic expressive style, 321–323 Locomotion, 20 Locomotion style, 345 Man-machine interface, 151 Meaning, 152, 165–167, 184, 269, 321 Memories cognitive evaluation of, 19 recall of (see Recall, of memories) storage of, 19 Memory emotional, 131 human, 290 (see also Recall, of personalized habits) K-line (see K-line memory) long-term associative, 62, 64 short-term, 3, 25, 26, 62, 64 Memory process, 154 Mental disorders, 122 Mental ontologies, 40 Metamanagement, 32, 44, 53, 58, 74, 88, 98, 104, 105 with alarms, 81–84(see also Control, alarm systems)
Subject Index
Microworld, 135 Mimicry, 218–219 Mind everyday concepts of, 35 society of (see Society of Mind theory) Mind ecology, 4, 36, 55 Mind fluidity, 293 Mind-body interaction, 7, 223, 227–228 Model BDI, 267 component-based, 5, 125–127 contention-scheduling, 93 emotion (see Emotion model) emotional valence, 258, 314–316 functional, 5, 127, 128 information flow, 4 Markovian, 314 personality, 237–238, 246 phenomenon-based/black-box, 4, 124, 131 process/design-based, 5, 124, 125, 128, 131 triple-layer, 52–54(see also System architecture, layered) triple-tower, 50, 51, 53, 54 user, 241 Modular organization, 47 Mood expression, 345 Moods, 71–73, 195 Moral, 255 Motivation, 5, 17, 132, 154, 157–158, 255. See also Control system, motivational Motivation emotion amplified, 133 incentive, 24 persistent and continuing, 19 Motivational states. See State, motivational Motivational submechanisms, 67 Motive generators, 67 insistence of, 74 intensity of, 74 Motive comparators, learned, 69 Motives, competing, 93 Motives, intrinsic, 92 MUD. See Multi-user domain Multilevel processing, 103 Multi-user domain, 175–180 Multi-user domain robot, 179–180 Multi-user dungeon. See Multi-user domain Multi-user systems, 357–360. See also Multi-user domain; Multi-user virtual environment Multi-user virtual environment, 175, 180. See also Virtual world
Music, 8, 49. See also Knowledge, musical; Emotional expression, musical Music composition process, 8, 293 Musical artifact, 295 Musical component frame, 299 MUVE. See Multi-user virtual environment Naive realism, 167 Narrative intelligence. See Layer, narrative intelligence Natural language interaction, 180 Natural language interface, 304, 308. See also Text-to-speech systems Natural selection, 21 Nervous system, 290 Net reward. See Reward, net Neural net, 31, 45, 63 Neurological impairment, 223 Neurophysiology, 154 Neuroscience, 224 Neurotransmitter, 153 Noradrenergic pathways, 19 Norms. See Social norms OCC Model. See Emotion model, OCC Office Plant #1 (robot plant), 339 Ontologies, self-bootstrapped, 98, 99 Ontology, 46, 144 Ontology of mind, architecture-based, 87, 101 Organisms hybrid reactive and deliberative, 78–81 (see also System, deliberative) purely reactive, 76, 77 (see also System, purely reactive) PARETO, 269 Partial solutions, 293 Pattern of activation/deactivation, 224. See also Activation Perception, altered, 138 Personae, switching, 74, 75 Personal viewpoint. See Point of view, personal Personality, 6, 75, 196, 201–202, 204, 206– 208, 309 biological substrates of, 205, 207 context-dependency of, 209–210 generation, 240 Myers-Briggs typology of, 310 representation of, 309–310, 314–315 theory of, 192, 203 Personality dimension dominance, 9, 310, 312–314, 325 friendliness, 9, 310, 312–314, 326
Subject Index
Personality dimensions, 203, 205–206, 209–210, 310 Personality model. See Model, personality Personality traits, 193, 200, 202–203, 207 clusters of, 203 longer-term, 9 Persuasion, 327–328 Petit Mal (robot), 339 Phobics, 32, 33 Physiology, 85, 197 Pinocchio, 354 Plan formation, 61 Planning, 25, 62, 63 Plans, long-term, 17, 27 Point of view personal, 161, 165–166, 172–173, 182– 183 first-person, 5, 350 third-person, 350 Posture, 314 control mechanisms of, 46 Predictability, 202–203 Prediction, 154 Pregnant woman, 27 Primitives, choice of, 131, 136 Privacy, 329–330 Problem solving, 63. See also Partial solutions Psychology, comparative, 102 Psychology, developmental, 102 Punishment, 255. See also Reward and punishment Qualia, 226 Quantum mechanisms, 92 R2D2, 354 RAP system, 267–269, 279 Rationality, 1, 26. See also Environment, rational reaction to; Reasoning, rational; Response tendencies, rational bounded, 256 Realism. See Cartooniness vs. realism; Believability; Naive realism Reason, 26 Reasoning, 160, 167, 224 emotional, 157–158, 160–161, 184 multistrategy, 293 rational, 157 rule-based, 222, 224 Recall of memories, 19 of personalized habits, 297, 298 Reequilibration, 119 Reflex, 191, 253, 257 Regulatory focus, 203 prevention, 204
promotion, 204 Reinforcement, 69, 156, 205. See also Stimulus reinforcement Reinforcement contigency, 14 Reinforcer, 11 learned, 3, 13 instrumental, 3, 13 intensity of, 14 negative, 14 positive, 14 primary, 3, 14, 15, 29, 30, 33 secondary, 3, 14 , 15 unlearned, 3, 13 Reinforcing events, 15 Relational toys, 365 Reproductive success, 17 Response, behavioral. See Behavioral response Response bodily, 223 (see also Mind-body interaction; State, body) emotional, 158–159, 192, 253 fear, 221–222 instrumental, 16 internal, 191 selection of, 191 Response categories, 263 Response tendencies, 196, 201, 219. See also Action tendency consistency in, 196 constraints on, 197 coping, 199 (see also Coping strategies) expressive, 197 (see also Emotional expression) information processing, 199 rational, 200 types of, 6, 197 variability in, 196 Reward, 21, 255. See also Reward and punishment deferred, 25 net, 25 Reward and punishment, 3, 11–13, 16, 17, 20–23, 107, 204, 210. See also Feeling, punished; Feeling, rewarded association matrix of, 348–349 Robot, 151 Robot arm, 22 RoboWoggles, 339 Satiety mechanisms, 22, 119 Self, 160, 162, 165–169, 171, 173, 183– 185, 266 concept of, 152, 160, 165, 171 conscious, 251 (see also Consciousness) experience of, 162–163, 225 a sense of, 5
Subject Index
Self schemata, 266 Self-awareness, 152, 226. See also Awareness Self-evaluation, 83, 84 Self-knowledge, 170–171 Self-modification, 81, 82 Self-monitoring, 164–165, 169, 183 concurrent, 81 Self-observation, 83, 84 Self-perception, 164–165, 183 Self-reflection, 164–165, 169, 183 Semantic information, 183 Semantic information states, 84 Semantic network, 267 Semiotics, 167 Sensation simulation, 295 Sensitivity, emotional. See Emotional sensitivity Sensory processing, 265–266, 317 Sensory sytems, 21 Sentic modulation, 7, 36, 227 Sentics, 227 Situatedness, 277, 280 Skill compilers, 66 Skin conductivity. See Galvanic skin response Social bonding, 18 Social cognition, 266 Social communication, 254 Social control, 75 Social emotional competence, 254 Social emotions, 86, 87, 253–255 Social interaction, 151–152, 157, 161, 248, 278, 319 illusion of, 8 Social norms, 195–196, 208, 254–255 Social presence, 303 Society of Mind theory, 100, 293 Software agent, 248. See also Agent, artificial SOM theory. See Society of Mind theory Somatic marker theory, 156, 163 Soul, 233 Speech energy level, 324–325 pitch, 324–325 rate, 324–325 Spreading activation network, 266, 298 Standards, 195, 208. See also Social norms STAR (inductive machine learning methodology), 292 State, affective. See State, emotional State body, 155 emotional, 15, 191–193, 222, 224, 254, 275–276, 290 (see also Emotional state indication, nonverbal)
hunger as internal need, 15 mental, 297 mental partial, 297 (see also Partial solutions) motivational, 118 short-term emotional, 9 thirst as internal need, 15 Stimulus, 252, 257, 278 environmental, 14, 16 external, 15 neutral, 15 sensory, 195 Stimulus reinforcement. See also Reinforcement association of, 14 learning of, 15–17 Story, affective, 7. See also Affective art Story generation, 241 Striatum, 18 Subsumption architecture. See System architecture, subsumption Survival, 119 Sweetness, 30 Symbol system, 180 Synthetic character, 2, 102, 220 Synthetic physiology, 123, 133, 141 System autonomous, 164, 170–171 (see also Agent, autonomous) biological, 150, 153, 163–164, 167 condition-action, 62 deliberative, 64, 73 (see also Agent architecture, deliberative; Deliberative mechanisms; Layer, deliberative; Organisms, hybrid reactive and deliberative) explicit, 3, 26–28, 32 purely reactive, 60 (see also Agent architecture, reactive; Behavior, reactive; Environment, rational reaction to; Layer, reactive; Organisms, hybrid reactive and deliberative; Organisms, purely reactive) System architecture, 205, 207, 316–318. See also Agent architecture; Cogaff architecture, System properties, innate dominance hierarchies, 90 functional differentiation, 90 goal-based behavior, 340–344, 346–349 humanlike, 99, 100 layered, 52, 347 layered trionic, 88, 90, 278–279 layered trionic with alarms, 82 (see also Control, alarm systems) subsumption, 53, 107, 260–261 System properties, innate, 295
Subject Index
TABASCO, 8, 252, 264–265, 268–272, 280 Tamagotchi, 337 Taxes, 20, 23 Teleo-reactive system, 56 Text-to-speech systems, 305, 324 Thirst. See State, thirst as internal need Thrashing, 96 Timing, 345 TinyMUD, 179 Touch, pleasantness of, 24 Tractability, 270 Trainability, 91 Training, 180–181 Traits, 240. See also Personality traits Trans-Frames, 293 Tropism, 20, 23 Unconscious system, 31. See also Desires, unconscious User Interfaces, 350–352. See also Conversational interface; Natural language interface Value system, 208 Viewpoint. See Point of view, personal Virtual Babyz, 333, 342–355 Virtual characters, 334–355. See also Agent, embodied; Avatar Virtual Petz, 333, 337, 340–342 Virtual reality, 174–175, 181, 272 Virtual world, 149, 152–153, 169, 173– 174, 176, 179–182, 184–185, 215 Vision, 161 Vocal expression, 220. See also Behavior, vocal; Emotional expression, vocal Will, 181 Wolfgang (musical composition system), 295 Wording choice, 320. See also Behavior, linguistic