< BACK
Cognition on Cognition Edited by Jacques Mehler and Susana Franck
Preface: Building COGNITION I
Neuropsychology
1 Insensitivity to future consequences following damage to prefrontal cortex 2 Autism: beyond "theory of mind" 3 Developmental dyslexia and animal studies: at the interface between cognition and neurology 4 Foraging for brain stimulation: toward a neurobiology of computation November 1 9 9 5 ISBN 0 - 2 6 2 - 6 3 1 6 7 - 9 504 pp. $ 5 5 . 0 0 / £ 3 5 . 9 5 (PAPER)
5 Beyond intuition and instinct blindness: toward an evolutionary rigorous cognitive science II Thinking 6 Why should we abandon the mental logic hypothesis? 7 Concepts: a potboiler
ADD TO CART
Series Bradford Books Cognition Special Issue Related Links Contributor List Request Exam/Desk Copy
8 Young children's naive theory of biology 9 Mental models and probabilistic thinking 10 Pretending and believing: issues in the theory of ToMM 11 Extracting the coherent core of human probability judgment: a research program for cognitive psychology 12 Levels of causal understanding in chimpanzees and children 13 Uncertainty and the difficulty of thinking through disjunctions III Language and Perception. 14 The perception of rhythm in spoken and written language 15 Categorization in early infancy and the continuity of development 16 Do speakers have access to a mental syllabary? 17 On the internal structure of phonetic categories: a progress report 18 Perception and awareness in phonological processing: the case of the phoneme 19 Ever since language and learning: afterthoughts on the Piaget-Chomsky debate 20 Some primitive mechanisms of spatial attention 21 Language and connectionism: the developing interface 22 Initial knowledge: six suggestions
Preface: Building COGNITION The human mind needs to acknowledge and celebrate anniversaries; however, some anniversaries are more salient than others. This book emanates from Volume 50 of the journal, COGNITION. Why that volume of COGNITION was important to us perhaps becomes clear when we understand how the mind encodes numbers. Indeed, Dehaene et al. (1992) reported that the number 50 is psychologically more salient than, say, either 47 or 53. So, predictably, Volume 50 was a befitting occasion to celebrate an anniversary; it was a time to take stock of what was happening during the early years and a time to remember how we were long ago and how we have evolved as a journal. In our first editorial, we wanted to remember those who have provided us with so much help and the cultural climate that made the journal possible. In this introduction to COGNITION on Cognition we leave as much of the original introduction as possible so that the flavor initially conveyed remains. COGNITION was envisioned by T. G. Bever and Jacques Mehler because we thought that the new and diffuse area of cognition had to be facilitated by overcoming the inflexibility of form and content that were characteristic of most earlier journals in psychology and linguistics. Moreover, cognition was a multidisciplinary domain while psychology and linguistics were too narrow and too attached to one school of thought or another. So too were most journals. In the sixties, one could see the birth of the cognitive revolution in Cambridge, Massachusetts, where many of those who were to become the main actors were working on a project which was to become modern Cognitive Science. Was it possible to study intelligent behavior, in man and in machine, in the way that one studies chemistry, biology or even astronomy? We were sure the question should be answered affirmatively. Since then, the study of mind has become a part of the natural sciences. Positivism and behaviorism, among others, had confined publishing to patterns that were ill-suited to our needs. Psychologists, linguists, neuropsychologists, and others would often voice their dismay. Authors knew that to enhance their chances of publication they had to avoid motivating their studies theoretically. "Make your introduction as short and vacuous as possible" seemed to be the unspoken guideline of most journals. Editors were often even more hostile towards discussions that had "too much theory," as they used to say in those days. That was not all. Psychology journals did not welcome articles from linguistics while neuropsychologists had to hassle with neurologists to see their findings published. For a psychologist to
VIII
Preface
publish in a linguistics journal was equally out of bounds. Readership was also broken down along lines of narrow professional affinity. Yet scientists from all these disciplines would meet and discuss a range of exciting new issues in the seminars held at the Harvard Center of Cognitive Studies, and at similar centers that were being created at MIT, Penn, amongst others. Those were the days when computer scientists and psychologists, neurologists and linguists were searching jointly for explanations to the phenomena that their predecessors had explored from much narrower perspectives. If perception continued to be important, learning was beginning to loose its grip on psychology. Neuropsychology and psycholinguistics were becoming very fashionable and so was the simulation of complex behavior. Studying infants and young children had once more become a central aspect of our concerns. Likewise, students of animal behavior were discovering all kinds of surprising aptitudes to which psychologists had been blinded by behaviorism. It was, however, in the fields of linguistics and computer science that the novel theoretical perspectives were being laid out with greatest clarity. What was wanted was a journal that could help students to become equally familiar with biological findings, advances in computer science, and psychological and linguistic discoveries, while allowing them to become philosophically sophisticated. So, some of us set out to create a journal which would enclose such a variegated domain. We also wanted a publication for which it would be fun to write and which would be great to read. These ideas were entertained at the end of the sixties, a difficult time. France was still searching for itself in the midst of unrest, still searching for its soul after hesitating for so long about the need to face up to its contradictions, those that had plunged it into defeat, occupation and then collaboration on one side, suffering, persecution and resistance on the other. The United States, contending with internal and external violence, was trying to establish a multiracial society. At the same time it was fighting far from home for what, we were being told, was going to be a better world, though the reasons looked much less altruistic to our eyes. All these conflicts fostered our concerns. They also inspired the scientists of our generation to think about their role and responsibility as social beings. The nuclear era was a reminder that science was not as useless and abstruse as many had pretended it to be. Was it so desirable for us to be scientists during weekdays and citizens on Sundays and holidays, we asked ourselves. How could one justify indifference over educational matters, funding of universities, sexism, racism, and many other aspects of our daily existence? In thinking about a journal, questions like these were always present in our minds. COGNITION was born in France and we have edited the journal from its Paris office ever since. When Jacques Mehler moved from the United States to France, he worked in a laboratory located across from the Folies Bergeres, a neighborhood with many attractions for tourists but none of the scientific journals that were essential for keeping up with cognitive science. In 1969, the laboratory was moved to a modern building erected on the site at which the infamous Prison du Cherche-Midi had been located until its demolition at the end of the Second World War. This prison stood opposite the Gestapo Headquarters and resistance fighters and other personalities were tortured and then shot within its walls. A
Preface
IX
few decades earlier in the prison, another French citizen had been locked up, namely, Captain Dreyfus. It was difficult to find oneself at such a place without reflecting on how the rational study of the mind might illuminate the ways in which humans go about their social business and also, how science and society had to coexist. The building shelters the Ecole des Hautes Etudes en Sciences Sociales (EHESS), an institution that played an important role in the development of the French School of History. F. Braudel presided over the Ecole for many years while being the editor of the prestigious Annates, a publication that had won acclaim in many countries after it was founded by M. Bloch and L. Febvre. It was obvious that the Annates played an important role at the Ecole, where M. Bloch, an Alsatian Jew who was eventually murdered for his leading role during the Resistance, was remembered as an important thinker. Bloch was a convinced European who preached a rational approach to the social sciences. He was persuaded of the importance of expanding communication between investigators from different countries and cultures. Today, M. Bloch and his itinerary help us understand the importance of moral issues and the role of the individual as an ultimate moral entity whose well-being does not rank below state, country, or religion. Our hope is that rational inquiry and cognitive science will help us escape from the bonds of nationalism, chauvinism, and exclusion. Cognitive scientists, like all other scientists and citizens, should be guided by moral reason, and moral issues must be one of our fields of concern. A Dutch publisher, Mouton, offered us the opportunity to launch the journal. In the late sixties, money seemed less important than it does today. Publishers were interested in ideas and the elegance with which they were presented. We agreed to minimize formal constraints, and there was no opposition to the inclusion of a section to be used to air our political and social preoccupations. Opposition during those early planning stages came from a source that we had not at all foreseen as a trouble area. To our great surprise we discovered that publishing an English language journal in France was not an easy task. Some of our colleagues disapproved of what they perceived as a foreign-led venture. "Isn't it true," they argued, "that J. Piaget, one of the central players in the Cognitive Revolution, writes in French?" "A French intellectual ought to try and promote the French culture throughout the language of Descartes, Racine and Flaubert," we were reminded time and again. For a while we had mixed feelings. We need no reminders of how important differences and contrasts are to the richness of intellectual life. Today politicians discuss ways in which the world is going to be able to open markets and promote business. The GATT discussions have concentrated partly on the diversity of cultural goods. We agree with those who would like to see some kind of protection against mass-produced television, ghost-written books, and movies conceived to anesthetize the development of good taste and intelligence. Unfortunately, nobody really knows how to protect us against these lamentable trends. Removing all cultural differences and catering only to the least demanding members of society, no matter how numerous, will promote the destruction of our intellectual creativity. So why did we favor making a journal in English, and why is it that even today we fight for a lingua franca of science? Science is a special case, we told ourselves then, as we do today. We all know that since the Second World War, practically
X
Preface
all the top-quality science has been published in English. It would be unthinkable for top European scientists to have won the Nobel prize or reached world renown if they had published their foremost papers in their own language. They didn't. Likewise, it is unthinkable today for serious scientists, regardless of where they study and work, to be unable to read English. Of course, novels, essays, and many disciplines in the humanities are more concerned with form than with truth. It is normal that these disciplines fight to preserve their tool of privilege, the language in which they need to express themselves. Thus we viewed the resistance to English during the planning stages of COGNITION as an ideological plot to keep the study of mind separate and antagonistic to science and closer to the arts and humanities. Our aim was just the opposite, namely, to show that there was a discipline, cognition, which was as concerned with truth as chemistry, biology, or physics. We were also aware that the fear of contact and communication among fellow scientists is the favorite weapon used by narrow-minded chauvinists and, in general, by authoritarian characters with whom, we still, unfortunately, have to cope in some parts of the European academic world. While COGNITION was trying to impose the same weights and measures for European (inter alia) and American science, some of our colleagues were pleading for a private turf, for special journals catering to their specific needs. We dismissed those pleas, and the journal took the form that the readership has come to expect. Today, we include in this volume a series of articles that were originally published in the Special Issue produced to celebrate the fiftieth volume of the journal. We present these articles in an order which we think brings out their thematic coherence. There are areas that deal with theoretical aspects which range from the status of explanations in cognitive science, the evolutionary accounts offered to explain the stable faculties that are characteristic of homo abilis, to the way in which humans use general faculties to reason about their environment, and so forth. Another group of papers deals with the way in which humans process information and use language, the parts of cognitive science that are best understood, so far. We also present a number of papers that deal with infants' initial abilities and their capacity to learn the distinctive behaviors of the species. We also include several papers that try to relate behaviors to their underlying neural structures. This formto-function pairing may become particularly relevant to explain development. Indeed, many of the changes in behavior that one observes in the growing organism may stem from neural changes and/or from learning. Understanding the neural structures underlying our capacities may help us understand how these are mastered. It is difficult to imagine what the contents of volume 100 of COGNITION will look like. Certainly the journal, publishing in general, and academic publishing in particular, will change in radical ways in the years to come. Not only will the contents evolve in ways that will seem transparent a posteriori but also the form will change in ways that are hard to predict a priori. The ways in which science develops are hard to foresee because until one has bridged the next step vistas are occluded by the present. Fortunately, we do not need to worry about this for the time being. Our work is cut out—concentrating on what we are doing rather than on the ways in which we are doing what we are doing. On the
Preface
•
contrary, we must start thinking about how the changes in publishing will affect our ways of doing science. It is part of the scientist's duty to explore the changes to come so as to insure that the independence and responsibility of science is protected in the world of tomorrow as it is today. We cannot close this short introduction without thanking Amy Pierce for her help in preparing this special issue for publication with MIT Press. Jacques Mehler and Susana Franck
1 Insensitivity to future consequences following damage to human prefrontal cortex Antoine Bechara, Antonio R. Damasio*, Hanna Damasio, Steven W. Anderson Department of Neurology, Division of Behavioral Neurology and Cognitive Neuroscience, University of Iowa College of Medicine, Iowa City, IA 52242, USA
Abstract Following damage to the ventromedial prefrontal cortex, humans develop a defect in real-life decision-making, which contrasts with otherwise normal intellectual functions. Currently, there is no neuropsychological probe to detect in the laboratory, and the cognitive and neural mechanisms responsible for this defect have resisted explanation. Here, using a novel task which simulates real-life decision-making in the way it factors uncertainty of premises and outcomes, as well as reward and punishment, we find that prefrontal patients, unlike controls, are oblivious to the future consequences of their actions, and seem to be guided by immediate prospects only. This finding offers, for the first time, the possibility of detecting these patients' elusive impairment in the laboratory, measuring it, and investigating its possible causes.
Introduction Patients with damage to the ventromedial sector of prefrontal cortices develop a severe impairment in real-life decision-making, in spite of otherwise preserved intellect. The impairments are especially marked in the personal and social realms (Damasio, Tranel, & Damasio, 1991). Patient E.V.R. is a prototypical example of this condition. He often decides against his best interest, and is unable to learn
* Corresponding author. Supported by NINDS POl NS19632 and the James S. McDonnell Foundation.
4
A. Bechara, A. Damasio, H. Damasio, S. Anderson
from his mistakes. His decisions repeatedly lead to negative consequences. In striking contrast to this real-life decision-making impairment, E.V.R.'s general intellect and problem-solving abilities in a laboratory setting remain intact. For instance, he produces perfect scores on the Wisconsin Card Sorting Test (Milner, 1963), his performances in paradigms requiring self-ordering (Petrides & Milner, 1982), cognitive estimations (Shallice & Evans, 1978), and judgements of recency and frequency (Milner, Petrides, & Smith, 1985) are flawless; he is not preseverative, nor is he impulsive; his knowledge base is intact and so is his short-term and working memory as tested to date; his solution of verbally posed social problems and ethical dilemmas is comparable to that of controls (Saver & Damasio, 1991). The condition has posed a double challenge, since there has been neither a satisfactory account of its physiopathology, nor a laboratory probe to detect and measure an impairment that is so obvious in its ecological niche. Here we describe an experimental neuropsychological task which simulates, in real time, personal real-life decision-making relative to the way it factors uncertainty of premises and outcomes, as well as reward and punishment. We show that, unlike controls, patients with prefrontal damage perform defectively and are seemingly insensitive to the future.
Materials and methods The subjects sit in front of four decks of cards equal in appearance and size, and are given a $2000 loan of play money (a set of facsimile US bills). The subjects are told that the game requires a long series of card selections, one card at a time, from any of the four decks, until they are told to stop. After turning each card, the subjects receive some money (the amount is only announced after the turning, and varies with the deck). After turning some cards, the subjects are both given money and asked to pay a penalty (again the amount is only announced after the card is turned and varies with the deck and the position in the deck according to a schedule unknown to the subjects). The subjects are told that (1) the goal of the task is to maximize profit on the loan of play money, (2) they are free to switch from any deck to another, at any time, and as often as wished, but (3) they are not told ahead of time how many card selections must be made (the task is stopped after a series of 100 card selections). The preprogrammed schedules of reward and punishment are shown on the score cards (Fig. 1). Turning any card from deck A or deck B yields $100; turning any card from deck C or deck D yields $50. However, the ultimate future yield of each deck varies because the penalty amounts are higher in the high-paying decks (A and B), and lower in the low-paying decks (C and D). For example, after turning 10 cards from deck A, the subjects have earned $1000, but they have also encountered 5 unpredicted punishments bringing their total cost to $1250, thus
A TYPI
1
B 0
c
r
s
0
0 0 26 27 2 * 3 1
7
s
+50 5 1
r +50
3. 4
§
8
5
2/ 22 36 H9\ 50 si 52 53
2
+ 100
2
16
0
35 63
r
0
CM
CM
0
65 7tf7T
"£? 65 66
0
31
m CM
3i|37
Jo
0
7 ©
8 8 tf
0
72 73
38 39 4o
M2
wsM
55
56 59 6
of 0
0
0
0
CM]
A TYPICAL TARGET SUBJECT (DM 1336) 1 RESPONSE 1 OPTION
1 1
A
l 2
5
r
8
(/
1 B
/
1 +100
0
12 1
7
3 . 4 .5
7 .8
8 8 CM
8
I'M
9 10 H5 ^
o
1 2
u 58
g
8
0
0
3 59
4
5
60p
?
CM
22 31 32 33 55 57 65 66 * 7 68 69 1S\ CM
38 39 40
6
72 73 1H 7
8 0
CM
0
0
0
42 52 53 54 56 96 97 98 99 /oo
C
1 °
8 3 19 2
+5
/*
0
2 31 3
1 2
CM
5 16 J7
0
9 .10
23 2¥ 25 26 Z7 28 29 30 M3
8 *>
§
+100
+5
8
S
m
8 3
8
CM
B 49 56 51
D
1 ° 1.
9
6
7 10
0
77ie top score card repr represent the profiles of select more from decks
0
0
CM]
0
0
0
0
CM
0
that of a typical control subject, and the bottom one tha elections from the first to the 100th card. Control subjects B
6
A. Bechara, A. Damasio, H. Damasio, S. Anderson
incurring a net loss of $250. The same happens on deck B. On the other hand, after turning 10 cards from decks C or D, the subjects earn $500, but the total of their unpredicted punishments is only $250 (i.e. subject nets $250). In summary, decks A and B are equivalent in terms of overall net loss over the trials. The difference is that in deck A, the punishment is more frequent, but of smaller magnitude, whereas in deck B, the punishment is less frequent, but of higher magnitude. Decks C and D are also equivalent in terms of overall net loss. In deck C, the punishment is more frequent and of smaller magnitude, while in deck D the punishment is less frequent but of higher magnitude. Decks A and B are thus "disadvantageous" because they cost the most in the long run, while decks C and D are "advantageous" because they result in an overall gain in the long run. The performances of a group of normal control subjects (21 women and 23 men) in this task were compared to those of E.V.R. and other frontal lobe subjects (4 men and 2 women). The age range of normal controls was from 20 to 79 years; for E.V.R.-like subjects it was from 43 to 84 years. About half the number of subjects in each group had a high school education, and the other half had a college education. E.V.R.-like subjects were retrieved from the Patient Registry of the Division of Behavioral Neurology and Cognitive Neuroscience. Selection criteria were the documented presence of abnormal decision-making and the existence of lesions in the ventromedial prefrontal region. To determine whether the defective performance of E.V.R.-like subjects on the task is specific to ventromedial frontal lobe damage, and not merely caused by brain damage in general, we compared the performances of E.V.R.-like subjects and normal controls, to an education matched group of brain-damaged controls. There were 3 women and 6 men, ranging in age from 20 to 71 years. These controls were retrieved from the same Patient Registry and were chosen so as to have lesions in occipital, temporal and dorsolateral frontal regions. Several of the brain-damaged controls had memory defects, as revealed by conventional neuropsychological tests. Finally, to determine what would happen to the performance if it were repeated over time, we retested the target subjects and a smaller sample of normal controls (4 women and 1 man between the ages of 20 and 55, matched to E.V.R. in level of education) after various time intervals (one month after the first test, 24 h later, and for the fourth time, six months later).
Results Fig. 2 (left) shows that normal controls make more selections from the good decks (C and D), and avoid the bad decks (A and B). In sharp contrast, E.V.R.-like subjects select fewer from the good decks (C and D), and choose more from the bad decks (A and B). The difference is significant. An analysis of
A TYPICAL CON
NORMAL CONTROLS
A
i SJ f
HIIIIIIIIIIII
f 1111111111111111 i 11111111 f 11111111111111 • 111111111111
0
15
30
45
A TYPICAL TARGET SUBJ
IATL D B
C
DISADVANTAGEOUS
HHIIIH
t
liiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii 0 15 30 45
ADVANTAGEOUS
Choice Fig. 2.
^J
r Tria
(Left panels) Total number of cards selected from each deck (A, B, C or D) by normal controls (n = represent means ± s.e.m. (Right panels) Profiles of card selections (from the first to the 100th selection)
A. Bechara, A. Damasio, H. Damasio, S. Anderson
8
variance comparing the number of cards from each deck chosen by normal controls and by target subjects revealed a significant interaction of group (controls vs. targets) with choice (A, B, C, D) (F(3,147) = 42.9, /X.001). Subsequent Newman-Keuls Mests revealed that the number of cards selected by normal controls from deck A or B were significantly less than the number of cards selected by target subjects from the same decks (ps< .001). On the contrary, the number of cards selected by controls from decks C or D were significantly higher than the numbers selected by target subjects (ps<.001). Within each group, comparison of the performances among subjects from different age groups, gender and education yielded no statistically significant differences. Fig. 2 (right) shows that a comparison of card selection profiles revealed that controls initially sampled all decks and repeated selections from the bad decks A and B, probably because they pay more, but eventually switched to more and more selections from the good decks C and D, with only occasional returns to decks A and B. On the other hand, E.V.R. behaves like normal controls only in the first few selections. He does begin by sampling all decks and selecting from decks A and B, and he does make several selections from decks C and D, but then he returns more frequently and more systematically to decks A and B. The other target subjects behave similarly. Fig. 3 reveals that the performance of brain-damaged controls was no different
3 8W>
60-1
S
4
sad?
3
T
\
5 3
20-
c
s ao
0- ^
^^^_
^^^H
^^H
^^m
^^^^^H
^^^^^H
^^^H ^^^^^|
^^^H ^^^^^H
•HBL—
0> WD
ac« >• •o < «M O u
1Za
Fig. 3.
-20/f A
-40-
Normal
Brain- Damaged
CONTROLS
EVR
EVR-Typc
TARGET SUBJECTS
Total number of selections from the advantageous decks (C + D) minus the total numbers of selections from the disadvantageous decks (A + B) from a group of normal controls (n = 44), brain-damaged controls (n=9), E.V.R., and E.V.R.-like subjects (n=6). Bars represent means ± s.e.m. Positive scores reflect advantageous courses of action, and negative scores reflect disadvantageous courses of action.
Insensitivity to future consequences following damage to prefrontal cortex
9
from that of normal controls, and quite the opposite of the performance of the prefrontal subjects. One-way ANOVA on the difference in the total numbers of card selections from the advantageous decks minus the total numbers of selections from the disadvantageous decks obtained from normal and brain-damaged controls did not reveal a significant difference between the two groups (F(l,52) = 0.1, p> A), but the difference between the normal and E.V.R.-like groups was highly significant (F(l,50) = 74.8, p < .001). As a result of repeated testing, E.VR.'s performance did not change, one way or the other, when tested one month after the first test, 24 h later, and for the fourth time, six months later. This pattern of impaired performance was also seen in other target subjects. On the contrary, the performance of normal controls improved over time.
Discussion These results demonstrate that E.V.R. and comparable subjects perform defectively in this task, and that the defect is stable over time. Although the task involves a long series of gains and losses, it is not possible for subjects to perform an exact calculation of the net gains or losses generated from each deck as they play. Indeed, a group of normal control subjects with superior memory and IQ, whom we asked to think aloud while performing the task, and keep track of the magnitudes and frequencies of the various punishments, could not provide calculated figures of the net gains or losses from each deck. The subjects must rely on their ability to develop an estimate of which decks are risky and which are profitable in the long run. Thus, the patients' performance profile is comparable to their real-life inability to decide advantageously, especially in personal and social matters, a domain for which in life, as in the task, an exact calculation of the future outcomes is not possible and choices must be based on approximations. We believe this task offers, for the first time, the possibility of detecting these patients' elusive impairment in the laboratory, measuring it, and investigating its possible causes. Why do E.V.R.-like subjects make choices that have high immediate reward, but severe delayed punishment? We considered three possibilities: (1) E.V.R.-like subjects are so sensitive to reward that the prospect of future (delayed) punishment is outweighed by that of immediate gain; (2) these subjects are insensitive to punishment, and thus the prospect of reward always prevails, even if they are not abnormally sensitive to reward; (3) these subjects are generally insensitive to future consequences, positive or negative, and thus their behavior is always guided by immediate prospects, whatever they may be. To decide on the merit of these possibilities, we developed a variant of the basic task, in which the schedules of reward and punishment were reversed, so that the punishment is immediate and
10
A. Bechara, A. Damasio, H. Damasio, S. Anderson
the reward is delayed. The profiles of target subjects in that task suggest that they were influenced more by immediate punishment than by delayed reward (unpublished results). This indicates that neither insensitivity to punishment nor hypersensitivity to reward are appropriate accounts for the defect. A qualitative aspect of the patients' performance also supports the idea that immediate consequences influence the performance significantly. When they are faced with a significant money loss in a given deck, they refrain from picking cards out of that same deck, for a while, just like normals do, though unlike normals they then return to select from that deck after a few additional selections. When we combine the profiles of both basic task and variant tasks, we are left with one reasonable possibility: that these subjects are unresponsive to future consequences, whatever they are, and are thus more controlled by immediate prospects. How can this "myopia" for the future be explained? Evidence from other studies suggests that these patients possess and can access the requisite knowledge to conjure up options of actions and scenarios of future outcomes just as normal controls do (Saver & Damasio, 1991). Their defect seems to be at the level of acting on such knowledge. There are several plausible accounts to explain such a defect. For instance, it is possible that the representations of future outcomes that these patients evoke are unstable, that is, that they are not held in working memory long enough for attention to enhance them and reasoning strategies to be applied to them. This account invokes a defect along the lines proposed for behavioral domains dependent on dorsolateral prefrontal cortex networks, and which is possibly just as valid in the personal/social domain of decision-making (Goldman-Rakic, 1987). Defects in temporal integration and attention would fall under this account (Fuster, 1989; Posner, 1986). Alternatively, the representations of future outcomes might be stable, but they would not be marked with a negative or positive value, and thus could not be easily rejected or accepted. This account invokes the somatic marker hypothesis which posits that the overt or covert processing of somatic states provides the value mark for a cognitive scenario (Damasio, 1994; Damasio et al., 1991). We have been attempting to distinguish between these two accounts in a series of subsequent experiments using this task along with psychophysiological measurements. Preliminary results favor the latter account, or a combination of the two accounts. Those results also suggest that the biasing effect of the value mark operates covertly, at least in the early stages of the task.
References Damasio, A.R. (1994). Descartes' error: Emotion, rationality and the human brain. New York: Putnam (Grosset Books).
Insensitivity to future consequences following damage to prefrontal cortex
11
Damasio, A.R., Tranel, D., & Damasio, H. (1991). Somatic markers and the guidance of behavior. In H. Levin, H. Eisenberg, & A. Benton (Eds.), Frontal lobe function and dysfunction (pp. 217-228). New York: Oxford University Press. Fuster, J.M. (1989). The prefrontal cortex (2nd edn.). New York: Raven Press. Goldman-Rakic, P.S. (1987). Circuitry of primate prefrontal cortex and regulation of behavior by representational memory. In F. Plum (Ed.), Handbook of physiology: The nervous system (Vol. V, pp. 373-401). Bethesda, MD: American Physiological Society. Milner, B. (1963). Effects of different brain lesions on card sorting. Archives of Neurology, 9, 90-100. Milner! B., Petrides, M., & Smith, M.L. (1985). Frontal lobes and the temporal organization of memory. Human Neurobiology, 4, 137-142. Petrides, M., & Milner, B (1982). Deficits on subject-ordered tasks after frontal and temporal-lobe lesions in man. Neuropsychologia, 20, 249-262. Posner, M.I. (1986). Chronometric explorations of the mind. New York: Oxford University Press. Saver, J.L., & Damasio, A.R. (1991). Preserved access and processing of social knowledge in a patient with acquired sociopathy due to ventromedial frontal damage. Neuropsychologia, 29, 1241-1249. Shallice, T., & Evans, M.E. (1978). The involvement of the frontal lobes in cognitive estimation. Cortex, 14, 294-303.
2
Autism: beyond "theory of mind" Uta Frith*, Francesca Happe MRC Cognitive Development Unit, 4 Taviton Street, London WC1H OBT, UK
Abstract The theory of mind account of autism has been remarkably successful in making specific predictions about the impairments in socialization, imagination and communication shown by people with autism. It cannot, however, explain either the non-triad features of autism, or earlier experimental findings of abnormal assets and deficits on non-social tasks. These unexplained aspects of autism, and the existence of autistic individuals who consistently pass false belief tasks, suggest that it may be necessary to postulate an additional cognitive abnormality. One possible abnormality - weak central coherence - is discussed, and preliminary evidence for this theory is presented.
The theory of mind account of autism In 1985 Cognition published an article by Baron-Cohen, Leslie, and Frith, entitled: Does the autistic child have a "theory of mind"? The perceptive reader would have recognized this as a reference to Premack and Woodruffs (1978) question: Does the chimpanzee have a theory of mind? The connection between these two was, however, an indirect one - the immediate precursor of the paper was Wimmer and Perner's (1983) article on the understanding of false beliefs by normally developing pre-school children. Each of these three papers has, in its way, triggered an explosion of research interest; in the social impairments of autism, the mind-reading capacities of non-human primates, and the development of social understanding in normal children. The connections which existed between the three papers have been mirrored in continuing connections between these three fields of research - developmental psychology (Astington, Harris, & Olson, 1989; Perner, 1991; Russell, 1992; Wellman, 1990), cognitive ethology
* Corresponding author
14
U. Frith, F. Happ£
(Byrne & Whiten, 1988; Cheney & Seyfarth, 1990), and developmental psychopathology (Cicchetti & Cohen, in press; Rutter, 1987). There can be little doubt that these contacts have enriched work in each area. Perceptive readers would also have noticed the inverted commas surrounding the phrase "theory of mind" in the 1985 paper. Baron-Cohen, Leslie, and Frith followed Premack and Woodruffs definition of this "sexy" but misleading phrase: to have a theory of mind is to be able to attribute independent mental states to self and others in order to explain and predict behaviour. As might befit a "theory" ascribable to chimpanzees, this was not a conscious theory but an innately given cognitive mechanism allowing a special sort of representation - the representation of mental states. Leslie (1987, 1988) delivered the critical connection between social understanding and understanding of pretence, via this postulated mechanism; metarepresentation is necessary, in Leslie's theory, for representing pretence, belief and other mental states. From this connection, between the social world and the world of imaginative play, sprung the link to autistic children, who are markedly deficient in both areas. The idea that people with autism could be characterized as suffering from a type of "mind-blindness", or lack of theory of mind, has been useful to the study of child development - not because it was correct (that is still debatable) but because it was a causal account which was both specific and falsifiable. The clearest expression of this causal account is given in Frith, Morton, and Leslie (1991). What is to be explained? Autism is currently defined at the behavioural level, on the basis of impairments in socialization, communication and imagination, with stereotyped repetitive interests taking the place of creative play (DSM-III-R, American Psychological Association, 1987). A causal account must link these behavioural symptoms to the presumed biological origins (Gillberg & Coleman, 1992; Schopler & Mesibov, 1987) of this disorder. Specificity is particularly important in any causal account of autism because autistic people themselves show a highly specific pattern of deficits and skills. The IQ profile alone serves to demonstrate this; autistic people in general show an unusually "spiky" profile across Wechsler subtests (Lockyer & Rutter, 1970; Tymchuk, Simmons, & Neafsey, 1977), excelling on Block Design (constructing a pattern with cubes), and failing on Picture Arrangement (ordering pictures in a cartoon strip). This puzzling discrepancy of functioning has caused many previous psychological theories of autism to fail. For example, high arousal, lack of motivation, language impairment, or perceptual problems are all too global to allow for both the assets and deficits of autism.
Fine cuts along a hidden seam What are the specific predictions made by the hypothesis that people with autism lack a "theory of mind"? The hypothesis does not address the question of
15
Autism: beyond "theory of mind"
the spiky IQ profile - it is silent on functioning in non-social areas - but it focuses on the critical triad of impairments (Wing & Gould, 1979). Not only does it make sense of this triad, but it also makes "fine cuts" within the triad of autistic impairments. Social and communicative behaviour is not all of one piece, when viewed from the cognitive level. Some, but not all, such behaviour requires the ability to "mentalize" (represent mental states). So, for example, social approach need not be built upon an understanding of others' thoughts - indeed Hermelin and O'Connor (1970) demonstrated to many people's initial surprise that autistic children prefer to be with other people, just like non-autistic children of the same mental age. However, sharing attention with someone else does require mentalizing - and is consistently reported by parents to be missing in the development of even able autistic children (Newson, Dawson, & Everard, 1984). The mentalizing-deficit account has allowed a systematic approach to the impaired and unimpaired social and communicative behaviour of people with autism. Table 1 shows some of the work exploring predictions from the hypothesis that autistic people lack mentalizing ability. The power of this hypothesis is to make fine cuts in the smooth continuum of behaviours, and in this it has been remarkably useful. It has sparked an enormous amount of research, both supporting and attacking the theory (reviewed by Baron-Cohen, Tager-Flusberg, & Cohen, 1993; Happe, 1994a; Happe & Frith, in press). The fine cuts method, as used in the laboratory, has also informed research
Table 1. Autistic assets and deficits as predicted by the "fine cuts" technique, between tasks which require mentalizing and those which do not Assets
Deficits
Ordering behavioural pictures
Ordering mentalistic pictures (Baron-Cohen et al., 1986) Understanding know (Perner et al., 1989) Protodeclarative pointing (Baron-Cohen, 1989b) Deception (Sodian & Frith, 1992) False beliefs (Leslie & Thaiss, 1992; Leekam & Perner, 1991) Recognizing surprise (Baron-Cohen et al., 1993) Information occlusion (Baron-Cohen, 1992) Metaphorical expression (Happe, 1993)
Understanding see Protoimperative pointing Sabotage False photographs
Recognizing happiness and sadness Object occlusion Literal expression References refer to Assets and Deficits.
16
U. Frith, F. Happe*
Table 2. Autistic assets and deficits observed in real life Assets
Deficits
Elicited structured play
Spontaneous pretend play (Wetherby & Prutting, 1984) Expressive gestures (Attwood, Frith, & Hermelin, 1988) Talking about beliefs and ideas (Tager-Flusberg, 1993) Using person as receiver of information (Phillips, 1993) Showing "interactive" sociability (Frith et al., in press)
Instrumental gestures Talking about desires and emotions Using person as tool Showing "active" sociability ^ References refer to Assets and Deficits.
into the pattern of abilities and deficits in real life (Table 2), although this enterprise has still some way to go. This technique, which aims to pit two behaviours against each other which differ only in the demands they make upon the ability to mentalize, pre-empts many potential criticisms. It is also peculiarly suitable for use in brain-imaging studies. By looking at performance across tasks which are equivalent in every other way, except for the critical cognitive component, intellectual energy has been saved for the really interesting theoretical debates. Another key benefit of the specificity of this approach is the relevance it has for normal development. The fine cuts approach suits the current climate of increased interest in the modular nature of mental capacities (e.g., Cosmides, 1989; Fodor, 1983). It has allowed us to think about social and communicative behaviour in a new way. For this reason, autism has come to be a test case for many theories of normal development (e.g., Happe, 1993; Sperber & Wilson's 1986 Relevance theory).
Limitations of the theory of mind account The hijacking of autism by those primarily interested in normal development has added greatly to the intellectual richness of autism research. But just how well does the theory of mind account explain autism? By the stringent standard, that explanatory theories must give a full account of a disorder (Morton & Frith, in press), not that well. The mentalizing account has helped us to understand the nature of the autistic child's impairments in play, social interaction and verbal and non-verbal communication. But there is more to autism than the classic triad of impairments.
Autism: beyond "theory of mind"
17
Non-triad features Clinical impressions originating with Kanner (1943) and Asperger (1944; translated in Frith, 1991), and withstanding the test of time, include the following: - Restricted repertoire of interests (necessary for diagnosis in DSM-III-R, American Psychological Association, 1987). - Obsessive desire for sameness (one of two cardinal features for Kanner & Eisenberg, 1956). - Islets of ability (an essential criterion in Kanner, 1943). - Idiot savant abilities (striking in 1 in 10 autistic children, Rimland & Hill, 1984). - Excellent rote memory (emphasized by Kanner, 1943). - Preoccupation with parts of objects (a diagnostic feature in DSM-IV, forthcoming). All of these non-triad aspects of autism are vividly documented in the many parental accounts of the development of autistic children (Hart, 1989; McDonnell, 1993; Park, 1967). None of these aspects can be well explained by a lack of mentalizing. Of course, clinically striking features shown by people with autism need not be specific features of the disorder. However, there is also a substantial body of experimental work, much of it predating the mentalizing theory, which demonstrates non-social abnormalities that are specific to autism. Hermelin and O'Connor were the first to introduce what was in effect a different "fine cuts" method (summarized in their 1970 monograph) - namely the comparison of closely matched groups of autistic and non-autistic handicapped children of the same mental age. Table 3 summarizes some of the relevant findings.
The talented minority The mentalizing deficit theory of autism, then, cannot explain all features of autism. It also cannot explain all people with autism. Even in the first test of the hypothesis (reported in the 1985 Cognition paper), some 20% of autistic children passed the Sally-Ann task. Most of these successful children also passed another test of mentalizing - ordering picture stories involving mental states (BaronCohen, Leslie, & Frith, 1986) - suggesting some real underlying competence in representing mental states. Baron-Cohen (1989a) tackled this apparent dis-
18
U. Frith, F. Happe*
Table 3. Experimental findings not accounted for by mind-blindness. Surprising advantages and disadvantages on cognitive tasks, shown by autistic subjects relative to normally expected asymmetries Unusual strength
Unusual weakness
Memory for word strings
Memory for sentences (e.g., Hermelin & O'Connor, 1967) Memory for related items (e.g., Tager-Flusberg, 1991) Echoing with repair (e.g., Aurnhammer-Frith, 1969) Pattern detection (e.g., Frith, 1970 a,b) Jigsaw by picture (e.g., Frith & Hermelin, 1969) Sorting faces by person (e.g., Weeks & Hobson, 1987) Recognizing faces right-way-up (e.g., Langdell, 1978)
Memory for unrelated items Echoing nonsense Pattern imposition Jigsaw by shape Sorting faces by accessories Recognizing faces upside-down
References refer to Unusual strength and Unusual weakness.
confirmation of the theory, by showing that these talented children still did not pass a harder (second-order) theory of mind task (Perner & Wimmer, 1985). However, results from other studies focusing on high-functioning autistic subjects (Bowler, 1992; Ozonoff, Rogers, & Pennington, 1991) have shown that some autistic people can pass theory of mind tasks consistently, applying these skills across domains (Happe, 1993) and showing evidence of insightful social behaviour in everyday life (Frith, Happe, & Siddons, in press). One possible way of explaining the persisting autism of these successful subjects is to postulate an additional and continuing cognitive impairment. What could this impairment be? The recent interest in executive function deficits in autism (Hughes & Russell, 1993; Ozonoff, Pennington, & Rogers, 1991) can be seen as springing from some of the limitations of the theory of mind view discussed above. Ozonoff, Rogers, & Pennington (1991) found that while not all subjects with autism and /or Asperger's syndrome showed a theory of mind deficit, all were impaired on the Wisconsin Card Sorting Test and Tower of Hanoi (two typical tests of executive function). On the basis of this finding they suggest that executive function impairments are a primary causal factor in autism. However, the specificity, and hence the power of this theory as a causal account, has yet to be established by systematic comparison with other non-autistic groups who show impairments in executive functions (Bishop, 1993). While an additional impairment in executive functions may be able to explain certain (perhaps non-specific) features of autism (e.g., stereotypies, failure to plan, impulsiveness), it is not clear how it could explain the specific deficits and skills summarized in Table 3.
Autism: beyond "theory of mind"
19
The central coherence theory Motivated by the strong belief that both the assets and the deficits of autism spring from a single cause at the cognitive level, Frith (1989) proposed that autism is characterized by a specific imbalance in integration of information at different levels. A characteristic of normal information processing appears to be the tendency to draw together diverse information to construct higher-level meaning in context; "central coherence" in Frith's words. For example, the gist of a story is easily recalled, while the actual surface form is quickly lost, and is effortful to retain. Bartlett (1932), summarizing his famous series of experiments on remembering images and stories, concluded: "an individual does not normally take [such] a situation detail by detail... In all ordinary instances he has an overmastering tendency simply to get a general impression of the whole; and, on the basis of this, he constructs the probable detail" (p. 206). Another instance of central coherence is the ease with which we recognize the contextually appropriate sense of the many ambiguous words used in everyday speech (son-sun, meet-meat, sew-so, pear-pair). A similar tendency to process information in context for global meaning is also seen with non-verbal material - for example, our everyday tendency to misinterpret details in a jigsaw piece according to the expected position in the whole picture. It is likely that this preference for higher levels of meaning may characterize even mentally handicapped (non-autistic) individuals - who appear to be sensitive to the advantage of recalling organized versus jumbled material (e.g., Hermelin & O'Connor, 1967). Frith suggested that this universal feature of human information processing was disturbed in autism, and that a lack of central coherence could explain very parsimoniously the assets and deficits shown in Table 3. On the basis of this theory, she predicted that autistic subjects would be relatively good at tasks where attention to local information - relatively piece-meal processing - is advantageous, but poor at tasks requiring the recognition of global meaning.
Empirical evidence: assets A first striking signpost towards the theory appeared quite unexpectedly, when Amitta Shah set off to look at autistic children's putative perceptual impairments on the Embedded Figures Test. The children were almost better than the experimenter! Twenty autistic subjects with an average age of 13, and non-verbal mental age of 9.6, were compared with 20 learning disabled children of the same age and mental age, and 20 normal 9-year-olds. These children were given the Children's Embedded Figures Test (CEFT; Witkin, Oltman, Raskin, & Karp, 1971), with a slightly modified procedure including some pretraining with cut-out shapes. The test involved spotting a hidden figure (triangle or house shape)
20
U. Frith, F. Happ6
among a larger meaningful drawing (e.g., a clock). During testing children were allowed to indicate the hidden figure either by pointing or by using a cut-out shape of the hidden figure. Out of a maximum score of 25, autistic children got a mean of 21 items correct, while the two control groups (which did not differ significantly in their scores) achieved 15 or less. Gottschaldt (1926) ascribed the difficulty of finding embedded figures to the overwhelming "predominance of the whole". The ease and speed with which autistic subjects picked out the hidden figure in Shah and Frith's (1983) study was reminiscent of their rapid style of locating tiny objects (e.g. thread on a patterned carpet) and their immediate discovery of minute changes in familiar lay-outs (e.g., arrangement of cleaning materials on bathroom shelf), as often described anecdotally. The study of embedded figures was introduced into experimental psychology by the Gestalt psychologists, who believed that an effort was needed to resist the tendency to see the forcefully created gestalt, at the expense of the constituent parts (Koffka, 1935). Perhaps this struggle to resist overall gestalt forces does not occur for autistic subjects. If people with autism, due to weak central coherence, have privileged access to the parts and details normally securely embedded in whole figures, then novel predictions could be made about the nature of their islets of ability. The Block Design subtest of the Wechsler Intelligence Scales (Wechsler, 1974, 1981) is consistently found to be a test on which autistic people show superior performance relative to other subtests, and often relative to other people of the same age. This test, first introduced by Kohs (1923), requires the breaking up of line drawings into logical units, so that individual blocks can be used to reconstruct the original design from separate parts. The designs are notable for their strong gestalt qualities, and the difficulty which most people experience with this task appears to relate to problems in breaking up the whole design into the constituent blocks. While many authors have recognized this subtest as an islet of ability in autism, this fact has generally been explained as due to intact or superior general spatial skills (Lockyer & Rutter, 1970; Prior, 1979). Shah and Frith (1993) suggested, on the basis of the central coherence theory, that the advantage shown by autistic subjects is due specifically to their ability to see parts over wholes. They predicted that normal, but not autistic, subjects would benefit from pre-segmentation of the designs. Twenty autistic, 33 normal and 12 learning disabled subjects took part in an experiment, where 40 different block designs had to be constructed from either whole or pre-segmented drawn models (Fig. 1). Autistic subjects with normal or near-normal non-verbal IQ were matched with normal children of 16 years. Autistic subjects with non-verbal IQ below 85 (and not lower than 57) were compared with learning disabled children of comparable IQ and chronological age (18 years), and normal children aged 10. The results showed that the autistic subjects' skill on this task resulted from a greater ability to segment the design. Autistic subjects showed superior performance compared to controls in one
21
Autism: beyond "theory of mind"
HH
• «• H
BB
3
4> 8
Fig. 1. Examples of all types of design: "whole" versus "segmented" (1, 2, 3, 4 vs. 5, 6, 7, 8) "oblique" versus "non-oblique" (3, 4, 7, 8 vs. 1, 2, 5, 6) "unrotated" versus "rotated" (1, 3, 5, 7 vs. 2, 4, 6, 8).
condition only - when working from whole designs. The great advantage which the control subjects gained from using pre-segmented designs was significantly diminished in the autistic subjects, regardless of their IQ level. On the other hand, other conditions which contrasted presence and absence of obliques, and rotated versus unrotated presentation, affected all groups equally. From these latter findings it can be concluded that general visuo-spatial factors show perfectly normal effects in autistic subjects, and that superior general spatial skill may not account for Block design superiority.
Empirical evidence: deficits While weak central coherence confers significant advantages in tasks where preferential processing of parts over wholes is useful, it would be expected to confer marked disadvantages in tasks which involve interpretation of individual
22
U. Frith, F. Happ6
stimuli in terms of overall context and meaning. An interesting example is the processing of faces, which seems to involve both featural and configural processing (Tanka & Farah, 1993). Of these two types of information, it appears to be configural processing which is disrupted by the inverted presentation of faces (Bartlett & Searcy, 1993; Rhodes, Brake, & Atkinson, 1993). This may explain the previously puzzling finding that autistic subjects show a diminished disadvantage in processing inverted faces (Hobson, Ouston, & Lee, 1988; Langdell, 1978). One case in which the meaning of individual stimuli is changed by their context is in the disambiguation of homographs. In order to choose the correct (contextappropriate) pronunciation in the following sentences, one must process the final word as part of the whole sentence meaning: "He had a pink bow"; "He made a deep bow". Frith and Snowling (1983) predicted that this sort of contextual disambiguation would be problematic for people with autism. They tested 8 children with autism who had reading ages of 8-10 years, and compared them with 6 dyslexic children and 10 normal children of the same reading age. The number of words read with the contextually appropriate pronunciation ranged from 5 to 7 out of 10 for the autistic children, who tended to give the more frequent pronunciation regardless of sentence context. By contrast, the normal and dyslexic children read between 7 and 9 of the 10 homographs in a contextually determined manner. This finding suggested that autistic children, although excellent at decoding single words, were impaired when contextual cues had to be used. This was also demonstrated in their relative inability to answer comprehension questions and to fill in gaps in a story text. This work fits well with previous findings (Table 3) concerning failure to use meaning and redundancy in memory tasks.
The abnormality of excellence The hypothesis that people with autism show weak central coherence aims to explain both the glaring impairments and the outstanding skills of autism as resulting from a single characteristic of information processing. One characteristic of this theory is that it claims that the islets of ability and savant skills are achieved through relatively abnormal processing, and predicts that this may be revealed in abnormal error patterns. One example might be the type of error made in the Block Design test. The central coherence theory suggests that, where errors are made at all on Block Design, these will be errors which violate the overall pattern, rather than the details. Kramer, Kaplan, Blusewicz, and Preston (1991) found that in normal adult subjects there was a strong relation between the number of such configuration-breaking errors made on the Block Design test and the number of local (vs. global) choices made in a similarity-judgement task
Autism: beyond "theory of mind"
23
(Kimchi & Palmer, 1982). Preliminary data from subjects with autism (Happe, in preparation) suggest that, in contrast to normal children, errors violating configuration are far more common than errors violating pattern details in autistic Block Design performance. A second example concerns idiot savant drawing ability. Excellent drawing ability may be characterized by a relatively piece-meal drawing style. Mottron and Belleville (1993) found in a case study of one autistic man with exceptional artistic ability that performance on three different types of tasks suggested an anomaly in the hierarchical organization of the local and global parts of figures. The authors observed that the subject "began his drawing by a secondary detail and then progressed by adding contiguous elements", and concluded that his drawings showed "no privileged status of the global form . . . but rather a construction by local progression". In contrast, a professional draughtsman who acted as a control started by constructing outlines and then proceeded to parts. It remains to be seen whether other savant abilities can be explained in terms of a similarly local and detail-observant processing style.
Central coherence and mentalizing Central coherence, then, may be helpful in explaining some of the real-life features that have so far resisted explanation, as well as making sense of a body of experimental work not well accounted for by the mentalizing deficit theory. Can it also shed light on the continuing handicaps of those talented autistic subjects who show consistent evidence of some mentalizing ability? Happe (1991), in a first exploration of the links between central coherence and theory of mind, used Snowling and Frith's (1986) homograph reading task with a group of able autistic subjects. Autistic subjects were tested on a battery of theory of mind tasks at two levels of difficulty (first- and second-order theory of mind), and grouped according to their performance (Happe, 1993). Five subjects who failed all the theory of mind tasks, 5 subjects who passed all and only first-order tasks, and 6 subjects who passed both first- and second-order theory of mind tasks were compared with 14 7-8-year-olds. The autistic subjects were of mean age 18 years, and had a mean IQ of around 80. The three autistic groups and the control group obtained the same score for total number of words correctly read. As predicted, however, the young normal subjects, but not the autistic subjects, were sensitive to the relative position of target homograph and disambiguating context: "There was a big tear in her eye", versus "In her dress there was a big tear". The normal controls showed a significant advantage when sentence context occurred before (rare pronunciation) target words (scoring 5 out of 5, vs. 2 out of 5 where target came first), while the autistic subjects (as in Frith and Snowling, 1983) tended to give the more frequent pronunciation regardless (3 out of 5 appropriate pronun-
24
U. Frith, F. Happ6
ciations in each case). The important point of this study was that this was true of all three autistic groups, irrespective of level of theory of mind performance. Even those subjects who consistently passed all the theory of mind tasks (mean VIQ 90) failed to use sentence context to disambiguate homograph pronunciation. It is possible, therefore, to think of weak central coherence as characteristic of even those autistic subjects who possess some mentalizing ability. Happe (submitted) explored this idea further by looking at WISC-R and WAIS subtest profiles. Twenty-seven children who failed standard first-order false belief tasks were compared with 21 subjects who passed. In both groups Block Design was a peak of non-verbal performance for the majority of subjects: 18/21 passers, and 23/27 failers. In contrast, performance on the Comprehension subtest (commonly thought of as requiring pragmatic and social skill) was a low point in verbal performance for 13/17 "failers" but only 6/20 "passers". It seems, then, that while social reasoning difficulties (as shown by Wechsler tests) are striking only in those subjects who fail theory of mind tasks, skill on non-verbal tasks benefiting from weak central coherence is characteristic of both passers and failers. There is, then, preliminary evidence to suggest that the central coherence hypothesis is a good candidate for explaining the persisting handicaps of the talented minority. So, for example, when theory of mind tasks were embedded in slightly more naturalistic tasks, involving extracting information from a story context, even autistic subjects who passed standard second-order false belief tasks showed characteristic and striking errors of mental state attribution (Happe, 1994b). It may be that a theory of mind mechanism which is not fed by rich and integrated contextual information is of little use in everyday life. The finding that weak central coherence may characterize autistic people at all levels of theory of mind ability goes against Frith's (1989) original suggestion that a weakness in central coherence could by itself account for theory of mind impairment. At present, all the evidence suggests that we should retain the idea of a modular and specific mentalizing deficit in our causal explanation of the triad of impairments in autism. It is still our belief that nothing captures the essence of autism so precisely as the idea of "mind-blindness". Nevertheless, for a full understanding of autism in all its forms, this explanation alone will not suffice. Therefore, our present conception is that there may be two rather different cognitive characteristics that underlie autism. Following Leslie (1987, 1988) we hold that the mentalizing deficit can be usefully conceptualized as the impairment of a single modular system. This system has a neurological basis - which may be damaged, leaving other functions intact (e.g., normal IQ). The ability to mentalize would appear to be of such evolutionary value (Byrne & Whiten, 1988; Whiten, 1991) that only insult to the brain can produce deficits in this area. By contrast, the processing characteristic of weak central coherence, as illustrated above, gives both advantages and disadvantages, as would strong central coherence. It is possible, then, to think of this balance (between preference for parts
Autism: beyond "theory of mind"
25
vs. wholes) as akin to a cognitive style, which may vary in the normal population. No doubt, this style would be subject to environmental influences, but, in addition, it may have a genetic component. It may be interesting, then, to focus on the strengths and weaknesses of autistic children's processing, in terms of weak central coherence, in looking for the extended phenotype of autism. Some initial evidence for this may be found in the report by Landa, Folstein, and Isaacs (1991) that the parents of children with autism tell rather less coherent spontaneous narratives than do controls.
Central coherence and executive function With the speculative link to cognitive style rather than straightforward deficit, the central coherence hypothesis differs radically not only from the theory of mind account, but also from other recent theories of autism. In fact, every other current psychological theory claims that some significant and objectively harmful deficit is primary in autism. Perhaps the most influential of such general theories is the idea that autistic people have executive function deficits, which in turn cause social and non-social abnormalities. The umbrella term "executive functions" covers a multitude of higher cognitive functions, and so is likely to overlap to some degree with conceptions of both central coherence and theory of mind. However, the hypothesis that autistic people have relatively weak central coherence makes specific and distinct predictions even within the area of executive function. For example, the "inhibition of pre-potent but incorrect responses" may contain two separable elements: inhibition and recognition of context-appropriate response. One factor which can make a pre-potent response incorrect is a change of context. If a stimulus is treated in the same way regardless of context, this may look like a failure of inhibition. However, autistic people may have no problem in inhibiting action where context is irrelevant. Of course it may be that some people with autism do have an additional impairment in inhibitory control, just as some have peripheral perceptual handicaps or specific language problems.
Future prospects The central coherence account of autism is clearly still tentative and suffers from a certain degree of over-extension. It is not clear where the limits of this theory should be drawn - it is perhaps in danger of trying to take on the whole problem of meaning! One of the areas for future definition will be the level at which coherence is weak in autism. While Block Design and Embedded Figures tests appear to tap processing characteristics at a fairly low or perceptual level,
26
U. Frith, F. Happ*
work on memory and verbal comprehension suggests higher-level coherence deficits. Coherence can be seen at many levels in normal subjects, from the global precedence effect in perception of hierarchical figures (Navon, 1977) to the synthesis of large amounts of information and extraction of inferences in narrative processing (e.g., Trabasso & Suh, 1993, in a special issue of Discourse Processes on inference generation during text comprehension). One interesting way forward may be to contrast local coherence within modular systems, and global coherence across these systems in central processing. So, for example, the calendrical calculating skills of some people with autism clearly show that information within a restricted domain can be integrated and processed together (O'Connor & Hermelin, 1984; Hermelin & O'Connor, 1986), but the failure of many such savants to apply their numerical skills more widely (some cannot multiply two given numbers) suggests a modular system specialized for a very narrow cognitive task. Similarly, Norris (1990) found that building a connectionist model of an "idiot savant date calculator" only succeeded when forced to take a modular approach. Level of coherence may be relative. So, for example, within text there is the word-to-word effect of local association, the effect of sentence context, and the larger effect of story structure. These three levels may be dissociable, and it may be that people with autism process the most local of the levels available in open-ended tasks. The importance of testing central coherence with open-ended tasks is suggested by a number of findings. For example, Snowling and Frith (1986) demonstrated that it was possible to train subjects with autism to give the context appropriate (but less frequent) pronunciation of ambiguous homographs. Weeks and Hobson (1987) found that autistic subjects sorted photographs of faces by type of hat when given a free choice, but, when asked again, were able to sort by facial expression. It seems likely, then, that autistic weak central coherence is most clearly shown in (non-conscious) processing preference, which may reflect the relative cost of two types of processing (relatively global and meaningful vs. relatively local and piece-meal). Just as the idea of a deficit in theory of mind has taken several years and considerable (and continuing) work to be empirically established, so the idea of a weakness in central coherence will require a systematic programme of research. Like the theory of mind account, it is to be hoped that, whether right or wrong, the central coherence theory will form a useful framework for thinking about autism in the future.
References American Psychological Association (1987). Diagnostic and Statistical Manual of Mental Disorders, 3rd revised edition (DSM-III-R). Washington, DC: American Psychological Association.
Autism: beyond "theory of mind"
77
Asperger, H. (1944). Die "autistischen Psychopathen" im Kindesalter. Archiv fur Psychiatrie und Nervenkrankheiten, 117, 76-136. Astington, J.W., Harris, P.L., & Olson, D.R. (Eds.) (1989). Developing theories of mind. New York: Cambridge University Press. Attwood, A.H., Frith, U., & Hermelin, B. (1988). The understanding and use of interpersonal gestures by autistic and Down's syndrome children. Journal of Autism and Developmental Disorders, 18, 241-257. Aurnhammer-Frith, U. (1969). Emphasis and meaning in recall in normal and autistic children. Language and Speech, 12, 29-38. Baron-Cohen, S. (1989a). The autistic child's theory of mind: A case of specific developmental delay. Journal of Child Psychology and Psychiatry, 30, 285-297. Baron-Cohen, S. (1989b). Perceptual role taking and protodeclarative pointing in autism. British Journal of Developmental Psychology, 7, 113-127. Baron-Cohen, S. (1992). Out of sight or out of mind? Another look at deception in autism. Journal of Child Psychology and Psychiatry, 33, 1141-1155. Baron-Cohen, S., Leslie, A.M., & Frith, U. (1985). Does the autistic child have a "theory of mind"? Cognition, 21, 37-46. Baron-Cohen, S., Leslie, A.M., & Frith, U. (1986). Mechanical, behavioural and intentional understanding of picture stories in autistic children. British Journal of Developmental Psychology, 4, 113-125. Baron-Cohen, S., Spitz, A., & Cross, P. (1993). Can children with autism recognise surprise? Cognition and Emotion, 7, 507-516. Baron-Cohen, S., Tager-Flusberg, H., & Cohen, D.J. (Eds.) (1993). Understanding other minds: Perspectives from autism. Oxford: Oxford University Press. Bartlett, F.C. (1932). Remembering: A study in experimental and social psychology. Cambridge, UK: Cambridge University Press. Bartlett, J.C., & Searcy, J. (1993). Inversion and configuration of faces. Cognitive Psychology, 25, 281-316. Bishop, D.V.M. (1993). Annotation. Autism, executive functions and theory of mind: A neuropsychological perspective. Journal of Child Psychology and Psychiatry, 34, 279-293. Bowler, D.M. (1992). "Theory of mind" in Asperger's syndrome. Journal of Child Psychology and Psychiatry, 33, 877-893. Byrne, R., & Whiten, A. (Eds.) (1988). Machiavellian intelligence: Social expertise and the evolution of intellect in monkeys, apes, and humans. Oxford: Clarendon Press. Cheney, D.L., & Seyfarth, R.M. (1990). How monkeys see the world. Chicago: University of Chicago Press. Cicchetti, D., & Cohen, D.J. (Eds.) (in press). Manual of developmental psychopathology (Vol. 1). New York: Wiley. Cosmides, L. (1989). The logic of social exchange: Has natural selection shaped how humans reason? Studies with the Wason selection task. Cognition, 31, 187-276. Fodor, J.A. (1983). Modularity of mind. Cambridge, MA: MIT Press. Frith, U. (1970a). Studies in pattern detection in normal and autistic children: I. Immediate recall of auditory sequences. Journal of Abnormal Psychology, 76, 413-420. Frith, U. (1970b). Studies in pattern detection in normal and autistic children: II. Reproduction and production of color sequences. Journal of Experimental Child Psychology, 10, 120-135. Frith, U. (1989). Autism: Explaining the enigma. Oxford: Basil Blackwell. Frith, U. (1991). Translation and annotation of "Autistic psychopathy" in childhood, by H. Asperger. In U. Frith (Ed.), Autism and Asperger syndrome. Cambridge, UK: Cambridge University Press. Frith, U., Happe, F., & Siddons, F. (in press). Theory of mind and social adaptation in autistic, retarded and young normal children. Social Development. Frith, U., & Hermelin, B. (1969). The role of visual and motor cues for normal, subnormal and autistic children. Journal of Child Psychology and Psychiatry, 10, 153-163. Frith, U., Morton, J., & Leslie, A.M. (1991). The cognitive basis of a biological disorder: Autism. Trends in Neuroscience, 14, 433-438.
28
U. Frith, F. Happe"
Frith, U., & Snowling, M. (1983). Reading for meaning and reading for sound in autistic and dyslexic children. Journal of Developmental Psychology, 1, 329-342. Gillberg, C, & Coleman, M. (1992). The biology of the autistic syndromes. London: Mac Keith Press. Gottschaldt, K. (1926). Ueber den Einfluss der Erfahrung auf die Welt der Wahrnehmung von Figuren. Psychologische Forschung, 8, 261-317. Happe, F.G.E. (1991). Theory of mind and communication in autism. Unpublished Ph.D. thesis, University of London. Happe, F.G.E. (1993). Communicative competence and theory of mind in autism: A test of relevance theory. Cognition, 48, 101-119. Happe, F.G.E. (1994a). Annotation: Psychological theories of autism. Journal of Child Psychology and Psychiatry, 35, 215-229. Happe, F.G.E. (1994b). An advanced test of theory of mind: Understanding of story characters' thoughts and feelings by able autistic, mentally handicapped and normal children and adults. Journal of Autism and Developmental Disorders, 24, 1-24. Happe, F.G.E. (submitted). Theory of mind and IQ profiles in autism: A research note. Happe, F.G.E. (in preparation). Central coherence, block design errors, and global-local similarity judgement in autistic subjects. Happe, F., & Frith, U. (in press). Theory of mind in autism. In E. Schopler & G.B. Mesibov (Eds.), Learning and cognition in autism. New York: Plenum Press. Hart, C. (1989). Without reason: A family copes with two generations of autism. New York: Penguin Books. Hermelin, B., & O'Connor, N. (1967). Remembering of words by psychotic and subnormal children. British Journal of Psychology, 58, 213-218. Hermelin, B., & O'Connor, N. (1970). Psychological experiments with autistic children. Oxford: Pergamon. Hermelin, B., & O'Connor, N. (1986). Idiot savant calendrical calculators: Rules and regularities. Psychological Medicine, 16, 885-893. Hobson, R.P., Ouston, J., & Lee, T. (1988). What's in a face? The case of autism. British Journal of Psychology, 79, 441-453. Hughes, C.H., & Russell, J. (1993). Autistic children's difficulty with mental disengagement from an object: Its implications for theories of autism. Developmental Psychology, 29, 498-510. Kanner, L. (1943). Autistic disturbances of affective contact. Nervous Child, 2, 217-250. Kanner, L., & Eisenberg, L. (1956). Early infantile autism 1943-1955. American Journal of Orthopsychiatry, 26, 55-65. Kimchi, R., & Palmer, S.E. (1982). Form and texture in hierarchically constructed patterns. Journal of Experimental Psychology: Human Perception and Performance, 8, 521-535. Koffka, K. (1935). Principles of Gestalt psychology. New York: Harcourt Brace. Kohs, S.C. (1923). Intelligence measurement. New York: McMillan. Kramer, J.H., Kaplan, E., Blusewicz, M.J., & Preston, K.A. (1991). Visual hierarchical analysis of block design configural errors. Journal of Clinical and Experimental Neuropsychology, 13, 455-465. Landa, R., Folstein, S.E., & Isaacs, C. (1991). Spontaneous narrative-discourse performance of parents of autistic individuals. Journal of Speech and Hearing Research, 34, 1339-1345. Langdell, T. (1978). Recognition of faces: An approach to the study of autism. Journal of Child Psychology and Psychiatry, 19, 255-268. Leekam, S., & Perner, J. (1991). Does the autistic child have a metarepresentational deficit? Cognition, 40, 203-218. Leslie, A.M. (1987). Pretence and representation: The origins of "Theory of Mind". Psychological Review, 94, 412-426. Leslie, A.M. (1988). Some implications of pretence for mechanisms underlying the child's theory of mind. In J.W. Astington, P.L. Harris, & D.R. Olson (Eds.), Developing theories of mind. New York: Cambridge University Press. Leslie, A.M., & Thaiss, L. (1992). Domain specificity in conceptual development: Evidence from autism. Cognition, 43, 225-251.
Autism: beyond "theory of mind"
29
Lockyer, L., & Rutter, M. (1970). A five to fifteen year follow-up study of infantile psychosis: IV. Patterns of cognitive ability. British Journal of Social and Clinical Psychology, 9, 152-163. McDonnell, J.T. (1993). News from the Border: A mother's memoir of her autistic son. New York: Ticknor & Fields. Morton, J., & Frith, U. (in press). Causal modelling: A structural approach to developmental psychopathology. In D. Cicchetti & D.J. Cohen (Eds.), Manual of Developmental Psychopathology (Vol. 1, Ch. 13). New York: Wiley. Mottron, L., & Belleville, S. (1993). A study of perceptual analysis in a high-level autistic subject with exceptional graphic abilities. Brain and Cognition, 23, 279-309. Navon, D. (1977). Forest before trees: The precedence of global features in visual perception. Cognitive Psychology, 9, 353-383. Newson, E., Dawson, M., & Everard, P. (1984). The natural history of able autistic people: Their management and functioning in social context. Summary of the report to DHSS in four parts. Communication, 18, 1-4; 19, 1-2. Norris, D. (1990). How to build a connectionist idiot (savant). Cognition, 35, 277-291. O'Connor, N., & Hermelin, B. (1984). Idiot savant calendrical calculators: Maths or memory. Psychological Medicine, 14, 801-806. Ozonoff, S., Pennington, B.F., & Rogers, S.J. (1991). Executive function deficits in high-functioning autistic children: Relationship to theory of mind. Journal of Child Psychology and Psychiatry, 32, 1081-1106. Ozonoff, S., Rogers, S.J., & Pennington, B.F. (1991). Asperger's syndrome: Evidence of an empirical distinction from high-functioning autism. Journal of Child Psychology and Psychiatry, 32, 1107-1122. Park, C.C. (1967). The siege: The battle for communication with an autistic child. Harmondsworth, UK: Penguin Books. Perner, J. (1991). Understanding the representational mind. Cambridge, MA: MIT Press. Perner, J., Frith, U., Leslie, A.M., & Leekam, S.R. (1989). Exploration of the autistic child's theory of mind: Knowledge, belief, and communication. Child Development, 60, 689-700. Perner, J., & Wimmer, H. (1985). "John thinks that Mary thinks that . . .": Attribution of second-order beliefs by 5-10 year old children. Journal of Experimental Child Psychology, 39, 437-471. Phillips, W. (1993). Understanding intention and desire by children with autism. Unpublished Ph.D. thesis, University of London. Premack, D., & Woodruff, G. (1978). Does the chimpanzee have a theory of mind? Behavioural and Brain Sciences, 4, 515-526. Prior, M.R. (1979). Cognitive abilities and disabilities in infantile autism: A review. Journal of Abnormal Child Psychology, 7, 357-380. Rhodes, G., Brake, S., & Atkinson, A.P. (1993). What's lost in inverted faces? Cognition, 47, 25-57. Rimland, B., & Hill, A.L. (1984). Idiot savants. In J. Wortis (Ed.), Mental retardation and developmental disabilities (vol. 13, pp. 155-169). New York: Plenum Press. Russell, J. (1992). The theory-theory: So good they named it twice? Cognitive Development, 7, 485-519. Rutter, M. (1987). The role of cognition in child development and disorder. British Journal of Medical Psychology, 60, 1-16. Schopler, E., & Mesibov, G.B. (Eds.) (1987). Neurobiological issues in autism. New York: PlenumPress. Shah, A., & Frith, U. (1983). An islet of ability in autistic children: A research note. Journal of Child Psychology and Psychiatry, 24, 613-620. Shah, A., & Frith, U. (1993). Why do autistic individuals show superior performance on the Block Design task? Journal of Child Psychology and Psychiatry, 34, 1351-1364. Snowling, M., & Frith, U. (1986). Comprehension in "hyperlexic" readers. Journal of Experimental Child Psychology, 42, 392-415. Sodian, B., & Frith, U. (1992). Deception and sabotage in autistic, retarded and normal children. Journal of Child Psychology and Psychiatry, 33, 591-605. Sperber, D., & Wilson, D. (1986). Relevance: Communication and cognition. Oxford: Blackwell.
30
U. Frith, F. Happ<5
Tager-Flusberg, H. (1991). Semantic processing in the free recall of autistic children: Further evidence for a cognitive deficit. British Journal of Developmental Psychology, 9, 417-430. Tager-Flusberg, H. (1993). What language reveals about the understanding of minds in children with autism. In S. Baron-Cohen, H. Tager-Flusberg, & D.J. Cohen (Eds.), Understanding other minds: Perspectives from autism. Oxford: Oxford University Press. Tanka, J.W., & Farah, M.J. (1993). Parts and wholes in face recognition. Quarterly Journal of Experimental Psychology, 46A, 225-245. Trabasso, T., & Suh, S. (1993). Understanding text: Achieving explanatory coherence through on-line inferences and mental operations in working memory. Discourse Processes, 16, 3-34. Tymchuk, A.J., Simmons, J.Q., & Neafsey, S. (1977). Intellectual characteristics of adolescent childhood psychotics with high verbal ability. Journal of Mental Deficiency Research, 21, 133-138. Wechsler, D. (1974). Wechsler Intelligence Scale for Children - Revised. New York: Psychological Corporation. Wechsler, D. (1981). Wechsler Adult Intelligence Scales - Revised. New York: Psychological Corporation. Weeks, S.J., & Hobson, R.P. (1987). The salience of facial expression for autistic children. Journal of Child Psychology and Psychiatry, 28, 137-152. Wellman, H.M. (1990). The child's theory of mind. Cambridge, MA: MIT Press. Wetherby, A.M., & Prutting, C.A. (1984). Profiles of communicative and cognitive-social abilities in autistic children. Journal of Speech and Hearing Research, 27, 364-377. Whiten, A. (Ed.) (1991). Natural theories of mind. Oxford: Basil Blackwell. Wimmer, H., & Perner, J. (1983). Beliefs about beliefs: Representation and the constraining function of wrong beliefs in young children's understanding of deception. Cognition, 13, 103-128. Wing, L., & Gould, J. (1979). Severe impairments of social interaction and associated abnormalities in children: Epidemiology and classification. Journal of Autism and Developmental Disorders, 9, 11-29. Witkin, H.A., Oltman, P.K., Raskin, E., & Karp, S. (1971). A manual for the Embedded Figures Test. California: Consulting Psychologists Press.
3
Developmental dyslexia and animal studies: at the interface between cognition and neurology Albert M. Galaburda Department of Neurology, Beth Israel Hospital and Harvard Medical School, Boston, MA 02215, USA
Abstract Recent findings in autopsy studies, neuroimaging, and neurophysiology indicate that dyslexia is accompanied by fundamental changes in brain anatomy and physiology, involving several anatomical and physiological stages in the processing stream, which can be attributed to anomalous prenatal and immediately postnatal brain development. Epidemiological evidence in dyslexic families led to the discovery of animal models with immune disease, comparable anatomical changes and learning disorders, which have added needed detail about mechanisms of injury and plasticity to indicate that substantial changes in neural networks concerned with perception and cognition are present. It is suggested that the disorder of language, which is the cardinal finding in dyslexic subjects, results from early perceptual anomalies that interfere with the establishment of normal cognitivelinguistic structures, coupled with primarily disordered cognitive processing associated with developmental anomalies of cortical structure and brain asymmetry. This notion is supported by electrophysiological data and by findings of anatomical involvement in subcortical structures close to the input as well as cortical structures involved in language and other cognitive functions. It is not possible at present to determine where the initial insult lies, whether near the input or in high-order cortex, or at both sites simultaneously.
The author thanks Dr. Lisa Schrott, presently at the Department of Psychiatry, University of Colorado Health Sciences Center, for help in the preparation of the report on behavioral studies in immunedefective mice. The author thanks his colleagues Drs. Gordon F. Sherman and Glenn D. Rosen for their collaboration. The preparation of this review was supported by grant 2P01 HD20806 from NIH/NICHD.
32
A. Galaburda
1. Introduction An important aim of behavioral neurology in the clinics and its sister specialty concerned with normal biology, cognitive neuroscience, is the mapping of behavior onto neurophysiology and brain structure. Where it concerns complex behaviors such as the elements of language, high-level vision, biographical memory, consciousness and attention, and the clinical syndromes in which these functions fail, progress has been slow. This is in part the result of the fact that to a large extent animal models have not been possible for capturing behaviors thought to be either purely human, or considered to have achieved qualitative or quantitative uniqueness (or both) only in human beings. Developmental disorders offer a partial advantage in this regard in that they concern the early acquisition of cognitive capacities, whereby relatively simpler biological and environmental factors may prove to be more important. Thus, for instance, sensory and perceptual deficits are more likely to play a role in language acquisition than they do after language has been implemented in the brain, and it is relatively easier to model sensory and perceptual abnormalities in animal models than it is to model cognitive processes like language. Our recent work is being driven by the hypothesis that the condition known as developmental dyslexia, also known as specific reading disability, originates as a disorder of perception affecting the brain at a vulnerable time (before the age of 1 year) when phonological structures relating to the native language are being organized in the developing brain. In this review, I will outline some of the findings in the brains of dyslexics and in suitable animal models, which have led to this hypothesis. Dyslexia remains a diagnosis that is made on educational grounds, that is, based on detection in the schools and on results of psychoeducational test batteries. Some specific cognitive anomalies are thought to be present in nearly all dyslexics, for example phonological deficits (Fischer, Liberman, & Shankweiler, 1978; Morais, Luytens, & Alegria, 1984), but it is likely that many with similar deficits are not dyslexic and some dyslexics do not exhibit phonological deficits (see recent discussions in Cognition, Volume 48, Number 3, September 1993). Thus, there is no absolute cognitive marker, as yet, for the diagnosis of dyslexia. This should not be surprising in a skill for which many areas of mental capacity on the one hand and cultural influences on the other play such an important role. In terms of brain loci and etiology, the diagnosis of dyslexia probably encompasses several biological subtypes, as yet unidentified. Although it is theoretically possible to learn to be dyslexic, in which case the brain would be found to be normal, brains of individuals with this diagnosis, which have come to post-mortem examination, have exhibited fairly uniform neuropathological findings (see below). Autopsy studies have not disclosed differences between dyslexics of the visual and auditory types, either reflecting the fact that available methods are not capable of showing such differences or the fact that phonological cases,
Developmental dyslexia and animal studies
33
being more common, have constituted the bulk of the anatomical studies. Findings so far, however, indicate that the brain is affected widely, including much of the perisylvian cortex containing both auditory and visual areas (Humphreys, Kaufmann, & Galaburda, 1990), as well as subcortical regions closer to the input (Galaburda & Livingstone, 1993; Livingstone, Rosen, Drislane, & Galaburda, 1991). The areas of anatomical abnormality are likely to interact with each other during development so that it is not possible to state with confidence at present whether the cortical changes are primary, causing developmental changes both upstream and downstream in the interconnected neural networks, or are in themselves secondary to changes occurring upstream or downstream first. Another possibility, more difficult to justify on the basis of available information, is that of injury to multiple stages of processing at the same developmental time. Our present research is directed in part toward answering the question of whether subcortical damage, or damage at early stages of processing, is the first step, which would support the hypothesis that cognitive deficits are secondary to early perceptual deficits.
2. Neuroanatomical characteristics My colleagues and I have reported alterations in the pattern of brain asymmetry of language areas as minor cortical malformations in four male and three female dyslexic brains (Galaburda, Sherman, Rosen, Aboitiz, & Geschwind, 1985; Humphreys et al., 1990). Specifically there is absence of the ordinary pattern of leftward asymmetry of the planum temporale, and the perisylvian cortex displays minor cortical malformations, including foci of ectopic neurons in the molecular layer and focal microgyria. The planum temporale is a part of the temporal lobe thought to make up a portion of Wernicke's speech area. The perisylvian cortices affected by the minor malformations include inferior frontal cortex, both in the region of and anterior to the classical Broca's area, cortex of the parietal operculum often involved by lesions producing conduction aphasia, cortex of the inferior parietal lobule often involved in anomic aphasia with writing disturbances, cortex of the superior temporal gyrus (part of Wernicke's area again), and cortex belonging to the "what" portion of the visual pathway in the middle and inferior temporal gyrus. A second set of observations is made on the human dyslexic thalamus. We have found that both neurons in the magnocellular layers of the lateral geniculate nucleus and in the left medial geniculate nucleus are smaller than expected (Galaburda & Livingstone, 1993; Livingstone et al., 1991). The former is associated with slowness in the early segments of the magnocellular pathway as
34
A. Galaburda
assessed by evoked response techniques addressing magnocellular function separately from parvocellular (Livingstone et al., 1991). The latter may relate to the temporal processing abnormalities described in the auditory system of language impaired children (Tallal & Piercy, 1973); similar temporal processing abnormalities have long been suspected to underlie deficits of aphasic patients (Efron, 1963). The relationship between the two elements of the first set of findings, that is, the anomaly in asymmetry and the cortical malformations, and the relationship between the first and second set, that is, the anomaly in rapid temporal processing associated with thalamic changes, is another focus of research with possibilities for resolution in animal models. Animal research has also aimed at explaining the question of how minor malformations could lead to noticeable and even clinically persistent disorders of cognitive function. Some answers have come from the study of strains of mice that spontaneously develop autoimmune disease, which also may display foci of neocortical ectopic neurons (Sherman, Galaburda, Behan, & Rosen, 1987; Sherman, Galaburda, & Geschwind, 1985). Thus, associated with these "ectopias", there may be significant changes in some cortical neuronal subtypes (Sherman, Stone, Rosen, & Galaburda, 1990), prominent alterations in corticocortical connectivity (Sherman, Rosen, Stone, Press, & Galaburda, 1992), and modification of behavior (see below). Furthermore, ectopic animals show alterations in the usual pattern of brain asymmetry, thus suggesting an interaction between very early developmental events and the expression of cortical asymmetry (Rosen, Sherman, Mehler, Emsbo, & Galaburda, 1989). In the normal state, both in the mouse and human brains, increasing size asymmetry between homologous areas of the two hemispheres is associated with decreasing number of neurons and decreasing amount of cortex on the small side (Galaburda, Aboitiz, Rosen, & Sherman, 1986; Galaburda, Corsiglia, Rosen, & Sherman, 1987) (rather than increasing amount on the large side). This relationship breaks down in animals with ectopias, in which asymmetry and size and number of neurons interrelate randomly. Such a discovery was not possible in the dyslexic brain sample alone, since all of them displayed symmetry of the planum temporale. One could suggest, therefore, that variable degrees of asymmetry work well only when a specific relationship between degree of asymmetry and neuronal numbers is allowed to take place, and that the development of ectopias may interfere with this relationship. There is an additional formal relationship between asymmetry and callosal connectivity (Rosen, Sherman, & Galaburda, 1989), which we anticipate will also break down in the presence of ectopias (research in progress). This would add further support to the notion that symmetry in the presence of ectopias, as is the case in dyslexic brains, is likely to be associated with fundamental changes in the functional properties of networks participating in perceptual and cognitive activities. Early on there were reports of neuroimaging aberrations in the pattern of brain
Developmental dyslexia and animal studies
35
asymmetry in dyslexic subjects (Hier, LeMay, Rosenberger, & Perlo, 1978). More recently, in a magnetic resonance imaging (MRI) study, Hynd and colleagues (Hynd, Semrud-Clikeman, Lorys, Novey, & Eliopulos, 1990) examined for the specificity of the asymmetry changes reported in dyslexic brains. Their study found that symmetry of the planum was significantly more common in dyslexics as compared with normals and individuals with attention deficit disorder/hyperactivity syndrome (ADD/H). The study also found that both ADD/ H and dyslexics had a narrower right anterior temporal area, and dyslexics alone had bilaterally smaller insular regions and significantly reduced left plana temporale. A recent study by Leonard and colleagues, which parceled the region of the planum and adjacent parietal operculum on MRI, described a shift toward increased right parietal opercular tissue in the dyslexic sample, away from temporal tissue, as well as anomalies in folding of the cortex in that region (Leonard et al., 1993). This reduction in the size of the left planum is in contrast to research, again in normal experimental animals, showing that symmetry, as compared to asymmetry, reflects bilaterally large symmetrical regions rather than bilaterally small symmetrical regions (Galaburda et al., 1987) and relatively excessive numbers of neurons (Galaburda et al., 1986) and interhemispheric connections (Rosen et al., 1989; Witelson, 1985). However, as seen in the mice with ectopias, the normal interaction between asymmetry and size no longer exists in the latter (see above), and this may explain why different imaging studies find different effects on the planum temporale. Another explanation, and this is this author's experience, is that different investigators use somewhat different criteria and methods for outlining the planum temporale (Galaburda, 1993). Larsen and colleagues (Larsen, Hoien, Lundberg, & Odegaard, 1990) reconstructed and measured the planum temporale in 19 eighth-grade dyslexics and appropriate controls using MRI. In the dyslexics 70% of the brains showed symmetry of the planum, as compared to only 30% of the control sample. Among the dyslexic subgroup exhibiting phonological deficits, 100% of the brains showed absence of asymmetry of the planum, leading the authors to suggest that asymmetry of the planum is necessary for normal phonological awareness. Another MRI study examined neuroanatomical differences between dyslexics and normals (Duara et al., 1991). This study found that a brain segment lying anterior to the occipital pole was larger on the right in dyslexics but not in controls. Relative area measurements of the midsagittal corpus callosum (CC) showed that female dyslexics had the largest CC, followed by male dyslexics, followed in turn by normal readers. The splenium of the callosum was also largest in female dyslexics, followed by male dyslexics, followed by normal males, followed by normal females. None of the brain segments corresponded to the planum temporale of other studies, so comparisons were not possible. The splenium contains fibers from the posterior temporal and parietal cortices, which participate in language functions. Moreover, this report, too, supports the notion
36
A. Galaburda
that alterations in the CC may be characteristic of dyslexic brains and may underlie, in part, their reading difficulties. There have been several functional tomographic studies using positron emission tomography (PET) (Hagman et al., 1992; Rumsey et al., 1992). Rumsey and colleagues (1992) demonstrated anomalies in cerebral blood flow in the left temporoparietal region in dyslexic men-an area important for language. Some studies have found increased left-handedness in dyslexics and increased dyslexia in left-handers (Geschwind & Behan, 1982; Schachter, Ransil, & Geschwind, 1987), and left-handers, like dyslexics, often show aberrant patterns of brain asymmetry (LeMay, 1977). A recent study (Steinmetz, Volkmann, Jancke, & Freund, 1991) confirmed this finding in MRI scans. The authors reconstructed and measured the planum temporale in 52 normal subjects-26 right-handers and 26 left-handers - and found that left-handers had a lesser degree of leftward planum asymmetry than right-handers. This study produced particularly accurate reconstructions of the planum temporale from MRI scans, which corresponded most closely to studies done directly on post-mortem material. 3. Neurophysiological studies Altered brain potentials have been described in dyslexics. One recent study (Landwehrmeyer, Gerling, & Wallesch, 1990) looked at auditory potentials evoked by a variety of linguistic and non-linguistic stimuli in dyslexics and non-dyslexics. Both groups showed increased right hemisphere surface negativity in a non-linguistic stimulus. Unlike normal readers, however, dyslexics failed to show increased left hemisphere negativity to linguistic stimuli, but instead showed increased right hemisphere negativity, thus suggesting that linguistic stimuli are treated in part as non-linguistic stimuli by this group. Another study (ChayoDichy, Ostrosky-Solis, Meneses, Harmony, & Guevara, 1990) looked at contingent negative variation (CNV), or expectancy wave. The resolution of this wave is called the postimperative negative variation (PINV). Both amplitude and latency differences at a left parietal site were documented in the PINV and amplitude differences in the CNV in a sample of nine right-handed preadolescent boys, as compared to matched controls. This suggested to the authors that significant differences existed in expectancy, attention and brain activity signal processing. A third study (Stelmack & Miles, 1990) found that disabled readers seemed to have a failure to engage long-term semantic memory, as demonstrated by their N400 responses to visually presented primed and unprimed words. These are the waves occurring at 400 ms from the stimulus onset and presumably reflect activity in cognitively related cortex. These results, combined, indicate that physiological differences exist between dyslexic and normal readers in cognitive processing, particularly affecting linguistic categories. What is less well recognized
Developmental dyslexia and animal studies
37
is their difficulties at the more peripheral levels of sensory processing and perception. The flicker fusion rate, which is the fastest rate at which a contrast reversal of a stimulus can be seen, is abnormally slow in dyslexic children at low spatial frequencies and low contrasts (Lovegrove, Garzia, & Nicholson, 1990). Another evoked potentials study (Livingstone et al., 1991) reported visual findings in dyslexics that can be attributed to perceptual anomalies and do not therefore primarily implicate language dysfunction. Flickering checkerboard patterns were presented to dyslexics and non-dyslexics at different contrasts and rates, and the transient and sustained visual evoked potentials were recorded. The parvocellular pathway, which is slow, relatively contrast insensitive, and color selective, appeared to function normally in the dyslexic group. On the other hand, the dyslexics showed abnormalities when fast, low-contrast stimuli were presented. These stimuli are handled by the magnocellular pathway of the visual system, which is segregated already in the retina and continues to be separate through the lateral geniculate nucleus (LGN), the primary visual cortex (VI) and higherorder visual cortices (Livingstone & Hubel, 1988). On the other hand, there is an ongoing debate as to the extent to which the two subsystems remain segregated, beginning even in VI, and whether the separation changes in character altogether (Ferrera, Nealey, & Maunsell, 1992; Lachica, Beck, & Casagrande, 1993). The timing of the physiological abnormality in the dyslexics suggested a magnocellular deficit early in the pathway, implicating the retina, the LGN and/or VI, none of which is currently thought to mediate cognitive functions. Moreover, in the same study, the neurons present in the magno- and parvocellular layers of the LGN were measured in five dyslexic and five control brains, and it was found that the magno cells only were smaller in the dyslexic group, thus complementing the physiological findings of a slowed magnocellular system. The parvo cells were not changed in size, but in some specimens both magno and parvo layers displayed disorganization of their architecture. Earlier evidence suggested a similar defect in fast auditory processing (Tallal & Piercy, 1973) and proposed that fast processing may be abnormal in several modalities, including the visual and auditory, and that such abnormalities may interfere with auditory and visual language acquisition and efficient language processing at loci where fast processing is required for extraction of meaning. Williams and Lecluyse (1990) have taken advantage of this possibility and have showed that image blurring, which reduces the contrast of high spatial frequencies, is capable of re-establishing normal temporal processing of words in disabled readers. We have also found preliminary evidence that the types of anatomical abnormalities described in the visual thalamus may extend to the auditory thalamus as well. We have measured the cell bodies of representative regions of the medial geniculate nuclei of dyslexic and control brains and have found that there is a shift in the former toward smaller neurons, especially affecting the left
38
A. Galaburda
hemisphere (Galaburda & Livingstone, 1993). Such a shift may reflect a corruption of a hitherto not well-understood large-celled system, and lesser differences between this large-celled system and a slower small-celled system, which could again produce difficulties in the processing of rapidly changing auditory information (Tallal & Piercy, 1973). It is not possible to state at this stage in the research program whether the functional deficits demonstrated early on in the pathway, associated with anatomical changes also relatively close to the sensory organs, represent a primary failure or the result of changes that have begun downstream and have propagated toward the periphery (as well as further downstream). Although all possibilities are predicted from current notions of developmental plasticity, animal models are being developed in our laboratory to study this question specifically vis-a-vis the relationship between cortical anomalies and changes in the thalamus - which came first?
4. Behavioral studies in animals with anomalous cortex As indicated above, NZB and BXSB mice are being used as an animal model of cortical malformations associated with the human dyslexic condition (Denenberg, Mobraaten et al., 1991; Denenberg, et al., 1992; Denenberg, Sherman, Schrott, Rosen, & Galaburda, 1991; Sherman et al., 1987). Because dyslexia is a specific learning disorder and may be accompanied by enhancement of certain abilities (Geschwind & Galaburda, 1985a, 1985b, 1985c), the use of a behavioral battery is crucial. The battery of behavioral tasks that we have administered includes four measures of learning (discrimination, spatial, complex maze and avoidance learning), as well as tests of lateralization and activity. A priori any behavioral abnormalities demonstrated in the immune-defective mice could be attributed either to the presence of cortical malformations, the existence of abnormalities in immune function making the animal sick, or to a combination of both. We have been able to separate the behavioral deficits into two types: ectopia-associated behaviors; and autoimmune-related behavior.
4.1. Ectopia-associated behaviors In working with inbred strains, one of the most difficult problems encountered is the choice of an appropriate control group. Comparing across inbred strains is risky, since behaviors are known to be influenced by the vastly different genetics of each strain. Therefore, within-strain comparisons were used to examine the behavior of NZB and BXSB mice. In the investigation of ectopia-associated behaviors, this was easily accomplished since 40-50% of NZB and BXSB mice
Developmental dyslexia and animal studies
39
develop ectopias. Three measures in the behavioral battery were found to be sensitive to the presence of ectopias: a non-spatial discrimination learning task, and two spatial measures, - water escape and the Morris maze. Ectopias depressed performance on discrimination learning. This test utilized a two-arm swimming T-maze, with a grey stem, and a black and a white alley. An escape ladder hung at the end of the alley designated to be positive (Wimer & Weller, 1965). Reinforcement consisted of escape from the water plus being placed in a dry box beneath a heat-lamp. The left-right location of the positive stimulus was altered in a semi-random sequence, so the animals had to use an associative, rather than spatial or positional strategy, to solve the task. Measures included number of correct choices and time to reach the escape ladder. Mice received 10 trials for 5 days (Denenberg, Talgo, Schrott, & Kenner, 1990). In this task, NZB mice with ectopias made fewer correct choices and took longer to find the escape ladder over the first 4 days of testing, but caught up by day 5. In ectopic NZB mice, a significant improvement in both performance measures was seen if they were reared in an enriched environment as compared to standard cages. No such effects were seen in non-ectopic mice (Schrott, Denenberg, Sherman, Waters, Rosen, & Galaburda, 1992). A similar pattern of behavior is seen in the Morris maze. This is a complex spatial task requiring the animal to find a hidden escape platform using extramaze cues. The starting point varies from one of four locations in a semi-random sequence. Time and distance to reach the escape platform were measured. In addition, the maze is divided into four quadrants and three annuli and the percentage of time spent in each portion was also measured. Mice received 4 trials a day for 5 days (Denenberg et al., 1990). Ectopic NZB mice were slower to find the escape platform and spent more time in the outermost annulus of the maze, reflecting a different pattern of learning than NZB mice without ectopias. Rearing in an enriched environment, however, was able to compensate for the deficit in ectopic mice. Enriched ectopic NZB mice had similar performance to their enriched non-ectopic littermates (Schrott, Denenberg, Sherman, Waters, et al. 1992). This is in itself interesting, since it supports the notion that focal brain injury early on can be, at least in part, compensated for in the presence of alternative strategies available in enriched early environments. The presence of ectopias also interacts with an animal's paw preference. On the discrimination learning task, right-pawed BXSB male mice had better performance than their left-pawed ectopic littermates. The opposite relationship was seen for the spatial water escape task. In this task an animal was placed in one end of an oval tub and had to swim to find a hidden escape platform at the other end using extra-maze spatial cues. The mice received 5 trials on a single day of testing and time to reach the platform was recorded. For water escape learning, left-pawed NZB males and females and BXSB males had faster times than their right-pawed counterparts. No paw preference effects were seen in non-ectopic
A. Galaburda
40
mice of either strain (Denenberg, Talgo, Carroll, Freter, & Deni, 1991). Again this is interesting in the light of claims that dyslexics are more likely than non-dyslexics to be left-handed (Geschwind & Behan, 1982), thus suggesting that ectopias could have more or less severe functional consequences according to laterality. The behavioral consequences of ectopia presence are task-specific. No main ectopia effects are seen for measures of lateralization, activity, the Lashley maze or avoidance conditioning. Ectopic mice are capable of learning, but often at a slower rate or with poorer scores. In addition, ectopias interact with other variables, such as paw preference and environmental enrichment. Across numerous studies it has been found that the presence of an ectopia, rather than its architectonic location, hemisphere or size, is the crucial characteristic for these associations. Most likely this is because the damage from an ectopia is more widespread than the focal lesion itself. The disruption of underlying fiber architecture, alterations in neuronal circuitry and neurotransmitter abnormalities that accompany an ectopia reflect a brain that has developed abnormally (Sherman, Morrison, Rosen, Behan, & Galaburda, 1990; Sherman, Rosen, Stone, Press, & Galaburda, 1992; Sherman, Stone, Press, Rosen, & Galaburda, 1990; Sherman, Stone, Rosen, & Galaburda, 1990). Whether the learning deficits are a direct consequence of the ectopia or whether an ectopia is a marker for aberrant development in general, with concomitant learning deficits, is not known at present.
4.2.
Autoimmune-related behavior
It should be remembered that ectopias arise from influences on brain development taking place as early as the 13th embryonic day, and that subsequently with increasing age many mice with or without ectopias acquire autoimmune disease consisting of humoral and cell-mediated injury to many organs other than the brain. In fact, after formation of the blood-brain barrier soon after birth, the brain is relatively protected from autoimmunity. On the other hand, metabolic changes arising from failure of organs such as the kidneys and liver, which are often involved in autoimmunity, could cross the blood-brain barrier and indirectly affect brain function. Poor performance in active and passive avoidance conditioning has been consistently associated with autoimmune mice (Denenberg, Mobraaten et al., 1991; Denenberg et al., 1992; Forster, Retz, & Lai, 1988; Nandy, Lai, Bennet, & Bennett, 1983; Schrott, Denenberg, Sherman, Waters et al., 1992; Spencer, Humphries, Mathis, & Lai, 1986). In the present set of studies avoidance conditioning was conducted in a two-way shuttlebox. The box was separated into two compartments by a divider. Five seconds of a pulsed light served as the
Developmental dyslexia and animal studies
41
conditioned stimulus, while up to 20 s of 0.4 mA footshock acted as the unconditioned stimulus. The number of avoidances, escapes (and the time to make them), as well as null responses were recorded. The most striking characteristic of NZB and BXSB mice is their very poor performance. Escape from the shock is their preferential response, with few avoidances made. It is interesting to note that failure to avoid or escape the shock (a null response) is not extinguished rapidly, as would typically be found in an animal learning this task. Instead, null responses and the latency to escape often increased across days. This response pattern is associated with a high degree of autoimmunity (the more autoimmune a mouse, the poorer the avoidance performance). The negative relationship between these two variables was more difficult to establish than the relationships between ectopias and behavior because of the lack of a proper control. Comparing avoidance performance across strains (an autoimmune strain vs. a strain with normal immune functioning) is problematic because an avoidance difference could result from any number of genetic differences unrelated to immune functioning. Comparing avoidance performance within a strain was not possible, because all mice within a strain develop an autoimmune condition and there is insufficient variability in avoidance performance (all mice have poor performance). This difficulty was solved in a rather complicated way. A set of embryo transfer studies permitted comparison of genetically identical mice who differed with regard to their immune status, as a function of the uterine environment in which they were raised. Transfer of a non-autoimmune DBA embryo to an autoimmune BXSB maternal host induced autoimmune disease in the adult animal, as well as impaired avoidance performance. Conversely, when the severity of the disease was reduced by transferring an NZB embryo to a non-autoimmune hybrid mother, avoidance performance was improved (Denenberg, Mobraaten et al., 1991). Further support for this association was provided by a study with BXSB-DBA reciprocal hybrids. The hybrid offspring were autosomally identical but differed in degrees of immune reactivity. The DBA x BXSB cross yielded offspring with greater immune reactivity and poorer avoidance performance than the BXSB x DBA cross (Denenberg et al., 1992). Finally, in a group of genetically related mice-NXRF recombinant inbred lines-the line with the greatest degree of autoimmunity had the poorest avoidance performance (Schrott, personal communication). Thus, in four groups of mice with vastly different rearing histories, the degree of autoimmunity was negatively related to performance on an active avoidance conditioning task. The degree of autoimmunity was not associated with any of the other tasks in the behavioral battery. Environmental enrichment had no effect on avoidance learning, nor were any ectopia interactions present (Denenberg, Mobraaten et al., 1991; Denenberg et al., 1992; Schrott, Denenberg, Sherman, Waters et al., 1992). In addition, pharmacological manipulations including cholinomimetics,
42
A. Galaburda
nootropics and antidepressants failed to improve performance (Schrott, Denenberg, Sherman, Waters et al., 1992). Although the negative neuroanatomical and neurochemical findings cannot conclusively prove that immune dysregulation mediates active avoidance deficits in autoimmune mice, they are consistent with this hypothesis. Possible immune mechanisms include (1) immune complex deposition on brain membranes and subsequent alterations in the permeability of the blood-brain barrier, (2) effects of circulating autoantibodies or other autoimmune factors, (3) cytokine effects, (4) altered stress responses and/or hormonal interactions, and (5) developmental aspects. These possible mechanisms are by no means mutually exclusive.
4.3. Other behaviors One behavior that does not fit into either of these categories is the Lashley m a z e - a complex maze which can be solved using spatial and/or associative learning strategies. BXSB mice have excellent performance on this task, while NZB have great difficulty learning it, even when given additional trials and cues (Schrott, Denenberg, Sherman, Rosen, & Galaburda, 1992). NZB mice are known to have abnormalities of the hippocampus, including the formation of ectopias and a small infrapyramidal mossy fiber tract system (Anstatt, 1988; Fink, Zilles, & Schleicher, 1991; Nowakowski, 1988). These abnormalities are not seen in BXSB mice (Sherman et al., unpublished data) and may account for the poor performance of NZBs in the Lashley maze. In addition, NZB mice have a low incidence of callosal agenesis (approximately 7%). Certain recombinant inbred lines with NZB as one of the progenitors develop a higher incidence of callosal agenesis, and this abnormality affects spatial learning (Schrott, personal communication).
4.4. Temporal processing in rats Language-impaired individuals exhibit severe deficits in the discrimination of rapidly presented auditory stimuli, including phonological and non-verbal stimuli (i.e., sequential tones; (Tallal, 1984; Tallal & Piercy, 1973); Fig. 1). In an effort to relate these results, male rats with neonatally induced focal malformation of the cortex (the resultant malformation is similar to one of the forms found in dyslexic brains (Humphreys, Rosen, Press, Sherman, & Galaburda, 1991)) were tested in an operant paradigm for auditory discrimination of stimuli consisting of two sequential tones. Subjects were shaped to perform a go-no-go target identification, using water reinforcement. Stimuli were reduced in duration from 750 to 375 ms across 24 days of testing. Results showed that all rats were able to
Human
2-Tone Sequence Task • — —
50-
—i
i
1
1
1
1
i
1
1
i
Normals LI 1
1
62135 3173 3693 4
Total Stimulus Time (msec) Figure 1.
Comparison of auditory temporal processing impairments in languag graph on the left illustrates the percentage correct of a two-tone discr individuals (see Tallal & Piercy, 1973) as different total stimulus calculated by false alarm minus hit latency (in milliseconds), for sh
44
A. Galaburda
discriminate at longer stimulus durations. However, bilaterally lesioned rats showed specific impairment at stimulus durations of 400 ms or less, and were significantly depressed in comparison to shams. Right- and left-lesioned subjects were significantly depressed in comparison to shams at the shortest duration (250 ms; Fig. 1). Interestingly, the neonatal lesion did not substantially involve the auditory pathways, thus suggesting that any nearby lesion may propagate along connectionally related areas to result in changes in those areas incompatible with normal temporal processing capacity. The experiments could not address the question of whether the cortical lesion propagates upstream and results in temporal processing anomalies early in the process, or whether all the results could be explained by late slowing. These questions are, however, undergoing further experimental investigation.
References Anstatt, T. (1988). Quantitative and cytoarchitectonic studies of the entorhinal region and the hippocampus of New Zealand Black mice. Journal of Neural Transmission, 73, 249-257. Chayo-Dichy, R., Ostrosky-Solis, F., Meneses, S., Harmony, T., & Guevara, M.A. (1990). The late event related potentials CNV and PINV in normal and dyslexic subjects. International Journal of Neuroscience, 54, 347-357. Denenberg, V.H., Mobraaten, L.E., Sherman, G.F., Morrison, L., Schrott, L.M., Waters, N.S., Rosen, G.D., Behan, P.O., & Galaburda, A.M. (1991). Effects of the autoimmune uterine/ maternal environment upon cortical ectopias, behavior and autoimmunity. Brain Research, 563, 114-122. Denenberg, V.H., Sherman, G.F., Morrison, L., Schrott, L.M., Waters, N.S., Rosen, G.D., Behan, P.O., & Galaburda, A.M. (1992). Behavior, ectopias and immunity in BD/DB reciprocal crosses. Brain Research, 571, 323-329. Denenberg, V.H., Sherman, G.F., Schrott, L.M., Rosen, G.D., & Galaburda, A.M. (1991). Spatial learning, discrimination learning, paw preference and neocortical ectopias in two autoimmune strains of mice. Brain Research, 562, 98-104. Denenberg, V.H., Talgo, N.W., Carroll, D.A., Freter, S., & Deni, R.A. (1991). A computer-aided procedure for measuring Lashley III maze performance. Physiology of Behavior, 50, 857-861. Denenberg, V., Talgo, N., Schrott, L., & Kenner, G. (1990). A computer-aided procedure for measuring discrimination learning. Physiology of Behavior, 47, 1031-1034. Duara, R., Kushch, A., Gross-Glenn, K., Barker, W.W., Jallad, B., Pascal, S., Loewenstein, D.A., Sheldon, J., Rabin, M., Levin, B., & Lubs, H. (1991). Neuroanatomy differences between dyslexic and normal readers on magnetic resonance imaging scans. Archives of Neurology, 48, 410-416. Efron, R. (1963). Temporal perception, aphasia, and deja vu. Brain, 36, 403-424. Ferrera, V.P., Nealey, T.A., & Maunsell, J.H. (1992). Mixed parvocellular and magnocellular geniculate signals in visual area V4. Nature, 358, 756-761. Fink, G.R., Zilles, K., & Schleicher, A. (1991). Postnatal development of forebrain regions in the autoimmune NZB-mouse: A model for degeneration in neuronal systems. Anatomy and Embryology, 183, 579-588. Fischer, F.W., Liberman, I.Y., & Shankweiler, D. (1978). Reading reversals and developmental dyslexia: A further study. Cortex, 14, 496-510. Forster, M.J., Retz, K.C., & Lai, H. (1988). Learning and memory deficits associated with autoimmunity: Significance in aging and Alzheimer's disease. Drug Development Research, 15, 253-273.
Developmental dyslexia and animal studies
45
Galaburda, A.M. (1993). The planum temporale (Editorial). Archives of Neurology, 50, 457. Galaburda, A.M., Aboitiz, F., Rosen, G.D., & Sherman, G.F. (1986). Histological asymmetry in the primary visual cortex of the rat: Implications for mechanisms of cerebral asymmetry. Cortex, 22, 151-160. Galaburda, A.M., Corsiglia, J., Rosen, G.D., & Sherman, G.F. (1987). Planum temporale asymmetry: Reappraisal since Geschwind and Levitsky. Neuropsychologia, 25, 853-868. Galaburda, A.M., & Livingstone, M.S. (1993). Evidence for a magnocellular defect in developmental dyslexia. Annals of the New York Academy of Sciences, 682, 70-82. Galaburda, A.M., Sherman, G.F., Rosen, G.D., Aboitiz, F., & Geschwind, N. (1985). Developmental dyslexia: Four consecutive cases with cortical anomalies. Annals of Neurology, 18, 222-233. Geschwind, N., & Behan, P.O. (1982). Left-handedness: Association with immune disease, migraine, and developmental disorder. Proceedings of the National Academy of Sciences, USA, 79, 5097-5100. Geschwind, N., & Galaburda, A.M. (1985a). Cerebral lateralization. Biological mechanisms, associations, and pathology: I. A hypothesis and a program for research. Archives of Neurology, 42, 428-521. Geschwind, N., & Galaburda, A.M. (1985b). Cerebral lateralization. Biological mechanisms, associations, and pathology: II. A hypothesis and a program for research. Archives of Neurology, 42, 521-552. Geschwind, N., & Galaburda, A.M. (1985c). Cerebral lateralization. Biological mechanisms, associations, and pathology: III. A hypothesis and a program for research. Archives of Neurology, 42, 634-654. Hagman, J.O., Wood, F., Buchsbaum, M.S., Tallal, P., Flowers, L., & Katz, W. (1992). Cerebral brain metabolism in adult dyslexic subjects assessed with positron emission tomography during performance of an auditory task. Archives of Neurology, 49, 734-739. Hier, D.B., LeMay, M., Rosenberger, P.B., & Perlo, V. (1978). Developmental dyslexia: Evidence for a sub-group with reversed cerebral asymmetry. Archives of Neurology, 35, 90-92. Humphreys, P., Kaufmann, W.E., & Galaburda, A.M. (1990). Developmental dyslexia in women: Neuropathological findings in three cases. Annals of Neurology, 28, 727-738. Humphreys, P., Rosen, G.D., Press, D.M., Sherman, G.F., & Galaburda, A.M. (1991). Freezing lesions of the newborn rat brain: A model for cerebrocortical microgyria. Journal of Neuropathology and Experimental Neurology, 50, 145-160. Hynd, G., Semrud-Clikeman, M., Lorys, A., Novey, E., & Eliopulos, R. (1990). Brain morphology in developmental dyslexia and attention deficit disorder/hyperactivity. Archives of Neurology, 47, 919-926. Lachica, E.A., Beck, P.D., & Casagrande, V.A. (1993). Intrinsic connections of layer-Ill of striate cortex in squirrel monkey and bush baby: Correlations with patterns of cytochrome oxidase. Journal of Comparative Neurology, 329, 163-187. Landwehrmeyer, B., Gerling, J., & Wallesch, C.W. (1990). Patterns of task-related slow brain potentials in dyslexia. Archives of Neurology, 47, 791-797. Larsen, J., Hoien, T., Lundberg, L, & Odegaard, H. (1990). MRI evaluation of the size and symmetry of the planum temporale in adolescents with developmental dyslexia. Brain and Language, 39, 289-301. LeMay, M. (1977). Asymmetries of the skull and handedness: Phrenology revisited. Journal of Neurological Science, 32, 243-253. Leonard, C M . , Voeller, K.K.S., Lombardino, L.J., Morris, M.K., Hynd, G.W., Alexander, A.W., Andersen, H.G., Garofalakis, M., Honeyman, J.C., Mao, J.T., Agee, O.F., & Staab, E.V. (1993). Anomalous cerebral structure in dyslexia revealed with magnetic resonance imaging. Archives of Neurology, 50, 461-469. Livingstone, M., & Hubel, D. (1988). Segregation of form, color, movement, and depth: Anatomy, physiology and perception. Science, 240, 740-749. Livingstone, M.S., Rosen, G.D., Drislane, F.W., & Galaburda, A.M. (1991). Physiological and anatomical evidence for a magnocellular defect in developmental dyslexia. Proceedings of the National Academy of Sciences USA, 88, 7943-7947. Lovegrove, W., Garzia, R., & Nicholson, S. (1990). Experimental evidence for a transient system
46
A. Galaburda
deficit in specific reading disability. Journal of the American Optometric Association, 2, 137-146. Morais, J., Luytens, M., & Alegria, J. (1984). Segmentation abilities of dyslexics and normal readers. Perceptual and Motor Skills, 58, 221-222. Nandy, K., Lai, H., Bennet, M., & Bennett, D. (1983). Correlation between a learning disorder and elevated brain-reactive antibodies in aged C57BL/6 and young NZB mice. Life Sciences, 33, 1499-1503. Nowakowski, R.S. (1988). Development of the hippocampal formation in mutant mice, Drug Development Research, 15, 315-336. Rosen, G.D., Sherman, G.F., & Galaburda, A.M. (1989). Interhemispheric connections differ between symmetrical and asymmetrical brain regions. Neuroscience, 33, 525-533. Rosen, G.D., Sherman, G.F., Mehler, C, Emsbo, K., & Galaburda, A.M. (1989). The effect of developmental neuropathology on neocortical asymmetry in New Zealand Black mice. International Journal of Neuroscience, 45, 247-254. Rumsey, J.M., Andreason, P., Zametkin, A.J., Aquino, T., King, A.C., Hamburger, S.D., Pikus, A., Rapoport, J.L., & Cohen, R.M. (1992). Failure to activate the left temporoparietal cortex in dyslexia: An oxygen-15 positron emission tomographic study. Archives of Neurology, 49, 527-534. Schachter, S.C., Ransil, B.J., & Geschwind, N. (1987). Associations of handedness with hair color and learning disabilities. Neuropsychologia, 25, 269-276. Schrott, L.M., Denenberg, V.H., Sherman, G.F., Rosen, G.D., & Galaburda, A.M. (1992). Lashley maze deficits in NZB mice. Physiology of Behavior, 52, 1085-1089. Schrott, L.M., Denenberg, V.H., Sherman, G.F., Waters, N.S., Rosen, G.D., & Galaburda, A.M. (1992). Environmental enrichment, neocortical ectopias, and behavior in the autoimmume NZB mouse. Developmental Brain Research, 67, 85-93. Sherman, G.F., Galaburda, A.M., Behan, P.O., & Rosen, G.D. (1987). Neuroanatomical anomalies in autoimmune mice. Acta Neuropathologica (Berlin), 74, 239-242. Sherman, G.F., Galaburda, A.M., & Geschwind, N. (1985). Cortical anomalies in brains of New Zealand mice: A neuropathoiogic model of dyslexia? Proceedings of the National Academy of Sciences USA, 82, 8072-8074. Sherman, G.F., Morrison, L., Rosen, G.D., Behan, P.O., & Galaburda, A.M. (1990). Brain abnormalities in immune defective mice. Brain Research, 532, 25-33. Sherman, G.F., Rosen, G.D., Stone, L.V, Press, D.M., & Galaburda, A.M. (1992). The organization of radial glial fibers in spontaneous neocortical ectopias of newborn New-Zealand black mice. Developmental Brain Research, 67, 279-283. Sherman, G.F., Stone, J.S., Press, D.M., Rosen, G.D., & Galaburda, A.M. (1990). Abnormal architecture and connections disclosed by neurofilament staining in the cerebral cortex of autoimmune mice. Brain Research, 529, 202-2(fJ. Sherman, G.F., Stone, J.S., Rosen, G.D., & Galaburda, A.M. (1990). Neocortical VIP neurons are increased in the hemisphere containing focal cerebrocortical microdysgenesis in New Zealand Black Mice. Brain Research, 532, 232-236. Spencer, D.G., Humphries, K., Mathis, D., & Lai, H. (1986). Behavioral impairments related to cognitive dysfunction in the autoimmune New Zealand Black mouse. Behavioral Neuroscience, 100, 353-358. Steinmetz, H., Volkmann, J., Jancke, L., & Freund, H.J. (1991). Anatomical left-right asymmetry of language-related temporal cortex is different in left-handers and right-handers. Annals of Neurology, 29, 315-319. Stelmack, R.M., & Miles, J. (1990). The effect of picture priming on event-related potentials of normal and disabled readers during a word recognition memory task. Journal of Clinical and Experimental Neuropsychology, 12, 887-903. Tallal, P. (1984). Temporal or phonetic processing deficit in dyslexia? That is the question. Applied Psycholinguistics, 5, 167-169. Tallal, P., & Piercy, M. (1973). Defects of non-verbal auditory perception in children with developmental aphasia. Nature, 241, 468-469.
Developmental dyslexia and animal studies
47
Williams, M., & Lecluyse, K. (1990). Perceptual consequences of a temporal processing deficit in reading disabled children. Journal of the American Optometry Association, 2, 111-121. Wimer, R., & Weller, S. (1965). Evaluation of a visual discrimination task for the analysis of the genetics of mouse behavior. Perceptual and Motor Skills, 20, 203-208. Witelson, S.F. (1985). The brain connection: The corpus callosum is larger in left handers. Science, 229, 665-668.
4
Foraging for brain stimulation: toward a neurobiology of computation C.R. Gallistel* Department of Psychology, University of California at Los Angeles, 405 Hilgard Ave., Los Angeles, CA 90024-1563, USA
Abstract The self-stimulating rat performs foraging tasks mediated by simple computations that use interreward intervals and subjective reward magnitudes to determine stay durations. This is a simplified preparation in which to study the neurobiology of the elementary computational operations that make cognition possible, because the neural signal specifying the value of a computationally relevant variable is produced by direct electrical stimulation of a neural pathway. Newly developed measurement methods yield functions relating the subjective reward magnitude to the parameters of the neural signal. These measurements also show that the decision process that governs foraging behavior divides the subjective reward magnitude by the most recent interreward interval to determine the preferability of an option (a foraging patch). The decision process sets the parameters that determine stay durations (durations of visits to foraging patches) so that the ratios of the stay durations match the ratios of the preferabilities.
Introduction Cognitive psychology arose when psychological theorists became convinced that behavior was mediated by computational processes that could not readily be described in the language of reflex physiology. Thus, the rise of cognitive psychology widened the conceptual gap between neuroscientific theory and
*Tel. (310) 206 7932, fax (310) 206 5895, e-mail
[email protected]
50
C. Gallistel
psychological theory. We need a conceptual scheme and research strategies to bridge the gap. One strategy is to assume that we already understand those aspects of neural functioning that enable neural tissue to carry out complex computations. In that case, we may hope to work from the bottom up, trying to build satisfactory models of psychological processes using what we know about how the nervous system works as the starting point of the modeling effort. However, there are reasons to doubt that we know the computationally relevant properties of neural tissue. We clearly do not know one such property. We do not know how neural tissue stores information; yet, storing and retrieving the values of variables are fundamental to computation. It is also far from clear that we know the neural processes that implement other elements of computation, such as adding and multiplying the values of the retrieved variables. In order to have a neurobiology of computation, we will have to discover how the elements of computation are realized in the central nervous system. If we do not already know the computationally relevant properties of neural tissue, then we must work from the top down. We must develop simplified preparations in which elementary computational operations demonstrably occur, derive from the behavioral study of those preparations properties of the neural processes that mediate the computations, then use our knowledge of those properties to discover the underlying cellular and molecular mechanisms. In pursuing this strategy, we follow the approach that proved successful in other areas where science has established the physical basis for phenomena that initially seemed refractory to physico-chemical explanation. Genes and their properties were inferred from the study of inheritance. Genes were then shown by the methods of classical genetics to be present in simple organisms (bacteria and yeast) that lent themselves to biochemical manipulation. This led in time to the elucidation of the molecular structure of the gene. The chemical identification of the gene revolutionized biochemistry. The mechanism of inheritance was not-and in retrospect could not have beendeduced by building models based on what biochemists understood about molecular structure before 1954, because crucial aspects of the structure of complex biological molecules were undreamed of in biochemistry up to that point. Genetic theory required molecules that could generate copies of themselves. How this was to be accomplished chemically was, to say the least, obscure until the revelation of the sequence of complementary base-pairs cross-linking the two strands of DNA. Computational models of psychological processes require the storing and retrieving of the values of the variables used in computation. How this is to be accomplished by neurophysiological processes is, to say the least, obscure. Thus, the discovery of the neural realization of the elements of computation may someday revolutionize neuroscience.
Foraging for brain stimulation
57
Developing a suitable preparation The pursuit of a top-down strategy requires the development of suitable simple preparations. The preparations must be as simple as possible from two different perspectives: a psychological perspective and a neurobiological perspective. From a psychological perspective, they must exploit those behavioral phenomena for which we have computational models that are clear, well developed, relatively simple, and empirically well supported. Most work on bacterial genes focuses on a relatively few genes whose effects are simply expressed and readily measured. Work on the neurobiology of computation needs likewise to focus on cognitive phenomena that are relatively well understood at the computational level. A good case can be made that behavior based on comparison of remembered and currently elapsing temporal intervals is a suitably simple aspect of cognition. Extensive experimental and theoretical work by Gibbon and Church and their collaborators has established a detailed model of the process by which temporal intervals are measured and recorded in memory and, more importantly, detailed models of the decision processes. The decision processes that mediate responding in timing tasks use simple computations to translate remembered and currently elapsing intervals into behavior (see Church, 1984; Gibbon, Church, & Meek, 1984, for reviews). Models of timing behavior are among the simplest, most clearly worked out, and most quantitatively successful computational models in cognitive psychology. Moreover, it is becoming clear that they elucidate and integrate a wide range of phenomena in classical and instrumental conditioning (Gallistel, 1990; Gallistel, in press; Gibbon, 1992; Gibbon, Church, Fairhurst, & Kacelnik, 1988). Thus, timing tasks recommend themselves from a psychological perspective. From a neurobiological perspective, there should be some apparent strategy by which one might try to identify the neural circuitry that carries out the computational aspects of the task. And, there should be behavioral methods by which we may derive quantitative properties of the neurophysiological processes that mediate the storage of the variable and/or the ensuing computations. We must determine quantitative properties of these computational processes by behavioral methods rather than by neurophysiological methods precisely because we do not yet know what neurophysiologically identified processes we should measure. Our goal is to discover which neurophysiological processes we should be examining. Until we have reached that goal, our knowledge of the quantitative properties of the relevant neurophysiological processes must come from measures based on their behavioral consequences. It is only by their behavioral consequences that we know them, just as genes were for many decades known only through their consequences for inheritance. Finally, there should be some idea of how the computationally relevant signals
52
C. Gallistel
might someday be generated in the neural circuit after it was subjected to the radical isolation required for serious work on cellular and molecular mechanisms. It is becoming commonplace to cut slices or slabs of living neural tissue containing interesting neural circuits, remove them from the brain, and keep them alive for hours or days in Petri dishes, where they may be subjected to experiments that could not be carried out in the intact brain. Suppose one had removed from the brain a slab containing a neural circuit that mediated the storage and retrieval of temporal intervals, reward magnitudes, and the computations involved in the decision processes«that translate these stored variables into choice behavior. How could you generate in the isolated neural circuit the signals that served as the inputs to that circuit in the intact animal? You cannot deliver food rewards to isolated neural circuits.
Rationale for using self-stimulation For some years, much work on electrical self-stimulation of the brain in the rat has been motivated by the belief that this phenomenon may provide us with a preparation in which to study the neurobiology of the elementary computational operations that make complex cognitive processes possible. Here, I explain this rationale before summarizing our recent experimental findings from the study of matching behavior in self-stimulating rats. Our findings emphasize the computational nature of the decision processes that mediate the choices a foraging animal makes, and they yield the function relating the magnitude of a stored variable to the parameters of the neural signal that specifies its value. Matching behavior is the very general tendency of animals-from pigeons to humans - to allocate the times they invest in competing choices in a manner that matches the relative returns obtained from those choices (Davison & McCarthy, 1988; Herrnstein, 1961, 1991). The return from the time invested in a given choice is the magnitude of the rewards received from that investment divided by the interreward interval. The relative return is the ratio of the return from a given choice to the sum of the returns from all the available choices. The to-bereviewed experimental findings suggest that the decision process underlying matching behavior computes these subjective quantities - the return and the relative return-then adjusts the parameters of a stochastic "patch-leaving" process in such a way that ratios of the expected stay durations match ratios of the most recently obtained returns. Thus, in matching behavior, as in other tasks that depend on the assessment and comparison of temporal intervals, the behavior provides an unusually direct reflection of an elementary computation performed by the central nervous system: the ratios of the times allocated to the alternatives match the computed ratios of the recently experienced returns from each alternative.
Foraging for brain stimulation
53
In matching behavior as it is commonly studied in the laboratory, the remembered variables that enter into the computations of the decision process the subjective durations of the interreward intervals and the subjective magnitudes of the rewards - are specified by external events - the onsets of tones and lights and the delivery of food rewards. The computationally relevant neural signals produced by these events are at present almost impossible to determine. However, in the self-stimulating rat, the quantities that enter into the computation - the subjective durations of the interreward intervals and the subjective magnitudes of the rewards - are specified by direct electrical stimulation of a pathway in the central nervous system itself. Generating the reward signal by direct electrical stimulation of neural tissue provides dramatic simplification from both the psychological and the neurobiological perspectives. The stimulus that produces the rewarding effect in self-stimulation is a brief train (0.1-1.0 s long) of 0.1 ms cathodal pulses delivered to the medial forebrain bundle, a neuroanatomically complex collection of diverse projections interconnecting the upper brain stem and the forebrain. The parameters of the rewarding neural signal - the number of axons fired by the pulses and the number of times per second they fire - are determined by the current and pulse frequency, respectively. Thus, in the self-stimulating rat, we bypass the complexities of the sensory perceptual circuits that translate natural events into the signals specifying the values of the psychological variables in computationally interesting decision processes. We generate the computationally relevant signal directly. The rat's desire for brain stimulation reward is intense and insatiable. It will work for hours on end for these rewards, even when it gets 40-60 large rewards per minute - a rate of return that would rapidly satiate its desire for any known natural reward. The combination of direct control over the rewarding neural signal and a rewarding effect that sustains almost any amount of responding makes it possible to bring psychophysical methods to bear in analyzing the neural pathway that carries the signal (Shizgal & Murray, 1989; Yeomans, 1988). Psychophysical procedures may also be used to determine the function relating the psychological magnitude of the remembered reward to the strength and duration of the neural signal that produces it (Leon & Gallistel, 1992; Mark & Gallistel, 1993; Simmons & Gallistel, in press). Thus, we have a directly controllable neural signal related in a known way to a psychological magnitude that plays a central role in computationally interesting decision processes. From the neurobiological standpoint, the problem of identifying the relevant neural circuitry is greatly simplified by the fact that the search starts at a localized site in the central nervous system. The axons that carry the rewarding signal must pass within about a 0.5 mm radius of the electrode tip. This gives us the starting point for a line of experiments that combines neuroanatomical, psychophysical and electrophysiological methods in an attempt to identify the axons that carry the rewarding signal, describe the morphology and physiological characteristics of
54
C. Gallistel
the neurons from which these axons arise, and investigate the electrophysiological and neurochemical processes in the postsynaptic cells of the circuits in which these neurons are embedded (Shizgal & Murray, 1989; Stellar & Rice, 1989; Wise & Rompre, 1989; Yeomans, 1988). Finally, if we succeed in identifying the circuit and isolating it for cellular and molecular study, we can use electrical stimulation to activate it in isolation - the same stimulus used in the intact animal. Thus, the self-stimulation phenomenon has the potential to provide the kind of simplification at both the psychological and neurobiological levels of analysis that may make it possible to work from the psychological level of analysis down to the neurobiological mechanisms that mediate elementary computation.
Measuring the subjective magnitude of the reward One of the early findings from psychophysical experiments on brain stimulation reward was the surprisingly simple form of the trade-off between the stimulating current and the pulse frequency required to produce a just acceptable reward from a train of fixed duration (Gallistel, Shizgal, & Yeomans, 1981). For many electrode placements, over most of the usable range of stimulating currents and pulse frequencies, the required pulse frequency varies as the reciprocal of the current. Thus, doubling the stimulating current halves the required pulse frequency. It is reasonably certain that the number of axonal firings produced by a train of very brief cathodal pulses is directly proportional to the pulse frequency. Thus, the finding that the required pulse frequency varies as the reciprocal of the current suggests two conclusions (Gallistel et al., 1981). (1) The number of reward-relevant axons fired by a stimulating pulse is directly proportional to the stimulating current; that is, doubling the current doubles the number of relevant axons fired by each pulse in the train. Thus, either doubling current or doubling pulse frequency doubles the strength of the reward signal, where strength is defined as the number of action potentials per unit time. (2) The subjective magnitude of the reward produced by the resulting neural signal is determined simply by the strength of that signal. Firing 1000 axons 10 times to produce 10,000 action potentials has the same rewarding effect as firing 500 axons 20 times. How does the subjective magnitude of the reward grow as we increase the strength of the neural signal? To determine this, we had to find a method for measuring the subjective magnitude of the reward. Gallistel and Leon (1991) reasoned that matching behavior could be used for this purpose, albeit in a somewhat indirect way. From a normative or "rational-decision-maker" standpoint, the subjective magnitude of the rewards the rat receives when it holds down a lever ought to combine multiplicatively with the subjective rate of rewards generated by holding down that lever to determine the subjective return from that
Foraging for brain stimulation
55
lever (return = magnitude x rate). A rational decision-making process that obeyed the matching law (allocating equal amounts of time to choices that yield equal returns) would divide its time equally between a lever yielding rewards of magnitude 10 every 8 s and a lever yielding rewards of magnitude 20 every 16 s (rewards twice as big, delivered half as often). Gallistel and Leon (1991) had rats work for brain stimulation reward on concurrent variable interval schedules of reward. A variable interval schedule makes the next reward on a lever available after an interval determined by an approximation to a Poisson process. Once every second, the scheduling algorithm in effect flips a biased coin. If the coin comes up heads, it makes a reward available on the lever. If the rat has the lever down, it gets the reward then and there; otherwise it gets it when it next presses the lever. The bias on the coin determines the average interval between the delivery of a reward and the availability of the next reward. The lower the probability of heads, the longer the average interreward interval on the lever. The expected interval to the next reward is lip s, where p is the probability of heads in a flip made every 1 s. In the Gallistel and Leon (1991) experiment, there were two levers, each in its own alcove. When the rat held down one lever, it received rewards at a rate determined by one variable interval schedule, say, one in which the expected interval between rewards was 16 s (a VI 16 s schedule). When the rat held down the lever in the other alcove, it received rewards determined by a different schedule, say, one in which the expected interval was 4 s (VI 4 s). In this situation, rats move back and forth between the levers with sufficient frequency that the average rate of reward from each lever is approximately equal to the reciprocal of the expected interval in the schedule for that lever. Thus, the rat gets about 15 rewards per minute from a lever with a VI 4 s schedule and about 3.75 rewards per minute from a lever with a VI 16 s schedule. To make the subjective return from a lever with a VI 16 s schedule equal the subjective return from a lever with the VI 4 s schedule (assuming a rational decision maker), we would have to make the subjective magnitude of the reward on the first lever four times bigger than the subjective magnitude of the reward on the second lever. When the rewards given on the two sides are the same size, the rat spends most of its time on the lever that delivers these rewards four times as often. As we increase the strength of the rewarding signal on the other side, we increase the subjective magnitude of the reward the rat experiences from each stimulating train received from that lever. When the factor by which the subjective reward magnitudes on the two sides differ is the inverse of the factor by which the subjective rates of reward differ, the subjective return on the two sides will be equal. By the matching law, the rat should spend equal amounts of time on the two levers. To determine the function relating the subjective magnitude of the rewarding effect to the strength of the neural signal that produced it, Gallistel and Leon
C. Gallistel
56
(1991) varied the factor by which the rates of reward differed and determined the adjustments in pulse frequency or current required to offset this difference, that is, to make the rat divide its time 50:50 between the two levers (equipreference). Fig. 1 uses double logarithmic coordinates to display the experimentally determined trade-off between the difference in rate of reward (on the y axis) and either pulse frequency or current at equipreference (on the x axis). Increasing either pulse frequency or current by a given factor (i.e., by a given interval on a log scale) has the same effect on the subjective magnitude of the reward (Fig. 1). Thus, Gallistel and Leon's (1991) results strengthen the conclusions drawn from the trade-off between current and pulse frequency at the threshold for reward acceptability. The conclusions are, first, that the number of action potentials per second in the rewarding signal is directly proportional to both pulse frequency and current:
n^klf where na is the number of action potentials per second, / is current, / is pulse frequency and k is a constant). Thus, the x axis in Fig. 1 can be read as the number of action potentials in the neural signal that produces the rewarding effect (times an unknown scaling factor k). Second, the subjective magnitude of the reward (Af) is determined by the number of action potentials per second in a signal of fixed duration, M = /(n a ). Given two measurement assumptions, the y axis in Fig. 1 can be equated with the subjective magnitude of the reward (on a log scale). (1) The subjective rate of reward (r) is proportional to the objective rate of reward. (2) Subjective rate of reward combines multiplicatively with subjective reward magnitude to determine the subjective return from a lever, or the V (for value) of the lever: V=Mr The V values for the available options are the quantities the decision process uses in computing the ratios of the returns from its most recent investments in those options. The ratio of these values determine the ratio of times allocated to the options. Gallistel and Leon (1991) did not test their measurement assumptions, but Mark and Gallistel (1993) did test them, in the course of an experiment that used matching behavior to determine the function relating the subjective magnitude of the reward to the duration of the neural signal that produced it. Like Gallistel and Leon (1991), Mark and Gallistel (1993) varied the ratio of the interreward intervals in the two concurrent schedules of reward. At each ratio of rates of reward, they varied the duration of the stimulating train delivered by one lever to find the factor by which the train durations had to differ in order to offset the difference in the rates of reward (the equipreference method). However, they
Foraging for brain stimulation Current (^A) at Equipreference
158 251 400 630 T
1-
100 158 Pulse Frequency (pps) at Equipreference Fig. 1. The trade-off between the relative rates of reward on concurrent variable interval schedules of reward (left ordinate) and the pulse frequency or current to which one reward had to be adjusted to induce the rat to spend equal amounts of time on the two levers (bottom and top abscissas). The pulse frequency and current of the other reward were fixed at 126 pps and 400 p.A. The train duration for both rewards was 0.5 s. The scales are logarithmic, so equal intervals on the current (top) and pulse frequency (bottom) axes correspond to changes by equal multiplicative factors. Under the measurement assumptions specified in the text, this graph may also be read as a graph of the subjective magnitude of the reward as a function of the number of action potentials per second in the 0.5 s rewarding signal. The logarithmic scale of subjective reward magnitude (right ordinate) was generated by setting the magnitude of the smallest measured reward equal to 1. Data replotted from Gallistel and Leon (1991).
also analyzed the data from the many trials on which the rats did not allocate equal amounts of time to the two levers (because the two levers did not yield equal returns). By the matching law, the ratio of the amounts of time allocated to the two levers on a trial is proportional to the ratio of the returns from those levers. When the schedules for the two levers deliver rewards at equal rates, a 4:1 ratio of time allocation in favor of one lever implies that the subjective reward from that lever is four times as big as the subjective reward from the other lever. Mark and Gallistel term this the direct method, because subjective reward magnitude is read directly from the time allocation ratio in a matching task. (The fixed magnitude of the reward on the competing lever establishes the unit of measurement.) In the direct method, the relative rate of reward need not be varied. A complete determination of the relation between a stimulation parameter, such as train duration, and the subjective magnitude of the reward produced can be made at a single setting of the schedules of reward on the two levers. Thus, measurements made by the direct method do not depend on the assumptions about how
C. Gallistel
58
the rat's subjective estimate of the rate of reward depends on the objective rate. On the other hand, they do depend on the assumption that the ratio of the times the rat allocates to the two levers matches the ratio of the subjective returns. The equipreference method does not depend on this latter assumption, only on the weaker assumption that when the returns are equal the rat has no preference between the levers. In comparing the measurements made by the equipreference and direct methods, we test the validity of these different measurement assumptions. If either assumption is wrong, the measurements will not agree. However, as shown in Fig. 2, they do agree. The reward magnitude versus train duration function obtained by the direct method is approximately superimposable on the function obtained by the equipreference method. The comparison in Fig. 2 tests the validity of the assumption that the subjective rate of reward is proportional to the objective rate of reward, one of the two assumptions on which the validity of the equipreference measurements rested. The second assumption was that the decision process underlying matching behavior multiplied the subjective magnitude of reward by the subjective rate of reward to determine the subjective return from a lever. Mark and Gallistel (1993) tested the validity of this assumption by comparing the measurements of subjective reward magnitude made at different relative rates of reward. If rate of
o , * 'LSC11
o
X X X
•f -%7 XXX x equipreference • direct
CJl
-*>
o
X OTC
o
Subjective Reward Magniti
CD X5
I
i i i_ , i 0.1 0.2 0.5 1 2 Trai n Duration (s)
Fig. 2. The measurements of the subjective reward magnitude at different train durations made by the equipreference method are compared to the measurements made by the direct method (with the rates of reward the same on both levers). The curve was computed by a smoothing routine from the complete data set shown in Fig. 3. The approximate agreement between the two sets of measurements validates the assumption that the subjective rate of reward is proportional to the objective rate. This is a key assumption in the equipreference method, but this assumption is unnecessary in the direct method, because, in the direct method, the relative rate of reward is held constant. In the data shown here, the relative rate of reward was 1:1. Data from Mark and Gallistel (1993).
Foraging for brain stimulation
59
reward combines multiplicatively with subjective magnitude to determine return, then the relative rate of reward should act as a simple scaling factor when subjective reward magnitude is measured by the direct method. Suppose that when the rates of reward on the two levers are equal, the rat has a 4:1 preference for the reward generated by a 1 s train over the reward generated by a 0.5 s train. The direct method takes this to mean that the subjective reward from the 1 s train is four times bigger than the subjective reward from the 0.5 s train. Suppose we repeat the measurements with the schedules of reward adjusted so that the 0.5 s reward comes twice as often as the 1 s reward. The rat's preference (time allocation ratio) for the 1 s lever should be reduced by a factor of two. It should allocate only twice as much time to the 1 s lever. Suppose we repeat the measurements with the schedules adjusted so that the 0.5 s reward comes only half as often as the 1 s reward. The rat should now allocate eight times as much time to the 1 s lever (a factor of two increase in its preference). And so on. At a given setting of the relative rates of reward, Mark and Gallistel (1993) determined the rat's time allocation ratio as a function of the train duration on one lever, keeping the reward on the other lever constant. The time allocation ratios are the direct measures of the subjective magnitude of the variable reward. A set of these time allocation ratios, one for each duration of the variable reward, gives the function relating subjective reward magnitude to the duration of the train of stimulation. If the relative rate of reward acts as a scaling factor, then the functions we obtained at different relative rates of reward should be superimposable, provided we correct for the difference in the scale of measurement. To correct for differences in the scale of measurement, we multiplied the time allocation ratios from a given session by the inverse of the rate ratio in effect for that session. Fig. 3 shows that, after rescaling, the measurements made at different relative rates of reward superimpose. This validates the assumption that subjective rate of reward combines multiplicatively with subjective reward magnitude to determine the subjective return from a lever. The greatest value of these measurement experiments lies in what they reveal about the decision process, the computational process that uses these psychological variables (subjective magnitude and subjective rate) to determine how the animal will behave. In the course of validating the measurements, we have established that the decision process multiplies subjective reward magnitude by subjective rate of reward to determine the subjective value of the lever. Thus, we have isolated a simple computational process, where we can control the values of the variables that enter into the computation by direct electrical stimulation of a pathway in the central nervous system. Moreover, we have determined how those values depend on the strength and the duration of the barrage of action potentials produced by the stimulation. The subjective magnitude of the reward is a steep sigmoidal function of the strength of the neural signal (Gallistel & Leon, 1991;
60
C. Gallistel 2
nitu<
CD T3
"LSC11
xJL*
D) CO
•/^
^ CO 3 ^
x eqp 1:8 • 1:4 • 1:2 • 1:1 •2:1
0.2
A
CD
}J*xx
•§ °- 1 CO
*
•*
••7 x jjcpT
•^ • f 0.05
*K
rfr •
1 0.5
-
T •
•
f
•
•
0.1 0.2 0.5 1 2 Train Duration (s) Fig. 3.
Comparison of direct determinations of reward magnitude versus train duration function made at different relative rates of reward. The key gives the rate of delivery of the reward whose magnitude was measured, relative to the rate of delivery of the reward whose magnitude was held constant. A ratio of, for example, 1:8 means that the constant reward was delivered eight times as often as the reward whose magnitude was measured. The measurements of reward magnitude (the observed time allocation ratios) made at different relative rates of reward were multiplied by the inverse of these rate ratios. The fact that the rescaled sets of measurements superimpose implies that subjective rate of reward combines multiplicatively with subjective reward magnitude to determine subjective return (or value). Data from Mark and Gallistel (1993).
Leon & Gallistel, 1992; Simmons & Gallistel, in press) and a somewhat less steep sigmoidal function of its duration (Mark & Gallistel, 1993).
A model of the decision process in matching It has commonly been assumed that the subjective estimates of rate of reward that underlie matching behavior were based on a reward-averaging process of some kind (Killeen, 1981; Lea & Dow, 1984; Staddon, 1988; Vaughan, 1981). Recently, however, Gibbon et al. (1988) showed that matching behavior could result from rate estimates based on just two interreward intervals, one for each lever, sampled from the population of remembered interreward intervals on a lever. In the Gibbon et al. (1988) analysis, matching behavior was another example of behavior generated by a decision process that used remembered temporal intervals. However, Gibbon et al. (1988) did not specify the size of the populations of remembered interreward intervals from which the samples were drawn. Thus, neither reward-averaging models nor the timing model proposed by Gibbon et al. (1988) specified the interval of past history on which the subjective estimate of rate of reward was based. All of these models, however, implicitly or explicitly assume that the animal's current time allocation ratios are based on a lengthy sample of previous returns. Thus, they all predict that when the relative
Foraging for brain stimulation
61
rate of return changes, it should take the animal a long time to adjust its time allocation ratios to the new rates of return. Its time allocation ratios will change only when the averages over a large number of previous rewards have changed or only when populations that include a large number of previous interreward intervals have changed. They all predict that adjustments in time allocation ratios following a step change in the relative rate of reward should be sluggish. Dreyfus (1991), working with pigeons responding for food reward, obtained results strikingly at variance with this prediction. He ran sessions in which the relative rates of reward reversed in the middle of each session. He found that the pigeons reversed their time allocation ratio within the span of about one expected interreward interval on the leaner schedule. We obtained similar results in a similar experiment with rats responding for brain stimulation reward (Mark & Gallistel, 1994). These rapid shifts in time allocation ratio in response to changes in the relative rates of reward imply that-at least under some conditions - the animal's estimate of the rate of reward is based only on a small sample of the more recent rates of reward on the two levers. The minimum sample on which estimates of the latest relative rates of reward could in principle be based is the most recent interreward interval on each lever. The reciprocal of the interval between the last two rewards on a lever gives an unbiased, but very noisy, estimate of the current rate of reward on that lever. In both the Dreyfus (1991) experiment and our experiment, the reversal in relative rate of reward at mid-session was predictable. It happened at the same time in every session, and, in our experiment, its occurrence was signaled by the withdrawal and reappearance of the levers. Thus, it might be thought that the rapid adjustment to the reversal reflected higher-order learning. However, we showed that the rats in our experiment adjusted to the totally unpredictable random changes in the apparent relative rates of reward due to the noise inherent in Poisson scheduling processes (Mark & Gallistel, 1994). We used sampling windows twice as long as the expected interreward on the leaner schedule. Within each successive window, we tabulated the time allocation ratio and the ratio of the numbers of rewards received. We plotted the logarithms of both ratios on the same graph (Fig. 4). Because we used such a narrow sampling window, the numbers of rewards received within a window were small. The expected number on the leaner lever was only two. Due to the random variations inherent in Poisson schedules, there were many windows in which this number was in fact zero, in which case the ratio of the numbers of rewards received was undefined. This accounts for the gaps in the solid lines in Fig. 4. A gap occurs wherever the reward ratio in a window was undefined. Ratios based on numbers derived from a small sampling of two Poisson scheduling processes show large variability - see solid lines in Fig. 4. This variability is random. The surprise is that the rat's window-to-window time allocation ratios show similar variability and that the variability in the time
C. Gallistel
1.8 1.2
h DB13
VI4svsVI16s
0.6 0.0
IS
-0.6
81
-1.2 15 rewards/min -1.8 6
9
12
15
0
20
40
60
80 100 120
CO
DC V116 svs VI 64 s
4.4 rewards/min
0
10 20 30 40 50 60
50
100 150 200 250
Time (mins) Plots of log reward ratios (R, IR2) and log time allocation ratios (T,IT2) in successive windows equal to two expected interreward intervals on the leaner schedule, over sessions comprised of two trials, with a 16-fold reversal in the programmed relative rate of reward between the trials Successive windows overlap by half a window. A gap in the solid tine means that the ratio was undefined withm that window because no reward occurred on one or both sides The programmed reward ratios of 4:1 and 1:4 are indicated by the horizontal lines at +0.6 and -0.6 respectively. The lighter horizontal lines indicate the actually experienced reward ratio as calculated by aggregating over the trial. The actual numbers of rewards obtained, which yield these ratios, are given beside these lighter tines. The actually experienced combined rate of reward across the two trials in a given condition (combined reward density) is given at the lower right of each panel. (A) Programmed reward density equals 19.2 rewards/min (B) Programmed reward density =4.8 rewards I min. (C) Programmed reward density =2 4 rewards!mm. (The density actually experienced, which is slightly greater than the programmed density, reflects the variability inherent in Poisson schedules.) (D) Programmed reward density -1.2 rewards/min. Note that the time allocation ratio tracks wide random fluctuations Z w T ^ / f ' ° re*ardless °f the °veral1 ™<»d density. Reproduced from Mark and Gallistel (1994) by permission of the publisher.
Foraging for brain stimulation
63
allocation ratios tracks the variability in the reward ratios. The time allocation ratio tracks the random noise inherent in estimates of relative rates of reward based on very narrow samples from two Poisson scheduling processes. The tracking of the noise, which is evident in Fig. 4, was confirmed by quantitative analyses, which included cross-correlation functions and scatter plots of the time allocation ratio within a window as a function of the reward ratio within that same window. Moreover, as can be seen in Fig. 4, the tracking of the noise remains the same over large changes in the overall rate of reward, that is, over large changes in the time scale of reward delivery. Experimental sessions lasted for 60 expected rewards on the leaner schedule. When the leaner schedule was VI 16 s (upper left panel in Fig. 4), the rat got about 15 rewards per minute from the two schedules combined, and a session lasted only 15 min. When the leaner schedule was VI 256 s (lower right panel), the rat got about one reward per minute, and sessions lasted more than 4 h. If the rate-estimating process used averaging windows whose width was fixed and independent of the time scale of the reward schedules (cf. Lea & Dow, 1984), these changes in the overall rates of reward would change the number of rewards over which the rate estimation process averaged and, hence, the extent to which the time allocation ratio tracked the noise in the reward ratio. However, the tracking of the noise is the same regardless of the overall rate of reward. This implies that the statistical properties of the rate estimation process (e.g., the variability in the estimates) are scale invariant, a general feature of behavior on timing tasks (Gibbon, 1992). Thus, under conditions where changes in the relative rate of reward occur frequently, the decision process that mediates matching behavior relies on rate estimates derived from a very narrow sample of the most recent rates of return experienced in the various foraging "patches" (levers or keys). And, the rate estimation process is time scale invariant. These findings led us to develop a timing model for the decision process in matching (Mark & Gallistel, 1994). The current rate estimate for a lever is assumed to be the reciprocal of the interval between the last two rewards received. This is the maximally localized estimate of the current rate. Also, estimating rate by taking the reciprocal of the interreward interval makes the statistical properties of the rate estimate time scale invariant. The current return from that lever is this rate estimate times the subjective magnitude of the most recent reward. The expected duration of the animal's stay "in a patch" (on a lever) is assumed to be determined by a Poisson patch-leaving parameter, Ap, where the subscript designates the patch. The reciprocal of this parameter, 1/Ap, is the expected duration of the stay. The tendency to leave a patch (Ap) is inversely proportional to the current return from that patch. The greater the most recent return, the less the tendency to leave a patch. The expected durations of the stays in each of the available patches sum to a constant, c, which is the duration of a visit cycle consisting of one visit to each patch. The duration of a visit cycle is inversely
64
C. Gallistel
proportional to the overall rate of reward. That is, the higher the overall rate of reward, the more rapidly the animal cycles through the patches. This simple model of the decision process in matching predicts the rapid kinetics observed by Mark and Gallistel (1994) and Dreyfus (1991). Surprisingly, it also predicts a quite unrelated and rather counterintuitive result reported by Belke (1992). He trained pigeons to respond to food on concurrent variable interval schedules of reward, using two pairs of keys: a white-red pair and a green-yellow pair. When the white-red pair was illuminated, there was a VI 40 s schedule for the red key and a VI 20 s schedule for the white key. When the green-yellow pair was illuminated, there was a VI 40 s schedule on the green key and a VI 80 s schedule on the yellow key. In accord with the matching law, the pigeons had about a 2:1 preference for the white key over the red and a similar preference for the green key over the yellow. Belke then determined the pigeon's preference in short (unrewarded) tests that pitted the red and green keys against each other. The surprising result was that the pigeons had a 4:1 preference for the green key. When choosing between the red and green keys, they showed a twofold stronger preference for the green key than the preference they showed for that key when it was paired with the yellow key. This is surprising because the red and green keys were both associated with VI 40 s schedules, while the yellow key was associated with a schedule only half as rich. Our model of the decision process predicts the observed 4:1 preference for the green key, on the assumption that the patch-leaving tendencies for the two keys (Ar and Ag) had the same values during the short unrewarded tests as they did during the training portions of the experiment. We are now testing the implications of our model for the microstructure of matching behavior. We are also testing the possibility that the use the decision process makes of local information about rate of return is determined by the animal's global experience. There is reason to think that if the rates of return remain constant for many sessions, the animals may use a different decision process to determine the times it allocates to each patch-one that does not rely on maximally localized estimates of the rates of reward (Keller & Gollub, 1977; Mazur, 1992; Myerson & Miezin, 1980). However, for present purposes, the important point is that we have developed and are testing a fully specified computational model of the decision process that operates when there is a history of frequent changes in the rate of reward.
Conclusion None of the models of psychological function propounded within contemporary cognitive psychology - including those models that take their inspiration from assumptions about how the nervous system works-is easily realized by neuro-
Foraging for brain stimulation
65
physiological processes that are faithful to well-established principles of neurobiology. This is not surprising, because the well-established principles of neurobiology do not say anything about how nervous tissue can store and retrieve the values of variables, nor how it can operate with those variables in accord with the elementary operations of arithmetic and logic, the operations that are the foundation of computation. Insofar as the behavioral evidence demands a computational account of psychological processes, there are clearly important neurophysiological mechanisms that remain to be discovered. What cognitive psychology has to offer neurobiology is a new conceptual scheme-a scheme rooted in computation rather than in reflexes. We should take the computational assumptions implicit in cognitive psychology seriously and use them to ask questions about the nervous system. The obvious question is, what are the elements of neurobiological computation - the "reduced instruction set" out of which all computations are compounded? How are these elementary operations implemented by cellular mechanisms and by basic neural circuits? Some progress along these lines has been made in early stages of sensory processing (e.g., Landy & Movshon, 1991). However, early sensory processing, or at least those aspects currently being modeled at the neural level, does not involve an operation crucial to many later, higher-level computations, namely, the storage and retrieval of the values of variables. We need to develop preparations that do involve this crucial element of post-sensory computation. Models of behavior based on remembered temporal intervals and other simple psychological quantities such as reward magnitude seem suited to the purpose. These are simple computational models, yet they involve the basic arithmetic operations that are the foundation of all computation. They have been exceptionally well specified and tested. This testing has yielded some surprising findings that may offer lines of attack on the underlying cellular and molecular mechanisms. For example, when the mechanism that measures the subjective duration of an interval writes the measured interval into memory, it miswrites the value by an animal-specific (not task-specific) scalar value (Church, 1984). As a result, the animal systematically misremembers intervals by some percentage. Depending on the animal, it may remember intervals as 10% shorter or 10% longer than they were. This scalar error in the recording of the values of intervals makes no functional or psychological sense. It is presumably a hardware error analogous to the hardware error that makes the period of an animal's internal clock deviate from 24 h by an animal specific factor. The discovery of mutants whose circadian clock has a dramatically enhanced period error (Konopka, 1987; Ralph, Foster, Davis, & Menaker, 1990; Ralph & Menaker, 1988) has provided a line of attack on the molecular and cellular mechanisms of internal clocks. Similarly, one may hope to find mutants with a dramatically enhanced scalar write-to-memory error, which would provide an avenue of attack on the molecular basis of the write-tomemory operation.
66
C Gallistel
Research in my own laboratory uses subjects in which the reward is generated by direct electrical stimulation of the central nervous system. This opens up another line of attack on the neural mechanisms that mediate the computations in timing tasks. It gives us direct control over the neural signal that plays two crucial roles. First, this neural signal specifies one of the psychological values that is used in the decision process that mediates matching behavior-the subjective magnitude of the reward. Second, this same neural signal specifies the beginning and end of interreward intervals. The specification of the subjective duration of those intervals requires a clock signal as well as a signal that marks off the interval. There is reason to believe, however, that the clock signal originates within the nervous system itself (Gallistel, 1990). No further stimulus is required to generate the time signals that enable the interval-measuring mechanism to measure an interval. Thus, the neural signal produced by direct electrical stimulation of the medial forebrain bundle provides the stimulus input necessary to define the two quantities that the decision process in matching uses to determine how the animal will allocate its time among patches-the subjective reward magnitude and the interreward interval. Experiments in many laboratories have used self-stimulation behavior to define quantitative and pharmacological characteristics of the neural pathway that carries the rewarding signal (Shizgal & Murray, 1989; Stellar & Rice, 1989; Wise & Rompre, 1989; Yeomans, 1988). More recently, in experiments reviewed here, we have been able to measure the subjective value of the reward and plot the functions relating this value to the strength and duration of the neural signal generated by the stimulation. We have also been able to use results from matching experiments with self-stimulating rats to develop and test models of the computations used by the decision process in matching behavior. Our model of the decision process in matching is closely related to models for the decision processes in other timing tasks. All of them assume that the decisions that determine quantitative properties of the observed behavior are based on elementary computations involving the addition, subtraction, multiplication, division and ordering of scalar quantities. These scalar quantities represent simple aspects of the animal's experience - the durations of intervals and the magnitudes of rewards. These models are cognitive models, but they are simpler and more completely specified than most such models. Thus, they give us a good starting point for a program of research aimed at establishing the neurobiological foundations of the computational capability that makes cognition possible.
References Belke, T.W. (1992). Stimulus preference and the transitivity of preference. Animal Learning and Behavior, 20, 401-406.
Foraging for brain stimulation
67
Church, R.M. (1984). Properties of the internal clock. In J. Gibbon & L. Allan (Eds.), Timing and time perception (pp. 567-582). New York: New York Academy of Sciences. Davison, M., & McCarthy, D. (1988). The matching law: A research review. Hillsdale, NJ: Erlbaum. Dreyfus, L.R. (1991). Local shifts in relative reinforcement rate and time allocation on concurrent schedules. Journal of Experimental Psychology: Animal Behavior Processes, 17, 486-502. Gallistel, C.R. (1990). The organization of learning. Cambridge, MA: Bradford Books/MIT Press. Gallistel, C.R. (in press). Space and time. In N.J. Mackintosh (Ed.), Handbook of perception and cognition. Vol. 9: Animal learning and cognition. New York: Academic Press. Gallistel, C.R., & Leon, M. (1991). Measuring the subjective magnitude of brain stimulation reward by titration with rate of reward. Behavioral Neuroscience, 105, 913-924. Gallistel, C.R., Shizgal, P., & Yeomans, J. (1981). A portrait of the substrate for self-stimulation. Psychological Review, 108, 228-273. Gibbon, J. (1992). Ubiquity of scalar timing with a Poisson clock. Journal of Mathematical Psychology, 36, 283-293. Gibbon, J., Church, R.M., Fairhurst, S., & Kacelnik, A. (1988). Scalar expectancy theory and choice between delayed rewards. Psychological Review, 1988, 102-114. Gibbon, J., Church, R.M., & Meek, W.H., (1984). Scalar timing in memory. In J. Gibbon & L. Allan (Eds.), Timing and time perception (pp. 52-77). New York: New York Academy of Sciences. Herrnstein, R.J. (1961). Relative and absolute strength of response as a function of frequency of reinforcement. Journal of the Experimental Analysis of Behavior, 4, 267-272. Herrnstein, R.J. (1991). Experiments on stable sub-optimality in individual behavior. American Economic Review, 81, 360-364. Keller, J.V., & Gollub, L.R. (1977). Duration and rate of reinforcement as determinants of concurrent responding. Journal of the Experimental Analysis of Behavior, 28, 145-153. Killeen, P.R. (1981). Averaging theory. In CM. Bradshaw, E. Szabadi, & C.F. Lowe (Eds.), Quantification of steady-state operant behavior. Amsterdam: Elsevier/North-Holland. Konopka, R.J. (1987). Genetics of biological rhythms in Drosophila. Annual Review of Genetics, 21, 227-236. Landy, M.S., & Movshon, J.A. (1991). Computational models of visual processing. Cambridge, MA: Bradford Books/MIT Press. Lea, S.E.G., & Dow, S.M. (1984). The integration of reinforcements over time. In J. Gibbon & L. Allan (Eds.), Timing and time perception (pp. 269-277). New York: Annals of the New York Academy of Sciences. Leon, M., & Gallistel, C.R. (1992). The function relating the subjective magnitude of brain stimulation reward to stimulation strength varies with site of stimulation. Behavioural Brain Research, 52, 183-193. Mark, T.A., & Gallistel, C.R. (1993). Subjective reward magnitude of MFB stimulation as a function of train duration and pulse frequency. Behavioral Neuroscience, 107, 389-401. Mark, T.A., & Gallistel, C.R. (1994). The kinetics of matching. Journal of Experimental Psychology: Animal Behavior Processes, 20, 79-95. Mazur, J.E. (1992). Choice behavior in transition: Development of preference with ratio and interval schedules. Journal of Experimental Psychology: Animal Behavior Processes, 18, 364-378. Myerson, J., & Miezin, F.M. (1980). The kinetics of choice: An operant systems analysis. Psychological Review, 87, 160-174. Ralph, M.R., Foster, R.G., Davis, F.C., & Menaker, D.M. (1990). Transplanted suprachiasmatic nucleus determines circadian period. Science, 247, 975-977. Ralph, M.R., & Menaker, M. (1988). A mutation of the circadian system in golden hamsters. Science, 241, 1225-1227. Shizgal, P., & Murray, B. (1989). Neuronal basis of intracranial self-stimulation. In J.M. Liebman & S.J. Cooper (Eds.), The neuropharmacological basis of reward (pp. 106-163). Oxford: Clarendon Press. Simmons, J., & Gallistel, C.R. (in press). The saturation of subjective reward magnitude as a function of current and pulse frequency. Behavioral Neuroscience. Staddon, J.E.R. (1988). Quasi-dynamic choice models: Melioration and ratio invariance. Journal of the Experimental Analysis of Behavior, 49, 303-320.
68
C Gallistel
Stellar, J.R., & Rice, M.B. (1989). Pharmacological basis of intracranial self-stimulation reward. In J.M. Liebman & S.J. Cooper (Eds.), The neuropharmacology basis of reward (pp. 14-65). Oxford: Clarendon Press. Vaughan, W.J. (1981). Choice and the Rescorla-Wagner model. In M.L. Commons, R.J. Hermstein, & H. Rachlin (Eds.), Quantitative analyses of behavior: Matching and maximizing accounts (pp. 263-279). Cambridge MA: Ballinger. Wise, R.A., & Rompre, P.-P. (1989). Brain dopamine and reward. Annual Review of Psychology, 40 6/ 191-225. Yeomans, J.S. (1988). Mechanisms of brain stimulation reward. In A.E. Epstein & A.R. Morrison (Eds.), Progress in psychobiology and physiological psychology (pp. 227-266). New York: Academic Press.
5
Beyond intuition and instinct blindness: toward an evolutionarily rigorous cognitive science Leda Cosmides* 3 , John Tooby b "Laboratory of Evolutionary Psychology, Department of Psychology, University of California, Santa Barbara, CA 93106-3210, USA b Laboratory of Evolutionary Psychology, Department of Anthropology, University of California, Santa Barbara, CA 93106-3210, USA
Abstract Cognitive psychology has an opportunity to turn itself into a theoretically rigorous discipline in which a powerful set of theories organize observations and suggest focused new hypotheses. This cannot happen, however, as long as intuition and folk psychology continue to set our research agenda. This is because intuition systematically blinds us to the full universe of problems our minds spontaneously solve, restricting our attention instead to a minute class of unrepresentative "high-level" problems. In contrast, evolutionarily rigorous theories of adaptive function are the logical foundation on which to build cognitive theories, because the architecture of the human mind acquired its functional organization through the evolutionary process. Theories of adaptive function specify what problems our cognitive mechanisms were designed by evolution to solve, thereby supplying critical information about what their design features are likely to be. This information can free cognitive scientists from the blinders of intuition and folk psychology, allowing them to construct experiments capable of detecting complex mechanisms they otherwise would not have thought to test for. The choice is not
* Corresponding author. For many illuminating discussions on these topics, we warmly thank Pascal Boyer, David Buss, Martin Daly, Mike Gazzaniga, Gerd Gigerenzer, Steve Pinker, Roger Shepard, Dan Sperber, Don Symons, Margo Wilson, and the members of the Laboratory of Evolutionary Psychology (UCSB). We also thank Don Symons for calling our attention to Gary Larson's "Stoppit" cartoon and, especially, Steve Pinker for his insightful comments on an earlier draft. We are grateful to the McDonnell Foundation and NSF Grant BNS9157-499 to John Tooby for their financial support.
70
L. Cosmides, J. Tooby
between no-nonsense empiricism and evolutionary theory; it is between folk theory and evolutionary theory, Nothing in biology makes sense except in the light of evolution. Theodosius Dobzhansky Is it not reasonable to anticipate that our understanding of the human mind would be aided greatly by knowing the purpose for which it was designed? George C. Williams
The cognitive sciences have reached a pivotal point in their development. We now have the opportunity to take our place in the far larger and more exacting scientific landscape that includes the rest of the modern biological sciences. Every day, research of immediate and direct relevance to our own is being generated in evolutionary biology, behavioral ecology, developmental biology, genetics, paleontology, population biology, and neuroscience. In turn, many of these fields are finding it necessary to use concepts and research from the cognitive sciences. But to benefit from knowledge generated in these collateral fields, we will have to learn how to use biological facts and principles in theory formation and experimental design. This means shedding certain concepts and prejudices inherited from parochial parent traditions: the obsessive search for a cognitive architecture that is general purpose and initially content-free; the excessive reliance on results derived from artificial "intellectual" tasks; the idea that the field's scope is limited to the study of "higher" mental processes; and a long list of false dichotomies reflecting premodern biological thought-evolved/learned, evolved / developed, innate / learned, genetic/ environmental, biological / social, biological/cultural, emotion/cognition, animal /human. Most importantly, cognitive scientists will have to abandon the functional agnosticism that is endemic to the field (Tooby & Cosmides, 1992). The biological and cognitive sciences dovetail elegantly because in evolved systems - such as the human brain - there is a causal relationship between the adaptive problems a species encountered during its evolution and the design of its phenotypic structures. Indeed, a theoretical synthesis between the two fields seems inevitable, because evolutionary biologists investigate and inventory the set of adaptive information-processing problems the brain evolved to solve, and cognitive scientists investigate the design of the circuits or mechanisms that evolved to solve them. In fact, the cognitive subfields that already recognize and exploit this relationship between function and structure, such as visual perception, have made the most rapid empirical progress. These areas succeed because they are guided by (1) theories of adaptive function, (2) detailed analyses of the tasks each mechanism was designed by evolution to solve, and (3) the recognition that these tasks are usually solved by cognitive machinery that is highly functionally specialized. We believe the study of central processes can be revitalized by
Beyond intuition and instinct blindness
71
applying the same adaptationist program. But for this to happen, cognitive scientists will have to replace the intuitive, folk psychological notions that now dominate the field with evolutionarily rigorous theories of function. It is exactly this reluctance to consider function that is the central impediment to the emergence of a biologically sophisticated cognitive science. Surprisingly, a few cognitive scientists have tried to ground their dismissal of functional reasoning in biology itself. The claim that natural selection is too constrained by other factors to organize organisms very functionally has indeed been made by a small number of biologists (e.g., Gould & Lewontin, 1979). However, this argument has been empirically falsified so regularly and comprehensively that it is now taken seriously only by research communities too far outside of evolutionary biology to be acquainted with its primary literature (Clutton-Brock & Harvey, 1979; Daly & Wilson, 1983; Dawkins, 1982, 1986; Krebs & Davies, 1987; Williams, 1966; Williams & Nesse, 1991).1 Other cognitive scientists take a less ideological, more agnostic stance; most never think about function at all. As a result, cognitive psychology has been conducted as if Darwin never lived. Most cognitive scientists proceed without any clear notion of what "function" means for biological structures like the brain, or what the explicit analysis of function could teach them. Indeed, many cognitive scientists think that theories of adaptive function are an explanatory luxury - fanciful, unfalsifiable speculations that one indulges in at the end of a project, after the hard work of experimentation has been done. But theories of adaptive function are not a luxury. They are an indispensable methodological tool, crucial to the future development of cognitive psychology. Atheoretical approaches will not suffice-a random stroll through hypothesis space will not allow you to distinguish figure from ground in a complex system. To isolate a functionally organized mechanism within a complex system, you need a theory of what function that mechanism was designed to perform. This article is intended as an overview of the role we believe theories of adaptive function should play in cognitive psychology. We will briefly explain why they are important, where exactly they fit into a research program, how they bear Similar results emerge from the cognitive sciences. Although artificial intelligence researchers have been working for decades on computer vision, object recognition, color constancy, speech recognition and comprehension, and many other evolved competences of humans, naturally selected computational systems still far outperform artificial systems on the adaptive problems they evolved to s o l v e - o n those rare occasions when artificial systems can solve the assigned tasks at all. In short, natural selection is known to produce cognitive machinery of an intricate functionality as yet unmatched by the deliberate application of modern engineering. This is a far more definable standard than "optimality" - where many anti-adaptationist arguments go awry. There are an uncountable number of changes that could conceivably be introduced into the design of organisms and, consequently, the state space of potential organic designs is infinitely large and infinitely dimensioned. Thus, there is no way of defining an "optimal" point in it, much less "measuring" how closely evolution brings organisms to it. However, when definable engineering standards of functionality are applied, adaptations can be shown to be very functionally designed - for solving adaptive problems.
72
L. Cosmides, J. Tooby
on cognitive and neural theories, and what orthodoxies they call into question. (For a more complete and detailed argument, see Tooby & Cosmides, 1992.)
I. Function determines structure Explanation and discovery in the cognitive sciences . . . trying to understand perception by studying only neurons is like trying to understand bird flight by studying only feathers: it just cannot be done. In order to understand bird flight, we have to understand aerodynamics; only then do the structure of feathers and the different shapes of birds' wings make sense. (Marr, 1982, p. 27)
David Marr developed a general explanatory system for the cognitive sciences that is much cited but rarely applied. His three-level system applies to any device that processes information - a calculator, a cash register, a television, a computer, a brain. It is based on the following observations: (1) Information-processing devices are designed to solve problems. (2) They solve problems by virtue of their structure. (3) Hence to explain the structure of a device, you need to know (a) what problem it was designed to solve, and (b) why it was designed to solve that problem and not some other one. In other words, you need to develop a task analysis of the problem, or what Marr called a computational theory (Marr, 1982). Knowing the physical structure of a cognitive device and the information-processing program realized by that structure is not enough. For human-made artifacts and biological systems, form follows function. The physical structure is there because it embodies a set of programs; the programs are there because they solve a particular problem. A computational theory specifies what that problem is and why there is a device to solve it. It specifies the function of an information-processing device. Marr felt that the computational theory is the most important and the most neglected level of explanation in the cognitive sciences. This functional level of explanation has not been neglected in the biological sciences, however, because it is essential for understanding how natural selection designs organisms. An organism's phenotypic structure can be thought of as a collection of "design features" - micro-machines, such as the functional components of the eye or liver. Over evolutionary time, new design features are added or discarded from the species' design because of their consequences. A design feature will cause its own spread over generations if it has the consequence of solving adaptive problems: cross-generationally recurrent problems whose solution promotes reproduction, such as detecting predators or detoxifying
Beyond intuition and instinct blindness
73
poisons. Natural selection is a feedback process that "chooses" among alternative designs on the basis of how well they function. By selecting designs on the basis of how well they solve adaptive problems, this process engineers a tight fit between the function of a device and its structure.2 To understand this causal relationship, biologists had to develop a theoretical vocabulary that distinguishes between structure and function. Marr's computational theory is a functional level of explanation that corresponds roughly to what biologists refer to as the "ultimate" or "functional" explanation of a phenotypic structure. A computational theory defines what problem the device solves and why it solves it; theories about programs and their physical substrate specify how the device solves the problem. "How" questions - questions about programs and hardware - currently dominate the research agenda in the cognitive sciences. Answering such questions is extremely difficult, and most cognitive scientists realize that groping in the dark is not a productive research strategy. Many see
Table 1. Three levels at which any machine carrying out an information-processing task must be understood (from Marr, 1982, p. 25) 1. Computational theory What is the goal of the computation, why is it appropriate, and what is the logic of the strategy by which it can be carried out? 2. Representation and algorithm How can this computational theory be implemented? In particular, what is the representation for the input and output, and what is the algorithm for the transformation? 3. Hardware implementation How can the representation and algorithm be realized physically? In evolutionary biology. Explanations at the level of the computational theory are called ultimate level explanations. Explanations at the level of representations and algorithm, or at the level of hardware implementation, are called proximate levels of explanation. 2 All traits that comprise species-typical designs can be partitioned into adaptations, which are present because they were selected for, by-products, which are present because they are causally coupled to traits that were selected for, and noise, which was injected by the stochastic components of evolution. Like other machines, only narrowly defined aspects of organisms fit together into functional systems: most of the system is incidental to the functional properties. Unfortunately, some have misrepresented the well-supported claim that selection organizes organisms very functionally as the obviously false claim that all traits of organisms are functional - something no sensible evolutionary biologist would ever maintain. Nevertheless, cognitive scientists need to recognize that while not everything in the designs of organisms is the product of selection, all complex functional organization is (Dawkins, 1986; Pinker & Bloom, 1990; Tooby & Cosmides, 1990a, 1990b, 1992; Williams, 1966, 1985).
74
L. Cosmides, J. Tooby
the need for a reliable source of theoretical guidance. The question is, what form should it take?
Why ask whyl -or-how to ask how It is currently fashionable to think that the findings of neuroscience will eventually place strong constraints on theory formation at the cognitive level. Undoubtedly they will. But extreme partisans of this position believe neural constraints will be sufficient for developing cognitive theories. In this view, once we know enough about the properties of neurons, neurotransmitters and cellular development, figuring out what cognitive programs the human mind contains will become a trivial task. This cannot be true. Consider the fact that there are birds that migrate by the stars, bats that echolocate, bees that compute the variance of flower patches, spiders that spin webs, humans that speak, ants that farm, lions that hunt in teams, cheetahs that hunt alone, monogamous gibbons, polyandrous seahorses, polygynous gorillas . . . There are millions of animal species on earth, each with a different set of cognitive programs. The same basic neural tissue embodies all of these programs, and it could support many others as well. Facts about the properties of neurons, neurotransmitters, and cellular development cannot tell you which of these millions of programs the human mind contains. Even if all neural activity is the expression of a uniform process at the cellular level, it is the arrangement of neurons - into birdsong templates or web-spinning programs - that matters. The idea that low-level neuroscience will generate a self-sufficient cognitive theory is a physicalist expression of the ethologically naive associationist/empiricist doctrine that all animal brains are essentially the same. In fact, as David Marr put it, a program's structure "depends more upon the computational problems that have to be solved than upon the particular hardware in which their solutions are implemented" (1982, p. 27). In other words, knowing what and why places strong constraints on theories of how, For this reason, a computational theory of function is not an explanatory luxury. It is an essential tool for discovery in the cognitive and neural sciences. A theory of function may not determine a program's structure uniquely, but it reduces the number of possibilities to an empirically manageable number. Task demands radically constrain the range of possible solutions; consequently, very few cognitive programs are capable of solving any given adaptive problem. By developing a careful task analysis of an information-processing problem, you can vastly simplify the empirical search for the cognitive program that solves it. And once that program has been identified, it becomes straightforward to develop clinical tests that will target its neural basis.
Beyond intuition and instinct blindness
75
To figure out how the mind works, cognitive scientists will need to know what problems our cognitive and neural mechanisms were designed to solve.
Beyond intuition: how to build a computational theory To illustrate the notion of a computational theory, Marr asks us to consider the what and why of a cash register at a check-out counter in a grocery store. We know the what of a cash register: it adds numbers. Addition is an operation that maps pairs of numbers onto single numbers, and it has certain abstract properties, such as commutativity and associativity (see Table 2). How the addition is accomplished is quite irrelevant: any set of representations and algorithms that satisfy these abstract constraints will do. The input to the cash register is prices, which are represented by numbers. To compute a final bill, the cash register adds these numbers together. That's the what. But why was the cash register designed to add the prices of each item? Why not multiply them together, or subtract the price of each item from 100? According to Marr, "the reason is that the rules we intuitively feel to be appropriate for combining the individual prices in fact define the mathematical operation of addition" (p. 22, emphasis added). He formulates these intuitive rules as a series of constraints on how prices should be combined when people exchange money for goods, then shows that these constraints map directly onto those that define addition (see Table 2). On this view, cash registers were designed to add because addition is the mathematical operation that realizes the constraints on buying and selling that our intuitions deem appropriate. Other mathematical operations are inappropriate because they violate these intuitions; for example, if the cash register subtracted each price from 100, the more goods you chose the less you would pay - and whenever you chose more than $100 of goods, the store would pay you. In this particular example, the buck stopped at intuition. But it shouldn't. Our intuitions are produced by the human brain, an information-processing device that was designed by the evolutionary process. To discover the structure of the brain, you need to know what problems it was designed to solve and why it was designed to solve those problems rather than some other ones. In other words, you need to ask the same questions of the brain as you would of the cash register. Cognitive science is the study of the design of minds, regardless of their origin. Cognitive psychology is the study of the design of minds that were produced by the evolutionary process. Evolution produced the what, and evolutionary biology is the study of why. Most cognitive scientists know this. What they don't yet know is that understanding the evolutionary process can bring the architecture of the mind into
7(5
L. Cosmides, J. Tooby
Table 2. Why cash registers add (adapted from Marr, 1982, pp. 22-23) Rules defining addition
Rules governing social exchange in a supermarket
1. There is a unique element, "zero"; Adding zero has no effect: 2 + 0 = 2
1. If you buy nothing, it should cost you nothing; and buying nothing and something should cost the same as buying just the something. (The rules for zero.)
2. Commutativity: (2 + 3) = (3 + 2) = 5
2. The order in which goods are presented to the cashier should not affect the total. (Commutativity.)
3. Associativity: (2 + 3) + 4 = 2 + (3 + 4)
3. Arranging the goods into two piles and paying for each pile separately should not affect the total amount you pay. (Associativity; the basic operation for combining prices.)
4. Each number has a unique inverse that when added to the number gives zero: 2 + (-2) = 0
4. If you buy an item and then return it for a refund, your total expenditure should be zero. (Inverses.)
sharper relief. For biological systems, the nature of the designer carries implications for the nature of the design. The brain can process information because it contains complex neural circuits that are functionally organized. The only component of the evolutionary process that can build complex structures that are functionally organized is natural selection. And the only kind of problems that natural selection can build complexly organized structures for solving are adaptive problems, where "adaptive" has a very precise, narrow technical meaning. (Dawkins, 1986; Pinker & Bloom, 1990; Tooby & Cosmides, 1990a, 1992; Williams, 1966). Bearing this in mind, let's consider the source of Marr's intuitions about the cash register. Buying food at a grocery store is a form of social exchange - cooperation between two or more individuals for mutual benefit. The adaptive problems that arise when individuals engage in this form of cooperation have constituted a long-enduring selection pressure on the hominid line. Paleoanthropological evidence indicates that social exchange extends back at least 2 million years in the human line, and the fact that social exchange exists in some of our primate cousins suggests that it may be even more ancient than that. It is exactly the kind of problem that selection can build cognitive mechanisms for solving. Social exchange is not a recent cultural invention, like writing, yam cultivation, or computer programming; if it were, one would expect to find evidence of its having one or several points of origin, of its having spread by contact, and of its being extremely elaborated in some cultures and absent in others. But its distribution does not fit this pattern. Social exchange is both universal and highly elaborated across human cultures, presenting itself in many forms: reciprocal
Beyond intuition and instinct blindness
77
gift-giving, food-sharing, marketing-pricing, and so on (Cosmides & Tooby, 1992; Fiske, 1991). It is an ancient, pervasive and central part of human social life. In evolutionary biology, researchers such as George Williams, Robert Trivers, W.D. Hamilton, and Robert Axelrod have explored constraints on the evolution of social exchange using game theory, modeling it as a repeated Prisoner's Dilemma. These analyses have turned up a number of important features of this adaptive problem, a crucial one being that social exchange cannot evolve in a species unless individuals have some means of detecting individuals who cheat and excluding them from future interactions (e.g., Axelrod, 1984; Axelrod & Hamilton, 1981; Boyd, 1988; Trivers, 1971). One can think of this as an evolvability constraint. Selection cannot construct mechanisms in any species including humans-that systematically violate such constraints. Behavior is generated by computational mechanisms. If a species engages in social exchange behavior, then it does so by virtue of computational mechanisms that satisfy the evolvability constraints that characterize this adaptive problem. Behavioral ecologists have used these constraints on the evolution of social exchange to build computational theories of this adaptive problem - theories of what and why. These theories have provided a principled basis for generating hypotheses about the phenotypic design of mechanisms that generate social exchange in a variety of species. They spotlight design features that any cognitive program capable of solving this adaptive problem must have. By cataloging these design features, animal behavior researchers were able to look for-and discover-previously unknown aspects of the psychology of social exchange in species from chimpanzees, baboons and vervets to vampire bats and hermaphroditic coral-reef fish (e.g., Cheney & Seyfarth, 1990; de Waal & Luttrell, 1988; Fischer, 1988; Smuts, 1986; Wilkinson, 1988, 1990). This research strategy has been successful for a very simple reason: very few cognitive programs satisfy the evolvability constraints for social exchange. If a species engages in this behavior (and not all do), then its cognitive architecture must contain one of these programs. In our own species, social exchange is a universal, species-typical trait with a long evolutionary history. We have strong and cross-culturally reliable intuitions about how this form of cooperation should be conducted, which arise in the absence of any explicit instruction (Cosmides & Tooby, 1992; Fiske, 1991). In developing his computational theory of the cash register - a tool used in social exchange - David Marr was consulting these deep human intuitions. From these facts, we can deduce that the human cognitive architecture contains 3 Had Marr known about the importance of cheating in evolutionary analyses of social exchange, he might have been able to understand other features of the cash register as well. Most cash registers have anti-cheating devices. Cash drawers lock until a new set of prices is punched in; two rolls of tape keep track of transactions (one is for the customer; the other rolls into an inaccessible place in the cash register, preventing the clerk from altering the totals to match the amount of cash in the drawer); and so on.
78
L. Cosmides, J. Tooby
programs that satisfy the evolvability constraints for social exchange. As cognitive scientists, we should be able to specify what rules govern human behavior in this domain, and why we humans reliably develop circuits that embody these rules rather than others. In other words, we should be able to develop a computational theory of the organic information-processing device that governs social exchange in humans. Since Marr, cognitive scientists have become familiar with the notion of developing computational theories to study perception and language, but the notion that one can develop computational theories to study the informationprocessing devices that give rise to social behavior is still quite alien. Yet some of the most important adaptive problems our ancestors had to solve involved navigating the social world, and some of the best work in evolutionary biology is devoted to analyzing constraints on the evolution of mechanisms that solve these problems. In fact, these evolutionary analyses may be the only source of constraints available for developing computational theories of social cognition.
Principles of organic design The field of evolutionary biology summarizes our knowledge of the engineering principles that govern the design of organisms. As a source of theoretical guidance about organic design, functionalism has an unparalleled historical track record. As Ernst Mayr notes, "The adaptationist question, 'What is the function of a given structure or organ?' has been for centuries the basis for every advance in physiology" (1983, p. 328). Attention to function can advance the cognitive sciences as well. Aside from those properties acquired by chance or imposed by engineering constraint, the mind consists of a set of information-processing circuits that were designed by natural selection to solve adaptive problems that our hunter-gatherer ancestors faced generation after generation.4 If we know what these problems were, we can seek mechanisms that are well engineered for solving them. The exploration and definition of these adaptive problems is a major activity of evolutionary biologists. By combining results derived from mathematical modeling, comparative studies, behavioral ecology, paleoanthropology and other fields, Our ancestors spent the last 2 million years as Pleistocene hunter-gatherers (and several hundred million years before that as one kind of forager or another). The few thousand years since the scattered appearance of agriculture is a short stretch, in evolutionary terms (less than 1% of the past 2 million years). Complex designs - ones requiring the coordinated assembly of many novel, functionally integrated features - are built up slowly, change by change, subject to the constraint that each new design feature must solve an adaptive problem better than the previous design (the vertebrate eye is an example). For these and other reasons, it is unlikely that our species evolved complex adaptations even to agriculture, let alone to post-industrial society (for discussion, see Dawkins, 1982; Tooby & Cosmides, 1990a, 1990b).
Beyond intuition and instinct blindness
79
evolutionary biologists try to identify what problems the mind was designed to solve and why it was designed to solve those problems rather than some other ones. In other words, evolutionary biologists explore exactly those questions that Marr argued were essential for developing computational theories of adaptive information-processing problems. Computional theories address what and why, but because there are multiple ways of achieving any solution, experiments are needed to establish how. But the more precisely you can define the goal of processing - the more tightly you can constrain what would count as a solution - the more clearly you can see what a mechanism capable of producing that solution would have to look like. The more constraints you can discover, the more the field of possible solutions is narrowed, and the more you can concentrate your experimental efforts on discriminating between viable hypotheses. A technological analogy may make this clearer. It is difficult to figure out the design of the object I'm now thinking about if all you know is that it is a machine (toaster? airplane? supercollider?). But the answer becomes progressively clearer as I add functional constraints: (1) it is well designed for entertainment (movie projector, TV, CD player?); it was not designed to project images (nothing with a screen); it is well designed for playing taped music (stereo or Walkman); it was designed to be easily portable during exercise (Walkman). Knowing the object is well engineered for solving these problems provides powerful clues about its functional design features that can guide research. Never having seen one, you would know that it must contain a device that converts magnetic patterns into sound waves; a place to insert the tape; an outer shell no smaller than a tape, but no larger than necessary to perform the transduction; and so on. Guessing at random would have taken forever. Information about features that have no impact on the machine's function would not have helped much either (e.g., its color, the number of scratches). Because functionally neutral features are free to vary, information about them does little to narrow your search. Functional information helps because it narrowly specifies the outcome to be produced. The smaller the class of entities capable of producing that outcome, the more useful functional information is. This means (1) narrow definitions of outcomes are more useful than broad ones (tape player versus entertainment device), and (2) functional information is most useful when there are only a few ways of producing an outcome (Walkman versus paperweight; seeing versus scratching). Narrow definitions of function are a powerful methodological tool for discovering the design features of any complex problem-solving device, including the human mind. Yet the definition of function that guides most research on the mind (it "processes information") is so broad that it applies even to a Walkman. It is possible to create detailed theories of adaptive function. This is because
80
L. Cosmides, J. Tooby
natural selection is only capable of producing certain kinds of designs: designs that promoted their own reproduction in past environments. This rule of organic design sounds too general to be of any help. But when it is applied to real species in actual environments, this deceptively simple constraint radically limits what counts as an adaptive problem and, therefore, narrows the field of possible solutions. Table 3 lists some principles of organic design that cognitive psychologists could be using, but aren't. Doing experiments is like playing "20 questions" with nature, and evolutionary biology gives you an advantage in this game: it tells you what questions are most worth asking, and what the answer will probably look like. It provides constraints - functional and otherwise - from which computational theories of adaptive information-processing problems can be built.
Taking function seriously We know the cognitive science that intuition has wrought. It is more difficult, however, to know how our intuitions might have blinded us. What cognitive systems, if any, are we not seeing? How would evolutionary functionalism transform the science of mind? Table 3. Evolutionary biology provides constraints from which computational theories of adaptive information-processing problems can be built To build a computational theory, you need to answer two questions: 1. What is the adaptive problem? 2. What information would have been available in ancestral environments for solving it? Some sources of constraints 1. More precise definition of Marr's "goal" of processing that is appropriate to evolved (as opposed to artificial) information-processing systems 2. Game-theoretic models of the dynamics of natural selection (e.g., kin selection, Prisoner's Dilemma and cooperation - particularly useful for analysis of cognitive mechanisms responsible for social behavior) 3. Evolvability constraints: can a design with properties X, Y, and Z evolve, or would it have been selected out by alternative designs with different properties? (i.e., does the design represent an evolutionarily stable strategy? - related to point 2) 4. Hunter-gatherer studies and paleoanthropology - source of information about the environmental background against which our cognitive architecture evolved. (Information that is present now may not have been present then, and vice versa). 5. Studies of the algorithms and representations whereby other animals solve the same adaptive problem. (These will sometimes be the same, sometimes different)
Beyond intuition and instinct blindness
81
Textbooks in psychology are organized according to a folk psychological categorization of mechanisms: "attention", "memory", "reasoning", "learning". In contrast, textbooks in evolutionary biology and behavioral ecology are organized according to adaptive problems: foraging (hunting and gathering), kinship, predator defense, resource competition, cooperation, aggression, parental care, dominance and status, inbreeding avoidance, courtship, mateship maintenance, trade-offs between mating effort and parenting effort, mating system, sexual conflict, paternity uncertainty and sexual jealousy, signaling and communication, navigation, habitat selection, and so on. Textbooks in evolutionary biology are organized according to adaptive problems because these are the only problems that selection can build mechanisms for solving. Textbooks in behavioral ecology are organized according to adaptive problems because circuits that are functionally specialized for solving these problems have been found in species after species. No less should prove true of humans. Twenty-first-century textbooks on human cognition will probably be organized similarly. Fortunately, behavioral ecologists and evolutionary biologists have already created a library of sophisticated models of the selection pressures, strategies and trade-offs that characterize these very fundamental adaptive problems, which they use in studying processes of attention, memory, reasoning and learning in non-humans. Which model is applicable for a given species depends on certain key life-history parameters. Findings from paleoanthropology, hunter-gatherer archaeology, and studies of living hunter-gatherer populations locate humans in this theoretical landscape by filling in the critical parameter values. Ancestral hominids were ground-living primates; omnivores, exposed to a wide variety of plant toxins and having a sexual division of labor between hunting and gathering; mammals with altricial young, long periods of biparental investment in offspring, enduring male-female mateships, and an extended period of physiologically obligatory female investment in pregnancy and lactation. They were a long-lived, low-fecundity species in which variance in male reproductive success was higher than variance in female reproductive success. They lived in small nomadic kin-based bands of perhaps 20-100; they would rarely (if ever) have seen more than 1000 people at one time; they had little opportunity to store provisions for the future; they engaged in cooperative hunting, defense and aggressive coalitions; they made tools and engaged in extensive amounts of cooperative reciprocation; they were vulnerable to a large variety of parasites and pathogens. When these parameters are combined with formal models from evolutionary biology and behavioral ecology, a reasonably consistent picture of ancestral life begins to appear (e.g., Tooby & DeVore, 1987). In this picture, the adaptive problems posed by social life loom large. Most of these are characterized by strict evolvability constraints, which could only be satisfied by cognitive programs that are specialized for reasoning about the social
82
L. Cosmides, J. Tooby
world. This suggests that our evolved mental architecture contains a large and intricate "faculty" of social cognition (Brothers, 1990; Cosmides & Tooby, 1992; Fiske, 1991; Jackendoff, 1992). Yet despite its importance, very little work in the cognitive sciences has been devoted to looking for cognitive mechanisms that are specialized for reasoning about the social world. Nor have cognitive neuroscientists been looking for dissociations among different forms of social reasoning, or between social reasoning and other cognitive functions. The work on autism as a neurological impairment of a "theory of mind" module is a very notable - and very successful - exception (e.g., Baron-Cohen, Leslie & Frith, 1985; Frith, 1989; Leslie, 1987). There are many reasons for the neglect of these topics in the study of humans (see Tooby & Cosmides, 1992), but a primary one is that cognitive scientists have been relying on their intuitions for hypotheses rather than asking themselves what kind of problems the mind was designed by evolution to solve. By using evolutionary biology to remind ourselves of the types of problems hominids faced across hundreds of thousands of generations, we can escape the narrow conceptual cage imposed on us by our intuitions and folk psychology. This is not a minor point: if you don't think a thing exists, you won't take the steps necessary to find it. By having the preliminary map that an evolutionary perspective provides, we can find our way out into the vast, barely explored areas of the human cognitive architecture.
II. Computational theories derived from evolutionary biology suggest that the mind is riddled with functionally specialized circuits During most of this century, research in psychology and the other biobehavioral and social sciences has been dominated by the assumptions of what we have elsewhere called the Standard Social Science Model (SSSM) (Tooby & Cosmides, 1992). This model's fundamental premise is that the evolved architecture of the human mind is comprised mainly of cognitive processes that are content-free, few in number and general purpose. These general-purpose mechanisms fly under names such as "learning", "induction", "imitation", "reasoning" and "the capacity for culture", and are thought to explain nearly every human phenomenon. Their structure is rarely specified by more than a wave of the hand. In this view, the same mechanisms are thought to govern how one acquires a language and a gender identity, an aversion to incest and an appreciation for vistas, a desire for friends and a fear of spiders - indeed, nearly every thought and feeling of which humans are capable. By definition, these empiricist mechanisms have no inherent content built into their procedures, they are not designed to construct certain mental contents more readily than others, and they have no features specialized for processing particular kinds of content over others. In
Beyond intuition and instinct blindness
83
other words, they are assumed to operate uniformly, no matter what content, subject matter or domain of life experience they are operating on. (For this reason, such procedures are described as content-independent, domain-general or content-free). The premise that these mechanisms have no content to impart is what leads to the doctrine central to the modern behavioral and social sciences: that all of our particular mental content originated in the social and physical world and entered through perception. As Aquinas put this empiricist tenet a millennium ago, 'There is nothing in the intellect that was not first in the senses." As we will discuss, this view of central processes is difficult to reconcile with modern evolutionary biology.
The weakness of content-independent architectures To some it may seem as if an evolutionary perspective supports the case that our cognitive architecture consists primarily of powerful, general-purpose problem-solvers - inference engines that embody the content-free normative theories of mathematics and logic. After all, wouldn't an organism be better equipped and better adapted if it could solve a more general class of problems over a narrower class? This empiricist view is difficult to reconcile with evolutionary principles for a simple reason: content-free, general-purpose problem-solving mechanisms are extraordinarily w e a k - o r even inert - compared to specialized ones. Every computational system - living or artificial - must somehow solve the frame problem (e.g., Pylyshyn, 1987). Most artificial intelligence programs have domainspecific knowledge and procedures that do this (even those that are called "general purpose"). A program equipped solely with domain-general procedures can do nothing unless the human programmer solves the frame problem for it: either by artificially constraining the problem space or by supplying the program by fiat - with pre-existing knowledge bases ("innate" knowledge) that it could not have acquired on its own, with or without connections to a perceptual system. However, to be a viable hypothesis about our cognitive architecture, a proposed design must pass a solvability test. It must, in principle, be able to solve problems humans are known to be able to solve. At a minimum, any proposed cognitive architecture had to produce sufficiently self-reproductive behavior in ancestral environments - we know this because all living species have been able to reproduce themselves in an unbroken chain up to the present. While artificial intelligence programs struggle to recognize and manipulate coke cans, naturally intelligent programs situated in organisms successfully negotiate through lifetimes full of biotic antagonists - predators, conspecific competitors, self-defending food items, parasites, even siblings. At the same time, these naturally intelligent programs solve a large series of intricate problems in the project of assembling a
84
L. Cosmides, J. Tooby
sufficient number of replacement individuals: offspring. Just as a hypothesized set of cognitive mechanisms underlying language must be able to account for the facts of human linguistic behavior, so too must any hypothetical domain-general cognitive architecture reliably generate solutions to all of the problems that were necessary for survival and reproduction in the Pleistocene. For humans and most other species, this is a remarkably diverse, highly structured and very complex set of problems. If it can be shown that there are essential adaptive problems that humans must have been able to solve in order to have propagated and that domain-general mechanisms cannot solve them, then the domain-general hypothesis fails. We think there is a very large number of such problems, including inclusive fitness regulation, mate choice, nutritional regulation, foraging, navigation, incest avoidance, sexual jealousy, predator avoidance, social exchange- at a minimum, any kind of information-processing problem that involves motivation, and many others as well. We have developed this argument in detail elsewhere (Cosmides & Tooby, 1987, 1994; Tooby & Cosmides, 1990a, 1992), so we won't belabor it here. Instead, we will simply summarize a few of the relevant points. (1) The "Stoppit" problem. There is a Gary Larson cartoon about an "allpurpose" product called "Stoppit". When sprayed from an aerosol can, Stoppit stops faucet drips, taxis, cigarette smoking, crying babies and charging elephants. An "all-purpose" cognitive program is no more feasible for an analogous reason: what counts as adaptive behavior differs markedly from domain to domain. An architecture equipped only with content-independent mechanisms must succeed at survival and reproduction by applying the same procedures to every adaptive problem. But there is no domain-general criterion of success or failure that correlates with fitness (e.g., what counts as a "good" mate has little in common with a "good" lunch or a "good" brother). Because what counts as the wrong thing to do differs from one class of problems to the next, there must be as many domain-specific subsystems as there are domains in which the definitions of successful behavioral outcomes are incommensurate. (2) Combinatorial explosion. Combinatorial explosion paralyzes even moderately domain-general systems when encountering real-world complexity. As generality is increased by adding new dimensions to a problem space or new branch points to a decision tree, the computational load increases with catastrophic rapidity. A content-independent, specialization-free architecture contains no rules of relevance, procedural knowledge or privileged hypotheses, and so could not solve any biological problem of routine complexity in the amount of time an organism has to solve it (for discussion see, for example, Carey, 1985; Gallistel, Brown, Carey, Gelman, & Keil, 1991; Keil, 1989; Markman, 1989; Tooby & Cosmides, 1992). The question is not "How much specialization does a general purpose system require?" but rather "How many degrees of freedom can
Beyond intuition and instinct blindness
85
a system tolerate - even a specialized, highly targeted one - and still compute decisions in useful, real-world time?" Combinatorics guarantee that real systems can only tolerate a small number. (Hence this problem cannot be solved by placing a few ''constraints" on a general system.) (3) Clueless environments. Content-free architectures are limited to knowing what can be validly derived by general processes from perceptual information. This sharply limits the range of problems they can solve: when the environment is clueless, the mechanism will be too. Domain-specific mechanisms are not limited in this way. They can be constructed to embody clues that fill in the blanks when perceptual evidence is lacking or difficult to obtain. Consider the following adaptive problem. All plants foods contain an array of toxins. Ones that your liver metabolizes with ease sometimes harm a developing embryo. This subtle statistical relationship between the environment, eating behavior and fitness is ontogenetically "invisible": it cannot be observed or induced via general-purpose processes on the basis of perceptual evidence.5 It can, however, be "observed" phylogenetically, by natural selection, because selection does not work by inference or simulation. Natural selection "counts up" the actual results of alternative designs operating in the real world, over millions of individuals, over thousands of generations, and weights these alternatives by the statistical distribution of their consequences: those design features that statistically lead to the best available outcome are retained. In this sense it is omniscient - it is not limited to what could be validly deduced by one individual, based on a short period of experience, it is not limited to what is locally perceivable, and it is not confused by spurious local correlations. As a result, it can build circuits - like those that regulate food choice during pregnancy - which embody privileged hypotheses that reflect and exploit these virtually unobservable relationships in the world. For example, the embryo/toxin problem is solved by a set of functionally specialized mechanisms that adjust the threshold on the mother's normal food aversion system (Profet, 1992). They lower it when the embryo is most at risk - thereby causing the food aversions, nausea and vomiting of early pregnancy - and raise it when caloric intake becomes a priority. As a result, the mother avoids ordinarily palatable foods when they would threaten the embryo: she responds adaptively to an ontogenetically invisible relationship. Functionally specialized designs allow organisms to solve a broad range of otherwise unsolvable adaptive problems. (For discussion of this design principle, see Cosmides & Tooby, 1987, 1994; Shepard, 1981, 1987; Tooby & Cosmides, 1990a.)
Women ingest thousands of plant toxins every day; embryos self-abort for many reasons; early term abortions are often undetectable; the best trade-off between calories consumed and risk of teratogenesis is obscure.
L. Cosmides, J. Tooby
86
In sum, architectures that do not come factory-equipped with sufficiently rich sets of content-specific machinery fail the solvability test. They could not have evolved, survived or propagated because they are incapable of solving even routine adaptive problems (Cosmides & Tooby, 1987, 1994; Tooby & Cosmides, 1992).
Natural selection, efficiency and functional specialization Some researchers accept the conclusion that the human mind cannot consist solely of content-independent machinery, but nevertheless continue to believe that the mind needs very little content-specific organization to function. They believe that the preponderance of mental processes are content-independent and general purpose. Moreover, they believe that the correct null hypothesis - the parsimonious, prudent scientific stance - is to posit as few functionally specialized mechanisms as possible. This stance ignores what is now known about the nature of the evolutionary process and the types of functional organization that it produces. Natural selection is a relentlessly hill-climbing process which tends to replace relatively less efficient designs with ones that perform better. Hence, in deciding which of two alternative designs is more likely to have evolved, their comparative performance on ancestral adaptive problems is the appropriate standard to use. Given this standard, positing a preponderance of general-purpose machinery is neither prudent nor parsimonious.6 General-purpose mechanisms can't solve most adaptive problems at all, and in those few cases where one could, a specialized mechanism is likely to solve it more efficiently. The reason why is quite straightforward. A general engineering principle is that the same machine is rarely capable of solving two different problems equally well. We have both cork-screws and cups because each solves a particular problem better than the other. It would be extremely difficult to open a bottle of wine with a cup or to drink from a cork-screw. This same principle applies to the design of the human body. The heart is elegantly designed for pumping blood, but it is not good at detoxifying poisons; the liver is specialized for detoxifying poisons, but it cannot function as a pump. Pumping blood throughout the body and detoxifying poisons are two very different problems; consequently, the human body has a different machine for solving each of them. In biology, machines like these-ones that are specialized 6
Parsimony applies to number of principles, not number of entities - physicists posit a small number of laws, not a small number of elements, molecules or stellar bodies. Epicycle upon epicycle would have to be added on to evolutionary theory to create a model in which less efficient designs frequently outcompeted more efficient ones.
Beyond intuition and instinct blindness
87
and functionally distinct-are called adaptive specializations (Rozin, 1976). Specialization of design is natural selection's signature and its most common result (Williams, 1966)7 In fact, the more important the adaptive problem, the more intensely natural selection tends to specialize and improve the performance of the mechanism for solving it. There is no reason to believe that the human brain and mind are any exception. The cognitive programs that govern how you choose a mate should differ from those that govern how you choose your dinner. Different informationprocessing problems usually have different solutions. Implementing different solutions requires different, functionally distinct mechanisms (Sherry & Schacter, 1987). Speed, reliability and efficiency can be engineered into specialized mechanisms, because they do not need to engineer a compromise between mutually incompatible task demands: a jack of all trades - assuming one is possible at all - is necessarily a master of none. For this reason, one should expect the evolved architecture of the human mind to include many functionally distinct cognitive adaptive specializations. And it does. For example, the learning mechanisms that govern language acquisition are different from those that govern the acquisition of food aversions, and both of these are different from the learning mechanisms that govern the acquisition of snake phobias (e.g., Cook, Hodes, & Lang, 1986; Cook & Mineka, 1989; Garcia, 1990; Mineka & Cook, 1988; Pinker, 1994; Ohman, Dimberg, & Ost, 1985; Ohman, Eriksson, & Olofsson, 1975). These adaptive specializations are domain-specific: the specialized design features that make them good at solving the problems that arise in one domain (avoiding venomous snakes) make them bad at solving the problems that arise in another (inducing a grammar). They are also content-dependent: they are activated by different kinds of content (speech versus screams), and their procedures are designed to accept different kinds of content as input (sentences versus snakes). A mind that applied relatively general-purpose reasoning circuits to all these problems, regardless of their content, would be a very clumsy problem-solver. But flexibility and efficiency of thought and action can be achieved by a mind that contains a battery of There are strict standards of evidence that must be met before a design feature can be considered an adaptation for performing function X. (1) The design feature must be species-typical; (2) function X must be an adaptive problem (i.e., a cross-generationally recurrent problem whose solution would have promoted the design feature's own reproduction); (3) the design feature must reliably develop (in the appropriate morphs) given the developmental circumstances that characterized its environment of evolutionary adaptedness; and, most importantly, (4) it must be shown that the design feature is particularly well designed for performing function X, and that it cannot be better explained as a by-product of some other adaptation or physical law. Contrary to popular belief, the following forms of "evidence" are not relevant: (1) showing that the design feature has a high heritability; (2) showing that variations in the environment do not affect its development; (3) showing that "learning" plays no role in its development. (Criteria for frequency-dependent adaptations differ. For refinements and complications, see Dawkins, 1982, 1986; Symons, 1992; Tooby & Cosmides, 1990b, 1992; and, especially, Williams, 1966, 1985).
88
L. Cosmides, J. Tooby
special-purpose circuits. The mind is probably more like a Swiss army knife than an all-purpose blade: competent in so many situations because it has a large number of components - bottle opener, cork-screw, knife, toothpick, scissorseach of which is well designed for solving a different problem. The functional architecture of the mind was designed by natural selection; natural selection is a hill-climbing process which produces mechanisms that solve adaptive problems well; a specialized design is usually able to solve a problem better than a more generalized one. It is unlikely that a process with these properties would design central processes that are general purpose and contentfree. Consequently, one's default assumption should be that the architecture of the human mind is saturated with adaptive specializations.
How to find a needle in a haystack The human brain is the most complex system scientists have ever tried to understand; identifying its components is enormously difficult. The more functionally integrated circuits it contains, the more difficult it will be to isolate and map any one of them. Looking for a functionally integrated mechanism within a multimodular mind is like looking for a needle in a haystack. The odds you'll find one are low unless you can radically narrow the search space. Marr's central insight was that you could do this by developing computational theories of the problems these mechanisms were designed to solve - for the human brain, the adaptive problems our hunter-gatherer ancestors faced. The only behavioral scientists who still derive their hypotheses from intuition and folk psychology, rather than an evolutionarily based theory, are those who study humans.8 The empirical advantages of using evolutionary biology to develop computational theories of adaptive problems have already been amply demonstrated in the study of non-human minds (e.g., Gallistel, 1990; Gould, 1982; Krebs & Davies, 1987; Real, 1991). We wanted to demonstrate its utility in studying the human mind. We thought an effective way of doing this would be to use an evolutionarily derived computational theory to discover cognitive mechanisms whose existence no one had previously suspected. Because most cognitive scientists still think of central processes as content-independent, we thought it would be particularly interesting to demonstrate the existence of central processes that are functionally specialized and content-dependent: domain-specific reasoning mechanisms. Toward this end, we have conducted an experimental research program over the last 10 years, exploring the hypothesis that the human mind contains 8 For a detailed analysis of the common arguments against the application of evolutionary biology to the study of the human mind, see Tooby and Cosmides (1992).
Beyond intuition and instinct blindness
89
specialized circuits designed for reasoning about adaptive problems posed by the social world of our ancestors: social exchange, threat, coalitional action, mate choice, and so on. We initially focused on social exchange because (1) the evolutionary theory is clear and well developed, (2) the relevant selection pressures are strong, (3) paleoanthropology evidence suggests that hominids have been engaging in it for millions of y e a r s - m o r e than enough time for selection to shape specialized mechanisms - and (4) humans in all cultures engage in social exchange. By starting with an adaptive problem hunter-gatherers are known to have faced, we could proceed to design experiments to test for associated cognitive specializations. The evolutionary analysis of social exchange parallels the economist's concept of trade. Sometimes known as "reciprocal altruism", social exchange is an "I'll scratch your back if you scratch mine" principle (for evolutionary analyses see, for example, Axelrod, 1984; Axelrod & Hamilton, 1981; Boyd, 1988; Trivers, 1971; Williams, 1966.). Using evolvability constraints that biologists had already identified (some involving the Prisoners' Dilemma), we developed a computational theory of the information-processing problems that arise in this domain (Cosmides, 1985; Cosmides & Tooby, 1989). This gave us a principled basis for generating detailed hypotheses about the design of the circuits that generate social exchange in humans. Some of the design features we predicted are listed in Table 4. For example, mathematical analyses had established cheater detection as a crucial adaptive problem. Circuits that generate social exchange will be selected out unless they allow individuals to detect those who fail to reciprocate favors cheaters. This evolvability constraint led us directly to the hypothesis that humans might have evolved inference procedures that are specialized for detecting cheaters. We tested this hypothesis using the Wason selection task, which had originally been developed as a test of logical reasoning (Wason, 1966; Wason & Johnson-Laird, 1972). A large literature already existed showing that people are not very good at detecting logical violations of "if-then" rules in Wason selection tasks, even when these rules deal with familiar content drawn from everyday life (e.g., Manktelow & Evans, 1979; Wason, 1983). For example, suppose you are skeptical when an astrologer tells you, "If a person is a Leo, then that person is brave," and you want to prove him wrong. In looking for exceptions to this rule, you will probably investigate people who you know are Leos, to see whether they are brave. Many people also have the impulse to investigate people who are brave, to see if they are Leos. Yet investigating brave people would be a waste of time; the astrologer said that all Leos are brave - not that all brave people are Leos - so finding a brave Virgo would prove nothing. And, if you are like most people, you probably won't realize that you need to investigate cowards. Yet a coward who turns out to be a Leo would represent a violation of the rule.
L. Cosmides, J. Tooby
90
Table 4. Reasoning about social exchange: evidence of special design* (a) The following design features were predicted and found
(b) The following by-product hypotheses were empirically eliminated
1. The algorithms governing reasoning about social contracts operate even in unfamiliar situations.
1. Familiarity cannot explain the social contract effect.
2. The definition of cheating that they embody depends on one's perspective.
2. It is not the case that social contract content merely facilitates the application of the rules of inference of the propositional calculus.
They are just as good at computing 3. Social contract content does not the cost-benefit representation of a somerely "afford" clear thinking. cial contract from the perspective of one party as from the perspective of another. 4. They embody implicational procedures specified by the computational theory.
4. Permission schema theory cannot explain the social contract effect; in other words, application of a generalized deontic logic cannot explain the results.
They include inference procedures specialized for cheater detection.
5. It is not the case that any problem involving payoffs will elicit the detection of violations.
6. Their cheater detection procedures cannot detect violations of social contracts that do not correspond to cheating. 7. They do not include altruist detection procedures. 8. They cannot operate so as to detect cheaters unless the rule has been assigned the cost-benefit representation of a social contract. a
To show that an aspect of the phenotype is an adaptation to perform a particular function, one must show that it is particularly well designed for performing that function, and that it cannot be better explained as a by-product of some other adaptation or physical law.
If your mind had reasoning circuits specialized for detecting logical violations of rules, if would be immediately obvious to you that you should investigate Leos and cowards. But it is not intuitively obvious to most subjects. In general, fewer than 10% of subjects spontaneously realize this. Despite claims for the power of culture and "learning", even formal training in logical reasoning does little to
Beyond intuition and instinct blindness
91
boost performance (e.g., Cheng, Holyoak, Nisbett, & Oliver, 1986; Wason & Johnson-Laird, 1972). However, we found that people who ordinarily cannot detect violations of "if-then" rules can do so easily and accurately when that violation represents cheating in a situation of social exchange. This is a situation in which one is entitled to a benefit only if one has fulfilled a requirement (e.g., "If you are to eat these cookies, then you must first fix your bed" or "If you are to eat cassava root, then you must have a tattoo on your face"). In these situations, the adaptively correct answer is immediately obvious to almost all subjects, who commonly experience a "pop out" effect. No formal training is needed. Whenever the content of a problem asks subjects to look for cheaters on a social exchange -even when the situation described is culturally unfamiliar and even bizarre -subjects experience the problem as simple to solve, and their performance jumps dramatically. Seventy to 90% of subjects get it right, the highest performance ever found for a task of this kind. From a domain-general, formal view, investigating people eating cassava root and people without tattoos is logically equivalent to investigating Leos and cowards. But everywhere it has been tested, people do not treat social exchange problems as equivalent to other kinds of reasoning problems. Their minds distinguish social exchange contents, and apply domain-specific, content-dependent rules of inference that are adaptively appropriate only to that task. (For a review of the relevant experiments, see Cosmides & Tooby, 1992. For more detailed descriptions, see Cosmides, 1985, 1989; Cosmides & Tooby, 1989; Gigerenzer & Hug, 1992.) We think that the goal of cognitive research should be to recover, out of carefully designed experimental studies, high-resolution "maps" of the intricate mechanisms that collectively constitute the cognitive architecture. Our evolutionary derived computational theory of social exchange allowed us to construct experiments capable of detecting, isolating and mapping out previously unknown cognitive procedures. It led us to predict a large number of design features in advance - features that no one was looking for and that most of our colleagues thought were outlandish (Cosmides & Tooby, 1989). Experimental tests have confirmed the presence of all the predicted design features that have been tested for so far. Those design features that have been tested and confirmed are listed in Table 4, along with the alternative by-product hypotheses that we and our colleagues have eliminated. So far, no known theory invoking general-purpose cognitive processes has been able to explain the very precise and unique pattern of data that experiments like these have generated. The data seem best explained by the hypothesis that humans reliably develop circuits that are complexly specialized for reasoning about reciprocal social interactions. Parallel lines of investigation have already identified two other domain-specialized reasoning mechanisms: one for reasoning about aggressive threats and one
92
L. Cosmides, J. Tooby
for reasoning about protection from hazards (e.g., Manktelow & Over, 1990; Tooby & Cosmides, 1989). We are now designing clinical tests to identify the neural basis for these mechanisms. By studying patient populations with autism and other neurological impairments of social cognition, we should be able to see whether dissociations occur along the fracture lines that our various computational theories suggest.
Reasoning instincts In our view, a large range of reasoning problems (like the astrological one) are difficult because (1) their content is not drawn from a domain for which humans evolved functionally specialized reasoning circuits, and (2) we lack the contentindependent circuits necessary for performing certain logical operations ("logical reasoning"). In contrast, social exchange problems are easy because we do have evolved circuits specialized for reasoning about that important, evolutionarily long-enduring problem in social cognition. The inferences necessary for detecting cheaters are obvious to humans for the same reason that the inferences necessary for echolocation are obvious to a bat. Instincts are often thought of as the polar opposite of reasoning. Non-human animals are widely believed to act through "instinct", while humans "gave up instincts" to become "the rational animal". But the reasoning circuits we have been investigating are complexly structured for solving a specific type of adaptive problem, they reliably develop in all normal human beings, they develop without any conscious effort and in the absence of any formal instruction, they are applied without any conscious awareness of their underlying logic, and they are distinct from more general abilities to process information or to behave intelligently. In other words, they have all the hallmarks of what one usually thinks of as an "instinct" (Pinker, 1994). Consequently, one can think of these specialized circuits as reasoning instincts. They make certain kinds of inferences just as easy, effortless and "natural" to us as humans, as spinning a web is to a spider or dead-reckoning is to a desert ant. Three decades of research in cognitive psychology, evolutionary biology and neuroscience have shown that the central premise of the SSSM - that the mind is general purpose and content-free - is fundamentally misconceived. An alternative framework - sometimes called evolutionary psychology - is beginning to replace it (Tooby & Cosmides, 1992). According to this view, the evolved architecture of the human mind is full of specialized reasoning circuits and regulatory mechanisms that organize the way we interpret experience, construct knowledge and make decisions. These circuits inject certain recurrent concepts and motivations into our mental life, and they provide universal frames of meaning that allow us to understand the actions and intentions of others. Beneath the level of surface
Beyond intuition and instinct blindness
#
variability, all humans share certain views and assumptions about the nature of the world and human action by virtue of these universal reasoning circuits (Atran, 1990; Boyer, 1994; Brown, 1991; Carey & Gelman, 1991; Gelman & Hirschfeld, 1994; Keil, 1989; Leslie, 1987; Markman, 1990; Spelke, 1990; Sperber, 1985, 1990, 1994; Symons, 1979; Tooby & Cosmides, 1992).
III. Intuition is a misleading source of hypotheses because functionally specialized mechanisms create 'instinct blindness"; computational theories are lenses that correct for instinct blindness Intuitions about cognition: the limitations of an atheoretical approach The adaptationist view of a multinodular mind was common at the turn of the century. Early experimental psychologists, such as William James and William McDougall, thought the mind is a collection of "faculties" or "instincts" that direct learning, reasoning and action (James, 1890; McDougall, 1908). These faculties were thought to embody sophisticated information-processing procedures that were domain-specific. In James's view, human behavior is so much more flexibly intelligent than that of other animals because we have more instincts than they d o - n o t fewer (James, 1890). The vocabulary may be archaic, but the model is modern. With every new discovery, it becomes more apparent that the evolved architecture of the human mind is densely multimodular - that it consists of an enormous collection of circuits, each specialized for performing a particular adaptive function. The study of perception and language has provided the most conspicuous examples, but evidence for the existence of learning instincts (Marler, 1991) and reasoning instincts is pouring in from all corners of the cognitive sciences (for examples, see Atran, 1990; Barkow, Cosmides, & Tooby, 1992; Baron-Cohen, Leslie, & Frith, 1985; A. Brown, 1990; D.E. Brown, 1991; Carey & Gelman, 1991; Cosmides & Tooby, in press; Daly & Wilson, 1988, 1994; Frith, 1989; Gelman & Hirschfeld, 1994; Gigerenzer, Hoffrage, & Kleinbolting, 1991; Leslie, 1988; Pinker, 1994; Rozin, 1976; Spelke, 1988; Sperber, 1994; Symons, 1979; Wilson & Daly, 1992; Wynn, 1992). In spite of this consistent pattern, however, most cognitive scientists balk at the model of a brain crowded with specialized inference engines. Even Fodor, who has championed the case for modular processes, takes the traditional view that "central" processes are general purpose (Fodor, 1983). The notion that learning and reasoning are like perception and language - the complex product of a large collection of functionally specialized circuits - is deeply at war with our intuitions. But so is the ipherent indeterminacy in the position of electrons. It is uncomfortable but scientifically necessary to accept that common sense is the
94
L. Cosmides, J. Tooby
faculty that tells us the world is flat.9 Our intuitions may feel authoritative and irresistibly compelling, and they may lead us to dismiss many ideas as ridiculous. But they are, nevertheless, an untrustworthy guide to the reality of subatomic particles or the evolved structure of the human mind. In the case of central processes, we think human intuition is not merely untrustworthy: it is systematically misleading. Well-designed reasoning instincts should be invisible to our intuitions, even as they generate them-no more accessible to consciousness than retinal cells and line detectors, but just as important in creating our perception of the world. Intuitively, we are all naive realists, experiencing the world as already parsed into objects, relationships, goals, foods, dangers, humans, words, sentences, social groups, motives, artifacts, animals, smiles, glares, relevances and saliences, the known and the obvious. This automatically manufactured universe, input as toy worlds into computers, seems like it could almost be tractable by that perennially elusive collection of general-purpose algorithms cognitive scientists keep expecting to find. But to produce this simplified world that we effortlessly experience, a vast sea of computational problems are being silently solved, out of awareness, by a host of functionally integrated circuits. These reasoning instincts are powerful inference engines, whose automatic, non-conscious operation creates our seamless experience of the world. The sense of clarity and self-evidence they generate is so potent it is difficult to see that the computational problems they solve even exist. As a result, we incorrectly locate the computationally manufactured simplicity that we experience as a natural property of the external world -as the pristine state of nature, not requiring any explanation or research. Thus the "naturalness" of certain inferences acts to obstruct the discovery of the mechanisms that produced them. Cognitive instincts create problems for cognitive scientists. Precisely because they work so well - because they process information so effortlessly and automatically - we tend to be blind to their existence. Not suspecting they exist, we do not conduct research programs to find them. To see that they exist, you need to envision an alternative conceptual universe. But these dedicated circuits structure our thought so powerfully that it can be difficult to imagine how things could be otherwise. As William James wrote: It takes . . . a mind debauched by learning to carry the process of making the natural seem strange, so far as to ask for the why of any instinctive human act. To the metaphysician alone can such questions occur as: why do we smile, when pleased, and not scowl? Why are we unable to talk to a
9 This should not be surprising. Our intuitions were designed to generate adaptive behavior in Pleistocene hunter-gatherers, not useful theories for physicists and cognitive scientists.
Beyond intuition and instinct blindness
95
crowd as we talk to a single friend? Why does a particular maiden turn our wits so upside-down? The common man can only say, Of course we smile, of course our heart palpitates at the sight of the crowd, of course we love the maiden, that beautiful soul clad in that perfect form, so palpably and flagrantly made for all eternity to be loved! And so, probably, does each animal feel about the particular things it tends to do in the presence of particular objects To the lion it is the lioness which is made to be loved; to the bear, the she-bear. To the broody hen the notion would probably seems monstrous that there should be a creature in the world to whom a nestful of eggs was not the utterly fascinating and precious and never-to-be-too-much-sat-upon object which it is to her. (James, 1890)
For exactly this reason, intuition is an unreliable guide to points of interest in the human mind. Functionally specialized reasoning circuits will make certain inferences intuitive - so "natural" that there doesn't seem to be any phenomenon that is in need of explanation. Consider, for example, sentences (1) and (2): (1) If he's the victim of an unlucky tragedy, then we should pitch in to help him out. (2) If he spends his time loafing and living off of others, then he doesn't deserve our help. The inferences they express seem perfectly natural; there seems to be nothing to explain. They may not always be applicable, but they are perfectly intelligible. But consider sentences (3) and (4): *(3) If he's the victim of an unlucky tragedy, then he doesn't deserve our help. *(4) If he spends his time loafing and living off of others, then we should pitch in to help him out. Sentences (3) and (4) sound eccentric in a way that (1) and (2) do not. Yet they involve no logical contradictions. The inferences they embody seem to violate a grammar of social reasoning-in much the same way that "Alice might slowly" violates the grammar of English but "Alice might come" does not (Cosmides, 1985; Cosmides & Tooby, 1989, 1992). If so, then one needs to look for a reasoning device that can reliably generate (1) and (2) without also generating (3) and (4). Realizing that not generating (3) and (4) is a design feature of the mechanism is tricky, however. Precisely because the device in question does not spontaneously generate inferences like (3) and (4), we rarely notice their absence or feel the need to explain it. And that is the root of the problem. There is a complex pattern to the inferences we generate, but seeing it requires a contrast between figure and
L. Cosmides, J. Tooby
96
ground; the geometry of a snowflake disappears against a white background. "Unnatural" inferences form the high contrast background necessary to see the complex geometry of the inferences that we do spontaneously generate. Yet these "unnatural" inferences are exactly the ones we don't produce. Without this background, the pattern can't be seen. As a result, we look neither for the pattern, nor for the mechanisms that generate it. And no one guesses that our central processes instantiate domain-specific grammars every bit as rich as that of a natural language (for more examples, see Table 5).
Hidden grammars In the study of language, a grammar is defined as a finite set of rules that is capable of generating all the sentences of a language without generating any non-sentences; a sentence is defined as a string of words that members of a linguistic community would judge as well formed. In the study of reasoning, a grammar is a finite set of rules that can generate all appropriate inferences while not simultaneously generating inappropriate ones. If it is a grammar of social reasoning, then these inferences are about the domain of social motivation and
Table 5. Inferences that violate a grammar of social reasoning (a) I want to help him because he has helped me so often in the past. I don't want to help him because whenever I'm in trouble he refuses to help me. *I want to help him because whenever I'm in trouble he refuses to help me. *I don't want to help him because he has helped me so often in the past. (b) I love my daughter. If you hurt her, I'll kill you. *I love my daughter. If you hurt her, I'll kiss you. (c) If I help you now, then you must promise to help me. *If I help you now, then you must promise to never help me. (d) He gave her something expecting nothing in return; she was touched. *He gave her something expecting nothing in return; she was enraged. (e) She paid $5 for the book because the book was more valuable to her than $5. *She paid $5 for the book because the book was less valuable to her than $5.
Beyond intuition and instinct blindness
97
behavior; an "inappropriate" inference is defined as one that members of a social community would judge as incomprehensible or nonsensical.10 The cornerstone of any computational theory of the problem of language acquisition is the specification of a grammar. Discovering the grammar of a human language is so difficult, however, that there is an entire field - linguistics devoted to the task. The task is difficult precisely because our linguistic inferences are generated by a "language instinct" (Pinker, 1994). One thing this set of specialized circuits can do is distinguish grammatical from ungrammatical sentences. But the rules that generate sentences - the grammar itself - operate effortlessly and automatically, hidden from our conscious awareness. Indeed, these complex rules are so opaque that just 40 years ago most linguists thought each human language - English, Chinese, Setswana-had a completely different grammar. Only recently have these grammars been recognized as minor variants on a Universal Grammar (UG): an invariant set of rules embodied in the brains of all human beings who are not neurologically impaired (Chomsky, 1980; Pinker, 1994).u Universal grammars of social reasoning are invisible to cognitive scientists now for the same reason that UG was invisible to linguists for such a long time. The fact that the internal operations of the computational machinery in question are automatic and unconscious is a contributing factor; but the causes of invisibility go even deeper.
I0 The similarities between a grammar of language and a grammar of social reasoning run even deeper. Context can make a seemingly ungrammatical sentence grammatical. To pick a standard linguistic example, "The horse raced past the barn fell" seems ungrammatical when "raced" is categorized as the main verb of the sentence, but grammatical if the context indicates that there are two horses. "Fell" is then recategorized as the main verb, and "raced" as a passive verb within a prepositional phrase. Context can have the same effect on statements that seem socially ungrammatical. "I'll give you $1000 for your gum wrapper" seems eccentric - ungrammatical - because gum wrappers are considered worthless. It violates a grammatical constraint of social contract theory: that (benefit to offerer) > (cost to offerer) (Cosmides & Tooby, 1989). To become grammatical, the context must cause the violated constraint to be satisfied. For example, recategorizing the gum wrapper as something extremely valuable (potentially justifying the $1000 payment) would do this: the statement seems sensible if you are told that the speaker is a spy who knows the gum wrapper has a microdot with the key for breaking an enemy code. n The term "innate" means different things to different scientific communities, but no person who uses the term means "immune to every environmental perturbation". UG is innate in the following sense: its intricate internal organization is the product of our species' genetic endowment in the same way that the internal organization of the eye is. Its neurological development is buffered against most naturally occurring variations in the physical and social environment. Certain environmental conditions are necessary to trigger the development of UG, but these conditions are not the source of its internal organization. As a result, all normal human beings raised in reasonably normal environments develop the same UG (e.g., Pinker, 1994). For an extensive discussion of how natural selection structures the relationships among genotype, phenotype and environment in development, see Tooby and Cosmides (1992).
98
L. Cosmides, J. Tooby
Instinct blindness UG is a small corner of hypothesis space; there are an indefinitely large number of grammars that are not variants of UG. To explain the fact that all natural languages fall within the bounds of UG, one must first realize that UG exists. To realize that it exists, one must realize that there are alternative grammars. But this last step is where our imagination stumbles. The language instinct structures our thought so powerfully that alternative grammars are difficult to imagine. This is not an incidental feature of the language instinct; it is the language acquisition device's (LAD) principal adaptive function.12 Any set of utterances a child hears is consistent with an infinite number of possible grammars, but only one of them is the grammar of its native language. A content-free learning mechanism would be forever lost in hypothesis space. The LAD is an adaptation to combinatorial explosion: by restricting the child's grammatical imagination to a very small subset of hypothesis space - hypotheses consistent with the principles of UG - it makes language acquisition possible. Its function is to generate grammatical inferences consistent with UG without simultaneously generating inconsistent ones. To do this, the LAD's structure must make alternative grammars literally unimaginable (at least by the language faculty). This is good for the child learning language, but bad for the cognitive scientist, who needs to imagine these unimaginable grammars. Forming the plural through mirror reversal - so that the plural of "cat" is "tac" - is a rule in an alternative grammar. No child considers this possibility; the LAD cannot generate this rule. The cognitive scientist needs to know this, however, in order to characterize UG and produce a correct theory of the LAD's cognitive structure. UG is what, an algorithm is how. A proposed algorithm can be ruled out, for example, if formal analyses reveal that it produces both the mirror reverse rule and the "add 's' to a stem" rule. Alternative grammars - and hence Universal Grammar - were difficult to discover because circuits designed to generate only a small subset of all grammatical inferences in the child also do so in the linguist. This property of the language instinct is crucial to its adaptive function. But it caused a form of theoretical blindness in linguists, which obstructed the discovery of UG and of the language instinct itself. One can think of this phenomenon as instinct blindness. Discovering a grammar of social reasoning is likely to prove just as difficult as discovering the grammar of a language, and for exactly the same reasons. Yet 12 As a side-effect, it can also solve problems that played no causal role in its selective history. For example, the LAD was not designed to support writing, but its properties made the design and spread of this cultural invention possible.
Beyond intuition and instinct blindness
99
there is no field, parallel to linguistics, that is devoted to this task; indeed, very few individuals even recognize the need for such a grammar, let alone such a field (for exceptions, see Cosmides, 1985; Cosmides & Tooby, 1989, 1992; Fiske, 1991; Jackendoff, 1992). Our intuitions blind us not only to the existence of instincts, but to their complexity. The phenomenal experience of an activity as "easy" or "natural" often leads scientists to assume that the processes that give rise to it are simple. Legend has it that in the early days of artificial intelligence, Marvin Minsky assigned the development of machine vision to a graduate student as a summer project. This illusion of simplicity hampered vision research for years: . . . in the 1960s almost no one realized that machine vision was difficult. The field had to go through [a series of fiascoes] before it was at last realized that here were some problems that had to be taken seriously. The reason for this misperception is that we humans are ourselves so good at vision. (Marr, 1982, p. 16)
Phenomenally, seeing seems simple. It is effortless, automatic, reliable, fast, unconscious and requires no explicit instruction. But seeing is effortless, automatic, reliable, fast, and unconscious precisely because there is a vast array of complex, dedicated computational machinery that makes this possible. Most cognitive scientists don't realize it, but they are grossly underestimating the complexity of our central processes. To find someone beautiful, to fall in love, to feel jealous, to experience moral outrage, to fear disease, to reciprocate a favor, to initiate an attack, to deduce a tool's function from its shape - and a myriad other cognitive accomplishments - can seem as simple and automatic and effortless as opening your eyes and seeing. But this apparent simplicity is possible only because there is a vast array of complex computational machinery supporting and regulating these activities. The human cognitive architecture probably embodies a large number of domain-specific "grammars", targeting not just the domain of social life, but also disease, botany, tool-making, animal behavior, foraging and many other situations that our hunter-gatherer ancestors had to cope with on a regular basis. Research on the computational machinery responsible for these kinds of inferences, choices and preferences - especially the social ones-is almost totally absent in the cognitive sciences. This is a remarkable omission, from an evolutionary point of view. Instinct blindness is one culprit; extreme and unfounded claims about cultural relativity is another (e.g., Brown, 1991; Sperber, 1982; Tooby & Cosmides, 1992).
Anthropological malpractice As a result of the rhetoric of anthropologists, most cognitive researchers have, as part of their standard intellectual furniture, a confidence that cultural relativity
L. Cosmides, J. Tooby
100
is an empirically established finding of wide applicability (see discussion of the Standard Social Science Model in Tooby & Cosmides, 1992). Consequently, most scientists harbor the incorrect impression that there is no "Universal Grammar" of social reasoning to be discovered. According to this view, a grammar of social reasoning might exist in each culture, but these grammars will differ dramatically and capriciously from one culture to the next. In its most extreme form, the relativist position holds that the grammars of different cultures are utterly incommensurate - that there is no transformation that can map the rules of one onto the rules of another. If so, then these rules cannot be expressions of an underlying UG of social reasoning. Among anthropologists, however, cultural relativism is an interpretation imposed as an article of faith - not a conclusion based on scientific data (Brown, 1991; Sperber, 1982; Tooby & Cosmides, 1992).13 Indeed, Maurice Bloch, a prominent member of the field, has complained that it is the "professional malpractice of anthropologists to exaggerate the exotic character of other cultures" (Bloch, 1977). To some degree, this is a self-legitimizing institutional pressure: why go long distances to study things that could be studied at home (Brown, 1991)? More importantly, however, anthropologists are just as oblivious to what is universally natural for the human mind as the rest of us. Their attention is drawn to what differs from culture to culture, not what is absent from all cultures or what differs from species to species. Drawing on their cognitive instincts, they understand, automatically and without reflection, much of what happens in other cultures. They know they can work out exchanges without language, or see a smile, a shared look, or an aggressive gesture and infer its meaning and its referent. Indeed, they operate within a huge set of implicit panhuman assumptions that allow them to decode the residue of human life that does differ from place to place (Sperber, 1982; Tooby & Cosmides, 1992). The notion of universal human reasoning instincts - including social reasoning instincts - is completely compatible with the ethnographic record. It is more than empirically reasonable; it is a logical necessity, for the reasons discussed above. Indeed, without universal reasoning instincts, the acquisition of one's "culture" would be literally impossible, because one wouldn't be able to infer which representations, out of the infinite universe of possibilities, existed in the minds of other members of the culture (Boyer, 1994; Chomsky, 1980; Sperber, 1985, 1990; Tooby & Cosmides, 1992). Instinct blindness is a side-effect of any instinct whose function is to generate some inferences or behaviors without simultaneously generating others. This is a
13
For a history and discussion of how unsupported relativist claims gained widespread acceptance in the social sciences, see Brown (1991) and Tooby and Cosmides (1992).
Beyond intuition and instinct blindness
101
very general property of instincts, because combinatorial explosion is a very general selection pressure (for discussion, see Tooby & Cosmides, 1992). The fact that human instincts are difficult for human minds to discover is a side-effect of their adaptive function. Many aspects of the human mind can't be seen by the naked "I" - by intuition unaided by theory. A good theory rips away the veil of naturalness and familiarity that our own minds create, exposing computational problems whose existence we never even imagined. The cognitive sciences need theoretical guidance that is grounded in something beyond intuition. Otherwise, we're flying blind.
Corrective lenses There are various ways of overcoming instinct blindness. One of the most common is the study of non-human minds that differ profoundly from our own - animal minds and electronic minds, broody hens and AI programs. Linguists were awakened to the existence of alternative grammars by the creation of computer "languages", which are not variants of UG. These languages "made the natural seem strange", inspiring linguists to generate even stranger grammars. To do this, they had to escape the confines of their intuitions, which they did through the use of mathematical logic and the theory of computation. In William James's terms, they debauched their minds with learning. The study of animal behavior is another time-honored method for debauching the mind-the one used by William James himself. Hermaphroditic worms, colonies of ant sisters who come in three "genders" (sterile workers, soldiers, queens), male langur monkeys who commit systematic infanticide when they join a troop, flies who are attracted to the smell of dung, polyandrous jacanas who mate with a male after breaking the eggs he was incubating for a rival female, fish who change sex when the composition of their social group changes, female praying mantises who eat their mate's head while copulating with him -other animals engage in behaviors that truly are exotic by human standards. Human cultural variation is trivial in comparison. Observing behaviors caused by alternative instincts jars us into recognizing the specificity and multiplicity of our own instincts. Observations like these tell us what we are not, but not what we are. That's why theoretical biology is so important. It provides positive theories of what kinds of cognitive programs we should expect to find in species that evolved under various ecological conditions: theories of what and why. Evolutionary biology's formal theories are powerful lenses that correct for instinct blindness. In their focus, the intricate outlines of the mind's design stand out in sharp relief.
102
L. Cosmides, J. Tooby
References Atran, S. (1990). The cognitive foundations of natural history. New York: Cambridge University Press. Axelrod, R. (1984). The evolution of cooperation. New York: Basic Books. Axelrod, R., & Hamilton, W.D. (1981). The evolution of cooperation. Science, 211, 1390-1396. Barkow, J., Cosmides, L., & Tooby, J. (Eds.) (1992). The adapted mind: Evolutionary psychology and the generation of culture. New York: Oxford University Press. Baron-Cohen, S., Leslie, A., & Frith, U. (1985). Does the autistic child have a "theory of mind"? Cognition, 21, 37-46. Bloch, M. (1977). The past and the present in the present. Man, 12, 278-292. Boyd, R. (1988). Is the repeated prisoner's dilemma a good model of reciprocal altruism? Ethology and Sociobiology, 9, 211-222. Boyer, P. (1994). The naturalness of religious ideas. Berkeley: University of California Press. Brothers, L. (1990). The social brain: A project for integrating primate behavior and neurophysiology in a new domain. Concepts in Neuroscience, 1, 27-51. Brown, A. (1990). Domain-specific principles affect learning and transfer in children. Cognitive Science, 14, 107-133. Brown, D.E. (1991). Human universals. New York: McGraw-Hill. Carey, S. (1985). Constraints on semantic development. In J. Mehler & R. Fox (Eds.), Neonate cognition (pp. 381-398). Hillsdale, NJ: Erlbaum. Carey, S., & Gelman, R. (Eds.) (1991). The epigenesis of mind. Hillsdale, NJ: Erlbaum. Cheney, D.L., & Seyfarth, R. (1990). How monkeys see the world. Chicago: University of Chicago Press. Cheng, P., Holyoak, K., Nisbett, R., & Oliver, L. (1986). Pragmatic versus syntactic approaches to training deductive reasoning. Cognitive Psychology, 18, 293-328. Chomsky, N. (1980). Rules and representations. New York: Columbia University Press. Clutton-Brock, T.H., & Harvey, P. (1979). Comparison and adaptation. Proceedings of the Royal Society, London B, 205, 547-565. Cook, E.W., III, Hodes, R.L., & Lang, P.J. (1986). Preparedness and phobia: Effects of stimulus content on human visceral conditioning. Journal of Abnormal Psychology, 95, 195-207. Cook, M., & Mineka, S. (1989). Observational conditioning of fear to fear-relevant versus fearirrelevant stimuli in rhesus monkeys. Journal of Abnormal Psychology, 98, 448-459. Cosmides, L. (1985). Deduction or Darwinian algorithms? An explanation of the "elusive" content effect on the Wason selection task. Doctoral dissertation, Department of Psychology, Harvard University. University Microfilms #86-02206. Cosmides, L. (1989). The logic of social exchange: Has natural selection shaped how humans reason? Studies with the Wason selection task. Cognition, 31, 187-276. Cosmides, L., & Tooby, J. (1987). From evolution to behavior: Evolutionary psychology as the missing link. In J. Dupre (Ed.), The latest on the best: Essays on evolution and optimality. Cambridge, MA: MIT Press. Cosmides, L., & Tooby, J. (1989). Evolutionary psychology and the generation of culture, Part II. Case study: A computational theory of social exchange. Ethology and Sociobiology, 10, 51-97. Cosmides, L., & Tooby, J. (1992). Cognitive adaptations for social exchange. In J. Barkow, L. Cosmides, & J. Tooby (Eds.), The adapted mind: Evolutionary psychology and the generation of culture. New York: Oxford University Press. Cosmides, L., & Tooby, J. (1994). Origins of domain specificity: The evolution of functional organization. In S. Gelman & L. Hirschfeld (Eds.), Mapping the mind: Domain specificity in cognition and culture. New York: Cambridge University Press. Cosmides, L., & Tooby, J. (in press). Are humans good intuitive statisticians after all? Rethinking some conclusion of the literature on judgment under uncertainty. Cognition. Daly, M., & Wilson, M. (1983). Sex, evolution and behavior. Boston: Wadsworth. Daly, M., & Wilson, M. (1988). Homicide. Hawthorne, NY: Aldine de Gruyter. Daly, M., & Wilson, M. (1994). Discriminative parental solicitude and the relevance of evolutionary
Beyond intuition and instinct blindness
103
models to the analysis of motivational systems. In M. Gazzaniga (Ed.), The cognitive neurosciences. Cambridge, MA: MIT Press. Dawkins, R. (1982). The extended phenotype. Oxford: W.H. Freeman. Dawkins, R. (1986). The blind watchmaker. New York: Norton. de Waal, F.B.M., & Luttrell, L.M. (1988). Mechanisms of social reciprocity in three primate species: Symmetrical relationship characteristics or cognition? Ethology and Sociobiology, 9, 101-118. Fischer, E.A. (1988). Simultaneous hermaphroditism, tit-for-tat, and the evolutionary stability of social systems. Ethology and Sociobiology, 9, 119-136. Fiske, A.P. (1991). Structures of social life: The four elementary forms of human relations. New York: Free Press. Fodor, J.A. (1983). The modularity of mind. Cambridge, MA: MIT Press. Frith, U. (1989). Autism: Explaining the enigma. Oxford: Blackwell. Gallistel, C.R. (1990). The organization of learning. Cambridge, MA: MIT Press. Gallistel, C.R., Brown, A.L., Carey, S., Gelman, R., & Keil, F.C. (1991). Lessons from animal learning for the study of cognitive development. In S. Carey & R. Gelman (Eds.), The epigenesis of mind. Hillsdale, NJ: Erlbaum. Garcia, J. (1990). Learning without memory. Journal of Cognitive Neuroscience, 2, 287-305. Gelman, S., & Hirschfeld, L. (Eds.) (1994). Mapping the mind: Domain specificity in cognition and culture. New York: Cambridge University Press. Gigerenzer, G., Hoffrage, U., & Kleinbolting, H. (1991). Probabilistic mental models: A Brunswikean theory of confidence. Psychological Review, 98, 506-528. Gigerenzer, G., & Hug, K. (1992). Domain-specific reasoning: Social contracts, cheating and perspective change. Cognition, 43, 127-171. Gould, J.L. (1982). Ethology: The mechanisms and evolution of behavior. New York: Norton. Gould, S.J., & Lewontin, R.C. (1979). The spandrels of San Marco and the Panglossian paradigm: A critique of the adaptationist programme. Proceedings of the Royal Society, London B, 205, 581-598. Jackendoff, R. (1992). Languages of the mind. Cambridge, MA: MIT Press. James, W. (1890). Principles of psychology. New York: Henry Holt. Keil, F.C. (1989). Concepts, kinds, and cognitive development. Cambridge, MA: MIT Press. Krebs, J.R., & Davies, N.B. (1987). An introduction to behavioural ecology. Oxford: Blackwell Scientific Publications. Leslie, A.M. (1987). Pretense and representation: The origins of "theory of mind". Psychological Review, 94, 412-426. Leslie, A.M. (1988). The necessity of illusion: Perception and thought in infancy. In L. Weiskrantz (Ed.), Thought without language (pp. 185-210). Oxford: Clarendon Press. Manktelow, K.I., & Evans, J. St.B.T. (1979). Facilitation of reasoning by realism: Effect or non-effect? British Journal of Psychology, 70, 477-488. Manktelow, K.I., & Over, D.E. (1990). Deontic thought and the selection task. In K.J. Gilhooly, M.T.G. Keane, R.H. Logie, & G. Erdos (Eds.), Lines of thinking (Vol. 1). Chichester: Wiley. Markman, E.M. (1989). Categorization and naming in children: Problems of induction. Cambridge, MA: MIT Press. Markman, E. (1990). Constraints children place on word meanings. Cognitive Science, 14, 57-77. Marler, P. (1991). The instinct to learn. In S. Carey & R. Gelman (Eds.), The epigenesis of mind. Hillsdale, NJ: Erlbaum. Marr, D. (1982). Vision: A computational investigation into the human representation and processing of visual information. San Francisco: Freeman. Mayr, E. (1983). How to carry out the adaptationist program? The American Naturalist, 121, 324-334. McDougall, W. (1908/1916). Introduction to social psychology. Boston: John W. Luce. Mineka, S., & Cook, M. (1988). Social learning and the acquisition of snake fear in monkeys. In T.R. Zentall & B.G. Galef (Eds.), Social learning: Psychological and biological perspectives, (pp. 51-73). Hillsdale, NJ: Erlbaum. Ohman, A., Dimberg, U., & Ost, L.G. (1985). Biological constraints on the fear response. In S. Reiss & R. Bootsin (Eds.), Theoretical issues in behavior therapy (pp. 123-175). New York: Academic Press.
104
L. Cosmides, J. Tooby
Ohman, A., Eriksson, A., & Olofsson, A. (1975). One-trial learning and superior resistance to extinction of autonomic responses conditioned to potentially phobic stimuli. Journal of Comparative and Physiological Psychology, 88, 619-627. Pinker, S. (1994). The language instinct. New York: Morrow. Pinker, S., & Bloom, P. (1990). Natural language and natural selection. Behavioral and Brain Sciences, 13, 707-727. Profet, M. (1992). Pregnancy sickness as adaptation: A deterrent to maternal ingestion of teratogens. In J. Barkow, L. Cosmides, & J. Tooby (Eds.), The adapted mind: Evolutionary psychology and the generation of culture. New York: Oxford University Press. Pylyshyn, Z.W. (Ed.) (1987). The robot's dilemma: The frame problem in artificial intelligence. Norwood, NJ: Ablex. Real, L.A. (1991). Animal choice behavior and the evolution of cognitive architecture. Science, 253, 980-986. Rozin, P. (1976). The evolution of intelligence and access to the cognitive unconscious. In J.M. Sprague & A.N. Epstein (Eds.), Progress in psychobiology and physiological psychology. New York: Academic Press. Shepard, R.N. (1981). Psychophysical complementarity. In M. Kubovy & J. Pomerantz (Eds.), Perceptual organization. Hillsdale, NJ: Erlbaum. Shepard, R.N. (1987). Evolution of a mesh between principles of the mind and regularities of the world. In J. Dupre (Ed.), The latest on the best: Essays on evolution and optimality. Cambridge, MA: MIT Press. Sherry, D.F., & Schacter, D.L. (1987). The evolution of multiple memory systems. Psychological Review, 94, 439-454. Smuts, B. (1986). Sex and friendship in baboons. Hawthorne: Aldine. Spelke, E.S. (1988). The origins of physical knowledge. In L. Weiskrantz (Ed.), Thought without language (pp. 168-184). Oxford: Clarendon Press. Spelke, E. (1990). Principles of object perception. Cognitive Science, 14, 29-56. Sperber, D. (1975) Rethinking symbolism, transl. Alice Morton. Cambridge, UK: Cambridge University Press. Sperber, D. (1982). On anthropological knowledge. Cambridge, UK: Cambridge University Press. Sperber, D. (1985). Anthropology and psychology: Towards an epidemiology of representations. Man (N.S.), 20, 73-89. Sperber, D. (1990). The epidemiology of beliefs. In C. Fraser & G. Geskell (Eds.), Psychological studies of widespread beliefs. Oxford: Clarendon Press. Sperber, D. (1994). The modularity of thought and the epidemiology of representations. In S. Gelman & L. Hirschfeld, (Eds.), Mapping the mind: Domain specificity in cognition and culture. New York: Cambridge University Press. Symons, D. (1979). The evolution of human sexuality. New York: Oxford University Press. Symons, D. (1992). On the use and misuse of Darwinism in the study of human behavior. In J. Barkow, L. Cosmides, & J. Tooby, (Eds.), The adapted mind: Evolutionary psychology and the generation of culture. New York: Oxford University Press. Tooby, J., & Cosmides, L. (1989). The logic of threat: Evidence for another cognitive adaptation? Paper presented at the Human Behavior and Evolution Society, Evanston, IL. Tooby, J., & Cosmides, L. (1990a). The past explains the present: Emotional adaptations and the structure of ancestral environments. Ethology and Sociobiology, 11, 375-424. Tooby, J., & Cosmides, L. (1990b). On the universality of human nature and the uniqueness of the individual: The role of genetics and adaptation. Journal of Personality, 58, 17-67. Tooby, J., & Cosmides, L. (1992). The psychological foundations of culture. In J. Barkow, L. Cosmides, & J. Tooby (Eds.), The adapted mind: Evolutionary psychology and the generation of culture. New York: Oxford University Press. Tooby J., & DeVore, I. (1987). The reconstruction of hominid behavioral evolution through strategic modeling. In W. Kinzey (Ed.), Primate models of hominid behavior. New York: SUNY Press. Trivers, R.L. (1971). The evolution of reciprocal altruism. Quarterly Review of Biology, 46, 35-57. Wason, P. (1966). Reasoning. In B.M. Foss (Ed.), New horizons in psychology. Harmondsworth: Penguin.
Beyond intuition and instinct blindness
105
Wason, P. (1983). Realism and rationality in the selection task. In J. St.B.T. Evans (Ed.), Thinking and reasoning: Psychological approaches. London: Routledge & Kegan Paul. Wason, P., & Johnson-Laird, P.N. (1972). Psychology of reasoning: Structure and content. London: Batsford. Wilkinson, G.S. (1988). Reciprocal altruism in bats and other mammals. Ethology and Sociobiology, 9, 85—100. Wilkinson, G.S. (1990). Food sharing in vampire bats. Scientific American, February, 76-82. Williams, G.C. (1966). Adaptation and natural selection: A critique of some current evolutionary thought. Princeton: Princeton University Press. Williams, G.C. (1985). A defense of reductionism in evolutionary biology. Oxford surveys in evolutionary biology, 2, 1-27. Williams, G.C, & Nesse, R.M. (1991). The dawn of Darwinian medicine. Quarterly Review of Biology, 66, 1-22. Wilson, M., & Daly, M. (1992). The man who mistook his wife for a chattel. In J. Barkow, L. Cosmides, & J. Tooby (Eds.), The adapted mind: Evolutionary psychology and the generation of culture. New York: Oxford University Press. Wynn, K. (1992). Addition and subtraction by human infants. Nature, 358, 749-750.
6 Why should we abandon the mental logic hypothesis? Luca Bonatti* Laboratoire de Sciences Cognitives et Psycholinguistique, 54, Boulevard Raspail, 75006 Paris, France Philosophy Department, Rutgers University, PO Box 270, New Brunswick, NJ 08903-0270, USA
Abstract Two hypotheses on deductive reasoning are under development: mental logic and mental models. It is often accepted that there are overwhelming arguments to reject the mental logic hypothesis. I revise these arguments and claim that they are either not conclusive, or point at problems which are troublesome for the mental model hypothesis as well.
1. Introduction An old and venerable idea holds that logic is concerned with discovering or illuminating the laws of thought. Its psychological corollary is that a system of logic in the mind underlines our thinking processes. This thesis fits very well with representational views of the mind according to which cognitive processes are largely proof-theoretical. Within such a framework, it is a thesis about the structure of the vehicle of internal representations. In a nutshell, it holds that reasoning consists of operations on mental representations, according to logical rules implemented in procedures activated by the forms of the mental representations. Even if the thesis loomed around for centuries, there is still little convincing psychological evidence of the existence of a mental logic. Such evidence has mostly been accumulated in the last few years, and almost exclusively concerns propositional reasoning (Braine, Reiser & Rumain, 1984; Lea, O'Brien, Fisch, Noveck & Braine, 1990; Rips, 1983). * Correspondence to: L. Bonatti, Laboratoire de Sciences Cognitives et Psycholinguistique, 54, Boulevard Raspail, 75006 Paris, France. The author is indebted to Martin Braine, Emmanuel Dupoux, Jerry Fodor, Jacques Mehler, and Christophe Pallier for comments on a first draft of this paper.
110
L. Bonatti
In the same years in which some results were beginning to appear, mental logic has been seriously challenged by an alternative - mental models - mostly due to the work of Johnson-Laird and his collaborators. Both hypotheses share the basic geography of cognition: also the mental models hypothesis is (inter alia) about the nature of the internal representations of deductive processes. They differ, however, on their supposed nature. Roughly, the mental model hypothesis claims that understanding a text consists of the manipulation of tokens representing concrete samples of entities in the world, and reasoning consists of the construction of alternative arrangements of tokens. No abstract rules should be needed to accomplish deduction. Thus, at least at first blush, while mental logic seems naturally to require a language of thought on whose formulas abstract rules apply, mental models seem to be able to dispense with it and substitute analog simulations for discrete manipulation of propositional-like objects (McGinn 1989). Originally, crucial aspects of the new hypothesis were left vague, and both its exact status and the feasibility of its claims were a puzzle (Boolos, 1984; Rips, 1986). What precisely a mental model is seemed to be a question of secondary importance, if compared to the big revolution introduced by the theory. Only recently has a substantial effort of formal clarification been undertaken (especially in Johnson-Laird & Byrne, 1991 and Johnson-Laird, Byrne, & Schaeken, 1992), but the task is still far from being accomplished (Bonatti, in press; Hodges, 1993). Nevertheless, the hypothesis had an enormous success, to the point that probably the words "mental models" are second only to "generative grammar" for their consequences within the cognitive science community. In a very short time, among psychologists an almost unanimous consensus has been reached on the death of mental logic and on the fact that reasoning is carried out by constructing mental models; nowadays the group of psychologists who doubt of the truth of the mental model theory is on the verge of extinction. A good part of this sweeping success, vagueness notwithstanding, is due to the impressive list of problems the new hypothesis promised to solve. Let me list them. Mental models would: (1) provide a general theory of deductive reasoning (Johnson-Laird, 1983a; Johnson-Laird & Bara, 1984, p. 3; Johnson-Laird & Byrne, 1991, p. x), and,' in particular (la) explain propositional reasoning (Johnson-Laird & Byrne, 1991, 1993; Johnson-Laird, Byrne, & Schaeken, 1992); (lb) explain relational reasoning (Johnson-Laird, 1983b; Johnson-Laird & Byrne 1989, 1991, 1993); (lc) explain the figural effect in reasoning (Johnson-Laird & Bara, 1984; Johnson-Laird & Byrne, 1991, Ch. 6); (Id) explain syllogistic reasoning (Johnson-Laird, 1983a, Ch. 5; Johnson-Laird & Bara, 1984; Johnson-Laird & Byrne, 1991), including individual differences
Why should we abandon the mental logic hypothesis?
Ill
(Johnson-Laird, 1983a, pp. 117-121) and the belief bias effect (JohnsonLaird & Byrne, 1991, pp. 125-126; Oakill, Johnson-Laird, & Garnham, 1989); (le) explain reasoning with single and multiple quantifiers (Johnson-Laird, 1983a; Johnson-Laird & Byrne, 1991; Johnson-Laird, Byrne, & Tabossi, 1989); (2) explain how logical reasoning is performed without logic (Byrne, 1991; Johnson-Laird, 1983a, Ch. 6, 1983b; Johnson-Laird & Byrne, 1991); (3) account for a vast series of linguistic phenomena, such as anaphors, definite and indefinite descriptions, pronouns and plausibility effects in language processing (Johnson-Laird, 1983a; Garnham, 1987); (4) offer a theory of the structure of discourse (Johnson-Laird, 1983a, pp. 370-371; Garnham, 1987); (5) explain the difference between implicit and explicit inferences (JohnsonLaird, 1983a, Ch. 6); (6) "solve the central paradox of how children learn to reason" (Johnson-Laird, 1983a, p. 45); (7) explain content effects in reasoning (Byrne, 1991, p. 77); (8) offer an explanation of meaning (Johnson-Laird, 1983a, p. 397; McGinn, 1989); (9) "readily cope with the semantics of propositional attitudes" (Johnson-Laird, 1983a, p. 430) and solve the problems presented by them (Johnson-Laird, 1983a, pp. 430-436); (10) provide a solution to the controversy on the problem of human rationality (Johnson-Laird & Byrne, 1993, p. 332); (11) solve the problem of how words relate to the world (Johnson-Laird, 1983a, p. 402, 1989, pp. 473-474, 489; Garnham, 1987; McGinn, 1989); (12) elucidate the nature of self-awareness and consciousness (Johnson-Laird, 1983a, pp. xi; Ch. 16). Even the most benevolent reader, when confronted with a theory so rich in both philosophical consequences and empirical power, should have at least felt inclined to raise her critical eyebrows. Nevertheless, critical voices were confined to a "small chorus of dissenters", almost all tied to the "ardent advocates of rule theories" (Johnson-Laird & Byrne, 1991, p. ix). In fact, with some patience and time, I think it can be shown that all the philosophical advantages claimed for mental models are unsupported propaganda, and that most of the psychological evidence is much less firm than generally admitted. But showing it is quite a long task. Another source of support for the mental model hypothesis came from a parallel series of arguments to the conclusion that the mental logic hypothesis is doomed to failure. In this paper, I will confine myself to a modest task. I will
112
L. Bonatti
plainly go through the list of this second class of arguments and show that either they are not conclusive or, when reasonable, they point at problems which are troublesome for the mental model theory as well. The arguments follow in no particular order of importance.
2. Mental logic doesn't have the machinery to deal with meaning and cannot explain the role of content and context in understanding and reasoning This is one of the major complaints against a mental logic. How could a formal theory enlighten us on such a clearly content-driven process as reasoning? In fact, as mental logic theorists recognize, one should distinguish two separate processes involved in problem solving. The first one is comprehension; the second one is reasoning proper. Accordingly, for mental logic theories a comprehension mechanism sensible to pragmatic information drives input analysis (Braine et al., 1984; Braine & O'Brien, 1991; O'Brien, 1993). Though the comprehension principles guiding it are only sketched, there is a hypothesis on their role in the time course of reasoning. After a first processing roughly delivering a syntactic analysis of a linguistic signal, the identification of its logical form and a first semantic analysis retrieving literal meaning, pragmatics and general knowledge aid to select a particular logical form for the input signal. Afterwards, representations possibly sharply different from the first semantic analysis are passed onto a processor blind to content and pragmatics. The general picture suggested, with some integration, looks like the diagram in Fig. 1. So a theory of mental logic cannot, and does not intend to, explain the role of content in reasoning, though it may help to locate how and when content and pragmatics interact with reasoning proper. From this point of view, the complaint is correct. However, models are no improvement; the thesis that "in contrast [to mental logic], the model theory has the machinery to deal with meaning" (Byrne, 1991, p. 77) is false. Models are supposed to be constructed either directly from perception, or indirectly from language. In the first case, no detailed account on how perception should generate models has been given.1 For linguistic models, a sketch of the procedures for their constructions exists. According to it, models are constructed from propositional representations via a set of procedures sometimes Sometimes it looks as if perceptual models in Marr's sense are considered to be equivalent to mental models in Johnson-Laird's sense (see Johnson-Laird, 1983a; Johnson-Laird et al., 1992, p. 421), but there are structural differences between the two constructs which make it difficult to accept the identification. To mention the most apparent one, perceptual models don't contain negation, but mental models do. For this reason, for each perceptual model there is an infinite number of mental models corresponding to it. A perceptual model of John scratching his head is a mental model of John scratching his head, but also of John not scratching his leg, of John not running the New York Marathon, of Mary being late for a date, and so on.
Reasoni theory
Comprehension mechanism
A first semantic analysis is elaborated; then, with the aid of pragmatic information and world knowledge, the logical form of the input sentence is selected
Rules ( basic proced + heurist strateg
I
Fallb strate
Figure 1. The place of pragmatics, comprehension mechanisms and reasoning prope in double squares.)
114
L. Bonatti
called procedural semantics. For example (Johnson-Laird & Byrne, 1991, p. 170 ff.), when given as input a sentence like The circle is on the right of the triangle a parser will start working and after some crunching the following information will be placed on the top of its stack: (The-circle-is . . .)-*Sentence ((1,0, 0)(A)(O)) The output of the parser is a couple containing both the grammatical description of the input ("sentence") and its semantical evaluation (in this case, an array containing numerical coordinates specifying the interpretation of the spatial relation, and the interpretations of the definite descriptions). Only at this point will procedural semantics take over and construct a model out of the propositional representation of the sentence; in this case, the model will be: A
O
that is, an image of a triangle to the left of the circle. Now, notice the following points. First, the procedures that construct models do not operate properly on natural language sentences, but on the logical forms of propositional representations. Thus procedural semantics presupposes logical forms. By the same token, procedural semantics presupposes the literal meaning of words and sentences, which have to be received as its input. As Johnson-Laird himself writes, "The reader should bear in mind that the present theory uses a procedural semantics to relate language, not to the world, but to mental models" (1983a, p. 248). Procedural semantics is essentially translation from mental representations to mental representations, not a function from mental representations to the world. But, then, if procedural semantics is not about literal meaning and logical forms, neither are mental models. Second, procedural semantics can work only if the output of the parser is not ambiguous: for example, scope relations must be already straightened out. The sentence (1) Every man loves a woman must be parsed to yield either (2) For all men x there is some woman y such that (x love y) or
Why should we abandon the mental logic hypothesis?
7/5
(3) For some woman y all men x are such that (x loves y) Only on the basis of one of them can procedural semantics yield a mental model. Thus the input to procedural semantics must be clear. Third, the possibility to construct the appropriate models of a text strictly depends on the expression power of the logical forms on which procedural semantics operates. To continue with the previous example, there are interpretations of (1) which don't correspond to either (2) or (3), involving a generic reading of the indefinite description. While it is not clear that a mental model can express the difference between a generic woman and a specific woman, this much is clear: // the logical form is not rich enough to articulate such a distinction, then mental models cannot represent it either, since they come from expressible logical forms. Thus the input to procedural semantics must be rich. Fourth, while the programs implementing the mental model theory described in Johnson-Laird (1983a) and Johnson-Laird et al. (1992) assume that the syntactic analysis of the input sentence plus word meaning is sufficient to determine its propositional content and logical forms, in a more natural setting the propositional content needed to construct the relevant mental models cannot be the first semantic analysis of the input, but the propositional content and the logical forms of the message it conveys. Now, by standard Gricean reasons the message conveyed in "Luca is a nice guy" in a text like Q: Is Luca a good philosopher? A: Well, let's say that Luca is a nice guy has something to do with my ability as a philosopher, and not with how much people like me. So if we take seriously the proposal that mental models are the kind of structure we build when comprehending a text, it is this contextual message that they must retain. A similar point can be made for metaphors, analogies, and all the cases in which the hearer/reader gathers information from an utterance aided by her general world knowledge, her understanding of relevance in communication, and other pragmatic factors. Now, since procedural semantics is proposed as a set of procedures extracting models from propositional representations, clearly the propositional representations on which it has to act in order to build the right mental models are not the results of a first semantic analysis of input sentences retrieving their literal meaning, but the analysis of their message in context, which, therefore, has to be retrieved before models are constructed. Procedural semantics works once all the disambiguations due to context, scope phenomena and retrieval of the speaker's intentions have taken place. To sum up, the input to procedural semantics presupposes both the literal meaning of the text and its logical form, and must be rich, clear, free from
116
L. Bonatti
Input
i
Parsing
T
Procedural semantics
First and second semantic analysis
Mental models
After a first semantic analysis is elaborated, pragmatic information and world knowledge aid to select the logical form of the input sentence Figure 2. The place of pragmatic and comprehension mechanisms in the mental model hypothesis.
structural ambiguities, and post-pragmatic. Thus when we begin to fill in the details, we come up with a sophisticated input analysis and we get the overall picture presented in Fig. 2, which for what concerns the role of pragmatics and meaning, has no difference from the mental logic picture - just as mental logic, procedural semantics and mental models presuppose, and do not explain, a theory of how pragmatics affects the selection of the correct message a set of utterances carries in the relevant situation. It could be objected that I am presenting a misleading picture, based on the algorithms implementing a small fraction of the mental model theory rather than on the theory itself. Algorithms are only a part of the story; with time, the rest will come. So Johnson-Laird et al. (1992) write: The process of constructing models of the premises is, in theory, informed by any relevant general knowledge, but we have not implemented this assumption, (p. 425)
But such "assumption" amounts to the solution to the frame problem, and the suspicion that it won't be implemented is more than warranted (Fodor, 1983). In any case, if the problem were solvable, it would still be the case that the retrieval of the relevant message would occur in the pre-modelic construction processes
Why should we abandon the mental logic hypothesis?
117
selecting the right logical forms and propositional contents which are input to procedural semantics. In fact, there is a litmus paper to test sensibility to content. The natural understanding of entailment seems to require a connection in content between antecedent and consequent. But the paradoxes of material implication allow false arbitrary antecedents to imply arbitrary consequents, regardless of their contents and even of their truth values. So if a theory of reasoning licenses them, it surely can't be advertised as the model to imitate for sensibility to content. Now, while in Braine and O'Brien's (1991) logical theory of implication the paradoxes are not available as theorems, mental models allow one to derive them as valid inferences (Johnson-Laird & Byrne, 1991). The reason is pretty clear. The mental model theory of connectives mainly consists of a variation on truth tables, and truth tables are only sensible to truth values, not to content connections or relevance. Thus besides their name, models have no advantage over mental logic to explain the role of content in reasoning, in any of the relevant senses of "content". They cannot explain literal meaning, nor meaning in situation, nor how pragmatics and general knowledge affect interpretation, and they don't seem to have the adequate structure to do it.
3. There is no mental logic because people make fallacious inferences People often reach conclusions which, if judged according to the canons of standard logic, are fallacious. And this should be a problem for a mental logic. The most glaring problem is that people make mistakes. They draw invalid conclusions, which should not occur if deduction is guided by a mental logic. (Johnson-Laird, 1983a, p. 25)
In less sophisticated versions, the argument notices that undergraduates make mistakes, and, worst of all, they show reiterate resistance to the teacher's efforts to correct them (Bechtel & Abrahansen, 1991, p. 168 ff.), or that they make more mistakes than what the average individual should innately know according to the logical competence mental logic attributes to people (Churchland, 1990, p. 283). In fact, mistakes come in different classes. They may be due to cognitive components not engaging reasoning proper, such as the comprehension stage or strategies of response selection; to performance failures; or to faulty competence. Any errors due to pre-deductive, comprehension mechanisms, or post-deductive, response selection strategies, can be accommodated by the two hypotheses roughly in the same way: the existence of such errors doesn't count against mental logic any more than it counts against mental models. Performance mistakes are explained away by mental models by indicating how models are built and handled by mechanisms non-proprietary of reasoning-mostly, mechanisms of working
118
L. Bonatti
memory storage and retrieval. A system based on mental logic can account for them in the same way. Errors of competence - as it were, directly generated by how the reasoning box is - are a more delicate matter. The question is to decide with respect to which point of reference they are errors. Does failure to apply excluded middle count as an error? Does the absence of reasoning schemata corresponding to material implication count? Classical logic - or, for that matter, any alternative logics cannot be a favored point of reference without further justifications. One major task of a psychological theory of deductive reasoning is to characterize what people take the right implications to be starting from certain premises, under ideal conditions. What could count as a systematic error in this context? Previous assumptions on the nature of rationality must be exploited. It can be argued, for example, that it is rational to proceed from truths to truths. On this basis, invalid reasoning processes could count as mistakes. If it could be shown that under ideal conditions people respond erratically to identical problems, or embody a rule which brings about a systematic loss of truths, then it may be said that subjects make mistakes in point of competence regardless of the compliance of natural logical consequence to classical, or other, logics. But if this were the case, mental models would be in a worse position than mental logic. It is possible (though not desirable) to account for systematic errors within a mental logic framework by indicating which rules (if any) induce systematic violations of the selected normative model. As of today the algorithms proposed to implement logical reasoning by models are either psychologically useless or ill defined (Bonatti, in press; O'Brien, Braine & Yang, in press), so it is difficult to give a definite judgement on this issue, but the tentative set of rules proposed for model construction is meant to be truth preserving in principle. Thus it is puzzling to figure out how models might account for purported systematic violations: errors in point of competence would be an even deeper mystery for the mental model hypothesis.
4. There is no mental logic because higher-order quantifiers are not representable in first-order logic, and yet we reason with them This argument has been considered "the final and decisive blow" to the doctrine of mental logic (Johnson-Laird, 1983a, p. 141). According to Barwise and Cooper (1981), expressions such as "More than half of" or "Most" are sets of sets, and therefore an adequate logic for natural language needs to extend beyond first order. The argument from this proposal to the rejection of mental logic runs as follows: [Higher-order calculus] is not complete. If there can be no formal logic that captures all the valid
Why should we abandon the mental logic hypothesis?
119
deductions, then a fortiori there can be no mental logic that does either. It is a remarkable fact that natural language contains terms with an implicit "logic" that is so powerful that it cannot be completely encompassed by formal rules of inference. It follows, of course, that any theory that assumes that the logical properties of expressions derive directly from a mental logic cannot give an adequate account of those that call for a higher-order predicate calculus. This failure is a final and decisive blow to the doctrine of mental logic. (Johnson-Laird, 1983a, pp. 140-141, italics mine)
The argument has often been repeated (see, for example, Johnson-Laird & Bara, 1984, p. 6; Johnson-Laird & Byrne, 1990, p. 81; Johnson-Laird & Byrne, 1991, p. 15); so it must be attached a certain importance. The question is to figure out why. The nature of the representational device in which mental processes are carried out is an empirical question, and if patterns of inference are required that can be better formalized in second-order logic, so be it. So what can possibly be wrong in using higher-order logic? We are told, it is not complete. Such objection makes sense only if one presupposes that a mental calculus must be complete. But an argument is needed to ask for completeness as a constraint over a mental logic, and it is difficult to see what it would look like. We may impose constraints on a logical system by requiring that it possesses certain logical properties such as consistency, or completeness, because we can decide what we want from it. But finding out how people reason is an empirical enterprise. It would be a very interesting empirical discovery to find out that, say, a subject's system for propositional reasoning is complete, but it's not enough that we want it to be so. Even more basic logical properties cannot be granted a priori. It would be desirable that subjects reason consistently, as everybody hopes to discover that under ideal conditions they do, but, again, to presuppose that our reasoning system is consistent requires an argument. Barring such arguments, the "final and decisive blow against mental logic" blows up. In fact, it may backfire. Johnson-Laird et al. blame the incompleteness of a higher-order mental logic system as if the mental model counterproposal were complete. But the only fragment for which a psychological implementation has been proposed - propositional reasoning-is not even valid. Models have no advantage over mental logic on the issue of completeness. Neither should they: such an advantage, in the absence of evidence that natural reasoning is complete, would be irrelevant.
5. There is no evolutionary explanation of the origin of mental logic Another alleged argument against mental logic concerns its origin. A bland version of it simply claims that there is no evolutionary explanation of mental logic, and this is enough to reject the theory (Cosmides, 1989). A richer version runs as follows. To accept that there is a mental logic seems to lead to the
120
L. Bonatti
admission that most of our reasoning abilities are innate. Nativism, in general, cannot be a problem: everybody has to live with it, and the only issue is whether you like it weaker or stronger. But there should be something specifically wrong with nativism about mental logic: there is no evolutionary explanation for its origin: By default, it seems that our logical apparatus must be inborn, though there is no account of how it could have become innately determined (Johnson-Laird, 1983a, p. 40). The moral that Fodor drew is an extreme version of nativism - no concept is invented; all concepts are innate. Alas, any argument that purports to explain the origins of all intellectual abilities by postulating that they are innate merely replaces one problem by another. No one knows how deductive competence could have evolved according to the principles of neo-Darwinism. (JohnsonLaird, 1983a, pp. 142-143) So intractable is the problem for formal rules that many theorists suppose that deductive ability is not learned at all. It is innate. Fodor (1980) has even argued that, in principle, logic could not be learned. The difficulty with this argument is not that it is wrong, although it may be, but that it is too strong. It is hard to construct a case against the learning of logic that is not also a case against its evolution. If it could not be acquired by trial-and-error and reinforcement, then how could it be acquired by neo-Darwinian mechanisms? (Johnson-Laird & Byrne, 1991, p. 204)
It is first worth noticing that the argument is meant to apply to cognition, and only to very restricted kinds of cognitive abilities. If you try to generalize it beyond this domain, it becomes flatly absurd. For the given premise is that Darwinian mechanisms are a sort of trial-and-error and reinforcement mechanisms applied to the species. Its generalization says: for any *, if x cannot be acquired by trial-and-error and reinforcement, then how could it be acquired by a neo-Darwinian mechanism? Now take a non-cognitive phenomenon and substitute it for x; breathing cannot be acquired by trial-and-error and reinforcement, so how did the species acquire the ability to breathe? That doesn't work. And neither does it work for most innate cognitive abilities. Try with colors, or perceptual primitives: the ability to recognize colors (or any perceptual primitive) cannot be acquired by trial-and-error and reinforcement, so how could the ability to recognize colors be acquired by neo-Darwinian mechanisms? This doesn't work either. So I assume that the argument is really targeted against mental logic. Second, even restricting its field of application, notice that there are at least three different questions one may raise. What is the logical syntax of mental processes? What logical system underlies reasoning abilities? What concepts is the mind able to entertain, whether innately or by experience? The above argument does not keep them separate, yet they may have radically different answers. For example, an organism may be innately endowed with the syntax of first-order logic, but it may keep changing its logical system (for simplicity, the set of its axioms) by flip-flopping an axiom, and at the same time may need to learn any concept by experience. Such an organism would have an innate logical syntax, but no innate logic or innate concepts. Or else, an organism may be endowed with an
Why should we abandon the mental logic hypothesis?
121
innate logical syntax and an innate logic, but may need experience to acquire contentful concepts. The arguments for or against nativism are quite different in the three cases. I will assume that the above argument is really targeted against nativism of a system of logic. Then, it can be reconstructed in the following way. If there is a mental logic, an account is due of how it is acquired. Since there is no theory of its acquisition, it must be assumed that the logical system - not just its syntax - is innate. But, alas, this claim is unsupported because there is no evolutionary story on how such a system gets fixated. Thus the doctrine of mental logic has to be rejected. The short answer to such an argument (in its bland and its rich forms) is: too bad for evolutionary explanations. The long answer requires a reflection on the state of evolutionary explanations of cognitive mechanisms. The argument presupposes that there must be an evolutionary explanation of how deductive abilities are fixated. What would it look like? For the much clearer case of language, evolutionary explanations are uninformative. Whether a mutation endowing humans with linguistic abilities concerns the structures of the organism or in its functions; whether language has been a direct mutation, or a byproduct of another mutation; under what metric it turned out to be advantageous: these are unanswered questions. This is a general problem concerning the application of evolutionary concepts to cognition. The quest for a Darwinian explanation of cognitive evolution is founded at best on an analogy with biological evolution, and analogies may be misleading. Lewontin specifically makes this point for problem solving: . .. generalized problem solving and linguistic competence might seem obviously to give a selective advantage to their possessors. But there are several difficulties, First, . . . human cognition may have developed as the purely epiphenomenal consequence of the major increase in brain size, which, in turn, may have been selected for quite other reasons. .. . Second, even if it were true that selection operated directly on cognition, we have no way of measuring the actual reproductive advantages. . . . Fourth, the claim that greater rationality and linguistic ability lead to greater offspring production is largely a modern prejudice, culture - and history - bound. . . . The problem is that we do not know and never will. We should not confuse plausible stories with demonstrated truth. There is no end to plausible story telling. (Lewontin, 1990, pp. 244-245)
And there is no reason to ask for mental logic what does not exist and might not exist for other, better-known, cognitive domains. But let us suppose that one should seriously worry for the lack of a Darwinian explanation of how innate logic has been selected. Again, here one should sense the kind of comparative advantage that the mental model hypothesis gains. The argument seems to presuppose that, as opposed to the case of mental logic, either (a) the ability of building mental models is not innate but learned, and thus Darwinian worries don't arise, or (b) if it is not learned, there is an evolutionist explanation of its origin. Alternative (a) is empty. There is no learning theory for models and it is
122
L. Bonatti
unlikely that any future such theory will bring about substantial economies in nativism, since most of the structures needed for problem solving are the same regardless of which theory turns out to be correct. Without (a), alternative (b) assumes the following form: an innate mechanism for building mental models gives an evolutionary advantage that an innate mental logic doesn't give. But evolutionary explanations are not so fine-grained to discriminate between our capacity to construct models as opposed to derivations. If there is any such explanation, it will work for both; if there isn't one for mental logic, there isn't one for mental models either.
6. Mental logic cannot explain reasoning because people follow extra-logical heuristics Often heuristics of various sorts guide human responses even in deduction. But it is unclear how this counts against mental logic. Models need heuristics as much as logical rules do. For example, if a premise has different possible interpretations, an order is needed to constrain the sequence of constructed models (Galotti, 1989). Such an order too may depend on heuristics having nothing to do with models proper, such as reliance on the most frequent interpretation, or on previous experience, or on previously held beliefs. But there may be something more to the argument. It may be argued that heuristics don't pose any special problem to model-based theories of reasoning, whereas they do for logic-based theories. Just like Dennett's queen moving out early, heuristics can be an epiphenomenon of the structure of models, whereas rule-based systems must express them explicitly. For example, a model for the sentences "a is to the right of &" and "6 is to the right of c" allows us to derive "a is to the right of c" with no explicit rule to that effect (see Johnson-Laird, 1983a; Johnson-Laird & Byrne, 1991). In this case, transitivity is an emerging feature of the structure of the model. Analogously, it may be argued that also other apparent rule-following behaviors such as strategies are emerging features of models. However, often subjects reason by following heuristics that they can perfectly spell out and that are not accounted for by the structure of models (see, for example, Galotti, Baron & Sabini, 1986), and this squares very badly with a radical rule epiphenomenalism. At least in principle, models may help to solve the problem of implicitness: certain processes may be externally described by explicit rules which nevertheless are not explicitly represented in the mental life of an organism. Solution: the rules supervene to the structure of models. But the other side of the coin is the problem of explicitness: how could a system represent the information that is explicitly represented? This is no difficulty for mental logic, but how could a heuristic be explicitly represented within models? Tokens and possibly some of their logical relations are explicit in models, but not
Why should we abandon the mental logic hypothesis?
123
performatives. Models don't contain information specifying the order in which certain operations have to be executed, but only the result of such operations. So while a propositional-like system doesn't have the problem of explicitness, models may have it. 7. Mental logic cannot offer a theory of meaning for connectives In fact, the formal rules for propositional connectives are consistent with more than one possible semantics . .. Hence, although it is sometimes suggested that the meaning of a term derives from, implicitly reflects, or is nothing more than the roles of inference for it, this idea is unworkable . . . (Johnson-Laird et al., 1992, p. 420)
But truth tables (and thus models) don't have such a problem, since they "are merely a systematic way of spelling out a knowledge of the meanings of connectives" (Johnson-Laird et al., 1992, p. 420). Johnson-Laird et al. refer to an argument presented in Prior (1960, 1964). But an aspect of it has been forgotten. Prior argued that rules of inference cannot analytically define the meaning of the connectives they govern. If there were nothing more to the meaning of a connective than the inferences associated to it, then the connective tonk could be defined, with the meaning specified by the following rules: (1) From P, derive P tonk Q (2) From P tonk Q, derive Q and with tonk we could obtain the following derivation: 2 and 2 are 4 Therefore, 2 and 2 are 4 tonk 2 and 2 are 5 Therefore, 2 and 2 are 5. Prior's argument is a challenge to a conceptual role semantics. If meaning is inferential role, how to avoid tonkl According to Prior, tonk shows that explicit definitions cannot give the meaning to a term on the ground of the analytical tie between the definiens and the definiendum, but can at most correspond to a previously possessed meaning: we see that certain rules of inferences are adequate for "and" because we know its meaning and judge the adequacy of the rules with respect to it. We can perfectly introduce a sign for tonk governed by the above rules and have a purely symbolic game running. But games with rules and transformations of symbols don't generate meaning: "to believe that anything of this sort can take us beyond the symbols to their meaning, is to believe in magic" (Prior, 1964, p. 191). The difference between "and" and tonk is that in the first case the rules correspond to the (previously held) sense of the word "and": they
124
L. Bonatti
don't confer it its meaning, but are "indirect and informal ways" (Prior, 1964, p. 192) to clarify it. But in the second case there is no prior sense to appeal to. We can define a class of signs standing for conjunction, and a class of signs standing for contonktion, but the latter is an empty class. There are conjunction-forming signs, because there is a conjunction. There are no contonktion-forming signs, because there is no contonktion and the explicit introduction of a sign for it does not give life to a new connective. So Prior's argument goes. One way to read it is that rules can't give a symbol its meaning, but something else can: namely, truth tables in a metalanguage. This seems to be the interpretation adopted by Johnson-Laird et al. (1992) when they claim that mental logic cannot explain the meaning of connectives, but truth tables can. In fact, Prior (1964) remarked that explicitly defining connectives in terms of truth tables did not change the point of his criticism. In his view, there was "no difference in principle between [rules of inferences and truth tables]" (Prior, 1964, p. 192). Instead of using rules, he argued, we can define a conjunctionforming sign by using the familiar truth table, but this will not give conjunction its meaning; any formula of arbitrary length with the same truth table will turn out to be a conjunction-forming sign; so will formulas involving non-logical conceptions such as "P ett Q", which is the abbreviation for "Either P and Q, or Oxford is the capital of Scotland" (Prior, 1964, p. 194). The point of this further facet of the argument is that truth tables identify a much broader class of signs than conjunction, and moreover, signs that are understood on the basis of the understanding of conjunction (see Usberti, 1991). We might try to eliminate all the unwanted signs which would be defined by the truth table for conjunction by saying that the table defines the meaning of the shortest possible sign for conjunction. We would probably be happy with this solution. But, Prior noticed, we would accept it because we understand that such a characterization captures the meaning of the conjunction, and not of the other signs. Thus, truth tables are in no better position than rules to generate meanings. If they apparently don't suffer from tonkitis, they suffer from another equally worrisome disease. And if we wanted to resort again to formal games, then tonkitis would reappear, since a (symbolic) truth table game defining a contonktion-forming sign is easy to find: tonk "is a sign such that when placed between two signs for the propositions P and Q, it forms a sign that is true if P is true and false if Q is false (and therefore, of course, both true and false if P is true and Q is false)" (Prior, 1964, p. 193). We can now leave Prior and touch on the real problem. If we grant that explicit rules, or truth tables, don't define the meaning of the logical symbols, but are accepted on the basis of their correspondence to some pre-existent meaning we
Why should we abandon the mental logic hypothesis?
125
attach to connectives and quantifiers, we still have to explain what the source of our intuitions about the meaning of connectives and quantifiers is, because if thinking that a game of symbols can take us beyond the symbols to their meanings is magic, as Prior said, it is equally magic to think that the meaning of logical symbols comes from nowhere. For Johnson-Laird, Byrne, and Schaeken, truth tables are merely "a systematic way of spelling out a knowledge of the meanings of connectives". But in general this is false. There are 16 binary truth tables: only some of them do, or seem to, spell out the meaning of binary connectives; others clearly don't. Why is it so? Why do we feel that the truth table, for the conjunction reflects the meaning of the conjunction, whereas the classical truth table for the implication doesn't reflect the meaning of natural implication, and the anomalous truth table for tonk can't reflect the meaning of a new connective? Nothing seems to block the following possibility. When I see somebody who reminds me of my brother, one of the possibilities is that it is my brother. So when I see a set of rules for the conjunction and I think that it adequately expresses what I mean by a conjunction, one of the possibilities is that I find that resemblance because the rules are the exact expression of the patterns of inferences of a logical connective in the mind. In this case, there is nothing more to the meaning of the term than the rules themselves. At the same time, when I see the truth table of material implication I realize that it does not spell out the meaning of natural implication because the rules governing natural implications are not reflected in it, and when I see the rules of inference-or the truth table-for tonk, I have no intuition about their adequacy because there is no logical connective for tonk in the mind, from which the explicit rules are a clone copy. Contonktion cured. Intuitions, however, are not good guides. It is not enough to say that conjunctions have a meaning because they seem to correspond to rules in the mind but contonktions don't because they don't titillate our intuitions. There are lots of logical operators that may not have any straightforward correspondence with natural language, and yet are computed in retrieving the truth conditions of natural language sentences - consider, for example, focus, or quantifiers over events. If a semanticist presented us with a set of rules for them, we would not probably have the same immediate intuition we feel for conjunction. This is where a theory of mental logic comes in. A developed theory of mental logic offers empirical reasons to show that conjunctions are in the mind, while contonktions are not. If such a theory can be worked out (and a tiny part of it already exists), then mental logic can be the basis of a theory of meaning for natural connectives. For the moment, we are very far from having such a complete theory. The present point is simply that no argument exists to hamper its development.
126
L. Bonatti
8. There is no mental logic because valid inferences can be suppressed This recent argument is based on the so called "suppression of valid inferences" paradigm. By modifying a paradigm used by Rumain, Connell, and Braine (1983), Byrne (1989) set up an experiment in which to premises such as If she meets her friend she will go to a play She meets her friend an extra premise was added, transforming the argument in If she meets her friend she will go to a play If she has enough money she will go to a play She meets her friend and she showed that in this case the percentage of subjects applying modus ponens drops from 96% to 38%. Mental model theorists attributed a considerable importance to this result. It shows, they claimed, that also valid deductions as strong as modus ponens can be blocked: Models can be interrelated by a common referent or by general knowledge. Byrne (1989) demonstrated that these relations in turn can block modus ponens. . .. The suppression of the deduction shows that people do not have a secure intuition that modus ponens applies equally to any content. Yet, this intuition is a criterion for the existence of formal rules in the mind. (Johnson-Laird et al, 1992, p. 326)
and as a consequence that by their own argument, rule theorists ought to claim that there cannot be inference rules for (valid deduction). (Johnson-Laird & Byrne, 1991, p. 83)
But no argument is offered to ensure that modus ponens is really violated, or to justify the claim that this result supports the mental models hypothesis. If we assume that deductive rules apply not to the surface form of a text, but to its integrated representation, then subjects may be led by pragmatic reasons to construe the two premises If she meets her friend she will go to a play If she has enough money she will go to a play as a single If (she meets her friend and she has enough money) she will go to a play
Why should we abandon the mental logic hypothesis?
127
and therefore when provided only with the premise "She meets her friend", they don't know about the truth of the conjunctive antecedent and correctly refuse to use modus ponens. In other cases, also studied by Byrne, when subjects are given arguments such as If she meets her friend she will go to a play If she meets her mother she will go to a play She meets her friend they do conclude that she will go to a play. This may be because subjects compose the premises "If A then B" and "If C then B" as a single "If (A or C), then B", and knowing one of the disjuncts of the composed antecedent suffices to correctly apply modus ponens. Thus, under this interpretation, there is no suppression of valid inferences: simply, people tend to construct a unified representation of a text which may itself be governed by formal rules of composition. It may be replied that my response to the suppression argument puts the weight of the explanation on pre-logical comprehension processes, rather than on deduction proper, and that mental logic theorists have no account of such processes. This wouldn't be necessary for models, because they "have the machinery to deal with meaning". But I have shown that such a claim is false. Models too rely on pragmatic comprehension mechanisms, and don't explain them. If model theorists want to explain why people draw the inference in one case and not in the other, they have to say that in one case a model licensing the inference is constructed, and in the other a model not licensing the inference is constructed. To account for why it is so, they offer no explanation.
9. Conclusions "Yes, but mental logic has had its shot. It has been around for centuries and nothing good came out of it. It's time to change." Often the contrast between the long history of mental logic and its scarce psychological productivity is taken as a proof of its sterility. In fact, this impression derives from a mistake of historical perspective. The idea is very ancient, but the conceptual tools needed to transform it into the basis for testable empirical hypotheses are very recent. For centuries, logic too remained substantially unchanged, to the point that Kant considered it a completed discipline (1965, pp. 17-18). So there was no reason to change the conventional wisdom on the relations between logic and psychology: the former was stable because considered complete and the latter was stable because non-existent. When, with Frege, Russell and the neopositivists, logic as we mean it started being developed, the routes of logic and psychology separated.
128
L. Bonatti
Well beyond the 1930s, among the large majority of philosophically minded logicians, showing interest in psychological processes became a sort of behavior that well-mannered people should avoid. No substantial argument against the psychological feasibility of mental logic motivated this change of view. Rather, its roots have to be looked for in the general spirit of rebellion against German and English idealism from which twentieth-century analytic philosophy stemmed. Nevertheless, for independent reasons, the same conclusion became popular among experimental psychologists and was generally held until the early 1960s, both by behaviorists and by the new-look psychologists. There was, indeed, the Piagetian exception, but it does not count: Piaget's flirting with mental logic was never clear enough to become a serious empirical program (Braine & Rumain, 1983), and recent Piagetian-oriented investigations on mental logic (see Overton, 1990) have not helped towards a clarification. It was again an impulse coming from logicians - not from psychologists - that put logic back in the psychological ballpark. Hilbert first directly expressed a connection between symbols and thought which could serve as a psychological underpinning for mental logic. For him, the fundamental idea of proof theory was "none other than to describe the activity of our understanding, to make a protocol of the rules according to which our thinking actually proceeds. Thinking, it so happens, parallels speaking and writing: we form statements and place them one behind another" (1927, p. 475). Yet Hilbert's intuition was not enough. Formal systems, as conceived by the axiomatic school, were the least possible attractive tool to investigate the psychology of reasoning. What was still missing to render logic ready for psychological investigation was on the one side a more intuitive presentation of formal systems, and on the other side a model of how a physical structure can use a formal system to carry out derivations. The first was provided by Gentzen, and the second by Turing. However, once again, the distance between Gentzen's and Turing's ideas and a real psychological program should not be underestimated. Gentzen did introduce the systems of natural deduction with the aim to "set up a formal system which comes as close as possible to actual reasoning" (Gentzen, 1969, p. 68), but his reference to "actual reasoning" was merely intuitive. And Turing did offer the abstract model of how a physical mechanism could perform operations once considered mental along the lines suggested by Hilbert, but Turing's real breakthrough consisted of the realization that a computer can be a mind, namely, that certain kinds of properties once attributable only to humans can also be appropriately predicated of other physical configurations. Such insight, however, leaves the mechanisms and procedures by which the mind itself operates underspecified. It says that mental processes can be simulated, but it leaves it undetermined whether the simulandum and the simulans share the same psychology. The further step necessary to the formulation of a psychological notion of mental logic came when functionalism advanced the explicit thesis that the
Why should we abandon the mental logic hypothesis?
129
psychological vocabulary is computational vocabulary, and that the natural kinds described by psychology are not organisms, but computational devices. The change leading to this second step was gradual, and required a lot of philosophical work to be digested. We are now beyond the 1960s, and not in Aristotle's age. Only then had logic and philosophy come to the right point of development to take the mental logic hypothesis seriously. And another decade or more had to go before experimental techniques were sufficiently developed to begin asking nature the right questions in the right way. The works by Braine, Rips and their collaborators are the first attempts at elaborating mental logic in a regimented psychological setting. Thus the psychological history of mental logic is very recent. It is, in fact, roughly contemporary with the psychological history of the mental model hypothesis. This shouldn't come as a surprise: both needed largely the same conceptual tools to be conceived. Mental models are not the inevitable revolution after millennia of mental logic domination. So, contrary to widespread assumptions, there are no good arguments against mental logic, be it point of principle, or in point of history. If a case against it and in favor of mental models can be made, it cannot rest on principled reasons, but on the formal and empirical development of the two theories. Indeed, extending the mental logic hypothesis beyond propositional reasoning engenders formidable problems connected with the choice of an appropriate language to express the logical forms of sentences on which rules apply, the choice of psychologically plausible rules to test, and the choice of appropriate means to test them. Approaching these problems requires the close collaboration of psychologists, natural language semanticists and syntacticians. But these are problems, however hard, and not mysteries. Most psychologists have abandoned the program and married the mental models alternative, both for its supposed superiority in handling empirical data and for the overwhelmingly convincing arguments against mental logic. In fact, the case for mental models has been overstated under both counts. Given how little we know about the mind and reasoning, conclusions on research programs that only began to be adequately developed a few years ago are premature. Psychologists should keep playing the mental logic game.
References Barwise, J., & Cooper, R. (1981). Generalized quantifiers and natural language. Linguistics and Philosophy, 4, 159-219. Bechtel, W., & Abrahansen, A. (1991). Connectionism and the mind. Oxford: Basil Blackwell. Bonatti, L. (in press). Propositional reasoning by model? Psychological Review. Boolos, G. (1984). On 'Syllogistic inference'. Cognition, 17 181-182. Braine, M.D. (1979). If then and strict implication; A response to Grandy's note. Psychological Review, 86, 158-160.
130
L. Bonatti
Braine, M.D., & O'Brien, D.P. (1991). A theory of if: Lexical entry, reasoning program, and pragmatic principles. Psychological Review, 98, 182-203. Braine, M.D., Reiser, B.J., & Rumain, B. (1984). Some empirical justification for a theory of natural propositional logic. The psychology of learning and motivation (Vol. 18, pp. 313-371). San Diego, CA: Academic Press. Braine, M.D., & Rumain. B. (1983). Logical Reasoning. In J. Ravell (Ed.), CaarmichaeVs handbook of child psychology (Vol. Ill, pp. 263-340). New York: Wiley. Byrne, R. (1989). Suppressing valid inferences with conditionals. Cognition, 31, 61-83. Byrne, R. (1991). Can valid inferences be suppressed? Cognition, 39, 71-78. Churchland, P.M. (1990). On the nature of explanation: a PDP approach, reprinted in S. Forrest (1991), Emergent computation (pp. 281-292). Cambridge, MA: MIT Press. Cosmides, L. (1989). The logic of social exchange: Has natural selection shaped how humans reason? Studies with the Wason selection task. Cognition, 31, 187-276. Ehrlich, K., & Johnson-Laird, P.N. (1982). Spatial descriptions and referential continuity. Journal of Verbal Learning and Verbal Behavior, 21, 296-306. Fodor, J.A. (1980). On the impossibility of acquiring 'more powerful' structures. In M. PiatteliPalmarini (Ed.), Language and learning: The debate between Jean Piaget and Noam Chomsky (pp. 142-163). Cambridge MA: Harvard University Press. Fodor, J.A. (1983). Modularity of mind. Cambridge, MA: MIT Press. Galotti, K.M. (1989). Approaches to studying formal and everyday reasoning. Psychological Bulletin, 105, 331-351. Galotti, K.M., Baron, J., & Sabini, J.P. (1986). Individual differences in syllogistic reasoning: deduction, rules or mental models? Journal of Experimental Psychology. General, 115, 16-25. Garnham, A. (1987). Mental models as representations of discourse and text. Chichester: Ellis Horwood Ltd. Gentzen, G. (1969). Investigations into logical deduction. In M.E. Szabo (Ed.), The collected papers of Gehrard Gentzen (pp. 68-128). Amsterdam: North Holland. Hilbert, D. (1927). The foundations of mathematics. In J. van Heijenoort (Ed.), From Frege to Godel (pp. 464-469). Cambridge, MA: Harvard University Press 1967. Hodges, J. (1993). The logical content of theories of deduction. Behavioral and Brain Sciences, 16 353-354. Johnson-Laird, P.N. (1983a). Mental models. Cambridge, MA: Harvard University Press. Johnson-Laird, P.N. (1983b). Thinking as a skill. In J.B. Evans (Ed.), Thinking and reasoning: psychological approaches (pp. 44-75). London: Routledge & Keegan. Johnson-Laird, P.N. (1989). Mental models. In M. Posner (Ed.), Foundations of cognitive science (pp. 469-499). Cambridge, MA: MIT Press. Johnson-Laird, P.N., & Bara, B. (1984). Syllogistic inference. Cognition, 16, 1-61. Johnson-Laird, P.N., & Byrne, R. (1989). Spatial reasoning. Journal of Memory and Language, 28, 565-575. Johnson-Laird, P.N., & Byrne, R. (1990). Meta-logical problems: Knights, knaves and Rips, Cognition, 36, 69-84. Johnson-Laird, P.N., & Byrne, R. (1991). Deduction. Hillsdale, NJ: Erlbaum. Johnson-Laird, P.N., & Byrne, R.M. (1993). Precis of Deduction. Behavioral and Brain Sciences, 16, 323-380. Johnson-Laird, P.N., Byrne, R., & Schaeken, W. (1992) Propositional reasoning by model. Psychological Review, 99, 418-439. Johnson-Laird, P.N., Byrne, R.M., & Tabossi, P. (1989). Reasoning by model: the case of multiple quantification. Psychological Review, 96, 658-673. Kant, I. (1965). Critique of pure reason. New York: St. Martin Press. Lea, R.B., O'Brien, D.P., Fisch, S., Noveck, I., & Braine, M. (1990). Predicting propositional logic inferences in text comprehension. Journal of Memory and Language, 29, 361-387. Lewontin, R.C. (1990). The evolution of cognition. In D. Osherson and E. Smith (Eds.), Thinking: an invitation to cognitive science (vol. 3, pp. 229-246). Cambridge, MA: MIT Press. McGinn, C. (1989). Mental content. Oxford: Basil Blackwell.
Why should we abandon the mental logic hypothesis?
131
Oakill, J., Johnson-Laird, P.N., & Garnham, A. (1989). Believability and syllogistic reasoning. Cognition, 31, 117-140. O'Brien, D. (1993). Mental logic and irrationality: we can put a man on the moon, so why can't we solve those logical reasoning problems? In K.I. Manktelow & D.E. Over (Eds.), Rationality (pp. 110-135). London: Routledge. O'Brien, D., Braine, M.D., & Yang, Y. (in press). Proportional reasoning by mental models? Simple to refute in principle and in practice. Psychological Review. Overton, W. (Ed.) (1990). Reasoning, necessity and logic: Developmental perspectives. Hillsdale, NJ. Erlbaum. Prior, A.N. (1960). The runabout inference ticket. Analysis, 21, 38-39. Prior, A.N. (1964). Conjunction and contonktion revisited. Analysis, 24, 191-195. Rips, L.J. (1983). Cognitive processes in propositional reasoning. Psychological Review, 90, 38-71. Rips, L.J. (1986). Mental muddles. In M. Brand & R. Harnish (Eds.), The representation of knowledge and belief (pp. 258-286). Tucson: University of Arizona Press. Rumain, B., Connell, J., & Braine, M.D. (1983). Conversational comprehension processes are responsible for reasoning fallacies in children as well as adults: It is not the biconditional. Developmental Psychology, 19, 471-481. Usberti, G. (1991). Prior's disease, Teoria, 2, 131-138.
7
Concepts: a potboiler Jerry Fodor* Graduate Center, CUNY, 33 West 42nd Street, New York, NY 10036, USA Center for Cognitive Science, Rutgers University, Psychology Building, Busch Campus, Piscataway, NJ 08855, USA
Abstract An informal, but revisionist, discussion of the role that the concept of a concept plays in recent theories of the cognitive mind. It is argued that the practically universal assumption that concepts are (at least partially) individuated by their roles in inferences is probably mistaken. A revival of conceptual atomism appears to be the indicated alternative.
Introduction: the centrality of concepts What's ubiquitous goes unremarked; nobody listens to the music of the spheres (or to me, for that matter). I think a certain account of concepts is ubiquitous in recent discussions about minds; not just in philosophy but also in psychology, linguistics, artificial intelligence, and the rest of the cognitive sciences; and not just this week, but for the last fifty years or so. And I think this ubiquitous theory is quite probably untrue. This paper aims at consciousness raising; I want to get you to see that there is this ubiquitous theory and that, very likely, you yourself are among its adherents. What to do about the theory's not being true (if it's not) - what our cognitive science would be like if we were to throw the theory overboard-is a long, hard question, and one that I'll mostly leave for another time. The nature of concepts is the pivotal theoretical issue in cognitive science; it's the one that all the others turn on. Here's why: Cognitive science is fundamentally concerned with a certain mind-world
* Correspondence to: J. Fodor, Center for Cognitive Science, Rutgers University, Psychology Building, Busch Campus, Piscataway, NJ 08855, USA.
J. Fodor
134
relation; the goal is to understand how its mental processes can cause a creature to behave in ways which, in normal circumstances, reliably comport with its utilities. There is, at present, almost1 universal agreement that theories of this relation must posit mental states some of whose properties are representational, and some of whose properties are causal. The representational (or, as I'll often say, semantic) properties of a creature's mental states are supposed to be sensitive to, and hence to carry information about, the character of its environment.2 The causal properties of a creature's mental states are supposed to determine the course of its mental processes, and, eventually, the character of its behavior. Mental entities that exhibit both semantic and causal properties are generically called "mental representations", and theories that propose to account for the adaptivity of behavior by reference to the semantic and causal properties of mental representations are called "representational theories of the mind". Enter concepts. Concepts are the least complex mental entities that exhibit both representational and causal properties; all the others (including, particularly, beliefs, desires and the rest of the "propositional attitudes") are assumed to be complexes whose constituents are concepts, and whose representational and causal properties are determined, wholly or in part, by those of the concepts they're constructed from. This account subsumes even the connectionist tradition which is, however, often unclear, or confused, or both about whether and in what sense it is committed to complex mental representations. There is a substantial literature on this issue, provoked by Fodor and Pylyshyn (1988). See, for example, Smolensky (1988) and Fodor and McLaughlin (1990). Suffice it for present purpose that connectionists clearly assume that there are elementary mental representations (typically labeled nodes), and that these have both semantic and causal properties. Roughly, the semantic properties of a node in a network are specified by the node's label, and its causal properties are determined by the character of its connectivity. So even connectionists think there are concepts as the present discussion understands that notion. On all hands, then, concepts serve both as the domains over which the most elementary mental processes are defined, and as the most primitive bearers of semantic properties. Hence their centrality in representational theories of mind.
l
The caveat is because it's moot how one should understand the relation between main-line cognitive science and the Gibsonian tradition. For discussion, see Fodor and Pylyshyn (1981). 2 There is no general agreement, either in cognitive science or in philosophy, about how the representational/semantic properties of mental states are to be analyzed; they are, in general, simply taken for granted by psychologists when empirical theories of cognitive processes are proposed. This paper will not be concerned, other than tangentially, with these issues in the metaphysical foundations of semantics. For recent discussion, however, see Fodor (1990) and references cited therein.
Concepts: a potboiler
135
1. Ancient history: the classical background The kind of concept-centered psychological theory I've just been sketching should seem familiar, not only from current work in cognitive science, but also from the philosophical tradition of classical British empiricism. I want to say a bit about classical versions of the representational theory of mind because, though their general architecture conforms quite closely to what I've just outlined, the account of concepts that they offered differs, in striking ways, from the ones that are now fashionable. Comparison illuminates both the classical and the current kinds of representational theories, and reveals important respects in which their story was closer to being right about the nature of concepts than ours. So, anyhow, I am going to argue. Here's a stripped-down version of a classical representational theory of concepts. Concepts are mental images. They get their causal powers from their associative relations to one another, and they get their semantic properties from their resemblance to things in the world. So, for example, the concept DOG applies to dogs because dogs are what (tokens of) the concept looks like. Thinking about dogs often makes one think about cats because dogs and cats often turn up together in experience, and it's the patterns in one's experience, and only these, that determine the associations among one's ideas. Because association is the only causal power that ideas have, and because association is determined only by experience, any idea can, in principle, become associated to any other, depending on which experiences one happens to have. Classical ideas cannot, therefore, be defined by their relations to one another. Though DOGthoughts call up CAT-thoughts, LEASH-thoughts, BONE-thoughts, BARKthoughts and the like in most actual mental lives, there are possible mental lives in which that very same concept reliably calls up, as it might be, PRIME NUMBER-thoughts or TUESDAY AFTERNOON-thoughts or KETCHUPthoughts. It depends entirely on how often you've come across prime numbers of dogs covered with ketchup on Tuesday afternoons. So much by way of a reminder of what classical theorists said about concepts. I don't want to claim much for the historical accuracy of my exegesis (though it may be that Hume held a view within hailing distance of the one I've sketched; for purposes of exposition, I'll assume he did). But I do want to call your attention to a certain point about the tactics of this kind of theory construction - a point that's essential but easy to overlook. Generally speaking, if you know what an X is, then you also know what it is to have an X. And ditto the other way around. No doubt, this applies to concepts. If, for example, your theory is that concepts are pumpkins, then it has to be a part of your theory that having a concept is having a pumpkin; and if your theory is that having a concept is having a pumpkin, then it has to be a part of your theory that pumpkins are what concepts are. I take it that this is just truistic.
136
J. Fodor
Sometimes it's clear in which direction the explanation should go, and sometimes it isn't. So, for example, one's theory about having a cat ought surely to be parasitic on one's theory about being a cat; first you say what a cat is, and then you say that having a cat is just: having one of those. With jobs, pains, and siblings, however, it goes the other way round. First you say what is to have a job, or a pain, or a sibling, and then the story about what jobs, pains and siblings are is a spin-off. These examples are, I hope, untendentious. But decisions about the proper order of explanation can be unobvious, important, and extremely difficult. To cite a notorious case: ought one first to explain what the number three is and then explain what it is for a set to have three members? Or do you first explain what sets are, and then explain what numbers are in terms of them? Or are the properties of sets and of numbers both parasitic on those of something quite else (like counting, for example). If I knew and I was rich, I would be rich and famous. Anyhow, classical representational theories uniformly took it for granted that the explanation of concept possession should be parasitic on the explanation of concept individuation. First you say what it is for something to be the concept X - y o u give the "identity conditions" for the concept-and then the story about concept possession follows without further fuss. Well, but how do you identify a concept? Answer: you identify a concept by saying what it is the concept of. The concept DOG, for example, is the concept of dogs; that's to say, it's the concept that you use to think about dogs with. Correspondingly, having the concept DOG is just having a concept to think about dogs with. Similarly, mutatis mutandis, for concepts of other than canine content: the concept X is the concept of Xs. Having the concept X is just having a concept to think about Xs with. (More precisely, having the concept X is having a concept to think about Xs "as such" with. The context "thinks about . . . " is intentional for the " . . " position. We'll return to this presently.) So much for the explanatory tactics of classical representational theories of mind. Without exception, however, current theorizing about concepts reverses the classical direction of analysis. The substance of current theories lies in what they say about the conditions for having the concept X. It's the story about being the concept X - t h e story about concept individuation - that they treat as parasitic: the concept X is just whatever it is that a creature has when it has that concept. (See, for example, Peacocke, 1992, which is illuminatingly explicit on this point.) This subtle, and largely inarticulate, difference between contemporary representational theories and their classical forebears has, so I'll argue, the most profound implications for our cognitive science. To a striking extent, it determines the kinds of problems we work on and the kinds of theories that we offer as solutions to our problems. I suspect that it was a wrong turn-on balance, a catastrophe - and that we shall have to go back and do it all again.
Concepts: a potboiler
137
First, however, just a little about why the classical representational view was abandoned. There were, I think, three kinds of reasons: methodological, metaphysical and epistemological. We'll need to keep them in mind when we turn to discussing current accounts of concepts. Methodology: Suppose you're a behaviorist of the kind who thinks there are no concepts. In that case, you will feel no need for a theory about what concepts are, classical or otherwise. Behaviorist views aren't widely prevalent now, but they used to be; one of the things that killed the classical theory of concepts was simply that concepts are mental entities,3 and mentalism went out of fashion. Metaphysics: A classical theory individuates concepts by specifying their contents; the concept X is the concept of Xs. This seemed OK-it seemed not to beg any principled questions - because classical theorists thought that they had of-ness under control; they thought the image theory of mental representation explained it. We now know that they were wrong to think this. Even if concepts are mental images (which they aren't) and even if the concept DOG looks like a dog (which it doesn't) still, it isn't because it looks like a dog that it's concept of dogs. Of-ness ("content", "intentionality") does not reduce to resemblance, and it is now widely, and rightly, viewed as problematic. It doesn't follow either that classical theorists were wrong to hold that the story about concept possession should be parasitic on the story about concept identification, or that they were wrong to hold that concepts should be individuated by their contents. But it's true that if you want to defend the classical order of analysis, you need an alternative to the picture theory of meaning. Epistemology: The third of the standard objections to the classical account of concepts, though at least as influential as the others, is distinctly harder to state. Roughly, it's that classical theories aren't adequately "ecological". Used in this connection, the term has Gibsonian ring; but I'm meaning it to pick out a much broader critical tradition. (In fact, I suspect Dewey was the chief influence; see the next footnote.) Here's a rough formulation. What cognitive science is trying to understand is something that happens in the world; it's the interplay of environmental contingencies and behavioral adaptations. Viewing concepts primarily as the vehicles of thought puts the locus of this mind/world interaction (metaphorically and maybe literally) not in the world but in the head. Having put it in there, classical theorists are at a loss as to how to get it out again. So the ecological objection goes. This kind of worry comes in many variants, the epistemological being, perhaps, the most familiar. If concepts are internal mental representations, and thought is conversant only with concepts, how does thought every contact the external world Terminological footnote: here and elsewhere in this paper, I follow the psychologist's usage rather than the philosopher's; for philosophers, concepts are generally abstract entities, hence, of course, not mental. The two ways of talking are compatible. The philosopher's concepts can be viewed as the types of which the psychologist's concepts are tokens.
J. Fodor
138
that the mental representations are supposed to represent? If there is a "veil of ideas" between the mind and the world, how can the mind see the world through the veil? Isn't it, in fact, inevitable that the classical style of theorizing eventuates either in solipsism ("we never do connect with the world, only with our idea of it") or in idealism ("it's OK if we can never get outside of heads because the world is in there with us")?4 And, surely, solipsism and idealism are both refutations of theories that entail them. Notice that this ecological criticism of the classical story is different from the behaviorist's eschewal of intentionality as such. The present objection to "internal representations" is not that they are representations, but that they are internal. In fact, this sort of objection to the classical theory predates behaviorism by a lot. Reid used it against Hume, for example. Notice too that this objection survives the demise of the image theory of concepts; treating mental representation as, say, discursive rather than iconic doesn't help. What's wanted isn't either pictures of the world or stories about the world; what's wanted is what they call in Europe being in the world. (I'm told this sounds even better in German.) This is all, as I say, hard to formulate precisely; I think, in fact, that it is extremely confused. But even if the "ecological" diagnosis of what's wrong with classical concepts is a bit obscure, it's clear enough what cure was recommended, and this brings us back to our main topic. If what we want is to get thought out of the head and into the world, we need to reverse the classical direction of analysis, precisely as discussed above; we need to take having a concept as the fundamental notion and define concept individuation in terms of it. This is a true Copernican revolution in the theory of mind, and we are still living among the debris. Here, in the roughest outline, is the new theory about concept possession: having a concept is having certain epistemic capacities. To have the concept of X is to be able to recognize Xs, and/or to be able to reason about Xs in certain kinds of ways. (Compare the classical view discussed above: having the concept of X is just being able to have thoughts about Xs). It is a paradigmatically pragmatist idea that having a concept is being able to do certain things rather than being able to think certain things. Accordingly, in the discussion that follows, I will contrast classical theories of concepts with "pragmatic" ones. I'll try to make it plausible that all the recent and current accounts of concepts in cognitive science really are just variations on the pragmatist legacy. 4
"Experience to them is not only something extraneous which is occasionally superimposed upon nature, but it forms a veil or screen which shuts us off from nature, unless in some way it can be 'transcended' (p. la)". "Other [philosophers' methods] begin with results of a reflection that has already torn in two the subject-matter and the operations and states of experiencing. The problem is then to get together again what has been sundered ..." (p. 9). Thus Dewey (1958). The remedy he recommends is resolutely to refuse to recognize the distinction between experience and its object. "[Experience] recognizes in its primary integrity no division between act and material, subject and object, but contains them both in an unanalyzed totality."
Concepts: a potboiler
139
In particular, I propose to consider (briefly, you'll be pleased to hear) what I take to be five failed versions of pragmatism about concepts. Each evokes its proprietary nemesis; there is, for each, a deep fact about concepts by which it is undone. The resulting symmetry is gratifyingly Sophoclean. When we've finished with this catalogue of tragic flaws, we'll have exhausted all the versions of concept pragmatism I've heard of, or can think of, and we'll also have compiled a must-list for whatever theory of concepts pragmatism is eventually replaced by.
2.1. Behavioristic pragmatism (and the problem of intentionality) I remarked above that behaviorism can be a reason for ruling all mentalistic notions out of psychology, concepts included. However, not all behaviorists were eliminativists; some were reductionists instead. Thus Ryle, and Hull (and even Skinner about half the time) are perfectly content to talk of concept possession, so long as the "criteria" for having a concept can be expressed in the vocabulary of behavior and/or in the vocabulary of dispositions to behave. Do not ask what criteria are; there are some things we're not meant to know. Suffice it that criterial relations are supposed to be sort-of-semantical rather than sort-of-empirical. So, then, which behaviors are supposed to be criterial for concept possession? Short answer: sorting behaviors. Au fond, according to this tradition, having the concept X is being able to discriminate Xs from non-Xs; to sort things into the ones that are X and the ones that aren't. Though behaviorist in essence, this identification of possessing a concept with being able to discriminate the things it applies to survived well into the age of computer models (see, for example, "procedural" semanticists like Woods, (1975); and lots of philosophers still think there must be something to it (see, for example, Peacocke, 1992). This approach gets concepts into the world with a vengeance: having a concept is responding selectively, or being disposed to respond selectively, to the things in the world that the concept applies to; and paradigmatic responses are overt behaviors "under the control" of overt stimulations. I don't want to bore your with ancient recent history, and I do want to turn to less primitive versions of pragmatism about concepts. So let me just briefly remind you of what proved to be the decisive argument against the behavioristic version: concepts can't be just sorting capacities, for if they were, then coextensive concepts - concepts that apply to the same things-would have to be identical. And coextensive concepts aren't, in general, identical. Even necessarily coextensive concepts - like TRIANGULAR and TRILATERAL, for examplemay perfectly well be different concepts. To put this point another way, sorting is something that happens under a description; it's always relative to some or other way of conceptualizing the things that are being sorted. Though their behaviors
140
J. Fodor
may look exactly the same, and though they may end up with the very same things in their piles, the creature that is sorting triangles is in a different mental state, and is behaving in a different way, from the creature that is sorting trilaterals; and only the first is exercising the concept TRIANGLE. (For a clear statement of this objection to behaviorism, see Dennett, 1978.) Behaviorists had a bad case of mauvais fois about this; they would dearly have liked to deny the intentionality of sorting outright. In this respect, articles like Kendler (1952), according to which 'what is learned, [is] a pseudoproblem in psychology", make fascinating retrospective reading. Suppose, however, that you accept the point that sorting is always relative to a concept, but you wish, nonetheless, to cleave to some kind of pragmatist reduction of concept individuation to concept possession and of concept possession to having epistemic capacities. The question then arises: what difference in their epistemic capacities could distinguish the creature that is sorting triangles from the creature that is sorting trilaterals? What could the difference between them be, if it isn't in the piles that they end up with? The universally popular answer has been that the difference between sorting under the concept TRIANGLE and sorting under the concept TRILATERAL lies in what the sorter is disposed to infer from the sorting he performs. To think of something as a triangle is to think of it as having angles; to think of something as a trilateral is to think of it as having sides. The guy who is collecting triangles must therefore accept that the things in his collection have angles (whether or not he has noticed that they have sides); and the guy who is collecting trilaterals must accept that the things in his collection have sides (even if he hasn't notice that they have angles). The long and short is: having concepts is having a mixture of abilities to sort and abilities to infer.5 Since inferring is presumably neither a behavior nor a behavioral capacity, this formulation is, of course, not one that a behavioristic pragmatist can swallow. So much the worse for behaviorists, as usual. But notice that pragmatists as such are still OK: even if having a concept isn't just knowing how to sort things, it still may be that having a concept is some kind of knowing how, and that theories of concept possession are prior to theories of concept individuation. We are now getting very close to the current scene. All non-behaviorist 5 The idea that concepts are (at least partially) constituted by inferential capacities receives what seems to be independent support from the success of logicist treatments of the "logical" concepts (AND, ALL, etc.). For many philosophers (though not for many psychologists) thinking of concepts as inferential capacities is a natural way of extending the logicist program from the logical vocabulary to TREE or TABLE. So, when these philosophers tell you what it's like to analyze a concept, they start with AND. (Here again, Peacocke, 1992, is paradigmatic.) It should, however, strike you as not obvious that the analysis of AND is a plausible model for the analysis of TREE or TABLE.
Concepts: a potboiler
141
versions of pragmatism hold that concept possession is constituted, at least in part, by inferential dispositions and capacities. They are thus all required to decide which inferences constitute which concepts. Contemporary theories of concepts, though without exception pragmatist, are distinguished by the ways that they approach this question. Of non-behavioristic pragmatist theories of concepts there are, by my reckoning, exactly four. Of which the first is as follows.
2.2. Anarchic pragmatism (and the realism problem) Anarchic pragmatism is the doctrine that though concepts are constituted by inferential dispositions and capacities, there is no fact of the matter about which inferences constitute which concepts. California is, of course, the locus classicus of anarchic pragmatism; but no doubt there are those even on the East Coast who believe it in their hearts. I'm not going to discuss the anarchist view. If there are no facts about which inferences constitute which concepts, then there are no facts about which concepts are which. And if there are no facts about which concepts are which, then there are no facts about which beliefs and desires are which (for, by assumption, beliefs and desires are complexes of which concepts are the constituents). And if there are no facts about which beliefs and desires are which, there is no intentional cognitive science, for cognitive science is just belief/desire explanation made systematic. And if there is no cognitive science, we might as well stop worrying about what concepts are and have a nice long soak in a nice hot tub. I'm also not going to consider a doctrine that is closely related to anarchic pragmatism: namely, that while nothing systematic can be said about concept identity, it may be possible to provide a precise account of when, and to what degree, two concepts are similar. Some such thought is often voiced informally in the cognitive science literature, but there is, to my knowledge, not even a rough account of how such a similarity relation over concepts might be defined. I strongly suspect this is because a robust notion of similarity is possible only where there is a correspondingly robust notion of identity. For a discussion, see Fodor and Lepore (1992, Ch. 7).
2.3. Definitional pragmatism (and the analyticity problem) Suppose the English word "bachelor" means the same as the English phrase "unmarried male". Synonymous terms presumably express the same concept (this is a main connection between theories about concepts and theories about language), so it follows that you couldn't have the concept BACHELOR and fail to have the concept UNMARRIED MALE. And from that, together with the
142
J. Fodor
intentionality of sorting (see section 2.1), it follows that you couldn't be collecting bachelors so described unless you take yourself to be collecting unmarried males; that is, unless you accept the inference that if something belongs in your bachelor collection, then it is something that is male and unmarried. Maybe this treatment generalizes; maybe, having the concept X just is being able to sort Xs and being disposed to draw the inferences that define X-ness. The idea that it's defining inferences that count for concept possession is now almost as unfashionable as behaviorism. Still, the departed deserves a word or two of praise. The definition story offered a plausible (though partial) account of the acquisition of concepts. If BACHELOR is the concept UNMARRIED MALE, then it's not hard to imagine how a creature that has the concept UNMARRIED and has the concept MALE could put them together and thereby achieve the concept BACHELOR. (Of course the theory that complex concepts are acquired by constructing them from their elements presupposes the availability of the elements. About the acquisition of these, definitional pragmatism tended to be hazy.) This process of assembling concepts can be-indeed, was-studied in the laboratory; see Bruner, Goodnow, & Austin (1956) and the large experimental literature that it inspired. Other significant virtues of the definition story will suggest themselves when we discuss concepts as prototypes in section 2.4. But alas, despite its advantages, the definition theory doesn't work. Concepts can't be definitions because most concepts don't have definitions. At a minimum, to define a concept is to provide necessary and sufficient conditions for something to be in its extension (i.e., for being among the things that concept applies to). And, if the definition is to be informative, the vocabulary in which it is couched must not include either the concept itself or any of its synonyms. As it turns out, for most concepts, this condition simply can't be met; more precisely, it can't be met unless the definition employs synonyms and near-synonyms of the concept to be defined. Maybe being male and unmarried is necessary and sufficient for being a bachelor; but try actually filling in the blanks in "JC is a dog iff JC is a ..." without using the words like "dog" or "canine" or the like on the right-hand side. There is, to be sure, a way to do it; if you could make a list of all and only the dogs (Rover, Lassie, Spot . . . etc.), then being on the list would be necessary and sufficient for being in the extension of DOG. That there is this option is, however, no comfort for the theory that concepts are definitions. Rather, what it shows is that being a necessary and sufficient condition for the application of a concept is not a sufficient condition for being a definition of the concept. This point generalizes beyond the case of lists. Being a creature with a backbone is necessary and sufficient for being a creature with a heart (so they tell me). But it isn't the case that "creature with a backbone" defines "creature with a heart" or vice versa. Quite generally, it seems that Y doesn't define X unless Y applies to all and only the possible Xs (as well, of course, as all and only the
Concepts: a potboiler
143
actual Xs). It is, then, the modal notion - possibility - that's at the heart of the idea that concepts are definitions. Correspondingly, what killed the definition theory of concepts, first in philosophy and then in cognitive psychology, is that nobody was able to explicate the relevant sense of "possible". It seems clear enough that even if Rover, Lassie and Spot are all the dogs that there actually are, it is possible, compatible with the concept of DOG, that there should be others; that's why you can't define DOG by just listing the dogs. But is it, in the same sense, possible, compatible with the concept DOG that some of these non-actual dogs are ten feet long? How about twenty feet long? How about twenty miles long? How about a light-year long? To be sure, it's not biologically possible that there should be a dog as big as a light-year; but presumably biology rules out a lot of options that the concept DOG, as such, allows. Probably biology rules out zebra-striped dogs; surely it rules out dogs that are striped red, white and blue. But I suppose that red, white and blue striped dogs are conceptually possible; somebody who thought that there might be such dogs wouldn't thereby show himself not to have the concept DOG - would he? So, again, are light-year-long dogs possible, compatible with the concept DOG? Suppose somebody thought that maybe there could be a dachshund a light-year long. Would that show that he has failed to master the concept DOG? Or the concept LIGHT-YEAR? Or both? To put the point in the standard philosophical jargon: even if light-year-long dogs aren't really possible, "shorter than a light-year" is part of the definition of DOG only if "some dogs are longer than a light-year" is analytically impossible; mere biological or physical (or even metaphysical) impossibility won't do. Well, is it analytically impossible that there should be such dogs? If you doubt that this kind of question has an answer, or that it matters a lot for any serious purpose what the answer is, you are thereby doubting that the notion of definition has an important role to play in the theory of concept possession. So much for definitions.
2.4. Stereotypes and prototypes (and the problem of compositionality) Because it was pragmatist, the definition story treated having a concept as having a bundle of inferential capacities, and was faced with the usual problem about which inferences belong to which bundles. The notion of an analytic inference was supposed to bear the burden of answering this question, and the project foundered because nobody knows what makes an inference analytic, and nobody has any idea how to find out. "Well", an exasperated pragmatist might nonetheless reply, "even if I don't know what makes an inference analytic, I do know what makes an inference statistically reliable. So why couldn't the theory of concept possession be statistical rather than definitional? Why couldn't I exploit
144
J. Fodor
the notion of a reliable inference to do what definitional pragmatism tried and failed to do with the notion of an analytic inference?" We arrive, at last, at modern times. For lots of kinds of Xs, people are in striking agreement about what properties an arbitrarily chosen X is likely to have. (An arbitrarily chosen bird is likely to be able to fly; an arbitrarily chosen conservative is likely to be a Republican; an arbitrarily chosen dog is likely to be less than a light-year long.) Moreover, for lots of kinds of Xs, people are in striking agreement about which Xs are prototypic of the kind (diamonds for jewels; red for colors; not dachshunds for dogs). And, sure enough, the Xs that are judged to be prototypical are generally ones that have lots of the properties that an arbitrary X is judged likely to have; and the Xs that are judged to have lots of the properties that an arbitrary X is likely to have are generally the ones that are judged to be prototypical. Notice, in passing, that stereotypes share one of the most agreeable features of definitions: they make the learning of (complex) concepts intelligible. If the concept of an X is the concept of something that is reliably Y and Z, then you can learn the concept X if you have the concepts Y and Z together with enough statistics to recognize reliability when you see it. It would be OK, for this purpose, if the available statistical procedures were analogically (rather than explicitly) represented in the learner. Qua learning models, "neural networks" are analog computers of statistical dependencies, so it's hardly surprising that prototype theories of concepts are popular among connectionists. (See, for example, McClelland & Rummelhart, 1986.) So, then, why shouldn't having the concept of an X be having the ability to sort by X-ness, together with a disposition to infer from something's being X to its having the typical properties of Xs? I think, in fact, that this is probably the view of concepts that the prototypical cognitive scientist holds these days. To see why it doesn't work, let's return one last time to the defunct idea that concepts are definitions. It was a virtue of that idea that it provides for the compositionality of concepts, and hence for the productivity and systematicity of thought. This, we're about to see, is no small matter. In the first instance, productivity and systematicity are best illustrated by reference to features (not of minds but) of natural languages. To say that languages are productive is to say that there is no upper bound to the number of well-formed formulas that they contain. To say that they are systematic is to say that if a language can express the proposition that P, then it will be able to express a variety of other propositions that are, in one way or another, semantically related to P. (So, if a language can say that P and that - Q , it will also be able to say that Q and that - P ; if it can say that John loves Mary, it will be able to say that Mary loves John . . . and so forth.) As far as anybody knows, productivity and systematicity are universal features of human languages. Productivity and systematicity are also universal features of human thought
Concepts: a potboiler
145
(and, for all I know, of the thoughts of many infra-human creatures). There is no upper bound to the number of thoughts that a person can think. (I am assuming the usual distinctions between cognitive "competence" and cognitive "performance"). And also, if a mind can entertain the thought that P and any negative thoughts, it can also entertain the thought that - P ; if it can entertain the thought that Mary loves John, it can entertain the thought that John loves Mary . . . and so on. It is extremely plausible that the productivity and the systematicity of language and thought are both to be explained by appeal to the systematicity and productivity of mental representations, and that mental representations are systematic and productive because they are compositional. The idea is that mental representations are constructed by the application of a finite number of combinatorial principles to a finite basis of (relatively or absolutely) primitive concepts. (So, the very same process that gets you from the concept MISSILE to the concept ANTIMISSILE, also gets you from the concept ANTIMISSILE to the concept ANTIANTIMISSLE, and so on ad infinitum.) Productivity follows because the application of these constructive principles can iterate without bound. Systematicity follows because the concepts and principles you need to construct the thoughts that P and -Q are the very same ones that you need to construct the thoughts that Q and - P ; and the concepts and principles you need to construct the thought that John loves Mary are the very same ones that you need to construct the thought that Mary loves John. This sort of treatment of compositionality is familiar, and I will assume that it is essentially correct. I want to emphasize that it places a heavy constraint on both theories of concept possession and theories of concept individuation. If you accept compositionality, then you are required to say that whatever the concept DOG is that occurs in the thought that Rover is a dog, that very same concept DOG also occurs in the thought that Rover is a brown dog; and, whatever the concept BROWN is that occurs in the thought that Rover is brown, the very same concept BROWN also occurs in the thought that Rover is a brown dog. It's on these assumptions that compositionality explains how being able to think that Rover is brown and that Rover is a dog is linked to being able to think that Rover is a brown dog. Compositionality requires, in effect, that constituent concepts must be insensitive to their host; a constituent concept contributes the same content to all the complex representations it occurs in. And compositionality further requires that the content of a complex representation is exhausted by the contributions that its constituents make. Whatever the content of the concept of BROWN DOG may be, it must be completely determined by the content of the constituent concepts BROWN and DOG, together with the combinatorial apparatus that sticks these constituents together; if this were not the case, your grasp of the concepts BROWN and DOG wouldn't explain your grasp of the concept BROWN DOG.
146
J. Fodor
In short, when complex concepts are compositional, the whole must not be more than the sum of its parts, otherwise compositionality won't explain productivity and systematicity. And if compositionality doesn't, nothing will. If this account of compositionality strikes you as a bit austere, it may be some comfort that the systematicity and productivity of thought is compatible with compositionality failing in any finite number of cases. It allows, for example, that finitely many thoughts (hence a fortiori, finitely many linguistic expressions) are idiomatic or metaphoric, so long as there are infinitely many that are neither. We can now see why, though concepts might have turned out to be definitions, they couldn't possibly turn out to be stereotypes or prototypes. Concepts do contribute their defining properties to the complexes of which they are constituents, and the defining properties of complex concepts are exhaustively determined by the defining properties that the constituents contribute. Since bachelors are, by definition, unmarried men, tall bachelors are, by the same definition, tall unmarried men; and very tall bachelors are very tall unmarried men, and very tall bachelors from Hoboken are very tall unmarried men from Hoboken . . . and so on. Correspondingly, there is nothing more to the definition of "very tall bachelor from Hoboken" than very tall unmarried man from Hoboken; that is, there is nothing more to the definition of the phrase than what the definitions of its constituents contribute. So, then, if concepts were definitions, we could see how thought could be compositional, and hence productive and systematic. Concepts aren't definitions, of course. It's just that, from the present perspective, it's rather a pity that they're not. For stereotypes, alas, don't work the way that definitions do. Stereotypes aren't compositional. Thus, "ADJECTIVE X" can be a perfectly good concept even if there is no adjective X stereotype. And even if there are stereotypic adjective Xs, they don't have to be stereotypic adjectives or stereotypic Xs. I doubt, for example, that there is a stereotype of very tall men from Hoboken; but, even if there were, there is no reason to suppose that it would be either a stereotype for tall men, or a stereotype for men from Hoboken, or a stereotype for men. On the contrary: often enough, the adjective in "ADJECTIVE X" is there precisely to mark a way that adjective X$ depart from stereotypic Xs. Fitzgerald made this point about stereotypes to Hemingway when he said, "The rich are different from the rest of us." Hemingway replied by making the corresponding point about definitions: "Yes", he said, "they have more money". In fact, this observation about the uncompositionality of stereotypes generalizes in a way that seems to me badly to undermine the whole pragmatist program of identifying concept possession with inferential dispositions. I've claimed that knowing what is typical of adjective and what is typical of X doesn't, in the general case, tell you what is typical of adjective Xs. The reason it doesn't is perfectly clear; though some of your beliefs about adjective Xs are compositional-
Concepts: a potboiler
147
ly inherited from your beliefs about adjectives, and some are compositionally inherited from your beliefs about Xs, some are beliefs that you have acquired about adjective Xs as such, and these aren't compositional at all. The same applies, of course, to the inferences that your beliefs about adjective Xs dispose you to draw. Some of the inferences you are prepared to make about green apples follow just from their being green and from their being apples. That is to say: they derive entirely from the constituency and structure of your GREEN APPLE concept. But others depend on information (or misinformation) that you have picked up about green apples as such: that green apples go well in apple pie; that they are likely to taste sour; that there are kinds of green apples that you'd best not eat uncooked, and so forth. Patently, these inferences are not definitional and not compositional; they are not ones that GREEN APPLE inherits from its constituents. They belong to what you know about green applies, not to what you know about the corresponding words or concepts. You learned that "green apple" means green and apple when you learned English at your mother's knee. But probably you learned that green apples mean apple pies from the likes of Julia Child. The moral is this: the content of complex concepts has to be compositionally determined, so whatever about concepts is not compositionally determined is therefore not their content. But, as we've just been seeing, the inferential role of a concept is not, in general, determined by its structure together with the inferential roles of its constituents. That is, the inferential roles of concepts are not, in general, compositional; only defining inferences are. This puts your paradigmatic cognitive scientist in something of a pickle. On the one hand, he has (rightly, I think) rejected the idea that concepts are definitions. On the other hand, he cleaves (wrongly, I think) to the idea that having concepts is having certain inferential dispositions. But, on the third hand (as it were), only defining inferences are compositional so if there are no definitions, then having concepts can't be having inferential capacities. I think that is very close to being a proof that the pragmatist notion of what it is to have a concept must be false. This line of argument was first set out in Fodor and Lepore (1992). Philosophical reaction has been mostly that if the price of the pragmatist account of concepts is reviving the notion that there are analytic/ definitional inferences, then there must indeed be analytic/definitional inferences. My own view is that cognitive science is right about concepts not being definitions, and that it's the analysis of having concepts in terms of drawing inferences that is mistaken. Either way, it seems clear that the current situation is unstable. Something's gotta give. I return briefly to my enumeration of the varieties of pragmatist theories of concept possession. It should now seem unsurprising that none of them work. In light of the issues about compositionality that we've just discussed, it appears there are principled reasons why none of them could.
148
J. Fodor
2.5. The "theory theory" of concepts (and the problem of holism) Pragmatists think that having a concept is having certain epistemic capacities; centrally it's having the capacity to draw certain inferences. We've had trouble figuring out which inferences constitute which concepts; well, maybe that's because we haven't been taken the epistemic bit sufficiently seriously. Concepts are typically parts of beliefs; but they are also, in a different sense of "part", typically parts of theories. This is clearly true of sophisticated concepts like ELECTRON, but perhaps it's always true. Even every-day concepts like HAND or TREE or TOOTHBRUSH figure in complex, largely inarticulate knowledge structures. To know about hands is to know, inter alia, about arms and fingers; to know about toothbrushes is, inter alia, to know about teeth and the brushing of them. Perhaps, then, concepts are just abstractions from such formal and informal knowledge structures. On this view, to have the concept ELECTRON is to know what physics has to say about electrons; and to have the concept TOOTHBRUSH is to know what dental folklore has to say about teeth. Here are some passages in which the developmental cognitive psychologist Susan Carey (1985) discusses the approach to concepts that she favors: "... [young] children represent only a few theory-like cognitive structures, in which their notions of causality are embedded and in terms of which their deep ontological commitments are explicated. Cognitive development consists, in part, in the emergence of new theories out of these older ones, with the concomitant reconstructing of the ontologically important concepts and emergence of new explanatory notions" (p. 14); "... successive theories differ in three related ways: in the domain of phenomena accounted for, the nature of explanations deemed acceptable, and even in the individual concepts at the center of each system . Change of one kind cannot be understood without reference to the changes of the other kinds" (pp. 4-5). The last two sentences are quoted from Carey's discussion of theory shifts in the history of science; her proposal is, in effect, that these are paradigms for conceptual changes in ontogeny. Cognitive science is where philosophy goes when it dies. The version of pragmatism according to which concepts are abstractions from knowledge structures corresponds exactly to the version of positivism according to which terms like "electron" are defined implicitly by reference to the theories they occur in. Both fail, and for the same reasons. Suppose you have a theory about electrons (viz. that they are X) and I have a different theory about electrons (viz. that they are Y). And suppose, in both cases, that our use of the term "electron" is implicitly defined by the theories we espouse. Well, the "theory theory" says that you have an essentially different concept of electrons from mine if (and only if?) you have an essentially different theory of electrons from mine. The problem of how to individuate concepts thus reduces to the problem of how to individuate theories, according to this view.
Concepts: a potboiler
149
But, of course, nobody knows how to individuate theories. Roughly speaking, theories are bundles of inferences, just as concepts are according to the pragmatist treatment. The problem about which inferences constitute which concepts has therefore an exact analagon in the problem which inferences constitute which theories. Unsurprisingly, these problems are equally intractable. Indeed, according to the pragmatist view, they are interdefined. Theories are essentially different if they exploit essentially different concepts; concepts are essentially different if they are exploited by essentially different theories. It's hard to believe it matters much which of these shells you keep the pea under. One thing does seem clear: if your way out of the shell game is to say that a concept is constituted by the whole of the theory it belongs to, you will pay the price of extravagant paradox. For example: it turns out that you and I can't disagree about dogs, or electrons, or toothbrushes since we have no common conceptual apparatus in which to couch the disagreement. You utter "Some dogs have tails." "No dogs have tails" I reply. We seem to be contradicting one another, but in fact we're not. Since tailessness is part of my theory of dogs, it is also part of my concept DOG according to the present, holist account of concept individuation. Since you and I have different concepts of dogs, we mean different things when we say "dog". So the disagreement between us is, as comfortable muddleheads like to put it, "just semantic". You might have thought that our disagreement was about the facts and that you could refute what I said by producing a dog with a tail. But it wasn't and you can't, so don't bother trying; you have you idea of dogs and I have mine. (What, one wonders, makes them both ideas of dogs?) First the pragmatist theory of concepts, then the theory theory of concepts, then holism, then relativism. So it goes. Or so, at least, it's often gone. I want to emphasize two caveats. The first is that I'm not accusing Carey of concept holism, still less of the slide from concept holism to relativism. Carey thinks that only the "central" principles of a theory individuate its concepts. The trouble is that she has no account of centrality, and the question "which of the inferences a theory licenses are central?" sounds suspiciously similar to the question "which of the inferences that a concept licenses are constitutive?" Carey cites with approval Kuhn's famous distinction between theory changes that amount to paradigm shifts and those that don't (Kuhn, 1962). If you have caught onto how this game is played, you won't be surprised to hear that nobody knows how to individuate paradigms either. Where is this buck going to stop? My second caveat is that holism about the acquisition of beliefs and about the confirmation of theories might well both be true even if holism about the individuation of concepts is, as I believe, hopeless. There is no contradiction between Quine's famous dictum that it's only as a totality that our beliefs "face the tribunal of experience", and Hume's refusal to construe the content of one's concepts as being determined by the character of one's theoretical commitments.
150
J. Fodor
There is, to be sure, a deep, deep problem about how to get a theory of confirmation and belief fixation if you are an atomist about concepts. But there is also a deep, deep problem about how to get a theory of confirmation and belief fixation if you are not an atomist about concepts. So far as I know, there's no reason to suppose that the first of these problems is worse than the second. So much for caveats. It's worth noticing that the holistic account of concepts at which we've now dead-ended is diametrically opposite to the classical view that we started with. We saw that, for the likes of Hume, any concept could become associated to any other. This was a way of saying that the identity of a concept is independent of the theories one holds about the things that fall under it; it's independent, to put it contemporary terms, of the concept's inferential role. In classical accounts, concepts are individuated by what they are concepts of, and not by what theories they belong to. Hume was thus a radical atomist just where contemporary cognitive scientists are tempted to be radically holist. In this respect, I think that Hume was closer to the truth than we are. Here's how the discussion has gone so far: modern representational theories of mind are devoted to the pragmatist idea that having concepts is having epistemic capacities. But not just sorting capacities since sorting is itself relativized to concepts. Maybe, then, inferential capacities as well? So be it, but which inferential capacities? Well, at a minimum, inferential capacities that respect the compositionality of mental representations. Defining inferences are candidates since they do respect the compositionality of mental representations. Or, rather, they would if there were any definitions, but there aren't any definitions to speak of. Statistical inferences aren't candidates because they aren't compositional. It follows that concepts can't be stereotypes. The "theory theory" merely begs the problem it is meant to solve since the individuation of theories presupposes the individuation of the concepts they contain. Holism would be a godsend and the perfect way out except that it's preposterous on the face of it. What's left, then, for a pragmatist to turn to? I suspect, in fact, that there is nothing left for a pragmatist to turn to and that our cognitive science is in deep trouble. Not that there aren't mental representations, or that mental representations aren't made of concepts. The problem is, rather, that Hume was right: concepts aren't individuated by the roles that they play in inferences, or, indeed, by their roles in any other mental processes. If, by stipulation, semantics is about what constitutes concepts and psychology is about the nature of mental processes, then the view I'm recommending is that semantics isn't part of psychology. If semantics isn't part of psychology, you don't need to have a sophisticated theory of mental processes in order to get it right about what concepts are. Hume, for example, did get it right about what concepts are, even though his theory of mental processes was associationistic and hence hopelessly primitive. Concepts are the constituents of thoughts; as such, they're the most elementary mental
Concepts: a potboiler
151
objects that have both causal and representational properties. Since, however, concepts are individuated by their representational and not by their casual properties, all that has to specified in order to identify a concept is what it is the concept of. The whole story about the individuation of the concept DOG is that it's the concept that represents dogs, as previously remarked. But if "What individuates concepts?" is easy, that's because its the wrong question, according to the present view. The right questions are: "How do mental representations represent?" and "How are we to reconcile atomism about the individuation of concepts with the holism of such key cognitive processes as inductive inference and the fixation of belief?" Pretty much all we know about the first question is that here Hume was, for once, wrong; mental representation doesn't reduce to mental imaging. What we know about the second question is, as far as I can tell, pretty nearly nothing at all. The project of constructing a representational theory of the mind is among the most interesting that empirical science has ever proposed. But I'm afraid we've gone about it all wrong. At the very end of Portnoy's Complaint, the client's two hundred pages of tortured, non-directive self-analysis comes to an end. In the last sentence of the book, the psychiatrist finally speaks: "So [said the doctor]. Now vee may perhaps to begin. Yes?"
References Bruner, J., Goodnow, J., & Austin, G. (1956). A Study of Thinking. New York: Wiley. Carey, S. (1985). Conceptual Change in Childhood. Cambridge, MA: MIT Press. Dennett, D. (1978). Skinner Skinned. In Brainstorms. Cambridge, MA: MIT Press. Dewey, J. (1958). Experience and Nature. New York: Dover Publications. Fodor, J. (1990). A Theory of Content and Other Essays. Cambridge, MA: MIT Press. Fodor, J., & Lepore, E. (1991). Why meaning (probably) isn't conceptual role. Mind and Language, 6, 328-343. Fodor, J., & Lepore, E. (1992). Holism: A Shopper's Guide. Oxford: Blackwell. Fodor, J., & McLaughlin, B. (1990). Connectionism and the problem of systematicity: why Smolensky's solution doesn't work. Cognition, 35, 183-204. Fodor, J., & Pylyshyn, Z. (1981). How direct is visual perception? Some reflection on Gibson's "ecological approach". Cognition, 9, 139-196. Fodor, J., & Pylyshyn, Z. (1988). Correctionism and cognitive architecture. Cognition, 28, 3-71. Kendler, H. (1952). "What is learned?" A theoretical blind alley. Psychological Review, 59, 269-277. Kuhn, T. (1962). The Structure of Scientific Revolutions. Chicago: University of Chicago Press. McClelland, J., & Rummelhart, D. (1986). A distributed model of human learning and memory. In J. McClelland & D. Rummelhart (Eds.), Parallel Distributed Processing (Vol. 2). Cambridge, MA: MIT Press. Peacocke, C. (1992). A Study of Concepts. Cambridge, MA: MIT Press. Smolensky, P. (1988). On the proper treatment of connectionism. Behavioral and Brain Sciences, 11, 1-23. Woods, W. (1975). What's in a link? In D. Bobrow & A. Collins (Eds.), Representation and Understanding. New York: Academic Press.
8
Young children's naive theory of biology Giyoo Hatano*' a , Kayoko Inagaki b ^Faculty of Liberal Arts, Dokkyo University, Soka, Saitama 340, Japan b School of Education, Chiba University, Chiba 263, Japan
Abstract This article aimed at investigating the nature of young children's naive theory of biology by reviewing a large number of studies conducted in our laboratories. More specifically, we tried to answer the following five critical questions. What components does young children's knowledge system for biological phenomena (or naive biology) have? What functions does it have in children's lives? How is it acquired in ontogenesis? How does its early version change as children grow older? Is it universal across cultures and through history? We propose that young children's biological knowledge system has at least three components, that is, knowledge needed to specify the target objects of biology, ways of inferring attributes or behaviors of biological kinds, and a non-intentional causal explanatory framework, and that these three constitute a form of biology, which is adaptive in children's lives. We also claim that the core of naive biology is acquired based on specific cognitive constraints as well as the general mechanism of personification and the resultant vitalistic causality, but it is differently instantiated and elaborated through activity-based experiences in the surrounding culture.
Introduction A growing number of cognitive developmentalists have come to agree that young children possess "theories" about selected aspects of the world (Wellman & Gelman, 1992). This conceptualization is a distinct departure from the Piagetian position, which assumed young children to be preoperational and thus
* Corresponding author.
154
G. Hatano, K. Inagaki
incapable of offering more or less plausible explanations in any domain, because the term "theories" means coherent bodies of knowledge that involve causal explanatory understanding. How similar young children's theories are to theories scientists have is still an open question, but the former certainly have something more than a collection of facts and/or procedures to obtain desired results (Kuhn, 1989). An important qualification here is "selected aspects of". In other words, young children are assumed to possess theories only in a few selected domains, where innate or early cognitive constraints work. Carey (1987) suggest that there are a dozen or so such domains. It is generally agreed that naive physics and naive psychology are included among them. What else? Wellman and Gelman (1992) take biology as the third domain. As to whether young children have acquired a form of biology, however, there has been a debate in recent years. On one hand, Carey (1985) claimed that children before around age 10 make predictions and explanations for biological phenomena based on intuitive psychology (i.e., intentional causality). According to her, young children lack the mind-body distinction, more specifically, do not recognize that our bodily functions are independent of our intention nor that biological processes which produce growth or death are autonomous. On the other hand, a number of recent studies have suggested that children possess biological knowledge at much earlier ages than Carey claimed. Some developmentalists (e.g., Hatano & Inagaki, 1987) have asserted that the differentiation between psychology and biology occurs, if it does, much earlier than Carey (1985) assumed. Others have proposed that biological phenomena are conceptualized differently from other phenomena from the beginning (e.g., Keil, 1992). A few other candidate theories young children may possess are theory of matters (e.g., Smith, Carey, & Wiser, 1985), astronomy (e.g., Vosniadou & Brewer, 1992), and theory of society (e.g., Furth, 1980). Nonetheless, none of these has been widely accepted as an important domain, nor researched extensively, at least compared with the "big three". Whichever aspects of the world young children have theories about, exact characterizations of these theories require further studies. Among others, the following questions seem critical. What components does each theory have? What functions does it have in children's lives? How is it acquired in ontogenesis? How does its early version change as children grow older? Is it universal across cultures and through history? In what follows, we would like to offer our tentative answers to these questions as to naive biology, based on a large amount of data collected by our associates and ourselves. Because of the limited space available, we will refer to studies conducted in other laboratories only when they are highly relevant.
Young children's naive theory of biology
755
Components of young children's biological knowledge We are convinced that the body of knowledge which young children possess about biological phenomena (e.g., behavior of animals and plants needed for individual survival; bodily process; reproduction and inheritance of properties to offspring) has at least three components, and believe that these three constitute a naive biology (Inagaki, 1993b). The first is knowledge enabling one to specify objects to which biology is applicable; in other words, knowledge about the living-non-living distinction, and also about the mind-body distinction. The second is a mode of inference which can produce consistent and reasonable predictions for attributes or behaviors of biological kinds. The third is a nonintentional causal explanatory framework for behaviors needed for individual survival and bodily processes. These components correspond respectively to the three features that Wellman (1990) lists in characterizing framework theories: ontological distinctions, coherence, and a causal-explanatory framework.
Animate-inanimate and mind-body distinctions An increasing number of recent studies have revealed that young children have the animate-inanimate distinction. More specifically, preschool children can distinguish animals from inanimate objects by attending to some salient distinctive features, for example animals' ability to perform autonomous movements (e.g., Gelman, 1990). Though only a small number of studies have dealt with plants as living things, they have also indicated that young children recognize plants as distinct from non-living things in some respects. For example, children before age 6 distinguish plants and animals from non-living things in terms of growth, that is, changes in size as time goes by (Inagaki, 1993a). In this study, which is an extension of that of Rosengren, Gelman, Kalish, and McCormick (1991), which investigated children's differentiation between animals and artifacts in terms of growth, children of ages 4-6 were presented with a picture of a flower's bud (or a new artifact or a young animal) as the standard stimulus picture, and were then asked to choose which of two other pictures would represent the same plant (or artifact or animal) a few hours later and several months/years later. The children showed "invariance" patterns (i.e., no change in size both a few hours later and several months/years later for all the items) for artifacts, but "growth" patterns (i.e., changes in size either/both a few hours later and several months/years later) for plants and animals. Backscheider, Shatz, and Gelman (1993) also reported that 4-year-olds recognize that, when damaged, both animals and plants can regrow, whereas artifacts can be mended only by human intervention. It is clear
156
G. Hatano, K. Inagaki
that young children can distinguish typical animals and plants from typical non-living things in some attributes. That young children treat inanimate things differently from animals and plants is not sufficient for claiming that they have an integrated category of living things. Proof that they are aware of the commonalities between animals and plants is needed. By asking 5- and 6-year-olds whether a few examples of plants or those of inanimate things would show similar phenomena to those we observe for animals, Hatano and Inagaki (1994) found that a great majority of them recognized commonalities between animals and plants in terms of feeding and growing in size over time, and thus distinguished them from inanimate things. Moreover, many of them justified their responses by mapping food for animals to water for plants, such as "A tulip or a pine tree dies if we do not water it"; for growth/about one-third of them cited the phenomenon of plants getting bigger from a seed or a bud, and one-fifth of them by referring to watering as corresponding to feeding as a condition for growth. Based on this and other related studies, we can conclude that children are able to acquire the living-non-living distinction by age 6. Young children can also distinguish between the body and the mind, in other words, biological phenomena from social or psychological ones, both of which are observed among a subset of animate things. Springer and Keil (1989) reported that children of ages 4-7 consider those features leading to biologically functional consequences for animals to be inherited, while other sorts of features, such as those leading to social or psychological consequences, to be not. Siegal (1988) indicated that children of ages 4-8 recognize that illness is caused not by moral but by medical factors; they have substantial knowledge of contagion and contamination as causes of illness. Inagaki and Hatano (1987) revealed that children of 5-6 years of age recognize that the growth of living things is beyond their intentional control. For example, a baby rabbit grows not because its owner wants it to but because it takes food. These findings all suggest that young children recognize the autonomous nature of biological processes. An even more systematic study on the mind-body distinction has just been reported by Inagaki and Hatano (1993, Experiment 1). By interviewing children using a variety of questions, they showed that even children aged 4 and 5 already recognize not only the differential modifiability among characteristics that are unmodifiable by any means (e.g., gender), that are bodily and modifiable by exercise or diet (e.g., running speed), and that are mental and modifiable by will or monitoring (e.g., forgetfulness), but also the independence of activities of bodily organs (e.g., heartbeat) from a person's intention. Another important piece of evidence for this distinction is young children's use of non-intentional (or vitalistic) causality for bodily phenomena but not for social-psychological ones; this point is discussed in a later section.
Young children's naive theory of biology
157
Personification as means to make educated guesses about living things When children do not have enough knowledge about a target animate object, they can make an educated guess by using personification or the person analogy in a constrained way. Young children are so familiar with humans that they can use their knowledge about humans as a source for analogically attributing properties to less familiar animate objects or predicting the reactions of such objects to novel situations, but they do not use knowledge about humans indiscriminately. Our studies (Inagaki & Hatano, 1987, 1991) confirmed such a process of constrained personification. In one of the studies (Inagaki & Hatano, 1991), we asked children of age 6 to predict a grasshopper's or tulip's reactions to three types of novel situations: (a) similar situations, in which a human being and the target object would behave similarly; (b) contradictory situations, where the target object and a human would react differently, and predictions based on the person analogy contradict children's specific knowledge about the target; and (c) compatible situations, where the object and a human being would in fact react differently, but predictions obtained through the person analogy do not seem implausible to them. Example questions for each situation are as follows: "We usually feed a grasshopper once or twice a day when we raise it at home. What will happen with it if we feed it 10 times a day?" (similar situation); "Suppose a woman buys a grasshopper. On her way home she drops in at a store with this caged grasshopper. After shopping she is about to leave the store without the grasshopper. What will the grasshopper do?" (contradictory); "Does a grasshopper feel something if the person who has been taking care of it daily dies? [If the subject's answer is "Yes"] How does it feel?" (compatible). Results indicated that for the similar situations many of the children generated reasonable predictions with some explanations by using person analogies, whereas they did not give personified predictions for the contradictory situations. As expected, they produced unreasonable predictions for the compatible situations, where they were unable to check the plausibility of products of person analogies because of the lack of adequate knowledge (e.g., about the relation between the brain and feeling). The following are example responses of a child aged 6 years 3 months: for the "too-much-eating" question of the "similar" situation, "The grasshopper will be dizzy and die, 'cause the grasshopper, though it is an insect, is like a person (in this point)"; for the "left-behind" question of the "contradictory" situation, "The grasshopper will be picked up by someone, 'cause it cannot open the cage." ["// someone does not pick up the cage, what will the grasshopper doT] "The grasshopper,will just stay there." ["Why doesn't the grasshopper do anything? Why does it just stay thereT] "It cannot (go out of the cage and) walk, unlike a
158
G. Hatano, K. Inagaki
person"; for the "caretaker's death" question of the "compatible" situation, "The grasshopper will feel unhappy." This illustrates well how this child applied knowledge about humans differentially according to the types of situations. Generally speaking, children generate reasonable predictions, using person analogies in a constrained way, and the person analogy may be misleading only where they lack biological knowledge to check analogy-based predictions.
Non-intentional causality The experimental evidence presented so far enables us to indicate that young children have a coherently organized body of knowledge applicable to living things. This body of knowledge can be called a theory only when a causal explanatory framework is included in it. This concerns the third component of their biological knowledge. Here the type of causality, intentional or nonintentional, determines the nature of a theory. Carey (1985) claimed that, as mentioned above, children before age 10 base their explanations of biological phenomena on an intentional causality, because they are ignorant of physiological mechanisms involved. On the contrary, we claim that young children before schooling can apply a non-intentional causality in explaining biological phenomena, and thus they have a form of biology which is differentiated from psychology. Young children cannot give articulated mechanical explanations when asked to explain biological phenomena (e.g., bodily processes mediating input-output relations) in an open-ended interview (e.g., Gellert, 1962); sometimes they try to explain them using the language of person-intentional causality (Carey, 1985). These findings apparently support the claim that young children do not yet have biology as an autonomous domain. It seems inevitable to accept this claim so long as we assume only two types of causalities, that is, intentional causality versus mechanical causality, as represented by Carey (1985). However, we propose an intermediate form of causality between these two. Children may not be willing to use intentional causality for biological phenomena but not as yet able to use mechanical causality. These children may rely on this intermediate form of causality, which might be called "vitalistic causality". Intentional causality means that a person's intention causes the target phenomenon, whereas mechanical causality means that physiological mechanisms cause the target phenomenon. For instance, a specific bodily system enables a person, irrespective of his or her intention, to exchange substances with its environment or to carry them to and from bodily parts. In contrast, vitalistic causality indicates that the target phenomenon is caused by activity of an internal organ, which has, like a living thing, "agency" (i.e., a tendency to initiate and sustain behaviors).' The activity is often described as a transmission or exchange of the "vital force",
Young children's naive theory of biology
159
which can be conceptualized as unspecified substance, energy, or information. Vitalistic causality is clearly different from person-intentional causality in the sense that the organ's activities inducing the phenomenon are independent of the intention of the person who possesses the organ. In Inagaki and Hatano (1990) some of the children of ages 5-8 gave explanations referring to something like vital force as a mediator when given novel questions about bodily processes, such as, what the halt of blood circulation would cause; for example, one child said, "If blood does not come to the hands, they will die, because the blood does not carry energies to them"; and another child, "We wouldn't be able to move our hands, because energies fade away if blood does not come there." However, as the number of these children was small, we did another experiment to induce children to choose a plausible explanation out of the presented ones. We (Inagaki & Hatano, 1993, Experiment 2) predicted that even if young children could not apply mechanical causality, and if they could not generate vitalistic causal explanations for themselves, they would prefer vitalistic explanations to intentional ones for bodily processes when asked to choose one from among several possibilities. We asked 6-year-olds, 8-year-olds, and college students as subjects to choose one from three possible explanations each for six biological phenomena, such as blood circulation and breathing. The three explanations represented intentional, vitalistic and mechanical causality, respectively. An example question on blood circulation with three alternative explanations was as follows: "Why do we take in air? (a) Because we want to feel good [intentional]; (b) Because our chest takes in vital power from the air [vitalistic]; (c) Because the lungs take in oxygen and change it into useless carbon dioxide [mechanical]." The 6-year-olds chose vitalistic explanations as most plausible most often; they chose them 54% of the time. With increasing age the subjects came to choose mechanical explanations most often. It should be noted that the 6-year-olds applied non-intentional (vitalistic plus mechanical) causalities 75% of the time, though they were more apt to adopt intentional causality than the 8-year-olds or adults. This vitalistic causality is probably derived from a general mechanism of personification. One who has no means for observing the opaque inside or details of the target object often tries to understand it in a global fashion, by assuming it or its components to be human-like (Ohmori, 1985). Hence, young children try to understand the workings of internal bodily organs by regarding them as humanlike (but non-communicative) agents, and by assigning their activities global life-sustaining characters, which results in vitalistic causality for bodily processes. We can see a similar mode of explanation in the Japanese endogenous science before the Meiji restoration (and the beginning of her rapid modernization),
160
G. Hatano, K. Inagaki
which had evolved with medicine and agriculture as its core (Hatano & Inagaki 1987). ' Young children seem to rely on vitalistic causality only for biological phenomena. They seldom attribute social-psychological behavior, which is optional and not needed for survival, to the agency of a bodily organ or part, as revealed by Inagaki and Hatano (1993, Experiments 3 and 3a). The following is an example question for such behavior used in the study: "When a pretty girl entered the room, Taro came near her. Why did he do so?" Eighty per cent of the 6-year-olds chose (a) "Because Taro wanted to become a friend of hers" [intentional explanation], whereas only 20% opted for (b) "Because Taro's heart urged him to go near her" [vitalistic]. For biological phenomenon questions - almost the same as those used in Experiment 2 of Inagaki and Hatano (1993) except for excluding the mechanical causal explanation - they tended to choose vitalistic explanations rather than intentional ones. What, then, is the relationship between the vitalistic explanation for biological phenomena and the teleological-functional explanation for biological properties (Keil, 1992)? Both are certainly in between the intentional and the mechanical; both seem to afford valid perspectives of the biological world. One interpretation is that they are essentially the same idea with different emphases-the teleological concerns more the why or the cause, whereas the vitalistic is concerned more with the how or the process. Another interpretation is that, because the vitalistic explanation refers to activity of the responsible organ or bodily part (implicitly for sustaining life), it is closer to mechanical causality than is the teleological one, which refers only to the necessity. Anyway, it will be intriguing to examine these characterizations of young children's "biological" explanations in concrete experimental studies. In sum, we can conclude from the above findings that children as young as 6 years of age possess three essential components of biology. In other words, contrary to Carey (1985), children before schooling have acquired a form of biology differentiated from psychology.
Functions of naive biology The personifying and vitalistic biology that young children have is adaptive'in nature in their everyday life. In other words, we believe that naive biology is formed, maintained, and elaborated, because it is functional. First, it is useful in everyday biological problem solving, for example, in making predictions for reactions of familiar animate entities to novel situations, and for properties and behaviors of unfamiliar entities. This is taken for granted, because naive biology includes constrained personification as a general mode of inference for biological phenomena and about biological kinds. Since a human being is a kind of living
Young children's naive theory of biology
161
entity, the use of the person analogy can often produce reasonable, though not necessarily accurate, predictions, especially for advanced animals. The person analogies in young children's biology are especially useful, because they are constrained by the perceived similarity of the target objects to humans and also by specific knowledge regarding the target objects, as revealed by Inagaki and Hatano (1987, 1991). Second, naive biology also enables young children to make sense of biological phenomena they observe in their daily lives. Our favorite example is a 5-year-old girl's statement reported by Motoyoshi (1979). Based on accumulated experience with raising flowers, and relying on her naive biology, she concluded: "Flowers are like people. If flowers eat nothing (are not watered), they will fall down of hunger. If they eat too much (are watered too often), they will be taken ill." In this sense, young children's naive biology constitutes what Keil (1992) calls a mode of construal. Hatano and Inagaki (1991b) presented to 6-year-olds three bodily phenomena of a squirrel, which can also be observed for humans (being constipated, diarrhea, and getting older and weaker), and asked them to guess a cause for each phenomenon. About three-quarters of them on the average could offer some reasonable causes, and also judge the plausibility of causes suggested by the experimenter. About a half of them explicitly referred to humans at least once in their causal attributions for a squirrel. At the same time, however, some of their expressions strongly suggest that they edited or adapted to this animal those responses obtained by the person analogy (e.g., "A squirrel became weaker because it did not eat chestnuts"). Naive biology provides young children with a conceptual tool for interpreting bodily phenomena of other animals as well as humans. Third, naive biology is useful because it helps children learn meaningfully, or even discover, procedures for taking care of animals and plants as well as themselves in everyday life. Global understanding of internal bodily functions is enough for such purposes. Inagaki and Kasetani (1994) examined whether inducing the person analogy, a critical component of naive biological knowledge, would enhance 5- and 6-year-olds' comprehension of raising procedures of a squirrel. The subjects were aurally given the description of the procedures while watching pictures visualizing them. The description included several references to humans in the experimental condition but not in the control condition. For example, about the necessity of giving a variety of food to a squirrel, the experimenter indicated, "You do not eat favorite food only. You eat a variety of food, don't you?" After listening to the description of all procedures, the children were asked to tell how to raise a squirrel to another lady. They were asked questions by this lady, for example, "What kind of food might I give a squirrel? Favorite chestnuts only, chestnuts, seeds and vegetables mixed, or ice cream?" They were thus required to choose an alternative and to give the reason. The experimental group children, irrespective of age, gave more often adequate
162
G. Hatano, K. Inagaki
reasons for their correct choices than the control ones, though their superiority in the number of correct choices was significant only for the younger subjects. For instance, one 5-year-old child said, "Don't feed chestnuts only. You must give a squirrel plenty of seeds and carrots, because a person will die if he eats the same kind of food only, and so will it." An anecdotal but impressive example of the discovery of a procedure for coping with trouble with a raised animal is reported by Motoyoshi (1979). Children aged 5 in a day care center inferred that, when they observed unusual excretion of a rabbit they were taking care of every day, it might be suffering from diarrhea like a person, and after group discussion they produced an idea of making the rabbit take medicine for diarrhea as a suffering person would. Young children's naive biology is functional partly because its componentspieces of knowledge, the mode of inference, and causality - are promptly and effortlessly retrieved and used to generate more or less plausible ideas. Their personifying and vitalistic biology seems to be triggered almost automatically whenever children come into contact with novel phenomena which they recognize as "biological" (Inagaki, 1990b). Speaking generally, making an educated guess by applying insufficient knowledge is often rewarded in everyday life, both in individual problem solving and in social interaction, so most everyday knowledge is readily used. Children's naive biology is not an exception, we believe. In fact in our study described above (Inagaki & Hatano, 1987) it was very rare that the children gave no prediction or the "I don't know" answer to our questions which were somewhat unusual. It should also be noted that naive biological knowledge is seldom applied "mechanically". As mentioned earlier, children constrain their analogies by factual, procedural or conceptual knowledge about the target to generate a reasonable answer.
Acquisition of naive biology As already mentioned, our experimental data strongly suggest that children as young as 6 years of age have acquired a form of biology. This early acquisition of biology is not surprising from the perspective of human evolution, because it has been essential for our species to have some knowledge about animals and plants as potential foods (Wellman & Gelman, 1992) and also knowledge about our bodily functions and health (Hatano, 1989; Inagaki & Hatano, 1993). When children acquire an autonomous domain of biology is still an open question for us, because we have not examined whether much younger subjects too possess a form of biology. However, we think that the acquisition of biology comes a little later than that of physics or psychology. Infants seldom need biological knowledge, since they do not need to take care of their health nor try to find food themselves. Moreover,
Young children's naive theory of biology
163
autonomous biology has to deal with entities which have agency (i.e., initiate and maintain activity without external forces) but can hardly communicate with us humans, and thus has to apply an intermediate form of causality between the intentional and mechanical. Autonomous biology also requires to include animals and plants, which appear so different, into an integrated category of living things. Though there is some evidence that even infants can distinguish objects having a capacity for self-initiated movement from those not having it (e.g., Golinkoff, Harding, Carlson, & Sexton, 1984), this cannot directly serve as the basis for the living-non-living distinction.
Cognitive bases of naive biology Whether naive biology gradually emerges out of naive psychology (Carey, 1985) or is a distinct theory or mode of construal from the start (Keil, 1992) is still debatable. It is true that, as Keil argues, preschool children have some understanding of the distinction between the biological and the social-psychological. In Vera and Keil (1988), for example, 4-year-olds' inductions about animals, when given the biological context, resembled those previously found for 7-year-olds, who were given the same attribution questions without context; giving the social-psychological context to 4-year-olds did not affect the inductions they made. However, young children may overestimate the controllability of bodily processes by will or intention. In fact, our modified replication study on the controllability of internal bodily functions suggests that 3-year-olds are not sure whether the workings of bodily organs are beyond their control (Inagaki & Suzuki, 1991). Our own speculation about how young children acquire personifying and vitalistic biology through everyday life experiences is as follows. Children notice through somatosensation that several "events", uncontrolled by their intention, are going on inside the body. Since children cannot see the inside of the body, they will try to achieve "global understanding" by personifying an organ or bodily part. Considering that young children use analogies in a selective, constrained way (Inagaki & Hatano, 1987, 1991; Vosniadou, 1989), it is plausible that they apply the person analogy to bodily organs in that way, too. More specifically, they attribute agency and some related human properties but not others (e.g., the ability to communicate) to these organs. They also through personification generalize this global understanding of the body to other living things. A set of specific innate or very early cognitive constraints is probably another important factor in the acquisition of naive biology. It is likely that even very young children have tendencies to attribute a specific physical reaction to a specific class of events, such as that diarrhea is caused by eating something poisonous. These tendencies enhance not only their rejection of intentional
164
G. Hatano, K. Inagaki
causality for bodily phenomena but also their construction of more specific beliefs about bodily processes. To sum up, we believe that the ability of young children to make inferences about bodily processes, as well as about animals' and plants' properties and behaviors, is based on personification, but this does not mean it is purely psychological, because they understand the mind-body distinction to some extent. However, it is suggested that they sometimes overattribute human mental properties (not communication ability, but working hard, being happy and others in addition to agency) to bodily organs (Hatano & Inagaki, unpublished study) as well as to less advanced animals, plants or even inanimate objects (e.g., Inagaki & Sugiyama, 1988). In this sense, early naive biology is "psychological", though it is autonomous.
Activity-based experiences We are willing to admit that, because of the above general mechanism of personification and the resultant vitalistic causality, which "fit nicely with biology" (Keil, 1992, p. 105), and specific cognitive constraints, there must be some core elements in naive biology that are shared among individuals within and between cultures, as suggested by Atran (1990). However, we would like to emphasize that this condition does not mean children's activity-based experiences do not contribute to the acquisition. Some such experiences are also universal in human ways of living, but others may vary and thus produce differently instantiated versions of naive biology. For example, if children are actively engaged in raising animals, it will be possible for them to acquire a rich body of knowledge about them, and therefore to use that body of knowledge, as well as their knowledge about humans, as a source for analogical predictions and explanations for other biological kinds. Our studies have in fact revealed that such an activity may produce a slightly different version of naive biology from the ordinary one. Inagaki (1990a) compared the biological knowledge of kindergarteners who had actively engaged in raising goldfish for an extended period at home with that of children of the same age who had never raised any animal. Although these two groups of children did not differ in factual knowledge about typical animals in general, the goldfish-raisers had much richer procedural, factual and conceptual knowledge about goldfish. More interestingly, the goldfish-raisers used the knowledge about goldfish as a source for analogies in predicting reactions of an unfamiliar "aquatic" animal (i.e., a frog), one that they had never raised, and produced reasonable predictions with some explanations for it. For example, one of the raisers answered, when asked whether we could keep a baby frog in the same size forever, "No, we can't, because a frog will grow bigger as goldfish grew bigger.
Young children's naive theory of biology
165
My goldfish were small before, but now they are big." It might be added that the goldfish-raisers tended to use person analogies as well as goldfish analogies for a frog. In other words, the goldfish-raisers could use two sources for making analogical predictions. Moreover, in another study (Kondo & Inagaki, 1991; see also Hatano & Inagaki, 1992), goldfish-raising children tended to enlarge their previously possessed narrow conception of animals. Goldfish-raisers attributed animal properties which are shared by humans (e.g., having a heart, excreting) not only to goldfish but also to a majority of animals phylogenetically between humans and goldfish at a higher rate than non-raisers. This suggests that the experience of raising goldfish modifies young children's preferred mode of biological inferences.
Theory changes in biological understanding So far we have emphasized strengths of young children's naive biology. What weaknesses does it have? Its weaknesses are obvious even when compared with intuitive biology in lay adults. Let us list some major ones: (a) limited factual knowledge; (b) lack of inferences based on complex, hierarchically organized biological categories; (c) lack of mechanical causality; and (d) lack of some conceptual devices (e.g., "evolution", "photosynthesis"). The use of inferences based on complex, hierarchically organized biological categories and of mechanical causality requires a theory change or conceptual change (i.e., fundamental restructuring of knowledge), whereas the accumulation of more and more factual knowledge can be achieved by enrichment only. Whether the acquisition of basic conceptual devices in scientific or school biology is accompanied by a theory change is not beyond dispute, but incorporating them meaningfully into the existing body of knowledge can usually be achieved only with the restructuring of that knowledge. It is expected that, as children grow older, their personifying and vitalistic biology will gradually change toward truly "non-psychological" (if not scientific) biology by eliminating the above weaknesses (b) and (c), that is, toward a biology which relies on category-based inferences and rejects intentional causal explanations. We assume that this change is almost universal, at least among children growing up in highly technological societies, and that it can occur without systematic instruction in biology, though schooling may have some general facilitative effects on it. Inagaki and Sugiyama (1988) examined how young children's human-centered or "similarity-based" inference would change as they grew older. They gave attribution questions, such as "Does X have a property Y?", to children aged from 4 to 10 and college students. Results indicated that there was a progression from 4-year-olds' predominant reliance on similarity-based attribution (attributing
166
G. Hatano, K. Inagaki
human properties in proportion to perceived similarity between target objects and humans) to adults' predominant reliance on category-based attribution (attributing by relying on the higher-order category membership of the targets and category-attribute associations). This shift is induced not only by an increased amount of knowledge but also by the development of metacognitive beliefs evaluating more highly the usefulness of higher-order categories (Hatano & Inagaki, 1991a, Inagaki, 1989). In contrast to young children's vitalistic, and sometimes even intentional, biological explanations, older children reject intentional explanations for biological phenomena and are inclined to use mechanical causality exclusively. In Experiment 2 of Inagaki and Hatano's (1993) study, the difference between 6-year-olds and 8-year-olds was larger than the difference between 8-year-olds and adults in terms of preference for mechanical explanations and avoidance of intentional ones. These results suggest that young children's biology is qualitatively different from the biology that older children and adults have, and that, in accordance with Carey's claim, there occurs a conceptual change in biological understanding between ages 4 and 10. However, contrary to her claim, this change is characterized not as the differentiation of biology from psychology but as a qualitative change within the autonomous domain of biology, because children as young as 6 years of age already possess a form of biology. Another important change may occur as a result of the learning of scientific biology at school. In order to be able to reason "scientifically" in biology one needs to know its basic concepts and principles - major conceptual devices which cannot be acquired without intervention. For example, if one does not know the phenomenon of photosynthesis, one will not be able to understand the difference between animals and plants (i.e., plants can produce nutriment themselves), and thus may accept the false analogy of mapping water for plants with food for animals. We assume that, unlike the first theory change, this change is hard to achieve and thus occurs only among a limited portion of older children or adolescents.
Universality of naive biology Which aspects of naive biology are universal, and which aspects are not? As suggested by Atran (1990), it may be possible to find the "common sense" or core beliefs shared by all forms of folk biology and even by scientific biology. However, what such core beliefs are is debatable. Much of the research inspired by Piaget has shown parallels among the biological understanding of children in different cultures. The distinctions between animals and terrestrial inanimate objects are particularly strong. However,
Young children's naive theory of biology
167
the biological understanding observed in different cultures is not identical. The most striking of the differences thus far reported concerns ideas about plants of children in Israel. Stavy and Wax (1989) showed that about half of a sample of 6-12-year-olds, when asked to judge the life status of animals, plants and non-living things, classified plants either as non-living things or as falling within a third category: things that are neither living nor non-living. Beliefs about inanimate objects also may differ between cultures. Whereas recent studies conducted in North America indicate that young children seldom attribute life or other living-thing properties to any terrestrial inanimate objects (e.g., Dolgin & Behrend, 1984; Richards & Siegler, 1984), Inagaki and Sugiyama (1988) reported that some Japanese preschoolers extended mental properties even to inanimate objects without movement or function, such as stones. Hatano et al. (1993) tried to differentiate between universal and culturally specific aspects of children's conceptions of life and understanding of attributes of living things, by comparing kindergarteners, 2nd- and 4th-graders from Israel, Japan and the United States. The children were asked whether two instances each of four object types (people, other animals, plants and inanimate objects) possessed each of 16 attributes that included life status (being alive), unobservable animal attributes (e.g., has a heart), sensory attributes (e.g., feels pain), and attributes true of all living things (e.g., grows bigger). The results illustrate both similarities and differences across cultures in children's biological understanding. Children in all cultures knew that people, other animals, plants, and inanimate objects were different types of entities, with different properties, and were extremely accurate regarding humans, somewhat less accurate regarding other animals and inanimate objects, and least accurate regarding plants. At the same time, as predicted from cultural analyses, Israeli children were considerably more likely not to attribute to plants properties that are shared by all living things, whereas Japanese children, whose overall accuracy was comparable to the Israeli, were considerably more likely to attribute to inanimate objects properties that are unique to living things. These differences are especially interesting because they suggest that children's naive biology is influenced by beliefs within the culture where they grow up. Consider why Japanese children might be more likely than children in the United States or Israel to view plants or inanimate objects as alive and having attributes of living things. Japanese culture includes a belief that plants are much like human beings. This attitude is represented by the Buddhist idea that even a tree or blade of grass has a mind. In Japanese folk psychology, even inanimate objects are sometimes considered to have minds. For example, it is at least not a silly idea for Japanese to assign life or divinity not only to plants but also to inanimate objects, especially big or old ones. In addition, linguistic factors seem to influence Japanese children's attributional judgements. The kanji (Chinese character) representing it has a prototypal meaning of "fresh" or "perishable" as well as
168
G. Hatano, K. Inagaki
"alive". Therefore, this kanji can be applied to cake, wine, sauce, and other perishable goods. Similar features of culture and language may account for Israeli children being less apt than American or Japanese children to attribute to plants life status and properties of living things. Stavy and Wax (1989) suggested that within the Israeli culture plants are regarded as very different from humans and other animals in their life status. This cultural attitude parallels that of a biblical passage (Genesis 1: 30), well known to Israeli students, indicating that plants were created as food for living things including animals, birds and insects. Adding to, or perhaps reflecting, their cultural beliefs, the Hebrew word for "animal" is very close to that for "living" and"alive". In contrast the word for "plant" has no obvious relation to such terms (Stavy & Wax, 1989). How culture influences the development of biological understanding has yet to be studied. Parents, schools and mass media may serve to transmit cultural beliefs. For example, Japanese parents may communicate the attitude through their actions toward plants and divine inanimate objects, though they do not usually tell their children this explicitly. Culture may provide children with opportunities to engage in activities that lead them to construct some particular biological understanding, as in the case of children raising goldfish (Hatano & Inagaki, 1992; Inagaki, 1990a).
Postscript Since Carey (1985), young children's naive biology has been an exciting topic for research in cognitive development. As more and more ambitious researchers have joined to study it, not only has a richer database been built and finer conceptualizations offered about this specific issue, but also, through attempts to answer questions like the ones discussed so far in this article, a better understanding of fundamental issues in the developmental studies on cognition, like the nature of domains, theories, constraints, etc., has been achieved. It will probably be a popular topic for the coming several years, and research questions about naive biology can be better answered and/or better rephrased. What is urgently needed now is (a) to integrate nativistic and cultural accounts of acquisition and change in naive biology, and (b) to find commonalities and differences between naive biology and other major theories of the world possessed by young children (Hatano, 1990).
References Atran, S. (1990). Cognitive foundations of natural history: Towards an anthropology of science. Cambridge, UK: Cambridge University Press.
Young children's naive theory of biology
169
Backscheider, A.G., Shatz, M., & Gelman, S.A. (1993). Preschoolers' ability to distinguish living kinds as a function of regrowth. Child Development, 64, 1242-1257. Carey, S. (1985). Conceptual change in childhood. Cambridge, MA: MIT Press. Carey, S. (1987). Theory change in childhood. In B. Inhelder, D. de Caprona, & A. Cornu-Wells (Eds.), Piaget today (pp. 141-163). Hillsdale, NJ: Erlbaum. Dolgin, K.G., & Behrend, D.A. (1984). Children's knowledge about animates and inanimates. Child Development, 55, 1646-1650. Furth, H.G. (1980). The world of grown-ups: Children's conceptions of society. New York: Elsevier. Gellert, E. (1962). Children's conceptions of the content and functions of the human body. Genetic Psychology Monographs, 65, 291-411. Gelman, R. (1990). First principles organize attention to and learning about relevant data: Number and the animate-inanimate distinction as examples. Cognitive Science, 14, 79-106. Golinkoff, R.M., Harding, C.G., Carlson, V., & Sexton, M.E. (1984). The infant's perception of causal events: The distinction between animate and inanimate objects. In L.L. Lipsitt & C. Rovee-Collier (Eds.), Advances in Infancy Research (Vol. 3, pp. 145-165). Norwood, NJ: Ablex. Hatano, G. (1989). Language is not the only universal knowledge system: A view from "everyday cognition". Dokkyo Studies in Data Processing and Computer Science, 7, 69-76. Hatano, G. (1990). The nature of everyday science: A brief introduction. British Journal of Developmental Psychology, 8, 245-250. Hatano, G., & Inagaki, K. (1987). Everyday biology and school biology: How do they interact? Quarterly Newsletter of the Laboratory of Comparative Human Cognition, 9, 120-128. Hatano, G., & Inagaki, K. (1991a). Learning to trust higher-order categories in biology instruction. Paper presented at the meeting of the American Educational Research Association, Chicago. Hatano, G., & Inagaki, K. (1991b). Young children's causal reasoning through spontaneous personification. Paper presented at the 33rd meeting of the Japanese Educational Psychology Association, Nagano [in Japanese]. Hatano, G., & Inagaki, K. (1992). Desituating cognition through the construction of conceptual knowledge. In P. Light & G. Butterworth (Eds.), Context and cognition: Ways of learning and knowing (pp. 115-133). London: Harvester/Wheatsheaf. Hatano, G., & Inagaki, K. (1994). Recognizing commonalities between animals and plants. Paper to be presented at the meeting of the American Educational Research Association, New Orleans. Hatano, G., Siegler, R.S., Richards, D.D., Inagaki, K., Stavy, R., & Wax, N. (1993). The development of biological knowledge: A multi-national study. Cognitive Development, 8, 47-62. Inagaki, K. (1989). Developmental shift in biological inference processes: From similarity-based to category-based attribution. Human Development, 32, 79-87. Inagaki, K. (1990a). The effects of raising animals on children's biological knowledge. British Journal of Developmental Psychology, 8, 119-129. Inagaki, K. (1990b). Young children's use of knowledge in everyday biology. British Journal of Developmental Psychology, 8, 281-288. Inagaki, K. (1993a). Young children's differentiation of plants from non-living things in terms of growth. Paper presented at the 60th meeting of the Society for Research in Child Development, New Orleans. Inagaki, K. (1993b). The nature of young children's naive biology. Paper presented at the symposium, "Children's naive theories of the world", at the 12th meeting of the International Society for the Study of Behavioral Development, Recife, Brazil. Inagaki, K., & Hatano, G. (1987). Young children's spontaneous personification as analogy. Child Development, 58, 1013-1020. Inagaki, K., & Hatano, G. (1990). Development of explanations for bodily functions. Paper presented at the 32nd meeting of the Japanese Educational Psychology Association, Osaka [in Japanese]. Inagaki, K., & Hatano, G. (1991). Constrained person analogy in young children's biological inference. Cognitive Development, 6, 219-231. Inagaki, K., & Hatano, G. (1993). Young children's understanding of the mind-body distinction. Child Development, 64, 1534-1549. Inagaki, K., & Kasetani, M. (1994). Effects of hints to use knowledge about humans on young
170
G. Hatano, K. Inagaki
children's understanding of biological phenomena. Paper to be presented at the 13th meeting of the International Society for the Study of Behavioral Development, Amsterdam. Inagaki, K., & Sugiyama, K. (1988). Attributing human characteristics: Developmental changes in over- and underattribution. Cognitive Development, 3, 55-70. Inagaki, K., & Suzuki, Y. (1991). The understanding of the mind-body distinction in children aged 3 to 5 years. Paper presented at the 33rd meeting of the Japanese Educational Psychology Association, Nagano, [in Japanese]. Keil, F.C. (1992). The origins of an autonomous biology. In M.R. Gunnar & M. Maratsos (Eds.), Modularity and constraints in language and cognition. Minnesota Symposia on Child Psychology (Vol. 25, pp. 103-137). Hillsdale, NJ: Erlbaum. Kondo, H., & Inagaki, K. (1991). Effects of raising goldfish on the grasp of common characteristics of animals. Paper presented at the 44th Annual Meeting of Japanese Early Childhood Education and Care Association, Kobe [in Japanese]. Kuhn, D. (1989). Children and adults as intuitive scientists. Psychological Review, 96, 674-689. Massey, CM., & Gelman, R. (1988). Preschooler's ability to decide whether a photographed unfamiliar object can move itself. Developmental Psychology, 24, 307-317. Motoyoshi, M. (1979). Watashino Seikatuhoikuron. [Essays on education for day care children: Emphasizing daily life activities.] Tokyo: Froebel-kan [in Japanese]. Ohmori, S. (1985). Chishikito gakumonno kouzou. [The structure of knowledge and science.] Tokyo: Nihon Hoso Shuppan Kyokai [in Japanese]. Richards, D.D., & Siegler, R.S. (1984). The effects of task requirements on children's life judgments. Child Development, 55, 1687-1696. Rosengren, K.S., Gelman, S.A., Kalish, C.W., & McCormick, M. (1991). As time goes by: Children's early understanding of growth. Child Development, 62, 1302-1320. Siegal, M. (1988). Children's knowledge of contagion and contamination as causes of illness. Child Development, 59, 1353-1359. Smith, C, Carey, S., & Wiser, M. (1985). On differentiation: A case study of the development of the concepts of size, weight, and density. Cognition, 21, 177-237. Springer, K., & Keil, F.C. (1989). On the development of biologically specific beliefe: The case of inheritance. Child Development, 60, 637-648. Stavy, R., & Wax, N. (1989). Children's conceptions of plants as living things. Human Development, 32, 88-94. Vera, A.H., & Keil, F.C. (1988). The development of inductions about biological kinds: The nature of the conceptual base. Paper presented at the 29th meeting of the Psychonomic Society, Chicago. Vosniadou, S. (1989). Analogical reasoning as a mechanism in knowledge acquisition: A developmental perspective. In S. Vosniadou & A. Ortony (Eds.), Similarity and analogical reasoning (pp. 413-469). Cambridge, UK: Cambridge University Press. Vosniadou, S., & Brewer, W. (1992). Mental models of the earth: A study of conceptual change in childhood. Cognitive Psychology, 24 535-585. Wellman, H.M. (1990). The child's theory of mind. Cambridge, MA: MIT Press. Wellman, H.M., & Gelman, S.A. (1992). Cognitive development: Foundational theories of core domains. Annual Review of Psychology, 43, 337-375.
9 Mental models and probabilistic thinking Philip N. Johnson-Laird* Department of Psychology, Princeton University, Green Hall, Princeton, NJ 08544, USA
Abstract This paper outlines the theory of reasoning based on mental models, and then shows how this theory might be extended to deal with probabilistic thinking. The same explanatory framework accommodates deduction and induction: there are both deductive and inductive inferences that yield probabilistic conclusions. The framework yields a theoretical conception of strength of inference, that is, a theory of what the strength of an inference is objectively: it equals the proportion of possible states of affairs consistent with the premises in which the conclusion is true, that is, the probability that the conclusion is true given that the premises are true. Since there are infinitely many possible states of affairs consistent with any set of premises, the paper then characterizes how individuals estimate the strength of an argument. They construct mental models, which each correspond to an infinite set of possibilities (or, in some cases, a finite set of infinite sets of possibilities). The construction of models is guided by knowledge and beliefs, including lay conceptions of such matters as the i(law of large numbers'9. The paper illustrates how this theory can account for phenomena of probabilistic reasoning.
1. Introduction Everyone from Aristotle to aboriginals engages in probabilistic thinking, whether or not they know anything of the probability calculus. Someone tells you: *Fax (609) 258 1113, e-mail
[email protected] The author is grateful to the James S. McDonnell Foundation for support. He thanks Jacques Mehler for soliciting this paper (and for all his work on 50 volumes of Cognition]). He also thanks Ruth Byrne for her help in developing the model theory of deduction, Eldar Shafir for many friendly discussions and arguments about the fundamental nature of probabilistic thinking, and for his critique of the present paper. Malcolm Bauer, Jonathan Evans and Alan Garnham also kindly criticized the paper. All these individuals have tried to correct the erroneous thoughts it embodies. Thanks also to many friends - too numerous to mention - for their work on mental models.
172
P. Johnson-Laird
There was a severe frost last night. and you are likely to infer: The vines will probably not have survived it. basing the inference on your knowledge of the effects of frost. These inferences are typical and ubiquitous. They are part of a universal human competence, which does not necessarily depend on any overt mastery of numbers or quantitative measures. Aristotle's notion of probability, for instance, amounts to the following two ideas: a probability is a thing that happens for the most part, and conclusions that state what is probable must be drawn from premises that do the same (see Rhetoric, I, 1357a). Such ideas are crude in comparison to Pascal's conception of probability, but they correspond to the level of competence a psychological theory should initially aspire to explain. Of course many people do encounter the probability calculus at school. Few master it, as a simple test with adults shows: There are two events, which each have a probability of a half. What is the probability that both occur? Many people respond: a quarter. The appropriate "therapy" for such errors is to invite the individual first to imagine that A is a coin landing heads and B is the same coin landing tails, that is, p(A & B) = 0, and then to imagine that A is a coin landing heads and B is a coin landing with the date uppermost, where date and head are on the same side, that is, p(A & B) = 0.5. At this point, most people begin to grasp that there is no definite answer to the question above - joint probabilities are a function of the dependence of one event on the other. Cognitive psychologists have discovered many phenomena of probabilistic thinking, principally that individuals do not follow the propositional calculus in assessing probabilities, and that they appear to rely on a variety of heuristics in making judgements about probabilities. A classic demonstration is Tversky and Kahneman's (1983) phenomenon of the "conjunction fallacy", that is, a violation of the elementary principle that p(A & B)^p(B). For example, subjects judge that a woman who is described as 31 years old, liberal and outspoken, is more likely to be a feminist bankteller than a bankteller. Indeed, we are all likely to go wrong in thinking about probabilities: the calculus is a branch of mathematics that few people completely master. Theorists relate probability to induction, and they talk of both inductive inference and inductive argument. The two expressions bring out the point that the informal arguments of everyday life, which occur in conversation, newspaper
Mental models and probabilistic thinking
173
editorials and scientific papers, are often based on inductive inferences. The strength of such arguments depends on the relation between the premises and the conclusion. But the nature of this relation is deeply puzzling - so puzzling that many theorists have abandoned logic altogether in favor of other idiosyncratic methods of assessing informal arguments (see, for example, Toulmin, 1958; the movement for "informal logic and critical thinking", e.g. Fisher, 1988; and "neural net" models, e.g. Thagard, 1989). Cognitive psychologists do not know how people make probabilistic inferences: they have yet to develop a computable account of the mental processes underlying such reasoning. For this celebratory volume of Cognition, the editor solicited papers summarizing their author's contributions to the field. The present paper, however, looks forward as much as it looks back. Its aim is to show how probabilistic thinking could be based on mental models-an approach that is unlikely to surprise assiduous readers of the journal (see, for example, Byrne, 1989; Johnson-Laird & Bara, 1984; Oakhill, Johnson-Laird, & Garnham, 1989). In pursuing the editor's instructions, part 2 of the paper reviews the theory of mental models in a self-contained way. Part 3 outlines a theoretical conception of strength of inference, that is, a theory of what objectively the strength of an inference or argument depends on. This abstract account provides the agenda for what the mind attempts to compute in thinking probabilistically ( a theory at the "computational" level; Marr, 1982). However, as we shall see, it is impossible for a finite device, such as the human brain, to carry out a direct assessment of the strength of an inference except in certain limiting cases. Part 4 accordingly describes a theory of how the mind attempts to estimate the strength of inferences (a theory at the "algorithmic" level). Part 5 shows how this algorithmic theory accounts for phenomena of probabilistic thinking and how it relates to the heuristic approach. Part 6 contrasts the model approach with theories based on rules of inference, and shows how one conception of rules can be reconciled with mental models.
2. Reasoning and mental models Mental models were originally proposed as a programmatic basis for thinking (Craik, 1943). More recently, the theory was developed to account for verbal comprehension: understanding of discourse leads to a model of the situation under discussion, that is, a representation akin to the result of perceiving or imagining the situation. Such models are derived from syntactically structured expressions in a mental language, which are constructed as sentences are parsed (see Garnham, 1987; Johnson-Laird, 1983). Among the key properties of models is that their structure corresponds to the structure of what they represent (like a visual image), and thus that individual entities are represented just once in a model. The theory of mental models has also been developed to explain deductive
174
P. Johnson-Laird
reasoning (Johnson-Laird, 1983; Johnson-Laird & Byrne, 1991). Here, the underlying idea is that reasoning depends on constructing a model (or set of models) based on the premises and general knowledge, formulating a conclusion that is true in the model(s) and that makes explicit something only implicit in the premises, and then checking the validity of the conclusion by searching for alternative models of the premises in which it is false. If there are no such counterexamples, then the conclusion is deductively valid, that is, it must be true given that the premises are true. Thus, the first stage of deduction corresponds to the normal process of verbal comprehension, the second stage corresponds to the normal process of formulating a useful and parsimonious description, and only the third stage is peculiar to reasoning. To characterize any particular domain of deduction, for example reasoning based on temporal relations such as "before", "after" and "while", or sentential connectives such as "not", "if, "and" and "or", it is necessary to account for how the meanings of the relevant terms give rise to models. The general reasoning principles, as outlined above, then automatically apply to the domain. In fact, the appropriate semantics has been outlined for temporal relations, spatial relations, sentential connectives and quantifiers (such as "all", "none" and "some"), and all of these domains can be handled according to five representational principles: (1) Each entity is represented by an individual token in a model, its properties are represented by properties of the token, and the relations between entities are represented by the relations between tokens. Thus, a model of the assertion "The circle is on the right of the triangle" has the following spatial structure: A
O
which may be experienced as a visual image, though what matters is not so much the subjective experience as the structure of the model. To the extent that individuals grasp the truth conditions of propositions containing abstract concepts, such as friendship, ownership and justice, they must be able to envisage situations that satisfy them, that is, to form mental models of these situations (see JohnsonLaird, 1983, Ch. 15). (2) Alternative possibilities can be represented by alternative models. Thus, the assertion "Either there is a triangle or there is a circle, but not both" requires two alternative models, which each correspond to separate possibilities: A O
(3) The negation of atomic propositions can be represented by a propositional annotation. Thus, the assertion "There is not a triangle" is represented by the following sort of model:
Mental models and probabilistic thinking
175
^A
where "-1" is an annotation standing for negation (for a defence of such annotations, see Polk & Newell, 1988; and Johnson-Laird & Byrne, 1991, pp. 130-1). Of course, the nature of the mental symbol corresponding to negation is unknown. The principal purpose of the annotation is to ensure that models are not formed containing both an element and its negation. Thus, the only way to combine the disjunctive models above with the model of 'There is not a triangle" is to eliminate the first model, leaving only the second model and its new negated element: ^A
O
It follows that there is a circle. As this example shows, deductions can be made without the need for formal rules of inference of the sort postulated in "natural deduction systems" (see, for example, Rips, 1983; Braine, Reiser & Rumain, 1984), such as, in this case, the formal rule for disjunction: A or B not A .-. B (4) Information can be represented implicitly in order to reduce the load on working memory. An explicit representation makes information immediately available to other processes, whereas an implicit information encodes the information in a way that is not immediately accessible. Individuals and situations are represented implicitly by a propositional annotation that works in concert with an annotation for what has been represented exhaustively. Thus, the proper initial representation of the disjunction "Either there is a triangle or there is a circle, but not both" indicates that for the cases in which triangles occur, and the cases in which circles occur, have been exhaustively represented, as shown by the square brackets: [A] [O]
This set of models implicitly represents the fact that circles cannot occur in the first model and triangles cannot occur in the second model, because circles are exhaustively represented in the second model and triangles are exhaustively represented in the first model. Thus, a completely explicit set of models can be constructed by fleshing out the initial models to produce the set: A
-iO
-iA
O
176
P. Johnson-Laird
where there is no longer any need for square brackets because all the elements in the models have been exhaustively represented. The key to understanding implicit information is accordingly the process of fleshing out models explicitly, which is governed by two principles: first, when an element has been exhaustively represented (as shown by square brackets) in one or more models, add its negation to any other models; second, when a proposition has not been exhaustively represented, then add both it and its negation to separate models formed by fleshing out any model in which it does not occur. (Only the first principle is needed to flesh out the models of the disjunction above.) (5) The epistemic status of a model can be represented by a propositional annotation; for example, a model represents a real possibility, a counterfactual state of affairs, or a deontic state. A model that does not contain propositional annotations, that is, a model based on the first two assumptions above, represents a set of possible states of affairs, which contains an infinite number of possibilities (Barwise, 1993). Hence, the model above of the assertion 'The circle is on the right of the triangle" corresponds to infinitely many possibilities; for example, the model is not specific about the distance apart of the two shapes. Any potential counterexample to a conclusion must be consistent with the premises, but the model itself does not enable the premises to be uniquely reconstructed. Hence, in verbal reasoning, there must be an independent record of the premises, which is assumed to be the linguistic representation from which the models are constructed. This record also allows the inferential system to ascertain just which aspects of the world the model represents; for example, a given model may, or may not, represent the distances apart of objects, but inspection of the model alone does not determine whether it represents distance. Experimental evidence bears out the psychological reality of both linguistic representations and mental models (see Johnson-Laird, 1983). Models with propositional annotations compress sets of states of affairs in a still more powerful way: a single model now represents a finite set of alternative sets of situations. This aspect of mental models plays a crucial role in the account of syllogistic reasoning and reasoning with multiple quantifiers. For example, syllogistic premises of the form: All the A are B All the B are C call for one model in which the number of As is small but arbitrary: [[a] [[a]
b] b]
c c
Mental models and probabilistic thinking
177
As are exhaustively represented in relation to Bs, Bs are exhaustively represented in relation to Cs, Cs are not exhaustively related, and the three dots designate implicit individuals of some other sort. This single model supports the conclusion: All the A are C and there are no counterexamples. The initial model, however, corresponds to eight distinct sets of possibilities depending on how the implicit individuals are fleshed out explicitly. There may, or may not, be individuals of each of the three following sorts: individuals who are not-a, not-b, not-c individuals who are not-a, not-b but c individuals who are not-a, but b and c These three binary contrasts accordingly yield eight alternatives, and each of them is consistent with an indefinite number of possibilities depending on the actual numbers of individuals of the different sorts (see also Garnham, 1993). In short, eight distinct potentially infinite sets have been compressed into a single model, which is used for the inference. The theory of reasoning based on mental models makes three principal predictions. First, the greater the number of models that an inference calls for, the harder the task will be. This prediction calls for a theoretical account of the models postulated for a particular domain. Such accounts typically depend on independently motivated psycholinguistic principles; for example, negative assertions bring to mind the affirmative propositions that are denied (Wason, 1965). Second, erroneous conclusions will tend to be consistent with the premises rather than inconsistent with them. Reasoners will err because they construct some of the models of the premises - typically, just one model of them - and overlook other possible models. This prediction can be tested without knowing the detailed models postulated by the theory: it is necessary only to determine whether or not erroneous conclusions are consistent with the premises. Third, knowledge can influence the process of deductive reasoning: subjects will search more assiduously for alternative models when a putative conclusion is unbelievable than when it is believable. The first two of these predictions have been corroborated experimentally for all the main domains of deduction (for a review, see JohnsonLaird & Byrne, 1991, and for a reply to commentators, see Johnson-Laird & Byrne, 1993). The third prediction has been corroborated in the only domain in which it has so far been tested, namely, syllogistic reasoning (see, for example, Oakhill, Johnson-Laird, & Garnham, 1989). In contrast, theories of deduction
178
P. Johnson-Laird
based on formal rules of inference exist only for spatial reasoning and reasoning based on sentential connectives (e.g., Rips, 1983; Braine, Reiser, & Rumain, 1984). Where the model theory and the formal rule theories make opposing predictions, the evidence so far has corroborated the model theory.
3. The strength of an inference By definition, inductive arguments are logically invalid; that is, their premises could be true but their conclusions false. Yet such arguments differ in their strength - some are highly convincing, others are not. These differences are an important clue to the psychology of inference. However, one needs to distinguish between the strength of an argument - the degree to which its premises, if true, support the conclusion, and the degree to which the conclusion is likely to be true in any case. An argument can be strong but its conclusion improbable because the argument is based on improbable premises. Hence, the probability of the premises is distinct from the strength of the argument. In principle, the probability of a conclusion should depend on both the probability of the premises and the strength of the argument. But, as we shall see, individuals are liable to neglect the second of these components. Osherson, Smith, and Shafir (1986) in a ground-breaking analysis of induction explored a variety of accounts of inferential strength that boil down to three main hypotheses: (1) an inference is strong if, given an implicit assumption, schema or causal scenario, it is logically valid; that is, the inference is an enthymeme (cf. Aristotle); (2) an inference is strong if it corresponds to a deduction in reverse, such as argument from specific facts to a generalization of them (cf. Hempel, 1965); and (3) an inference is strong if the predicates (or arguments) in premises and conclusion are similar (cf. Kahneman & Tversky, 1972). Each hypothesis has it advantages and disadvantages, but their strong points can be captured in the following analysis, which we will develop in two stages. First, the present section of the paper will specify an abstract characterization of the objective strength of an argument - what in theory has to be computed in order to determine the strength of an inference (the theory at the "computational" level). Second, the next section of the paper will specify how in practice the mind attempts to assess the strength of an argument (the theory at the "algorithmic" level). The relation between premises and conclusion in inductive inference is a semantic one, and it can be characterized abstractly by adopting the semantic approach to logic (see, for example, Barwise & Etchemendy, 1989). An assertion such as "The circle is on the right of the triangle" is, as we have seen, true in infinitely many different situations; that is, the distance apart of the two shapes can differ, as can their respective sizes, shapes, textures and so on. Yet in all of
Mental models and probabilistic thinking
179
these different states the circle is on the right of the triangle. Philosophers sometimes refer to these different states as "possible worlds" and argue that an assertion is true in infinitely many possible worlds. We leave to one side the issue of whether or not possible worlds are countably infinite. The underlying theory has led to a powerful, though controversial, account of the semantics of natural language (see, for example, Montague, 1974). Armed with the notion of possible states of affairs, we can define the notion of the strength of an inference in the following terms: a set of premises, including implicit premises provided by general and contextual knowledge, lend strength to a conclusion according to two principles: (1) The conclusion is true in at least one of the possible states of affairs in which the premises are true; that is, the conclusion is at least consistent with the premises. If there is no such state of affairs, then the conclusion is inconsistent with the premises: the inference has no strength whatsoever, and indeed there is valid argument in favor of the negation of the conclusion. (2) Possible states of affairs in which the premises are true but the conclusion false (i.e., counterexamples) weaken the argument. If there are no counterexamples, then the argument is maximally strong- the conclusion follows validly from the premises. If there are counterexamples, then the strength of the argument equals the proportion of states of affairs consistent with the premises in which the conclusion is also true. This account has a number of advantages. First, it embraces deduction and induction within the same framework. What underlies deduction is the semantic principle of validity: an argument is valid if its conclusion is true in any state of affairs in which its premises are true. An induction increases semantic information and so its conclusion must be false in possible cases in which its premises are true. Hence, inductions are reverse deductions, but they are the reverse of deductions that throw semantic information away. Second, the probability of any one distinct possible state of affairs (possible world) is infinitesimal, and so it is reasonable to assume that possible states of affairs are close to equi-possible. It follows that a method of integrating the area of a subset of states of affairs provides an extensional foundation for probabilities. The strength of an inference is accordingly equivalent to the probability of the conclusion given the premises. It is 1 in the case of a valid deduction, 0 in the case of a conclusion that is inconsistent with the premises, and an intermediate value for inductions. The two abstract principles, however, are not equivalent to the probability calculus: as we shall see, the human inferential system can attempt to assess the relevant proportions without necessarily using the probability calculus. Likewise, the principles have no strong implications for the correct interpretation of probability, which is a matter for self-conscious philosophical reflection. The
180
P. Johnson-Laird
principles are compatible with interpretations in terms of actuarial frequencies of events, equi-possibilities based on physical symmetry, and subjective degrees of belief (cf. Ramsay, 1926; Hintikka's, 1962, analysis of beliefs in terms of possibility; and for an alternative conception, see Shafer & Tversky's, 1985, discussion of "belief functions"). Hence, an argument (or a probability) may concern either a set of events or a unique event. Individuals who are innumerate may not assign a numerical degree of certainty to their conclusion, and even numerate individuals may not have a tacit mental number representing their degree of belief. Individuals' beliefs do differ in subjective strength, but it does not follow that such differences call for a mental representation of numerical probabilities. An alternative conception of "degrees of belief might be based on analogue representations (cf. Hintzman, Nozawa, & Irmscher, 1982), or on a system that permitted only partial rankings of strengths, such as one that recorded the relative ease of constructing different classes of models. Third, the account is compatible with semantic information. The semantic information conveyed by a proposition, A, equals 1 -p(A)> where p(A) denotes the probability of A (Bar-Hillel & Carnap, 1964; Johnson-Laird, 1983). If A is complex proposition containing conjunctions, disjunctions, etc., its probability can be computed in the usual way according to the probability calculus. Hence, as argued elsewhere (Johnson-Laird, 1993), we can distinguish between deduction and induction on the basis of semantic information, that is, the possible states of affairs that a proposition rules out as false. Deduction does not increase semantic information; that is, the conclusion of a valid deduction rules out the same possibilities as the premises or else fewer possibilities, and so the conclusion must be true given that the premises are true. Induction increases semantic information; that is, the conclusion of an induction goes beyond the premises (including those tacit premises provided by general knowledge) by ruling out at least some additional possibility over and above the states of affairs that they rule out. This account captures all the standard cases of induction, such as the generalization from a finite set of observations to a universal claim (for a similar view, see Ramsay, 1926). Fourth, the account is compatible with everyday reasoning and argumentation. One feature of such informal argumentation is that it typically introduces both a case for a conclusion and a case against it - a procedure that is so unlike a logical proof that many theorists have supposed that logic is useless in the analysis of everyday reasoning (e.g, Toulmin, 1958). The strength of an argument, however, can be straightforwardly analyzed in the terms described above: informal argumentation is typically a species of induction, which may veer at one end into deduction and at the other end into a creative process in which one or more premises are abandoned. Thus, a case for a conclusion may depend on several inductive arguments of differing strength. The obvious disadvantage of the account is that it is completely impractical. No
Mental models and probabilistic thinking
181
one can consider all the infinitely many states of affairs consistent with a set of premises. No one can integrate all those states of affairs in which the conclusion is true and all those states of affairs in which it is false. Inference with quantifiers has no general decision procedure; that is, proofs for valid theorems can always be found in principle, but demonstrations of invalidity may get lost in the uspace" of possible derivations. Inference with sentential connectives has a decision procedure, but the formulation of parsimonious conclusions that maintain semantic information is not computationally tractable; that is, as premises contain more atomic propositions, it takes exponentially longer to generate such conclusions (given that NP ^ P). So how does this account translate into a psychological mechanism for assessing the strength of an argument? It is this problem that the theory of mental models is designed to solve.
4. Mental models and estimates of inferential strength Philosophers have tried to relate probability and induction at a deep level (see, for example, Carnap, 1950), but as far as cognitive psychology is concerned they are overlapping rather than identical enterprises: there are probabilistic inferences that are not inductive, and there are inductive inferences that are not probabilistic. Here, for example, is a piece of probabilistic reasoning that is deductive: The probability of heads is 0.5. The probability of the date uppermost given heads is 1. The probability of the date uppermost given tails is 0. Hence, the probability of the date uppermost is 0.5. This deduction makes explicit what is implicit in the premises, and it does not increase their semantic information. A more mundane example is as follows: If you park illegally within the walls of Siena, you will probably have your car towed. Phil has parked illegally within the walls of Siena. Phil will probably have his car towed. This inference is also a valid deduction. Conversely, many inductive inferences are not probabilistic; that is, they lead to conclusions that people hold to be valid. For example, the engineers in charge at Chernobyl inferred initially that the explosion had not destroyed the reactor (Medvedev, 1990). Such an event was unthinkable from their previous experience, and they had no evidence to suppose that it had occurred. They were certain that the reactor was intact, and their
P. Johnson-Laird
182
conviction was one of the factors that led to the delay in evacuating the inhabitants of the nearby town. Of course people do make probabilistic inductions, and it is necessary to explain their basis as well as the basis for probabilistic deductions. To understand the application of the model theory to the assessment of strength, it will be helpful to consider first how it accounts for deductions based on probabilities. Critics sometimes claim that models can be used only to represent alternative states of affairs that are treated as equally likely. In fact, there is no reason to suppose that when individuals construct or compare models they take each model to be equally likely. To illustrate the point, consider an example of a deduction leading to a probabilistic conclusion: Kropotkin is an anarchist. Most anarchists are bourgeois. .-. Probably, Kropotkin is bourgeois. The quantifier "most" calls for a model that represents a proportion (see Johnson-Laird, 1983, p. 137). Thus, a model of the second premise takes the form: [a] [a] [a] [a]
b b b
where the set of anarchists is exhaustively represented; that is, anarchists cannot occur in fleshing out the implicit model designated by the three dots. When the information in the first premise is added to this model, one possible model is: k
[a] [a] [a] [a]
b b b
in which Kropotkin is bourgeois. Another possible model is:
k
[a] [a] [a] [a]
b b b
Mental models and probabilistic thinking
183
in which Kropotkin is not bourgeois. Following Aristotle, assertions of the form: probably S, can be treated as equivalent to: in most possible states of affairs, S. And in most possible states of affairs as assessed from models of the premises, Kropotkin is bourgeois. Hence, the inferential system needs to keep track of the relative frequency with which the two sorts of models occur. It will detect the greater frequency of models in which it Kropotkin is bourgeois, and so it will deduce: Probably, Kropotkin is bourgeois. Individuals who are capable of one-to-one mappings but who have no access to cardinal or ordinal numbers will still be able to make this inference. They have merely to map each model in which S occurs one-to-one with each model in which S does not occur, and, if there is a residue, it corresponds to the more probable category. Likewise, there are many ways in principle in which to estimate the relative frequencies of the two sorts of model-from random sampling with replacement to systematic explorations of the "space" of possible models. The only difference in induction is that information that goes beyond the premises (including those in tacit knowledge) is added to models on the basis of various constraints (see Johnson-Laird, 1983). The strength of an inference depends, as we have seen, on the relative proportions of two sorts of possible states of affairs consistent with the premises: those in which the conclusion is true and those in which it is false. Reasoners can estimate these proportions by constructing models of the premises and attending to the proportions with which the two sorts of models come to mind, and perhaps to the relative ease of constructing them. For example, given that Evelyn fell (without a parachute) from an airplane flying at a height of 2000 feet, then most individuals have a prior knowledge that Evelyn is likely to be killed, but naive individuals who encounter such a case for the first time can infer the conclusion. The inference is strong, but not irrefutable. They may be able to imagine cases to the contrary; for example, Evelyn falls into a large haystack, or a deep snow drift. But, in constructing models (of sets of possibilities), those in which Evelyn is killed will occur much more often than those in which Evelyn survives - just as models in which Kropotkin is bourgeois outnumber those in which he is not. Insofar as individuals share available knowledge, their assessments of probabilities should be consistent. This account is compatible with the idea of estimating likelihoods in terms of scenarios, which was proposed by Tversky and Kahneman (1973, p. 229), and it forms a bridge between the model theory and the heuristic approach to judgements of probability. Estimates of the relative proportions of the two sorts of models - those in which a conclusion is true and those in which it is false - will
184
P. Johnson-Laird
be rudimentary, biased and governed by heuristics. In assessing outcomes dependent on sequences of events, models must allow for alternative courses of events. They then resemble so-called "event trees", which Shafer (1993) argues provide a philosophical foundation to probability and its relations to causality. Disjunctive alternatives, however, are a source of difficulty both in deduction (see, for example, Johnson-Laird & Byrne, 1991) and in choice (see, for example, Shafir & Tversky, 1992).
5. Some empirical consequences of the theory The strength of an argument depends on the relation between the premises and the conclusion, and, in particular, on the proportion of possibilities compatible with the premises in which the conclusion is true. This relation is not in general a formal or syntactic one, but a semantic one. It takes work to estimate the strength of relation, and the theory yields a number of predictions about making and assessing inductive inferences. The main predictions of the theory are as follows: First, arguments - especially in daily life - do not wear their logical status on their sleeves, and so individuals will tend to approach deductive and inductive arguments alike. They will tend to confuse an inductive conclusion, that is, one that could be true given the premises, with a deductive conclusion, that is, one that must be true given the premises. They will tend to construct one or two models, draw a conclusion, and be uncertain about whether it follows of necessity. Second, envisioning models, which each correspond to a class of possibilities, is a crude method, and, because of the limited processing capacity of working memory, many models are likely never to be envisaged at all. The process will be affected by several constraints. In particular, individuals are likely to seek the most specific conclusion consistent with the premises (see Johnson-Laird, 1993), they are likely to seek parsimonious conclusions (see Johnson-Laird & Byrne, 1991), and they are likely to be constrained by the availability of relevant knowledge (Tversky & Kahneman, 1973). The model theory postulates a mechanism for making knowledge progressively available. Reasoners begin by trying to form a model of the current situation, and the retrieval of relevant knowledge is easier if they can form a single model containing all the relevant entities. Once they have formed an initial model, knowledge becomes available to them in a systematic way. They manipulate the spatial or physical aspects of the situation; that is, they manipulate the model directly by procedures corresponding to such changes. Next, they make more abstract conceptual manipulations; for example, they consider the properties of superordinate concepts of entities in the model. Finally, they make still more abstract inferences based on introducing
Mental models and probabilistic thinking
185
relations retrieved from models of analogous situations (cf. Gentner, 1983). Consider the following illustration: Arthur's wallet was stolen from him in the restaurant. The person charged with the offense was outside the restaurant at the time of the robbery. What follows? Reasoners are likely to build an initial model of Arthur inside the restaurant when his wallet is stolen and the suspect outside the restaurant at that time. They will infer that the suspect is innocent. They may then be able to envisage the following sort of sequence of ideas from their knowledge about the kinds of things in the model: (1) Physical and spatial manipulations: The suspect leant through the window to steal the wallet. The suspect stole the wallet as Arthur was entering the restaurant, or ran in and out of the restaurant very quickly (creative inferences that, in fact, are contrary to the premises). (2) Conceptual manipulations: The suspect had an accomplice - a waiter, perhaps - who carried out the crime (theft is a crime, and many crimes are committed by accomplices). (3) Analogical thinking The suspect used a radio-controlled robot to sneak up behind Arthur to take the wallet (by analogy with the use of robots in other "hazardous" tasks). In short, the model theory predicts that reasoners begin by focusing on the initial explicit properties of their model of a situation, and then they attempt to move away from them, first by conceptual operations, and then by introducing analogies from other domains. It is important to emphasize that the order of the three sorts of operations is not inflexible, and that particular problems may elicit a different order of operations. Nevertheless, there should be a general trend in moving away from explicit models to implicit possibilities. Third, reasoners are also likely to be guided by other heuristics, which have been extensively explored by Tversky and Kahneman, and their colleagues. These heuristics can be traced back to Hume's seminal analysis of the connection between ideas: "there appear to be only three principles of connexion between ideas, namely, Resemblance, Contiguity in time or place, and Cause or Effect" (Hume, 1748, Sec. III). Hence, semantic similarity between the premises and the conclusion, and the causal cohesiveness between them, will influence probabilistic judgements. Such factors may even replace extensional estimates based on models.
186
P. Johnson-Laird
Fourth, individuals should be inferential satisficers; that is, if they reach a credible (or desirable) conclusion, or succeed in constructing a model in which such a conclusion is true, they are likely to accept it, and to overlook models that are counterexamples. Conversely, if they reach an incredible (or undesirable) conclusion, they are likely to search harder for a model of the premises in which it is false. This propensity to satisfice will in turn lead them to be overconfident in their conclusions, especially in the case of arguments that do have alternative models in which the conclusion is false. Individuals are indeed often overconfident in their inductive judgements, and Gigerenzer, Hoffrage, and Kleinbolting (1991) have propounded a theory of "probabilistic mental models" to account for this phenomenon. These are long-term representations of probabilistic cues and their validities (represented in the form of conditional probabilities). These authors propose that individuals use the single cue with the strongest validity and do not aggregate multiple cues, and that their confidence derives from the validity of this cue. They report corroboratory evidence from their experiments on the phenomenon of overconfidence; that is, rated confidence tends to be higher than the actual percentage of correct answers. As Griffin and Tversky (1992) point out, however, overconfidence is greater with harder questions and this factor provides an alternative account of Gigerenzer et al.'s results. In contrast, the model theory proposes that the propensity to satisfice should lead subjects to overlook models in the case of multiple-model problems, and so they should tend to be more confident than justified in the case of harder problems. Overconfidence in inductive inference occurred in an unpublished study by Johnson-Laird and Anderson, in which subjects were asked to draw initial conclusions from such premises as: The old man was bitten by a poisonous snake. There was no known antidote available. They tend initially to infer that the old man died. Their confidence in such conclusions was moderately high. They were then asked whether there were any other possibilities and they usually succeeded in thinking of two or three. When they could go no further, they were asked to rate again their initial conclusions, and showed a reliable decline in confidence. Hence, by their own lights, they were initially overconfident, though by the end of the experiment they may have been underconfident as a result of bringing to mind remote scenarios. With easier one-model problems, the error and its correlated overconfidence cannot occur. But should subjects be underconfident in such cases, as is sometimes observed? One factor that may be responsible for the effect in repeated-measure designs is the subjects' uncertainty about whether or not there might be other models in a one-model case. Finally, individuals are likely to focus on what is explicit in their initial models and thus be susceptible to various "focusing effects" (see Legrenzi, Girotto, &
Mental models and probabilistic thinking
187
Johnson-Laird, 1993). These effects include difficulty in isolating genuinely diagnostic data (see, for example, Beyth-Marom & Fischhoff, 1983; Doherty, Mynatt, Tweney, & Schiavo, 1979), testing hypotheses in terms of their positive instances (Evans, 1989; Klayman & Ha, 1987), neglect of base rates in certain circumstances (Tversky & Kahneman, 1982), and effects of how problems in deductive and inductive reasoning are framed (e.g., Johnson-Laird & Byrne, 1989; Tversky & Kahneman, 1981). Focusing is also likely to lead to too great a reliance on the credibility of premises (and conclusion) and too little on the strength of the argument, that is, the relation between the premises and conclusion. Reasoners will build an initial model that makes explicit the case for a conclusion, and then fail to adjust their estimates of its likelihood by taking into account alternative models (see also Griffin & Tversky, 1992, for an analogous view). Conversely, any factor that makes it easier for individuals to flesh out explicit models of the premises should improve performance.
6. Rules for probabilistic thinking An obvious potential basis for probabilistic reasoning is the use of rules of inference, such as: If q & r then s (with probability p) .'. If q then s (with probability p') Numerous AI programs include rules of this sort (see, for example, Holland, Holyoak, Nisbett, & Thagard, 1986; Michalski, 1983; Winston, 1975). The most plausible psychological version of this idea is due to Collins and Michalski (1989). They argue that individuals construct mental models on the basis of rules of inference, and that these rules have numerical parameters for such matters as degree of certainty. They have not tried to formalize all patterns of plausible inference, but rather some patterns of inference that make up a core system of deductions, analogies and inductions. They admit that it is difficult to use standard psychological techniques to test their theory, which is intended to account only for people's answers to questions. It does not make any predictions about the differences in difficulty between various sorts of inference, and, as they point out (p. 7), it does not address the issue of whether people make systematic errors. Hence, their main proposed test consists in trying to match protocols of arguments against the proposed forms of rules. Pennington and Hastie (1993) report success in matching these patterns to informal inferences of subjects playing the part of trial jurors. But, as Collins and Michalski mention, one danger is that subjects' protocols are merely rationalizations for answers arrived at by other means. In sum, AI rule systems for induction have not yet received decisive corroboration.
188
P. Johnson-Laird
In contrast, another sort of rule theory has much more empirical support. This theory appeals to the idea that individuals have a tacit knowledge of such rules as the "law of large numbers" (see Nisbett, 1993; Smith, Langston, & Nisbett, 1992). Individuals apply the rules to novel materials, mention them in justifying their responses, benefit from training with them, and sometimes overextend their use of them. The rules in AI programs are formal and can be applied to the representation of the abstract logical form of premises. The law of large numbers, however, is not a formal rule of inference. It can be paraphrased as follows: The larger the sample from a population the smaller its mean is likely to diverge from the population mean. Aristotle would not have grasped such notions as sample, mean and population, but he would have been more surprised by a coin coming up heads ten times in a row than a coin coming up heads three times in a row. He would thus have had a tacit grasp of the law that he could make use of in certain circumstances. The law has a rich semantic content that goes well beyond the language of logical constants, and it is doubtful whether it could be applied to the logical form of premises. On the contrary, it is likely to be applied only when one has grasped the content of a problem, that is, constructed a model that makes explicit that it calls for an estimate based on an example. Individuals are likely to hold many other general principles as part of their beliefs about probability. For instance, certain devices produce different outcomes on the basis of chance, that is, at approximately equal rates and in unpredictable ways; if a sample from such a device is deviant, things are likely to even up in the long run (gambler's fallacy). Such principles differ in generality and validity, but they underlie the construction of many probabilistic judgements. The fact that individuals can be taught correct laws and that they sometimes err in over-extending them tells us nothing about the mental format of the laws. They may take the form of schemas or content-specific rules of inference, but they could be represented declaratively. Likewise, how they enter into the process of thinking - the details of the computations themselves - is also unknown. There is, however, no reason to oppose them to mental models. They seem likely to work together in tandem, just as conceptual knowledge must underlie the construction of models.
7. Conclusions The principle thesis of the present paper is that general knowledge and beliefs, along with descriptions of situations, lead to mental models that are used to assess probabilities. Most cognitive scientists agree that humans construct mental
Mental models and probabilistic thinking
189
representations; many may suspect that the model theory merely uses the words "mental model" where "mental representation" would do. So, what force, if any, is there to the claim that individuals think probabilistically by manipulating models? The answer, which has been outlined here, is twofold. First, the representational principles of models allow sets of possibilities to be considered in a highly compressed way, and even in certain cases sets of sets of possibilities. Hence, it is feasible to assess probability by estimating possible states of affairs within a general framework that embraces deduction, induction and probabilistic thinking. This framework provides an extensional foundation of probability theory that is not committed a priori to either a frequency or degrees-of-belief interpretation, which are both equally feasible on this foundation. Second, the model theory makes a number of predictions based on the distinction between explicit and implicit information, and on the processing limitations of working memory. Such predictions, as the study of deduction has shown, are distinct from those made by theories that postulate only representations of the logical form of assertions.
References Aristotle (1984). The complete works of Aristotle, edited by J. Barnes, 2 vols. Princeton: Princeton University Press. Bar-Hillel, Y., & Carnap, R. (1964). An outline of a theory of semantic information. In Y. Bar-Hillel, (Ed.), Language and information. Reading, MA: Addison-Wesley. Barwise, J. (1993). Everyday reasoning and logical inference. Behavioral and Brain Sciences, 16, 337-338. Barwise, J., & Etchemendy, J. (1989). Model-theoretic semantics. In M.I. Posner (Ed.), Foundations of cognitive science. Cambridge, MA: MIT Press. Beyth-Marom, R., & Fischhoff, B. (1983). Diagnosticity and pseudodiagnosticity. Journal of Personality and Social Psychology, 45, 1185-1197. Braine, M.D.S., Reiser, B.J., & Rumain, B. (1984). Some empirical justification for a theory of natural propositional logic. In The psychology of learning and motivation, (Vol. 18). New York: Academic Press. Byrne, R.M.J. (1989). Suppressing valid inferences with conditionals. Cognition, 31, 61-83. Carnap, R. (1950). Logical foundations of probability. Chicago: Chicago University Press. Collins, A.M., & Michalski, R. (1989). The logic of plausible reasoning: A core theory. Cognitive Science, 13 1-49. Craik, K. (1943). The nature of explanation. Cambridge, UK: Cambridge University Press. Doherty, M.E., Mynatt, C.R., Tweney, R.D., & Schiavo, M.D. (1979). Pseudodiagnosticity. Acta Psychologica, 43, 11-21. Evans, J.St.B.T. (1989). Bias in human reasoning: Causes and consequences. London: Erlbaum. Fisher, A. (1988). The logic of real arguments. Cambridge, UK: Cambridge University Press. Garnham, A. (1987). Mental models as representations of discourse and text. Chichester: Ellis Horwood. Garnham, A. (1993). A number of questions about a question of number. Behavioral and Brain Sciences, 16, 350-351. Gentner, D. (1983). Structure-mapping: A theoretical framework for analogy. Cognitive Science, 7, 155-170.
190
P. Johnson-Laird
Gigerenzer, G., Hoffrage, U., & Kleinbolting, H. (1991). Probabilistic mental models: A Brunswikian theory of confidence. Psychological Review, 98, 506-528. Griffin, D., & Tversky, A. (1992). The weighing of evidence and the determinants of confidence. Cognitive Psychology, 24, 411-435. Hempel, C. (1965). Aspects of scientific explanation. New York: Macmillan. Hintikka, J. (1962). Knowledge and belief: An introduction to the logic of the two notions. Ithaca: Cornell University Press. Hintzman, D.L., Nozawa, G., & Irmscher, M. (1982). Frequency as a nonpropositional attribute of memory. Journal of Verbal Learning and Verbal Behavior, 21, 127-141. Holland, J.H., Holyoak, K.J., Nisbett, R.E., & Thagard, P. (1986). Induction: Processes of inference, learning, and discovery. Cambridge, MA: MIT Press. Hume, D. (1748/1988). An enquiry concerning human understanding. La Salle, IL: Open Court. Johnson-Laird, P.N. (1983). Mental models: Towards a cognitive science of language, inference and consciousness. Cambridge, UK: Cambridge University Press. Johnson-Laird, P.N. (1993). Human and machine thinking. Hillsdale, NJ: Erlbaum. Jbhnson-Laird, P.N., & Bara, B. (1984). Syllogistic inference. Cognition, 16, 1-61. Johnson-Laird, P.N., & Byrne, R.M.J. (1989). Only reasoning. Journal of Memory and Language, 28, 313-330. Johnson-Laird, P.N., & Byrne, R.M.J. (1991). Deduction. Hillsdale, NJ: Erlbaum. Johnson-Laird, P.N., & Byrne, R.M.J. (1993). Authors' response [to multiple commentaries on Deduction]: Mental models or formal rules? Behavioral and Brain Sciences, 16, 368-376. Kahneman, D., & Tversky, A. (1972). Subjective probability: A judgment of representativeness. Cognitive Psychology, 3, 430-454. Klayman, J., & Ha, Y.-W. (1987). Confirmation, disconfirmation and information in hypothesis testing. Psychological Review, 94, 211-228. Legrenzi, P., Girotto, V, & Johnson-Laird, P.N. (1993). Focussing in reasoning and decision making. Cognition, 49, 37-66. Marr, D. (1982). Vision: A computational investigation into the human representation and processing of visual information. San Francisco: W.H. Freeman. Medvedev, Z.A. (1990). The legacy of Chernobyl. New York: Norton. Michalski, R.S. (1983). A theory and methodology of inductive learning. In R.S. Michalski, J.G. Carbonell, & T.M. Mitchell (Eds.), Machine learning: An artificial intelligence approach. Los Altos, CA: Morgan Kaufmann. Montague, R. (1974). Formal philosophy: Selected papers. New Haven: Yale University Press. Nisbett, R.E. (Ed.) (1993). Rules for reasoning. Hillsdale, NJ: Erlbaum. Oakhill, J.V., Johnson-Laird, P.N. & Garnham, A. (1989). Beiievability and syllogistic reasoning. Cognition, 31, 117-140. Osherson, D.N., Smith, E.E. & Shafir, E. (1986). Some origins of belief. Cognition, 24, 197-224. Pennington, N., & Hastie, R. (1993). Reasoning in explanation-based decision making. Cognition, 49, 123-163. Polk, T.A., & Newell, A. (1988). Modeling human syllogistic reasoning in Soar. In Tenth Annual Conference of the Cognitive Science Society (pp. 181-187). Hillsdale, NJ: Erlbaum. Ramsay, F.P. (1926/1990). Truth and probability. In D.H. Mellor, (Ed.), F.P. Ramsay: Philosophical papers. Cambridge, UK: Cambridge University Press. Rips, L.J. (1983). Cognitive processes in propositional reasoning. Psychological Review, 90, 38-71. Shafer, G. (1993). Using probability to understand causality. Unpublished MS, Rutgers University. Shafer G., & Tversky, A. (1985). Languages and designs for probability judgment. Cognitive Science, 9, 309-339. Shafir, E., & Tversky, A. (1992). Thinking through uncertainty: Nonconsequential reasoning and choice. Cognitive Psychology, 24, 449-474. Smith, E.E., Langston, C, & Nisbett, R.E. (1992). The case for rules in reasoning. Cognitive Science, 16, 1-40. Thagard, P. (1989). Explanatory coherence. Behavioral and Brain Sciences, 12, 435-502. Toulmin, S.E. (1958). The uses of argument. Cambridge, UK: Cambridge University Press.
Mental models and probabilistic thinking
191
Tversky, A., & Kahneman, D. (1973). Availability: A heuristic for judging frequency and probability. Cognitive Psychology, 5, 207-232. Tversky, A., & Kahneman, D. (1981). The framing of decisions and the psychology of choice. Science, 211, 453-458. Tversky, A., & Kahneman, D. (1982). Evidential impact of base rates. In D. Kahneman, P. Slovic, & A. Tversky, (Eds.), Judgments under uncertainty: Heuristics and biases. New York: Cambridge University Press. Tversky, A., & Kahneman, D. (1983). Extensional versus intuitive reasoning: The conjunction fallacy in probability judgment. Psychological Review, 90, 293-315. Wason, P.C. (1965). The contexts of plausible denial. Journal of Verbal Learning and Verbal Behavior, 4, 7-11. Winston, P.H. (1975). Learning structural descriptions from examples. In P.H. Winston, (Ed.), The psychology of computer vision. New York: McGraw-Hill.
10 Pretending and believing: issues in the theory of ToMM Alan M. Leslie Department of Psychology, Center for Cognitive Science, Rutgers University, Piscataway, NJ 08855-1179, USA
Abstract Commonsense notions of psychological causality emerge early and spontaneously in the child. What implications does this have for our understanding of the mind/brain and its development? In the light of available evidence, the child's "theory of mind" is plausibly the result of the growth and functioning of a specialized mechanism (ToMM) that produces domain-specific learning. The failure of early spontaneous development of "theory of mind" in childhood autism can be understood in terms of an impairment in the growth and functioning of this mechanism. ToMM constructs agent-centered descriptions of situations or "metarepresentations". Agent-centered descriptions place agents in relation to information. By relating behavior to the attitudes agents take to the truth of propositions, ToMM makes possible a commonsense causal interpretation of agents9 behavior as the result of circumstances that are imaginary rather than physical. Two early attitude concepts, pretends and believes, are discussed in the light of some current findings. Dedication: This article is dedicated to the memory of Daniel Roth, my student, collaborator and friend who tragically lost his long struggle against cancer on April 17, 1993.
This paper has undergone a long gestation, various parts having been presented to the BPS Developmental Section Annual Conference, Coleg Harlech, September 1988, Trieste Encounters on Cognitive Science, Trieste, Italy, May 1989, International Workshop on Naturalized Epistemology, Cornell University, December 1989, International Conference on Cultural Knowledge and Domain Specificity, University of Michigan, Ann Arbor, October 1990, Society for Research on Child Development Biennial Meeting, Seattle, April 1991, and Inaugural Conference of the Rutgers University Center for Cognitive Science, Rutgers University, November 1991. I am grateful to participants and audiences at those meetings and also to colleagues and friends at the MRC Cognitive Development Unit for nurture and good nature.
194
A. Leslie
oAnn
A \
A box
<
Dl
)
Silly puUjM^harmt trble Inthtbaskt
\
|
Sally goas away ^L
t—*C5 ~T*~i
<—J A L_l 1
Ann
moves
marble
(feasor]
•
box
"whan will Sally ^^ look for her marble?"
^
Fig. 1. A standard false belief scenario that can be solved by 4-year-olds (after Baron-Cohen, Leslie, & Frith, 1985).
1. Introduction Consider the scenario in Fig. 1. Numerous studies (e.g. Baron-Cohen, Leslie, & Frith, 1985; Wimmer & Perner, 1983) have shown that, by about 4 years, children understand this scenario by attributing a (false) belief to Sally and predicting her behavior accordingly. Premack and Woodruff (1978) coined the term "theory of mind" for the ability, illustrated by this scenario, to predict, explain and interpret the behavior of agents in terms of mental states. Such findings raise the following question. How is the preschool child able to learn about mental states when these are unobservable, theoretical constructs? Or put another way: how is the young brain able to attend to mental states when they can be neither seen, heard nor felt? A general answer to the above question is that the brain attends to behavior and infers the mental state that the behavior issues from. For example, in the scenario in Fig. 2, Mother's behavior is talking to a banana. The task for a 2-year-old watching her is to infer that Mother PRETENDS (of) the banana (that) "it is a telephone". Mother's behavior described as a physical event-as one object in relation to another-is minimally interesting. The real significance of her behavior emerges only when mother is described as an agent in relation to information. As an agent, mother can adopt an attitude (of pretending) to the truth of a description ("it is a telephone") in regard to a given object (the banana). Entertaining this kind of intentional or agent-centered description requires computing a certain kind of internal representation. I have called this the "metarepresentation" or "M-representation" (Leslie, 1987; Leslie & Thaiss, 1992).
Pretending and believing: issues in the theory of ToMM
195
Mother's behaviour: talking to a banana!
Infer mental state: mother PRETENDS (of) the banana (that) "it is a telephone" Fig. 2. A pretend scenario that can be solved by 2-year-olds.
I shall explore the following assumption. Native to our mental architecture is a domain-specific processing stream adapted for understanding the behavior of agents. A major component of this system is a mechanism which computes the M-representation. I call this mechanism ToMM (theory of mind mechanism). Here are five guiding ideas in the theory of ToMM. (1) The key to understanding the origins of theory of mind lies in timepressured, on-line processing to interpret an agent's behavior in terms of underlying intentions. Early in development, human beings undertake the informationprocessing task of understanding the behavior of agents, not simply as a sequence of events, but as instantiating intentions in the broad sense, that is, as issuing from mental states. This processing task is time-pressured because agent-centered descriptions must be arrived at fast enough to keep up with the flow of behavior in a conversation or other interaction. This pressure will constrain the amount and types of information that can be taken into account and has had an adaptive evolutionary influence on the architecture of theory of mind processing.
196
A. Leslie
(2) Descriptions of intentional states are computed by a specialized theory of mind mechanism {ToMM) which is post-perceptual, operates spontaneously, is domain specific, and is subject to dissociable damage-in the limit, modular. Information about behavior arrives through a number of different sensory channels and includes verbal utterances, so ToMM should operate post-perceptually. ToMM should be able to function relatively spontaneously since it has the job of directing the child's attention to mental states which, unlike behavior, cannot be seen, heard or felt. ToMM should also be able to function as a source of intuitions in reasoning about agents and thus be addressable centrally. ToMM is specifically concerned with "cognitive" properties of agents and employs specialized notions for this task. Finally, ToMM can be damaged or impaired independently of other processing systems (see below). (3) ToMM employs a proprietary representational system which describes propositional attitudes. This property of ToMM is discussed in the theory of the M-representation to which I return below. (4) ToMM forms the specific innate basis for our capacity to acquire a theory of mind. Perhaps the most important job ToMM has to do is to produce development within its designated domain and to produce it early, rapidly and uniformly without benefit of formal instruction. To this end, ToMM introduces the basic attitude concepts and provides intuitive insight into mental states early in life while encyclopedic knowledge and general problem-solving resources are limited. (5) ToMM is damaged in childhood autism resulting in its core symptoms and impairing these children's capacity to acquire a theory of mind. Leslie and Roth (1993) have recently reviewed evidence supporting this idea (see also Frith, Morton, & Leslie, 1991; Leslie & Frith, 1990).
2. Pretending and ToMM One of the easily observed products of ToMM is the capacity to pretend. Spontaneous pretending emerges between 18 and 24 months of age. Around this time, the child begins to entertain deliberate suppositions about simple imaginary situations: foi example, she pretends that a banana is a telephone or that an empty cup contains juice. The ability is productive and does not remain limited to a single or to a few special topics, is exercised playfully and communicatively without ulterior motive (e.g., to deceive), permits sharing of information about imaginary situations with other people, and encompasses the ability to understand other people's communicative pretence. Due regard must be paid to the question of distinguishing pretence from other phenomena which are superficially similar at a behavioral level (e.g., functional play, acting from a mistaken belief, play in animals). I discussed some of the more important of these distinctions in Leslie
Pretending and believing: issues in the theory of ToMM
197
(1987) and pointed out that the aim of previous workers to develop a behavioral definition of pretence was unattainable. I proposed instead a theoretical definition in terms of underlying cognitive processes. There are four critical features of early pretence that a cognitive model must capture. The first requirement is to account for the fundamental forms of pretence. There are three of these, one for each of the basic (external) semantic relations between a representation and what it represents (viz., reference, truth and existence). In object substitution pretence, a given real object, for example a banana, is pretended to be some other object, for example a telephone. Such pretence requires a decoupling of the internal representation for telephones from its normal reference so that it functions in context as if it referred to a member of some arbitrary class of object, in this case a banana. Second, in properties pretend, a given object or situation is pretended to have some property typically it does not have, for example a dry table is pretended to be wet. Here the pretence decouples the normal effects of predicating wetness in the internal representation. And thirdly, imaginary objects can be pretended to have existence, for example that there is a hat on teddy's head. Here the pretence affects the normal existence presuppositions in the internal representation. A cognitive model of pretence has to explain why there are exactly three fundamental forms and why there are exactly these three.1 Leslie (1987) argued that the fundamental forms of pretence reflect the semantic phenomena of opacity (Quine, 1961). Opacity may be roughly described as the result of the "suspension" of the semantic relations of reference, truth and existence that occurs when a representation is placed in an intentional context, such as a mental state report or counterfactual reasoning. To explain the isomorphism between the three fundamental forms of pretence (behavioral phenomena) and the three aspects of opacity (semantic phenomena), I proposed the existence of a certain kind of internal representation. Representational structures, having whatever properties give rise to opacity phenomena, must be a part of the human mind/brain from its infancy onwards. The second critical feature of the development of pretence that a cognitive theory must account for is related to the first. Rather than appearing in three
*For reasons which are not clear, Perner (1991) writes as if the fact that the fundamental forms can be combined into more complex forms- for example, pretending that teddy's imaginary hat has a hole in it-should be a source of embarrassment to my theory. In fact, the possibility of "complex" pretence springs readily from the assumed combinatorial properties of metarepresentation. A further misunderstanding is to suppose that the only way the child could possibly handle the three aspects of opacity is by explicitly theorizing about reference, truth and existence - that is, by theorizing about the general nature of representation. Leslie (1987) did not propose any such thing. Indeed the whole thrust of my proposals was to avoid such a commitment by describing processing machinery that would achieve a similar result implicitly.
198
A. Leslie
discrete stages, the fundamental forms of pretence emerge together in a package. Given a single mechanism with the right properties, a cognitive model can capture both the character of the three fundamental forms of pretence and their emergence as a package (see Leslie, 1987, for discussion). The third crucial feature of pretence to be explained is, when the child first becomes able to pretend herself (solitary pretence), why does she also gain the ability to understand pretence-in-others? Traditional investigations overlooked this startling fact. Understanding another person's behavior as a pretence can be studied as an information-processing task the child undertakes. For example, when mother says, "The telephone is ringing", and hands the child a banana, the 2-year-old, who is also undertaking a number of other complex informationprocessing tasks, such as building a catalogue of object kinds, analysing agents' goal-directed actions with instruments and acquiring a lexicon, is nonetheless neither confused about bananas, nor about mother's strange behavior, nor about the meaning of the word "telephone". Instead, in general, the child understands that mother's behavior-her gesturing and her use of language - relates to an imaginary situation which mother pretends is real. Again, we can account for the yoking in development between the capacity to pretend oneself and the capacity to understand pretence-in-others if we assume that a single mechanism is responsible for both. Finally, a cognitive account must address the fact that pretence is related to particular aspects of the here and now in specific ways. This is true both for solitary pretence and in understanding the pretence of other people. For example, it is this banana that mother pretends is a telephone, not bananas in general nor that banana over there. The pretended truth of the content, "it is a telephone", is anchored in a particular individual object in the here and now. This is another critical feature of the early capacity for pretence that a cognitive model must capture. These four critical features of pretence - the three fundamental forms, their emergence as a package, the yoking of solitary pretence with the ability to understand pretence-in-others, and the anchoring of pretend content in the here and now - can be succinctly explained as consequences of the data structure called the "metarepresentation". This representational system provides precisely the framework that is needed to deploy another attitude concept closely related to pretending, namely, the concept of believing. Thus, the same representational system is required if the child is to interpret mother's behavior in terms of mother BELIEVES (of) the banana (that) "it is a telephone". Pretending and believing, though closely related attitude concepts, are, nevertheless, different concepts and their successful deployment can make rather different demands on problemsolving resources. I shall consider the emergence of the concept, believing, in the second part of this article.
Pretending and believing: issues in the theory of ToMM
199
2.1. The metarepresentation Leslie (1987, 1988c; Leslie & Frith, 1990) outlined some general ideas on how a mechanism like ToMM could account for the above. Three different types of representation were distinguished. "Primary" representations are literal, transparent descriptions of the world. "Decoupled" representations are opaque versions of primary representations. The decoupling of a representation allows a processor to treat the representation as a "report" of information instead of merely reacting to it. This in turn allows the (decoupled) representation to be placed within a larger relational structure in which an attitude to the truth of the "report" can be represented. This larger relational structure is built around a set of primitive relations - the attitude concepts or "informational relations". These relations tie together the other components. This entire relational structure is the third type of representation and is referred to as the "metarepresentation" (or, to distinguish it from Perner's later use of the term, the "M-representation"). ToMM employs the metarepresentation. Following Marr (1982), we can say that this system makes explicit four kinds of information. Descriptions in this system identify: (1) (2) (3) (4)
an agent an informational relation (the attitude) an aspect of the real situation (the anchor) an "imaginary" situation (the description)
such that a given agent takes a given attitude to the truth of a given description in relation to a given anchor. The informational relation is the pivotal piece of information in the sense that it ties together the other three pieces of information into a relational structure and identifies the agent's attitude. The direct object of the identified attitude is (the truth of) a proposition or description (typically of an "imaginary" situation) in relation to a "real" object or state of affairs. Not counting the implicit truth value, informational relations are thus three-place relations (Leslie, 1987). What does an informational relation represent? Perner (1991) has made a great deal of the fact that I borrowed the term "metarepresentation" from Pylyshyn (1978) for whom it meant a "representation of the representational relation". This seemed an innocuous enough phrase to me then, and still does, as long as one leaves it as an empirical issue exactly how a given "representational relation" is represented. But for Perner the term can only mean that the child possesses a certain kind of "representational theory of mind" (RTM) in which mental states are individuated by form rather than by meaning. I see no reason to accept this stricture. In any case, Leslie (1987) simply assumed that very young children did not have access to an RTM in this sense. The model of metarep-
200
A. Leslie
resentation I outlined was designed to account for the very young child's capacities by attributing more modest knowledge in which, for example, "representational relations", such as reference and truth, are handled implicitly, while "representational relations" such as pretending and believing are handled explicitly. As we shall see later, there is no evidence available to suggest that preschool children have an RTM in Perner's sense. The critical point about what informational relations represent is that they denote the kind of relation that can hold between an agent and the truth of a description (applied to a given state of affairs). This kind of relation immediately determines a class of notion different from the other kinds of relation that feature in early cognition, for example spatial and mechanical relations, and forms the conceptual core of commonsense theory of mind. My assumption is that there is a small set of primitive informational relations available early on, among them BELIEVE and PRETEND. These notions are primitive in the sense that they cannot be analyzed into more basic components such that the original notion is eliminated. While one can paraphrase "John believes that p is true" in a number of ways, one does not thereby eliminate the notion believes. For example, one can say "p is true for John", but that just gives another way (an alternate set of sounds for) saying "John believes that p is true". Perner (1991) adopts part of the above theory of pretence, namely the notion of decoupling, though he discusses it in terms of "models". According to this view, pretence emerges when the child can entertain multiple "models" of the world instead of just a single model that is possible during the first year. Representations of different times and places apparently constitute different "models". It seems unlikely that infants during the first year cannot relate past states of affairs to present ones but, in any case, Perner's notion of "model" does not say much about pretence. The opacity properties of pretence are not illuminated by tense and location "models" because the content of pretence is opaque in the here and now. This kind of opacity is also what is relevant to believing. Consider now a "Zaitchik photograph" (one that has gone out of date). This photograph is only a representation of a past situation and not of the current situation. Contrast this with the case in which someone assumes (wrongly) that the photograph is a photograph of the current situation. This is now quite different. The disinguishing feature in this latter case is clearly not the representation itself which remains the same, but the fact that an agent believes that the photograph depicts a current situation. Perner's model notion fails to address the relationship of the agent to the "model". Perner (1988) says that for the child the agent is simply "associated" with the model. But an associative relationship can also hold between, for example, can-openers and kitchens without the child ever thinking that canopeners pretend or believe anything about kitchens. Perner (1991) at times
Pretending and believing: issues in the theory of ToMM
201
attributes a behaviorist notion of pretence to the young child such that the agent who pretends that/? is understood as acting as if p were true. This proposal is only useful if we are also told how the child views the relation between p and the agent's behavior in the case in which p actually is true. If the relation between circumstances and behavior in the normal case is causal, is it also causal in the case of pretence? If so, how can imaginary circumstances be viewed as causal? How could the child learn about the causal powers of imaginary circumstances? The only solution to this dilemma, as far as I can see, involves some kind of mentalistic rather than behavioristic interpretation of the relation between the agent and /?, that is, some kind of attitude notion. Finally, parity of argument demands that if we insist upon a behavioristic construal of pretence-understanding in the child, then we should also insist upon a behavioristic construal of falsebelief-understanding. After all, falsely believing that/? demands the interpretation acting as if p every bit as much (or every bit as little, depending upon point of view) as pretending that p. The fact that one child is a bit older than the other does not in itself constitute a compelling reason for treating the two cases in radically different ways.
2.2.
Decoupling
The role of decoupling in the metarepresentation is to transform a primary, transparent internal representation into something that can function as the direct object of an informational relation. In the case of informational relations, as in the case of verbs of argument and attitude, the truth of the whole expression is not dependent upon the truth of its parts. This is a crucial part of the semantics of mental state notions and is what gives rise to the possibility of pretends and beliefs being false. The decoupling theory was an attempt to account for this feature without supposing that the child had to devise a theory of opacity. Leslie (1987) suggested that one way to think about the decoupling of an internal representation from its normal input/output relations vis-a-vis normal processing was as a report or copy in one processing system (the "expression raiser") of a primary representation in another (e.g., general cognitive systems). This suggestion drew upon the analogy between opacity phenomena in mental state reports and reported speech. Subsequently, Leslie (1988c) and Leslie and Frith (1990) developed this idea in terms of the relationship between decoupled representations and processes of inference. The basic idea is that decoupling introduces extra structure into a representation and that this extra structure affects how processes of inference operate, ensuring that the truth of the part does not determine the truth of the whole. The simplest illustration of this is that one cannQt infer it is a telephone from "it is a telephone". Normally, the truth of a whole expression is determined by the truth of its
202
A. Leslie
parts. For example, "Mary picked up the cup which was full" is true iff the cup Mary picked up was full. This same principle is involved in the detection of contradiction. Consider the following; the cup is empty and the empty cup is full. Suppose the whole-parts principle was implemented in a spontaneous inferencing device that carries out elementary deductions. The device will quickly produce the conclusion, "the cup is full and not full", revealing a contradiction because the whole and all of its parts cannot be simultaneously true. Despite the surface similarity to the foregoing, however, the device should not detect a contradiction in I pretend the empty cup is full. One might think at first that contradiction is blocked by the element, pretend, but contradiction returns in I pretend the cup is both empty and full, despite the presence of the element, pretend. We can think of decoupling as controlling the occurrence of contradiction: (1) (2) (3) (4)
the cup is empty. the empty cup is full. I pretend the empty cup "it is full". I pretend the cup "it is both empty and full".
Decoupling creates extra structure in the representation - an extra level to which the inferencing device is sensitive. Thus, in (2), with no decoupling, there is a single level within which a contradiction is detectable. In (3), there are two levels. The inferencing device first examines the upper level where it encounters I pretend the empty cup X and registers no contradiction. Next, it examines the lower level where it sees X "it is a telephone" and again detects no contradiction. On the lower level of (4), however, the device encounters X "it is both empty and full" and registers contradiction within the level as in (2). Contradiction is detected within but not across decoupled levels. This is exactly what is required by informational relations. Similar patterns can be seen in causal inferences. For example, I pretend this empty cup "it contains tea" can be elaborated by an inference such as: if a container filled with liquid is UPTURNED, then the liquid will pour out and make something wet. This same inference works in both real and pretend situations; it also works for both own pretence and for understanding other people's pretence (Leslie, 1987). Of course, in pretend situations, we do not conclude that pretend tea will really make the table wet. The consequent is decoupled because the antecedent was. So, if I upturn the cup which I am pretending contains tea, I conclude that I pretend the table "it is wet". The conclusions of the inference are again closed under decoupling; or we may say that the inference operates within the decoupled level. If pretend scenarios - both one's own and those one attributes to other people - unfold by means of inference, then we could predict another consequent
Pretending and believing: issues in the theory of ToMM
203
based on a variation of the above inference: if the liquid comes out of the container, then the container mil be empty. This leads to pretending something that is true, namely, that the empty cup is empty. At first glance, this may seem ridiculous. But there is, of course, an important difference between the empty cup is empty and pretending (of) the empty cup "it is empty". Later I will present an empirical demonstration that young children routinely make this sort of inference in pretence.
2.3. Yoking The emergence of solitary pretence is yoked to the emergence of the ability to understand pretence-in-others. The very young child can share with other people the pretend situations she creates herself and can comprehend the pretend situations created by other people. She is able to comprehend the behavior and the goals of other people not just in relation to the actual state of affairs she perceives, but also in relation to the imaginary situation communicated to her and which she must infer. We can illustrate this in two different ways: first in relation to behavior, and second in relation to language use. Mother's goal-directed behavior with objects will be an important source of information for the young child about the conventional functions of objects. Likewise, mother's use of language will be a major source of information about the meanings of the lexical items the child learns. But in pretence, the child will have to know how to interpret mother's actions and utterances with respect to mother's pretence rather than with respect to the primary facts of the situation. When mother says, for example, "The telephone is ringing" and hands the child the banana, it will not be enough for the child to compute linguistic meaning. She will have to calculate speaker's meaning as well (cf. Grice, 1957). This double computation is inherently tied to the agent as the source of the communication and is seamlessly accomplished through the metarepresentation. Interestingly, Baldwin (1993) has provided independent evidence that children from around 18 months of age begin to calculate speaker's meaning. In the circumstances studied by Baldwin, the 18-month-old calculates speaker's meaning, not in service of pretence, but in the service of calculating linguistic meaning. Baldwin showed that, from around 18 months, children do not simply take the utterance of a novel word to refer to the object they themselves are looking at but instead look round and check the gaze of the speaker. They then take the novel word to refer to the object that the speaker is looking at, even if this is different from the one they were looking at when they heard the utterance. This finding reinforces the idea that infants around this age are developing an interest in what might be called the "informational properties" of agents.
204
A. Leslie
3. Understanding pretence-in-others In this section, I shall describe an experimental demonstration of a number of the phenomena discussed so far. This experiment was first presented in Leslie (1988a) and discussed briefly in Leslie (1988c).2 The following hypotheses are tested. First, early pretence can involve counterfactual inferencing. Second, such inferencing can be used to elaborate pretence. Third, inferencing within pretence can use real-world causal knowledge and such knowledge is available to 2-yearolds in a form abstract enough to apply in imaginary situations and in counterfactual argument where perceptual support is minimal or contradictory. Fourth, 2-year-olds can infer the content of someone else's pretence and demonstrate this by making an inference appropriate to that person's pretence. Fifth, 2-year-old pretence is anchored in the here and now in specific ways. Sixth, that pretend contents are not always counterfactual. Seventh, one can communicate through action, gesture and utterance a definite pretend content to a 2-year-old child, sufficient for the child to calculate speaker's meaning/pretender's meaning and to support a particular counterfactual inference based upon the communicated content.
3.1. Method The child was engaged by the experimenter in pretend play. Toy animals and some other props were introduced to the child during a warm-up period. My assistants in this task were Sammy Seal, Mummy Bear, Lofty the Giraffe, Larry Lamb and Porky Pig. Other props included toy cups, plates, a bottle, some wooded bricks and a paper tissue. The experimenter pretended that it was Sammy's birthday that day, that Sammy was being awakened by Mummy Bear and was being told that there was going to be a party to which his friends were coming. This warm-up period served to convey that what was to happen was pretend play and to overcome the shyness children of this age often and quite rightly have with strangers who want to share their innermost thoughts with them. The general design was to share pretence, allowing the child to introduce what elements he or she wished or felt bold enough to advance but to embed a number of critical test sub-plots as naturally as possible into the flow of play. These sub-plots allow testing of pretence-appropriate inferencing. Could the child make inferences which are appropriate to the pretend scenario he has internally represented, but which are not appropriate to the actual physical condition of the props? 2 Harris and Kavanaugh (in press) have recently replicated and extended this study, though they draw somewhat different conclusions in line with their "simulation theory". The simulationist view of theory of mind phenomena raises a number of complex issues which I do not discuss here (but see Leslie & German, in press).
Pretending and believing: issues in the theory of ToMM
205
3.2. The sub-plots (1) CUP EMPTY/FULL. The child is encouraged to "fill" two toy cups with "juice" or "tea" or whatever the child designated the pretend contents of the bottle to be. The experimenter then says, "Watch this!", picks up one of the cups, turns it upside down, shakes it for a second, then replaces it alongside the other cup. The child is then asked to point at the "full cup" and at the "empty cup". (Both cups are, of course, really empty throughout.) (2) UPTURN CUP. Experimenter "fills" a cup from the bottle and says, "Watch what happens!" Sammy Seal then picks up the cup and upturns it over Porky Pig's head, holding it there upside down. Experimenter asks, "What has happened? What has happened to Porky?" (3) MUDDY PUDDLE. The child is told that it is time for the animals to go outside to play. An area of the table top is designated "outside". A sub-part of this area is pointed to and experimenter says "Look, there's a muddy puddle here!" Experimenter then takes Larry Lamb and says "Watch what happens!" Larry is then made to walk along until the "puddle" area is reached whereupon he is rolled over and over upon this area. Experimenter asks "What has happened? What has happened to Larry?" (4) BATH-WATER SCOOP. Following the above, it is suggested that Larry should have a bath. Experimenter constructs a "bath" out of four toy bricks forming a cavity. Experimenter says, "I will take off Larry's clothes and give him a bath. Then it will be your turn to put his clothes back on. OK?" Experimenter then makes movements around the body and legs of Larry suggesting perhaps the removal of clothes and each time puts them down on the same part of the table top, making a "pile". Larry is then placed in the cavity formed by the bricks for a few seconds while finger movements are made over him. Larry is then removed and placed on the table. Experimenter then says, "Watch this!" and picks up a cup. The cup is placed into the cavity and a single scooping movement is made. The cup is then held out to the child and he or she is asked, "What's in here?" If the child does not answer, the scoop is repeated once to "Watch this!" and "What's in here?" (5) CLOTHES PLACE. The child is told "It's your turn to put Larry's clothes on again" and handed Larry. Where (if anywhere) the child reaches in order to get the "clothes" is noted.
3.3. Subjects There were 10 children aged between 26 and 36 months, with a mean age of 32.6 months. Two further children were eliminated for being uncooperative or wholly inattentive.
206
A. Leslie
Table 1. Number of subjects passing test sub-plots and the full range of responses obtained Test
Subjects passing
Range of responses obtained indicating appropriate inference
CUP EMPTY/FULL
10/10
Points to or picks up correct cup
UPTURN CUP
9/10
Refills cup, says 'Til wipe it off him" and wipes with tissue, "threw water on him", "he's spilling", "he got wet", "poured milk over him... wet", "tipped juice on head"
MUDDYPUDDLE
9/10
Dries animal with tissue, says "oh no, all the mud", covered in mud", "got mud on"
BATH-WATER SCOOP
9/10
Says "water", "water" and pours into other cup, "water" and upturns into bath, "bathwater"
CLOTHES-PLACE
9/9
Picks up from correct place, points to correct place
Failures were produced by two different children with "don't know" responses or no response after the test was repeated twice. One child was not asked the clothes-place test through experimenter error.
4. Results Table 1 shows the number of children passing each sub-plot plus the entire range of responses that occurred. The failures came from two children who answered "Don't know" or failed to respond despite the sub-plot being repeated for them. Statistical analysis seems mostly unnecessary. The CUP EMPTY/FULL subplot could be guessed correctly half the time, so all 10 children passing is significant (p = 0.001, binomial test). In the other cases it is difficult to estimate the probability of a correct answer by chance but it is presumably low.
5. Discussion These results support a number of features of the metarepresentational model of pretence. They demonstrate counterfactual causal reasoning in 2-year-olds based on imaginary suppositions. For example, in the CUP EMPTY/FULL scenario the child works from the supposition the empty cups "they contain juice"
Pretending and believing: issues in the theory of ToMM
207
and upon seeing the experimenter upturn one of the cups, the child applies a "real-world" inference concerning the upturning of cups (see pp. 220-221). In this case the child was asked about the cups, so the conclusions generated were this empty cup "it contains juice" and that empty cup "it is empty". The last conclusion is, of course, an example of pretending something which is true and not counterfactual. Notice, however, that in terms of decoupling this is not the tautology, this empty cup is empty. A similar conclusion was generated by one of the children in the UPTURN CUP scenario and expressed by his pretending to refill the "empty cup" when asked what had happened. These examples help us realize that, far from being unusual and esoteric, cases of "non-counterfactual pretence", that is, pretending something is true when it is true, are ubiquitous in young children's pretence and indeed has an indispensable role in the child's ability to elaborate pretend scenarios. This is predicted by the Leslie (1987) model.3 One way to understand the above result is this: the logic of the concept, pretend, does not require that its direct object (i.e., its propositional content) be false. Our feeling that a "true pretend" is odd reflects the normativity of our concept of pretence. Having counterf actual contents is, as it were, what pretends are for; pretends "ought" to be false; but their falseness is not strictly required by the logic of the concept. In this regard, PRETEND shows the logic of the BELIEVE class of attitudes: the truth of the whole attitude expression is not dependent upon the truth of all of its parts, specifically, not a function of the truth of its direct object. As we saw earlier, we can understand this peculiarity of attitude expressions in terms of decoupling. Some attitudes, like KNOW, on the other hand, do require the truth of their direct object (though even here there are subtleties), but PRETEND and BELIEVE do not. As we shall see later, BELIEVE shows the opposite normativity to PRETEND. Normally, beliefs are true. In the experiment, the children correctly inferred what the experimenter was pretending. The very possibility of "correctness' depends upon some definite pretend situation being communicated. The child calculates a construal of the agent's behavior-a construal which relates the agent to the imaginary situation that the agent communicates. The child is not simply socially excited into producing otherwise solitary pretend; the child can answer questions by making definite inferences about a definite imaginary situation communicated to him by the behavior of the agent. To achieve this, the child is required to calculate, in regard to utterances, speaker's meaning as well as linguistic meaning, and, in regard to action, the agent's pretend goals and pretend assumptions. The child is 3 Though I was perhaps the first to derive this as a prediction from a theoretical model, I am certainly not the first to point out that pretends "can be true". Vygotsky (1967) describes the case of two sisters whose favorite game was to pretend to be sisters.
208
A. Leslie
also capable of intentionally communicating his own pretend ideas back to participating agents. One of the deep properties that we seem pre-adapted to attribute to agents is the power of the agent to take an attitude to imaginary situations (or, more accurately, to the truth of descriptions). This allows a rational construal of the role of non-existent affairs in the causation of real behavior. It is striking that this is done quite intuitively by very young children. The spontaneous processing of the agent's utterances, gestures and mechanical interactions with various physical objects to produce an interpretation of agent pretending this or that is surely one of the infant's more sublime accomplishments. However, there is no more need to regard the child as "theorizing" like a scientist when he does this than there is when the child acquires the grammatical structure of his language.
6. Believing and ToMM One of the central problems in understanding the development of theory of mind is the relation between the concepts of pretending and believing. There are two broad possibilities. There may be no specific relationship between the two and their development may reflect quite different cognitive mechanisms and quite different representational structures. Versions of this position have been held for example by Perner (1991), Flavell (1988) and Gopnik and Slaughter (1991). Alternatively, there may be a close psychological relationship between the concepts of pretending and believing: both may be introduced by the same cognitive mechanism; both may belong to the same pre-structured representational system. Within this general scheme there are a number of more detailed options. For example, one concept, pretend, may develop first while the other, believe, may develop later, either because of maturational factors or because believe requires more difficult and less accessible information to spur its emergence (Leslie, 1988b). Or the two notions may differentiate out of a common ancestor concept. Or there may be a progressive strengthening or sharpening of the pre-structured metarepresentational system (Plaut & Karmiloff-Smith, 1993). Or there may be different performance demands associated with employing the two concepts - demands that can be met only at different times in development depending upon a variety of factors (Leslie & Thaiss, 1992). Whichever of these options, or whichever mixture of these options (they are not mutually exclusive), turns out to be correct, there is no reason to suppose that pretend and believe require radically different representational systems, any more than the concepts dog and cat, though undoubtedly different concepts, require radically different representational capacities. It is often claimed in support of the special nature of believe, and in contradiction of the second set of positions above, that solving false belief tasks,
Pretending and believing: issues in the theory of ToMM
209
such as that in Fig. 1, requires the child to employ a radically different conceptualization of mental states from that required by understanding pretend (e.g., Perner, 1991). Specifically, it is claimed that false belief can only be conceptualized within a "representational theory of mind" (RTM). I have criticized the RTM view of the preschoolers' theory of mind at length elsewhere (e.g., Leslie & Thaiss, 1992; see also Leslie & German, in press; Leslie & Roth, 1993). Briefly, there are two ways in which one could speak of the child possessing a "representational theory of mind". One could use the term "representational" loosely to cover anything which might be considered true or false; that is, anything which can be semantically evaluated will count as a "representation". In this loose sense of representational theory of mind, mental states involve representations simply because their contents are semantically evaluable, because a mental state involves a relation to a proposition not because it involves a relation to an image, a sentence, or whatever. The second and stricter way of using the term is to denote only entities that can be semantically evaluated and have a physical form or a syntax. Insofar as cognitive science holds an RTM, it is in this second stricter sense of "representational". Thus, for example, psychologists might argue about whether a given piece of knowledge is represented in an image form or in a sentential form. The form of the representation is held to be critical to the individuation of the mental state. Indeed, mental states are individuated within this framework in terms of their form, not in terms of their content. From the point of view of psychology, an image of a cat on a mat counts as a different mental state from a sentential representation of the same cat on the same mat. Because it would be massively confusing to use the same term both for a theory of mind which individuates mental states in terms of their contents (semantics) and for a theory of mind which individuates mental states in terms of their form but not their content, we shall use different terms. We shall speak of a "propositional attitude" (PA), or semantically based, theory of mind for the first type of theory and a representational, or syntactically based, theory of mind (RTM) for the second. Now we can ask, does the child employ a PA-based (semantic) theory of mind or representational (syntactic) theory of mind? Perner's (1988, 1991) claim is that success on a variety of false belief tasks at age 4 reflects a radical theory shift from a PA-based theory of mind to an RTM. However, all of the evidence quoted in support of this claim (mainly passing various false belief tasks) only shows that the child individuates beliefs on semantic grounds. After all, the falseness of a belief is a quintessentially semantic property. To date, there are no demonstrations of preschoolers individuating beliefs on syntactic grounds in disregard of their content. All the available evidence supports the idea that preschoolers are developing a semantic theory of belief and other attitudes. What the theory of ToMM aims to account for is the specific basis for this early emerging semantic theory of the attitudes.
210
A. Leslie
6.1. A task analytic approach to belief problems How can we begin to investigate the claim that a prestructured representational system interacts with performance factors in producing the patterns seen in the preschool period? Specifically, how do we investigate the notion that performance limitations mask the preschooler's competence with the concept of belief? We can try to develop a task analysis. In carrying out this analysis, we must separate the various component demands made on conceptual organization from those made on general problem-solving resources in the course of tackling false belief problems. An important beginning has been made in this line of research by Zaitchik (1990). Zaitchik designed a version of the standard false belief task in which the protagonist Sally is replaced by a machine, namely, a polaroid camera. The protagonist's seeing of the original situation of the marble is replaced by the camera's taking a photograph of it; after moving the marble to a new location, the protagonist's out-of-date belief is replaced in the new task by the camera's out-of-date photograph. While the conceptual content of the task changes (from belief to photograph), the general task structure remains identical. This task then provides an intriguing control for the general problem-solving demands of the false belief task. Results from comparing these two tasks show two things: first, normal 3-year-olds fail both the false belief and the photographs tasks (Zaitchik, 1990), as would be expected on the basis of a general performance limitation; second, autistic children fail only the false belief task but pass the photographs version (Leekam & Perner, 1991; Leslie & Thaiss, 1992), consistent with autistic impairment in the domain-specific mechanism ToMM. The ToMM model can be extended to relate it to the performance limitations affecting young preschoolers. Building on ideas proposed by Roth (Leslie & Roth, 1993; Roth, 1993; Roth & Leslie, in preparation), Leslie and Thaiss (1992) outlined the ToMM-SP model. Some false belief tasks, such as the Sally and Ann scenario in Fig. 1 and other standards such as "Smarties", make demands on at least two distinct mechanisms. Specific conceptual demands are made of ToMM to compute a belief metarepresentation, while, in the course of accurately calculating the content of the belief, more general problem-solving demands are made of a "selection processor" (SP). These latter demands require the child to interrogate memory for the specific information that is key to the belief content inference, disregarding other competing or confusing information. For example, to infer the correct content for Sally's belief, the situation that Sally was exposed to at t0 has to be identified from memory and the inference to Sally's belief based on that, resisting the pre-potent tendency to simply base the inference upon the (present) situation at tx. There is a conceptual basis for the existence of this pre-potent response. Normatively beliefs are true: this is what beliefs are "for"; they are "for"
Pretending and believing: issues in the theory of ToMM
211
accurately describing the world; a belief is "useful" to an agent only to the extent it is true; in short, beliefs "ought" to be true.4 It makes sense, then, if, by default, inferences to belief contents are based upon current actuality. In the case of false belief, this normative design fails and in order to accurately compute the errant content the pre-potent assumption must be resisted. Similar considerations may apply, for example, to the case of Zaitchik photographs. (The same problem does not arise in the case of true pretends because the agent is always able to intentionally communicate the content of his pretend whereas, for obvious reasons, an agent is not in a position to intentionally communicate that he has a false belief. However, see Roth and Leslie, 1991, for a case in which communication does help the 3-year-old with false belief.) We can organize our thinking about the general demands made by some belief tasks by positing a general, or at least non-theory-of-mind-specific, processing mechanism. Roth and I have dubbed this the "selection processor" (SP). The SP performs a species of "executive" function, inhibiting a pre-potent inferential response and selecting the relevant substitute premise. Like many other "executive functions", SP shows a gradual increase in functionality during the preschool period. Some belief tasks do not require this general component or stress it less, by, for example, drawing attention to the relevant "selection" and/or by encouraging inhibition of the pre-potent response. In these cases, better performance on false belief tasks is seen in 3-year-olds (e.g., Mitchell & Lacohee, 1991; Roth & Leslie, 1991; Wellman & Bartsch, 1988; Zaitchik, 1991). According to this view, the 3-year-olds' difficulty with false belief is due to limitations in this general component. Meanwhile, in the normal 3-year-old, ToMM is intact. The autistic child, by contrast, shows poor performance on a wider range of belief reasoning tasks, even compared with Down's syndrome children and other handicapped groups (e.g., Baron-Cohen, 1991; Baron-Cohen, Leslie, & Frith, 1985; Leslie & Frith, 1988; Roth & Leslie, 1991). This disability is all the more striking alongside the excellent performance autistic children show on out-of-date photographs, maps and drawings tasks (Charman & Baron-Cohen, 1992; Leekam & Perner, 1991; Leslie & Thaiss, 1992). These tasks control for the general problem-solving demands of standard false belief tasks. Autistic impairment on false belief tasks cannot, therefore, be due to an inability to meet the general problem-solving demands of such tasks. So, although autistic children seem to be 4
Although this normative assumption is fundamental to the notion of belief, it is not part of the logic of the concept that belief contents must be true (compare the earlier parallel discussion on page 225 of pretends being normatively false). Notice that the normative assumption is a far cry from the "copy theory" of belief (Wellman, 1990). Incidently, it has been my experience that there are many adults who are surprised, even dismayed, to discover that pretends can be true. In view of this, we should not be too hard on the preschooler if she takes a few months to discover that, contrary to design, the vicissitudes of the real world sometimes defeat beliefs with dire consequences for the agent's goals.
212
A. Leslie
impaired in certain kinds of "executive functioning" (Ozonoff, Pennington, & Rogers, 1991), this cannot be the cause of their failure on false belief tasks. This pattern can be succinctly explained on the assumption of a relatively intact SP together with an impaired ToMM-the mirror-image of the normal 3-year-old. Fig. 3 summarizes the ToMM-SP model of normal and abnormal development. Roth and I have recently extended our approach of studying minimally different task structures in an effort to isolate general processing demands from specific conceptual demands (Roth & Leslie, in preparation). In one study, we compared the performance of young, middle and older 3-year-olds on a standard version of the Sally and Ann task with a "partial true belief" version (see Leslie & Frith, 1988, for details of the tasks used). This allowed us to assess the importance of the falseness of the belief (a conceptual factor) in generating difficulty for 3-year-olds while holding general task structure constant. The results are shown in Fig. 4. It can be readily seen that there was no difference in difficulty between the two tasks for 3-year-olds when task structure is equated. Taken together with previous findings that 3-year-olds can understand "knowing and not knowing" (e.g., Pratt & Bryant, 1990; Wellman & Bartsch, 1988), this shows that the conceptual factor of the falseness of the belief per se is not the source of difficulty for 3-year-olds. There must be something about the problem-solving structure of this standard belief task that stresses 3-year-olds. Another approach in the literature to the problem of isolating belief competence is to find simplified task structures that 3-year-olds perform better on. The theoretical assumption behind such work is that by finding simplified tasks one reduces the number of false negatives that standard tasks produce. Sudan and non-standard pretence
standardFB
standard false representation
ToMM 4 year old 3 year old Autistic
y y X
SP
y X
y
Fig. 3. The ToMM-SP model of development (after Leslie & Thaiss, 1992).
213
Pretending and believing: issues in the theory of ToMM Partial True Belief versus False Belief Younger, Middle and Older Three-year-olds
o o
Younger
Middle
Older
Age groups Fig. 4.
Performance on both a standard false belief task and a true belief analogue improves gradually during the fourth year.
Leslie (in preparation) point out a danger with this approach. Manipulations designed to simplify tasks may inadvertently allow children to pass for the wrong reasons-for reasons which do not reflect the conceptual competence that the investigator is targeting. To avoid this, we need to introduce controls for false positives. A concrete example will help make the idea of controlling false positives clear. Suppose we run a group of children on the Sally and Ann task and find that 100% of the children pass. We then run another group of same age children on a modified version of the task in which Sally does not go away for her walk but instead remains behind and watches Ann all the while. In this version, Sally sees the transfer of the marble and knows that it is in its new location. Imagine that, to our surprise, we find that 100% in this SEE control group fail! In this control condition, failing means that the child indicates the empty location when asked where Sally will look for the marble. Now we will say that the first finding, in the "NOT-SEE" test, of 100% indicating the empty location, consisted of false positives; it did not demonstrate false belief understanding. Suppose instead we had discovered that the false belief group were only 50% correct. We would have been tempted to describe this result as "chance" but we waited till we saw how many children passed the SEE control version of the task. Suppose that on this, 100% of the children succeed. Now we can be confident that in the NOT-SEE test the children really did take Sally's exposure history into account because if they had responded like the controls no one would have
214
A. Leslie
passed: therefore the 50% who did were not false positives. In other words, the second pattern of results says more about false belief understanding than the 100% "passing" in the paragraph above. One immediate use of this enhanced technique of balancing a NOT-SEE test with a SEE control is to allow us to look at 3-year-old performance with a more sensitive instrument. Robinson and Mitchell (1992) report a study that would benefit from the use of a SEE control. Three-year-olds were given a scenario in which Sally has two bags each containing pieces of material. She places the bags, one in each of two drawers, and goes to the next room. Ann then enters, takes the two bags out of their drawers, plays with them, then replaces the bags. But, by accident, she swaps their locations. Sally then calls from the next room that she wants her bag of material to do some sewing and that it is important that she gets the correct bag. She tells Ann that it is the bag in the red drawer she wants but, of course, Sally does not know that Ann has swapped the bags. The child is asked to identify the bag that Sally really wants. In this interesting task, the child can correctly identify Sally's desire only if she first relates it to Sally's false belief. The results showed that 50% of the 3-year-olds passed-a higher proportion than that obtained with a standard FB task. Unfortunately, the possibility exists with this design that children were simply confused by the swapping and interpreted Sally's description of the bag either as referring to the bag that was in the red drawer or to the bag that is in the red drawer, half making one interpretation and half the other. This would yield 50% false positives without any of the children actually calculating Sally's false belief. If the children were using such a low-level, "dumb" strategy, it will show up again in a version of the Robinson and Mitchell task that implements the SEE control. This time, Sally watches as Ann swaps the bags between the drawers. Now, when Sally asks for the bag in the red drawer, she must mean the bag that is in the red drawer. Yet, if the children were simply following a "dumb" strategy and not calculating at all what Sally believed, then it should make no difference that Sally had watched the proceedings. Half will still interpret her as wanting the bag that was in the red drawer: the confusion created by swapping will occur again, resulting again in 50% "correct". (Bear in mind that the indicated location counted as correct in the SEE control condition is the opposite of that counted correct in the NOT-SEE (false belief) condition.) Sudan and Leslie (in preparation) examined the Robinson and Mitchell scenario in relation to the SEE control. They found that the proportion of "correct" locations indicated in the false belief condition was the same as the proportion of incorrect locations indicated in the SEE control. This shows that the children were not taking into account Sally's exposure and thus were providing false positives in the false belief task. According to these new results, swapping locations and asking about desire in relation to false belief does not, in the
Pretending and believing: issues in the theory of ToMM
215
Robinson and Mitchell task, produce a simplified false belief problem for 3-yearolds. Surian and Leslie then went on to test 3-year-olds in a study which combines the methods of minimally different task structures with the SEE control for false positives. The Robinson and Mitchell task was modified to introduce ambiguity into Sally's desire. Although this modification makes the scenario more complex as a story, we supposed that it would simplify the scenario as a false belief problem. Fodor (1992) has also recently proposed a model of a performance limitation in 3-year-olds' theory of mind reasoning. In this model, the 3-year-old typically predicts behavior from desire without calculating belief. According to the model, the young child will not calculate belief unless the prediction from desire yields an ambiguous result and the child is unable to specify a unique behavior. Whenever desire prediction results in ambiguity, however, the child will break the impasse by calculating belief. Standard false belief tasks allow unique predictions from desire, so the young child does not calculate belief and thus fails. Older children, however, routinely calculate both belief and desire because they have greater processing capacity available. Like the ToMM model, Fodor assumes that the 3-year-old possesses the conceptual competence for understanding false belief. Therefore, Fodor predicts that when the 3-year-old does calculate belief, she will succeed. One difficulty is to know what predictions the child will consider as ambiguous. For example, though Fodor (1992) suggests splitting and moving the object into two target locations as a way of creating ambiguity, it could be that the child will regard search in both locations as a single unambiguous action. In order to test Fodor's suggestion clearly, we need a scenario in which ambiguity in the object of desire is unavoidable. A modification of the Robinson and Mitchell scenario meets this requirement nicely. Instead of having two bags of material, Sally has four pencils. Three of the pencils are sharp, while the fourth pencil is broken. Sally leaves the pencils and goes into the next room. Now Ann comes in and finds the pencils. First Ann sharpens the broken pencil, then she breaks each of the original three sharp pencils. Now there are three broken pencils and one sharp pencil. At this point, Sally calls from nextdoor, "Ann, bring me my favorite pencil-you know, it's broken!" As before, the child is asked which pencil Sally really wants. The child has been given no information prior to this about which pencil is Sally's favorite. The only information the child has to go on is Sally's attached description, "it's broken". But now there are three pencils which are broken. This unavoidably produces ambiguity. According to Fodor's model, the ambiguity in the object of desire should trigger the child into consulting Sally's belief. When the child consults Sally's belief about which pencil is broken, he will realize that Sally thinks that the now sharp pencil is still broken. The sharp pencil is the one Sally really wants! Our results showed that 48% of our 3-year-olds correctly chose the pencil that
216
A. Leslie
Sally really meant. This performance was significantly better than performance on the unmodified Robinson and Mitchell scenario and better than on the standard false belief task. We thus obtained support for Fodor's ambiguity factor. Before reaching this conclusion, however, we should recognize that there are low-level "dumb" strategies that could have produced these results. Perhaps the passers were false positives. For example, the word "favorite" singles out a particular individual. The children may simply have latched onto the "odd-one-out", the uniquely sharp pencil. To control for dumb possibilities like this, we also ran a SEE control version of the pencils task. If children simply respond with this or some other dumb strategy, then they should use the same dumb strategy when Sally remains in the room watching Ann process the pencils. In the SEE control, as in the NO-SEE test, the child has no information on which pencil is Sally's favorite other than the description Sally gives of it as being broken. Again this is ambiguous, because, by the time Sally makes her request, there are three broken pencils. If the children follow a dumb strategy, about half the children should again respond by picking the odd-one-out - the uniquely sharp pencil. In fact, in the SEE control condition only about 20% of the children chose the sharp pencil, the rest choosing one of the broken pencils. This pattern was significantly different from that obtained in the NO-SEE test. Most of the passers were true positives. By combining a method of minimal task differences with the SEE control, Surian and Leslie obtained a sensitive measure of 3-year-old competence. We were able to isolate Fodor's ambiguity of desire factor by comparing the performance on Robinson and Mitchell's original task with the ambiguity modified version of it, while at the same time controlling for false positives by means of a SEE control. Although further studies under way may change the picture, it seems that ambiguity of desire can help 3-year-olds in solving a false belief problem. However, it is not clear that Fodor's model identifies all the performance factors limiting 3-year-olds' successes. Fodor's (1992) model focused on the prediction of behavior. Important though this is, the child is also concerned with the underlying mental states themselves. For example, in the ambiguity study above, the child did not calculate belief in order to predict behavior. She calculated belief in order to figure out the referent of Sally's desire. Furthermore, in standard false belief tasks, even when 3-year-olds presumably do consult belief-for example, when they are directly requested to do so-they still have difficulty calculating its content accurately. For example, in the standard Sally and Ann scenario it makes little difference if, rather than being asked to predict behavior, 3-year-olds are asked where Sally thinks the marble is. Despite being asked to consult belief, they are no better at calculating its content than when asked to predict behavior. And even when the ambiguity factor was apparently activated in the study above, still half the children did not calculate belief content correctly.
Pretending and believing: issues in the theory of ToMM
217
Fodor's model assumes that 3-year-olds do not ordinarily consult belief, but that when they do, they can easily calculate its content correctly (even in the case of false beliefs). The SP model, on the other hand, proposes that ToMM's routine calculation of belief normally assumes that content reflects relevant current facts. In light of the normative nature of the belief concept, this assumption is, in the general case, justified. But for false belief situations where belief does not operate as it ought to, in order to produce a correct answer about content, this assumption has to be inhibited or blocked and a specific alternative content identified. Both of these processes (the inhibition of the pre-potent response and the selection of the correct content) stretch the 3-year-olds' capabilities. Unless "help" is given by the form of the task, 3-year-olds will tend to assume beliefs reflect current facts or will fail to identify the correct content. Fodor's ambiguity of desire can be assimilated to the SP model as one factor which can inhibit the normal content assumption and lead to search for an alternative belief content. For example, when the child tries to infer which pencil Sally wants, the first hypothesis will simply be "a broken one". But which one is that, given there are three broken ones? Since it is not possible to reach a definite answer to the question of what Sally wants, the ordinary assumption about belief content is inhibited and an attempt made to calculate the content from Sally's exposure history. Recall that the control children simply had to live with an indefinite answer because in their case Sally in fact knew there were three broken pencils. If the experimental effect simply reflected the dumb strategy, "my first answer is going to be wrong, so I'll pick something else", and the only other different thing the child can pick is the sharp pencil, then this same dumb strategy should have been followed by the SEE control children as well. But it was not. The children were indeed calculating belief. Nevertheless, though the pencils story helped the children, it was not enough to help more than half of them to get their calculation right. Perhaps if task structure were made to help with the selection of the appropriate content as well as with inhibiting the pre-potent response, performance would improve further. In a final experiment, Surian and Leslie (in preparation) re-examined a modified standard scenario based on Siegal and Beattie (1991). In this otherwise standard Sally and Ann task, instead of asking the child "Where will Sally look for her marble?" the child is asked 'Where will Sally look for her marble firstT Siegal and Beattie found that adding the word "first" dramatically improved 3-year-olds' success. This result has largely been ignored, however, because it is open to some obvious objections. For example, the word "first" may simply lead the child to point to the first location the marble occupied, in other words to follow a dumb associative strategy. Alternatively, the word "first" might cue the child that the experimenter expects the first look to fail. If there is to be a first look, presumably there is to be a second look; but why should there be a second look unless the first one fails? Therefore, point to the empty location for the first failing look. Again, put like this, the word "first" simply triggers a dumb strategy
218
A. Leslie
in the child who then appears to succeed but who, in fact, does nothing to calculate belief. We simply added the necessary SEE control to examine the viability of such "dumb strategy" explanations. If the child is not attending at all to Sally's belief then it should make no difference that Sally watches Ann move the marble. In the control condition too, the word "first" should trigger the dumb response strategy. We found that 83% of the children in the false belief task passed, replicating Siegal and Beattie's finding. If this was the result of a dumb strategy, then we should expect to find a similar proportion failing the SEE control task, because in this condition a point to the first location is considered wrong. In fact, 88% were correct in this condition too. Therefore, the effect of the word "first" is specific to the status of Sally's belief. These last results vindicate Siegal and Beattie (1991) and suggest that they have been wrongly ignored. Siegal and Beattie argued that including the word "first" made the experimenter's intentions explicit for the 3-year-old. We are now in a position to suggest an account of how this manipulation makes the experimenter's intentions explicit for 3-year-olds and why they, but not 4-yearolds, need the help. Notice that the absolute level of success in the look first task is very high indeed. It is quite comparable to the level of success obtained by 4-year-olds in the standard task and higher than that obtained with 3-year-olds in the ambiguity task. The word "first" may give the child a "double" help. The child's attention is drawn to the possibility that Sally's first look may fail. If her look fails (to find the marble), Sally's behavior defeats her desire. Plausibly, behavior defeating a desire encourages the blocking of the normal assumption regarding belief content in the same way that ambiguity about what would satisfy the desire does. This is a variation on Fodor's factor. In addition, however, the word "first" directs the child's attention to the first location and this helps select the correct counterfactual content. The double help results in very good performance. Bear in mind, once again, that this hypothesized double help has specific effects depending upon the status of Sally's belief. If Sally indeed knows where the marble is, asking where she will look first obtains a contrasting answer from 3-year-olds: namely, "in the second location". Finally, the look-first question fails to help autistic children: we found only 28% of an autistic group passed, no different from an unmodified Sally and Ann task. In summary, there is increasing evidence that 3-year-olds have an underlying competence with the concept of belief but that this competence is not revealed in the tasks that are standardly used to tap it. It seems increasingly likely that their competence is masked by a number of "general" factors that create a performance squeeze. This squeeze gradually relaxes over the course of the fourth year (see Fig. 4), and probably beyond. Further support is provided for this view by the finding that false belief tasks show a difficulty gradient, that some false belief tasks are easier than others. The ToMM-SP model provides, to date, the
Pretending and believing: issues in the theory of ToMM
219
most wide-ranging model of the young child's normal theory of mind competence, of the performance factors that squeeze the child's success on false belief calculations, and of the abnormal development of this domain found in childhood autism.
References Baldwin, D.A. (1993). Infants' ability to consult the speaker for clues to word reference. Journal of Child Language, 20, 395-418. Baron-Cohen, S. (1991). The development of a theory of mind in autism: Deviance and delay? In M. Konstantareas & J. Beitchman (Eds.), Psychiatric Clinics of North America, special issue on Pervasive developmental disorders (pp. 33-51). Philadelphia: Saunders. Baron-Cohen, S., Leslie, A.M., & Frith, U. (1985). Does the autistic child have a "theory of mind"? Cognition, 21, 37-46. Charman, T., & Baron-Cohen, S. (1992). Understanding drawings and beliefs: A further test of the metarepresentation theory of autism (Research Note). Journal of Child Psychology and Psychiatry, 33, 1105-1112. Flavell, J.H. (1988). The development of children's knowledge about the mind: From cognitive connections to mental representations. In J.W. Astington, P.L. Harris, & D.R. Olson, (Eds.), Developing theories of mind (pp. 244-267). New York: Cambridge University Press. Fodor, J.A. (1992). A theory of the child's theory of mind. Cognition, 44, 283-296. Frith, U., Morton, J., & Leslie, A.M. (1991). The cognitive basis of a biological disorder: Autism. Trends in Neurosciences, 14, 433-438. Gopnik, A., & Slaughter, V. (1991). Young children's understanding of changes in their mental states. Child Development, 62, 98-110. Grice, H.P. (1957). Meaning. Philosophical Review, 66, 377-388. Harris, P.L., & Kavanaugh, R. (in press). The comprehension of pretense by young children. Society for Research in Child Development Monographs. Leekam, S., & Perner, J. (1991). Does the autistic child have a "metarepresentational" deficit? Cognition, 40, 203-218. Leslie, A.M. (1987). Pretense and representation: The origins of "theory of mind". Psychological Review, 94, 412-426. Leslie, A.M. (1988a). Causal inferences in shared pretence. Paper presented to BPS Developmental Conference, Coleg Harlech, Wales, September 16-19, 1988. Leslie, A.M. (1988b). Some implications of pretense for mechanisms underlying the child's theory of mind. In J. Astington, P. Harris, & D. Olson (Eds.), Developing theories of mind (pp. 19-46). Cambridge: Cambridge University Press. Leslie, A.M. (1988c). The necessity of illusion: Perception and thought in infancy. In L. Weiskrantz (Ed.), Thought without language (pp. 185-210). Oxford: Oxford Science Publications. Leslie, A.M., & Frith, U. (1988). Autistic children's understanding of seeing, knowing and believing. British Journal of Developmental Psychology, 6, 315-324. Leslie, A.M., & Frith, U. (1990). Prospects for a cognitive neuropsychology of autism: Hobson's choice. Psychological Review, 97, 122-131. Leslie, A.M., & German, T. (in press). Knowledge and ability in "theory of mind": One-eyed overview of a debate. In M. Davies, & T. Stone (Eds.) Mental simulation: Philosophical and psychological essays. Oxford: Blackwell. Leslie, A.M., & Roth, D. (1993). What autism teaches us about metarepresentation. In S. BaronCohen, H. Tager-Flusberg, & D. Cohen (Eds.), Understanding other minds: Perspectives from autism (pp. 83-111). Oxford: Oxford University Press. Leslie, A.M., & Thaiss, L. (1992). Domain specificity in conceptual development: Neuropsychological evidence from autism. Cognition, 43, 225-251.
220
A. Leslie
Marr, D. (1982). Vision. San Francisco: W.H. Freeman. Mitchell, P., & Lacohee, H. (1991). Children's early understanding of false belief. Cognition, 39, 107-127. Ozonoff, S., Pennington, B.F., & Rogers, S.J. (1991). Executive function deficits in high-functioning autistic individuals: Relationship to theory of mind. Journal of Child Psychology and Psychiatry, 32, 1081-1105. Perner, J. (1988). Developing semantics for theories of mind: From propositional attitudes to mental representation. In J. Astington, P.L. Harris, & D. Olson (Eds.), Developing theories of mind (pp. 141-172). Cambridge: Cambridge University Press. Perner, J. (1991). Understanding the representational mind. Cambridge, MA: MIT Press. Plaut, D.C., & Karniloff-Smith, A. (1993). Representational development and theory-of-mind computations. Behavioral and Brain Sciences, 16, 70-71. Pratt, C, & Bryant, P. (1990). Young children understand that looking leads to knowing (so long as they are looking into a single barrel). Child Development, 61, 973-982. Premack, D., & Woodruff, G. (1978). Does the chimpanzee have a theory of mind? Behavioural and Brain Sciences, 4, 515-526. Pylyshyn, Z.W. (1978). When is attribution of beliefs justified? Behavioural and Brain Sciences, 1, 592-593. Quine, WV. (1961). From a logical point of view. Cambridge, MA: Harvard University Press. Robinson, E.J., & Mitchell, P. (1992). Children's interpretation of messages from a speaker with a false belief. Child Development, 63, 639-652. Roth, D. (1993). Beliefs about false beliefs: Understanding mental states in normal and abnormal development. Ph.D. thesis, Tel Aviv University. Roth, D., & Leslie, A.M. (1991). The recognition of attitude conveyed by utterance: A study of preschool and autistic children. British Journal of Developmental Psychology, 9, 315-330. Reprinted in G.E. Butterworth, P.L. Harris, A.M. Leslie, & H.M. Wellman (Eds.), Perspectives on the child's theory of mind (pp. 315-330). Oxford: Oxford University Press. Siegal, M., & Beattie, K. (1991). Where to look first for children's knowledge of false beliefs. Cognition, 38, 1-12. Vygotsky, L.S. (1967). Play and its role in the mental development of the child. Soviet Psychology, 5, 6-18. Wellman, H.M., & Bartsch, K. (1988). Young children's reasoning about beliefs. Cognition, 30, 239-277. Wimmer, H., & Perner, J. (1983). Beliefs about beliefs: Representation and constraining function of wrong beliefs in young children's understanding of deception. Cognition, 13, 103-128. Zaitchik, D. (1990). When representations conflict with reality: The preschooler's problem with false beliefs and "false" photographs. Cognition, 35, 41-68. Zaitchik, D. (1991). Is only seeing really believing? Sources of true belief in the false belief task. Cognitive Development, 6, 91-103.
11 Extracting the coherent core of human probability judgement: a research program for cognitive psychology Daniel Osherson*' 3 , Eldar Shafirb, Edward E. Smithc a
IDIAP, C.P. 609, CH-1920 Martigny, Valais, Switzerland ^Department of Psychology, Green Hall, Princeton University, Princeton, NJ 08544-1010, USA 'Department of Psychology, University of Michigan, 330 Packard Road, Ann Arbor, MI 48104, USA
Abstract Human intuition is a rich and useful guide to uncertain events in the environment but suffers from probabilistic incoherence in the technical sense. Developing methods for extracting a coherent body of judgement that is maximally consistent with a person's intuition is a challenging task for cognitive psychology, and also relevant to the construction of artificial expert systems. The present article motivates this problem, and outlines one approach to it.
1. Introduction Human assessment of chances provides a guide to objective probabilities in a wide variety of circumstances. The survival of the species in diverse and rapidly evolving environments is testimony to this fact, as is the adequacy of our choices and judgements in most contexts encountered in daily life. At the same time, our assessments of chance are subject to systematic errors and biases that render them incompatible with the elementary axioms of probability. The character and origin of these errors have been the topic of intense scrutiny over several decades (for a partial review, see Osherson, in press).
* Corresponding author; E-mail
[email protected] Research support was provided by the Swiss National Science Foundation contract 21-32399.91 to Osherson, by US Public Health Service Grant 1-R29-MH46885 from NIMH to Shafir, and by Air Force Office of Scientific Research, Contract No. AFOSR-92-0265 to Smith.
222
D. Osherson, E. Shafir, E. Smith
How can the power and wisdom of human judgement be exploited while avoiding its weaknesses? One approach is to formulate principles of reasoning that simulate human judgement in large measure but correct it in view of probabilistic coherence. Such is the research program advocated and elaborated in our recent work (Osherson, Biolsi, Smith, Shafir, & Gualtierotti, 1994), and proposed in this article as a research program for cognitive psychology. The fundamental idea of extracting elements of human judgement for use within a normatively correct system of reasoning has already been articulated by Judea Pearl (1986, 1988). However, as explained below, our approach is different from, and complementary to, the tradition spawned by Pearl's studies. In a word, earlier approaches are "extensional" in character, assigning probabilities to unanalyzed statements and their logical combinations; in contrast, our approach is "intensional" inasmuch as it relies on a representation of the semantic content of the statements to which probabilities must be attached. A specific implementation of our approach is presented in Osherson et al. (1994) but a somewhat different one will be summarized below. The goal of the present article is not to insist upon the details of any one approach but rather to highlight a research question that we find challenging and important. The question is: how can orderly and plausible judgement about uncertain events be extracted from the turmoil of human intuition? We begin with general remarks on the difficulty of reasoning about probabilities in coherent fashion.
2. Coherent reasoning 2.1. The attraction of probabilistic reasoning A system that recommends action in the face of uncertainty should quantify its estimates of chance in conformity with the axioms of probability.1 Such is the burden of classical analyses of betting and prediction, which highlight the risks of reasoning non-probabilistically (see Cox, 1946; de Finetti, 1964, 1972; Jeffrey, 1983; Lindley, 1982; Resnik, 1987; Savage, 1972, for extended discussion). Alternative principles have been proposed to govern situations in which probabilities cannot generally be determined, as in Shafer (1976, 1986) and Shortliffe and Buchanan (1975). However, close examination of these principles (e.g., Fagin & Halpern, 1991; Heckerman, 1986) reinforces the conviction that Presentation and discussion of the elementary axioms of probability is provided in Resnik (1987, section 3.2). We do not rely here on the additional axiom of countable additivity (Ross, 1988, section 1.3), which is more controversial (see Kelly, 1993, section 13.4]).
Extracting the coherent core of human probability judgment
223
probabilistic reasoning remains the standard of rigorous thinking in the face of uncertainty. Although the foregoing conclusion remains the subject of lively debate and reflection (as in Dubois & Prade, 1991; Shafer & Pearl, 1990) it will be adopted without further justification in what follows.
2.2. The computational difficulty of probabilistic reasoning Whatever its normative attractiveness, probabilistic reasoning poses difficult computational challenges. If probabilities must be distributed over the sentences of an expressive language, these difficulties can become insurmountable.2 However, even when probabilities must be attributed to sentences without complex internal structure, certain manipulations are known to be intractable, for example updating Bayesian belief networks (see Cooper, 1987). The root difficulty is the large number of events that must be kept in view in order to ensure probabilistic coherence. To explain the problem we now review some standard terminology. The remaining discussion bears exclusively on finite event spaces (the domain for most probabilistic expert systems). Moreover, instead of the term "event" we use the equivalent terminology of "statements" and "propositions". (For additional discussion, see Neapolitan, 1990, section 5.1; Pearl, 1988, section 2.1.)
2.3. Probability over statements To establish a domain of discourse for reasoning, a finite number of statements S l9 S 2 ,. . . , SN are fixed in advance. Each S, is a determinate claim whose truth value may not be known with certainty, for example: (1) Tulsa will accumulate more than 10 inches of rain in 1995. The N statements give rise to 2N state descriptions, each of the form: ± S j A • • • A ±SN
where ±S, means that 5, may or may not be negated. A state description is the logically strongest claim that can be made about the domain since it consists of the conjunction of every statement or its negation. A proposition is a subset of state 2 This is shown for first-order arithmetical languages in Gaifman and Snir (1982, Theorem 3.7); the case of weak monadic second order logic is discussed in Stockmeyer, (1974, Theorem 6.1).
224
D. Osherson, E. Shafir, E. Smith
descriptions. Propositions are often denoted by the apparatus of propositional logic. For example, "Sx v S2" denotes the set of state descriptions in which at least one of Sx, S2 occur positively. There are 22" propositions. A distribution is an assignment of non-negative numbers to the state descriptions, whose sum is one. A given distribution induces a probability for each proposition, obtained by summing the numbers associated with the state descriptions that make it up. Let a collection C of propositions be given, and suppose that M is a mapping of C to real numbers. M is called coherent just in case there is some distribution P such that for all XEC, M(X) = P(X). Otherwise, M is incoherent.
2.4. Maintaining coherence in large domains Now suppose that the domain in question embraces 500 statements S19S2 5500 concerning, let us say, five meteorological events in 100 American cities (we assume the events to be non-exclusive, referring to different months, for example). Suppose as well that some human agent #? is asked to assign probabilities to a growing list i? of propositions built from those statements. Each proposition is relatively simple in form, but the same statement may occur across different propositions. Thus, £ might start off like this: S2 v -iS 3
(2)
2={
514^(52A5,) Sfi A i S , ,
Question: How can #? assign coherent probabilities to the successive propositions in if? In other words, as 5if associates numbers with more and more propositions, what procedure can be employed to ensure that the numbers are always extendible to a genuine probability distribution? To achieve coherency, 5if could in principle proceed as follows. Stage 1: Faced with the first proposition S2 v -iS 3 , 2if writes down the four state descriptions based on statements S2 and S3, namely: S2 A 5 3 , iS2 A S3, S2 A -i5 3 , iS2 A iS3. Then Sif chooses a distribution over these state descriptions that reflects her beliefs about their respective probabilities. By summing over the first, third and fourth state descriptions she arrives at her probability for 52 v -iS 3 . Her probabilities are coherent at the end of this stage. Stage 2: Faced with the second proposition S 14 -*(S 2 A 5 X ), she writes down the
Extracting the coherent core of human probability judgment
225
16 state descriptions based on statements Sl9 S2, S3 and S14, that is, based on the four statements that appear in stages 1 and 2. Then W chooses a distribution over these state descriptions that meet two conditions, namely: (i) it is consistent with the distribution chosen for the four state descriptions of stage 1 (this is possible since her probabilities are coherent at the end of stage 1); (ii) it reflects her beliefs about the 16 state descriptions now in play. By taking the appropriate sum, % arrives at her probability for S 14 ->(5 2 A 5 J ) . Because of property (i) her probability for the first proposition S2 v -iS 3 may be recovered by adding the relevant state descriptions from among the current 16; the probability assigned to S2 v ~iS3 will not have changed from stage 1. Consequently, the totality of her probability attributions are still coherent at the end of this stage. Stage 3: Faced with the third proposition S8 A - I 5 1 4 , she writes down the 32 state descriptions based on statements S15 S2, S3, S8, S 14 ,. . . etc.
The disadvantage of this procedure is that it soon requires excessive space to write down the list of needed state descriptions. Eventually, 2500 state descriptions need be written down at once, an enormous number. It is not immediately obvious what procedure X can substitute for this one. Suppose, for example, that X attempts at each stage to limit attention to just the state descriptions needed in the evaluation of the current proposition. Thus, in the first stage, X would attribute probabilities only to the four state descriptions based on S2 and S3; in the second stage, X would attribute probabilities only to the eight state descriptions based on Sx, S2 and S14; and so forth. Let us assume, furthermore, that $f chooses her probabilities in coherent fashion at each stage. This procedure is nonetheless insufficient to guarantee the coherence of X9s judgement since it ignores logical dependencies among propositions at different stages. To take the simplest example, suppose that the state descriptions of (I) show up in 3> at one stage, and those of (II) show up later: f
Sj A S2
J —\S1 A S2 > I Si A ~iS2 [pSx A -iS 2>
S 2 A S^
]
—lS2 A S3 1
(II) < S A —lS [ 2 3 " l S 7 A ""IS. J
226
D. Osherson, E. Shafir, E. Smith
Then overall coherence requires that the sum of the probabilities assigned to the first two state descriptions of (I) equal the sum of the probabilities assigned to the first and third state descriptions of (II). Otherwise, the two distributions imply different values for the statement S2, violating overall coherence. It is thus clear that the revised procedure suggested above is inadequate without some means of keeping track of the combinations of state descriptions seen in «S? up to the current point, and this entails the same combinatorial explosion as before. The difficulty of maintaining coherence over large domains is widely recognized. For example, Clark Glymour (1992, p. 361) summarizes the matter this way: To represent an arbitrary probability distribution, we must specify the value of the probability function for each of the state descriptions. So with 50 atomic sentences, for many probability distributions we must store 2 50 numbers to represent the entire distribution. . . . We cannot keep 2 50 parameters in our heads, let alone 2 raised to the power of a few thousand, which is what would be required to represent a probability distribution over a realistic language. . . . For such a language, therefore, there will be cases in which our beliefs are inconsistent and our degrees of belief incoherent.
Finally, note that the problem is aggravated by the expressiveness of the language used for ordinary thought. For example, we can easily express 100 predicates that apply sensibly to any of 100 grammatical subjects. There result 10,000 statements concerning which an agent might wish to reason probabilistically. Is there any hope of carrying out such reasoning coherently? 2.5. Tractability via conditional independence Some distributions have special properties that allow coherent reasoning within modest computational bounds. For example, consider a distribution P over statements Sl9. . . , SN such that for all subsets {Sl Sr,} of {Sl9.. . SN}, P(5 r A • • • A Sr.) = P(5 r ) x • • • x P(Sr,), that is, in which the underlying statements are mutually stochastically independent. Reasoning according to P does not require storing the probability of each state description. It suffices to store only the probabilities associated with Sl9...SN since the probability of any needed state description can be recovered through multiplication (relying where needed on the fact that P(-iS,) = 1 - P(S,)). It follows that if the beliefs of our agent W are mutually independent in the foregoing sense then she can confront the list <£ with no fear of incoherence. For each proposition X of i?, W need only carry out the following steps: (a) list the statement letters occurring in X; (b) decide which state descriptions over the foregoing list imply X\ and (c) take the sum of the probabilities of the latter state descriptions as the final answer. Using this strategy, not only can 9€ assign coherent probabilities to all the propositions that
Extracting the coherent core of human probability judgment
227
might arise in <£ (assuming that each proposition remains reasonably short). In addition, her judgement exhibits "path independence" in the sense that reordering i£ will not change 2Ts probability for any proposition. The mutual independence of Sx,. . . SN is an unrealistic assumption in most situations. Weaker but still useful forms of independence can be defined (see Whittaker, 1990, for extended discussion). For example, if P ( 5 1 | 5 2 A 5 3 ) = P(S 1 |S 3 ) then Sl is said to be *'conditionally independent" of S2 given S3 (according to P). If P exhibits a felicitous pattern of conditional independence, then its underlying state descriptions can be factored in such a way as to render their manipulation computationally tractable. A variety of schemes for exploiting conditional independence have been devised (e.g., Geiger, Verma, & Pearl, 1990; Heckerman, 1990b; Lauritzen & Spiegelhalter, 1988; Andreassen, Woldbye, Falck, & Andersen, 1989; Olesen et al., 1989; Long, Naimi, Criscitiello, & Jayes, 1987). Unfortunately, even with the right configuration of conditional independencies, many of these systems require manual entry of an excessive number of probabilities and conditional probabilities, often minute in magnitude. Usually the probabilities cannot be assessed in actuarial fashion, so must be based on the judgement of experts (e.g., doctors). It is well known that experts are apt to provide incoherent estimates of probability (see Casscells, Schoenberger, & Grayboys, 1978; Kahneman & Tversky, 1972; Tversky & Kahneman, 1983; Winterfeld & Edwards, 1986, Ch. 4.5), and the numbing task of making thousands of judgements no doubt aggravates this tendency. Several responses may be envisioned to the foregoing problem. First, the interrogation of experts can be rationalized and simplified (as in Heckerman, 1990a). Second, procedures can be devised to reduce the effect of judgemental biases that lead to incoherence (as discussed in Kahneman, Slovic, & Tversky, 1982, Chs. 30-32; Winterfeld & Edwards, 1986). Third, techniques can be implemented for finding a probability distribution that is maximally close to a set of possibly incoherent judgements (see Osherson et al., 1994). Fourth, methods can be invented for constructing human-like distributions on the basis of judgements that are psychologically more natural than assessments of the probability of arbitrary propositions. The fourth response has been raised in Szolovits and Pauker (1978). It is the one advocated here, though not to the exclusion of the other three. The essential innovation of our approach is to attempt to derive probabilities on the basis of the "semantic" (really, "informational") content of the grammatical constituents that compose statements. The potential benefit is reduction in the amount of information that must be culled from a human informant (and later stored). The reduction is based on the combinatorial mechanisms of grammar, which allow a large number of statements to be generated from a small number of constituents. Let us now examine this idea.
D. Osherson, E. Shafir, E. Smith
228 3. Statement semantics
How can the meaning of statements best be represented in view of deriving human-like probabilities from their representations? Surely we are far from having the answer to this question. In the hope of stimulating discussion, the present section describes a simple approach that analyzes grammatical constituents along a fixed shock of dimensions We begin by specifying the kind of statement to be treated.
3.1. Subjects, predicates, objects A statement like (3)
Lawyers seldom blush
decomposes into a grammatical subject (namely, "Lawyers") and a grammatical predicate (namely, "seldom blush"). Henceforth we employ the term "object" in place of "grammatical subject" in order to prevent confusion with the "subjects" participating in psychological experiments. Thus, "Lawyers" and "seldom blush" are the object and predicate, respectively, of statement (3). We limit attention to statements of this simple object-predicate form. Given object o and predicate p, we use [o, p] to denote the statement formed from them. A domain of reasoning is established by fixing a (finite) list obj of objects and a (finite) list pred of predicates and then considering the set of statements S = {[o, p] | o e obj and p Gpred}. For simplicity in what follows we assume that all the statements in 5 are meaningful, neither contradictory nor analytic, and logically independent from each other.
3.2. Vectorial representations Our approach associates each object and predicate with a real vector in n dimensions, for some fixed value of n. Such a vector may be conceived as a rating of the object or predicate along n dimensions (e.g., for n = 3, a rating of the object TIGER along the dimensions size, speed and ferocity). The vector is intended to code a given person's knowledge (or mere belief) about the item in question (see Smith & Medin, 1981, for discussion). Vector representations might seem impoverished compared to "frames" or other elaborate schemes for knowledge representation (Bobrow & Winograd, 1976; Minsky, 1981, 1986). It is thus worth noting the considerable representational power of real vectors. Suppose that person P is chosen, and let us say that
Extracting the coherent core of human probability judgment
229
predicate p "fits" object o just in case the probability of [o, p] according to P exceeds .5 (any other threshold would serve as well). This may be abbreviated to: P([°> p])> -5. We would like to represent the fit-relation in terms of vectors. For this purpose, suppose that n-dimensional vectors are assigned to obj Upred, one per object and one per predicate. Given such an assignment, let us say that o "dominates" p just in case the coordinates of o's vector are at least as great as the corresponding coordinates of /?'s vector. We have the following fact, proved in Doignon, Ducamp, and Falmagne (1984): (4) Let P embody any fit-relation whatsoever. Then for some n, there is an assignment of n-dimensional vectors to obj U pred such that for all o E obj and all p E pred, P([o, p]) > .5 if and only if o dominates p. Moreover, n can be chosen to not exceed the smaller of: the cardinality of obj, the cardinality of pred. Intuitively, we can think of the vector assigned to a predicate as establishing criteria for membership in the associated category. (For example, the predicate "can learn a four choice-point maze in three trials" might have a requirement of .75 in the coordinate corresponding to intelligent.) For o to have greater than .5 probability of possessing p, oJs values at each coordinate must exceed the criterion established by p. Fact (4) shows that such a scheme is perfectly general for representing probability thresholds and it renders plausible the idea that real vectors might also serve to predict continuous assessments of the probability of statements.
3.3. To and from statement representations Recall that our goal is to capture the coherent core of a person's judgement about chances. Call the person at issue 5if. Having decided to use vectors to represent obj U pred, two questions remain to be answered. These are: (a) Which particular object and predicate are attributed to 5if? (b) How are object and predicate vectors translated into a probability distribution? Once answers are offered to (a) and (b), a third question may be addressed, namely: (c) If 2fs vectors are fixed in accordance with the answer to (a), and if
D. Osherson, E. Shafir, E. Smith
230
probabilities are subsequently assigned to propositions in accordance with the answer to (b), how well do the resulting probabilities accord with 2T$ judgement about chances? Does the processed and regimented output of our system retain any of the insight that characterizes 2Ts understanding about probabilities in the environment? Let us now briefly consider (a)-(c).
3.4. Fixing object and predicate vectors One means of obtaining vectors is to request the needed information directly from %t via feature ratings (as in Osherson, Stern, Wilkie, Stob, & Smith, 1991). A less direct approach is to infer the vectors from similarity ratings among objects and predicates. In this case, we work backwards from a suitable vector-based model of similarity (e.g., those discussed in Osherson, 1987; Suppes, Krantz, Luce, & Tversky, 1989; Tversky, 1977), looking for vectors that best predict 2Ts similarity data. Another strategy is to postulate a model of simple probability judgements based on the needed vectorial representations, and then work backwards to vectors from such judgements. In this case, our system carries out "extrapolation", extending a small set of probability judgements to a more complete set (see Osherson, Smith, Meyers, Shafir, & Stob, in press).
3.5. Vectors to probabilities Turning to question (b) above, we describe one procedure for synthesizing probabilities from the vectors underlying obj and pred. It rests upon a scheme for constructing three-dimensional Venn diagrams. Specifically, the pair of vectors associated with object o and predicate p is translated into a subregion 5? of the unit cube.3 The volume of 9t represents the probability of [o, p]. The position of 91 determines its intersection with subregions assigned to other statements. The probability of a complex proposition (e.g., the intersection or the union of two statements) may then be determined by calculating the volume of the corresponding region. It is easy to see that use of the diagram guarantees probabilistic coherence.4 Let us now outline a simple scheme for selecting the particular region assigned 3 The unit cube has sides of length 1. It is used for convenience in what follows; various other kinds of solids would serve as well. 4 There is no mathematical reason to limit the diagram to three dimensions. Volumes in the n-dimensional unit cube for any positive n yield bona fide distributions. So far our experiments indicate that three dimensions are enough.
Extracting the coherent core of human probability judgment
231
to a given statement [o, p]. Let O, P be the vectors underlying o, /?, respectively, and suppose them to be suitably normalized so that all coordinates fall into the interval [0,1]. Define the "O, P-box" to be the (unique) rectangular solid 9? with the following properties: (5) (a) O falls within 38; (b) for i ^ 3 the length of the ith side of 5? is 1 minus the absolute difference between Oi and P,; (c) within the foregoing constraints, P is as close as possible to the geometrical center of 01. It may be seen that the volume of 9t varies directly with the match of P's coordinates to those of 0\ statements based on compatible objects and predicates are accorded higher probability thereby. Moreover, <Jfc's position in the cube represents aspects of the semantics of [o9 p]. For example, if p and q are complementary predicates with contrasting vectors then (5) assigns [o, p] and [o, q] boxes with little or no intersection. This reflects the low probability that must sensibly be assigned to [o, p] A [O, q] in view of the incompatible contents of p and q. Many alternatives to (5) are possible. To serve as a computationally tractable means of coherent reasoning in large domains it suffices to meet the following condition: (C) Given a point x in the unit cube, and vectors 0, P underlying object o and predicate /?, it must be computationally easy to determine whether x lies in the region associated with [o, p]. In this case it is straightforward to calculate the volumes associated with any Boolean combination of statements, hence with any proposition.5 It is clear that (5) satisfies C. Observe that within any Venn diagram scheme that conforms to C, coherent probabilistic reasoning can proceed without storing an entire distribution. It is enough to store the vectors underlying objects and predicates since the volume associated with any given proposition (of reasonable length) can be easily retrieved from the vectors. Thus, given 10 objects 2nd 10 predicates, only 20 vectors need be stored. This is easily achieved even for vectors of considerable size. In contrast, 10 objects and 10 predicates give rise to 100 statements and thus to a distribution with 2100 state descriptions. A potential solution to the problem of coherent reasoning, posed in section 2.4, is offered thereby. It must be emphasized that not every distribution can be represented by a Venn A more careful formulation of C would refer to € -spheres in place of points x, etc.
232
D. Osherson, E. Shafir, E. Smith
diagram that meets C (just as not every distribution manifests conditional independencies of a computationally convenient kind). The question thus arises: do distributions that conform to C approximate human intuition about chance in a wide variety of domains? We are thus led to question (c) above, namely, whether the distribution delivered by our method resembles the original intuitions of subject $?. Sticking with the simple scheme in (5) - henceforth called the "Venn model" - let us now address this matter.
3.6. Accuracy of the method We summarize one experimental test of our method. By an elementary argument (over obj U pred) is meant a non-empty set of statements, one of which is designated as "conclusion", the remainder (if any) as "premises". Statements are considered special cases of elementary arguments, in which the premise set is empty. An argument may be conceived as an invitation to evaluate the probability of its conclusion while assuming the truth of its premises. Thirty college students evaluated 80 elementary arguments based on four mammals (which served as objects) and two predicates (e.g., "are more likely to exhibit 'fight' than 'flight' posture when startled"). For each subject, an individually randomized selection of 30 arguments was used to fix vectors representing his objects and predicates. This was achieved by working backwards from the Venn model, seeking vectors that maximize its fit to the subject's judgement about the 30 input arguments. The Venn model was then applied to the resulting vectors to produce probabilities for the remaining 50 arguments. For each subject we calculated the average, absolute deviation between the Venn model's predictions for the 50 arguments and the probabilities offered directly by the subject. Pearson correlations between the two sets of numbers were also calculated. The median, average absolute deviation between the observed probabilities assigned to a subject's 50 predicted arguments and the probabilities generated by the Venn model is .ll. 6 The correlation between the two sets of numbers is .78. The results suggest that the Venn method can extrapolate a coherent set of probabilities from a small input set, and do this in such a way that the extrapolated distribution provides a reasonable approximation to the judgement of the person providing input. The input set of probabilities need not be coherent.
6 This deviation can be compared to the following statistic. Consider the mean of the probabilities assigned to the 30 arguments used to fix the object and predicate vectors of a given subject. We may use this single number as a predictor of the probabilities assigned to the remaining 50 arguments. In this case the median, average absolute deviation between the observed and predicted probabilities is .20.
Extracting the coherent core of human probability judgment
233
4. Concluding remarks The problem of recovering the coherent core of human probability judgement strikes us as an important project for cognitive psychology. It unites theorizing about the mental mechanisms of reasoning with a practical problem for expert systems, namely, finding an exploitable source of Bayesian priors. The system sketched above is preliminary in character, and serves merely to suggest the feasibility of the research program we advocate. Psychological research in recent years has produced considerable understanding of the character and causes of incoherent reasoning, even if debate continues about its scope and interpretation (see Gigerenzer & Murray, 1987; Osherson, 1990; Shafir & Tversky, 1992; Tversky & Shafir, 1992, and references cited there). It was noted in section 2.1 that probabilistic coherence has non-trivial justification as a standard - however incomplete - of normatively acceptable reasoning. We thus take there to be good empirical evidence, plus great computational plausibility, in favor of the thesis that human judgement is imperfect from the normative point of view. This thesis does not, however, impugn every aspect of ordinary reasoning. Indeed, the merits of human judgement have often been emphasized by the very researchers who investigate its drawbacks (e.g., Nisbett & Ross, 1980, p. 14). A challenge is posed thereby, namely, to devise methods that distill the rational component of human thought, isolating it from the faulty intuition that sometimes clouds our reason. Such appears to have been the goal of early inquiry into probability and utility (Gigerenzer et al., 1989, Ch. 1). It remains a worthy aim today.
References Andreassen, S., Woldbye, M., Falck, B., & Andersen, S. (1989). Munin: A causal probabilistic network for interpretation of electromyographic findings. In Proceedings of the Tenth International Joint Conference on Artificial Intelligence. Bobrow, D., & Winograd, T. (1976). An overview of KRL, a knowledge representation language. Cognitive Science, 1, 3-46. Casscells, W., Schoenberger, A., & Grayboys, T. (1978). Interpretation by physicians of clinical laboratory results. New England Journal of Medicine, 299, 999-1000. Cooper, G.F. (1987). Probabilistic inference using belief networks is np-hard. Memo KSL-87-27, Knowledge Systems Laboratory, Stanford University, May 1987. Cox, R. (1946). Probability, frequency, and reasonable expectation. American Journal of Physics, 14, 1-13. de Finetti, B. (1964). La prevision: Ses lois logiques, ses sources subjectives [transl. into English]. In H. Kyburg & P. Smokier (Eds.), Studies in subjective probability. New York: Wiley. de Finetti, B. (1972). Probability, induction and statistics. New York: Wiley. Doignon, J.-P., Ducamp, A., & Falmagne, J.-C. (1984). On realizable biorders and the biorder dimension of a relation. Journal of Mathematical Psychology, 28, 73-109. Dubois, D., & Prade, H. (1991) Updating with belief functions, ordinal conditional functions and
234
D. Osherson, E. Shafir, E. Smith
possibility measures. In P.P. Bonissone, M. Henrion, L.N. Kanal, & J.F. Lemmer (Eds.), Proceedings of the Sixth Workshop on Uncertainty in Artificial Intelligence (pp. 311-329). Amsterdam: Elsevier. Fagin, R., & Halpern, J. (1991). A new approach to updating beliefs. In P.P. Bonissone, M. Henrion, L.N. Kanal, & J.F. Lemmer (Eds.), Proceedings of the Sixth Workshop on Uncertainty in Artificial Intelligence (pp. 347-374). Amsterdam: Elsevier. Gaifman, H. & Snir, M. (1982). Probabilities over rich languages. Journal of Symbolic Logic, 47, 495-548. Geiger, D., Verma, T., & Pearl, J. (1990). d-Separation: From theorems to algorithms. In R.D. Shachter, M. Henrion, L.N. Kanal, & J.F. Lemmer (Eds.), Uncertainty in artificial intelligence 5. Amsterdam: North-Holland. Gigerenzer, G., & Murray, D.J. (1987). Cognition as intuitive statistics. Hillsdale, NJ: Erlbaum. Gigerenzer, G., Swijtink, Z., Porter, T., Daston, L., Beatty, J., & Kruger, L. (1989). The empire of chance. Cambridge, UK: Cambridge University Press. Glymour, C. (1992). Thinking things through. Cambridge, MA: MIT Press. Heckerman, D. (1986). Probabilistic interpretations for MYCIN'S certainty factors. In L.N. Kanal & J.F. Lemmer (Eds.), Uncertainty in artificial intelligence. Amsterdam: North-Holland. Heckerman, D. (1990a). Probabilistic similarity networks. Heckerman, D. (1990b). A tractable inference algorithm for diagnosing multiple diseases. In R.D. Schachter, M. Henrion, L.N. Kanal, & J.F. Lemmer (Eds.), Uncertainty in artificial intelligence 5. Amsterdam: North-Holland. Jeffrey, R. (1983). The logic of decision (2nd edn.). Chicago: University of Chicago Press. Kahneman, D., Slovic, P., & Tversky, A. (Eds.) (1982). Judgment under uncertainty: Heuristics and biases. Cambridge, UK: Cambridge University Press. Kahneman, D., & Tversky, A. (1972). Subjective probability: A judgement of representativeness. Cognitive Psychology, 3, 430-454. Kelly, K.T. (1993). The logic of reliable inquiry. MS, Department of Philosophy, Carnegie Mellon University. Lauritzen, S., & Spiegelhalter, D. (1988). Local computations with probabilities on graphical structures and their applications to expert systems. Journal of the Royal Statistical Society, B, 50, 157-224. Lindley, D. (1982). Scoring rules and the inevitability of probability. International Statistical Review, 50, 1-26. Long, W., Naimi, S., Criscitiello, M., & Jayes, R. (1987). The development and use of a causal model for reasoning about heart failure. IEEE Symposium on Computer Applications in Medical Care (pp. 30-36). Minsky, M. (1981). A framework for representing knowledge. In J. Haugeland (Ed.), Mind design. Cambridge, MA: MIT Press. Minsky, M. (1986). The society of mind. New York: Simon & Schuster. Neapolitan, R. (1990). Probabilistic reasoning in expert systems. New York: Wiley. Nisbett, R., & Ross, L. (1980). Human inference: Strategies and shortcomings of social judgment. Englewood Cliffs, NJ: Prentice-Hall. Olesen, K.G., Kjaerulff, U., Jensen, F., Jensen, F.V., Falck, B., Andreassen, S., & Andersen, S.K. (1989). A munin network for the median nerve: A case study on loops. Applied Artificial Intelligence, Special Issue: Towards Causal AI Models in Practice. Osherson, D. (1987). New axioms for the contrast model of similarity. Journal of Mathematical Psychology, 31, 93-103. Osherson, D. (1990). Judgment. In D. Osherson & E.E. Smith (Eds.), Invitation to cognitive science: Thinking (pp. 55-88). Cambridge, MA: MIT Press. Osherson, D. (in press). Probability judgment. In E.E. Smith, & D. Osherson (Eds.), Invitation to cognitive science: Thinking (2nd edn.). Cambridge, MA: MIT Press. Osherson, D., Biolsi, K., Smith, E., Shafir, E., & Gualtierotti, A. (1994). A source of Bayesian priors. IDIAP Technical Report 94-03. Osherson, D., Smith, E.E., Meyers, T.S., Shafir, E., & Stob, M. (in press). Extrapolating human probability judgment. Theory and Decision.
Extracting the coherent core of human probability judgment
235
Osherson, D., Stern, J., Wilkie, O., Stob, M., & Smith, E. (1991). Default probability, Cognitive Science, 15, 251-270. Pearl, J. (1986). Fusion, propogation, and structuring in belief networks. Artificial Intelligence, 29, 241-288. Pearl, J. (1988). Probabilistic reasoning in intelligent systems: Networks of plausible inference. San Mateo, CA: Morgan-Kaufmann. Resnik, M. (1987). Choices: An introduction to decision theory. Minneapolis: University of Minnesota Press. Ross S. (1988). A first course in probability. New York: Macmillan. Savage L. (1972). The foundations of statistics. New York: Dover. Shafter, G. (1976). A mathematical theory of evidence. Princeton, NJ: Princeton University Press. Shafer, G. (1986). Constructive probability. Synthese, 48, 1-60. Shafer, G., & Pearl, J. (Eds.) (1990). Readings in uncertain reasoning. San Mateo, CA: MorganKaufmann. Shafir, E., & Tversky, A. (1992). Thinking through uncertainty: Nonconsequential reasoning and choice. Cognitive Psychology, 24, 449-414. Shortliffe, E., & Buchanan, B. (1975). A model of inexact reasoning in medicine. Mathematical Biosciences, 23, 351-379. Smith, E.E., & Medin, D. (1981). Categories and concepts. Cambridge, MA: Harvard University Press. Stockmeyer, L. (1974). The complexity of decision problems in automata theory and logic. Ph.D. thesis, MIT. Suppes, P., Krantz, D.H., Luce, R.D., & Tversky, A. (1989). Foundations of measurement (Vol. II). San Diego: Academic Press. Szolovits, P., & Pauker, S. (1978). Categorical and probabilistic reasoning in medical diagnosis. Artificial Intelligence, 11, 115-144. Tversky, A. (1977). Features of similarity. Psychological Review, 84, 327-362. Tversky, A., & Kahneman, D. (1983). Extensional versus intuitive reasoning: The conjunction fallacy in probability judgement. Psychological Review, 90, 293-315. Tversky, A., & Shafir, E. (1992). The disjunction effect in choice under uncertainty. Psychological Science, 5, 305-309. Whittaker, J. (1990). Graphical models in applied multivariate statistics. New York: Wiley. Winterfeld, D.V., & Edwards, W. (1986). Decision analysis and behavioral research. New York: Cambridge University Press.
12 Levels of causal understanding in chimpanzees and children David Premack*, Ann James Premack Laboratoire de Psycho-Biologie du Developpement, CNRS, 41 rue Gay Lussac, 75005 Paris, France
Abstract We compare three levels of causal understanding in chimpanzees and children: (1) causal reasoning, (2) labelling the components (actor, object, and instrument) of a causal sequence, and (3) choosing the correct alternative for an incomplete representation of a causal sequence. We present two tests of causal reasoning, the first requiring chimpanzees to read and use as evidence the emotional state of a conspecific. Despite registering the emotion, they failed to use it as evidence. The second test, comparing children and chimpanzees, required them to infer the location of food eaten by a trainer. Children and, to a lesser extent, chimpanzees succeeded. When given information showing the inference to be unsound - physically impossible - 4-year-old children abandoned the inference but younger children and chimpanzees did not. Children and chimpanzees are both capable of labelling causal sequences and completing incomplete representations of them. The chimpanzee Sarah labelled the components of a causal sequence, and completed incomplete representations of actions involving multiple transformations. We conclude the article with a general discussion of the concept of cause, suggesting that the concept evolved far earlier in the psychological domain than in the physical.
•Corresponding author. The data reported here were collected at the University of California, Santa Barbara, and the University of Pennsylvania Primate Facility. We are greatly indebted to Guy Woodruff, who participated in all phases of the research and would be a co-author if we knew his whereabouts and could obtain his permission. We are also indebted to the many students, graduate and undergraduate, at both institutions who assisted in the care and testing of the chimpanzees.
238
D. Premack, A. Premack
Introduction In this paper we compare three levels of causal understanding in chimpanzees and children. At the deepest level, the individual engages in causal reasoning, solving problems in which he sees the outcome of a process but not its cause, and must infer or reconstruct the missing events. At an intermediate level the individual analyses intact causal sequences into their components and labels them. He must label the actor, object, and instrument of the action. The ability to carry out this task is a prerequisite for causal reasoning; if one cannot identify the separate parts of an intact sequence, one cannot identify the missing part of an incomplete sequence. At the most superficial level, the individual must complete an incomplete representation of a causal action, by selecting the correct alternative. Since the alternatives are all visible this task is the least demanding. Chimpanzees have been shown to be capable of all but the deepest level (Premack 1976, 1983), though they have been shown capable of analogical reasoning. Not only do they complete incomplete analogies and make same/ different judgments about exemplars that are and are not analogies (Gillan, Premack, & Woodruff, 1981), they also construct analogies from scratch (Oden & Premack, unpublished data). There is evidence (Gillan, 1981), albeit inconclusive, that they can do transitive inference. But there is little indication that they are capable of "Sherlock Holmes" type reasoning, that is, causal reasoning. In causal reasoning, an outcome-a corpse on the floor-is presented, but the cause is not. A human confronted with this scene would ask immediately, "Who did it? How? When, where and why?" He would answer these questions by making inferences from the "evidence", which has two main sources: existing knowledge concerning the "corpse" and its past, and observations concerning the immediate circumstances. As an astute observer, Sherlock Holmes was a good reasoner because he had an uncanny sense of what was "relevant", detecting implications in what others dismissed as neutral facts.
Causal reasoning In this paper we present two tests of causal reasoning: one conducted with a group of chimpanzees, another in which we compare chimpanzees and young children.
Levels of causal understanding in chimpanzees and children
239
Subjects The chimpanzees (Pan troglodytes) were African born; four were 3^-4\ years old and Sarah was about 10 years old at the time of the study. Animals entered the laboratory as infants, were diapered and bottle-fed, trained extensively in match-to-sample, and (some) were taught an artificial language. They were maintained rather like middle-class children on three meals a day, snacks, and continuous access to water. When tested, some were "rewarded" with a preferred food while others were simply praised. Participating children came from Philadelphia schools, and varied in age from 3.8 to 4.5 years, with an average age of 4.1 years.
Reading emotional evidence Four juvenile chimpanzees were tested in a simple two-step reasoning problem. We first trained them to run down a path to a consistently positive hidden object, then introduced them to occasional negative trials; that is, a rubber snake was substituted for the food on 15% of trials on a random schedule. The unpredictable negative trials profoundly changed the character of the chimpanzees' run. They no longer dashed full speed to the goal, but slowed midway, approaching the goal hesitantly. We next offered the animals an opportunity to play Holmes, to escape the uncertainty of the negative trial by using the emotional state of an informant to infer what object it had encountered on its run. Now, before starting a run, each animal was placed in a holding room with an informant that had just completed a run. The informant, having encountered either food or the rubber snake on its run, was in either a positive or negative emotional state, and this state, we have reason to believe, was successfully communicated to the recipient. Uninformed human judges shown videotapes of the informant could discriminate the state of the informant (ca. 98% correct). Beyond that, they could discriminate the recipient's state following its contact with the informant (ca. 70% correct). However, the "informed" chimpanzees seemed not to profit from this contact, for they accepted all opportunities to run, and did so in the same way whether: (1) the informant was in a positive state, (2) a negative state, or even when on control trials (3) had had no contact with an informant at all. The holding room was adjacent to the runway. Animals were taken there prior to, and immediately after, a run (to serve as an informant). Every chimpanzee played both roles, that of informant and recipient, and had the opportunity to observe that its conspecifics too played both roles. The use of four animals permitted 12 possible recipient-informant pairs, all of which were used.
240
D. Premack, A. Premack
Under these conditions, the animals should have been able to infer that an informant's emotional state was the result of what it had found at the end of a run. No chimpanzee ever encountered another in the holding room whose emotional state was not owed to a run. Nonetheless, it still could not use the emotional state (which it registered at some level) as evidence from which to infer what the informant had encountered on its run. Could the chimpanzee have made this inference but not have assumed that what the informant found was not a good prediction of what it would find - in other words, assumed it might be snakes for you but food for me? We cannot rule out this possibility, but a human in this circumstance would certainly explore the hypothesis that snakes for you means snakes for me as well. Before testing the chimpanzees, we had assumed this was a simple problem, one that could be easily solved, and therefore could be used as a base condition on which to impose variations that would permit our answering fundamental questions about reasoning. Perhaps at 3±-4± years the chimpanzees were too young and could have solved the problem when older. The age at which children can solve this problem is not known, for one cannot test children with frightening objects.
Using location as evidence In the next experiment, we tested both chimpanzees (the same group of four used in the previous problem), and two groups of children (10 in each group). The apes were tested in their outdoor compound, and the children in a classroom. For the chimpanzees, we placed two opaque containers about 30 feet apart. These formed the base of a triangle with the chimpanzee at its apex, midway between the containers and 30 feet from the base. The chimpanzee was accompanied by a trainer. As the two watched, a second trainer placed an apple in one container and a banana in the other. Following this action, the accompanying trainer placed a blind in front of the chimpanzee so that the containers were no longer visible to it. The trainer distracted the animal for approximately 2 min before removing the blind. What the subject now saw was the second trainer standing midway between the containers eating either an apple or a banana. Having eaten the fruit, the trainer left, and the chimpanzee was released. Each animal was given 10 trials, with an intertrial interval of about an hour. The two fruit were placed equally often in both containers, and the fruit eaten by the experimenter was counterbalanced over trials. We used apples and bananas because apes are fond of both and find them about equally desirable. Children were tested with a comparable procedure adjusted to suit a classroom and human food preferences.
Levels of causal understanding in chimpanzees and children
241
Results Children in both groups were largely consistent, 18 of 20 choosing the container associated with an item different from the one the experimenter was eating.1 That is, on trials on which the experimenter was seen eating a cookie, children selected the container which held the doughnut, and on trials on which he was seen eating the doughnut, selected the container which held the cookie. Chimpanzees, however, were inconsistent. Sadie, the oldest chimpanzee, responded as did the children, choosing the container associated with a fruit different from the one eaten by the experimenter. She made this selection on the first trial as well as on all subsequent ones. By contrast, Luvie did the opposite, choosing the container associated with the fruit that was the same as the one eaten by the trainer. For instance, upon seeing the trainer eat the apple, she went to the container which held the apple, and upon seeing him eat the banana, to the container which held the banana. Bert and Jessie responded in an intermediate fashion, choosing the container associated with food different from that which the trainer was seen to be eating after first choosing the opposite container for two and four trials, respectively.
Discussion Causal reasoning is difficult because a missing item must be reconstructed. While in simple learning, a monkey receives an electric shock when it presses a lever (and in observational learning observes that another receives an electric shock when it presses the lever), in causal reasoning the monkey does not observe the model press the lever but sees only its emotional response to the shock. Because in both simple and observational learning temporal contiguity between the lever press and electric shock are either experienced or observed, the relation between them is readily learned. Causal reasoning, however, does not provide such temporally contiguous events. While the monkey sees the relation between the lever press and the model's painful state, the chimpanzee does not actually see this relation, but experiences only the informant's emotional state, and must reconstruct from it the event that caused the state. In our tests, even though the chimpanzees experienced the same emotional states as a consequence of the same events, they were incapable of reconstructing those events from the emotional state of another chimpanzee. This helps clarify the striking difference in difficulty between ^e difference between these data and preliminary data reported earlier (Premack, 1983) comes from an unaccountable difference between our village and city children. Village children typically lagged city children by 6-12 months.
242
D. Premack, A. Premack
learning and reasoning, and suggests why the former is found in all species, the latter in exceedingly few. One might say that in the second experiment there is evidence for causal reasoning on the part of the children and perhaps one chimpanzee. This experiment can be seen as one of causal reasoning because here too there is a missing element to reconstruct. The subjects saw the trainer eating one or another food, but were never shown where he obtained it. Nevertheless, the children and perhaps one chimpanzee evidently did reconstruct the location of the food. What is most interesting about this outcome is that subjects "asked" the question of where the trainer got the food, and "answered" it quite specifically by going consistently to the container holding the food different from that eaten by the trainer. They assumed the food was the same as that which had been placed in the container, and in making this assumption believed the one container to be empty. It is not the specific content of the assumption alone that is of interest, but the additional fact that they made any assumption at all. Most species, we suggest, will make no assumptions. They will observe the trainer eating and never ask where he obtained the food. Such a question is asked only if one sees an event as part of a causal sequence in which there is a missing component. Could we induce our subjects to change their assumption? Suppose there is insufficient time for the trainer to recover the food placed in the containers. Would this affect choice? Keeping all other conditions constant, we tested this possibility by wrapping the fruit and pastries in elaborate packages before placing them in the containers. Now the trainer could not possibly have obtained the food from the containers - there was not sufficient time for him to unwrap these items. Children of 4 years and older were profoundly affected by this change. They no longer chose the container holding the pastry different from the one eaten by the trainer but chose at chance level between the two containers. By contrast, younger children and the chimpanzees were totally unaffected by the change in the availability of the item. They responded as before.
Labelling a causal action A causal action can be analysed into three components: the actor who carries out the action, the object on which he acts, and the instrument with which he performs the action. For instance, John paints a wall with a brush, cuts an apple with a knife, and washes his dirty socks in soapy water. Can young children and chimpanzees analyse such causal sequences? We devised a non-verbal procedure to answer this question by showing simple actions on a television monitor and giving our subjects markers that adhered to
Levels of causal understanding in chimpanzees and children
243
the screen and allowed them to identify each component of an action. A red square placed on John, for example, identified the actor; a green triangle on the apple, the recipient of the action; a blue circle on the knife, the instrument of the action. Sarah was given this problem. She was trained on three different actions: Bill cutting an orange, John marking a paper with a pencil, and Henry washing an apple with water. The trainer demonstrated the proper placement of each marker, handed the markers to Sarah, and corrected Sarah's errors. After reaching criterion on the three training tapes, she was given a transfer test in which all her responses were approved - our standard procedure in transfer tests. The tests were uniquely demanding, for the scenes were not merely new, but also decidedly more complex than those used in training. Where the training scenes had presented one person acting on one object with one instrument, the transfer scenes presented: two objects, only one of which was acted upon; two instruments, only one of which was used; and two persons, only one of whom carried out the action, the other being engaged in some scenes as an observer of the action of the first, and in other scenes as the recipient of action; for example, Bill brushed Bob's hair. Sarah passed the transfer tests, but at a relatively low level of accuracy. She was 85% correct in the use of the actor marker, 67% correct with the object marker, and 62% correct with the instrument marker. We attempted to improve her accuracy by training her on the transfer series, correcting her errors where previously we had approved all her responses. The attempt failed because she now placed the markers on the blank part of the screen (calling our attention to a fact we had previously overlooked - most of the screen is blank!). This tactic was entirely new, and brought the experiment to a halt. In retrospect, we recognize the experiment was needlessly difficult, and could have been simplified by dropping one of the categories, either of object or of instrument. Kim Dolgin (1981) as part of her doctoral research applied the same procedures to young children from 3.8 to 4.4 years, with an average age of 4 years, using the same non-verbal approach used with Sarah. The children failed the transfer tests. They did not properly identify the actor marker but drew a simpler distinction, animate/inanimate (or perhaps person/non-person). They reserved the actor marker for people but without regard for whether the person was an actor, observer, or recipient of the action. They made a similar simplification in the case of object and instrument markers, reserving them for non-persons, but without regard for whether the object or instrument was in actual use or simply present. With the children Dolgin took a further step not possible with Sarah. She told the children the meaning of each marker. For example, she presented the scene in which Bill cut the apple, and then showing the child the actor marker told her
244
D. Premack, A. Premack
"He's the one doing it", the object marker "This is what he's doing it to", and the instrument marker "This is what he's doing it with." The results were dramatic. The children passed the transfer tests at an average level of 93%, far higher than Sarah.
Causal sequences as transformations An actual causal sequence is a dynamic one in which an agent acts on an object, typically with the use of an instrument, changing its state and/or location. But one can represent the causal sequence in a stylized way - an apple and a cut apple representing the transformation of the apple, a knife the object responsible for the transformation. To determine whether non-speaking creatures can recognize such transformations, we designed a test in which the subject was given an incomplete representation of this causal sequence and was required to choose an alternative that properly completed it (Premack, 1976). The main actions we tested were cutting, marking, wetting and, in special cases, actions that reversed the effect: joining, erasing and drying. The subjects had extensive experience with the actions on which they were tested, carrying them out in a play context. The problem was given to three chimpanzees (and numerous children) in two basic formats, one requiring that they choose the missing operator, another the missing terminal state. In the former, they were given as a representative test sequence "apple ? cut apple" along with the alternatives: knife, pencil, container of water. In the latter, they were given as a representative test sequence "apple knife ?" along with the alternatives: cut apple, cut banana, wet apple. The chimpanzees were given not only novel object-operator combinations but also anomalous ones-for example, cut sponges, wet paper, fruit that had been written on - and performed as well on the anomalous material as on the other (see Premack, 1976, pp. 4-7, 249-261 for details). The tests were passed only by language-trained chimpanzees which were not given any training on the test but passed them on their first exposure to them. This test was only one of four that language-trained chimpanzees could do; the other three were analogies, same/different judgments on the relations between relations, and the matching of physically unlike proportions (e.g., i potato to \ glass of water) (Premack, 1983). Language training conferred an immediate ability tojsolve this complete set of four tasks. Only with protracted training could non-language-trained chimpanzees be taught to do these tests, and then only one test at a time, with no apparent saving from one to the other (Premack, 1988).
Levels of causal understanding in chimpanzees and children
245
Mapping the directionality of action The direction of the action was depicted in the test sequences by the left /right order of the objects. Thus, the object in its initial state was always presented on the left, the object in its transformed state on the right. But the standard tests did not require the subjects to read the sequences from left to right; they might have chosen an operator simply on the grounds that it belonged with a particular action-a knife, for example, as the operator for cutting - making this choice without regard to order. Whether the intact apple was to the left or right of the cut one, the animal could well have made the same choice. To obtain evidence that Sarah could discriminate the left-right order of the sequence and make use of it, she was first acquainted with pairs of actions that had reverse effects. For example, the trainer showed her how to mend broken pieces of an object with Scotch tape, how to erase pencil or crayon marks with a gum eraser, how to dry a wet object with a cloth-and then gave her the opportunity to carry out the actions. She adopted these new actions with enthusiasm. She was then given three test sessions with the original "cut, mark, wet", and four sessions with the new cases "tape, erase, dry". The tests took the standard form, for example, "apple ? cut apple", accompanied by the standard three operators, for example, knife, container of water, pencil. A total of 26 objects were used and 12 operators, two of each kind. Each of the three cases (tape, erase, dry) was presented four times per session in random order, with the position of the correct operator randomized across left, centre and right positions. Results: Total = 40/60 Original cases = 12/18 New cases = 28/42 Zdiff between old and new not significant. These preliminary results simply established that Sarah understood the new actions and could deal with them correctly. She was then required to use the left-right order of the sequence, and presented pairs of trials in which the same material appeared in reverse order. For example, "paper ? marked paper"; "marked paper ? paper". She was given pencil, container of water, and eraser as possible operators. Now, to choose correctly, Sarah had to take order into account, for while pencil was correct (and eraser incorrect) for one order, eraser was correct (and pencil incorrect) for the other. It takes a pencil to mark a blank sheet of paper, an eraser to remove the mark. Sarah was given 16 sessions, 12 trials per session, old cases being presented 24
246
D. Premack, A. Premack
times each, new cases 36 times each in random order across sessions. Although the objects and operators used were the same as those in the previous tests, they were combined in new ways. All other details were the same as those of the preceding tests. Results: Total = 110/180 Original cases = 47/72 New cases = 63/108 If we exclude trials in which Sarah chose the incorrect irrelevant alternative rather than the relevant one, corresponding figures are: Total = 110/148 Original cases = 47/62 New cases = 63/86 Zdiff between old and new not significant. The reversal pairs compared as follows: Cut/tape: 28/60 Mark /erase: 37/60 Wet/dry: 45/60 Zdiff between c/t and w/d = 3.18. Finally, Sarah was given an extensive transfer test involving new objects and operators. In five sessions of 12 trials per session, 30 new objects were used as well as 60 new operators, 10 of each kind. Each object appeared twice, once with one or another of the six new operators, and again with the reverse operator. Each operator appeared three times: as correct alternative, incorrect reverse alternative, and incorrect irrelevant alternative. Each case appeared twice per session in counterbalanced order. All other details were the same as those already reported. Results: Total = 44/60 Original cases = 20/30 New cases = 24/30 Excluding trials in which Sarah chose incorrect alternative: Total = 44/52 Original cases = 20/25
Levels of causal understanding in chimpanzees and children
247
New cases = 24/27 Reversal pairs: Cut/tape = 15/20 Mark/erase = 12/20 (p < .05 with three alternatives) Wet/dry = 17/20 No significant Zdiff. These data establish that Sarah could use left-right order to map the directionality of action as accurately on unfamiliar as on familiar cases.
Multiple transformations The basic consequence of a causal action is a transformation - a change from an initial state to a final one. Could Sarah understand causal action from this perspective, looking at the initial state of an object, comparing it with its terminal state, then selecting the operator(s) that explains or accounts for the difference? We can add to the interest of this question by removing the restrictions that were applied to the examples Sarah had been given. First, transformations involved more than a single action-for example, paper could be both cut and marked. Second, the initial state could be an already-transformed object rather than one in an intact or canonical state. Now we not only lifted restrictions, but gave Sarah a special trash bin in which to discard incorrect or irrelevant operators. So, besides removing the interrogative particle and replacing it with the correct or relevant operators, she was required to select the incorrect or irrelevant operators, and place it/them in the trash. The test consisted of six 12-trial sessions, each consisting of both single-action and double-action trials in equal number counterbalanced over the session. The six actions and their combinations were presented in equal number in each session counterbalanced over the session. The rest of the procedural details have already been reported. Results: Single transformations = 25/36 Double transformations = 24/36 (p < .001, both cases) Total = 49/72 These results add to the evidence of Sarah's ability to use the test sequences as representations of action. Her analyses answered these implicit questions: (1)
248
D. Premack, A. Premack
What operator changed the object from its initial to its terminal state? (2) In applying this operator to this object, what terminal state did one produce? (3) Which operators caused the difference between the initial and terminal state, and which did not? In answering these questions, Sarah had to attend to the order of the test sequences, "reading" them from left to right. We speculate that the representational ability required to pass these tests is that of a mind/brain which is capable of copying its own circuits. In carrying out an actual causal sequence, such as cutting an apple with a knife, an individual may form a neural circuit enabling him to carry out the act efficiently. But suppose he is not required to actually cut an apple, but is instead shown a representation of cutting - an incomplete depiction of the cutting sequence such as was given the chimpanzees - could he use the neural circuit to respond appropriately, that is, to complete the representation by choosing the missing element? Probably not, for the responses associated with the original circuit are those of actual cutting; they would not apply to repairing an incomplete representation of cutting. Moreover, the representation of a sequence can be distorted in a number of ways, not only by removing elements as in the chimpanzee test, but also by duplicating elements, misordering them, adding improper elements, or combinations of the above. To restore distorted sequences to their canonical form requires an ability to respond flexibly, for example, to remove elements, add others, restore order and the like. Flexible novel responding of this kind is not likely to be associated with the original circuitry (that concerned with actual cutting), but more likely with a copy of the circuit. Copies of circuits are not tied, as are the original circuits, to a fixed set of responses, and they may therefore allow for greater novelty and flexibility. For this reason, we suggest, flexible responding may depend on the ability of a mind/brain to be able to make copies of its own circuits.
An attempt to combine three questions Sarah was given a test that consisted of three questions: (1) What is the cause of this problem? (2) How can it be solved? (3) What is neither cause nor solution but merely an associate of the problem? These questions were not asked explicitly, with words, but implicitly with visual exemplars. Similarly, her answers were not given in words but in visual exemplars. The problems about which she was queried were depicted by a brief videotape (the terminal image of which was put on hold). For instance, she was shown a trainer vigorously stamping out a small conflagration of burning paper. The questions asked her in this case were: What caused the fire? How could it be put out? What is neither cause nor solution but an associate of the fire? The correct answers were photographs of: matches (cause), a bucket of water
Levels of causal understanding in chimpanzees and children
249
(solution), and a pencil (associate), the latter because she often used a pencil in scribbling on paper of exactly the kind that was shown in the videotape. The three questions were identified by different markers (like those used to identify the three components of an action), though the meaning of these markers was not determined by their location on the television image, but by the correct answers with which each marker was associated. The markers were introduced by presenting each of them with a videotape, offering three photographs and teaching her which of them was correct. In the example concerning a fire, which served as one of three training cases, she was presented the marker for "cause" (red square), with three photographsmatches, clay, knife-and taught to choose matches. When presented the marker for "solution" (green triangle), she was shown photographs of water, Scotch tape, eraser, and taught to choose the bucket of water. When presented the marker for associate (blue circle), she was shown photographs of pencil, apple, blanket, and taught to choose pencil. This procedure, repeated with two other training videotapes, was intended to teach her to view the problems depicted on the videotape according to the perspective indicated by the correct answer associated with a marker. Once she reached criterion on the three training cases, she was given a transfer test involving 20 new problems. When she failed, it was decided to train her on this material and to bring in a new trainer - one who no longer played an active role in her care or testing but who had been an early caretaker, was a favourite, and could be counted on to bring out her best effort. Sarah definitely "tried" harder with some trainers than with others. Ordinarily she looked only once at an alternative before choosing, but with a difficult question and a favourite trainer, several times. Her looking behaviour was readily observable; after the trainer gave her the test material in a manila envelope, he left the cage area. Sarah could be observed on a television monitor to empty the envelope on the cage floor, spread out the alternatives, inspect them, choose, and then ring her bell (as a period marker signalling an end and summoning the trainer). With this favourite trainer she not only looked at the alternatives with more than usual care but did several double-takes; that is, looked, looked away, and then quickly looked back. This did not help her cause; she made three consecutive errors. When the trainer entered to show her the fourth videotape, she lost sphincter control and ran screaming about the cage. Although a demanding test, it is not necessarily beyond the capacity of the chimpanzee. It must be taught more carefully than we did, not as a combination of three markers, but one marker at a time, then two and, only when there is success on two, all three presented together. We subsequently used this approach with 4^-6^-year-old children on a problem only slightly less demanding than the one given Sarah, and they succeeded nicely (Premack, 1988).
250
D. Premack, A. Premack
General discussion There are two traditions in which causality has been studied in psychology: the natural causality of Michotte (1963), and the arbitrary causality of Hume (1952). Natural causality (Premack, 1991) concerns the relation between special pairs of events-one object launching another by colliding with it is the classic example; whereas arbitrary causality concerns the relation between any pair of temporally contiguous events - a lever press followed by the presentation of a Mars bar is one example. These two traditions have fostered conflicting interpretations of causality. The perception of natural causal relations requires only a single pairing of appropriate events, is demonstrable in 6-month-old infants (e.g., Leslie & Keeble, 1987) and is considered innate; whereas the perception of arbitrary causal relations requires repeated pairings in time of two events, and is learned. But are these differences real or do they simply reflect differences in subject matter? Although Michotte's case is typically the only cited example of natural causality, it is essential to recognize that there is another, and more basic example of natural causality. This is the psychological case where we perceive causality under two conditions: (1) when an object moves without being acted upon by another object, that is, is "self-propelled" (Premack, 1990); and (2) when one object affects another "at a distance", that is, affects another despite a lack of spatial/temporal contiguity. Humans unquestionably perceive causality under both these conditions. Yet this fact has received little comment - virtually none compared to the extensive comment on the Michotte case. Why? We suggest, because the perception of causality in the psychological domain evolved far earlier than it did in the Michotte case and belongs to a "part of the mind" that is less accessible to language. The perception of causality of the Michotte variety probably evolved late, and even then only in a few tool-using species (Kummer, in press), for only humans, apes, and the exceptional monkey (e.g., Cebus) handle objects in a manner capable of producing collisions. Compare this to the perception of causality in the psychological domain, which is not restricted to a few tool-using species but is found in virtually all species. Intentional action which involves either a single individual or one individual acting upon another is part of the experience of all but a few invertebrates. Bar pressing that produces food, threats that produce withdrawal, collisions that launch objects, and so on, all fit neatly into either a physical or psychological category. These cases are important because they give the impression that the concept of cause has a content: psychological, physical, or both. However, infants may perceive a causal relationship when presented with totally arbitrary cases; for
Levels of causal understanding in chimpanzees and children
251
example, a sharp sound followed by a temporally contiguous colour change in an object. If so, this example is important because it demonstrates that the concept of cause may be without content. Let us habituate one group of infants to a contiguous case, another to a delay case, and then apply the Leslie-Keeble paradigm by reversing the order of the two events. We present colour change followed by sound to both groups. The contiguous group, we predict, will show greater dishabituation (recovery in looking time) than the delay group. In other words, our outcome will be the same as that obtained by LeslieKeeble in the Michotte example - a greater recovery for the group in which the events are presented contiguously. But how does one explain these results? Just as Leslie-Keeble do. While contiguous events certainly do lead to the perception of causality, is this perception confined to the Michotte case? Causality is not bound by content, we suggest, for the sequence "sound-colour change" which involves neither an intentional act nor the transfer of energy from one object to another, is an example of neither psychological nor physical causality. Perhaps the concept of causality at its most fundamental level is no more than a device that records the occurrence of contiguous events - associative learning - and is found in all species. All species may share a device that records the occurrence of contiguous events, and evolution contributed two major additions to this primitive device: first, the capacity to act intentionally which enabled certain species not only to register but also to produce contiguous events; second, the capacity, largely unique to the human, to explain or interpret events that have been both registered and produced. While the basis of the primitive level has not been resolved by neuroscience (for this level operates on "events", and how the mind/brain binds items so as to construct events remains a challenge for neuroscience; e.g., Singer, 1990), fortunately, the second level of causality is well represented by work on animal learning. Especially Dickinson and his colleagues (e.g., Dickinson & Shanks, in press) have considered the special subset of concurrent events - act-outcome pairs - brought about by intentional action. Are such pairs marked in some fashion, and thus represented differently in memory from other concurrent pairs? The third level of causality is to be found in recent work on domain-specific theories of naive physics (Spelke, Phillips, & Woodward, in press; Baillargeon, Kotovsky, & Needham, in press), psychology (Leslie, in press; Gelman, Durgin, & Kaufman, in press; Premack & Premack, in press), and arguably biology (Carey, in press; Keil, in press). These theories separate the concurrent events (registered by the primitive device) into special categories, and propose an explanatory mechanism for each of them. Explanation, embedded in naive theories about the world, is largely a human specialization.
252
D. Premack, A. Premack
References Baillargeon, R., Kotovsky, L., & Needham, A. (in press). The acquisition of physical knowledge in infancy. In A.J. Premack, D. Premack, & D. Sperber (Eds.), Causal cognition: A multidisciplinary debate. Oxford: Clarendon Press. Carey, S. (in press). On the origin of causal understanding. In A.J. Premack, D. Premack, & D. Sperber (Eds.), Causal cognition: A multidisciplinary debate. Oxford: Clarendon Press. Dickinson, A. & Shanks, D. (in press). Instrumental action and causal representation. In A.J. Premack, D. Premack, & D. Sperber (Eds.), Causal cognition: A multidisciplinary debate. Oxford: Clarendon Press. Dolgin, K.G. (1981). A developmental study of cognitive predisposition'. A study of the relative salience of form and function in adult and four-year-old subjects. Dissertation, University of Pennsylvania. Gelman, R., Durgin, F., & Kaufman, L. (in press). Distinguishing between animates and inanimates: Not by motion alone. In A.J. Premack, D. Premack, & D. Sperber (Eds.), Causal cognition: A multidisciplinary debate. Oxford: Clarendon Press. Gillan, D.J. (1981). Reasoning in the chimpanzee. II. Transitive inference. Journal of Experimental Psychology: Animal Behavior Processes, 7, 150-164. Gillan, D.J., Premack, D., & Woodruff, G. (1981). Reasoning in the chimpanzee. I. Analogical reasoning. Journal of Experimental Psychology: Animal Behavior Processes, 7, 1-17. Hume, D. (1952). An enquiry concerning human understanding. In Great books of the western world (Vol. 35). Chicago: Benton. Keil, F. (in press). The growth of causal understandings of natural kinds. In A.J. Premack, D. Premack, & D. Sperber (Eds.), Causal cognition: A multidisciplinary debate. Oxford: Clarendon Press. Kummer, H. (in press). On causal knowledge in animals. In A.J. Premack, D. Premack, & D. Sperber (Eds.), Causal cognition: A multidisciplinary debate. Oxford: Clarendon Press. Leslie, A., & Keeble, S. (1987). Do six-month-old infants perceive causality? Cognition, 25, 265-288. Michotte, A. (1963). The perception of causality. London: Methuen. Premack, D. (1976). Intelligence in ape and man. Hillsdale, NJ: Erlbaum. Premack, D. (1983). The codes of man and beasts. Behavioral and Brain Sciences, 6, 125-167. Premack, D. (1988). Minds with and without language. In L. Weiskrantz (Ed.), Thought without language. Oxford: Clarendon Press. Premack, D. (1990). The infant's theory of self-propelled objects. Cognition, 36, 1-16. Premack, D. (1991). Cause/induced motion: intention/spontaneous motion. Talk at Fyssen Foundation Conference on the origins of the human brain. Clarendon Press: Oxford. Premack, D., & Premack, A.J. (in press). Semantics of action. In A.J. Premack, D. Premack, & D. Sperber (Eds.), Causal cognition: A multidisciplinary debate. Oxford: Clarendon Press. Singer, W. (1990). Search for coherence: A basic principle of cortical self-organization. Concepts in Neuroscience, 1, 1-26. Spelke, E.S., Phillips, A., & Woodward, A.L. (in press). Infant's knowledge of object motion and human action. In A.J. Premack, D. Premack, & D. Sperber (Eds.), Causal cognition: A multidisciplinary debate. Oxford: Clarendon Press.
13 Uncertainty and the difficulty of thinking through disjunctions Eldar Shafir* Department of Psychologyy Princeton University, Princeton, NJ 08544, USA
Abstract This paper considers the relationship between decision under uncertainty and thinking through disjunctions. Decision situations that lead to violations of Savage's sure-thing principle are examined, and a variety of simple reasoning problems that often generate confusion and error are reviewed. The common difficulty is attributed to people's reluctance to think through disjunctions. Instead of hypothetically traveling through the branches of a decision tree, it is suggested, people suspend judgement and remain at the node. This interpretation is applied to instances of decision making, information search, deductive and inductive reasoning, probabilistic judgement, games, puzzles and paradoxes. Some implications of the reluctance to think through disjunctions, as well as potential corrective procedures, are discussed.
Introduction Everyday thinking and decision making often occur in situations of uncertainty. A critical feature of thinking and deciding under uncertainty is the need to consider possible states of the world and their potential consequences for our beliefs and actions. Uncertain situations may be thought of as disjunctions of possible states: either one state will obtain, or another. In order to choose between alternative actions or solutions in situations of uncertainty, a person
* E-mail eidar@clarity. princeton. edu This research was supported by US Public Health Service Grant No. 1-R29-MH46885 from the National Institute of Mental Health, and by a grant from the Russell Sage Foundation. The paper has benefited from long discussions with Amos Tversky, and from the comments of Philip Johnson-Laird, Daniel Osherson, and Edward Smith.
254
E. Shafir
needs to consider the anticipated outcomes of each action or each solution pattern under each state. Thus, when planning a weekend's outing, a person may want to consider which of a number of activities she would prefer if the weekend is sunny and which she would prefer if it rains. Similarly, when contemplating the next move in a chess game, a player needs to consider what the best move would be if the opponent were to employ one strategy, and what may be the best move if the opponent were to follow an alternative plan. Special situations sometimes arise in which a particular action, or solution, yields a more desirable outcome no matter how the uncertainty is resolved. Thus, a person may prefer to go bowling rather than hiking regardless of whether it is sunny or it rains, and an exchange of queens may be the preferred move whatever the strategy chosen by the opponent. An analogous situation was described by Savage (1954) in the following passage: A businessman contemplates buying a certain piece of property. He considers the outcome of the next presidential election relevant to the attractiveness of the purchase. So, to clarify the matter for himself, he asks whether he would buy if he knew that the Republican candidate were going to win, and decides that he would do so. Similarly, he considers whether he would buy if he knew that the Democratic candidate were going to win, and again finds that he would do so. Seeing that he would buy in either event, he decides that he should buy, even though he does not know which event obtains . . .
Savage calls the principle that governs this decision the sure-thing principle (STP). According to STP, if a person would prefer a to b knowing that X obtained, and if he would also prefer a to b knowing that X did not obtain, then he definitely prefers a to b (Savage, 1954, p. 22). STP has a great deal of both normative and descriptive appeal, and is one of the simplest and least controversial principles of rational behavior. It is an important implication of "consequentialist" accounts of decision making, in that it captures a fundamental intuition about what it means for a decision to be determined by the anticipated consequences.1 It is a cornerstone of expected utility theory, and it holds in other models of choice which impose less stringent criteria of rationality (although see McClennen, 1983, for discussion). Despite its apparent simplicity, however, people's decisions do not always abide by STP. The present paper reviews recent experimental studies of decision under uncertainty that exhibit violations of STP in simple disjunctive situations. It is argued that a necessary condition for such violations is people's failure to see through the underlying disjunctions. In particular, it is suggested that in situations of uncertainty people tend to refrain from fully contemplating the consequences of potential outcomes and, instead, suspend judgement and remain, undecided, at ^he notion of consequentialism appears in the decision theoretic literature in a number of different senses. See, for example, Hammond (1988), Levi (1991), and Bacharach and Hurley (1991) for technical discussion. See also Shafir and Tversky (1992) for a discussion of nonconsequentialism.
Uncertainty and the difficulty of thinking through disjunctions
255
the uncertain node. Studies in other areas, ranging from deduction and probability judgement to games and inductive inference, are then considered, and it is argued that a reluctance to think through disjunctions can be witnessed across these diverse domains. Part of the difficulty in thinking under uncertainty, it is suggested, derives form the fact that uncertainty requires thinking through disjunctive situations. Some implications and corrective procedures are considered in a concluding section.
Decisions Risky choice Imagine that you have just gambled on a toss of a coin in which you had an equal chance to win $200 or lose $100. Suppose that the coin has been tossed, but that you do not know whether you have won or lost. Would you like to gamble again, on a similar toss? Alternatively, how would you feel about taking the second gamble given that you have just lost $100 on the first (henceforth, the Lost version)? And finally, would you play again given that you have won $200 on the first toss (the Won version)? Tversky and Shafir (1992) presented subjects with the Won, Lost, and uncertain versions of this problem, each roughly a week apart. The problems were embedded among several others so the relation among the three versions would not be transparent, and subjects were instructed to treat each decision separately. The data were as follows: the majority of subjects accepted the second gamble after having won the first gamble, the majority accepted the second gamble after having lost the first gamble, but most subjects rejected the second gamble when the outcome of the first was not known. Among those subjects who accepted the second gamble both after a gain and after a loss on the first, 65% rejected the second gamble in the disjunctive condition, when the outcome of the first gamble was uncertain. In fact, this particular pattern - accept when you win, accept when you lose, but reject when you do not know - was the single most frequent pattern exhibited by our subjects (see Tversky & Shafir, 1992, for further detail and related data). A decision maker who would choose to accept the second gamble both after having won and after having lost the first, s h o u l d - i n conformity with S T P choose to accept the second gamble even when the outcome of the first is uncertain. However, when it is not known whether they have won or lost, our subjects refrain from contemplating (and acting in accordance with) the consequences of winning or of losing. Instead, they act as if in need for the uncertainty about the first toss to be resolved. Elsewhere, we have suggested that people have
256
E. Shafir
different reasons for accepting the second gamble following a gain and following a loss, and that a disjunction of different reasons (" 'I can no longer lose. . .' in case I won the first gamble or 1 need to recover my losses. . .' in case I lost") is often less compelling than either of these definite reasons alone (for further discussion of the role of reasons in choice, see Shafir, Simonson, & Tversky, 1993). Tversky and Shafir (1992) call the above pattern of decisions a disjunction effect. A disjunction effect occurs when a person prefers x over y when she knows that event A obtains, and she also prefers x over y when she knows that event A does not obtain, but she prefers y over x when it is unknown whether or not A obtains. The disjunction effect amounts to a violation of STP, and hence of consequentialism. While a reliance on reasons seems to play a significant role in the psychology that yields disjunction effects, there is nonetheless another important element that contributes to these paradoxical results: people do not see through the otherwise compelling logic that characterizes these situations. When confronting such disjunctive scenarios, which can be thought of as decision trees, people seem to remain at the uncertain nodes, rather then contemplate t h e sometimes incontrovertible - consequences of the possible branches. The above pattern of nonconsequential reasoning may be illustrated with the aid of the value function from Kahneman and Tversky's (1979) prospect theory. The function, shown in Fig. 1, represents people's subjective value of losses and of gains, and captures common features of preference observed in numerous empirical studies. Its S-shape combines a concave segment to the right of the origin reflecting risk aversion in choices between gains, and a convex segment to the left of the origin reflecting risk seeking in choices between losses. Furthermore, the slope of the function is steeper on the left of the origin than on the right, reflecting the common observation that "losses loom larger than gains" for most people. (For more on prospect theory, see Kahneman & Tversky, 1979, 1982, as well as Tversky & Kahneman, 1992, for recent extensions.) The function in Fig. 1 represents a typical decision maker who is indifferent between a 50% chance of winning $100 and a sure gain of roughly $35, and, similarly, is indifferent between a 50% chance of losing $100 and a sure loss of roughly $40. Such a pattern of preferences can be captured by a power function with an exponent of .65 for gains and .75 for losses. While prospect theory also incorporates a decision weight function, 7r, which maps stated probabilities into their subjective value for the decision maker, we will assume, for simplicity, that decision weights coincide with stated probabilities. While there is ample evidence to the contrary, this does not change the present analysis. Consider, then, a person P whose values for gains and losses are captured by the function of Fig. 1. Suppose that P is presented with the gamble problem above and is told that he has won the first toss. He now needs to decide whether to accept or reject the second. P needs to decide, in other words, whether to
Uncertainty and the difficulty of thinking through disjunctions
Fig. 1.
257
The value function v(x) = x65 for x^O and v(x) = -(-x)m75 for x^O.
maintain a sure gain of $200 or, instead, opt for an equal chance at either a $100 or a $400 gain. Given P's value function, his choice is between two options whose expected values are as follows: Accept the second gamble: Reject the second gamble:
.50 x 400( 65) + .50 x 100( 65) 1.0 x 200( 65)
Because the value of the first option is greater than that of the second, P is predicted to accept the second gamble. Similarly, when P is told that he has lost the first gamble and needs to decide whether to accept or reject the second, P faces the following options: Accept the second gamble: Reject the second gamble:
.50 x -[200 ( 75)] + .50 x 100( 65) 1.0 x -[100 ( 75)]
258
E. Shafir
Again, since the first quantity is larger than the second, P is predicted to accept the second gamble. Thus, once the outcome of the first gamble is known, the value function of Fig. 1 predicts that person P will accept the second gamble whether he has won or lost the first. But what is P expected to do when the outcome of the first gamble is not known? Because he does not know the outcome of the first gamble, P may momentarily assume that he is still where he began-that, for moment, no changes have transpired. Not knowing whether he has won or lost, P remains for now at the status quo, at the origin of his value function. When presented with the decision to accept or reject the second gamble, P evaluates it from his original position, without incorporating the outcome of the first gamble, which remains unknown. Thus, P needs to choose between accepting or rejecting a gamble that offers an equal chance to win $200 or lose $100: Accept the second gamble: .50 x -[100 ( 75)] + .50 x 200( 65) Reject the second gamble: 0 Because the expected value of accepting is just below 0, P decides to reject the second gamble in this case. Thus, aided by prospect theory's value function, we see how a decision maker's "suspension of judgement" - his tendency to assume himself at the origin, or status quo, when it is not known whether he has won or lost - leads him to reject an option that he would accept no matter what his actual position may be. Situated at a chance node whose outcome is not known, P's reluctance to consider each of the hypothetical branches leads him to behave in a fashion that conflicts with his preferred behavior given either branch. People in these situations seem to confound their epistemic uncertainty - what they may or may not know-with uncertainty about the actual consequences - what may or may not have occurred. A greater focus on the consequences would have helped our subjects realize the implications for their preference of either of the outcomes. Instead, not knowing which was the actual outcome, our subjects chose to evaluate the situation as if neither outcome had obtained. It is this reluctance to think through disjunctions that characterizes many of the phenomena considered below.
Search for noninstrumental information: the Hawaiian vacation Imagine that you have just taken a tough qualifying exam. It is the end of the semester, you feel tired and run-down, and you are not sure that you passed the exam. In case you failed you have to take it again in a couple of months-after
Uncertainty and the difficulty of thinking through disjunctions
259
the Christmas holidays. You now have an opportunity to buy a very attractive 5-day Christmas vacation package to Hawaii at an exceptionally low price. The special offer expires tomorrow, while the exam grade will not be available until the following day. Do you buy the vacation package? This question was presented by Tversky and Shafir (1992) to Stanford University undergraduate students. Notice that the outcome of the exam will be known long before the vacation begins. Thus, the uncertainty characterizes the present, disjunctive situation, not the eventual vacation. Additional, related versions were presented in which subjects were to assume that they had passed the exam, or that they had failed, before they had to decide about the vacation. We discovered that many subjects who would have bought the vacation to Hawaii if they were to pass the exam and if they were to fail, chose not to buy the vacation when the exam's outcome was not known. The data show that more than half of the students chose the vacation package when they knew that they passed the exam and an even larger percentage chose the vacation when they knew that they failed. However, when they did not know whether they had passed or failed, less than one-third of the students chose the vacation and the majority (61%) were willing to pay $5 to postpone the decision until the following day, when the results of the exam would be known.2 Note the similarity of this pattern to the foregoing gamble scenario: situated at a node whose outcome is uncertain, our students envision themselves at the status quo, as if no exam had been taken. This "suspension of judgement" - the reluctance to consider the possible branches (having either passed or failed the exam) - leads our subjects to behave in a manner that conflicts with their preferred option given either branch. The pattern observed in the context of this decision is partly attributed by Tversky and Shafir (1992) to the reasons that subjects summon for buying the vacation (see also Shafir, Simonson, & Tversky, 1993, for further discussion). Once the outcome of the exam is known, the student has good - albeit different reasons for going to Hawaii: having passed the exam, the vacation can be seen as a reward following a successful semester; having failed the exam, the vacation becomes a consolation and time to recuperate before a re-examination. Not knowing the outcome of the exam, however, the student lacks a definite reason for going to Hawaii. The indeterminacy of reasons discourages many students from buying the vacation, even when both outcomes - passing or failing the exam - ultimately favor this course of action. Evidently, a disjunction of different
2 Another group of subjects were presented with both Fail and Pass versions, and asked whether they would buy the vacation package in each case. Two-thirds of the subjects made the same choice in the two conditions, indicating that the data for the disjunctive version cannot be explained by the hypothesis that those who buy the vacation in case they pass the exam do not buy it in case they fail, and vice versa. While only one-third of the subjects made different decisions depending on the outcome of the exam, more than 60% of the subjects chose to wait when the outcome was not known.
260
E. Shafir
reasons (reward in case of success; consolation in case of failure) can be less compelling than either definite reason alone. A significant proportion of subjects were willing to pay, in effect, for information that was ultimately not going to alter their decision - they would choose to go to Hawaii in either case. Such willingness to pay for noninstrumental information is at variance with the classical model, in which the worth of information is determined by its potential to influence choice. People's reluctance to think through disjunctive situations, on the other hand, entails that noninstrumental information will sometimes be sought. (See Bastardi & Shafir, 1994, for additional studies of the search for noninstrumental information and its effects on choice.) While vaguely aware of the possible outcomes, people seem reluctant to fully entertain the consequences as long as the actual outcome is uncertain. When seemingly relevant information may become available, they often prefer to have the uncertainty resolved, rather than consider the consequences of each branch of the tree under the veil of uncertainty. A greater tendency to consider the potential consequences may sometimes help unveil the noninstrumental nature of missing information. In fact, when subjects were first asked to contemplate what they would do in case they failed the exam and in case they passed, almost no subject who had expressed the same preference for both outcomes then chose to wait to find out which outcome obtained (Tversky & Shafir, 1992). The decision of many subjects in the disjunctive scenario above was not guided by a simple evaluation of the consequences (for, then, they would have realized that they prefer to go to Hawaii in either case). An adequate account of this behavior needs to contend with the fact that the very simple and compelling disjunctive logic of STP does not play a decisive role in subjects' reasoning. A behavioral pattern which systematically violates a simple normative rule requires both a positive as well as a negative account (see Kahneman and Tversky, 1982, for discussion). We need to understand not only the factors that produce a particular response, but also why the correct response is not made. Work on the conjunction fallacy (Shafir, Smith, & Osherson, 1990; Tversky and Kahneman, 1983), for example, has addressed both the fact that people's probability judgement relies on the representativeness heuristic - a positive account - as well as the fact that people do not perceive the extensional logic of the conjunction rule as decisive - a negative account. The present work focuses on the negative facet of nonconsequential reasoning and STP violations. It argues that like other principles of reasoning and decision making, STP is very compelling when stated in a general and abstract form, but is often non-transparent, particularly because it applies to disjunctive situations. The following section briefly reviews studies of nonconsequential decision making in the context of games, and ensuing sections extend the analysis to other domains.
Uncertainty and the difficulty of thinking through disjunctions
261
Games Prisoner's dilemma The theory of games explores the interaction between players acting according to specific rules. One kind of two-person game that has received much attention is the Prisoner's dilemma, or PD. (For an extensive treatment, see Rapoport & Chammah, 1965). A typical PD is presented in Fig. 2. The cell entries indicate the number of points each player receives contingent on the two players' choices. Thus, if both cooperate each receives 75 points but if, for example, the other cooperates and you compete, you receive 85 points while the other receives 25. What characterizes the PD is that no matter what the other does, each player fares better if he competes than if he cooperates; yet, if they both compete they do significantly less well than if they had both cooperated. Since each player is encountered at most once, there is no opportunity for conveying strategic messages, inducing reciprocity, or otherwise influencing the other player's choice of strategy. A player in a PD faces a disjunctive situation. The other chooses one of two strategies, either to compete or to cooperate. Not knowing the other's choice, the first player must decide on his own strategy. Whereas each player does better competing, their mutually preferred outcome results from mutual cooperation rather than competition. A player, therefore, experiences conflicting motivations. Regardless of what the other does, he is better off being selfish and competing; but assuming that the other acts very much like himself, they are better off both making the ethical decision to cooperate rather than the selfish choice to compete. How might this disjunctive situation influence people's choice of strategy?
OTHER cooperates
competes
You: 75
You: 25
Other: 75
Other 85
You: 85
You: 30
Other 25
Other 30
cooperate
YOU
compete
— Fig. 2.
1
A typical prisoner's dilemma. The cell entries indicate the number of points that you and the other player receive contingent on your choices.
262
E. Shafir
Shafir and Tversky (1992) have documented disjunction effects in one-shot PD games played for real payoffs. Subjects (N = 80) played a series of PD games (as in Fig. 2) on a computer, each against a different unknown opponent supposedly selected at random from among the participants. Subjects were told that they had been randomly assigned to a "bonus group", and that occasionally they would be given information about the other player's already-chosen strategy before they had to choose their own. This information appeared on the screen next to the game, and subjects were free to take it into account in making their decision. (For details and the full instructions given to subjects, see Shafir & Tversky, 1992.) The rate of cooperation in this setting was 3% when subjects knew that the opponent had defected, and 16% when they knew that the opponent had cooperated. Now what should subjects do when the opponent's decision is not known? Since 3% cooperate when the other competes and 16% cooperate when the other cooperates, one would expect an intermediate rate of cooperation when the other's strategy is not known. Instead, when subjects did not know whether their opponent had cooperated or defected (as is normally the case in this game), the rate of cooperation rose to 37%. In violation of STP, a quarter of the subjects defected when they knew their opponent's choice-be it cooperation or defection - but cooperated when their opponent's choice was not known. Note the recurring pattern: situated at a disjunctive node whose outcome is uncertain, these subjects envision themselves at the status quo, as if, for the moment, the uncertain strategy selected by the opponent has no clear consequences. These players seem to confound their epistemic uncertainty - what they may or may not know about the other's choice of strategy - with uncertainty about the actual consequences - the fact that the other is bound to be a cooperator or a defector, and that they, in turn, are bound to respond by defecting in either case. (For further analysis and a positive account of what may be driving subjects' tendency to cooperate under uncertainty, see Shafir & Tversky, 1992.)
Newcomb's problem and quasi-magical thinking Upon completing the PD game described in fhe previous section, subjects (N = 40) were presented, on a computer screen, with the following scenario based on the celebrated Newcomb's problem (for more on Newcomb's problem, see Nozick, 1969; see Shafir & Tversky, 1992, for further detail and discussion of the experiment). You now have one more chance to collect additional points. A program developed recently at MIT was applied during this entire session to analyze the pattern of your preferences. Based on that analysis, the program has predicted your preference in this final problem.
Uncertainty and the difficulty of thinking through disjunctions
I 20 points J Box A
|
?
263
I
Box B
Consider the two boxes above. Box A contains 20 points for sure. Box B may or may not contain 250 points. Your options are to: (1) Choose both boxes (and collect the points that are in both). (2) Choose Box B only (and collect only the points that are in Box B). If the program predicted, based on observation of your previous preferences, that you will take both boxes, then it left Box B empty. On the other hand, if it predicted that you will take only Box B, then it put 250 points in that box. (So far, the program has been remarkably successful: 92% of the participants who choose only Box B found 250 points in it, as opposed to 17% of those who chose both boxes.) To insure that the program does not alter its guess after you have indicated your preference, please indicate to the person in charge whether you prefer both boxes or Box B only. After you indicate your preference, press any key to discover the allocation of points.
According to one rationale that arises in the context of this decision, if the person chooses both boxes, then the program, which is remarkably good at predicting preferences, is likely to have predicted this and will not have put the 250 points in the opaque box. Thus, the person will get only 20 points. If, on the other hand, the person takes only the opaque box, the program is likely to have predicted this and will have put the 250 points in that box, and so the person will get 250 points. A subject may thus be tempted to reason that if he takes both boxes he is likely to get only 20 points, but that if he takes just the opaque box he is likely to get 250 points. There is a compelling motivation to choose just the opaque box, and thereby resemble those who typically find 250 points it it. There is, of course, another rationale: the program has already made its prediction and has already either put the 250 points in the opaque box or has not. If it has already put the 250 points in the opaque box, and the person takes both boxes he gets 250 + 20 points, whereas if he takes only the opaque box, he gets only 20 points. If the program has not put the 250 points in the opaque box and the person takes both boxes he gets 20 points, whereas if he takes only the opaque box he gets nothing. Therefore, whether the 250 points are there or not, the person gets 20 points more by taking both boxes rather than the opaque box only. The second rationale relies on consequentialist reasoning reminiscent of STP (namely, whatever the state of the boxes following the program's prediction, I will do better choosing both boxes rather than one only). The first rationale, on the other hand, while couched in terms of expected value, is partially based on the assumption that what the program will have predicted - although it has predicted this already - depends somehow on what the subject ultimately decides to do. The results we obtained were as follows: 35% of the subjects chose both boxes, while 65% preferred to take Box B only. This proportion of choices is similar to that observed in other surveys concerning the original Newcomb's problem (see,
264
E. Shafir
for example, Gardner, 1973, 1974; Hofstadter, 1983). What can be said about the majority who prefer to take just one box? Clearly, had they known for certain that there were 250 points in the opaque box (and could see 20 in the other), they would have taken both rather than just one. And certainly, if they knew that the 250 points were not in that box, they would have taken both rather than just the one that's empty. These subjects, in other words, would have taken both boxes had they known that Box B is either full or empty, but a majority preferred to take only Box B when its contents were not known. The conflicting intuitions that subjects experience in the disjunctive situation when the program's prediction is not known - are obviously resolved in favor of both boxes once the program's decision has been announced: at that point, no matter what the program has predicted, taking both boxes brings more points. Subjects, therefore, should choose both boxes also when the program's decision is uncertain. Instead, many subjects fail to be moved by the foreseeable consequences of the program's predictions, and succumb to the strong motivation to choose just the opaque box and thereby resemble those who typically find 250 points in it.3 As Gibbard and Harper (1978) suggest in an attempt to explain people's choice of a single box, "a person may. . . want to bring about an indication of a desired state of the world, even if it is known that the act that brings about the indication in no way brings about the desired state itself. This form of magical thinking was demonstrated by Quattrone and Tversky (1984), whose subjects selected actions that were diagnostic of favorable outcomes even though the actions could not cause those outcomes. Note that such instances of magical thinking typically occur in disjunctive situations, before the exact outcome is known. Once they are aware of the outcome, few people think they can reverse it by choosing an action that is diagnostic of an alternative event. Shafir and Tversky (1992) discuss various manifestations of "quasi-magical" thinking, related to phenomena of self-deception and illusory control. These include people's tendency to place larger bets before rather than after a coin has been tossed (Rothbart & Snyder, 1970; Strickland, Lewicki, & Katz, 1966), or to throw dice softly for low numbers and harder for high ones (Henslin, 1967). Similarly, Quattrone and Tversky (1984) note that Calvinists act as if their behavior will determine whether they will go to heaven or to hell, despite their belief in divine pre-determination, which entails that their fate has been determined at birth. The presence of uncertainty, it appears, is a major contributor to quasi-magical thinking; few people act as if they can undo an already certain 3 The fact that subjects do not see through this disjunctive scenario seems indisputable. It is less clear, however, what conditions would serve to make the situation more transparent, and to what extent. Imagine, for example, that subjects were given a sealed copy of the program's decision to take home with them, and asked to inspect it that evening, after having made their choice. It seems likely that an emphasis on the fact that the program's decision has long been made would reduce the tendency to choose a single box.
Uncertainty and the difficulty of thinking through disjunctions
265
event, but while facing a disjunction of events, people often behave as if they can exert some control over the outcome. Thus, many people who are eager to vote while the outcome is pending, may no longer wish to do so once the outcome of the elections has been determined. In this vein, it is possible that Calvinists would perhaps do fewer good deeds if they knew that they had already been assigned to heaven, or to hell, than while their fate remains a mystery. Along similar lines, Jahoda (1969) discusses the close relationship between uncertainty and superstitious behavior, which is typically exhibited in the context of uncertain outcomes rather than in an attempt to alter events whose outcome is already known. As illustrated by the studies above, people often are reluctant to consider the possible outcomes of disjunctive situations, and instead suspend judgement and envision themselves at the uncertain node. Interestingly, it appears that decision under uncertainty is only one of numerous domains in which subjects exhibit a reluctance to think through disjunctive situations. The difficulties inherent to thinking through uncertainty and, in particular, people's reluctance to think through disjunctions manifest themselves in other reasoning and problem-solving domains, some of which are considered below.
Probabilistic judgement Researchers into human intuitive judgement as well as teachers of statistics have commented on people's difficulties in judging the probabilities of disjunctive events (see, for example, Bar-Hillel, 1973; Carlson & Yates, 1989; Tversky & Kahneman, 1974). While some disjunctive predictions may in fact be quite complicated, others are simple, assuming that one sees though their disjunctive character. Consider, for example, the following "guessing game" which consisted of two black boxes presented to Princeton undergraduates (N = 40) on a computer screen, along with the following instructions. Under the black cover, each of the boxes above is equally likely to be either white, blue, or purple. You are now offered to play one of the following two games of chance: Game 1: You guess the color of the left-hand box. You win 50 points if you were right, and nothing if you were wrong. Game 2: You choose to uncover both boxes. You win 50 points if they are the same color, and nothing if they are different colors.
The chances of winning in Game 1 are 1/3; the chances of winning in Game 2 are also 1/3. To see that, one need only realize that the first box is bound to be either white, blue, or purple and that, in either case, the chances that the other will be the same color are 1/3. Notice that this reasoning incorporates the disjunctive
266
E. Shafir
logic of STP. One enumerates the possible outcomes of the first box, considers the chances of winning conditional on each outcome, and realizes that the chances are the same no matter what the first outcome was. Subjects, therefore, are expected to find the two games roughly equally attractive, provided that they see through the disjunctive nature of Game 2. This disjunctive rationale, however, seems not to have been entirely transparent to our subjects, 70% of whom indicated a preference for Game 1 (significantly different from chance, Z = 2.53, p<.05). These subjects may have suspected an equal chance for both games, but a certain lack of clarity about the disjunctive case may have led them to prefer the unambiguous first game. Whereas this preference could also be attributed to subjects' beliefs about the computer set-up, the next version not only insured the perceived independence of outcomes, but also emphasized the game's sequential character which, it was thought, may make its disjunctive nature more transparent. One hundred and three Stanford undergraduates listed their highest buying prices for the gambles below: The following games of chance are played with a regular die that has two yellow sides, two green sides, and two red sides: Game A: You roll the die once. You win $40 if it falls on green, and nothing otherwise. What is the largest amount of money that you would be willing to pay to participate in this game? Game B: You roll the die twice. You win $40 if it falls on the same color both times (e.g., both red, or both green) and nothing otherwise. What is the largest amount of money that you would be willing to pay to participate in this game?
The probability of winning in Game A is 1/3. The probability of winning in Game B is also 1/3. To see this, one need only realize that for every outcome of the first toss, the probability of winning on the second toss is always 1/3. Forty-six percent of our subjects, however, did not list the same buying price for the two games. Eighty-five percent of these subjects offered a higher price for Game A than for Game B. In fact, over all subjects, Game A was valued at an average of $6.09, while Game B was worth an average of only $4.69 (t = 4.65, p < .001). We have investigated numerous scenarios of this kind, in all of which a large proportion of subjects prefer to gamble on a simple event over an equally likely or more likely disjunctive event.
Inductive inference Inferential situations often involve uncertainty not only about the conclusion, but about the premises as well. The judged guilt of a defendant depends on the veracity of the witnesses; the diagnosis of a patient depends on the reliability of the tests; and the truth of a scientific hypothesis depends on the precision of
Uncertainty and the difficulty of thinking through disjunctions
267
earlier observations. Reasoning from uncertain premises can be thought of as reasoning through disjunctions. How likely is the defentant to be guilty if the witness is telling the truth and how likely if the witness is lying? What is the likelihood that the patient has the disease given that the test results are right, and what is the likelihood if the results are false? The aggregation of uncertainty is the topic of various theoretical proposals (see, for example, Shafer & Pearl, 1990), all of which agree on a general principle implied by the probability calculus. According to this principle, if I believe that event A is more probable than event B in light of some condition c, and if I also believe that event A is more probable than event B given the absence of c, then I believe that A is more probable than B regardless of whether c obtains or not. Similar to the failure of STP in the context of choice, however, this principle may not always describe people's actual judgements. In the following pilot study, 182 Stanford undergraduates were presented with the divorce scenario below, along with one of the three questions that follow: Divorce problem Tim and Julia, both school teachers, have been married for 12 years. They have a 10-year old son, Eric, to whom they are very attached. During the last few years Tim and Julia have had recurring marital problems. They have consulted marriage counselors and have separated once for a couple of months, but decided to try again. Their marriage is presently at a new low. Disjunctive question (N = 88): What do you think are the chances that both Tim and Julia will agree to a divorce settlement (that specifies whether Eric is to stay with his father or with his mother)? [59.8%] Mother question (N = 46): What do you think are the chances that both Tim and Julia will agree to a divorce settlement if Eric is to stay with his mother? [49.8%] Father question (N = 48): What do you think are the chances that both Tim and Julia will agree to a divorce settlement if Eric is to stay with his father? [40.7%]
Next to each question is its mean probability rating. Subjects judged the probability that the parents will agree to a divorce settlement that specified that the child is to stay with his father to be less then 50%, and similarly if it specified that the child is to stay with his mother. However, they thought that there was a higher - almost 60% - chance that the parents would agree to divorce in the disjunctive case, when there was uncertainty about whether the settlement would specify that the child is to stay with the father or with the mother (z = 4.54 and 2.52 for the father and for the mother, respectively; p < .05 in both cases.) Because the above effect is small, and there are potential ambiguities in the interpretation of the problem, more exploration of this kind of judgement is
268
E. Shafir
required. It does appear, however, that people's reasoning through this disjunctive situation may be nonconsequential. In effect, the pattern above may capture a "disjunction effect" in judgement similar to that previously observed in choice. Either disjunct - each branch of the tree - leads to attribute a probability of less than one-half, but when facing the disjunction people estimate a probability greater than one-half. Disjunction effects in judgement are likely to arise in contexts similar to those which characterize these effects in choice. While either disjunct presents a clear scenario with compelling reasons for increasing or decreasing one's probability estimate, a disjunctive situation can be less compelling. Thus, people tend to suspend judgement in disjunctive situations, even if every disjunct would eventually affect their perceived likelihood is similar ways. As in choice, instead of contemplating the consequences of traversing each of the branches, people tend to remain nonconsequential at the uncertain node. In the above divorce scenario, it appears, people see a clear reason for lowering the probability estimate of a settlement once they know that the child is to stay with his father, namely, the mother is likely to object. Similarly, if the child is to stay with his mother, the father will object. But what about when the fate of the child is not known? Rather than consider the potential objections of each parent, subjects evaluate the situation from a disjunctive perspective, wherein neither parent has reasons to object. From this perspective the couple seems ready for divorce. Of course, people do not always refrain from considering the potential implications of disjunctive inferences. The pilot data above illustrate one kind of situation that may yield such effects due to the way uncertainty renders certain considerations less compelling. More generally, such patterns can emerge from a tendency towards "concrete thinking" (Slovic, 1972) wherein people rely heavily on information that is explicitly available, at the expense of other information which remains implicit. Numerous studies have shown that people often do not decompose categories into their relevant subcategories. For example, having been told that "robins have an ulnar artery", subjects rate it more likely that all birds have an ulnar artery than that ostriches have it (Osherson, Smith, Wilkie, Lopez, & Shafir, 1990; see also Shafir, Smith, & Osherson, 1990). A precondition for such judgement is the failure to take account of the fact that the category birds consists of subcategories, like robins, sparrows, and ostriches. Along similar lines, most subjects estimate the frequency, on a typical page, of seven-letter words that end in ing ( ing) to be greater than the frequency of seven-letter words that have the letter n in the sixth position ( n-) (Tversky & Kahneman, 1983). When making these estimates, subjects focus on the particular category under consideration: because instances of the former category are more easily available than instances of the latter, subjects erroneously conclude that they must be more frequent. Evidently, subjects do not decompose the latter category into its constituent subcategories (i.e., seven-letter words that end in ing, seven-letter words that end in ent, seven-letter words that end in ine, etc.).
Uncertainty and the difficulty of thinking through disjunctions
269
Various manifestations of the tendency for considerations that are out of sight to be out of mind have been documented by Fischhoff, Slovic, and Lichtenstein (1978) who, for example, asked car mechanics to assess the probabilities of different causes of a car's failure to start. The mean probability assigned to the hypothesis "the cause of failure is something other than the battery, the fuel system, or the engine" doubled when the unspecified disjunctive alternative was broken up into some of its specific disjuncts (e.g., the starting system, the ignition system, etc.) Along similar lines, Johnson, Hershey, Meszaros, and Kunreuther (1993) found that subjects were willing to pay more when offered health insurance that covers hospitalization for "any disease or accident" than when offered health insurance that covers hospitalization "for any reason". Evidently, subjects do not perceive the latter, implicit disjunction as encompassing the various disjuncts explicitly mentioned in the former. For an extensive treatment of the relationship between explicit and implicit disjunctions in probability judgement, see Tversky and Koehler (1993). Inferential disjunction effects may also occur in situations in which different rationales apply to the various disjuncts. Under uncertainty, people may be reluctant to contemplate the consequences, even if they would eventually affect judgement in similar ways. Shafir and Tversky (1992) have suggested that the financial markets' behavior during the 1988 US Presidential election had all the makings of a disjunction effect. In the weeks preceding the election, US financial markets remained relatively inactive and stable, "because of caution before the Presidential election" (The New York Times, November 5, 1988). "Investors were reluctant to make major moves early in a week full of economic uncertainty and seven days away from the Presidential election" (The Wall Street Journal, November 2, 1988). Immediately following the election, a clear outlook emerged. The dollar plunged sharply to its lowest level in 10 months, stock and bond prices declined, and the Dow Jones industrial average fell a total of almost 78 points over the ensuing week. The dollar's decline, explained the analysts, "reflected continued worry about the US trade and budget deficits", "economic reality has set back in" (WSJ, November 10). The financial markets, observed the NYT, "had generally favored the election of Mr. Bush and had expected his victory, but in the three days since the election they have registered their concern about where he goes from here". Of course, the financial markets were likely to have registered at least as much concern had Mr. Dukakis been elected. Most traders agree, wrote the WSJ, that the stock market would have dropped significantly had Dukakis staged a come-from-behind victory. "When I walked in and looked at the screen", explained one trader after the election, "I thought Dukakis had won" (NYT, November 10). After days of inactivity preceding the election, the market declined immediately following Bush's victory, and would have declined at least as much had Dukakis been the victor (these would unlikely be due to disjoint sets of actors). Of course, a thorough analysis of the financial markets' behavior reveals
270
E. Shafir
numerous complications but, at least on the surface, this incident has all the makings of a disjunction effect: the markets would decline if Bush was elected, they would decline if Dukakis was elected, but they resisted any change until after the elections. Being at the node of such a momentous disjunction seems to have stopped Wall Street from addressing the expected consequences. "Considering how Wall Street had rooted for Bush's election", said the NYT (November 11), "its reaction to his victory was hardly celebratory. Stocks fell, bonds fell and the dollar dropped. It makes one think of the woman in the New Yorker cartoon discussing a friend's failing marriage: 'She got what she wanted, but it wasn't what she expected'." Indeed, it is in the nature of nonconsequential thinking to encounter events that were bound to be, but were not expected.
Deductive inference The Wason selection task One of the most investigated tasks in research into human reasoning has been the selection task, first described by Wason (1966). In a typical version of the task, subjects are presented with four cards, each of which has a letter on one side and a number on the other. Only one side of each card is displayed. For example:
m
u
s
m
Subjects' task is to indicate those cards, and only those cards, that must be turned over to test the following rule: "If there is a vowel on one side of the card, then there is an even number on the other side of the card." The simple structure of the task is deceptive - the great majority of subjects fail to solve it. Most select the E card or the E and the 4 cards, whereas the correct choices are the E and the 7 cards. (The success rate of initial choices in dozens of studies employing the basic form of the selection task typically ranges between 0 and a little over 20%; see Evans, 1989, and Gilhooly, 1988, for reviews.) The difficulty of the Wason selection task is perplexing. Numerous variations of the task have been documented, and they generally agree that people have no trouble evaluating the relevance of the items that are hidden on the other side of each card. Wason and Johnson-Laird (1970; see also Wason, 1969) explicitly address the discrepancy between subjects' ability to evaluate the relevance of potential outcomes (i.e., to understand the truth conditions of the rule), and their inappropriate selection of the relevant cards. (Oakhill & Johnson-Laird, 1985, report related findings regarding subjects' selection of counterexamples when testing generalizations.) While the problem is logically quite simple, they conclude, "clearly it is the attempt to solve it which makes it difficult" (Wason & Johnson-Laird, 1972, p. 174). Thus, subjects understand that neither a vowel nor a consonant on the other
Uncertainty and the difficulty of thinking through disjunctions
271
side of the 4 card contributes to the possible falsification of the rule, yet they choose to turn the 4 card when its other side is not known. Similarly, subjects understand that a consonant on the other side of the 7 card would not falsify the rule but that a vowel would falsify it, nevertheless they neglect to turn the 7 card in the disjunctive situation. As Evans (1984, p. 458) has noted, "this strongly confirms the view that card selections are not based upon any analysis of the consequences of turning the cards". Subjects are easily able to evaluate the logical consequences of potential outcomes in isolation, but they seem to act in ways that ignore these consequences when facing a disjunction. What exactly subjects do when performing the selection task remains outside the purview of the present paper, especially considering the numerous studies that have addressed this question. In general, a pattern of content effects has been observed in a number of variations on the task (see, for example, Griggs & Cox, 1982; Johnson-Laird, Legrenzi, & Legrenzi, 1972; and Wason, 1983, for a review; although see also Manktelow & Evans, 1979). It is likely that such content effects facilitate performance on the selection task by rendering it more natural for subjects to contemplate the possible outcomes, which tend to describe familiar situations. To explain the various effects, researchers have suggested verification biases (Johnson-Laird & Wason, 1970), matching biases (Evans, 1984; Evans & Lynch, 1973), memories of domain-specific experiences (Griggs & Cox, 1982; Manktelow & Evans, 1979), pragmatic reasoning schemas (Cheng & Holyoak, 1985, 1989), selective focusing (Legrenzi, Girotto, & Johnson-Laird, 1993), as well as an innate propensity to look out for cheaters (Cosmides, 1989). What these explanations have in common is an account of performance on the selection task that does not involve disjunctive reasoning per se. Instead, people are assumed to focus on items that have been explicitly mentioned, to apply prestored knowledge structures, or to remember relevant past experiences. While most people find it trivially easy to reason logically about each isolated disjunct, the disjunction leads them to withhold such reasoning, at least when the content is not familiar. Subjects confronted with the above four-card problem fail to consider the logical consequences of turning each card, and instead remain, judgement suspended, at the disjunctive node, the cards' hidden sides not having been adequately evaluated.
The THOG problem Another widely investigated reasoning problem whose disjunctive logic makes it difficult for most people to solve is the THOG problem (Wason & Brooks, 1979). The problem presents four designs: a black triangle, a white triangle, a black circle, and a white circle. Subjects are given an exclusive disjunction rule. They are told that the experimenter has chosen one of the shapes (triangle or
272
E. Shafir
circle) and one of the colors (black or white), and that any design is a THOG if, and only if, it has either the chosen shape or color, but not both. Told that the black triangle is a THOG, subjects are asked to classify each of the remaining designs. The correct solution is that the white circle is a THOG and that the white triangle and the black circle are not. This is because the shape and color chosen by the experimenter can only be either a circle and black, or a triangle and white. In both cases the same conclusion follows: the black circle and white triangle are not THOGs and the white circle is. The majority of subjects, however, fail to follow this disjunctive logic and the most popular answer is the mirror image of the correct response. Reminiscent of the selection task, subjects appear to have no difficulty evaluating what is and what is not a THOG once they are told the particular shape and color chosen by the experimenter (Wason & Brooks, 1979). It is when they face a disjunction of possible choices that subjects appear not to work through the consequences. Uncertain about the correct shape and color, subjects fail to consider the consequences of the two options and reach a conclusion that contradicts their preferred solution given either alternative. Smyth and Clark (1986) and Girotto and Legrenzi (1993) also address the relationship between failure on the THOG problem and nonconsequential reasoning through disjunctions.
Double disjunctions Further evidence of subjects' reluctance to think through inferential disjunctions comes from a recent study of propositional reasoning conducted by JohnsonLaird, Byrne, and Schaeken (1992; see also Johnson-Laird & Byrne, 1991, Chapter 3). These investigators presented subjects with various premises and asked them to write down what conclusion, if any, followed from those premises. They concluded that reasoning from conditional premises was easier for all subjects than reasoning from disjunctive premises. In one study subjects were presented with "double disjunctions" - two disjunctive premises such as the following: June is in Wales or Charles is in Scotland, but not both. Charles is in Scotland or Kate is in Ireland, but not both. To see what follows from this double disjunction, one simply needs to assume, in turn, the separate disjuncts. If we assume that June is in Wales, then it is not the case that Charles is in Scotland and, therefore, we know that Kate is in Ireland. Similarly, if we assume that Charles is in Scotland, then it is not the case that June is in Wales or that Kate is in Ireland. It therefore follows from this double disjunction that either Charles is in Scotland or June is in Wales and Kate is in
Uncertainty and the difficulty of thinking through disjunctions
273
Ireland. It is clear, once separate disjuncts are entertained, that certain conclusions follow. Yet, nearly a quarter of Johnson-Laird et al.'s subjects (ages 18-59, all working at their own pace) concluded that nothing follows, and many others erred in their reasoning, to yield a total of 21% valid conclusions. (Other kinds of disjunctions - negative and inclusive - fared worse, yielding an average of 5% valid conclusions.) As in the previous studies, if subjects are provided with relevant facts they have no trouble arriving at valid conclusions. Thus, once subjects are told that, say, June is in Wales, they have no trouble concluding that Kate is in Ireland. Similarly, they reach a valid conclusion if told that Charles is in Scotland. But when facing the disjunctive proposition, people seem to confound their epistemic uncertainty, what they may or may not know, with uncertainty about the actual consequences, the fact that one or another of the disjuncts must obtain. Presented with a disjunction of simple alternatives most subjects refrain from assuming the respective disjuncts and arrive at no valid conclusions.4
Puzzles and paradoxes The impossible barber Many well-known puzzles and semantic paradoxes have an essentially disjunctive character. Consider, for example, that famous, clean-shaven, small-village 4
Johnson-Laird, Byrne, and Schaeken (1992) investigate these disjunctions in the context of their theory of propositional reasoning. In fact, a number of psychological theories of propositional reasoning have been advanced in recent years (e.g., Braine, Reiser, & Rumain, 1984; Osherson, 1974-6; Rips, 1983), and the relationship between reasoning about disjunctive propositions and reasoning through disjunctive situations merits further investigation. One issue that arises out of the aforementioned studies is worth mentioning. Rips (1983), in his theory of propositional reasoning which he calls ANDS, finds reason to assume certain "backward" deduction rules that are triggered only in the presence of subgoals. This leads Braine et al. to make the following observation: The conditionality of inferences on subgoals places ANDS on a very short leash that has counterintuitive consequences. For example, consider the following premises: There is an F or an R If there is an F then there is an L If there is an R then there is an L It seems intuitively obvious that there has to be an L. If ANDS is given the conclusion There is an L, then ANDS makes the deduction. But if the conclusion given is anything else (e.g., There is an X, or There is not an L), ANDS will not notice that there has to be an L. (1984, pp. 357-8) The explicit availability of the premises in the example above may distinguish it from a standard disjunction effect, wherein the specific disjuncts are not explicitly considered. Apart from that, the phenomenon - that it should be "intuitively obvious" that there has to be an L, but may "not be noticed" - seems a good simulation of the disjunction effect.
274
E. Shafir
barber who shaves all and only the village men who do not shave themselves. The description of this barber seems perfectly legitimate - one almost feels like one may know the man. Behind this description, however, lurks an important disjunction: either this barber shaves himself or he does not (which, incidentally, still seems perfectly innocent). But, of course, once we contemplate the disjuncts we realize the problem: if the barber shaves himself, then he violates the stipulation that he only shaves those who do not shave themselves. And if he does not shave himself, then he violates the stipulation that he shaves all those who don't. The impossible barber is closely related to another of Bertrand Russell's paradoxes, namely, the set paradox. The set paradox, which had a profound influence on modern mathematical thinking, concerns the set of all sets that do not contain themselves as members. (Is this set a member of itself?) The logical solution to these paradoxes is beyond the scope of the present paper (see Russell, 1943), but their "paradoxical" nature is instructive. One definition of "paradox" is "a statement that appears true but which, in fact, involves a contradiction" (Falletta, 1983). What characterizes the paradoxes above is a logical impossibility that goes undetected partly due to their underlying disjunctive nature. Unless we delve into the appropriate disjuncts (which are themselves often not trivial to identify) and contemplate their logical consequences, these impossible disjunctions appear innocuous.
Knights and knaves Many puzzles also rely on the surprising complexity or lack of clarity that arise in simple disjunctive situations. A class of such puzzles concerns an island in which certain inhabitants called "knights" always tell the truth, and others called "knaves" always lie. Smullyan (1978) presents a variety of knight-knave puzzles, and Rips (1989) investigates the psychology of reasoning about them. Consider, for example, the following puzzle (which the reader is invited to solve before reading further): There are three inhabitants, A, B, and C, each of whom is a knight or a knave. Two people are said to be of the same type if they are both knights or both knaves. A and B make the following statements: A: B is a knave. B: A and C are of the same type. What is C? (Smullyan, 1978, p. 22; reprinted in Rips, 1989)
We know that A must be either a knight or a knave. If A is a knight, then his statement about B must be true, so B is a knave. If B is a knave, then his
Uncertainty and the difficulty of thinking through disjunctions
275
statement that A and C are of the same type is false. Hence, since we are assuming A is a knight, C must be a knave. On the other hand, suppose A is a knave. Then his statement that B is a knave is false, and B is the knight. Hence, B's statement that A and C are of the same type is true, and since A is a knave, so is C. Thus, we have shown that C is a knave regardless of whether A is a knave or a knight. While each assumption about A leads straightforwardly to a conclusion about C, the disjunctive nature of the puzzle makes it quite difficult. And the difficulties are not negligible: about 30% of Rips' subjects stopped working on a set of such problems relatively quickly and scored at less than chance accuracy (which was 5%), and the remaining subjects averaged a low solution rate of 26% of the problems answered correctly. In fact, after reviewing subjects' think-aloud protocols, Rips (1989, p. 89) concludes that "most of the subjects' difficulties involved conceptual bookkeeping rather than narrowly logical deficiencies" (although see Johnson-Laird & Byrne, 1991, for a discussion of possible difficulties involved in more complex cases). Rips proceeds to stipulate that while the required propositional (logical) rules are equally available to everyone, "subjects differ in the ease with which they hit upon a stable solution path" (p. 109). Thus, it is not the simple logical steps that seem to create the difficulties in this case, but rather the general, conceptual "solution path" required to reason through a disjunction.
Conclusion In their seminal Study of Thinking, Brunger, Goodnow, and Austin (1956) observed "the dislike of and clumsiness with disjunctive concepts shown by human subjects" (p. 181). The studies reviewed above indicate that people's dislike of and clumsiness with disjunctions extend across numerous tasks and domains. While various factors may contribute to the clumsiness with disjunctions in different domains, it nonetheless appears that a consideration of people's reluctance to think through disjunctions may shed light on common difficulties experienced in reasoning and decision making under uncertainty. Decision difficulty is sometimes attributed to emotional factors having to do with conflict and indecision. Alternatively, it can result from the sheer complexity that characterizes many decision situations. In the context of the present paper, on the other hand, STP violations were observed in a number of simple contexts of decision and reasoning that do not seem readily attributable to eitheremotional factors or complexity considerations. The disjunctive scenarios reviewed in this paper were quite simple, most involving just a couple of possible disjuncts. In contrast to many complicated tasks that people perform with relative ease, these problems appear computationally very simple. They serve to highlight the discrepancy between logical complexity on the one hand and psychological
276
E. Shafir
difficulty on the other. Recall, for example, Rips' observation in the context of the knights/knaves problem, to the effect that most subjects' difficulties involved conceptual bookkeeping, or arriving at a stable solution path, rather than narrowly logical deficiencies (for a related discussion, see Goldman, 1993). While it is possible that subjects occasionally forget intermediate results obtained in their reasoning process, subjects in these experiments were allowed to write things down and, besides, there often were very few intermediate steps to remember. In general, subjects appear reluctant to travel through the branches of a decision tree. Indeed, numerous studies have shown that merely encouraging subjects to systematically consider the various disjuncts often allows them to avoid common errors. In the context of the THOG problem, for example, Griggs and Newstead (1982) have shown that simply spelling out for subjects the four disjunctive possibilities reliably improves their performance. Similar effects have been shown by Johnson-Laird and Byrne (1991) in the context of the double disjunctions, and by Tversky and Shafir (1992) in the context of various disjunction effects in decision problems. Merely mentioning the few possible disjuncts can hardly be considered a major facilitation from the point of view of computational or logical complexity, but it does appear to set subjects on the right solution path, namely that of systematically contemplating the decision tree's various branches. Typically, shortcomings in reasoning are attributed to quantitative limitations of human beings as processors of information. "Hard problems" tend to be characterized by reference to the "required amount of knowledge", the "memory load", or the "size of the search space" (cf. Kotovsky, Hayes, & Simon, 1985; Kotovsky & Simon, 1990). These limitations are bound to play a critical role in many situations. As discussed by Shafir and Tversky (1992), however, such limitations are not sufficient to account for all that is difficult about thinking. In contrast to the "frame problem" (Hayes, 1973; McCarthy & Hayes, 1969), for example, which is trivial for people but exceedingly difficult for AI, the task of thinking through disjunctions is trivial for AI (which routinely implements "tree search" and "path finding" algorithms) but is apparently quite unnatural for people. It appears that decision under uncertainty can be thought of as another domain in which subjects exhibit a reluctance to think through disjunctive situations. Thinking through an event tree requires people to assume momentarily as true something that may in fact be false. People may be reluctant to make this assumption, especially when competing alternatives (other branches of the tree) are readily available. It is apparently difficult to devote full attention to each of several branches of an event tree (cf. Slovic & Fischhoff, 1977), particularly when it is known that most will eventually prove to be false hypothetical assumptions. Often, subjects may lack the motivation to trasverse the tree simply because they assume, as is often the case, that the problem will not be resolved by separately
Uncertainty and the difficulty of thinking through disjunctions
277
evaluating the branches. We usually try to formulate problems in ways that have sifted through the irrelevant disjunctions: those that are left are normally assumed to matter. It appears that part of what may be problematic in decision under uncertainty is more fundamental than the problems typically envisioned, concerning the difficulties involved in the estimation of likelihoods and their combination with the estimated utilities of outcomes. Situations of uncertainty, it is suggested, can be thought of as disjunctive situations: one event may occur, or another. The studies above indicate that the disjunctive logic of these uncertain situations often introduces an uncertainty of its own. Thus, even in situations in which there should be no uncertainty since the same action or outcome will eventually obtain in either case, people' reluctance to think through these scenarios often creates an uncertainty that, if it were not for this reluctance, would not be felt or observed. As with numerous other systematic behavioral errors, the fact that people routinely commit a mistake does not, of course, mean that they are not capable of realizing it once it is apparent. Many of the patterns observed above, we suggest, reflect a failure on the part of people to detect and apply the relevant principles rather than a lack of appreciation for their normative appeal (see Shafir, 1993, for related discussion). Subjects' violations of STP in a variety of decision contexts were attributed to their failure to think through the disjunctive situation. In fact, when Tversky and Shafir (1992) first asked subjects to indicate their preferred course of action under each outcome and only then to make a decision in the disjunctive condition, the majority of subjects who opted for the same option under every outcome chose that option also when the precise outcome was not known. The frequency of disjunction effects, in other words, substantially diminishes when the logic of STP is made salient. Like other normative principles of decision making, STP is generally satisfied when its application is transparent, but is sometimes violated when it is not (Tversky & Kahneman, 1986). Because it is a general "solution path" that seems to be neglected, rather than a limitation in logical or computational skill, a proficiency in thinking through uncertain situations may be something that people can improve upon through deliberate planning and introspection. Further study of people's psychology in situations of uncertainty and in other disjunctive situations is likely to improve our understanding and implementation of reasoning in general, and of the decision making process in particular.
References Bacharach, M., & Hurley, S. (1991). Issues and advances in the foundations of decision theory. In M. Bacharach & S. Hurley (Eds.), Foundations of decision theory: Issues and advances (pp. 1-38). Oxford: Basil Blackwell.
278
E. Shafir
Bar-Hillel, M. (1973). On the subjective probability of compound events. Organizational Behavior and Human Performance, 9, 396-406. Bastardi, A., & Shafir, E. (1994). On the search for and misuse of useless information. Manuscript, Princeton Universitv. Braine, M.D.S., Reiser, B.J., & Rumain, B. (1984). Some empirical justification for a theory of natural propositional logic. In G.H. Bower (Ed.), The psychology of learning and motivation (Vol. 18, pp. 313-371). New York: Academic Press. Bruner, J.S., Goodnow, J.J., & Austin, G.A. (1956). A study of thinking. New York: Wiley. Carlson, B.W., & Yates, J.F. (1989). Disjunction errors in qualitative likelihood judgment. Organizational Behavior and Human Decision Processes, 44, 368-379. Cheng, P.W., & Holyoak, K.J. (1985). Pragmatic reasoning schemas. Cognitive Psychology, 17, 391-416. Cheng, P.W., & Holyoak, K.J. (1989). On the natural selection of reasoning theories. Cognition, 33, 285-313. Cosmides, L. (1989). The logic of social exchange: has natural selection shaped how humans reason? Cognition, 31, 187-276. Evans, J.St.B.T. (1984). Heuristic and analytic processes in reasoning. British Journal of Psychology, 75, 451-468. Evans, J.St.B.T. (1989). Bias in human reasoning: Causes and consequences. Hillsdale, NJ: Erlbaum. Evans, J.St.B.T., & Lynch, J.S. (1973). Matching bias in the selection task. British Journal of Psychology, 64, 391-397. Falletta, N.L. (1983). The paradoxicon. New York: Doubleday. Fischhoff, B., Slovic, P., & Lichtenstein, S. (1978). Fault trees: sensitivity of estimated failure probabilities to problem representation. Journal of Experimental Psychology: Human Perception and Performance, 4, 330-344. Gardner, M. (1973). Free will revisited, with a mind-bending prediction paradox by William Newcomb. Scientific American, 229(1), 104-108. Gardner, M. (1974). Relfections on Newcomb's problem: a prediction and free-will dilemma. Scientific American, 230(3), 102-109. Gibbard, A., & Harper, W.L. (1978). Counterfactuals and two kinds of expected utility. In C.A. Hooker, J.J. Leach, & E.F. McClennen, (Eds.), Foundations and applications of decision theory, (Vol. 1, pp. 125-162). Dordrecht: Reidel. Gilhooly, K.J. (1988). Thinking: Directed, undirected, and creative, 2nd ed. San Diego, CA: Academic Press. Girotto, V, & Legrenzi, P. (1993). Naming the parents of the THOG: Mental representation and reasoning. Quarterly Journal of Experimental Psychology, forthcoming. Goldman, A.I. (1993). Philosophical applications of cognitive science. Boulder, CO: Westview Press. Griggs, R.A., & Cox, J.R. (1982). The elusive thematic-materials effect in Wason's selection task. British Journal of Psychology, 73, 407-420. Griggs, R.A., & Newstead, S.E. (1982). The role of problem structure in a deductive reasoning task. Journal of Experimental Psychology: Language, Memory and Cognition, 8, 297-307. Hammond, P. (1988). Consequentialist foundations for expected utility. Theory and Decision, 25, 25-78. Hayes, P. (1973). The frame problem and related problems in artificial intelligence. In A. Elithorn & D. Jones (Eds.), Artificial and human thinking. San Francisco: Jossey-Bass. Henslin, J.M. (1976). Craps and magic. American Journal of Sociology, 73, 316-330. Hofstadter, D.R. (1983). Dilemmas for superrational thinkers, leading up to a luring lottery. Scientific American, June. Reprinted in D.R. Hofstadter (1985). Metamagical the mas: Questing for the essence of mind and pattern. New York: Basic Books. Jahoda, G. (1969). The psychology of superstition. Harmondsworth: Penguin Books. Johnson, E.J., Hershey, J., Meszaros, J., & Kunreuther, H. (1993). Framing, probability distortions, and insurance decisions. Journal of Risk and Uncertainty, 7, 35-51. Johnson-Laird, P.N., & Byrne, R.M.J. (1991). Deduction. Hillsdale: Erlbaum. Johnson-Laird, P.N., Byrne, R.M.J., & Schaeken, W. (1992). Propositional reasoning by model. Psychological Review, 99, 418-439.
Uncertainty and the difficulty of thinking through disjunctions
279
Johnson-Laird, P.N., Legrenzi, P., & Legrenzi, S.M. (1972). Reasoning and a sense of reality. British Journal of Psychology, 63, 395-400. Johnson-Laird, P.N., & Wason, P.C. (1970). A theoretical analysis of insight into a reasoning task. Cognitive Psychology, 1, 134-148. Kahneman, D., & Tversky, A. (1979). Prospect theory: An analysis of decision under risk. Econometrica, 47, 263-291. Kahneman, D., & Tversky, A. (1982). On the study of statistical intuitions. Cognition, 11, 123-141. Kotovsky, K., Hayes, J.R., & Simon, H.A. (1985). Why are some problems hard? Evidence from tower of Hanoi. Cognitive Psychology, 17, 248-294. Kotovsky, K., & Simon, H.A. (1990). What makes some problems really hard: Explorations in the problem space of difficulty. Cognitive Psychology, 22, 143-183. Legrenzi, P., Girotto, V, & Johnson-Laird, P.N. (1993). Focussing in reasoning and decision making. Cognition, 49, 37-66. Levi, I. (1991). Consequentialism and sequential choice. In M. Bacharach & S. Hurley (Eds.), Foundations of decision theory: Issues and advances (pp. 92-122). Oxford: Basil Blackwell. Manktelow, K.I., & Evans, J.St. B.T. (1979). Facilitation of reasoning by realism: Effect or non-effect? British Journal of Psychology, 70, 477-488. McCarthy, J., & Hayes, P. (1969). Some philosophical problems from the standpoint of Artificial Intelligence. In B. Meltzer & D. Michie (Eds.), Machine intelligence. New York: American Elsevier. McClennen, E.F. (1983). Sure-thing doubts. In B.P. Stigum & F. Wenstop (Eds.), Foundations of utility and risk theory with applications (pp. 117-136). Dordrecht: Reidel. Nozick, R. (1969). Newcomb's problem and two principles of choice. In N. Rescher (Ed.), Essays in honor of Carl G. Hempel. Dordrecht: Reidel. Oakhill, J.V., & Johnson-Laird, P.N. (1985). Rationality, memory and the search for counterexamples. Cognition, 20, 79-94. Osherson, D.N. (1974-6). Logical abilities in Children (Vol. 2-4). Hillsdale, NJ: Erlbaum. Osherson, D.N., Smith, E.E., Wilke, A., Lopez, A., & Shafir, E. (1990). Category based induction. Psychological Review, 97, 185-200. Quattrone, G.A., & Tversky, A. (1984). Causal versus diagnostic contingencies: On self-deception and on the voter's illusion. Journal of Personality and Social Psychology, 46, 237-248. Rapoport, A., & Chammah, A. (1965). Prisoner's dilemma. Ann Arbor: University of Michigan Press. Rips, L.J. (1983). Cognitive processes in propositional reasoning. Psychological Review, 90, 38-71. Rips, L.J. (1989). The psychology of knights and knaves. Cognition, 31, 85-116. Rothbart, M., & Snyder, M. (1970). Confidence in the prediction and postdiction of an uncertain event. Canadian Journal of Behavioral Science, 2, 38-43. Russell, B. (1943). The principles of mathematics. 2nd ed. New York: Norton. Savage, L.J. (1954). The foundations of statistics. New York: Wiley. Shafer, G., & Pearl, J. (Eds.) (1990). Readings in uncertain reasoning. San Mateo, CA: MorganKaufmann. Shafir, E. (1993). Intuitions about rationality and cognition. In K.I. Manktelow & D.E. Over (Eds.), Rationality: Psychological and philosophical perspectives (pp. 260-283). New York: Routledge. Shafir, E., Simonson, I., & Tversky, A. (1993). Reason-based choice. Cognition, 49, 11-36. Shafir, E., Smith, E.E., & Osherson, D.N. (1990). Typicality and reasoning fallacies. Memory and Cognition, 18, 229-239. Shafir, E., & Tversky, A. (1992). Thinking through uncertainty: Nonconsequential reasoning and choice. Cognitive Psychology, 24, 449-474. Slovic, P. (1972). From Shakespeare to Simon: speculations - and some evidence - about man's ability to process information. Orgeon Research Institute Research Monograph, 12(2). Slovic, P., & Fischhoff, B. (1977). On the psychology of experimental surprises. Journal of Experimental Psychology: Human Perception and Performance, 3, 544-551. Smullyan, R.M. (1978). What is the name of this book? The riddle of Dracula and other logical puzzles. New York: Simon & Schuster.
280
E. Shafir
Smyth, M.M., & Clark, S.E. (1986). My half-sister is a THOG: Stragetic processes in a reasoning task. British Journal of Psychology, 77, 275-287. Strickland, L.H., Lewicki, R.J., & Katz, A.M. (1966). Temporal orientation and perceived control as determinants of risk-taking. Journal of Experimental Social Psychology, 2, 143-151. Tversky, A., & Kahneman, D. (1974). Judgment under uncertainty: Heuristics and biases. Science, 185, 1124-1131. Tversky, A., & Kahneman, D. (1983). Extensional versus intuitive reasoning: The conjunction fallacy in probability judgment. Psychological Review, 90, 293-315. Tversky, A., & Kahneman, D. (1986). Rational choice and the framing of decisions. Journal of Business, 59, 251-278. Tversky, A., & Kahneman, D. (1992). Advances in prospect theory: Cumulative representation of uncertainty. Journal of Risk and Uncertainty, 5, 297-323. Tversky, A., & Koehler, D.J. (1993). Support theory: A nonextensional representation of subjective probability. Manuscript, Stanford University. Tversky, A., & Shafir, E. (1992). The disjunction effect in choice under uncertainty. Psychological Science, 3, 305-309. Wason, P.C. (1966). Reasoning. In B.M. Foss (Ed.), New horizons in psychology (Vol. 1). Harmandsworth: Penguin. Wason, P.C. (1969). Structural simplicity and psychological complexity: Some thoughts on a novel problem. Bulletin of the British Psychological Society, 22, 281-284. Wason, P.C. (1983). Realism and rationality in the selection task. In J.St.B.T. Evans (Ed.), Thinking and reasoning: Psychological approaches. London: Routledge & Kegan Paul. Wason, P.C, & Brooks, P.G. (1979). THOG: The anatomy of a problem. Psychological Research, 41, 79-90. Wason, P.C, & Johnson-Laird, P.N. (1970). A conflict between selecting and evaluating information in an inferential task. British Journal of Psychology, 61, 509-515. Wason, P.C, & Johnson-Laird, P.N. (1972). Psychology of reasoning: Structure and content. Cambridge, MA: Harvard University Press.
14 The perception of rhythm in spoken and written language Anne Cutler Max-Planck-InstitutfUrPsycholinguistik, Wundtlaan 1, 6525 XD Nijmegen, Netherlands MRC Applied Psychology Unit, Cambridge, UK
Parti: Introduction Rhythm is perceptually salient to the listener. This claim is central to the research project briefly described below: a large-scale investigation of listening, in which the principal issue was how listeners segment continuous speech into words. Listeners must recognize spoken utterances as a sequence of individual words, because the utterances may never previously have been heard. Indeed, listeners' subjective impression of spoken language is that it may be effortlessly perceived as a sequence of words. Yet speech signals are continuous: there are only rarely reliable and robust cues to where one word ends and the next begins. Research on this issue in a number of languages prompted apparently differing proposals. In English, experimental evidence (Cutler & Norris, 1988; Cutler & Butterfield, 1992; McQueen, Norris & Cutler, 1994) suggested that lexical segmentation could be efficiently achieved via a procedure based on exploitation of stress rhythm: listeners assumed that each strong syllable was likely to be the beginning of a new lexical word, and hence segmented speech input at strong syllable onsets. Experimental evidence from French, in contrast (Mehler, Dommergues, Frauenfelder & Segui, 1981; Segui, Frauenfelder & Mehler, 1981) motivated a syllable-by-syllable segmentation procedure. Similar to the French findings were results from Catalan, and, under certain conditions, Spanish (Sebastian, Dupoux, Segui & Mehler, 1992). Cross-linguistic studies with French and English listeners (Cutler, Mehler, Norris & Segui, 1986,1992) established that these contrasting results reflected differences between listeners rather than being effects of the input itself; non-native listeners did not use the segmentation procedures used by native listeners, but instead could apply their native procedure—sometimes inappropriately—to non-native input. Language rhythm offered a framework within which the English and French results could be interpreted as specific realizations of a single universal procedure. The rhythm of English is stress-based, while French is said to have a syllable-based rhythm. Thus in both these cases the segmentation procedure preferred by listeners could be viewed as exploitation of the characteristic rhythmic structure of the language. To test this hypothesis, experiments similar to those conducted in French and English were carried out in Japanese, a language with a rhythmic structure based on a subsyllabic unit, the mora (Otake, Hatano,
284
A. Cutler
Cutler & Mehler, 1993; Cutler & Otake, 1994). These showed indeed that Japanese listeners could effectively use moraic structure to segment spoken input, thus supporting the hypothesis that listeners solve the segmentation problem in speech recognition by exploiting language rhythm—whatever form the rhythmic structure of their native language may take. The segmentation problem, however, is specific to listening. One of the many differences between listening and reading is that in most orthographies the segmentation problem does not arise—words are clearly demarcated in most printed texts. Therefore it is at least arguable that the reader, who is confronted with no segmentation problem, has no need of language rhythm. In other words: Rhythm will not be perceptually salient to the reader. The following section of this chapter was conceived as a (self-referential) test of this suggestion. The reader is invited to read the piece as it originally appeared (Part II). For those still unconvinced, Part III may serve as a useful comparison to Part II. Part II: The Perception of Rhythm in Language 7. The segmentation problem The orthography of English has a very simple basis for establishing where words in written texts begin and end: both before and also after every word are empty spaces and this demarcation surely helps the reader comprehend. In a spoken text, however, as presented to a hearer, such explicit segmentation cues are rarely to be found; little pauses after every single word might make things clearer, but the input is continuous, a running stream of sound. This implies that part of listening involves an operation whereby input is segmented, to be processed word by word, for we cannot hold in memory each total collocation, as most sentences we come across are previously unheard. Yet we listeners experience no sense of some dramatic act of separating input into pieces that are known; as we listen to an utterance it seems unproblematic—words in sentences seem just as clear as words that stand alone. Just how listeners accomplish such an effortless division is a question that psychologists have now begun to solve, and this paper will describe (although with minimal precision) some experimental studies showing what it might involve. The findings, as this summary explains, at once can vindicate the order of the problem and the hearer's sense of ease, for though speech must be segmented, yet the data plainly indicate that rhythm in the input makes segmenting speech a breeze. 2. The language-specificity of rhythmic structure Now linguistic rhythmic structures have a noticeable feature in that language-universal they are definitely not. This fact is all too obvious to any hapless teacher who has tried to coax the prosody of French from, say, a Scot. Thus while English rhythmic structure features alternating stresses in which syllables contrast by being either strong or weak, this particular endowment is not one which French possesses, having rather one where syllables are equal, so to speak. These distinctions were expressed within traditional phonetics as uniquely based on timing (stress or syllable), though now we admit of more complexity in rhythmic exegetics and of other types of patterning that languages allow; thus in Japanese the mora is the (subsyllabic) unit which provides the root of rhythm, as phonolo-
The perception of rhythm in spoken and written language
285
gists maintain. An important source of evidence, and few would dare impugn it, can be found in verse and poetry: the metrical domain. So compare the English limerick, a form which thousands take up, with the haiku, a poetic form of note in Japanese; there are five lines in a limerick, and stress defines their makeup: the third and fourth are two-stress lines, the others all are threes; and analogously haiku have their composition reckoned by the mora computation, in a manner iron-cast: while the longest line in morae, having seven, is the second, there are five and only five in both the first line and the last. 3. The use of rhythm in listening Just those rhythms found in poetry are also those which function in perception, as the work referred to earlier suggests; thus for English there is evidence1 involving a conjunction of spontaneous performance and experimental tests, which together show that listeners use stress in segmentation, by hypothesizing boundaries when syllables are strong. Since the lexicon has far more words with strong pronunciation of the word-initial syllable, this method can't go wrong. In comparison with English we should surely not ignore a set of studies2 run quite recently on hearers in Japan, which produced results consistent with the story that the mora is the unit that these listeners segment by when they can; while those studies3 that initiated all this lengthy series were performed on native listeners of French some years ago, and they demonstrated well beyond the range of any queries that these listeners used syllables for parsing speech en mots. More experiments were subsequently carried out in Spanish, and in Catalan and Portuguese and Qu6becois and Dutch, which in spite of minor variance did nothing that would banish the conclusion that for hearers rhythm matters very much. So the picture that emerges is that rhythm as exhibited in verse forms of a language can effectively predict those procedures which, assuming that their use is not inhibited, allow us to declare the segmentation problem licked. 4. The non-use of rhythm in reading Unexpected complications to this neat account, however, are observed when we consider rhythms found in written text. Some preliminary findings, which this section will endeavor to elucidate, at first left their discoverer perplexed. For if rhythm is so integral a part of our audition, then it ought to be the case that it is hard to overlook; but the most pronounced of rhythms can escape our recognition when they're reproduced in printing in an article or book. Late in 1989 the present author wrote a letter, in which verse (or rather, doggerel) pretended to be prose, to at least a hundred friends, from whom responses showed the better part had not perceived the rhymes at all, wherever they arose. In a follow-up, a colleague4 gave this ready-made material to subjects to read out, and his results were even worse: of the readers who produced the text, in strict progression serial, not one perceived the letter as a rhyming piece of verse. But the selfsame text, however, may be printed as a ballad (thus, with lines which end in rhymes), and any reader can descry where the rhythm is, which renders this interpretation valid: written rhythm's only noticed when it clearly hits iCutler & Norris (1988); Cutler & Butterfield (1992). Otake et al. (1993); Cutler & Otake (1994). 3Mehler et al. (1981); Segui et al. (1981). 4 Many thanks to Aki Fukushima and Bob Ladd for conducting this study and permitting me to describe it. 2
286
A. Cutler
the eye. But perhaps the readers' lack of use of rhythm, as conceded, if judiciously considered has a lesson it can teach: it arises just because no segmentation step is needed. Thus the role of language rhythm is in understanding speech. Part III: The Perception of Rhythm in Language 1. The segmentation problem The orthography of English has a very simple basis for establishing where words in written texts begin and end: both before and also after every word are empty spaces and this demarcation surely helps the reader comprehend. In a spoken text, however, as presented to a hearer, such explicit segmentation cues are rarely to be found; little pauses after every single word might make things clearer, but the input is continuous, a running stream of sound. This implies that part of listening involves an operation whereby input is segmented, to be processed word by word, for we cannot hold in memory each total collocation, as most sentences we come across are previously unheard. Yet we listeners experience no sense of some dramatic act of separating input into pieces that are known; as we listen to an utterance it seems unproblematic— words in sentences seem just as clear as words that stand alone. Just how listeners accomplish such an effortless division is a question that psychologists have now begun to solve, and this paper will describe (although with minimal precision) some experimental studies showing what it might involve. The findings, as this summary explains, at once can vindicate the order of the problem and the hearer's sense of ease, for though speech MUST be segmented, yet the data plainly indicate that rhythm in the input makes segmenting speech a breeze. 2. The language-specificity of rhythmic structure Now linguistic rhythmic structures have a noticeable feature in that language-universal they are definitely not. This fact is all too obvious to any hapless teacher who has tried to coax the prosody of French from, say, a Scot. Thus while English rhythmic structure features alternating stresses in which syllables contrast by being either strong or weak, this particular endowment is not one which French possesses, having rather one where syllables are equal, so to speak. These distinctions were expressed within traditional phonetics as uniquely based on timing (stress or syllable), though now we admit of more complexity in rhythmic exegetics
The perception of rhythm in spoken and written language and of other types of patterning that languages allow; thus in Japanese the mora is the (subsyllabic) unit which provides the root of rhythm, as phonologists maintain. An important source of evidence, and few would dare impugn it, can be found in verse and poetry: the metrical domain. So compare the English limerick, a form which thousands take up, with the haiku, a poetic form of note in Japanese; there are five lines in a limerick, and stress defines their makeup: the third and fourth are two-stress lines, the others all are threes; and analogously haiku have their composition reckoned by the mora computation, in a manner iron-cast: while the longest line in morae, having seven, is the second, there are five and only five in both the first line and the last. 3. The use of rhythm in listening Just those rhythms found in poetry are also those which function in perception, as the work referred to earlier suggests; thus for English there is evidence involving a conjunction of spontaneous performance and experimental tests, which together show that listeners use stress in segmentation, by hypothesizing boundaries when syllables are strong. Since the lexicon has far more words with strong pronunciation of the word-initial syllable, this method can't go wrong. In comparison with English we should surely not ignore a set of studies run quite recently on hearers in Japan, which produced results consistent with the story that the mora is the unit that these listeners segment by when they can; while those studies that initiated all this lengthy series were performed on native listeners of French some years ago, and they demonstrated well beyond the range of any queries that these listeners used syllables for parsing speech en mots. More experiments were subsequently carried out in Spanish, and in Catalan and Portuguese and Quebecois and Dutch, which in spite of minor variance did nothing that would banish the conclusion that for hearers rhythm matters very much. So the picture that emerges is that rhythm as exhibited in verse forms of a language can effectively predict those procedures which, assuming that their use is not inhibited, allow us to declare the segmentation problem licked. 4. The non-use of rhythm in reading Unexpected complications to this neat account, however, are observed when we consider rhythms found in written text.
288
A. Cutler
Some preliminary findings, which this section will endeavor to elucidate, at first left their discoverer perplexed. For if rhythm is so integral a part of our audition, then it ought to be the case that it is hard to overlook; but the most pronounced of rhythms can escape our recognition when they're reproduced in printing in an article or book. Late in 1989 the present author wrote a letter, in which verse (or rather, doggerel) pretended to be prose, to at least a hundred friends, from whom responses showed the better part had not perceived the rhymes at all, wherever they arose. In a follow-up, a colleague gave this ready-made material to subjects to read out, and his results were even worse: of the readers who produced the text, in strict progression serial, not one perceived the letter as a rhyming piece of verse. But the selfsame text, however, may be printed as a ballad (thus, with lines which end in rhymes), and any reader can descry where the rhythm is, which renders this interpretation valid: written rhythm's only noticed when it clearly hits the eye. But perhaps the readers' lack of use of rhythm, as conceded, if judiciously considered has a lesson it can teach: it arises just because no segmentation step is needed. Thus the role of language rhythm is in understanding speech. References Cutler, A. & Butterfield, S. (1992). Rhythmic cues to speech segmentation: Evidence from juncture misperception. Journal of Memory and Language, 31, 218-236. Cutler, A., Mehler, J., Norris, D.G. and Segui, J. (1986). The syllable's differing role in the segmentation of French and English. Journal of Memory and Language, 25, 385-400. Cutler, A., Mehler, J., Norris, D. and Segui, J. (1992). The monolingual nature of speech segmentation by bilinguals. Cognitive Psychology 24, 381-410. Cutler, A. & Norris, D.G. (1988). The role of strong syllables in segmentation for lexical access. Journal of Experimental Psychology: Human Perception and Performance, 14, 113-121; Cutler, A. and Otake, T. (1994). Mora or phoneme? Further evidence for language-specific listening. Journal of Memory and Language, 33, 824-844. McQueen, J.M., Norris, D.G. and Cutler, A. (1994). Competition in spoken word recognition: Spotting words in other words. Journal of Experimental Psychology: Learning, Memory and Cognition, 20, 621-638. Mehler, J., Dommergues, J.-Y., Frauenfelder, U. & Segui, J. (1981). The syllable's role in speech segmentation. Journal of Verbal Learning & Verbal Behavior, 20, 298-305. Otake, T, Hatano, G., Cutler, A. & Mehler, J. (1993). Mora or syllable? Speech segmentation in Japanese. Journal of Memory and Language, 32, 358-378. Sebastian-Galles, N., Dupoux, E., Segui, J. & Mehler, J. (1992). Contrasting syllabic effects in Catalan and Spanish. Journal of Memory & Language, 31, 18-32. Segui, J., Frauenfelder, U.H. & Mehler, J. (1981). Phoneme monitoring, syllable monitoring and lexical access. British Journal of Psychology, 72, 471-477.
15 Categorization in early infancy and the continuity of development Peter D. Eimas* Department of Cognitive and Linguistic Sciences, Brown University, Providence, RI 02902, USA
Abstract Arguments and evidence are presented for the conclusion that the young infant's perceptually based categorical representations for natural kinds - animals in this case - are the basis for their mature conceptual counterparts. In addition, it is argued that conceptual development is continuous in nature and without the need for special developmental processes. A consideration of the development of the syllabic, segmental, and featural categories of phonology shows a more complex pattern of change - one marked by both continuities and discontinuities in the representations themselves and the processes that produce them.
1. Introduction The standard wisdom that underlies developmental theories of human cognition includes the presumption that development is marked by discontinuities - by stages that differ in kind with respect to the medium of mental representations and often the processes or rules that operate on these representations. The foremost proponent of this general view of development in the twentieth century was certainly Piaget (1952, for example), but comprehensive theories of cognitive development akin to that of Piaget are found in the writings of Bruner, Olver, and Greenfield (1966) and Vygotsky (1934/1962) among others. They, like Piaget, hypothesized a variety of stages and substages to explain the substantive changes in perception and thought that were believed to mark our intellectual passage from birth to maturity. * E-mail
[email protected] Preparation of this discussion and the author's research described herein were supported by Grants HD 05331 and HD 28606. The author thanks Gundeep Behl-Chadha, Joanne L. Miller, and Paul C. Quinn for their comments on earlier versions.
290
P. Eimas
More typical of the writings of recent developmental theorists are explanations of (presumed) qualitative changes in quite restricted aspects of cognition. An earlier example of this approach is found in the Kendlers' description of the changes in discriminative learning and transfer that occur during the fifth and seventh years of life (e.g., Kendler & Kendler, 1962; but see Eimas, 1970). Other, more recent examples are offered by investigators concerned with developmentally correlated regressions in various facets of cognition from imitation to conservation and language comprehension. These relatively specific reversals in cognitive growth are typically conceived of (or can be conceived of) as reflecting the consequences of developmentally timed neuronal reorganizations that alter cognitive processes and strategies (Bever, 1982). The initial concerns of the present discussion pertain to the conceptual development of natural kinds, specifically animals, from early infancy to early childhood and how this development relates to the typically presumed discontinuities in development. In the second section, I consider development of phonology from the perspective of continuities and discontinuities across representations and the processes that yield these representations.
Conceptual development At least since the time of Piaget (Piaget & Inhelder, 1964) and Vygotsky (1934/1962), this facet of cognitive development has been viewed as a process marked by a series of stages in which the categorical representations for objects and events that populate our environment differ in kind across the inevitable progression, ceteris paribus, from one developmental period to the next. For example, the representations hypothesized by Piaget in the earliest years of life were based on sensorimotor representations, whereas for Vygotsky the earliest representations were the idiosyncratic associations of the individual child among the things of the world - cognitive "heaps". Only later, at or near the time of puberty, do conceptual representations that are abstract, logically structured, and meaningful emerge. And only then according to Piaget are acts of cognition involving concepts able to support the logic of problem solving, for example, or according to Vygotsky to provide the conceptually based meanings that are conveyed by human language. For Bruner and his colleagues it was only when concepts became symbolically represented, as opposed to enactively (motorically) or iconically represented, did thought attain mature levels of computational power. This stage-wise view of conceptual development is found today in the writings of Mandler (1992) and Karmiloff-Smith (1992), for example. It is important to note that these theorists posit not only a difference in kind between early and
Categorization in early infancy and the continuity of development
291
later representations of the categories of our world, but they also posit specialized mechanisms that perform this transition - a transformation that is taken to make human cognition in its mature form possible. Mandler (1992) has assumed that the earliest categorical structures of infants, the earliest parsing of things and events in the world, are perceptual in nature (cf. Quinn & Eimas, 1986) and remain so until they undergo a process of perceptual analysis that yields the meaningful conceptual representations of older children and adults - representations that permit us to know the kind of thing being represented. In a similar vein, Karmiloff-Smith (1992) has posited a process of representational redescription that operates a number of times during development, with each redescription producing increasingly more abstract representations that eventually become available to consciousness.1 Quinn and Eimas (1986) also noted the qualitativelike differences between the earliest perceptually based representations of young infants and the conceptually driven representations of maturity - the latter having a core that was then presumed to include information that was not sensory or perceptual in nature. Quinn and Eimas were, however, moot about causal factors that brought about these apparent changes which I now attempt to describe, drawing on a discussion of conceptual development by Quinn, Eimas, and Behl-Chahda (in preparation). In this endeavor we depart from the classical view of conceptual development, offering instead the contention that development of conceptual representations and even of the naive theories in which they are ultimately embedded is a continuous process that does not require the application of special-purpose processes of development. What is necessary instead is the application and re-application of processes that are available to all sentient beings (as far as we know), that are innately given, that are operative early in life, and that remain operative throughout the course of our existence. The initial function of these processes is to form perceptually driven categorical representations, whereas their later function is to enrich these initial representations informationally and to do so to an extent that they begin to take on the characteristics of concepts. In effect, this view posits continuity with respect to the cognitive operations by which conceptual representations develop and considers the nature of categorical representations to be unchanging across development the apparent qualitative difference between perceptually and conceptually driven
*Xu and Carey (1993) have presented evidence that 10-month-old infants use spatiotemporal but not property-kind information in forming representations of specific physical objects (sortal concepts) that specify their boundaries and numerical identity (i.e., whether an object is identical to one encountered at another time). Although this can be viewed as a discontinuity in conceptual development, it can also be described as part of a continuous process whereby different attributes of the physical world attain functional significance at different ages.
292
P. Eimas
representations being in actuality for us one of degree of informational richness and complexity.2 It is important from the view taken here that definitions of mature concepts are vague and more than somewhat imprecise with respect to the information that is represented and the information that presumably distinguishes mature from immature representations (Quinn et al., in preparation). In a recent discussion of the current "Zeitgeist on the nature of concepts" Jones and Smith (1993) offer the following description of concepts that they take to summarize the prevailing view: ". . . our perceptual experiences as we encounter objects in the world are represented at the periphery of our concepts. At the center lies our nonperceptual knowledge: principally, beliefs about the origins and causes of category membership. Thus at the periphery of our concept kitten would be a description of its surface properties, fur, head and body shape, four-legged, and so forth. At the center would be understanding of the kitten as a living entity - as an immature cat, as having been born of a cat" (p. 114). In opposition to this "Zeitgeist on the nature of concepts", we offer the idea that the nonperceptual knowledge that is taken to mark concepts as opposed to perceptual categories finds its origins and basis in the same processes of perception and categorization that make possible the initial perceptually driven categorical representations; thus it too is perceptually based. For example, "living entity" in the example of kitten described above may be given by such information as the self-propelled and intended motion that is associated with living animals (sufficiently restricted initially to exclude mollusks and plants, at the very least) and that would seem to be readily perceived and distinguished from other kinds of motion by young infants (e.g., Bertenthal, Proffitt, & Cutting, 1984). To be represented as an "immature cat" could well be a consequence of such characteristics as size (conditional upon other feline attributes), playfulness, facial characteristics, and the kitten's sounds of communication - characteristics again likely to be perceivable by infants (cf. Quinn et al., in preparation). The acquisition of these attributes and undoubtedly others is presumed, as noted, to be gradual and most importantly to rest on the processes of categorization and possibly prototype extraction that yield the early categorical structures of natural and artifactual kinds. There is no question that knowledge about animals (and other aspects of our world) is also gained by means not directly mediated by our senses and perceptual systems in the classical sense. Rather considerable biological knowledge about It is important to note that this position should not be construed as an argument for an empiricist view of conceptual development. Given the precocity of the infant's abilities to categorize and their sophistication, it would seem obvious that these abilities have strong biological determinants. In addition, the information the infant is sensitive to in the course of parsing the world and in adding knowledge to these categorical representations must likewise be (in part) a function of our biology, as has been well argued by Edelman (1987), for example.
Categorization in early infancy and the continuity of development
293
what distinguishes different kinds of animals, as well as such biological principles as inheritance, respiration, digestion, and the like, is gained in the formal and informal processes of education by means of language. These acquisitions, we likewise argue, do not require processes specially designed for cognitive growth. What is required is simply the association of language-based knowledge with mental structures that currently represent the animals in question. Presumably, acquisition of this knowledge is governed by general, if at first rudimentary, laws of language comprehension and learning. In support of this position, I note our recent research on the young infant's categorization of natural kinds, in this case species of animals. We have obtained considerable evidence showing that infants as young as 3 and 4 months of age can form representations for a variety of mammalian species that are quite exclusive, that is, at (or nearly at) what would be considered the basic level of representation (Eimas & Quinn, in press; Eimas, Quinn, & Cowan, submitted; Quinn, Eimas, & Rosenkrantz, 1993). These include, for example, a categorical representation for cats that excludes horses, tigers, and even female lions given appropriate experience that contrasts cats and lions. Infants of this age can also form a global representation for a number of individual mammalian species, a rudimentary superordinate-like representation (Behl-Chadha, 1993). In sum, quite young infants are starting with categorical representations that parse at least part of the world of animals in ways that will continue to have significance throughout the course of development. What comes with further experience, or so we believe, is a quantitative enrichment, and not a qualitative transformation of these early categorical representations. We have also argued that the processes of perceptually based categorization and association can in principle yield more abstract representations, for example, an independent representation for animate things (admittedly restricted initially to mammals and similar animals). The common aspects of the features for animate things, for example, biological motion, that are found in the categorical representations for different species are presumed to be recognized and abstracted (categorized) and form the basis for a representation for animate beings. These processes of recognition and abstraction operate across the considerable variation that exists in the individual representations of these features in different species. They are believed to be the same processes that permit very young infants to abstract prototypic values for a variety of (possibly simpler) stimulus attributes such as orientation (Quinn & Bomba, 1986). As a consequence, a number of properties, for example locomotion and the possession of faces, legs, elongated bodies and so forth, may be individuated and represented by some average or prototypic value or range of permissible values. Furthermore, inasmuch as these attributes are correlated in their representations for specific animals, their abstract prototypic values may be bound together and in this manner readily form a unified, if at first rudimentary, representation for animate
294
P. Eimas
things. Given representations for animals at (or nearly at) the basic level as well as the beginnings of a global representation and an emerging representation for animate things, the infant has a number of the necessary representations for the beginnings of a naive theory of biology that will ultimately bring organization to increasingly complex biologically based representations and complete our story of conceptual development for the domain of biology (cf. Murphy & Medin, 1985). What is important is that the emergence of what is viewed as conceptually based categorical representations and even a naive theory can be viewed as a continuous developmental process and one that does not need special processes that transform simple, immature representations into complex, more abstract symbolic representations. While we have applied this line of thought only to biological entities, we see no reason that such a view cannot have wide application to conceptual development across many domains of natural kinds and artifacts. It is a view very much in accord with that recently offered by Spelke, Breinlinger, Macomber, and Jacobson (1992) for object perception. They theorized that development begins around a "constant core" that yields the perception and representation of coherent objects that adhere to (some) laws of physics. To this core we would add the ability to form perceptually based categorical representations for objects and events, which on gradually acquiring further knowledge become the conceptual representations that make human cognition possible.
Phonological development In this section, I discuss development of the categories of an emerging system of phonology in terms of the ideas used to describe the emergence of categorical representations for animals in infants and young children. I begin by describing research on the perception of speech, noting in particular the infant's abilities to categorize speech and the apparent constancy across development with respect to the processes of perception that yield categorical representations. I then note, however, that there is evidence for and against the continuity of phonological development with respect to the relation between the original categorical representations for speech and those necessary for a mature phonology and the processes that cause these changes. There are numerous studies showing that infants in the first weeks and months of life are not only sensitive to and attracted by human speech, but that they are able to represent the sounds of speech categorically (see Eimas, Miller, & Jusczyk, 1987, and Jusczyk, in press, for reviews). The latter is important in that it shows that infants are able to listen through the natural variation in speech that arises from instance-to-instance variation in production, the characteristics of the speaker, including the speaker's sex, emotional state, and rate of production, and phonetic context. This process is necessary if infants are to ultimately arrive at
Categorization in early infancy and the continuity of development
295
structures that can support perceptual constancy and provide the basic constituents of language. Interestingly, these early representations are not only categorical in nature but also organized entities (Eimas & Miller, 1992). Thus, for example, Eimas, Siqueland, Jusczyk, and Vigorito (1971) showed that 1- and 4-month-old infants failed to discriminate small differences in the speech signal corresponding to moment-to-moment variations in production (voice onset time in this case) when the different exemplars were drawn from either of the two voicing categories of English in syllable-initial position. However, when the same acoustic difference signaled different voicing categories, the sounds were reliably discriminated. Similar findings have been obtained for voicing information for infants being raised in other language communities as well as for information corresponding to differences in place and manner of articulation, and for sufficiently brief vocalic distinctions. Further evidence for categorization comes from a series of experiments by Kuhl (e.g., 1979, 1983) using a quite different experimental procedure and somewhat older infants. She found that infants, approximately 6 months of age, form equivalence classes for a number of consonantal and vocalic categories whose exemplars varied considerably in their acoustic properties as a consequence of differences in speaker, intonation patterns, and phonetic environment and were readily discriminable from each other. Moreover, given that categorizations of this form occur on the first exposure to novel category exemplars and that the initial categories based on voicing, for example, exist without constraint by the parental language, the categorization of speech would appear to be a basic biologically given characteristic of human infants. The categorization process for speech in effect maps a potentially indefinite number of signals onto a single representation. What is particularly interesting is that this many-to-one mapping is itself not invariant. Studies with young infants, based on experiments performed originally with adult listeners, have shown that the boundary between categories can be altered as a consequence of systematically varying contextual factors such as rate of speech (Eimas & Miller, 1980; Miller & Eimas, 1983; and see Levitt, Jusczyk, Murray, & Carden, 1988, and Eimas & Miller, 1991). In a similar vein, research with adults and infants has shown that the multiple acoustic properties that are sufficient and available to signal phonetic contrasts enter into perceptual trading relations (e.g., Fitch, Halwes, Erickson, & Liberman, 1980, and Miller & Eimas, 1983, respectively). That is to say, the values along a given cue influence the values along a second cue that signal a categorical representation. The processes for the categorization of speech would thus appear to be precocious and stable across time - one form of continuity in development. However, it is important to note that experiments evidencing the categorical perception of speech in infancy do not inform us whether these early categorical representations are linguistic. Nor do they inform us whether the units of
296
P. Eimas
categorization are the equivalent of features, segments, or syllables, whether they be auditory or linguistic. It is at this point that arguments for continuity become more complex. I take these initial representations to be linguistic in nature-a highly contentious assumption in the field of speech perception, especially infant speech perception, but one that is justified experimentally and biologically in my view (e.g., Eimas & Miller, 1991, 1992; Liberman & Mattingly, 1985; Mattingly & Liberman, 1990). By this I simply mean that the categorical representations of speech are a part of the mental structures that form a human language (cf. Liberman & Mattingly, 1985; Mattingly & Liberman, 1990). I also take the initial representations to be syllabic in structure-a decision for which there would appear to be greater empirical evidence (Jusczyk, 1993, in press; Mehler, 1981) than for segmental or featural representations (e.g., Hillenbrand, 1984). Now if this is indeed the case, then at some point in development these representations must change from being syllabic to being segmental and featural as well as syllabic, as is required for phonological knowledge (e.g., Kenstowicz & Kisseberth, 1979). Given these assumptions, a major issue to be considered is whether there is continuity between the initial categorical representations for speech and later phonological representations. A second issue is the basis for development. There would appear to be two views that are quite different with respect both to the nature of the relation between initial and mature representations and their presumed means of transformation. The first, articulated by Studdert-Kennedy (1991), takes the initial categorizations of infants to be little more than general, prelinguistic auditory constraints that must exist if developing systems of perception and production of speech are ultimately to act in concert. They are, furthermore, not the forerunners of the phonetic categories of human languages; indeed, there is little relation between the two. The mature representations of human phonologies emerge from representations of larger linguistic units, words or word-like structures, and then only when the child begins to use language productively to convey meaning - a noncontinuity view of development of representations and the means by which they develop. This form of development is, according to Studdert-Kennedy, in agreement with a general principle of evolution that he views as being applicable to ontogeny, namely, "that complex structures evolve by differentiation of smaller structures from larger. Accordingly... we should expect phonemes to emerge from words" (StuddertKennedy, 1991, p. 12). The beginnings of an alternative, more complex, view of development with respect to issues of continuity and noncontinuity can be found in my earlier writings. For example, I assumed in Eimas (1975), for example, that initial categories of speech were segmental and unchanging. More recently, however, having been convinced by the prevailing weight of evidence (e.g., Mehler, 1981)
Categorization in early infancy and the continuity of development
297
and by the arguments of Jusczyk (e.g., Jusczyk, 1993), I have come to believe that syllabic representations provide the better starting position (Eimas et al., 1987). These initial holistic representations exist in a form from which segmental and featural representations may eventually be abstracted or differentiated (cf. Gibson, 1969) - representational forms that in my opinion differ in kind from syllabic units. However, it should be remembered that the final representations are not only segmental and featural, they are also syllabic. The mature syllabic structures are undoubtedly more complex, and more varied, than the original representations (cf. Eimas et al., 1987). Nonetheless, this aspect of development is one that is continuous in nature - there is a change in the complexity of syllabic representations, but not in the kind of unit, as is the case for segments and features. As noted, I continue to believe that the initial mental structures for speech are linguistic and, if true, it results in a situation, I would argue, that is cognitively more economical than nonlinguistic representations of speech - parity between perception and production is immediately given (cf. Liberman & Mattingly, 1985; Mattingly & Liberman, 1990). This is to say, the representations supporting perception and production have at least a common core and thus the relation between the two need not be a part of the early stages of language acquisition. As a result this aspect of phonological development is continuous. Differentiation is possibly aided later by the processes of speech production that are involved in later meaningful communication, but I would argue with Jusczyk that differentiation is more a consequence of the representation and encoding of similar syllables in nearby positions than of production. Of course, having a process of differentiation as a necessary means for the emergence of phonetic categories may be construed as a special process (over and above that which is involved in categorization per se) designed to further development, although it need not be, given the prevalence of differentiation in development (Gibson, 1969). A major point to be taken from this latter section is that acquiring the categories of a mature phonology, the syllables, segments and features, would seem to be indicants of both continuous and discontinuous forms of development. The process of featural and segmental differentiation may signify a discontinuity, that is, a special developmentally related process that transforms initial representations. Moreover, other processes, more probably discontinuous in nature, may be involved if the differentiation of segments and features is also a consequence of production. Similarly, continuity does not exist between the form of the initial and final representations of speech with respect to segments and features, although it does exist for syllables. There is further continuity if the initial and later representations of speech are both linguistic in nature. Finally, there is the apparent constancy across development in the processes of speech perception. Thus the picture for phonological development is quite different from
298
P. Eimas
that for conceptual development. Moreover, there is no simple way to characterize the nature of phonological development, as may well be true for many of the varied facets of perception, cognition, and language. Attempts to understand and describe our intellectual and linguistic origins and their development thus find further justification for domain-specific theories. References Behl-Chadha, G. (1993). Perceptually driven formation of superordinate-like categorical structures in early infancy. Unpublished doctoral dissertation, Brown University. Bertenthal, B.I., Proffitt, D.R., & Cutting, J.E. (1984). Infant sensitivity to figural coherence in biochemical motions. Journal of Experimental Child Psychology, 37, 213-230. Bever, T.G. (1982). Regressions in mental development: Basic phenomena and theories. Hillsdale, NJ: Erlbaum. Bruner, J.S., Olver, R.R., & Greenfield, P.M. (1966). Studies in cognitive growth. New York: Wiley. Edleman, G.M. (1987). Neural darwinism. New York: Basic Books. Eimas, P.D. (1970). Attentional processes. In H.W Reese & L.P. Lipsitt (Eds.), Experimental child psychology (pp. 279-310). New York: Academic Press. Eimas, P.D. (1975). Speech perception in early infancy. In L.B. Cohen & P. Salapatek (Eds.), Infant perception (Vol. 2, pp. 193-231). New York: Academic Press. Eimas, P.D., & Miller, J.L. (1980). Contextual effects in infant speech perception. Science, 209, 1140-1141. Eimas, P.D., & Miller, J.L. (1991). A constraint on the perception of speech by young infants. Language and Speech, 34, 251-263. Eimas, P.D., & Miller, J.L. (1992). Organization in the perception of speech by young infants. Psychological Science, 3, 340-345. Eimas, P.D., Miller, J.L., & Jusczyk, P.W. (1987). On infant speech perception and the acquisition of language. In S. Hamad (Ed.), Categorical perception: The groundwork of cognition (pp. 161-195). New York: Cambridge University Press. Eimas, P.D., & Quinn, P.C. (in press). Studies on the formation of basic-level categories in young infants. Child Development. Eimas, P.D., Quinn, P.C, & Cowan, P. (submitted). Development of categorical exclusivity in perceptually based categories of young infants. Eimas, P.D., Siqueland, E.R., Jusczyk, P., & Vigorito, J. (1971). Speech perception in infants. Science, 171, 303-306. Fitch, H.L., Halwes, T., Erickson, D.M., & Liberman, A.M. (1980). Perceptual equivalence of two acoustic cues for stop-constant manner. Perception and Psychophysics, 27, 343-350. Gibson, E.J. (1969). Principles of perceptual learning and development. New York: Appleton-CenturyCrofts. Hillenbrand, J. (1984). Speech perception by infants: Categorization based on nasal consonant place of articulation. Journal of the Acoustic Society of America, 75, 1613-1622. Jones, S.S., & Smith, L.B. (1993). The place of perception in children's concepts. Cognitive Development, 8, 113-139. Jusczyk, P.W. (1993). From general to language-specific capacities: The WRAPSA Model of how speech perception develops. Journal of Phonetics, 21, 3-28. Jusczyk, P.W. (in press). Language acquisition: Speech sounds and the beginnings of phonology. In J.L. Miller & P.D. Eimas (Eds.), Handbook of perception and cognition, Vol. 11: Speech, language, and communication. Orlando, FL: Academic Press. Karmiloff-Smith, A. (1992). Beyond modularity. Cambridge, MA: MIT Press. Kendler, H.H., & Kendler, T.S. (1962). Vertical and horizontal processes in problem solving. Psychological Review, 69, 1-16.
Categorization in early infancy and the continuity of development
299
Kenstowicz, M., & Kisseberth, C. (1979). Generative phonology. New York: Academic Press. Kuhl, P.K. (1979). Speech perception in early infancy: Perceptual constancy for spectrally dissimilar vowel categories. Journal of the Acoustical Society of America, 66, 1668-1679. Kuhl, P.K. (1983). Perception of auditory equivalence classes for speech in early infancy. Infant Behavior and Development, 6, 263-285. Levitt, A., Jusczyk, P.W., Murray, J., & Carden, G. (1988). Context effects in two-month-old infants' perception of labiodental /interdental fricative contrasts. Journal of Experimental Psychology: Human Perception and Performance, 14, 361-368. Liberman, A.M., & Mattingly, I.G. (1985). The motor theory of speech perception revised. Cognition, 21, 1-36. Mandler, J.M. (1992). How to build a baby: II. Conceptual primitives. Psychological Review, 99, 587-604. Mattingly, I.G., & Liberman, A.M. (1990). Speech and other auditory modules. In G.M. Edelman., W.E. Gall, & W.M. Cowan (Eds.), Signal and sense: Local and global order in perceptual maps (pp. 501-520). New York: Wiley-Liss. Mehler, J. (1981). The role of syllables in speech processing: Infant and adult data. Philosophical transactions of the Royal Society of London, B295, 333-352. Miller, J.L., & Eimas, P.D. (1983). Studies on the categorization of speech by infants. Cognition, 13, 135-165. Murphy, G.L., & Medin, D.L. (1985). The role of theories in conceptual coherence. Psychological Review, 92, 289-316. Piaget, J. (1952). The origins of intelligence in children. New York: International University Press. Piaget, J., & Inhelder, B. (1964). The early growth of logic. London: Routledge & Kegan Paul. Quinn, P.C., & Bomba, P.C. (1986). Evidence for a general category of oblique orientations in four-month-old infants. Journal of Experimental Child Psychology, 42, 345-354. Quinn, P.C, & Eimas, P.D. (1986). On categorization in early infancy. Merrill-Palmer Quarterly, 32, 331-363. Quinn, P.C, Eimas, P.D., & Rosenkrantz, S.L. (1993). Evidence for representations of perceptually similar natural categories by 3-month old and 4-month old infants. Perception, 22, 463-475. Spelke, E.S., Breinlinger, K. Macomber, J., & Jacobson, K. (1992). Origins of knowledge. Psychological Review, 99, 605-632. Studdert-Kennedy, M. (1991). Language development from an evolutionary perspective. In N. Krasnegor, D. Rumbaugh, R. Schiefelbusch, & M. Studdert-Kennedy (Eds.), Language acquisition: Biological and behavioral determinants (pp. 3-28). Hillsdale, NJ: Erlbaum. Vygotsky, L. (1934/1962). Thought and language (transl. E. Hanfmann & G. Vakar). Cambridge, MA: MIT Press. Xu, F., & Carey, S. (1993). Infants' metaphysics: The case of numerical identity. MIT Center for Cognitive Science, Occasional Paper (#48), Cambridge, MA.
16 Do speakers have access to a mental syllabary? Willem J.M. Levelt*, Linda Wheeldon Max Planck Institute for Psycholinguistics, Wundtlaan 1, Nijmegen 6525 XD, Netherlands
Abstract The first, theoretical part of this paper sketches a framework for phonological encoding in which the speaker successively generates phonological syllables in connected speech. The final stage of this process, phonetic encoding, consists of accessing articulatory gestural scores for each of these syllables in a "mental syllabary9'. The second, experimental part studies various predictions derived from this theory. The main finding is a syllable frequency effect: words ending in a high-frequent syllable are named faster than words ending in a low-frequent syllable. As predicted, this syllable frequency effect is independent of and additive to the effect of word frequency on naming latency. The effect, moreover, is not due to the complexity of the word-final syllable. In the General Discussion, the syllabary model is further elaborated with respect to phonological underspecification and activation spreading. Alternative accounts of the empirical findings in terms of core syllables and demisyllables are considered.
Introduction The purpose of the present paper is to provide evidence for the notion that speakers have access to a mental syllabary, a repository of articulatory-phonetic syllable programs. The notion of stored syllable programs originates with Crompton (1982) and was further elaborated in Levelt (1989, 1992, 1993). The latter two papers introduced the terms "phonetic" and "mental syllabary" for this hypothetical mental store. Most current theories of speech production model the pre-articulatory form representation at a phonological level as consisting of •Corresponding author. E-mail
[email protected] The authors wish to thank Ger Desserjer for his assistance with the running and analysis of all of the experiments reported.
3
02
W. Levelt, L. Wheeldon
discrete segments or features (Dell, 1988; Shattuck-Hufnagel, 1979) and some models assume explicitly that this level of representation directly activates articulatory routines (Mackay, 1982). However, the actual phonetic realization of a phonological feature is determined by the context in which it is spoken. The fact that phonetic context effects can differ across languages means that they cannot all be due to the implementation of universal phonetic rules but most form part of a language-dependent phonetic representation (Keating, 1988). The mental syllabary was postulated as a mechanism for translating an abstract phonological representation of an utterance into a context-dependent phonetic representation which is detailed enough to guide articulation. The present paper will provide experimental evidence that is consistent with the existence of a mental syllabary and provides a challenge to theories that assume (tacitly or otherwise) that the phonetic forms of all syllables are generated anew each time they are produced. We present here a more detailed model of syllable retrieval processes than has previously been attempted, and while we readily admit that much further evidence is required in order to substantiate it, we propose it as a productive framework for the generation of empirical research questions and as a clear target for further empirical investigation. In the following we will first discuss some of the theoretical reasons for assuming the existence of a syllabary in the speaker's mind. We will then sketch a provisional framework for the speaker's encoding of phonological words-a framework that incorporates access to a phonetic syllabary. This theoretical section will be followed by an empirical one in which we present the results of four experiments that address some of the temporal consequences of a speaker's retrieving stored syllable programs during the ultimate phase of phonological encoding. In the final discussion section, we will return to a range of further theoretical issues that are worth considering, given the notion of a mental phonetic syllabary.
The syllabary in a theory of phonological encoding Crompton's suggestion As with so many notions in theories of speech production, the idea that a speaker retrieves whole phonetic syllable programs was originally proposed to account for the occurrence of particular speech errors. Crompton (1982) suggested the existence of a library of syllable-size articulatory routines to account for speech errors involving phonemes and syllable constituents. For example, an error like guinea hig pair (for guinea pig hair) arises when the mechanism of addressing syllable routines goes awry. The articulatory syllables [pig] and [he9r] in the library are addressed via sets of phonemic search instruction such as:
Do speakers have access to a mental syllabary?
onset = p nucleus = I coda = g
303
onset = h nucleus = 69 coda = r
If these search instructions get mixed up, leading to the exchange of the onset conditions, then instructions arise for the retrieval of two quite different articulatory syllables, namely [hlg] and [pear]. This provides an elegant account for the phonetic "accommodation" that takes place: [hlg] is pronounced with the correct allophone [J, not with [e] that would have been the realization of [h] in hair. This addressing mechanism, Crompton argues, is fully compatible with Shattuck-Hufnagel's (1979) scan copier mechanism of phonological encoding. According to that model, a word's phonological segments are spelled out from the word's lexical representation in memory, and inserted one-by-one into the slots of a syllabic frame for the word that is independently retrieved from the word's lexemic representation. This copier mechanism in fact specifies the search instructions for each of a word's successive syllables (i.e., onset of syllable 1, nucleus of syllable 1, coda of syllable 1, onset of syllable 2, etc.). A functional paradox However, in the same paper Crompton reminds us of a paradox, earlier formulated by Shattuck-Hufnagel (1979, p. 338), but not solved by either of them: "perhaps its [the scan copier's] most puzzling aspect is the question of why a mechanism is proposed for the one-at-a-time serial ordering of phonemes when their order is already specified in the lexicon". Levelt (1992) formulated this functional paradox as follows: Why would a speaker go through the trouble of first generating an empty skeleton for the word, and then filling it with segments? In some way or another both must proceed from a stored phonological representation, the word's phonological code in the lexicon. Isn't it wasteful of processing resources to pull these apart first, and then to combine them again (at the risk of creating a slip)?
And (following Levelt, 1989) he argued that the solution of the paradox should be sought in the generation of connected speech. In connected speech it is the exception rather than the rule that a word's canonical syllable skeleton is identical to the frame that will be filled. Instead, new frames are composed, not for lexical words (i.e., for words in their citation form), but for phonological words, which often involve more than a single lexical word. It is only at this level that syllabification takes place, not at any earlier "citation form" level. Let us now outline this framework in more detail (see Fig. 1). Word from retrieval A first step in phonological encoding is the activation of a selected word's
W. Levelt, L. Wheeldon
304 word form retrieval
metrical spellout
segmental spellout td/fi/Jm/Jx/Jn/Jd/,
NJtl
J A ?A
CO
I a I W M- H
i.
phonological word fonnation CD
/IN
segment-to-frame association CO
ft/A
Id. i m s n d i t/
retrieval of syllabic gestural scores [di - man - dit] articulatory network Figure 1.
A framework for phonological encoding.
"lexeme" - the word's form information in the mental lexicon. In Fig. 1 this is exemplified for two words, demand and it, as they could appear in an utterance such as police demand it. Although terminologies differ, all theories of phonological encoding, among them Meringer and Mayer (1895), Shattuck-Hufnagel (1979), Dell (1988) and Levelt (1989), distinguish between two kinds of form information: a word's segmental and its metrical form. The segmental information relates to the word's phonemic structure: its composition of consonants, consonant clusters, vowels, diphthongs, glides, etc. Theories differ with respect to the degree of specification, ranging from minimal or underspecification (Stemberger, 1983) to full phonemic specification (Crompton, 1982), and with respect to the degree of linear ordering of segmental
Do speakers have access to a mental syllabary?
305
information. Without prejudging these issues (but see Discussion below), we have represented segments in Fig. 1 by their IPA labels and as consonantal or vocalic (CorV). The metrical information is what Shattuck-Hufnagel (1979) called the word's "frame". It specifies at least the word's number of syllables (its "syllabicity") and its accent structure, that is, the lexical stress levels of successive syllables. Other metrical aspects represented in various theories are: onset versus rest of word (Shattuck-Hufnagel 1992), the precise CV structure of syllables (Dell, 1988), the degree of reduction of syllables (Crompton, 1982) and (closely related) whether syllables are strong or weak (Levelt, 1993). Our representation in Fig. 1 follows Hays (1989) as far as a syllable's weight is represented in a moraic notation (one mora for a light syllable, two morae for a heavy one). This is not critical, though; weight could also be represented by branching (vs. not branching) the nucleus. But the mora representation simplifies the formulation of the association rules (see below). The representation of accent structure in Fig. 1 is no more than a primitive "stressed" (with ') versus unstressed (without '). There is also general agreement that metrical information is, to some extent, independently retrieved. This is sometimes phenomenologically apparent when we are in a "tip-of-the-tongue" state, where we fail to retrieve an intended word, but feel pretty sure about its syllabicity and accent structure. This relative independence of segmental and metrical retrieval is depicted in Fig. 1 as two mechanisms: "segmental spellout" and "metrical spellout" (see Levelt, 1989, 1993, for more details). An important aspect of form retrieval, which will play an essential role in the experimental part of this paper, is that it is frequency sensitive. Jescheniak and Levelt (in press) have shown that the word frequency effect in picture naming (naming latency is longer for pictures with a low-frequency name than for pictures with a high-frequency name) is entirely due to accessing the lexeme, that is, the word's form information. Phonological word formation A central issue for all theories of phonological encoding is how segments become associated to metrical frames. All classical theories, however, have restricted this issue to the phonological encoding of single words. However, when generating connected speech, speakers do not concatenate citation forms of words, but create rhythmic, pronounceable metrical structures that largely ignore lexical word boundaries. Phonologists call this the "prosodic hierarchy" (see, for instance, Nespor & Vogel, 1986). Relevant here is the level of phonological words (or clitic groups). In the utterance police demand it, the unstressed function word it cliticizes to the head word demand, resulting in the phonological word demandit. Of crucial importance here is that phonological words, not lexical
306
W. Levelt, L. Wheeldon
words, are the domain of syllabification. The phonological word demandit is syllabified as de-man-dit, where the last syllable straddles a lexical boundary. Linguists call this "^syllabification", but in a processing model this term is misleading. It presupposes that there was lexical syllabification to start with (i.e., de-mand + it). There is, in fact, good reason to assume that a word's syllables are not fully specified in the word form lexicon. If they were, they would regularly be broken up in connected speech. That is not only wasteful, but it also predicts the occurrence of syllabification speech errors such as de-mand-it. Such errors have never been reported to occur in fluent connected speech. In short, there must be a mechanism in phonological encoding that creates metrical frames for phonological words. This is depicted in Fig. 1 as "phonological word formation". Notice that this is an entirely metrical process. There are no known segmental conditions on the formation of phonological words (such as "a word beginning with segment y cannot cliticize to a word ending on segment x"). The conditions are syntactic and metrical. Essentially (and leaving details aside), a phonological word frame is created by blending the frames of its constituent words, as depicted in Fig. 1. Segment-to-frame association The next step in phonological encoding, than, is the association of spelled-out segments to the metrical frame of the corresponding phonological word. There is good evidence that this process runs "from left to right" (Dell, 1988; Meyer, 1990, 1991; Meyer & Schriefers, 1991; Shattuck-Hufnagel, 1979). But the mechanisms proposed still vary substantially. However, whatever the mechanisms, they must adhere to a language's rules of syllabification. Levelt (1992) presented the following set of association rules for English, without any claim to completeness: (1) A vowel only associates to /LL, a diphthong to /I/LA. (2) The default association of a consonant is to a. A consonant associates to /x if and only if any of the following conditions hold: (a) the next element is lower in sonority; (b) there is no a to associate to; (c) associating to a would leave a [L without associated element. In addition, there is a general convention that association to o\ the syllable node, can only occur on the left-hand side of the syllable, that is, to the left of any unfilled morae of that syllable. See Levelt (1992) for a motivation of these rules. On the assumption that spelled-out segments are ordered, and that association proceeds "from left to right", a phonological word's syllabification is created "on the fly" when these rules are followed. The reader can easily verify that for
Do speakers have access to a mental syllabary?
307
demandit the syllabification becomes de-man-dit, where the last syllable straddles the lexical boundary. It should be noticed that this is not an account of the mechanism of segmentto-frame association. It is doubtless possible to adapt Shattuck-Hufnagel's (1979) scan-copier mechanism or Dell's (1988) network model to produce the left-toright association proposed here. The adaptations will mainly concern (i) the generation of phonological, not lexical word frames, and (ii) the use of more global syllable frames, that is, frames only specified for weight, not for individual segmental slots. Accessing the syllabary The final step of phonological encoding (which is sometimes called phonetic encoding) is to compute or access the articulatory gestures that will realize a phonological word's syllables. It is at this point that the notion of a mental syllabary enters the picture. But before turning to that, we should first say a few words about what it is that has to be accessed or computed. We suggest that it is what Browman and Goldstein (1991) have called gestural scores. Gestural scores are, like choreographic or musical scores, specifications of tasks to be performed. Since there are five subsystems in articulation that can be independently controlled, a gestural score involves five "tiers". They are the glottal and the velar system, plus three tiers in the oral system: tongue body, tongue tip and lips. Example of a gestural task is to close the lips, as in the articulation of apple. The gestural score only specifies that the lips should be closed, but not how it should be done. The speaker can move the jaw, the lower lip, both lips, or all of these articulators to different degrees. But not every solution is equally good. As Saltzman and Kelso (1987) have shown, there are least-effort solutions that take into account which other tasks are to be performed, what the prevailing physical conditions of the articulatory system are (does the speaker have a pipe in his mouth that wipes out jaw movement?), etc. These computations are done by what they called an "articulatory network" - a coordinative motor system that involves feedback from the articulators. Relevant here is that gestural scores are abstract. They specify the tasks to be performed, not the motor patterns to be executed. The gestural score for a phonological word involves scores for each of its syllables. The issue here is: how does a speaker generate these scores? There may well be a direct route here, as Browman and Goldstein have convincingly argued. A syllable's phonological specifications are, to some extent, already specifications of the gestural tasks that should be carried out in order to realize the syllable. One can present a reader with a phonotactically legal non-word that consists of non-existing syllables (such as fliltirp), and the reader will pronounce it all right. Still, there may be another route as well. After all, most syllables that a speaker uses are highly overlearned articulatory gestures. It has been argued time
308
W. Levelt, L. Whceldon
and again that most (though not all) phenomena of allophonic variation, of coarticulation and of assimilation have the syllable as their domain (see, for instance, Fujimura & Lovins, 1978; Lindblom, 1983). In other words, if you know the syllable and its stress level, you know how to pronounce its segments. Or rather: phonetic segments have no independent existence; they are mere properties of a syllabic gesture, its onset, nucleus and offset. If these syllabic scores are overlearned, it is only natural to suppose that they are accessible as such, that is, that we have a store of syllabic gestures for syllables that are regularly used in speech. This is depicted in Fig. 1 as the syllabary. According to this theory, the syllabary is a finite set of pairs consisting of, on the one hand, a phonological syllable specification and, on the other hand, a syllabic gestural score. The phonological specification is the input address; the gestural score is the output. As phonological syllables are, one by one, created during the association process, each will activate its gestural score in the syllabary. That score will be the input to the "articulatory network" (see above), which controls motor execution of the gesture. Crompton (1982) made the suggestion that articulatory routines for stressed and unstressed syllables are independently represented in the repository, and this was adopted in Levelt (1989). It should be noticed that the size of the syllabary will be rather drastically different between languages, ranging from a few hundred in Chinese or Japanese to several thousands in English or Dutch. So far for the theoretical framework. It is obvious that many theoretical issues have not (yet) been raised. It is, in particular, not the intention of the present paper to go into much more detail about the initial processes of phonological encoding, segmental and metrical spellout, phonological word formation and segment-to-frame association. We will, rather, focus on the final step in the theory: accessing the syllabary. It is important to notice this step has a certain theoretical independence. Most theories of phonological encoding are not specific about phonetic encoding, and many of them would be compatible with the notion of a syllabary. Still, as will be taken up in the General Discussion, the syllabary theory may have interesting consequences for an underspecification approach to phonological encoding. It may provide an independent means of determining what segmental features should minimally be specified in the form lexicon. The following four experiments were inspired by the notion of a syllabary. Their results are compatible with that notion, but alternative explanations are by no means excluded. Still, they provide new evidence about the time course of phonetic encoding that has not been predicted by other theories.
EXPERIMENT 1: WORD AND SYLLABLE FREQUENCY According to the theory outlined above, there are two steps in phonological encoding where the speaker accesses stored information. The first one is in
Do speakers have access to a mental syllabary?
309
retrieving word form information, that is, the lexeme. The second one is in retrieving the syllabic gestural score. The former involves the form part of the mental lexicon, the latter the syllabary. We have modelled these two steps as successive and independent. It has long been known that word form access is sensitive to word frequency. Oldfield and Wingfield (1965) and Wingfield (1968) first showed that naming latencies for pictures with low-frequency names are substantially longer than latencies for pictures with high-frequency names. The effect is, moreover, not due to the process of recognizing the picture; it is a genuinely lexical one. Jescheniak and Levelt (in press) have further localized the effect in word form access. Accessing a low-frequent homophone (such as wee) turned out to be as fast as accessing non-homophone controls that are matched for frequency to the corresponding high-frequent homophone (in case, we). Since homophones, by definition, share their word form information, but not their semantic /syntactic properties, the frequency effect must have a form-level locus: the lowfrequent homophone inherits the form-accessing advantage of its high-frequent twin. It is, however, enough for the rationale of the experiment to know that there is a genuinely lexical frequency effect in word retrieval, and to assume that accessing the syllabary is a later and independent step in phonological encoding. Similar to word retrieval, accessing the store of syllables might also involve a frequency effect: accessing a syllable that is frequently used in the language may well be faster than accessing a syllable that is less frequently used. The experiment was designed to look for an effect on word production latency of the frequency of occurrence of a word's constituent syllables. High- and low-frequency bisyllabic words were tested which comprised either two highfrequency syllables or two low-frequency syllables. Whole-word frequency of occurrence was therefore crossed with syllable frequency, allowing us to test for any interaction. The syllabary theory predicts that the effects should be additive and independent.
Method In the following experiments the linguistic restrictions on the selection of experimental materials were severe. It is, in particular, impossible to obtain the relevant naming latencies by means of a picture-naming experiment; there are simply not enough depictable target words in the language. We therefore designed another kind of naming task, that would put minimal restrictions on the words we could test. In the preparation phase of the experiment, subjects learned to associate each of a small number of target words to an arbitrary symbol. During the experiments, these symbols were presented on the screen and the subjects produced the corresponding target words; their naming latencies were measured.
W. Levelt, L. Wheeldon
310
Notice that we decided against a word-reading task, which would always involve linguistic processing of the input word.
Frequency counts All frequency counts were obtained from the computer database CELEX1, which has a Dutch lexicon based on 42 million word tokens. The word frequency counts we used are two occurrences per million counts from this database: word form frequency, which includes every occurrence of that particular form, and lemma frequency, which includes the frequencies of all word forms with the same stem. Syllable frequencies were counted for phonetic syllables in Dutch. The phonetic script differentiates the reduced vowel schwa from full vowel forms, giving approximately 12,000 individual syllable forms. Syllable frequencies were calculated for the database from the word form occurrences per million count. Two syllable frequency counts were calculated: overall frequency of occurrence and the frequency of occurrence of the syllable in a particular word position (i.e., first or second syllable position). The syllable frequencies range from 0 to approximately 90,000 per million words, with a mean frequency of 121. In all of the experiments reported the same criteria were used in assigning words to frequency conditions. All low-frequency words had a count of less than 10 for both word form and lemma counts. All high-frequency words had both counts over 10. Low-frequency syllables had counts of less than 300 in both overall and position-dependent counts; high-frequency syllables had both counts over 300. Most low-frequency syllables, therefore, had above-average frequency of occurrence in the language. This is important as our model claims that very lowfrequent syllables will be constructed on-line rather than retrieved from store. We are aware of the fact that we have been counting citation form syllables, not syllables as they occur in connected speech. But if the latter frequency distribution deviates from the one we used, this will most likely work against our hypothesis; our distinct HF and LF syllable classes will tend to be blurred in the "real" distribution.
Vocabulary The experimental vocabulary comprised four groups of 16 bisyllabic Dutch words. These groups differed in the combination of word frequency and syllable frequency of their constituent words. Average frequencies for each word group are given in Table 1. Each group contained 13 nouns and three adjectives. Groups ^e Centre for Lexical Information (CELEX), Max Planck Institute, The Netherlands.
Do speakers have access to a mental syllabary?
311
Table 1. Log syllable and word frequencies and number of phonemes of words in each of the Word x Syllable frequency groups of Experiment 1 Low
log syllable frequency: Log word frequency:
High High
High
High Low
Low Low
Word form Lemma
3.3 3.6
3.2 3.6
0.4 0.6
0.3 0.4
1st syllable position dependent 1st syllable total
7.4 7.9
4.7 5.0
7.5 8.1
4.2 4.6
2nd syllable position dependent 2nd syllable total
7.3 8.2
4.1 4.1
7.3 8.0
3.3 3.6
Number of phonemes
5
6
5
6
were also matched for word onset phonemes and mean number of phonemes. Each group was divided into four matched subgroups which were recombined into four experimental vocabularies of 16 words (four from each condition; see Appendix 1). Within each vocabulary four groups of four words (one from each condition) were selected to be elicited in the same block. These groups contained words which were phonologically and semantically unrelated and each group contained at least one word with second syllable stress.
Symbols Four groups of four symbol strings were constructed. Each symbol consisted of a string of six non-alphabetic characters. The four groups of symbols were roughly matched for gross characteristics as follows: Set 1 ) ) ) ) ) ) %%%%%% >>>>>>
Set 2 \ \ \ \ \ \ &&&&&&
Set 3 }}}}}} ######
Set 4 [ [ [ [ [ [ @@@@@@ A A A A A A
Design Subjects were assigned to one of the vocabularies. Their task was to learn to produce words in response to symbols. Subjects learned one block of four words at a time. The experiment consisted of 12 blocks of 24 naming trials - three blocks for each four-word set. Within a block subjects produced each word six times. The first production of each word in a block was a practice trial. Order of presentation
312
W. Levelt, L. Wheeldon
was random, with the condition that no symbol occurred twice in a row. This condition was included in order to eliminate the potentially large facilitation effect due to immediate repetition and to encourage subjects to clear their minds at the end of each trial. Within a vocabulary the order of presentation of block groups was rotated across subjects. Within a vocabulary each block group was assigned a symbol set. The assignment of symbols to words within sets was also rotated across subjects.
Procedure Subjects were tested individually. They were given a card on which four words with associated symbols were printed. They were asked to practise the relationship between the symbols and the words until they thought they could accurately produce the words in response to the symbols. When each subject was confident that they had learned the associations they were shown each symbol once on the computer screen and asked to say the associated word. If they could do this correctly they then received three blocks of 24 trials. The events on each trial were as follows. A fixation cross appeared on the screen for 300 ms. The screen then went blank for 500 ms, after which a symbol appeared on the screen and remained there for a further 500 ms. Subject than had 2 s in which to respond, followed by a 3 s interval before the onset of the next trial. This procedure was repeated for all four groups of words. The printed order of the words from each frequency group was rotated across block groups. Both naming latencies and durations were recorded for each trial.
Subjects Thirty-two subjects were tested, 24 women and 8 men. All were native speakers of Dutch. They were voluntary members of the Max-Planck subjects pool, between the ages of 18 and 34. They were paid for their participation.
Results Exclusion of data Data from two subjects were replaced due to high error rates. The first production of a word in each block was counted a practice trial and excluded from the analysis. Correct naming latencies following error trials were also excluded from the latency analysis as errors can often perturb subject's responses on the
Do speakers have access to a mental syllabary?
313
following trial. 3.3% of the data points were lost due to these criteria. Data points greater than two standard deviations from the mean were counted as outliers and were also excluded. This resulted in the loss of only 1.6% of the data points. Missing values in all experiments reported were substituted by a weighted mean based on subject and item statistics calculated following Winer (1971, p. 488).
Naming latency Collapsed across syllable frequency, high and low word frequency latencies were 592.0 ms and 607.2 ms respectively. The main effect of word frequency (15.2 ms) was significant, Fx(l, 28) = 14.9, / X . 0 0 1 , F 2 (l, 48) = 4.2, p < . 0 5 . Collapsed across word frequency, high and low syllable frequency latencies were 592.3 ms and 606.8 ms respectively. The main effect of syllable frequency (14.5 ms) was also significant, Fj(l, 28) = 17.7, p < . 0 0 1 , F 2 (l,48) = 3.8, p = .052. Mean naming latencies for words in each of the frequency groups are shown in Fig. 2. The size of the syllable frequency effect is similar in both word frequency groups and vice versa: the interaction of word and syllable frequency was insignificant, Fx and F2 < 1. There was a significant effect of vocabulary in the materials analysis, FY(3928) = 1.2, F2(3,48) = 7.8, p < .001, but no interactions of this variable with either syllable or word frequency. Effects of practice were evident in the significant decrease in naming latencies across the three blocks of a word group, Fx(2, 56) = 203.1, p < .001, F2(2, 96) = word onset latency in ms. oou
low-frequency words 610
590 high-frequency words
low
high
syllable frequency Figure 2.
Naming latencies in Experiment 1. Syllable versus word frequency.
314
W. Levelt, L. Wheeldon
318.8, p<.001, and across the five repetitions of a word within a block, Fx(4,112) = 25.7, p < .001, F2(4,192) = 23.8, p < .001. The effect of block did not interact with either word or syllable frequency effects (all Fs< 1). The effect of trial, however, showed an interaction with syllable frequency that approached significance by subjects, Ft(4,112) = 2.3, p < .06, F2(4,192) = 1.5. However, this interaction was due to variation in the size of the priming effect over trials but not in the direction of the effect and does not qualify the main result.2
Percentage error rate High and low word frequency error rates were 2.6% and 3.0% respectively. High and low syllable frequency error rates were 2.7% and 2.9% respectively. A similar analysis carried out on percentage error rate (arc sine transformed) yielded no significant effects.
Naming duration A similar analysis was carried out on naming durations. High and low word frequency durations were 351.4 ms and 344.7 ms respectively. The 6.7 ms difference was significant over subjects, F^l, 28) = 8.8, p < .01, F2 < 1. High and low syllable frequency durations were 326.8 ms and 369.3 ms respectively. The 42.5 ms difference was significant, F^l, 28) = 253.7, p < .001, F 2 (l, 48) = 15.6, p < .001. Word and syllable frequency did not interact, Fx and F2 < 1.
Regression analyses Regression analyses were carried out on the means data of the experimental words. In all regressions mean naming latency is the dependent variable. Simple regressions with both log word form frequency and log lemma frequency failed to reach significance (R = 0.2, p > .05). Of the syllable frequency counts only second syllable frequency counts yielded significant correlations: total log frequency (# = 0.3, /?<.01) and position-dependent log frequency (/? = 0.4, /?<.001). Similarly number of phonemes in the second syllables and log second syllable CV structure frequency showed significant correlations with naming latency (both /? = 0.3, p<.05). A multiple regression of naming latency with these three 2
Main effects of block and trial were observed in the analyses of all the dependent variables reported. These practice effects were always due to a decrease in naming latencies, durations and error rates as the experiment progressed. In no other analysis did they significantly interact with frequency effects and they will not be reported.
Do speakers have access to a mental syllabary?
315
second syllable variables showed only a significant unique effect of log syllable frequency (p < .05). This pattern of results remained when only words with initial syllable stress were included in the regressions (n = 32).
Discussion Apart from the expected word frequency effect, the experiment showed that there is a syllable frequency effect as well, amounting to about 15 ms. Bisyllabic words consisting of low-frequency syllables were consistently slower in naming than those consisting of high-frequency syllables. Moreover, this syllable frequency effect was independent of the word frequency effect, as predicted by the syllabary theory. The post hoc regression analyses suggest that second syllable frequency is a better predictor of naming latency than the frequency of first syllable. Experiments 2 and 3 will explore this possibility in more detail. Not surprisingly, syllable complexity affected word durations, but there was also some evidence that complexity of the second syllable has an effect on naming latency. This issue will be taken up in Experiment 4.
EXPERIMENT 2: FIRST AND SECOND SYLLABLE FREQUENCY There are theoretical reasons to expect that in bisyllabic word naming the frequency of the second syllable will affect naming latency more than the frequency of the first syllable. It is known that in picture naming bisyllabic target words are produced with longer naming latencies than monosyllabic target words. In a study by Klapp, Anderson, and Berrian (1973) the difference amounted to 14 ms. The effect cannot be due to response initiation, as the difference disappears in a delayed production task where subjects can prepare their response in advance of the "Go" signal to produce it. It must therefore have its origin in phonological encoding. Levelt (1989, p. 417) suggests that if in phonetic encoding syllable programs are addressed one by one, the encoding duration of a phonological word will be a function of its syllabicity. But the crucial point here is that, apparently, the speaker cannot or will not begin to articulate the word before its phonetic encoding is complete. If articulation was initiated following the phonetic encoding of the word's first syllable, no number-of-syllables effect should be found. Wheeldon and Lahiri (in preparation) provide further evidence that during the production of whole sentences articulation begins only when the first phonological word has been encoded. Making the same assumption for the present case - that is, that initiation of
316
W. Levelt, L. Wheeldon
articulation will wait till both syllables have been accessed in the syllabary - it is natural to expect a relatively strong second syllable effect. The association process (see Fig. 1) creates phonological syllables successively. Each new syllable triggers access to the syllabary and retrieval of the corresponding phonetic syllable. Although retrieving the first syllable will be relatively slow for a low-frequency syllable, that will not become apparent in the naming latency; the response can only be initiated after the second syllable is retrieved. Retrieving the second syllable is independent of retrieving the first one. It is initiated as soon as the second syllable appears as a phonological code, whether or not the first syllable's gestural score has been retrieved. And articulation is initiated as soon as the second syllable's gestural code is available. First syllable frequency will only have an effect when retrieving that syllable gets completed only after retrieving the second syllable. This, however, is a most unlikely state of affairs. Syllables are spoken at a rate of about one every 200 ms. Wheeldon and Levelt (1994) have shown that phonological syllables are generated at about twice that rate, one every 100 ms. Our syllable frequency effect, however, is of the order of only 15 ms. Hence it is implausible that phonetic encoding of the second syllable can ''overtake" encoding of the first one due to advantageous frequency conditions. In this experiment we independently varied the frequency of the first and the second syllable in bisyllabic words. In one sub-experiment we did this for high-frequency words and in another one for low-frequency words.
Method The vocabulary consisted of 96 bisyllabic Dutch nouns: 48 high word frequency, 48 low word frequency. Within each word frequency group there were four syllable frequency conditions (12 words each) constructed by crossing first syllable frequency with second syllable frequency (i.e., high-high, high-low, low-high and low-low). The criteria for assigning words to frequency groups were the same as in Experiment 1. Mean log frequencies and number of phonemes for the high- and low-frequency words in each syllable condition are given in Table 2. Two high word frequency vocabularies and the two low word frequency vocabularies were constructed, each with six words from each syllable frequency condition. Each vocabulary was then divided into six four-word groups with one word from each condition. As in Experiment 1, these groups contained words which were phonologically and semantically unrelated. Each group was assigned a symbol set with four rotations and each of 48 subjects were assigned to one vocabulary and one symbol set. Each subject received 18 blocks of 24 trials: three blocks for each word group. In this experiment word frequency was a between-subjects variable. This was necessary because of the extra syllable frequency conditions and the limited
Do speakers have access to a mental syllabary?
317
Table 2. Log syllable and word frequencies and mean number of phonemes for high- and low-frequency words in each of the First x Second syllable frequency groups of Experiment 2 Syllable freq.
No. phonemes
Syllabic\ 1
Syllabic: 2
Word
1st
Syl. 1
Syl. 2
POS
TOT
POS
TOT
WRD
LEM
2nd
High High High Low Low
word frequency x High 2.8 x Low 2.8 xHigh 3.1 xLow 3.0
2.7 3.2 2.8 3.6
7.3 7.6 4.9 4.9
7.8 7.9 5.2 5.3
7.8 5.1 8.3 4.8
8.7 5.3 8.9 5.2
3.8 3.6 3.8 3.7
4.0 4.0 4.0 4.0
Low High High Low Low
word frequency x High 2.8 x Low 2.7 xHigh 3.1 x Low 2.9
2.6 3.2 2.6 3.3
7.0 7.5 4.1 4.3
7.5 8.1 4.7 4.7
8.1 4.0 8.3 4.1
8.7 4.5 8.9 4.5
1.5 1.2 1.4 1.0
1.9 1.5 1.7 1.5
number of words a subject could accurately memorize and produce within an hour. Moreover, our major interest was in the pattern of results over the syllable frequency conditions for both high- and low-frequency words, rather than in the word frequency effect itself. In order to be able to compare baseline naming speed of subjects who received the high and low word frequency vocabularies, each subject received a calibration block of the same four words at the end of the experiment. The rest of the procedure was exactly the same as in Experiment 1. Forty-eight subjects were run; 24 received a high word frequency vocabulary (20 women and 4 men) and 24 received a low word frequency vocabulary (18 women and 6 men). Results Exclusion of data Data from four subjects were replaced due to high error rates. Data points were excluded and substituted according to the same principles as in Experiment 1. The first production of a word in each block was again counted a practice trial and excluded from the analysis. 2.8% of data points were correct naming latencies following error trials. 1.8% of the data points were greater than two standard deviations from the mean. Naming latency Mean naming latency for the high word frequency group was 641.6 ms - 5.7 ms
W. Levelt, L. Wheeldon
318
Table 3. Mean naming latency and percentage error (in parentheses) for words in the four syllable frequency conditions of Experiment 2. Means are shown for all words and for high- and low-frequency words separately. The effect of syllable frequency (low minus high) is also shown. Effect Low - high
Syllable frequency Low
High
637.4 (1.9) 644.5 (2.3)
640.1 (2.1) 633.0 (0.2)
-2.7 (-0.2) 11.5 (0.5)
High-frequency words 1st syllable 641.8 (2.0) 646.5 (2.5) 2nd syllable
641.1 (2.4) 636.5 (2.0)
0.7 (-0.4) 10.0 (0.5)
Low-frequency words 1st syllable 2nd syllable
638.9 (1.8) 629.3 (1.5)
-6.1 13.0
All words 1st syllable 2nd syllable
632.8 (1.8) 642.3 (2.1)
(0.0) (0.6)
slower than the low word frequency group, 635.9 ms (see Table 3). This reverse effect of word frequency was insignificant, F1 and F2 < 1, and can be attributed to the random assignment of slower subjects to the high-frequency vocabularies. Mean naming latencies for the calibration block were: high word frequency, 659.3 ms; low word frequency, 624.5 ms. Subjects who received the high word frequency vocabularies were, therefore, on average 34.8 ms slower than the subjects who received the low word frequency vocabularies. This difference was also significant by words, F^l, 46) = 1.9, F 2 (l, 3) = 52.1, p < .01. Mean naming latencies and error rates for the syllable frequency conditions are shown in Table 3; the latency data are summarized in Fig. 3. The -2.7 ms effect of first syllable frequency was, unsurprisingly, insignificant, F x (l, 44) = 1.1, F2 < 1. The 11.5 ms effect of second syllable frequency was significant by subjects, F^l, 44) = 18.6, p < .001, and again marginally significant by words, F 2 (l, 80) = 3.8, p = .053. The interaction of first and second syllable frequency was not significant, Fx and F 2 <1. However, there was a significant three-way word frequency by first and second syllable frequency interaction, but only in the subject analysis, Fj(l,44) = 6.1, /?<.05, F 2 (l,80) = 1.3. This was due to a by-subjects only interaction of first and second syllable frequency in the lowfrequency word set, ^(1,22) = 5.6, p<.05, F 2 (l,40) = 1.2; words with highfrequency first syllables showed a smaller effect of second syllable frequency than words with low-frequency first syllables (5 ms and 21 ms respectively). Words with high-frequency second syllables showed a reverse effect of first syllable frequency (-14 ms) compared to a 2 ms effect for words with low-frequency second syllables. There was no main effect of vocabulary, F1 and F2 < 1. However, there was a
Do speakers have access to a mental syllabary?
319
word onset latency in ms. 660
650
syllable 1 640
630
620 low
high
syllable frequency Figure 3. Naming latencies in Experiment 2. Syllable position (word-initial, word-final) versus syllable frequency.
significant interaction of second syllable frequency with vocabulary in the bysubject analysis, Fx(l9 44) = 6.8, p < .05, F 2 (l, 80) = 1.4, due to differences in the size of the effect in the two vocabularies in both the high- and low-frequency word sets.
Naming duration Naming durations for high- and low-frequency words were 346.8 ms and 316.6 ms respectively. The 50.2 ms effect was significant by words, F x (l, 44) = 3.5, p > . 0 5 , F 1 (l,80) = 20.1, p<.001. There were also significant effects of first syllable frequency (high 329.1 ms, low 334.3 ms, F 2 (l, 44) = 12.7, p > .01, F2 < 1) and second syllable frequency (high 321.1ms, low 342.3 ms, F^l, 44) = 167.0, p>.001, F 2 (l,80) = 9.8, p<M). The interactioli of first and second syllable frequency was only significant by subjects, Fx(l9 44) = 12.0, p > .001, F2 < 1; the effect of frequency on second syllable durations was restricted to words with high first syllable frequencies.
Percentage error rate Error rates are also shown in Table 3. They yielded only a significant effect of second syllable frequency over subjects, Fj(l, 44) = 6.0, p < .05, F 2 (l, 80) = 2.4.
320
W. Levelt, L. Wheeldon
Discussion Although not all vocabularies in this experiment yielded significant syllable frequency effects, the main findings were consistent with our expectations. Whatever there is in terms of syllable frequency effects was due to the second syllable only. The frequency of the first syllable had no effect on naming latencies. Although the average size of the frequency effect (12 ms) was of the order of magnitude obtained in Experiment 1 (15 ms), the complexity of the experiment apparently attenuated its statistical saliency. An interaction of first and second syllable frequency effects is not predicted by our model of syllable retrieval. This experiment did yield some indication of such an interaction. However, it was observed in one vocabulary only and never approached significance over items. While further investigation is necessary to rule out such an effect, we do not feel it necessary to amend our model on the basis of this result. The next experiment was designed to isolate the effect of second syllable frequency.
EXPERIMENT 3: SECOND SYLLABLE FREQUENCY
Method Vocabulary The experimental vocabulary consisted of 24 pairs of bisyllabic Dutch words. Members of a pair had identical first syllables but differed in their second syllable: one word has a high-frequency second syllable and one word had a low-frequency second syllable (e.g., ha-merlha~vik). High and low second syllable frequency words were matched for word frequency. No attempt was made to match second syllables for number of phonemes (see Table 4). Two matched vocabularies of 12 word pairs were constructed.
Design Twelve pairs of abstract symbols of the form used in Experiment 1 were constructed. Each pair consisted of one simple symbol (e.g., ) and one more complex symbol (e.g., }}}}}}). The symbol pairs were assigned to one word pair in each vocabulary. Two sets for each vocabulary were constructed such that each word in a word pair was assigned to each symbol in its associated pair once.
Do speakers have access to a mental syllabary?
321
Table 4. Log syllable and word frequencies for high- and low-frequency second syllable words in Experiment 4 2nd syllable frequency High
Low
Log frequency Word form Lemma
1.9 2.1
2.0 2.2
1st syllable position dependent 1st syllable total
6.8 7.2
6.8 7.2
2nd syllable position dependent 2nd syllable total
7.8 8.7
4.0 4.7
Number of phonemes
2.8
3.3
Within a vocabulary, words were grouped into six blocks of four words. Only one member of a word pair occurred within a block. None of the words within a block had the same initial phoneme and they were semantically unrelated. The associated symbol groups in each set were the same in each vocabulary. Each subject was assigned randomly to a vocabulary and a word set. Each subject received 24 blocks of 24 trials: three blocks for each word group. Presentation of the blocks within a set was rotated. Procedure and subjects Each subject was assigned randomly to a vocabulary and a word set. Presentation of the blocks within a set were rotated. The procedure was the same as in Experiments 1 and 2. Twenty-four subjects were tested: 18 women and 6 men. Results Exclusion of data 2.2% of the data were trials following an error and 1.8% of the data were greater than 2 standard deviations from the mean. These data were again excluded from the analyses. Naming latencies Mean naming latency for words with high-frequency second syllables was
322
W. Levelt, L. Wheeldon
622.7 ms, and for low-frequency second syllable 634.5 ms. The 11.8 ms effect of syllable frequency was significant, Fj(l,22) = 12.6, p < . 0 1 , F 2 (l,44) = 4.7, p < .05. There was a main effect of vocabulary by words, FX<1, F 2 (l,44) = 18.0, p < .001, due to slower reaction times to vocabulary A (640.1 ms) compared to vocabulary B (617.1 ms). There was also a significant interaction between syllable frequency and vocabulary by subjects only, Fx(l9 22) = 4.5, p < .05, F 2 (l, 44) = 1.7, due to a larger frequency effect in vocabulary A (high 630.7 ms, low 649.5 ms) than in vocabulary B (high 614.8 ms, low 619.5 ms).
Naming durations Mean naming duration for words with high-frequency second syllables was 351.5 ms, and for low-frequency second syllable 370.0 ms. The 18.5 ms difference was significant, F x (l, 22) = 106.0, p < .001, F 2 (l, 44) = 4.5, p < .05. The effect of vocabulary was significant by words, Fx(l9 22) = 2.8, F 2 (l, 44) = 26.0, p < .001 (vocabulary A, 338.4 ms, vocabulary B 383.0 ms), but there was no interaction of vocabulary with syllable frequency, F x (l, 22) = 3.1, F2 < 1.
Percentage error rate Mean percentage error rates were, for high-frequency second syllable 1.2%, and for low-frequency second syllable 1.6%. The only significant effect was of vocabulary (vocabulary A 1.8%, vocabulary B 1.0%), F x (l,22) = 5.2, p<.05, F 2 (l,44) = 5.1, p<.05.
Discussion The present experiment reproduced the 12 ms second syllable effect obtained in Experiment 2, but now with satisfying statistical reliability. Together with the previous experiments, it supports the notion that the bulk, if not the whole of the syllable frequency effect, is due to the word-final syllable. Let us now turn to the other issue raised in the discussion of Experiment 1. Could it be the case that what we are measuring is not so much an effect of syllable frequency, but rather one of syllable complexity? In all of the previous experiments the second syllable frequency effect on naming latencies is accompanied by a similar effect on naming durations; that is, words with low-frequency second syllables have significantly longer naming durations than words with high-frequency second syllables. Moreover, the regression analyses of Experiment
Do speakers have access to a mental syllabary?
323
1 showed that a syllable's frequency of occurrence correlates with the number of phonemes it contains. It is possible, therefore, that syllable complexity (defined in terms of number of phonemes to be encoded or in terms of articulation time) underlies the effects we have observed.
EXPERIMENT 4: SYLLABLE COMPLEXITY The complexity issue is a rather crucial one. In the theoretical section of this paper we compared a direct route in phonetic encoding and a route via stored syllable programs. If any of these, the former but not the latter would predict an effect of syllable complexity. The more complex a syllable's phonological structure, the more computation would be involved in generating its gestural score afresh from its phonological specifications. But no such thing is expected on the syllabary account. The syllabic gesture need not be composed; it is only retrieved. There is no reason to suppose that retrieving a more complex gestural score takes more time than retrieving a simpler one. There will, at most, be a mediated relation to complexity. There is a general tendency for more complex syllables to be less frequent in usage than simpler syllables. If indeed frequency is a determinant of accessing speed, then - even on the syllabary account - simple syllables will be faster than complex syllables. The present experiment was designed to test second syllable complexity as a potential determinant of phonetic encoding latency, but we controlled for syllable frequency in order to avoid the aforementioned confounding. We also controlled for word frequency.
Method Vocabulary The vocabulary consisted of 20 pairs of bisyllabic nouns. Each pair of words had the same initial syllable but differed in the number of phonemes in their second syllable (e.g., ge-mis [CVC]; ge-schreeuw [CCCVVC]). Word pairs were also matched for word and syllable frequency (see Table 5). The 20 pairs were divided into two vocabularies of 10 pairs matched on all the above variables.
Design As in Experiment 3, pairs of abstract symbols were constructed and assigned to one word pair in each vocabulary. Two sets for each vocabulary were again
*** Table 5.
W. Levelt, L. Wheeldon Log syllable and word frequencies and mean number of phonemes for short and long words in Experiment 4 2nd syllable Short
Long
Log frequency Word form Lemma
1.9 2.3
2.0 2.4
1st syllable position dependent 1st syllable total
9.3 9.4
9.3 9.4
2nd syllable position dependent 2nd syllable total
3.7 5.0
3.6 5.3
Number of phonemes
3
5
constructed such that each word in a word pair was assigned to each symbol in its associated pair once. Each vocabulary consisted of five blocks of four words. The rest of the design was the same as in Experiment 3, except that each subject received 15 blocks of 24 trials: three blocks for each word group.
Procedure and subjects Each subject was again assigned randomly to a vocabulary and a word set. Presentation of the blocks within a set were rotated. The procedure was the same as in Experiments 1 and 2. Twenty subjects were tested: 13 women and 7 men.
Results Exclusion of data Two subjects were replaced due to high error rates. Exclusion of data resulted in the loss of 5.6% of the data: 4.1% were trials following an error and 1.5% were outliers.
Analyses Naming latencies and percentage error rates were, for simple words, 681.3 ms (4.2%), and for complex words 678.7 ms (3.3%). The effect of complexity on naming latency was insignificant, Fx and F2 < 1, as was the effect on error rates,
Do speakers have access to a mental syllabary?
325
Fx = 1.0, F2 = 1.5. Clearly, the complexity (number of phonemes) of a word's second syllable does not affect its naming latency. Mean word duration for the simple words was 270.0 ms, compared to 313.0 for the complex words; this difference was significant, F ^ l , 18) = 99.5, /?<.0001, F 2 (l,36) = 15.5, p < . 0 0 1 .
Discussion When syllable frequency is controlled for, second syllable complexity does not affect naming latency. This shows that complexity cannot be an explanation for the syllable frequency effect obtained in the previous three experiments. In addition, the lack of a complexity effect shows that either the direct route in phonetic encoding (see above) is not a (co-)determinant of naming latencies in these experiments, or that the computational duration of gestural scores is, in some way, not complexity dependent.
GENERAL DISCUSSION The main findings of the four experiments reported are these: (i) syllable frequency affects naming latency in bisyllabic words; (ii) the effect is independent of word frequency; (iii) the effect is due to the frequency of the word's ultimate syllable; (iv) second syllable complexity does not affect naming latency, and hence cannot be the cause of the frequency effect. What are the theoretical consequences of these findings? We will first consider this issue with respect to the theoretical framework of phonological encoding sketched above. We will then turn to alternative accounts that may be worth exploring.
The syllabary theory reconsidered It needs no further discussion that the experimental findings are in seamless agreement with the syllabary theory as developed above. In fact, no other theory of phonological encoding ever predicted the non-trivial finding that word andsyllable frequency have additive effects on naming latency. The theory, moreover, provides natural accounts of the dominant rule of the word-final syllable and one of the absence of a syllable complexity effect. These explanations hinge on the theoretical assumption that syllabification is a late process in phonological encoding (in particular that there is no syllabification in the word form lexicon) and that gestural scores for syllables are retrieved as whole entities.
326
W. Levelt, L. Wheeldon
It is, however, not the case that the findings are also directly supportive for other aspects of the theory, such as the details of segmental and metrical spellout, the metrical character of phonological word formation and the particulars of segment-to-frame association (except for the assumption that this proceeds on a syllable-by-syllable basis). These aspects require their own independent justification (for some of which see Levelt, 1989, 1993). But there is one issue in phonological encoding that may appear in a new light, given this framework and the present results. It is the issue of underspecification. As pointed out above, Stemberger (1983) was amongst the first to argue for underspecification in a theory of phonological encoding. It could provide a natural account for speech errors such as in your really gruffy-scruffy clothes. Here the voicelessness of /k/ in scruffy is redundant. The lexicon might specify no more than the "archiphoneme" /K/, which can have both [k] and [g] as phonetic realizations; that is, the segment is unspecified on the voicing dimension. In the context of /s-r/, however, the realization has to be voiceless. But when, in a slip, the /s/ gets chopped off, the context disappears, and /K/ may become realized as [g]. The notion of underspecification was independently developed in phonological theory, Archangeli (1988) in particular proposed a theory of "radical underspecification", which claims that only unpredictable features are specified in the lexicon. But a major problem for any underspecification theory is how a full specification gets computed from the underspecified base. The solutions need not be the same for a structural phonological theory and for a process theory of phonological encoding. Here we are only concerned with the latter, but the proposed solution may still be of some relevance to phonological theory. The syllabary theory may handle the completion problem in the following way. There is no need to complete the specifications of successive segments in a word if one condition is met. It is that each phonological syllable arising in the process of segment-to-frame association (see Fig. 1) corresponds to one and only one gestural score in the syllabary. In other words, even if a syllable's segments are underspecified, their combination can still be unique. This condition puts empirical constraints on the degree and character of underspecification. Given a theory of underspecification, one can determine whether uniqueness is preserved, that is, whether each phonological syllable that can arise in phonological encoding corresponds to only one phonetic syllable in the syllabary. Or in other words, the domain of radical redundancy should be the syllable, not any other linguistic unit (such as the lexical word). Moreover, the domain should not be potential syllables, but syllables that occur with sufficient frequency in the speaker's language use as to have become "overlearned". Different cut-off frequency criteria should be considered here. Another variant would be to limit the domain to core syllables, ignoring syllable suffixes (see below).
Do speakers have access to a mental syllabary?
327
The syllabary theory is, of course, not complete without a precise characterization of how the syllabary is accessed, given a phonological syllable. What we have said so far (following Crompton, 1982, and Levelt, 1989) is that a syllable gesture is selected and retrieved as soon as its phonological specification is complete. In a network model (such as in Roelofs, 1992, or Levelt, 1992, but also mutatis mutandis in Dell's, 1988, model), this would require the addition of a bottom layer of phonetic syllable nodes. A syllable node's frequency-dependent accessibility can then be modelled as its resting activation. A strict regime has to be built in, in order to select phonetic syllables in their correct order, that is, strictly following a phonological word's segment-to-frame association. Although a word's second syllable node may become activated before the first syllable has been selected, selection of syllable one must precede selection of syllable two (and so on for subsequent syllables). Unlike phonological encoding, which involves the slightly error-prone process of assigning activated phonemes to particular positions in a phonological word frame, there are no frames to be filled in phonetic encoding. It merely involves the concatenation of successively retrieved syllabic gestures. This difference accounts for the fact that exchanges of whole syllables are almost never observed. Modelling work along these lines is in progress. The successive selection of articulatory gestures does not exclude a certain overlap in their motor execution. Whatever there is in betweensyliable coarticulation may be due to such overlap. The articulatory network probably computes an articulatory gesture that is a weighted average of the two target gestures in the range of overlap.
Alternative accounts Let us now turn to possible alternative accounts of our data. They can best be cast as ranging over a dimension of "mixed models", which includes our own. The one extreme here is that all phonological encoding involves access to a syllabary. The other extreme is that a phonological word's and its syllables' gestural scores are always fully computed. Our own syllabary theory, as proposed above, is a mixed model in that we assume the computability of all syllables - new, low or high frequency. But there is always a race between full computation and access to stored syllable scores, where the latter process will normally win the race except for very low-frequency or new syllables. Hence, our theory predicts that there should be a syllable complexity effect for words that end on new or very low-frequency syllables. But the balance between computation and retrieval may be a different one. More computation will be involved when one assumes that only core syllables are stored, whereas syllable suffixes are always computed. What is a core syllable? One definition is that it is a syllable that obeys the sonority sequencing principle,
328
W. Levelt, L. Wheeldon
This states that syllable-initial segments should be monotonically increasing in sonority towards the syllable nucleus (usually the vowel), and that syllable-final segments should be monotonically decreasing from the nucleus (see Clements, 1990, for a historical and systematic review of "sonority sequencing"). Phonetically a segment's sonority is its perceptibility, vowels being more sonorant than consonants, nasals being more sonorant than stops, etc. But sonority can also be denned in terms of phonological principles (Clements, 1990). On either of these sonority accounts the syllable /plant/ is a core syllable, whereas /lpatn/ is not; the latter violates the sequencing principle both in its onset and its offset. Though /lpatn/ is not a syllable of English, violations of sonority sequencing do occur in English syllables, as in cats, task or apt. Fujimura and Lovins (1978) proposed to treat such and similar cases as combinations of a core syllable plus an "affix", such as /c t + s/, etc. Here the core obeys sonority sequencing, and the affix is added to it. The authors also gave other, more phonological reasons for distinguishing between core and affixes, not involving sonority. They proposed that English syllables can have only one place-specifying consonant following the nucleus. So, in a word like lens, s is a suffix, although the sonority principle is not violated here. A similar notion of "syllable appendix" was proposed by Halle and Vergnaud (1980). It is clear where such affixes can arise in the process of segment-to-frame association discussed earlier. This will most naturally occur in word-final position when there is a "left over" consonantal segment that cannot associate to a following syllable (Rule 2b). The present version of a mixed theory would then be that as soon as a phonological core syllable is created in left-to-right segment-toframe association, its phonetic score is retrieved from the syllabary. Any affixes will be computationally added to that score. An advantage of this theory is that the syllabary will drastically reduce in size. In the CELEX database for English (i.e., for citation forms of words) there are about 12,000 different syllables (counting both full and reduced syllables). But most of them have complex offset clusters. These will all be eliminated in a core syllabary. But a disadvantage is that the theory predicts the complexity effect that we didn't find in Experiment 4. There we varied syllables' complexity precisely by varying the number of segments in their consonant clusters (onset or coda), and this should have computational consequences on the present theory. Still, the experiment was not explicitly designed to test the affix theory; it is therefore premature to reject it without further experimentation. Where Fujimura and Lovins (1978) only proposed to distinguish between syllable core and affix(es), Fujimura (1979) went a step further, namely to split up the core as well. In order to account for the different types of vowel affinity of the initial and final parts of the syllable (already observed in the earlier paper) he introduced the notion of demisyllable. The syllable core consists of an initial
Do speakers have access to a mental syllabary?
329
demisyliable consisting of initial consonant(s) plus vowel, and a final demisyllable consisting of vowel plus following consonants. Hence, these demisyllables hinge at the syllabic nucleus. In this model, demisyllables are the domains of allophonic variation, of sonority and other relations between consonants and the vowels they attach to. Or more precisely, as Fujimura (1990) puts it, demisyllables, not phonemes, are the "minimal integral units". Consonantal features are, in actuality, features of demisyllables. On this account "the complete inventory for segmental concatenation will contain at most 1000 entries and still reproduce natural allophonic variation" (Fujimura, 1976). We could call this inventory a demisyllabary, and we have another mixed model here. The speaker might access such a demisyllabary and retrieve syllable-initial and syllable-final gestures or gestural scores. Fujimura's model requires that, in addition, further computation of syllable affixes should be necessary. This latter part of the model will, or course, create the same complexity problem as discussed above. But as far as the demisyllable aspect is concerned, we can see no convincing arguments to reject such a model on the basis of our present results. It cannot be excluded a priori that our syllable frequency effect is, in actuality, a demisyllable frequency effect. In order to test this, new experiments will have to be designed, where demisyllable frequency is systematically varied. In conclusion, although we have certainly not yet proven that speakers do have access to a syllabary, our theory has been productive in making non-trivial predictions that found support in a series of experiments. Any alternative theory should be able to account for the syllable frequency effect, its independence of word frequency, and the absence of syllable complexity effects.
References Archangeli, D. (1988). Aspects of underspecification theory. Phonology, 5, 183-207. Browman, C.P., & Goldstein, L. (1991). Representation and reality: Physical systems and phonological structure. Haskins Laboratory Status Report on Speech Research, SR-105/106, 83-92. Clements, G.N. (1990). The role of the sonority cycle in core syllabification. In J. Kingston & M.E. Beckman (Eds.), Papers in laboratory phonology I. Between the grammar and physics of speech (pp. 58-71). Cambridge, UK: Cambridge University Press. Crompton, A. (1982). Syllables and segments in speech production. In A. Cutler (Ed.), Slips of the tongue and language production (pp. 109-162). Berlin: Mouton. Dell, G.S. (1988). The retrieval of phonological forms in production: Tests of predictions from a connectionist model. Journal of Memory and Language, 27, 124-142. Fujimura, O. (1976). Syllables as concatenated demisyllables and affixes. Journal of the Acoustical Society of America, 59 (Suppl. 1), S55. Fujimura, O. (1979). An analysis of English syllables as cores and suffixes. Zeitschrift fur Phonetik, Sprachwissenschaft und Kommunikationsforschung, 32, 471-476.
330
W. Levelt, L. Wheeldon
Fujimura, O. (1990). Demisyllables as sets of features: Comments on Clement's paper. In J. Kingston & M.E. Beckman (Eds.), Papers in laboratory phonology I. Between the grammar and physics of speech (pp. 377-381). Cambridge, UK: Cambridge University Press. Fujimura, O., & Lovins, J.B. (1978). Syllables as concatenative phonetic units. In A. Bell & J.B. Hooper (Eds.), Syllables and segments (pp. 107-120). Amsterdam: North-Holland. Halle, M., & Vergnaud, J.-R. (1980). Three dimensional phonology. Journal of Linguistic Research, 1, 83-105. Hayes, B. (1989). Compensatory lengthening in moraic phonology. Linguistic Inquiry, 20, 253-306. Jescheniak, J.D., & Levelt, W.J.M. (in press). Word frequency effects in speech production: Retrieval of syntactic information and of phonological form. Journal of Experimental Psychology: LMC, Klapp, S.T., Anderson, W.G., & Berrian, R.W. (1973). Implicit speech in reading, reconsidered. Journal of Experimental Psychology, 100, 368-374. Keating (1988). Underspecification in phonetics. Phonology, 5, 275-292. Levelt, W.J.M. (1989). Speaking: From intention to articulation. Cambridge, MA: MIT Press. Levelt, W.J.M. (1992). Accessing words in speech production: Stages, processes and representations. Cognition, 42, 1-22. Levelt, W.J.M. (1993). Timing in speech production: With special reference to word form encoding. Annals of the New York Academy of Sciences, 682, 283-295. Lindblom, B. (1983). Economy of speech gestures. In P.F. MacNeilage (Ed.), The production of speech (pp. 217-245). New York: Springer. Mackay, D. (1982). The problems of flexibility, fluency and speed-accuracy tradeoff. Psychological Review, 89, 483-506. Meringer, R., & Mayer, K. (1985). Versprechen und Verlesen. Stuttgart: Goschensche Verlag. (Reissued, with introductory essay by A. Cutler and D.A. Fay, 1978, Amsterdam: John Benjamins.) Meyer, A.S. (1990). The time course of phonological encoding in language production: The encoding of successive syllables of a word. Journal of Memory and Language, 29, 524-545. Meyer, A.S. (1991). The time course of phonological encoding in language production: Phonological encoding inside a syllable. Journal of Memory and Language, 30, 69-89. Meyer, A.S., & Schriefers, H. (1991). Phonological facilitation in picture-word interference experiments: Effects of stimulus onset asynchrony and types of interfering stimuli. Journal of Experimental Psychology, Human Perception and Performance, 17, 1146-1160. Nespor, M., & Vogel, I. (1986). Prosodic phonology. Dordrecht: Foris. Oldfield, R.C., & Wingfield, A. (1965). Response latencies in naming objects. Quarterly Journal of Experimental Psychology, 17, 273-281. Roelofs, A. (1992). A spreading-activation theory of lemma retrieval in speaking. Cognition 42 107-142. Saltzman, E., & Kelso, J.A.S. (1987). Skilled actions: A task-dynamic approach. Psychological Review, 94, 84-106. Shattuck-Hufnagel, S. (1979). Speech errors as evidence for a serial order mechanism in sentence production. In WE. Cooper & E.C.T. Walker (Eds.), Sentence processing: Psycholinguistic studies presented to Merrill Garrett (pp. 295-342). Hillsdale, NJ: Erlbaum. Shattuck-Hufnagel, S. (1992). The role of word structure in segmental serial ordering. Cognition 42 213-259. Stemberger, J.P. (1983). Speech errors and theoretical phonology: A review. Bloomington: Indiana Linguistics Club. Wheeldon, L.R., & Lahiri, A. (in preparation). Prosodic units in language production. Wheeldon, L., & Levelt, W.J.M. (1994). Monitoring the time-course of phonological encoding. Manuscript submitted for publication. Winer, B.J. (1971). Statistical principles in experimental design. New York: McGraw-Hill. Wingfield, A. (1968). Effects of frequency on identification and naming of objects. American Journal of Psychology, 81, 226-234.
331
Do speakers have access to a mental syllabary? Appendix 1. Vocabularies in Experiment 1
The four experimental vocabularies split into blockgroups containing one word from each of the four frequency groups. Within a blockgroup, words are phonologically and semantically unrelated. VOCAB A
VOCAB B
VOCAB C (HH) (LH) (HL) (LL)
GROUP 1
constant neutraal cider tarbot
(HH) (HH) (HL) (LL)
nadeel gordijn takel concaaf
(HH) (LH) (HL) (LL)
geding triomf kakel neuraal
GROUP 2
arme client nader vijzel
(HH) (LH) (HL) (LL)
koning sleutel volte absint
(HH) (LH) (HL) (LL)
(HH) stilte rapport (LH) bever (HL) horzel (LL)
GROUP 3
boter heuvel kandeel giraffe
(HH) (LH) (HL) (LL)
toren nerveus gemaal berber
(HH) (LH) (HL) (LL)
natuur gratis proper concours
GROUP 4
pater techniek gewei rantsoen
(HH) (LH) (HL) (LL)
heelal crisis reiger pingel
(HH) (LH) (HL) (LL)
kussen vijand adder trofee
(HH) (LH) (HL) (LL) (HH) (LH) (HL) (LL)
VOCAB D roman borrel hoeder soldeer versie praktijk neder causaal teder advies combo geiser gebaar kasteel tegel narcis
(HH) (LH) (HL) (LL) (HH) (LH) (HL) (LL) (HH) (LH) (HL) (LL) (HH) (LH) (HL) (LL)
17 On the internal structure of phonetic categories: a progress report Joanne L. Miller* Department of Psychology, Northeastern University, Boston, MA 02115, USA
Abstract There is growing evidence that phonetic categories have a rich internal structure, with category members varying systematically in category goodness. Our recent findings on this issue, which are summarized in this paper, underscore the existence and robustness of this structure and indicate further that the mapping between acoustic signal and internal category structure is complex: just as in the case of category boundaries, the best exemplars of a given category are highly dependent on acoustic-phonetic context and are specified by multiple properties of the speech signal. These findings suggest that the listener's representation of phonetic form preserves not only categorical information, but also fine-grained information about the detailed acoustic-phonetic characteristics of the language.
Introduction A major goal of a theory of speech perception is to explicate the nature of the mapping between the acoustic signal of speech and the segmental structure of the utterance. The basic issue can be framed as follows: how do listeners internally represent the phonetic categories of their language and how do they map the incoming speech signal onto these categorical representations during processing? Throughout the years, considerable emphasis has been placed on the discrete as opposed to continuous nature of phonetic categories and, in particular, on the boundaries between categories - where boundaries are located, why they are *Tel. (617) 373 3766, fax (617) 373 8714, e-mail
[email protected] This paper and the research reported herein were supported by NIH Grant DC 00130 and NIH BRSG RR 07143. The author thanks Peter D. Eimas for valuable comments on an earlier version of the manuscript.
334
J. Miller
located where they are, what kinds of factors shift boundaries around (Repp & Liberman, 1987). This focus on boundaries has historical roots. Much of the early research on speech categorization explored the phenomenon of categorical perception (Studdert-Kennedy, Liberman, Harris, & Cooper, 1970), which emphasized the ease of discriminating stimuli that crossed a category boundary compared to the relative difficulty of discriminating stimuli that did not cross a category boundary, that is, that belonged to the same phonetic category. The basic idea was that during speech perception the listener maps the linguistically relevant acoustic properties onto discrete phonetic categories such that information about category identity is retained, but the details of the underlying acoustic form are largely lost (cf. Lahiri & Marslen-Wilson, 1991). It is now known, however, that the discrimination of stimuli from a given phonetic category is not all that limited; under certain experimental conditions listeners can discriminate stimuli within a category remarkably well (Carney, Widin, & Viemeister, 1977; Pisoni & Tash, 1974; van Hessen & Schouten, 1992). Thus the processes that map the acoustic signal onto the categorical representations of speech do not necessarily produce a loss of detailed information about particular speech tokens (cf. Pisoni, 1990). Moreover, there is now a growing body of evidence that stimuli within a given phonetic category are not only discriminable from one another, but that phonetic categories themselves have a rich internal structure. The fundamental idea is that phonetic categories, like perceptual/cognitive categories in general (Nosofsky, 1988; Rosch, 1978; Medin & Barsalou, 1987), have a graded structure, with some stimuli perceived as better exemplars of the category than others-it is far from the case that the stimuli within a phonetic category are perceptually equivalent. This graded internal structure is revealed both by tasks that assess the functional effectiveness of category members in such phenomena as dichotic competition (Miller, 1977; Repp, 1977), selective adaptation (Miller, 1977; Miller, Connino, Schermer, & Kluender, 1983; Samuel, 1982), and discrimination/generalization (Kuhl, 1991), and by tasks that assess overt judgements of category goodness (Davis & Kuhl, 1992; Kuhl, 1991; Miller & Volaitis, 1989; Samuel, 1982; Volaitis & Miller, 1992; cf. Li & Pastore, in press). The picture that is emerging, then, is one of a categorization process that maps acoustic information onto discrete, but highly structured categorical representations. The challenge is to explicate the nature of these structured representations and to discover their role in processing. In this paper I summarize our progress to date in doing so.
Evidence for internal category structure: goodness judgements Although our early work on this issue examined the differential effectiveness of category members in dichotic listening and selective adaptation tasks (Miller,
On the internal structure of phonetic categories
335
1977; Miller et al., 1983), more recently we have focused on overt category goodness judgements. A typical experiment proceeds as follows. We first create a speech series in which a phonetically relevant acoustic property is varied so as to range from one phonetic segment to another. For example, we create a series ranging from /bi/ to /pi/, with the / b / - / p / voicing distinction specified by a change in voice onset time (VOT), the delay interval between the release of the consonant and the onset of periodicity corresponding to vocal fold vibration. As in typical speech experiments, the VOT values of the /bi/ and /pi/ endpoints are informally selected to be good category exemplars. Next, we extend the series by incrementing the critical acoustic property, VOT, to values far beyond those typically associated with a good member of one of the categories, in this case, /p/. This yields an extended series ranging from /bi/ through /pi/ to a breathy, exaggerated version of /pi/ (which we label */pi/). Note that for this particular series the phonetic category of interest, /p/, is bounded on one side by another phonetic category, /b/, and on the other by highly exaggerated instances of the target category, */p/. In other cases, the category of interest is bounded on both sides by other phonetic categories. In our goodness task, listeners are presented randomized sequences of the extended series and asked to judge each exemplar for its goodness as a member of the / p / category using a 1-10 rating scale; the better the exemplar, the higher the number. Representative data from such an experiment are shown in Fig. 1 (top panel). As VOT increases the ratings first increase and then decrease, showing systematic variation in category goodness. In other words, the category has a fine-grained internal structure, with only a limited range of stimuli within the category obtaining the highest goodness ratings. Furthermore, we have found that the structure reflected by the goodness ratings is strongly correlated with the structure reflected in a categorization reaction time task: the higher the goodness rating, the less time it takes listeners to identify a within-category stimulus as a good exemplar of /p/ (r = - . 9 3 ) (Wayland & Miller, 1992). This suggests that the fine gradations within the category may play an important role in on-line speech processing and, perhaps, lexical access - issues we are currently pursuing. We have now obtained goodness functions for a number of different phonetic contrasts, involving both consonants and vowels, specified by a variety of acoustic properties. Although the precise form of the goodness function does not remain invariant, all contrasts studied to date have yielded systematic variation in goodness judgements within the category. Two further examples are shown in Fig. 1 (middle and bottom panels). We tentatively conclude that graded internal structure is a general characteristic of phonetic categories. Interestingly, there is now evidence that such structure exists not only for adults but also for young infants. Kuhl and her colleagues (Kuhl, Williams, Lacerda, Stevens, & Lindblom, 1992) report structured, language-dependent categories for vowels in 6-month-old infants and we have preliminary evidence for structured consonantal categories in
336
Fig. 1.
J. Miller
SO 100 ISO 200 Voice-Onset-Time (ms)
2S0
200 400 600 800 Closure Duration (ms)
1000
100
S00
200 300 400 Vowel Duration (ms)
Three examples of goodness functions for phonetic categories. Top panel: group goodness function for Ipil along a lbil-lpil-*lpil series, specified by VOT. Based on Wayland and Miller (1992). Middle panel: group goodness function for Isteil along a lseil-lsteil-*lsteil series, specified by closure duration. Based on Hodgson and Miller (1992). Bottom panel: group goodness function for I beet I along a lbetl-lba>tl-*lbcetl series, specified by vowel duration. Based on Donovan and Miller (unpublished data).
3-month-old infants (Eimas & Miller, unpublished data). Whether such graded category structure emerges through exposure to language over the first months of life, as suggested by Kuhl (1993), or whether some rudiments of languageindependent internal category structure are innately given by the biological endowment of the infant, to become fine-tuned with specific language experience, remains to be determined. We should also point out that although the existence of internal category structure is now well established, the nature of the mental representation that underlies such structure is not known. Two major possibilities present themselves.
On the internal structure of phonetic categories
337
First, it could be that the listener stores an abstract summary description, or prototype, of the category (Rosch, 1978). Goodness ratings would be based on the perceptual similarity of the test stimulus to the stored prototype. Alternatively, the representation could be based on stored category exemplars, presumably weighted for frequency (Nosofsky, 1988). In this case, goodness ratings would be based on the perceptual similarity of the test stimulus to the set of stored exemplars. Differentiating between these accounts of phonetic category representation may well prove difficult (see Li & Pastore, in press), as it has in the case of non-speech perceptual and cognitive categorization (Medin & Barsalou, 1987).l The internal structure of phonetic categories is context dependent There is substantial evidence that the acoustic information specifying a given phonetic segment varies extensively with a host of contextual factors, such as phonetic environment, speaker and speaking rate, and that listeners are sensitive to such variation during speech perception (Jusczyk, 1986; Perkell & Klatt, 1986; but see Stevens & Blumstein, 1981). Much of the evidence for context effects in perception comes from studies examining boundary locations. The basic finding is that boundary locations are flexible; variation in a relevant contextual factor results in a predictable, systematic change in the location of the listener's category boundaries (Repp & Liberman, 1987). With a shift of emphasis from category boundaries to category centers, a question that immediately arises is whether these pervasive context effects are limited to the region of the category boundary where, by definition, there is ambiguity in category membership, or whether the influence of contextual factors extends beyond the boundary region, such that the internal structure of the category is itself altered. The latter would indicate that contextual effects in perception entail a more comprehensive remapping between acoustic property and phonetic structure than shifts in boundary location alone can reveal. Our initial investigation of this issue focused on speaking rate. When speakers talk they do not maintain a constant rate of speech (Crystal & House, 1990; Miller, Grosjean, & Lomanto, 1984). This produces a potential problem for perception in that many phonetically relevant acoustic properties are themselves temporal in nature, and change as speaking rate changes. A case in point is VOT. As speakers slow down such that overall syllable duration becomes longer, the *It should also be noted that the longstanding debate in the literature over the unit of representation that underlies speech perception, for example whether linguistic feature (Stevens & Blumstein, 1981), phonetic segment (Pisoni & Luce, 1987) or syllable (Studdert-Kennedy, 1980), is relevant here as well. That is to say, it is not currently known whether graded internal structure is properly characterized in terms of categories at a featural, segmental or syllabic level of representation.
338
J. Miller
VOT values associated with stop consonants at syllable onset also become longer (Miller, Green, & Reeves, 1986; Summerfield, 1975; Volaitis & Miller, 1992). This suggests that if listeners are to use VOT to distinguish voiced and voiceless stops most effectively, they should treat VOT not absolutely, but in relation to syllable duration. Many studies have shown evidence for such rate-dependent processing, measured at the category boundary: as syllable duration increases (reflecting a slowing of rate), the listener's voiced-voiceless boundary shifts toward longer VOTs. Although there is continuing controversy over the nature of the perceptual mechanism underlying rate effects of this type (Fowler, 1990; Miller & Liberman, 1979; Pisoni, Carrell, & Gans, 1983; Summerfield, 1981), the basic phenomenon has proved highly robust. We asked whether the listener's adjustment for rate is limited to the boundary region or affects the perception of stimuli well within the category (Miller & Volaitis, 1989; see also Volaitis & Miller, 1992). To do so, we created two extended /bi/-/pi/-*/pi/ series that differed from each other in overall syllable duration, reflecting different speaking rates. The stimuli in one series were short (125 ms), reflecting a fast rate of speech, and the stimuli in the other were longer (325 ms), reflecting a slower rate of speech. Stimuli from the two series were presented for goodness judgements using our rating task. For both series, we obtained systematic goodness functions, with only a limited range of stimuli receiving the highest ratings. In order to quantify the effect of rate, we designated a best-exemplar range for each series, defined as the range of stimuli receiving ratings that were at least 90% of the maximal rating given to any stimulus within the series. The effect of rate was clear: the best-exemplar range was shifted toward longer VOT values for the longer syllables. Thus, just as speakers produce longer VOT values for longer syllables, so too do listeners require longer VOT values for stimuli to be perceived as prototypic category exemplars. Interestingly, the perceptual data matched the acoustic production data in yet another characteristic. As speakers slow down, they produce not only longer VOTs, but tend to produce a wider range of VOTs (Miller et al., 1986; Volaitis & Miller, 1992). Our perceptual data mirrored this effect, in that the best exemplars spanned a wider range of VOT values for the longer syllables. There thus appears to be a close correspondence between acoustic alterations stemming from a change in rate during speech production and perceptual alterations in the internal structure of categorical representations. So far, we have discussed only rate information specified by the syllable itself, that is, syllable-internal rate. But it is known that, at least insofar as boundaries are concerned, a change in syllable-external rate also affects perception (Gordon, 1988; Summerfield, 1981). Indeed, other than the fact that the boundary shifts due to syllable-internal rate often appear to be larger than those due to syllableexternal rate, the two rate effects appear similar in kind. However, by focusing on rate effects within the category rather than at the boundary, we have recently
On the internal structure of phonetic categories
339
found evidence for a qualitative difference in the two kinds of rate effects (Wayland, Miller, & Volaitis, in press) - evidence that supports the proposal in the literature that the two rate effects may derive from different underlying mechanisms (Port & Dalby, 1982; Summerfield, 1981). The main study was straightforward. We obtained goodness ratings for target syllables drawn from a single /bi/-/pi/-*/pi/ series embedded in fast and slow versions of a context sentence, "She said she heard here." Two main findings emerged. First, the change in sentential rate from fast to slow produced a shift in the best-exemplar range for /pi/ along the target series toward longer VOT values. Thus syllable-external rate, like syllable-internal rate, does alter the perception of stimuli well within the category. However, the change in sentential rate from fast to slow, unlike the change in syllable-internal rate, produced only a simple shift in the best-exemplar range, with no concomitant widening of the range. These findings support a dissociation between the two kinds of rate information. In addition, they raise the possibility that the shape of the phonetic category is determined by characteristics of the syllable itself, with syllableexternal factors simply shifting the category, with its structure intact, along the acoustic dimension. Support for the notion that syllable-internal factors are important in determining the shape (as well as location) of the category along an acoustic dimension comes from a study in which we directly examined the effect of syllable structure on goodness ratings (Volaitis, 1990; Volaitis & Miller, 1991). To set the stage for our perceptual study, we first conducted an acoustic analysis of consonant-vowel (CV) and consonant-vowel-consonant (CVC) syllables, obtained from asking speakers to produce tokens of /di, ti, dis, tis/ across a range of rates. As expected, for each of the four syllables, as speaking rate decreased such that overall syllable duration increased, so too did the VOT value of the initial consonant increase. However, VOT did not depend only on the duration of the syllable, but also on its structure (Weismer, 1979). In particular, for a given syllable duration, VOT was longer for /ti/ than /tis/, and tended to encompass a wider range of VOT values. If listeners are sensitive to this acoustic pattern, then for a given syllable duration the best exemplars of /ti/ should have longer VOT values than those of /tis/. This is precisely what happened in the perceptual study, in which we compared goodness ratings along a /di-ti-*ti/ and /dis-tis-*tis/ series, whose syllables had the same overall duration. Moreover, the shapes of the goodness functions for /ti/ and /tis/ were quite different, with the best-exemplar range being narrower for /tis/ than /ti/, again mirroring the trend in the acoustic data. Yet further support for both the existence of context effects within the category, and a close correspondence between these context effects and the acoustic consequences of speech production, comes from a study in which we looked at the role of acoustic-phonetic context provided by the critical segment
340
J. Miller
itself in shaping the internal structure of a phonetic category (Volaitis & Miller, 1992). It has long been known that the VOT value of an initial stop consonant systematically varies with changes in place of articulation of that stop, such that VOT values become longer as place moves from labial (/b,p/) to velar (/g,k/) (Lisker & Abramson, 1964). Moreover, it is known that listeners are sensitive to this pattern, insofar as they shift their voiced-voiceless category boundaries toward longer VOT values as place changes from labial to velar (Lisker & Abramson, 1970). Our question was whether this context sensitivity in perception extends to the best exemplars of the category. The answer was yes. When listeners were asked for /p/ and /k/ goodness judgements, respectively, for stimuli from /bi-pi-*pi/ and /gi-ki-*ki/ series, they required longer VOT values to perceive the best tokens of /ki/ than /pi/. Taken together, the findings provide strong support for the proposal that the listener's sensitivity to contextual factors is not a boundary phenomenon, but that context alters which stimuli are perceived to be the best category exemplars. Note that this finding is compatible with either an exemplar-based or a prototype-based representational structure, with the caveat that any abstracted "prototype" must itself be context dependent; that is, there is not a single prototype for a given linguistic category (cf. Oden & Massaro, 1978). Our data also show that the systematic way in which context alters the best exemplars of the perceptual category corresponds closely to the way in which context alters the relevant acoustic properties during production. In other words, even for stimuli within a category, listeners are finely tuned to the consequences of articulation. An important issue to be resolved is whether such attunement arises from a speechspecific mechanism that operates in terms of articulatory principles, as the motor theory of speech perception proposes (Liberman & Mattingly, 1985), an articulatory system that has evolved to take account of auditory processes (Diehl & Kluender, 1989), or a perceptual system that operates so as to be generally sensitive to the consequences of physical events in the world, with speech being just one of those events (Fowler, 1986). So far we have focused on acoustically based contextual variation, rooted in the acoustic consequences of speech production. But not all context effects in speech perception are acoustically based. Some derive instead from the influence of higher-order linguistic variables - variables that themselves do not systematically alter the acoustic structure of the utterance. An example is lexical status. Consider a series of speech syllables that vary in initial consonant from /b/ to /p/, specified by a change in VOT. The critical aspect of the series is that one endpoint constitutes a word of the language, for example BEEF, whereas the other constitutes a non-word, for example PEEF. In a number of experiments using such series, it has been shown that listeners will tend to identify stimuli with potentially ambiguous phonetic segments in the vicinity of the / b / - / p / boundary so as to render the real word of the language rather than the non-word, in this
On the internal structure of phonetic categories
341
example BEEF rather than PEEF. In other words, lexical status can produce a shift in category boundary location (Ganong, 1980; Pitt & Samuel, 1993). Interestingly, however, the literature suggests that the effect of lexical status, unlike the acoustically based context effects described earlier, may be limited to the region of the category boundary: The strength of the lexical effect appears to decrease as stimuli along a series move away from the region of the category boundary, with little or no influence often seen for the endpoint stimuli of a series, where phonetic identity is presumably clearly specified by the acoustic information. In a recent set of experiments (Miller, Volaitis, & Burki-Cohen, unpublished data), we used extended VOT series, coupled with our goodness rating task, to test directly the influence of lexical status on internal category structure. In a preliminary experiment involving BEEF-PEEF and BEACEPEACE series that varied in VOT, we confirmed the basic lexical effect: The / b / - / p / category boundary was located at a longer VOT value on the BEEFPEEF compared to the BE ACE-PEACE series; that is, listeners tended to hear the stimuli near the boundary as BEEF in the first series and as PEACE in the second series. In the main experiment, we extended the series by increasing VOT to extreme values (as in Miller & Volaitis, 1989), such that listeners were presented with BEEF-PEEF-*PEEF and BEACE-PEACE-*PEACE series. They were asked to rate each exemplar for the goodness of /p/. Following the procedures used in Miller and Volaitis (1989; see above), we designated a best-exemplar range for /p/, for both series. The interesting finding was that the change in lexical status did not shift the entire best-exemplar range along the VOT series, as had the contextual factors of rate, syllable structure, and phonetic context, described above. Instead, there was only a marginally reliable shift for the peak of the function itself, and no reliable shift for the edge of the best-exemplar range beyond the peak (in the direction away from the boundary). Although only preliminary, these findings raise the possibility that higher-order contextual variables, unlike acoustically based contextual variables, do not substantially alter the mapping between acoustic signal and internal category structure.
The internal structure of phonetic categories is specified by multiple acoustic properties Discussions of speech perception typically consider two major kinds of variation that complicate the mapping between acoustic signal and phonetic category. One type is variation due to contextual factors, considered above. The other type has to do with the finding that phonetic contrasts are multiply specified. Listeners appear to be exquisitely sensitive to the multiple acoustic consequences that arise from any given articulatory act, and use these multiple
342
J. Miller
properties to identify a given phonetic segment (Bailey & Summerfield, 1980; but see Stevens & Blumstein, 1981). As in the case of context effects, much of the evidence for the use of multiple properties comes from the examination of category boundaries. As a case in point, consider the distinction between the presence and absence of a stop consonant in "say" (/sei/) versus "stay" (/stei/). Two primary properties underlying this contrast are the duration of the closure interval between the fricative and the vocalic portion of the syllable and the frequency of the first formant (Fl) at its onset after closure. A short closure and high Fl onset specify "say", whereas a longer closure and lower Fl onset frequency specify "stay". Evidence that listeners use both properties in identifying "say" versus "stay" comes from trading relation studies, which reveal that manipulation of one variable can produce a shift in the category boundary along a continuum defined by the other variable. For example, Best, Morrongiello, and Robson (1981) created two "say"-"stay" series, in which the "say" versus "stay" distinction along each series was specified by a change in closure duration. The series differed from each other in Fl onset frequency. For each series, listeners identified stimuli with short closures as "say" and those with longer closures as "stay". However, the location of the "say"-"stay" boundary depended on the Fl onset frequency, such that for the series with the higher Fl onset the boundary was shifted toward longer closure durations. In other words, with a higher Fl onset frequency (which favors "say") listeners required more silence (which favors "stay") to hear the presence of the stop consonant. With respect to the nature of categories, the critical question is whether such trading relations are limited to the boundary region or whether properties also trade to define which stimuli are the best exemplars of the category. We have examined this issue by pairing our goodness procedure with a trading relation paradigm, focusing on the "say"-"stay" distinction described above (Hodgson, 1993; Hodgson & Miller, 1992, in preparation). We began by creating two "say"-"stay" series closely patterned after those of Best et al. (1981). We then extended each series by increasing closure duration to very long values, such that each series ranged from "say" through "stay" to an exaggerated version of "stay" (*"stay"), which sounded like "s...tay". Stimuli from the two series were randomized and presented to listeners for goodness judgements. The main finding was that the best exemplars of "stay" were located at longer closure durations for the high Fl onset series than the low Fl onset series-clear evidence for a within-category trading relation. And we have replicated this basic finding for an intervocalic voicing category, specified by closure duration and preceding vowel duration (Hodgson & Miller, 1991). Note that the sensitivity of internal category structure to multiple properties, like the context-dependent nature of category structure described above, is compatible with either an exemplar-based or a prototype-based representational structure, with a similar caveat: any abstracted
On the internal structure of phonetic categories
343
"prototype" must itself not only be context dependent, but must also be multiply specified (cf. Massaro & Cohen, 1976).2 In three subsequent studies based on our extended "say"-"stay"-*"stay" series, we have examined the limits and robustness of the within-category trading relation and, in so doing, the robustness of category structure itself (Hodgson, 1993; Hodgson & Miller, in preparation). In the first of these studies, we explored the limits of the trading relation by systematically increasing the Fl onset value of the stimuli. We found that as Fl onset frequency becomes higher, the magnitude of the trading effect, measured as the shift in location of the best category exemplar along the series, also becomes larger, but at a price: the structure of the category begins to break down. That is, the rating function for a series with extreme Fl values is depressed, such that listeners no longer give very high ratings to any stimuli within the series-no stimuli are perceived as good exemplars of "stay". Of considerable interest will be to determine whether the Fl onset frequency at which category structure is lost is correlated with the decreasing probability of that Fl onset frequency value occurring in production. In the second study of the series, we used background noise to examine the robustness of category structure, as well as within-category trading relations. Specifically, we conducted our basic goodness experiment with our two original "say"-"stay"-*"stay" series, but presented the stimuli in a background of multitalker babble noise-noise that has been shown previously to produce changes in the way in which acoustic properties contribute to phonetic categorization (Miller & Wayland, 1993; Wardrip-Fruin, 1985). In the present case, however, the noise had no effect: both the within-category structure and the trading relation proved highly resilient, showing no signs of decline even at a relatively poor signal-to-noise ratio (0 dB). Thus the fine-grained internal structure of phonetic categories is far from fragile. Finally, our third study in the series provides evidence that sinewave replicas of speech, which eliminate the rich harmonic structure of speech but preserve the basic time-varying patterns of the speech signal, can support the perception of graded category structure. Patterning our stimuli after sinewave stimuli used by Best et al. (1981), we created sinewave replicas of our two original extended "say"-"stay"-*"stay" series. We presented the stimuli for "stay" goodness judgements to "speech" listeners, that is, to listeners who heard the sinewave 2
At first glance, our findings on within-category trading relations appear to disagree with those of Repp (1983), who did not find evidence for a within-category trading relation for the "say"-"stay" contrast (he did not test the intervocalic voicing contrast). However, unlike our goodness judgement task, his discrimination task did not require listeners to attend to the phonetic quality of the stimuli. Thus his listeners could have based their within-category discriminations on auditory information alone and, indeed, Repp assumed that they did just that. Taken together, the studies suggest that within-category trading relations do occur, but only when specifically phonetic processing, as required by our goodness judgement task, is invoked.
344
J. Miller
stimuli as speech. Although as typical in sinewave speech experiments listeners gave a variety of response patterns, fully half of the listeners we tested provided highly systematic goodness functions that were remarkably similar to those we obtained for our original "say"-"stay"-*"stay" speech series, showing clear evidence of a wi thin-category trading relation. This finding adds to the weight of evidence that graded category structure is an integral part of the representation of phonetic categories.
Conclusion The research we have reviewed provides support for the claim that the stimuli within a phonetic category are far from perceptually equivalent. Rather, withincategory stimuli vary systematically in category goodness, with the best exemplars of a given category themselves defined in terms of multiply specified, contextdependent properties. And this graded internal category structure is highly robust. It is resistant to noise and it is revealed even when the phonetic quality of the stimulus rests on a highly stylized "caricature" of speech that preserves critical phonetically relevant time-varying properties. These findings indicate that the listener's representation of the phonetic categories of language includes finegrained detail about phonetic form. A major challenge before us is to determine the role played by this fine-grained knowledge in on-line speech processing and lexical access.
References Bailey, P.J., & Summerfield, Q. (1980). Information in speech: Observations on the perception of [s]-stop clusters. Journal of Experimental Psychology: Human Perception and Performance, 6, 536-563. Best, C.T., Morrongiello, B., & Robson, R. (1981). Perceptual equivalence of acoustic cues in speech and nonspeech perception. Perception and Psychophysics, 3, 191-211. Carney, A.E., Widin, G.P., & Viemeister, N.F. (1977). Noncategorical perception of stop consonants differing in VOT. Journal of the Acoustical Society of America, 62, 961-970. Crystal, T.H., & House, A.S. (1990). Articulation rate and the duration of syllables and stress groups in connected speech. Journal of the Acoustical Society of America, 88, 101-112. Davis, K., & Kuhl, P.K. (1992). Best exemplars of English velar stops: A first report. In J.J. Ohala, T.M. Nearey, B.L. Derwig, M.M. Hodge, & G.E. Wiebe (Eds.), Proceedings of the International Conference on Spoken Language Processing (pp. 495-498). Alberta, Canada: University of Alberta. Diehl, R.L., & Kluender, K.R. (1989). On the objects of speech perception. Ecological Psychology, 1, 121-144. Fowler, C.A. (1986). An event approach to the study of speech perception from a direct-realist perspective. Journal of Phonetics, 14, 3-28. Fowler, C.A. (1990). Sound-producing sources as objects of perception: Rate normalization and nonspeech perception. Journal of the Acoustical Society of America, 88, 1236-1249.
On the internal structure of phonetic categories
345
Ganong, W.F., III (1980). Phonetic categorization in auditory word perception. Journal of Experimental Psychology: Human Perception and Performance, 6, 110-125. Gordon, P.C. (1988). Induction of rate-dependent processing by coarse-grained aspects of speech. Perception and Psychophysics, 43, 137-146. Hodgson, P. (1993). Internal structure of phonetic categories: The role of multiple acoustic properties. Ph.D. thesis, Northeastern University. Hodgson, P., & Miller, J.L. (1991). The role of multiple acoustic properties in specifying the internal structure of phonetic categories [Abstract]. Journal of the Acoustical Society of America, 89, 1997. Hodgson, P., & Miller, J.L. (1992). Internal phonetic category structure depends on multiple acoustic properties: Evidence for within-category trading relations [Abstract]. Journal of the Acoustical Society of America, 92, 2464. Hodgson, P., & Miller, J.L. (in preparation). Internal structure of phonetic categories: Evidence for within-category trading relations. Jusczyk, P.W. (1986). Speech perception. In K.R. Boff, L. Kaufman, & J.P. Thomas (Eds.), Handbook of perception and human performance (pp. 27/1-27/57). New York: Wiley. Kuhl, P.K. (1991). Human adults and human infants show a "perceptual magnet effect" for the prototypes of speech categories, monkeys do not. Perception and Psychophysics, 50, 93-107. Kuhl, P.K. (1993). Innate predispositions and the effects of experience in speech perception: The native language magnet theory. In B. de Boysson-Bardies, S. de Schonen, P. Jusczyk, P. McNeilage, & J. Morton (Eds.), Developmental neurocognition: Speech and face processing in the first year of life (pp. 259-274). Dordrecht: Kluwer. Kuhl, P.K., Williams, K.A., Lacerda, F., Stevens, K.N., & Lindblom, B. (1992). Linguistic experience alters phonetic perception in infants by 6 months of age. Science, 255, 606-608. Lahiri, A., & Marslen-Wilson, W. (1991). The mental representation of lexical form: A phonological approach to the recognition lexicon. Cognition, 38, 245-294. Li, X., & Pastore, R.E. (in press). Evaluation of prototypes and exemplars in perceptual space for place contrast. In M.E.H. Schouton (Ed.), Audition, speech and language. Berlin: Mouton-De Gruyter. Liberman, A.M., & Mattingly, I.G. (1985). The motor theory of speech perception revised. Cognition, 21, 1-36. Lisker, L., & Abramson, A.S. (1964). A cross-language study of voicing in initial stops: Acoustical measurements. Word, 20, 384-422. Lisker, L., & Abramson, A.S. (1970). The voicing dimension: Some experiments in comparative phonetics. In Proceedings of the Sixth International Congress of Phonetic Sciences, Prague, 1967 (pp. 563-567). Prague: Academia. Massaro, D.W., & Cohen, M.M. (1976). The contribution of fundamental frequency and voice onset time to the / z i / - / s i / distinction. Journal of the Acoustical Society of America, 60, 704-717. Medin, D.L., & Barsalou, L.W. (1987). Categorization processes and categorical perception. In S. Hamad (Ed.), Categorical perception. New York: Cambridge University Press. Miller, J.L. (1977). Properties of feature detectors for VOT: The voiceless channel of analysis. Journal of the Acoustical Society of America, 62, 641-648. Miller, J.L., Connine, C.N., Schermer, T.M., & Kluender, K.R. (1983). A possible auditory basis for internal structure of phonetic categories. Journal of the Acoustical Society of America, 73, 2124-2133. Miller, J.L., Green, K.P., & Reeves, A. (1986). Speaking rate and segments: A look at the relation between speech production and speech perception for the voicing contrast. Phonetica, 43, 106-115. Miller, J.L., Grosjean, F., & Lomanto, C. (1984). Articulation rate and its variability in spontaneous speech: A reanalysis and some implications. Phonetica, 41, 215-225. Miller, J.L., & Liberman, A.M. (1979). Some effects of later-occurring information on the perception of stop consonant and semivowel. Perception and Psychophysics, 25, 457-465. Miller, J.L., & Volaitis, L.E. (1989). Effect of speaking rate on the perceptual structure of a phonetic category. Perception and Psychophysics, 46, 505-512.
346
J. Miller
Miller, J.L., & Wayland, S.C. (1993). Limits on the limitations of context conditioned effects in the perception of [b] and [wj. Perception and Psychophysics, 54, 205-210. Nosofsky, R.M. (1988). Similarity, frequency, and category representations. Journal of Experimental Psychology: Learning, Memory, and Cognition, 14, 54-65. Oden, G.C., & Massaro, D.W. (1978). Integration of featural information in speech perception. Psychological Review, 85, 172-191. Perkell, J.S., & Klatt, D.H. (1986). Invariance and variability in speech processes. Hillsdale, NJ: Erlbaum. Pisoni, D.B. (1990). Effects of talker variability on speech perception: Implications for current research and theory. In H. Fujisaki (Ed.), Proceedings of the International Conference on Spoken Language Processing. Kobe, Japan. Pisoni, D.B., Carrell, T.D., & Gans, S.J. (1983). Perception of the duration of rapid spectrum changes in speech and nonspeech signals. Perception and Psychophysics, 34, 314-322. Pisoni, D.B., & Luce, P.A. (1987). Acoustic-phonetic representations in word recognition. Cognition, 25, 21-52. Pisoni, D.B., & Tash, J. (1974). Reaction times to comparisons within and across phonetic categories. Perception and Psychophysics 15, 285-290. Pitt, M.A., & Samuel, A.G. (1993). An empirical and meta-analytic evaluation of the phoneme identification task. Journal of Experimental Psychology: Human Perception and Performance, 19, 699-725. Port, R.F., & Dalby, J. (1982). Consonant/vowel ratio as a cue for voicing in English. Perception and Psychophysics, 32, 141-152. Repp, B.H. (1977). Dichotic competition of speech sounds: The role of acoustic stimulus structure. Journal of the Acoustical Society of America, 3, 37-50. Repp, B.H. (1983). Trading relations among acoustic cues in speech perception are largely a result of phonetic categorization. Speech Communication, 2, 341-361. Repp, B.H., & Liberman, A.M. (1987). Phonetic category boundaries are flexible. In S. Harnad (Ed.), Categorical perception. New York: Cambridge University Press. Rosch, E. (1978). Principles of categorization. In E. Rosch & B.B. Lloyd (Eds.), Cognition and categorization. Hillsdale, NJ: Erlbaum. Samuel, A.G. (1982). Phonetic prototypes. Perception and Psychophysics, 31, 307-314. Stevens, K.N., & Blumstein, S.E. (1981). The search for invariant acoustic correlates of phonetic features. In P.D. Eimas and J.L. Miller (Eds.), Perspectives on the study of speech (pp. 1-38). Hillsdale, NJ: Erlbaum. Studdert-Kennedy, M. (1980). Speech perception. Language and Speech, 23, 45-66. Studdert-Kennedy, M., Liberman, A.M., Harris, K.S., & Cooper, F.S. (1970). Motor theory of speech perception: A reply to Lane's critical review. Psychological Review, 77, 234-249. Summerfield, Q. (1975). Aerodynamics versus mechanics in the control of voicing onset in consonantvowel syllables. Speech Perception: Series 2, Number 4, Spring. Department of Psychology, The Queen's University of Belfast. Summerfield, Q. (1981). Articulatory rate and perceptual constancy in phonetic perception. Journal of Experimental Psychology: Human Perception and Performance, 7, 1074-1095. van Hessen, A.J., & Schouten, M.E.H. (1992). Modeling phoneme perception. II: A model of stop consonant discrimination. Journal of the Acoustical Society of America, 92, 1856-1868. Volaitis, L.E. (1990). Some context effects in the production and perception of stop consonants. Ph.D. thesis, Northeastern University. Volaitis, L.E., & Miller, J.L. (1991). Influence of a syllable's form on the perceived internal structure of voicing categories [Abstract]. Journal of the Acoustical Society of America, 89, 1998. Volaitis, L.E., & Miller, J.L. (1992). Phonetic prototypes: Influence of place of articulation and speaking rate on the internal structure of voicing categories. Journal of the Acoustical Society of America, 92, 723-735. Wardrip-Fruin, C. (1985). The effect of signal degradation on the status of cues to voicing in utterance-final stop consonants. Journal of the Acoustical Society of America, 77, 1907-1912. Wayland, S.C, & Miller, J.L. (1992). Influence of internal phonetic category structure in on-line speech processing. Paper presented at 33rd annual meeting of the Psychonomic Society, St Louis, 1992.
On the internal structure of phonetic categories
347
Wayland, S.C, Miller, J.L., & Volaitis, L.E. (in press). The influence of sentential speaking rate on the internal structure of phonetic categories. Journal of the Acoustical Society of America. Weismer, G. (1979). Sensitivity of voice-onset-time (VOT) measures to certain segmental features in speech production. Journal of Phonetics, 7, 197-204.
18 Perception and awareness in phonological processing: the case of the phoneme Jose Morais*, Regine Kolinsky Laboratoire de Psychologie experimental, Universite Libre de Bruxelles, Av. Ad. Buyl 117, B-1050 Bruxelles, Belgium
Abstract The necessity of a "levels-of-processing" approach in the study of mental representations is illustrated by the work on the psychological reality of the phoneme. On the basis of both experimental studies of human behavior and functional imaging data, it is argued that there are unconscious representations of phonemes in addition to conscious ones. These two sets of mental representations are functionally distinct: the former intervene in speech perception and (presumably) production; the latter are developed in the context of learning alphabetic literacy for both reading and writing purposes. Moreover, among phonological units and properties, phonemes may be the only ones to present a neural dissociation at the macro-anatomic level. Finally, it is argued that even if the representations used in speech perception and those used in assembling and in conscious operations are distinct, they may entertain dependency relations.
Cognitive psychology is concerned with what information is represented mentally and how it is represented. In these twenty years or so of Cognition's life the issue of where, that is, at what levels of processing, particular types of information are represented has become increasingly compelling. This issue is crucial both to track the mental itinerary of information and to draw a correct 'Corresponding author. Fax 32 2 6502209, e-mail
[email protected] The authors' work discussed in the present paper was supported by the Human Frontier Science Program (project entitled Processing consequences of contrasting language phonologies) as well as the Belgian Fonds National de la Recherche Scientifique (FNRS)-Loterie Nationale (convention nos. 8.4527.90 and 8.4505.93) and the Belgian Ministere de l'Education de la Communaute franchise ("Action de Recherche concertee" entitled Le traitement du langage dans differentes modalites: approches comparatives). The second author is Research Associate of the Belgian FNRS. Special thanks are due to all our collaborators, and in particular to Mireille Cluytens. SSDI
0010-0277(93)00601-3
350
J. Morais, R. Kolinsky
picture of mental structure. However, the "where" question may be even more difficult to answer than the "what" and "how" ones. In spite of the tremendous development of the functional imaging technology, we are still unable to follow on a computer screen the multiple recodings of information accomplished in the brain. Thus, the experimental study of human behavior remains up to now the most powerful approach to the mind's microstructure. Sadly, we all know that what we register are intentional responses given under the request of the experimenter, so that the evidence arising from an experiment may be difficult to attribute to a particular stage of processing. No reader of Cognition doubts that he or she can represent phonemes mentally. Characters coming out of press or from the writer's hand are costumed phonemes, or at least may be described as such. But at how many processing level(s) and how deeply do phonemes live in our minds? We take the phoneme issue as a good illustration both of the necessity of pursuing a "levels-ofprocessing" inquiry in the study of mental representations and of the misunderstandings and pitfalls this difficult study may be confronted with. In the seventh volume of this journal, our group demonstrated (at least we believe so) that the notion that speech can be represented as a sequence of phonemes does not arise as a natural consequence of cognitive maturation and informal linguistic experience (Morais, Cary, Alegria, & Bertelson, 1979). This claim was based on the discovery that illiterate adults are unable to manipulate phonemes intentionally, as evidenced by their inability to delete "p" from "purso" or add "p" to "urso". In a subsequent volume of this journal, under the guest editorship of Paul Bertelson, it was reported that Chinese non-alphabetic readers share with illiterates the lack of phonemic awareness (Read, Zhang, Nie, & Ding, 1986) and that the metaphonological failure of illiterates seems to be restricted to the phoneme, since they can both manipulate syllables and appreciate rhyme (Morais, Bertelson, Cary, & Alegria, 1986). Later on, we observed that illiterates can compare short utterances for phonological length (Kolinsky, Cary, & Morais, 1987), which again suggests that conscious access to global phonological properties of speech utterances does not depend on literacy. We were happy that Cognition's reviewers had understood the interest of our 1979 paper. As a matter of fact, we had submitted a former version of it to another journal, which rejected it on the basis of the comments of one reviewer who could not believe in our results given that, as Peter Eimas and others had shown (e.g., Eimas, Siqueland, Jusczyk, & Vigorito, 1971), American babies can perceive "phonemic" distinctions, like between "ba", "da" and "ga", fairly well. We are not complaining about reviewers - almost every paper of the present authors has greatly improved following reviewers' criticisms-sincerely, we are almost grateful to the anonymous reviewer and presumably distinguished scholar who confounded perception and awareness. It was probably his or her reviewing that led us both to write: "the fact that illiterates are not aware of the phonetic
Perception and awareness in phonological processing
351
structure of speech does not imply, of course, that they do not use segmenting routines at this level when they listen to speech" (Morais et al., 1979, p. 330), and to conclude the paper stressing the need "to distinguish between the prevalence of such or such a unit in segmenting routines at an unconscious level and the ease of access to the same units at a conscious, metalinguistic level" (p. 331). In the following years, we progressively realized that, as far as our own work was concerned, the battle to distinguish between perceptual and postperceptual representations had just begun. The work with illiterates and with non-alphabetic readers has contributed to nourish, if not to raise, the suspicion that the phoneme could be, after all, and despite our familiarity with it, a simple product of knowing an alphabet. Warren (1983) rightly called one's attention to the danger of introspection in this domain: "Our exposure to alphabetic writing since early childhood may encourage us to accept the analysis of speech into a sequence of sounds as simply the recognition of a fact of nature" (p. 287). However, he has erroneously taken observations from the conscious awareness level as evidence of perceptual reality or nonreality. In the same paper, he listed the "experimental evidence that phonemes are not perceptual units" (our italics). In this list, the fact that "illiterate adults cannot segment phonetically" (p. 289) appears at the top. Some linguists have reached the same conclusion as far as the role of phonemes in the formal description of phonology is concerned. Kaye (1989), for instance, announces "the death of the phoneme" (head of a section, p. 149), in the context of an attempt to demonstrate that "a phonology based on non-linear, multileveled representations is incompatible with the notion of a phoneme" (pp. 153-154). Is the phoneme dead? Did it ever exist otherwise than in the conscious thoughts of alphabetically literate minds? Are phonemes the make-up of letters rather than letters the make-up of phonemes? Like Orfeo, we have to face the illusions that constantly assault the visitors of perception. One may use perceptual illusions to fight against experimenters' illusions. Fodor and Pylyshyn (1981) have convincingly argued for "the centrality, in perceptual psychology, of experiments which turn on the creation of perceptual illusions" (p. 161). Besides its ability to demonstrate the direction of causality between two correlated states, the production of an illusion implies that the perceiver has no full conscious control of the informational content of the illusion. Thus, information that is not consciously represented may, if it is represented at an unconscious perceptual level, influence the misperception. By looking at the informational content of the illusion, and having enough reasons to believe that part of this information cannot come from conscious representations, one is allowed to locate the representation of that part of the information at the unconscious perceptual system. Following the logic of illusory conjunctions (cf. Treisman & Schmidt, 1982), we took advantage of the dichotic listening technique to elicit word illusions which
352
J. Morais, R. Kolinsky
should result from the erroneous combination of parts of information presented to one ear with parts of information presented to the opposite ear (see Kolinsky, 1992, and Kolinsky, Morais, & Cluytens, in press, for detailed description of the methodology). If speech attributes can be wrongly combined, they must have been separately registered as independent units at some earlier stage of processing. In our situation, since the subject is asked either to detect a word target previously specified or to identify the word presented in one particular ear, his or her attention is not called upon any word constituent. This experimental situation can thus be used for testing of illiterate as well as literate people. Recent data that we have obtained on Portuguese-speaking literate subjects (either European or Brazilian) indicate that the initial consonant of CVCV utterances is the attribute that "migrates" the most, compared to migrations of syllable and either voicing or place of articulation of the initial consonant (Kolinsky & Morais, 1993). Subsequent testing of Portuguese-speaking illiterate subjects, again from both Portugal and Brazil, yielded the same pattern of results (Morais, Kolinsky, & Paiva, unpublished). This means that, at least for Portuguese, consonants have psychological reality at the perceptual level of processing, and that the role of consonants in speech perception can be demonstrated in a population that is unable to represent them consciously. Phonemes are not a mere product of alphabetic literacy, at least-we insist-in Portuguese native speakers. The very same populations which allowed us to show that conscious representations of phonemes are prompted by the learning of alphabetic literacy provide also a clear suggestion that unconscious perceptual representations of phonemes can develop prior to the onset of literacy. Portuguese and Brazilian subjects possess unconscious representations of phonemes without disposing of conscious ones. The reverse picture can also be found. Testing French native speakers on French material, we found low rates of initial consonant migration (Kolinsky, 1992; Kolinsky et al., in press). At the particular stage of processing tapped by the attribute migration phenomenon French-speaking listeners do not seem to represent consonants (the possibility that consonants are represented at other perceptual stages cannot be excluded), at least they do not do so as much as Portuguese-speaking listeners, but, given that they were university students, we may be confident that they possess conscious representations of phonemes. This double dissociation supports the idea that conscious and unconscious representations of phonemes form two functionally distinct sets of mental representations. However, distinctiveness does not mean necessarily independence. Two functionally distinct systems may interact with each other. Thus, it is important to address both issues. Phonological processing intervenes in different functions. It intervenes in perceiving and producing speech, in reading and writing, and in the different forms of metaphonological behavior.
Perception and awareness in phonological processing
353
The perception and production of speech require distinct systems. Indeed, production can be affected by a lesion in Broca's area while leaving recognition of auditory words intact, and the reverse may be true for a lesion in Wernicke's area. Thus, separate representations are needed for input and output processing. As far as the independence issue is concerned, very little interference was observed on reading aloud words by having to monitor at the same time for a target word in a list of auditorily presented words (Shallice, McLeod, & Lewis, 1985). Both Shallice et al.'s finding and the fact that auditory word input does not activate the area specifically activated by word repetition and word reading aloud as shown by positron emission tomographic (PET) imagery (Petersen, Fox, Posner, Mintun, & Raichle, 1989) suggest that processing in the input system might be quite independent from processing in the output system. The two input functions, that is, the recognition of written and spoken words, also use distinct phonological systems. Moreover, written word pronunciation does not activate the Wernicke's area that is activated during word repetition (Howard et al., 1992). Since behavioral studies have provided evidence of automatic activation of phonological representations of word constituents, including phonemes, during orthographic processing (Ferrand & Grainger, 1992; Perfetti & Bell, 1991; Perfetti, Bell, & Delaney, 1988), these phonological representations would be distinct and to a large extent independent from those involved in spoken word recognition. However, let us note that the automatic phonological activation during skilled word reading has probably very little in common, except from a developmental point of view, with the intentional assignment of phonological values to letters and groups of letters that is mostly used in the reading of illegal sequences or of long and phonologically complex pseudo words. We have argued above that conscious representations of phonemes are distinct from the unconscious representations of phonemes used in spoken word recognition, and we might consider now whether or not they are distinct, too, from the representations of phonemes which are intentionally activated in reading. Is phonological dyslexia, which is characterized above all by a highly selective impairment in reading pseudowords and nonwords, concomitant with a severe deficiency in manipulating phonemes consciously? Based on the ability to perform phonemic manipulations displayed by a patient diagnosed as phonological dyslexic, Bisiacchi, Cipolotti, and Denes (1989) suggested that the representations involved in pseudoword reading and in phonemic manipulations are independent in skilled readers. However, the data are not convincing. As a matter of fact, the patient's impairment in pseudoword reading was slight, she knew the phonemic values of all the letters of the alphabet, and she could read and write a very high number of meaningless syllables (cf. discussion in Morais, 1993). More recently, Patterson and Marcel (1992) suggested that the nonword
354
J. Morais, R. Kolinsky
reading deficit "may be just one symptom of a more general disruption to phonological processing" (p. 259). They presented the results of six phonological dyslexics, all displaying a severe deficit in nonword reading, on both the intentional segmentation and assembling of phonemes. An interesting dissociation was observed between these two tasks, since all subjects were exceedingly poor at assembling three phonemes, but two of them could delete the initial consonant of a short utterance on about 80% of the trials. The nonword reading deficit of these patients might be due mainly to deficiency in assembling. At least for the two patients who could delete initial phonemes, there might be no deficiency in accessing phonology from orthography; unfortunately, the authors do not give any indication on the patients' knowledge of grapheme-phoneme correspondences. In collaboration with Philippe Mousty, the first author has tested three phonological dyslexics (more exactly, two of them were deep dyslexics), and one surface dyslexic on different metaphonological tests (see Morais, 1993). The surface dyslexic (J.S.), who displayed no effect of lexicality in either reading or writing, but a large effect of regularity, was only slightly impaired in his conscious phonemic abilities (in reading, too, there were a number of single consonant confusions, suggesting a slight impairment in his grapho-phonological conversion procedure, besides his impairment in the addressed procedure). All the three phonological dyslexics (V.D., P.R. and R.V.) were extremely poor at nonword reading, and no one of them displayed a regularity effect. Interestingly, all were also extremely poor at the phonemic tests, performing around chance level; two of them performed much better on rhyming judgement and on tests requiring a conscious analysis of utterances into syllables than on the phonemic tests. Thus, when the reading deficit spares the mechanism of phonological assembling, phonemic awareness is still present; but in those cases where phonological assembling is dramatically damaged phonemic awareness is not observed, the patients behaving in the metaphonological tests like illiterate people. Re-education of the assembling procedure re-installs phonemic awareness. P.S., the deep dyslexic whose re-education was described by de Partz (1986) and who, immediately before re-education, was unable to associate letters with phonemes, attained a very high level of performance on both pseudoword reading and phonemic analysis when we tested him a few years later. The assembling procedure in reading (which, we repeat, may not be the only mechanism involving phonological representations in word reading) seems thus to depend on the same phoneme representations that are evoked for the purpose of intentional, conscious manipulations of phonemes. More recently, we could test J.S., V.D., P.R. and one further phonological dyslexic (S.A.) on the speech dichotic test we have designed for the induction of attribute migration errors (unpublished data). Their results were compared with those of control subjects of the same age and educational level. We wanted to
Perception and awareness in phonological processing
355
know if people phonologically impaired both in conscious phonemic analysis and in phonological assembling in reading would show the same pattern of attribute migration in speech recognition as normal listeners. The results were very clear. J.S., the surface dyslexic, obtained an overall correct detection score as poor as the phonological dyslexics, but he was the only patient who showed the normal pattern of migrations for French, that is, a high rate of migrations for syllable, followed by moderate rates for first vowel and voicing of the initial consonant, and a low rate for initial consonant. All the phonological dyslexics failed to obtain migrations for syllables, and their migration rates for initial consonant were even lower than those obtained by the normals. It should be noted that, with the exception of J.S., who was good at repeating both words and nonwords, all the other patients displayed good word repetition but relatively low nonword repetition. Among the phonological dyslexics, V.D. and PR. had been diagnosed as Broca's aphasics, but S.A. as Wernicke's, thus precluding a clear association to one type of aphasia. What are we allowed to infer from these correlational data? The distinctiveness of conscious and unconscious representations of phonemes cannot be questioned, given that it was clearly supported by the dissociation observed in illiterate people. Thus, there are two ways to interpret the impairments observed in our phonological patients. One interpretation is that the cerebral damage they had undergone was wide enough to affect two relatively localized systems of representation, that is, the representations used in speech perception and those used in assembling and in conscious operations. The alternative is to conceive that these two systems of representation, though distinct, entertain dependency relations with each other. We do not dispose of neuro-anatomical data about our patients which would be sufficiently precise to try to match the areas damaged with types of deficit. However, it may be useful to inspect the literature to evaluate how distant the areas supporting conscious and unconscious phonological representations could be from each other if they are not coincident. Recently, Zatorre, Evans, Meyer, and Gjedde (1992) reported that phonetic decoding is accomplished in part of Broca's area near the junction with the premotor cortex. The evidence comes from an increase of activation, measured with PET, in that area in a task requiring to decide whether two syllables ended or not with the same consonant, in comparison with passive listening of the same speech material. Yet, the task used involves much more than phonetic decoding. It is a rather sophisticated metaphonological task, which illiterates would be unable to perform. Thus, the true implication of Zatorre et al.'s finding is that a part of Broca's area is involved in conscious phonemic analysis. Phonetic decoding is obligatorily and automatically triggered whenever people hear speech stimuli. Thus, it occurs even under passive listening. Activation of temporal parietal structures posterior to the sylvian fissure occurs during passive
356
J. Morais, R. Kolinsky
listening, whereas frontal activation anterior to this fissure occurs for articulation, as shown by other neural imaging studies (cf. Petersen et al., 1989). As Petersen et al. comment, "the activation of the left temporoparietal focus during passive auditory word presentation, but not for auditory clicks or tones, makes this area a good candidate for phonological encoding of words" (p. 163). Since, on the other hand, the neural circuits that subserve articulation appear to host phonemic awareness processes in people who know an alphabet, it seems that conscious and unconscious representations of phonemes rely on different-though, as could be expected, relatively close-brain areas. It would be interesting to assess whether or not activation of an additional area is obtained with a discrimination task, that is, which requires the subject to decide whether two speech stimuli are the same or different, in comparison with the passive listening situation. The discrimination task implies an intentional judgement (in this sense, it is metaphonological), but it requires only the global matching of two conscious percepts (in this last sense, it involves recognition, and it may therefore be much closer to perception). Our expectation, however, is that activation elicited by the speech discrimination would be similar to that observed in passive listening. This prediction is based on the fact that, as reported by Petersen et al. (1989), a rhyming task, thus a metaphonological but non-analytical task, using visual input implicated the temporoparietal cortex. Worth noting also is a case of word deafness in a patient with a left temporoparietal infarct, involving most of the superior temporal gyrus, and with a subsequent right hemisphere infarct involving again the superior temporal gyrus (Praamstra, Hagoort, Maassen, & Crul, 1991). The second stroke caused a specific deficit in an auditory lexical decision task, suggesting that auditory processes dependent on the previously intact right hemisphere have for some time compensated for the left hemisphere damage. Comparison of discrimination and identification functions for vowels and consonants suggests that there was a phonetic deficit, presumably prior to the second stroke. Interestingly, despite the auditory and phonetic impairment, the patient was able, even after the second stroke, to perform a (in some sense) metaphonological task requiring him to judge whether two disyllabic words began (ended) or not with the same syllable. Very precise phonetic decoding was probably not necessary in this syllablematching task, so that the task may have been accomplished on the basis of the residual auditory and/or phonetic capacities. The task is formally similar to the one used by Zatorre et al. (1992), but it concerns syllables rather than phonemes. We predict that it would not yield the anterior activation that Zatorre et al. found. The neuro-anatomical and neuropsychological data available up to now do not indicate a consistent dissociation between phonological and metaphonological representations as unitary ensembles. Among phonological units and properties, phonemes may be the only ones to present such a dissociation. The conscious
Perception and awareness in phonological processing
357
representations of phonemes appear to be selectively dissociated, on a neural basis, from the processes of phonetic decoding. Our results with illiterates using the migration phenomenon on the one hand, and the conscious phonemic manipulation tasks on the other hand, are consistent with the neural data. What functional dependency relations might the conscious and unconscious representations of phonemes entertain with each other? From unconscious to conscious representations, the dependency relation may be trivial. In languages in which perceptual processing at the phonemic level may be crucial for the quality of the global conscious speech percept, the activation of unconscious representations ultimately constrains the elaboration of conscious representations. The reverse dependency relation is theoretically more interesting. As we discuss elsewhere (Morais & Kolinsky, in press), the acquisition of phonemic awareness may elicit supplementary and perhaps more efficient procedures to cope with spoken words. We found evidence for a (sometimes useful) strategy of listening based on attention to the phonemic structure of words in the dichotic listening situation, and even when the stimuli are simply presented against a noise background (Castro, 1992; Castro & Morais, in preparation; Morais, Castro, Scliar-Cabral, Kolinsky, & Content, 1987). Other indications that attentional focusing on phonemes may lead to improved word recognition include Nusbaum, Walley, Carrell, and Ressler's (1982) observation that listeners may avoid the illusion of phoneme restoration by focusing attention on the critical phoneme. However, the stage of processing at which these influences occur remains an open question. Remember that orthographic representations may also influence rhyming judgements on spoken words (Seidenberg & Tanenhaus, 1979), detection of phonemes (Taft & Hambly, 1985) as well as the occurrence of phonological fusions in dichotic listening (see Morais, Castro, & Kolinsky, 1991). However, written word input does not seem to activate the areas devoted to the perceptual processing of spoken words. The effects of strategies based on conscious phonemic representations, as well as orthographic effects, may take place between perception and recognition - a land that remains unexplored and that is perhaps unexplorable with our present techniques. To conclude, in spite of attempted murder, the phoneme is still alive. It seems that it is not a mere convention. It would have a perceptual as well as a postperceptual reality, and therefore it deserves further and more systematic exploration.
References Bisiacchi, P.S., Cipolotti, L., & Denes, G. (1989). Impairment in processing meaningless verbal material in several modalities: The relationship between short-term memory and phonological skills. Quarterly Journal of Experimental Psychology, 41 A, 293-319.
358
J. Morais, R. Kolinsky
Castro, S.-L. (1992). Alfabetizacao da fala. Porto: Instituto Nacional de Investigacjio Cientffica. Castro, S.-L., & Morais, J. (in preparation). Evidence for global and analytical strategies in spoken word recognition. de Partz, M.P. (1986). Re-education of a deep dyslexic patient: Rationale of the method and results. Cognitive Neuropsychology, 3, 149-177. Eimas, P., Siqueland, E.R., Jusczyk, P.W., & Vigorito, J. (1971). Speech perception in infants. Science, 171, 303-306. Ferrand, L., & Grainger, J. (1992). Phonology and orthography in visual word recognition: Evidence from masked nonword priming. Quarterly Journal of Experimental Psychology, 45A, 353-372. Fodor, J., & Pylyshyn, Z. (1981). How direct is visual perception? Some reflections on (jibson's "Ecological Approach." Cognition, 9, 139-196. Howard, D., Patterson, K., Wise, R., Brown, W.D., Friston, K., Weiller, C, & Frackowiak, R. (1992). The cortical localization of the lexicons. Brain, 115, 1769-1782. Kaye, J. (1989). Phonology: A cognitive view. Hillsdale, NJ: Erlbaum. Kolinsky, R. (1992). Conjunctions errors as a tool for the study of perceptual processes. In J. Alegria, D. Holender, J. Jun?a de Morais, & M. Radeau (Eds.), Analytic approaches to human cognition (pp. 133-149). Amsterdam: North-Holland. Kolinsky, R., Cary, L., & Morais, J. (1987). Awareness of words as phonological entities: The role of literacy. Applied Psycholinguistics, 8, 223-232. Kolinsky, R., & Morais, J. (1993). Intermediate representations in spoken word recognition: A cross-linguistic study of word illusions. Proceedings of the 3rd European Conference on Speech Communication and Technology: Eurospeech'93 (pp. 731-734). Berlin. Kolinsky, R., Morais, J., & Cluytens, M. (in press). Intermediate representations in spoken word recognition: Evidence from word illusion. Journal of Memory and Language. Morais, J. (1993). Phonemic awareness, language and literacy. In R.M. Joshi & C.K. Leong (Eds.), Reading disabilities: Diagnosis and component processes (pp. 175-184). Dordrecht: Kluwer. Morais, J., Bertelson, P., Cary, L., & Alegria, J. (1986). Literacy training and speech segmentation. Cognition, 24, 45-64. Morais, J., Cary, L., Alegria, J., & Bertelson, P. (1979). Does awareness of speech as a sequence of phones arise spontaneously? Cognition, 7, 323-331. Morais, J., Castro, S.L., & Kolinsky, R. (1991). La reconnaissance des mots chez les adultes illettres. In R. Kolinsky, J. Morais, & J. Segui (Eds.), La reconnaissance des mots dans les differentes modalites sensorielles: Etudes de psycholinguistique cognitive (pp. 59-80). Paris: Presses Universitaires de France. Morais, J., Castro, S.L., Scliar-Cabral, L., Kolinsky, R., & Content, A. (1987). The effects of literacy on the recognition of dichotic words. Quarterly Journal of Experimental Psychology, 39A, 451-465. Morais, J., & Kolinsky, R. (in press). The consequences of phonemic awareness. In B. de Gelder & J. Morais (Eds.), Speech and reading: Comparative approaches. London: Erlbaum. Morais, J., Kolinsky, R., & Paiva, M. The phoneme's perceptual reality exhumed: Studies in Portuguese. Unpublished manuscript. Nusbaum, H.C., Walley, A.C., Carrell, T.D., & Ressler, W.H. (1982). Controlled perceptual strategies in phonemic restoration. Research on Speech Perception Progress Report (Vol. 8, pp. 83-103). Bloomington, IN: Department of Psychology, Indiana University. Patterson, K., & Marcel, A. (1992). Phonological ALEXIA or PHONOLOGICAL alexia? In J. Alegria, D. Holender, J. Jun?a de Morais, & M. Radeau (Eds.), Analytic approaches to human cognition (pp. 133-149). Amsterdam: North-Holland. Perfetti, C.A., & Bell, L. (1991). Phonetic activation during the first 40 ms of word identification: Evidence from backward masking and priming. Journal of Memory and Language, 30, 473-485. Perfetti, C.A., Bell, L.C., & Delaney, S.M. (1988). Automatic (prelexical) phonemic activation in silent word reading: Evidence from backward masking. Journal of Memory and Language, 27, 59-70. Petersen, S.E., Fox, P.T., Posner, M.I., Mintun, M.E., & Raichle, J. (1989). Positron emission
Perception and awareness in phonological processing
359
tomographic studies of the processing of single words. Journal of Cognitive Neurosciences, 1, 153-170. Praamstra, P., Hagoort, P., Maassen, B., & Crul, T. (1991). Word deafness and auditory cortical function. Brain, 114, 1197-1225. Read, C, Zhang, Y., Nie, H., & Ding, B. (1986). The ability to manipulate speech sounds depends on knowing alphabetic writing. Cognition, 24, 31-44. Seidenberg, M.S., & Tanenhaus, M.K. (1979). Orthographic effects on rhyme monitoring. Journal of Experimental Psychology: Human Learning and Memory, 5, 546-554. Shallice, T., McLeod, P., & Lewis, K. (1985). Isolating cognitive modules with the dual task paradigm: Are speech perception and production separate processes? Quarterly Journal of Experimental Psychology, 37A, 507-532. Taft, M., & Hambly, G. (1985). The influences of orthography on phonological representations in the lexicon. Journal of Memory and Language, 24, 320-335. Treisman, A., & Schmidt, H. (1982). Illusory conjunctions in the perception of objects. Cognitive Psychology, 14, 107-141. Warren, R.M. (1983). Multiple meanings of "phoneme" (articulatory, acoustic, perceptual, graphemic) and their confusions. In N.J. Lass (Ed.), Speech and language: Advances in basic research and practice (Vol. 9, pp. 285-311). New York: Academic Press. Zatorre, R.J., Evans, A.C., Meyer, E., & Gjedde, A. (1992). Lateralization of phonetic and pitch discrimination in speech processing. Science, 256, 846-849.
19 Ever since language and learning: afterthoughts on the Piaget-Chomsky debate Massimo Piattelli-Palmarini* Dipartimento di Scienze Cognitive, Istituto San Raffaele, Via Olgettina 58, Milano 20132, Italy Center for Cognitive Science, MIT, Cambridge MA 02139, USA
Abstract The central arguments and counter-arguments presented by several participants during the debate between Piaget and Chomsky at the Royaumont Abbey in October 1975 are here reconstructed in a particularly concise chronological and "logical" sequence. Once the essential points of this important exchange are thus clearly laid out, it is easy to witness that recent developments in generative grammar, as well as new data on language acquisition, especially in the acquisition of pronouns by the congenitally deaf child, corroborate the "language specificity" thesis defended by Chomsky. By the same token these data and these new theoretical refinements refute the Piagetian hypothesis that language is constructed
Correspondence to: Massimo Piattelli-Palmarini, Dipartimento di Scienze Cognitive, Istituto San Raffaele, Via Olgettina 58, Milano, 20132, Italy. I am in debt to Thomas Roeper for his invitation to give a talk on the Piaget-Chomsky debate to the undergraduates in linguistics and psychology, at the University of Massachusetts at Amherst, in April 1989. The idea of transforming it into a paper came from the good feedback I received during that talk, and from a suggestion by my friend and colleague Paul Horwich, a philosopher of science, who had attended. Steven Pinker reinforced that suggestion, assuming that such a paper could be of some use also to the undergraduates at MIT. Noam Chomsky carefully read the first draft, and made many useful suggestions in the letter from which I have quoted some passages here. Paul Horwich, Morris Halle and David Pesetsky also offered valuable comments and critiques. Jerry Fodor stressed the slack that has intervened in the meantime between his present position and Chomsky's, inducing me to revise sections of the first draft (perhaps the revisions are not as extensive as he would have liked). The ideas expressed here owe a lot to a lot of people, and it shows. I wish to single out, however, my special indebtedness to Noam Chomsky, Jerry Fodor, Jacques Mehler, Jim Higginbotham, Luigi Rizzi, Ken Wexler, Laura-Ann Petitto, Lila Gleitman, Steve Gould and Dick Lewontin. The work I have done during these years has been generously supported by the Alfred P. Sloan Foundation, the Kapor Family Foundation, the MIT Center for Cognitive Science, Olivetti Italy and the Cognitive Science Society. I am especially indebted to Eric Wanner for initial funding.
362
M. Piatelli-Palmarini
upon abstractions from sensorimotor schemata. Moreover, in the light of modern evolutionary theory, Piaget's basic assumptions on the biological roots of cognition, language and learning turn out to be unfounded. In hindsight, all this accrues to the validity of Fodor's seemingly "paradoxical" argument against "learning" as a transition from "less" powerful to "more" powerful conceptual systems.
1. Introduction This issue of Cognition offers a rare and most welcome invitation to rethink the whole field in depth, and in perspective. A fresh reassessment of the important Royaumont debate (October 1975) between Piaget and Chomsky may be of interest in this context. After all, the book has by now been published in ten languages, and it has been stated (Gardner, 1980) that the debate is "certainly a strong contender... as the initial milestone in the emergence of this field" (i.e., cognitive science). It is not for the co-organizer, with Jacques Monod, of that meeting, or for the editor of the proceedings (Piattelli-Palmarini, 1980) to say how strong the contender is. It is a fact, however, that many of us have witnessed over the years many impromptu re-enactments of arguments and counter-arguments presented in that debate, and that if one still wants to raise today the same kind of objections to the central ideas of generative grammar as Piaget, Cellerier, Papert, Inhelder, and Putnam raised at the time, one cannot possibly do a better job than the one they did. Moreover, the most effective counters to those objections are still basically the same that Chomsky and Fodor offered at Royaumont. That debate also foreshadowed, for reasons that I shall come back to, much of the later debate on the foundations of connectionism (Pinker & Mehler, 1988; Fodor & Pylyshyn, 1988). What I will attempt to do here is support the Chomsky-Fodor line with further evidence that has become available in the meantime. In fact, as time goes by, it is increasingly clear that the pendulum is presently swinging towards the innatist research program in linguistics presented at Royaumont by Chomsky (and endorsed by Mehler with data on acquisition), and away from even the basic, and allegedly most "innocent", assumptions of the constructivist Piagetian program. Lifting, at long last, the self-imposed neutrality I considered it my duty to adopt while editing the book, I say here explicitly, and at times forcefully, what I studiously avoided to say there and then. I also wish to highlight some recent developments in linguistics and language acquisition that bear clear consequences on the main issues raised during the debate.
2. The debates within the debate In hindsight, it is important to realize that there were at least four distinct Royaumont debates eventually collapsing into one, a bit like a swarm of virtual
Ever since language and learning: afterthoughts on the Piaget-Chomsky debate
363
particles collapsing into a single visible track in modern high-energy laboratories: the event that actually happened; the one which we, the organizers, thought would happen; the one Jean Piaget hoped would happen, and the one that Chomsky urged everyone not to let happen. Let me digress for a moment and sketch also these other "virtual" debates. Piaget assumed that he and Chomsky were bound to agree in all important matters. It was his original wording that there had to be a "compromis" between him and Chomsky. In fact, this term is recurrent throughout the debate. During the preparatory phase, Piaget made it clear that it had been his long-standing desire to meet with Chomsky at great length, and witness the "inevitable" convergence of their respective views. As Piaget states in his "invitation" paper,1 he thought there were powerful reasons supporting his assumption. I will outline these reasons in a simple sketch. Reasons for the "compromise" Piaget's assessment of the main points of convergence between him and Chomsky - Anti-empiricism (in particular anti-behaviorism) - Rationalism and uncompromising mentalism -Constructivism and/or generativism (both assigning a central role to the subject's own internal activity) - Emphasis on rules, principles and formal constraints - Emphasis on logic and deductive algorithms - Emphasis on actual experimentation (vs. armchair theorizing) - A dynamic perspective (development and acquisition studied in real time, with real children)
Piaget's proposal was one of a "division of labour", he being mostly concerned with conceptual contents and semantics, Chomsky being (allegedly) mostly concerned with content-independent rules of syntactic well-formedness across different languages. Piaget considered that the potentially divisive issue of innatism was, at bottom, a non-issue (or at least not a divisive one) because he also agreed that there is a "fixed nucleus" {noyaux fixe) underlying all mental activities, language included, and that this nucleus is accounted for by human biology. The only issue, therefore was to assess the exact nature of this fixed nucleus and the degree of its specificity. The suggestion, voiced by Cellerier and Toulmin, was to consider two "complementary" strategies: the Piagetian one, which consisted of a minimization *In Language and Learning: The Debate between Jean Piaget and Noam Chomsky (hereinafter abbreviated as LL), pp. 23-24.
M. Piatelli-Palmarini
364
of the role of innate factors, and the Chomskian one, consisting of a maximization of these factors-once more, a sort of division of labor. It was interesting for all participants, and certainly unexpected to Piaget, to witness that, during the debate proper, the constant focus of the discussions was on what Piaget considered perfectly "obvious" gallant de soi")\ the nature and origin of this "fixed nucleus". He was heading for severe criticism from the molecular biologists present at the debate (especially from Jacob and Changeux) concerning his views on the origins of the fixed nucleus. And he was heading for major disagreements with Chomsky concerning the specificity of this nucleus. It can be safely stated that, while Piaget hoped for a reconciliatory settlement with the Massachusetts Institute of Technology (MIT) contingent about particular hypotheses and particular mechanisms concerning language and learning (and, in particular, the learning of language), he found himself, unexpectedly, facing insuperable disagreement about those very assumptions he hardly considered worth discussing, and which he believed were the common starting point - more on these in a moment. Piaget's imperception of these fundamental differences was, in essence, responsible for the vast gap between the debate he actually participated in, and the virtual debate he expected to be able to mastermind. One had the impression that, to the very end, Piaget was still convinced he had been misunderstood by Chomsky and Fodor. In Piaget's opinion, had they really understood his position, then it would have been unthinkable that the disagreement could still persist. One of Piaget's secrets was his deep reliance on the intuitive, unshakeable truth of his hypotheses directrices (guiding hypotheses). These were such that no reasonable person could possibly reject them - not if he or she actually understood what they meant. One could single out the most fundamental of Piaget's assumptions (Piaget, 1974) in words that are not his own, but which may well reflect the essence of what he believed:
Piaget's guiding hypothesis (hypothese directrice) - Life is a continuum - Cognition is an aspect of life therefore - Cognition is a continuum
This is a somewhat blunt rendition, but it is close enough to Piaget's core message. Some of his former collaborators in the Geneva group, in 1985, expressed basic agreement that this was "a fair rendition" of Piaget's hypothese directrice (as expressed, for instance in his 1967 book Biologie et Connaissance)? 2
Barbel Inhelder, personal communication.
Ever since language and learning: afterthoughts on the Piaget-Chomsky debate
365
As any historian of medieval logic could testify, if literally taken this version is a well-known logical fallacy (compare with the following):
- New York is a major metropolis - Central Park is part of New York therefore - Central Park is a major metropolis
Decidedly, one does not want to impute to Piaget and his co-workers assent to a logical fallacy. Thus stated, it cannot pass as a "fair" reformulation. That would be too devious. A better reformulation, one that passes the logical test, would be the following:
A better heuristic version of PiageVs core hypothesis - Life is (basically) auto-organization and self-stabilization in the presence of novelty -Cognition is one of life's signal devices to attain auto-organization and self-stabilization therefore - Cognition is best understood as auto-organization and self-stabilization in the presence of novelty
This much seemed to Piaget to be untendentious and uncontroversial, but also very important. He declared, in fact, that this central hypothesis had guided almost everything he had done in psychology. In order better to understand where the force of the hypothesis lies, one must remember that he unreservedly embraced other complementary hypotheses and other strictly related assumptions. Here they are (again in a succinct and clear-cut reformulation):
PiageVs additional assumptions I Auto organization and self-stabilization are not just empty metaphors, but deep universal scientific principles captured by precise logico-mathematical schemes. II There is a necessary, universal and invariable sequence of stepwise transitions between qualitatively different, fixed stages of increasing selfstabilization. Ill The "logic" of these stages is captured by a progressive hierarchy of
M. Piatelli-Palmarini
366
inclusion between ascending levels of abstraction and generalization (each stage contains the previous one as a sub-set). IV The necessary and invariant nature of these transitions cannot be captured by the Darwinian process of random mutation plus selection. Corollary V Another theory of biological evolution is needed (Piaget's "third way", differing both from Darwin's and Lamarck's).
Piaget believed that there is a kind of evolution that is "unique to man", and which grants the "necessity" of the mental maturational stages.3 These are what they are, and could not be anything else; moreover they follow one another in a strict unalterable sequence. The random process of standard Darwinian evolution is unable in principle (not just as a temporary matter of fact, due to the present state of biology) to explain this strict "logical" necessity. One the last two points the biologists, obviously, had their say, as we will see in a moment. Within this grand framework, it is useful to emphasize what were Piaget's specific assumptions concerning learning and language:
Piagefs crucial assumptions about learning The transitions (between one stage and the next) are formally constrained by "logical necessity" (fermeture logique) and actually, "dynamically", take place through the subject's active effort to generalize, equilibrate, unify and systematize a wide variety of different problem-solving activities. The transition is epitomized by the acquisition of more powerful concepts and schemes, which subsume as particular instances the concepts and schemes of the previous stage. Piaget's crucial assumptions about language The basic structure of language is continuous with, and is a generalizationabstraction from, various sensorimotor schemata. The sensorimotor schemata are a developmental pre-condition for the emergence of language, and also constitute the logical premise of linguistic 3
LL, p. 59.
Ever since language and learning: afterthoughts on the Piaget-Chomsky debate
367
structures (word order, the subject/verb/object construction, the agent/patient/instrument relation, and so on). Conceptual links and semantic relations are the prime movers of language acquisition. Syntax is derivative from (and a "mirror" of) these. It was inevitable that Piaget should meet strong opposition on each of these assumptions, on their alleged joint force and on the overall structure of his argument. In a sense, the whole debate turned only on these assumptions, with Piaget growing increasingly impatient to pass onto more important and more technical matters, but failing to do so, on account of the insurmountable problems presented by his core tenets. Chomsky and Fodor kept mercilessly shooting down even the most "obvious" and the most "innocent" reformulations of the basic assumptions of the Piagetian scheme, notably in their many spirited exchanges with Seymour Papert, who boldly undertook the task of systematically defending Piaget against the onslaught. The debate was not the one Piaget had anticipated, and it became clear to everyone, except possibly to Piaget himself (see his "Afterthoughts"),4 that no compromise could possibly be found.
3. Another virtual debate: the one the organizers thought they were organizing There was, as I said, another virtual debate, the one which the organizers molecular biologists with a mere superficial acquaintance of cognitive psychology and linguistics - believed they were organizing. It was closer to what Piaget had in mind than to the debate that actually took place, because they too anticipated some kind of convergence. How could that be? How could we, the biologists in the group, believe for a moment that some form of compromise could be reached? The simple answer to this, in retrospect, is: ignorance. What we thought we knew about the two systems was simple and basic. I think I can faithfully reconstruct it in a few sentences: What we (the biologists) thought we knew About Piaget: -There is a stepwise development of human thought, from infancy to adulthood, through fixed, qualitatively different stages that are common to all cultures, though some cultures may fail to attain the top stages. - N o t everything that appears logical and necessarily true to us adults is so 4
LL, pp. 278-284.
368
M. Piatelli-Palmarini
judged by a child, and vice versa. Suitable experiments show where the differences lie. - Constructivism, a variant of structuralism, is the best theoretical framework to explain the precise patterns of cognitive development. Unlike behaviorism, constructivism stresses the active participation of the child and the role of logical deduction. - Set theory and propositional calculus are (somehow) central components of the theory. About Chomsky: -There are linguistic universals, common to all the different languages the world over. -These are not superficial, but constitute a "deep structure".1 -This deep structure is innate, not learned, and is unique to our species. - Formal logic and species-specific computational rules are (somehow) involved in determining deep syntactic structures. - Syntax is autonomous (independent of semantics and of generic conceptual contents). - There are syntactic transformations (from active to passive, from declarative to interrogative, etc.) that "preserve" the deep structure of related sentences. Semantics "links up" with syntax essentially at this deep level. - Behaviorism is bad, while innatism and mentalism are OK. - T h e expression "mind/brain" is OK. Linguistics and psychology are, at bottom, part of biology. The organizers, in fact, knew very little, but they liked what they knew, on both sides. There was every reason (in our opinion) to expect that these two schools of thought should find a compromise, and that this grand unified metatheory would fit well within modern molecular biology and the neurosciences. Both systems relied heavily on "deeper" structures, on universals, on precise logico-mathematical schemes, on general biological assumptions. This was music to a biologist's ears. All in all, it was assumed that the debate would catalyze a "natural" scientific merger, one potentially rich in interesting convergences and compromises. 4. Chomsky's plea for an exchange, not a "debate" Commenting on a previous version of the present paper, Chomsky has insisted that he, for one, had always been adamant in not wanting a debate, but rather an + There was at the time some confusion among non-experts between the terms "deep structure" and "universal grammar".
Ever since language and learning: afterthoughts on the Piaget-Chomsky debate
369
open and frank discussion, devoid of pre-determined positions and pre-set frontiers: "I am a little uneasy about presenting the whole thing as a 'ChomskyPiaget debate'. That's not the way I understood it, at least, and I thought that Piaget didn't either, though I may be wrong. As far as I understood, and the only way I would have even agreed to participate, there was a conference (not debate) on a range of controversial issues, which was opened by two papers, a paper by Piaget and my reaction to it, simply in order to put forward issues and to open the discussion."5 Chomsky then adds: "Debates are an utterly irrational institution, which shouldn't exist in a reasonable world. In a debate, the assumption is that each participant has a position, and must keep to this position whatever eventuates in the interchange. In a debate, it is an institutional impossibility (i.e., if it happened, it would no longer be a debate) for one person to say to the other: that's a good argument, I will have to change my views accordingly. But the latter option is the essence of any interchange among rational people. So calling it a debate is wrong to start with and contributes to ways of thinking and behaving that should be abandoned." After pointing out that, as is to be expected in any ongoing scientific activity, his views are constantly changing and are not frozen into any immutable position, Chomsky insists that neither he, nor Fodor, nor the enterprise of generative grammar as a whole, are in any sense an institution, in the sense in which in Europe Marxism, Freudianism, and to some extent Piagetism, are institutions. The following also deserves to be quoted verbatim from his letter: 'There is, thank God, no 'Chomskyan' view of the world, or of psychology, or of language. Somehow, I think it should be made clear that as far as I was concerned at least, I was participating by helping open the discussion, not representing a world view". These excerpts from Chomsky's letter should make it very clear what his attitude was. But it is well beyond anyone's powers now to un-debate that debate, partly because it is the very subtitle of the book ("The debate between Jean Piaget and Noam Chomsky"), and partly because the community at large has been referring to the event in exactly those terms for almost two decades. So, after having made clear which kind of virtual non-debate Chomsky assumed one should have organized, let us finally return to what actually happened.
5. The real debate From now on, let's faithfully attempt to reconstruct, from the published records, from the recorded tapes, and from the vivid memory of some of those With Chomsky's permission, this, and the following, are verbatim quotes from a letter to M. Piattelli-Palmarini, dated May 8, 1989.
370
M. Piatelli-Palmarini
who were present, how all these imaginary, unlikely, virtual debates precipitated into the real one. Chomsky's written reply to Piaget,6 made available a couple of months before the debate, rightly stressed, among other things, the untenability of Piaget's conception of evolution. Not until the first session of the debate proper had anyone realized that Piaget was (Heaven forbid!) a Lamarckian. It was, however, already clear from his distributed "invitation" paper that he had a curious idea of how genes are assembled and of how evolution acts on gene assemblies. Chomsky clearly had got it right and Piaget had got it wrong. This was the first important point in favor of Chomsky. Moreover, Chomsky stressed the need for specificity, while Piaget stressed the need for generality. The concrete linguistic examples offered by Chomsky seemed indeed very, very remote from any generalization of sensorimotor schemata. Some participants already felt sympathetic to Chomsky's suggestion that one should not establish any dualism between body and mind, and that one should approach the study of "mental organs" exactly in the way we approach the study of the heart, the limbs, the kidneys, etc. Everything he said made perfect sense and the concrete linguistic examples (which Piaget and the others never even began to attempt to deal with) made it vastly implausible that syntactic rules could be accounted for in terms of sensorimotor schemata. Chomsky's arguments against learning by trial and error were compelling - very compelling. One clearly saw the case for syntax, but one may still have failed to see the far-reaching import of his arguments for learning in general. For this, the participants had to wait until Fodor made his big splash at the meeting. But let's proceed in chronological order. Most important, to some of the biologists, was the feeling, at first confused, but then more and more vivid, that the style of Chomsky's argumentation, his whole way of thinking, was so deeply germane to the one we were accustomed to in molecular biology. On the contrary, Piaget's biology sounded very much like the old nineteenth-century biology; it was the return of a nightmare, with his appeal to grand unifying theories, according to which life was "basically" this or that, instead of being what it, in fact, is. Chomsky's call for specificity and his reliance on concrete instances of language were infinitely more appealing. It became increasingly clear to the biologists at Royaumont that Chomsky was our true confrere in biology and that the case for syntax (perhaps only for syntax) was already lost by Piaget. As the debate unfolded, the participants were in for further surprises and much more startling revelations. In order not to repeat needlessly what is already in full length in the book itself, let's recapitulate only the main turning points of the debate. 6
LL, pp. 35-52.
Ever since language and learning: afterthoughts on the Piaget-Chomsky debate
371
5.1. The mishaps of "phenocopies" Upon deeper probing into his rather peculiar idea of "phenocopy", Piaget indeed turned out to be a Lamarckian. He actually believed in some feedback, however devious and indirect, from individual experience to the genetic make-up of the species. The biologists were aghast! Jacob made a marvelous job of politely and respectfully setting the record straight on phenocopies, aided by Changeux (Monod was not present, and maybe he would have been carried away by the discussion, behaving slightly less courteously to Piaget than Jacob and Changeux did. Monod, haunted by the memory of the Lyssenko affair, always reacted to Lamarckism by drawing his gun!) Well, believe it or not, Piaget was unruffled. He had the stamina to declare himself "tres surpris" by the reactions of the biologists, and reject Jacob's rectifications, quoting a handful of pathetic heretics, obscure Lamarckian biologists who happened to agree with him. The alienation of Piaget from mainstream biology was consummated there and then; patently, he did not know what he was talking about. (The young molecular biologist Antoine Danchin undertook, after the meeting, the task of making this as evident as it had to be made)8 Subsequent exchanges with Cellerier and Inhelder showed that they had no alternative explanation to provide for the linguistic material brought in by Chomsky. When they mentioned linguistic examples, these were of a very peculiar generic kind, nowhere near the level of specificity of Chomsky's material. They pleaded for an attenuation of the "innateness hypothesis", so as to open the way to the desired compromise. But Chomsky's counter was characteristically uncompromising: first of all, the high specificity of the language organ, and, therefore its innateness, is not a hypothesis, it is a fact, and there is no way one may even try to maximize or minimize the role of the innate components, because the task of science is to discover what this role actually is, not to pre-judge in advance "how much of it" we are ready to countenance in our theories. Second, it is not true that Chomsky is only interested in syntax, he is interested in every scientifically approachable aspect of language, semantics and conceptual systems included. These too have their specificity and there are also numerous and crucial aspects of semantics that owe nothing to sensorimotor schemata, or to generic logical necessity-no division of labor along thes€ lines, and again no compromise. The salient moments of this point in the debate can be summarized as follows:
7 LL, pp. 61-64. 8
LL, pp. 356-360.
372
M. Piatelli-Palmarini
Counters to Piaget from the biologists Jacob's counter: - Autoregulation is made only by structures which are there already and which regulate minor variations within a heavily pre-determined range of possibles. - Regulation cannot precede the constitution of genetically determined regulatory structures. - (Gentle reminder) Individual experience cannot be incorporated into the genes. Piaget simply did not see the devastating effect of Jacob's counters on his private and idiosyncratic conception of evolution by means of autoregulation. Cellerier was visibly embarrassed by Piaget's anti-Darwinism and tried, I think unsuccessfully, to disentangle the personal attitudes of Piaget in matters of biological evolution from the objective implications of the Darwinian theory for psychology proper.9
5.2. The mishaps of "precursors" During the next session, when Monod was also present, came another major counter, on which Fodor quickly and aptly capitalized: Monod's counter10 - If sensorimotor schemata are crucial for language development, then children who are severely handicapped in motor control (quadriplegics, for instance) should be unable to develop language, but this is not the case. -Inheldefs answer: Very little movement is needed, even just moving the eyes. - Monod"s and Fodofs punch-line: Then what is needed is a triggering experience and not a bona fide structured "precursor". Once again, it was the impression of several participants that the weight of this counter was not properly registered by the Piagetians. Yet the Monod-Fodor argument was impeccable, and its conclusion inevitable. One thing is a triggering input, quite another a structured precursor that has to be assimilated as such, and
9
LL, pp. 70-72. LL, p. 140.
10
Ever since language and learning: afterthoughts on the Piaget-Chomsky debate
373
on the basis of which a higher structure is actually built. A trigger need not be "isomorphic" with, and not even analogous to, the structure it sets in motion. Admitting that this precursor can be just anything you please (just moving your eyes once) is tantamount to admitting that it is nothing more than a "releasing factor", in accordance with the innatist model of growth and maturation and against the literal notion of learning. Papert, for instance, went on at great length in offering the virtues of "indirect", "implicit" learning and of the search for "primitives". These, he insisted, and only these, can be said to be innate, not the highly specific structures proposed by Chomsky. These "clearly" are derived from more fundamental, simpler primitives.11 For this illusion, Fodor had a radical cure up his sleeve, as we will see in a moment. (Healthy correctives to Papert's, and Piaget's notion of implicit learning in the specific domain of lexical acquisition are to be found in Atkins, Kegl, & Levin, 1986; Berwick, 1985; Grimshaw, 1990; Jackendoff, 1983, 1990, 1992; Lederer, Gleitman & Gleitman, 1989; Lightfoot, 1989; Piatelli-Palmarini, 1990a; Pinker, 1989.) Before Fodor's cold shower a lot of the discussion turned, rather idly, around the existence, in language, of components which are not specific to it, but are also common to other mental activities and processes. Again, a division of labour was proposed along these lines. Chomsky had no hesitation in admitting that there are also language factors that are common to other intelligent activities, but rightly insisted that there are many besides which are unique to language, and which cannot be explained on the basis of general intelligence, sensorimotor schemes, communicative efficacy, the laws of logic, problem-solving, etc. These languagespecific traits, Chomsky insisted, are the most interesting ones, and those most amenable to a serious scientific inquiry.
5.3. Chomsky's plea for specificity Here is an essential summary of the line he defended: Chomsky's argument for specificity12 The simplest and therefore (allegedly) most plausible rule for the formation of interrogatives The man is here. Is the man here?
U LL, pp. 12
90-105. LL, pp. 39-43.
374
M. Piatelli-Palmarini
is the following (a "structure-independent" rule): "Move 4s' to the front". But look at The man who is tall is here. *Is the man who tall is here? (bad sentence, never occurring in the child's language) Is the man who is tall here? (good sentence) The "simple" rule is never even tried out by the child. Why? The correct rule, uniformly acquired by the child is not "simple" (in this transparent and shallow sense of the word) and involves abstract, specifically linguistic notions such as "noun phrase". Therefore it is not learned by trial and error and is not derivative on sensorimotor schemata. (What could the motor equivalent of a noun phrase conceivably be?)
This is, somewhat bluntly put, the core of the argument. If the process were one of induction, of hypothesis formation and confirmation, we should expect to see the simplest and least language-specific rules being tried out first. But this is not what we observe. More specific data on language acquisition in a variety of languages and dialects (Berwick & Wexler, 1987; Chien & Wexler, 1990; Guasti, 1993; Jusczyk & Bertoncini, 1988; Lightfoot, 1989; Manzini & Wexler, 1987; Wexler, 1987; Wexler, 1990; Wexler & Manzini, 1987) by now make the case against learning syntax by induction truly definitive. We will come back to this point.
Chomsky's argument against any derivation of syntactic rules from generic constraints13 We like each other = each of us likes the others We expect each other to win = each of us expects the others to win Near-synonymous expressions: "each other" = "each... the others" BUT *We expect John to like each other 13
LL, pp. 113-117.
375
Ever since language and learning: afterthoughts on the Piaget-Chomsky debate
is NOT well formed and is NOT synonymous with Each of us expects John to like the others WHY? There is no obvious logical or communication-theoretical explanation. (There aren't even non-obvious ones, at that). The linguistic rule is of the following kind. In embedded structures of the form ...X...[...Y...] where X and Y are explicit or understood components (names, pronouns, anaphoric elements etc.) no rule can apply to X and Y if the phrase between brackets contains a subject distinct from Y. The nature of this rule is specifically linguistic: the rule has no conceivable sensorimotor counterpart, nor any justification in terms of general intelligence. Further confirming evidence (just apply the rule): The men heard stories about each other. T h e men expect John to like each other. Who did the men hear stories about? *Who did the men hear John's stories about? John seems to each of the men to like the others. *John seems to the men to like each other.
(OK) (bad) (OK) (bad) (OK) (bad)
Evidence from another language: J'ai laisse Jean manger X. J'ai laisse manger X a Jean.
(both OK)
These are apparently freely interchangeable constructions, but the symmetry is broken in the next example: J'ai tout laisse manger a Jean. *J'ai tout laisse Jean manger.
(OK) (bad)
NB: Update. These phenomena have received much better and deeper explanations in recent linguistic work, in terms of "complete functional complexes" (for a summary, see Giorgi & Longobardi, 1991; Haegeman, 1991). The overall thrust of Chomsky's argument for specificity comes out further reinforced.
376
M. Piatelli-Palmarini
Conclusion: These rules are tacitly known by the speaker, but they are neither learned (by induction, problem-solving or trial-and-error), nor determined by some general necessity. General intelligence and sensorimotor schemata cannot even begin to explain what is happening. Chomsky's point failed to impress Piaget and the Piagetians. A lot of their counter-arguments turned on the possibility of explaining these facts "in some other way". One could not fail, I think, to be impressed, there and then, by the fact that no other way was actually proposed, but that it all turned around the sheer possibility that some other rule, at some other level, might explain all of the above. (Anthony Wilden even tried out Russellian logical types14 to no avail.) Wisely, and unflinchingly, Chomsky kept replying that this might well be the case, but that he did not expect it to be the case. (In fact, many years have gone by, and these alternative explanations are still sorely missing - for a precise account, firmly grounded in generative g