1
The emergence
of the child
as grammarian’
LILA R. GLEITMAN HENRY GLEITMAN ELIZABETH F. SHIPLEY University of Penns...
21 downloads
1470 Views
11MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
1
The emergence
of the child
as grammarian’
LILA R. GLEITMAN HENRY GLEITMAN ELIZABETH F. SHIPLEY University of Pennsylvania
Abstract Demonstrations of some young children’s awareness of syntactic and semantic properties of language are presented. Rudiments of such ‘meta-linguistic’ functioning are shown in two-year olds, who give judgments of grammaticalness in a role-modelling situation. The growth of these abilities is documented for a group of five to eight-year old children, who are asked explicitly to give judgments of deviant sentences. Adult-like behavior, in these talented subjects, is found to emerge in the period from five to eight years. Possible relations of meta-linguistic functioning to other ‘meta-cognitive’ processes are suggested.
What do we mean when we say a speaker ‘knows the rules’ of language? Transformational linguists have been guarded in explicating this claim, for surely there is a difference between what the speaker knows and what a professional grammarian knows. There is broad agreement that speakers ‘follow the rules’ and, in fact, have trouble not following them (as in memorizing deviant sentences and the like; e.g., Miller and Isard, 1963). But performances of this kind are hardly equivalent to our everyday understanding of what it means to know rules. Used in this ordinary sense, the term knowledge implies awareness of generality; in its strongest form it involves the capacity to articulate the rule system itself, as in a chess player who can readily recite the rules of the game that constrain his behavior. The elite linguistic informant is rather like the chess player: he follows the language rules, 1. The work reported here was supported by the National Institute of Health under grant number 20041-01. Thanks are due to Rachel Gelmarn and Francis W. Irwin for helpful comments on the manuscript, and to Harris
Savin for many useful suggestions and criticisms in developing this work. We also wish to thank Betsy Alloway and Judy Buchanan who carried out some of the experiments.
138
Lila R. Gleitman, Henry Gleitman and Elizabeth F. Shipley
but on demand he can do much more. He can demonstrate some awareness of the existence of a rule system by performing the one task that provides the main data base for modern grammatical theories: he can indicate whether a sentence is or is not well-formed. A rule system may be followed and yet not be known in this sense. The spider weaves his web according to a well-defined set of arachnid principles, but we hardly expect him to note any deviance if he weaves under the influence of LSD. Rule following per se implies knowledge of a weaker sort than that which linguists have generally been interested in. The very tasks they impose upon their informants require more than mere obedience to the rule system: the rules themselves must be engaged in the service of a further cognitive act. Tn a way, the linguist assumes not only that the speaker knows the rules but that he knows something about his knowledge. This paper is concerned with the development of this aspect of linguistic behavior, the ability of a speaker to reflect upon the rules that he follows. There is little doubt that this meta-linguistic skill has been a critical methodological prerequisite for the construction of linguistic theories during the last two decades. We are here concerned with the emergence of this skill in young children. Developmental psycholinguists have shown us that the young child already honors the rules for English sentence formation, at least within very wide limits. Children of four and five speak the language fairly well, have trouble - !ike adults - in repeating and memorizing deviant sentences, and so forth (e.g., Labov, 1970). But this work tells us only that children follow the rules, ‘know how’ to speak English.* The question is whether they can also contemplate the structure of the language, whether they know that they know. We will claim that at least some five, six and seven year olds possess this meta-linguistic ability to a remarkable degree, and that a germ of this capacity can already be seen in the two-year old. Thus in part we are pursuing a claim some of us have made elsewhere (Gleitman and Gleitman, 1970) a claim that is, we believe, implied - though usually cautiously - in the writings of most generative grammarians: it is the speaker’s potential abstract awareness of language structure, and not merely his orderly behavior in accord with these rules, that lies at the heart of the generative-transformational hypothesis.
2. We do not wish to overstate the case for the sophistication of children’s speech. Although many developmental psycholinguists state the language-learning process is essentially complete at age four or five (e.g., McNeill, 1966; Lenneberg, 1967), this is by
no means clear. We do know that gross errors in speech have largely disappeared at this time, except where morphophonemic irregularities (b~izged, etc.) are involved; but we have no firm data on the complexity and variety of structures in early child speech.
The emergence
of the child as grammarian
139
1. The child’s garden of syntax At first glance, there seems to be something of a paradox for students of cognitive development in the pre-schooler’s linguistic precocity: if language is simply a tool of thought, then it is surprising that language abilities seem to emerge so much earlier than other cognitive skills. The child’s progress to logic, to a belief in the conservation of quantities, to concepts of number, seems painfully slow, but almost any mother can attest to leaps of apparently abstract thought in the particular areas of phonology and syntax. For example, no three-year old lisps out his syllables so poorly that he does not feel entitled to employ a self-conscious baby-talk to dolls and other social inferiors (Gelman and Shatz, 1972; Shipley and Shipley, 1969). Such aspects of juvenile competence are rarely studied, in part because of the widespread belief that they cannot be dealt with experimentally (e.g., Brown, Fraser, and Bellugi, 1964). Yet anecdotally it is easy to point to cases where young children manifest great sensitivity to identifiable subtle features of language. For example, here is a question about segmentation from a four-year old: Mommy, is it AN A-dult or A NUH-dult?, a query made doubly intriguing by the fact that this child did not make the a/an distinction in spontaneous speech until two years later. And a four-year old with a question about adverbial complements: Mother (taking car around a sharp bend): Hold on tight! Child: Isn’t it tightly? A precocious first-grader, unacquainted with formal punctuation observes the distinction between use and mention:
marks, delicately
Child (writing): They call Pennsylvania Pennsylvania because William Penn had a (Penn) in his name. Mother: Why did you put those marks around the word Penn? Child: Well, I wasn’t saying Penn, I was just talking about the word. This child quite apparently does more than speak in accordance with the rules of grammar. She recognizes paraphrases, laughs at puns, and rejects deviant though meaningful sentences. We believe that these features of behavior, far from being the icing on the linguistic cake, represent our best clues to central aspects of language competence. We will show in this paper that the capacity to reflect on linguistic structure is available to some very young children. First, we demonstrate the rudiments of this abstract attitude in two-year olds. Second, we document evolution of this capacity in the young school-age child. No normative data are presented. We intend the work as an existential comment on linguistic creativity in
140
Lila R. Gleitman, Henry Gleitman and Elizabeth F. Shipley
some young children. We cannot in the population at large.
2. 2.1
Judgments of grammaticalness
speculate
on how widespread
such talents
may be
from the two-year old
The problem of doing developmental
psycholinguistics
Fruitful linguistic inquiry can hardly begin unless the speaker-listener can provide firm judgments on at least some sentences of his language. The primary data are not the subject’s utterances, but rather a set of sentences he judges, upon reflection, to be well-formed. The theory of grammar that emerges is an account of these judgments. Precisely how such an account is related to a description of language performances (other than judgment-giving performances) is currently something of a mystery. What is important for our purposes here is the fact that these judgmental data have not been available to the developmental psycholinguist. Even if we had a complete description of the child’s speech and a complete description of the adult’s linguistic judgments, this would be a dissatisfying state of a.ffairs, for there is no obvious way to compare these accounts in the interest of describing language acquisition. Brown, Fraser, and Bellugi commented in 1964 on this methodological gap separating the linguist’s study of the adult from the psychologist’s study of the child: The linguist working with an adult informant gets reactions to utterances as well as the utterances themselves . . . Can such data be obtained from very young children? With Abel [a two-year-old] we were not successful in eliciting judgments of grammaticality. Of course there was no point in asking him whether an utterance was ‘grammatical’ or ‘well-formed’. We experimented with some possible childhood equivalents. The first step was to formulate tentative grammatical rules, and the next to construct some utterances that ought to have been acceptable. For Abel ‘The cake’ should have been grammatical and ‘Cake the’ ungrammatical. How to ask? The experimenter said: ‘Some of the things I say will be silly and some are not. You tell me when I say something silly.’ Abel would not. If Abel had a sense of grammaticality, we were unable to find the words that would engage it. . .’ (Brown, 1970, pp. 72-73). Given this outcome, psychologists have used various indirect methods in the study of the very young language learner. Almost all of the techniques represent attempts to get at the classificatory system. Large contributions to our knowledge of the emerging speaker have come from careful observation of spontaneous speech (e.g.,
The emergence
of the child as grammarian
141
Braine, 1963; Miller and Ervin, 1964; Brown, Cazden and Bellugi, 1969; Bloom, 1970) which must, in some admittedly cloudy way, reflect something of the speaker’s underlying organization of the language. Similarly, studies of repetition, memory, and the comprehension of various syntactic and semantic structures (e.g., Brown and Bellugi, 1964; Menyuk, 1969; Chomsky, 1969) are in many ways analogous to solicitation of judgments of well-formedness. These latter methods are to some extent validated by their success with adult informants for whom concordant judgmental data can be provided (e.g., Savin and Perchonock, 1965; Johnson, 1965; Bever, Lackner, and Kirk, 1969; and many others). At present, then, we are able to get a fairly coherent picture of the course of speech-acquisition and some hints about the mechanisms of language learning from the work of these investigators. But the fact remains that the insights incorporated into modern generative grammar could probably not have been achieved by the use of such indirect methods (see, for discussion, Chomsky, 1965). Judgments of grammaticalness have always been used to provide the primary data. Comparable data from child informants would obviously be very useful. They would enable the developmental psycholinguist to proceed just like a linguist who studies some exotic adult language. But so far no one has found a little child who gives stable judgments on his own primitive language. Why is the young child unable or unwilling to provide these judgments? Does he merely fail to understand the instruction? Is this failure orthogonal to his linguistic capacities - perhaps representing a general cognitive immaturity? The work of Shipley, Smith, and Gleitman (1969) was designed to examine this issue. Perhaps the child does make judgments of well-formedness, but simply cannot understand an instruction to report on them. If so, we might get classificatory data from the child by developing some behavioral indices of differential responsiveness to various language forms. An examination of this study will help set the problem toward which the present paper is directed: the growth of meta-linguistic reflection in the language learner. Shipley et al. had mothers give commands (mild imperative sentences) to children aged 18 to 30 months. Some commands were well-formed (e.g., Throw me the ball!) but some were ‘telegraphic’ or foreshortened, as in the children’s own speech (Ball!; Throw ball!). We found that children discriminate among these formats, as shown by their differential tendency to obey these commands. More specifically, the holophrastic children (who do not yet put two words together in speech) tended to obey foreshortened commands. In contrast, telegraphic speakers ignored, repeated, talked about, laughed about, telegraphic commands, but obeyed well-formed commands. Shipley et al. assumed that children fail to obey commands that they perceive as linguistically deviant; thus differential tendency to obey these commands was
142
Lila R. Gleitman, Henry Gleitman and Elizabeth F. Shipley
interpreted as an implicit judgment on the ‘acceptability’ or ‘naturalness’ of their format3 This study revealed that the spontaneous speech of children provides a limited data source for the study of their linguistic knowledge, in practice as well as in theory. Clearly, the telegraphic speech of these children did not reflect the fact that they could discriminate telegraphic syntax from the adult syntax. Perhaps the children ‘preferred’ the well-formed sentences that they themselves never produced at this stage, as indicated by their tendency to act on just these. But such behavioral indices can be at best crude indicators of the child’s ‘judgments’ on the sentences offered to him - if indeed he can make such judgments. These indices do not come to grips with the question of classification. 2.2
Soliciting judgments from two-year old children
Can the two-year old child be induced to give judgments of grammaticalness more directly? Some curious hints began to appear among the subjects of Shipley et al. Every once in a while, a child seemed to behave much like an adult when confronted with a linguistically bizarre stimulus: Mother: (delivering stimulus) Joseph: Gor ronta ball! Joseph: 4 Wha’, Momma? ‘Gor ronta ball’? We have punctuated this sentence advisedly. The child seemed to be querying the format directly - not asking whether or not his mother really wanted him to gor ronta ball. Other children would grab the list of stimulus sentences from their mothers (obviously they could not read) and say ‘Now I do one!’ Such behavior suggests that these children were regarding the sentences apart from their communicative function. In a longitudinal variant of this work, in which eight children were studied at successive stages of language development, these sophisticated responses became
3. A simpler hypothesis - that the child fails to obey only because there isn’t enough information in the shortened commands - is falsified by the outcome: a child is more apt to obey Ball! than Gor ronta ball! though the two contain identical intelligible information. Further, at these ages information outside the noun itself has no effect on the specific behavior of the child given
that be looks at the stimulus object in the first place: If he does anything at all with the ball, he is equally likely to throw it if you say Ball, Throw ball, Throw me the ball, or Gor ronta ball. Only the likelihood of his coming into contact with the ball in the first place differs under these formats. 4. In all instances, names have been changed to protect the innocents.
The emergence
of the child as grammarian
143
too frequent to ignore.5 To be sure, in the first run of the experiment, the children behaved as we had anticipated. By and large (but with the usual enormous noise) the telegraphic speakers obeyed well-formed commands more often than telegraphic commands. In successive runs some months later with these same children, we expected to see the culmination of this development: the subjects would now more uniformly respond to well-formed sentences, and balk more often at deviance. But this was not the outcome. On successive runs, the distinction between well-formed and telegraphic commands became less potent in predicting the children’s tendency to obey. Obviously these subjects had not been unlearning English. On the contrary, they seemed to have learned to cope with anomaly. Our feeble operational techniques thus are well-foiled: Mother: Allison: Mailbox fill! Allison: We don’t have any mailbox fills here. Assuming the subject is playing it straight, she has interpreted the stimulus as a compound noun (albeit one whose referent she cannot discern) and has responded accordingly. But in light of Allison’s ten-month-long experience as a subject with such sentences - and with her favorite ready-to-be-filled mailbox toy not two feet from her eyes - it is more likely that she is putting us on. Accordingly, we had to reopen the question of whether she had become a functioning linguistic informant at all of two years old. Abandoning the indirect route, we now performed an experiment designed to solicit judgments of grammaticalness directly. The subjects were three girls, all about two-and-a-half years old, who had participated in the longitudinal study. Two of the girls (Allison and Sarah) had responded preferentially to well-formed sentences in the first part of the longitudinal study (the expected result for the ‘telegraphic speaker’), and had become indifferent to this distinction (‘post-telegraphic’) by the last run, as we have just discussed. The third (Ann) had responded preferentially to telegraphic sentences in her first run through the experiment (the expected result for the ‘holophrastic speaker’) and responded preferentially to wellformed sentences by the time of the final run (the telegraphic stage). Thus, by the behavioral measures, Allison and Sarah were a step ahead of Ann in language development when the present experiment was run.
5. The longitudinal study was undertaken, in part, to replace spontaneous speech measures with a better external criterion of each child’s development. Thus each child served as hi own control. This experiment differed in various ways from the 1969 study, most
relevantly by the inclusion of some word order reverses (e.g., Ball bring: Ball me the bring), which allow us to ask whether the child who rejects telegraphic sentences does so merely because they lack the intonation contour of well-formed sentences.
144
Lila R. Gleitman, Henry Gleitman and Elizabeth F. Shipley
In designing
the test situation,
we exploited
the children’s
willingness
to imitate
adult roles. The child was told ‘Today we have a new game’. Mother, experimenter, and child would take turns being ‘teacher’. As a preliminary step, the experimenter (as teacher) read a list of sentences to the mother, who judged them ‘good’ or ‘silly’, and did so correctly in all cases. She also repeated the good sentences verbatim, and ‘fixed up’ the silly ones. If the child hadn’t quickly clamoured for her turn, she was now offered it. The mother became teacher and the child was asked to judge the sentences, repeating the good ones and fixing the silly ones. Finally, the child became teacher. She was handed the stimulus list (needless to say, she could not read), and was told to offer sentences to the experimenter, who gamely undertook to judge these (as correctly as he could) for grammaticalness.6 The stimuli were 60 sentences, all short imperatives. The noun object of each sentence was the name of a toy or other object known to the child. The sentences varied along two dimensions: (a) Intonation contour: the sentences had either the contour of a well-formed imperative (Bring me the ball; Ball me the bring) or that of a telegraphic sentence of the kind these children sometimes still produced in speech (Bring ball; Bali bring). (b) Order: the serial order of words might be correct (Bring me the ball; Bring ball) or the order of noun and verb might be reversed (Ball me the bring; Ball bring). Thirty of the sentences were those used in the prior study, and thus were familiar to the child. The other thirty sentences were new. The familiar sentences were presented at one session, the new ones a week later. All three children undertook to judge the sentences offered to them. The mere fact that they did so, and that the results were non-random, suggests that these two-year olds could view language ‘as an object’. That their classificatory skills were quite feeble, nevertheless, can be seen from Table 1. Each child tended to judge well-formed sentences (those that were full in contour and correct in serial order) as ‘good’, though this outcome is far from categorical (combined probabilities from chi-square test, p < .02). There were no differences between the familiar and the unfamiliar sentences. For all subjects, the reversed-order sentences result in more judgments of ‘silly’ (combined probabilities from chi-square test, p < .OOl). But only Allison judged telegraphic sentences with word order preserved as sillier than 6. We have frequently been asked how we succeeded in telling the children to deal with ‘sentences’ when these children could hardly be expected to understand the word sentence.
Indeed that is a puzzle, but nevertheless children to do.
acted as if they understood
the what
The emergence
of the child as grammarian
145
well-formed sentences. Recall that all three subjects distinguished telegraphic from well-formed sentences by the behavioral index (tendency to obey such commands) in earlier runs through that test. Table 1. Sentences judged ‘Good’ by three two-year olds (in percent)
Subjects
well-formed
order telegraphic
Reversed well-formed
order telegraphic
Sarah Ann Allison
92 80 80
100 82 58
75 50 58
58 58 58
Normal
Two of the three subjects (Allison and Sarah) were willing to repeat those sentences they had judged to be good. Sarah’s repetitions were with one exception verbatim; her single ‘error’ was a recognizable correction of a telegraphic sentence. Allison judged 20 of the 30 sentences of session 2 to be good, and she was asked to repeat these. Of these 20 sentences, 10 were actually well-formed, and she repeated all these verbatim. Of the 10 that were actually deviant, she repeated 5 verbatim, and changed 5 in some way. That is, she gave verbatim repetitions of well-formed sentences 10 times out of 10, and of deviant sentences 5 times out of 10 (Fisher test, p = .025). Four of the five non-verbatim ‘repetitions’ are partial corrections. This outcome is similar to that achieved by Labov (1970), whose 12-year old subjects unintentionally, and in spite of the task requirements, ‘corrected’ sentences offered for repetition to conform with their own dialect. The repetition data, coupled with Labov’s findings, give some independent evidence of these children’s language organization. More relevant to the issue of metalinguistic functions are Allison’s and Sarah’s attempts to correct those sentences they had judged to be silly (Arm refused to attempt this). The corrections are shown in Table 2. Inspection of the table shows that Allison usually (about 7 times in 10) and Sarah sometimes (about 4 times in 9) made a change in the direction of wellformedness. It appears that the children understood the question that was asked and made a conscious attempt to restructure the output. There is a further curiosity in Allison’s responses: some of them are non-paraphrastic (e.g., Bring book / Close book; Cup find / Fill the cup). We must admit that the instructions are vague; when we told the child to ‘fix it up if it is silly’, we did not stipulate that the correction had to mean the same thing. But it is worth noting that one need not so instruct an adult. If we ask an adult to correct the expression The dog bit cat the, we expect the response The dog bit the cat. If informants responded The dog bit the rat with any measurable frequency, linguistic
146
Table
Lila R. Gleitman, Henry Gleitman and Elizabeth F. Shipley
2.
Sentences judged ‘silly’ and corrected judged ‘silly’
by two two-year olds * Correction
Subjects
Sentences
Sarah
Pull wagon Chair look Clothes iron Door shut Put on a coat Make up the bed Wash the dishes Song me a sing Box the open
Pull the wagon me Chair look Iron clothes Door shut Put me in a coat Make the bed Wash the dishes Sing me a song Get in the box
Allison
Find cup Push sweeper Bring book Mailbox fill Cup find Pick up the iron Fill the mailbox Ball me the throw Sweeper the push Iron up the pick
Drink out of the cup The rug sweeped Close book Mailman, put the mail in the mailbox Fill the cup Iron the clothes Put mail in the mailbox Throw the ball The sweeper push on the rug Iron the clothes
* The table shows all cases in which correction was attempted. Some other sentences were also judged to be ‘silly’ but no correc-
tions were offered for them. One subject, Ann, refused to provide any corrections at all.
theory would look a good deal different than it does. Nor would we ever expect as responses The cat was bitten by the dog, The domestic canine bit the cat, It was the dog who bit the cat, or any of a host of other paraphrastically related responses. The adult informant has a surprisingly precise notion of what ‘it’ means in the instruction ‘Fix it up’ in the context of a deviant sentence. We show later on that at least some six and seven-year olds interpret the task just as adults do. In the last experimental condition (the child as ‘teacher’), we get some further suggestive evidence from Allison, who invented 20 sentences for her mother to judge. Of these 20, 18 were well-formed imperatives of the type she had been tested on in this and the earlier experiment (e.g., Sit on the horsie; Put pants on yourself; Look at that chair). One was a reversed-order imperative (Rug put on the floor), and the last a peculiar declarative (Hair is on yourself). It is of some interest that all save one of these inventions are imperatives, reproducing in minute detail the syntactic structure of the sentences we had offered to her. It seems unlikely that a child asked simply to ‘say sentences’ will produce imperatives so exclusively; thus it is probable that Allison was capable of developing a set for a unique grammatical
The emergence
of the child as grammarian
147
structure in response to her perception of the requirements of the task. On the other hand, she showed no tendency to be rigid on semantic grounds, for her inventions varied over a wide range of topics. In sum, we have found that the method of Shipley et al. (1969) rapidly becomes unworkable as the child passes out of the telegraphic stage of speech. The sophisticated two-year old, like his seniors, seems to fiddle around with deviant material. He may somehow internally ‘correct’ it, and then respond to the corrected material (the general paradigm hypothesized for adults bv Katz, 1964; Ziff, 1964; and Chomsky, 1964; see Gleitman and Gleitman, 1970, for some experimental evidence). Thus no simple behavioral index now gets close to his recognition of the distinction between well-formed and deviant sentences. At this stage, some of his knowledge can be tapped by direct query: Tell me if the following sentence is good or silly. Tenuously, but quite clearly, some two-year olds can follow this instruction in the role-modelling situation. Further, two of the three subjects studied give evidence of going beyond simple classification. Isolating the deviance, at least in some measure, they often provide partial corrected paraphrases of deviant material. Indirect data from their spontaneous speech and from corrected repetitions are also consistent with these interpretations. We believe that, with appropriate refinement of these elicitation procedures, it may be feasible to inquire quite directly into aspects of young children’s language organization. However, we do not know how far this judgmental capacity extends. In this study we dealt only with very simple sentence types. We do not know if these subjects could provide stable judgments if we edged closer to the limits of their knowledge (a matter which is after all in some doubt even for adults; see, e.g., Maclay and Sleator, 1960; Hill, 1961). What has been demonstrated here is at least a minimal capacity in some children under three to contemplate the structure of language.
3. The child as informant We will now show that some children from five to eight years old come up with intuitions about syntactic and semantic structure so subtle that they are often overlooked even by professional grammarians. We will not argue that most or even many children can perform such feats of reflection. Since extreme differences in linguistic creativity have already been demonstrated for adults (Gleitman and Gleitman, 1970; Geer, Gleitman, and Gdtman, 1972) it would be surprising if we did not find great differences among children. We have not, then, looked for subjects who are in any way representative of the dialect population. On the contrary, we have taken some pains to interview children we suspected were highly articulate,
148
Lila R. Gleitman,
Henry Gleitman
and Elizabeth
F. Shipley
either from personal knowledge or from aspects of their background. After all, the adult informants whose judgments provide the empirical basis for linguistic theory are at least as far from being a random sample of the population. Having granted the bias of our sample, we begin with a transcription of a dialogue between one of us (LRG) and one of her children.
3.1
An interview
with a seven-year
old child
At the time this dialogue was taped, Claire was seven years old, in her second year of grammar school. She had had a good deal of exposure to language games, and had participated when very young in pilot studies of the sort reported by Shipley et al. (1969). We are not suggesting, then, that Claire was average either in linguistic capacity or experience, although some of the results we will report below for children of less special background suggest that her approach to syntactic questions is by no means unique.’ LG: Are you ready to do some work? CG: Yes. LG: We’re going to talk about sentences this morning. And I want your opinion about these sentences. CG: Yes, I know. LG: Are they good sentences, are they bad sentences, do they mean something, are they silly, whatever your opinions are. Do you know that your opinions can’t really be wrong? CG: I know because you told me. LG: CG: LG:
Do you believe me? Yes, I believe you, because everybody has his own opinion. You and I may disagree; would you like me to tell you when I disagree with you?
CG: LG:
Yes, but you won’t tell me! Okay, okay, I’ll tell you. The important thing is you should know it’s all right to disagree. Okay: John and Mary went home. (1) That’s okay. That’s an okay sentence? Yes. Does it mean the same thing as: John went home and Mary went home? (2) Yes, but it’s sort of a little different because they might be going to the
CG: LG: CG: LG: CG:
7. All sentences presented to Claire and all of her initial responses appear in this tran-
script. A few tedious interchanges from probes have been deleted.
resulting
The emergence
of the child as grammarian
149
same home - well, it’s okay, because they both might mean that, SO it’s the same. Here’s another one: Two and two ure four.(3) I think it sounds better is.
LG: CG: LG: Two and two is four? CG: Am I right? LG: Well, people say it both ways. How about this one: Claire and Eleanor is a sister. (4) CG: (laugh) Claire and Eleanor are sisters. LG: Well then, how come it’s all right to say Two and two is four? CG: You can say different sentences different ways! (annoyed) LG: I see, does this mean the same thing: Two is four and two is four? CG: No, because two and two are two and two and two and two is four. LG: Isn’t that a little funny? CG: Two and two more is four, also you can say that. LG: How’s this one: My sister plays golf. (5) CG: That’s okay. LG: How about this one: Golf plays my sister. (6) CG: I think that sounds terrible, you know why? LG: Why? CG: Poor girl! LG: Well, what does it mean? CG: It means the golf stands up and picks up the thing and hits the girl at the goal. LG: How about this one: Boy is at the door. (7) CG: If his name is Boy. You should - the kid is named John, see? John is at the door or A boy is at the door or The boy is at the door or He’s knocking at the door.
LG: Okay, how about this one: Z saw the queen and you saw one. (8) CG: No, because you’re saying that one person saw a queen and one person saw a one - ha ha - what’s a one? LG: How about this: Z saw Mrs. Jones and you saw one. (9) CG: It’s not okay - Z saw - You saw Mrs. Jones and Z saw one (ha ha). Besides there aren’t two Mrs. Jones. LG: Is that the problem there? Is that why the sentence sounds so funny? CG: No, the other problem is Z saw - You saw Mrs. Jones and Z saw one - a one. LG: A one, you mean like a number one? CG: No - a one, whatever a one - well, okay, a number one. LG: How about this: Be good! (10) CG: That sounds okav.
150
Lila R. Gleitman, Henry Gleitman and Elizabeth F. Shipley
LG: CG: LG: CG: LG: CG: LG: CG: LG: CG: LG:
How about this: Know the answer! (11) That’s the only way to say it, I think. The only way to say what? You better know the answer! (threatening tone) HOW about this one: I am eating dinner. (12) Yeah, that’s okay. HOW about this one: I am knowing your sister. (13)
No: Z know your sister. Why not Z am knowing your sister - you can say Z am eating your dinner. It’s different!
(shouting)
You
say different
Otherwise it wouldn’t make sense! I see, you mean you don’t understand
what
sentences that
in different
means,
ways!
Z am knowing
your sister. CG: LG: CG: LG: CG: LG: CG: LG: CG: LG: CG: LG: CG: LG: CG:
LG: CG: LG: CG: LG: CG: LG:
I don’t understand what it means. How would you say it? Z know your sister. Do you disagree with me? It so happens I agree with you. How’s this one: Z doubt that any snow will
fall today. (14) Z doubt that snow will fall today. How’s this: Z think that any snow will fall today. (15) Z think that some snow will fall today. That way it’s okay? I don’t think snow will fall today cause it’s nice out - ha ha. How about this: Claire loves Claire. (16) Claire loves herself sounds much better. Would you ever say Claire loves Claire? Well, if there’s somebody Claire knows named Claire. I know somebody named Claire and maybe I’m named Claire. And then you wouldn’t say Claire loves herself? No, because if it was another person named Claire - like if it was me and that other Claire I know, and somebody wanted to say that I loved that other Claire they’d say Claire loves Claire. Okay, I see. How about this: Z do, too. (17) It sounds okay but only if you explain what you’re trying to say. How about: The color green frightens George. (18) Doesn’t frighten me, but it sounds okay. How about this one: George frightens the color green. (19) Sounds okay, but it’s stupid, it’s stupid! What’s wrong with it?
The emergence
CG: LG: CG: LG: CG:
LG: CG: LG: CG: LG: CG:
of the child as grammarian
151
The color green isn’t even alive, so how can it be afraid of George? Tell me, Claire, is this game getting boring to you? Never-rrrrrrrrrrrrrrr. Why do you like to play a game like this? What’s the difference how you say things as long as people understand you? It’s a difference because people would stare at you (titter). No, but I think it’s fun. Because I don’t want somebody coming around and saying - correcting me. Oh, so that’s why you want to learn how to speak properly? That’s not the only reason. Well, what is it? Well, there’s a lotta reasons, but I think this game is plain fun. You want to go on playing? Yeah, and after this let’s do some spelling; I love spelling.
3.2 Other subjects As a further check on the incidence of skills apparent in Claire’s responses, we tested six more children with these same materials. Listed in ascending age, they were: Sl - female, 5 years; S2 - male, 5 years; S3 - male, 6 years; S4 - male, 7 years; S5 - male, 7 years; S7 - female, 8 years (S6 was Claire Gleitman, 7 years). All of the subjects were children of academic families. The interviewer for Sl and S5 was their mother, an undergraduate psychology student. The interviewer for S2, S3, S4, and S7 was an undergraduate linguistics student who had never met the children before the interview. All sessions were taped. 3.3 Results Rather to our surprise, all of the children we interviewed were prepared to play the game; all classified the sentences in fair conformance with adult judgments; and all, including the youngest, gave interesting and relevant accounts of what is wrong with the deviant sentences, at least some of the time. 3.3.1 Classification of the sentences Table 3 presents the conformance of the children’s judgments on these sentences with our own. There are many reasons to be embarrassed by so formal a presentation of these data. Most centrally, the accuracy of the child’s response was often dependent on the wit of the interviewer in making the correct probe. In particular, the youngest subjects would accept almost any sentence unless some further question was asked:
152
Lila R. Gleitman, Henry Gleitman and Elizabeth F. Shipley
E: Sl: E: Sl: Table
How about this one? Boy is at the door. Good. Good? Is that the way you would say it? No. A boy is at the door. Boy is at the door isn’t a good sentence. 3.
Conformance
of children’s judgment to those of adults * S2
5
5
John and Mary went home wf John went home and Mary went home wf Two and two are four wf Claire and Eleanor is a sister d My sister plays golf wf Golf plays my sister d Boy is at the door d I saw the queen and you saw one d I saw Mrs. Jones and you saw one d Be good! wf Know the answer! d I am eating dinner wf I am knowing your sister d I doubt that any snow will fall today wf I think that any snow will fall today d Claire loves Claire wf/d I do too wf The color green frightens George wf George frightens the color green d
4+ + -
+ + + + + + + + -
+ + + + + + + + + + + +
+
+ +
Total ‘+’ judgments
12
10
Adult judgment (1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12) (13) (14) (15) (16) (17) (18) (19)
Subject: S3 S4 Age: 6 7
Sl
for all sentences
* Adult judgment were provided by three independent judges who indicated whether each sentence was well-formed (wf) or deviant (d). The children’s judgments are marked ‘+’ if they agreed with those of the adult and ‘--’ if they did not, regardless of their ex-
+ + + + + + + + -
S5
S6
S7
7
7
8
+ + + + + + + + + + + + + +
+ + + + + + + + + + + + + +
+ + + + + + + + + + + + + +
+ + + + + + + + + + + + + + +
+ + +
+
+ +
+ + +
+ +
15
15
16
17
17
planation. Sentence 16 cannot be scored in this manner; whether or not it is deviant depends upon whether the same referent is assumed for both nouns. The names in sentences 4 and 9 were chosen to be familiar; in sentence 16 the child’s own name was used.
More generally, it should be clear that this test was performed simply to ascertain whether children of these ages can in principle adopt the attitude of judging and classifying in a manner similar to that of adult informants: with respect to this point, the results are clear-cut. But since the choice of test sentences was haphazard in terms of any metric of well-formedness, complexity, and the like, we can make no general statement about the judgmental capacities of children as compared with adults; the percentage agreement with adults would almost certainly be changed
The emergence
of the child as grammarian
153
by varying the proportion of one or another kind of deviance. In the absence of normative data, these subjects’ responses are useful only to expand the picture drawn in the original interview. But the results leave little doubt that a variety of delicate questions of syntax and semantics are handled rather neatly by these children. Below we give a number of examples, organized according to several rough syntactic subcategories. Stative verbs: Sentences 13 (Z am knowing your brother) and 11 )Know the answer!) contained stative verbs in deviant environments. As can be seen from
3.3.1.1
these examples, this verb class has no forms in the present progressive or in the imperative. As Table 3 indicates, the younger children fail to notice the problem. The older ones reject the deviant forms. S6 suggests the so-called ‘pseudo-imperative’ interpretation (an ellipsis for an if-clause), which is acceptable for such verbs (Know the answer = You better know the answer = Zf you don’t know the answer, *#!). Collective versus distributive use of and: All children stated that (1) John and Mary went home and (2) John went home and Mary went home mean the same thing. Claire spontaneously brought up the collective/distributive issue in response
3.3.1.2
to this comparison. She first tried to distinguish the two sentences on this basis (‘they might be going to the same home’) but then recognized that both forms share both construals (‘they might both mean that, so it’s the same’). 3.3.1.3 Pronominal referents: Sentences 8 and 9 display the anomaly that arises when a definite noun-phrase is the apparent antecedent of an indefinite pronoun: (8) Z saw the queen and you saw one. (9) Z saw Mrs. Jones and you saw one. The oddity is clearer in (9) for while there may be only one Mrs. Jones in the world,
one cannot have that same Mrs. Jones as its grammatical antecedent. The four younger children accept (8) without question, which is consistent with their tendency not to notice syntactic deviance when no semantic anomaly arises. On the other hand, all of the subjects rejected (9). The responses to these questions are displayed in full below. It is quite clear that the quality of the explanations improves with the age of the children; put another way, there is an increasing conformance of their judgments with our own. Note that the younger children give explanations that accord with their judgments: they reject only the case with the proper noun, and they explain by claiming that this structure is incorrect with a name. (The stimulus sentence is tagged by its number; the experimenter’s comments, somewhat abbreviated, are bracketed): Sl: (8) Good. (9) No, ‘cause there’s only one Mrs. Jones. [Then how would you
154
Lila R. Gleitman, Henry Gleitman and Elizabeth F. ShipEey
say it?] Z saw Mrs. Jones [and?]
Z did, too.
s2:
(8) Yeah. (9) I would hate that ‘cause they’re not - I got two reasons. They’re not the same age and they don’t look the same. [So how would you say it?] I don’t know. It’s silly. Because it don’t say the name and - it don’t say the name - it’s a - Z saw Richard Jamison, and you saw one. Don’t give no reason. s3: (8) Good. (9) It sounds funny ‘cause You saw Mrs. Jones is okay, but Z saw a one - it should mean something like Z saw - You saw a tree and Z saw one, too. You can’t say it with a name. [So what’s the problem?] Because you have to say something like You saw a tree and Z saw one. But you can’t say something like You saw Mrs. Jones and Z saw one. You have to say You saw Mrs. Jones and Z saw her, too. s4: (8) That’s a good sentence. (9) That’s silly. ‘cause there might not be two Mrs. Jones that I know. [So how would you say it?] Z saw Mrs. Jones and so did you. Both of us saw Mrs. Jones. s5: (8) No, Z saw the queen and you saw the same queen that Z saw - you and me saw the queen. (9) No, Z saw a Mrs. Jones and so did you. S6: (8) No, because you’re saying that one person saw a queen and one person saw a one - ha ha - what’s a one? (9) It’s not okay - Z saw - You saw Mrs. Jones and Z saw one - ha ha - besides, there aren’t two Mrs. Jones. [Is that the problem here?] No, the other problem is You saw Mrs. Jones and Z saw one - a one. [Like a number one?] No - a one - whatever a one well, okay, a number one. s7: (8) That doesn’t really make sense. You saw a queen - no, 1’11 say me Z saw a queen and you saw a queen, too. (9) That doesn’t make sense because there’s only one Mrs. Jones that you saw and you have to see the same one, probably. Z saw Mrs. Jones and you saw her, too. [But if there were two Mrs. Jones?] You saw her - I don’t know. I guess if there were two you could say one. It would sound funny. [Suppose your grandmother and your mother are both Mrs. Smith, so you might be able to see two of them at the same time.] Z saw Mrs. Smith and you saw them, too - ha ha - that sounds - and you saw them, too - Z saw Mrs. Smiths . . . I don’t know.
3.3.2 Explanations of deviance While our subjects very often rejected syntactically deviant but meaningful expressions, they ordinarily, and improbably, explained their rejection on semantic grounds; e.g., E: How about Kari and Kirsten is a sister. S4: Funny. E: Why is it funny?
The emergence
of the child as grammarian
155
S4: Because that doesn’t make sense. E: How would you say it? S4: Kari and Kirsten are sisters. This happened with trying consistency. Since the subject easily provided a paraphrase, he had obviously grasped the sense of the sentence; but even so he often ‘explained’ the peculiarity of the sentence by denying its meaningfulness. Again, this confusion is not restricted to children; one has only to make the case a bit more difficult. Thus adults given the sentence Z saw the queen of England and you saw one, too will often reject it on the grounds that there is only one queen of England. The fact that the sentence would sound just as odd if there were fifty queens of England entirely escapes their notice (Gleitman, 1961). Of course very often a semantic explanation is appropriate; here is an example from a five-year old: E: George frightens the color green. Sl: No, because green is used to boys. E: If there was a color that never saw children, it could be frightened? Sl: No. It couldn’t be frightened because - ‘cause -I’m thinking, okay, Mom? . . . No, ‘cause colors don’t have faces of paint. You talking about paint? We have seen that semantic ‘explanations’ are common among our subjects, even where they are inappropriate. Yet there also are many instances in which the children, including the five-year olds, point quite precisely to a syntactic violation; e.g.: E: Z think that any rain will fall today. S3: You can’t say any there. E: Z am knowing your brother. S4: It’s not right English. It should be Z know your brother, not Z am knowing your brother.
E: Two and two are four. S6: I think it sounds better is. It is worth noting that these children, not yet exposed to grammar exercises in school, nonetheless have definite opinions that take the form ‘you can’t say . . . you have to say.’ To this extent, the children seem to adopt a frame of reference in answering these questions that is similar to our own. The non-paraphrastic responses often observed in the two-year old subjects (see Table 2) have disappeared. The quality of explanation changes markedly with age (whether it also changes with intelligence, schooling, and the like, is a question we cannot speculate on). Some further ticklish differences in the frame of reference for dealing with our experimental question are left unanswered here. As a final comment, however, the
156
Lila R. Gleitman, Henry Gleitman and E?izabeth F. Shipley
following kind of response would probably be exceedingly rare in adult subjects, but it occurs more than once in our sample. (We were trying to find out whether is and are are both acceptable in sums): E: Two and two are four. S3: Yeah. E: Can you think of any other way to say that? S3: Three and one?
3.4
Discussion
We now consider the factors that determined the behavior of these subjects in responding to the question: Is the following sentence ‘good’ or ‘silly’? A number of tangled issues of truth, plausibility, meaningfulness, and syntactic patterning enter into the interpretation of these findings. Did our subjects distinguish between implausible or false expressions and linguistically anomalous ones? Even if they did, did they really contemplate the constraints on arrangements of words and phrases (syntax) or did they consider only the meanings of such arrays and the entities that comprise them (semantics)? Below we comment on our subjects’ approaches to these fine distinctions. These matters are presumably of some importance, perhaps especially to those psychologists who claim that ‘semantics is what is important’ about language and language learning, and that the transformational foray into syntax is in some ways uninteresting or not cogent for psychologists of language. In our view there are really two issues. One is the immediate question about the factors that determined our subjects’ judgments. The other concerns the general problem of distinguishing syntax from meaning. 3.4.1
What makes a sentence ‘silly’: Falsehood or ill-formedness?
There are many ways that a sentence can be ‘silly’. For example, there are quite different oddities involved in saying Mud makes me clean versus Mud drinks my ankle. Notice that the negative of the first is entirely unexceptional (Mud does not make me clean), while the negative of the second is precisely as odd as the positive (Mud does not drink my ankle). Stated more generally, Mud makes me clean is implausible to the extent that mud is rarely a cleansing agent, but it is a ‘good’ sentence of English. Is this true of Mud drinks my ankle? Again, some would say that this is a good sentence of English on the grounds that it is a case of a noun-phrase followed by a verb-phrase in which the right-completion of the verb drink (a noun-phrase) is also correct, given the gross patterns of English. But most linguists would respond that a description of the English language that fails to account for the oddity of
The emergence
of the child as grammarian
157
this sentence is primitive and incomplete (after all, if linguists disclaim responsibility for this phenomenon, who is to handle it?). In the standard transformational formulation, such oddities are described as violations -of selectional restrictions that obtain among the words and phrases of the language (for discussion, see Chomsky, 1965): drink requires an animate subject while mud is not an animate noun. Knowledge of selectional restrictions on words is claimed to be part of the lexical information that speakers have internalized. (Whether this information is ‘semantic’ or ‘syntactic’ is a question to which we will return.) Linguists are less concerned (although not utterly unconcerned) to account for the implausibility of Mud makes me clean, which more clearly turns on the language user’s ‘knowledge of the world as opposed to ‘knowledge of language.’ How do speakers interpret the instruction to tell whether or not a sentence is ‘silly’? By and large, adults will accept Mud makes me clean with only mild waffling, and they will reject Mud drinks my ankle. They accept implausible sentences and reject violations of selectional restrictions. In contrast, two-year olds, as already mentioned, seem to reject implausible sentences. For example, Find book is ‘corrected’ as Close book. Our guess is that these non-paraphrastic responses are attempts to come up with more plausible expressions. Similarly, the five-year olds studied here sometimes reject sentences on these grounds: E: Z am eating dinner. S2: I would hate that. E: Why? S2: I don’t eat dinner anymore. E: How about Z am eating breakfast? S2: Yum, yum, good! On the contrary, the older subjects rarely reject sentences on the basis of implausibility or falsehood. If the experimenter suggests that they do so, they consider this only a joke: E: Z think that any rain will fall today. S3: Well, any is not the right word. You should say Z don’t think that any rain is going to come down. Bight? E: Okay. It’s a pretty nice day anyway. S3: SO it’s not gonna rain, so that’s why I’m probably right (gails of laughter). Similarly, Claire points up this distinction in one response: E: The color green frightens George. S6: Doesn’t frighten me, but it sounds O.K. On the other hand, violated selectional restrictions - which indeed lead to a bizarre meaning - are uniformly rejected. For a simple case such as Golf plays my brother,
158
Lila R. Gleitman, Henry Gleitman and Elizabeth F. Shipley
all subjects
say that it is ‘backwards’
or that ‘it doesn’t
make sense.’ They may also
provide a reading for the deviant sentence: S3: Ha ha. That doesn’t sound right. You should say My brother plays golf instead of golf plays brother - that would mean a golf ball or something bats the boy over the thing. It is also worth noting that Sl, who for mysterious reasons of his own rejects Z am eating dinner, rejects Golf plays my brother on appropriate grounds: Sl: I hate it cause it’s backwards. While the tendency to reject implausible but ‘correct’ sentences diminishes with the older subjects, it does not disappear. All subjects reject George frightens the color green, which violates a selectional restriction on frighten. But some also reject The color green frightens George on grounds of implausibility: S7: No, because green is just still. It isn’t going to jump up and go BOO! Nor should we expect categorical acceptance of implausible expressions in the light of the vagueness of these instructions. The point here is not whether categorical behavior is found: given these instructions, some adults will also reject such sentences. (After all, the idea is silly). Much more centrally, the plausibility dimension seems highly salient for two-year olds (see Table 2) is still sometimes apparent in five-year olds, and becomes much less salient as the determinant of judgments among the older children and adults. As we will now discuss, syntactic dimensions become more potent with age. 3.4.2 What makes a sentence ‘silly’: syntactic deviance or semantic anomaly? We have so far seen that what differs among our subjects, and what differentiates them most clearly from adults, is the precise understanding of the question: Is the following a good sentence of English. Do they respond in terms of truth and plausibility or of form? One might ask in addition: do they respond in terms of meaning or form? (Of course this further question raises serious problems of definition, most of which we regretfully ignore, for they reach well beyond the scope of this paper.) While syntactic structure appears to be the basis of many of the rejections of sentences we have cited thus far, it may be argued that semantic anomalies arose from the syntactic deviations, and that it was the semantic anomaly to which these subjects responded. In that event, the best test cases for sensitivity to syntax would be those sentences whose syntactic deviations have the least semantic force. These will usually be low-order violations of phrase-structure constraints. Examples among our stimuli are sentences in which number concord is violated (John and Jim is a brother) or in which determiners required by count nouns are missing (Boy is at the door). Similar to the last instance are the foreshortened forms without particles and determiners presented to two-year olds (Bring book). These cases can be con-
The emergence
of the child as grammarian
159
trasted with deviations from well-formedness that, at least intuitively, do more radical violence to meaning, such as Golf plays my brother, Z think that any rain will fall today, and, for the two-year olds, Ball me the throw. If the children notice only semantic anomaly and ignore syntactic patterning, they should accept sentences of the first sort (John and .Zim is a brother) and reject sentences of the second sort (Z think that any rain will fall today). The youngest subjects were indeed most responsive to deformations which obscured or complicated semantic interpretation. Thus the two-year olds gave clear-cut data only for sentences with word-order reversals such as Ball me the bring whose meaning is obscure. Similarly, the five-year olds often accepted a deviant sentence if it was odd only in its word-arrangement but still clear in meaning (e.g., John and Jim is a brother). But nevertheless there are some indications that even the youngest children were sensitive to syntactic issues as such. For example, one of the two-year olds (Allison) was much more likely to judge a telegraphic sentence as ‘silly’ than a well-formed one. The sensitivity to syntax is more obvious in the fiveyear olds who sometimes noticed syntactic oddities that yielded no semantic problems. One of them rejected Boy is at the door, and spontaneously added, ‘Boy is at the door isn’t a good sentence’. From six years on, the salience of syntactic deviance is no longer in doubt. All children over five years of age rejected Boy is at the door and John and Jim is a brother. Each provided the appropriate paraphrase so they obviously understood these expressions. They were then rejected solely on grounds of syntactic nicety. Beyond the immediate issue we have just considered is the question of whether a relevant distinction can in general be drawn between syntax and semantics. Certain psycholinguists seem to believe that it can. They seem to assume that constraints on word order and the like, insofar as these are not merely historical accidents, are relevant only to the nature of memory, sequencing of outputs, and other issues of linguistic information processing. What they fail to notice, or misinterpret, is that very much of what we mean by meaning is expressed through syntactic devices. Notions such as subject, predicate, noun, adverbial phrase, and the like, are the categories and functions described in the syntax of the language; but of course these are not semantically empty notions. The movement within transformational linguistics known as ‘generative semantics’ (e.g., Lakoff, 1972) is an attempt, as we understand it, to merge the semantic and syntactic descriptions of the language in a way more perspicuous than in Chomsky’s (1965) formulation. But whatever success this venture may have, it is bizarre that any version of transformationalgenerative grammar could be viewed as describing ‘merely’ the semantically empty syntax of the language. Clearly these theories have always been attempts to describe the complex interweave of form and meaning that natural languages represent.
160
Lila R. Gleitman,
Henry Gleitman
and Elizabeth
F. Shipley
This being our view, it is hard for us to argue any more strenuously than we have that our subjects are aware of English syntactic structure. If it can be shown that the features of syntactic structure that these children note and comment on always have some semantic content, that can come as no surprise to us, and cannot mitigate our interpretations of these findings.
4. Conclusions 4.1
The child as language knower
All of the children we have studied show at least a muddy capacity to be reflective about knowledge. Even the two-year olds provided nonrandom classifications of simple sentences: the fact that they undertook this task at all is evidence of at least the rudiments of a meta-linguistic skill. A child who can do this must already be said to know something about language that the spider does not know about webweaving. The ability to reflect upon language dramatically increases with age. The older children were better not only in noting deviance but also in explaining where the deviance lies. By and large the five-year olds offered only paraphrastic corrections. They did not add much in the way of explanation, even though they indicated that there are ways ‘you have to say it’ and that some of the sentences are just ‘not right’. In contrast, the older children often referred to linguistic categories (e.g., ‘you can’t say it with a name’) and occasionally changed the lexical classification of a familiar word thus rendering a deviant sentence well-formed (e.g., ‘you can’t say that unless you are “a Green” ‘; ‘One person saw a queen and one person saw a one - whatever a “one” is’). This achievement is all the more impressive considering the fact that many adults have serious difficulty when required to change the categorial status of a word (Gleitman and Gleitman, 1970). But even where the subjects offered only example or paraphrase, the older ones sometimes came up with all of the data relevant to writing a rule of grammar, Claire’s response to Boy is at the door is a case in point: If his name is Boy. The kid is named John, see? John is at the door, or The boy is at the door, or A boy is at the door, or He’s knocking at the door. Most of the main distinctions among noun and noun-phrase types (count, proper, pronoun; definite versus indefinite noun-phrase) are neatly laid out in this response. Such manipulation of linguistic data is a not inconsiderable accomplishment. It is after all the modus operandum of the practicing linguist. We should reiterate that the abilities we have demonstrated in some children will
The emergence
of the child as grammarian
161
not necessarily appear in very many. Our claims are simply existential. The lack of normative data is only one of the reasons for this caution. A number of studies (e.g., Pfafflin, 1961; Gleitman and Gleitman, 1970) have shown that there are substantial individual differences among adults in the ability to deal with classificatory problems. These differences in meta-linguistic skills are not attributable solely to differences in non-linguistic matters such as memorial capacity (Geer, Gleitman, and Gleitman, 1972). Under the circumstances, it is only reasonable to suppose that such differences already exist among young children. Chomsky’s demonstration (1969) of individual differences in the recognition of transformational features of verbs in six to ten-year olds gives further grounds for this belief. 4.2
Meta-linguistic functions compared to other meta-cognitive processes
At least in adults, there are some other ‘meta-cognitive’ processes which seem to be similar in some ways to the meta-linguistic functions we have just considered. We think and we sometimes know that we think; we remember and sometimes know that we remember. In such cases, the appropriate cognitive process is itself the object of a higher-order cognitive process, as if the homunculus perceived the operations of a lower-order system. But the lower-order process often proceeds without any meta-cognition. This is certainly true for language whose use (even in professional grammarians) is often unaccompanied by meta-linguistic reflection. Similarly for other cognitive processes such as memory: we see a friend and call him by name without any awareness that we have just recognized and recalled. The important point is that we can deal with memory in a meta-cognitive way, just as we can reflect meta-linguistically. Examples of meta-cognition in memory are recollection (when we know that we remember) and intentional learning (when we know that we must store the material for later retrieval). Another example is the phenomenon of knowing that one knows - that is, has stored in memory - some item of information even though one cannot recall it at the moment (e.g., the ‘tipof-the-tongue’ phenomenon, Brown and McNeill, 1966; memory monitoring, Hart, 1967). Developmental evidence suggests that these various meta-cognitive processes may be closely related. In particular, their time of emergence seems suspiciously close to the five to seven year age range in which we found adult-like performance on meta-linguistic tasks. Whether Piaget’s stage-analysis can handle such findings is another matter, but it is interesting to note that the period from five to seven is just about the time when children begin to explain their judgments of space and number (Ginsburg and Opper, 1969). Similarly for monitoring processes in memory. Several Russian investigators have shown that intentional strategies for remembering are
162
Lila R. Gleitman, Henry Gleitman and Elizabeth F. Shipley
rarely adopted before five years of age but are increasingly utilized thereafter (Yendovitskaya, 197 1). Whether these relations among the various meta-cognitive functions will turn out to be more than mere analogies remains to be seen. The primary emphasis of the present paper has been on the meta-cognitive aspect of language behavior, for it is this that allows us to say that language is not only used but known. The results indicate that this kind of knowledge is found even in children. Consider a reaction to Chomsky’s paradigm (1969) in which the child is shown a blindfolded doll and is asked: ‘Is this doll easy or hard to see?’ (Claire, age 8): CG: Easy to see - wow! That’s confusing. LG: Why is it confusing? CG: Because it’s hard for the doll to see but the doll is easy to see and that’s what’s confusing. Or again, on the ambiguity of ask to : LG: What would it mean: Z asked the teacher to leave the room. CG: It would mean Z asked the teacher if Z could leave the room to go to the bathroom or it would mean Z asked the teacher to leave the room so Z could
go to the bathroom in privacy. Here knowledge is explicit. The child has moved from mere language use to serious innovation and creativity, to contemplation of language as an object. Such skills are frequently manifest in six, seven, and eight-year olds. We believe it is this kind of language activity that is most intriguingly engaged and convincingly explained by transformational theory; this is hardly surprising because just such data are the methodological prerequisite for grammar construction.
REFERENCES Bever, T. G., Lackner, J. R., and Kirk, R. (1969) The underlying structures of sentences are the primary units of immediate speech processing. Percept. Psychophys.,
Bloom,
5, 225-234. development: in emerging gram-
L. (1970) Lnnguage
Form mars.
and function
Cambridge, Mass., M.I.T. Press. Braine, M. D. S. (1963) The ontogeny of English phrase structure: The first phase. Language, 39, 1-13. Brown, R., and Bellugi, U. (1964) Three processes in the child’s acquisition of syntax. In E. H. Lenneberg (Ed.), New directions
Cambridge,
in
the
study
of
language.
Mass., M.I.T. Press.
Brown, R., Cazden, C., and Bellugi, U (1969) The child’s grammar from I to ITT. In J. P. Hill (Ed.), Minnesota symposium on child psychology. Minneapolis, University of Minnesota Press. Brown, R., Fraser, C., and Bellugi, U. (1963) Explorations in grammar evaluation. Monographs of the Society for Research in Child Development, 29 (1) Serial No. 92. Reprinted in R. Brown (1970) Psycholinguistics. New York, The Free
Press. Brown, R., and McNeill, D. (1966) The ‘tip of the tongue’ phenomenon. J. verb. Learn.
verb.
Behav.,
5,
325-337.
The emergence
C. (1969) The acquisition of syntax in children from 5 to 10. Cambridge,
Chomsky,
Mass., M.I.T. Press. Chomsky, N. (1964) Degrees of grammaticalness. In J. A. Fodor and I. J. Katz (Eds.), The structure of language. Englewood Cliffs, N.J., PrenticeHall. Geer, S., Gleitman, H., and Gleitman, L. (1972) Paraphrasing and remembering compound words. 1. verb. Learn. verb. Behav., 11, 348-355. Gelman, R., and Shatz, M. (in preparation) The development of communication skills: Modification in the speech of young children as a function of listener. Ginsburg, H., and Opper, S. (1969) Piagef’s theory
of
intellectual
development.
Englewood Cliffs, N.J., PrenticeHall. Gleitman, L. (1961) Pronominals and stress in English conjunction. Language Learning, University of Michigan, Ann Arbon, Michigan. Gleitman, L., and Gleitman, H. (1970) Phrase and Paraphrase. New York, W. W. Norton. Hart, J. T. (1967) Memory and the memorymonitoring process. 1. verb. Learn. verb. Behav.,
6, 385-391.
Hill, A. A. (1961) Grammaticality. Word, 17, l-10. Johnson, N. F. (1965) The psychological reality of phrase-structure rules. Journal of Verbal Learning and Verbal Behavior, 4, 469-475.
Katz, J. (1964) Semisentences. In J. A. Fodor and J. J. Katz (Eds.,) The sfructure of language. Englewood Cliffs, N. J., Prentice-Hall. Lakoff, G. (1972) Generative Semantics. In D. Steinberg and L. Jakobovits (Eds.), Semanfics: An interdisciplinary reader in philosophy, linguistics, anthropology and psychology. London, Cambridge
University Press. Labov, W. (1970) The logic of non-standard English. In F. Williams (Ed.), Language and Poverty. Chicago, Markham Press.
163
of the child as grammarian
Lenneberg,
E. (1967) Biologicul foundations New York, J. Wiley and
of language.
Son. Maclay, H., and Sleator, M. (1960) Responses to language: Judgments of grammaticalness. Intern. 1. amer. Ling., 26, 275-282.
McNeil& D. (1966) The creation of language by children. In J. Lyons and R. 3. Wales (Eds.), Psycholinguistics papers. Edinburgh, Eidnburgh University Press. Menyuk, P. (1969) Sentences children use. Cambridge, Mass., M.I.T. Press. Miller, G. A., and Isard, S. (1963) Some perceptual consequences of linguistic rules. J. verb. Learn. verb. Behav., 2, 217-228.
Miller, W., and Ervin, S. (1964) The development of grammar in child language. Monographs development,
of social research 29, 2-34.
in child
Pfafflin, S. M. (1961) Grammatical judgments of computer-generated wo’rd sequences. Murray Hill, N. J., Bell Telephone Laboratories, Mimeo. Savin, H., and Perchonock, E. (1965) Grammatical structure and the immediate recall of English sentences. 1. verb. Learn.
verb.
Behav.,
4,
348-353.
Shipley, E. F., and Shipley, T. E., Jr. (1969) Quaker children’s use of thee: A relational analysis. J. verb. Learn. verb. Behav.,
8, 112-117.
Shipley, E. F., Smith, C. S., and Gleitman, L. R. (1969) A study in the acquisition of language: Free responses to commands. Language, 45, 322-342. Yendovitskaya, T. V. (1971) Development of memory. In A. V. Zaporozhets and D. B. Elkonin (Eds.), The psychology of preschool children, Cambridge, Mass., M.I.T. Press. Ziff, P. (1964) On understanding understanding utterances. In J. A. Fodor and J. J. Katz (Eds.), The structure of language. Englewood Cliffs, N. J., Prentice-Hall.
164
Lila R. Gleitman, Henry Gleitman and Elizabeth F. Shipley
Cet article prkente des exemples de la connaissance qu’ont les enfants des proprittCs syntaxiques et skmantiques du langage. Les rudiments d’un fonctionnement ‘m&a-linguistique’, peuvent 2tre mis en tvidence chez les enfants de deux ans qui donnent des jugements de grammaticalitt dans une situation de jeux. Les auteurs ont examine le dkveloppement de ces capacitks avec un
groupe d’enfants de 5 B 8 ans, B qui l’on a demand6 explicitement de donner des jugements sur des phrases dkviantes. Les r<ats montreraient qu’un comportement semblable S celui des adultes apparait entre 5 et 8 ans. 11 est suggerk qu’il y a des relations possible entre le fonctionnement ‘metaet d’autres processus ‘metalinguistique’ cognitifs’.
2
The effects
of motor
skill on object
permanence
T. G. R. BOWER JENNIFER University
G. WISHART
of Edinburgh
Abstract Piaget found that infants in the first year of life will not remove a cloth or a cup that they have seen cover a toy. Part of the difficulty is a motor skill problem. However, deficits in motor skill are not sufficient to account for the failure in the situation. We cannot assume that out of sight is out of mind for such infants for they will reach out to obtain an object that has been made to go out of sight by switching off the room lights, leaving the baby in total darkness.
Piaget (1954) found that if one presents a six-eight month old infant with an attractive toy that is then covered with a cloth or a cup before the infant can grasp it, the infant will make no attempt to remove the obstacle to get at the toy. This observation has been interpreted as showing that such infants think that an object that is no longer visible no longer exists; out of sight is, allegedly, out of mind at this stage of development. Other investigators, by contrast, have found that infants of twenty weeks or less do seem to believe that objects that are out of sight still exist in a localizable place. Thus, Bower, Broughton and Moore (1971) found that infants of twenty weeks were able to anticipate the reappearance of an object that had moved out of sight behind a screen. Mundy-Castle and Anglin (1970) found that even younger infants were able to do this and in addition to follow the invisible path of the object while it was out of sight, even when that path was complex (Fig. 1). These experiments would thus seem to show that out of sight is not out of mind for infants of twenty weeks or more. The standard interpretation of Piaget’s classic observations is in contradiction with the more recent observations described above. Bower (1967) argued that the contradiction was more apparent than real. Bower et al. (1971) and Mundy-Castle and Anglin (1970) used eye movements as indicator behaviors to make manifest
166
Figure
T. G. R. Bower and Jennifer G. Wishart
1.
Mundy-Castle and Anglin used the apparatus shown here. They found that, after the object disappears in one window, prior to its reappearance in the other, infants of 16 weeks will interpolate an eye-movement trajectory corresponding to the invisible path of the object.
the infants’ knowledge about objects. Piaget, by contrast, used hand movements, manual search behavior, as an indicator. Manual search behavior is a far more complex task for an infant than is visual search behavior. Control of the hand develops much later than control of the eyes. Bower (1967) thus argued that the apparent contradiction between Piaget and the other authors cited was simply a matter of experimental method: The infants in the Piagetian manual search situation knew that the toy was under the cup but they did not know how to remove the cup to get at the object. Bower presented data in support of this hypothesis gathered in an informal experiment in which 5-6 month old infants were presented with a toy that was covered, before they could reach it, with a transparent cup. Since an object under a transparent cup is fully visible all the time, a failure with it could not be put down to out of sight meaning out of mind. Failure with a transparent cup could only result from lack of manual skill, and the same lack of manual skill could explain failure with an opaque cup, without thereby suggesting that the infants did not know that an object they had seen hidden under a cup was under the cup.
The effects of motor skill on object permanence
167
The results gathered from what was an extremely unsystematic experiment favored the hypothesis that 5-6 month old infants fail the standard Piagetian test because they lack the motor skill to pick up a cup; the infants failed with both transparent and opaque cups. The data was gathered very unsystematically and, indeed, was not reported as an experiment. Yonas (pers. comm.) and Gratch (pers. comm.) have performed more systematic replications, finding the opposite result, that infants who failed with opaque cups could remove transparent cups. Since the result is important for theories of the development of the object concept, it seemed worthwhile performing the experiment with more control of conditions than had been employed before in any of these studies. The first experiment was thus a systematic replication of the observations reported by Bower (1967). Although at first sight there would seem to be few methodological problems involved in testing whether or not an infant knows there is an object under a cup, this first impression would be very misleading. Object permanence experiments are done under somewhat relaxed conditions with a heavy emphasis on rapport between subject and experimenter. Given the need for rapport, it is essential to have very precise definitions of what constitutes a passing response or a failing response. The mere fact that a child finishes up with the object that was hidden in his hands is not a storable piece of information. The following considerations must be borne in mind in devizing criteria for a pass or fail response in an object permanence situation. First of all it must be certain that the subject can pick up an object from a foreplane surface. If the baby cannot pick up any object at all, there is little point in checking whether or not the subject can pick up an occluder to get at another object. If the subject can pick up an object, then it follows that he can pick up an occluding object, provided the occluder is not too large. However picking up an occluding object is not the same thing as picking up an occluding object in order to get at an object that had been hidden underneath the occluder, and it is the latter action that we wish to consider criteria1 in this situation. Piaget has certainly never denied that infants who do not have object permanence can nevertheless pick up objects. The special characteristic of picking up an occluder in the object permanence situation is that the occluder is not picked up for its own sake but is removed to allow the infant to get at the object that has been occluded. It is this ability to conjoin actions rather than the mere ability to pick up an occluder that Bower (1967) thought was lacking in the infants who failed the standard object permanence test. The problem is thus to decide whether an infant who picks up an occluder is picking it up for its own sake or in order to get at the object inside the occluder. The criterion was as follows. Prior to the beginning of object permanence testing the infants were presented with a toy placed on the table top before which they sat. The time from presentation up to successful capture of the object
168
T. G. R. Bower and Jennifer G. Wishart
was recorded; this time interval will be referred to as free capture time. It was determined that if an infant removed an occluder and then picked up the object that had been under the occluder, with the time from removal of the occluder to picking up the object less than that infant’s free capture time, the infant would be recorded as having picked up the occluder to get the object, a successful response. Picking up an occluder without getting the object that had been occluded or only getting it after an interval longer than free capture time was scored as a failure.
Subjects 16 twenty-one
week old infants
served as subjects,
8 male, 8 female.
Procedure Subjects sat at a plain brown wooden table, facing the experimenter. A stylized manikin 4.0 cm high, painted day-glo pink, by 1.5 cm in diameter was used as a toy. Previous work had found this to be a desirable enough toy. The transparent occluder was a plastic cup 6 cm high by 3 cm in diameter, with a transmission ratio of .7, so that an object within it could be clearly seen while the cup itself was clearly visible. The opaque occluder was a white plastic cup 6 cm high by 3 cm in diameter, that was perfectly opaque. The infants were presented with the toy, placed within reach, and their free capture time recorded. After 15 seconds the toy was taken away from the baby. In its original location was placed one of the occluders, the opaque to 8 babies, the transparent to the remainder. Free capture time was recorded. The occluder was then taken away and the toy replaced in its original location. Before the baby could take the toy again the opaque occluder was placed over the toy. The baby was then given three minutes to remove the occluder before the trial was terminated. At the end of the trial the occluder was removed revealing the object which the infant was allowed to pick up and retain for 15 seconds. At the end of this time the toy was removed and replaced in its original location, this time being covered by the transparent occluder. Trial duration was again three minutes, save that if an infant had a hand on the occluder at the end of three minutes he was given a further two minutes to complete his response. At the end of this period, if the infant had the toy, it was taken away, replaced and recovered by the opaque occluder, with a trial duration equal to that given with the transparent occluder. If the infant did not have the toy at the end of the transparent occluder trial, the occluder was removed, and the infant allowed to take and retain the toy for 15 seconds before the second opaque occluder trial was begun.
The effects of motor skill on object permanence
169
Results
The results are summarized in Table 1. As can be seen there, the hypothesis that there is no difference between an opaque and a transparent occluder as obstacles in a manual search task can be clearly rejected. The opaque occluder was far more difficult than the transparent occluder. On the other hand, it cannot be concluded that the transparent occluder offered no difficulties at all. Only eight infants were clearly able to pick up the occluder to get at the toy. The latency of picking up the transparent occluder when there was a toy inside it was far greater than the latency to pick up the occluder alone, indicating that the conjoined response was more difficult.
Condition
N. picked up occluder
Mean time to pick up occluder
N. picked up toy
Mean time to pick up toy
N. within free capture time
Opaque 1 Transparent Opaque 2
0 14 2
11.5 sets 125 sets
0 10 2
40 35
-
Mean free capture time for object Mean free capture time for occluder
8 2
45 sets 55 sets
Discussion
The result of this experiment reopens the issue of the apparent contradiction between the eye movement results cited above and the classic Piagetian manual search task. The transparent occluder did pose problems but not enough to account for the difficulties shown with the opaque occluder. Part of the problem in the classic situation is behavior sequencing but it is obvious that the opaque occluder, with the toy out of sight, produced yet more serious problems, seemingly implying that out of sight is out of mind in the manual search situation. Since out of sight is definitely not out of mind in eye tracking situations, this raises the theoretically interesting possibility that there is a process of decalage operating between the eye movement control system and the hand movement control system; with, at this stage of development, the eye movement control system knowing that out of sight objects still exist while the hand movement control system has not yet incorporated this information. As Piaget normally uses the term decalage it is applied to extension of information from one situation to another that resembles the initial one formally if not in detail. One can thus speak of decalage between conservation of volume and conservation of weight since the situations are formally similar and developmentally asynchronous. It would not be doing violence to the concept to use it in the context
170
T. G. R. Bower and Jennifer G. Wishart
of an asynchrony
between
an ability
to find objects
that have gone out of sight by
eye and the ability to do this with the hand. However, the statement ‘out of sight is out of mind’ is a very broad one. The data of the experiment described above do not unambiguously support such a statement. The transparent occluder did produce difficulties, so that the resulting non behavior might have resulted from a summation of occluder effects with effects of the out of sight condition, without the latter being so severe that the infant thought that the occluded toy no longer existed. To test this hypothesis it is necessary to have an out of sight condition that imposes no behavioral problems or only very minimal ones. If out of sight is out of mind for the hand the absence of behavioral problems will not help the infant. If, on the other hand, out of sight is simply a problem, then the absence of the additional behavioral problems posed by the classic situation might allow the babies to succeed. The next experiment was designed to test this hypothesis.
Subjects 12 twenty-week
old infants,
6 male, 6 female, served as subjects.
Procedure The subjects were given a standard Piagetian object permanence test as described in experiment 3. All of them failed to do anything with the occluder. They were then given a different out of sight condition. The table used before was removed. The manikin was presented on the end of a string, dangling in front of the baby. Before the baby could reach out for the toy, the room lights were extinguished. Since the room was light tight, this left the baby in total darkness. The toy was thus out of sight, as was everything else in the environment. The babies’ behavior was observed and recorded with an infra-red T.V. system, the vidicon of which was sensitive to light between 850 and 875 millimicrons. Illumination in this spectral band, which is totally invisible to the human eye, was provided by a specially constructed light source. The babies were left alone in darkness for three minutes. At the end of this time the standard object permanence test was repeated.
Results None of the infants passed the standard object permanence test on either presentation. All of them were able to reach out to obtain the object out of sight in darkness. The reaching in the dark was accurate. The hands went straight to the object locus - see Fig. 2 - even after initial periods of distress lasting as long as 90 seconds.
The effects of motor skill on object permanence
171
Discussion
It thus seems that out of sight is not necessarily out of mind, not even that part of the mind that controls hand movements, provided the transition to out of sight is accomplished by plunging the room into darkness. One could infer from this that out of sight is not out of mind in the standard test situation either, the difficulties of the motor task simply summin g with difficulties created by the fact that the object is no longer visible. On the other hand, one should beware of equating all the changes in stimulation that result in disappearance of an object. The psychophysics of disappearance has been studied (Michotte, 1962; Bower, 1967; Gibson et al., 1969) and it is clear that some disappearance sequences result in perception of the continued existence of the object that has disappeared while others have just the opposite result, the vanished object seeming to no longer exist anywhere. It is possible that disappearance under a cup or a cloth is a disappearance transformation of the latter sort, for infants of 5-9 months. More careful psychophysical work will be required to decide the issue.
REFERENCES Bower, T. G. R. (1967) The development of object permanence: Some studies of existence constancy. Percept. Psychophys., 2 (9), 411-418. Bower, T. G. R., Broughton, J. M., and Moore, M. K. (1971) Development of the object concept as manifested in changes in the tracking behavior of infants between 7 and 20 weeks of age, J. exper. Child Psychol., 11, 182-193. Gibson, J. J., Kaplan, G. A., Reynolds, H., and Wheeler, K. (1969) The change from visible to invisible: A study of
optical transitions. Percept. Psychophys., 5, 113-116. Michotte, A. (1962) Causalitk permanence et rLalite’phenomenales, Louvain, Publications Universitaires Belgium. Mundy-Castle, A. C., and Anglin, J. (1970) The development of looking in infancy. Paper read at Society for Research in Child Development, Santa Monica, California, April 1970. Piaget, J. (1954) The consfrucfion of reality in the child, New York, Basic Books.
172
T. G. R. Bower and Jennifer G. Wishart
RhSUtTlP
Piaget a trouve que les enfants de moins dun an n’enlevent pas une couverture avec laquelle ils ont vu couvrir un jouet. Une partie de cette difficult6 est due a des problemes d’habilite motrice. Cependant now estimons que les problemes d’ordre moteur, ne suffisent pas pour rendre compte des
Cchecs dans cette situation. On ne peut assurer que ‘&tre hors de la we signife %tre hors de l’esprit’ pour les enfants. En effet ceux-ci essayent d’attraper un objet qui est devenu invisible lorsqu’on a eteint la lumike de la piece en les laissant dans l’obscuritt.
3
A comparison
of sign language URSULA SUSAN
and spoken
language’
BELLUGI FISCHER
The Salk Institute for Biological
Studies
Abstract Evidence is presented which suggests that a sign in the American Sign Language takes longer to produce than a spoken word, but that a proposition takes about the same amount of time to produce in either language, or either modality for some signers. Properties of American Sign Language which can account for both of these fhcts are then discussed.
As a way of investigating the biological foundations of language, a small research group at The Salk Institute has undertaken a series of studies of the sign language used by the deaf among themselves.2 We have been studying the structure and form of the language, and how it is acquired by deaf children of deaf parents. We are interested in the form of the language. It seems to us that the change in modality of perception (the ear to the eye) and of production (the vocal apparatus to the hands and body) may affect the form which the language takes. In this paper we shall compare the rate of articulation in the two languages and speculate on the consequences of this comparison.
1.
The parameters of American Sign Language (ASL)
Let us examine the act of signing quite superficially from the outside, to establish a few basic parameters. If an outsider watches deaf people engaged in conversation, he will undoubtedly not be able to tell even what the topic of conversation is between them. 1. This research was supported by the National Institutes of Health Grant No. 0981 l-01 to the Salk Institute for Biological Studies. 2. Our research group has included Dr. Edward Klima, Adele Abrahamson, Robbin Battison,
Nancy Frishberg, Krystina Hooper, Richard Lacy, and Patricia Siple. In addition, Mrs. Bonnie Gough, who is deaf from birth and highly articulate in sign language, is a member of the project.
174
Ursula Bellugi and Susan Fischer
He will see arms flying and hands moving rapidly. Let us consider just this aspect. The language is basically produced by the hands, although the face and bodily movements and eyes do play a role. What is the space used in signing? There is a base plane, about at the level of the hands when clasped together in front of the body. Very few signs are made below the waist. The base plane is the locus for certain phrasal cues, which are a part of signing. When viewed under slow motion, one can sometimes see the signer’s hands come to the base plane at the end of what might be equivalent to a clause or sentence. Very few signs are made above the level of the top of the head. So there is an area bounded by the top of the head and the waist. We could likewise define the usual area for signing in terms of the reach of the hands from side to side. Consider the relation of the hands in signing. Some signs are made with one hand only: RED, HOME, DUCK. Some signs are made with two hands and both hands move (usually in similar ways): SIGN, BOOK, ANIMAL. Some signs are made with two hands, where one hand moves and the other acts as a base: KEY, VOTE, BUTTER. Some signs are made without contacting the head, torso, or other part of the body; that is, they are made in the space in front of the body: YELLOW, MILK, ROOM. And some signs are made in contact with various parts of the body from the head to the waist: FATHER, is made on the forehead; INSECT is made on the nose; MOTHER is made on the chin; HOME is made on the cheek; CURIOUS is made on the throat; POLICEMAN is made just below the shoulder; LIKE is made starting from the chest; AUTUMN is made along the lower arm near the elbow; LAW is made on the palm of the hand; CHOCOLATE is made on the back of the hand, NAVY is made at the waist, etc. One can see that even for simple sentences there may be considerable changes of location involved in making the signs. As a hypothetical example, consider the content words in the sentence: ‘Father likes autumn at home’. This would involve a change in location of the hand from contact with the forehead, moving to the chest area, then to the lower arm near the elbow, and ending on the cheek. Now let us consider some of the types of movement that are involved in signing. Some are quite simple and direct: the hand, in a distinctive configuration, contacts some part of the body, as in WORD, TIME, and DEER. But very few signs have such simple motion. Some signs are made by brushing against the area of contact, often more than once, as in BREAD, DOLL, WEEK. Some are made by a movement based on twisting the wrist after contact is made as in APPLE, cow, ONION. Some signs make an initial contact, move away in a slight arc and make contact in another location as in QUEEN, PICTURE, and YESTERDAY. Some involve wiggling some or all of the fingers as in INSECT, COLOR, FIRE. Some are made by closing some parts of the hand and then flicking open fingers and/or thumb as in MARBLE, BAWL-OUT, and FROG. Some signs involve grasping some part of the body or clothing as in TICKET, CANADA, and CURIOUS. Some signs involve a circular motion with one hand: REASON, TEA, FOUNDATION; and some signs
A comparison of sign language and spoken language
37.5
involve circular motion with two hands: YEAR, WORLD, SIGN. This describes only some of the motions involved in making signs: Most are more complex and take up more time than a simple contact point. Consider some of the hand conJgurations involved in making signs. The hand with palm open and fingers spread apart is involved in some signs: FLIRT, AMERICA, TREE. The middle finger may be bent down from this open hand, and this is a part of the signs for FEEL, SICK, TOUCH. The hand may be in a fist with the index finger extended, and this is an aspect of the signs for MONTH, KNIFE, and SOCKS. The index and middle fingers may be extended from the fist and bent over as in HARD, STERN, and SQUIRREL. The hand may be held in a tapered ‘0’ as in TEACH, KISS, and NUMBER. There are many other hand configurations involved in the signs of American Sign Language; this is but a small sample. We have described superficially parameters of ASL: The space which is ordinarily used, the relation of the hands, some places of articulation, some types of movement of the hands, and some of the hand configurations. Perhaps this gives the noninitiated reader a glimpse of the means of production in sign language. It is certainly radically different from the familiar speech apparatus. For hearing speaking people the spoken language is basically a set of modulations of the stream of air which passes through the oral, nasal, and pharyngeal cavities. For deaf signing people, language is basically a set of modulations of the hands and fingers in relation to the top half of the body and the space in front of it. There are differences of very great interest which we can only touch on here. For one thing, each person has but one vocal apparatus. In contrast, we have two hands. We have already mentioned that some signs are made with one hand only. This allows for a possibility in sign language which does not physically exist in spoken language: One can theoretically make two different signs at the same time, and therefore the restriction on a strict linear ordering of elements of the message produced may be lifted in sign language. Thus the possibility exists of creating two different messages at the same time in sign language. The extent to which this is (or can be) used is something we have begun to investigate. Our vocal apparatus produces messages which have been analyzed at many different levels. One level is that of the phone. Consonantal and vowel phones can be combined and recombined into different syllables or morphemes: [pat], [trep] and [rept], for example, are different combinations of the three phones [re], [p], and [t] into three distinct syllables. Some words consist of one syllable, and that may involve rearrangements of sounds as input and tap and apt. Some words consist of more than one syllable. We find that there does not seem to be an analogue to this level in signing. Signs ar not sequential arrangements of elements. Signs are better understood, it seems, as simultaneous combinations and recombinations of various hand configurations, types of movements, and places of articulation. If we hold one hand in a tapered ‘O’, touch
176
Ursula Bellugi and Susan Fischer
the cheek near the mouth, move away in a small arc and touch the upper cheek, it is the sign for HOME. If we keep the same hand configuration and touch on one side of the nose, move away slightly and touch on the other side of the nose, it is FLOWER. If we keep the motion and place of articulation of the sign for HOME constant, and change the hand configuration so that the hand is closed and the thumb and little finger are extended, it is the sign for YESTERDAY. If we return to the tapered ‘0’ and the cheek, and make the motion in a small circle, it is one sign for PEACH. So that signs are made by various combinations and recombinations of the basic parameters, not sequentially combined as with sounds, but simultaneously combined. There seems to be no direct analogue to the syllable. We have superficially described the means of production of sign language, at least in relation to the hands. It does seem very different from speech. What are some of the differences, and what are some of the consequences of the differences? Here we plan to make one rather straightforward kind of comparison. Signs involve movement of the hands in space. Perhaps this means that there is a difference in the rate at which signs can be produced. We shall attempt to compare the rate of articulation for sign language with the rate for speech.
2.
The rate of speech
What is known of the rate at which speech is produced? Goldman-Eisler reports a series of studies which are relevant to this question in her book Psycholinguistics (1968). She studied spontaneous speech under various conditions and in different situations. We tend to think of speech as an even flow, a stream of sound. But actual measurement of the amount of time spent in speaking and pausing shows this to be inaccurate. Goldman-Eisler found that spoken language is really very fragmented and that the ‘flow of sound’ is frequently interrupted by hesitations or pauses. In response to a request to describe picture stories, she found that most of her subjects paused between 40% and 50% of their total speaking time. Therefore, when we want to investigate the rate at which language is produced, it is important to separate out first of all the amount of time spent in pausing. In the studies reported above, the rate of articulation was measured in terms of the number of syllables per minute of the time spent in vocal activity (pauses subtracted out.) The studies found that there were significant individual differences in articulation, but that the rate of articulation within individuals was remarkably constant, even in very different types of situations. Goldman-Eisler suggests that what is experienced as a variation in the speed of talking within individuals turns out on careful analysis to be a variation in the amount of pausing. What is experienced as an increase
A comparison of sign language and spoken language
II I
in the speed of talking may be largely due to closing the gaps. The important point here is that rate of articulation seems to be a personality constant of remarkable invariance. If we want to investigate the rate of articulation in the two languages then, we would need to compare signs and words as the units of measurement since there is no analogue to the syllable in sign. We need to try to eliminate pauses from both sign and speech. Since there are significant individual differences in rates of articulation for speech but the rate within one individual remains remarkably constant, we felt that the ideal subjects for this study would be people who are highly practiced and fluent in both languages. We decided to compare rates of articulation within ‘bilingual’ individuals.
3.
3.1
A study in the comparison of rate
Subjects for the study
Fortunately, there is a special group of people we were able to locate to study the comparison between the rate of articulation of sign and speech. These are hearing people who are the sons and daughters of deaf parents. Two deaf parects often have hearing children, even though some of the children may be deaf. If the parents’ primary mode of communication with one another is in sign language, and that is the usual case, they may also use sign language primarily with their hearing children as well as their deaf children. It then may happen that the hearing child learns sign language as a native language from his parents. Occasionally, when the families are isolated from other people for some reason - illness in the family, geographic isolation - the hearing child may not learn spoken language during his early years. Otherwise the hearing child may learn spoken language from other older hearing children in the family, from other relatives, or from neighbors and children on the street. It is similar to the case of an immigrant family where the parents have not learned the language of the community around them. Deafness as a handicap means that the parents cannot use the telephone, and may mean that the parents have difficulty communicating with hearing people. The hearing child in such families may play a special role from a very early age in terms of interpreting for his parents into sign language what hearing people say to them and interpreting into spoken language what his parents sign. (There is a very sensitive portrayal of this in a book by Joanne Greenberg called In this sign.) The hearing child of deaf parents thus becomes bilingual and a fluent bilingual interpreter to an extent that is unusual. Of course, this is not always true, but for our study we looked for and found such subjects. We found three young hearing adults who had learned sign as a native language from deaf parents, and who had signed all their lives. All three were presently using
178
Ursula Bellugi and Susan Fischer
sign language as a part of their work, their studies, and their living situations. They were therefore extremely fluent in sign and in speech and highly practiced and accomplished in both modes. Using these people as subjects we can compare the rate of sign language and spoken language within individuals.
3.2
Materials
We asked each one of them individually to tell us some story from their childhood or some story they knew well. We asked for three different renditions of each story, in different orders, and not specifying at the start that the story would be repeated. We videotaped the entire proceedings, so that for each individual we had a videotaped version of: 1. A story told in American Sign Language 2. The same story told in spoken English 3. The same story simultaneously signed and spoken. We made very careful transcriptions of these stories preserving exclamations, pause fillers, and expressions.
3.3
Transcription and scoring
To give a written form to the sign language versions is not an easy task. The language has no readily written form. There are some attempts at devising systems of notation for sign language; most worthy of notice is that developed by William Stokoe in a pamphlet called Sign Ianguage structure (1960) and used in his Dictionary of American Sign Language (1965). For the purposes of our research in the acquisition and structure of sign language, we are using transcriptions which consist of giving an English gloss to signs, adding in information on the number of times a sign is made, the direction of movement of signs, specification when something is finger-spelled, specification when something is an expression, pantomime, or indexical sign, specification of eye contact during an utterance, etc. For the purposes of this study, this is probably adequate, since we want to compare the rate of producing words with the rate of producing signs. We do not, however, indicate how signs are formed. We then had four transcriptions for each subject: One of the story in sign language alone; one of the story when it was spoken only; one of the signed part and one of the spoken part of the simultaneously signed-and-spoken versions of the story. Counting words or signs was not a major problem. We counted spoken contractions as one word (i.e. don’t and it’s were each counted as one word). And similarly we counted an item as a single sign even if it had other information incorporated into it. As an example, there is a basic sign for LOOK. That sign can be varied to mean ‘YOIJ-LOOK-
A comparison of sign language and spoken language
179
the orientation of the hand. It can be varied to mean ‘EVERYBODYby changing the orientation of the hand, making it with two hands instead of one, and using a rather sharp motion. It can be varied to mean ‘THEYLOOK-AT-EACH-OTHER' by using two hands and changing their orientation with respect to one another. It can be varied to mean ‘GAZE-AT-• NE-ANOTHER-LIKE-LOVERS' by bending the fingers somewhat and imposing a slow circular motion on the preceeding variation. Each of these we counted as a single sign. We had one additional problem to solve in counting signs. There is a manual alphabet - available in every boy scout manual. This is a representation for each letter of the alphabet on the hands. Some words in American Sign Language are frequently fingerspelled. These tend to be short words like o-f-f, b-y, d-o, s-o, t-o-o, i-s, w-a-s, etc. In the signed stories rendered by hearing bilinguals there was a small proportion of fingerspelled words which ranged from 2% to 12%. We decided to count the fingerspelled words as signs since they were short (an average of 3 letters) and often highly practiced condensed forms. 3 Thus we obtained a count of the number of words and the number of signs per story. We timed each story with a stopwatch to get the total time from the start of the first utterance to the end of the final utterance. Let us consider just the information on total number of seconds per story, as presented in Table 1. AT-ME'
by changing
IS-LOOKING-AT-ME'
Table
1. Total time in seconds for stories in sign andspeech
Subject A: Subject B: Subject C:
Signed story Total number of seconds
Spoken story Total number of seconds
154.5 66.1 38.8
144.0 87.0 51.3
Our next problem was to try to get some measurement of the time spent in pausing. In order to get some measure of this, a scorer watched and listened to the videotapes and recorded all durations of all measurable pauses. This was done using a telegraph key signal attached to an Oscillomink equipped with a 100 Herz signal. The pauses were so measured three times for each condition. The totals we shall present represent the totals for median measured time - this was considered most accurate to take scorer’s possible error into account. Sign in all cases was measured at slow motion at the ratio of 3 :2, and the results were then adjusted to normal speed. In addition, we measured
3. We counted frames on the videotape (60 frames = 1 second) for 30 fingerspelled words
and found a mean rate of 2.7 words per second.
180
Ursula Bellugi and Susan Fischer
the duration of pantomimes, exclamations, and finger-spelling in the sign language stories only. There was one problem with the measurement of pauses which should be stated here, since it may affect our results somewhat. It is easy enough to distinguish between vocalization and silence, and we can judge easily when pauses are interrupted with non-speech sounds like ‘urn’. It is less easy to distinguish signing from non-signing. We have had to think about where signs begin and end, and about transitions between signs. There seems to be an analogue to making a non-speech sound to fill a hesitation pause in sign, but it has a different character in this modality. We have seen deaf people hold one hand in a neutral position and wiggle the fingers; perhaps this is one equivalent to ‘umm’. In these stories the signers kept their hands in the position of the sign they had just finished or were about to make, and rested them in that position while pausing. This means that determining a pause may be quite a bit more difficult in sign than in speech. This suggests that our assignment of pausing time for sign may be somewhat underestimated, and this should be taken into account when evaluating the results.
3.4
Comparison of rate of sign and speech
Now that we have some sort of measurement for the pauses that occurred in both conditions, we can begin to consider the rate of articulation for our three subjects, excluding pauses. Table 2 shows the rate of articulation for signed and spoken stories.
Table 2. Kate of articulation
Subject A: Subject B: Subject C:
in producing sign or speech (excludingpauses)
Signed story Average Signs Per Second
Spoken story Average Words Per Second
2.3 2.3 2.5
4.0 4.9 5.2
From this small amount of data, it looks as though there may be a striking difference between the rate of articulation for the two modalities. When pauses are excluded from the timing of a narrative, Subject A signed at a rate of 2.3 signs per second and spoke at a rate of 4.0 words per second. Subject B signed 2.3 signs per second and spoke 4.9 words per second. Subject C signed 2.5 signs per second and spoke 5.2 words per second. The rate of articulation foi words is nearly double the rate for signs for each of our subjects. In addition the differences across modalities for each subject were greater than
A comparison of sign language and spoken language
181
the differences between subjects. The difference between the rates of articulation for the two modes seems to be striking. One should remember that these were all individuals who had learned sign language as a native language from deaf parents, who were extremely fluent in sign language, and had been using it all of their lives. These subjects use sign language in daily communications and are judged by others as fluent, highly practiced signers. The subjects were asked to tell the same story when signing and when speaking. But of course there are differences between the two versions, as there would be in repeating any story. There is a way of examining the relationship between sign and speech which conveys precisely the same propositional content, and we shall turn to this next.
3.5
Rate of simultaneous signing-and-speaking
When people are really bilingual in two languages like French and English we can never ask them to produce both languages at the same time. But when one is bilingual in a spoken language and in a language that uses the hands in gestures, the modalities are different enough so that there is this unusual possibility. Both languages can be produced simultaneously. The subjects we used for our study were very accomplished at this rather difficult feat, for it was common experience for them to interpret for a mixed group of hearing and deaf people. (It is not an easy task: We who have been learning sign as a language find it very difficult to sign and speak at the same time, but our subjects were remarkably adept at this.) Our three subjects, then, relayed their stories in simultaneous sign-and-speech. This gives us an interesting basis for comparison, since the propositional content of the two stories is perfectly matched. Again we measured total times, (Table 3) subtracted out times spent in pauses, counted words and signs and can examine the results. In addition, although not directly related to the main point of this paper, we may mention some of the findings which stem from the requirement to produce two different languages at the same time. Table 3. Total time in seconds for simultaneous stories Subject A: Subject B: Subject C:
152.4 65.3 69.0
We measured the pause times separately for the spoken and signed aspects of the stories. We can consider the proportion of time spent in pausing and signing or speech. There was an increase in percent of time pausing in speech over sign, (but recall
182
Ursula Bellugi and Susan Fischer
that there may be some problem with measuring length of pauses in sign.) Notice that there was an increase in percent of time spent in pausing for each subject when signingand-speaking simultaneously as compared with producing either modality separately (Table 4). Perhaps this reflects the greater cognitive load involved in producing two languages simultaneously. Table 4. Percent of time pausing in sign and speech Stories produced separately Sign Speech Subject A: Subject B: Subject C:
20.9% 10.6% 12.4%
29.6% 23.6% 30.2%
Stories produced simultaneously Sign Speech 28.1% 25.0% 16.8%
33.6% 26.4% 34.9%
There were other indications of the effect of producing two languages at the same time which we can mention here. We notice in some instances that the two languages influence one another and that one finds ‘errors’ in one language that are related to the other. One subject said, ‘I was going to cook cake’. In speaking she would not ordinarily use the word ‘cook’ in this context, but it is a translation of an appropriate sign which she used in signing the story. In another part of the story there was an error from spoken English which appeared in the signed version of the story. There is a sign for ‘THEN' and an entirely distinct sign for ‘THAN'. The subject said ‘then’ but signed ‘THAN'. In addition, we found a number of anticipatory errors, equivalent to spoonerisms, in the simultaneous signed and spoken stories. Now we are ready to compare the rate of articulation for sign-and-speech under the special condition that both are produced simultaneously in telling a story. One might anticipate some constraints from the additional cognitive load of producing two languages in different modes simultaneously. Table 5. Rate of articulation in simultaneous signed and spoken stories Signs per second Subject A: Subject B: Subject C:
3.6
2.2 2.5 2.5
Words per second 3.4 4.4 4.1
Discussion
We see that even when expressing precisely the same information and content subjects are filling the temporal intervals with different numbers of basic units.
our The
A comparison of sign language and spoken language
183
rate of articulation for words is at least one and a half times the rate of articulation for signs. So the differences are still consistent. Fewer signs are produced per unit of time than words. We can compare these results to those obtained when the subjects were producing either sign without speech or speech without sign. We find that the rate of signing is the same when signing only (2.3,2.3, 2.5) or signing-and-speaking (2.2,2.5, 2.5). The rate of speaking when signing and speaking at the same time is 3.4 ,4.4, 4.1. This is somewhat less than the rate when speaking alone (4.0, 4.9, 5.2). It seems that under the special constraint of producing a narrative in sign and speech simultaneously, the rate of speech is somewhat slower than when speaking alone, although the rate of signing remains constant. The basic finding of our study so far is that signs seem to take longer to produce than words. In the introduction to this paper we described some aspects of signing that might lead one to anticipate this result. Signs are produced by movement and articulation of the hands in space in relation to the body. They may range over a fairly wide area (top of the head to the waist); some complex and time-consuming movements may be involved (circular motion of the hands with respect to one another; a brushing motion made more than once; a clasping action, etc.) The impression one has of an articulate signer - and our three subjects were highly articulate in sign - is that the hands are flying at a rapid pace. Still, it seems that the modality has an effect that makes a difference of some interest: The rate of articulation for speaking is considerably higher than the rate of articulation for signing, even when both languages areproduced at the same time.
4.
Temporal processes underlying sentence production
Until now we have been discussing physiological aspects of language production in different modalities. Although we have not demonstrated this by our study, we would guess that by the nature of their production the signs of the American Sign Language of the deaf could not be produced at the same rate as spoken words of English. We might imagine, therefore, that signed sentences and their underlying units (propositions) would be stretched out in time periods longer than similar spoken propositions. But our use of language is not determined solely by physiological factors of the rate of articulation of words or signs. Goldman-Eisler points out that a considerable portion of time during speech is spent in hesitation pauses. Our three subjects paused from about 10% to about 35% of the time spent in telling or signing a familiar story. The most important limiting factor, one might expect, involves cognitive aspects of planning sentences and not the physical ability to perform the articulatory movements. We
184
Ursula Bell&
and Susan Fischer
can attempt to search for some clue to the temporal processes involved in creating sentences or propositions. Suppose we made an operational definition of a proposition, as something that can be considered equivalent to a simple underlying sentence. Thus an actual produced sentence may contain one or several propositions. We marked off propositions in our stories. We counted as underlying propositions all main verbs or predicates which had overt (or covert) subjects. We did not count semi-auxiliaries like try, continue, stop unless they occurred by themselves as in I tried it. We did not count repetitions of verbs (as in I ran and run) or false starts, even if they included verbs. Consider first the stories that were simultaneously spoken-and-signed. The propositions are marked off at the same junctures for sign and for speech. Thus we have an equal number of such propositions and the propositions must be considered as linguistically parallel. If we then exclude pauses, we find that there is a range from 1.2 to 1.6 mean seconds per proposition. But we may have altered the natural flow of narrative by the requirement to produce two languages (in different modes) at the same time. It may be more revealing to examine the stories that were in spoken English only and the stories that were in American Sign Language only. Since these are freely told, the only requirement is that the same situation be described, but not necessarily in precisely the same way. We can again segment these stories into propositions, using the criteria previously established. Then, excluding the pauses, we can calculate the mean number of seconds per proposition for each subject and for each condition. These are presented in Table 6. Table 6. Mean secondsperproposition Spoken (without sign)
Subject A: Subject B: Subject C:
1.6 1.2 1.0
Signed (without 2.0 1.4 1.0
speech)
Simultaneous Sign Speech 1.6 1.4 1.2
1.6 1.4 1.4
We see that the rate of mean seconds per proposition varies slightly and consistantly from individual to individual but varies less across language modalities. The range is from 1.O seconds per proposition to 2.0 seconds per proposition in all conditions. We can, of course, base little on such meager data. But, it is not incompatible with the idea that there may be a common underlying temporal process governing the rate of ‘propositionalizing.’ The point of interest here is that while we found striking and consistent differences in the rate of articulation for sign and speech, weJind similarities in the Iength of time per underlying proposition for the two modes.
A comparison of sign language and spoken language
185
McNeill, in a recent (1972) paper, hypothesizes that the basic encoding process in speech is one that produces elementary sentences. Pauses give time for the process of encoding underlying elementary sentences to catch up with the utterance of syllables, words or surface phrases. The function of such pauses, he argues, is to permit speech to proceed smoothly at the underlying level, even at the cost of interruptions at the surface level. McNeil1 claims that there is a constant amount of time taken to construct underlying elementary sentences, and that this is on the average 1.0 to 2.0 seconds regardless of the age of the speaker. The arguments he makes, while suggestive only, are rather intriguing. He links this rate for producing underlying sentences with the process of shift of attention. Each new elementary sentence encodes further informattion into some sort of semantic form, and perhaps thus requires a shift of attention. And he claims that such attentional shifts ordinarily occur every one or two seconds. The point to be made here is that the mean number of seconds per proposition we have found in signing alone, speaking alone, and signing and speaking simultaneously, are not consistently different from one another and are within the range posited by McNeill.
5.
Appendix: Some properties of ASL (Susan Fischer)
So far we have done little to describe the language of signs. We have only given sub-
stance to the impression we had that signs take much longer to produce than words. And we have suggested that even so, the temporal processes underlying elementary sentoids or propositions may be similar in the two modalities, at least for bilingual individuals. What are some of the basic characteristics of the form of signs and sign language which may be related to these observations? Another way of posing this question is: Given the limitations imposed on sign by its modality of communication, namely that a sign takes longer to utter than a word, what are the mechanisms by which sign compensates for this limitation - what are the unique properties of sign on which the language capitalizes? Time is the crucial factor. How does sign save time and still communicate unambiguously? The answers we have found so far (and they are by no means complete) fall into three categories: Doing without, incorporation, and bodily or facial shifts. In discussing these mechanisms, while we shall refer to and use some of the same primary data which served as the basis for the previous sections of this paper, we shall also draw on other sources as well. 5.1
Doing without
Indian grammarians, particularly in the Sanskrit tradition (J.F. Staal, personal communication) had a proverb: ‘It is better to save half a syllable than to have a son’.
186
Ursula Bellugi and Susan Fischer
This proverb explains why, for example, Panini’s grammar, while extremely short in Western terms, is one of the most complete grammars ever written. It also gives us a some insights into the sign language of the deaf. Many of our insights into the richness and diversity of sign have come from our deaf informant, Bonnie Gough. Often we ask her to translate some sentence, which we have written down in English, into her own language. She will almost invariably point to a large number of words in the sentence and tell us, ‘Get rid of that, get rid of this, get rid of that,’ etc. This happens so frequently that we have termed this phenomenon the ‘get-rid-of-it’ syndrome. What is it that our informant so earnestly wishes us to do without? Grammatical morphemes. One of the sentences which we gave to several informants to translate into ASL was (1): (1) It is against the law to drive on the left side of the road. This sentence has fourteen words. A speaker of English might say it in fewer words, for example by contracting it and is into it’s, and perhaps even leave out the last three words, leaving us with (2) which has ten words: (2) It’s against the law to drive on the left side. Of these ten words, it’s, to, on, and the (twice) are what have been termed jimction words, as opposed to content words, such as nouns, adjectives and verbs. Sign language, like Russian, does not use the copula, nor does it use articles. The translation into sign of sentences (1) or (2) is (3). (3) ILLEGAL DRIVE LEFT-SIDE. In just three signs, the information is preserved, but in a kind of condensed form, and all the elements which are not really essential to convey the message have been eliminated. To borrow an old term from information theory, sign lacks a great deal of
redundancy. Anaphora. In English,
a sentence like (4) is at least awkward, and to some, is unacceptable: (4) John likes Mary, so John goes and visits Mary a lot, and John often takes Mary out to dinner, though sometimes John cooks dinner for Mary. Certainly a sentence like (5) is far more acceptable: (5) John likes Mary, so he goes and visits her a lot, and he often takes her out to dinner, though sometimes he cooks dinner for her. Every language of the world has ways of reducing a sentence like (4) in some fashion. In English we have anaphoric pronouns, 4 but this is not the only mechanism available to a language. If an item has been mentioned before, it can be deleted in some lan-
4. We are not making a theoretical claim here that (5) is necessarily derived transformationally
from (4), but there is an obvious relation between the two sentences.
A comparison of sign language and spoken language
187
guages, including Papago, Japanese, and in our case, ASL. A fairly good translation of (5) into sign might be (6): (6) JOHN LIKE MARY, WELL, GO VISIT MUCH, OFTEN TAKE OUT EAT, BUT SOMETlMES COOK FOR.
There are, of course, other ways of saying (5) in sign, some of which would be more idiomatic, and some of which use the kinds of mechanisms we shall discuss below, but for now, let us merely consider the differences between (5) and (6). Largely because anaphoric relations are expressed by pronominalization in English and deletion in sign, sentence (5) has 26 words, while sentence (6) has only 15 signs. Thus, the means by which sign expresses anaphora serves as another mechanism for shortening a sentence in sign. Information is preserved by location in space (see 5.2). General and spec$c verbs.5 English often spells out concepts, particularly verbal ones, by means of a periphrastic construction instead of denser lexical items. A speaker of English will thus often say go into instead of enter, get on, instead of mount, take a bath, instead of bathe, make knots in the rope instead of knot the rope. The verbs used to spread out the constructions, such as go, get, take, or make, are very general - their meanings depend largely on what follows them (e.g., the American Heritage Dictionary of the English Language lists 30 definitions for get, depending on the object, and 29, depending on the preposition). The denser constructions all depend on much more specific verbs. ASL tends to choose the denser construction - this is another way of making things shorter. Below are some examples from the stories elicited previously which illustrate the difference. To capsulize our notation system for sign, we give an English word gloss to each sign wherever possible. When words are fingerspelled, there is a hyphen between letters, and indexical items are in parentheses. When more than one English word is required to translate a single sign, the words are connected by hyphens. Repetitions are marked by t’s. The left-hand column here is the spoken version, the right-hand column is our rather inadequate English gloss6 for the signed sentences.
5. We are grateful to Adele Abrahamson for bringing this idea to our attention. 6. We have mentioned the notation for writing sign down, developed by Stokoe in his Sign language structure. It notes for any given sign the hand configuration involved, the number of hands involved, the point of articulation, the direction of movement, and to a certain extent, the manner of this movement. For us, our glosses serve the same purpose-we
have developed a consistent code such that there is a one-to-one correspondence between our gloss and an entry in Stokoe notation - and are a convenient mnemonic. The problem is that hand, movement, and location - and indeed the signs made up from these - are not the only parameters involved in the language, as we shall show below. It is partly for this reason that the gloss is inadequate.
Ursula Bellugi and Susan Fischer
188
English Sign RETURN TO KITCHEN and I went back into the kitchen ENTER So they came in ME TURN-ON G-A-S I turned on the gas. 1 pulled open the drawer. I PULL-OUT-DRAWER. And I struck the match AND STRIKE-MATCH UNTIL DECIDE GO-THROUGH GATE. Until I finally decided to go through the gate. (13) OK, so they got off the streetcar AND ARRIVE GET-OFF TRAIN Each pair of sentences is from the same individual, yet the English uses periphrasis and the sign uses the denser lexical item, even though some of these are from the simultaneous versions (9, 10, 11, 12, 13), which could conceivably constrain the sign output to match the English, but does not appear to. There are many other means which sign uses to pack information into one lexical item, but this is one which again serves to economize on time. (7) (8) (9) (10) (11) (12)
5.2
Incorporation
In a sense, the use of specific verbs instead of periphrasis with general verbs is a kind of incorporation. Sign utilizes, however, much more interesting types of incorporation, some of them radically different from any type known to spoken language. Incorporation of location. A large number of verbal (and some nominal) signs can vary, to a lesser or greater extent, the direction or path of movement, sometimes accompanied by a change in the orienration of the hands. These changes in movement reflect the spatial layout, either actual or established, of persons or objects in relation to the speaker. The ability to refer to location by changing the direction of movement of a sign largely makes up for the fact that sign does not generally use anaphoric pronouns. This is a mechanism which is extremely pervasive in sign, and which has, as far as we know, no counterpart in any spoken language. One of the sentences used by one of our subjects in sign was the following: (14) BOTH TWO-LOOK-AT-ME, THEN TWO-LOOK-AT-EACH-OTHER. We shall return in the next section to the question of number-incorporation. Here we shall be concerned with variations on the sign for look at with respect to incorporation of location. There is a sign for ‘two people look at each other’. If we wanted to say ‘those two people look at each other,’ the hands would start from the respective locations already established. Verbs in sign can variously incorporate, depending on lexical restrictions in the verb, the location not only of subject and direct object, but also source, goal, and dative. Thus in translating a sentence like (15) into sign (15) I will bring something down off that shelf for you
A comparison of sign language and spoken language
189
the verb BRING will incorporate the location of the source (a high shelf, hence the sign moves downward) and the dative (you). An interesting example of a rather unusual incorporation of location is the verb INVITE in sign. If an open hand with the palm up (fingers closed) moves toward the signer, the sign means ‘I invite someone.’ (The direction of the starting point of the sign can designate the object.) If the sign starts at or near the signer’s body and moves away from her, it means ‘someone invites me.’ (Again, the location of the end-point of the sign can designate the subject.) This change in direction of movement can, in the case of this verb at least, substitute for the lack of a passive in sign.7 On one occasion, our informant signed the following sentence: (16) ME NOT INVITE [away from signer] HER PARTY. This sentence does nor mean ‘I didn’t invite myself to her party,’ but rather the best translation of (16) would be ‘I was not invited to her party.’ Thus, the ability to incorporate location can not only reduce the length of a signed sentence but also can change the foregrounding much as passive can do so in English. The important thing to notice about the possibility in sign of incorporating location of various semantic relations is that it can shorten the signed sentence. That this is a true grammatical mechanism in sign is shown by the fact that, while there is often an unmarked or neutral verbal sign, most often the version with Z as subject, when this form is used with a verb that lexically can change and Zis supposed to be the object, the result (even if the personal pronouns are specified) is unacceptable to a native signer. Thus a sentence like (17) is ungrammatical. (17) *You LOOK-A-r [orientation away from signer] HE.* One could think of this kind of process as incorporating case-marking pronoun copies into the verb. There are a number of languages of the world which do this, so that sign would not seem so different after all from spoken languages. However, no spoken language incorporates case-markings in this way - vocal cords never change direction to show different grammatical relations - so that the mechanism which sign makes use of is unique to the modality.9 Incorporation of number. As in Chinese or Japanese, number, and in particular agreement in number between subject (or object) and verb, are not always specified in sign. Specification of number is, in sign, sometimes obligatory and quite often optional, and in numerous cases is reflected in the verb, often only the verb. When it is the verb which
7. This is not true for every verb. KILL, for example, cannot be used in this way. If the signer wishes to indicate the passive, she must fingerspell K-I-L-L-E-D. 8. A sentence preceded by an asterisk is an un-
grammatical sentence. 9. This mechanism is not always restricted to verbs. The sign BOTH-OF-US can change and become BOTH-OF-YOU or BOTH-OFTHEM.
190
Ursula Bellugi and Susan Fischer
reflects number, the shape of the verbal sign can change in several possible ways. Let us return to sentence (14) (repeated again here) (14) BOTHTWO-LOOK-AT-ME,THENTWO-LOOK-AT-EACH-OTHER. Consider the verbal sign TWO-LOOK-AT-ME. As we mentioned above, this sign is made by moving two ‘V’ hands towards the signer. If only one hand is used, then it is one person, a singular subject. If on the other hand, one uses not two ‘V’ hands but rather ‘four’ hands, i.e., with the four fingers sticking out and the thumb on the palm, then the resultant sign means ‘many people look at me. ‘10 The verbal sign CLIMB works in a similar way. If it is one person climbing, two fingers of one hand are used. If it is two two persons climbing, two hands are used, with two fingers on each hand. And if many people are climbing, four fingers on each hand are used. Thus, the number of hands and the number of fingers often can be varied to show singular, dual, and plural. Again, it should be noticed that while the variations on these signs add information, they do not take up any extra time. One way of incorporating number in sign is, as we have seen, to vary the hand configuration. Another is to change the manner of movement. One of the examples of a dual in the previous stories is the sign inadequately represented in (18). (18) TWO-GET-UP-ON-SIT-TOGETHER. This sign is made by moving two crooked ‘V’ hands with palms down, moving them up, outward, and together simultaneously. The same sign, made such that after coming together, the two hands now move to the right and left respectively, means many people sitting in a row, and the sign can be varied to show many people sitting in a circle as well. Similarly, the sign PARK-A-CAR is made with one hand, TWO-CARSPARKED-SIDE-BY-SIDE ismade withtwo hands,and ROW-OF-PARKED-CARS is madejust like two cars parked side by side, except that the hands then slowly move apart. Again, this does not take up very much time. Still another way of incorporating plural is by repetition, generally of the verb, though some nouns by themselves can be pluralized in this way, e.g., FRIEND, ENEMY, STAR, STREET-LAMP. If a verb is repeated quickly (see Fischer, in press) and at the same time the hand or hands move around in a semicircle, this indicates that either the subject or object is plural. Coupled with the possibility of incorporation of location, a large variety is possible. Thus, for INVITE, one can differentiate between I-INVITEMANY-PEOPLE and MANY-PEOPLE-INVITE-ME. For those verbs which do not change,
10. It seems that sign language has a very rich vocabulary of differential signs for certain aspects of the field of vision: Signs which incorporate or refer to objects; signs which incorporate affect, etc. It is an interesting reflection of cultural differentiation in language. The
absence of hearing and reliance on vision is the common component of users of the language and the language reflects this change even at the lexical level. There are a large number of distinct signs and modifications on signs for aspects of looking and seeing.
A comparison of sign language and spoken language
191
it is necessary to depend on context for disambiguation of which element is pluralized, but all verbs in sign can be pluralized in this way. Incorporation of manner. The sign for ‘START' involves inserting the first joint of the index finger of one hand between the first two fingers of the other hand and making a slight twist of the wrist. In fact, there is a continuum of meaning for this sign that can be associated with such variations as size of movement, tension, and so forth. Thus, for START, the index finger can be inserted at the very tip of the other fingers, and the wrist action may be relatively small, and this will mean ‘just barely starting.’ We find, not uncommonly, that what is expressed by a manner adverbial in speech is expressed by varying other parameters in sign. As another example, a deaf person was told that she was becoming famous for her teaching of sign. She signed something that conveyed the meaning ‘No, you are really famous; I am only becoming known to a couple of people.’ She conveyed this using only two variations on the sign which means ‘FAMOUS' (aside from the signs for ‘YOU' and ‘I’). The first time the sign was made enlarged, with exaggerated motion and with her face upward looking out; the second time the sign was miniaturized with very tiny movements, close to the head and with her chin down toward her chest. In the stories we have been discussing there were examples of this sort that account for a few of the differences in number of words as compared with numbers of signs. In one story, a person said, ‘and there was a terrz$c explosion.’ In the signed version, the sign for ‘EXPLODE' incorporated the manner adverbial and indicated that it was indeed a ‘terrific’ explosion. In addition, the same subject said, ‘I burst out crying.’ What she signed was ‘ME BAWL. ' ‘BAWL' is a variation on CRY which is much more intense and sudden. As another example, a different subject said, ‘She was much bigger than me.’ She signed what we have glossed ‘BI-I-I-GGER THAN ME‘ - i.e., there is a gross exaggeration of the sign ‘LARGE' which incorporates the quantifier. The incorporation of manner into a sign is not necessarily unique to sign. After all, this is exactly one function of intonation in spoken language. In sign, however, this is sometimes the only way of indicating differences such as those on the continuum between, say, ‘big’ and ‘gargantuan.’ It is used not only for affect, as is largely the case in spoken language, but also for real differences in cognitive meaning. Zncorporation of size and shape. Another way in which information can be packed into sign without losing any time is by the way in which the sign can vary with respect to the physical size and shape of the referents of the grammatical elements. This is particularly true of verbs. The most pervasive and general kind of thing which is incorporated is height relative to the signer (or protagonist of a story - see below). Thus the sign REQUEST, which barely changes at all to incorporate location (the hands are slightly further out from the body for I-REQUEST-OF-YOU than for YOU-REQUEST-• F-
192
Ursula Bellugi and Susan Fischer
is made relatively high if the indirect object is tall and relatively low if the indirect object is short. There are many signs which do not incorporate any aspect of size and shape other than relative height. For any grammatical element there is some verb which can incorporate the size and/or shape of its referent. Thus, there are verbs which can incorporate some or all of agent, direct object, indirect object, source, goal, and instrument. And various aspects of size and shape can be incorporated. Thus, take the verb RUN. The sign for a person or other two-legged animal running is made by placing the tips of the thumbs together, with index fingers parallel and pointing away from the signer, other fingers closed, and wiggling the index fingers while moving the hands away from the body. For a four-legged animal the sign is made rather differently. One hand is behind the other, with the first two fingers of each hand sticking down, and the fingers move in various ways depending on the animal involved. The sign RUN incorporates legs, but the sign BITE incorporates something like mouth. A large animal bite is different from a small animal bite, is different from a snakebite, is different from a scorpion bite, is different from a mosquito bite. So the sign BITE incorporates the size and shape of the agent referent’s mouth; it also incorporates the size and shape of the object, in another way. While the neutral sign BITE is made on the hand, it can be made on virtually any part of the body, visibility and decency permitting. 11The same is true for CUT, OPERATION, and BLEED (here BLEED incorporates the source). A sign like REMOVE can incorporate the size and shape of the object, the source, the goal,andalsotheinstrument.Thus, REMOVE-LARGE-PAINTING-FROM-WALL-WITH-HANDS is different from REMOVE-NAIL-FROM-WALL-WITH-CLAW-HAMMER, and REMOVE-SMALLTHING-FROM-WIDE-MOUTH-JAR-WITH-SPOON iS different from REMOVE-SMALL-THINGFROM-NARROW-MOUTH-JAR-WITH-FORK, REMOVE-UPPER-DENTURES-FROM-MOUTH is differentfrom REMOVE-LOWER-DENTURES-FROM-MOUTH. It is beginning to look as though the incorporation of size and shape involves some sort of feature system, since we need a method of cross-classification in order to attain maximum generality. Thus, EAT-ICE-CREAM-WITH-SPOON employs the same hand-shape for the signing hand as the sign for removing something with a spoon, while SCREWON-TOP-OF-JAR has the same base hand as the sign for removing something from a jar. While many signs which incorporate the size and/or shape of various elements are optional, and seem to be used somewhat to add color to a signed utterance, there are some verbs for which this type of incorporation is obligatory. Thus, the verb CLOSE ME)
11. There is a sort of ‘strike zone’in which signs are usually made. It includes the area in front of and immediately to the side of the face, trunk, and arms. There are a very few signs made below
the waist. Thus, if one wished to say in sign that a mosquito bit one on the foot, a hand would substituteforthefoot.
A comparison of sign language and spoken language
193
requires the signer to specify what is being closed. (A window that slides up and down will have a different sign for closing than a window that slides sideways.) Examples from our original stories which exemplify this type of incorporation are BANDAGE-ARM,
SMOKE-CIGARETTE, PAT-ARM, SINGE-HAIR-ALONG-FACE, GRAB-HANDS,
incorporation of size and unique to sign - numerous American depending on the size, shape, or even which this is done is unique to a sign is a way of compacting a great deal of
OPEN-GATE, CUT-ON-ELBOW-ABOUT-SIX-INCHES-LONG. Again,
shape of grammatical elements is not necessarily Indian languages have different affixes for verbs degree of rigidity of the object - but the way in modality. And again, this type of incorporation information into a small temporal space. 5.3 Body movements andfacial expression
There have recently been numerous studies on what has been called ‘body language.’ It has been claimed (Birdwhistell, 1970) that one can tell a great deal about a person’s inner feeling by carefully observing the attitude of the body and the expression on the face - the body is, as it were, often communicating something quite different from what the voice is trying to get across. Since sign is much more a language of the body than spoken language, employing as it does the hands, which often must touch various parts of the body, one might think that a signer would be more attuned to body language than a hearing person, and that body language would be very much a part of sign language. While both of these things are true, to a certain extent - (our informant can get good or bad ‘vibes’ from a person without directly communicating with him, just by watching, and a signer who does not make use of facial and bodily expression looks like a zombie) - the way they are true is rather different from what an outsider might expect. The way in which a signer uses the face and body is different from the way a hearing person uses the face and body in one important respect: Facial expression and body attitude can be part of the grammer of sign where they are usually considered as part of paralinguistics for spoken languages. Facial expression. Although there is a sign NOT, and various other negative signs, one very common way of negating a sentence, or part of a sentence, is with a shake of the head. Often this headshake is reduced to a slight frown - someone who is just learning sign can miss it. We gloss this headshake as NEG. It is a suprasegmental which can, as we mentioned, be superimposed over all or part of a sentence. Thus, one way of translating the sentences in (19) into sign is with the sentences in (20): (19) a. I know that. b. I don’t know that. (20) a. ME KNOW THAT.
194
Ursula Bellugi and Susan Fischer
b.
MEKNOWTHAT. tNEG+
>
The only difference is the headshake. Sometimes the headshake or frown can serve to disambiguate two lexical items. The sign FURNITURE is made by moving two ‘F'hands (thumb and index forming a circle, other fingers sticking out) towards and away from each other repeatedly. The sign IT'S-NOTHINGis made exactly the same way, but it is accompanied by a frown. Nodding the head (‘yes’), like the headshake, is also a suprasegmental, which can occur over all or part of a sentence. In a declarative, it gives an emphasis on affirmation. In a yes-no question (to our knowledge, it never occurs in WH-questions), it is the sign equivalent of tag formation. (One kind of negative tag formation uses NEG in the same way). Thus, the translations of the sentences in (21) into sign are those given in (22): (21) a. Do you like this? b. You like this, don’t you? c. Don’t you like this? (22) a. YOU LIKE THIS? b. YOU LIKE THIS? +YES-+ 1 I C. YOU LIKE THIS? I
tNEG+
1
The questions in (22) bring up another point about facial expression in sign. Yes-no questions are often marked only by a slight raising of the eyebrows, usually accompanied by a widening of the eyes, though occasionally also by a higher final position of of the hands at the end of the sentence. Hence, this is another way in which facial expression can become part of the grammar of sign. We saw with example (22) above that the eyes play an important role in sign questions. Independent of this, however, the eyes may indicate other aspects of the language. The difference between I-GIVE-YOUand I-GIVE-HERcan be the direction of the gaze. If the eyes are directed right at the addressee, the sentence means ‘I give you’. If the eyes are even slightly averted from the addressee, upward, downward, or sideways, it means something else. This use of the eyes can also be used to mark off direct quotation from the rest of a narrative. Direct quotation seems to be used far more extensively in sign than in English, some of the reasons for which we shall see in the next section. Consider this excerpt from one of our simultaneous stories: Speech Sign ‘(sHE)HITME,(SHE)HITME, (SHE) And I says, ‘oooh, she hit me, she hit me, she’s chasing me she’s chasing CHASEME,CHASEME,AWFUL!' me, aaugh, she’s so awful !’ CRY + +,BLEED-FROM-ARM+ +. And I’m crying and I’m bleeding.
A comparison of sign language and spoken language
195
In the spoken version, there are two signals that cue the direct quotation. The first is ‘And I says,’ The second is the change in tone of voice, particularly at the end of the quotation. In the signed version, though there is a sort of facial analogue to the tone of voice in this case, it would not be necessary. It is simply (but cf. next section) that the eyes are not directed at the addressee during the quote, but change to face the addressee at the end. Since shifting eye contact can be performed even while signing, this is still another mechanism in sign which saves time. Body attitude. One way in which the modality of sign has large consequences in determining the shape of the language is in its use of space. The way a signer can set things up in space and move location-incorporating verbs between them, as we have discussed above, is one example of the utilization of space. Another is shaping space with the hands and body to show size and shape, and moving in space to show number. Still another crucial use of space becomes evident in connected discourse or narrative, a use which involves attitudes, and in particular, shifts in attitudes, in the body and head. The narrative does not have to be very long. One time, our informant had been trying to lose weight and was tempted to eat a piece of scrumptious pie. Above her head and to the right, she established a sort of balloon with her hands which symbolized her conscience. Then she turned her body to the left and bent her head downward, thus effectively becoming the conscience. While in that position, she, as the conscience, scolded herself. Then becoming herself again, she turned her body to the right and lifted up her head toward where she had established her conscience and signed ‘YES, DEAR.’ The whole sequence took a very short time. This use of bodily shifts to indicate various characters in a story, particularly to indicate who is speaking to whom, is evident in some of our original stories. Let us return to the signed version of the story of which (24) is a part. A lengthier excerpt is printed below. COME MOTHER ‘WHAT ‘(SHE)
WRONG
COME + ? HAPPEN?’
HIT ME, (SHE) CHASE ME, CHASE ME, AWFUL.’
CRY + +, BLEED-FROM-ARM
+ +.
In addition to changes in eye contact and facial expression noted above, there are some important changes in the attitude of the body, which enable us to decipher this discourse. The first line is signed with the signer’s eyes directed at the viewer and the body facing straight ahead. In the second line, the angle of the body shifts to the left and the head bends downward - the mother addressing the child. In the transition to the third line, the informant indicates the child addressing the mother, so the body now shifts to the right and the head and eyes are directed upward. At the end of this quote, the eyes and body are again directly facing the viewer. Without these changes in body attitude, might misunderstand, not knowing who was talking to whom about what.
196
Ursula Bellugi and Susan Fischer
Just as the other mechanisms we have discussed shape space, this delimits and concentrates it, and again, takes up very little time, as compared with the amount of time it would take to disambiguate the situation merely by using signs. These bodily means of indicating relations are used pervasively in sign; they take advantage of the possibilities inherent in a spatial modality, and, perhaps not so incidentally, take up very little time. They also help to make the language quite vivid and absorbing to watch.
6.
A comparison of signed English and American Sign Language
All of the mechanisms discussed - doing without, incorporation of various aspects of various elements, and body and facial expression, have in common the fact that they exploit the possibilities inherent in a gesture language. However, the examples we have given were taken from situations which were not even specifically designed to bring these mechanisms out; the only instructions to the subjects were to tell a story as though the addressee were a native signer. When we specifically design a situation to bring these out, what we get is a distillation of all the mechanisms we have been discussing. What a signer often does when she utilizes the above mechanisms is to take on the role, however briefly, of a character in her narrative, and to identify herself with that character, so that all the aspects of the situation refer to the relative placement of that character. Sometimes, in a narrative, however, the point of view of the character changes - i.e., the story is seen from a different character’s eyes. In a recent paper, Kuroda (in press) describes the grammatical and stylistic consequences when such a change takes place. Kuroda has written a short, illustrative paragraph to indicate these changes, and we have adapted it for our research. The adaptation follows: John was standing by Mary in the house when Bill hit him. Falling to the floor, he saw her slender ankle. Instantly, Bill grabbed her arm, and dragged her outside of the house. The two of them looked up. The night sky of winter was clear and innumerable stars were coldly shining. It is the use of this particular paragraph with shift in point of view that makes for our designed situation. The instructions were the same as we had given our other subjects, with one exception. We asked our native informant, in this case a deaf person, to sign the paragraph as though to another deaf person; we asked her, in addition, to do the story in ‘Signed English,’ i.e., to use English word order and functorslz 12. There are a number of versions of signed English being developed at present, primarily for use in schools. Signs are invented for English functors and also for morphemes such as -ed
(or past), -ing, or -ment. Some examples are SEEing Essential English, Linguistics of VisuaI English, and Signing Exact English which are in the process of development.
A comparison of sign language and spoken language
197
filled in, and most importantly, without moving her body or changing her gaze. The signed English required 49 signs and 10 inflections; the ASL used 27 signs. For the ASL version, glosses for the signs themselves are printed in the left-hand column, and the simultaneous or subsequent suprasegmentals and bodily shifts are in the parallel righthand column. Signed English 1. 2. 3. 4.
JOHN WAS STANDING
NEAR THE GIRL IN THE HOUSE.
WHEN BILL HIT PAST JOHN. JOHN FALL-DOWN
PAST TO THE FLOOR.
JOHN SEE PAST THE GIRL ‘S PRETTY L-E-G-S.
5.
FAST BILL GRAB PAST THE GIRL AND DRAG PAST HER OUTSIDE THE HOUSE.
6. 7.
IT WAS REAL LY CLEAR AND MANY + +
BOTH LOOK-UP
PAST AT THE SKY STAR+
+
WERE SHINE ING
American Sign Language Sign 1.
BOY GIRL STAND-TOGETHER, AROUND-GIRL
2.
WHENBOY
3. 4.
FALL-DOWN
BOY-ARMIN HOUSE
HIT-HIM
TURN-T~-L~~K-~P
[from
prone
position] ‘PRETTY L-E-G-S’
5.
BOY GRAB GIRL
OUTSIDE HOUSE.
6.
Suprasegmentals and body shifts Left hand for first boy Right hand All signed on left side of the body From right, right hand for second boy To previous location of first boy on left Left hand Body and eye shift slowly throughout to gaze and orient right Body at an angle, right hand signing Grabbing from previously established location of girl Action gradually devolving to extreme right side of body Eyes
BOTH LOOK-UP
directed
established
7.
BEAUTIFUL MANY+
SKY, CLEAR, WITH +STAR+
+TWINKLE.
upward,
back
location
of house
Signs higher than normal, remain directed upward.
to
eyes
One of the main differences in the number of signs between the English and the ASL is that the English, in order not to be ambiguous, must keep referring to the actors. The
198
Ursula Bellugi and Susan Fischer
version, by first setting the actors up in space, does not need to mention them after the first time. The body and hand changes reflect in view that the English version spells out in signs. Particularly striking in this respect is sentence (4) in this sequence. In the Signed English version, the fact that sentence (4) follows sentence (3) is the only indication of the fact that John is lying down when he looks at Mary’s ankles. By contrast, in the ASL version, both the sign for LOOK and the direction of the eyes have been modified to show John’s vantage point. The use of direct rather than indirect discourse in the ASL version of this sentence serves to emphasize this vantage point. This device is used again in sentences (6) and (7), where the viewpoint is no longer John’s but rather Bill and Mary%. The ASL version of this story uses fewer signs, is temporally shorter (and incidentally, perhaps more interesting to watch), and yet actually gives us more direct information, as opposed to information that has to be inferred, than the Signed English version. By changing the shape of the language with kinds of mechanisms available to the modality of sign, which we have been discussing in this section, it makes possible a good deal of economy. It is probably for these reasons that a proposition in sign seems to take about the same amount of time to express as a proposition in English. Exploiting the mechanisms available to the modality make it possible to compensate for the problem apparently inherent to such a modality, namely that a sign made in space takes longer to articulate than a word made in the vocal tract. As a result of these mechanisms, the language looks very different even when written down, and in many ways, it is indeed different from spoken modality, but at the same time it can often convey the same amount of information in a given amount of time as a spoken language. ASL
7.
Summary
We began with a brief superficial description of some of the signs of the American Sign Language of the deaf. Signs involve movement of the hands in space, and we wondered if there might be a difference between the rate at which they are produced compared with the rate of articulation for words. We found three subjects who are completely bilingual in both languages (both modalities). They are hearing people of deaf parents who had learned sign as a native language and have continued to use sign in daily conversation with deaf people throughout their lives. We videotaped three versions of a story from each: one in spoken English, one in American Sign Language, and one in both languages produced simultaneously. Omitting the hestitation pauses in the stories we calculated the rate of articulation for signs and for words. The rate of articulation for words was nearly double the rate of articulation for signs when the two languages are produced separately. We made an arbitrary definition of a proposition
A comparison of sign language and spoken language
199
(underlying elementary sentence) in the narratives, and calculated the mean length of time per proposition. The rates for producing propositions were similar for signing and speaking - despite the great differences in the rate of articulation. The subjects were producing underlying sentences at a comparable rate in the two languages, but filling them with nearly twice as many words as signs. We then speculate on the consequences of this difference in the means of production of two languages. It may be the case that a sign language would tend not to use some of the surface complications of a spoken language like English. We find, indeed, that there are no common signs for articles, inflections, copula, some prepositions, etc. There seems to be a strong tendency to condense the message in sign language: There seems to be a premium on economy of expression. When translating an English sentence into its equivalent in American Sign Language, non-essential elements are almost invariably eliminated. (This may be what has led some writers to claim, quite incorrectly, that sign language ‘has no grammar.‘) It seems to us that this condensation may be a response to pressure when the rate of articulation of the language is so different from speech. It may also be that sign language has special ways of compacting and incorporating linguistic information that, because of its nature, are different from spoken language. We are just in the early stages of our studies of the linguistic mechanisms of sign language. We find that they are indeed different from speech and allow for addition of linguistic information without increasing the time required for signing.
REFERENCES
Anthony, David A. and Associates, Eds. (1971) Seeing essential English. Anaheim, Calif., Educational Services Division, Anaheim Union High School District. Birdwhistell, Ray L. (1970) Kinesics and context. Philadelphia, University of Pennsylvania. Fischer, Susan D. (1971, mimeograph) Two processes of reduplication in the American Sign Language. San Diego, Calif., The Salk Institute. Goldman-Eisler, F. (1968) Spontaneous speech. London, Academic Press, Inc., Ltd. Greenberg, Joanne (1970) Zn this sign. New
York, Holt, Rinehart and Winston. Gustason, Gerilee, Signing exact English. Silver Springs, Maryland, Publishing Division, National Association of the Deaf, 814 Thayer Avenue 20910. Kuroda, Sige-Y. (In press) Where epistemology, style, and grammar meet. In S. Anderson and P. Kiparsky, Eds., Studies presented
to Morris
Halle.
McNeill, David (197 1) Sentences as biological processes. Paper presented at the International Colloquium on Current Problems in Psycholinguistics, Centre National de la Recherche Scientifique, Paris, France. Stokoe, William C., Jr. (1960) Sign language
200
Ursula Bellugi and Susan Fischer
structure: An outline of the visual communication system of the deaf. Studies
-,
in Linguistics, Occasional Paper 8. Buffalo, New York, University of Buffalo. P. 78. Casterline, Dorothy, and Croneberg, Carl G. (1965) A dictionary of American
Sign Language
on linguistics principles.
Washington, D. C., Gallaudet College Press. Wampler, Dennis (Manuscript) Linguistics of visual English. 2322 Maher Drive, # 35, Santa Rosa, California 95405.
R&umi Les don&es presentees ont montre que la production dun geste dans le American Sign Language (ASL) demande plus de temps que celle dun mot parle, mais que la production d’une proposition prend B peu
p&s le m&me temps dans les deux langages, et prend le mCme temps avec chacun des procedes, pour les sujets bilingues. Suit une discussion des proprietes du ASL qui rend compte de ces faits.
4
An auditory
JAMES
illusion
of depth’
R. LACKNER*
Massachusetts
Institute
of Technology
Abstract The auditory perception of distance may be altered by systematically transforming the time and intensity ratios of the auditory cues at a subject’s ears. Two auditory ‘illusions of depth, opposite in sign, were generated, by attenuating in one case and masking with white noise in the other, the signal in one ear from an ‘auditory pendulum’. The change in perceived depth of the acoustic pendulum with attenuation of one ear was analogous to the Pulfrich phenomenon in vision. These auditory experiments suggested two additional ways of generating visual Pulfrich effects that were then demonstrated.
1. Introdnction Auditory space perception shares with visual space perception the existence of mechanisms for indicating the direction and distance of signals; in both modalities, some of these mechanisms depend on the existence of paired sense organs (BBkCsy, 1960; Ogle, 1950). It is, therefore, to be expected that certain illusions of depth that can be produced in vision by differential stimulation of the two eyes should have their analogues in audition where they could result from differential input to the two ears. The cues which can be utilized in binaural localization of pure tones are differences in arrival times, sound intensities, and wave crests of the tone at the two ears (BCkBsy, 1960). Time cues are determined by the direction of the sound source
1. This work was supported by an NDEA fellowship held by the author and by a grant to Professor H.-L. Teuber from the John A. Hartford Foundation, Inc. of New York City.
I am grateful to Professor Richard Held and Professor H.-L. Teuber for valuable suggestions on the manuscript. 2. Also at Brandeis University.
202
James R. Lackner
with respect to the listener’s head. If the sound source is anywhere in the median plane of the head, the sound will arrive at the two ears simultaneously. If the sound source is displaced to one side, then the sound will arrive at the ear on that side first. Intensity cues are a result of the shadowing effect of the listener’s head and are less pronounced with low frequency tones whose wave fronts diffract as they pass by the head. Phase cues are generated by the difference in arrival times of the wave crests at the ears when a sound source is off to one side of the listener’s median plane and are effective (and unambiguous) primarily for sounds below 1000 hz. The differences in sound arrival times and intensities at the two ears have a curious property. If a sound source is moved away from the listener along a radius originating at the center of his interaural axis, then the arrival-time differences of the sound at his ears will remain constant. The geometry of this situation is illustrated in Figure 1; regardless of the position of the sound source on the extension of the radius, the interaural time difference, d (0 + sin O), remains constant and its exact value depends only on 0, the angle between the projected radius and the listener’s median plane. However, the difference in sound intensity between the ears varies as a continuous function of the source intensity times the difference between the inverse squares of the distances of the sound source from the two ears. As the distance of the source from the listener is increased this sound intensity difference approaches zero.
Figure 1. As a sound source is moved along the radius from position RI to R3 the interaural time difference, only on the value of 6.
d@ + d sin 0, remains constant and depends
An auditory illusion of depth
203
It is generally recognized that for pure tones above 1000 hz, an intensity difference at the two ears is the only cue available to the listener for gauging the distance of a sound source. Thus one might expect that a listener, with his head stationary and the arrival times and phase of a tone at his ears constant, should perceive fluctuations of auditory depth if the intensity difference of the tone at his ears were varied systematically. A similar situation obtains in the Pulfrich stereo-phenomenon, a well known visual illusion of depth. If an observer, with one of his eyes covered by a neutral density filter, views binocularly a pendulum bob swinging in his frontal plane, he will see the bob move closer to him as it swings toward the eye covered by the filter and farther away from him on its return swing. The apparent path of the pendulum bob represented from above is an ellipse (Pulfrich, 1922). The Pulfrich effect is believed to depend on the alteration of stereoscopic depth cues which arises from the different neural latencies of the two eyes in the experimental situation. Registration of bob position is delayed in the eye covered by the filter because that eye has longer neural latencies; in binocular fusion the bob is seen in the position that is associated with the new pattern of retinal disparity cues in the two eyes. Only at the endpoints of its transit, where its velocity is zero, is the bob seen at its ‘correct’ depth. The greatest displacements in depth relative to the resting positions are seen in the center of the bob’s transit where its velocity is maximal (Lit, 1960). These changes in visual depth for altered light intensity ratios give rise to the expectation of similar changes in auditory depth under analogous conditions in which the intensity cues at the listener’s ears could be systematically distorted. Accordingly, an auditory analogue of the Pulfrich effect was devised. Its study suggested several additional ways of obtaining the Pulfrich effect which were also investigated.
2. Procedure An ‘auditory pendulum’ was constructed by mounting an earphone from a GrasonStadler headset on one end of a three-foot metal rod; the electrical leads from the phone were run along the length of the rod. The other end of the rod was attached to a roller bearing supported by a shaft forming the center of rotation of the assembly. Upon being raised and released, the pendulum would swing back and forth in an arc determined by the release height. Eighteen inches in front of the arc’s vertical plane and 3 inches above its lowest point, two microphones were mounted with a horizontal separation of 7 inches (to mimic the listener’s interaural axis). The outputs of these microphones, after appropriate amplification were fed into separate
204
James R. Lackner
phones of the listener’s stereophonic headset. During the the listener sat blindfolded with his head stabilized by a a cloth drape separated him from the pickup microphones sensing air currents generated by the pendulum. Figure 2 experimental situation. The entire apparatus was housed in Figure
experimental procedure chin and forehead rest; preventing him from is an illustration of the an acoustic chamber.
2
ATTENUATOR
During an experimental trial the pendulum was set in motion and a 2000, 3000, or 4000 hz signal was delivered to its speaker; pure-tones were utilized to eliminate the possibility of the listener employing tonal complexity cues in his judgments. The output of either phone of the calibrated Grason-Stadler stereo headset could be attenuated, or masked with white noise. The listener’s task was to give a running report of the absence or presence of a depth change in the sound that he heard moving back and forth and, in the event of a change in depth, its direction. Subjects were M.I.T. undergraduates who had volunteered for paid participation. Each prospective subject’s hearing thresholds were measured with a BCkCsy audiometer; to qualify for inclusion in the experiment, the subject’s ears had to be matched within 5 db throughout a frequency range of 1800 to 4200 hz. All subjects used had participated in previous experiments involving binaural localization tasks.
An auditory illusion of depth
20.5
2.1 Experiment 1 Twelve subjects were run individually, in approximately 60-minute sessions, under each of the following conditions: 50 db signal (relative to the subject’s detection threshold measured with the pendulum stationary) at 2000, 3000, or 4000 hz to the pendulum phone and white noise at 0, 10, or 20 db masking the signal in one phone of the subject’s head set. Half of the subjects heard the white noise in the left phone, and half in the right phone. At the end of each ten-second experimental trial, the subject had 20 seconds to expand the report he had given during the trial and to prepare for the next trial. Each subject received 5 trial s in a block for each of the 9 experimental conditions. The order of experimental conditions was randomized for each subject. The results of Experiment 1 are presented in Table 1. No significant changes in depth were reported without the presence of the white-noise mask in one phone. For both the 10 and 20 db levels of white noise, the reports of changes in depth were significantly different (p < .OOl) from the control conditions in which the noise mask was absent. Table 1. White noise mask (db)
hz
2ooo 3000 4000
0
10
20
2 2 3
38 35 43
46 47 51
Combined reports of depth changes for twelve subjects in Experiment 1. Each subject had five trials under each of the nine experimen-
tal conditions. An entry represents the total number of depth changes reported in that condition for sixty trials.
As the experimental sessions progressed most subjects found that they could either hear the sound as being localized within their head or externally in front of themselves. Moreover, they could voluntarily shift from one form of localization to another. The direction of the change in depth depended on which ear received the whitenoise mask. In all reports of changes in depth, as the pendulum swung toward the side with the white noise mask, the subject heard it move farther away. By contrast, in the Pulfrich situation, as a pendulum bob swings toward the side of the eye covered by a neutral-density filter, it appears to move closer. This raised the possibility of reversing the direction of the depth effect in the auditory case by attenuating one channel rather than masking it.
206
James R. Lackner
2.2 Experiment 2 Experiment 2 has the same form as Experiment 1, except that the signal to one channel was attenuated 0, 3, or 6 db instead of being masked by white noise. Twelve new subjects were used. Table 2 summarizes the results. With the signal to one ear attenuated, significant changes in depth were reported (p < .OOl). Moreover, when the pendulum approached the attenuated side, the subject heard it as moving closer to him. Thus, the direction of the displacement in depth was the reverse of that found with a Table
hz
2.
Attenuation (db)
2000 3000 4000
0
3
6
1 2 1
43 50 49
48 52 53
Combined reports of depth changes for twelve subjects in Experiment 2. Each subject had five trials under each of the nine experimen-
tal conditions. An entry represents the total number of depth changes reported in that condition for sixty trials.
unilateral masking noise. In this experiment, as in Experiment report more depth changes for the high frequency signals.
3.
1, listeners
tended
to
Discussion
The present experiments support the hypothesis that systematic alteration of the intensity ratios of a tone at the ears can evoke a perception of change in auditory depth. Attenuating the signal to one ear yielded a depth change analogous to the Pulfrich stereo-phenomenon in vision; masking the signal with white noise reversed the direction of the depth change. If similar physical cues are being utilized in the visual and auditory modalities to mediate perceived changes of depth, then it should be possible to reverse the direction of the Pulfrich effect by increasing instead of decreasing the illumination in one eye. This was easily corroborated by indirectly illuminating one of an observer’s eyes with a penlight while he viewed a moving bob. Under these conditions the observer saw the bob move farther away as it approached the side of his illuminated eye. By increasing the disparity in illumination between the two eyes, it was possible to increase the perceived displacement in depth of the bob as it swung toward the illuminated side.
An auditory illusion of depth
207
An experiment by Lemmon and Geisinger (1936) suggested a method of setting off the mechanisms involved in the Pulfrich stereo-phenomenon in vacua. These investigators found that the reaction time to a visual stimulus increases when the eye is dark adapted. They attributed this increase to differences in the neural latencies of the cones and rods of the retina. If their hypothesis is correct, dark-adapting the eye should be similar in its effect on neural latencies to putting a neutral density filter in front of the eye. Consequently, a Pulfrich effect should be obtained while viewing a moving bob with one eye light-adapted and the other eye dark-adapted. To test this prediction a patch was worn over one eye for 30 minutes to permit that eye to dark-adapt. The room illumination was lowered until a bob swinging in the frontal plane was just visible to the light-adapted eye; then, the dark-adapted eye was uncovered. Although with monocular viewing the bob appeared much brighter to the dark-adapted eye, during binocular viewing the bob appeared to move closer as it swung toward the side of that eye. As the adaptation state of the two eyes equalized, the depth disparity also diminished to zero. The direction of the displacement in depth was the same as that which occurs in the Pulfrich effect obtained using a neutral density filter in front of the eye. This observation has been repeated frequently with invariant results. These auditory experiments and visual demonstrations suggest that, for both modalities, fluctuations in perceived depth can be induced by latency differences in the registration of positions. In the auditory situation, a white-noise mask fed into one ear presumably shortens the neural latencies and raises the level of receptor Figure 3. A - Position of sound source registered by non-masked ear. B -Position of sound source registered by masked ear. C - Perceived position of sound source after binaural fusion. Direction of sound
of movement source
xc
/I B!!A :
I
white noise
*
\
I
208
James R. Lackner
activity in that channel; as a result, in binaural fusion, the sound is displaced into the distance as it approaches the listener’s masked side. This hypothesis is illustrated in Figure 3. Figure 4 illustrates the same situation with the sound source moving in the opposite direction.
Figure
4.
A - Position of sound source registered by non-masked ear. B - Position of sound source registered by masked ear. C - Perceived position of sound source after binaural fusion. DIrectIon of sound
of movement source c
white noise
Conversely, with unilateral attenuation instead of masking, neural latencies would be lengthened in one channel relative to the other, and as a consequence the binaural fusion cues would be reversed; as a result, the sound should be heard to move closer as the sound source approaches the attenuated side. These considerations indicate that the ability to perceive auditory distance is more refined than one might expect from considering the usual stimulus arrangements in comparison with the immeasureably better ones for auditory direction. Moreover, the pursuit of analogies between vision and audition evidently continues to uncover interesting aspects of both modalities.
An auditory illusion of depth
209
REFERENCES
Lit, A. (1960) The magnitude of the Pulfrich stereophenomenon as a function of target velocity. J. exper. Psychol., 59, 165-175.
Ogle, K. N. (1950) Researches in binocular vision. Philadelphia, W. B. Saunders Company. Pierce, A. H. (1901) Studies in auditory and visual space perception. New York, Longmans and Green. Pulfrich, C. (1922) Die Stereoskopie im Dienste der isochromen und heterochromen Photometrie. Naturwissenschaften, 10, 533-564.
La perception auditive de la distance peut &tre alter&e par des transformations systematiques des rapports de temps et d’intensite des stimulus auditifs presentes au sujet. On peut obtenir deux ‘illusions’ auditives de distance, de signes opposi%, soit en masquant avec du bruit blanc, soit en attenuant un signal provenant dune ‘pendule auditive’.
Ce signal est envoy6 dans une seule oreille. Le changement de perception de distance de la pendule accoustique dam le cas d’att&ration dans une oreille est analogue au phenomene Pulfrich en vision. La demonstration de ces experiences auditives sugg&rent deux man&es supplementaires d’engendrer les effets visuels Pulfrich.
BekCsy, G. (1960) Experiments in hearing. New York, McGraw Hill. Lemmon, V. W., and Geisinger, S. M. (1936) Reaction-time to retinal stimulation under light and dark adaptation. Amer. J. Psychol.,
48,
140-142.
5
The abstraction
of linguistic
JOHN
ideas:
A review
D. BRANSFORD
State University of New York at Stony Brook
JEFFERY
J. FRANKS
Vanderbilt University
Abstract The present paper investigates the status of the individual sentence. Is the sentence the unit of memory, or is it primarily a unit for communicating ideas? A series of studies is presented demonstrating that Ss do not simply retain information expressed by individual input sentences. Znstead Ss spontaneously integrate information communicated by sets of semantically related (and often non-consecutively presented) acquisition sentences to construct more wholistic semantic descriptions. These wholistic descriptions may contain more information than any particular input sentence expressed. Memory is primarily a function of these wholistic structures. Ss will recognize and recall many sentences never presented during acquisition but which are derivable from the semantic structures acquired. However, Ss will rarely recall or recognize information that represents a distortion of these integrated ideas. Semantic integration is investigated in a variety of experimental conditions. It is shown to occur within the context of specially designed ‘integration paradigms’ as well as in prose passages, and it is shown to occur for a wide variety of acquisition tasks. Some models attempting to account for the data are evaluated, and implications are discussed.
The relation between memory and language has been a focal point of research and theoretical speculation since psychology’s inception. Approaches to the study of this relationship have been widely divergent. The pioneering work of Ebbinghaus (1889, for example, emphasized rote memory for individual items. Through the use of nonsense syllables, Ebbinghaus attempted to eliminate the ‘confounds’ of meaning from his experiments, and the precise repetition of a previously experienced input was considered a prerequisite for correct recall. A very different set of emphases
212
John D. Bransford and Jeffery J. Franks
were exemplified in Wundt’s approach to the study of language and memory (cf. Blumenthal, 1970). Wundt was concerned with sentences and their meanings. Sentences, according to his position, had both simultaneous and sequential structures. The general idea or cognition underlying the sentence was assumed to be a relatively wholistic semantic structure, and a sequential (syntactic-phonological) linguistic structure was imposed on this wholistic meaning in order for the latter to be expressed. In short, Wundt distinguished between cognitions and the particular sequential forms in which they were expressed. Although the study of rote memory for individual items continues to play a valuable role in formulating general theories of memory, recent accounts of meaory for natural language materials have moved increasingly closer to Wundt’s viewpoint. These recent accounts have been influenced by modern transformational linguistic theories (e.g. Chomsky, 1965; Katz and Postal, 1964; Postal, 1964). These linguists distinguish between surface structures and deep structures of sentences, a distinction somewhat analogous to Wundt’s distinction between successive and simultaneous structures. The surface structure provides the sequential ordering of the lexical items and the basis of phonological specification of the sentence. The underlying or deep structure provides the basis for specification of the meaning of the sentence. An important consequence of this distinction between surface and deep structures is that sentences can be different at one level yet similar at another. For example, the sentences (1) The boy kissed the girl and (2) The girl was kissed by the boy differ at the level of surface structure, but their respective deep structures are very nearly the same (Katz and Postal, 1964). Wundt would undoubtedly have argued that these two sentences lead to or are based on the construction of similar simultaneous cognitions despite the fact that they differ in their sequential, expressive form. Hypotheses about various levels of linguistic structure have motivated considerable research on sentence memory. It has been shown that Ss often remember sentence meanings despite forgetting their original wordings (e.g. Sachs, 1967), and many researchers have argued that a semantic interpretation of the deep structural relations specified by current transformational grammars characterizes the abstract information that is retained in memory (e.g. Blumenthal, 1967; Blumenthal and Boakes, 1967; Mehler, 1963; Miller, 1962; Rohrman, 1968). Like Wundt, current linguistic accounts of language assign special status to the individual sentence, and the psycholinguistic research tends to reflect this point of view. An important question is: Does the individual sentence deserve special psychological status, or are there other levels of structure that are at least as important psychologically? We believe that the Wundtian formulation distinguishing between cognitions and sequential modes for expressing them has a much broader scope of application than merely to individual sentences. This notion can also be used as a potential account of the nature of the memory for sets of semantically related sentences (e.g.
The abstraction of linguistic ideas: A review
213
paragraphs or whole discourses) as well as to the relation between memory and language for individual sentences. Not only may successive words contribute to wholistic semantic structures representing the meaning of sentences, but various successively experienced sentences could contribute to common wholistic semantic representations of the general meanings of related discourses as a whole. Intuitively this type of constructive process operates continuously in every day situations. People do not spontaneously store individual sentences (nor the individual semantic structures each generates) as separate, independent entities. Instead they use the information from various semantically related sentences to construct wholistic descriptions of events. The purpose of the present paper is to introduce some methodologies we have been using for the study of this more inclusive usage of Wundt’s notion of constructing wholistic semantic structures and to review a series of studies utilizing our experimental approach. 1.
An experimental paradigm
In overview, the initial experimental paradigm we have used consists of presenting subjects (Ss) with a set of sentences each representing only a partial meaning of an arbitrarily chosen complete idea, and then we attempt to assess the nature of the information that Ss retain. More specifically we have attempted to contrast two general alternate views of what is stored. One position follows from the extension of the Wundtian position to include inputs of greater scope than sentences. This view holds that people will integrate the partial meanings of semantically related sentences and construct a more wholistic semantic representation of the complete idea. The second view is that people do not integrate semantic information, but instead ‘store’ information in smaller units. One obvious choice for these lesser units would be that Ss store individual semantic representations for each (or at least some subset) of the specific sentences that were actually presented. Or Ss may store a list of independent features that underlie the input events. The studies discussed below were designed to begin differentiating these alternative views. In the basic studies (Bransford and Franks, in press), we chose the complete ideas to be the semantic structures underlying certain complex, embedded sentences. For example, ‘The ants in the kitchen ate the sweet jelly which was on the table’. Each complete idea that was chosen could be considered to be composed of four basic propositions, e.g. the ants were in the kitchen. This breakdown into four propositions was intuitively based; no claims are made that these propositions are necessarily linguistically basic or unique. For terminology we refer to sentences expressing one of these basic propositions as ONES. Correspondingly, the complete ideas containing four interrelated propositions are termed FOURS. Other sentences related to a complete idea can be formed by
214
John D. Bransford and Jeffery J. Franks
combining ONES into combinations of two propositions or three propositions (TWOS and THREES, respectively). Table 1 presents an example of a FOUR and a set of ONES, TWOS, and THREES related to it. Table
1.
Sentences comprising an idea set
FOUR The ants in the kitchen ate the sweet jelly which was on the table. ONES were in the kitchen. was on the table. was sweet. ate the jelly.
The The The The
ants jelly jelly ants
The The The The
TWOS ants in the kitchen ate the jelly. ants ate the sweet jelly. sweet jelly was on the table. ants ate the jelly which was on the table.
THREES The ants ate the sweet jelly which was on the table. The ants in the kitchen ate the jelly which was on the table. The ants in the kitchen ate the sweet jelly.
In the initial studies four different complete ideas or FOURS were chosen to be communicated to the Ss. The basic experimental procedure in these studies consisted of an acquisition phase followed by a recognition test. During acquisition Ss were presented with sets of partial meanings (ONES, TWOS, and THREES) for each of the four complete ideas. The sentences relating to the different complete ideas were randomly intermixed in presentation. The acquisition task was an incidental learning procedure where Ss were asked to answer a question about a sentence after it was presented. They were not told that they were later going to be tested on the sentences nor that they were to integrate the meanings of related sentences. Following acquisition, Ss were given a recognition test. They were presented sentences one at a time and asked to judge whether they had actually heard a given sentence in acquisition and to give a confidence rating for each judgment on a five-point scale. Three general types of sentences were included on recognition: old clear-cases (OLDS), new clear-cases (NEWS), and NONCASES. OLDS are sentences (ONES, TWOS, and THREES) that were actually presented during acquisition. NEWS are sentences whose semantic structure form a part of one of the complete ideas (i.e. their
The abstraction of linguistic ideas: A review
215
meaning is derivable from one of the complete ideas) however they did not actually occur in acquisition. NEWS included ONES, TWOS, and THREES as well as the FOURS, i.e. those sentences expressing the complete ideas. A NONCASE is a sentence whose meaning is not derivable from one of the ideas being communicated. In the original studies all NONCASES contained four propositions, comparable in this respect to clear-case FOURS. Ss were given two recognition trials. Mean recognition confidence ratings were computed for the individual recognition sentences and for various types of sentences. Avoiding details, in order to assess the results presented below it is sufficient to note that increasing positive values, up to +5, indicate increasing confidence that a sentence actually had been presented in acquisition and, conversely, increasing negative values, down to - 5, indicate increasing confidence that a sentence had not been previously presented. Before discussing the results of these initial studies, let us first consider the general pattern of results that would be expected if the Wundtian distinction between underlying cognitions and overt sequential expressions has psychological reality and can be extended to units greater than sentences. This position argues that on the basis of a sequential input people may construct a complete wholistic semantic representation of this input and store this meaning in memory. In the present study, it could be argued that this means that people will integrate the partial semantic representations underlying the set of acquisition sentences that are related to a particular idea. These partial meanings will be integrated into a complete wholistic semantic structure of the complete idea. This wholistic meaning then is what is remembered. In short, the complete idea can be the memory representation that is retained from a related set of acquisition sentences, and little or no information may be stored about the exact sentences which communicated the idea. If the above is true then the recognition ratings in the present study should be based on the complete ideas which were abstracted. This hypothesis leads to the expectations: 1. NEW sentences should receive generally positive recognition ratings because they are derivable from, or congruent with, the complete idea which has been abstracted and stored. 2. NONCASES should receive generally negative recognition ratings since their semantic structures are not derivable from the abstracted idea. 3. OLDS and NEWS should receive generally comparable ratings since recognition is based mainly on the abstracted complete idea and to a lesser extent on memory representations of actual input sentences. The first prediction is that Ss should think they ‘recognize’ many NEW sentences, in spite of the fact that they did not actually hear them before. In traditional terminology, such recognition responses would be called ‘false positives’. However, if one assumes that recognition ratings are a function of the total ideas acquired during acquisition, the notion of false positive is not appropriate. NEW sentences are recognized because
216
John D. Bransford and Jeffery J. Franks
they are derivable from the total semantic structures acquired. Table 2 illustrates a typical set of ratings for NEW sentences (from Bransford and Franks, in press). Note that these are data for a set of sentences all related to a single idea. The three data points above each sentence represent: 1) the mean recognition ratings averaged over trials 1 and 2; 2) the mean ratings for trial 1; and 3) the mean ratings for trial 2. All the sentences in Table 2 are NEW with the exception of sentence 4 (OLD). Each sentence is identified as a FOUR, THREE, TWO or ONE.
Table
2.
Recognition ratings for an idea set 4.26 (4.26; 4.26) 1. The ants in the kitchen ate the sweet jelly which was on the table.(FouR)
-0.73 (-0.46; -1.06) 3. The sweet jelly was on the table (TWO)
2.93 (3.00; 2.86) The ants ate the sweet jelly. (TWO) OLD -4.
-2.66 (-2.73; A-2.60) 6. The jelly was sweet. (ONE)
1020
3.59 (3.86; 3.33) 5. The ants ate the jelly which was on the table. (TWO)
. .;. -1 20) 7. The ants ate the jelly. (ONE)
Predictions:1>2>4>6;1>3;1>5>7;4>7;3>6;2>6.
Note first the positive recognition ratings for the NEW sentences, at least the NEW FOURS, THREES, and most TWOS. Note especially the very high positive rating for the sentence expressing the complete idea (i.e. the FOUR). This sentence contained more semantic information than any sentence which occurred on acquisition, yet Ss were very sure they heard it before. These positive ratings, especially for the FOUR, suggest that Ss did indeed integrate information from various acquisition sentences to construct wholistic semantic structures and then based their recognition judgments on these structures. The results were even more orderly than is evident from the above description. The results presented in Table 2 illustrate a pattern of ratings that we find in all studies of this type that we have run. Not only do Ss think they recognize NEW sentences, but their recognition ratings order according to the semantic complexity of the sentences.
The abstraction of linguistic ideas: A review
217
That is, FOURS receive higher ratings than THREES, THREES than TWOS, and TWOS than ONES. Figure 1 shows the same pattern of recognition orderings for results summed over two experiments reported in Bransford and Franks (in press). Figure
1.
Mean recognition ratings as a function of sentence complexity.
-5 I
L
FOURS
THREES
TWOS
1
I
ONES
NONCASES
These ordering data suggest that most Ss felt that they had actually heard NEW and THREE sentences. This presents strong evidence for the integration position since the information underlying novel FOUR and THREE sentences could only have been acquired by integrating information from various sentences presented nonconsecutively on the acquisition list. The fact that ratings for TWOS and ONES were lower than for FOURS and THREES suggests that more Ss were less sure that they had actually heard these during acquisition. Of course, nearly all Ss knew that they had heard some short sentences and hence each individual S said ‘YES' to some of these sentences (often with high confidence). Overall, however, fewer Ss said ‘YES' to sentences as sentence complexity decreased from FOURS to ONES. The second prediction is that Ss would not think that they had previously heard NONCASES since these sentences were not derivable from the ideas which are abstracted. The results confirm these expectations. Figure 1 shows that NONCASES received lower ratings than CLEARCASE sentences. Ss were very confident that they had not heard these NONCASES before. NONCASE data are very important for interpretating the results of the
FOUR
218
John D. Bransford
and Jeffety
J. Franks
present studies. First, they show that Ss were not merely responding to pure sentence length or complexity. NONCASES were just as long and complex as the FOURS which received the highest CLEARCASE ratings. NONCASE results also suggest that Ss were not merely responding on the basis of key words found in acquisition sentences. Many NONCASES represented rather subtle distortions of complete semantic structures, yet were nevertheless rejected by Ss. For example, Ss accepted the NEW FOUR The scared cat running from the barking dog jumped on the table but rejected the NONCASE The scared cat was running ffom the barking dog which jumped on the table (from Bransford and Franks, in press). The ability to distinguish such sentences from one another shows that the semantic information acquired by Ss was quite precise. Additional kinds of NONCASE data will prove important for distinguishing between alternate theories of what is learned later in this paper. The third prediction is that OLDS and NEWS should receive comparable ratings since ratings are based on the abstracted idea. An experiment reported in Bransford and Franks (in press) was explicitly designed to consider this question. Essentially this study involved 48 sentences, 12 related to each of four ideas. During acquisition 24 sentences were presented to one group of Ss and the other 24 to a second group of Ss, each group receiving 6 sentences related to each idea. Each group received ONES, TWOS, and THREES for each idea. In addition, in this study, Group 1 was also presented two of the FOURS and Group 2 the other two FOURS. For recognition, each group received all 48 sentences. Thus, for Group 1, half of the recognition sentences were NEW and the other half OLD. For Group 2, the opposite halves were OLD and NEW. The important feature of this design is that one can assign two recognition ratings to each sentence, one for when it was OLD and the other for when it was NEW. Figure 2 shows results averaged over all ONES, TWOS, THREES, and FOURS for the same sentences when they are OLD versus NEW. Essentially there is no difference along the OLD-NEW dimension at the level of FOURS, THREES, and TWOS. There is, however, a slight but reliable difference favoring OLD ONES over NEW ONES. For ONES, OLDS received slightly higher ratings than NEWS. This specific memory effect accounts only for an extremely small amount of the total variance, however, since OLD ONES still receive lower recognition ratings than NEW TWOS. Note that the ordering effect mentioned above (FOURS > THREES > TWOS > ONES) is clearly found in this data for both OLDS and NEWS. The overall results of the OLD-NEW study support the hypothesis that Ss retain information about wholistic semantic structures and are much less likely to retain information about the particular sentences used to express the structure. To some extent, however, Ss may remember something about the nature of the information presented during the acquisition task. The general point that recognition ratings are based primarily on abstracted complete ideas and to a lesser extent on retention of particular sentences is also nicely
The abstraction of linguistic ideas: A review
219
illustrated by a second result found in this OLD-NEW study. Groups 1 and 2 received non-overlapping sets of acquisition sentences yet the rank order correlation between recognition ratings for the two groups was .88. This high correlation indicates that both groups acquired essentially the same semantic structures despite non-overlapping acquisition experiences, and that recognition ratings were primarily a function of these overall ideas.
Figure
2.
Mean recognition ratings for OLD and NEW sentences. 5432lo-1 -2 -
-4
-OLD x-x
SENTENCES NEW
SENTENCES I
FOURS
I
THREES
I
TWOS
~~
~~
--I
ONES
Ss’ inabilities to distinguish OLD from NEW sentences in the above study is in direct contrast with other research, for example Shepard (1967). His Ss showed an ability to recognize sentences from a set of over 450 experienced during acquisition with approximately 90% accuracy. The present study showed almost no ability to distinguish OLD from NEW sentences despite the fact that the acquisition list was only 24 sentences long. The differences between these two studies is readily understandable if one considers the types of materials used in the two experiments. In Shepard’s experiments, all sentences (including recognition foils) were semantically unrelated. In the present study, OLD and NEW sentences were derivable from common semantic ideas. Note that the kind of recognition foils employed by Shepard are equivalent to NONCASES as
220
John D. Bransford
and Jeffery
J. Franks
described detecting
earlier. In the paradigm used in the present paper, Ss are also excellent at distortions of previously acquired semantic structures (i.e. Ss can detect NONCASES), they simply have trouble determining which of several derivations from wholistic semantic structures they actually heard. This distinction between distortions and related derivations is very important for comparing various results dealing with sentence memory. 2. Towards a specification
of what is learned
These initial studies demonstrate the general phenomena of linguistic abstraction. The results are consistant with an extension of Wundt’s position which asserts that on the basis of a set of partial meanings people will construct wholistic semantic representations of complete ideas and these will be remembered. However, the results reported above are not sufficiently precise to rule out a number of plausible alternative explanations of the findings. The above results do cast extreme doubts on models assuming specific memories for particular sentences, but other models not based on integration of wholistic structures are also available. The studies reported below were aimed at providing a more detailed analysis of the nature of the acquired semantic representations. The fact that NONCASES were rejected in the above work indicated that Ss acquired some degree of semantic precision. However, these data in isolation still allow a number of alternate characterizations of what is learned. Consider, for example the FOUR The scared cat funning from the bafking dog jumped on the table versus the NONCASE The scared cat was running from the barkitzg dog which jumped on the table. What kinds of models could account for the fact that Ss are very confident of having heard the FOUR but confident of not having heard the NONCASE? One plausible alternative type of model would be to characterize what Ss acquire as a set or list of independent semantic features. Let us consider a specific version of a feature model in order to get some idea of the potential explanatory power of such models and to demonstrate the kinds of evidence that can contrast this formulation with the explanation based on abstracted complete ideas. Consider the following model: Assume that Ss analyze the acquisition sentences into their basic underlying features or propositions. For this model, assume that these features are equivalent to the semantic content of ONES. Ss’ knowledge of acquisition experiences is characterized as a list of the features or propositions that were contained in the set of acquisition sentences. In short, Ss can be said to remember a list of ONES. Second, assume that recognition ratings positively covary with the number of propositions from the stored list that are contained in a given recognition sentence. This would account for the ordering data FOURS ~-THREES > TWOS > ONES. The more complex sentences contain more propositions from the list and therefore get higher ratings. Note that this model
The abstraction of linguistic ideas: A review
221
also accounts for the lack of differentiation between OLDS and NEWS since only features are stored, not the sentences in which they occurred. Finally, assume that if a recognition sentence contains a proposition (ONE) that does not match one of those stored in memory during acquisition, that sentence receives a negative recognition rating. This assumption takes care of the NONCASE DATA above. As an example, compare the FOUR and NONCASE given above. If we assume that the semantic propositions are equivalent to ONES, the semantic features comprising the FOURS are: The cat was scai’ed; the cat was running frbm the dog; the dog was barking; the cat jumped on the table. According to the feature model experiences should result in a memory list of such independent features (plus features from the other acquisition sentences), and recognition confidence ratings should covary with the number of features each sentence contained. Since the above FOUR contains all 4 features, it should receive the highest rating of all. Now consider the basic propositions composing the above mentioned NONCASE: The cat was scared; the cat was rtinningfrom the dog; the dog was bar’king; the dog jumped on the table. Note that this NONCASE contains three propositions that match those comprising the above FOUR and one proposition that deviates from those actually heard (i.e. the distorted proposition states that the dog jumped on the table rather than the cat). Assuming that the presence of a proposition (ONE) that does not match those acquired during acquisition automatically results in negative recognition ratings, this NONCASE should be rejected. In short, the set of assumptions comprising this simple analytic feature model can account for all the previous results. Note the types of claims made by the analytic feature model. It does not need to assume that Ss actually integrate information from nonconsecutively experienced acquisition sentences. Instead, each input sentence is analyzed into its basic semantic features, and a whole set of independent features is stored as a list. The important aspect of this semantic feature model is that the individual features are assumed to be independent of one another. Information about relations among features is thrown away. If such information were remembered and one assumed that no semantic integration took place during acquisition then 5% memory would be equivalent to a list of just those sentences heard on the acquisition list. This we have already argued against. The feature model is contrary to most current linguistic accounts of the nature of the semantic information communicated by sentences. Linguistic theories generally preserve information about relations among sentence constituents, and treat sentences as wholistic Gestalts rather than as sums of independent semantic events (see Chomsky, 1965; Katz and Postal, 1964; Neisser, 1967; Postal, 1964). Some variant of the feature model might nevertheless suffice to characterize what is learned in the present experimental paradigm. Although no single experiment can test all possible
222
John D. Bransford and Jeffery J. Franks
variations of feature models, data from some of our studies allow some degree of differentiation between the analytic and more wholistic points of view (using the above specified feature model to illustrate the analytic view). Consider, for example, the following acquisition situation (which simulates portions of the general format used in our actual experiment). Two different aquisition lists are composed from the same set of underlying semantic propositions (i.e., from the same set of ONES). These lists are designated as lists UC (unconstrained) and c (constrained), and examples from them are provided in Table 3. Note that all the sentences in this table are composed from the same set of basic propositions, the differences between UC and c sentences lie in the constraints placed upon ways in which such propositions can be combined. Table
3.
Constrained and unconstrained lying propositions
sentences composed of the same under-
UNCONSTRAINED The The The The The The etc.
(U.C.)
rich man riding in the car lives next door. old man wore a green hat. man who lives next door broke the window. old man riding in the car wore a green hat. rich man broke the window that was on the porch. old man riding in the car broke the window. CONSTRAINED
The The The The The The etc.
rich man was riding in the car. old man who lives next door broke the window. man riding in the car wore a green hat. man broke the window on the porch. rich man wore a green hat. old man lives next door.
BASIC PROPOSITIONS The The The The
man was rich. man was old. man lives next door. hat was green.
UNDERLYING The The The The
BOTH LISTS
man was riding in the car. man wore a hat. man broke the window, window was on the porch.
List UC consists of semantically meaningful sentences formed by combining basic semantic propositions with no constraints on which propositions can be combined with which others. In the actual studies there were 8 additional propositions, and the actual acquisition list consisted of 6 ONES, 6 TWOS, 6 THREES and 6 FOURS. The recognition list contained all these 24 OLD sentences plus 12 NEW sentences. NEW sentences consisted of the same basic propositions combined in ways not found on the acquisi-
The abstraction
of linguistic
ideas: A review
223
tion list. For example, a NEW sentence for group UC might be The old man who lives next door broke the window. Acquisition list c was constructed just like the acquisition lists in the studies reported above which demonstrated integration, except that the four complete ideas were formed by particular combinations of the same set of basic propositions that underlay list UC. That is, acquisition list c contains constraints on permissable relations among basic propositions. List c sentences in Table 3, for example, are all derivable from two wholistic semantic structures: The rich man riding in the car wore a green hat; the old man who lives next door broke the window on the porch. The acquisition list used in the actual experiment was composed from four wholistic ideas and contained 24 sentences, 6 related to each wholistic idea. All acquisition sentences were included on recognition as well as 12 NEW sentences and one NONCASE. NEW sentences were all derivable from the complete ideas presumably acquired during acquisition. The NONCASE was composed of the same set of basic propositions underlying acquisition sentences, but the constraints on permissable relations among propositions were violated. A NONCASE given the c sentences in Table 3 would be as follows: The rich man who lives next door wore a green hat. Consider what the results of the above experiment should be according to the semantic feature model. Since acquisition lists UC and c are both analyzable into the same set of basic propositions (ONES) results of both groups should be the same. First, recognition ratings should covary with the number of semantic propositions comprising recognition sentences, hence ratings should order FOURS > THREES > TWOS > ONES. Second, since OLD and NEW sentences are all derivable from the same set of basic propositions, Ss should not be able to distinguish OLDS from NEWS. Third, the NONCASE presented to group c was also composed of the same set of propositions underlying acquisition sentences, hence Ss should not be able to tell that they had not heard it before. Figures 3 and 4 show the recognition results of groups c and UC to be markedly different. Group c data (Figure 3) look like the results of previous experiments, including the rejection of the NONCASE. However, group UC results (for two different experiments) show a very different trend (Figure 4). First, recognition ratings do not covary with the number of semantic propositions underlying a sentence. More important, Ss discriminate OLDS from NEWS. These results clearly do not support the predictions based on the semantic feature model. The data argue that something besides a mere list of independent propositions is apparently being stored. How can the model which assumes that wholistic ideas are abstracted and stored account for these results? The data of group c present no problems. The results including rejection of NONCASE are accounted for as in the discussion of the experiments above. But what about group UC? The wholistic idea model assumes that information
224
John D. Bransford and Jeffery J. Franks
about relations among propositions is stored as well as the propositional meanings themselves. If this is the case, the UC acquisition task should be very confusing. It consists of a basic set of propositions, but whereas in list c the interrelationships are
Figure
Mean recognition condition.
3.
-3 -4
-5
m-4 -
OLD’s NEW’s . NONCASE
i
1
* FOURS
ratings for OLD and NEW sentences:
Constrained
FOUR
I
I
THREES
I
I
TWOS
ONES
highly regular, in list UC these propositions interrelate in an irregular variety of ways. Storage of the particular interrelations actually presented should be very difficult. Since most UC acquisition sentences represent a unique combination of semantic propositions each such unique combination represents a unique idea to be acquired. Thus Ss in group UC were exposed to considerably more information or distinct ideas than Ss in group c (the latter presumably acquire only 4). Group UC Ss had a harder task and hence should show more uncertainty in their recognition ratings. Overall, however, they should still be able to differentiate between OLD and NEW sentences, since NEW sentences represent combinations of propositions never experienced during acquisition. In short, NEW sentences for group UC are actually equivalent to NONCASES. NEWS for group UC were generally not derivable from any of the numerous ideas acquired during acquisition, whereas NEWS for group c were always derivable from the complete ideas
The abstraction of linguistic ideas: A review
225
acquired. In this study, group UC NEWS did not receive ratings as low as those generally received by NONCASES in other studies. However, this presumbably reflects the fact that considerable confusion took place. Thus, the wholistic idea model could attribute the Figure 4. Mean recognition ratings for
OLD and NEW sentences: condition (for two different experiments).
Unconstrained
2lO-l-2-3-A-
e--+ OLD’s ,-oOLD’s NEW’s: t+ NEW’s
-5-
: EXP. 1 : EXP.2 EXP. 1 : EXP. 2 I
FOURS
I THREES
I TWOS
I ON ES
generally low recognition ratings in the UCresults to confusion due to the amount of information presented and the differentiation between OLDS and NEWS to NEWS actually being NONCASES with respect to the information acquired during acquisition. This account is admittedly post hoc to a certain extent but this does not alter the fact that the data are quite inconsistent with a feature model interpretation of what is learned but are not necessarily inconsistent with the wholistic idea model. The reader may argue that the particular feature model chosen was too simplistic and that a reasonable feature model would include information encoding the interrelations among propositions. Our belief is that the more such information is incorporated into the model the closer it approaches to the wholistic idea model. It is the implications of this latter model that we wish to assess.
226
John D. Bransford
and Jeffery
J. Franks
3. Broadening the empirical base of the results The above experiments lend strong support to the notion that, under certain experimental conditions, Ss integrate linguistic information to construct wholistic semantic structures. The present section reviews some studies that attempted to explore some of the boundary conditions surrounding this phenomenon by varying (a) the nature of the acquisition instructions; (b) the form in which various ideas are presented; (c) the form of testing; (d) the nature of the ideas to be acquired.
3. I.
Varying the acquisition irlstructions
A Ph.D. thesis by Curnow (1969) tested the effects of varying acquisition instructions. This work is summarized in Curnow, Franks, Bransford and Jenkins (1971). The experimental design was identical to that used in the previous experiments, except that the complete ideas to be acquired were composed of 5 basic propositions rather than 4. Curnow tested the following four types of acquisition instructions: (1) Elliptical question instructions (Q), where Ss were simply asked to answer a question about each sentence after a 5-sec. delay (this task replicated the instructions used in all the previous studies); (2) semantic rating instructions (S) where Ss rated each acquisition sentence on three scales: good-bad, active-passive, strong-weak. This condition was also an incidental acquisition condition, since Ss were not told that there would be a later recognition task; (3) E-counting(E) where Ss were instructed to count the number of ‘e-s’ appearing in each acquisition sentence, and again were not informed about the later recognition task; (4) explicit recognition(R) where& were told to try to remember the acquisition sentences verbatim because they would later be asked to recognize them and were also told about the detailed differences between OLD and NEW sentences (e.g., that they should try not to confuse OLD sentences like The boy hit the big ball with NEW sentences like The boy hit the ball). Figure 5 illustrates the results obtained under these four conditions. The recognition ratings for all groups are quite similar. The usual ordering effect (FOURS > THREES > 'TWOS > ONES) is found with all groups. Product-moment correlation coefficients between all possible combinations of pairs of acquisition conditions also testifies to the similarity of the recognition ratings. All correlations fall within a range of .87 to .93. Note that the slope of the explicit recall group(R) is less steep than the other groups, however, indicating that Ss in this group were more uncertain about which sentences they had and had not actually heard. These results are not surprising given the nature of the (R) instructions. Ss were more aware of the fineness of the discriminations they were being asked to make since these were pointed out in the (R) instructions. The fact that the recognition ratings still order, however, indicates that Ss in group (R) were not
227
The abstraction of linguistic ideas: A review
completely acquisition
successful in remembering only those sentences but instead integrated the semantic information.
Figure
Recognition
5.
actually
heard
during
ratings as a function of acquisition instructions.
432lO-l-2-3-4-5
WE n--a
R
’
1
I FIVES
FOURS
I THREES
, TWOS
1
ONES
At first glance, the performance of the E-counting group may seem unexpected Why should these Ss integrate the ideas? The E-counting group was included to destroy the phenomenon of integration and was modeled after experiments by Hyde and Jenkins (1969) showing that this kind of instruction has adverse effects on subsequent memory (and clustering scores) for words. Upon sitting through the Ecounting task using sentences as inputs, however, we discovered that it was difficult if not impossible to perform the task without also processing the meaning of each auditorily presented sentence and holding it in short-term memory as the relatively slow process of counting took place (see Savin and Bever, 1970, for considerations of the ease of processing more wholistic phrases versus individual phonemes). This semantic processing should allow Ss to code the meanings of sentences, and hence information from semantically related sentences could still be integrated into wholistic representations. It should be noted that the E-counting experiment did not employ NONCASES on recognition, however, hence it is impossible to be sure that Ss really had access to precise semantic ideas. Overall the phenomena of abstracted integrated ideas
228
John D. Bransford and Jeffery J. Franks
appears to be general enough to operate under a number of different acquisition conditions. There should, of course, be conditions under which Ss will treat sentences, even related sentences, as individual entities and not integrate information ; however we have not yet run such a task.
3.2.
Varying the forin of idea transmission
Besides the effect of various acquisition instructions there are other boundary conditions to be considered. A very important condition concerns the general form of the sentences which communicate the ideas to be acquired. If all acquisition sentences were ONES, for example, one would not expect Ss to actually think they had heard sentences that were THREES and FOURS. Similarly, if all acquisition sentences were presented in the active voice Ss probably would not think they had actually heard passive sentences, even though the latter expressed the same basic ideas. It seems clear that any adequate account of the general phenomenon of linguistic abstraction will have to postulate at least two relatively independent aspects of memory representations of sentences or sets of sentences: (1) Ss will construct and remember wholistic semantic structures, and (2) Ss will retain some information about the general style in which the semantic information was originally expressed, that is, something about the form of the input. An acquisition list composed entirely of ONES may be sufficient to allow Ss to integrate complex semantic structures, but memory for the general style of acquisition sentences (i.e., that they were all extremely short and simple) would most likely cause Ss to reject recognition sentences that were THREES and FOURS (and maybe even Twos). Curnow (1969, and see Curnow et. al, 1971) tested the effects of some aspects of transmission style on subsequent recognition by varying the complexity of the sentences used to communicate various wholistic ideas (all complete ideas were FIVES). Each of the four different complete ideas was expressed by different ranges of sentential complexity during acquisition. For example, idea A was expressed in a set of acquisition sentences which contained the FIVE,as well FOURS, THREES, TWOS, and ONES (Type A acquisition). Idea B acquisition sentences ranged from FOURS to ONES (Type B acquisition). Idea C acquisition sentences were THREES, TWOS, and ONES (Type C) and idea D sentences were only TWOS and ONES on acquisition (Type D). Thus, within the complete experiment, Ss received examples of the full range of sentence complexities (ONES to FIVES)during acquisition; however, the range of complexity of the set of sentences communicating a particular idea varied from Type D to Type A. Figure 6 shows the results of these manipulations. Recognition ratings following acquisition types A, B, and C are very similar to those obtained in the basic studies. The ordering effect from FIVES to ONES is clearly present. Thus, with these ranges of
The abstraction of linguistic ideas: A review
229
Figure 6. Recognition ratings as a function of level of complexity at acquisition. 54o_ _______
3-
ix
2l-
.dCC
___-
_,.A-__
--__
--.-
. .
. .
--_
-. _-. ?
ok -l-2-
**
..\ _ x_ 0-_---o .---.
-5
\\ \,‘\ \ ‘\
’
TYPEA TYPE B TYPEC TYPE D I
I
FIVES
FOURS
I
THREES
1
I
TWOS
ONES
acquisition complexity Ss appear to integrate the wholistic ideas and base their ratings on these. For acquisition type D, however, this integration appeared to break down. That is, for those ideas expressed only by ONES and TWOSduring acquisition Ss were less sure of having seen FIVES and FOURS on the recognition test. As is explained in Curnow et. al. (1971), the study was counterbalanced so that each particular idea was expressed by different acquisition types (over different groups of S’s).A closer analysis of the integration effect for acquisition Type D, showed the effect to be idea specific. That is, for two of the ideas Ss were not confident that they had heard FIVES and FOURS, but for the other two ideas Ss confidence ratings ordered FIVES > FOURS > THREES > TWOS > ONES. The precise reasons for the specificity of these results are not clear at the moment, and additional research on this question needs to be pursued. The overall results of the Curnow experiment do show a strong tendency to integrate information under a variety of input conditions however, although there do appear to be instances in which the tendency to ‘recognize’ complex sentences breaks down. We feel that these breakdowns occurred not because the semantic information was not integrated, but because Ss remembered additional information about the general style through which these ideas were originally expressed (for further discussion of the relation between input styles and the semantic integration, see the section on ‘Related Considerations’).
230
John D. Bransford and Jeffery J. Franks
The Curnow et al results for type D acquisition suggest that the effects of transmission style on recognition ratings may be relative to the particular ideas being corn municated. Some data we have collected point toward a similar conclusion. Figure 7 illustrates recognition ratings for two different types of wholistic structures. Cohesive embedded structures were those like the ideas used in the previous experiments (e.g. one idea to be acquired was The warm breeze blowing from the sea stirred the heavy evening air). Two of the four ideas to be acquired by Ss were of this form. The other two structures to be communicated were more list-like in content (e.g. The man saw a cat, a cow, a dog, anda horse). We reasoned that the list-like characteristics of the latter type of sentence might make the number of items mentioned in any acquisition sentence very salient. This feature of the input, concerning number of items listed, could well be a factor of transmission style that would be remembered, in addition to the complete idea which is abstracted. If no FOURS were presented during acquisition (which they were not) Ss should remember this fact during recognition and reject FOURS, despite the fact that they integrated the complete semantic ideas. Results support this prediction; Ss thought they heard the FOURS expressing the more cohesive, embedded structures despite the fact that FOURS never occurred in acquisition. The same Ss were rather confident of not having heard FOURS communicating the list-like
Figure
7.
Recognition
-4
ratings for embedded
vs. list-like ideas.
e---o Embedded Structures 1
~---a
-5'
List-Like
Structures
, FOURS
I
I THREES
TWOS
I ON ES
The abstraction of linguistic ideas: A review
231
structures. Note that these Ss must have integrated the semantic information from the sentences expressing the list-like structures however, otherwise they would not be confident of ‘recognizing’ novel (but appropriately derivable) THREES, TWOS and ONES. The notion that Ss store integrated semantic descriptions plus information about the general style by which this information was originally communicated can also be investigated by asking different kinds of recognition questions. For example the question Did you actually hear a particular sentence? requires judgments about both the style and semantic information expressed by the sentence. The questions Couldyou have heard the information expressed by this sentence? or Is this information consonant with what you learned? require only judgments about the wholistic semantic structures acquired and such questions may result in the acceptance of all CLEARCASE sentences derivable from the ideas acquired. The fact that in the above studies S’s’ ratings order as a function of the complete ideas even under the did you actually hear’ the particular sentence instructions makes the obtained results all the more impressive. They indicate that under a variety of conditions, the integration and storage of complete ideas dominates any memory for factors specific to particular acquisition sentences. Note, incidentally, that the questions asked in the present experiments (i.e. did you actually hear these sentences) also assume the understanding of a particular context. If Ss were asked if they had heard a particular FOUR while walking down the hall on their way to the experiment they would most likely say ‘no.’ Wholistic semantic ideas are also assimilated into more general temporal-semantic contexts, and Ss remember these general contextual constraints in addition to the overall ideas. 3.3 Changing the test frbm rkcognition toffee recall So far we have used only recognition measures. In several pilot studies we investigated
the phenomenon of integration using free recall. In one study, sentences from Experiment I in Bransford and Franks (in press) were utilized, and the same acquisition procedure as that study employed was used. After acquisition, Ss were asked to recall the exact sentences they had heard during acquisition. Ss recalled an average of 11 sentences each (they had heard 24 acquisition sentences); 55% were OLD and 45% were CLEARCASE NEW. NONCASES were produced less than 5% of the time. At the level of FOURS and THREES, 21 OLDS were recalled and 26 NEWS (all FOURS were NEWS and a total of 11 FOURS were recalled). At the level of TWOS there was a 24 to 18 OLD-NEW ratio, and this ratio was 21 to 10 for ONES. Results show that Ss did not simply store individual acquisition sentences but instead based their recall on more integrated semantic structures. If Ss had not integrated information, they should not have recalled any FOURS or novel THREES. Results also showed that the OLD-NEW difference tended to increase with decreases in sentential complexity, a result which parallels
232
John D. Bransford and Jeffery J. Franks
that found in the OLD-NEW study mentioned above (see Bransford and Franks, in press). The free recall data also indicated that all Ss know that they had heard some shorter (i.e. Twos and ONES) sentences. Ss tended to differ, however, in the particular sentences that they recalled. 3.4
The nature of the ideas to be acquired
In all the work that we have discussed, the ideas to be acquired have been semantic structures that are concrete and highly imageable. What would happen with abstract, difficult-to-image ideas? Begg and Paivio (1969) on the basis of demonstrated differential memory effects have argued that concrete sentences are stored as images, but that abstract sentences are stored in some sequential verbal form. Without arguing about the validity of this particular hypothesis, the demonstrated differential memory effects may imply that different effects may be found in the present experimental paradigm. The main question is, will S’s integrate wholistic ideas when the information being com-
Figure
8.
Recognition sentences.
ratings as a function
of sentence
complexity:
Abstract
5432lo-l-2-3-4-5
I
I FOURS
THREES
I TWOS
I
1 ONES
NONCASES
The abstraction of linguistic ideas: A review
233
municated is abstract rather than concrete? A set of studies by Franks and Bransford (in press) investigated this question. The experimental procedures used were exactly like those of the original studies except abstract FOURS were used instead of concrete FOURS. An example of an abstract FOUR is: The arrogant attitude expressed in the speech led to immediate criticism. ONES, TWOS, and THREES related to this idea and each of three other abstract ideas were presented in acquisition. This was followed by the recognition test containing ONES FOURS related to each idea and also four NONCASE sentences. The results are presented in Figure 8. As can be seen, these results are quite comparable to those obtained using concrete sentences and ideas, although in general the mean ratings are not as high in either the positive or negative range. The ratings order from FOURS down to ONES and NONCASES received the lowest ratings. Although not apparent in these means, in general there appears to be somewhat more variability in the ordering of the ratings for these abstract sentences than in the orderings for ratings of the concrete sentences we have used. This may have something to do with abstract sentences being generally more difficult to understand. Further work is being planned for investigations with abstract sentences, but for present purposes, we think the evidence is clear that Ss integrate and store wholistic ideas of abstract semantic structures just as with concrete ideas.
4. Increasing the complexity
of the semantic structures abstracted
In the previous experiments a complete idea to be acquired was always expressible by an easily comprehended single embedded sentence. Is it reasonable to assume that there exists some constraint on the complexity of the semantic structures that can be abstracted by Ss, or are the kinds of constraints that operate to keep sentences within reasonable bounds of complexity merely constraints on the form through which complex structures may be effectively communicated? Sentences that are too long and complex exceed our limited processing capacity. Yet this does not necessarily mean that the semantic structures which are integrated and stored must also be limited by this processing capacity. In fact it seems reasonable to suppose that the integration of sentential information into long-term memory representations might well form stored semantic structures too complex to ever be effectively expressed in a single sentence. We believe that this latter type of process occurs when comprehending connected discourse, for example. We are beginning to study such situations in which semantic structures contain more information than can be communicated by a FIVE or FOUR. An experiment to be considered demonstrates that Ss can abstract semantic
234
John D. Bransford and Jeffery .I. Franks
structures of much greater complexity than those discussed above, and at the same time shows how expectancies can affect the types of semantic descriptions that Ss construct. Consider an acquisition list composed of 24 sentences, where these sentences can be integrated to form three invariant wholistic ideas. One idea is The ants in the kitchen ate the sweet jelly which was on the table. This idea is essentially a filler idea for the present experiment. The other two ideas are quite similar to one another and their respective structures are represented in Figure 9. (The lines simply indicate the modifiers or predicates associated with the nouns in the semantic structure. No claims are Figure
9.
A schematic illustration of two simple semantic descriptions. Structure
A
structure
B
h,ll
made for the elegance or formal properties of this notation other than illustrative utility.) Some sentences derivable from these two semantic structures are represented in Table 4. Sentences like these (plus some about the ants) constituted the acquisition list. The same acquisition list was presented to three different groups of Ss. All were told to imagine they were at a cocktail party hearing a number of different conversations. In such situations sentences from different conversations are often intertwined with one another, yet a listener can still perceive the essence of what different groups of people are discussing. Ss were read a randomly intermixed set of sentences each re-
The abstraction of linguistic ideas: A review
235
ating to one or another of the ideas to be communicated and were told to construct the total ideas being described. The variable of interest concerned the general nature of the ideas Ss were lead to expect. Ss in Group 1 were told that three different ideas could be discovered; one about the ants, one about the old man and one about the greyhaired man. Ss in groups 2 and 3, however, were told that two different ideas were forthcoming; one about the ants and the other about a grey-haired old man. After hearing these instructions, Ss were read the 24 sentences on the acquisition list. Table 4. Sentences derivable from structures A h B Structure A The The The The The The
old man was standing on the steep hill. man standing on the hill saw the huge rock. man saw the rock balanced on the edge of the hill. old man standing on the hill saw the rock balanced on the edge. man was standing on the hill. man saw the huge rock.
Structure B The grey-haired man saw the rock roll down the hill. The man saw the rock roll down the hill and crush the hut. The rock that rolled down the hill crushed the hut. The grey-haired man saw the jagged rock. They grey-haired man saw the rock roll down the grassy hill and crush the hut. The man saw the rock crush the hut.
After acquisition a recognition test was given. Ss were asked to remember which exact sentences they had heard during acquisition and to indicate their confidence in their answers. Before the recognition task began all Ss were reminded that the task would be difficult but that they were to do the best they could. In addition Ss in group 3 were also told that they had actually heard three stories rather than two, and that this information might help in the recognition task. They were instructed that the three stories were about the ants, an old man and a grey-haired man, and were told to try to differentiate these stories if possible. If they could not do this, however, they were told just to respond to each individual recognition sentence as best they could, The recognition list contained OLD, NEW and NONCASE sentences. Six OLDS and six NEWS constituted the important data. The NEWS always combined information across the two different semantic structures outlined in Figure 9. The OLDS of course, were always derivable from a single wholistic structure and had actually occurred on the acquisition list. Examples of OLD and NEW sentences are: OLD: The grey-haired man
236
John D. Bransford and Jeffery J. Franks
saw the rock roll down the hill and crush the hut; NEW: The old man saw the rock roll down the hill and crush the hut. Table 5 shows the mean recognition ratings for OLD, NEW and NONCASE sentences for the three groups of Ss. Group 1 which had been led to expect three different ideas differentiated between the OLD and NEW sentences. Presumably NEW sentences were not derivable from the overall semantic structures acquired by this group, hence for group 1, it would be more accurate to term these NEW sentences NONCASES. Ss in Groups 2 and 3 did not differentiate between OLD and NEW sentences, however. This indicates that they did not construct two separate structures about an old man versus a grey-haired man. Instead, they constructed a single complex structure encompassing the information relating to both and stored an idea containing considerably more information than any single acquisition sentence expressed. New sentences were derivable from this integrated structure, as were OLD sentences, hence Ss could not determine which exact sentences they had actually heard. Table
5.
Recognition
ratings for the three sentence types
Group
OLDS
NEWS
NONCASES
I
1.09 2.74 1.58
-0.24 2.60 1.48
-4.58 -4.75 -4.57
II III
Note that Ss in Group 1 could not discriminate perfectly between OLD and NEW sentences, since they were not as confident about rejecting NEW sentences as they were about rejecting NONCASES (which represented more obvious semantic distortions of the ideas acquired). The semantic structures constructed by group 1 Ss are thus not as distinct and separate as those sketched in Table 4, but the general characterization in terms of two separate structures still holds. Note also that Groups 1, 2, and 3 differed in their overall confidence ratings for OLD sentences. This probably reflects the varying extents to which Ss were explicitly informed about the difficulty of the task. Group 1 Ss knew the difficulty of the task from the beginning, (i.e. that they had to separate the old man story from that about the grey-haired man), and this seemed to be reflected in relatively low overall confidence (except for NONCASES). Group 3 was later informed about the difficulty of their task, but also told to do the task as best they could. Given that they apparently could not retrieve any cues allowing them to differentiate the two stories, they apparently based their confidence on what they felt they knew. Ss in Group 2 were most confident about their answers. They also were least informed about the difficulty of the task, and hence presumably were more sure of what they knew. The results of this experiment show that the same inputs can be used to construct
The abstraction of linguistic ideas: A review
237
different semantic descriptions, and it is these latter constructions that determine recognition memory rather than a list of these exact sentences actually heard. Note also that those Ss who integrated sentences about the oldman andgrey-hairedman into a single semantic description, constructed and stored a semantic structure much more complex than those we have previously considered. This structure encompassed more information than any of the complete ideas used in the previous tasks. This type of task warrants further investigation, and additional research is currently being conducted. Note, for example, that we did not set up this experiment to look at orderings of ratings in terms of sentence complexity. When semantic structures get too complex it is clear that there will be a definite limit to the kinds of complexity Ss will think they recognize. We would hardly expect to find data showing TWENTY > NINETEEN > EIGHTEEN . .etc. Data comparing OLD and NEW sentences and NONCASES becomes more important for demonstrating abstraction of wholistic ideas in these kinds of tasks.
5. Memory for connected discourse Our approach to paragraph memory follows directly from the procedures and discussion of the experiment we have just presented. Memory for paragraphs is not seen as equivalent to memory for a list of individual input sentences (or more precisely their underlying meanings), but rather is seen as memory for wholistic semantic structures constructed from linguistic input events. The wholistic structures encompass all the semantic information from all the sentences in the paragraph. Recognition and recall memory is primarily a function of these wholistic semantic structures. Sentences consonant with the whole idea of the paragraph are likely to be recognized as having occurred in the paragraph, and sentences not consonant with structures are rejected as being NONCASES. As pointed out earlier, although much of the memory for paragraphs consists of memory for wholisitic semantic structures, Ss also may retain information about the general style by which such information was actually expressed. Our work with paragraph memory is just beginning. Some initial (and very primitive) studies will serve to convey some aspects of the general approach. The initial work utilizes experimenter-generated materials which conform to some intuitively defined idea standards. We eventually want to be able to characterize paragraphs in vivo, but for the moment are content to find some artificial, relatively well-specified exemplars that we can gradually complicate as our understanding of paragraph memory improves. Consider, for example, the following paragraph (italics and numbers will be explained later): It was midnight. Only the brightness of the moon allowed one to see what was in front of his eyes. An old man with a long beard was walking slowly through the
238
John D. Bransford
and Jeffery
J. Franks
dark woods. (1) He had grey hair. (2) The man walked quietly and looked up at the mountain. He saw a huge, jagged boulder at its top. Below the mountain was a straw hut with a pointed roof. The roof was green. (3) The tiny hut was at the edge of the dense woods. Suddenly a storm appeared from nowhere. Lightening struck the huge boulder. The man saw the jagged boulder. The huge boulder plunged down the mountain. He saw it crush the straw hut at the edge of the dark woods. (4) The tiny hut was the old mun’s home. Only pieces of the crushed hut could be seen amidst the densely packed trees. A very rough approximation of the semantic structure generated by the main theme of this paragraph might be characterized as in Figure 10. Again the representation is more an illustrative heuristic than a formal characterization. We assume that Ss construct some such semantic structure when exposed to the above paragraph. In addition they also remember something about the general style of the input events. Ss should thus have a tendency to think they recognize many novel sentences derivable from this overall structure and should be quite confident of not having heard sentences representing distortions of this overall idea just as in the above work with more limited structures. For example, Ss should have a difficult time deciding whether they heard the sentence He saw it crush the straw hut at the edge of the dark woods (OLD) or the stylistically similar NEW sentence He saw it crush the tiny hut at the edge of the dense woods. Preliminary results indicate that Ss do indeed have difficulty differentiating between OLD and stylistically similar NEW sentences. But Ss are very confident that they have not heard NONCASES before. By themselves, the above results are not all that surprising. Perhaps, for example, Ss are just confusing various words and no abstracted wholistic idea need be implicated. With this criticism in mind, a second paragraph was constructed to demonstrate more precisely the role of the integrated semantic structure. The paragraph is identical to the first one, except for the following changes in the italicized, numbered phrases in the paragraph above: (1) He is changed to his friend; (2) The man is changed to the JCend; (3) The tiny hut was at the edge.. . is changed to A tiny hut was also at the edge.. . (4) The tiny hut was the old man’s home is changed to Luckily, the tiny hut was the old man’s fzome. Paragraphs I and II are thus very similar with respect to number of words to be remembered, types of adjectives used, temporal distance between uses of different adjectives, etc. According to the present position, however, these slight word changes should greatly alter the nature of the semantic descriptions that Ss presumably construct. Ss hearing paragraph II should construct a semantic structure different from that schematized in Figure 10. The second structure should characterize information about two people and two huts rather than one of each. One consequence of this is that Ss should no longer think they heard the previously mentioned NEW sentences: He saw it crush the tiny hut at the edge of the dense woods, for example. Results indicate
The abstraction
of linguistic ideas: A review
239
that Ss hearing paragraph II indeed reject such sentences. Since they are not derivable from the semantic structures presumably acquired, they are NONCASES. Failure to differentiate OLD from NEW sentences given in paragraph I is thus not due to some general, unspecifiable ‘confusion’, but is rather strongly predictable from considerations of the nature of the semantic structures represented in the paragraph and abstracted and stored in memory.
Figure
10.
A schematic
illustration
of some information
only peces
underlying
seen
a paragraph.
man’s
home
Memory for Paragraph I above has also been investigated in a free recall experiment. Ss listened to this paragraph (plus two filler paragraphs) and then were asked to attempt to recall it verbatim (the first two sentences were read to get them started). Various types of recall measures were then used to quantify their recall. The first type of measure used was percentage of actual verbatim recall. This turned out to be 0%. However, using a very strict criterion, 86% of all recalled sentences were derivable from the schematic structure outlined in Figure 10 (i.e. they were NEWS), and most other recalled sentences represented very slight variations on this structure (like the rock crushed the roofofthe hut). These recall results provide further support for the notion
240
John D. Bransford and Jeffery J. Franks
that Ss abstracted a complete idea of the paragraph and then based responding on this complete structure. For purposes of the present discussion, the most important recall measures were those based on a distinction we make between direct and indirect paraphrase. We make this distinction based on the sentence as the communication unit. A direct paraphrase is an expression whose semantic structure is directly derivable from, or a subset of, the semantic structure of a particular sentence. An indirect paraphrase is one which contains information derivable only from two or more original sentences, that is, it involves integration of information across sentences. For example, assume that Ss heard the following two sentences: The old man walked slowly through the woods. He had grey hair. Direct paraphrases would be expressions like The man walked through the woods: The man hadhair that wasgrey, etc. (Somewhat arbitrarily the specification of the object referred to by a pronoun is considered a direct paraphrase. One may argue that this is an indirect paraphrase by definition. We will not argue the point here; for present purposes the general distinction is sufficient.) Examples of indirect paraphrases are: The grey haired man walked slowly through the woods. The old man had grey hair, etc. The CLEARCASE sentences (the 86%) recalled following exposure to paragraph I consisted of 56% direct paraphrases and a full 44% indirect paraphrases, which had to be formed from information integrated from two or more sentences in the acquisition paragraph. These results strongly indicate that acquisition sentences (or paraphrases of them) are not the units of paragraph memory. Rather in paragraph memory (as with the other materials discussed) information from various sentences is integrated into more wholistic semantic structures, and these structures appear to be the most important factors in determining memory for events. Note, incidentally, that paragraphs should differ widely in terms of the degree to which they integrate into single cohesive structures. Hence degree of direct versus indirect paraphrase should be affected by the nature of the semantic structures acquired. We have conducted an additional series of studies designed to investigate the status of the individual sentence in prose learning. These studies were motivated by results from Slobin (1968) indicating that certain types of sentences (i.e. ‘truncated’ passives) were stored in a manner that favored their verbatim recall. Slobin presented Ss with passages written in the passive voice, and later asked them to recall as accurately as they could. Some Ss received full passive sentences (e.g. On the first day of school Bob was introduced to his new teacher by the principal and was given a r’eading book by the teacher), and some Ss receive ‘truncated’ passive sentences without mention of the actor (e.g. On the first day of school Bob was introduced to his new teacher and was given a reading book). At recall, Ss hearing full passive sentences transformed these sentences into active sentences 62% of the time. Ss hearing the truncated passive versions, however, showed only a 39% tendency to recall this information in the
The abstraction of linguistic ideas: A review
241
active form. In short, truncated passive sentences resulted in much greater verbatim recall. We decided to investigate the effects of semantic context on memory for truncated passive sentences. Passages were constructed which contained full passive, full active, and some truncated passive sentences. For some truncated passive sentences (m-alone) the actor was never supplied in the passage, as was the case in Slobin’s study. For other truncated passive sentences, however (TP plus semantics), information about the actor was supplied but it was always supplied elsewhere in the passage. The question was whether this intersentential information would affect memory for the TP form. Results showed that full passive sentences were remembered as full active sentences much more frequently than were the TP-alone sentences. These results replicate those Slobin found. The important result was that the TP plus semantics sentences were recalled as actives much more frequently than TP-alone sentences, indicating that the semantic context had a powerful effect on memory. These results were found for forced-choice recognition as well as recall. 6. Are semantic descriptions solely confined to the information represented in the total set of linguistic inputs? So far we have argued that Ss do not spontaneously treat sets of individual semantical-
ly related sentences as independent objects for storage. Instead information from various sentences is integrated to form wholistic semantic structures containing more information than any input sentences expressed. But must these wholisitic semantic structures be confined to that information integrable from sets of related sentences, or may such structures also be partially constructed from extra-linguistic information as well? We contend that Ss often spontaneously use linguistic information in conjunction with extra-linguistic information (either perceptual or from past experience) to construct the stored semantic descriptions of situations. Such descriptions are really ‘knowledge-enriched’ structures, since they encompass more information than is contained in the whole set of integrated input sentences about some topic. The development of this notion of knowledge-enriched ideas and the studies aimed at investigating it are to a great extent due to the influence of J. Richard Barclay without whom this research would not have materialized. Our research into this notion is just beginning but a set of studies by Bransford, Barclay and Franks (1971) illustrates the importance of the notion of knowledgeenriched memory representations. Although these studies deal with memory for individual sentences, their general implications can be extended to paragraph memory as well. Consider the following two sentences: (1) Three turtles rested beside afloating log and afish swam beneath them.
242
John D. Bransford
and Jeffery J. Franks
(2) Three turtles rested on a floating log and a fish swam beneath tiltm. These sentences are identical except for two lexical items (beside versus on). Both sentences have nearly identical linguistic structures and are each composed of two main propositions: (a) Three turtles rested (beside/on) a floating log and (b) The fish swam beneath them (the turtles). Despite their linguistic similarity we argue that sentences (1) and (2) are very different psychologically. Comprehension of sentence (2) typically involves different information than does comprehension of sentence (1). As mentioned above, knowledge of sentence (1) specifies the location of the turtles with respect to the log and tells where the fish swam. Sentence (2) also specifies this information, but comprehension of (2) typically involves additional information as well. Since the turtles are on the log and the fish swam beneath them, the fish must have swam beneath the log as well. This latter information (that the fish swam beneath the log) is not supplied linguistically, but must be inferred on the basis of one’s general cognitive knowledge of the world (in this case, knowledge of spatial relations). We refer to sentences like (2) as potential inference sentences (PI) and term sentences like (1) noninference sentences (NI). Bransford, Barclay, and Franks tested memory for these two types of sentences (PI and NI) in a recognition paradigm. Specifically, Ss were exposed to a set of acquisition sentences that contained either PI or NI versions of a number of different ‘sentences’ (e.g. during acquisition Ss would receive either (1) or (2) above). Following acquisition, Ss were given a recognition test containing OLD sentences and NEW sentences, where NEWS were exactly like one of the acquisition sentences except that the final pronoun was changed. For example the NEWS corresponding to (1) and (2) above would be Three turtles rested beside a floating log and a fish swam beneath it and Three turtles rested on afloating log and afish swam beneath it, respectively. If exposure to sentences like (1) or (2) above results in memory for only the linguistic inputs, then confusion as to which pronoun actually was presented in acquisition should be similar for these two sentence types. However, if exposure to PI sentences like (2), leads Ss to spontaneously construct and store semantic structures containing extralinguistic information (i.e. that the fish swam beneath the log as well as the turtles), then PI sentences should result in much greater confusion as to which particular pronoun was actually heard in acquisition. This follows since both it and them pronoun versions are consonant with the complete ‘enriched’ semantic descriptions which can be constructed from PI sentences, but only a single pronoun (the original input pronoun) is consonant with the structures generated by the NI sentence form (1) above. Table 6 shows that the recognition results strongly support the notion of knowledgeenriched memory structures. Ss did not distinguish between OLD and NEW versions of PI sentences but a clear differentiation was made between OLD and NEW versions of NI sentences. These results once again support the general thesis that people construct
The abstraction
of linguistic ideas: A review
243
wholistic semantic representations from input events and base responding on these stored structures. In this case, it was demonstrated that these stored structures can contain more information than that actually presented explicitly in the linguistic input. Table
6.
Mean recognition
ratings for the six sentence categories
Sentence type.5 Potential inference Non-inference Filler
OLDS
NEWS
1.40 2.22 2.19
1.43 -0.19 -4.15
A similar point can be made with paragraphs. Consider, for example, the passage below: There is a pond with a car beside it. The car is to the right of the pond. A man sits on top of the car. The pond is crystal clear and one can see all the fish. Bransford, Barclay and Franks (in press) read Ss sets of such passages and then presented them with a forced-choice recognition procedure. For example, Ss might be asked to pick one of the four sentences below. 1. The car is to the right of the pond. 2. The car is to the left of the pond. 3. The man is to the right of the pond. 4. The man is to the left of the pond. Data showed that if Ss could not remember which sentence they had actually heard before (i.e. sentence 1) they tended to pick a sentence consonant with the overall description of the situation (i.e. sentence 3). Ss tended not to pick a sentence like (2) despite the fact that it is linguistically similar to (1). Note that the information that ‘the man was to the right of the car’ was never provided linguistically. It could only have been inferred on the basis of one’s spatial knowledge of the world. We are currently investigating the notion of ‘knowledge-enriched memory’ structures in a number of other situations, both within the context of memory for individual sentences as well as memory for connected discourse. It seems to us that one’s spontaneous reaction to linguistic information is not to treat it as an object of storage, but rather to treat it as information which, when combined with other knowledge, allows one to update one’s general cognitive knowledge of the world. The present constructive approach to memory thus applies equally well to memory for sets of sentences in some perceptual context as well as to memory for linguistic inputs in isolation. In addition, different people may bring different knowledge to any particular task. Ultimately the constructive approach to memory should help us to understand how different people
244
John D. Bransford and Jeffery J. Franks
listening to exactly the same set of linguistic understanding) very different things.
7. Other considerations memory
inputs can come away remembering
related to the present approach to linguistic
abstraction
(and
and
The present approach to linguistic abstraction suggests a number of related problems that must be investigated. Of primary concern is the need for a well-developed psychologically valid system for expressing semantic structures or representations. Some formulation with greater precision than our heuristic illustrations in Figures 7 and 10 is needed to express the wide variety of types of linguistic structures (and extra-linguistic structures) that might be factors effecting the abstraction and storage of ideas. Of course the structural characterizations developed in linguistics (e.g. Chomsky, 1965), provide initial characterizations of some structures. We have argued, however, that a characterization of individual sentences is not necessarily sufficient to characterize knowledge of a set of semantically related input events. A second question related to the present model of linguistic abstraction concerns optimal ways to communicate semantic information. For example, what kinds of syntactic structures are most efficient and effective for communicating a given semantic structure. It is our belief that a complete answer to this question can only be formulated relative to considerations of the complexity of the semantic structures to be communicated, and that research concerned, for example, with predicting and optimizing the comprehensibility of paragraphs must proceed from this relativistic point of view. Consider, for example, an implication of this view for questions concerning relative difficulty in the comprehending and learning of various syntactic sentence types. It is easy to convince oneself that, in general, syntactically simple sentences are easier to process than syntactically more complex sentences (but see Garrett and Fodor, 1968), but does it follow from this that paragraphs composed of syntactically simple sentences will be the easiest to comprehend? If paragraph comprehension were equivalent to memory for just those sentences comprising it, one could argue for the communicative efficiency of using all syntactically simple sentences. However, if Ss spontaneously integrate information from various acquisition sentences, syntactic simplicity may not always correlate with ease of comprehension. If Ss have to integrate information anyway, it should help if syntax does this for them, at least partially. In fact, this is precisely what embedding transformations do. They integrate information that could be expressed in separate sentences. The present approach to memory thus suggests that a trade-off relation between syntax and semantics must be considered in dealing with efficiency of communication (see Bransford and Franks, 1970). Simple syntax may actually hinder comprehension by forcing Ss to do too much of the integra-
The abstraction of linguistic ideas: A review
245
tion. And more complex structures (like some forms of embedding) may actually facilitate comprehension by explicitly expressing cohesive, easily codable semantic integrations of ideas. Indeed, if embedding transformations did not facilitate communication to some extent, it would be difficult to imagine why languages would make use of them at all. Research supporting the notion of the syntax-semantic relation has been reported by Pearson (1969). He found that embeddings helped Ss in recall compared to simpler syntactic strings. We are currently engaged in research aimed at clarifying this trade-off relationship. Other syntactic transformations besides embeddings should also be viewed from the standpoint of their role in expressing overall semantic structures. Paragraph comprehension presumably requires a constant assimilation of new information into an ever-changing semantic structure, and different sentence types should be differentially effective in communication depending on the nature of the structures to be expressed. For example, given certain structures it should not be at all suprising to find that passive sentences facilitate comprehension to a greater extent than do active sentences, in spite of the fact that, as isolated entities, their relative ease of comprehension seems essentially reversed (e.g. Coleman, 1965). One of the most important general considerations stemming from the present approach to linguistic abstraction and memory is that it allows one to begin differentiating between comprehension and rote memorization or learning. Remembering based on rote memories is the kind of response that one would expect if one assumed that representations of specific input events were stored. The present view of abstracted wholistic semantic structures provides a basis for discussing comprehension in more intuitively reasonable terms. Comprehension involves the construction and use of such abstract structures. With further refinement the approach will hopefully provide insight into some of the many different levels at which comprehension can take place. We think that the above work demonstrates the appropriateness of Wundt’s distinction between overt sequential manifestations in language and the underlying cognitive structures that are constructed from the overt expressions. The work carries this distinction far beyond the sentence level. It provides a strong basis for the claim that in general, given a set of related linguistic inputs, people will integrate the semantic information into a wholistic idea representing the complete meaning being communicated and store this complete idea in memory. The semantic information contained in this stored structure will in general be a function both of the linguistic input and of general extra-linguistic’knowledge’which relates to the linguistic input. Ss’ later use of this semantic information, say in responding, will be based to a great extent on the stored idea and not necessarily on stored representations of particular inputs, although memories for stylistic aspects of the inputs may also retained.
246
John D. Bransford
and Jeffery
J. Franks
8. Overall summary The present paper is concerned with the problem of linguistic abstraction and the relation between language and memory. The present approach extends investigations of the notion that the overt sequential expressions of a sentence should be considered distinct from the underlying semantic or meaning structure of the message being communicated. Wundt made this distinction in separating the overt sequential aspects of language from the wholistic cognition or idea which underlay or could be constructed from the overt sentences. This same distinction is made by modern transformational linguistics in distinguishing surface and deep structures of sentences. The present paper reviews a series of studies that demonstrate that this distinction can be carried beyond the sentence level. In general the work demonstrates that information is integrated from a number of different sentences related to some topic and formed into a complete idea of the information being communicated, and this complete idea is the memory representation of the information that is stored. The first studies attempted to communicate information about four different complete ideas to Ss and to assess the nature of their memory representation of this information. The general procedure consisted of an acquisition task followed by a recognition test. During acquisition Ss were presented with a set of sentences each expressing only part of the meaning of one of the four complete ideas. Following acquisition, Ss were given a recognition test with confidence ratings. The recognition test contained three general types of sentences: OLDS which are sentences actually presented in acquisition; NEWS which are sentences that are derivable from the complete ideas but were not presented in acquisition (four sentences, each encompassing the complete semantic structure of one of the four whole ideas were included among the NEWS); and NONCASES which are sentences that are not derivable from any of the complete ideas. In general the results were that (1) Ss were confident that they had not heard NONCASES during acquisition but in general thought that they did hear clear cases, both OLDS and NEWS; (2) Ss in general did not distinguish OLDS from NEWS, giving both types of sentences comparable recognition ratings, and (3) a clear relationship was found between the extent to which a sentence contained the complete idea and Ss’ recognition ratings of the sentence. That is, the more complex the sentence the higher the recognition rating it received (for both OLDS and NEWS). These results support the hypothesis that people integrate the meanings of related sentences and store the complete ideas in memory. Later tests (like recognition) are based on information about these complete ideas rather than, say, on memory for those particular sentences that occured on the acquisition list. NONCASES are rejected because they are incompatible with the complete idea. OLDS and NEWS are both positively recognized and not readily distinguished from
The abstraction of linguistic ideas: A review
247
one another because both types of sentences are derivable from the complete idea. Further experiments were conducted to more precisely specify the semantic structure that is remembered. An hypothesis that what is stored is a list of independent semantic features was contrasted with the wholistic idea hypothesis. The results were contrary to the expectations of the semantic features model but were compatible with an account based on stored wholistic ideas. Other experiments werec onducted to determine the generality of the above results and to further investigate the usefulness of the present technique as a tool for investigating linguistic abstraction. The above work used an incidental learning task for acquisition. Other acquisition tasks including a variety of different incidental tasks and a task instructing Ss to explicitly remember the acquisition sentences (nonincidental) have been used, and the results were comparable to those obtained in the initial experiments. In other tasks the range of syntactic-semantic complexity of the acquisition sentences that are related to a given idea was varied and in general comparable results were obtained. This indicates that the abstraction of complete ideas holds over a variety of input complexities. In still other work we considered the effects due to the general nature of the ideas to be acquired. In the above work the ideas being communicated were concrete and easily imaged. A set of experiments demonstrated that the same pattern of results holds when the complete ideas are abstract and difficult to image. Thus the results supporting the integration, abstraction, and memory for wholistic ideas appear to hold for a wide variety of tasks. In additional experiments we extended the scope of the hypothesis to deal with complete ideas that are too complex to be effectively expressed in single sentences. Examples of such structures are the semantic structures which underlie paragraphs and connected discourse in general. Evidence for the same general phenomena was demonstrated. Ss integrated the information from a set of related sentences composing a paragraph and based their responses to later tests (recognition and recall) on abstracted and stored wholistic representations of the ideas being communicated. Finally a set of experiments is discussed which demonstrate that in many cases the wholistic idea that is constructed and stored encompasses more information than is explicitly represented in the linguistic expressions that are used. It is shown that the information explicitly contained in the linguistic input is often supplemented or enriched by a person’s extra-linguistic knowledge about the world and that the memory representation of the idea being communicated includes such information. Overall the series of studies provide support for the hypothesis that on the basis of linguistic input, often supplemented by extra-linguistic knowledge, people spontaneously construct and retain wholistic semantic representations of the ideas being communicated. Ss do not necessarily retain information about those specific sentences from which these semantic representations were originally acquired.
248
John D. Bransford
and Jeffery
J. Franks
REFERENCES Begg, I., and .Paivio, A. (1969) .Concreteaess and imagery in sentence meaning. J. verb. Learn. verb. Behav. 8, 821-827. Bransford, J. D., Barclay, J. R., and Franks, J. J. (In press) Sentence memory: A constructive vs. interpretive approach. Cogniiive Ps_Ychol.
Bransford, J.D., and Franks, J. J. (1970) Temporal integration in the acquisition ofcomplex linguistic ideas. Symposium paper presented at the Midwestern Psychological Association. Bransford, J.D., and Franks, J. J. (In press) The abstraction of linguistic ideas. Cognitive Fsychol.
Bltimenthal, A.L. (1967) Prompted recall of sentences. J. verb. Learn. verb. Eehav. 6, 203-206. Blumenthal, A.L. (1970) Language and Psychology. New York, J. Wiley and Sons. Blumenthal, A.L., and Boakes, R. (1967) Prompted recall of sentences. J. verb. Learn. verb. Behav. 6,674-676. N. (1965) Aspects of the theory of Syntax. Cambridge, Mass., M.I.T. Press.
Chomsky,
Coleman, E.B. (1965) Learning prose written in four grammatical transformations. J. appl. Psychol.
49,332-341.
Curnow,
P.F. (1969) Integration of linguistic materials. Ph. D. Dissertation, University of Minnesota. Curnow, P. F., Franks, J. J., Bransford, J.D., and Jenkins, J.J. (In preparation) The etfects of acquisition complexity and instructions on linguistic integration. Ebblnghaus, H. (1885) Ueber das Gedachtnis _ (translated by H. A. Ruger and C. E. Bussenius, 1913) New York: Teacher’s College. Franks, J. J. and Bransford, J. D. (In preparation). Linguistic abstraction: Towards a specification of what is learned. Franks, J. J., and Bransford, J. D. (In press) The acquisition of abstract linguistic ideas. J. verb. Learn.
verb. Behav.
Garrett, M., and Fodor, J. A. (1968) Psychological theories and linguistic constructs. In T. R. Dixon and D. L. Horton (eds.) Verbal Behavior and General Behavior
Theory.
Englewood Cliffs, N. J.: Prentice-Hall, Inc. Hyde,T. S., and Jenkins, J. J. (1969) Differential effects of incidental tasks on the organization of recall of a list of highly associated words. J. exp. Psychol. 82,472481. Katz, J. J., and Postal, P.M. (1964) An Zntegruted Theory of Linguistic Descriptions. Cambridge, Mass., M.I.T. Press. Mehler, J. (1963) Some effects of grammatical transformation on the recall of English sentences. J. verb. Learn. verb. Behav. 2, 346-351.
Miller, G. A. (1962) Some psychological studies of grammar. Amer. Psychol. 17,748-762. Neisser, U. (1966) Cognitive Psychology. New York, Appleton-Century-Crofts. Pearson, P. D. (1969) The effects of grammatical complexity on children’s comprehension, recall, and conception of semantic relations. Unpublished Ph. D. thesis, University of Minnesota. Postal, P.M. (1964) Underlying and superficial structure. Harvard educ. Rev. 34,246-266. Rohrman, N.L. (1968) The role of syntactic structure in the recall of English nominalizations. J. verb. Learn. verb. Behav. 7, 904-912.
Sachs, J. (1967) Recognition memory for syntactic and semantic aspects of connected discourse. Percept. Psychophys. 2, 437-142.
Savin, H.B., and Bever, T.G. (1970) The nonperceptual reality of the phoneme. J. verb. Learn. verb. Behav. 9,295-302.
Shepard, R.N. (1967) Recognition memory for words, pictures, and sentences. J. verb. Learn. verb. Behav. 6,156-163.
Slobin, D.I. (1968) Recall of full and truncated passive sentences in connected discourse. J. verb. Learn. verb. Behav., 7, 876-88.
The abstraction of linguistic ideas: A review
249
R&me’ Nous avons examine
dans cet article, le statut de phrases individuelles. Une phrase estelle une unite de la memoire, ou, avant tout, une unite qui aurait pour but de communiquer des id&? Dans une serie d’expkriences, on a demontre que les sujets ne retiennent pas uniquement l’information exprim6e par la phrase stimulus. Mais par contre, qu’ils integrent spontanement I’information communiquee par des series de phrases reliees skmantiquement (et souvent present&s de facon nonconsecutive) dans le but de construire une description skmantique plus globale. Ces descriptions peuvent contenir plus d’information que chacun des inputs particuliers. La memoire est avant tout une fonction de ces structures globales. Les sujets reconnai-
tront et se rappelleront des phrases qui n’ont jamais et6 presentees durant l’acquisition, mais qui sont derivables des structures dmantiques acquises. Cependant, les sujets reconnaitront ou se rappelleront rarement des phrases presentant une distortion &s id&s integrtes. Nous avons etudits l’integration semantique dans des conditions expkmentales vari&s. Nous avons demur&e que cette integration a lieu aussi bien dans le contexte des ‘paradigmes dint&ration’, specialement prepares, que pour les passages de prose, et pour toute une variCte de taches d’acquisition. Nous avons &value quelques modeles qui tentent d’expliquer ces don&s, et nous en discutons les implications.
6
Concordant
preferences as a precondition not for symbolic communication
for affective but (or How to do
experimental
anthropology)*
DAVID PREMACK University of California, Santa Barbara
A bstract An argument is shown that could provide a litmus paper-like test for determining whether or not a species communicates symbolically when appropriately constrained (as opposed to, is capable of acquiring language with human intervention). In affective communication a listener can predict his own emotional state from that of a speaker. If a speaker, having found food in one case and predators in another, returned in a high positive state in one case and a high negative state in another, listeners could venture forth and stay home respectively, in anticipation of objects that would induce comparable affective states in them. The effect could be comparable to that of the symbolic communication ‘there are apples out there’ and ‘there is a tiger out there’. However, the affective system depends upon a precondition which the symbolic one does not. A listener can predict his own affective state from that of a speaker only if both parties have concordant preferences. Therefore, induce a high need to communicate while at the same time destroying the normally concordant preferences of the species; if the species still communicates it is symbol positive. In addition, a model is presented suggesting that basic cause-effect relations, of a kind underlying human technologies, are discovered in nonutilitarian contexts under. the:aegis.of an-aesthetic.dtsposition:We explere.theexperimental-application L . of these ideas to chimpanzees in a simulated field. situation.
Consider two main ways in which you could benefit from my knowledge of the conditions next door. I could return and tell you, ‘the apples next door are ripe.’ Alternatively, I could come back from next door, chipper and smiling. On still * Paper read at Conference on Behavioral Basis of Mental Health, Galway, Ireland, 1972.
252
David Premack
another occasion I could return and tell you, ‘a tiger is next door.’ Alternatively, I could return mute with fright, disclosing an ashen face and quaking limbs. The same dichotomy could be arranged on numerous occasions. I could say, ‘the peaches next door are ripe,’ or say nothing and manifest an intermediate amount of positive affect since I am only moderately fond of peaches. Likewise, I might report, ‘a snake is next door,’ or show an intermediate amount of negative affect since I am less shaken by snakes than by tigers. For simplicity, consider that everything of interest is next door, making locational information irrelevant. The question to be answered therefore is always, What is it? never, Where is it? Also, for simplicity, assume that everything I tell you is true, even as every affective state I display is genuine and not simulated. These qualifications are not necessary for the simple argument to be made here but they smooth the way. Information of the first kind consists of explicit properties of the world next door; information of the second kind of affective states that I will assume can be positive or negative, and can vary in degree. Since changes in the affective states are caused by changes in the conditions next door, the two kinds of information are obviously related. In the simplest case, we could arrange that exactly the condition referred to in the symbolic communication be the cause of the affective state. If ‘cause’ is too simple, then consider that the tiger, apple, snake, etc. is the dominant factor in the affective display; much of what I will say here could be given more sophisticated formulation but without contributing materially to the basic argument. The use you could make of my statements needs no comment; but the use you could make of my affective states is almost equally obvious. You could go next door when my state was positive, not go when it was negative. The speed and certainty with which you went could be proportional to the intensity of my positive state. The certainty with which you did not go, even perhaps the distance you went to the opposite side of the house or the number of doors you locked behind you, could be proportional to the intensity of my negative state. Or you could go on all occasions but carry a bucket on some occasions, a spear on others. All of this is foretold from a simple fact stated earlier: the two kinds of information are correlated; they are caused by the same factors. Conveniently locating everything of interest next door is a hypothetical arrangement, of course, but we can see the same affective system operate to good advantage in an actual experimental situation. Menzel (1971) has recently reported some ingenious experiments concerning the social behavior of a group of young chimpanzees in a one-acre compound. After hiding such objects as food or snakes in the compound, the experimenter took one animal into the compound and showed it the hidden object. He then returned the informed animal to the rest of the group,
Concordant preferences
as a precondition for affective communication
253
which was held in a restraining cage on the edge of the compound, and released all animals together (regrettable from the point of view of communication studies but compatible v&h Menzel’s objectives). The animals succeeded in finding the hidden object, snake no less than food, significantly more often than when they were released into the compound without the benefit of an informed animal. In other experiments, Menzel hid two caches of food, one larger than the other, showed one to animal A the other to animal B, and again released all animals at the same time. Typically they not only found both objects, but found the larger one first. In addition, Menzel noted that the quality and even perhaps quantity of the object hidden could be predicted by an uninformed observer from the exit behavior of the animals (personal communication). From the moment the animals were released there were detectible differences in posture, gait, and vocalizations. Why did the knowing animal not simply steal off and enjoy the food to itself? Immature chimps, and perhaps even caged adult ones, are apparently afraid to venture too far into the compound alone. Menzel’s work shows that they tend not to exceed a certain distance from one another, actually not a distance so much as a time needed to overcome a distance. The informed animal’s fear combined nicely with the group’s ignorance of the location of the hidden objects, giving rise to a high mutual need for communication. The informed animal was afraid to go alone, while the uninformed animals were unlikely to find the hidden object if they went alone. Each party needed the other. Man has both affective and symbolic communication. Indeed, conflict between the two - e.g., a father saying to his son through gritted teeth, ‘I do agree with you, more than you realize’ - has been proposed as a source of mental illness (Mehrabian, 1970). All other species, except when tutored by man (Gardner & Gardner, 1969; Premack, 1970) have only the affective form. Even affective information alone could be of great value in a world where changes in location can be costly in terms of energy, risk or both. As a rule, the individual cannot simply venture forth each time a forager returned in a positive state. Instead he must weigh possible gains against possible losses, taking into account the valence and intensity of the speaker’s state on the one hand and his knowledge of predators or distance to be traveled on the other. If predators were abundant a leader’s positive state would have to be exceptionally high to induce a positive decision. Conversely, if predators were few a listener could indulge his curiosity, going forth unreservedly to learn what had occasioned a leader’s mildly negative state. In principle, an affective system would permit an animal’s choice behavior to accurately reflect the objective probabilities of his world. This is so despite the fact that a speaker’s affective state probably does not distinguish quality from quantity but is proportional to their resultant. For instance, a large amount of a + 4 item
254
David Premack
would probably occasion the Yet this need not be seen as are the same as yours, I would and small + 7, and thus would
same affective state as a lesser amount of a i- 7 item. a weakness of the affective system. If my preferences be as indifferent as you in a choice between large + 4 be willing to pay the same price for both commodities
in terms of risk and energy. The affective system could not rely entirely on unconditional factors but for maximum efficiency would seem to require some amount of learning. For example, since all members of a group are not likely to be of the same temperament, listeners should not respond merely to a magnitude of affect but rather to magnitude plus source. Contextual sensitivity of this kind would enable the listener to react in the same way to 0.4 intensity from sluggish Henry as to 0.8 intensity from excitable George. On the other hand, in the field this problem may be reduced if not averted by the fact that a listener would not be guided by the affective state of any one speaker. Instead, the speaker’s state would elicite affective states in all his listeners, the intensities of which would vary with their respective temperaments. If each listener more or less integrated over the several affective states, responding, say, to the average intensity of the group, the source of the individual states could be ignored. Only in the laboratory where a listener was restricted to a single speaker would it be necessary to take the source of the affect into account. Yet if one belonged to a group that varied, either through occasionally leaving one group to join another or through changes in membership brought on by the movement of others, the desirability of responding contextually would arise again. Only this time the contextual factor to which a listener should be sensitive would be the group rather than the individual. Even when helped by learning, the affective system is capable only of answering what-questions and not where-questions. But this does not limit the applicability of the affective system as sharply as it might seem. There are at least three ways in which to circumvent the need for locational information while retaining the value of the what-information. First, everything of interest can be next door, in a department store on 6th street, or across from the fire station, i.e., in some agreed upon location. Second, an informed leader through fear of venturing too far alone or for other reasons can lead his uninformed peers to the hidden objects. Third, the successful forager need not return to the group but can reveal his location by calls, manifesting his affective state on an auditory basis. The latter is undoubtedly the case most often found in the field; nonetheless all three cases share this property: the value of what-information is preserved despite the fact that the affective system does not code for directional information. These examples may help to underscore the uniqueness of the bee’s putative communication system (von Frisch, 1967). The proportionality between the rate at which the bee waggles and the quantity,
Concordant preferences
as a precondition for affective communication
255
quality or distance of the food is merely another instance of the motivational system now turned to a communication purpose quite like the affective system discussed here. But the coding for direction which the bee’s system is said to contain is unique and to my knowledge could not be derived from the classic properties of motivational systems. We come now to the main point of the paper. Affective communication depends upon a simple precondition: All members of the group must have concordant preference orders for the items about which they communicate. When members of a group are agreed about what is positive and negative, and the order of their magnitudes, then, in effect, any member of the group can use the affective state of any other member to predict his own affective state. But if you and I do not order items comparably then neither of us can use the valence or intensity of the other one’s excitement to predict his own. Suppose, for example, you are very fond of strawberries but I detest them. You return in a high positive state. Knowing nothing of your peculiar tastes I become equally excited in anticipation of a highly positive item. On such an occasion I am especially likely to go next door or to follow you into the compound, only to suffer the disappointment of strawberries. Your excitement has not proved to be a good basis for predicting my excitement. Is this a problem which learning could resolve? Typically, when owing to some change in circumstances, the unlearned behavior of a species becomes maladaptive, we turn to learning as the most powerful corrective device for restoring adaptive behavior. Could the affective system be restored by learning in the case of disconcordant preferences? Only in those special cases where there was a systematic relation between the preferences of several individuals. For example, if the preferences of one party were the perfect inverse of those of another party, the two parties could learn, either swiftly through a rule induced on a few exemplars, or slowly by trial and error, to adjust to this difference. Comparable adjustments could be made in principle for any systematically related preference orders. But if the relations were not systematic it is not clear that learning could contribute substantively to the problem. Consider a successfully-communicating group in which nonsystematic preferences were introduced. Although the speaker’s affective response to snake or food may be largely unlearned, a listener’s response to a speaker should have both learned and unlearned components. Thus a listener who had rushed next door on the occasion of a speaker’s positive excitement, only to find food that he did not care for, or worse a snake, would learn to inhibit his response to that speaker’s positive excitement. But would that solve his problem? On a subsequent occasion he might learn belatedly that the same speaker’s positive excitement, which he chose to ignore, was the occasion for an encounter with bananas,
256
David Premack
an item which he cared for greatly. He might also discover that some of the speaker’s negative states were occasioned by items of which he was quite fond. Yet it would not do simply to respond positively to all of the speaker’s negative states since at least some of them would be associated with negative conditions. And the listener would have no way of distinguishing ‘good’ occasions from ‘bad’ ones. In the long run, when preferences differ nonsystematically, acting on a speaker’s positive affective states would lead a listener to negative and positive stimuli with about equal frequency; acting on his negative affective states would have the same outcome. Similarly, the average intensity of the stimuli a listener encountered would be the same for all intensities of the speaker’s states. Could the problem of disconcordant preferences be resolved by a call system that was not restricted to affective states which were either simply positive or negative? Suppose the species had one call for food, another call for danger, etc. The functional effect of such calls is tantamount to an agreement between members of the species to call the same things food, the same things dangerous, etc. Yet calls of this kind do not differentiate one member of the food class from another member, nor one member of the dangerous class from another member. Thus the call system would protect an anticipation of, say, strawberries against the discovery of a tiger, or vice versa. But it would not protect anticipation of bananas against discovery of strawberries, nor anticipation of snake against discovery of tiger. To avoid within-class, as well as between-class confusion would require either concordant preferences or calls that were specific not only to classes but to members of the classes. The disastrous effect upon the affective system of disconcordant preferences would be comparable to that of a chaotic world the content of which changed before a listener had an opportunity to act upon a speaker’s message. Indeed, when first beset by the consequences of disconcordant preferences, a listener might well conclude that the world had changed, that it was no longer a reliable place. For often before he could get next door, apples would have turned into snakes and conversely; or so it might seem to the listener whose companion’s preferences had been altered without his knowledge. Iconic and symbolic communication
Both symbolic and iconic communication escape the simple precondition upon which affective communication depends. Organisms that disagree radically about values can nevertheless guide one another through the world provided they communicate either symbolically or iconically. Communication in this case does not depend upon a unanimity of values but merely upon the consistent application of names or icons to the items in the world. For instance, if you tell me the turnips next door are
Concordant preferences
as a precondition for affective communication
257
ready, your possible dislike of them would not detract from the information. If I want to try some I can, your finicky message notwithstanding. Symbolic communication escapes the precondition because de listener is presented not with (only) the speaker’s affective response to a condition but with a statement of the condition. In the iconic case he is presented with a piece of the condition. If not a tree full of ripe apples, then an apple core or even a leaf of the tree; or if not a tiger then perhaps a product of the tiger, droppings or claw marks. But the tiger might be constipated, the heroic speaker might die in an attempt to bring home a whisker, or with a more plebian speaker listeners might die for lack of a warning. Icons are inefficient. Worse, concepts are not equally susceptible to iconic representation; some, such as the logical connectives, could not be represented in that manner at all. Admittedly, symbolic communication is pervaded by iconicity (e.g., Durbin, 1971; Wescott, 1971), but the ultimate unacceptability of the pure iconic approach is incontestable. All this is beside the main point, however, which is simply that organisms agreed about values can guide one another through the world with affective communication alone; whereas organisms disagreed about values can still guide one another through the world provided they communicate symbolically.
Experimental
anthropology
The difference in preconditions for affective and symbolic communication can be used to do what might reasonably be called experimental anthropology. Years ago a moritorium was declared on speculating about the origin of language (Hewes, 1971). Learned societies sought to help man resist the temptation of speculating about the unknowable by prohibiting all such publication. Today I think we can go beyond idle speculation, not only about origins of language but of human milestones generally. We can test models concerning the origins of language, agriculture, religion, art, etc., and though perhaps we can never say how in fact they did originate, we can assign weights to the alternatives on an experimental basis. A model of the origin of language or of any other human activity will have two components. The first will state the cognitive skills that are a prerequisite for the activity. The second will state the selective pressures which, if imposed upon organisms with the prerequisite skills, will lead to the development of the activity in question. The first component deals with the problem solving ability of the species or its information processing capacity generally. The second deals with environmental pressures, problems that are posed a species by changes in the world. In the rest of this paper, I will take the conclusion from the first half of the paper, and show how it can be utilized as a selective pressure in experiments on the
258
origins
David Premack
of language.
In addition,
model of the origin of human
Laboratory-field
I will make some tentative
proposals
for a general
activities.
combination
Consider a joint laboratory-field approach to the origin of language. Even though we cannot yet enumerate the cognitive skills that are prerequisites for language, laboratory studies have already shown that the chimpanzee can be taught some of the principal exemplars of language (Gardner & Gardner, 1969; Premack, 1970). We know, for example, that the chimp is capable of symbolization, of using one event to represent another; of responding differentially to different word orders; of concatenating and rearranging words in ways that are necessary for the production of sentences. A successful comparison of human and animal intelligence requires that we be able to state the cognitive preconditions for these linguistic performances and ultimately all basic human activities. Our progress in achieving this objective could be measured by our ability to predict, for example, that species with certain cognitive skills could be taught language whereas those without these skills could not. Since we know that the chimp can be taught symbolic communication, it is sensible to apply a selective pressure to this species which may lead it to develop symbolic communication on its own. In the field, using Menzel’s procedures, we could induce in a small group of chimps a high need to communicate. As we have seen, the need to communicate which this procedure induces is normally handled nicely on an affective basis. Followers can accurately anticipate both the valence and the intensity of their own future excitement from the current excitement of the informed animal. But we also know that we could undermine the affective system simply by introducing disconcordances in the preferences, thus leaving an unresolved need to communicate. In the laboratory we could deprive and satiate the animals on different foods. An animal normally keen on, but now satiated on, bananas would be disappointed to find the bananas to which it was led by a highly excited animal that was not satiated on banana. Preferences could also be manipulated by contingencies, by arranging that an animal be able to obtain a highly preferred food only by first eating a nonpreferred one. This would increase the animal’s preference for a normally nonpreferred food and lead it to bring back ‘false’ reports (false positive) concerning what was in the compound. With an appropriate combination of these procedures we could arrange that no animal’s preference order be a function of the preference order of any other animal. And if this were not enough, we could also change any animal’s preference order from time to time by changing the satiation and contingency procedures from time to time. In this way, in animals known to be capable
Concordant preferences
as a precondition for affective communication
259
of being taught symbolic communication, we could produce a high need to communicate, while at the same time eliminating the normal mechanism for doing so. Could the chimpanzees then invent iconic or symbolic communication themselves, first when the hints from the experimenter were strong, later when they were made progressively weaker? Perhaps the first time this experiment is done, the animals should be left entirely to their own resources. If they proved incapable of developing a substitute as seems highly likely, or, as is also possible, developed one that we could not decipher, we could attempt to structure their problem solving, not only as an aid to them but also to make certain that we could follow their solution. For instance, we could allow the informed animal to return with a piece of the hidden object. This may duplicate the field situation in which the forager returns not only with its affective state, but also with vestiges of the source of its affective state, for example, with the smell of the food on its breath or body. (So an observer could use the forager’s breath to tell him what was next door and the intensity of the forager’s excitement to tell him how much was there.) We might sidetrack briefly to explore the chimp’s overall ability to use iconic representations. Starting with icons whose relation to the hidden object was that of part-whole, we could progressively weaken the relation, ending up with cases where the icon was merely an associate of the hidden object. Ultimately we could study metaphors. In addition, by giving the informed animal not one object to return with, but a number of alternatives from which to choose, we could study the informed animal’s ability to choose wisely, to pick items that its uninformed companions could use. When an informed animal returned with icons that were informative as its affective states no longer were, we could observe the possible transition from a reliance on affective states to a reliance on icons. In the course of this transition, we might observe a general degradation of responsivity to emotional cues, since these cues would no longer possess the functional significance they once did.
words
In a quite different approach we would provide the informed animal not with either icon, metaphor, or a choice among different possible ones, but with words taught both it and the other members of the group in the laboratory. In this case the animal could return to the uninformed group with the word naming the hidden object that it had been shown. Although the proportionality between a speaker’s affective state and the hidden object would no longer be an aid to its companions, with words the informed animal could tell the other animals exactly what was there. Possible dif-
260
David Premack
ferences in their evaluations of the hidden object should become irrelevant. Animals with a high preference for the object named by a given word on a particular trial should go forth with the informed leader; those with low preferences for the item should stay home.
Invented
words
Once the animals succeeded in using icons or words that had been taught them in the laboratory, we could raise the critical question, Can they devise their own symbols? The invention of symbols would seem to involve a change more difficult than that involved in other kinds of innovation. Changes in food preparation, tool use and the like, well documented in primate groups (e.g., Marler, 1965), can be made by one animal and then transmitted by social modeling to other animals. But a symbol cannot be invented and transmitted in this way. At least two individuals must use a symbol in the same way in order for it to be effective. I may use a blue triangle to represent apple in my private thinking and problem solving while you use a red square for the same purpose, but we could not communicate about apples until we used the same symbol, or found a way to establish the equivalence of our different symbols. Symbols seem, therefore, more likely to be social inventions rather than individual ones. An alternative would be for one inventor to transmit his idea to other animals; an improbable alternative in light of the degree of instruction that would be demanded and the fact that didactic intervention of that kind is apparently totally unknown outside of man. Indeed, instruction is considered to play only a minor role in the child’s acquisition of language. How can we arrange for the kind of joint invention that the symbol seems likely to require? Two animals could be shown a hidden food at the same time and supplied an arbitrary object as the only possible item with which to represent the hidden food. These animals would be in a position to share the same potential symbol. In addition, animals that chanced to follow the first pair into the compound could associate the arbitrary object shown them with the food discovered in the compound. The association is more likely to develop if the interval was short, or perhaps merely if the food were new. In some species (Garcia, Ervin & Koelling, 1966) associations develop between avoidance responses and foods despite long intervals provided the food is new and it results in gastrointestinal upset. The sickness may be unnecessary, however, and the association between stimuli and food may develop over unusual intervals merely if the food is new. In another approach to the invention of symbols, words or icons would not be brought back from the compound by the informed animal but would be selected by
Concordant preferences as a precondition for affective communication
261
that animal from alternatives stored in the restraining cage. After being returned to the restraining cage, the informed animal would look over the alternatives available to him there and select the one he considered to best represent the object he had seen in the compound. If the alternatives were consistently stored in specific locations, the removal of either the words or icon need not prevent the informed animal from communicating with his peers. He could put his hand in the appropriate place, or merely point in a given direction, and in this way perhaps devise gestures that would substitute for the previous words or icons. Aesthetics
and the discovery of basic causal relations
In the other human milestones - art, religion, agriculture - comparable analyses would be made. Cognitive skills on the one hand and selective pressures on the other. Since the logic of these cases is not different from that of language, I will not take them up but will turn to a slightly different problem. One precondition for certain human activities is a knowledge of basic cause-effect relations. In agriculture, for example, the most basic causal relation is that between the seed and the plant. Consider the nature of the circumstance in which relations of this kind are likely to be discovered and ultimately used. Agriculture is considered to have replaced hunting and gathering in areas where population density made earlier forms of provisioning untenable (e.g., Binford, 1971). Population density may have led to agriculture, but is this same pressure likely to have led to the discovery of the seed-plant relation? Bushmen of today are reported to know the seed-plant relation yet they continue to hunt and gather nonetheless. Though more efficient then hunting or gathering, agriculture is actually more arduous. People turn to it, I suspect, because they have to and not because they have just discovered the seed-plant relation. Discoveries of basis cause-effect relations such as the seed-plant relation seem more likely to occur under the aegis of aesthetic or exploratory dispositions than utilitarian ones, and thus to occur in contexts far removed from those in which the knowledge is ultimately used. If this is so and the causal knowledge is often not used directly, some functional repository would seem necessary, a system for preserving knowledge that a group carried but was not yet using. Finally, there is the terminal phase in which appropriate selective pressures operate upon existing knowledge to produce technologies representing solutions to practical problems. This suggests a three state model in which the basic steps in the development of human technologies are discovery, retention, and use. A principal root of the aesthetic disposition is a preoccupation with the discontinuities of space and with the possibilities of their transformation. We need not go
262
David Premack
to the human artist in whom these dispositions are institutionalized; in a minute way they can be seen even in a rat. Placed in a small box in which a lever projects from a wall, the rat rises on to its hind legs, sniffing and sweeping its vibrassae across the wall. In dropping back to the floor, its front legs contact the lever, which gives slightly under the pressure, causing the rat to stiffen and its hair to bristle; there is a momentary excitement. Having discovered this break in the texture of space, the rat is likely to return to reinstate it. The event in the rat is small but it can be magnified in the monkey and still more in the chimp. Consider a monkey that has pressed the same lever hundreds of times, producing no extrinsic consequence. One day the lever sticks before returning to resting position. The visibly excited monkey presses 30 times in the space of a few minutes trying presumably to restore the change in the visual transit of the lever. There is no end to this kind of event in the chimp; I will offer only one example. Sarah, a ten year old female chimp, who is the subject of a long-term language project, occasionally finds cuts on the hand of her trainer. She squeezes the cut expertly, not by opposing her thumb and forefinger in the human manner, but by placing her index fingers on opposite sides of the cut. The pressure accomplished in this manner can be very finely graded. As she squeezes her attention is rapt; she looks up from the cut only to peer into the eyes of the trainer (who looks back puzzled and a bit frightened, not of Sarah but of the intensity of her preoccupation). The chimp leaves off pressing just as a thin red line appears along the cut, outlining it against the rest of the skin. Presumably she would go on in this manner, raptly attentive, making subtle changes in space - if we had a device that could offer her multiple cuts. But no one has been willing to inflict a series of even small cuts in his hand simply to confirm the obvious. Rather than elaborate examples from chimp behavior, I will provide one example from human behavior, one which, as you will see, applies directly to agriculture. Some months ago in discussing the present thesis with Dr. Barbara Partee (gifted UCLA linguist) she was reminded of an event from her childhood which she has kindly consented to have reported. Walking in the woods in the late fall, she found an unusual clump of small trees. Returning with a small saw, she cut down the trees in the center and stuck them in with the other trees to form ‘a fort’ as she recalls it. In the spring she rediscovered the clump of trees, finding not only the original rees in bloom but those she had sawed off and transplanted as well. In this way, under the aegis of a disposition to operate upon and ‘improve’ space, she discovered rooting, one of the oldest forms of horticulture. Fossilized seeds, recently discovered on the graves of Neanderthal Man, have proved to be the seeds of flowers, suggesting that 50 thousand years ago man was already placing flowers on the graves of his dead. Who can say but that in the
Concordant preferences
as a precondition for affective communication
263
context of burial, seed may have fallen in fresh earth, sprouted, and led man to discover the seed-plant relation. The initial discovery of that relation is lost in prehistory; we cannot reasonably hope to recover it. My point is simply to note, first, the urgency of the aesthetic disposition in man and even chimp, and second that the disposition is of a kind to lead man into activities where he is likely to discover basic cause-effect relations. If knowledge of great utilitarian potential is first discovered in nonutilitarian contexts, it seems reasonable to provide a repository for it, such that it may be preserved for later use. The repository may be ritual, religion, or even art; I have no clear idea. The problem has clear aspects of psychological interest however. What factors make it likely that knowledge acquired in one context will be preserved in some other context; and what factors make it likely that knowledge will be used in contexts different from those in which it was discovered, preserved or both? The literature on problem solving suggests some of the difficulties that can arise in transporting an idea from one domain to another. Also we know the power of metaphor, a power by no means restricted to art but found also in scientific discovery. Can we systematize these matters and show, for example, how causal relations discovered in one context are more likely to be utilized than causal relations discovered or preserved in some other context? The last assumption in the model is simply that cause-effect relations which may have been a part of group knowledge for years will come to provide the basis of technology when activated by appropriate pressures. These three assumptions provide the tentative basis of a model as to how knowledge may be discovered, carried, and ultimately used. There are psychological issues of considerable interest locked in these assumptions and it is my hope to free them with the help of chimpanzees in a combined laboratory-field approach.
264
David Premack
REFERENCES Binford, L. R. (1971) Post-Pleistocene adaptations. In S. Struever (Ed.), Prehistoric agriculture. New York, The Natural History Press. Durbin, M. (1971) Some non-arbitrary aspects of language. Paper presented at the meeting of the American Anthropological Association, New York. Garcia, J., EN-in, F., and Koelling, R. (1966) Learning with prolonged delay of reinforcement. Psychonomic Science, 5, 121-122. Gardner, R. A. and Gardner, B. T. (1969) Teaching sign language to a chimpanzee. Science, 165, 664-672. Hewes, G. W. (1972) An explicit formulation of the relationship between tool-using, tool-making and the emergence of language. Unpublished manuscript. University of Colorado, Boulder.
Marler, P. (1965) Communication in monkeys and apes. In I. De Vore (Ed.), Primate Behavior. New York, Holt, Rinehart & Winston. Pp. 544-584. Mehrabian, A. (1970) Tactic of social influence. Englewood Cliffs, NJ., Prentice-Hall. Menzel, E. W. (1971) Social organization of a group of young chimpanzees. Paper read at the meeting of the American Anthropological Association, New York. Premack, D. (1970) A functional analysis of language. J. exper. Andy. Behav., 14, 104-125. von Frisch, K. (1967) The dance language und orientation of bees. Cambridge, Mass., Harvard University Press. Wescott, R. (1971) Linguistic iconism. Lunguuge, 47, 416-428.
On cherche B montrer qu’il est possible de fournir un test du genre ‘papier tournesol pour determiner si oui ou non une esp&e communique symboliquement quand elle y est contrainte de facon appropri&. (plutot que d’&tre capable d’acquerir un langage avec une intervention humaine). Dans la communication affective, l’auditeur peut predire son propre &tat emotionel B partir de Petat du locuteur. Si un locuteur a soit trouve de la nourriture, soit vu un predateur, dans le premier cas, il est dans un Ctat hautement positif, et dans le deuxieme cas, dans un ttat hautement negatif. Les auditeurs pourront alors soit rester chez eux, soit s’aventurer dehors, en anticipant les objets qui vont les mettre dans des ttats affectifs comparables. L’effet peut 6tre comparable a celui dune communication symbolique du type ‘ii y a des pommes dehors’,
le 04 ‘il y a un tigre dehors’. Toutefois, systeme affectif depend dune precondition dont ne depend pas le systeme symbolique. En effet, un auditeur peut predire son propre systeme affectif B partir de celui du locuteur, a condition que tous deux aient les m$mes preferences. Ainsi done si l’on induit un besoin de communication, tout en detruisant la condordance normale des preferences inter-esp&e, et si l’espbce continue a communiquer, on peut penser que c’est un symbole positif. On presente Cgalement un modele qui suggere que les relations causeeffets fondamentales sous-tendant les technologies humaines, peuvent ktre trouvtes dans des contextes non utilitaires sous l’egide de dispositions esthttiques. L’auteur explore une application experimentale de ces idees avec des chimpanzes mis dans des conditions de champs simults.
7
(A physical
scientist
Science or superstition? looks at the IQ controversy)*
DAVID
LAYZER
Harvard College Observatory
That valid judgments about the biological significance of differences in tests of mental abilities are impossible has already been stressed in preceding sections. This is a point that cannot be overemphasized in view of the immediacy of the racial problems confronting the United States at this time. Various proposals have been advanced by probably well-intentioned people that suggest how meaningful investigation of genetic rather than phenotypic differences in intelligence and achievement might be carried out. Most of these individuals suffer from the obstinate inability to see the methodological difficulties and inherent biases of their schemes. Some anthropologists even opine that such studies are irrelevant or too vulnerable to misinterpretation and too fraught with political danger to be undertaken. This may or may not be true, but it is a fact that generations of discrimination have made direct comparisons of mental traits between Negroes and whites not biologically meaningful. I. M. Lerner (1968, p. 234)
Abstract Jensen and others have applied the polygenic theory of inheritance to ZQ measurements and have derived a value of about .8 for the ‘heritability of ZQ.’ From this result they draw a number of far-reaching conclusions, in particular that children with low IQ’s cannot acquire higher cognitive skills (those involved in abstract reasoning and problem-solving) and that ethnic differences in average ZQ probably have a significant genetic component. This paper analyzes the implicit assumptions underlying Jensen’s theoretical analysis and demonstrates that they are untenable. Like any other quantitative scientific theory, the theory of polygenic inheritance * The present version of this paper has benefited greatly from criticisms (of an earlier version) and suggestions by Professor James F. Crow, Professor Irven De Vore, Professor Martin Deutsch, Professor I. Michael Lemer, Professor Lawrence Plotkin, Pro-
fessor Sandra Starr-Salapatek, Dr. Robert L. Trivers, Dr. Jean Carew Watts, and many other kind friends and critics. I acknowledge with gratitude their help and encouragement.
266
David Layzer
applies only to measurements that satisfy certain formal requirements. A detailed discussion of IQ measurements shows that they do not satisfy these requirements. Consequently, estimates of the ‘heritability of IQ’ are not merely unreliable but meaningless. Next, IQ correlation data and other relevant observations are examined in the light of current ideas concerning cognitive development. It is shown that the data provide no support for the view that children with low IQ’s, or children of parents with low IQ’s, have limited capacity for acquiring higher cognitive skills. Finally, the ‘hypothesis’ of genetic differences in intelligence between ethnic groups is shown to be untestable by existing or foreseeable methods. Hence it should not be regarded as a scientific hypothesis but as a metaphysical speculation.
A number of years ago, when high school teachers in North Carolina were being paid a starting salary of $120 per month, I happened to ask a member of that state’s legislature whether he considered this to be an adequate salary. ‘Certainly,’ he said, ‘they’re not worth any more than that. ’ ‘HOW do you know?’ I asked. ‘Why, just look at what they’re paid.’ Circular reasoning ? I think not. Our views on salary and status reflect our basic assumptions concerning the individual and his relation to society. One possible assumption is that society should reward each of its members according to his needs and contributions. Another is that society has a fixed hierarchic structure and each individual gravitates inevitably toward the level where he belongs. My question was based on the first assumption, the legislator’s reply on the second. The idea that, by and large, we get what we deserve - that there is a pre-ordained harmony between what we are and what we achieve - was an essential ingredient in the Calvinist doctrine of New England’s Puritan settlers. What really mattered to them was not, of course, how well they did in this world but how well they would do in the next. The first was important only insofar as it provided a clue to the second. Although Calvinism’s other-wordly orientation has long since gone out of fashion, its underlying social attitudes persist and continue to play an important part in shaping our social, educational and political institutions. Because we still tend to interpret wealth and power as tokens of innate worth (and poverty and helplessness as tokens of innate worthlessness), we tend to believe that it is wicked to tamper with ‘natural’ processes of selection and rejection (Thou shalt not monkey with the Market), to erect artificial barriers against economic mobility (downward or upward), or to penalize the deserving rich in order to benefit the undeserving poor. Not unnaturally, such attitudes have always appealed strongly to the upwardly mobile and those who already inhabit society’s upper strata. Besides, they offer a
Science or superstition?
267
convenient rationalization for our failure to cope with, or even to confront, our most urgent social problem: the emergence of a growing and self-perpetuating lower class, disproportionately Afro- and Latin-American in its ethnic composition, excluded from the mainstream of American life and alienated from its values, isolated in rural areas and urban ghettos, and dependent for the means of bare survival on an increasingly hostile and resentful majority. Faced with this problem, many people find it comforting to believe that human nature, not the System, is responsible for gross inequalities in the human condition. As Richard Nixon has said, ‘Government could provide health, housing, means, and clothing for all Americans. That would not make us a great country. What we have to remember is that this country is going to be great in the future to the extent that individuals have self-respect, pride and a determination to do better.’ Although such attitudes are deeply ingrained, increasing numbers of Americans are beginning to question their validity. The System may be based on eternal moral truths, but in practice it seems to be working less and less well; and one of the eternal moral truths does, after all, assert that practical success is inner virtue’s outward aspect. Yet the quality of life in America is deteriorating in many ways, not only for the downwardly mobile lower class (who, according to Mr. Nixon, are not trying hard enough) but also for the upwardly mobile middle class (who are already trying as hard as they can). In these circumstances any argument that lends support to the old, embattled attitudes is bound to arouse strong emotional responses both among those who recognize a need for basic social reform and among those who oppose it.
Jensenism This may help to explain the furor generated by the publication, in a previously obscure educational journal, of a long scholarly article provocatively entitled, ‘How Much Can We Boost IQ and Scholastic Achievement?’ (Jensen, 1969). Very little, concludes the author - because differences in IQ largely reflect innate differences in intelligence. Children with low IQ’s, he argues, lack the capacity to acquire specific cognitive skills, namely, those involved in abstract reasoning and problem solving. Such children should be taught mainly by rote and should not be encouraged to aspire to occupations that call for higher cognitive skills. What is true of individuals could also well be true of groups, continues Jensen: differences between ethnic groups in average performance on IQ tests probably reflect average differences in innate intellectual capacity. Jensen does not shirk the unpleasant duty of pointing out that this conclusion has an important bearing on
268
David L,ayzer
fundamental questions of educational, social and political policy: Since much of the current thinking behind civil rights, fair employment, and equality of educational opportunity appeals to the fact that there is a disproportionate representation of different racial groups in the various levels of educational, occupational and socioeconomic hierarchy, we are forced to examine all the possible reasons for the inequality among racial groups in the attainments and rewards generally valued by all groups within our society. To what extent can such inequalities be attributed to unfairness in society’s multiple selection processes? . . . And to what extent are these inequalities attributable to really relevant selection criteria which apply equally to all individuals but at the same time select disproportionately between some racial groups because there exist, in fact, real average differences among the groups relevant to educational and occupational perdifferences . . . indisputably formance? The contention that IQ is an index of innate cognitive capacity is of course not new, but it has not been taken very seriously by most biologists and psychologists. Jensen’s article purports to put it on a sound scientific basis. In outline, his argument runs as follows. IQ test scores represent measurements of a human trait which we may call intelligence. It is irrelevant to the argument that we do not know what intelligence ‘really is.’ All that we need to know is that IQ tests are internally and mutually consistent and that IQ correlates strongly with scholastic success, income, occupational status, etc. We can then treat IQ as if it was a metric character like height or weight, and use techniques of population genetics to estimate its ‘heritability.’ In this way we can discover the relative importance of genetic and environmental differences as they contribute to differences in IQ. Such studies show, according to Jensen, that IQ differences are approximately 90 % genetic in ori
Science or superstition?
269
caste, its members doomed by their genetic incapacity to do well on IQ tests to remain forever unemployed and unemployable, a perpetual burden and a perpetual threat to the rest of society. Many of Jensen’s and Hermstein’s critics have accused them of social irresponsibility. In reply, Jensen and Herrnstein have invoked the scholar’s right to pursue and publish the truth without fear or favor. Besides, they point out, we cannot escape the consequences of unpleasant truths either by shutting our eyes to them or by denouncing them on ideological grounds. But how firmly based are these ‘unpleasant truths’? The educational, social and political implications of Jensen’s doctrine justify a careful examination of this question. It is easy to react emotionally to Jensenism, but teachers and others who help to shape public attitudes toward education and social policy cannot allow themselves to be guided wholly by their emotional responses to this issue. There is another reason why Jensen’s technical argument repays analysis. It exemplifies - almost to the point of caricature - a research approach that is not uncommon in the social sciences. Taking the physical sciences as their putative model, the practitioners of this approach eschew metaphysical speculation and work exclusively with hard, preferably numerical, data, from which they seek to extract objective and quantitative laws. Thus Jensen deduces from statistical analyses of IQ test scores that 80 % of the variance in these scores is attributable to genetic differences. By exposing in some detail the logical and methodological fallacies underlying Jensen’s analysis, I hope to draw attention to the weaknesses inherent in the ‘operational’ approach that it exemplifies.
The irrelevance of heritability
Jensen’s central contention, and the basis for his and Herrnstein’s doctrines on education, race and society, is that the heritability of IQ is about 8. This means that about 80 % of the variation in IQ among (say) Americans of European descent is attributable to genetic factors. Other authors have made other estimates of the heritability of IQ - some higher, some considerably lower than 8. In the following pages I shall try to explain why all such estimates are unscientific and indeed meaningless. But before we embark on a discussion of heritability theory and its applicability to human intelligence, it is worth noticing that, even if Jensen’s central contention were meaningful and valid, it would not have the implications that he and others have drawn from it. Suppose for the sake of the argument that IQ was a measure of some metric trait like height, and that it had a high heritability. This
270
would
David Layzer
mean
that under
prevailing
developmental
conditions,
variations
in IQ are
due largely to genetic differences between individuals. It would tell us nothing, however, about what might happen under different developmental conditions. Suppose - to take a more concrete example than IQ - that a hypothetical population of first-graders raised in identical environments has been taught to read by method A. Measured differences in their reading ability will then be attributable largely to genetic differences. If method B had been used instead of method A, the differences in reading ability would still have been attributable largely to genetic factors, but both the individual scores on a test of reading ability and even their rank order might have been quite different, since it is well known that different methods of teaching reading suit different children. Thus, the heritability of such scores tells us nothing about the educability of the children being tested. To concludes, as Jensen and Herrnstein have done, that children with low IQ’s have a relatively low capacity for acquiring certain cognitive skills is to assume either that these skills cannot be taught at all or that, insofar as they can be taught, they have been taught equally well to all children.’ What does the alleged high heritability of IQ imply about genetic differences between ethnic groups? The answer to this question is unequivocal: nothing. Geneticists have been pointing out for well over half a century that it is meaningless to try to separate genetic and environmental contributions to measured differences between different stocks bred under different developmental conditions.2 Between ethnic groups, as between socioeconomic groups, there are systematic differences in developmental conditions (physical, cultural, linguistic, etc.) known to influence performance on IQ tests substantially. Since we have no way of correcting test scores for these differences, the only objectively correct statement that can be made on this subject is the following: - ‘The reported differences in average IQ tell us nothing whatever about any average genetic differences that may exist. On the data, black genetic superiority in intelligence (or whatever it is that IQ tests measure) is
1. Richard C. Lewontin (1970) has drawn attention to an ironical aspect of this assumption - ‘Jensen’s article puts the blame for the failure of his science [educational psychology] not on the scientists but on the children. According to him, it is not that his science and its practitioners have failed utterly to understand human motivation, behavior and development but simply that the damn kids are ineducable. . . . Jensen proposes . . . that, in the terms of his metaphor,
fallen bridges be taken as evidence of unhrideeability of rivers. The alternative planation, that educational psychology is in the seventeenth century, is apparently part of his philosophy.’
the LXstill not
2. A beautiful extended example illustrating this point is given by Lewontin (1970). See also Waddington (1957), kp. 92-94, who quotes an exceptionally clear argument by Hogben (1933).
Science or superstition?
271
neither more nor less likely than white superiority.’ 3 If we ultimately succeed in building a color-blind society, then and only then will we be able to estimate, in retrospect, how great the systematic effects of racial prejudice really were. As S. L. Washburn (quoted by Lerner [1968)) has said, I am sometimes surprised to hear it stated that if Negroes were given an equal opportunity, their IQ would be the same as the whites’. If one looks at the degree of social discrimination against Negroes and their lack of education, and also takes into account the tremendous amount of overlapping between the observed IQ’s of both, one can make an equally good case that, given a comparable chance to that of the whites, their IQ’s would test out ahead. Of course, it would be absolutely unimportant in a democratic society if this were to be true, because the vast majority of individuals of both groups would be of comparable intelligence, whatever the mean of these intelligence tests would show. To sum up, even if Jensen’s considerations of the heritability of IQ were meaningful and valid, they would have no direct bearing on the question of educability or on the issue of genetic differences between ethnic groups. Their apparent relevance is a result of semantic confusion. In ordinary usage, when we speak of a highly heritable trait we mean one that is largely inborn. In genetics, however, a trait can have high heritability either because its expression is insensitive to environmental variation or because the range of relevant environmental variation happens to be small. Jensen and Herrnstein apparently assume that the first of these alternatives is appropriate for IQ. But the available experimental evidence, some of which is cited later in this article, shows that IQ scores are in fact highly sensitive to variations in relevant developmental conditions.
Science and scientism:
A question of methodology
The theory of heritability, some elementary aspects of which are described below, was developed by geneticists within a well-defined biological context. The theory applies to metric characters of plants and animals - height, weight and the like. To apply this theory to human intelligence, Jensen and the authors whose work he summarizes must assimilate intelligence to a metric character and IQ to a measurement of that character. Most biologists would, I think, hesitate to take this con-
3. Hermstein appears to have misunderstood this point: he writes that the reported differences between ethnic groups could be
‘more genetic, less genetic, or precisely as genetic as implied by a heritability
of 2’.
272
David Layzer
ceptual leap. Jensen, however, justifies it on the following philosophical grounds: Disagreements and arguments can perhaps be forestalled if we take an operational stance. First of all, this means that probably the most important fact about intelligence is that we can measure it. Intelligence, like electricity, is easier to measure than to define. And if the measurements bear some systematic relationships to other data, it means we can make meaningful statements about the phenomenon we are measuring. There is no point in arguing the question to which there is no answer, the question of what intelligence really is. The best we can do is obtain measurements of certain kinds of behavior and look at their relationships to other phenomena and see if these relationships make any kind of sense and order. It is from these orderly relationships that we gain some understanding of the phenomena. The ‘operational stance’ recommended by Jensen is thought by many social scientists to be the key ingredient in the ‘scientific method’ as practised by physical scientists. This belief is mistaken. The first and most crucial step toward an understanding of any natural phenomenon is not measurement. One must begin by deciding which aspects of the phenomenon are worth examining. To do this intelligently, one needs to have, at the very outset, some kind of explanatory or interpretive framework. In the physical sciences this framework often takes the form of a mathematical theory. The quantities that enter into theory - mass, electric charge, force, and so on - are always much easier to define than to measure. They are, in fact, completely - if implicitly - defined through the equations that make up the theory. Once a mathematical theory has been formulated, its predictions can be compared with observation or experiment. This requires appropriate measurements. The aspect of scientific measurements that non-scientists most often fail to appreciate is that they always presuppose a theoretical framework. Even exploratory measurements, carried out before one has a definite theory to test, always refer to quantities that are precisely defined within a broader theoretical context. (For example, although we do not yet have a theory for the origin of cosmic rays, we know that such a theory must involve the masses, energies, momenta and charges of cosmic-ray particles. In designing apparatus to measure these quantities, physicists use wellestablished mathematical theories that describe the behavior of fast particles under a wide variety of conditions.) The theoretical framework for a given set of measurements may be wrong, in which case the measurements will ultimately lead to inconsistencies, but it must not be vague. In short, significant measurements usually grow from theories, not vice versa. Jensen’s views on scientific method derive not from the practice of physical scientists but from the philosophical doctrine of Francis Bacon (1561-1626) who taught that meaningful generalizations emerge spontaneously from systematic measurements.
Science or superstition?
213
These considerations apply equally to biology, where mathematical theories do not yet occupy the commanding position they do in the physical sciences. The following criticism by C. H. Waddington (1957) of conventional applications of the heritability theory is illuminating: . . . There has been a tendency to regard a refined statistical analysis of incomplete experiments as obviating the necessity to carry the experiments further and to design them in more penetrating fashion. For instance, if one takes some particular phenotypic character such as body weight or milk yield, one of the first steps in an analysis of its genetic basis should be to try to break down the underlying physiological systems into a number of more or less independent factors. Are some genes affecting the milk yield by increasing the quantity of secreting tissue, others by affecting the efficiency of secretion, and others in still other ways? These views contrast sharply with those of Jensen and Hermstein, who believe in the possibility of discovering meaningful relations between measurable aspects of human behavior without inquiring too closely into the biological or psychological significance of that behavior. In this way they hope to avoid ‘metaphysical’ speculation. This is an admirable objective. But it not so easy to operate without a conceptual framework. As we shall see, what Jensen and Herrnstein have in fact done is not to dispense with metaphysical assumptions but to dispense with stating them. Such a policy is especially dangerous in the social sciences, where experimental verification of hypotheses is usually difficult or impossible. As Gunnar Myrdal has wisely pointed out, the failure of the social sciences to achieve the same degree of objectivity as the natural sciences can be attributed at least as much to a persistent neglect on the part of social scientists to state and examine their basic assumptions as to the complexity of the phenomena they deal with. The operational approach not only spares Jensen the task of trying to understand the nature of intelligence. It also enables him to draw an extremely powerful conclusion from statistical analyses of IQ test scores: Regardless of what it is that our tests measure, the heritability tells us how much of the variance in these measurements is due to genetic factors. Because this assertion holds the key to Jensen’s entire argument, we shall analyze it in some detail.
Heritabiity
In the statement just quoted, Jensen uses the term heritability in a specific technical sense that must be elucidated before the statement can be analyzed. Suppose that
274
David Layzer
we have measured an individual character like height or weight within a given population. The two most fundamental statistical properties of a character are its mean and its variance. The mean is the average of the measurements; the variance is the average of the squared differences between the individual measurements and the mean. The variance is the most convenient single measure of the spread of individual measurements within a population. Now, this spread results partly from genetic and partly from nongenetic causes. But this does not mean, nor is it true in general, that a definite fraction of the spread, as measured by the variance, can be attributed to genetic factors and the rest to nongenetic factors. The variance splits up into separate genetic and nongenetic parts only if the variable part of each measurement can be expressed as the sum of statistically independent genetic and nongenetic contributions - that is, only if variations of the relevant genetic and nongenetic factors contribute additively and independently to the character in question. (A criterion for statistical independence will be given later.) In this case the genetic fraction or percentage of the variance is called the heritability. Characters like eye color and blood type, which are entirely genetically determined, have heritability 1. In general, however, the heritability of a character depends on the population considered and on the range of relevant nongenetic factors. Reducing this range always increases the heritability because it increases the relative importance of the genetic contribution to the variance. It is not easy to find realistic examples of metric characters affected independently by genetic and nongenetic factors. Human height is a possible, though not a proven, example, provided we restrict ourselves to ethnically homogeneous populations. Giraffe height, on the other hand, is a counterexample, since a giraffe’s nutritional opportunities may depend strongly on his genetic endowment. Human weight is another counterexample: on a given diet one person may gain weight while another loses weight. Let us suppose, however, that we have reason to believe that variations ol a given character are in fact the sum of independent genetic and environmental contributions. To calculate the heritability we need to be able to estimate either the genetic or the environmental contribution to the variance. This can be done if, for example, the population contains a large number of split pairs of one-egg twins. By a split pair 1 mean one whose members have been separated since birth and reared in randomly selected, statistically uncorrelated environments. All observable differences between such twins are environmental in origin, and the environmental differences are, by assumption, representative of those between individuals selected at random from the reference population. If, in addition, the genotypes of the twins are representative of those in the population as a whole, then, using elementary statistical techniques, one can derive separate estimates for the genetic and environmental contributions
Science or superstition?
275
of any metric character that satisfies the assumptions of additivity and independence. The same calculations serve to check these assumptions. If a suitably representative population of split twin-pairs is not available, one can cary out a similar but slightly more complicated analysis using pairs of genetically related individuals. In this case, however, one needs to know what degree of statistical correlation between the genetic contributions to a given character results from a given degree of genetic relationship. This information is available only for relatively simple characters such as those studied by Mendel in his classic experiments. For most characters of interest to students of animal genetics, the necessary information must be supplied by admittedly oversimplified theoretical considerations. Where human characters are concerned, the fact that mating patterns are both uncontrolled and nonrandom introduces a further source of uncertainty into the calculation. Although geneticists can often carry out carefully controlled experiments involving known variations in genetic and environmental factors, the lack of reliable theoretical information concerning the genetic basis of complex characters makes the concept of heritability less useful than one might at first sight suppose. In poultry, for example, the he&abilities of such economically important characters as adult body weight, egg weight, shell thickness, etc., have been repeatedly estimated. Yet for most such characters the estimates span a considerable range - sometimes as great as 50 % (Lemer, 1968). Again, estimates of milk yield in dairy cattle range from 25 % to 90 %. This spread does not result from random errors in individual estimates but from the fact that different methods, which in theory ought to be equivalent, yield systematically different heritability estimates. As Waddington (1957) has remarked in a similar context, ‘The statistical techniques available [for the analysis of heritability], although imposing and indeed intimidating to most biologists, are in fact very weak and unhandy tools.’ The assumption that genetic and environmental factors contribute additively and independently to a phenotypic character is, on general grounds, highly suspect. From a purely mathematical point of view, additivity is an exceedingly special property. Moreover, a character that happens to have this property when measured on one scale would lose it under a nonlinear transformation to a different scale of measurement. Additivity is therefore a plausible postulate only when there exists some specific biological justification for it. For complex animal characters there is little reason to expect additivity and independence to prevail. On the contrary, such characters usually reflect a complicated developmental process in which genetic and environmental factors are inextricably mingled, It is easy enough to produce more general mathematical models in which genetic and environmental factors contribute nonadditively and nonindependently to the to the variance
276
David Layzer
expression of a character. The difficulty with such models is that they are too flexible to be useful. The available statistical data do not suffice to evaluate the parameters needed to specify the model. Thus in the absence of a deeper understanding of the genetic and developmental factors affecting complex animal characters, the theory of heritability must operate within a severely restricted range.
IQ as a measure of intelligence
We are now ready to analyze the key assertion quoted earlier: - ‘Regardless of what it is that our tests measure, the heritability tells us how much of the variance of these measurements is due to genetic factors.’ Implicit in this statement are two distinct assumptions: that IQ is a phenotypic character having the mathematical structure (additivity and independence of the genetic and environmental contributions) presupposed by the theory of heritability; and that - assuming this condition to be fulfilled - the heritability of IQ can be estimated from existing data. Now, the IQ data that Jensen and others have analyzed were gathered in eight countries and four continents, over a period of 50 years, by investigators using a wide variety of mental tests and testing procedures. Geneticists and other natural scientists who make conventional scientific measurements under controlled conditions know from bitter experience how wayward and recalcitrant, how insensitive to the needs and wishes of theoreticians, such measurements can be. Their experience hardly leads one to anticipate that the results of mental tests constructed in accordance with unformulated, subjective and largely arbitrary criteria possess the special mathematical structure needed to define heritability. It is difficult to imagine how this happy result could have been achieved except through the operation of collective serendipity on a scale unprecedented in the annals of science. Nevertheless, let us examine the case on its merits. At the very outset we have to ask, is IQ a valid measure of intelligence? Jensen and Herrnstein assure us that it is. ‘The most important fact about intelligence is that we can measure it,’ says Jensen, while Herrnstein remarks that the ‘objective measurement of intelligence’ is psychology’s ‘most telling accomplishment.’ I find these claims difficult to understand. To begin with, the ‘objective measurement’ does not belong to the same logical category as what it purports to measure. IQ does not measure an individual phenotypic character like height or weight; it is a measure of the rank order or relative standing of test scores in a given population. Thus the statement, ‘A has an IQ of 100’ means that half the members of a certain reference population scored lower than A on a certain set of tests and half scored higher. ‘B has an IQ of 115’ means that 68 % of the reference population scored lower
Science or superstition?
277
than B and 32 % higher, and so on. (IQ tests are so constructed that the frequency distribution of test scores iu the reference population conforms as closely as possible to a normal distribution - the familiar bell-shaped curve - centered on the value of 100 and having a half-width or standard deviation [the square root of the variance] of 15 points.) To call IQ a measure of intelligence conforms neither to ordinary educated usage nor to elementary logic. One might perhaps be tempted to dismiss this objection as a mere logical quibble. If IQ itself belongs to the wrong logical category to be a measure of intelligence, why not use actual test scores? One difficulty with this proposal is the multiplicity and diversity of mental tests, all with equally valid claims. (This is part of the price that must be paid for a strictly ‘operational’ definition of intelligence.) Even if one were to decide quite arbitrarily to subscribe to a particular brand of mental test, one would still need to administer different versions of it to different age groups. An appearance of uniformity is secured only by forcing the results of each test to fit the same Procrustean bed (the normal distribution). But this mathematical operation cannot convert an index of rank order on tests having an unspecified and largely arbitrary content into an ‘objective measure of intelligence.’ Even Burt (1956) a convinced hereditarian whose work forms the mainstay of Jensen’s technical argument, recognized this difficulty. ‘Differences in this hypothetical ability [intelligence],’ he wrote, ‘cannot be directly measured. We can, however, systematically observe relevant aspects of the child’s behavior and record his performances on standardized tests; and in this way we can usually arrive at a reasonably reliable and valid estimate of his “intelligence” in the sense defined.’ (Emphasis added. Earlier in the paper cited above, Burt defines intelligence as ‘an innate, general, cognitive factor.‘) Burt’s conviction that intelligence cannot be directly or objectively measured - a conviction bred by over half a century of active observation - profoundly influenced his practical approach to the problem. In assessing children’s intelligence, Burt and his assistants used group tests, but also relied heavily on the subjective impressions of teachers. When a discrepancy arose between a teacher’s assessment and the results of group tests, the child was retested individually, if necessary more than once. Burt’s final assessments may be ‘reliable and valid,’ as he claims, but they are certainly not objective, nor did he consider them to be so.
TQ and tentacle length
The fact that IQ cannot, for purely logical reasons, be an objective measure of intelligence (or of any other individual characteristic) does not automatically invalidate Jensen’s arguments concerning heritability. Rank order on a mental test
278
David Layzer
could still be, as Burt suggested, an indirect measure of intelligence. To illustrate this point, suppose that members of a superintelligent race of octopuses, unable to construct rigid measuring rods but versed in statistical techniques, wished to measure tentacle length. Through appropriate tests of performance they might be able to establish rank order of tentacle length in individual age groups. By forcing the frequency distribution of rank order in each group to fit a normal distribution with mean 100 and standard deviation 15, they would arrive at a TQ (tentacle quotient) for each octopus. In all probability, differences in TQ would turn out to be closely proportional to differences in actual tentacle length within a given age group, though the factor of proportionality would vary in an unknown way from one group to another. Thus our hypothetical race of octopuses would be able to infer relative tentacle length within an age group from information about rank order. This inference evidently hinges on the assumption that tentacle length, which the octopuses cannot measure directly, is in reality normally distributed within each age group.
Some tacit assumptions
unmasked
and analyzed
Similarly, the inference that TQ is a measure of intelligence depends on certain assumptions, namely: (a) that there exists an underlying one-dimensional, metric character related to IQ in a one-to-one way, as tentacle length is related to TQ, and (b) that the values assumed by this character in a suitable reference population are normally distributed. If these assumptions do not in themselves constitute a theory of human intelligence, they severely restrict the range of possible theories. Once again we see that the ‘operational stance,’ though motivated by a laudable desire to avoid theoretical judgments, cannot in fact dispense with them. The choice between a theoretical approach and an empirical one is illusory; we can only choose between explicit theory and implicit theory. But let us examine the assumptions on their own merits. The first assumption is pure metaphysics. Assertions about the existence of unobservable properties cannot be proved or disproved; their acceptance demands an act of faith. Let us perform this act, however - at least provisionally - so that we can examine the second assumption, which asserts that the underlying metric character postulated in the first assumption is normally distributed in suitably chosen reference populations. Why normally distributed? A possible answer to this question is suggested by a remark quoted by the great French mathematicran Nenri Poincare: ‘Everybody believes in the [normal distribution]: the experimenters because they think it can be proved by mathematics, the mathematicians because it has been established by observation.’ Nowadays both experimenters and mathematicians
Science or superstition?
279
know better. Generally speaking, we should expect to find a normal frequency distribution when the variable part of the measurements in question can be expressed as the sum of many individually small, mutually independent, variable contributions. This is thought to be the case for a number of metric characters of animals such as birth weight in cattle, staple length of wool, and (perhaps) tentacle length in octopuses. It is not the case, on the other hand, for measurements of most kinds of skill or proficiency. Golf scores, for example, are not likely to be normally distributed because proficiency in golf does not result from the combined action of a large number of individually small and mutually independent factors. What about mental ability? Jensen and Herrnstein believe that insight into its nature can be gained by studying the ways in which people have tried to measure it. Jensen argues that because different mental tests agree moderately well among themselves, they must be probing a common factor (Spearman’s g). Some tests, says Jensen, are ‘heavily loaded with g’, others not so heavily loaded. Thus g is something like the pork in cans labelled ‘pork and beans.’ Hermstein takes a less metaphysical line. Since intelligence is what intelligence tests measure, he argues, what needs to be decided is what we want intelligence tests to measure. This is to be decided by ‘subjective judgment’ based on ‘common expectations’ as to the ‘instrument’. ‘In the case of intelligence, common expectations center around the common purposes of intelligence testing - predicting success in school, suitability for various occupations, intellectual achievement in life.’ Thus Herrnstein defines intelligence ‘instrumentally’ as the attribute that successfully predicts success in enterprises whose success is commonly believed to depend strongly on . . . intelligence. That is, intelligence is what is measured by tests that successfully predict success in enterprises whose success is commonly believed to depend strongly on what is measured by tests that successfully predict succes in enterpries whose success is commonly believed to depend strongly on . . . Whatever the philosophical merits of the definitions offered by Jensen and Herrnstein, they afford little insight into the question at hand: Does intelligence depend on genetic and environmental factors in the manner required by heritability theory? In other words, is the heritability of intelligence a meaningful concept? To pursue this question we must go outside the theoretical framework of Jensen’s discussion.
Intelligence defined; cognitive development
Many modem workers believe that intelligence can usefully be defined as information-processing ability. As a physical scientist, I find this definition irresistible. To begin with, it permits us to distinguish as many qualitatively different kinds of
280
David Layzer
information as we may find it useful to do. Moreover, because information is a precisely defined mathematical concept, there is no obvious reason why it should not be possible to devise practical methods for reliably measuring the ability to process it. (In its broadest sense information-processing involves problem-solving as well as the extraction and rearrangement ot data.) Whether or not such tests would be accurate predictors of ‘success’ I do not know. They could, however, be usefully empIoyed in assessing the effectiveness of teachers, educational procedures and curricula. Information-processing skills, like other skills, are not innate, but develop over the course of time. What is the nature of this development? Consider such complex skills as skiing or playing the piano. In order to acquire an advanced technique one must acquire in succession a number of intermediate techniques. Each of these enables one to perform competently at a certain level of difficulty, and each must be thoroughly mastered before one can pass to the next level. The passage to a higher level always involves the mastery of qualitatively new techniques. Through systematic observations carried out over half a century with the help of numerous collaborators, Jean Piaget (1952) has demonstrated that basic cognitive structures also develop in this way, and he has traced the development of a great many of these structures in meticulous detail. Each new structure is always more highly organized and more differentiated than its predecessor. At the same time it is more adequate to a specific environmental challenge. The intermediate stages in the development of a given structure are not rigidly predetermined (there are many different ways of learning to read or ski or play the piano), nor is the rate at which an individual passes through them, but in every case cognitive development follows two basic rules (Piaget 1967): ‘Every genesis emanates from a structure and culminates in another structure. Conversely, every structure has a genesis.’ Cognitive development may be compared to the building of a house. Logic and the laws of physics demand that the various stages be completed in a definite order: the foundations before the frame, the frame before the walls, the walls before the roof. The finished product will depend no doubt on the skill of the builder and on the available materials, but it will also reflect the builder’s intentions and the nature of the environmental challenge. Similarly, although cognitive development is undoubtedly strongly influenced by genetic factors, it represents an adaptation of the human organism to its environment and must therefore be strongly influenced by the nature of the environmental challenge. Thus we may expect cultural factors to play an important part in shaping all the higher cognitive skills, for the environmental challenges that are relevant to these skills are largely determined by cultural context.
Science or superstition?
Genetic-environmental
281
interaction
If intelligence, or at least its potentially measurable aspects, can be identified with information-processing skills and if the preceding very rough account of how these skills develop is substantially correct, then it seems highly unlikely that scores achieved on mental tests can have the mathematical properties that we have been discussing - properties needed to make ‘heritability of IQ’ a meaningful concept. The information-processing skills assessed by mental tests result from developmental processes in which genetic and nongenetic factors interact continuously. The more relevant a given task is to an individual’s specific environmental challenges, the more important are the effects of this interaction. Thus a child growing up in circumstances that provide motivation, reward and opportunity for the acquisition of verbal skills will achieve a higher level of verbal proficiency than his twin reared in an environment hostile to this kind of development. Even if two genetically unlike individuals grow up in the same circumstances - for example, two-egg twins reared together - we cannot assume (as Jensen, Herrnstein and other hereditarians usually do) that the relevant nongenetic factors are the same for both. If one twin has greater verbal aptitude or is more strongly motivated to acquire verbal skills (usually the two factors go together), he will devote more time and effort to this kind of learning than his twin. Thus differences between scores on tests of verbal proficiency will not reflect genetic differences only, but also - perhaps predominantly - differences between the ways in which the genetic endowments of the twins have interacted with their common environment. One might be tempted to classify these interactive contributions to developed skills as genetic, on the grounds that they are not purely environmental and that the genetic factor in the interaction plays the active role. In technical discussions, however, common sense must accomodate itself to definitions and conventions laid down at the outset. If we redraw the line that separates genetic and nongenetic factors we must formulate a new theory of inheritance; if we wish to use the existing theory we must stick to the definitions that it presupposes.
Do the IQ data fit the theory of heritabiity?
Up till now we have concerned ourselves with the first of the two implicit assumptions underlying Jensen’s key assertion about the interpretation of heritability estimates, namely, the assumption that intelligence is a phenotypic character to which the theory of heritability can be applied. We have found no plausible grounds for supposing that genetic and environmental factors contribute to the development
282
David Layze?
of intelligence in the simple way required by heritability theory. To this objection Jensen and Herrnstein might reply as follows: - ‘Discussions of “meaning”, “matheare irrelevant. IQ test scores and the matical structure” and “logical categories” statistical quantities that can be derived from them (means, variances and correlations) are hard data. There is nothing to prevent us from applying heritability theory to these data and seeing whether or not it fits. If the theory does fit, we may reasonably assume that it applies to these data, whatever their provenance.’ This argument has some validity. If the heritability theory did apply to IQ test scores and the statistics derived from them, these statistics would simultaneously satisfy a large number of numerical relations, like the steel girders composing a complex rigid structure. Conversely, if all these relations were indeed accurately satisfied by the statistical data, we would have good reason to suppose that the theory applied to them in spite of a priori arguments to the contrary. Yet such arguments do serve an important purpose: they help us to decide how good the evidence must be to convince us that the theory really does fit the data. Suppose that an astronomer, having made several observations of a newly discovered planet, tries to determine its orbit using Newton’s theory of gravitation. If he applies the theory correctly, he can be virtually certain that any discrepancy between the theory and his observations results from observational error: previous experience has firmly established Newton’s theory and its applicability to the motion of planets. When the validity or applicability of a theory is not so well established, however, it may not be easy to decide how much of the discrepancy between theory and observation to attribute to experimental error and how much to error or incompleteness in the theoretical description. In such cases a competent scientist bases his judgment on all the relevant theoretical and observational information available to him. If that information strongly suggests that a given theory does not apply to given data, he will demand highly convincing evidence of internal consistency before taking seriously claims based on a statistical analysis of the data. Strangely enough, Jensen and Herrnstein offer little evidence of this kind, and what evidence they do offer is, as we shall see, specious. The data that Jensen, Bun and other hereditarians have analyzed consist mainly of correlations between the measured IQ’s of more or less closely related persons living in more or less similar environments. Before examining these data, let us recall what statisticians mean by the term correlation. Suppose we have measured the heights and weights of twelve-year olds in a certain school. Height and weight are said to be strongly correlated if their rank orders are very similar. A perfect correlation is assigned the value + 1, a null correlation - between two statistically independent sets of measurements - the value 0. (This property defines statistically independent measurements.) Two paired
Science or superstition?
283
sets of numbers have a large negative correlation if the rank order of one closely resembles the reversed rank order of the second.4 IQ correlations between genetically related individuals are measured in exactly the same way as the correlation between height and weight in our example. To measure the correlation between the IQ’s of grandparents and grandchildren in a given sample, one could follow the above recipi, substituting the grandchild’s IQ for (say) height and the grandparent’s IQ for weight. Such studies have usually shown that the IQ’s of more closely related people tend to be more highly correlated than those between less closely related people; also that the IQ’s of children growing up in similar circumstances tend to be more highly correlated than those of children growing up in dissimilar circumstances For pairs of children reared together, the measured correlations increase systematically with increasing genetic similarity. Thus the IQ’s of one-egg (identical) twins tend to be more highly correlated than those of two-egg [fraternal) twins, which in turn tend to be more highly correlated than the IQ’s of unrelated children reared together. These findings show that IQ is strongly influenced by both genetic and environmental factors. Can we disentangle these factors? ‘By evaluating the total evidence,’ writes Herrnstein, ‘and by a procedure too technical to explain here, Jensen concluded (as have most of the other experts in the field) that the genetic factor is worth about 80 % and that only 20 % is left to everything else . . .‘. As summarized by Herrnstein, the evidence on which this conclusion rests seems quite impressive. Herrnstein compares the ‘actual’ values of IQ correlations between relatives with ‘theoretical’ correlations calculated on the assumption that nongenetic effects on IQ are negligible. In every case the agreement seems to be very close: Uncle’s (or aunt’s) IQ should, by the genes alone, correlate with nephew’s (or niece’s) by a value of 31% ; the actual value is 34 % . The correlation between grandparent and grandchild should, on genetic grounds alone, also be 31 %, whereas the actual correlation is 27 %, again a small discrepancy. And finally for this brief survey, the predicted correlation between parent and child, by genes alone, is 49 %, whereas the actual correlation is 50 % using
4. One calculates the numerical value of the correlation between height and weight from the following recipe. (1) Calculate the mean height and the variance in height, also the mean weight and the variance in weight. (2) Subtract from each child’s height the mean height and divide the result by the square root of the variance in height. The resulting number may be called the reduced deviation from the mean. Do the same thing
for the weight. (3) Multiply corresponding reduced deviations in height and weight together, add up all the products and divide the result by the number of children in the sample. The resulting number is the heightweight correlation for this sample. Notice that it does not depend on what units (inches or centimeters, pounds or kilograms) we use to measure height and weight.
284
David Layzer
the parents’
adult
IQ and 56 % using the parents’
childhood
IQ’s - in either
case too small a difference to quibble about. But let us take a closer look. What does Herrnstein mean by the word ‘actual’ in he writes, ‘are lifted directly out the passage just quoted? ‘The foregoing figures’, of Jensen’s famous article, figures that he himself culled from the literature on intelligence testing.’ Referring to Jensen’s article, we do indeed find the figures quoted by Herrnstein, in a column headed ‘obtained median correlation.’ How is the ‘median’ correlation related to the ‘actual’ correlation? Can we assert that the actual value of a quantity lies close to the median of several measurements of that quantity? As every working scientist knows, the answer to this question is, No, not in general. All that can be said is that, in the absence of systematic errors, the actual value is likely to lie within a range of values comparable to the range spanned by the actual measurements (if there are enough of them). What ranges do the IQ correlations span? Jensen’s paper does not supply this important information. His table, however, is adapted from one given by Burt, who, in order to compare his own correlation measurements with those of previous investigators, tabulated the medians of correlation measurements collected by Erlenmeyer-Kimling and Jarvik (1964). Burt did not display the actual ranges of the measured correlations, but he did mention that several of them were large and gave one example: for siblings reared together, the correlations obtained in 55 studies range from .3 to .8 and are spread almost uniformly over the entire range. The correlations reported between parent and child in 11 studies - to give a second example - range from about .2 to about 8. But these figures still do not tell the whole story. What do the reported correlations actually mean? Each reported correlation refers to a particular sample and to a particular test or set of tests. How homogeneous are the samples with respect to nongenetic variables? How meaningful is it to combine correlations referring to population sampIes and tests differing in unspecified and unknown ways? If answers to these hard but important questions are to be found, it will be in the primary sources. As we move farther and farther away from these sources, the errors and uncertainties in the data become less and less noticeable, until at last, in the pages of The Atlantic Monthly, only the ‘actual values’ remain. Like the reputations of saints, scientific data often improve with transmission. The ‘theoretical correlations’ quoted by Herrnstein have undergone a similar transformation. The theory in question is not the one usually applied by geneticists to estimate the heritability of polygenic characteristics, but a modified version of that theory especially devised by Burt and Howard (3 956) to improve the agreement with their correlation data - data based, incidentally, on Burt’s semi-objective assessments of intelligence.
Science or superstition?
285
What, then, can we infer from the data on IQ correlations? If genetic factors did not appreciably influence IQ, we would expect to find no appreciable differences between the IQ correlations of one-egg twins, two-egg twins and unrelated children reared in the same home. In fact, the measured correlations tend to be greater for one-egg twins than for two-egg twins, and greater for two-egg twins and siblings than for unrelated children reared in the same home - although the ranges overlap considerably. This indicates that genetic factors undoubtedly do influence IQ significantly - but not necessarily in the manner presupposed by the heritability theory. The internal consistency of the reported data is far too low to lend credence to claims that IQ measurements have the mathematical structure required by that theory.
Other checks
The internal consistency of the theory as applied to IQ correlation data can be checked in other ways. For example, Fred S. Fehr (1969) has derived heritability estimates from these data in a number of different ways. If the underlying premises of the theory were valid, the various estimates would be very similar. Actually they range from .4 to .8, and exhibit other internal inconsistencies. This suggests that the data do not have the structure needed to ensure consistency between distinct but equivalent methods of calculating heritability. As mentioned above, IQ correlations between one-egg twins have usually been found to be greater than those between two-egg twins, as would be expected if genetic factors significantly influence IQ (since one-egg twins have all their genes in common while two-egg twins, on the average, have only half their genes in common). A recent study by Sandra Starr-Salapatek (1971b) throws an interesting light on these findings. Scar-r-Salapatek analyzed optitude-test scores of twins enrolled in the Philadelphia public schools in April 1968. She divided her sample of 3042 twins according to ethnic background (‘black’ and ‘white’) and socioeconomic status (‘advantaged’ and ‘disadvantaged’) and calculated IQ correlations separately within each of the resulting four groups as well as for the entire sample. She also calculated the correlations separately for same-sex pairs and opposite-sex pairs. Since one-egg twins are always of the same sex while two-egg twins are as likely to be of the same sex as of opposite sexes, one can derive from such data separate estimates of the correlations for one-egg and two-egg twins - provided the sample is large enough, as it is in this case. Scar-r-Salapatek found that for the sample as a whole, as well as for the ‘black’ and ‘white’ samples separately, the one-egg correlations were, as expected, higher on all tests than the two-egg correlations. The differences were
286
David Layzer
especially marked for the ‘advantaged’ groups. For the ‘disadvantaged’ groups, on the other hand, the differences between the two correlations were much smaller. Indeed on some tests the correlations between one-egg twins were not significantly different from those between two-egg twins. This finding affords direct evidence for strong interaction between environmental and genetic factors. It suggests that bad environments can inhibit the expression of relevant genetic differences, while good environments bring out and even amplify such differences. If this interpretation is correct, why should a poor environment inhibit the expression of these particular genetic differences? One possible reason, I suggest, is that the skills assessed by the tests in question are less relevant to the environmental challenges faced by ghetto-dwellers than to those faced by suburbanites. If intelligence represents an adaptation to specific environmental challenges, a suburban child will have stronger incentives to develop certain cognitive skills than a ghetto child. These considerations would also explain the observation that the IQ’s of children living in urban ghettos and depressed rural areas tend to decrease as the children grow older. If we view the growth of intelligence as an adaptive process we would expect to observe exactly such an effect.
IQ and cuhre This brings us to the important question of how cultural factors affect IQ correlations. Jensen’s views on this question are instructive. They are summarized in a passage that begins with the sentence we have already analyzed in some detail: Regardless of what it is that our tests measure, the heritability tells us how much of the variance in these measurements is due to genetic factors. If the test scores get at nothing genetic, the result will simply be that estimates of their heritability will not differ significantly from zero. The fact that heritability estimates based on IQ’s differ significantly from zero is proof that genetic factors play a part in individual differences in IQ. To the extent that a test is not ‘culture-free’ or ‘culture-fair’ it will result in a lower heritability measurement. Jensen is here making two further assertions: only genetic factors can raise heritability estimates, and cultural bias in a test always lowers the heritability of the results. In discussing these assertions we can avoid unnecessary confusion by regarding ‘heritability estimates’ as purely formal mathematical quantities defined in terms of measured correlations. One such ‘heritability estimate’, for example, is provided by the IQ correlation between one-egg twins reared separately, another is given by twice the difference betwe enthe IQ correlations of one-egg and two-egg twins, and
Science or superstition?
287
so on. As we have seen, different ‘estimates’ of this kind do not agree among themselves, and none of them can be interpreted as measuring the genetic fraction of IQ variance in a given population. The data gathered by Starr-Salapatek in IQ correlations between one-egg and two-egg twins bear directly on the assertion that cultural bias always reduces ‘heritability estimates.’ At first sight her data seem to support this claim, for cultural bias has apparently reduced ‘heritability estimates’ for the disadvantaged subpopulatton. But it has also increased ‘heritability estimates’ in the advantaged subpopulation. Clearly the term ‘bias’ is misleading. Its use presupposes the possibility of constructing an unbiased intelligence test. But if intelligence is a result of adaptation, this is impossible, except under entirely unrealistic assumptions about cultural homogeneity. An adequate test of intelligence for one cultural context must be inadequate for another. The idea that each of us is born with a certain abstract ‘capacity’ for, say, pattern recognition, which can be measured by an appropriate culture-free test (if only it could be found), is pure metaphysics. Do cultural factors systematically affect performance on IQ tests? Hereditarians usually argue that such effects are probably small, because the skills assessed by IQ tests are not systematically taught in school. Hence, they argue, all children have the same opportunity to learn them. It is (unfortunately) true that many of these skills are not taught effectively in schools, but no one who has actually visited classrooms in suburban and in ghetto schools would deny that they are usually taught more effectively in the former than in the latter. Moreover, they are deliberately and effectively taught in homes where learning is valued for its own sake. Children who grow up in such homes learn to speak grammatically, to use words with precision, to get information and entertainment from books, to argue consequentially, to solve abstract problems, and to set a high value on these and similar activities. Such home environments tend strongly to run in families and to occur more frequently in some ethnic groups than in others. In her classic study of some leading American scientists, Anne Roe (1952) discovered that an unexpectedly large proportion of her subjects were the sons of professional men. She suggested as the most likely explanation for this finding the fact that her subjects grew up in homes where for one reason or another learning was valued for its own sake. The social and economic advantages associated with it were not scorned, but they were not the important factor. The interest of many of these men took an intellectual form at quite an early age. This would not be possible if they were not in contact of some sort with such interests and if these did not have value for them. This can be true even in nomes where it is not taken for granted that the sons will go to college.
288
David Layzer
We do not choose our cultural background but are born into it. For this reason people related genetically are more likely to share a common cultural background and common cultural values than unrelated persons. Since parental values help to shape children’s intellectual development, it is clear that cultural factors affect IQ correlations between relatives in roughly the same way as genetic factors. The correspondence between cultural and genetic effects is not. of course, perfect. For example, if cultural factors alone were important, we would expect to find nearly the same correlations between the IQ’s of one-egg and two-egg twins (which is what Starr-Salapatek actually did find for the ‘disadvantaged’ twins in her sample). On the other hand, heritability estimates based on IQ correlations between separated one-egg twins would probably be quite high, because in those cases where the separated twins are not actually reared by relatives, adoption agencies usually strive to match the cultural backgrounds of natural and adoptive parents. In short, cultural factors may be expected to increase some heritability estimates and to decrease others. The available observational evidence is at least consistent with the hypothesis that cultural factors contribute heavily to measured IQ correlations. Culture is, of course, not the only mock-genetic environmental factor that systematically influences IQ. Because cultural values, wealth, social status, occupational and educational levels are all more or less strongly correlated with one another as well as with IQ, and because they are all transmitted from generation to generation in roughly the same way as genetic information, it would obviously be very difficult under the best of circumstances to disentangle the genetic factors affecting IQ from the nongenetic factors. We have seen, however, that even if all these systematic nongenetic factors were absent, such a separation could not be accomplished through the application of heritability theory, because IQ measurements do not satisfy that theory’s requirements. Jensen and other hereditarians not only fail to take cultural and other environmental factors adequately into account, they also ignore genetic but noncognitive factors which interact in complex and as yet little understood ways with each other. with cognitive factors and with the environment. Among such factors are sex, color, temperament and physical appearance. Thus in societies where mathematical ability is considered unfeminine and femininity is prized, women tend not to develop mathematical ability. Again, skepticism and curiosity may help a middle-class child to develop scientific ability but only cause trouble for a ghetto child. And in societies like our own, where selection and advancement mechanisms often employ positive feedback (nothing succeeds like success). small ‘initial’ differences in genetic or environmental endowment often get greatly amplified. This is another way or saying that genetic-environmental interaction may well dominate the development of the skills that IQ tests assess.
Science or superstmon;/
Inferences from the
219
twinstudies
Studies of separated one-egg twins provide the most direct evidence bearing on genetic factors in intelligence. Although we have already seen that ‘heritability estimates’ based on these studies cannot bear the interpretation that Jensen puts on them, it is instructive to examine his analysis in some detail [assumptions not explicitly stated by Jensen will be enclosed in brackets]. Jensen asserts, to begin with, that the measured IQ correlation between unrelated foster children reared together is .24. This number represents the fraction of variance attributable to environmental variation [under the assumptions that genetic and environmental factors contribute additively and independently to IQ measurements and that the relevant genetic characteristics of foster children reared together are statistically independent]. The measured IQ correlation between one-egg twins reared apart is .75. This number represents the genetic fraction of the total variance in IQ [under the additional assumption that there is no significant correlation between nongenetic, or between genetic but noncognitive, characteristics of separated one-egg twins]. Thus genetic and environmental factors together account for 99 % of the total variance, leaving 1 % for genetic-environmental interaction. Although Jensen’s sum is good arithmetic, it is bad science. The values .24 and .75 are not measured correlations but medians of several measurements. Strictly speaking, medians should not be used at all in quantitative work (they are sometimes used in rough qualitative work because they give less weight to the extreme values in a group of measurements than a straight average). One should use weighted mean values, the weight assigned to a single measurement being determined by its probable error. In any case, measurements unacompanied by error estimates have no scientific value. Now, we can be sure that the error associated with a reported median correlation is at least as great as the spread of the individual measurements contributing to the median. The fact that the spreads of the measurements reported by Erlenmeyer-Kimling and Jarvik tend to increase with the number of reported correlations strongly suggests that the actual errors may in fact substantially exceed the observed spreads. For unrelated foster-children, the five reported corelation measurements span a range of about .15, while the four reported correlation measurements for one-egg wins reared separately span a range of .25. The spread in the sum of the two correlations is thus l/(.15* + .25*) = .29, according to the standard rule for combining independent measurements. This figure represents the purely internal uncertainty in Jensen’s estimate (.Ol) of the contribution of genetic-environmental interaction to the variance. Thus the internal error of the estimate is about 3,000 %. More serious than Jensen’s and Herrnstein’s failure to distinguish between medians and ‘actual values’ is their failure to recognize that the interpretation of the
290
David Layzer
measurements
rests on specific
and highly
questionable
assumptions
(bracketed
in
the above summary). Even if we knew that the IQ correlation between randomly selected foster-children reared together was, say, 25 %, we could not conclude that 75 % of the total variance in IQ scores was purely genetic in origin. A substantial part of the variance could well be, and probably is, produced by genetic and nongenetic factors acting jointly. Studies of one-egg twins reared separately might be expected to throw light on genetic-environmental interaction. Four such studies have been published. The sample sizes range from 19 to 53 pairs, and the studies are very diverse with respect to methodology, selection of subjects, and scientific objectives. Nevertheless, they do permit certain broad conclusions to be drawn. The most striking result, common to all four, is that the tested IQ’s of separated one-egg twins tend to be very similar if the environmental differences are not too great. In these circumstances the measured correlations are usually higher than those between two-egg twins reared together. This result clearly indicates that genetic factors can play an important role in the development of cognitive skills. We must bear in mind, however, that among these factors we must include, for example, sex, color, temperament and various physical characteristics. So far as I know, there is no theoretical or observational justification for supposing that the effects of such noncognitive genetic factors are small compared with those of the specifically cognitive factors. We must therefore beware of drawing conclusions about the importance of cognitive genetic factors from studies of one-egg twins. But why bother to distinguish between cognitive and noncognitive genetic factors? The answer is that the ways in which some noncognitive factors influence the development of intelligence depend strongly on social and cultural attitudes that could change rapidly and drastically. Prevailing attitudes toward women and AfroAmericans are currently among the most attractive candidates for such a change. Although there appears to be little doubt that one-egg twins reared separately tend to have similar IQ’s, the measured IQ correlations have no quantttarive significance, because in none of the studies are the environmental differences between separated twins representative of those between randomly paired children of the same age. The environmental differences that matter in this connection are, of course, those between factors especially relevant to intellectual growth. What are these factors? Or, to put the same question in slightly different terms, what kinds of experience at what ages contribute most effectively to a child’s mental development? This question is still the subject of intensive research. Most students of cognitive development agree, however, that the usual indices of socioeconomic status (based chiefly on the occupational status of the head of the family) are not reliable indices of the environmental factors most relevant to cognitive development. Since crucial
Science or superstition?
291
stages of this development occur during the first years of a child’s life, the mother’s child-rearing practices, her intelligence, and the quality of the mother-child relationship, are perhaps more relevant than the father’s occupation.
Environmental
differences in the
twinstudies
Relevant environmental differences between separated twins are most carefully assessed in the classic study of Newman, Freeman and Holzinger (1937). Summarizing their findings on the environmental differences between the separated twins in their study, these authors write: ‘. . . the majority of the ratings [of relevant environmental differences] were relatively low. For only a few cases were the environmental differences of the twin pairs judged to be large.’ These comments apply equally to Shields’s study (1962), in which, besides, a large fraction of the subjects not reared in their own homes were reared by relatives. In the third study (Juel-Nielsen and Mogensen, 1957) the sample - 12 pairs - is too small to permit statistical inferences to be drawn from it. The fourth study, that of Burt, is the largest (53 pairs) and the only one in which the subjects were children. Moreover, only five of the 53 pairs were reared by relatives. Finally - and this is a point on which Burt, Jensen and Herrnstein lay great stress - there was no significant correlation in occupational status between the families in which the separated twins were reared. Can we infer from this that the relevant environmental differences between separated twins in Burt’s study are likely to be representative of differences between randomly selected children in the same population? To begin with, it is worth noting that the vast majority of the children in Burt’s study attended school in London. Thus the range of educational experiences in this study was considerably narrower than it would have been for a representative sample of English schoolchildren. But the environmental differences between separated twins in Burt’s study are unrepresentative in other respects too. As Burt points out, the absence of significant correlation between the occupational categories of foster parents and natural parents does not reflect a deliberate policy on the part of adoption agencies to randomize the placement of foster children. On the contrary, efforts are normally made to match the social classes of foster and natural parents, though such efforts frequently fail. Thus half the foster parents in Burt’s sample belong to occupational category V (semi-skilled), but this category supplied only 28 % of the children reared in foster homes or residences. For category VI (unskilled) the imbalance is in the reverse direction. These and similar systematic asymmetries, which no doubt reflect systematic differences between the social and economic pressures experienced by families in different occupational categories, are presum-
292
David Layzer
ably responsible for the failure of efforts to match the social classes of adoptive and natural parents. But is there any reason to suppose that the agencies failed to place children in families that they judged to be suitable in other respects? or to suppose that their placement criteria, which presumably included emotional and intellectual qualities of the foster mother, were less relevant to the foster child’s intellectual development than the occupational status of the foster father? For Herrnstein (if I interpret his remarks on the subject correctly) the ‘breadwinner’s occupation’ is the most relevant environmental factor, but I think it is possible that not all psychologists share this view. Two other points should be mentioned in connection with Burt’s study. First, although his sample is relatively large, it is by no means what statisticians would call a ‘fair sample.’ For example, among the children reared in their own homes, the frequency distribution of IQ is not normal (i.e., bell-shaped) as it would be in a fair sample, but strongly skewed: although the mean and the variance are nearly equal to their values in the standard reference population, only 38 % of the children in the sample have IQ’s greater than 100. Second, Burt’s assigned IQ’s are not measurements in the Jensen-Herrnstein sense, but semi-objective assessments. Burt’s published figures show that the subjective assessments for separated twins correlate better and differ less than do group-test scores. To sum up, in none of the published twin studies can we condder the environmental differences between separated twins to be representative of those between children of the same age drawn at random from a standard reference population. Because the samples are small and unrepresentative and because data are wholly lacking on relevant environmental differences that might enable one to correct partially for the unrepresentative character of a sample, we cannot even guess what the IQ correlation between genetically identical children reared in randomly selected, statistically independent environments might turn out to be. This difficulty cannot of course be overcome by combining correlations measured in four methodologically disparate studies carried out in three different countries; four sieves will not hold more water than one.
Environmental
differences
and IQ differences in the twin studies
Although the twin studies do not provide usable information about theoretically significant correlations, they do tell us something about the effects of environment on IQ. For example, we can ask: do large environmental differences tend to produce large IQ differences? In his incisive critique of Jensen’s article, Martin Deutsch (1969) has summarized several independent studies that throw light on this question,
Science or superstition?
293
of which the following is representative: Using the Newman et al. ratings of educational and social differences between pairs of twins, Stone and Church (1968) classified 10 pairs of twins as having ‘larger differences in educational and social advantages’ (DSEA), and 9 pairs of twins as having ‘smaller DSEA.’ They found that 7 pairs of the twins in the larger DSEA group had IQ differences of 10 or more points, while only 3 pairs of twins in this group had IQ differences of less than 10 points. In the group with the smaller DSEA, all pairs of twins showed IQ differences of less than 10 points. In the larger DSEA group, 4 pairs of twins showed differences of 15. 17, 19 and 24 IQ points . . . These analyses of twin data indicate greater differences in intelligence test scores between identical twins reared apart than Jensen acknowledges in his discussion; implied is a greater environmental contribution to the performance of even the most genetically similar individuals. Interestingly enough, Burt s own data (unpublished) reveal a similar statistical connection between large IQ differences and large differences in occupational status. For example, in every case where the IQ difference between separated twins is at least 10 points and the two families differ by at least two grades in occupational status - five cases in all - the IQ difference is in the same direction as the difference in occupational satus. In view of the limited relevance of occupational class to cognitive development in this study (illustrated by the fact that th esecond largest IQ difference in the study, 15 points, occurs for twins reared in families assigned to the same occupational category) we could hardly expect to find a stronger connection between differences in IQ and socioeconomic status. Burt and his colleagues have themselves examined correlations between the IQ differences of the separated twins and differences in indices relating to ‘cultural conditions’ and ‘material conditions’ in their environments. They report a positive correlation of 43 % between differences in cultural conditions and differences in IQ as measured by group tests; and a correlation of 74 % between differences in cultural conditions and differences in scholastic attainments. No significant correlation was found between IQ differences and differences in material conditions, but differences in scholastic attainments did correlate with differences in material conditions (37 %). Thus Burt’s results, like those of other investigators, do support the thesis that cultural factors play an important part in the growth of cognitive skills.
Tbe hypothesis of fixed mental capacity So far we have been chiefly concerned with the arguments by which Jensen and other hereditarians have sought to establish the high heritability of IQ. We have
294
David Layzer
seen that these arguments do not hold water. In the first place, the ‘heritability of IQ’ is a pseudo-concept like ‘the sexuality of fractions’ or ‘the analyticity of the ocean.’ Assigning a numerical value to the ‘heritability of IQ’ does not, of course, make the concept more meaningful, any more than assigning a numerical value to the sexuality of fractions would make that concept more meaningful. In the second place, even if we had a theory of inheritance that could be applied to IQ test scores, we could not apply it to the correlation data employed by Jensen. A scientific theory, like a racing car, needs the right grade of fuel. Jensen’s data are to scientific data as unrefined petroleum is to high-test gasoline. Jensen and Herrnstein would have us believe that we can gain important insights into human mtehigence and its inheritance by subjecting measurements that we do not understand to a mathematical analysis that we cannot justify. Unfortunately, many people appear to be susceptible to such beliefs, which have their roots in a widespread tendency to attribute magical efficacy to mathematics in almost any context. The perennial popularity of astrology is probably an expression of this tendency. Astrology is based, after all, on hard numerical data, and the success and internal consistency of its predictions are customarily offered as evidence for its validity. The most important difference between astrology and the Jensen-Herrnstein brand of intellectual Calvinism is not methodological but philosophical; one school believes that man’s fate is written in the stars, the other that it runs in his genes. Jensen’s and Herrnstein’s central thesis is that certain cognitive skills - those involving abstract reasoning and problem solving - cannot be taught effectively to children with low IQ?. From this thesis and from it alone flow all the disturbing educational, social and political inferences drawn by these authors. If social and educational reforms could raise the general level of mental abilities to the point where people with IQ’s of 85 were able to solve calculus problems and read French, rank order on mental tests would no longer seem very important. It is precisely this possibility that Jensen’s argument seeks to rule out. For if only a small fraction of the difference in average IQ between children living in Scarsdale and in BedfordStuyvesant can be attributed to environmental differences. it seems unrealistic to expect environmental improvements to bring about substantial increases in the general level of intelligence. Now, even if Jensen’s theoretical considerations and his analysis of data were beyond reproach, they would afford a singularly indirect means of testing his key thesis. The question to be answered is whether appropriate forms of intervention can substantially raise (a) the rate at which children acquire the abilities tested by IQ tests and/or (b) final levels of achievement. This question can be answered experimentally, and it has been. Since we do not yet know precisely what forms of intervention are most effective for different children, negative results (such as the alleged
Science or superstition?
295
failure of compensatory education) carry little weight. On the other hand, all positive results are relevant. For if IQ can be substantially and consistently raised - by no matter what means - it obviously cannot reflect a fixed mental capacity. The professional literature abounds in reports of studies that have achieved striking positive results. Several of these are cited by Starr-Salapatek (1971a) in a critical review of recent hereditarian literature. In one extended study, the Milwaukee Project, in which subjects are ghetto children whose mothers’ IQ’s are less than 70, intervention began soon after the children were born. Over a four-year period Heber has intensively tutored the children for several hours every day and has produced an enormous IQ difference between the experimental group (mean IQ 127) and a control group (mean IQ 90). Has intensive tutoring engendered in these ghetto children a previously absent ‘capacity’ for abstract reasoning and problem solving? In a study published in 1949 and frequently cited in the psychological literature, Skodak and Skeels compared the IQ’s of adopted children in a certain sample with those of their biological mothers, whose environments were systematically poorer than those of the adoptive mothers. They found a 20 point mean difference in favor of the children, although the rank order of the children’s IQ’s closely resembled that of their biological mothers. Many tests have shown that blacks living in the urban north score systematically higher on IQ tests than those living in the rural south. For many years hereditarians and environmentalists debated the interpretation of this finding. The environmentalists attributed the systematic IQ difference to environmental differences, the hereditarians to selective migration (they argued that the migrants could be expected to be more energetic and intelligent than the stay-at-homes). The environmental interpretation was decisively vindicated in 1935 by 0. Klineberg, who showed that the IQ’s of migrant children increased systematically and substantially with length of residence in the north. In New York (in the early 1930’s) migrant black children with 8 years of schooling had approximately the same average IQ as whites. These important findings were fully confirmed by E. S. Lee (1951) who, 15 years later, repeated Klineberg’s experiment in Philadelphia. Additional studies bearing on IQ differences between ethnic groups are reviewed and analyzed by L,. Plotkin (1971). Teachers and therapists who work with children suffering neuropsychiatric disorders (including emotional and perceptual disturbances) regularly report large increases in their tested IQ’s. One remedial reading teacher of my acquaintance works exclusively with ‘ineducable’ children. So far she has not had a single failure; every one of her pupils has learned to read. And reading, of course, provides the indispensable basis for acquiring most of the higher cognitive skills.
296
David Layzer
The hypothesis
of unlimited
educability
That the growth of intelligence is controlled in part by genetic factors seems beyond doubt. The significant questions are, ‘What are these factors?’ ‘How do they operate?’ ‘How do they interact with non-cognitive and environmental factors?’ Experience suggests that children differ in the ease with which they acquire specific kinds of cognitive skills as well as in the intensity of their cognitive drives or appetites. But cognitive appetites, like other appetites, can be whetted or dulled. Nor are aptitude and appetite the only relevant factors. Everyone can cite case histories in which motivation has more than compensated for a deficit in aptitude. There are excellent skiiers, violinists and scientists who have little natural aptitude for any of these activities. None of them will win international acclaim, but few of them will mind. I know of no theoretical or experimental evidence to contradict the assumption that everyone in the normal range of intelligence could, if sufficiently motivated, and given sufficient time, acquire the basic cognitive abilities demanded by such professions as law, medicine and business administration. Once we stop thinking of human intelligence as static and predetermined, and instead focus our attention on the growth of cognitive skills and on how the interaction between cognitive, non-cognitive and environmental factors affects this growth, the systematic differences in test performance between ethnic groups appear in a new light. Because cognitive development is a cumulative process, it is strongly influenced by small systematic effects acting over an extended period. Informationprocessing ability grows roughly in the same way as money in a savings account the rate of growth is proportional to the accumulated capital. Hence a small increase or decrease in the interest rate will ultimately make a very large difference in the amount accumulated. Now, the ‘cognitive interest rate’ reflects genetic, cultural and social factors, all interacting in a complicated way. Membership in the Afro-American ethnic group is a social factor (based in part on non-cognitive genetic factors) that, in the prevailing social context, contributes negatively to the cognitive interest rate. The amount of the negative contribution varies from person to person, being generally greatest for the most disadvantaged. But there is no doubt that it is always present to some extent. In these circumstances we should expect to find exactly the kind of group differences that we do find. I think it is important to take note of these differences. They are valuable indices of our society’s persistent failure to eradicate the blight of racism. It may be that the assumption of unlimited educability will one day be shown to be false. But until then, it could usefully be adopted as a working hypothesis by educators, social scientists and politicians. We have seen that the widely held belief in fixed mental capacity as measured by IQ has no valid scientific basis. As a device
Science or superstition?
297
for predicting scholastic success (and thereby for helping to form the expectations of teachers, parents and students), as a criterion for deciding that certain children should be excluded from certain kinds of education, and as a lever for shifting the burden of scholastic failure from schools and teachers to students, the IQ test has indeed been, in Herrnstein’s words, ‘a potent instrument’ - potent and exceedingly mischievous. Admirers of IQ tests usually lay great stress on their predictive power. They marvel that a one-hour test administered to a child at the age of eight can predict with considerable accuracy whether he will finish college. But, as Burt and his associates have clearly demonstrated, teachers’ subjective assessments afford even more reliable predictors. This is almost a truism. If scholastic success is to be predictable, it must be reasonably consistent at different age levels (otherwise there is nothing to predict). But if it is consistent, then it is its own best predictor. Johnny’s second-grade teacher can do at least as well as the man from ETS. This does not mean that mental tests are useless. On the contrary, sound methods for measuring information-processing ability and the growth of specific cognitive skills could be extremely useful to psychologists and educators - not as instruments for predicting scholastic success but as tools for studying how children learn and as standards for assessing the effectiveness of teaching methods.
Conclusions To what extent are differences in human intelligence caused by differences in environment, and to what extent by differences in genetic endowment? Are there systematic differences in native intelligence between races or ethnic groups? Jensen, Herrnstein, Eysenck, Shockley and others assure us that these questions are legitimate subjects for scientific investigation; that intelligence tests and statistical analyses of test results have already gone a long way toward answering them; that the same techniques can be used to reduce still further the remaining uncertainties; that the results so far obtained clearly establish that differences in genetic endowment are chiefly responsible for differences in performance on intelligence tests; that reported differences in mean IQ between Afro- and Europeo-Americans may well be genetically based; and that educational, social and political policy decisions should take these ‘scientific findings’ into account. We have seen, however, that the arguments put forward to support these claims are unsound. IQ scores and correlations are not measurements in any sense known to the natural sciences, and ‘heritability estimates’ based on them have as much scientific validity as horoscopes. Perhaps the single most important fact about human intelligence is its enormous and as yet ungauged
298
David Layzer
capacity for growth and adaptation. The more insight we gain into cognitive development, the less meaningful seems any attempt to isolate and measure differences in genetic endowment - and the less important. In every natural science there are certain questions that can profitably be asked at a given stage in the development of that science, and certain questions that cannot. Chemistry and astronomy grew out of attempts to answer the questions, How can base metals be transmuted into gold? How do the heavenly bodies control human destiny? Chemistry and astronomy never answered these questions, they outgrew them. Similarly, the development of psychology during the present century has made the questions posed at the beginning of this paragraph seem increasingly sterile and artificial. Why, then, are they now being revived? Earlier in this article I suggested that a combination of cultural, historical and political factors tempts us to seek easy ‘scientific’ solutions to hard social problems. But this explanation is incomplete. It leaves out a crucial psychological factor: once we have acquired a skill we find it hard to believe that it was not always ‘there,’ a latent image waiting to be developed by time and experience. The complex muscular responses of an expert skier to a difficult trail are, to him, as instinctive as a baby’s reaction to an unexpected loud noise. For this reason the doctrine of innate mental capacity exercises an intuitive appeal that developmental accounts can never quite match. This however, makes it all the more important to scrutinize critically the logical, methodological and psychological underpinnings of that doctrine.
REFERENCES Burt, C. (1966) The genetic determination of differences in intelligence: A study of monozygotic twins reared together and apart. Brir. J. Psychol., 57 (1 and 2), 137-153. Burt, C. and Howard, M. (1956) The multifactorial theory of inheritance and its application to intelligence. Brit. J. star. Psychol., 9, 9.5131. Deutch, M. (1969) Happenings on the way back to the forum. Harvard educ. Rev., 39, 423-557.
Erlenmeyer-Kimling, L. and Jarvik, L. F. (1964) Genetics and intelligence. Science, 142, 1477-1479. Eysenck, H. J. (1971) The ZQ argument, race intelligence and education. New York, The Library Press.
Fehr,
F. S. (1969) Critique omfhereditarian accounts. Harvard educ. Rev., 39, 571580.
Herrnstein,
R. .I. (1971) IJ. The Atlantic
Monthly,
228, 43-64.
Jensen, A. R. (1969) How much can we boost IQ and scholastic achievement? Harvard educ. Rev., 39, l-123. Juel-Nielsen, N. and Mogensen, A. (1959) Uniovular twins brought up apart. Acra Genetica, 7,430-433. Klineberg, 0. (1935) Negro intelligence and selective migration. New York, Columbia University Press. Lee, E. S. (1951) Negro intelligence and selective migration: A Philadelphia test of the Klineberg hypothesis. Amer. sot. Rev.,
16, 227-233.
Science or superstition?
Lemer, I. M. (1968) Heredity, evolution and society. San Francisco, W. H. Freeman and Company. Lewontin, R. C. (1970) Race and intelligence. Bulletin of the Atomic Scientists, 2-8.
March,
Newman, H. H., Freeman, F. N. and Hoi. zinger, K. _I. (1937) Twins: A study 01 heredity and environment. Chicago. Chicago Umversity Press. Piaget, J. (1952) The origins of intelligence in children. New York, International Universities Press. Piaget, J. (1967) Six psychological studies. New York. Random House.
Plotkin, L. (1971) Negro the Jensen hypothesis.
intelligence The
New
299
and York
Statistican, 22, 3-7.
Roe (1953) The Making of a Scientist. New York, Dodd, Mead and Company. Scam-Salapatek, S. (1971) Unknowns in the IQ equation. Science, 174, 1223-1228. Starr-Salapatek, S. (1971) Race, social class and IQ. Science, 174, 12851295. Shields, J. (1962) Monozygotic Twins. London, Oxford University Press. Skodak, M. and Skeels, H. (1949) A final follow-up study of 100 adopted children. .I. genet. Psychol., 75, 85. Waddington, C. H. (1957) The strategy of the genes. London, Allen and Unwin.
R&sum4
Jensen, ainsi que d’autres auteurs, a applique la theorie polygenique de l’h6redit6 aux mesures du QI et il a conclu que l’htredite intervenait B 80 %. 11 a utilid ces resultats pour poser des conclusions de port&~ generale, et il pretend en particulier que les enfants avec un QI bas, ne peuvent acqutrir les compttences cognitives sup&ieures, celles qui sont impliquQs dans le raisonnement abstrait et la solution des problemes. 11 a &palement pose que les differences ethniques sont un composant g&Ztique significatif du
QI. Cet article analyse les hypotheses implicites qui sous tendent l’analyse theorique de Jensen, et montrent que ces hypotheses sont insoutenables. Comme n’importe quelle autre theorie scientifique quantitative, la theorie de l’htr&iit6 polygenique s’applique settlement B des types de mesures qui satisfont certaines conditions formelles. Une ana-
lyse detail& des mesures du QI montre que celles-ci ne satisfont pas les conditions requises. En consequence, les estimations de ThCritabilitC du QI ne sont pas sirnplement fragiles, elles sont sans signification. D’autre part, en examinant, B la lumiere des id& courantes concemant le developpement cognitif, les don&es sur la corrtlation du QI et d’autres observations pertinentes, on voit que ces donn&s ne supportent pas l’hypothese que les enfants dont les parents ont un QI bas ont des capacites limit&es pour acqutrir les competences coguitives sup& rieures. Bref, l’hypothese de difference gdnCtiques entre les groupes ethniques nest pas testable par des methodes existantes ou mcme envisageables. Done, cette hypothese ne doit pas &tre consider&e comme une hypothese scientifique, mais une speculation mttaphysique.
Discussions
Whatever
happened
to vaudeville? A reply to Professor Chomsky
R. J. HERRNSTEIN Harvard University
Professor Chomsky winds up a discussion (1972)’ of some of my views on social stratification (1971) by listing what he calls my assumptions: ‘that people labor only for material gain, for wealth and power, and that they do not seek interesting work suited to their abilities - that they would vegetate rather than do such work (p. 40). Quite sensibly in the face of such strange assumptions, he rejects any conclusions that follow, especially since I offer ‘no reason why we should believe any of this (and there is certainly some reason why we should not)’ (p. 40). I agree completely, except that I made no such assumptions, nor are they required by any of my conclusions. Nevertheless, to be misread by a person of Professor Chomsky’s quickness of wit impels me to try to restate, perhaps to clarify, the part of my discussion that has eluded him. I believe that he was led astray because my conclusions run contrary to his political suppositions rather than by my prose style. In particular, my argument wrecks the egalitarianism to which Professor Chomsky, like many other American intellectuals, pays homage. First, what is my argument? Concisely stated, it is that, (1) since people inherit their mental capacities (as indexed, for example, in intelligence tests) to some extent, and (2) since success in our society calls for those mental capacities, therefore, (3), it follows that success in our society reflects inherited differences between people. To readers who have not worked their way through the 15,000 or so words leading up to that conclusion, my syllogism may seem rash and unsubstantiated. To those readers, I can only commend my original article. Since Professor Chomsky did not question the facts behind the two premises, I will not bother reviewing them here. Nor did Professor Chomsky object to the logical form of my argument, to the inevitability of (3), given (1) and (2). He did not even object to my use of such hazard1. Professor Chomsky’s discussion of my views appears, essentially word for word, in both Cognition (as part of a longer article)
and Ramparts (by itself). Page references here refer to the Cognition article.
302
R. J. Herrnstein
ous expressions as ‘mental capacities’ or ‘success.’ Some readers may wonder, at this point, what other objection there can be to a syllogism, besides to the factual truth of its premises, the logical soundness of its steps, or its definition of terms. That, I will come to in a moment. But first, I note that the second premise - that success in our society depends in some significant way on mental capacities - means specifically that mental competence is necessary, but not sufficient, for getting ahead. To put it yet another way, each level of success in our society, as defined by the members of society themselves, permits a range of intelligence, but the higher the level, the narrower the range. At the top of the scale of occupations, the intellectual requirement is set relatively high, but high intelligence does not guarantee success. This, too, is a factual assertion, which I therefore offer here without further substantiation. I focus on premise 2 because Professor Chomsky’s main argument centers thereon As he puts it: ‘This step in the argument embodies two assumptions: first, it is so in fact; and second, it must be so, for society to function effectively’ (p. 34). Professor Chomsky disputes only the inevitability of premise 2, not its current accuracy. But do I really assume that the second premise ‘must be so’? In fact, I find no such assumption in my article, nor can I see that I need it. For me, it is no disgrace if my argument holds merely for existing societies, not necessarily all possible ones. As regards societies about which we have some data, such as the American or the Japanese or the Western European or the Russian, Professor Chomsky apparently accepts premise 2, and so is inexorably carried on to my conclusion - that society will stratify itself increasingly by genetic factors as it divests itself of the barriers commonly held to be unfair - those of race, religion, family connections, inherited wealth, and so on. Or, at least, he has not at this time chosen to dispute my actual conclusion. Instead, his dispute is with a possibility I offer for scrutiny, which is that hereditary social classes may stratify not only existing societies but also any conceivable society in which merit is a factor in social status. Professor Chomsky challenges, then, not my conclusmn but my extrapolation, which is a distinction that I would like to register even as he overlooks it. Now, having made the distinction, I contend that his criticism of my extrapolation has barely a shred of merit, which, it seems to me, strengthens my case, for it would be a subtle flaw indeed that would elude Professor Chomsky’s sharp eye. Professor Chomsky notes that my argument ‘would not apply in a society in which “income (economic, social, and political) is unaffected by success”.’ (p. 34) That is correct, for in such a society, should it come to pass, the second premise may no longer hold. If people’s accomplishments did not gain the rewards society has to dispense, then, indeed, success (as measured by social rewards) might not depend in any significant way on mental capacities, thereby violating premise 2. But now, Professor Chomsky does something odd. Instead of depicting a hypothetical society without
Whatever happened
to vaudeville? A reply to Professor Chomsky
303
differential rewards, he instead postulates a society in which people are rewarded for their accomplishments ‘only by prestige.’ If they are differentially rewarded for their accomplishments, even if only by prestige, my syllogism applies, as Professor Chomsky admits. The second premise is back, albeit only for success defined by prestige - not by money or dachas or limousines or winter vacations in the Caribbean or on the Black Sea. In his idea of the ‘decent society,’ says Professor Chomsky, ‘It will only follow (granting his other assumptions [with which, I note, he registers no dispute at any point]) that children of people who are respected for their achievements will be more likely to be respected for their own achievements, an innocuous result even if true.’ (p. 35) Why does Professor Chomsky find it ‘innocuous’ for prestige to run in families? He does not say, but I believe it is because he thinks that if the sole status distinction between people is based on the prestige or respect they earn from their fellows, they will not suffer for their failures as much as they do in our society, in which the penalty for failure is poverty, or at least, relative poverty as compared to our society’s successful people. Let us, for argument’s sake, suppose that Professor Chomsky’s decent society could get its work done. It is, after all, possible (although rather unlikely, I wager) that a society using prestige as its only differential social reward could render prestige potent enough to sustain work no less well than do the rewards in our society, including money and power. But if prestige were that potent, then surely the lack of it would cause sadness and regret, just as the lack of money and power causes sadness and regret now. Perhaps it would be a kinder world than ours, if, in eliminating the suffering caused by relative poverty, it did not substitute equally, or more, painful psychic deprivations. But ‘innocuous’ hardly seems like the right word for a society stratified by a mortal competition for prestige. Professor Chomsky did not postulate any such competition, but that’s what he would find nevertheless, if prestige could be made potent enough to replace the material rewards of existing societies. Yet, if prestige were not that potent, could Professor Chomsky’s utopia get its work done? Does anyone doubt that the differential rewards granted in society function like the potential difference in an electrical circuit - as a kind of labor pump? By attaching different outcomes for different jobs, or for jobs done well or poorly, society directs the flow of labor one way or the other - as for example, out of vaudeville and into radio and motion pictures, which had captured its audience and the attendant multiple rewards. As a more timely example, consider the diminishing numbers of applicants for graduate schools and the lengthening queues for law and medical schools, precisely in tune with the shifting demands, and values, in society at large. Or remember that when the rewards for manufacturing spats disappeared, so did spats manufacturers. The inherent rewards of making spats, such as they were, could
304
R. J. Herrnstein
not have changed, but the extrinsic ones evaporated and so did the industry. Now, this is not to suggest that society always distributes its rewards sensibly, humanely, or even attractively, merely that the distribution expresses something like a social consensus, which then gets converted into human effort. Sometimes, because of extraneous perturbations, or short term influences, or structural inadequacies, the consensus may be faulty, as Professor Chomsky notes. Thus, our society may be harming itself in the long run by paying public relations experts such high salaries (to the irritation of both Professor Chomsky and me), thereby attracting into the business the bright people who then sell us a bill of goods. No doubt, Professor Chomsky rightly notes that such high salaries often come from the wealthy few who have a stake in keeping the rest of us fooled. But all is not lost, for there is glory (if not also money) waiting for the fellow who sets the public straight, like Ralph Nader perhaps, showing that the system may have more resiliency than Professor Chomsky supposes. In any case, the merits of a given social consensus as compared to another is immaterial to the issue, except insofar as Professor Chomsky thinks I approve of the American consensus on the values of various occupations and therefore holds me accountable for what he considers its flaws. Whether I approve or not, whether the consensus is wise or not, the point is simply that labor flows towards the rewards, and that if a given reward successfully guides the flow of labor, then it is valuable enough to cause psychological pain by its absence. The relevant principle is one that Professor Chomsky, in his distaste for Professor Skinner’s psychology, has apparently never grasped. If a reward can sustain effort by its acquisition then it will punish by its deprivation. Or, to be concrete about it -if, in Professor Chomsky’s hypothetical world, prestige and respect are strong enough to direct labor in accordance with the social consensus, then they are strong enough to bring unhappiness to those who fail to get it. It does not matter a bit whether the consensus comes from a free market, monopoly capitalists, or the central government, In all likelihood, however, Professor Chomsky did not mean to allow prestige to be all that important in his decent society. Although a bit vague on this point, I believe that he intended the differential reward of prestige to be ineffective, which is to say, he intended that it have little or no effect on the distribution of labor. For the moment, let us grant that work would go on anyway, a supposition I firmly disbelieve and will later reconsider. With prestige unimportant, Professor Chomsky correctly infers that the lack of prestige would cause no pain, and that distinctions would be ‘innocuous.’ But, if SO, he erred in concluding that prestige would then run in families. If prestige was an impotent reward (hence causing no suffering by its lack), then the better endowed people would not end up in the prestigious occupations. They would instead be randomly dispersed among the various occupational levels (except for another of Professor Chomsky’s implausible postulations, to which I also return
Whatever happened
to vaudeville? A reply to Professor Chornsky
305
later), and their children would be randomly dispersed too. Premise 2 is either right or wrong as a factual matter. If it is right (assuming premise 1 is right, too), then there will be a hereditary meritocracy and some people will suffer the pain of having lost out in the competition for society’s rewards. If it is wrong, then we have no grounds for inferring a hereditary meritocracy and no reason to suppose anyone is hurting for the lack of society’s impotent rewards. Perhaps Professor Chomsky erred because he forgot that one reason we now find the highly rewarded occupations usually filled by the better endowed people surely is that they enjoy a competitive advantage in the contest to fill the slots. The correlation between occupational level and intellect has something to do with the greater desirability of the better jobs, on the one hand, and the greater competence (on the average) of the intelligent, on the other. Professor Chomsky accepts the fact of a correlation now 2, predicts that there would be a correlation in his hypothetical world, but disbelieves that the extrinsic social rewards, which he would ban, have anything to do with it. But why, if society did not create a gradient of potent lures towards the various occupations - by money, power, prestige, or whatever - would there be any sort of correlation between intellect and occupational level (defined by some secret consensus of social utility that was not being translated into differential rewards)? Professor Chomsky apparently believes that the inherent pleasures of labor would yield a good, strong correlation anyway. ‘In a decent society everyone would have the opportunity to find interesting work, and each person would be permitted the fullest possible scope for his talents. Would more be required, in particular, extrinsic reward in the form of wealth and power? Only if we assume that applying one’s talents in interesting and socially useful work is not rewarding in itself, that there is no intrinsic satisfaction in creative and productive work, suited to one’s abilities, or in helping others (say, one’s family, friends, associates, or simply fellow members of society).’ (p. 35f) In short, his hypothetical society would sort people roughly the way a beehive sorts bees, by a differentiation in the individuals rather than in their 2. Here and there, Professor Chomsky muses about the importance of ruthlessnessand the like, as opposed to intellect, in the struggle for achievementin our society. He apparently does not know that the data on the matter show intellect, as indexed in I.Q. tests or just by schooling, to be a far better predictor than any measure of personality that might identify those who are ‘ruthless, cunning, avaricious, selfseeking, lacking in sympathy and compassion, subservient to authority and
willing to abandon principle for material gain,’ (p. 38) to quote Professor Chomsky’s formula for success in America. No doubt, given two people of equal intellect, personality and character may spell the difference, but even here, the data do not bear out Professor Chomsky’s gloomy vision. Instead, the ones who succeed tend, on the average, to be the buoyant, energetic, independent, healthy ones, although there are many interesting exceptions.
306
R. J. Herrmtein
extrinsic rewards. Only steady-handed, nerveless intellectuals would yearn to be surgeons; only true-earred, sensitive artists would crave to sing in public; only masters of logical complexity would declare themselves chessplayers (and no one must keep score at matches, for the pleasures of victory are paid for by the pain of defeat). And, at the other end of the range, the under-endowed would cheerily and spontaneously designate themselves assistant clerks or plumber’s helpers. That is how Professor Chomsky must get his correlation, for there would be no differences in pay, no differences in power, and such differences in respect as he grants would be ‘innocuous,’ which is to say, that they would not much affect the competition for jobs. There would only be differences in ‘intrinsic satisfaction in creative and productive work.’ Clearly, Professor Chomsky feels I have seriously underrated the power of that intrinsic satisfaction. And I grant that I hold it to be less important than he does, for, while he wants the world to run on it, I believe that human society can no more transform itself into a beehive than vice versa. I would say that the burden falls squarely on him to prove otherwise. It is not necesary to prove that work is sustained by a mixture of intrinsic and extrinsic rewards, and that both contribute to its attractiveness. To that extent I agree with Professor Chomsky when he says that the extrinsic rewards do not solely determine the distribution of labor. However, I know of no one who would disagree with him, not even Professor Skinner. Neither Professor Skinner nor I have any trouble understanding why house painters get more money for painting the outsides of houses than for painting the insides. They require extra extrinsic rewards to offset the intrinsic disadvantages of clinging to the 20th or 30th story on windy days, as compared to the safety indoors. Throughout the scale of occupations something similar operates. For a given level of social utility (not in any philosophical sense, but as measured by the prevailing consensus) the intrinsic and extrinsic benefits add up to something like a constant. Thus, Albert Schweitzer did not have to get paid a lot of money for his work in order to keep him going into old age, for the respect and the eternal reward he envisioned were apparently recompense enough. A society wisely praises such men ‘richly,’ if it is to have the fruits of great talent unstintingly dedicated. But such rich praise, as well as other sorts of riches, would have to be prohibited to inactivate my syllogism. The main issue between Professor Chomsky and me finally boils down to this. Suppose all extrinsic rewards for labor, from gratitude to cash, were somehow held constant over all occupations. Now, let only the intrinsic satisfactions vary as they will. Professor Chomsky supposes that in such a world, in which my syllogism would truly be innocuous, everything would go just fine. There would, he assumes, be no more clumsy surgeons, suicidal airplane pilots, inarticulate teachers, rude salesmen, out-of-tune singers in his ideal world than there are in the one we are living in.
Whatever happened
to vaudeville? A reply to Professor Chomsky
He supposes, in other words, that the matching of talent to occupation
307
need owe nothing to society’s system of differential rewards. A remarkable supposition; to me utterly unbelievable. Moreover, Professor Chomsky supposes that in his world there would be no shortage of labor either, that somehow the intrinsic rewards of coal mining, ditch digging, schooling, garbage collecting, poodle clipping, even house painting, would keep the work at just the level society needs, neither too much nor too little. For he supposes that the extrinsic rewards contribute nothing essential to the monitoring of labor. The reason Professor Chomsky must be supposing these outlandish things is that, as he well knows, as soon as he grants a role to the extrinsic rewards, my syllogism starts cranking away. It changes my argument not at all if intrinsic satisfaction could account for some of the distribution of labor, for the distribution of external rewards will compensate for such complications as the social consensus dictates. An onerous but important job will draw rich rewards; a pleasurable, insignificant one will draw little if any. As long as some of the distribution of labor depends at all significantly upon differential extrinsic rewards, and as long as the likelihood of success depends upon inherited mental differences (which, please recall, Professor Chomsky grants, or at least does not challenge), then social standing will depend upon inherited differences to some degree. Professor Chomsky may find me lacking in imagination. Why, he may ask, can I not picture his revolutionary new man or woman, eager to serve the decent society for no differential rewards except the sense of a useful job well done. The answer is that I know what has happened before when the state has told its citizens henceforth to be good and productive for the sake of the state (usually in the name of the ‘people’), instead of for their own sakes, and then enforces its vision of ‘classlessness.’ History does not encourage further ventures of that sort, at least, it does not encourage me. Soon after the leaders discover that selflessness cannot be counted upon, they are most likely to impose a gradient of punishment, which may have about the same potential for producing labor as our society’s gradient of’ reward, except that it is bound to be more, rather than less, cruel. It hardly looks like an improvement to substitute imprisonment or forced labor camps for poverty (especially when the poverty persists). Not that I think Professor Chomsky favors any such reign of terror, but he might have little to say at that point, for the revolution’s visionaries are often among its first victims. Professor Chomsky surely knows that the persistent status differentials in all socialist states follow directly from individual differences in ability and the Skinnerian principles of reward and punishment that he so contemptuously, and repeatedly, keeps dismissing. He must therefore tell us how his decent society will steer its way between those venerable human limitations. SO much for the main issue between Professor Chomsky and me. Unlike some of my more vehement critics, Professor Chomsky does not accuse me of racism, for
308
R. J. Herrnstein
which I am grateful. However, he does hold me accountable for making an argument that ‘will surely be exploited by racists to justify discrimination.’ (p. 41) Unfortunately for the sake of further discussion, Professor Chomsky fails to say which of my arguments will have that unwholesome consequence, which makes his assertion difficult either to evaluate or refute. Professor Chomsky does, however, provide what he considers an analogous case to mine, with which I heartily disagree. Perhaps if I refute his analogy, I will have dealt with his complaint. First, the analogy: ‘Imagine a psychologist in Hitler’s Germany who thought he could show that Jews had a genetically determined tendency toward usury (like squirrels bred to collect too many nuts) or, a drive toward anti-social conspiracy and domination, and so on. If he were criticized for even undertaking these studies, could he merely respond that [Professor Chomsky is quoting me here] “a neutral commentator. . , would have to say that the case is simply not settled” and that the “fundamental issue” is “whether inquiry shall (again) be shut off because someone thinks society is best left in ignorance”? I think not. Rather, I think that such a response would have been met with justifiable contempt. At best, he could claim that he is faced with a conflict of values. On the one hand, there is the alleged scientific importance of determining whether in fact Jews have a genetically determined tendency toward usury and domination (an empirical question, no doubt). On the other, there is the likelihood that even opening this question and regarding it as a subject for scientific inquiry would provide ammunition for Goebbels and Rosenberg and their henchman.’ (p. 42) Presumably, then, because I did not deny the possibility of a racial difference in I.Q., I am like the scientist in the analogy, studying innate Jewish usury in Hitler’s Germany. One must make allowances for Professor Chomsky’s tendency towards hyperbole. America is not Hitler’s Germany, and I was not proposing to study, nor was I asserting, a genetic flaw in any race. Let us, nevertheless, consider the analogy on its own merits. To begin with, I agree that in Hitler’s Germany, I might not study innate hoarding in Jews. But then, in Hitler’s Germany, would I do any science at all (disregarding for argument’s sake that Professor Chomsky and I would both be in concentration camps)? I hope that I would have had the strength to cease being a scientist in such a society and the good sense to have been among those who fled Germany in the 1930’s. I therefore share Professor Chomsky’s contempt for his hypothetical scientist, but not for Professor Chomsky’s reason. The scientist’s specialty hardly matters, compared to his willingness to stay and work at all. Wolfgang Kiihler, who voluntarily vacated the prime academic chair for a psychologist in Germany - the professorship at Berlin - worked on the physiological basis of perception. Was his gesture any less admirable because his research had no clear relevance to Nazi
Whatever happened
to vaudeville? A reply to Professor Chomsky
309
ideology? I think it was more admirable, for he could easily have used his irrelevance as an excuse for remaining indifferent to his country’s troubles and continuing to enjoy his eminent circumstances. Goebbels and Rosenberg did not need Kiihler’s data; they needed Kijhler’s acquiesence in German society, which he courageously and unstintingly withheld. And so it would be with Professor Chomsky’s hypothetical scientist. As a matter of fact, if Professor Chomsky’s scientist is honest, Goebbels and Rosenberg would probably stop him from carrying out his research anyway. Professor Chomsky forgets that in honest research, one does not always know the an.swer beforehand. Goebbels and Rosenberg would worry that Jews might not have an innate tendency towards usury, which would be quite embarrassing for the Party, if it got around. Instead, they would simply find some pseudo-scientist who would invent more convenient findings. Professor Chomsky, who wants me to subject scientific findings to the test of political suitability (see his pp. 41-44), should expect no less from Goebbels. Professor Chomsky’s analogy proves to be quite revealing, although it has little bearing on my article and hardly proves that my argument will be ‘exploited by racists.’ Contrary, I am sure, to his intention, I draw the lesson that we should encourage more research on people, not less. And all of it of passable quality should be published, not picked over for symptoms of apostasy. I trust that, as always before, the truth will turn out to be more complex and subtle than called for in anyone’s orthodoxy. Since society must cope with what people are really like, rather than with the fictions embodied in one political philosophy or the other, we would do well to learn as much as we can at every opportunity, limited, of course, by the rights of individuals to their privacy.
REFERENCES Chomsky, N. (1972) Psychology and ideology. Cognition, 1, 1 l-46; also in Chomsky, N. (1972) I.Q. tests: Building blocks for the new class system. Ramparts, 11
(l), 24-30. Herrnstein, R. J. (1971) I.Q. The Atlantic, Monthly, September.
‘Reflections
Some notes on L.S. Vygotsky’s
concerning Dr. Fodor’s Thought and language’
A. N. LEONTIEV A. R. LURIA Moscow
University
After carefully studying Dr. J. A. Fodor’s ‘Reflections on L. S. Vygotsky’s Thought and language’, we decided to reply for two basic reasons: Firstly, as collaborators and followers of L. S. Vygotsky, we are naturally extremely interested in any article on this outstanding researcher, who still has a great influence on experimental psychology in our country. Thus we were very attentive to a paper on Vygotsky’s basic work, especially by such a gifted psychologist as Dr. Fodor. Secondly, it is quite apparent to us that although Dr. Fodor makes some very good points when he discusses some of the basic problems in the contemporary approach to psychology, nevertheless, he makes some statements with which we are in complete disaccord. Defending truths or discovering mistakes is a much easier task compared to dealing with half-truths, and these are the most dangerous in history of science. Having briefly stated our reasons for replying, we would now like to discuss the contents of J. A. Fodor’s paper. Dr. Fodor begins his article by saying that towards the end of the last century, psychology came to an amicable divorce with philosophy, and began to lead an independent life; a page later, Dr. Fodor takes this statement up again saying that psychology, however, still remained under the influence of a bad philosophy, and that this lead to a ‘deplorable state of affairs’. We entirely agree with both these statements. We have always believed that in order for psychology to make any real progress, it must develop as an independent field. But nonetheless, we think that a scientifically based philosophy, which deals with some of the basic concepts and general laws on the development of nature and society, has a decisive and positive influence on our research. It also provides a guarantee against certain misinterpretations of data and loss of progress in the field. When Dr. Fodor says ‘L. S. Vygotsky started from a priori assumptions more than from real facts’ (which we do not believe), it seems to us that he himself has based his views on another set of a priori assumptions, sometimes much more superficial than the ideas of L. S. Vygotsky.
312
A. N. Leontiev and A. R. Luria
The question of the relation between thought and language was of basic interest to Vygotsky. (Unfortunately Dr. Fodor’s information is limited, since Thought and language is the only book of Vygotsky’s to have been translated into English.) Leaving aside the historical problem (i.e., between the late twenties, when the book was written, and the early thirties when it was published, the question of the relation between thought and language was a basic problem to cognitive psychologists at that time) Dr. Fodor claims that Vygotsky’s basic mistake lies in identifying language with speech, and thought with problem-solving. His own idea is that the relation between the ‘deep structures’ or ‘natural’ and ‘inborn’ language and the ‘superficial language codes’ and their place in thought, has to be the central problem in cognitive psychology today. At the same time, he assumes that the processes of thinking can have different relations with speech and problem solving (‘do we really solve a problem thinking “Sunday will perhaps be warm?” ‘). We quite agree that thinking can take many different forms and that speech is not at all identical to language. We also agree that the language in which we integrate visual and auditory information cannot either be the language of vision or the language of audition, although it must contain both. But we doubt that the theory of a ‘natural language’ and ‘innate language codes’ is valid, and we turn to the works of D. 0. Hebb (1971), who in describing the most complex cognitive basis of language processes has rendered any hypothesis on its ‘innate’ origin untenable. We also know that the relation between language and thought can be extremely variable at different stages of the child’s development, and, consequently, any theory on the stability of this relation seems out of the question. An important factor in the works of L. S. Vygotsky is that he totally differs in his approach to the nature of mental processes from that of classical psychology. Vygotsky presumed that conscious (or cognitive) processes have a socio-historical issue, and that language is narrowly related to every conscious reflection of reality: These forms of conscious reflection undergo a series of deep structural changes during the child’s development. Thus any theory of a ‘natural’ (ready made) or ‘innate’ language seemed unacceptable to Vygotsky, and it remains unacceptable to both of us. It seems to us that a contrast between the immediate (natural) evolution of animal behavior and the social (or language-based) development of the human mind (of man’s cognitive processes) has a much broader significance. This is why Vygotsky mentions the social origin of language and its influence on human thought as a central issue in scientifics psychology. Vygotsky believed (as we do) that to think ‘Sunday will perhaps be warm’ is impossible without the participation of language. not because Sunday is a verbal concept, but because any conscious thought of the future is a mental process which needs language as its base, and it is impossible to deal with the future (as with the past) without the aid of inner speech as a derivation
Some notes concerning Dr. Fodor
313
of language. Thus Vygotsky accepted that even ‘practical intelligence’, i.e. constructing-tasks or problem-solving such as the Link blocks, etc. as well as the complex forms of active attention or memory are not ‘natural’ processes but can be realized only with the aid of inner speech-which is a specialized derivation of socially originated behavior. These statements are not mere philosophical speculations, or ‘a priori ussumptions’. Vygotsky himself prepared a long series of articles on his experimental works which are only partly published in Russian (a six-volume collection of his articles is now in preparation), as well as publications such as The development of memory published by A. N. Leontiev in the thirties, and a series of experimental studies concerning the development of speech and its directive functions, published by A. R. Luria. Also, many experimental works by such pupils of Vygotsky as A. V. Zaporozhets, D. B. Elconin, P. Ya. Galperin, and others give evidence that to apply the term ‘philosophical muddle’ to L. S. Vygotsky’s findings as does Dr. Fodor is entirely out of place when discussing the heritage of one of the most outstanding psychologists of our time. Let us now examine Dr. Fodor’s second ‘reflection’. He says that the idea ‘the meaning of the words evolve’ is of basic significance in Vygotsky’s theory. But he himself disagrees with the statement and gives a series of arguments which, according to him, render this idea untenable. His arguments can appear to be obvious and convincing at first glance, but a close examination shows how weak they really are: If the meaning of words were different for children and adults, then they would be talking different languages and no mutual understanding could be possible. ‘Vygotsky’s way of dealing with this objection is simply hopeless’, concludes Dr. Fodor. This statement can appear to be viable only if it is read at a superficial level without taking the context into consideration. It is not true that a mutual understanding is possible only if word meanings are identical. It is well known and accepted by all psychologists and linguists that the word has a very complex structure - we need not repeart it to such an outstanding psychologist as Dr. Fodor. A word always designs an object (quality, action, or relation) and a common designation suffices for a mutual understanding. But every word is a complex matrix, in which the same object can form different systems of relations - these matrices of relations constitute the essence of word meanings. Thus when a child and an adult use the word ‘shop’, they are relating to the same object, but the child relates ‘shop’ to a set of empirical (emotional) impressions, whereas an adult disposes of many more potential systems and can therefore select from a much wider range of relations than can a child. It is obvious that the word ‘angle’ fundamentally differs in meaning for a pre-school child, a schoolboy, and a student in geometry - and it is not just a quantitative difference of images and as-
314
A. N. Leontiev and A. R. Luria
sociations. What is really important is that at each stage, the developing word meaning requires a different mental operation. Thus with a small child a basic role is played by immediate impressions (partly emotional), with a schoolchild this structure undergoes a deep change, and finally with an adult the mental operations required to process word meanings involves an extremely complex process of deep psychological changes - and to believe, as does Dr. Fodor, that these changes are only quantitative, i.e. that an adult ‘knows more’ than a child but uses the same psychological operations, is an assumption which brings us back to the old times of associationistic psychology but which hinders the further progress of psychological science. One can be a good psycholinguist, with excellent empirical works, but, as Dr. Fodor, one can come under the influence of a bad philosophy. We do not think it worthwhile to abandon the theories of Vygotsky and Piaget fo rthose of Dr. Fodor. Doing so would be going right the way back to a kind of psychology which was abandoned at least two generations ago. In his third critical remark, Dr. Fodor argues that L. S. Vygotsky had a simplified notion of the basic essence of an adult’s thought, reducing it to Boolean Logics. Vygotsky understood the development of the child’s thought as a process of mastering abstract concepts and of selecting relevant criteria from a confused mass of sounds. We quite agree with Dr. Fodor’s remark that the processes of thought in an adult cannot be reduced to simply a process of abstraction and categorization, and that the flow of an adult’s thought depends to a great extent on his purposes, motives, and goals. Only a schizophenic selects abstract criteria of a string, instead of using them for practical purposes: A normal adult will never do so. But the fact is that a normal adult has various levels of logical thought, and he can use these levels differently according to his purposes and environmental requirements; a young child does not have these different levels of thought, and some theoretical operations are inaccessible to the child (if he does not acquire them through special forms of instruction). Thus a child who is capable of solving one kind of problem - by his own means - remains unable to solve other abstract problems, so that we have to find special ways of instructing (but under no circumstance, simply of conditioning) so as to develop new forms of cognitive processes even in young school children. A series of works by Vygotsky’s followers in our country (D. B. Elkonin, V. V. Davydov, P. Y. Galperin, et al.) have shown that the methods of education should not only follow the steps of a child’s mental development but a psychologically based instruction could highly stimulate the mental development of the child, permitting younger children to acquire new forms of thinking. And this is the essence of our discussion with our friend Jean Piaget, which took place during the XVIII International Psychological Congress in Moscow, as well as in a series of publications. Vygotsky thought of his method of classifying blocks simply as a model to
Some notes concerning Dr. Fodor
315
demonstrate the qualitative stages of the basic forms of generalization, which change during the child’s mental development. This technique brings forward a psychological issue of great significance, namely that the acquisition of abstract operations opens new possibilities to thought and results in an immense enrichment in the possibility of finding new relations between concrete objects. This is why we do not believe in the separation of abstract and concrete thinking but - as in Marx’ philosophy - we suppose that a transition from the empirical to the categorical approach provides a new opening in dealing with concrete objects. There are different ways open to this development, and Vygotsky himself mentions that the acquisition of empirical and scientific concepts have a different psychological mechanism. This is why we can hardly agree with an attempt to describe L. S. Vygotsky as a hard-headed defender of the idea of the development of thought as a linear approach to Boolean logics. We would like to say a few words in connection with some of the points made by Fodor with which we agree. These have already been developed by Soviet psychologists during the last decade. We totally agree with Dr. Fodor’s statement that thinking highly depends on the purposes which it serves and, we would like to add, on the form of activity (Tatigeit) it is included in. Thus it would be too dogmatic to say that concepts become accessible to children at the age of 12 to 14 years. Since Vygotsky’s death nearly forty years ago, Soviet psychologists have not just been repeating his work: It has been a period of extremely intensive and creative work. Many of L. S. Vygotsky’s statements were enriched and elaborated, and a series of significant data were obtained at the same time that new ideas were formulated. One of the basic steps in the development of psychology during these years was the elaboration of the general concepts of human actions and their psychological structures (see a series of publications by A. N. Laontiev, A. V. Zaporozhets, D. B. Elkonin, P. Ya. Galperin, L. I. Bozhowich, et aZ.). Anoher research project was the study of the relation between instruction and the mental development of a child, and the methods which could be used to intensify the course of his development, such as finding a method of teaching young school children of 7-8 years to master even complex concepts of algebra, linguistics, etc. (see the works of D. B. Elconin, V. V. Davydov, P. Ya. Galperin, et al.). It was clearly shown how man’s motives and purposes result in new forms of activity, how actions and operations - different at various of development - are acquired, and how the most complex unity of a child’s personality is formed. The data was preceded by a series of theoretical hypotheses formulated by Vygotsky in the last period of his life. Later they formed part of a highly elaborated field of psychology. We are mentioning all this to say that Vygotsky’s contribution to psychology is much greater than what is contained in Thought and language (which,
316
A. N. Leontiev
and A. R. Luria
by the way, appeared in a shortened version in English). So Dr. Fodor’s assumption that the development of the basic forms of thinking depends, to a great extent, on goals, purposes, and real tasks, fully concords with all that has been said by Soviet psychologists in the last decades. Thus we called Dr. Fodor’s ‘Reflections on L. S. Vygotsky’s Thought and language’, a ‘halftruth’. We wished to point out what seemed to us to be serious mistakes in interpretation, and to frankly express our disagreement. At the same time we wished to underline what we believe to be true in Dr. Fodor’s article, and with which points we totally agree.
Some
comments
on Fodor’s
‘Reflections on L. S. Vygotsky’s Thought and language’
H. SINCLAIR University of Geneva
1n his discussion of Vygotsky’s well-known work on language and thought, Fodor appears to equate Piaget’s views on cognitive development with those of Vygotsky’s. Thisis at least partly misleading. Though development in Piaget’s theory is considered to proceed by stages, and though Piagetians indeed assume that ‘operations of a specific computational power are either available or absent across the board at any given developmental stage’ (under certain conditions, at least), they do not think of development as an accumulation of isolated new operations. Nor is it correct to imply that Piagetians, like Vygotsky, lable certain concepts as ‘concrete’ or ‘abstract’, Piaget’s stages are defined by system: systems of actions (during the sensorimotor period, before the infant becomes capable of representation), systems of one-way mappings or semi-logical functions (during the period of intuitive thought, until the age of about six), and systems of mental operations (at what has been called the ‘concrete’ and ‘formal’ levels of reasoning). Piaget has always emphasized the isomorphism between what he calls the grouplike structure of sensorimotor action-patterns and the mathematical group-structures of formal thought. Inside this framework there is no way to lable the notion of ‘tableware’ (to take one of Fodor’s examples) as a sensori-motor, concrete or formal concept per se: It is only the operative system of which it can be the content that can be so defined. For example, a six-year old may agree that there are less spoons than tableware objects, but, generally, he will not understand that there are more nonspoons than non-tableware objects. A twelve year old will accept and explain the latter statement, and may add that in a country where only spoons, no knives or forks, are used, it would not hold true. We do not, ordinarily, converse about nonspoons, and young children are perfectly capable of talking intelligently with adults about knives and their dangers (from the operative point of view, ‘danger’ is no more ‘abstract’ than ‘knife’). Nonetheless, the operative reasoning system adults or older children can bring to bear on such notions is different from that of younger children.
318
H. Sinclair
The developmental view presented by Fodor has much in common with Piaget’s theory. Piaget has always stressed the fact that babies exhibit a number of actionpatterns, each of which forms a small organized totality, and that development consists in ever wider coordinations and integrations of such patterns. Small infants can suck their fingers in a most accomplished way, they can also look at objects and follow their movements, but only later on can they combine both patterns and look at what they grasp or grasp what they are looking at. By contrast, when Fodor views development as a widening application of early installed (or maybe innate) special purpose computational apparatus, thus assuming a difference in quantity rather than in quality at different developmental levels, this cannot be conciliated with Piaget’s theory. In the first place, Piaget considers coordinations of special action patterns as qualitative changes; and this is where his epistemological view repudiates empiricism, since Piaget emphasizes the capacity of the human mind to create genuine novelties (through the recombination of already existing elements, just as happens in biology). Secondly, though these action-patterns (and the later thought-patterns) form organized totalities, they are not rigidly pre-programmed but their functioning creates disequilibria and conflicts which will lead to their integration with other patterns and thereby to their reorganization. This is where Piaget’s psychological theory departs both from the theories of learning by association and transfer, and from the purely maturational view of development. The ‘ethological plausibility’ of Piaget’s developmental theory has been explored by Etienne (1972) and its biological basis has been formulated by Piaget himself (1967).
REFERENCES Etienne, A. (1972) Paper presented at the symposium on Constraints on learning, Cambridge, England, April.
Piaget, J. (1967) Biologie et connaissance, Paris, Gallimard.