Cog&ion, 8 (1980) 227-241 @ Elsevier Sequoia S.A., Lausanne
1 - Printed
in the Netherlands
Classes and Collections: P...
46 downloads
1126 Views
8MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Cog&ion, 8 (1980) 227-241 @ Elsevier Sequoia S.A., Lausanne
1 - Printed
in the Netherlands
Classes and Collections: Principles of Organization in the Learning of Hierarchical Relations ELLEN M. MARKMAN, MARJORIE S. HORTON ALEXANDER G. McLANAHAN* Stanford
University
Abstract This study demonstrates that children tend to distort class inclusion relations (e.g., the relation of oaks to trees) into the part-whole structure of collections (e.g., the relation of oaks to a forest). Children aged 6 to I7 were taught novel class inclusion hierarchies, analogous to the relation between oaks, pines, and trees. In one condition, the class inclusion relations were taught by ostensive definition alone, e.g., stating “These are trees” while pointing to trees and, “These are oaks, ” while pointing to oaks. In the second condition, children were additionally told what would be analogous to “Oaks and pines are two kinds of trees”. With this additional information to constrain their interpretation, even the youngest children correctly interpreted the relation as class inclusion. In contrast, with limited information, children as old as I4 erroneously imposed a collection structure on the inclusion hierarchies. They would deny, for example, that any single tree was a tree (as they should if they thought ‘tree” meant ‘tforest “I, and would pick up several trees despite being asked for a tree. The results indicated that the part-whole structure of collections is simpler to establish and maintain than the structure of inclusion.
People routinely learn hierarchically organized class inclusion relations, e.g., that chairs are furniture, that poodles are dogs, that oaks are trees. In learning such relations, they must figure out how two words (e.g., “oaks” and “trees”) apply to the same objects. Since class inclusion relations are pervasive and readily acquired, they could be the most basic or simplest principle of hier*We are grateful to H. H. Clark and J. H. Flavell for their helpful comments and to Mr. Richard D’Amelio and the staff of Bayside Middle School, Mr. Warren Edwards and the staff of Aragon High School, Mr. J. Patterson and the staff of Hoover Elementary School, Mrs. Inez Finkel and the staff of Gavello Glen Nursery School, Mrs. Marjorie Ward and the staff of Sunflower Nursery School, and the staff of Bing Nursery School for their participation in this study. This work was supported in part by PHS research grant MH 28154. Requests for reprints should be sent to Ellen M. Markman, Department of Psychology, Stanford University, Stanford CA 94305, U.S.A.
228
E. M. Markman, M, S. Horton and A. G. McLanahan
archical organization. On this assumption, inclusion should be the first hypothesis children would consider when confronted with a novel hierarchy. Despite its prominence, class inclusion is not the only hierarchical organization possible. One alternative is the collection structure described by Markman and Seibert (1976), e.g., the relationship between trees and forest, soldiers and army. Although collective nouns (“forest”, “army”, etc.) are scarce relative to class terms, we will argue that in some ways collections reflect a psychologically simpler principle of hierarchical organization. In particular it may be easier for children to construct a collection organization than a class organization in their initial acquisition of novel hierarchies. Collections differ from classes in several ways (Markman & Seibert, 1976). First, collections are organized into something like part-whole relations, e.g., trees are part of a forest, they make up a forest, they are in a forest. In contrast, classes are organized into class inclusion relations, e.g., oaks are trees, they are kinds of trees, they are examples of trees. Second, members of a collection must be related to each other, e.g., for trees to form a forest they must exist in some type of spatial proximity to each other. Class membership does not require such interrelationship. For example, one can establish whether or not a given object is a tree by examining its properties independently of its relation to other trees. Third, because of these differences in internal structure, collections have greater psychological integrity or coherence than classes. It should be simpler to conceptualize collections as organized wholes or aggregates than to do so for classes. There are several lines of argument and evidence which lend support to this analysis (Markman, 1973; 1978, 1979; Markman & Seibert, 1976). The most relevant to the present argument are the studies of Piagetian class inclusion (Markman, 1973 ; Markman & Seibert, 1976). In the standard Piagetian class inclusion problem, children are presented with a display of objects that form a small “zoological” hierarchy, that is, one superordinate set consisting of two mutually exclusive subordinate sets. One of the subordinate sets contains more members than the other and the child is asked to make a quantitative comparison between the larger of the two subordinate sets and the superordinate set. As a concrete example, a child shown four oaks and two pines would be asked “Are there more oaks or more trees?“. This task was used by Inhelder and Piaget (1964) as one index or test of concrete operations. They argued that to solve this problem a child must be able to simultaneously deal with class addition (oaks + pines = trees) and class subtraction (trees - pines = oaks). That is, the child must be able to subtract oaks out from the whole set, trees, and simultaneously include it in the whole set in order to make this “part-whole” (oaks - trees) comparison. Note that failure to solve this problem does not mean that a
Principles of Organization in the Learning of Hierarchical Relations
229
child does not know the inclusion relation between oaks and trees. Children who know that oaks are trees could still easily fail to solve this problem. In fact, for Piagetian theory, it is important that children know the inclusion relation at issue because otherwise they could fail the class inclusion question out of ignorance rather than lack of operativity. Thus, the task does not measure the child’s knowledge of the hierarchical relation of inclusion but rather his ability to operate upon that hierarchical relation, adding and subtracting classes and comparing parts and wholes. Until they are about eight years old, children generally find the class inclusion problem very difficult. Instead of making the part-whole comparison asked for (“Are there more oaks or more trees?“), young children make partpart comparisons (oaks versus pines). Markman and Seibert (1976) suggested that it is because the superordinate class lacks sufficient psychological coherence that once it is divided into two parts children can no longer maintain it in mind. If collections have greater psychological coherence, then children should be better able to maintain the collection in mind even while attending to its subparts. In several studies, children have been shown to be able to make part-whole comparisons on collections that they cannot make on classes. In all of these studies, children hearing collection descriptions viewed identical displays and were asked identical questions to children hearing class descriptions. Simple substitution of collection terms for class terms markedly improved children’s ability to answer these questions. These results indicate that collections are more stable with respect to their hierarchical organization than classes. The part-whole structure of collections forms a hierarchical relationship that is easier for children to operate upon than a class inclusion hierarchy. If so, then the collection hierarchy may be simpler to form as well as to operate upon. In particular, under circumstances where children are relatively free to impose their own structure on a novel hierarchy, they might prefer a collection to a class organization. To test this hypothesis we contrived a situation in which children were provided with minimal information about the hierarchical relation to see what their spontaneous interpretation would be. We taught children novel class inclusion hierarchies, analogous to learning the relation between oaks, pines, and trees. We used ostensive definition (pointing and labelling) in order to achieve a minimal specification of the relationship. As a concrete example, imagine that oaks and pines are presented in a row in front of the subject. While pointing to the oaks the experimenter would say, “These are oaks”; while pointing to the pines the experimenter would say, “These are pines”; and while pointing to both the oaks and pines, the experimenter would say, “These are trees”. The plural in “These are trees” accompanied by pointing to the trees means that each individual tree is a tree. The singular would have
230
E. M. Markman, M. S. Horton and A. G. McLanahan
to be used to properly refer to a collection, e.g., “This is a forest”. Thus, this minimal information indicates a class inclusion interpretation for the objects presented.’ If children do impose a collection structure on the hierarchy then they should believe that all of the elements together form an instance of the concept at the higher level of the hierarchy (“trees” in the example) but not that any single element is an instance. To see why, suppose a child is questioned about a collection, e.g., a forest. If asked to point to the forest, the child should point to many trees but should deny that a pine or any other single tree is itself a forest. Thus if a child mistakenly imposed a collection structure on a class inclusion relation like oaks and trees, he should point to many trees even when asked to point to a tree, and should deny that any single tree was a tree. The ostensive definition specifies inclusion only minimally so it allows children to reveal their preferred organization. Since, under ordinary circumstances, children obviously can learn class inclusion hierarchies, it is important to demonstrate that, with enough information constraining the interpretation of the relation, they could learn the inclusion hierarchies we are teaching them as well. Thus ostensive definitions will be compared with statements that more explicitly specify inclusion, e.g., “Oaks and pines are kinds of trees”. In summary, when given enough information constraining their interpretation, children should be able to learn novel class inclusion hierarchies. When information about the relation is limited, children will be relatively free to impose their own structure on the hierarchy and may then reveal their preference for collection organizations.
Methods Materials
Four novel categories were constructed, each composed of two subcategories. All of the category exemplars were small construction paper figures. Two of the four categories were of animals. One of these was designed so that its two subcategories were relatively similar to each other, while one was designed so that its two subcategories were relatively dissimilar. Two of the four categories were of shapes. One of these was designed so that its two sub‘That is, all of the designated objects from the subclass are members of the class but not the converse. Though ostensive definition can establish this proper inclusion relation for the objects presented, it is possible that an overlap relation could exist for other class members.
Principles of Organization in the Learning of Hierarchical Relations
23 I
categories were relatively similar to each other, while one was designed so that its two subcategories were relatively dissimilar. Illustrations of category exemplars appear in Figure 1. Order of presentation of category type (animal or shape) and degree of similarity was counterbalanced over subjects. In each of the four displays, subjects saw four exemplars of one subcategory and two exemplars of the other. The numerosity of the subcategories was counterbalanced across category type and order of presentation. Twelve CVC nonsense syllables constituted the names for the novel figures. Twelve different assignments of these nonsense syllables were randomly assigned to the 24 subjects per age group with the restriction that each be used twice. The assignment ensured that individual terms were counterbalanced across category type, degree of similarity and hierarchical level. Subjects Ninetysix children participated. There were 24 children from each of four grades: first and second (mean age = 6;8), sixth (mean age = 11 ;lO), eighth (mean age = 14;l) and eleventh and twelfth (mean age = 17;3). There were Figure 1.
Illustrations of the exemplars from each category. ANIMATE
-SIMILAR
SHAPE -SIMILAR
ANIMATE
- DISSIMILAR
SHAPE - DISSIMILAR
232
E. M. Markman, M. S. Horton and A. G. McLanahan
equal numbers of males and females at each grade. Each child was seen individually. For the youngest age group, there was one experimenter. For the older age groups, there were two experimenters counterbalanced with sex and condition over subjects. Procedure
Within each grade, children were randomly assigned to one of two learning conditions, ostensive or inclusion, with the restriction that the conditions be equated for sex. Each child was trained and then tested on one category at a time. Members of the category were lined in a row in front of the child with the members of each subcategory grouped together. The two subcategories were separated slightly. For both training conditions, the labels taught to the child were plural, analogous to “These are trees”. Testing always involved singular terms, analogous to “Is this a tree?“. The two conditions differed in the type of information presented during training. For the ostensive condition, the child saw the experimenter point to the row or part of the row while labelling the objects. In the inclusion condition after this ostensive information was provided subjects were also given inclusion information, analogous to “Oaks and pines are kinds of trees”. Details of the procedure are provided below. In this discussion, the two bottom nodes of the hierarchy, that is, the two subcategories, will be referred to as “A” and “B”. The top node will be referred to as “C”. Ostensive
condition
The experimenter firsipointed to each of the two subcategories, As and Bs (analogous to oaks and pines) or the entire category Cs (analogous to trees) and simply labelled them. Whether the top node (C) or the bottom nodes (A and B) were mentioned first was counterbalanced over items. The procedure is analogous to pointing to oaks and saying, “These are oaks”, pointing to pines and saying, “These are pines”, and pointing to trees and saying, “These are trees”. The experimenter always pointed to each array from left to right, in an identical manner for each. This pointing and labelling was repeated with pauses after each label to allow the child to provide the correct label, i.e., “These are ___“. If the child erred in repeating the label, the experimenter stated the correct label and again prompted the child. After the subject correctly repeated the experimenter-provided labels, the experimenter pointed to the arrays and requested the subject to name each by saying, “These are ___” and having the child provide the name. The
Principles of Organization in the Learning of Hierarchical Relations
233
correct answer was provided for children who did not give the correct label. This procedure was repeated until subjects were able to provide all three correct labels, one for the category and each of its subcategories. The child was then asked eight questions about the array, all of the questions using singular terms. Table 1 presents examples of questions and the scoring of class versus collection responses. Four of the questions were about the entire category (C) and four were about the subcategories (A or B). For the entire category, two of the questions required behavioral responses: “Show me a C” and “Put a C in the envelope”. These are analogous to being asked, “Show me a tree” .and “Put a tree in the envelope”. The other two questions required yes-no responses. While pointing to one exemplar of the category the experimenter asked, “Is this a C?“. This is analogous to the experimenter pointing to a pine and asking, “Is this a tree?“. For the subcategories, two of the questions required behavioral responses, “Show me an A” and “Put an A in the envelope”. These are analogous to “Show me an oak” and “Put an oak in the envelope”. The other two questions required Table 1.
Classification of the Eight Question Types and Possible Responses
Question type
Upper Level Questions Behavioral l.ShowmeaC. 2. Put a C in the envelope. Yes-No 3. Is this a C? (Experimenter points to a C). 4. Is this a C? (Experimenter points to another C). Lower Level Questions Behavioral 5. Show me an A. 6. Put an A in the envelope.
Consistent with a Class Interpretation (e.g., oaks, pines, trees)
Consistent with a Collection Interpretation (e.g., oaks, pines, a forest)
Child points to a C. Child puts a C in the envelope.
Child points to several or all of the Cs. Child puts several or all of the Cs in the envelope.
Yes
No
Yes
No
Child points to an A. Child puts an A in the envelope.
Child points to an A. Child puts an A in the envelope.
Yes
Yes
No
No
Yes-No
7. Is this points 8. Is this points
a B? (Experimenter to a B). a B? (Experimenter to an A).
234
E. M. Markman, M. S. Horton and A. G. McLanahan
“Yes-No” responses, “Is this a B?” (pointing to a B), and “Is this a B?” (pointing to an A). These are analogous to being asked, “Is this a pine?” (pointing to a pine), “Is this a pine?” (pointing to an oak). These eight questions were randomly ordered for the displays with the constraint that no more than three bottom node questions (A or B) or three top node questions (C) could occur consecutively. After every two questions, subjects were reminded of the labels for the three categories. These instructions were given as in training, e.g., “These are As, these are Bs, these are Cs”. Inclusion
condition
The pointing and labelling procedure for each of the three nodes per category was identical to the procedure in the ostensive condition. The only difference was in the additional information provided about category membership: “As are a kind of C”, “Bs are a kind of C” and “As and Bs are two kinds of Cs”. This is analogous to, “Oaks and pines are two kinds of trees”. This information was given immediately after the labelling and pointing. When the experimenter provided these statements about the inclusion relationships, he or she did not point to any aspect of the array. The procedures for testing and correcting the child’s knowledge of the labels and the learning criterion were identical to those of the ostensive condition. The eight testing questions and the procedure for asking them were identical to those of the ostensive condition. The only difference is that when children were reminded of the labels, they were also reminded of the inclusion relation, e.g., “As and Bs are two kinds of Cs”.
Results There are two types of errors that children could make that would suggest they have imposed a collection structure on the class inclusion hierarchy they were taught (see Table 1). Both of these errors were expected to occur on the upper level of the hierarchy, the entire category, C, and not on the two subcategories, A and B. For the items that required a behavioral response, “Show me a __“, and “Put a ___ in the envelope”, selecting more than one or all of the items is consistent with a collection but not a class interpretation.* Thus, these errors should be largely restricted to the upper level ‘See bottom
of facing page.
Principles of Organization in the Learning of Hierarchical Relations
235
of the hierarchy. For the “Yes-No” questions, “Is this a __. 7” the collection error is to say “No”, to deny that a single instance is an instance of the category. The number of incorrect “No” responses here should be signiticantly greater than the number of incorrect Yes-No answers at the lower hierarchical levels. These latter errors should reflect memory problems, not collection interpretations. Table 2 presents the mean number (out of a maximum 8 per cell) of collection errors by grade, learning condition, hierarchical level, and question type. Preliminary analyses indicated that there were no significant differences in errors as a function of sex, order of mention of the upper or lower levels, type of category (animate or shape) or degree of similarity of the subcategories. These factors do not appear in the subsequent analyses. The error data were analyzed by grade (4) X learning condition (2) X hierarchical level (2) X question type (2) mixed ANOVA, with repeated measures on the hierarchical level and question type. The main effect of grade was significant F(3,88) = 7.21, p < 0.001. Children in the eleventh grade made fewer errors than children in each of the other grades, all t’s (46) > 2.40), p’s < 0.05. None of the other grades differed significantly from each other. Children in the ostensive learning condition averaged 1.84 errors compared to children in the inclusion condition, who averaged 0.47 errors, F(1,88) = 32.78, p < 0.001. Thus, when children were given more explicit information, they had little trouble interpreting inclusion relations. Table 2.
Mean Number of Errors for Yes-No and Behavioral Questions by Grade, Learning Condition, and Hierarchical Level Ostensive Learning Conditioll
Grade
Upper Level
Lower Level
Inclusion Learmng Condition Total
Yes-No Behavioral Yes-No Behavioral 2 6 8 11
5.83 4.11 4.17 0.83
4.25 3.33 4.58 0.75
0.33 0.33 0.25 0.33
0.25 0.00 0.00 0.00
Upper Level
Lower Level
Total
Yes-No Behavioral Yes-No Behavioral 2.61 1.96 2.25 0.48
2.17 1.17 0.08 0.17
0.67 2.00 0.08 0.00
0.50 0.33 0.08 0.00
0.08 0.17 0.00 0.08
0.85 0.92 0.06 0.06
‘The majority of these errors was to pick up the entire array of objects. The only other common error was to select one A and one B. This error was counted as a collection error for the following reasons: (1) Selecting two objects is clearly inconsistent with the singular “Show me II C”. (2) The profile of children making this error was indistinguishable from children who selected all of the items. In fact they were sometimes the same children. In particular, children who selected two objects were just as likely to deny that a single object was a C as were children who selected all objects. Some commented that you need one of each to make a C.
236
E. M. Markman, M. S. Horton and A. G. McLanahan
The grade X learning condition interaction was significant, F(3,88) = 2.76, p < 0.05. Children in the second grade made significantly fewer errors in the inclusion condition than in the ostensive condition, t(23) = 4.55 p < 0.001, as did children in the eighth grade, t(23) = 4.50 p < 0.001. At sixth grade the difference in conditions was not significant, t(23) = 1.63, p > 0.10, though children were still frequently erring. At eleventh grade, children made very few errors and there was no significant difference in condition, t(23) = 1.28~ > 0.10. The hypothesis that children treat a novel hierarchy as a collection structure rather than a class inclusion relation predicts that collection errors should predominate in the upper level of the hierarchy. This was strongly supported by the significant hierarchical level effect, F( 1,SS) = 63.93, p < 0.001. Overall children averaged 2.14 errors on the upper level while errors at the lower level were very infrequent, averaging only 0.172. There was a marked difference in frequency of errors as a function of hierarchical level. Table 3 presents the number of children at each grade giving collection responses to at least 50% of the questions about the upper level of the hierarchy. The comparable data for the questions about the lower levels can be quickly summarized: No child at any grade from either condition consistently gave a collection response to the questions about the lower nodes. In contrast, quite a few children gave collection responses to the upper level questions. In the ostensive learning condition, about half (56%) of the children in grades 2-8 answered the majority of the questions about the upper level consistent with a collection, not a class, interpretation. Only one of the eleventh graders did so. Returning to the data in Table 2, the hierarchical level X grade interaction was significant, F(3,88) = 5.34, p < 0.005. At all grades except the eleventh children made more errors on the upper than the lower levels, all t-prs (23 > 3.2 1, p < 0.0 1. At eleventh grade children made very few errors at all and these do not differ significantly with level, t-pr (23) < 1. Table 3.
Number of Children Giving Collection Responses to at Least 50% Versus Less Than 50% of the Upper Level Questions, by Grade and Learning Condition Grade
2 6 8 11
Ostensive Learning Condition Percent of Collection Responses <SO% 2.50% 8 5 I 1
4 I 5 11
Inclusion Learning to Upper Level Questions >SO% 1 4 0 0
Condition <50% 11 8 12 12
Principles of Organization in the Learning of Hierarchical Relations
237
The hierarchical level X learning condition interaction was significant F(1,SS) = 29.32, p < 0.001. The difference in the frequency of errors at the upper and lower nodes is much greater for the ostensive condition, (3.49 versus 0.187) than for the inclusion condition (0.792 versus 0.156). The number of errors at the lower levels does not differ with condition, 0.187 versus 0.156, t (94) < 1. This is as one would expect if these responses reflected noise and memory errors. In contrast, the number of errors at the higher level, which should reflect collection interpretations, did differ with condition, (3.49 versus 0.792), t (94) = 5.08, p < 0.001. The hierarchical level X learning condition X grade interaction was marginally significant, F(3,88) = 2.68, p < 0.10. At second grade the difference between upper and lower levels was significant for the inclusion condition, t-pr (11) = 3.04, p < 0.01, as well as for the ostensive condition, t-pr (11) = 6.92, p < 0.001. At no other grades was there a significant difference in the number of errors at the two levels for the inclusion condition, all t-prs (11) < 1.66, ps > 0.10. At sixth and eighth grade there were significant level differences for the ostensive condition, t-pr (11) = 3.43, p < 0.01 for sixth grade and t-pr (11) = 4.15, p < 0.0 1 for eighth grade. At eleventh grade there was no level difference even in the ostensive condition, t-pr (11) < 1. There were more errors for the Yes-No questions than for.the questions requiring a behavioral response, F(1,88) = 8.61, p < 0.005. The question type X grade interaction was significant, F(3,88) = 4.70, p < 0.005, as was the question type X grade X hierarchical level, F(3,88) = 5.41, p < 0.002. For upper level questions, the eleventh graders were making fewer errors than each of the other grades for both the Yes-No and the behavioral questions, all ts (46) > 2.24, ps < 0.05. For upper level questions involving a behavioral response there were no differences in the remaining three grades, all ts (46) < 1. For Yes-No responses at the upper level, the eighth graders outperformed the second graders, t(46) = 2.2 1, p < 0.05. At the lower levels there were no significant grade differences for either the Yes-No or the behavioral questions. However there were marginally significant differences between eighth and second graders for the behavioral questions, t(46) = 1.70, p < 0.10, and between second graders and eleventh graders, t(46) = 1.76, p < 0.10, and eighth graders, t(46) = 1.76, p < 0.10, for the Yes-No questions. These results indicate that the developmental differences were considerably more marked for the upper level questions and appeared for both the Yes-No and the behavioral questions. At the lower levels, there were small developmental differences and they appeared somewhat greater for the Yes-No questions. In summary, when told that “As are a kind of C”, analogous to being told “Oaks are a kind of tree”, even the youngest children in this study correctly
238
I!?.M. Markman, M. S. Horton and A. G. McLanahan
interpreted novel hierarchies as class inclusion relations. As long as the relation was explicitly specified, children interpreted it as inclusion. When relations were taught only by ostensive definition, until eleventh grade, children often erroneously treated the relations as collections. When asked what would be analogous to “Show me a pine”, children correctly picked up a single pine. When the experimenter, while pointing to a pine, asked “Is this a pine?“, children almost invariably responded correctly. The errors occurred only on the upper level of the hierarchy, analogous to trees. When asked the analogue to “Show me a tree”, children often scooped up a handful rather than just one. When the experimenter, while pointing to a tree, asked “Is this a tree?“, children often said “No”. This pattern of responding is exactly as one would expect if children were answering questions about a collection.
Discussion Several factors worked against children imposing a collection structure on these novel hierarchies. First, collective nouns are relatively rare. Children have more exposure to and practice learning class terms. Thus, on the basis of word frequency, they should be biased toward class inclusion, not collection, organizations. Second, children had to erroneously impose a collection structure on the hierarchy. In the initial learning phase, they had to overlook the plural construction “These are Cs”, which means any one of the objects referred to is a C, and misinterpret the phrase to mean the entire array is a C as if it meant “This collection is a C”. In testing they had to pick up several objects despite hearing a request for a single object. Third, at least occasionally children must have begun to treat the novel terms correctly and then reinterpreted them as collections. When children were learning the concepts, the order in which they were taught the upper versus lower nodes had no effect on their interpretation of the hierarchies as class or collection structures. But when what children heard first was an upper level term, analogous to “These are trees”, they certainly would have treated the term as a class term, i.e., they would have assumed that each tree was a tree. It is only after the lower level terms were introduced that hierarchical learning began, and only then that children could have reinterpreted the upper level node as a collection. Despite these factors, children as old as fourteen still misinterpreted inclusion relations as collections. Why was learning inclusion so difficult in the ostensive condition? One possibility is that because ostensive definition gives limited information, it is probably not typically used to convey hierarchical relations. Children may expect that when someone points to various objects and names them that he
Principles ofOrganization in the Learning ofHierarchical Relations
239
is naming them at the same hierarchical level, often the basic level (Anglin, 1977; Rosch, Mervis, Gray, Johnson, and Boyes-Braem, 1976). If children are initially set to interpret labels at the same level, then they may be puzzled by hearing two labels for the same objects. It is as if they think “You just told me these are As and now you are telling me they are Cs”. Note that the problem arises in part because there are two is a relations. With only minimal linguistic cues to guide them, the children must assign meaning to terms in such a way as to resolve the apparent conflict between x is an A and x is a C. Establishing a hierarchical relation between the two terms will solve the problem. In the inclusion condition of this study, the hierarchical relation of inclusion was specified explicitly. Thus it would be expected to help the children, as it did. In the ostensive definition condition, the children were left to resolve the apparent contradiction between x is an A and x is a C without the clarification provided in the inclusion condition. The hierarchical collection structure resolves the conflict since one of the is a relations is replaced by a part-whole relation: x is an A but x is part of C. The problem with this solution, of course, is that it is incorrect. However, the only clue to its incorrectness comes from singular-plural usage of English. From “These are Cs”, it follows that any one of the objects referred to is a C. Interpreting the relation as a collection violates this rule. Older children may be more disturbed by such a violation than younger children. When some of the children who participated in this study were informally interviewed about their answers, they used singular terms in ways that are consistent with a collection, not a class, interpretation: e.g., “This is a whole C”, “You have to put these together to make a C”, “The whole thing is a C”. One child claimed that the experimenter had pointed to the array saying, “That’s a C”. Apparently children’s memory for “These are Cs” was distorted to “This is a C” to be consistent with a collection interpretation. Perhaps by sixteen this incorrect usage is too salient to permit a collection error, thus forcing children to seek other interpretations. Another reason why interpreting the hierarchy as one of class inclusion may be difficult is that the asymmetry of the inclusion relation can cause confusion and memory problems. That children have difficulty with many aspects of class inclusion has been documented by Inhelder and Piaget (1964). For example, children find questions such as “Are all the roses flowers?” and “Are all the flowers roses?” very difficult. They initially tend to answer these questions as if they referred to the total identity of the two classes, a symmetrical relation, rather than to the hierarchical relation of inclusion. By sixth and eighth grade, children would be able to answer such
240
E. M. Markman, M, S. Horton and A. G. McLanahan
questions about familiar classes, yet could still have difficulty with the type of problem we posed for them. Even adults become confused about class inclusion relations for unfamiliar or abstract material. In syllogistic reasoning tasks, for example, from premises such as “All As are Cs”, adults will erroneously infer “All Cs are As”; that is, when the information is abstract, adults fail to maintain the asymmetry of inclusion (Ceraso and Provitera, 1971; Chapman & Chapman, 1959; Revlis, 1975). Thus, young adolescents could be expected to have related problems in the ostensive learning condition of the present study. In particular, to establish class inclusion relations, one must keep track of the asymmetrical relations; one must keep the different levels of the hierarchy distinct. Collection structures are asymmetrical relations as well. Yet, the part-whole structure of collections seems to form a more stable asymmetrical relation than class inclusion. The whole formed by collections is more coherent than that of classes (Markman & Seibert, 1976) and intuitively the part-whole organization seems more concrete. In some ways the collection structure is analogous to the part-whole organization of objects. This asymmetrical relation might be less prone to be distorted into a symmetric relation than class inclusion. An arm, for example, is part of a person, but a person is not part of an arm. Similarly a tree is part of a forest, a forest is not part of a tree. If there is less confusion of part and whole, the asymmetrical relation of collections would have greater psychological stability and would not so readily degenerate into a symmetrical relation. At least under the conditions of the present study, children apparently found it easier to impose a collection structure on a novel hierarchy than to interpret it as inclusion. Admittedly, ostensive definition is not the normal way in which hierarchical relations are taught, so such collection errors may be unlikely in natural situations. However, there are some reports that children first acquiring superordinate terms distort some class inclusion relations into collections (Macnamara, Note 1 ;Valentine, 1942). Thus even in naturally occurring contexts, very young children may find it simpler to impose the part-whole structure of collections on inclusion hierarchies they are trying to learn.
References Anglin, J. M. (1977) Word, Object, and Conceptual Development. Norton, New York. Ceraso, J., and Provitera, A. (1971) Sources of error in syllogistic reasoning. Cog. Psychol., 2, 400-410. Chapman, I. J., and Chapman, J. P. (1959) Atmosphere effect reexamined. J. exper. Psychol., 58 220-226.
Principles of Organization in the Learning of Hierarchical Relations
241
Inhelder, B. and Piaget, J. (1964) The Early Growth of Logic in the child. Norton, New York. Markman, E. M. (1973) The facilitation of part-whole comparisons by use of the collective noun “family”. Child Devel, 44, 837-840. Markman, E. M. (1978) Empirical versus logical solutions to part-whole comparison problems concerning classes and collections. Child Devel., 49, 168-177. Markman, E. M. (1979) Classes and collections: Conceptual organization and numerical abilities. Cog. PsychoI., II, 395-411. Markman, E. M. and Seibert, J. (1976) Classes and collections: Internal organization and resulting holistic properties. Cog. Psychol., 8, 561-577. Revlis, R. (1975) Syllogistic reasoning: Logical decisions from a complex data base. In R. J. Fahnagne (ed.), Reasoning: Representation and Process. John Wiley, New York. Rosch, E., Mervis, C. B., Gray, W. D., Johnson, D. M. and Boy&-Braem, P. (1976) Basic objects in natural categories. Cog. Psychol., 8, 382-439. Valentine, C. S. (1942) 77ze Psychology ofEarly Childhood. Methuen, London. Reference Note 1. Macnamara, J. (1979)
Unpublished
book manuscript,
McGill University.
Cette ktude montre que les enfants tendent i ramener les relations d’inclusion de classe (par exemple, la relation des chdnes aux arbres) i une structure partie-tout de collections (par exemple la relation des che^nes i la for&t). On a enseignb P des enfants 2gg8s de 6 $ 17 ans des hierarchies d’inclusion de classe analogues aux relations entre les chdnes, les pins et les arbres. Dans une des conditions, la relation d’inclusion de classe @tait enseignie sculement par une ddfinition avec d&ctique par exemple en disant “Ceuxci sont des chbnes” en pointant vers les ch&nes. Dans I’autre condition on ajoutait ce qui serait un analogue 1 “Les ch&nes et les pins sont deux sortes d’arbres”. Cette information suppl6mentaire contraint l’interpretation et dans ce cas mcme les enfants les plus jeunes interprktaient correctement la relation comme une inclusion de classe. Par contre, avec I’information limithe, m&me i 14 ans des enfants ont impost une structure de collection sur les hie’rarchies d’inclusion. Par exemple, les enfants de’niaient qu’un arbre seul soit un arbre (comme s’ils pensaient qu’arbre signifiait f&et) ou ramassaient plusieurs arbres quand on leur demandait un arbre. Les rCsultats indiquent que la structure partie-tout de collection est plus simple g 6tablir et maintenir que la structure d’inclusion.
Cognition, 8 (1980) 243-262 0 Elsevier Sequoia S.A., Lausanne
2 - Printed
in the Netherlands
Infant search tasks reveal early concepts of containment and canonical usage of objects” N. H. FREEMAN s. LLOYD C. G. SINHA Department
of Psychology,
University
of
Bristol
Abstract It is difficult to gain unambiguous evidence on the use of concepts by infants. Many results can be accounted for in terms of action-based strategies. The evidence reported here fulfirs the minimal criteria for the operation of working concepts in infants. Search tasks are used with a filled interval which forces memory-search, and the object is hidden in containers which fulfil their customary job or violate it. Infants treat an upright cup as a more reliable location marker than an inverted one. A series of experiments probes the phenomenon. The results indicate that the infants have a working concept of containment which can be triggered by the provision of containers in their canonical orientation. Even ‘object permanence tasks” lead infants to access their knowledge of the relationships into which things typically enter in the world outside the laboratory. Relational terms like in and on are amongst the earliest developed by infants. When do they acquire concepts of containment and support? According to Piaget (1955, pp. 192-208) it is most unlikely that these underlying concepts are attained in the tirst year of life. The purpose of this report is to demonstrate that infants do develop an understanding of containment in the first year of life, and further, that this understanding has indeed attained the status of a conceptual entry in the memory system. So our claim is that the recent upgrading of Piaget’s account of the perceptual tuning demonstrable in young infants must now be accompanied by an upgrading of their conceptual development. Finally, it will be argued that there is a precedent for this in Piaget’s writings, but that he has wrongly analyzed the first important concepts developed by infants through suppressing a term in his own argument. *Requests for reprints should be sent to the authors at: Department Bristol, 8-10, Berkeley Square, Bristol BS8 IHH, England.
of Psychology,
University
of
244
N. H. Freeman, S. Lloyd and C. G. Sinha
Theore tical issues Keith Nelson (1974) says that “the concept of object permanence encompasses broad knowledge of the predictable characteristics of objects”. This is, in effect, a definition of the minimal work that one has to do in order to demonstrate the existence and use of a concept. What instances of “predictable characteristics” of containers might infants encounter? A cup would possibly be the best examplar: a moveable commodity whose function is predictably to maintain containment relations undisturbed by horizontal movement. Yet Bower (1977, p. 118) claims that this is precisely what infants cannot grasp in the first year of life. His evidence comes from transposition tasks: an object is hidden in one of two cups, they are then transposed, and even Stage V infants tend to go for the original location on the tabletop where now only an empty cup is to be found. But a glance at Bower’s diagram shows that he used inverted cups. This has three consequences for the argument (see Chapter 4 of Freeman, 1980). First, the relationship between an inverted cup and its contents is tagged not by in but by under. Since under is a later linguistic acquisition than either in or on (Brown 1973) it is reasonable to suppose that conceptual development occurs in that order too. So Bower’s task could underestimate conceptual attainment relevant to containment. Secondly, even though the specification for under differs from in in several respects, one is particularly crucial for search tasks. An object under an inverted cup is supported by the same surface that supports the cup itself: The two share a contact relationship with a static surface. In contrast, when the object is in the cup, there is only one contact relationship (the bottom of the cup) with the stable surface. Therefore, in a change-in-location-search-task, in and under differ in their relationship to the static surface which provides a continual frame of reference for the infant’s position and that of each of the containers (see Bremner, 1980, for an analysis of infant search in terms of self-context relations). These are empirically testable consequences of confusing under with in when constructing an experiment, which might lead to underestimation of infants’ responsiveness to location shifts. But there is a more exciting third possibility. Suppose that infants have a minimal intensional grasp of the concept of containment according to Keith Nelson’s criterion of a knowledge of the predictable consequences of using a cup. Studies of locative language comprehension in slightly older infants (Clark, 1973; Wilcox and Palermo, 1974-5; Grieve, Hoogenraad and Murray, 1977;Walkerdine and Sinha, 1978) yield evidence that non-linguistic rules based upon the predictable characteristics of objects in the array largely determine response-strategies. In this case younger infants might fail with an under task using cups, precisely because
Infant search tasks
245
they have already learned that it is in implicit contrast with an in relationship: in that one cannot trust a cup to retain its contents if it be inverted, especially when horizontal location-shift is added. Therefore a grasp of in might produce the error with under. The contrast being made with the first two points in the paragraph above, is that success here with the under task would be viewed not as a less sensitive measure of containment-concepts, but as an index of a lack of attainment of a basic ecological rule specifying the condition under which containment relationships normally operate. This possibility can be investigated by the use of baseline data. In the literature, it has been proposed that a 50.-50 split between success and failure is common enough, at least at Stage IV, to give a baseline measure that the task induces balanced conflict between two location codes (Butterworth, 1976). If an under relationship produces reliably worse than chance performance, this ecologically and semantically based argument might be worth future study.
Procedural
problems
The form of the first experiment follows from all these considerations: a transposition task using upright and inverted cups. The prediction has to be that there will be greater success with the object hidden and transposed in one of two upright cups than when under one of two inverted cups. Such a demonstration will not, in itself, give a guide as to how to explain the effect, and to that end we report a series of studies. A repeated-measures design must be used to substitute each infant acting as his/her own control in place of comparability criteria based on Piagetian stage-diagnosis which may or may not be relevant. A technique is needed which leads towards a formal analysis of the components of the task-demands as well as conferring freedom to manipulate the ecological validity of the situation. The design has to produce a sufficiently high level of errors for comparing conditions, yet ensure that the infants do not become demoralized. The most common technique produces a high level of errors for any one condition whilst blunting comparisons between conditions. When the object is hidden, a three seconds delay is introduced before permitting search (Harris, 1973; Gratch, Appel, Evans, LeCompte and Wright, 1974). But as Bremner (1978) has noted, people rarely observe what happens during this delay, and the few observations by Gratch and Landers (1971) should certainly be extended. The trouble with the delay technique is that what happens during the delay is uncontrolled. Some infants might still remain fixated on the last place of disappearance, and others not, but some of the latter might still have a hand extended ready to reach to the new place. There is an immense
246
N. H. Freeman, S. Lloydand C. G. Sinha
amount of behavioral variation uncontrolled which militates against intercondition comparisons, especially in separate-groups designs. Studies of human information processing have demonstrated that filled intervals are essential at some stage of studying recall (see the sections in Baddeley, 1976, and Crowder, 1976, on the Brown-Peterson paradigm as a technique for probing access to secondary memory storage). They produce higher levels of error, and so give rise to interesting error patterns. From our suggestion that ecological factors be taken into account, the crucial question is the effect of a delay upon the infants’ relationship to the task. Not allowing them to search immediately, certainly objectively stops them grabbing for the object, but what does it do subjectively? It might induce uncertainty about being ailowed to retrieve the object, and that might be why the infants often seek eye-contact in this interval (Cornell, 1978). These considerations lead us to the following method. During the delay, the experimenter leans forward, makes eye-contact with the infant and encourages him/her to search. In practice, we find that this excites the infants enough to disrupt any gross prepared reaching, whilst maintaining the social dynamics of the test.
Experiment
1
Subjects
Thirty-eight
infants were tested,
19 at 12 months
and 19 at 15 months
of age.
Apparatus A circular white tray, 34 cm radius, on which were placed cylindrical plastic mugs with handles, measuring 10 cm high X 28 cm circumference, some coloured blue, some orange. A selection of small toys, maximum height 1% cm. Procedure
Testing was done in the infant’s home with the mother present. The experimenter spent up to 15 minutes talking to the mother and gradually establishing a rapport with the infant. By the end of this period, the infant was encouraged to play with the apparatus, and observations were made on the way in which the cups were spontaneously used. Then the tray was set facing the infant who sat in front of the mother. Two blue (or two orange) cups
Infant search tasks
247
were placed on the tray, with a separation of 8 ems between them, either both upright or both inverted. A toy in which the infant had previously shown interest was hidden at one of the cups (cup A) and the tray pushed forward. This was repeated until the infant had successfully retrieved the toy three times in succession from that cup. During this series of A trials, a delay was gradually introduced between hiding and permitting search. On the final A trial the procedure was always for the experimenter to hide the object, then to lean forward and make eye-contact with the infant and to call the infant’s name, saying “go on, you find it, you find it then”. Then the tray was pushed forward. On the B trial, the object was again placed in cup A, and this was moved round behind (from the infant’s point of view) the distractor cup,‘the distractor cup simultaneously being moved to the initial cup A position in order to transpose them. The whole operation takes just 2 sec. Subsequently the procedure was as for the last A trial. After making eye-contact with the infant, the experimenter kept her/his eyes at the level of those of the infant. This enables the experimenter to assess whether the infant could be guided by a glimpse of the object inside a cup in the upright condition. The mother, too, sitting behind the infant was asked to do the same. Obviously, for the first experiment, there is no way of proving whether or not the object could be seen and this is a possible factor biasing the results. The other experiments show that it is not. Following that test a break was taken, from 5 to 10 minutes, then the other condition was run in which the orientation of the cups was changed as well as their colour and the side on which the A trials were run. Counterbalancing of these three factors was implemented. Results The data in Table
1 are for the cup first manually searched at on the B trial. The cup-orientation effect is seen by comparing the numbers in the last two columns: it appears with almost equal strength in both groups. Seventeen infants failed on inverted cups and passed on upright ones. There was not a single case where the inverted cups were easier than the upright. The uprightness effect is unidirectional and highly reliable by a McNemar test (P 0.001 for the younger infants and P 0.008 for the older). It is shown in nearly half the sample. So the uprightness effect spans quite a wide age-range. There was no effect of task order on the pattern of errors. The first two Table-columns give pass-fail data for the half of the sample who were not affected by the cup orientation, and either passed on both conditions or failed on both. There were 9 passes and 12 failures, figures which approach a desirable 50-50 baseline, and indeed attain it for the older infants (6-6).
248
N. H. Freeman, S. Lloyd and C. G. Sinha
Table 1.
Number of infants successfully grasping baited cup after transposition, with numbers failing by returning to cup at original location Upright
Pass
Fail
Pass
Fail
Inverted
Pass
Fail
Fail
Pass
12 months
3
6
10
0
15 months
6
6
7
0
Total
9
12
17
0
Cup orientation:
Subjects
Next, pooling the data from all rows, it will be seen that the younger infants’ performance was reliably worse than chance with inverted cups (3 passes and 16 failures, binomial P 0.002) and only marginally above chance with upright cups (13 passes and 6 failures, binomial P 0.084). So the major component of the cup-orientation effect in them is given by difficulty with the inverted cups. In contrast the older infants show an equal marginal bias in both directions (6 passes and 13 failures with inverted cups, and 13 passes and 6 failures with upright cups, binomial P 0.084 in both cases) which interact to produce the reliable difference of P 0.008 noted in the McNemar analysis. Thus to sum up, the younger infants find the inverted cups difficult. The older infants with a 50-50 baseline when cup orientation is not discriminated (no error and two-error group) show an equal bias around the baseline when orientation is discriminated. This provides the possibility of identifying a developmental trend which is taken up in the section which collates the data from the separate experiments. Could the results have been predicted from observations made during the A trials? It was certainly our impression that the infants were harder to engage on the inverted cups conditions, but not harder to train once their cooperation had been enlisted. However, although the proportion of errorless A sequences was indeed higher with upright than with inverted cups (0.81 and 0.59, respectively), a McNemar test failed to give a reliable outcome. Perhaps a more sensitive test would be to assess the association between errors on the A trials and on the B trials. Again, although a difference appeared, in that a phi coefficient for association was only reliable with upright cups, a gamma assessment of inequality of association failed to reach reliability. Both approaches to the characteristics of the A trials, then, give intriguing indications that it may in future be possible to track the cup orientation effect more sensitively than at present; but since the B trial outcomes cannot reliably be reduced to A trial outcomes for these data, it
Infant search tasks
249
seemed better to take the conservative decision to study the Piagetian A-B transfer paradigm for the subsequent experiments in this paper. Finally, we noted what happened when an error was made. There was a surprisingly sharp discontinuity in behavior pattern here: either the infants would move the hand within 2 sets to the correct cup without any gross postural adjustment, or they would stop dead and sit back. With the inverted cups, the proportion of children showing a “stop dead” response was 0.89, whilst with the upright cup, the proportion was 0.43 (a reliable difference at the 5% level by the McNemar test for homogeneity of proportions). So apparently the inverted cups not only led to more errors, but induced less willingness to make an immediate correction in the sample as a whole: the numbers are too small for a reliable assessment to be made about the two age-groups. Discussion
A reliable cup-orientation effect was found at both ages. So clearly the method of hiding the object is of great importance. To that extent, the data already indicate that it may be fruitless to think of the tasks as indexing the concept of a permanent object; since by definition, this is a context-free concept, applicable only to the appearance/disappearance of the object, not to the environmental relationships into which it enters. The virtue of the experiment is that it uses household materials in their customary manner, but its associated weakness is that it contains many confounding factors. The differential performance could be due to the perceptual difference between uprightness and inversion, or cavity-upwards and cavity-downwards containers, or the differential method of hiding the object by dropping in and placing under, or the differential ease of programming a motor programme for retrieving the object, or whatever. The next study controls for one of the potentially important factors.
Experiment
2
If a situation can be found in which the containers induce no higher success rate when their cavities are upwards than when they are downwards, this will weaken a large number of competing procedural interpretations of the data: differential methods of hiding, retrieval or possible visibility of the object before manual selection is finally made. It will also militate theoretically against the formal componential analysis in terms of number of contact relations with the static supporting surface.
250
N. H. Freeman, S. Lloyd and C. G. Sinha
The obvious step is to find a container which customarily has its cavity downwards. We chose model houses. It cannot be assumed that infants know the broad predictable characteristics of model houses; it is, however, safe to assume that they have not had much experience of model houses being inverted and used in the manner of cups, so the minimal criterion for a change in container-rules is met. Subjects
Twenty-nine of age.
infants served as subjects,
14 at 12 months,
and 15 at 15 months
Apparatus
Two wooden model houses, whose cavity was within 1 cm3 of the cups. The apex of the roof was flattened, by cutting off the top 1% cm of the 6 cm roof, so that the house could be stably set upside down. Procedure
This was exactly
as for the first experiment.
Results
As can be seen from Table 2, the younger infants performed at chance with inverted houses even though they had the cavity upwards, and marginally worse than chance with upright houses though not reliably so. The older infants performed at about chance level with inverted houses. So the data contrast with those from the cups. Also of interest is the reliable difference Table 2.
Number of infants grasping baited toy house after transposition, with numbers failing by returning to house at original location House Orientation Inverted
Upright
12 months
7 (7)
4 (10)
15 months
6 (9)
10 (5)
13 (16)
14 (15)
Subjects
Total
Infant search tasks
25 1
found between the age-groups with the upright houses (x’ (1) = 4.35, P < 0.05). This contrasts with the lack of a reliable age-effect with the inverted houses. The important contrast with the cups data is that the inverted houses induced very poor performance even though they had the cavity pointing upwards; and the important similarity is that the performance of the older infants again falls to either side of a chance level, depending upon whether the object is upright or not instead of whether the cavity is upwards or downwards. Discussion
Clearly there is more to the cup effect than a cavity-upwards factor. The form of the container matters, not solely its orientation. The findings lessen the worry that the upright cup effect is trivially due to a lack of efficient concealment of the object relative to the inverted cup condition, for there was a very low success rate with the cavity-upwards inverted houses. We can now set up a testable working hypothesis. The cup orientation effect is due to the infants having a grasp of containment rules which they can apply only when a familiar type of container is properly used, that is according to its customary manner. The question then to be investigated becomes one of identifying the components of customary usage which should be entered into an explanatory model. At the outset, a cup was referred to as being designed to be a trustworthy mobile container. The next step is to use a static-container search task. The cup orientation effect may well weaken, because a salient customary containment cue is abolished, but any residual effect will be useful for model-building.
Experiment 3 Subjects These were the same infants as in Experiment 1. The testing was done at the same time as Experiment 1, with full counterbalancing. Procedure
Using the same apparatus as in Experiment 1, the only change in procedure was on the B trial, whereby the object changed its location by being placed in or under the previous distractor cup (instead of via transposition).
252
N. H. Freeman, S. Lloyd and C. G. Sinha
Table 3.
Number of infants grasping baited cup after change in location of placement, with number failing by returning to cup at original location Upright
Pass
Fail
Pass
Fail
Inverted
Pass
Fail
Fail
Pass
Cup orientation:
12 months
6
I
6
0
15 months
12
2
5
0
Total
18
9
11
0
Subjects
Results
The data set out in Table 3 show a reasonable orientation effect in the younger group. Six of them failed with the inverted cups whilst passing with the upright ones, with no cases of the reverse pass-fail pattern. This is a little weaker than for the transposition data, but the unidirectional 6-O split in the one-error group is reliable (McNemar, P 0.016). Again, the results fall either side of the 50-50 split level, with 6 passes to 13 failures on inverted cups, and 12 passes to 7 failures on upright cups. This time, the data for the older group show an age-related advance, from the upper level of the younger, to almost ceiling on upright cups (the 17-2 split is reliable, binomial P 0.001). Again there was not a single case of a subject passing with inverted cups yet failing with upright ones, and the 5-O split in the one-error group is reliable (McNemar, P 0.03 1). The first two Table-columns give the pass-fail data for the two-thirds of the sample who were not affected by cup orientation. There were 18 passes and 9 failures, made up of a reliable contrast between the agegroups, from a chance level in the younger to better than chance (binomial P 0.006) in the older. But the number in each age-group affected by the cups remains almost identical (6 younger and 5 older). Secondly, pooling the data from all rows, it will be seen that the younger infants’ performance was marginally worse than chance with inverted cups (6 passes and 13 failures, an unreliable difference) and marginally better than chance with upright cups (12 passes and 7 failures, again an unreliable difference) which interact to produce the reliable difference (P 0.0 16) shown by the McNemar analysis. This is almost identical to the pattern shown by the older infants in the cuptransposition study. Therefore, one can equate the cup orientation effect found in older infants on the transposition task with that found in younger infants on this static-container task. A question remains, which must be settled before a model can be suggested. The upright cup advantage in the
Infant search tasks 253
oneerror groups could be due to the orientation of the cup being in its customary direction or to the congruence between that and the methods of hiding and/or retrieval. These possibilities cannot be settled by considering the results of the houses study. The next experiment investigates them, extending the study to younger infants.
Experiment
4
A design is now needed in which the cup in its two orientations occludes the object in the same way. One solution is to use the cup as a screen, hiding the object behind it. That guarantees that the method of hiding is equalized as well as the degree of concealment. What it loses is ecological validity, for the cup is being misused in both its orientations. This is certainly an objection, but its empirical consequences may not be too great. They can be assessed by including two control conditions, one of which is a direct replication of the previous experiment, and the other which uses flat screens instead of cups. Thus one can extract a cup-interior versus screen effect, and assess the use of cups as screens in relation to that. Subjects Twenty-one infants aged months, acted as subjects.
between
8% and
11% months,median
age 9%
Apparatus Cups as previously, plus opaque perspex screens whose area matches of frontal projection of the cups.
the area
Procedure The staticcontainer procedure of the previous experiment conditions. A replication of the in and under conditions run; also conditions involving placing the object behind inverted or both upright; and one condition involving behind one of the two opaque screens.
was used, under 5 in that study was cups, either both placing the object
254
N. H. Freeman,
Table 4.
S. Lloyd and C. G. Sinha
Number of infants successfully given in brackets
searching
at baited location,
with failures
Condition Under Inverted
In Upright
CUP
CUP
11 (10)
16 (5)
Behind Screen
15 (6)
Behind Inverted
Behind Upright
CUP
CUP
11 (10)
17 (4)
Results
The data set out in Table 4 almost speak for themselves. The cup orientation effect appears with equal strength whether the object is within the cup or behind it: the frequency of errors is halved (at P 0.03 1 and P 0.016 for the within and behind conditions respectively, by the McNemar test). There are two conflicting pieces of evidence on this effect. If the 50-50 split with the inverted cups represents a true baseline, then the effect is one of facilitation by the upright cups. If the high level of success with the flat screen is taken as a baseline for a screen effect, then the use of upright cups does not facilitate performance, whilst inverted cups depresses it. This baseline issue has previously come up and will be discussed below. Before putting the data from all the studies together, we report the results of the final experiment. This is a transposition version of the behind task, to bring that into line with the basic within-cups studies.
Experiment 5 Subjects
Ten infants aged 13 f 1.5 months,
acted as subjects.
Apparatus
Screens as previously, plus cups with pieces of felt of the same color as the cups glued either to the tops or bottoms, protruding by 21 cm* to provide a support for a small object to be placed on and moved with the cup during transposition.
Infant search tasks
255
Procedure The basic transposition procedure was used with the object hidden behind one of two screens, one of two inverted cups or one of two upright cups. Results The results were the clearest of the series: a 5-5 pass-fail 9-l pass-fail ratio with upright cups and a l-9 pass-fail ones. Thus, at last, a reliable facilitation by upright cups dation of performance with inverted ones, (binomial occurred around a 50-50 baseline with the screens.
split with screens, a ratio with inverted and a reliable retarP 0.011 for each)
Discussion The cup uprightness effect cannot be attributed to asymmetries in the method of hiding the object, nor to differential efficiency of concealment. An upright cup is treated as a better location marker than an inverted one, even when it is acting as a screen, as a simple landmark cue. It seems reasonable to suggest that the effect is developed by 9 or 10 months of age; and then the major development takes place in general location-shift ability, carrying the effect with it. In pilot studies we have found the effect with some 7 month infants. The effect is certainly detectable within the first year of life.
Discussion of collated results Evidence on a cup orientation effect There are three analytically distinct sources of evidence: direction of errors, level of errors and both combined against a baseline. We deal with them in order. All infants who made a single error always made it when cups were inverted, never when cups were upright. The total unidirectional errors come to 33 for the within-cups conditions and this 33-O split is reliable far beyond P 0.001 by a McNemar test. There were 93 pairs of observations in these studies, so the cup effect applies to about a third of the sample. In the behind-cups studies, the strength of the unidirectional effect was 14-O (again reliable beyond P 0.001). This was out of a total of 31 paired observations, so here the effect applies to about half the sample.
256
IV. H. Freeman, S. Lloyd and C. G. Sinha
Secondly, we look at error-levels. Binomial tests showed that inverted cups led to reliably worse than chance performance in the 12 month subjects in Experiment 1, and the 13 month subjects of Experiment 5; whilst upright cups led to better than chance performance in the 15 month subjects of Experiment 3, the 9% month subjects of Experiment 4, and the 13 month subjects of Experiment 5. By themselves then, both facilitation and retardation effects on performance can be found. The next step is to assess biasses in group performance against non-observed chance baselines, yielding a reliable interaction of two unreliable biasses in opposite directions for the 15 month subjects of Experiment 1, and the 12 month subjects of Experiment 3. So some evidence can be found for inverted cups depressing performance and upright cups enhancing it within whole samples. The final step is to work with observed chance-level baselines. These occurred in the last two experiments, yielding evidence for a facilitation effect in Experiment 4 and both facilitation and retardation in Experiment 5. Therefore it is safest to assume that both effects are operative, and that it must be left to future work to find conditions which produce only facilitation or only retardation. The critical question that wilI have to be dealt with is the choice of an appropriate baseline for each set of tasks. Only Experiment 5 gives evidence on facilitation and retardation around a baseline which is theoretically and empirically rigorously specified. One cannot know yet whether these particular results are completely generalizable to those of the other age-groups. For example, it could well be the case that the infants’ concept of containership does not alter over the span 9 to 15 months (as we argue in the next section) yet their concept of the way in which screening works might well do so. It would be an empiricist error to treat the use of screens as a constant baseline condition. Accordingly, we now go on to an analysis of the data which is independent of this problem: regardless of whether the effects are composed of facilitation, retardation or both, what is the size of the contrast between the cup conditions across ages? Evidence
on age-related
changes
In Experiment 3, a statistic search task was given. The proportion of correct responses at 12 months were 0.32 and 0.64 for inverted and upright cups respectively. At 15 months they were 0.64 and 0.89 respectively. The 15 month group have moved up relative to the 12 month group by one step, roughly maintaining the difference between conditions. In Experiment 1, a transposition version, the 15 month infants perform at 0.32 and 0.68 respectively: reduced to the level of 12 month infants on the easier static task. The 12 month infants in the transposition task then sink to 0.16 and 0.68 respec-
Infant search tasks
257
tively. The relations amongst these figures may be simply expressed. Consider just the inverted cups: the proportion correct in the 12 month group, for a transposition task is 0.16, and this can be doubled to 0.32 by using a static task or increasing the age to 15 months, so that doing both at once again doubles the proportion correct to 0.64. Therefore the data are perfectly additive, and the age-related change can be accounted for entirely by an ability to cope with a transposition task. In contrast the upright cups yield essentially the same proportion correct at around 0.64, except for the 15 month static task at near ceiling: upright cups seem to leave less room for developmental change than inverted cups. This suggests new approaches to the question of what changes with development: a point which we now take up. The collated data are set out in Figure 1 with two additions: the static within-cups condition from Experiment 4 on 9% month infants, and an additional 10 infants of 10 months, from yet another of our within-cup experiments, using a transposition task. Two things are very clearly shown. First, for the static task, the difference between inverted and upright cups is impressively consistent. The age shifts increase the level of performance more than the underlying contrast between in and under. Therefore the Figure 1.
Proportion of sample of infants of different ages (in months) succeeding at finding an object within a cup after a change in hiding place. -upright cups, . . . . . . inverted cups. .Q
a .7
I
6 t; B .5 i?2 c 4
0 r 0
.3 %t k .2 .l
.-
01
10
static
12
transposition
15
258 N. H. Freeman, S. Lloydand C. G. Sinha
conceptual contrast is fully acquired by 9% months; only general-purpose contrast-insensitive increments in performance occur. However, the second point is that this does not happen in a linear fashion, for the ordering in terms of success is 15 months, 9% months then 12 months. This suggestion of a U-shaped curve has been found before in infant search tasks (e.g. Bower and Wishart, 1972); and Butterworth (1976) discusses it in terms of a reorganisation of skilled performance around 12 months, which agrees well with the present data. In the transposition task, the picture of age-group performance is very different. They all come together for upright cups, with variation in performance for inverted cups. However, the inverted cups differences prove unreliable by chi-square tests even for a post-hoc comparison based on the ageordering from the static task, bet-ween 12 months and the others combined (x’ < 1). It remains open whether further work will establish a fan-shape to the data; or collapse them all on to one developmentally-insensitive line (perhaps at the level of the 12 month static performance). Whichever happens, it seems clear that the static task yields a different aspect of development from the transposition task. At present it is safest to take a conservative position and say that transposition tasks bring out a developmentally insensitive cup effect in terms of slope and level, whereas a static task brings out a developmentally insensitive effect in terms of slope with a sensitive effect on level. Therefore, the basic agreement between both sets of data is that the basic contrast between in and under does not change over the age-range. It is not possible to formalise the difference between the static and transposition tasks without investigating how they interact with a separate independent variable (this we are currently doing); but it is possible to relate the difference to an issue raised in the Introduction. The uprightcups transposition task is the most ecologically valid test for a concept of containment, and the inverted-cup transposition task the least ecologically valid. This extreme contrast generates no reliable age-related differences, whilst the intermediate static conditions do. If further research corroborated this, it should not be too difficult to make sense of this in terms of the information load on the infant. Finally, the houses study in Experiment 2 gave evidence of an age-related change whereby upright houses unreliably retarded 12 month performance and unreliably facilitated 15 month performance, the two interacting to yield a reliable effect. This may be evidence for a conceptual advance in dealing with object function, but needs much more study.
Infant search tasks
259
Conclusions It is now clear that the way in which an object is concealed can play an important part in determining accuracy in a search task. Evidence was found for the proposition that if cups be used in their customary orientation, performance is better than if they are inverted. Various factors have been eliminated from the investigation, including an object-independent cavity-upwards bias, differential methods of placing the object, and differential efficiency of concealment. The final experiments have a crucial role to play. First they show that the cup uprightness effect can be obtained even when the method of retrieval of the object is equalized: it always has to be picked up off the supporting surface. So the effect cannot be reduced to an entirely action-based one, and this is important because of Piaget’s insistence on the priority of action-coding at this level of development of sensori-motor intelligence. This argument can be taken further. The cup-effect means that just the sight of a cup triggers an entry in the child’s conceptual system which deals with location coding. Surely this is what the last experiment shows: that an inverted cup can be worse than a plain screen as a location marker even when the cup is being used as a screen with the object behind it. This is remarkable. The orientation of the cup is irrelevant to the solution of this right-left task, so why should an inverted cup be a poorer landmark for the infants than an upright one? It would be perverse to assume that they suddenly forget about object permanence. Presumably they have learned that an inverted container is not to be regarded as a reliable cue to the location of its contents. This is a conceptual rule. Whether the cup effect acts basically as a facilitator-y or inhibitory performance factor, it is clear that the effect must be due to learning about cups, because of the previous elimination of simple ongoing factors such as the cavity-upwards factor or motorpatterns of retrieval. Infants know a minimal amount about the rules of containment, that is, about the rules of concealment and location of small objects when placed in relation to potential containers. What dominates their performance is not the nature of the small objects, but the nature of the location-relations they enter into, not the rules governing the disappearance of objects so much as the characteristics of the things which seem to make objects disappear. This may now be formalised. The effect may be named “the canonicality effect”: performance is affected by whether the experimenter uses the container for its canonical purpose or misuses it. This must index conceptual behaviour, for the infant is accessing an entry in memory which contains a knowledge of the predictable characteristics of an object in its canonical and non-canonical orientations. The last experiments show that registration of the orientation is enough
260
N. H. Freeman, S. Lloyd and C. G. Sinha
to trigger the knowledge. The experimenter knows that this is doubly irrelevant in the final experiment, in that the objects merely serve as screens and it is purely a left-right task, but the infants act as though they cannot edit out their knowledge. Further learning will be necessary to specify extensional rules for inhibiting accessing the concept in the behind-conditions. This is the sort of issue which must be investigated in order to study the semantically-based argument, set out in the introduction, that a grasp of canonical relationships might actually induce errors with non-canonical relationships. The critical finding would be a condition under which inverted cups led to reliably worse than chance performance, but both upright cups and a neutral baseline task led to reliably above-chance performance. Yet again, the issue of choice of an appropriate baseline stands in the way of rigorous modelbuilding. The question is the extent to which this relates to other concepts that the infants may be developing. One line of argument is that search tasks such as these do not tap other types of concept; they are tests of conceptual rules dealing with the characteristics of hiding places. The other line is to say that concepts of canonical orientation relate to the only other concept which Piaget credits the infants with, namely the object concept. Piaget’s argument is that infantile egocentrism is manifested in a lack of search-flexibility since they enter the location of an object as a criteria1 attribute of the object’s existence. If it is not hidden at its original location, it does not have continued existence. If one believed that this was still an appropriate way to argue, then one would have to maintain that the location-term which enters into the infants’ object concept is not an egocentrically defined one, but one based on the canonicality of the hiding place for its purpose of hiding. The present data do not allow us to choose between these alternatives. However, we suspect that Piaget is wrong. The reason why he puts forward a lack of grasp of object permanence as the explanation is that he had only two terms in his argument: the location and identity of the object. The present evidence indicates the role of a third term: the characteristics of the hiding place with reference to its canonical orientation. It is more economical to explain the data on the basis of this last term than to modify an account based on its original suppression. This pushes forward the argument of Butterworth (1977) and Gratch (1976): the infants’ representation of space is limited, but not, or not only, by their incomplete knowledge of the desirable object which the experimenter uses to induce representational activity. Probe tasks should be designed to take into account the ecology of the infants’ experience of spatial relationships. Finally, the evidence presented here accords neatly with one particular account of the cognitive prerequisites for the acquisition and development of language. Katharine Nelson’s (1974) “functional core
Infant search tasks
26 1
concept” hypothesis proposes that the earliest referential child utterances mark dynamic, functional, relational properties of objects. It may-well be that the cognitive bases of language development are to be sought not in a generalised representational or semiotic function-thought by Piaget to emerge at Stage VI of sensori-motor development-but in specific encoding strategies for representing canonical relational information, whose emergence may be traced back to around 9 months of age. The work reported forms part of the LARINCS research project into early linguistic and conceptual development, supported by the Social Science Research Council of Great Britain. Michael Pope and Iris Powell collected part of the data of the final study. Discussions with Peter French helped us initiate the work, and the incisive comments of two anonymous reviewers led us to deepen the data analysis.
References Baddeley, A. D. (1976) The Psychology of Memory. Harper and Row, New York. Bower, T. G. R. (1977) A Primer of Infant Development. Freeman, San Francisco. Bower, T. G. R. and Wishart, J. G. (1972) The effects of motor skill on object permanence. Cog. I, 28-35. Bremner, J. G. (1978) Spatial errors made by infants: inadequate spatial cues or evidence of egocentrism? Br. J. Psychol., 69, 77-84. Bremner, J. G. (1980) The infant’s view of space. In M. V. Cox (ed.), Are Young Children Egocentric? Concord/Batsford, London. Brown, R. (1973) A First Language: the early stages. George Allen, London. Butterworth, G. E. (1976) Asymmetrical search errors in infancy. Child Dev. 47, 864-867. Butterworth, G. E. (1977) Object disappearance and error in Piaget’s stage IV task. J. Exp. Child Psychol., 23, 391-401. Clark, E. V. (1973) Non-linguistic strategies in the acquisition of word meaning. Cog., 2, 161-182. Cornell, E. H. (1978) Learning to find things: a reinterpretation of object permanence studies. In L. S. Siegel and C. J. Brainerd (eds.), Alternatives to Piaget, Academic Press, New York. Crowder, R. G. (1976) The Principles of Learning and Memory. Lawrence Erlbaum, Hillsdale, N.J. Freeman, N. H. (1980) Strategies of Representation in Young Children: analysis of spatial skills and drawingprocesses. London: Academic Press, London. Gratch, G. (1976) On levels of awareness of objects in infants and students thereof. Merrill-Palmer Quart., 22, 157-176. Gratch, G. and Landers, W. F. (1971) Stage IV of Piaget’s theory of infants’ object concepts: a longitudinal study. Child Dev., 42, 359-372. Gratch, G., Appel, K. J., Evans, W. F., LeCompte, G. K. and Wright, N. A. (1974). Piaget’s stage IV object error: evidence of forgetting or object conception? Child Dev., 45, 71-77. Grieve, R., Hoogenraad, R. and Murray, D. (1977) On the young child’s use of lexis and syntax in understanding locative instructions. Cog., 5, 235-250. Harris, P. L. (1973) Perseverative errors in search by young infants. Child Devel., 44, 28-33.
262
N. H. Freeman,
S. Lloydand
C G. Sinha
Nelson,
K. (1974) Concept, Word and Sentence: interrelations in acquisition and development. Psychol. Rev., 81, 267-285. Nelson, K. E. (1974) Infants’ short-term progress toward one component of object permanence. Merrill-Palmer Quart., 20, 3 -8. Piaget, J. (1955) The Child’s Construction ofReality. Routledge and Kegan Paul, London. Walkerdine, V. and Sinha, C. (1978) The Internal Triangle: language, reasoning and the social context. In I. Markova (ed.), The Social Context of Language. Wiley, London. Wilcox, S. and Palermo, D. S. (1974/S) “In ” , “on” and “under” revisited. Cog., .3, 245-254.
I1 est difficile de prouver de maniere non ambigue l’usage des concepts par lcs bcbds. On peut rendre compte des resultats en termes de strategies d’actions. C’est cc critcre minimal que nous avons utilisd ici. La &he consiste en une rccherche de l’objet cache dans differcnts gobclets, et clle implique une mdmoire des places. Les bdbes traitent un goblet i l’cndroit commc une localisation mieux marqude que le m&me gobelet B l’envers. Unc serie d’experiences prouve ce phCnomene. Ces rcsultats indiquent que les bdbds disposent d’un concept de contenant d&clench& par l’oricntation mcme dcs objets. Cette connaissance des chases de l’environment, fondee sur les relations contenant-contenu, joue meme dans les taches de permanence d’objet.
Cognition, 8 (1980) 263-361 @Elsevier Sequoia S.A., Lausanne
3 - Printed
in the Netherlands
Against definitions*
J. A. FODOR M. F. GARRETT E. C. T. WALKER C. H. PARKES Psychology
Department
Massachusetts
Institute
of
Technology
Epigraph There existed an adult male person who had lived a relatively short time, belonging or pertaining to St. Johns fa college of Cambridge University), who desired to commit sodomy with the large webfooted swimming birds of the genus Cygnus or subfamily Cygninae of the family Anafldae, characterized by a long and gracefully curved neck and a majestic motion when swimming. So he moved into the presence of the person employed to carry “Hold or possess as something at your burdens, who declared: disposal my female child! The large web-footed swimming birds of the genus Cygnus or subfamily Cygninae of the family Anatidae, characterized by a long and gracefully curved neck and a majestic motion when swimming, are set apart, specially retained for the Head, Fellows and Tutors of the College.” from ‘Two Semantic Limericks’ by Gavin Ewart
Abstract Definitional accounts of language structure are explored in this paper. Several classes of arguments for definitions are reviewed; those which connect to: classical theories of reference, theories of informal validity, theories of sentence comprehension, and theories of concept learning. We suggest that, for each of these areas, accounts which rely upon definition are, in fact, not to be preferred on evidential grounds to plausible non-definitional alternatives. We also present a series of experimental observations bearing on one of these areas - that of sentence comprehension. We show that one widely cited class of examples of definitional structures - that of “causative verbs” *Reprint requests should Mass. 02139, U.S.A.
be sent to Dr. Merrill Garrett,
Psychology
Department,
MIT.,
Cambridge,
264
J. A. Fodor, M. F. Garrett, E. C. T. Walker and C. H. Parkes
fails to affect subject judgments of those relations among the words of causative sentences which depend upon the putative definitional structures. Such subject judgments are independently demonstrated to be sensitive to structural relations of comparable type for other linguistically non-problema tic types. The idea that there are definitions - that the morphemes of a natural language typically have internal structure at the ‘semantic level’ - has fascinated philosophers and psychologists at least since Plato. Epistemologists, to be sure, have recently shown signs of disaffection (see footnote 1, page 265). But in the ‘cognitive sciences’ the notion of definition remains one of those ideas that hardly anybody ever considers giving up. Perhaps for that very reason, there have been relatively few attempts to provide direct empirical evidence for the psychological reality of definitions, Still rarer are discussions of theoretical alternatives to definitional treatments of language and mind. We think that a general reconsideration is long overdue; not only because the empirical basis of the definition construct is exiguous, but also because the whole theoretical superstructure in which it plays a central role has commenced to wobble. This paper has three parts and an Appendix. In Part I, we discuss, in some detail, the way that the notion of definition has served to connect several aspects of classical theories of language with one another and with widely credited accounts of concept acquisition. We will call this complex of views ‘The Standard Picture (TSP)‘. We are particularly interested in two issues: to what extent is TSP plausible independent of questions about the empirical status of the definition construct; and in what ways would TSP have to be revised if the definition construct were to be abandoned. Part II presents informally an experimental investigation directed toward determining whether claims for the psychological reality of definitions can be sustained. This is not, of course, a ‘crucial experiment’; probably nothing could be. But we believe we can make a case that some important predictions which flow naturally from the view that definition is a basic notion in the theory of language are strikingly disconfirmed. The methods, materials, and statistical treatment of the results of the experiments are reserved for the Appendix, q.v. Finally, Part III returns briefly to The Standard Picture. We try to suggest in outline what cognitive psychology might look like in the post-definitional era. Part I: The Standard Picture Why do so many people think that there are definitions? Not, according to us, because there’s much direct evidence that there are. Still less because
Againstdefinitions 265
there are many persuasive examples of the kind. Rather, there are several other theories that people hold about language and mind, and these other doctrines either rest upon, or anyhow closely comport with, the definitional account. There are, in fact, five such theories in which the notion of definition plays a significant - if not ineliminable - role. We will consider four of them.’ I.a. Language and the world: the definition ofa word determines its extension According to TSP, the definition of a word makes explicit what is true of a thing if and only if the word applies to it. Consider the word “bachelor”. It’s often said that the definition of “bachelor” is “unmarried man”. Suppose that this is true. Then the intended consequence is that: 1 Every bachelor is a man. 2 Every bachelor is unmarried. 3 Whatever is both unmarried and a man is a bachelor. Or, to put the same point slightly differently, the idea is that the set of bachelors and the set of unmarried men are coextensive in virtue of the definition of “bachelor”. In some sense or other, the fact that the definition of “bachelor” is “unmarried man” is supposed to explain the coextensivity of these sets. It’s important to see just how the explanation is supposed to go. Assume that the extension of the phrase “unmarried man” is determined somehow. Then, it’s a consequence of the intended interpretation of the notion definition that if “unmarried man” is the definition of “bachelor”, then “bachelor” has the same extension as “unmarried man”, whatever that extension may be. That is: the definition of “bachelor” as “unmarried man” fixes the extension of “bachelor” relative to the extension of “unmarried man”.’
‘The fifth is the epistemological doctrine according to which definitions guarantee the necessity (or unrevisability, etc.) of certain general truths, e.g., that bachelors are unmarried or that F = MA. We put this view aside because (a) it leads further into the philosophy of science than we propose to go, and (b) it is pretty thoroughly discredited as the result of work by such philosophers as l&hem, Putnam, and especially, Quine. For an airing of these issues, see Katz (1975) and Putnam (1975). ‘This situation is not materially altered if, for example, we think of definitions in the way that many linguists do: viz., as couched in a universal metalanguage. On such a view, a definition fvtes the extension of an object language expression relative to an interpretation ofthe metalanguage. This is a nicety which we will henceforth ignore (but see J. A. Fodor (1975) and J. D. Fodor (1977) for extensive discussion).
266
J. A. Fodor, M. F. Garrett, E. C. T. Walker and C. H. Parkes
If the definition of “bachelor” fixes its extension relative to the extension of “unmarried man”, what fixes the extension of “unmarried man”? Patently, there are two possibilities: (a) the extension of “unmarried man” is determined by the definition of its constituents, or (b) it is determined in some other way. It is also patent that we’ll have to get to option (b) eventually, since exploiting option (a) raises the question: what fixes the extensions of the expressions in terms of which “unmarried” and “man” are defined? Definitions have to stop somewhere. These considerations suggest the following view (which is itself part and parcel of TSP). The lexicon3 of a natural language can be partitioned into (1) definable terms, and (2) a primitive basis. Definitions relate the definable terms to expressions in the vocabulary of the primitive basis.4 They thereby fix the extensions of definable expressions relative to an interpretation of the primitive basis. Definition is thus viewed as an asymmetric relation in that there is a preferred direction of analysis for expressions in a language: analysis runs from definable expressions into the primitive vocabulary. Definitions typically apply in chains, and the further along a chain we go, the closer we get to expressions couched in the vocabulary of the primitive basis. The primitive basis is where definitions stop. It should be emphasized that this view would be vacuously satisfied if all (or practically all) of the morphemically simple expressions in a natural language belonged to the primitive basis of that language. This is, of course, not what TSP intends. For TSP, the primitive basis of a natural language is notably smaller than the lexicon. Moreover, TSP has it in mind that definitions should exhibit the systematic articulation of the lexicon into semantic subsystems. In virtue of shared features of their definitions, morphemes should fall together into such families as action words, person words, color words, causatives, etc. (See, for example, Clark and Clark (1977)). Definitions fix the extensions of definable expressions relative to an interpretation of the primitive basis. What fixes the interpretation of the primitive basis? TSP provides no unequivocal answer to this question, but one version of the doctrine deserves special notice as historically venerable and currently influential. According to this (Empiricist) reading of TSP, items in the primitive vocabulary express sensory/motor properties. The 3Strictly, the morphemic inventory. We won’t distinguish between morphemes and lexical items; the former are always intended. 4Another pedantic footnote: Definitions relate definable terms to expressions in a vocabulary consisting of items in the primitive basis together with logico-syn~ecr~c vocahular). So, for example, “bachelor” means “man and nof married”, where “and” and “not” belong to the logico-syntactic apparatus. We’ll distinguish between primitive terms and logico-syntactic terms only where the distinction makes a difference.
Against definitions
267
extensions of these items are thus fixed by a causal account of the sensory/ motor transducers. So, for example, the extension of “red” is that set of objects which do (or, more plausibly, would) appropriately activate the redtransducers; the extension of “angular” is that set of objects which do or would trigger appropriate motor-tracking responses; and so forth. According to this view, then, all non-primitive expressions are definable in a sensory/ motor vocabulary whose items are, in turn, related to their extensions by a specifiable causal hook-up. Taken together, the definitions and the causal account of the transducers fix the interpretation of the entire lexicon. We believe that, insofar as the problem of interpreting the primitive basis has been faced at all by contemporary adherents of TSP (especially in AI and psychology) it has usually been something like the Empiricist version of the doctrine that they have had in mind. (Showing this would require more textual exegesis than we have space for here, but see J. A. Fodor, op. cit.). Suffice it for present purposes that any theory which appeals to definitions to answer the question ‘what relates words to the world?’ must cope with the problem of intepreting the primitive basis somehow. One hasn’t got a theory of language and the world unless that problem has been adequately addressed: all one has is a theory of a relation between uninterpreted linguistic forms. (Note that such formulae as “the definition of ‘bachelor’ is ‘unmarried man”’ assert relations between forms of words, not between forms of words and their extensions.) We stress this point in aid of dispelling an illusion. It’s easy to suppose that, if one gives up the notion of definition if, for example, one assumes that the entire lexicon is primitive - one thereby loses an account of the relation between language and the world; an account which exploitation of the definition relation would otherwise secure. On the contrary: for purposes of specifying the relation between a word and its extension, there is no principled difference between a theory which says that “unmarried” and “man” are primitive while “bachelor” is defined, and a theory which says that they’re all primitive. It’s just that the former sort of theory delays the question of interpretation till it gets to the primitive basis, while the latter sort faces the question straight off. One further point under this general head. If the Empiricist version of TSP were plausible, that would be a strong argument for the strategy of using definitions to delay the question of interpretation. For, as we’ve seen, the Empiricist does have a (schematic) account of how a sensory/motor basis is to be interpreted; viz., by reference to the causal structure of the sensory/ motor transducers. So, if he can use definitions to provide sensory/motor equivalents for each of the non-primitive items in a language, he really will have a theory of how the expressions in the lexicon of that language are related to their extensions.
268 J. A. Fodor, M. F. Garrett, E. C. T. Walker and C. M. Parkes
It’s thus
important to emphasize that the Empiricist version of TSP is If there are few convincing examples of definitions, there are literally IZO convincing examples of definitions which take prima facie nonsensory/motor terms into a sensory/motor vocabulary. There is, indeed, no reason to believe even that the definitions we have examples of generally exhibit an epistemically interesting direction of analysis. On the contrary, the ‘right hand’ (definiens) side of standard examples of definitions does not tend, by and large, to be more sensory/motor than the ‘left hand’ (detinendum) side. So, for example, the conventional wisdom has it that “grandmother” means “mother of father or mother”; “ketch” means “two masted sailing vessel such that the aft (or ‘mizzen’) mast is stepped forward of the helm”; “bachelor” means “unmarried man”, “kill” means “cause to die”, etc. It’s hard to discern a tendency toward the sensory/motor in these examples, and these examples are quite typical. Surely, no one could seriously argue that words like “mother, father, mast, vessel, cause, die, married and man”, all of which occur in the detiniens, are somehow closer to transducer language, than say, “grandmother”, “ketch”, “kill” or “bachelor” are.5. To summarize: definitions provide a useful part of a theory of language and the world on& if they empty into a primitive basis which is independently interpreted. That is, definitions figure seriously in theories of language and the world only if: (a) all the expressions of a language are equivalent to expressions in the vocabulary of its primitive basis; (b) the primitive basis is notably smaller than the lexicon; and (c) the extensions of expressions in the primitive basis can be fixed without further appeal to the notion of definition. The only primitive basis which has so far been seriously alleged to satisfy (a)-(c) is sensory/motor, and it is morally certain that that allegation cannot be sustained. It may well be that definition plays no serious role in theories of language and the world, TSP to the contrary notwithstanding. not plausible.
‘We are not denying that natural languages like English contain a vocabulary of sensory/motor terms, where sensory/motor terms are those whose extensions can be specified solely by reference to the causal structure of transducer mechanisms. Perhaps, for example, “red” is in this sense a sensory/ motor term (though recent work on color perception makes this seem unlikely). Our claim, in any event, is that even if there are sensory motor terms, the lexicon is not reducible to expressions containing only such terms and lo&o-syntactic vocabulary.
Againstdefinitions 269
I.b. Intersentential relations: definitions underwrite the validity of informally valid arguments There are lots of ways of thinking about logic; here’s one. People have pretheoretic intuitions about the validity of arguments. These are intuitions to the effect that the conclusion of an argument does (or doesn’t) ‘follow from’ its premises; that the truth of the premises does (or doesn’t) guarantee the truth of the conclusions, etc. Logic provides a ‘rational reconstruction’ of these intuitions by systematizing, correcting, extrapolating and extending them. Or, at least, it does so insofar as validity intuitions turn upon the logical form of the sentences which constitute the premises and conclusions of an argument. Logical form is that representation of a sentence which remains invariant under substitution for items in its non-logical vocabulary; and the non-logical vocabulary is specified by enumeration. Roughly, it’s anything except such expressions as “all, some, not, or, equals, if then”, and “and”. In effect, according to this view, logic provides an account of the validity of an argument insofar as validity is mediated by the behavior of expressions in the logical vocabulary. One might go further and say something like this: logic provides an account of the validity of an argument insofar as its validity turns upon the meanings of items in its logical vocabulary. On this view, one has said what there is to say about the meaning of a word like “and” when one has said (for example) that (P is true and Q is true) = (Pand Q is true) is valid. If one does think about logic this way, one might well be led to ask: what’s so special about the logical vocabulary? Suppose that argument 4 is valid in virtue of the meaning of “and” (and “therefore”). Isn’t it equally plausible that argument 5 is valid in virtue of the meaning of “bachelor”? If the goal of logic is to reconstruct pre-theoretic intuitions of validity, isn’t the second case as apt for treatment as the first? 4 John left and Mary wept, therefore 5 John is a bachelor,
therefore
Mary wept.
John is unmarried.
In short, it’s possible to view standard logic as providing a reconstruction of validity for only a rather arbitrary selection of the intuitively valid arguments. It then becomes natural to seek a more extended treatment; one which provides an account of the informally valid arguments. Informally valid arguments are those whose validity turns, at least in part, upon the meaning of items in the non-logical vocabulary.
270
J. A. Fodor, M. F. Garrett, E. C. T. Walker and C. H. Parkes
The appeal to definitions comes in here. Suppose we assume that there is a ‘semantic level’ of linguistic representation and that at that level definable expressions are represented by their definitions. So, for example, the semantic-level representation of sentence 6 is something like formula 7.6 6 John is a bachelor. 7 John is a man and John is unmarried. Assume further that principles of valid inference (including, for example, simplification of conjuction, the rule which allows us to infer P from Pand Q) apply to the semantic representations of sentences rather than to their surface forms. On these assumptions, we need postulate no principled dif-
ference between arguments 4 and 5; in the extended sense of ‘formally valid’ where validity is a relation over semantic-level representations, both these arguments instantiate the formally valid scheme P & Q --f P. Or, to put the same point slightly differently, given the present assumptions about semantic representations, we need not alter the standard logical apparatus in order to exhibit the validity of arguments which turn on the meaning of “bachelor”; doesn’t even occur at the for, on these assumptions, the word “bachelor” level of representation for which validity is formally defined. All that occurs there is the (conjunctive) definition of that word. Similarly, mutatis mutandis, for other definable expressions. The idea that systematic exploitation of the notion of definition might provide for an account of intuitions of informal validity enters the modern linguistic literature with Katz and Fodor (1963) and has been widely influential in ‘linguistic semantics’ (for a review, see J. D. Fodor, op. cit.). It connects in obvious ways with the definitional theory of language and the world sketched in I.a. For example, if the argument from “bachelor” to “unmarried” is valid in virtue of the meaning of “bachelor”, it’s hardly surprising that the extension of the former is included in the extension of the latter. There are, nevertheless, several reasons for viewing the definitional account of informal validity with considerable suspicion. We will consider three. First, it’s by no means certain that all informally valid arguments will be revealed as formally valid (as subsumed by the inferential apparatus of standard logic) even if couched at the (putative) level of semantic representation. In this respect, the “bachelor --f unmarried” case may be quite misleading. Consider, for example, a kind of case which we will presently
6There’s every reason to believe that, if there are such things as semantic representations, they must bk syntactically analyzed formulae; perhaps they are something like tree structures, as practically all linguists have assumed. For present purposes we can ignore this, but it will be important further on.
Against definitions
return
to at length:
the informal
validity
of arguments
27 1
like 8. According
8 John killed Mary + Mary died. to the conventional wisdom, the definition of “kill” is “cause to die”, so that “John killed Mary” comes out as “John caused Mary to die” at the level of semantic representation. There is, however, no rule of standard logic which underwrites the validitv of arguments like 9; this latter inference appears 9 John caused Mary to die + Mary die(d) to turn essentially on the meaning of “cause”,’ which is not itself a logical word. Of course, it’s conceivable that we could make 9 formally valid by replacing “cause” by its definition. But, as things now stand, nobody knows whether “cause” has a definition;* or, if it does, how it ought to be defined. And there is certainly no reason at all to believe that such a definition, if somebody could find it, would render arguments like 9 formally valid. In short, the idea that informally valid arguments will prove to be formally valid when couched at the semantic level is equivalent to the idea that only their logical form is relevant to determining the validity of semanticlevel arguments; and, as things now stand, there is simply no reason to believe that this is true. If it is not, then a reconstruction of informal validity may require an enrichment of the inferential apparatus of standard logic (e.g., the incorporation of rules which govern the behavior of formulae containing words like “cause”) even if it also assumes the existence of definitions.’ The second consideration which militates against definitional accounts of informal validity is that there appear to be at least Some informally valid arguments which cannot be reconstructed by appeal to the definition relation.rO The point turns upon the symmetry of the inferences which definitions license. Suppose that “bachelor + unmarried” is valid in virtue of the
‘Compare the invalid argument “John wanted Mary to die --t Mary die(d)” *If you just caught yourself thinking: ‘but, surely, evev word has a definition’, that shows that you are in the grip of TSP. Words in the primitive basis are not definable, by assumption. Perhaps “cause” is one of these. ‘There is, of course, nothing incoherent about the proposal that a theory of informal validity requires both the existence of a semantic level and the extension of the logic. It’s our impression, in fact, that most linguists who opt for definitions opt for an extended logic as well (barring, perhaps, Prof. Katz). Well see. however, that there’s a prima facie parsimony argument against such ‘mixed’ theories since it’s adequately clear that any argument whose validity can be expressed by an extended lo ic plus definitions can equally be expressed by an appropriately extended logic without definitions. ’ %For discussion, see J. D. Fodor (op. cit.) and also Geach (1957) where this point is made the basis of an argument against ‘abstractionist’ accounts of concept acquisition.
272
J. A. Fodor, M. F, Garrett, E. C. T. Walker and C. H. Parkes
definition of “bachelor”. Then we can be sure that there will be some predicate P (in fact, the predicate “man”) such that “unmarried & P -+ bachelor” is also valid.” Quite generally, if an informally valid argument turns on a definition, then there will be some clause that we can conjoin to the consequent which will make the corresponding bi-conditional true. Any informally valid argument which does not meet this condition can’t be a detinitional argument. The problem is that there appear to be informally valid arguments which don’t meet this condition. The standard examples are formulae like “if x is red then x is colored”. A moment’s reflection should serve to make clear that there is no predicate P such that “x is P and colored -+ x is red” is valid; hence that the validity of the first formula can’t follow from the definition of ‘red’.12 The moral here parallels the one we drew from the validity of arguments like 9. Even given the apparatus of definitional analysis, it looks as though some informally valid arguments can’t be captured within the inferential apparatus of standard logic. Rather, to get “red -+ colored” we will need a special rule of inference that does for “red” what the standard logical rules do for the operators, connectives and quantifiers. “Red” shows what “cause” suggests: assuming a semantic level won’t, all by itself, buy you a theory of informal validity. This brings us to our third point, which is that there is a serious alternative to appealing to definitions as part of an account of informal validity. TSP proposes a theoretical apparatus that looks like Fig. 1: definitions apply to syntactically analyzed natural language formulae to provide domains for the inferential apparatus of standard logic. The alternative account looks like Fig. 2: syntactically analyzed representations provide domains for an enriched inferential apparatus; one which contains rules which govern the behavior (not just of the logical vocabulary but also) of such non-logical words as “bachelor”, “cause”, “red” and the rest. In point of terminology, such non-standard rules of inference are traditionally called “meaning
’ t If the definition of “bachelor” were just “unmarried”, then the condition is satisfied vacuously; i.e. “bachelor + unmarried” and “unmarried -+ bachelor” would both be valid. It is not, by the way, re uired that P be an atomic predicate in “unmarried & P --+bachelor”. not a candidate, since 14 Of course “x is red and colored + x is red” is valid, but it’s presumably “red”) are not available at the semantic level. Alterdefinable expressions (including, by assumption, natively, we could take “colored” to be the defined term, so that the semantic representation of “x is colored” is something like “x is red, or green, or purple, or brown...” This treatment would give “red + colored” as valid, but it will not commend itself to anyone who wants a psychologically plausible semantics; e.g. who wants the semantic representation of a sentence to be what is internally displayed when tokens of the sentence are understood.
Against definitions
Figure 1.
Sentence understanding systems in TSP D E F I N I T I 0 N s
I 1 S Y N T
A X
Figure 2.
273
-e
ASSESSMENTS OF ARGUMENTS
Sentence understanding systems on meaning postulates view M
1IS Y N T A X
LEXICALLY
E A L: 0 G&E I c
-e
ASSESSMENTS OF ARGUMENTS
P 0 S T.
postulates”. (See Camap (1947), Kintsch (1974), Fodor, Fodor and Garrett (1975).) There’s an inclination to argue as follows: if your theory is willing to acknowledge an inference rule (meaning postulate) of the form “x is a bachelor iff x is unmarried and a man”, isn’t your theory really indistinguishable from one which acknowledges “‘bachelor” means “unmarried man” as a definition? The answer to this question is certainly “no”. Looked at from the linguist’s point of view, the two theories.disagree on what levels of representation there are (levels of linguistic representation are individuated, inter alia, by their vocabulary. According to the definitional view, there is a level of description at which “bachelor” is unavailable for the representation of “John is a bachelor”; whereas, the meaning postulate account denies that this is so.) Looked at from the psychologist’s point of view, the theories disagree on what representation of tokens of “John is a bachelor” is recovered when they are understood. The definitional view holds that such representations have the form “... unmarried man...“; whereas the meaning postulate view holds that such representations have the form “...bachelor...“. In short, the two theories differ in just about every way that two theories
274
J. A. Fodor, M. F. Garrett, I?. C. T. Walker and C. H. Parkes
can, given that they both assume a datum that “bachelor - unmarried man” is valid. As things now stand, it’s hard to see any very decisive reason for preferring the TSP account to the meaning postulate alternative. It’s sufficiently obvious, for example, that if the validity of an argument calz be captured by the former sort of theory, then it can also be captured by the latter. This is because, so far as questions of validity are concerned, definitions just are a special case of meaning postulates. Roughly, they’re the symmetrical ones.13 On the other hand, we’ve seen reason to suppose that even if you have definitions, you will have to have meaning postulates as well: arguments like “cause die --f die” and “red + colored” probably aren’t formally valid evelz at the semantic level.14 So, the best that can be said for TSP, in the present context, is that definitions may form part of the theory of informal validity. On the other hand, there are alternative structures for such a theory and these may be able to do the job without appeal to definitions; for that matter, without appeal to any ‘semantic level’ as that notion is popularly understood. Moreover, the whole discussion proceeds module the uncertain assumption that there is such a thing as informal validity; that there’s some sense of “valid” such that both simplification of conjunction and “bachelor+ unmarried” are usefully so stigmatized. If, in short, the phenomenon of informal validity is our best reason for believing that there are definitions, then we have no very compelling reason for that belief and TSP is in trouble. I.c. Sentence tion
comprehension:
to understand
a word is to recover
its defini-
If a definition gives the meaning of a word, then it’s natural to assume (a) that knowing what a word means is knowing its definition; and (b) that understanding a (token) utterance/inscription of a word is, or involves, having the definition ‘in mind’. We reserve (a) for section 1.d. The natural way to understand (b) is to situate it in the context of a ‘computational’ account of higher mental processes. According to such accounts, perception (including language perception) involves the assignment of ‘internal representations’ to distal stimuli. These representations specify salient r3More precisely, they’re the ones that are both symmetrical and eliminative. 14This may well be a special case of the general principle that problems ‘solved’ by appeal to definitions tend to recur in the form of problems about the primitive basis. In the present case, it looks as though ‘red’ and ‘cause’ are good candidates for membership in the primitive basis of English. What, then, shall we do about the informal validity of arguments which turn on the meaning of “red” and “cause”?
Against definitions
275
and/or task relevant properties of the stimulus. A theory of perception for a given stimulus domain must say: what these properties are, what format is employed for their mental representation and what mental operations are involved in the assignment of the representations to the stimuli. Viewed in this context, TSP claims that understanding a sentence involves the recovery of its representation at the semantic levells where, as we have seen, a semantic level representation is one in which definable expressions are replaced by their definitions. So, according to TSP, to understand a token of “John is a bachelor” involves representing the token as something like “John is an unmarried man”.16 There are several preliminary points to make about this claim. To begin with, like any interesting theory, it operates only modulo suitable idealizations. Nobody has to maintain that every case of understanding a sentence token involves recovering definitions, even where the sentence contains definable terms; there might be any number of heuristic procedures for avoiding the recovery of definitions in special circumstances. All that has to be claimed is that understanding a sentence token involves recovering the definitions in, as it were, the systematic cases. It follows that - here as elsewhere - bringing about circumstances relevant to testing the theory (in particular, by eliminating the possibility of reliance upon heuristic short-cuts) might well require elaborate experimental manipulation. We will return to this point presently. Second, there’s the by now familiar point that the theory can at best be vucuousl~ satisfied for sentences which draw their vocabulary entirely from the primitive basis. Perhaps understanding a sentence of the form “...bachelor...” involves computing an internal representation of the form “...man...“. But if “man” belongs to the primitive basis, then all that the definitional theory of understanding can say is that “man” is its own internal representation; that is, so far as the definitional theory is concerned, to
“It also claims that the semantic level representation of a sentence is the one that the speaker has in mind and which primarily (causally) explains his producing the token. We’ll concentrate on the TSP model of the hearer for purposes of simplifying the exposition but cf. Fodor and Fodor (forthcoming). 16TSP doesn’t, of course, claim that understanding tokens of “John is a bachelor” involves hearing them as tokens (of the sentence) “John is an unmarried man”. TSP allows that sentence comprehension requires the recovery of morphological and syntactic as well as semantic representations, and the two sentences are morphologically and syntactically distinct, however much they may converge at the semantic level. The reader may, by the way, be getting rather tired of bachelors and unmarried men, and we apologize for the paucity of our examples. Practically all the plausible examples of definitions come from jargon vocabularies (“ketch”), kinship vocabularies (“grandmother”) and axiomatized systems (“triangle”). This rather limits one’s range of choices and is a fact which should cause adherents of TSP to ponder.
276 J. A. Fodor, M. F. Garrett, E. C. T. Walker and C. H. Parkes
understand an utterance of a sentence of the form “...man...” is just to compute a token of a semantic representation of the form “...man...” We stress this point because it’s part of the intuitive appeal of the definitional theory that it avoids the necessity of saying things like “bachelor” is defined as “bachelor”, or the internal representation of “bachelor” is “bachelor”. Progress appears to be made when tokens of “bachelor” are internally represented by tokens of some other formula (like “unmarried man”). Once again, however, this is only progress towards the primitive basis, once one gets there, some notion of understanding a word other than recovering its definition will have to be invoked. As usual, then, reflection leads to the moral that nothing principled changes if we consider the entire vocabulary to be primitive, so that each word functions as its own internal representation for purposes of sentence comprehension. In fact, the .account of the lexical aspect of sentence comprehension that emerges from such treatment is actually quite attractive; it’s a serious alternative to TSP. Any theory of sentence comprehension must somehow license a distinction between processes involved in understanding a token and processes involved in exploiting the information which the token carries. According to TSP, this distinction is drawn at the semantic level. That is, the output of the sentence comprehension system is the semantic representation of the sentence, This output, in turn, provides a domain for such further transformations as logical and inductive inferences, comparison with information in memory, comparison with information simultaneously available from other perceptual channels, etc. Since these ‘extra-linguistic’ transformations are defined over the semantic representations, they have access to the definitions of the lexical items that the sentence contains. Whereas, on the alternative theory, extra-linguistic transformations are defined directly over the grammatical form of the sentence, roughly, over its syntactic structural description (which, of course, includes a specification of its lexical items.)” To say that the output of the syntax provides the domain for extra-linguistic transformations is, to all intents and purposes, to suggest that understanding a sentence token is just assigning it to a sentence type. The specification of a sentence type requires an ambiguity-free notation, but
17We shan’t consider the possibility that understanding a sentence involves recovering a representation which specifies its logical form in the traditional sense of that notion; roughly, a specification which formally determines such properties of the sentence as quantifier and operator scope, variable binding, etc. but which provides no access to internal structure in lexical items. This, in fact, seems to us quite a plausible view, but it’s independent of the issues about definition which are our primary concern.
Against definitions
277
that is precisely what syntax (together with appropriate subscripting for ambiguous lexical items) ought to provide. Notice that, on either account, the sentence comprehension system functions to provide domains for the extra-linguistic transformations. Both accounts thus postulate a sharp distinction between the mechanisms of sentence comprehension and those which determine the consequences (inductive, logical, plausible, etc.) of information that sentence tokens convey. The theories differ only in respect of the character of the representations that the sentence comprehension system provides; one alleges, and the other denies, that the system has access to (putatively) definable expressions like “bachelor”. There is, in fact, some rather tentative experimental evidence in favor of the non-definitional view. TSP entails that representations which specify definitions are internally displayed in the process of sentence comprehension; but experiments which have sought to test the psychological reality of such representations have not, thus far, met with notable success. In general, the tactics of such experiments involve (a) bringing about a situation in which it seems plausible to claim that S has understood a stimulus sentence; and then (b) attempting to show that some parameter of S’s response is sensitive to properties of the definitions of items in the sentence. For example, Kintsch (op. cit.) used the phoneme monitor task (and other) experimental procedures in an attempt to show that RT to lexical items in a sentence is a function of the relative complexity of their definitions; and Fodor, Fodor and Garrett (op. cit.) used a speed-of-inference task to determine whether words which contain “negative” in their definitions (such as “bachelor”, which means “not married man”) show typical chronometric effects of the presence of negative morphemes. Both experiments were unsuccessful as, indeed, untutored intuition might have predicted. There is, for example, no intuitive support for the following entailment of TSP: given two otherwise identical sentences which are respectively of the form “...Li . ..” and “...Li . ..” and such that the definition of Li is a proper part of the definition of Li, the second sentence should be more complex than the first. For example, TSP predicts that “John is a bachelor” should be more complex than “John is unmarried” since the definition of “unmarried” is a proper part of the definition of “bachelor”. And it predicts that “John broke the glass” is more complex than “the glass broke” since, according to the conventional wisdom, “breaktransitive” is defined as “cause appears to have much intuitive warto breakintransitive “. Neither prediction rant, and, as we remarked in the preceding paragraph, attempts at experimental vindication have, in general, not proved successful.
278
J. A. Fodor, M. F. Garrett, E. C. T. Walker and C. H. Parkes
Such experimental studies have, however, been roundly criticized for their reliance upon chronometric measures. For example, Katz (1977) has argued that they show at most that subjects employ heuristic shortcuts to avoid the recovery of semantic representations when they are placed under time pressure. Katz doesn’t offer any direct evidence that this is true, nor does he offer any suggestions as to what sort of task might have construct validity for testing the psychological reality of definitions. We’ll return to this worry in section II where we consider evidence from some non-chronometric measures. To summarize: In section I.b., we say that there is a trade-off between, on the one hand, theories which propose to save the inferential apparatus of standard logic by constructing a level of linguistic representation at which definitional structure is displayed; and, on the other, theories which dispense with a semantic level but propose to capture informal validity by a suitable enrichment of the inferential apparatus of standard logic. Not suprisingly, we now see that there is the possibility of the same sort of trade-off in theories of sentence comprehension. ‘Deep’ theories of comprehension (like TSP) require definitional analysis as part of the process of decoding tokens. In effect, they say that some inferences (like “bachelor + unmarried man”) must be drawn on pain of failure to understand the sentence. Whereas, ‘shallow’ theories claim that the entire inferential apparatus is extra-linguistic (in the sense that there are no inferences which must be drawn in the course of understanding a sentence; including, NB, inferences which turn on definitions.) It’s an open, and interesting, question how to choose between these views. But what should be clear from our discussion is that this is entirely an empirical question; there’s no a priori reason why theories of sentence comprehension need to assume that definitions express important linguistic relations. Theories of sentence comprehension - at least in modern cognitive psychology - are functions from sentences onto internal representations. Internal representations are themselves expressions in a formal language. Nothing principled hinges on the size of the primitive vocabularies that this formalism exploits; a fortiori, nothing principled requires that it do without “bachelor”. I.d. Definitions and theories of concept leurning: decomposition of concepts into their elements
definitions
express
the
The discussion in the last three sections has tended to exhibit TSP as one approach among others to a variety of problems about language and mind. We haven’t yet found any very persuasive reason for preferring the definitional account to its alternatives, and we’ve suggested some reasons for sup-
Against definitions
279
posing that TSP might be seriously flawed. If this is correct, it raises a question which has some interest from the point of view of intellectual history: Why has the definition story been taken so seriously by so many theorists? In this section, we seek to provide part of the answer. It’s natural - perhaps it’s mandatory - to speak of words as expressing concepts. Two consequences are implied thereby. First, the child’s task of mastering the lexicon of his language is bifurcated into learning the concepts that lexical items express and learning which morpho-phonological forms are used to express those concepts in his language community. Second, the distinction between defined and primitive lexical items generates a corresponding distinction between complex and basic concepts. l8 Given this presumptive’ parallelism between the lexical and conceptual systems, we can think of the definitions themselves as expressing not only the relations between definable expressions and the primitive vocabulary, but also the relations between complex and basic concepts. That is, we can say both: the meaning of “bachelor” is constructed from the meanings of “unmarried” and “man” and the concept BACHELOR is a construct out of the concepts UNMARRIED and MAN. The rule of definition supplies the principle of construction in both cases; definitions articulate both lexical decompositions and conceptual analyses. This is a point of some significance since, so far as we can tell, practically all of the influential theories of concept learning in philosophy and psychology have relied heavily upon the possibility of constructing complex constructs from a primitive conceptual basis.” There is, in this respect, a direct line which runs from Locke and Hume, through Vygotsky and Brunei-, to Winston and Miller and Johnson-Laird. We’ll try to say, in outline, what it is that the views of such theorists have in common. 1. There is a presumed distinction between basic and complex concepts (paralleling, as we’ve seen, the presumed distinction between primitive and definable expressions.) 2. The potential conceptual repertoire of the organism is the closure of the primitive basis under some specified set of combinatorial operations.” In early versions of TSP, this combinatorial apparatus was implicit in the
t81n fact, this puts the matter slightly backwards since TSP is wont to view the distinction between complex and basic concepts as fundamental (e.g., as epistemologicahy principled) and the distinction between definable and primitive terms as merely derived (primitive terms being the ones which express basic concepts... etc.). “The glaring exception is the work of the Cartesians and their followers, for whom concept acquisition is not assumed to be a learning process at ah. We’ll return to this presently. ‘OJust as, mutatis mutandis, the potential lexicon is the closure of the primitive vocabulary under whatever logico-syntactic apparatus definitions are assumed to have access to.
280
J. A. Fodor, M. F. Garrett, E. C. T. Walker and C. H. Parkes
formal properties of the presumed associative principles. Recent versions (of which Miller and Johnson-Laird (1976) provide an unusually instructive example) typically assume much richer formalisms taken from logic, set theory or computer mathematics. In either case, however, the form of the theory is quite straightforward: the concepts you can have are the ones that can be constructed out of an inventory of basic concepts by the application of an inventory of combinatorial principles. 3. Given this account of the space of potentially available concepts, the theory of concept leurning consists of a set of inductive procedures ~ to all intents and purposes, an inductive logic - which determines the availability of a complex concept relative to (a) the availability of appropriate basic concepts (in particular, the ones from which the complex concept is constructed) and (b) the experience of the organism. So, in its most familiar form, the theory has it that concept acquisition is a matter of framing and confirming hypotheses. In the Vygotsky/Bruner paradigm, for example, the concept BIK is said to have been learned when some such generalizations as “x is Bik iff x is round and red” controls the subject’s sorting behavior. In such accounts, the inventory of basic concepts (together, as usual, with the logico-syntactic apparatus) provides the vocabulary in which the hypotheses are couched.*l And the presumed laws of learning (problem solving strategies, principles of association, or whatever) operate to determine the degree of subjective confirmation of each hypothesis (the extent to which S believes it) as a function of environmental inputs, possibly including error-signals. We are, in fact, inclined to believe that all standard theories of concept learning are variations on this model; that they all consist, fundamentally of an inventory of basic concepts, a combinatorial apparatus and an inductive logic. We’d be pleased to hear if the reader can think of a counter-example; we cannot. (For further discussion, see J. A. Fodor (1975), especially Chapters 1 and 2). The point of present concern is this: if our account of concept learning models is correct, then all such models are theories of the inductive acquisition of complex concepts relative to the presumed availability of a primitive basis. WHAT THEN OF THE ACQUISITION OF THE CONCEPTS IN THE PRIMITIVE BASIS? It seems to us that there is only one possible answer: *‘Strictly speaking, the operation of the inductive procedures requires not only a source of hypotheses, but also a canonical format for the representation of the (dis)confiiming data. The inventory of basic concepts serves both functions: RED and ROUND. for example, occur both in the subject’s hypotheses (“x is Bik iff it’s red and round”) and also in his internal memory representation of the experimental trials and their outcomes (“on trial n, the distal object was red and round and the hypothesis that it was Bik was rewarded”).
Against definitions
28 1
theories of concept learning presuppose the availability of the primitive conceptual basis; they don’t explain it. If, however, the primitive basis is presupposed in concept learning, then it cannot itself be learned. If it is not learned, then, presumably, it is innate. The claim, then, is that all standard theories of concept learning require the innateness of the primitive basis and explain at most the acquisition of complex concepts relative to the availability of that basis. This claim may seem quite radical, but if it does that is only because the logical structure of theories of concept learning has not been widely appreciated in the modem literature. In fact, the idea that everyone has always been committed to the innateness of the basic concepts was common ground in many of the early discussions. William James, for example, who can hardly be viewed as a wild-eyed Nativist, comments as follows: “The first thing I have to say is that all schools (however they otherwise differ) must allow that the elementary qualities of cold, heat, pleasure, pain, red, blue, sound, silence, etc. are original, innate or a priori properties of our subjective nature, even though they should require the touch of experience to waken them into actual consciousness, and should slumber to all eternity without it.” (Principles of Psychology, Vol. 2, p. 6 18). In fact, James is here quite close to Descartes; they both realize that concept acquisition presupposes a primitive basis, hence that the basic concepts cannot themselves be learned. On the other hand, neither thinks that the availability of basic concepts is causally independent of experience (of, for example, the activation of the sensor-mm). From the contemporary point of view, one might say that James is a triggering theorist vis-a-vis the primitive basis and a constructivist (of the Associationist variety) about complex concepts. Whereas Descartes is a triggering theorist about practically everything: if the acquisition of concepts requires sensory stimulation, that is not because “ . . .these extraneous things [distal stimuli] transmitted the ideas themselves to our minds through the organs of sense, but because they transmitted something which gave the mind occasion to form these ideas, by means of an innate faculty, at this time rather than at another.... Hence it follows that the ideas of . . . movements and figures are themselves innate in us. So much the more must be the ideas of pain, colour, sound and the like be innate, that our mind may, on the occasion of certain corporeal movements envisage these ideas.” (Quoted by Adams, p. 770).22 22What, according to Descartes, shows that sensory stimulations are at best ‘occasions’ (viz. triggers) for the formation of concepts is the dissimilatity between our concepts and their distal causes. A modern way of making this (very perceptive) point is that our experience does not, in general, provide a good inductive sample for the concepts we acquire. (See Chomsky (1975) on the relation between the corpus which triggers language acquisition and the grammar thereby induced.)
282 J. A. Fodor, M. F. Garrett, E. C. T. Walker and C. H. Parkes
These reflections put a rather significant twist on how one construes the traditional Rationalist-Empiricist debate. If all parties are committed to the innateness of the primitive basis, then the residual dispute must be over how much of the (potential or actual) conceptual repertoire is primitive. Here, surely, is one source of the widespread commitment to the existence of definitions. If there are no definitions then presumably the entire lexicon is primitive. If the entire lexicon is primitive, then presumably all the concepts that lexical items express are primitive. If all the concepts that lexical items express are primitive, then presumably all the concepts that lexical items express are innate. If that does not precisely amount to the innateness of all concepts, it is quite enough to give an Empiricist the willies.23 The moral, then, is this: what is not definable must be innate. Most of us are inclined to assume that it just can’t be the case that all concepts are innate; most of us have therefore thought that many - indeed, very many concepts must be definable. This is a persuasive line of argument if Empiricism is true. But what if Empiricism isn’t true? In short, the argument cuts both ways; if there is evidence that there are no definitions, then that is evidence against the standard views of concept learning. Since the standard views are the only ones we’ve got and since the only alternative to concepts being learned is that concepts are innate, it appears that the substance of the Rationalist-Empiricist debate turns quite centrally on the empirical status of the definition construct. That makes the empirical status of the definition construct a matter of very considerable interest. We are about to turn to it. I.e. Summary
We’ve sought to show, in the preceding discussion, how the assumption that there are definitions plays a variety of roles in the conglomerate of theories that we’ve called “The Standard Picture”. We’ve seen, in particular, that theories of concept learning are more heavily invested in the psychological reality of definitions than might at first appear; indeed, that the dispute
23There is, of course, an infinity of concepts if one uses “concept” to denote not just what is expressed by words but also what is expressed by phrases. So, in particular, even if one claims that all Iexically encoded concepts are primitive ~ hence innate ~ there will be infinitely many phrasally encoded concepts that are not primitive, hence learnable; even if the concepts expressed by “tin” and “trumpet” are basic, the concept expressed by “tin trumpet” is complex. Much of what modern theories of language (since Frege) are about is exhibiting the mechanisms in virtue of which a finite. lexically encoded primitive basis is projected onto an infinity of phrasally encoded forms which express complex concepts.
Against definitions
283
between Empiricist and Rationalist accounts of the acquisition of concepts turns largely upon this issue. If TSP were clearly true, that would in itself be a conclusive argument for endorsing the definition construct; if you accept a theory, you must accept its entailments. But, as we hope we’ve convinced the reader, there are plausible alternatives to TSP, and these alternatives are not committed to definitions. It would obviously be desirable if the question could be brought to empirical test. Equally obviously, no single experiment could validate (or refute) the whole of TSP. The goal of Part II is to discuss experimental evidence relevant to assessing just one aspect of the standard picture: the doctrine (discussed in I.c.) that understanding a sentence token involves recovering (e.g., displaying in working memory) the definition of such lexical items as the sentence contains. This is part (though by no means all; see Part III) of what one might mean by saying that definitions are ‘psychologically real’24 for purposes of sentence comprehension.
Part II: The Psychological mental Inquiry
Reality
of Causative Constructions:
An Experi-
As we remarked above, a test of I.c. would require first bringing about an experimental situation in which subjects understand a sentence token, and then determining whether their behavior in that situation is sensitive to properties of the definitions of the words in the sentence. Constructing such a test involves solving three methodological problems: (a) we need stimulus materials that contain words which we can be reasonably certain have definitions if any words do; (b) we need to choose as the manipulated variable a property which we can be reasonably certain that the definitions have if they exist at all; and (c) we need to find an experimental measure which we can be reasonably certain is sensitive to the presence or absence of that property in S’s internal representation of the stimulus sentences. We will discuss (a) and (b) under the head “materials” and (c) under the head “methods”.
24We assume throughout that “there are definitions” and just two ways of saying the same thing; more generally, that the truth conditions on existential claims in linguistics. We but we are quite unmoved thereby. For discussion, see the Grammars in Block (1980).
“definitions are psychologically real” are psychological states and processes provide realise that this view is sometimes denied, section entitled Psychological Reality of
284 J. A. Fodor, M. F. Garrett, E. C. T. Walker and C. H. Parkes
11.~. Materials
It’s hard to find good examples of definitions, and this fact prejudices any results in a test of I.c. Such results might always be ‘explained away’ by claiming that the stimulus items aren’t really among the definable expressions, or that, if they are, the definitions assumed by the experiment aren’t the right ones. One can’t test the entire lexicon, and random sampling would be meaningless without a criterion for distinguishing definable from primitive terms. The indicated strategy is therefore a ‘best case’ approach: test for definitions where the linguistic argument for definability is strongest. This is the strategy that we have followed. It seems clear that, in the present state of the literature, the examples which provide the best arguments for definitions are the ‘causative’ verbs. English causative verbs include, for example, “break, tear, smash, bounce” etc. and also “kill, feed, remind, grow”, etc. According to the proposed analysis, such verbs have two salient features: they’re transitive (in particular, they satisfy surface syntactic structures like Fig. 3) and they’re non-primitive (in particular, they satisfy semantic analyses like Fig. 4).25
negative
Figure 3.
Surface structure for causatives; e.g., “John broke the glass”. s
/-\
N
/‘“’ /“\
I
John
broke
the
glass
This account of the causatives has several interesting features. First, as just mentioned, it makes causatives defined expressions; whereas their intransitive counterparts are primitive, at least so far as this analysis is con25Notice that the morphological identity between the transitive and intransitive form of a verb like “break” is accidental according to the present analysis. That is, we have examples like “John broke the glass/The glass broke” but we also have examples like “John killed Mary/Mary died”. “Die” ate fed raised grew is thus taken to be the intransitive counterpart of “kill”, just as “break” is taken to be the intransitive counterpart of “break”.
Against definitions
Figure 4.
285
Semantic analysisfor causatives; e.g., ‘John broke the glass “. S,
cerned. Since causatives are defined, they do not appear in semantic-level representations, and this raises the question of how they manage to get into surface forms. Linguists who agree that causatives are defined tend to disagree on this latter issue; roughly, it divides the ‘generative’ from the ‘interpretive’ schools of linguistic semantics. According to the former, the mapping from semantic onto surface structures is transformational (raising and lexicalization are critically involved). According to the latter, it is nontransformational and accomplished by rules of a form that remain to be specified. This issue has been extensively reviewed elsewhere (see, in particular, J. D. Fodor (op. cit.)) and need not concern us here. It’s sufficient, for our purposes, that there is wide agreement on thestructures despite the disagreement about the derivations.26 The second noteworthy point about the definitional account of causatives is that it makes sentences like “John broke the glass” both surface simplex and deep multiplex; that is, they’re one clause sentences at the level of surface structure, but they contain (at least) two clauses at the level of semantic representation. The fact that causative constructions are assumed to be deep
261t’s a good rule of thumb that a theorist accepts the definitional account of causatives iff he holds TSP, You’ll find versions of the analysis in sources as scattered as Katz (1972); &hank (1975) Lakoff (1970); McCawley (1971), Miller and Johnson-Laird (1976) etc. This rule must, however, be applied with caution; Katz, for example, accepts bofh a definitional view of the lexicon and a radically nativist view of the acquisition of concepts.
286
J. A. Fodor, M. F. Garrett, E. C. T. Walker and C. H. Parkes
multiplex has a consequence which will be decisive for our experimental manipulation: there are critical pairs of items in causative constructions which are analyzed as grammatically related in the surface structure but NOT tion.
grammatically
related
at the level of definitional/semantic
representa-
Consider, for example, “John” and “the glass” in “John broke the glass”. These items are related as subject and object of the verb “break” in the surface sentence (see Fig. 3); but they are not so related in the putative semantic representation (see Fig. 4). In fact, on the definitional analysis, there is no verb of which “John” and “the glass” are both arguments at the semantic level. Rather, “John” turns up as the subject of the underlying verb “cause”, and “the glass” turns up as subject of the underlying verb “breakhtransirive”. Another way of putting this is as follows: given what is presumably the intended interpretation of the logical syntax, “John broke the glass” does not express a relation between John and a glass. On the contrary, it expresses a relation (of causing) between John and an event (viz., the glass breaking). Similarly, according to the definitional analysis, “John killed Mary” does not express a relation between John and Mary... etc. We said that the definitional treatment of causatives represents a bestcase example of definability. We are thus under some obligation to show that the analysis has face plausibility. There’s the following to be said in its favor: 1. “John broke the glass” does imply “the glass broke”, and this fact is predictable from the putative semantic representation of “break,,,,,itiye” given the appropriate meaning postulates for “cause”. 2. There exist languages (including, apparently, Japanese) in which verbs of causative import have an explicit surface morphological marking. If this doesn’t argue directly that surface verbs like “breaktransitive” are represented at the semantic level, at least it suggests that “causative” is a by “...cause...” morpheme category which universal grammar will have to acknowledge.27 3. By far the most persuasive argument for the definitional treatment of causatives is that there exist types of sentential ambiguity for which it provides an elegant and appealing explanation. Consider sentence 10. It’s argued 10
John almost killed Mary
27We are not, then, denying that the lexicon is cross-classified by features like “causative”. On the contrary, the existence of such a cross-classification is suggested by a variety of linguistic and psychological phenomena. What we do deny is that the cross-classification of the lexicon implies that lexical items are semantically represented by their definitions. That the mechanism of cross-classification musr be definitional has been widely but gratuitously assumed by theorists in linguistics, psychology. and AI. Yet, alternative mechanisms are easily imagined, and we know of no substantive reasons why definitions are so widely preferred. For further discussion, see Fodor, Fodor and Garret, op. cit.
Against definitions
287
that 10 can mean either that John almost brought about Mary’s death (e.g., he brought the poison but then he changed his mind) or that John brought it about that Mary almost died (he brought the poison and he fed it to her, but they saved her with a stomach pump.) The difference between the readings is the difference between attempted homicide in the latter case and mere premeditation in the former. The point of these observations is the following: if we accept the detinitional analysis of “kill”, and if we accept that the scope of adverbs is defined over semantic-level representations, then we can predict the possible adverb scopes for sentences like 10 purely geometrically; viz. by the principle that an adverb can have scope over any clause in a semantic representation. The difference between the two readings of 10 is thus captured by the difference between Fig. 5 and Fig. 6. Figure 5.
Long scope analysis of adverb in “John almost killed Mary “.
ADV
NP
/
/\
I
cause
N
Almost
Figure 6.
VP
A
i
J0hl-l
Mary
die
Short scope analysis of adverb in ‘John almost killed Mary’:
i”
N
I
John
/‘\
i’\
i
cause
/
ADV
I
almost
s2
288 J. A. Fodor, M. F. Garrett, I?. C. T. Walker and C. H. Parkes
The force of the argument, then, is that accepting the causative analysis of “kill” allows us to preserve the following generalization about the scope of adverbs: any form with the deep geometry of (NP Verb (S))s will provide a source of scope ambiguities analogous to sentence 10. The generalization applies not only to 10 itself, but also to ambiguous sentences like “John cooked the meat S~OW~Y”(“Cooktransitive” = df “cause to cookintransitive”) etc. Lest this argument seem so good that it settles the case in favor of the definitional decomposition of causatives, a few caveats ought to be entered. First, it’s not all that obvious that sentences like 10 are ambiguous. It’s not clear how to decide between structural ambiguity (ti la definitional story) and mere disjointness of truth conditions. This is a general problem with the evaluation of ambiguity arguments that are proposed to motivate multiplicity of linguistic representations, and we don’t know how to solve it. Second, there are alternative accounts of the (putative) ambiguity of 10. According to these accounts, such ambiguities are handled either by appeal to the interpretative apparatus (see Dowty, 1976) or to principles which define adverb scope over surface structure (see Chomsky, 1972). Third, and most important, it appears that the generalization that adverb scope is definable over the geometry of semantic representations cannot, after alI, be sustained. Notice that ‘try-verbs’ have the same (putative) definitional geometry as causatives. So, for example, “seek” = df “try to find” and t’chase” = df “try to catch” precisely parallel to “kill” = df “cause to die” or “break” = df “cause to break”. That is, the underlying geometry of Fig. 7 is precisely congruent with the underlying geometry of Fig. 4; what distinguishes them is just that in 4 the embedding verb is “cause” whereas in 7 the embedding verb is “try”. Nevertheless, the rule of adverb scope assignment that works for causaalmost sought the site of tives doesn’t work for try-verbs. So, “Schliemann Troy” means almost (Schliemann tried to find the site of Troy) but has no reading Schliemann tried (to almost find the site of Troy). Similarly, “John almost chased Mary” has no reading John try (to almost catch Mary); rather it means univocally almost (John try to catch Mary). In general, and despite the congruence of Figures 4 and 7, adverbs like “almost” yield only the long scope reading in sentences containing try-verbs. These observations strongly suggest that adverb scope assignment rules are not geometrical; viz., that they will have to be sensitive to particular verb classes (causatives versus try-verbs) even if definitional analyses are allowed. This means that ambiguities like “almost kill” provide no argument for a two clause analysis of causatives. The most they show is that scope
Against definitions
Figure 7.
289
Schematic semantic analysisfor “try ” verbs; e.g., “‘John chased Mary’! S, /
\
/Npl/\
N
I
John
i
Y2\ V
NP
I2 i John
I catch
Y
Mary
rules apply (inter alia) in virtue of the causativity of verbs, and that that requirement could be met even by scope rules defined over surface structure, assuming that the lexicon is cross-classified by some such feature as ‘2 causative”. The argument that the causative analysis allows us to preserve a generalization about the geometrical character of adverb scope assignment is unsound because, as it turns out, there is no such generalization to preserve. To summarize: the linguistic evidence for the causative analysis is at best inconclusive, but at least there is some evidence. Causatives are very nearly the only case for which serious distributional grounds for the existence of definitions have thus far been alleged. In practically every other case, the argument has been quasi-methodological; e.g., that definitions must be posited in order to preserve some or other tenet of TSP. It’s therefore the causative analysis that we have chosen for experimental scrutiny. The next question dictating the choice of experimental materials is: what property of the definitional representations ought we to test for? Here there are two main considerations: we want to choose a non-adventitious property of the definitions, and we want to choose a property which allows for the construction of appropriate controls. So, for example, it would be inadvisable to test the definitional account of causatives by appealing to the fact that “cause” is alleged to be the main verb in the deep representations. Nothing essential to the theory chooses between, say, “cause” and “bring about”, so if there were reason to believe that the definition of “break” contains a Verb + particle construction, the
290
J, A, Fodor, M. F. Garrett, E. C. T. Walker and C. H. Parkes
analysis could accommodate that fact without essential revision. On the other hand, the analysis does seem to be essentially committed to a claim that we remarked upon above: the pattern of grammatical relations among phrases shifts as between the surface form and the semantic representation of a causative construction. In particular, it’s built into the geometry of the analysis that the surface subject and object of a causative verb are not in grammatical relation in the corresponding definition. This, then, is the strategy of the experiment: find a test which is sensitive to the distinction between constructions in which grammatically related surface phrases are not so related in underlying representations; then apply that test to the case of causatives. This raises the question of controls, in particular, of determining the construct validity of the test procedure. This issue is especially pressing if, like us, you suspect that the definitional analysis of causal verbs isfulse. To see this, consider the view on which verbs like “break” are primitive and undefined. On this account, the semantic representation of “John broke the glass ” is “John broke the glass”; hence a test which is sensitive to shifts of grammatical relations as between surface and semantic representations would be predicted to fail when applied to causative verbs in general and to “break” in particular. This threatens to leave the experimenter in the nasty position of making a prediction of no-difference and, hence, of being unable to distinguish the truth of his theory from the insensitivity of his test instrument. There is, of course, a way of coping with this sort of problem. What one needs is independent validation of the test instrument.” Fortunately, this is possible in the present case. As it turns out, there exist several kinds of sentences (we will call them “shift sentences”) in which the pattern of grammatical relations among phrases changes as between surface and abstract representations, but where the existence of the shift does not depend upon the definitional decomposition of lexical items in the sentence. Moreover. in the case of each of these kinds of shift sentences, it is possible to provide close approximations to minimal pair controls; sentences which share the surface organization of the shift sentence, but where the key grammatical relations do not change as between surface and deep structures.
28Some psychologists appear to believe that the confirmation of predictions of no-difference is somehow inherently uninteresting. This view is, of course, absurd; if it were true, it would be practically impossible ever to provide empirical evidence that the ontological commitments of a theory are false. It is also, by the way, bad history of science. To take just one example: modern estimates of the size of the universe were initially confirmed by the demonstration that the two hundred inch reflector telescope could not resolve Cephid variables in the Andromeda Nebula.
Against definitions
29 1
The general strategy of the experimental design may now be characterized: Phase 1: Validate the test instrument by showing that it is sensitive to the difference between an arbitrary shift sentence and its non-shift control. Phase 2: Apply the test procedure to determine whether causatives behave like shift sentences (a positive result tends to validate the definitional analysis of causatives; a. negative result tends to disconfirm it.) We are almost ready to consider the problem of validating the test instrument. Before doing so, however, we need to call attention to a crucial feature of the experimental strategy just described: it depends upon the assumption that any plausible account of the causatives in which they are lexically decomposed will be one in which the subject and object of the surface causative verb are NOT co-clausal in underlying representation. It is thus essential to consider the possibility of alternative decompositional analyses; ones which do not share this feature. (This possibility was first pointed out to us by Professor Zenon Pylyshyn, to whom we wish to express our gratitude. Some day we are going to do him a favor.) Consider, then, a treatment of the causatives according to which the underlying structure of ‘John killed Mary’ is the one represented in Fig. 8. Figure 8.
Semantic analysis schema for ‘John killed May ‘:
/i\
John do something to Mary
Paraphrased into surface English, this structure says that ‘John killed Mary’ is synonymous with ‘There is something which John did to Mary which caused Mary to die.’ Notice that, according to this analysis, ‘John’ and ‘Mary’ are co-causal in semantic representation; they occur as subject and indirect object of the abstract verb ‘do’. (Or, to put it more semantically, according to this analysis, ‘John killed Mary’ does express a relation between John and Mary; viz., the relation which holds between x and y just in case x does something to y which causes y’s death). We think that there are decisive difficulties with this suggestion. We are going to examine them in some detail, not only because undermining the analysis is essential to motivating our experimental program, but also because the analysis offers an interesting case study in the difficulty of
292
J. A. Fodor, M. F. Garrett, E. C. T. Walkerand C. H. Parkes
constructing defensible definitions. Prima facie plausible proposals are forever leading to unforseen troubles when their consequences are seriously pursued. To begin with, it may seem that the analysis shown in Fig. 8 can be supported on philosophical grounds since, on certain metaphysical views (see, for example, Davidson ( 1970)) “cause” and other causatives express relations between events. If such views are written into the logical syntax, then the underlying subject of a causative verb ought to be a sentence, and that is indeed the case according to the analysis in Fig. 8. This is not, however, a persuasive argument in favor of the analysis, since we could get the same effect with Fig. 9, according to which ‘John killed Mary’ means ‘John did something which caused Mary to die’, and in which ‘John’ and ‘Mary’ are, once again, not cocausal. It may be that there are constructable examples which decide between the analysis in Figs. 8 and 9, but the metaphysical considerations per se do not. Figure 9.
Semantic analysis schema for ‘John killed Mary”.
Ai
A\
John do something
cause
A
Mary die
But lack of metaphysical motivation is hardly the major problem with the proposal. What’s more serious is that it is incompatible with the geometrical treatment of adverb scope discussed above; since we now have a three clause structure for ‘John killed Mary’, we presumably predict a threeway ambiguity for ‘John almost killed Mary’, and this prediction is not sustained by intuition. The following difficulties are more serious still. Consider the sentence A (= ‘the wind moved the leaves’); “movet,,,i,i,,” is a causative so, according to the present proposal, A is synonymous with (and hence entails) sentence B (= ‘the wind did something to the leaves which caused them to move’). It follows that whenever A is true, there will have to be a (true) answer to the question: ‘What was it that the wind did to the leaves such that it was the wind’s doing that to the leaves which caused them to move?’ Suppose that, in a given case, the answer is C (= ‘the wind exerted force upon the leaves’).
Against definitions
293
In this case, A is true because C is true. But now consider that C explains the leaves’ moving only if D is true: (D = ‘the wind’s exerting force upon the leaves moved them’). But D itself contains a causative, and is thus itself in need of analysis. According to the analysis proposed, D is synonymous with (and hence entails) E (= ‘the wind’s exerting force upon the leaves did something to the leaves which caused them to move’). Notice, however, that whenever E is true there will have to be a (true) answer to the question: ‘What was it that the wind’s exerting force upon the leaves did to the leaves which caused them to move?’ And, whatever the answer to that question is, it will give rise to a sentence with a gerundive subject and a causative main verb in just the way that C gave rise to D. Patently, the argument iterates indefinitely. Since this consequence is obviously unsatisfactory - there can’t be indefinitely many events between a cause and its effect - we have a reductio ad absurdum of the proposed analysis. Contrary to Fig. 8, causative’s don’t contain an existentially quantified variable over events. We can think of only two plausible replies to this point. First, one might argue that causatives with gerundive subjects don’t lexically decompose (though causatives with non-sentential subjects do). This suggestion will save the analysis, but it seems totally ad hoc. Notice that the same sort of semantic considerations that are taken to favor decomposition in the case of causatives with NP subjects also obtain in the case of causatives with gerundive subjects: ‘the wind’s exerting force on the leaves moved them’ entails ‘the wind’s exerting force on the leaves caused them to move’just as ‘the wind moved the leaves’ entails ‘the wind caused the leaves to move’. Second, it might be argued that, though A can’t be true unless B is, still, the answer to the question that B invites (viz., ‘what was it that the wind did to the leaves which caused them to move?‘) could just be ‘it moved them’. This avoids the problem of having to manufacture an indefinite string of effects which, as it were, intervene between the wind’s blowing and the leaves moving. But it invites troubles of its own. For, if we accept this move, then we accept the following dialogue as well-formed (i.e., we accept that A is an answer to Q): Q) A)
What did the wind do to the leaves which caused them to move? The wind moved the leaves.
And if we accept that dialogue as well-formed, then we must be able to translate it at the semantic level preserving the question/answer relation. Notice, however, that what we get at the semantic level is: Q’) A’)
What did the wind do to the leaves that caused them to move? The wind did something to the leaves that caused them to move.
294
J. A. Fodor, M. F. Garrett, E, C T. Walker and C. H. Parkes
We are prepared to stretch intuition to the point of believing that A is an answer to Q, but not to the point of believing that A’ is an answer to Q’. In the light of these considerations, we shall continue to assume that if a decompositional analysis of causatives can be sustained, it will have to be one of the ‘John killed Mary’ - ‘John caused Mary to die’ variety; i.e., one which differs from both Fig. 8 and Fig. 9 in avoiding existential quantification inside the analyzed verbs. It is, in particular, the former kind of analysis that is presupposed in the experiments presently to be reported. We turn now to a discussion of the sentence types used in the first phase of the experiment to validate the test instrument.*’ Phase I, Comparison
Consider 1 la 1 lb
sentence
1: ‘expect-verbs’
versus ‘persuade-verbs
pair 1 la, b. It seems plausible
’
to say that these sentences
John expected Mary to leave John persuaded Mary to leave
have the same surface analysis, perhaps the one shown in Fig. 10. Whereas it seems clear that the sentences must differ significantly in their semantic representations. This is because, as we saw in I.b, semantic representations are supposed to provide domains for inferential operations, and 11 a and b differ strikingly in the sorts of (informally) valid arguments they enter into. Notice, for example, that 11 b entails “John persuaded Mary”, whereas 1 lb doesnot entail “John expected Mary”. One might put it, very approximately, that 1 la expresses a relation between John and an event (Mary’s leaving), whereas 1 lb expresses a relation between John and Mary.30 These sorts of considerations have led practically everybody to agree that 1 la and b have different abstract representations.3’ The standard proposal is that ‘expect-verbs’ enter into abstract configurations like the one in Fig. 1 1, whereas ‘persuade-verbs’ enter into abstract configurations like the one in Fig. 12.
*‘The actual stimulus sentences employed are given in the materials section of the Appendix; this discussion will treat examples of each sentence type that are simplified for purposes of clarity. The distinction between the two ‘phases’ of the experiment is likewise expository. In fact, barring one case, all the stimulus materials were tested together and on the same population of subjects; see the methods section of the Appendix. 30What makes this analysis very approximate is, of course, that both sentences are intensional for “Mary leave”. 31There is not, alas, perfect consensus that they have the same surface representations. We’ll return to this problem presently.
Against definitions
Figure 10.
295
Surface structure for ‘expect’and ‘persuade’verbs S.
N
V
NP
John
Figure Il.
-3
I N
1
G;;;gexztd}
I
A
to
Mary
leave
Underlying structure for ‘expect’ verbs; e.g., “John expected Maw to leave “.
/‘\ /Np ip\ i A I
N
John
Figure 12.
expect
Mary
leave
Underlying structure for ‘persuade’verbs; e.g., “‘John persuaded Mary to leave”. S.
Notice that if Fig. 10 is right, the present analysis has the consequence that ‘expect-verbs’ (but not ‘persuade-verbs’) are shifters. In particular, the abstract representation of 1 lb contains a verb (viz. “persuade”) of which “John” and “Mary” are respectively subject and object in both underlying
296
J. A. Fodor, M. F. Garrett, E. C. T. Walker and C. H. Parkes
and surface representations. Whereas, in the case of ‘expect-verbs’, the grammatical relations shift in a manner precisely analogous to the shift that the definitional analysis posits in causal verbs. In particular, “John” and “Mary” are the subject and object of “expect” in Fig. 10, but they exhibit no grammatical relations to each other in the structure shown in Fig. 11; rather, while “John” is subject of “expect”, “Mary” is subject of “leave”. Notice, too, that though ‘expect-verbs’ are shifters according to this analysis, the occurrence of the shifts does not depend upon a presumption of definitional analysis; the present analysis is compatible with (though it does not demand)
the assumption that ‘expect-verbs’ are semantic primitives. Comparisons like expect/persuade thus dissociate the issue of shifting from the issue of definitional decomposition; they permit us to determine the construct validity of a test for shifting without prejudicing the question of the psychological reality of definitions. Analogous remarks apply for all the following comparison sets tested in Phase I. Phase I, Comparison
2: ‘easy-adjectives’
versus ‘eager-adjectives’
Consider the sentence pair 12a and b. Once again, it seems plausible that they share a surface structure (see Fig. 13). And, once again, the surface similarity masks a radical difference in logical form. Thus, while 12a attributes a property (easiness) to an event type (pleasing John), 12b attributes a property (being eager to please) to John. The standard analysis marks 12a 12b
John is easy to please John is eager to please
Figure 13. Surface structure for ‘easy ‘and 'eager'adjectives
to please
Against definitions
297
this distinction by assigning 12a the abstract representation in Fig. 14 while assigning to 12b the representation in Fig. 15. If this analysis is right, then “John” is a shifter in 12a but not in 12b. That is, while “John” is the subject of “eager” in both the surface and the semantic representation of 12b, it is the object of “please” in the abstract representation of 12a. Hence, “John” and “easy” are grammatically unrelated in Fig. 13, while they are related as subject and predicate in Fig. 12. This pattern of assumptions is presumed to explain the fact that, while 12b entails “John is eager”, 12a does not entail “John is easy”. A test sensitive to shifting should, therefore, distinguish 12a from 12b.
Figure 14. Underlying structure for‘easy ‘adjectives S.
/\
/“” i\
ANiv IT e&y
be
s,
-A AN
please John
Figure 1.5. Underlying structure for ‘eager'adjectives.
VP
Jai-m
de
eager
‘s,
A
John please AN
298
J. A. Fodor, M. F. Garrett, E, C. T. Walker and C. H. Parkes
Phase I. Comparison Consider structure 13a 13b
3: sluicing
the sentence pair 13a and b. Presumably they share the surface in Fig. 16. However, it’s plausible that “know” has different
John married John married
somebody somebody
but we don’t know who but we don’t know her
relations to its surface object in the corresponding deep representations. In particular, the sense of 13a is: John married somebody but we don’t know who John married, which in turn has the structure: John married somebody but we don’t know (John married wh + someone). The interrogative pronoun in 13a is thus, abstractly, not the object of “know” but the object of Figure
16.
Surface one,
structure for ‘sluicing’ sentences;
e.g.,
(“John wants to many some-
but we don’t know
“married”. Whereas it’s plausible that “her” in 13b is the direct object of “know” at all relevant levels of representation. 32 If this is correct, then a test for shifts should distinguish 13a from 13b; in particular, it should exhibit the know-who relation as shifted and the know-her relation as unshifted.
32We are assuming both that the subordinated S is pruned in the derivation of 13b and that “who” ends up inside the matrix VP, yielding 13a and b as true minimal pairs. However, if the interrogative element is inside the VP in 13a, we would expect such passives as “John married somebody, but who isn’t known (by us)“, which strikes us as marginal. We are, to this extent, uncertain about the proposed analysis.
Against definitions
Phase I, Comparison
Consider 14a 14b 14c
4: ‘there-quantifier’
the sentence
triplet
versus
1
‘there where
i
299
-adverb ’
14a, b, c. Once again, it seems plausible
that
There is a man I want you to meet There is the man I want you to meet Where is the man you want me to meet?
the surface structures are congruent, and once again it appears that the logical structures are quite different. In particular, 14a is the sort of formula that logicians represent with an existential quantifier and bound variables; something along the lines of 15.’ Whereas “there” in 14b is most naturally 15
3 x (x is a man and I want you to meet x)
read as a locative adverbial. 14c is like 14b except that the locative adverbial is interrogative; (“where” = “wh + somewhere”). According to standard treatments, quantifiers are transformational constants; that is, they are introduced by insertion rules and are not elements of abstract representation. Whereas locative adverbs presumably are available to semantic structures. If these assumptions are right, then the abstract representation of 14a is something like 16, whereas plausible abstract representations for 14b and c are 17 and 18 respectively. Notice that this means that the expressions “there” and “man” are grammatically related in the abstract representations of 14b and 14c (viz. as subject and predicate);33 16 17 18
A man (I want you to meet a man) is The man (I want you to meet the man) is there The man (I want you to meet the man) is wh + somewhere
whereas of 14a; shifting whereas unshifted
“there” and “man” are not so related in the abstract representation “there” has no source in that representation at all. A valid test for should therefore detect a shift in the relation ‘there-man’ in 14a, ‘there-man’ should be unshifted in 14b and ‘where-man’ should be in 14~.~~
S5Strictly speaking, they form possibly proper parts - typically heads - of grammatically related constituents. We won’t bother observing this distinction in what follows since it turns out to be irrelevant to the empirical outcomes; subjects are apparently willing to regard heads of constituents as equivalent to constituents for purposes of the experimental task. %It’s possible to read “there” as an adverb in 14a (vi& “There is a man I want you to meet”) and it’s also possible to read “there” as a quantifier in 14b. (“There is the man I want you to meet”). Neither reading is preferred. Subjects who choose them would provide “spurious” disconfirming data since the experimental prediction is that results for the two types of sentence are asymmetric.
300
J. A. Fodor, M. F. Garrett, I?. C. T. Walker and C. H. Parkes
There are two possible objections to the test materials thus far discussed. One might claim that the analyses are wrong (hence that a test which produced the predicted asymmetries would not be detecting shift but something else); or one might claim that the abstract representations proposed are not semantic (hence that a test might distinguish all these cases and still not be sensitive to the kind of representations in which definitions occur.) We need to expand briefly on both these worries. It’s conceivable that the assumption that surface structures are shared in each of the pair-types just enumerated is wrong. In the case of expectpersuade, in particular, Chomsky has argued that there is a surface difference; in fact, that Figs. 9 and 10 would be appropriate as surface representations for 1 la and b respectively. If this is correct, then sentences containing ‘expect-verbs’ are not shifters and a test could distinguish 1 la from 1 lb even if it were insensitive to semantic relations; viz. by being sensitive (solely) to surface relations. What’s involved here is the status of transformations which raise NPs, and we don’t propose to commit ourselves on the issue which is, as it turns out, extremely complicated. Suffice it, for the moment, to make the following three remarks: (a) it’s extremely unlikely that this sort of problem infects all of the four cases; (b) even if 1 la and b differ in surface structure, they surely also differ in semantic representation, so that a test which distinguishes them might be sensitive to surface structure, or semantic structure, or both; (c) generative semanticists form a substantial sub-population of linguists committed to definitions, and they do accept the traditional (raising) analysis of ‘expect-verbs’ which, indeed, they take as supplying a precedent for the operation of predicate raising (the latter occurs essentially in the generative semantic treatment of causatives). What emerges is that we need to show that our test vehicle is sensitive at least to semantic relations, whatever else it may respond to. We’ll return to this presently. Some theorists (‘interpretive’ semanticists) argue that there is a principled distinction between semantic representations and deep syntactic ones. It’s open to such a theorist to claim that all the abstract representations thus far discussed are merely syntactic, hence that a test could be sensitive to shift in all these cases and still be insensitive to specifically semantic properties of sentences. Such a theorist could further claim that definitional analyses are displayed only at the semantic level (# to the level of deep syntax). Hence, a test could be validated for all the types so far discussed and fail to show causatives to be shifters, even if the definitional analysis were true. (Notice that this line of argument is not available to a ‘generative’ semanticist, for whom the deep syntactic and semantic levels are identical.)
Against definitions
301
Once again, it appears that what’s needed to meet the objection is a demonstration that the test instrument is sensitive at least to semantic representations, even if it is sensitive to surface (and/or deep) syntactic relations as well. To meet this requirement, we have introduced two further types of validating materials. In both these cases it seems clear that there is a difference in semantic relations between superficially similar sentences, and that the difference turns crucially upon the meaning of one of the constituent words: just as the difference in semantic structure between, e.g., “John killed Mary” and “John bit Mary” is supposed to turn crucially upon differences in the meanings of “killed” and “bit”. Here again, though the logical differences seem patent, and though they turn precisely upon the meanings of lexical items, they do not involve definitional decompositions. Phase I, Comparison
Consider 19a 19b
the sentence
5: negative quantifiers
pair 19a, b. It seems plausible to treat such sentences
All of the men left. None of the men left.
as syntactic minimal pairs, but they differ crucially in their semantic properties in virtue of the meanings of “all” and “none”. In particular, in 19a, but not in 19b, the property left is attributed to the men. A test which is sensitive to semantic relations ought, therefore, to distinguish 19a from 19b in respect of the expression “men left”; conversely, it’s hard to believe that a test which does make that distinction could nevertheless be insensitive to semantic relations. Phase I, Comparison
6: intensional
verbs
Consider the sentence pairs 20a, b. It seems clear that the sentences differ in respect of whether a relation is asserted between John and an apple: all the 20a
John
wanted imagined needed etc.
an apple
20b
John
ate bit had etc.
an apple
302
F, A. Fodor, M. F. Garrett, E, C. T. Walker and C. H. Parkes
sentences in 20a are intensional for the occurrence of “an apple”, whereas none of those in 20b are. In particular, you can ‘quantify in’ to the forms of 20b, yielding such entailments as “there is an apple that John ate”, but you can’t (validly) infer such existential conclusions from 20a. It’s accepted, therefore, that sentences like those in 20b express relations which implicate the surface subject and object, while those in 20a do not. The difference between intensional verbs and their relational counterparts is a paradigm of the sort of thing that is traditionally supposed to be captured at the semantic level. In all probability, it is not reducible to a structural difference, but is an intrinsic property of the verbs; rules of inference simply have to be ‘told’ whether a given verb is intensional. In any event, it seems exceedingly plausible that a test which distinguishes 20a from 20b is sensitive to semantic representations in any coherent sense of that notion. Phase II: causative verbs
Assuming a test validated in Phase I, the next objective is to test for semantic relatedness between (e.g.) “John” and “Mary” in “John killed Mary”. (It will be recalled that, according to the definitional analysis, such sentences do not express semantic relations between the subject and object of the surface verb. On the contrary, “John killed Mary” expresses a relation between John and an event.) Equivalently, for our purposes, we want to use the test to determine whether surface subjects and objects of causative verbs are shifters. To do this, we must find minimal pair controls in which the surface verb does not shift (a fortiori is not causative) according to definitional treatments. In effect, this means finding verbs which can be plausibly viewed as expressing primitive relations even if it assumed that some verbs (like causatives) are defined. Since nobody is very clear what the primitive verbs of English are, we used two rough tests. (a) We chose surface transitive verbs for which we could not conjure up reasonable multiplex definitions; or where, if we could think of such a definition, it did not involve shifts of relations for the constituents which form the surface arguments of the verb. Second, we generally chose cases for which our dictionary (the complete Webster’s) gave synonyms rather than definitions. Finally, we ran versions of the experiment several times, varying both the selection of causatives and the selection of the (putative) primitive controls. As it turned out, these variations of the materials made no detectable difference to the outcome, suggesting that the results were not due to materials artifacts. (See the materials section of the Appendix.)
Against definitions
303
Paradigmatically, then, phase II consists of comparing sentences like 21a and b. Assuming the validity of the test instrument, “John” and *‘Mary” 2 1a 2 1b
John killed Mary John bit Mary
should be shifted in 2 la and unshifted in 2 lb if, but also only if, the definitional account of causatives is true. We turn now to a description of the test instrument, pausing only to remark that the Phase I stimulus types we’ve discussed include literally all the kinds of constructions we have been able to think of which might plausibly be relevant to assessing the construct validity of a test of the definitional analysis of causatives. (We regard this work - indeed this entire paper - as exploratory, and we should be glad to hear from readers who think of other constructions that it might be useful to examine.) Ilb.
Methods
Whatever else definitions are supposed to be, they must be linguistic constructs in good standing if TSP is true. It would therefore obviously be desirable to test the psychological effects of definitions by using an instrument that is relevantly similar to the manner in which linguists gather the primary data which control their theories: viz., the elicitation of intuitions about sentence acceptability and sentence structure. The least that could be said for such procedures is that, unlike chronometric measures, they place no explicit time constraints upon a subject and are correspondingly unlikely to tempt him to “heuristic short-cuts”. We stress this in light of the suspicions Katz and others have voiced about the use of “on line” tests in validating semantic theories. If the subject’s considered intuitions about sentence structure aren’t relevant to the confirmation of claims about the mental representations which mediate sentence comprehension, then most of the results in linguistics must be similarly beside the point. Psycholinguists have, by and large, avoided considering intuitional data almost as single-mindedly as linguists have avoided chronometrics. Among the exceptions, however, is an important paper by Levelt (1970). Levelt presented his subjects with sentences and lists of word pairs. The word pairs consisted of the lexical items of the sentence taken regardless of order. So, a Levelt-stimulus might be the sentence “John went to the store” together with the list of word pairs “John went; store went; John to; the store; etc.“. S’s task was to assign numbers to each of the pairs indicating his judgment of the relative degree of relatedness of the pair in the sentence. Thus, if it’s S’s intuition that the pair “the store” is more intimately related
304
J. A, Fodor, M. F. Garrett, E. C. T. Walker and C. H. Parkes
to than any of the others, he assigns it number 1. If it’s his intuition that the pair “John went” is relatively less related, he assigns it a correspondingly lower rank... etc. Levelt subjected these scaling data to a statistical procedure called “Hierarchical Clustering Analysis”. HCA, in effect, uses S’s rankings to construct an analysis tree according to the principle: pairs with the closest rated relationship are dominated by the lowest nodes, and so on up. The result of primary interest to Levelt was that the trees which emerged from his data were in many respects congruent to linguistic surface structures. Since S’s intuitions apparently respect the grouping of words into surface constituents, the scaling procedure could be viewed as providing a demonstration of the ‘psychological reality’ of derived trees: the general features of surface constituency are not the products of a sophisticated linguistic sensitivity tutored in some particular theoretical tradition, but are rather the consequence of structure which guides even the naive contemplation of sentences. However, Levelt also reports a further result that is closer to our present concerns. If S was presented a sentence like “John drove to the store and walked home”, he typically indicated the same level of intuitive relatedness for the pairs “John drove” and “John walked”. These pairs are not, of course, similarly represented in a standard surface tree. Rather, the subject appears to be responding to some more abstract (syntactic and/or semantic) relation of the kind that is represented at deeper levels in generative grammars. What determines such intuitions is either that “John” is the deep subject of “walk” or that “John” is the agent of “walk” (or both, assuming that these facts are indeed distinct). This suggests, in turn, that given a pair of sentences which have the Same surface structures, but which differ in the abstract relations among their constituents, S’s scalings might well reflect the distribution of underlying relations. This is precisely what is required of a test for shifting. So, for example, if “expect” is a shift verb in “John expected Mary to leave”, and if “persuade” is not a shift verb in “John persuaded Mary to leave”, and if the Levelt procedure is sensitive to the underlying relations in these sentences, then we might reasonably predict subjects to scale “John” and “Mary” as more closely related in the ‘persuade’ sentences, than in the ‘expect’ sentences. Precisely similar predictions apply in the case of all the other sentence types compared in Phase I. Moreover, if the test is demonstrably sensitive to shifting (as would be witnessed by the success of the Phase I predictions), then we ought also to expect that “John” and “Mary” will be scaled as more closely related in “John bit Mary” than in “John killed Mary” unless the definitional account of the causatives is false. If it is indeed false, then we should find no systematic difference for the “kill/ bite” comparison.
Against definitions
305
These ratings procedures also permit a slightly more refined way of looking at the validating materials than we have thus far described. Consider a sentence pair like 22 (taken from actual materials of the experiment). Not all expected the Even though a bad storm was predicted, the captain 22 persuaded passengers to remain calm. the relations between constituents are affected by the difference in abstract structures between the two versions. For example, such pairs as “storm/ are not implicated in this difference. If we choose such pairs predicted” throughout the range of Phase 1 comparisons, and if the Levelt test is, indeed, affected by shift, we can predict that the differences between the ratings for pairs like “storm/predicted” (referred to as “control pairs” hereafter) should be systematically smaller than those for pairs of words whose relations differ in the two versions (e.g., pairs like “captain/passengers” in 22, referred to as “experimental pairs” hereafter). Our initial experimentation on Phase 1 and Phase 2 sentences was conducted with a (slightly modified) version of the Levelt paradigm (see Appendix, Ratings task). In the final set of experiments the procedure was further revised; a simpler format was used since, for our purposes, we do not require data for most of the possible comparisons of word pairs from the stimulus sentences (see Appendix, Forced Choice Task). In this latter procedure a subject is presented with both versions of a sentence, with a pair of words underlined in each (italicized words in this text were underlined in the stimulus sentences). Typically, it’s the same pair of words in both versions except for comparisons like ‘there-quantifier’ versus ‘where-interrogative adverb’. So, for example, the stimulus for 22 would be as follows: 22a
though a bad storm was predicted, the captain expected to remain calm. Even though a bad storm was predicted, the captain persuaded passengers to remain calm. Even
the
passengers
22b
the
The subject is forced to choose whether the designated pairs are more closely related in sentence A or sentence B. Judgments of no-difference are not allowed, but the subject is asked to indicate his degree of confidence in the judgment on a five-point scale. Most sentence types were tested with both forced choice and ratings procedures. II.c. Results The experimental in the Appendix. briefly.
findings are summarized in Table 1 and reported in detail For present purposes, the outcomes can be stated quite
306
J. A, Fodor, hf. F. Garrett, E. C. T. Walker and C. H. Parkes
a) There are few differences between the pattern of results for the ratings procedures and that for the forced choice paradigm. Such differences as did appear are of some methodological interest, however, and are discussed in the Appendix. b) The Phase I predictions are consistently confirmed.35 In Phase I comparison sets l-4 in both tasks, shifted constituents are judged to be less related than their unshifted counterparts. In phase 1, comparison sets 5 and 6, forced choice task only, semantically unrelated constituents (subjects and predicates in the scope of negative quantifiers and the subjects and objects of intensional verbs) are judged to be less related than the corresponding constituents of their paired sentences (subjects and predicates in the scope of positive quantifiers, and the subjects and objects of non-intensional verbs). See Table 1 (Table .l includes distinctions among sub-types not discussed here; see Appendix). c) Across sentence types, the control pairs exhibit smaller differences in judged relatedness than the experimental pairs. That is, if a pair of constituents is not affected by the structural asymmetries between the two versions of a sentence in a Phase I comparison, then the difference between subjects’ responses for that pair of constituents is, in general, less than the corresponding difference for constituents which are implicated in the structural asymmetries. d) In no case is there a detectable asymmetry between causative verbs and their (putatively) primitive counterparts. This fmding of no difference holds in both test procedures (see Table 1, comparison set 5 for the ratings and comparison set 7 for the forced choice task). Variation in the choice of verbs used in the comparison sets for causatives did not affect the outcome. Moreover, we stress, it is not the case that the failure of the causative predictions is a “statistical” one ~ e.g., smaller effects that just fail to be significant, perhaps because of a greater response variability. On the contrary, the responses to the causatives were quite consistent across subjects and items, more so perhaps than some of the validating cases. There were no meaningful trends even when the power of the tests was increased by collapsing across different stimulus sets. There is, in short, no hint in these data that “kill, break” etc. are shift verbs, or, put more generally, there is no indication that the intuitive relatedness between “John” and “Mary” in “John killed Mary” differs measurably from that between “John and Mary” in “John bit Mary”.
35Note: results for comparison set 6, negative quantifiers, are preliminary only; comparisons used in other sentence types for Phase I have not been completed.
the full range of test
Against definitions
Table 1.
Summary of results of Ratings and Forced Choice procedures for evaluation of syntactic and semantic relations
Ratings task Materials
Set Set Set Set Set Set Set Set
1 2 3 4A 4B 5A SB 5C
307
(“expect-persuade”) (“eagereasy”) (“sluicing”) (existential “there”) (existential “there”) (causatives, marked) (causatives, unmarked) (causatives; mixed)
Forced choice task Set 1 (“expect-persuade”) Forced choice Confidence Set 2 (“eager-easy”) Forced choice Confidence Set 3 (“sluicing”) Forced choice Confidence Set 4 (existential “there”) Forced choice Confidence Set 5 (quantifiers) Forced choice Confidence Set 6 (intensional verbs) Forced choice Confidence Set 7 (causatives) Forced choice Confidence
Differences between scores for sentence versions, experimental pairs: B-A
Differences between scores for control pairs and experimental pairs
Phase I
Phase I
Phase II
Items
Ss
* ns *
* * * * *
* *
Items
ns ns ns
Ss
Phase II
___~
Items
Ss
ns ns ns * *
ns *
* *
*
*
ns *
ns *
ns *
*
* *
* *
*
_ _
_
ns *
ns ns
*
ns
ns
ns ns ns
ns ns ns
ns ns
ns ns
* *
*
*
Ss
ns * *
ns ns ns
*
Items
*p < 0.05; t, one-tailed
II.d. Discussion
a) It appears from the Phase I results that our version of Levelt’s paradigm and its forced choice variant are sensitive (at least) to patterns of semantic
308
J. A. Fodor, M. F. Garrett, E. C. T. Walker and C. H. Parkes
relatedness among sentential constituents. At the very least, the findings offer a striking demonstration of effects of linguistic levels of representation distinct from surface constituent structures. b) The Phase II results clearly indicated that “kill”, “break”, etc. are deep simplex verbs if any verbs are; viz. that the semantic representation of “John killed Mary” is something like “John killed Mary”. c) This indicates, in turn, that causative verbs are undefined; psychological reality apparently cannot be claimed for the definitional structures that have been widely alleged to underlie such verbs. d) Since causatives seem to be ‘best cases’ for definition, the results suggest that there may be few or no cases of psychologically real definitions. e) Since the present results are quite compatible with those of Kintsch (op. cit.), Fodor, Fodor and Garrett, (op. cit.), etc., it appears that previous negative findings on the psychological reality of definitions are quite probably not artifacts of the use of chronometric measures. Part III. TSP revised What does all this show? We review the situation four aspects of TSP discussed in Part I.
in respect
of each of the
1II.a. Language and the world TSP never did provide a plausible theory of the relation between terms and their extensions. It still doesn’t. As we saw in La, the appeal to definitions would provide for such a theory only modulo an account of the interpretation of the primitive basis. Only the Empiricist version of TSP does offer a reconstruction of the relation between primitive terms and their extensions, and it seems quite certain that the Empiricist version of TSP is indefensible. Even if there are definitions, it is wildly unlikely that they can be couched in a vocabulary of sensory/ motor terms in any important number of cases. This leaves us without a theory of language and the world. The best current hope for such a theory is perhaps to accept that aspect of the Empiricist treatment of primitive terms which claims that the relation between words and their extensions is somehow mediated by causal chains, but to abandon the condition that the relevant chains are exhaustively specifiable by reference to the behavior of sensory/motor mechanisms. (For contemporary discussions of ‘causal theories’ of reference, see Schwartz (1977)). What is left is thus the very weak suggestion that the relation
Against definitions
309
between, say, “Chicago” and Chicago, in virtue of which tokens of the one refer to the other, involv& some sort of causal connection between the tokens and the city. This kind of view seems reasonably plausible for names, intriguing but underwhelming for some kinds of descriptions, and only possibly defensible for kind terms. It clearly has deep troubles with abstract reference, reference to fictions and the like. Nor will a psychologist find it really satisfying even where it works best. What a psychologist wants to understand is what kind of causal chains fix extensions, and what the nomologically necessary and sufficient conditions for the existence of such chains are. About these questions, nothing worth reporting is known.36 To summarize: psychologists have wanted very much to have a theory of language and the world. Many of them have thought that appeals to definitions contribute substantially to the development of such a theory, but that was largely - perhaps solely - an Empiricist illusion. There is, as things now stand, no theory of language and the world and it seems most unlikely that one will be forthcoming in the foreseeable future. A methodological principle first enunciated by the philosopher Frank Ramsey applies here: what can’t be said can’t be said, and it can’t be whistled either. III. b. Informally
valid arguments
In 1.b we saw reason to believe that at least some informally valid inferences are inherently asymmetric; hence that meaning postulates will have to play a role in theories of informal validity even if definitions are endorsed. There is, however, a deeper point to be made in the light of such results as.those in Part II. Definitional theories of informal validity start out as attempts to break down the distinction between the logical and non-logical vocabularies. Paradoxically, however, they end up by exalting it. For, according to such accounts, the difference between a form of argument like, say, P -+ not (not P) and a form of argument like bachelor + unmarried actually implicates a difference of linguistic levels; whereas the validity of the former turns on the
36Though many false accounts are widely believed. For example, the kindest way of thinking about the Skinnerian account of language is perhaps to view it as an attempt to provide a model of just such causal connections. “Chicago” refers to Chicago because the latter is a discriminative stimulus for the production of tokens of the former; the laws of operant conditioning determine when, in general, a given discriminative stimulus controls a given discriminated response. This would appear to be the right kind of story to flesh out a causal theory of reference; all we have against it is its palpable untruth.
3 IO J. A. Fodor, M. F. Garrett, E. C. T. Walker and C. H. Parkes
logical apparatus, the validity of the latter is dgtermined by relations (e.g.,of ‘containment’; see (Katz (1972)) among semantic representations. No wonder theorists committed to definitions have claimed a basic intuitive distinction between ‘analytic’ (viz. definitional) truths and mere truths of logic. In fact, we doubt that the intuitions are actually there. Even if they are, however, the results of Part II suggest that they are not intuitions of relations specified over definitional representations. If Part II is right, subjects don’t compute definitional relations in situations where linguistic intuitions are elicited; not even when the intuitions elicited implicate semantic properties of the stimulus. If intuitions of informal validity aren’t intuitions of definitional relations, what are they intuitions of? There is a plethora of possibilities, all about equally plausible and all about equally unattractive. We mention a few by way of a shopping list. 1. Intuitions of informal validity are just reports of empirical beliefs. This view has the virtue of compatibility with a post-Quineian epistemology. It explains why we seem to be able to imagine rejecting putative informally valid arguments, given suitably bizarre contingencies. (Cats are animals is supposed to be informally valid; but suppose cats turned out to be robots manipulated by Martians; suppose they turned out to have a silicon-based biochemistry; etc.) 2. Intuitions of informal validity are not just reports of empirical beliefs; they’re intuitions of deductive relations determined by the logical apparatus. On this story, there will have to be standard logical rules and meaning postulates, and the distinction between the logic (which contains both) and the body of empirical generalizations (which contains neither) will have to be principled. The putative counter-examples to informal validities will have to be explained away somehow (presumably by appeal to notions like change of meaning; to discover that cats are Martian robots would be to discover that there are no cats. To claim that cats are Martian robots would be implicitly to recommend redefining “cat”). 3. The distinction between empirical generalizations and informally valid ones is principled, and so is the distinction between informal validity and formal validity. This might be the case if, for example, the distinction between meaning postulates and standard logical rules is itself principled. Then informally valid arguments might be ones which involve o&y the meaning postulates (or only the meaning postulates together with some designated subset of the logical rules.) This is apparently the view that Carnap held; but see Quine (1963).
Against definitions
3 11
4. The distinction between informally valid arguments and analytic arguments is also principled. Analytic arguments might, for example, be the ones which implicate no more than precisely n of the meaning postulates under some canonical formalization. 5. The distinction between informally valid arguments and analytic arguments is not principled; there are degrees of analyticity with very analytic arguments corresponding to very short routes through the meaning postulates (and, perhaps, designated logical rules as per 3 above). Etc.... The reader who finds himself not much caring which, if any, of 1-5 is true has all our sympathy. There is, however, one point we want to emphasize: what all the non-definitional approaches to informal validity have in common is that they assume that the domain for the logical apparatus (including meaning postulates) is the output of the syntax; there is no semantic level (no level of logical form) except what may be required for the representation of such relations as quantifier binding, operator scope, etc. In particular, there is no logical form inside lexical items. It seems to us that the weight of the current evidence is that this latter claim is plausible. If we had to bet, we’d bet on the following story and we’d stick to small sums: a) The logical apparatus is defined over representations of logical form in something like the traditional sense (scope, binding, etc. are formally specitied in the domain of the logical rules.) b) The logical apparatus contains standard rules and meaning postulates indifferently . c) There is no semantic level in the sense of ‘linguistic semantics’; the logical apparatus has access to the surface morphological inventory of the language. This picture comports nicely with the results of Part II; it’s compatible with the notion that logical form is determined solely or in large part by surface structure; it permits ambiguities of quantifier order (and other phenomena of traditional logical syntax) to be psychologically real; it provides room for a principled notion of informal validity in case somebody should happen to find a use for one; there appears to be no solid a priori or a posteriori reason for supposing that it is false. We are available for small wagers. III.c. Sentence
comprehension
without
definitions
Understanding a sentence is recovering a representation that provides a domain for relevant inferential processes. If there are no definitions, then understanding a sentence is recovering its logical form. If there are no logical
3 12
J. A. Fodor, M. F. Garrett, .!T.C, T. Walker and C. H. Parkes
forms, then understanding a sentence is recovering its syntactic structural descriptions. There must be syntactic structural descriptions; the ambiguity arguments prove it. In short, you won’t be far wrong, on the present view, if you think of a sentence comprehension system as a function from tokens to types. Here are two possible objections: 1. How could understanding a sentence be recovering a type-individuating representation of that sentence ? Such representations are just formulae in some other (e.g., meta-) language. Answer: we doubt that this objection buys much in this context. 37 What’s certain is that it buys nothing in aid of the definitional account: DEFINITIONS ARE ALSO JUST REPRESENTATIONS IN SOME OTHER LANGUAGE! The disagreement over the psychological reality of definitions is a dispute within versions of the representational theory of mind. 2. Understanding is a graded notion; different performances count as understanding depending on the circumstances; understanding can’t be formally defined. Answer: if this is an argument at all, it’s an argument against both definitional and non-definitional accounts. Both claim that there is a level of representation whose recovery is constituitive of (or at least necessary for) sentence comprehension. They disagree only about which level it is. No doubt the ordinary notion of understanding is graded for all that. This is primarily because nobody (except academics) is ordinarily interested in understanding sentences; what we ordinarily want is to understand what people say and what they meant by saying it, and it’s perfectly clear that all sorts of contextual, background and inferential apparatus is brought to bear in this latter undertaking. This is not, however, an argument for studying understanding what people say and what they mean instead of studying understanding sentences. On the contrary, you can’t do the former without doing the latter, since it’s patent that the computational apparatus involved in understanding sentences is normally used in understanding people. That’s why it is, in general, easier to understand somebody who’s talking a language you know than to understand somebody who’s talking a language you don’t. A theory of understanding sentences is thus part of a theory of understanding people and, for all we now know, it may be the only part that’s sufficiently systematic to reward specifically scientific scrutiny. 37What it does is force a distinction between theories. Since the failure to grasp this distinction no small matter. See J. A. Fodor (1978).
theories of sentence comprehension and semantic is epidemic among procedural semanticists, this is
Against definitions
3 13
Briefly: if the ordinary notion of understanding is graded, so much the worse for the ordinary notion of understanding. We don’t make physics out of the ordinary notion of energy. III.d. The innateness controversy Whatever is not definable must be innate. This is, however, weaker than: whatever is not internally represented by its definition must be innate. For example, it may be that while adults represent “kill” as kill, children learn “kill” as cause to die. After a while, one might imagine, cause to die consolidates and kill comes to act as a derived primitive. Derived primitives are representations which (a) have no computationally relevant internal structure, but (b) are introduced into the representational system by adding eliminative bi-conditionals to the logic. Rules which introduce derived primitives are, as it were, the diachronic equivalents of definitions. This suggests (what we believe to be correct) that the case for a rich, innate primitive conceptual system can’t be made just by demonstrating the psychological unreality of definitions in adults. The psychological reality of definitions in the adult provides a sufficient, but not a necessary, condition for the analyzability of concepts. If you want to show that a concept which is psychologically unanalyzed for the adult is, nevertheless, only a derived primitive (hence definable, hence presumably, not innate) there are at least three things you can try. 1. Show that the concept is, in principle, analyzable. The existence of a possible analysis is prima facie evidence that the concept actually is analyzed somewhere in ontogeny. This card is not, however, easy to play; there are, as we have several times remarked, very few examples of plausible definitions. 2. Show that the concept is internally complex for the child; e.g., show that the child represents “kill” as cause to die. We think that the developmental literature which purports to demonstrate that the child’s concepts are typically learned by assembling complex arrays of primitives (e.g., of semantic features) is thus far unpersuasive; if one approaches the data without Empiricist preconceptions, the striking fact is the lack of evidence for ‘bottom up’ processes in concept acquisition. We won’t argue this here, however; our present concern is just to acknowledge the relevance of such data to the sorts of issues we have raised. 3. Show that the concept is expressed by a phrase (rather than a morphemically simple expression) in some natural language or other. We’ve argued that morphemically simple expressions are typically undefined, that undefined expressions typically express primitive concepts;
3 14
J. A. Fodor, M. F. Garrett, E. C. T. Walker and C. H. Parkes
and that primitive concepts must be innate. The presumption that a concept expressed by a morpheme is primitive cannot, however, be right if there are actually languages in which that Same concept is expressed by a phrase. For (a) if a concept Carl be expressed by a phrase, then it is ipso facto definable; and (b) if a concept is in fact primitive (hence innate) for any human, it must surely be primitive (hence innate) for all humans. It would thus be extremely interesting to know how much different languages agree as to which concepts are expressed by morphemically simple expressions. Given, however, the notorious difficulty of making sense out of the translation relation, we aren’t likely to find out by, say, next week. All this should suggest - what is clearly true - that if you don’t like Crumpet being innate, you still have plenty of room to wriggle. Dismantling TSP, if it is to be done at all, will surely be the work of generations, just as constructing it was. Since, however, TSP has been so widely endorsed, and since even the possibility that it is deeply wrong opens such startling vistas of speculation, it may be a good idea to end with the following considerations: 1. TSP has never worked. The appeal to definitions has been central to projects ranging from the theory of visual perception to axiomatic ethics; from linguistic semantics to the operational analysis of theoretical terms in science; from theories about how children might learn concepts to theories about how computers might understand newspapers. In each case, the underlying assumption has been that the primitive conceptual repertoire cannot be as rich as the available reportoire of categories; hence that many concepts must be analyzable. These assumptions have governed research in the AngloAmerican tradition for some three hundred years; almost, in our view, totally without success. The definitions and analyses haven’t been forthcoming and there is no prospect that they will turn up in the foreseeable future. Perhaps the world is trying to tell us something. Perhaps there is something wrong with our assumptions. 2. If we are finally forced to the view that people have a rich endowment of innate (e.g. triggered) concepts, that ought not to outrage intuition all that much; it would only be to accept for us a kind of doctrine that we take to be quite plausible for most of the rest of animate creation. 3. The true theory doesn’t have to be boring. The theoretical reach of physics stretches to embrace the possibility of worlds in which the time arrow points backwards. Surely a little nativism ought not be more than psychologists can bear.
Against definitions
3 15
References Robert M. (1975) Where do our ideas come from? In Stitch (ed.), Znnafe Ideas. University of California Press. Block, Ned, in N. J. Block (ed.), Readings in Philosophy of Psychology,Vol. 2. Harvard ‘IJniv. Press, Cambridge, Mass. Carnap, R. (1947) Meaning and Necessity. University of Chicago Press, Chicago. Clark, E. and Clark, H. (1977) Psychology and Language: An Introduction to Psycholinguistics. Harcourt Brace Jovanovich, New York. Chomsky, N. (1972) Studies on Semantics in Generative Grammar. Mouton, The Hague. Chomsky, N. (1975) Reflections on Language. Pantheon Books, New York. Dowty, D. (1976) Montague grammar and the lexical decomposition of causative verbs. In B, Partee (ed.), Montague Grammar. Academic Press, New York.Davidson, D. (1970) Mental events. In Forster L. and Swanson, J. (eds.), Experience and Theory. University of Massachusetts, Amherst. Fodor, J. A. (1975) The Language of Thought. Crowell, New York. Fodor, J. A. (1978) Tom Swift and his procedural grandmother. Cog., 6, 229-247. Fodor, J. A., Bever, T. G. and Garrett, M. F. (1974) The Psychology of Language. McGraw-Hill, New York. Fodor, J. A. and Fodor, J. D. (forthcoming). Language, Mind and Communication, Harvard University Press, Cambridge. Mass. Fodor, J. D. (1977) Semantics: Theories of Meaning in Generative Grammar. Crowell, New York. Fodor, J. D., Fodor, J. A. and Garrett, M. F. (1975) The psychological unreality of semantic representations. Linguistic Inquiry, VZ, 515-53 1. Geach, P. (1957) Mental Acts. Routledge & Keegan Paul, London. Katz, J. J. (1972) Semantic Theory. Harper, New York. Katz, J. J. (1977) The real status of semantic representations. Linguistic Inquiry, VIIZ, 559-584. Katz. J. J. (1975) in Gunderson (ed.), Minnesota Studies in the Philosophy Universitv _- ofscience. _ of Minnesota Press, Minneapolis, Minnesota. Katz, J. J. and Fodor, J. A. (1963) The Structure of a Semantic Theory. Language, 39, 176-210. Kintsch, W. (1974) The Representation ofMeaning in Memory. Wiley, New York. Lakeoff, G. (1970) Linguistics and Natural Logic. Studies in Generative Semantics, No. I, Phonetics Laboratory. University of Michiaan. Also Svnthese (1970) 22. 151-271. Levelt, W. M. (1970) A scaling approaih to the study of syntactic relations. In G. B. Flores d’Arcais and W. J. M. Levelt (eds.),Advances in Psycholinguistics. North Holland, Amsterdam. Levelt, W. M. (1974) Formal Grammars in Ltnguistics and Psycholinguistics. Mouton, The Hague. McCawley, J. D. (1971) Prelexical Syntax. In R. S. O’Brien (ed.), Monograph Series on Language and Linguistics, No. 24, Georgetown University, Washington, D C. Miller, G. A. and Johnson-Laird, P. N. (1976) Perception and Language. Belknap, Cambridge, Mass. Putnam, H. (1975) in Gunderson (ed.), Minnesota Studies in the Philosophy of Science. University of Minnesota Press, Minneapolis, Minnesota. Quine, W. V. D. (1963) From a LogicalPoint of View. Harper and Row, New York. Schwartz, S. (1977) Introduction. In B. Schwartz (ed.), Naming, Necessity, and Natural Kind. Cornell University Press, Ithaca, N.Y. Schank, R. (1975) Conceptual Information Processing North Holland, Amsterdam. Walker, E. (1976) Some grammatical relations among words. In Wales, R. and Walker, E. (eds.), New Approaches to Language Mechanisms, North Holland, Amsterdam. Adams,
316
Cet article Porte sur les analyses ddtinitionnelles de la structure du langage. Plusieurs classes d’arguments ayant trait aux definitions sont pass&es en revue, entre autres, celles Ii&s aux theories classiques de la reference, aux theories de validation informelles, aux theories de la comprehension de phrases et aux theories de I’apprentissage de concept. On suggkre que, dans chacun de ces domaines, les travaux qui s’appuient sur une definition ne sont pas plus justifies par les preuves qu’une alternative plausible nondefinitionneile. On prdsente, en outre, une serie d’observations experimentales portant sur un de ces domaines: celui de la comprehension de phrase. On etudie la classe des verbes causatifs, classe souvent citee en exemple de structure definitionnelle. Cette classe d’exemples n’influence pm les jugements que le sujet Porte sur celles des relations entre mots des phrases causatives qui dependent des structures definitionnelles suggerees. De facon independante, on montre que les jugements du sujet sont sensibles aux relations structurales de type comparable dans des formes linguistiques.
Againstdefinitions 3 17
Part IV: Appendix Introduction Levelt (1970, 1974) has shown that speakers’ ratings of the relatedness among the words of a sentence systematically reflect aspects of its surface and underlying syntactic structure. The experimental work we have discussed uses variants of Levelt’s procedures for tests of both syntactic and semantic relations. The efficacy of these procedures was assessed by measures of a number of relatively uncontroversial structural and semantic distinctions and they were then applied to sentence types whose analysis is at issue in the dispute over lexical decomposition. The general implications of these experiments were discussed at length in parts I-III of this paper; in this section, we wish to make clear the nature of the test materials and test procedures we have used, and to describe the analyses made, with details of the results obtained. We carried out two related types of experimental procedures; in each, native speakers of English were queried for their judgments of the relations among words in a variety of types of sentences. Both tests employed very similar sets of sentences, and in both tests the criterion for subjects’ responses was their estimate of the degree of relatedness holding between a particular pair of words taken from a particular sentence or pair of sentences. In the first type of test, the Ratings task, subjects were asked to rate on a five point scale how related a particular pair of words was in a given test sentence; in the second type, the Forced choice task, the subjects were forced to choose in which of two sentences a pair of words seemed more related. The ratings tests were used primarily to examine the effect on perceived relatedness of two words being in the same simple underlying clause. The materials labelled “Stimulus Sentences Used in the Ratings Task” test the sensitivity of ratings of relatedness to the four structural contrasts detailed in Comparison Sets 1-4 (see discussion of Phase I in Part II). In the ratings task, the materials in Comparison Set 5 test the putative contrast in the underlying structures of causative verbs and simple transitive verbs (see discussion of Phase 2 in Part II of the main body of the paper). The forced choice tests were used both for (modified) replication of the ratings task and for examining a more general interpretation of the effective relation between test word pairs. The relations subsumed by common clause membership in the materials for the ratings task might be construed in either syntactic or semantic terms (see the discussion in Part II). Therefore, to the stimulus sentences used for the forced choice task, we added sentence types
318
J. A. Fodor, M. F. Garrett, E. C. T. Walker and C. H. Parkes
which can plausibly be claimed to contrast only in the semantic relatedness of the critical word pair; these are in Forced-choice Comparison Sets 5 and 6. For these sentences, the syntactic structure of the underlying clause containing the test pair is assumed to be virtually identical in both members of a pair. In the Forced Choice materials, Comparison Sets l-4 and 7, we also included sentence pairs adapted from the corresponding types in the Ratings task to permit informal comparison results of the two procedures. In both lists of materials, we have underlined and labelled the words involved in the pairs we tested. These same labels are used in the discussion of results and in the Tables as indicated.
Ratings task: Phase I, Validation Construction
of Materials
The sentences used for validation in the ratings experiments are listed here. The materials in sentence types 1, 2 and 3 were tested together (Ratings Experiment 1). The materials in 4A and 4B were tested subsequently (Ratings Experiment 2). In each experiment, subjects who rated validation sets rated causative sentence sets as well. COMPARISON SET 1: RATINGS TASK EXPECT(A)/PERSUADE(B) SENTENCES
1. (A) (B)
During the stormy Atlantic cro%ing the cap;ain ex&ted frightened. During the stormy Atlantic &sing remain calm.
the pass?ngers to be
the cap?ain pezuaded the passe&em to
2. (A)
During the fire d& the pin&al
disczvered the b& to be missing.
(B)
During the fire d% the pin&al
t,“ld the bty to be quiet in the hall.
3. (A)
(B)
On the basis of the dress rehe:rsal the dire:tor anmhced success. At the dress rehhsal
the acgeess to be a
the dire:tor rernizded the acgess to speak with feeling.
Against definitions
3 19
4. (A)
According to a syndicated honest man.
colimnist, Con&ess be&es
(B)
According to a syndicated military spending.
columnist, Congress advised the President to curtail
5. (A)
Undoubtedly,
V
s
C
many azthbrs s&ose
edi:rs
the President to be an 0
to be capricious about accepting
C
stories. (B)
6. (A)
(B)
Undoubtedly, substance.
many azthors cozvince edit%s to accept stc$ies that have no
Before the NFL championship top shape.
pla$ff
the co&h rep&ted the pla$rs to be in
gfo;e
plagff
the co&h wazed the plq%-s to be in top
the NFL championship
COMPARISON SET 2: RATINGS TASK EASY(A)/EAGER(B)
1. (A)
Marriage coun:elors believe that wr?es are difficqrlt to conyrol in the home C
(B)
2. (A)
S
3. (A) (B) 4. (A) (B)
V
According to manigers, bojers are impissible to tr& too much. C
(B)
A
Marriage counselors believe that wives are eager to control in the home.
S
A
V
According to managers, boxers are afraid to train too much. Most do;s know snjkes are dar&rous to blye when stretched out. Most d:gs know snikes are untble to bi: when stretched out. After a def;at, the gen%als are ea$ to re%re on a pension. After a deiat, the genzrals are con;ent to raze
on a pension.
320
5. (A) (B)
6. (A) (B)
J. A. Fodor, G. M. Garrett, E. C. T. Walker and C. H. Parkes
African game ward:ns believe elephants are simApleto n&e African game wardCensbelieve elephsantsare heitant to r&e When the ,,‘e,
(B)
2. (A)
to new ranges.
is cold, gills are delighArfit1 to wa:h from the shore.
When the w,“t,r is cold, g&s are con&t
COMPARISON WHO(A)/NOUN(B)
1. (A)
to new ranges.
to wazh from the shore.
SET 3: RATINGS TASK SLUICING SENTENCES
According to the gossip columns, the hei:ess married someone but only the s v 0 family knows who. c According to the gossip columns the heiress married someone but only the s v 0 family knows him. The win:er of the travel lottery will go somewhere for two weeks and a comiuter has deterLined whe?e. S
C
(B)
The winner of the travel lottery will go somewhere for two weeks and a computer has detezined
3. (A)
the pla:e.
Although the ticket agent knew the pl&e left from Gate 7, the trav.%ler V
0
couldn’t find out when. (B)
Although
the ticket agent knew the pine left from Gate 7, the trav.%er V
0
couldn’t find out the time. 4. (A) 9)
The Bermuda Triangle is known to be dangzous and a recent biok exzains wgy. The Bermuda Triangle is known to be dangzous and a recent &ok ex$ains the 0
reasons.
Against definitions
5. (A)
Pot?ev and other ancient artifacts are known to be buried in the Northern Hemisphere and the archeilogists are trying to disczver wh%e.
(B)
Pot?ery and other ancient artifacts are known to be buried in the Northern Hemisphere and the arche%ogi>ts are trying to dis&er
the si&.
6. (A)
San Francisco is supposed to slide into the &a and the seismo&ists are v 0 attempting to predict when.
(B)
San Francisco is supposed to slide into the rk and the seismologists are V
attempting
0
to predict the date.
-
COMPARISON SET 4A: RATINGS TASK THERE(A)/WHERE(B) SENTENCES
1. (A)
O-0 2. (A)
(B) 3. (A)
l&e
is an is&d in the Caribbean which has a free po?t.
Wh%e is an i&d
in the Ca&bean which has a free p&t?
T&-e is a doc?or who makes house ca?ls on Sundays. Ht is a docyor who makes house ca%son f&days. Th&e is this frieid of hers who lives in New’York and always has room for gueL.
(B)
W$o is this friezd of hers who lives in Ne$York
and always has room for
gue%?
4. (A) (B)
Thire was a pezon playing $0
was the per&
Chzpin on the lobby pia:0 for hours.
playing Chzpin on the lobby p&o
for hours?
32 1
J. A. Fodor, M. F. Garrett, E. C. T. Walker and C H. Parkes
322
5. (A) Thire was a ,“,n who whisked when the voluptuous moCde1entered. (B) W\o was the m% who whis%ed when the voluptuous moCde1entered. 6. (A) Thzre were students who said the e&n was too e&.
@I 7. (A) 03)
Wio were the stud?nts who said the e&n
was too ea$y?
Is thzre a rn,r(‘m on the Chases that has a gas p&p? Whzre is a ma%a on the Cti%es that has a gas &np?
8. (A)
Is th&e an au?hor who does not dre&
bf instant suc?ess?
@I
Wgo is the auyhor who does not dre&
of instant suc?ess?
9. c.4) Thzre is an invettor who desiied 0)
a new burglar alc&n.
Whsois the inveztor who desig&d a new burglar ala’,?
10. (A)
Thire was an old woian who liv:d in a sh:e.
(B)
Wh”,was the old wom’an who lives in a shze?
COMPARISON SET 4B: RATINGS TASK THERE(A)/THERE(B) SENTENCES
1. (A)
According to the angry ten&ts, thzre are some n$e
that live behind the kitchen
C
walls. @I
C
S
“Look,” said the angry tenants, “there are the rnicq?that live behind the kitchen w&s. *?
2. (A)
According to the Scientific American, th%e is a pro:f that you never need more than fof
colors to draw a map.
Against definitions
2. (B)
Waving at the blackboard,
3.69
You may find it hard to believe, but th&e is a profisor
the mathematician shouted, “m:re c c you never need more than four colors to draw a map.”
C
323
is the przofthat
who is trying to
C
recover solar energy from mushrooms. @I
over the?e is the proyessor who is
If you want to meet someone interesting, C
C
trying to recover solar energy from mushrooms. 4. (A) @I 5. (A)
The secret&y believes thbe is a c&y of the le&r somewhere, but she can’t find it. The secre?aty said, Y7re?e is the cc& of the let&,” but her boss didn’t hear her. Looking worried, the professor said, “I doubt that th%reis a good bzok on C
C
relativity theory for laymen.”
O-9
Pointing to the shelf, the professor said, “There is a good bo% on relat&ty C
theory for laymen. ”
6. (A)
On the top shelf of the refrigerator th&e is a kind of $z that all the chil$en licke. Smiling proudly, the cook remarked, “If&e is a kind of $
that every ch?ld
C
will like. ” 7. (A)
The tou%sts think that thzre is a plzce in Florida where Blackbeard buried his C
treasure. 03)
The touzsts think that thi is the @aYe in Florida where Blackbeard buried his C
treasure.
8.
(A)
“T&-e is a most amazing stayue in Rheyms,” gushed Jack Parr.
03)
“I&-e is the most amazing st%ue in Rhzims,” gushed Jack Parr.
324
J. A. Fodor, M. F. Garrett, E, C. T. Walker and C. H. Parkes
COMPARISON
SET 5A:
RATINGS
CAUSATIVE(A)/TRANSITIVE(B)
1. (A) (B)
2. (A) (B)
3. (A) (B) 4. (A) (B) 5’. (A) (B)
TASK
- UNMARKED
Despite protests from the rna&er the or%, 7 Despite protests from the man&er the okze,
clzed
soyd the thiter.
Sitting in his high ch:ir the ba& spiied the jui:e in his cup. Sitting in his high ch% the baiy draxk the jui:e in his cup.
In the orchard the chi&en
beit the brarz:hes to get the ap;les.
In the orchard the chil%-en puked the brazhes
to get the apiles.
Despite the win% the j%e$en
sto&ed
Despite the w&
con&olled the j& quickly.
the fkeien
the $e
quickly.
At the end of the day the g&-d 1ocVkedthe $e At the end of the day the g&d
in the yzd.
boyted the ga?e in the ya:d.
6. (A)
Nimbly the s&k-el
cra:ked the nt?t on a high bra&h.
(B)
Nimbly the x&-reI
ak the rrz on a high bra&h.
COMPARISON CAUSATIVE(A)/TRANSITIVE(B)
1. (A) (B) 2. (A) (B)
the thezer.
After the rnzt
SET SB: RATINGS
was sliced the the; thicl&ed
After the rnzt was sliced the chseftaskd Many y&s
ago the contr&tors
TASK
- MORPHOLOGICALLY
deepIned
MARKED
the sa%e.
the sau:e. the chaiel
in the Panama Canal.
Many yzars ago the contnfctors planxed the char?nel in the Panama Canal.
Against definitions
3. (A) @I 4. (A) (B)
5. (A) (B)
325
While working in the S&J the carpinter straigzened the n%. While working in the sh\p the captnter disczded the nfils. In setting up the tezt the sco”,ts loozned the lin% on the center pole. In setting up the tezt the scok In colonial times blac?&ths
tie> the lin,“, on the center pole.
hardened the stgl to make an Le.
In colonial times blaclcsrk+ths,“t the ste% to make an ,“e.
6. (A) After the operation the n&se darkzned the P-o%n so the pat&t could sleep.
09
After the operation
the n&se leyt the rozm so the pasent could sleep.
COMPARISON SET 5C: RATINGS TASK CAUSATIVE(A)/TRANSITIVE(B) SENTENCES
1. (B) While patching up the old hozse the wc%cers jiou:d some p&t in the basement.
(A) 2. W
While patching up the old hake the wosrkers spized some pa%t in the basement. When the pcflice reaxhed the demons%ation, they told the people to go h&e.
(A) When the ,vo?ice stozped the dernon%ation, they told the people to go h&e. 3.09
(A)
Nobody noticed that during the pokerCgarne the gabbler uzd a ca?d from his sleeve. Nobody noticed that during the pokerCgame the gabbler drozped a c%d from his sleeve.
4. @I
The cautious hutfters hezd the coigar that was hiding in the trze.
(A)
The cautious huzters k&d the cou&r that was hiding in the trie.
326
J, A. Fodor, M. F. Garrett, E. C. T. Walker and C. H. Parkes
5. (B) In the middle of a quiet afternoon
0 S the students foxnd a fire in the chemistry
laboLory.
(A) In the middle of a quiet afternoon the students s&ted a&
in the chemistry
labo&toiy.
6. (B) Against the recommendation
of his advzors the ma$or supp&ted the pr$ect
for redevelopment.
(A) Against the recommendation
of his advgors the ma;or ended the prozct for
redevelopment.
7. (B) (A) 8. (B)
The Sunday paier repe:ted the sto’y that the se&or
was about to resign.
The Sunday pager sp;ad the st$v that the sehtor was about to resign. The retiring chai;man atte\ded the mezing of the Board of Trustees for the last tSe.
(A)
The retiring chai;man con:ened the melting of the Board of Trustees for the last tiZIe. C
9. (B) (A)
10. (B)
S
Before going on camera, the chefadled
0
the ingredients according to the recipe.
Before going on cazera, the chSefcomVbinedthe ingredgnts according to the recipe. With only fifteen seczrzds left, the full&k
(A) With only fifteen sec&ds left, the full&k
gz the footbOnNon the one-yard line. pzt the football on the one-yard line.
To determine the sensitivity of the ratings task to the relevant structural features, we asked subjects to rate the relatedness of word pairs formed by using the items in italics in the materials lists; for the four words in italics, all six possible pairs were rated (though not all are relevant to our concerns). The relevant word pairs are determined by the structural analyses of each of the sentence types. Those analyses and their application to the ratings pairs are described in Part II of the main body of the paper.
Againstdefinitions 327
The first two types of sentences (expect/persuade sentences: Set 1, and eager/easy sentences: Set 2) have been the object of earlier experimentation using other experimental techniques, both in our laboratory and those of other researchers (see e.g., Walker, 1976; Fodor, Bever and Garrett, 1974; Cooper, 1976). Those experimental enquiries all indicate that the structural contrasts of these sentences are effective in immediate memory, perception and production. We have no such prior experimental assurances for the types in Sets 3 and 4. In each of these four sentence types, there are word pairs in the A versions which do not share underlying clause membership, while in B versions, they do. Assuming the characterizations of underlying clause structure given in Part II, the judgments of the degree of relatedness in the several sentence types should show the relation A < B; for example, word pairs taken from the stimulus sentences indicated should display the following pattern: 1,l) 2,4) 371) 4A,l) 4B,2)
. . . captain expected passengers < captain persuaded . . . generals are easy < generals are content . . . family knows who
passengers
These comparisons involving the surface subject and object (or predicate adjective, in Set 2) are designated the primary pairs of our analysis; this is because they are the pairs in the sentences for sets l-4 which most precisely parallel the word pairs in Set 5, the causatives, for which the definitional analysis makes the clearest prediction. Note that, unlike the primary pairs, the prediction of an A-B difference for subject verb pairs in the causatives is complicated by the fact that a “feature” of the putatively decomposed surface verb (namely, the deep verb “cause”) is associated with the subject noun in underlying structure (see Part II, Fig. 4). The verb and object-noun pairs in the causatives do not, of course, yield any prediction of differences on the grounds of underlying clausal membership. Hence, the subject, object pair provides the only clear test case for the causatives. However, in Sets 1 and 3, we can make an additional prediction: the verb and object-noun pairs (“secondary pairs”) should show the same pattern as the primary pairs. In both these sentence types, these words are assigned to distinct underlying clauses in the A versions and to common underlying clauses in the B versions. Note that sentence Sets 2 and 4 do not lend themselves to such a prediction; the sentences of 2 do not contain an explicit surface object, and those of 4 do not contain appropriate verbs. In addition to the words directly involved in the relevant structural contrast, we also obtained judgments involving a fourth unrelated word. This
328
J. A. Fodor, M. F. Garrett, E. C. T. Walker and C. H, Parkes
was done in order to test for the specificity of the manipulated structures as the determinant of the judgments obtained. One might well imagine that the structural differences between the A and B versions could create some “whole sentence” effects on relatedness judgments; the structural change might affect all words in the sentences regardless of clausal membership. To evaluate this possibility we tested word pairs involving, e.g., the subject noun and a nearby extra-clausal word, such as “crossing” in (I,1 ) or “defeat” in (2,4), etc., whose relations do not change from A to B versions. Such pairs should not differ in their ratings in the two versions of the sentences. The relevant words (labelled “C”) for “control” are marked in the stimulus lists and again in the tables of results. Wherever feasible, the control pairs use a neutral word plus the pivotal noun whose clausal relations change from A to B in the manipulated portions of the test sentences. (Note that the control pairs for sentence set 2 are somewhat complicated by the involvement of the surface subject in different underlying grammatical relations.) One further control problem is relevant. That is the possibility that intrinsic relations between the particular words of a test pair may obscure or contribute to the structurally induced variations. On this point, note first that for the primary comparison (the subject, object pairs) identical words are tested in Sets 1 and 4 among the validating cases and also in Set 5, the causative test case; similarly for the control pairs. Thus, frequency effects, length effects, imageability, etc., as well as the intrinsic relation between the pair members is controlled for that comparison. It was not always possible to maintain identity from version to version of the sentences, but where feasible, we adhered to a minimal pair principle in sentence construction. In the ideal case, only one word changes from version A to version B. In others, some additional changes made to material surrounding the critical contrast were motivated by a desire to render both sentence versions comparably plausible or natural expressions. Beyond the matching of compared items, we have taken one further step to assess the effects of variation from A to B versions: All the test word pairs which were rated in sentence contexts were also rated in isolation. A randomly ordered list of A version pairs and a matched list of the B version pairs was rated by independent groups. This provides some measure of the effect of intrinsic relations between the members of word pairs which do differ in their lexical content. Finally, we note that in both experiments a variety of sentences representing each of the structural types were used; we have not relied upon only two or three exemplars of each type. By so doing, we reduce the possibility that observed differences depend upon the idiosyncracies of particular sen-
Against definitions
329
tences. In the ratings experiment, the numbers of sentences representing a given type is smaller (n = 6) than that for the forced choice experiment (n = 20). The ratings task is more time consuming, hence the conservatism in numbers compared to the forced choice tests. This was, as the tests indicate, not the best accommodation, for it weakened item based tests for ratings. In both experiments an evaluation of effects based upon items is reported, as well as an evaluation based on subjects. Procedures
and Analysis
In the ratings task, a given subject saw only one version of any given sentence pair, and each subject saw ecjual numbers of A and B versions. Experiment 1 subjects saw a randomly ordered series consisting of A or B versions of sentences in Comparison Sets 1, 2, 3, 5A and 5B; experiment 2 subjects saw a randomly ordered series from Comparison Sets 4A and SC; 4B sentences were separately run. A new randomization was used for each subject. Presentation and response were written (computer-printed test booklets), with a sentence typed above the list of six word pairs, and each pair associated with a five point rating scale. Subjects were instructed to rate the degree of relation holding between each pair of words, using their understanding of the printed sentence as their basis of judgment. 60 MIT undergraduates who were paid for their voluntary participation were run as subjects in experiment 1, and 40 in experiment 2. The ratings obtained were adjusted for idiosyncratic subject and item variability. Item scores were expressed as deviations from each subject’s mean rating score across all his rating judgments, thus removing differences in subjects’ use of the rating scale from the comparison of A and B versions of given sentences. Similarly, subject scores were expressed as deviations from each (paired) item’s mean rating across all judgments. These scores were tested for A-B differences on experimental word pairs (those which differ in clausal relations in the two versions) and control word pairs (those whose structural relation is constant across versions). Separate analyses are reported for each structural type, for the primary test pair and its control; where available, results for secondary test pairs are reported. Results
For sentence type 1, the primary pair is the subject-object pair (e.g. “captain” and “passengers” in the illustrative sentence pair in Table 2) and the control pair is the subject noun and an unrelated control word (e.g. “captain” and “crossing” in the illustrated pair). Table 2 provides results.
330
J. A. Fodor, M. F. Garrett, E. C. T. Walker and C. H. Parkes
Table 2.
Example
Sentence Set I, Rating Scales. Tests for common clause membership in VP complements (B versions) versus disjoint membership in NP complements (A versions) sentence
pair with target
words in italics:
A.
During
the stormy
Atlantic
control crossing
subject the captain
B.
During
the stormy
Atlantic
control subject verb crossing the captain persuaded
verb expected
Analysis
mean B-A differences (subject-object)
for primary
mean B-A differences (subject-control)
for control
primary
pairs versus control
word pairs
word pairs
pairs
object the passengers to be frightened. object the passengers to remain
calm.
type
Items (n = 6)
Subjects
0.333 t=2.185(df=5) p < 0.040
0.332 t = 2.926 (df = 59) p < 0.002 min F’ ns
0.025 t = 0.109 (df = 5) p < 0.459
0.029 t = 0.203 (df = 59) p < 0.420 min F’ ns
t=-1,119(df=5) p < 0.148
t = - 1.540 (df = 59) p < 0.064 min F’ ns
(n = 60)
mean B-A differences (verb-object)
for secondary
word pairs
0.775 t = 2.747 (df = 5) p < 0.020
0.776 t = 5.994 (df = 59) p < 0.001 min F’ = 6.236 p < 0.05
mean B-A differences (verb-control)
for secondary
control
-0.067 t = -0.268 p < 0.399
--0.069 t = -0.505 (df = 59) p < 0.307 min F’ ns
secondary
pairs
versus control
pairs
pairs
t = -2.239 p < 0.037
(df = 5)
(df = 5)
t = -4.538 (df = 59) p < 0.001 min F’ ns
If common clausal membership is an effective determinant of relatedness judgments, we would expect the ratings for B versions to be higher than ratings for the corresponding A versions of these sentences: e.g., “captain” and “passengers” should be judged more closely related when they are connected by the verb “persuaded” than when connected by the verb “expected”. The relation between “captain” and “crossing” should, by contrast, remain constant; and similarly for the verb-object pair and its control.
Against definitions
33 1
As Table 2 indicates, this seems to be the case: B-A differences depart significantly from zero for both subject and item based analyses. The control pair for the subject-object comparison does not reach significance for either analysis. The direct comparison of the subject-object pair differences and the subject-control pair differences does not reach significance, though the difference is of the expected type. The result for the verb-object pair and the verb-control pair shows exactly the same pattern, but in this case, the direct comparison of experimental and control pairs is significant. We note here that these and subsequent comparisons are, as the tables indicate, evaluated by min F’ statistic as well. Rarely do the differences for the ratings experiment ,achieve significance by this test. Table 3.
Example
Sentence Set 2, Rating Scales. Tests for predicate djective constructions, common clause membership (B versions) versus disjoint membership (A versions) sentence
pair with target
words in italics:
A.
After a defeat,
subject adj. control the generals are easy to retire on a pension.
B.
After a defeat,
subject thegenerals
adj. control are content to retire on a pension. Analysis
mean B-A differences (subject-adjective)
for primary
mean B-A differences (subject-control)
for control
primary
pairs versus control
pairs
word pairs
word pairs
type
Items (n = 6)
Subjects
0.600 t= 1.223 (df=5) p < 0.138
0.602 t=3.321 (df=59) p < 0.001 min F’ ns
0.200 t = 0.529 (df = 5) p < 0.310
0.201 t = 1.793 (df = 59) p < 0.039 min F’ ns
t = -2.054 p < 0.240
t = -3.529 (df = 59) p < 0.046 min F’ ns
(df = 5)
(n = 60)
In the eager/easy sentences of Comparison Set 2, the primary test pair is the subject, adjective pair (e.g. “generals” and “content” versus “generals” and “easy” in the illustrative sentence pair of Table 3). As Table 3 shows, the means for B versions were significantly higher than those for A versions for the subject based analysis in both the experimental and control pairs.
332
J. A. Fodor, M. F. Garrett, E. C. T. Walkerand C. H. Parkes
Direct comparison of the differences shows the experimental pair differences to significantly exceed those for the control pairs. Item based analyses, though in the expected direction, do not reach significance. Note that for this case, we have used as a control pair two words which are in the same underlying clause in both versions (“generals” and “retire”), but which have differing grammatical roles in that clause - e.g., “generals” is an underlying subject in the A version and an underlying object in the B version. This difference may have compromised the control pair; a better choice for a control pair would perhaps have been “retire”, “pension”. For the “sluicing” sentences of sentence Set 3, the primary comparison is again the subject, object pair (e.g., “family” and “who” versus “family” and the illustrative pair). Table 4 presents the results. Table 4.
Example
Sentence Set 3, Rating Scales. Tests for sluicing construction, common clause membership (B versions, direct object) versus disjoint membership (A versions, sluiced object) sentence
pair with target
words
in italics: control
A.
According
to the gossip columns
the heiress married
to the gossip columns
the heiress married
subject
verb
subject
verb
familyknows
someone
but only the
someone
but only the family knows
object
who.
control
B.
According object
him. Analysis
mean B-A differences (subject-object)
for primary
mean B-A differences (subjectcontrol)
for control
primary
pairs versus control
pairs
word pairs
word pairs
type
-
Items (n = 6)
Subjects
0.398 t = 2.359 (df = 5) p < 0.033
0.409 t = 2.621 (df = 59) p < 0.005 min F’ ns
-0.027 t = -0.096 p < 0.463 t = -1.293 p < 0.123
(n = 60)
-0.020 (df = 5)
t = -0.100 (df = 59) p < 0.461 min F’ ns
(df = 5)
t=-1,549(df=59) p < 0.063 min F’ ns (Continued
on facing page)
Against definitions
333
Table 4. (Continued) Analysis
type
Items (n = 6)
Subjects
0.945 t = 0.072 (df = 5) p < 0.003
0.015 t = 0.113 (df = 59) p < 0.455 min F’ ns
(n = 60)
-___ mean B-A differences (verb-object)
for secondary
word pairs
mean B-A differences (verb-control)
for secondary
control
secondary
pairs versus control
pairs
pairs
-0.311 t = -4.098 p < 0.004 t = -2.828 p < 0.020
(df = 5)
(df = 5)
-0.299 t = -2.108 (df = 59) p < 0.019 min F’ ns t = -1.466 (df = 59) p < 0.074 min F’ ns
Again, we find that for both subject and item based analyses the B versions yield higher relatedness scores than do the A versions for the subject, object test pairs. By contrast, the control pairs show no indication of an A, B difference. As with the type 1 sentences, the direct comparison approaches, but does not reach significance. The secondary comparison (verb, object pairs) in these sentences does show a significant contrast (on the item based analysis), between experimental and control differences. This contrast should not be interpreted, however, because it arises from a significant negative difference between A and B versions for the control pair, as well as a significant positive difference for the experimental pair. The subject based analysis for the experimental pair differences is also non-significant. Given these facts, the secondary comparison, though significant for items, should be discounted. Sentence types 1, 2 and 3 all show, for the primary comparisons, the pattern of results expected on the assumption that common underlying clause membership increases judged degree of relatedness for the words of a sentence. Though the patterns in each case are the same, and for the most part statistically significant in themselves, direct comparisons with control pairs are usually not significant. However, if one collapses the three sentence types for a single test of the contrast between experimental and control pairs, one finds a significant difference between the primary pairs and control pairs for both subject and item based analyses though not by the min F’ test. The mean B-A difference across the three sets is 0.448 for subjects and 0.444 for items on the experimental pairs; by contrast, the mean differences for control pairs are 0.070 for subjects and 0.066 for items.
334
J. A. Fodor, M. F. Garrett, E. C. T. Walker and C. H. Parkes
Though ratings experiment 1 also included tests of causative verbs, we will postpone their report until after the report of a further validation of the test instrument using sentences of types 4A and 4B from experiment 2. These sentences, though they also contrast word pairs in terms of underlying clausal membership, do so in a different way from sentences in types 1, 2, and 3. In 4A and 4B, on the analyses we described for these sentences, the existential “there” of the primary test pairs is transformationally introduced, and simply is not present in underlying representations of the test sentences. We also changed from five point scales to seven point scales for these tests. Table 5.
Sentence common
Set 4A, Rating Scales. Tests for existential ‘there’ constructions, clause membership (B versions, interrogative pronoun) versus
disjoint membership Example
sentence
pair with target
(A versions, existential ‘there ‘)
words
in italics:
A.
subject object control There is an island in the Caribbean
control which has a free port.
B.
subject object control Where is an island in the Caribbean
which
control has a free porf? Analysis
mean B-A differences (subject-object)
for primary
mean B-A differences (control-control)
for primary
primary
pairs versus control
pairs
word pairs
control
pairs
type
Items (n = 10)
Subjects
1.670 t = 10.167 p < 0.001
1.670 t = 8.647 (df = 39) p < 0.001 min F’ = 43.387 p < 0.01
(df = 9)
(n = 40)
0.095 t = 0.561 (df = 9) p < 0.294
0.095 t = 0.556 (df = 39) p < 0.291 min F’ ns
t = 6.143 (df = 9) p < 0.001
t = 6.449 (df = 39) p < 0.001 min F’ = 19.785 p < 0.01
Tables 5 and 6 give the results for tests on these two sentence sets. The results are quite uniform and very robust. Differences between the A and B versions are large (and significant) for the experimental pairs in type 4A (“there”/“where”) sentences, while those for the control pairs are small and nonsignificant. Direct comparison of the experimental and control pair differences is significant for both subjects and items, and by min F’
Against definitions
Sentence Set 4B, Rating Scales. Tests for existential ‘there’ constructions, common clause membership (B versions, demonstrative pronoun) versus disjoint membership (A versions, existential ‘there 7
Table 6.
Example
335
sentence
A.
pair with target
According
words
to the Scientific
in italics:
American,
subject object control there is a proof that you never need more than four
control colors to draw a map. Waving at the blackboard,
B.
the mathematician
shouted,
subject object “There is the proof that you never
control control need more than four colors to draw a map.” _~__
--____Analysis
mean B-A differences (subject+bject)
for primary
word pairs
mean B-A differences (controlcontro1)
for primary
control
primary
pairs versus control
pairs
pairs
type
Items (n = 8)
Subjects
0.639 t = 1.893 (df = 7) p < 0.050
0.643 t=4.121 (df=39) p < 0.001 min F’ ns
-0.244 t = -1.973 p < 0.045 t = -2.629 p < 0.017
(df = 7)
(df = 7)
(n = 40)
-0.244 t = -1.288 (df = 39) p < 0.103 min F’ na t = -3.577 (df = 39) p < 0.001 min F’ = 4.488 p < 0.05
test as well. The pattern for the type 4B (“there”/“there”) sentences is essentially the same. The only pause one might take is over the negative difference between versions for the 4B control pairs. This makes the experimental/control comparison less useful. It seems likely that this effect may have arisen as a contrast effect because of the strongly felt differences for the B versions of the primary pairs. The 4B sentences were run separately from the other sets and hence the “padding” effect of other pairs and other sentence types was lacking. However, given the significance of the A/B differences for the primary test pair, and the outcome of the tests for type 4A, there does not seem much room for doubt that the test is strongly sensitive to the relevant structural feature, even though there is some indication that it may be sensitive to other aspects of sentence organization as well.
336
J. A. Fodor, M. F. Garrett, E. C. T. Walker and C, H. Parkes
Phase 2: Tests of causatives with the Ratings Task Results
We now turn our attention to the results for the tests on causative verbs in the two ratings experiments (Comparison Sets 5A, 5B and 5C). Here we find a quite different outcome from that for Phase 1. As Tables 7, 8 and 9 indicate, the contrast of causative verbs with simple transitive verbs in these sentences gives no evidence of an effect like that just described for sets 1, 2, 3 and 4. Table 7.
Example
Sentence Set SA. Rating Scales. Tests for causatives, common clause membership (B versions, transitive verb) versus disjoint membership (A versions, unmarked causative verb) sentence
pair with target
words in italics: control
subject
A.
Despite
protests
from the manager the owner
B.
Despite
protests
from
control
subject
the manager the owner
verb object closed the theater. verb
object
sold the theater. Analysis
type
Items (n = 5)
Subjects
(n = 60)
mean B-A differences (subjectobject)
for primary
word pairs
0.002 t=O.O12(df=4) p < 0.496
0.091 t = 0.577 (df = 59) p < 0.283 min F’ ns
mean B-A differences (subject-control)
for primary
control
0.172 t = 0.661 (df = 4) p < 0.272
0.199 t = 1.247 (df = 59) p < 0.108 min F’ ns
t = 0.566 (df = 4) p < 0.170
t = 0.454 (df = 59) p < 0.325 min F’ ns
primary
pairs versus primary
control
pairs
pairs
mean B-A differences (subject-verb)
for secondary
word pairs
0.152 t = 0.880 (df = 4) p < 0.214
0.033 t = 0.244 (df = 59) p < 0.404 min F’ ns
mean B-A differences (verb-control)
for secondary
control
-0.198 t = -1.094 p < 0.167
0.074 t = 0.476 (df = 59) p < 0.318 min F’ ns
pairs
(df = 4)
(Continued
on facing page)
Against definitions
337
Table 7 (Continued) Analysis ~~
secondary
pairs versus secondary
secondary
pairs versus primary
Table 8.
Example
control
control
pairs
pairs
type
Item (n = 5)
Subjects
t= -1.399(df=4) p < 0.078
t = 0.208 (df = 59) p < 0.252 min F’ ns
t = 0.064 (df = 4) p < 0.480
t = 0.748 (df = 59) p < 0.229 min F’ ns
Sentence Set .5B, Rating Scales. Tests for causatives, common clause membership (B versions, transitive verb) versus disjoint membership (A versions, marked causative verb). sentence
pair with target
words
in italics.
A.
control subject verb After the meat was sliced the chef thickened
B.
After
object the sauce.
control subject verb object the meat was sliced the chef tasted the sauce. Analysis
type
Items (n = 6) mean B-A differences (subject-object)
for primary
word pairs
-0.133 t = -1.576
Subjects
(df = 5)
mean B-A differences (subject-control)
for primary
control
pairs
0.000 t = -0.001
(df = 5)
p < 0.500 pairs versus primary
control
pairs
t = 0.526
(df = 5)
for secondary
word pairs
-0.042 t = -0.222
(df = 5)
p < 0.426 mean B-A differences (verb-control)
for secondary
control
pairs
0.083 t = 0.285 p < 0.393
(df = 59)
(df = 5)
0.001 t = 0.006 (df = 59) p < 0.500 min F’ ns t = 0.590 (df = 59) p < 0.279 min F’ ns
p < 0.303 mean B-A differences (subject-verb)
-0.132 t = -0.745
(n = 60)
p < 0.229 min F’ ns
p < 0.088
primary
(n = 60)
-0.040 t = -0.350 (df = 59) p < 0.364 min F’ns
0.084 t = 0.551 (df = 59) p < 0.291 min F’ ns (Continued on following page)
338
J. A. Fodor, M. F. Garrett, E. C. T. Walker and C. H. Parkes
Table 8. (Continued) Analysis
type
-
Item (n = 6)
Subjects
(n = 60)
t = 0.726 (df = 5) p < 0.346
t = 0.488 (df = 59) p < 0.244 min F’ ns
t = 0.894 (df = 5) p < 0.462
t = 0.841 (df = 59) p < 0.420 min F’ ns
-secondary
pairs versus secondary
secondary
pairs versus primary
control
control
pairs
pairs
The sentences of type 5A are not morphologically marked (i.e., surface and deep verbs differ in form) whereas those of 5B are so marked (the form of the surface transitive corresponds to that of the underlying intransitive except for the presence of the participial ending in the surface verb). No differences that are relevant to the putative clausal analysis of the causatives were manifest. Neither sentence set revealed a significant contrast of A and B versions for experimental or control pairs. This was true for both the primary test pairs (subject, object) and for the secondam’ test pairs (subject, verb). For Comparison Set SC, the marked/unmarked causative distinction was eliminated, and the set of sentences enlarged (n = 10). Again, however, the primary test pairs show no hint of a difference, nor for that matter, do their controls. The secondary test pairs (subject, verb) did show a significant subject based effect, but it was in the direction opposite to that predicted by the causative analysis (e.g., “workers” and “spilled” was judged more related than “workers” and “found”). Item based analyses were not significant. The contrast of experimental and control pairs was significant because the direction of effects was opposite in the two cases. Whatever the reason for this effect, it provides no comfort for the causative analysis. Overall, the three tests of the causatives are remarkably uniform. They indicate no distinction between A and B versions or between experimental and control pairs. The final results we will note for the ratings task are those for test word pairs presented in isolation (i.e., with no sentence contexts). These ratings were obtained to help evaluate the effects of intrinsic relations between the members of the test pairs. For the pairs from Set 1, (“expect/persuade” sentences), and Set 4B (existential “there” sentences), Set 2 (“eager/easy” sentences, there are no significant differences between A and B version pairs. This is surely to be expected for Sets 1 and 4B, for the pairs are iden-
Against definitions
Table 9.
Example
339
Sentence Set SC. Rating Scales. Tests for causatives, common clause membership (B versions, transitive verb) versus disjoint membership (A versions, causative verb) sentence
pair with target word in italics:
A.
While patching
control subject up the old house the workers
B. ___~
While patching
control subject verb object up the old house the workers found some paint in the basement.
verb object spilled some paint in the basement.
Analysis
type
Items (n = 10)
Subjects
(n = 40)
mean B-A differences (subject-object)
for primary
word pairs
0.012 t = 0.109 (df = 9) p < 0.458
0.009 t = 0.078 (df = 39) p < 0.469 min F’ ns
mean B-A differences (subject-control)
for primary
control
0.044 t = 0.394 (df = 9) p < 0.352
0.038 t = 0.301 (df = 39) p < 0.383 min F’ ns
t = -0.214 p < 0.428
t = -0.169 (df = 39) p < 0.434 min F’ ns
primary
pairs versus primary
control
mean B-A differences (subject-verb)
for secondary
mean B-A differences (verb-contrbl)
for secondary
secondary
pairs versus secondary
secondary
pairs versus primary
pairs
pairs
word pairs
control
control
control
pairs
pairs
pairs
0.245 t = -1.556 p < 0.077
(df = 9)
(df = 9)
-0.245 t = -2.173 (df = 39) p < 0.018 mhi F’ ns
0.159 t = 1.077 (df = 9) p < 0.155
0.153 t= 1.161 (df=39) p < 0.127 min F’ ns
t = -2.030 p < 0.037
(df = 9)
t = -2.097 (df = 39) p < 0.022 min F’ ns
t = -1.224 p < 0.126
(df = 9)
t = -1.385 (df = 39) p < 0.067 min F’ ns
tical. For the Set 2 cases, however, this provides us with some assurance that the differences observed in the sentence experiments were not the consequence of different intrinsic relations between the predicate adjectives and their subject nouns. For the pairs from Sets 3 (“sluicing”) and 4A (existential “there” versus locatives and other pronouns), there are significant differences. For the Set 3
340
J. A. Fodor, M. F. Garrett, IT. C. T. Walker and C. H. Parkes
case, the difference is of the same magnitude as that observed in the sentence experiment (mean difference in sentences: 0.501; in isolation: 0.521). It should be recalled that the sluicing cases were somewhat less clear-cut on structural grounds as well. The results with ratings in isolation, plus the structural imponderables, indicate that we should not place much weight on the results with the sluicing sentences in the ratings task. The existential “there” case is different, however. The magnitude of the differences observed for the ratings in isolation is about half that observed for the sentence ratings (a mean B-A difference of 1.67 scale units versus 0.81 scale units). In fact, if one subtracts the values for ratings in isolation from their corresponding values for ratings in sentence contexts, the residual differences for the sentences are still significantly different in A and B versions. Given that fact and the positive results with the sentences of Set 4B, the existential “there” results seem secure. The results summarized in Tables 2-8 indicate that subjects in the ratings task are responsive to the manipulated variable of underlying clausal membership where there are reasonably clear syntactic grounds for such assignments. The results of Table 9 indicate no variation due to the putative decomposition of causative verbs. We turn now to a discussion of the procedures and results for the forced-choice task.
Forced
Choice
Construction
Experiments:
Phase I
of materials
The sentences which were used in the forced-choice experiments are listed here. Minor changes were made in some of the sentences that were used in the ratings task and they were included among those used in the forcedchoice task. To these a number of new sentences of each type were added to increase the number of test sentence pairs (n = 20 for most groups). In addition, two new structural types were added (see Comparison Sets 5 and 6) which provide a more general (semantically based) evaluation of the basis for subjects’ judgments of relatedness in these tasks. Recall that in the sentence types included for Comparison Sets 5 and 6, the contrast of A and B versions is not plausibly susceptible of a syntactic construal (see Part II of the paper for discussion). The experimental and control word pairs are in italics in the lists. We have confined our attention to the primary test pairs for the forced choice task.
Against definitions
34 1
COMPARISON SET 1: FORCED CHOICE TASK EXPECT(A)/PERSUADE(B) SENTENCES
1. (A)
C
(B)
0
Even though a bad sto’m was predhed remain calm.
the caitain expected the passengers to 0
S
C
Even though a bad storm was predicted the captain persuaded the passengers to remain calm.
2. (A)
During the j$e dh
the ptinipal discovered the bzy to be missing.
(B)
During the jT2 dil
the prinApal told the b& to be quiet.
3. (A)
After the dress rehearsal the dire:tor pronounced
the unde%udy to be rezdy
to take %er for the leading lady. (B)
After the dress rehearsal the dire%or reminded the unde%udy to be re:dy to take %er for the leading lady.
4.
(A)
According to a New York Times editosal, Co&ress believes the President to be too flexible in his foreign policy.
(B)
According to a New York Times editorial, Congress advised the President to be more flexible in his foreign policy.
C
5. (A)
C
S
0
Union leazers rarely suppose manayement to provide adequate health ca;e benefits for their meibers.
(B)
Union Ieiders rarely convince man:gement to provide adequate health &e benefits for their rneibers.
6. (A)
(B)
Before the NFL champi&hip top shape.
pla&ffthe
Before the NFL champikzip top shape.
pla$ff
co&h reported the play& to be in
the co&91 warned the play& to be in
342
J. A. Fodor, M. F, Garrett, E. C. T. Walker and C. H. Parkes
7. (A) (B)
8. (A)
(B)
9. (A) (B) 10. (A) (B)
Because of un;est in the Middi East the geieral required his regi;ent to be ready to move out immediately. C C 0 S Because of unrest in the Middle East the general ordered his regiment to be ready to move out immediately. Since the ks was IeaFing at noon, the scout&aster expected the pa:01 to assemble by 11:30. C C 0 S Since the bus was leaving at noon, the scoutmaster told the patrol to assemble at 11:30. His sixth grade school teazher declared Jazes to be outs&ding in mathzmatics. S 0 C C His sixth grade school teacher inspired James to be outstanding in mathematics. My great grand;ather preferred his cht&ren to be quiet in p&c My great grandfather encouraged
pla:es.
his chil%en to be quiet in pu%ic plzes.
s
11. (A)
(B)
The old trainer thought the young athlete to be the best prepared run:er in C the hundred-yard dash. The old trailer helped the young athlzte to be the best prepared rum& in the hundred-yard
12. (A) (B)
13. (A)
(B)
14. (A)
da:h.
At the beB:ning of the seme%er the pro%ssor assumed the studOentsto be competent in English composition. At the beg&zing of the seme:ter the pro&sor advised the stud%ts to be imaginative in their writing. According to one source, the gra&ate admi&ions comSmittee intended DrySmith to be chairman, C S C 0 According to one source, the graduate admissions committee elected Dr. Smith to be chairman. Do J& tht?zkour music teacher imagines her so; concert pianists?
to play well enough to be
Against definitions
14. (B)
Do J& thh our music teacher taught her szs concert pianists?
15. (A)
The new capstainof the debating team wants his t&n
343
to play well enough to be
to have an anszer for
C
(B)
any question that might arise. s 0 C The new captain of the debating team urges his team to have an answer for C
any question that might arise. 16. (A)
The search pa& feared the chi&en to be lost in the woods where the plane C
crashed. S
(B)
0
The search party directed the children to lead them to the place in the woods C
where the plane cra:hed. 17. (A)
The vacazoners understood
the motel o+vn:r to have reserved a suse for thzm.
S
(B)
lg. (A)
0
C
C
The vacationers asked the motel owner to reserve a suite for them. Since the weatktman predzted heavy snow, some of the mzthers wanted the prinzpal to close the school early. C
(B)
C
Since the weatherman predicted heavy snow, some of the metiers told the prin%al to close the school early.
19. (A)
During survival training the sergzant felt his sogiers to be prepared for ju:gZe corn&2t.
(B)
During survival training the sergzant forced his sobers
to be prepared for
jungle comcbat. 20. (A)
In spite of the blizzard the condu%or expects the guest vio%zistto be prekzt C
at the recital tonight. (B)
In spite of the blizzard the con&ctor convinced preTent at the reciL
tonight.
the guest vio%ist to be
344
J. A. Fodor, M, F. Garrett, E. C. T. Walker and C. H. Parkes
COMPARISON SET 2: FORCED CHOICE EASY(A)/EAGER(B) SENTENCES
l.(A) 03
TASK
Marriage coun:elors beieve that ,r”,es are difficAlt to dominate Marriage co&elors
be Fzeve that w&es are a&d
to dominate
2.64)
Accor%zg
to manigers,
most young pi&hers are imp;ssible
(B)
Accor%zg
to ma&ers,
most young piihers
in the home.
in the home.
to overtrain.
are afrtid to overtrain.
3. (A)
Most naturalists
know that s$kes
are diff%lt
to move in co?d weatier.
(B)
Most naturalists
know that sdkes
are un$ble to move in ;Jd wezther.
4. (A)
Especially after lo&
his shi:, a cap?ain is e&
to retire on a pension.
@I
Especially after lo&
his s/r($, a cap?ain is con&t
5. c-4)
African game war&s
belie:e that elepiatzts are sin&e to shift to new ranges.
(B)
African game war&s
belige
to retire on a pension.
that elepiarzts are hesifant to shift to new ranges.
6. (A)
Because of the pre% of pe&le,
the police are e&y to fight in a crowd.
@I
Because of the prek of pe&e,
the po%ce are afriid to fight in a crowd.
7. (A) @I
Sen:tors are hatd to please in an ele:tion _ve%. SenSators are am&us to please in an electyon y&r. S
8. (A)
Some people claim that universities are dif&ult
to change when the circ&n-
C
stances require it. @I
Some people claim that univ&sities are ea&r to change when the circurn~tarzces C
require it.
Against definitions
9. (A) (B)
10. (A) (B)
11. (A)
The toxic rid tise makes & The toxic ,,“d t&z makes
345
danterous to eat for several days.
fisSh unaAbleto eat for several
When the weayher is wa& the swimhers are del&tfil
days.
to watch from the shore.
When the weayher is c$d the swimgers are con$ent to watch from the shore.
Their enormous cap:tal inves&ent makes large coq&ations almost impossible V
to resist by financial measures. C
(B)
S
C
Their enormous capital investment makes large corporations almost powerless V
to resist by financial measures.
12. (A)
According to siigle wosen, bachzlors are e& C
(B)
13. (A) (B)
14. (A)
(B)
1.5. (A) (B)
16. (A) (B)
S
C
to entertain
According to single women, bachelors are eager to entertain
Soldirs are $d
to command
at home.
A
at home.
on the batt&eld without enough r&k.
Soldirs are he&ant to command
on the bat&field without enough &k.
When 7 was in schzol, I always found my frieids were to&h to leave when summer vacation started. When f was in schzol, I always found my frieids were an&us to leave when summer vacation started.
F’oll~wing the 1968 conv&tions, the mzbs were diffi&dt to disperse. Follzwing the 1968 con&lions,
the m$bs were re&ant
When it comes to ask& for he?p, straigers are e& forward.
to disperse.
to offend by being too
When it comes to ask& for hsp, stra&ers are hesttant to offend by being too forward.
346
J. A. Fodor, M. F. Garrett, E. C. T. Walker and C. H. Parkes
17. (A) (B)
18. (A)
& the summe%ime relalves are jiin to visit on weekends. & the summe:time relatyvesare eaier to visit on weekends. Acco?ding to the stud:nts some pol&cians are impAsible to debate. C
(B)
19. (A) (B) 20. (A) (B)
A
s
C
According to the students some politicians are afraid to debate. Under s&re circum~ances the police are sikple to resist forcibly. Under s&e
circum:tances the p&e
Just befzre Ch&mas, Just be&e
are pow%ess to resist forcibly.
children are si&ple to please.
Chri;mas, children are eater to please.
COMPARISON SET 3: FORCED CHOICE WHO(A)/NOUN(B) SENTENCES
1. @)
I
once r$d an interesting C
(A)
2. (B)
I once read an interesting
TASK
bozk on dodos, but Scan’t recall i?veIy precisely. S
C
0
book on dodos, but I can’t recall when very precisely.
Pottery and other ancient arti&cts are thought to be bused in the Northern Hemisphere and the archeologists are trying to recover thzm.
(A)
Pottery and other ancient artij&ts are thought to be bused in the Northern Hemisphere
3. (B) (A)
4. (B)
and the archeologists are trying to discover wh%e.
C
C
S
C
C
S
0
Tom should file a claim for disaster relief, but he doesn’t know the procedure. 0
Tom should file a claim for disaster relief, but he doesn’t know how. The win:er of the travel Iott& S
computer has determined
will go somewhere for two weeks and a 0
the place.
Against definitions
C
C
4. (A)
The winner of the travel lottery will go somewhere for two weeks and a 0 s computer has determined where.
5. (B)
Or&e
(A)
6.09
347
is a bo;e though nobSody is willing to tell h?m.
On%e is a boFe though nobody is able to explain W~JJ.
According to the gossip columns the he&ss maLed 0 s reporters know him.
someone but none of the
According to the gossip columns the heyress mar%zd someone but none of the S
0
reporters know who.
7. W
There is only one good way to maCkeso&i’t%, and the cief is teaching David the method.
(A)
There is only one good way to m:ke so&l&
and the cief is teething
David
h:w.
8.
(B)
The Bermuda Triangle is associated with mystezous
phen:mena
and a recent
The Bermuda Triangle is associated with myste%ous phen:mena
and a recent
biok tries to explain th:m.
biok tries to explain W%JJ.
9. @)
.l& used to kn:w ways to get that sort of information,
but h”, has forgotten
0
them. (A)
J,‘, used to knzw ways to get that sort of information,
but h”, has forgotten
h:w.
10. (B)
A major earth&ake S
is expected to o,‘,, 0
seismologists are trying to predict it.
in the San Andreas fault, and
348
J. A. Fodor, M. F. Garrett, E. C. T. Walker and C. H. Parkes
10. (A)
A major eart&uake is expected to o&r
in the San Andreas fault, and
0
S
seismologists are trying to predict when.
11. (B) (A)
A pize is m&g
from this puzzle, and Scan?
A pize. is missng from this puzzle, and IScan’t find wgre.
12. (B)
The correct procedures
(A)
The correct procedures
13. (B)
for rozng caies are tricky, but T%dis learning th%. 0 c c S for roping calves are tricky, but Ted is learning how.
You will meet a ta%,dark strazger on a bus, but Scannot C
(A)
14. (B)
find E,
predict the exact d&.
s
C
0
You will meet a tall, dark stranger on a bus, but Z cannot predict exactly when.
I am sure I used to knzw someCbodywho makes violins, but ? can hardly 0
recall him. (A)
I am sure I used to khw somecbody who makes violins, but ! can hardly 0
recall who.
1.5. (8)
There is a good p&e
for cross country
Sk&g in New Hampshire, but few
pe&Ze go th%e. C
(A)
pe&e
16. (B) (A)
17. (B)
know whge.
One of the spark pZz$s is de:d, but Scan? One of the spark &&
find i?
is de$d, but IScan’t tell whr?h.
The susiect has just disapzeared, and the detecftive can’t find h%. C
(A)
C
There is a good place for cross country skiing in New Hampshire, but few
C
S
0
The suspect has just disappeared, and the detective can’t guess how.
Against definitions
18. (B)
The main character is actually som;one well kn&n,
349
and any reider should
0
recognize him after the first chapter. (A)
19. (B)
The main character is actually somzone well kn&n, 0 recognize who after the first chapter.
and any redder should
It is thought that only two cheiicals are attazking the ozone layer but S
0
scientists have not been able to discover them. (A)
It is thought that only two chen%als are attazking the ozone layer, but S
0
scientists have not been able to discover how. 20. (B)
The expesment
didn’t wo:k and the student’s problem is to redesign z.
(A)
The expesment
didn’t wo?k and the student’s
problem is to explain whey.
COMPARISON SET 4 : FORCED CHOICE TASK THERE(A)jWHERE(B) SENTENCES
l.(A) 09 2. (A) (B) 3. (A) (B)
4. (A) (B)
Is th.$e an isl&d in the CaribCbeanwhich has a free p&t? Wh%e is the is,&d in the Caribbean which has a free iort? Thzre is a doc?or who makes house:alls on S&fa_ys . ii: is the do:tor who makes house&s
on Sundays.
Is thzre a j?i%d of Jane’s who li& in ZVez York and always has room for guests? W\o is the &nd guests?
of Jane’s who li:es in Ne:York
Thzre are two m:n who were playing C&in
and always has room for
on the piazo in the lobby.
Thzre are the two M”, who were playing Cho%in on the pi&o in the lobby.
350
J. A. Fodor, iVl.F. Garrett, E. C. T. Walker and C. H. Parkes
5. (A)
Was th?erea %an who whi%ed when the voluptuous S
(B) 6. (A)
0
b&de
Who was the man who whistled when the voluptuous
blonde entered?
Thzre were some students who said the ex:m was too e&. 0
S
C
C
(B)
There are the students who said the exam was too easy.
7. (A)
Is thzre a marina on the C’h&es that has a gas pzmp? 0 c C s Where is the marina on the Charles that has a gas pump?
(B) 8. (A) (B) 9. (A)
0
Is thzre a c&c who says that Vonzegut is worth reiding? 0 C C s Who is the critic who says that Vonnegut is worth reading? Is th&e an inveztor who designed a new bu&ar ala&? S
(B)
10. (A) (B)
11. (A)
entered?
C
C
C
0
C
Who is the inventor who designed a new burglar alarm? Thzre was an old wzman who h’v~din a shk. Wh”,was the old wogan who hvzd in a shze? The tourists think that thzre is a pla,“, in Florida where Black~eard buried his tre&ure.
(B)
The tourists think that thsisis the plse
in Florida where Blackcbeard buried his
C
treasure. 12. (A)
(B)
13. (A)
“Rere is a most amazing stage in Washington,” enthusiastically. “H&e is the most amazing stage in Washington,“ enthusiastically.
According to the angry tenants, kit:hen wa%s.
the tour dire?tor siid the tour dir-ctor ssd
thf&e are some rzts that live behind the
Against definitions
13. (B)
0
S
“Look,”
35 1
said the angry tenants, “there are the rats that live behind the
kit:hen w&s. ” 14. (A)
thzre is a proOofthat you never need more
According to Scientific American, C
C
than four colors to draw a map. (B)
Waving at the blackboard,
the mathematician
shouted, “Z’&re is the pzof
C
C
that you never need more than four colors to draw a map.” 15. (A)
You may find it hard to believe, but thzre is a profgsor who is trying to collect solar enzrgy with mush:ooms.
(B)
If you want to meet someone interesting, C
over th:re is a progssor who is trying
C
to collect solar energy with mushrooms. 16. (A)
The secretary believes th\re is a copy of the letyer you thought was mizing from the fi?e.
(B)
The secretary said, ‘Th:re is the copy of the leger you thought was mi&g from the ~3:. ”
17. (A)
Looking worried, the professor said, “I doubt that thtre is a good hzok on C
C
relativity theory for laymen.” (B)
Pointing to the shelf, the professor said, “Thsere is a good boq7k on relitivity C
theory for laymen.” 18. (A) (B)
19. (A)
According to the cook, th%e is a kind ofp% that all chiben
&e.
Pointing to the pantry, the cook remarked, “There is the kind of$e c c children like. ” It smells like th?re is a rotten eig in tk refrig:rator.
that all
J. A. Fodor, M. F. Garrett, E. C. 1’. Walker and C. H. Parkes
352
19 03)
Th.&e is the rotten .&which
has been smelling up t?w re&erator.
20. (A)
My tennis coach says thire is a new kind of ra&et
which is especially
des?gned for begt;ners. 0
S
(B)
C
C
There is the new racquet which is especially designed for beginners.
COMPARISON SET 5 : FORCED CHOICE TASK NEGATIVE(A)/POSITIVE(B) QUANTIFIERS
l.(B) (A)
Everyone owns a copy of the book that is required for the Calculus course. No one owns a copy of the book that is required for the Calculus course.
2.03)
Mary discovered that both of the settings on the blender were good for making frappes. Mary discovered that neither of the settings on the blender were good for making frappes.
(A) 3. @I (A)
I know that either of the pinch hitters will win this game. I know that neither of the pinch hitters will win this game.
4.03) (A>
There is some meat in the soup on the stove. There is no meat in the soup on the stove.
5. @I (A)
It is obvious that the boy has measles. It is doubtful that the boy has measles.
6. (B) (A)
Either of these colors would look nice in this room. Neither of these colors would look nice in this room.
7. (B) (A)
In the department In the department
8. @I
The that The tion
(A) 9. @I (A)
10. (B) (A)
store, I found everyone helpful. store, I found no one helpful.
reporter admitted that all of his sources had been paid for the information they provided. reporter admitted that none of his sources had been paid for the informathat they provided.
The gourmet noticed that none of the snails were cooked to perfection. The gourmet noticed that all of the snails were cooked to perfection. Both of these hints could be helpful in solving the problem. Neither of these hints could be help&l in solving the problem.
Against definitions
11. (B) (A)
3 53
The foreman discovered that all of the work was done by the time the lunch whistle blew. The foreman discovered that none of the work was done by the time the lunch whistle blew.
12. (B) (A)
Both of these spices would tastegood in this stew. None of these spices would taste good in this stew.
13. (B) (A)
There is always some money in Bertha’s checking account. There is never any money in Bertha’s checking account.
COMPARISON SET 6 : FORCED INTENSIONAL(A)/TRANSITIVE(B)
CHOICE TASK SENTENCES
1. (B)
After the storm the ReSd 0-0s~ found snowgobiles to carry foid to the elderly.
(A)
After the storm the RedSCross sought snow%obiles to carry fosd to the el&-ly.
2. (B)
(A)
3. (B)
(A)
4. (B) (A)
5. (B) (A) 6. (B)
Because there was plenty of wid all of the sazors made the tzp w&out having to use their engines. Although there wasn’t much wild all of the saz?orsattempted using their engines.
the &
witgout
Since gas%ne has become so expznsive the comkuters use a tr% to get to the city. Since gaszline has become so exp:nsive the com;uters the city.
need a tr% to get to
For several weeks after her ski:g accCidentJine used a c%ze to help her walk. For several weeks after her ski:g aczident kne needed a c%e to help her walk. The amateur na&alist suggested a new method of cztching m%hs. The amateur nat%alist discovered a new meyhod of ca%hing mzths. EveGone on the nominating C
C
candidate for president.
committee
asked one of their frizds
to be a
354
J. A. Fodor, M. F. Garrett, E. C. T. Walker and C. H. Parkes
6. (A)
EveGone on the nominating C C candidate for president.
7. (B)
After the heavy rains the town cozncil repeated the n%vs that the d:m was &fe.
(A)
8. (B)
(A)
9. (B) (A)
10. (B) (A)
11. (B) (A)
12. (B) (A)
13. (B) (A) 14. (B) (A)
committee
After the heavy rains the towncou?rcil
wanted one of their frr&ds to be a
hoped for the nzws that the dzrn was&e.
The board of law review students attended C relations.
a confgence
The board oflaw review studznts suggested a conf$ence C relations.
on labor-management
on labCor-management
The me:hanic assembled the p%ts to& the t&r. S 0 c c The mechanic ordered the parts to fix the car. In the last secznds of the gime the quar&back fumbled the &/ on the oneyard line. In the last secznds of the ga?ne the quarirback yard line.
expected the 6oaOon the one-
In cooking class each chsef.tested the s&ce for the rzast dtck. S 0 C In cooking class each chef planned a sauce for the roast du:k. The engzeers
found r% near the ridge where the rozd was to be &d.
The enzneers predicted % near the ridge where the t&d was to be lzd. The anxious houzewife had a lucky stzak at Wedzesday night bizgo. S 0 C C The anxious housewife needs a lucky streak at Wednesday night bingo. The tow tr&k pulled the c% out of the snozdrift several days after the blihard. The tow tr&klooked
for the &in
the snozdrift several days after the blisard.
Against definitions
15. (B)
The wo;kman used a chaii saw to cut the l&s for the fire:lace.
16. (B) (A)
17. (B) (A)
18. (B)
19. (B)
C
The workman wanted a chain saw to cut the logs for the fireplace. The weary world trav&er visitedl’%s
in the s$flg
tize.
The weary world travzller imagined P%s in the spt%zgt;e. After thirty y&-s of sesce bonus.
the poftman enjoyed retirzment with a special
After thirty y&s bonus.
the pos?man requested retizment with a special
of se&e
The sen?ors attended S
(A)
C
0
S
(A)
3 55
The seniors demanded
a graduation
p%y to celeb:ate the end of high schzol. 0
a graduation party to celeCbratethe end of high school.
The engiieers at MIT have built a solar heating sysyem that is more effi?ient than earlier mocdels.
(A)
The en&eers
at MIT would like a solar heating syytem that is more effisent C
than earlier models. 20. (B) (A)
In the stc& of UncleCWhiskers,the c&t ate rabobitswith obvious enjoyment. In the st:v
of Uncle’Whiskers. the c% hunted rabts
COMPARISON SET 7: FORCED CAUSATIVE(A)/TRANSITIVE(B)
with obvious enjoyment.
CHOICE TASK SENTENCES
1. (B)
While cleaning up the o?d ho&e the wzrkers found some p%t in the basement.
(A)
While cleaning up the 0% haze the wo:kers spilled some p%t in the basement.
2. (B)
When the p&ce reached the student demon%ation, they told the bys&zders C
to go home.
356
J. A. Fodor, M. F. Garrett, E. C. T. Walkerand C H. Parkes
2. (A)
When the police stopped the student demonytration, they told the bysknders C
to go home. 3. (B)
Nobody notged that during the poker game the gabbler used a c%d from his sleeve.
(A)
No&x@ notLed that during the poker game the gabbler dropped a c$d from his sleeve.
4. (B) (A)
5. (B)
(A)
6. (B) (A)
7. (B) (A) 8. (B)
(A)
9. (B)
The cautious hunkers noticed the cotgar that was hid’ing in the t:e. C C S 0 The cautious hunters killed the cougar that was hiding in the tree. In the middle of the &et chemistry laboratory.
aftezoon
In the middle of a qz& aftekoon laboratory.
the stuSdentsdiscovered a jre in the the students started a fi?e in the chemistry
Against the recommezdation of his advzors the mator supported the przjecl for urban redevelopment. 0 S C C Against the recommendation of his advisors the mayor ended the project for urban redevelopment. The Sunday piper repeated the s& The Sunday paier spread the stzv The retiring chai%an attended last time. The retiring chai;man convened last time.
that the senitor was about to re&n. that the sena?or was about to rzsign.
the meeying of the Bzard of Tr&ees
for the
the me%ng of the Bo%d of Trusees for the
Before going on camera, the chief assembled most of the ingr!dients ne:ded for the disch.
(A)
Before going on camera, the chief combined the dzh.
most of the ingrzdients neeCdedfor
Against definitions
10. (B)
(A)
11. (B) (A)
12. (B) (A)
13. (B) (A)
14. (B)
With onlyj&en yard line.
se&ds
left, the @hack
With only jifieen sec&ds left, the fuliack line.
fumbled the foz.baN on the oneput the foo?ball on the one-yard
After the m:at was sli.c$d the ch$f tried the satce. After the msat was slic:d the ch$f warmed the szce.
The contra%ors blasted the cha%el for the Panama C&al with dyiamite. 0
S
C
15. (B)
While wozing in the sh%p the carpznter hit the na% with a hammer. While wo$ng
in the s&p the car&nter bent the n&s with a hammer.
In setting up the tent the sco%s pulled the &es on the ceiter p%e.
16. (B)
c
c
Even in early coloncal settlekents tiey often used gla% to make windows. C
s
0
Even in early Colonial settlements they melted glass to make windows.
The miid left the ,,“,rn so the g&s 0
S
(A)
0
In setting up the tent the scouts raised the lines on the center pole.
C
(A)
C
The contractoorsmade the channel for the Panama Canal with dynamite.
S
(A)
3 57
could use s. C
C
The maid cleaned the room so the guests could use it.
17. (B)
The TV repa%men checked the c~?or of the pic?ure t&e.
(A)
The TV repa%men changed the c8or of the pi&t-e t&e. C
18. (B) (A)
Before iron?ng them the houseLife examined basket.
the closes
that were in the
Before iron&g th:m the houseLife folded the clzthes that were in the basket.
358
J, A. Fodor, M. F. Garrett, E. C. T. Walker and C. H. Parkes
19. (B) (A)
20. (B) (A)
Procedures
The dress?naker sold the czat that the President’s $2
w&e.
The dres&zker made the czat that the President’s ~$e &t-e. Nimbly the s$rrel
ate the nit on a h?gh br&ch.
Nimbly the ~$4
cracked the ,“,t on a h?h b&ch.
and Analysis
In this task, unlike the ratings task, a subject is presented with both versions of a test sentence, and then required to choose which of the two indicates a closer relation between the members of a specified word pair. Each subject saw a given sentence pair only once and made a single judgment involving one word pair. Subjects were also required to indicate their confidence in their judgment by marking a 5 point confidence scale printed on the same form as the stimulus sentences. The presentation order of the sentences was randomly varied, with each subject receiving a new randomization. In the ratings experiments, the item based analyses were somewhat fragile because we did not have large numbers of sentence pairs representing each of the sentence types. For the forced-choice experiments, we improved that circumstance by increasing the number of sentence pairs in each structural type. However, because each subject saw any given sentence pair only once for a given word pair, the number of observations on each word pair condition from any given subject is small and confounded with sentence pair. Thus, while we were able to contrast experimental and control word pairs within subjects for a given test sentence pair in the ratings task, that same comparison is between subjects for the forced-choice task. We have reported subject based and item based analyses for the forced choice experiments. The scores used are, for items, the differences in the frequency with which the A and B versions of a given sentence pair were chosen, and their associated confidence levels. For subjects, the scores used are the differences in the frequency with which A and B versions were chosen, summing across the different sentences for which a given structural choice was presented to a subject, and associated confidence levels. Results
There were two experiments run. One, however, is only preliminary; only three sentence types and did not require confidence judgments. results reported below, save one, are from the principal experiment,
it used All the which
Against definitions
359
included Sentence Sets 1, 2, 3,4,6, and 7. The results for Sentence Set 5 are from the preliminary experiment. The outcomes for Sentence Sets 1, 2, and 3 are reported in Tables 10, 11 *and 12; (these Sets are of the same structural types as Sets 1, 2, and 3 in the ratings experiments.) There are two quite consistent patterns. First, the frequency of B choices significantly exceeds the frequency of A choices for all Table 10.
Example
Sentence Set 1, Forced choice and confidence ratings results. Tests for complement constructions, common clause membership (B versions, VP complements) versus disjoint meinbership (A versions, NP complements).
sentenee
pair with target
words in italics:
control
A.
Even though
control
B.
Even though calm.
control
a bad storm was predicted, control
a bad storm was predicted,
subject
object
the captain expected
the passengers to remain
subject
the captain persuaded
Analysis
calm.
object
the passengers to remain
Type
Items (n = 20)
Subjects
(n = 80)
forced choice mean B-A differences word pairs (subject-object)
for primary
6.650 t = 3.754 (df = 19) p < 0.001
1.662 t = 6.496 (df = 79) p < 0.001 min F’ = 10.56 p < 0.01
forced choice mean B-A differences control pairs (control-control)
for primary
4.100 t = 2.127 (df = 19) p < 0.023
1.025 t = 4.054 (df = 79) p < 0.001 min F’ ns
t = 2.191 (df = 19) p < 0.023
t = 2.213 (df = 79) p < 0.015 min F’ ns
mean difference between confidence ratings for primary pairs and control pairs when B versions were selected (SOB - CC&
0.796 t = 7.623 (df = 19) p < 0.001
0.316 t = 2.635 (df = 79) p < 0.005 min F’ = 6.202 p < 0.05
mean difference between confidence ratings for B choices of primary pairs and ratings for A and B choices combined on control pairs
0.643 t = 5.794 (df = 19) p < 0.001
0.475 t = 5.294 (df = 79) p < 0.001 min F’ = 15.274 p < 0.01
primary
pairs versus controls
360 J. A. Fodor, M. F. Garrett, E. C T. Walker and C. H. Parkes
Table 11. Sentence Set 2, Forced choice and confidence ratings. Tests for predicate adjective constructions, common clause membership (B versions) versus disjoint membership (A versions) Example
sentence
pair with target words
in italics:
A.
Especially
control control subject adj. after losing his ship, a captain is easy to retire on a pension.
B.
Especially
control subject control adi. rifter losing his ship, a captain is confent Analysis
to retire on a pension. Type
Items (n = 20)
Subjects
(n = 80)
forced choice mean B-A differences word pairs (subject-adjective)
for primary
6.000 t = 2.739 (df = 19) p < 0.006
1.500 t = 5.764 (df = 79) p < 0.001 min F’ = 6.12 p < 0.05
forced choice mean B-A differences control pairs (control-control)
for primary
4.250 t = 2.103 (df = 19) p < 0.025
1.063 t = 4.937 (df = 79) p < 0.001 min F’ ns
t = 1.984 (df = 19) p
t = 1.324 (df = 79) p < 0.094 min F’ ns
mean difference between confidence ratings for primary pairs and control pairs when B versions were selected (SAB - CC,)
0.710 t = 4.076 (df = 19) p < 0.001
0.154 t = 1.223 (df = 79) p < 0.112 min F’ ns
mean difference between confidence ratings for B choices of primary pairs and ratings for A and B choices combined on control pairs
0.757 t = 6.493 (df = 19) p < 0.001
0.367 t = 3.504 (df = 79) p < 0.001 min F’ = 9.509 p < 0.01
primary
pairs versus controls
three structural types when the primary (subject, object) word pairs are considered; this is true for both the subject and the item based analyses. In Sentence Sets 1 and 2, though not in Sentence Set 3, these comparisons are also significant by min F’ test. Rather surprisingly and problematically, however, the same comparisons are, for the most part, also significant for tests based on the control pairs. The differences here are not as large as those of the experimental pairs, and they are not significant by min F’, but with one exception, they are signifi-
Against definitions 36 1
Table 12. Sentence Set 3, Forced choice and confidence ratings. Tests for sluicing constructions, common clause membership (B versions, direct object) versus disjoint membership (A versions, sluiced object)
Example A.
sentence
pair with target
According
words in italics:
to the gossip columns
control control the heiress mnrried someone,
subject but none of the reporters
to the gossip columns
control control the heiress married someone,
subject but none of the reporters
object knew who. B.
According object know him.
Analysis
Type
Items (n = 20)
Subjects
(n = 80)
forced choice mean B-A differences word pairs (subject-object)
for primary
3.400 t = 1.919 (df = 19) p < 0.035
0.850 t = 3.653 (df = 79) p < 0.001 min F’ ns
forced choice mean B-A differences control pairs (control-control)
for primary
3.200 t = 1.637 (df = 19) p < 0.059
0.800 t = 3.043 (df = 79) p < 0.001 mm F’ ns
t = 0.138 (df = 19) p < 0.486
t = 0.163 (df = 79) p < 0.436 min F’ ns
mean difference between confidence ratings for primary pairs and control pairs when B versions were selected (SOB - CCB)
0.352 t = 3.001 (df = 19) p < 0.003
-0.158 t = -1.143 (df = 79) p < 0.139 min F’ ns
mean difference between confidence ratings for B choices of primary pairs and ratings for A and B choices combined on control pairs
0.467 t = 3.966 (df = 19) p < 0.001
0.235 t = 2.075 (df = 79) p < 0.020 min F’ = 3.380 p < 0.06
primary
pairs versus controls
cant on both subject and item analyses. Given this effect in the control pairs, it would not be surprising if the direct comparisons of experimental and control pairs failed. However, for Sets 1 and 2, the primary test pair differences significantly exceed those of control pairs on item and subject analyses. If one collapses across the three sentence types and makes the
362
J. A. Fodor, M. F. Garrett, E. C T. Walker and C H. Parkes
direct comparison, the differences are significant for both items and subjects, though not for min F’. There seem to be two possible accounts of these circumstances: either subjects are using as a decision base some feature of structural contrast other than the one we have focussed on, or, the results in the control pairs reflect a halo effect from the manipulated contrast induced by the forced choice task. The latter conclusion seems more likely on several grounds. First, and most obvious, there is the demonstrated specific effect of the clausal variable on these, same sentence types in the ratings task. Second, there is the indication of a significantly stronger effect for the experimental pairs than for the control pairs; that is, the words directly involved in the putatively effective structural relations show larger effects than those not directly involved. If some alternate basis of choice were, in fact, operative, that difference would be unexplained. Further, and most persuasive, is the evidence that the experimental and control pairs show different effects for the confidence ratings. We report two confidence results in the tables for the forced choice task. Both are tests of differences between confidence levels associated with experimental pairs and those associated with control pairs. The forced choices are, by hypothesis, motivated by direct involvement of the test words in the structural contrast between sentence versions, the latter, at best, indirectly so. The first test is for only B choices of both experimentals and controls; the second pools A and B choices for controls. This pooling reflects the fact that A/B choices in controls are arbitrary. In either case, however, focus on B choices tests differences for the preferred responses. Note: though we do not include the results in the tables, direct contrasts of A and B choices for experimentals were made. As one might expect, the less frequent A choices were also made with lower confidence for experimentals, though controls did not differ so consistently. Tables 10, 11 and 12 include results for the confidence ratings for Comparison Sets l-3. The differences are uniformly higher for experimental pair decisions, and significantly so, than are the comparable decisions involving control pairs. Again, if some alternate basis of decision is postulated this is a puzzle. On the grounds that the same basis of decision is being used by subjects in both cases, the outcome is as one might expect, however. The words directly involved in the relevant structural contrast are judged with greater confidence than those not directly involved. To these results we may add those for Sentence Set 4, the contrast of transformationally introduced “there” with locative “there” or other pronouns present in the underlying structure. Again we see the pattern of Sets l-3. Subject and item based analysis for the experimental pairs show signifi-
Against definitions
Table 13.
363
Sentence Set 4, Forced choice and confidence ratings, Tests for existential ‘there’ constructions, common clause membership (B versions, pronoun subject) versus disjoint membership (A versions, existential ‘there’ subject)
Example sentence pair with target words in italics: subject object control control A. Is there an island in the Caribbean which has a free port? subject object control control B. Where is the island in the Caribbean which has a free port? Analysis Type Items (n = 20)
Subjects (n = 80)
forced choice mean B-A differences for primary word pairs (subject-object)
6.800 t = 3.760 (df = 19) p < 0.001
forced choice mean B-A differences for primary control pairs (control-control)
3.800 t = 1.750 (df = 19) p < 0.048
1.700 t = 5.748 (df = 79) p < 0.001 min F’ = 9.90 p < 0.01 0.950 t = 3.338 (df = 79) p < 0.001 mm F’ ns
primary pairs versus controls
t = 3.099 (df = 19) p < 0.004
t = 1.812 (df = 79) p < 0.037 min F’ ns
mean difference between confidence ratings for primary pairs and control pairs when B versions were selected (SOB - CC,,
0.674 t = 5.456 (df = 19) p < 0.001
0.153 t = 1.038 (df = 79) p < 0.151 min F’ ns
mean difference between confidence ratings for B choices of primary pairs and ratings for A and B choices combined on control pairs
0.739
0.416 t = 3.625 (df = 19) p < 0.001 min F’ = 10.4 12 p < 0.01
t = 7.081 (df = 19) p < 0.001
cant A-B differences for both forced choice and confidence ratings; so too do the control pairs, though again they are smaller differences. Direct comparison shows significant item and subject based differences between experimental and control pairs; min F’ is not significant for forced choice judgments, but is so for confidence ratings. These results all involve sentence types which were also tested in the ratings procedures, and essentially the same pattern of results has emerged: word pairs which share underlying clause membership are judged to be more
364
J. A. Fodor, M. F. Garrett, E. C. T. Walker and C H. Parkes
closely related than comparably surface-located words which do not cooccur in an underlying clause. The forced-choice results for contrast of experimental with control word pairs are in some measure less clear than for the ratings case since there are significant effects observed for the control pairs. Nonetheless, as we observed above, the effects for the words involved in the structural manipulation are both larger and associated with higher confidence levels. Thus, taken together, the results for the ratings and the forced choice procedures seem to indicate an effect dependent upon clausal membership of words in a sentence. Before turning to the results for the causative sentences in the forcedchoice procedure, we will consider the sentences of Sets 5 and 6. These were intended to test for effects of semantic relatedness in cases where such relations are not directly represented by differences of syntactic structure. The results for Sentence Set 5, it should be recalled, are preliminary; tests were for the primary pairs only and did not include confidence ratings. The negative quantifier versions of the sentences did yield a lower judgment of relatedness for the test pairs than did the positive quantifier versions. We have no basis for evaluating the specificity of this effect to the test pair itself, however. (This sentence type is the subject of further investigation now in progress.) The results for the intensional verb sentences of Set 6 are reported in Table 14. Again we find significant effects for the test of differences between A and B versions. This holds for both the subject and the item based analyses. And, again as with many of the tests of Sets 1-4, there are significant differences for the control pair, albeit of lesser magnitude than for the primary pair. The direct comparison of experimental and control pairs is not significant. The item based tests for the confidence ratings are significant, however, though the subject based tests are not. We thus have some indication, though not a particularly strong one, that the procedures are sensitive to semantic relations as well as to syntactic relations like clausal membership. It seems entirely reasonable to expect that if the causative sentences differ from their transitive counterparts in the test sets, then our procedures would be able to detect that difference whether it be construed in semantic or syntactic terms. The results for the caustaive sentences of Set 7 are reported in Table 14; they are simple to describe for their bearing on the causative hypothesis: there is no indication of a difference in the A and B versions on the primary test pairs. The control pairs show larger differences than the primary pairs, and this seems to account for the significant values which do appear in the table. Note that there are twenty test sentence pairs, and recall that the forced choice procedures seem to be even more sensitive to differences in
Against definitions
Table 14.
Example
365
Sentence Set 6, Forced choice and confidence ratings. Tests for intensional verbs, common clause membership (B versions, transitive verb) versus disjoint membership (A versions, intensional verb)
sentence
pair with target words in italics: control
A.
Since gasoline has become
B.
Since gasoline has become
control
control
object
subject
so expensive, the commuters need a train to get to the city. control
subject
so expensive, the commuters Analysis
object
use a train to get to the city. Type
Items (n = 20)
Subjects
(n = 80)
forced choice mean B-A differences word pairs (subject-object)
for primary
5.300 t = 2.230 (df = 19) p < 0.014
1.325 t = 5.296 (df = 79) p < 0.001 mm F’ = 4.22 p < 0.05
forced choice mean B-A differences control pairs (control-control)
for primary
3.700 t = 1.637 (df = 19) p < 0.045
0.925 t = 3.043 (df = 79) p < 0.001 mm F’ ns
t = 0.782 (df = 19) p < 0.240
t = 1.210 (df = 79) p < 0.115 min F’ ns
mean difference between confidence ratings for primary pairs and control pairs when B versions were selected (SOB - CC,)
0.335 t = 2.338 (df = 19) p < 0.015
-0.077 t = -0.569 (df = 79) p < 0.266 mm F’ ns
mean difference between confidence ratings for B choices of primary pairs and ratings for A and B choices combined on control pairs
0.284 t = 2.151 (df = 19) p < 0.022
0.039 t = 0.358 (df = 79) p < 0.361 min F’ ns
primary
pairs versus controls
the sentences from the validating sets than were the ratings procedures. Thus, it seems fair to conclude that if there were any difference of the sort which the decompositional analysis of the causatives claims, it could have been detected in the forced choice tests. Conclusion Any number of methodological issues remain unsettled. Principal among these are perhaps the issue of the specificity of the subjects’ response to the
366
J. A. Fodor, M. F. Garrett, E. C. T. Walkerand C. H. Parkes
Table 1.5. Sentence Set 7, Forced choice and confidence ratings results. Tests for causatives, common clause membership (B versions, transitive verb) versus disjoint membership (A versions, causative verbs).
Example
sentence
pair with target
words
control
in italics:
control
subject
A.
While cleaning
up the old house
B.
While cleaning
up the old house, the workers
control
the workers
control
object
spilled some paint in the basement.
subject
object
found
some painf in the basement.
Analysis
Type
Items (n = 20)
Subjects
(n = 80)
forced choice mean B-A differences word pairs (subject-object)
for primary
0.100 t = 0.052 (df = 19) p < 0.419
0.025 t = 0.087 (df = 79) p < 0.465 min F’ ns
forced choice mean B-A differences control pairs (control-control)
for primary
1.850 t = 0.818 (df = 19) p < 0.212
0.462 t = 1.765 (df = 79) p < 0.045 min F’ ns
t = -1.412 p < 0.076
t = -1.092 (df = 79) p < 0.139 min F’ ns
primary
pairs versus controls
mean difference between confidence ratings for primary pairs and control pairs when B versions were selected (SOB - CCB) mean difference between confidence ratings for B choices of primary pairs and ratings for A and B choices combined on control pairs
-0.073 t = -0.263 p < 0.397
(df = 19)
(df = 19)
0.080 t = 0.5 12 (df = 19) p < 0.307
0.112 t=0.711 (df=79) p < 0.235 min F’ ns 0.318 t = 2.443 (df = 79) p < 0.009 min F’ ns
manipulated structural variables and that of the type of relations, semantic or syntactic (or both), which in fact govern the responses of subjects in these experiments. We can say of the first point that there are some good indications that the manipulations of clausal structure are the focus of subjects’ judgments for the sentences of Sets l-4. But what mechanism yields the ‘halo’ effects for control pairs observed in the forced choice procedure is not apparent. It is not strange that, faced with the necessity of choice, a subject casts about for a basis of choice and actually finds one; it is surprising, however,
Against definitions
367
that they so often seem to settle upon the quite abstract differences of clausal structure which are not in any way directly implicated in the relations of the particular word pair they must judge. The results for control word pairs thus seem to suggest that this structural variable is more salient than might have been supposed. Of the second issue, there is no doubt that the facts of clausal structure that we discussed in syntactic terms might be recast as claims about semantic relations. Whether they should be or not, we haven’t a clue. Finally, we may reiterate the principal point of our exercise: If lexical decomposition is a fact, and a psychological one at that, it is a fact which should yield structural consequences of the sort we have explored, with some success, in several types of sentences. That our examination of causative sentences was the sole consistent occasion for null results strongly indicates that, for causatives at least, lexical decomposition is not a psychological fact.