T H E NATURE A N D ORIGINS OF MATHEMATICAL SKILLS
ADVANCES IN PSYCHOLOGY 91 Editors:
G . E. STELMACH P. A. VROON
NORTH-HOLLAND AMSTERDAM LONDON * NEW YORK TOKYO
THE NATURE AND ORIGINS OF MATHEMATICAL SKILLS
Edited by
Jamie I.D. Campbell Department of Psychology University of Saskatchewan Saskatoon, Canada
1992
NORTH-HOLLAND AMSTERDAM LONDON NEW YORK TOKYO
NORTH-HOLLAND ELSEVIER SCIENCE PUBLISHERS B.V. Sara Burgerhartstraat 25 P.O. Box 21 I , 1000 AE Amsterdam, The Netherlands
L i b r a r y o f Congress C a t a l o g i n g - i n - P u b l i c a t i o n
Data
The N a t u r e and o r i g i n s of m a t h e n a t l c a l s k i l l s I e d l t e d by J a m i e I . D . Campbell. pa cm. -- (Advances i n psychology ; 9 1 ) I n c l u d e s b i b l i o g r a p h i c a l r e f e r e n c e s and i n d e x . ISBN 0-444-89014-9 1. Mathematical a b i l i t y . I. Campbell. Jamie 1. D . 11. S e r i e s : Advances i n psychology (Amsterdam. N e t h e r l a n d s ) ; 9 1 . PA1 1 .N37 1992 92- 17326 370.15'651--dC20
CIP
ISBN: 0 444 89014 9 @I992 ELSEVIER SCIENCE PUBLISHERS R.V. All rights reserved. No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, without the prior written permission of the publisher, Elsevier Science Publishers B.V., Copyright & Permissions Department. P.O. Box S21, 1000 AM Amsterdam, The Netherlands. Special regulations for readers in the U.S.A. - This publication has been registered with the Copyright Clearance Center Inc. (CCC), Salem, Massachusetts. Information can be obtained from the CCC about conditions under which photocopies of parts of this publication may be made in the U.S.A. All other copyright questions, including photocopying outside of the U.S.A., should be referred to the copyright owner, Elsevier Science Publishers B.V.. unless otherwise specified. No responsibility is assumed by the publisher for any injury and/or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use or operation of any methods, products, instructions or ideas contained in the material herein. This book is printed on acid-free paper. Printed in The Netherlands
V
List of contents Introduction
vii
Acknowledgements
Part 1
Origins of Mathematical Abilities
Chapter 1
What a Number Is: Mathematical Foundations and Developing Number Concepts Kevin F.Miller
Chapter 2
Chapter 3
Relationships Children Construct Among English Number Words, Multiunit Base-Ten Blocks, and Written Multidigit Addition Karen C. Fuson, Judith L. Fraivillig, & Birch H. Burghardt Understanding Elementary Mathematics Jeffrey Bisanz & Jo-Anne LeFevre
X
3
39
113
Chapter 4
Mathematical Misunderstandings: Qualitative Reasoning About Quantitative Problems Richard E. Mayer, Anne Bovenmyer Lewis, & Mary Hegarty 137
Chapter 5
The Role of Expertise in Solving Arithmetic and Algebra Word Problems by Analogy Laura R. Novick
155
The Development of Skill in Mental Arithmetic: An Individual Differences Perspective Keith F. Widaman & Todd D. Little
189
Chapter 6
vi
Ihe Nature and Origim of Mathematical Skills
Part 2
Numerical Cognition: Representation, Process, and Architecture
Chapter 7
A Theory of Enumeration That Grows Out of a General Theory of Vision: Subitizing, Counting, and FINSTs Lana M. Trick
257
Working Memory, Automaticity, and Problem Dimculty Mark H.Ashcraft, Rick D. Donley, Margaret A. Halas, & Mary Vakali
301
Representation and Retrieval of Arithmetic Facts: A Network-Interference Model and Simulation Jamie I. D. Campbell & Michael Oliphmt
331
Chapter 10 MATHNET Preliminary Results From a Distributed Model of Arithmetic Fact Retrieval Michael McCloskey & A. Margrethe Lindemann
365
Chapter 11 Inhibitory Mechanisms in Normal and Dysfunctional Number Processing James M. Clark
411
Chapter 12 Cognitive Number Processing: An Encoding-Complex Perspective Jamie I. D. Campbell & James M.Clark
457
Chapter 13 The Functional Architecture of Numerical Processing Mechanisms: Defending the Modular Model Michael McCloskey, Paul Macaruso, 8i Tony Whetstone
493
Chapter 14 In Defense of the Encoding-complex Approach: Reply to McCloskey, Macaruso, & Whetstone Jamie I. D. Campbell
539
Author Index
551
Subject Index
567
Chapter 8
Chapter 9
Introduction The last two decades have seen an explosion of theoretical and empirical analyses of the cognitive processes underlying the acquisition and application of mathematical skills. This research has been motivated both by the practical need for more effective methods for teaching mathematics, and by the realization that number skills provide an exceptionally rich experimental domain for studying cognitive processes. The range of theoretical questions addressed extends from the nature of very low-level perceptual and cognitive mechanisms (e.g., perceptual grouping processes in counting; the elementary feature structure of number representations), to more general theoretical issues of long-standing interest to educational, cognitive, and neuro-cognitive psychologists (e.g., the influences of biological, cultural, and pedagogical factors on learning and performance; the relation between linguistic and conceptual knowledge; the functional architecture of cognition). Whereas research into such questions is not restricted, of course, to the domain of numerical skills, many researchers have been attracted to cognitive number processing because the task environment is often well defined and relatively constrained. As such, it is tractable to attempt to develop complete and detailed theoretical models. Furthermore, although the processes underlying even simple number tasks are often subtle and complex, compelling models of these processes can be derived from detailed analysis of error and solution time patterns, through systematic manipulation of stimulus characteristics, and through analysis of verbal protocols. Much has also been learned from analysis of individual differences in normal children and adults, and from the complex ways that numerical skills can break down with brain injury. This volume provides a broad overview of current experimental research on numerical cognition and the acquisition of mathematical skills. The individual chapters, however, provide in-depth analysis of specific issues, methodologies, phenomena, and theory. The book is divided into two parts. In Part 1, Origins of Mathematical Abilities, the focus is on the acquisition and development of numerical skills. Kevin Miller (Chapter 1) examines the relation between the formal and psychological origins of basic number concepts, and presents experimental analyses of cross-cultural differences in children’s representation of number. Karen Fuson, Judith Fraivillig, and Birch Burghardt (Chapter 2) follow with a detailed analysis of young children’s learning of base-ten structure, examining how the coordination of multiple representational schemes contributes to the learning process. Jeffrey Bisanz and Jo-Anne LeFevre (Chapter 3) develop
viii
The Nature and Origins of Mathematical Skills
a new operational approach to the theoretical problem of understanding in basic mathematics, and, in Chapter 4, Richard Mayer, Anne Lewis, and Mary Hegarty show how specific forms of mathematical misunderstanding originate in qualitative reasoning skills rather than in quantitative skills. In Chapter 5, Laura Novick examines the relation between expertise and analogical transfer in mathematical problem solving, and compares analogical transfer in child and adult populations. Keith Widaman and Todd Little (Chapter 6) conclude Part 1 with a detailed, componential model of the development of cognitive addition skills, including an analysis of the sources of individual differences in such skills. Part 2 of the book, Numerical Cognition: Representation, Process, and Architecture, contains theoretical analyses of the information-processing basis of numerical skills. Specifically, this research attempts to penetrate the microstructure of numerical cognition in order to identify the basic mechanisms of perception, attention, and memory that support number skills. In Chapter 7, Lana Trick examines the role played by elementary visuo-attentional routines in subitizing and counting. Mark Ashcraft, Rick Donley, Margaret Halas and Mary Vakali (Chapter 8) present new findings on the effects of memory load and priming in mental arithmetic and examine the relation between these phenomena and traditional research on working memory and attention. Next, Jamie Campbell and Michael Oliphant (Chapter 9) describe a detailed computer simulation of retrieval of simple addition and multiplication facts. This model, which assumes a localist representational scheme, can be contrasted with the distributed model of multiplication fact retrieval presented in Chapter 10 by Michael McCloskey and Margrethe Lindemann. James Clark (Chapter 11) continues the theoretical analysis of elementary processes in cognitive arithmetic with an in-depth examination of the possible role of inhibitory mechanisms. The final three chapters in the book continue the ongoing debate regarding two fundamentally different views of the cognitive architecture underlying numerical skills: Is number processing comprised of independent subsystems specialized for comprehension, calculation, and production (the modular view), or, instead, are these seemingly distinct functions more accurately viewed as intimately intertwined skills that utilize common resources (the encoding-complex view)? In Chapter 12, Jamie Campbell and James Clark present new experimental evidence for the encoding-complex approach to numerical cognition, Michael McCloskey, Paul Macaruso, and Tony Whetstone follow in Chapter 13 with a critique of the encoding-complex view and a defense of their alternative modular architecture. In the final chapter, Jamie Campbell replies with a defense of the encodingcomplex position.
Introduction
ix
The range of theoretical and methodological orientations represented in the volume captures both the diversity and coherence of contemporary research into mathematical skills. The research of educational psychologists, cognitive psychologists, and cognitive neuropsychologists mutually informs and reinforces theoretical developments within each area. The multidisciplinary interest in mathematics skills reflects the pervasiveness and importance of mathematics in education, technology, and science, and also indicates that questions about mathematical competence address important issues in diverse areas of psychology and cognitive science. Jamie Campbell April, 1992
X
Acknowledgements Producing a volume of edited chapters such as this requires the efforts and cooperation of a large number of people. I want to especially thank the authors for the very fine quality of work produced and for their sincere efforts to meet the various deadlines with which they were faced. The excellence of the work presented here also owes a great debt to the many people who provided careful, thorough, and timely reviews of chapters. These people are acknowledged individually at the end of each chapter, but their valuable contribution is evident throughout the volume. The excellent technical quality of the book is due to the meticulous and ingenious efforts of James Sawchyn. Finally, I want to specially thank Valerie Thompson for much valuable advice on both scientific and practical matters related to the book. Jamie Campbell April, 1992
Part One
Origins of Mathematical Abilities
This Page Intentionally Left Blank
The Nature and Origins of Mathematical Skills J.I.D.Campbell (Editor) Q 1992 Elsevier Science Publishers B.V. All rights reserved.
3
Chapter 1 WHAT A NUMBER IS: MATHEMATICAL FOUNDATIONS AND DEVELOPING NUMBER CONCEPTS
Kevin F. Miller University of Illinois at Urbana Champaign
-
'Why do we call something a 'number'? Well, perhaps because it has a -directrelationship with several things that have hitherto been called number; and this can be said to give it an indirect relationship to other things we call the same name. And we extend our concept of number as in spinning a thread we twist fiber on fiber. And the strength of the thread does not reside in the fact that some one fiber runs through its whole length, but in the overlapping of many fibers." Wttgenstein (1958/53, p.32)
Overview Descriptions of children's mathematical development aimed at teachers and parents contain, with regularity, assertions such as the following (Copeland, 1984, p.12): "Children at different stages cannot learn the same content. They cannot learn about number, for example, until they reach the concrete operational stage." Statements such as the last ignore a growing body of research demonstrating impressive early numerical competence on the part of young children, much of which has been summarized in two books published ten years apart (Gelman & Gallistel, 1978; Fuson, 1988). I will discuss the philosophical foundation of the idea that there is a unitary number concept children might master at a specific developmental moment. Evidence for early mathematical knowledge in preschoolers obviously challenges this notion. Equally problematic for the view that number development rests on a single number concept is evidence that children's concepts of number continue to evolve well after the point they master tasks such as conservation of number (Piaget, 1965/1941) that might indicate that children understand what numbers are.
K. Miller
4
Background Logical foundations for number: Is there a "numberconcept" to have? This chapter is organized into three sections. The first section reviews philosophical descriptions of logical foundations for number, which have had a profound effect on psychological research on mathematical development. Particularly influential have been Peano's (1973/1889) derivation of numbers from an ordinal operation, and the demonstration of Frege (1968/1884) and Russell (1919) that numbers may be generated by a process of item-to-item correspondence between the members of sets. Several psychological researchers (Piaget, 1%5/1941; Brainerd, 1973) have proposed that what is logically primitive is psychologically primitive as well, suggesting that children's number concepts are the consequence of their understanding of logical operations such as ordination or class-inclusion. Discussion of the foundational theories proposed by Peano, Russell and Frege will serve as a background for considerationof the psychological theory of Piaget, who attempted to establish a psychological foundation for mathematics to parallel its logical foundations. A critique of foundational theories, and an alternative
The second section of the chapter presents a critique of foundational theories of number. The belief that number rests on some logical foundation has been questioned within philosophy, particularly by Wittgenstein (1956, 1958, 1974). Wittgenstein suggested that mathematics requires no logical foundation at all, and could instead be characterized as a "motley collection of language-games.'' As a guide for psychological research, Wittgenstein's theory implies that number need not stem from a single cognitive source, and that a premium should be placed on describing the existing numerical behavior of children. The concept that mathematics is a collection of language-games suggests that it may prove useful to characterize number psychologically as a set of overlapping games, with rules and requirements to be determined empirically. Developing number concepts: Effects of expertise, language, and orthography The final section of the chapter describes three studies on children's developing representations of numbers, focusing on changes beyond the point where children would be expected to have a basic understanding of the definition of numbers. All are based on analyses of children's number similarity judgments, as affected by development, language differences, expertise, and orthographic representations of
Mathematical Foundations
5
number. Taken together, these studies suggest that a) Wittgenstein's view of number as an overlapping fiber comprised of many of threads provides an apt metaphor for the development of children's number concepts, and b) a number concept independent of external form is a relatively late accomplishment rather than forming the foundation for children's mathematical development.
Logical foundations for number Peano (1973/1889) asserted that the natural numbers can be developed from three primitive ideas (0, number, and successor) and the following five postulates: 1) 0 is a number. 2) The successor of any number is a number. 3) No two numbers have the same successor. 4) 0 is not the successor of any number. 5) Any property which belongs to 0, and also to the successor of every number which has the property, belongs to all numbers. Peano went on to demonstrate that the arithmetical operations of addition, subtraction, multiplication and division could be defined within his system without additional postulates, and that it was also possible to derive the set of rational and irrational numbers within his system. Russell (1919) criticized Peano's theory of number on the grounds that it is overly general and contains unnecessary postulates. He noted (Russell, 1919, pp.6-10) that the Peano postulates are satisfied not only by the set of natural numbers but by any ordered series (e.g., 0, 2, 4, 5 , 8,... or 1, 1/2, 1/4 ,...). Russell asserted that "we want our numbers not merely to verify mathematical formulae, but to apply in the right way to common objects", and that nothing in Peano's axioms ensures that this will be the case (other than a pre-edsting conception of number). Russell (1919) and Frege (1968/1884) suggested that it is unnecessary to postulate order and number as primitive entities, and proposed an alternative that has come to be known as the logicist definition of number. In the logicist definition, number is a class which consists of all sets containing a given number of terms. Any two sets are defined as belonging to the same class (having the same number) if and only if each of the members of both sets can be placed in a one-one correspondence with a member of the other set (if this relation is intransitive, the relation is either one-many or many-one, depending on the direction of the intransitivity). Number is thus (Russell, 1919, p.19) "a set of classes such that any two are similar to each other, and none outside the set are similar to any inside the set."
6
K Miller
Having defined number as a superordinate collection of all sets which can be placed in one-one correspondence with each other, Russell went on to defrne Peano's other primitive terms (0 and successor). "0" is defined as the class whose only member is the null-set, and successor is defined as follows: "the successor of the number of terms in the class u is the number of terms in the class consisting of a together with x, where x is any term not belonging to the class, These two foundational theories of number have exerted a great influence over the psychological study of number development. This influence appears to have involved both a general belief that it is possible to find cognitive bases for number to parallel the logical foundations, as well as the direct influence of the specific foundations of number postulated by Peano, Russell and Frege. The psychological theories and number research of Piaget (1%5/1941) reflect this influence most directly. Piaget's conception of number
The importance of number research to the general development of Piagetian theory has been emphasized by Flavell(l963, pp.6-7), who attributed to this work the beginnings of the "logico-mathematical''concept of groupings which pervaded all of Piaget's later work. Number development is also central to a consideration of Piagetian theory because of the vast body of research generated by this theory of number development. It is an area for which sufficient data is available to evaluate a number of Piaget's claims, and hence can be treated as a case study of the Piagetian approach to cognitive development. Piaget's number development work consists of both the development of a theory about the logical foundations of number and research aimed at demonstrating that this mathematical theory describes the course of number development in children. Piaget's theory of the logical foundations of number stems from his observation (Piaget 1941/1965, pp.viii-ix, Beth and Piaget, 1966, pp.259-72) that it is impossible within Russell's system to differentiate the individual items within a set. Piaget accepts Russell's goal of reducing number to logical operations, but he asserts that such a reduction is logically impossible without the addition of order as a logically primitive concept. Piaget (Beth and Piaget, 1957/1964, pp.271-2) accuses Whitehead and Russell of introducing order "almost surreptitiously" in order to account for the fact that, given two sets containing at least some of the same items, the sum and the union of the two sets are not the same. Piaget (1968/1964, p.83) summarized his own theory of the logical foundations for number as follows:
Mathematical Foundations
7
”In short, the whole number is neither a simple system of class inclusions, nor a simple seriation, but an indissociable synthesis of inclusion and seriation. The synthesis derives from the fact that these two systems (classification and seriation), which are distinct when their qualities are conserved, become fused as soon as their qualities are abstracted.“ Piaget’s theory of the logical foundation of number had little impact on philosophical discussion about numbers (Brainerd, 1973, p.244), and it is accordingly difficult to find discussions of it in that literature. Several apparent shortcomings deserve comment, however. Piaget’s observation that it is impossible to distinguish the individual members of a set within the Russell-Frege systems does not represent a shortcoming in that model unless lack of that feature is relevant to the goals of that system. It is not necessary to distinguish between the individual members of a set in order to perform logical and mathematical operations upon the set (e.g., Brainerd, 1975, p.894), which was the goal of Russell’s analysis. Piaget never stated why the ability to distinguish individual members of a set (after its numerosity is known) is a necessary feature of a number system. Russell (1919, p.29-41) did actually define order at some length within his system; he differs from Piaget in that he believed that order could be treated as a derivative characteristic of cardinal number as determined by one-one correspondence, rather than being an essential and primitive characteristic of number. The logical need for introducing order as a primitive element also seems far from established. It has been suggested by one critic (Brainerd, 1973, p.244-5) that Piaget’s theory of number contains all the faults of ordinal and cardinal theories of number without conferring any additional benefits. At any rate, the logical advantages of Piaget’s system are far from obvious. Whatever the logical status of Piaget’s theory of number may be, its primary application has been to serve as the basis for a large body of psychological research in number development conducted by Piaget, his followers, and his critics. Thus it seems fair to evaluate Piaget’s system primarily as a psychological theory rather than as a logical one, giving first consideration to its use as a description of number development in children. Piaget’s description of number development Piaget’s (1965/1941) description of children’s number concepts was isomorphic to his logical derivation of number described above. In Piaget’s view, children come to understand the nature of numbers by constructing an understanding of the
8
K. Miller
logical principles (classification and seriation) on which number is based. As Piaget (1965/1941, p.viii) asserted, "logical and arithmetical operations therefore constitute a single system that is psychologically natural, the second resulting from generalization and fusion of the first, under the two complementary headings of inclusion of classes and seriation of relations, quality being disregarded." Piaget's research methods strongly reflected this belief that the task of explaining number development is essentially one of explaining the development of the logical abilities involved in classification and seriation. His procedures for assessing number development consisted of testing the child's ability to make use of one-one correspondence in judging the equality of sets of unknown numerosity, understand hierarchical classification, and create and reason with ordered collections of objects. An understanding of one-one correspondence is assessed by his number conservation procedure, in which children are shown (or asked to create) two rows of objects placed in one-one alignment, and are questioned as to the effect on the numerosity of the sets of changes in the spatial arrangement of the objects. Hierarchical classification is assessed by means of a class-inclusion task, in which children are asked to compare the numerosity of a superordinate set (e.g., beads) with that of a subordinate set (e.g., brown beads). Seriation is measured by requiring the child to form serial orderings (e.g., arranging dolls and their walking sticks in order of length) and make use of one-one correspondences to make inferences concerning the pairings of items in the two seriated sets. The Piagetian model of number development resulting from the use of these procedures features three discrete stages of development. The first is a prenumerical stage, in which children fail to make the appropriate correspondences between items. In describing this stage, Piaget (1965/1941, pp.147-8) emphasizes the global nature of the judgments of these children (i.e., making judgments in terms of vague labels such as "a lot"), and "the fact that immediate perceptual experience prevails over operational, logical composition" in judgments involving numerosity. A second, intuitive or preoperational stage characterizes children who can make correct judgments with some stimulus support, but who tend to make errors when perceptual cues conflict with the correct judgments. Piaget places in this category children who are able to make correct predictions of the effects of numerically irrelevant transformations, but who change their minds after the transformation is actually effected.
Mathematical Foundations
9
The final stage of number development described by Piaget is an operational stage, in which children can make correct judgment under all the conditions presented, and are able to ignore conflicting perceptual cues. Piaget (1%5/1941, p.149) described this stage as "the victory of operation over intuition", and suggested that there is a fundamental change in the child's perception of numerically irrelevant transformations in this stage. These stages of number development were believed by Piaget to be identical and parallel for both cardination and ordination operations, although he acknowledged (Piaget. 1965/1941, p.149) difficulty measuring the "pure state" of the second stage. He further extends these procedures to continuous quantities such as liquids, and finds the same stages of number development in children's use of one-one correspondence in judging volume in these tasks as in judgments involving discrete objects. Piaget drew a distinction between the number concepts he investigated and mundane skills he termed "merely verbal knowledge." This label was applied to the ability of children to perform numerical operations such as counting and calculating in the absence of a fully operational understanding of number. Piaget (1965/1941, pp.161,190) suggested that children cannot learn to add or subtract until they reach this final stage, noting that "it is true that even children who are still at the earlier stages can be taught to repeat formulae such as 2+2=4; 2+3=5; 2+4=6, etc., but there is no true assimilation until the child is capable of seeing that six is a totality, containing two and four as parts, and of grouping the various possible combinations in additive compositions." This de-emphasis on verbal knowledge was reflected in Piaget's treatment of the role of counting in the development of number. Counting was classified as "merely verbal knowledge" (Piaget, 1965/1941, p.29) unless it was accompanied by an operational level of performance on the tasks employed by Piaget to assess number development. Piaget did not concern himself with the child's ability to count the sets used in his tasks (except where the response required is the specific numerosity of a set). Piaget's attitude toward counting was further demonstrated by his interpretation of evidence that counting helps some children solve the conservation and correspondence tasks. He described a child (Piaget, 1965/1941, p.63) who gave correct answers to a correspondence task as long as she counted the elements involved, but notes that "when, however, she did not count the elements as they were being exchanged, she showed that she was still at the earlier level."
10
K. Miller
Piaget viewed number as a product of the developing logical operations by which a child makes sense of the world. These logical operations, featuring classification and seriation, were seen as general across a variety of domains (Piaget, 1%5/1941, p.vii mentioned space, time and space, as well as both continuous and discontinuous qualities). The stages and operations by which number is created are viewed as constant across these different applications, and they develop through three discrete stages, involving increasingly more adequate numerical judgments and an increasing ability to ignore perceptual miscues. When a child has reached the final, operational stage of number development, he or she has an understanding of the nature of number which provides the only sure psychological foundation for the development of practical numerical skills such as counting and calculating. Evaluation of Piaget's theory of number development
The theoretical edifice that Piaget built to explain number development has inspired an enormous amount of research in the 50 years since it was first published. Researchers have replicated many of the basic phenomena demonstrated by Piaget (see Brainerd, 1978, pp.183-8), but experiments using variations of Piaget's methods have raised questions concerning his interpretation of many of these phenomena. Serious questions have been raised concerning Piaget's assertion that a variety of numerical abilities develop through the same stages in parallel, and that the development of these numerical abilities in turn parallels the development of logical abilities. Piaget's characterization of young children as "prenumerical"has also been challenged by research suggestinggreater numerical competencies in younger children than had been found by Piagetian tasks. Recent research has also emphasized the importance to number development of skills such as counting that Piaget dismissed as merely verbal learning. Perhaps the deepest effect of Piaget's theory, however, was in the basic characterization of number as a concept whose definition children might acquire at a definite point in development. Although the details of when and how children develop such a concept have been subjected to dramatic revision, the basic notion that children acquire a number concept at some point in time, which they then apply to mundane tasks such as counting and arithmetic, has received much less scrutiny. Are young children prenumerical?
The origins of mathematical understanding have been shown to extend nearly to the moment of birth. Starkey and his colleagues (Starkey & Cooper, 1980;
Mathematical Foundations
11
Starkey, Spelke, & Gelman, 1983, 1990) have shown that infants will habituate to small numerosities and &habituate to changes in numerosity when extraneous variables are controlled. Antell & Keating (1983) replicated this basic result with infants age 21 to 144 hours. Starkey, Spelke, & Gelman (1980, 1983, 1990) have shown that infants’ ability to match numbers extends beyond particular modalities, with infants looking longer at dot displays associated with particular numbers of audible drumbeats, although Moore, Benenson, Reznick, Peterson, and Kagan (1987) failed to replicate this result with a roughly comparable procedure. Baillargeon, Miller, & Constantino (1992) have demonstrated that infants by 9.5 months of age also show a very rudimentary understanding of the effects of addition and subtraction on the numerosity of sets of objects. These infant mathematical abilities are followed by some impressive albeit limited mathematical understanding on the part of preschool children before the time they begin to pass number conservation tasks. Gelman & Gallistel (1978) described some impressive early counting abilities, suggesting that children’s mastery of counting is guided by an early but implicit understanding of principles (including one-to-one correspondence) that make counting possible. The relation between such principled knowledge and the mundane procedures of actual counting has proven to be complex. Children often do better on generating correct conventional counting sequences than on judging puppet counting performance (Briars & Siegler, 1984; Frye, Braisby, Lowe, Maroudas, & Nicholls, 1989; Fuson, 1988, but see Gelman & Meck, 1983, 1986), which suggests that interaction between conventional counting and children’s understanding of the underlying rules may run in both directions. What is the role of “merely verbal learning’’in number development? The basis for the numerical reasoning principles shown by young children appears to lie in the development of counting abilities (Gelman, 1978; Gelman and Gallistel, 1978). Gelman notes that young children are adept at using counting procedures to represent numerosity under a variety of conditions (e.g., in the face of spatial transformations, under directions to start counting in the middle of the set, etc.). In contrast to Piaget’s description of counting as merely verbal learning with little effect on number development, Gelman suggested that it forms the cornerstone of a consistent and accurate, if limited system of number representation and numerical reasoning. Taken as a whole, these criticisms greatly complicate the description of number development proposed by Piaget. Children’s performance on tasks requiring ordinal and cardinal judgments is not predicted by knowledge of their stage of
12
K Miller
number development, for performance is variable on these different tasks. The ability to solve class-inclusion and one-one correspondence tasks appears to involve a variety of non-numerical factors, and these tasks are accordingly not solid indicators of a child’s level of numerical development. Finally, young children have been shown to be capable of making the kinds of consistent numerical judgments that Piaget asserted they are incapable of making, and to do so at an age when they perform poorly on Piagetian tasks assessing the logical operations believed to underpin these numerical judgments. Piaget’s project of reducing number development to the development of the logical operations of classification and seriation must contend with an accumulating body of complicating counter-evidence. The extent of extant evidence running counter to the Piagetian description of number development is such, however, that it may be necessary to reconstruct the study of number development on a new foundation. Evidence that children’s understanding of number is not built upon an understanding of the logical foundations for number suggests it may be worthwhile to look more closely at the premise that number ought to be derived from more fundamental, logical concepts. The next section examines this premise and describes an alternative. Alternative philosophical foundations for the study of number development
The foundational view of children’s mathematical development provides a poor description of the actual development of children’s numerical abilities, as reviewed in the last section. Psychologists are not the only persons who find the foundational perspective wanting. Philosophers have also questioned the value of describing mathematics in terms of logical foundations. Russell’s student Wittgenstein (1956, 1958, 1974) provided both a cogent critique of foundational theories of number, and an alternative framework for the philosophical study of mathematics. Wittgenstein’s model is particularly relevant to psychological research on mathematical ability, so it will be reviewed at some length in the next section. Wittgenstein’s critique of foundational approaches to number
Wittgenstein suggested that theories postulating foundations for mathematics are both unnecessary and misleading. He asserted that mathematics needs no foundation, characterizing the attempt to give it one (Wittgenstein, 1974, p.296) as defining “problems whose solution would at long last give us the right to do arithmetic as we do.“ Wittgenstein suggested that mathematics stands in the latter
Mathematical Foundations
13
relation to its "foundations". People evaluate the adequacy of the statements in Russell's (or any other) foundational theory in light of the known characteristics of mathematics, rather than the other way around. Foundational approaches to mathematics are misleading because they suggest that mathematics is constrained to rules that exist within a particular logical system. Characteristics and techniques of mathematics that do not conform to a particular logical theory are no less legitimate than those that do, in Wittgenstein's view, but there is the danger that they will be considered erroneous or inferior because they don't fit within a particular foundational system. Wittgenstein (1956, p.171) summarized his attack on foundational theories of mathematics as follows: "What does mathematics need a foundation for? It no more needs one, I believe, than propositions about physical objects--or about sense impressions-need an analysis. What mathematical propositions do stand in need of is a clarification of their grammar, just as do those other propositions. The mathematical problems of what is called foundations are no more the foundations of mathematics for us than the painted rock is the support of a painted tower."
Wttgenstein's theory of mathematics In contrast to the foundational goal of generating mathematics from a small set of propositions, Wittgenstein (1956, p.84) suggested that mathematics is best viewed as a motley collection of techniques. These techniques may be compared to games, which have consistent if arbitrary rules within themselves, but need have no necessary relation to other games. In the same way that all the common games could not be generated from a few propositions, Wittgenstein suggested there is no unitary source for mathematics. The foundations of mathematics consist in the rules of the various games that comprise it, so Wittgenstein (1974, p.297) suggested: "teach it to us, and then you have laid its foundations." The task of the philosopher considering mathematics is, in Wittgenstein's view, simply to describe the rules of the various language games that are observed. Similarities in rules may be observed in different areas of mathematics, but these similarities are on the same order as those that exist between non-mathematical games (e.g., checkers and chess). He considered the problem of how to compare different games, writing (Wittgenstein, 1956, p.61), "How do we compare games? By describing them--by describing one as a variation of another--by describing them, and emphasizing their differences and analogies."
14
R Miller
Wittgenstein's characterization of mathematics as a motley collection of language-gamesneed not preclude the possibility that mathematical concepts could be the consequence of mastering a small set of logical operations, but whether that is the case is an empirical issue. This approach gives mathematical knowledge a psychological rather than a logical basis. It changes the task of describing mathematical development from one of deriving the set of fundamental concepts that might make mathematics possible to the empirical task of describing the development of mathematical concepts in human beings. The application of a number-games approach is illustrated by Wittgenstein's (1956, p.44) consideration of a problem similar to Piaget's (1%5/1941) conservation of quantity. Wittgenstein considered the case of a hypothetical society where people buy and sell timber as a function of the area that a pile of timber covers: "How could I show them that-as I should say-you don't really buy more wood if you buy a pile covering a bigger area? I should, for instance, take a pile which was small by their ideas and, by laying the logs around, change it into a 'big' one. This might convince them--but perhaps they would say: 'Yes, now it's a lot of wood and costs more'--and that would be the end of the matter. We should presumably say in this case: they simply do not mean the same by 'a lot of wood' and 'a little wood' as we do; and they have a quite different system of payment from us."
It is certainly relevant that Wittgenstein was considering the case of a society rather than that of young children within this society, but this thought-experiment illustrates a difference between the two approaches. Wittgenstein concludes that these people are playing a different language-game than his own, in which terms have different meanings than the ones he would give them, but he does not suggest that they lack a fully-operational conception of number. As applied to number development in children, Wittgenstein's approach emphasizes description of the rules or systematic techniques that children use in the number-games they actually Play-
Symbolic structure and mathematical development One promising domain for evaluating the relative utility of a number games vs. a foundational perspective concerns the effects of external symbolicrepresentations for number on children's mathematical development. If children acquire a unitary number concept that is then applied to a variety of mathematical concepts, one
Mathematical Foundations
15
would expect a high degree of similarity in their judgments about numerical relations in different symbolic systems. In the case of on-line symbolic processing (e.g., in number-naming and calculation), McCloskey and his colleagues (McCloskey, Sokol, & Goodman, 1986; Sokol, Goodman-Schulrnan, & McCloskey, 1989; McCloskey, Mawuso, & Whetstone, this volume) have argued that numerical processing is based on a modular architecture, in which the first step in processing numerical symbols is translation into an abstract numerical code which is then used for further processing. Supporting this view, McCloskey et al. (1986) present data from several neurologically-impaired subjects who show different types of difficulty in translating numbers between different representational codes. The claim of a modular architecture implies that effects of the structure of external symbolic systems should be limited to processes involved in translating those symbols to and from an abstract internal code. Contrary to the view of strictly modular architecture, however, is evidence that the structure of external symbols affects the numerical information processing of adults. Clark and Campbell (1991) review evidence indicating that calculation is affected by the specific modality in which numbers are presented, as well as factors such as odd/even status not incorporated into McCloskey et al.3 proposed abstract code. Miller, Zhu, Zhang, & Yan (1992) looked at the role that orthography plays in the access of numerical structure. In one study, subjects were asked to reverse twodigit numbers (e.g., responding "24 when shown "42). U.S. subjects showed both more difficulty overall and a different profile of number-reversal times when presented with stimuli written as English words (e.g., "Seventeen") than when shown Arabic numerals. For Arabic numerals, teens stimuli (e.g., "17") were quite fast to reverse, presumably because their reverses conform to the standard English rule for naming two-digit numbers. For English words, the same stimuli (e.g., "seventeen") were very hard to reverse. This indicates that American adults can take advantage of the base ten structure of Arabic numerals to compensate for some of the idiosyncrasies of the English number-naming system. Studies of on-line processing of numerical symbols challenge a strictly modular architecture for the representation of number. Campbell & Clark (1988; this volume) have argued that such data support an alternative representation, in which numbers are represented in a complex encoding based on a network of interconnected format- and modality-specific codes (e.g., imaginal, graphemic, etc.). Sokol et al. (1989) have argued that Campbell & Clark's model lacks the specificity of the modular architecture, but it clearly is better equipped to handle the accumulating evidence that the formats of external representations have continuing effects on calculation that extend beyond the limits of an early translation into an abstract mathematical code.
16
K Miller
The structure of symbolic systems could also affect children’s conceptual understanding of number. To the extent that children’s developing understanding of number is best described as the acquisition of a set of overlapping number games, then the structure of such systems should affect children’s understanding of what it means to be a number in particular contexts. The final section will present three studies on children’s developing representations of what numbers are. All are based on analyses of children’s number similarity judgments, as affected variously by development, language differences, expertise, and orthographic representations of number. Across all studies, evidence for an integrated representation of number is found, but such an integrated system appears to be a late accomplishment of older and more accomplished children, rather than serving as the basis for children’s understanding of mathematics. Developmental changes in number representation MUer & Gelman (1983) collected similarity judgments for single-digit integers from children in grades kindergarten, three, and six, as well as from adults. If children maintain a separation between defining features of numbers, and the applications of numbers to tasks such as arithmetic, there should be little developmental change in number-similarity judgments beyond the point where children acquire an understanding of the defining features of number. If, on the other hand, mathematical development is characterized by Wittgenstein’s model of an interwoven thread, then looking at developmental changes in judgments of relations between numbers might provide a window into the processes by which children weave a representation of the relations that define numbers out of their mathematical experience. Figure 1 shows a re-analysis of these data using the individual differences multidimensional scaling model (INDSCAL) of Carroll & Chang (1970). This mathematical model entails a psychological assumption about how individual differences affect representation of stimulus domains, namely that individuals differ by differentially emphasizing common stimulus dimensions. According to this model, individuals differ by differentially weighting the dimensions of a common stimulus space. Because of the related way that INDSCAL treats subject and stimulus variation, programs to fit this model provide separate spatial representations of the overall stimulus relations and of the weights that subjects place on the dimensions of this stimulus space. High subject weights along a stimulus dimension indicate that the corresponding feature was emphasized by a particular subject or group of subjects.
Mathematical Foundations
17
Developmental Changes in Number-Similarity Judgments (Data from Miller & Gelman, 1963)
0.6
0.4
0.2
f
W
$
0
U
8 -0.2
-0.4
-0.6 -0.6
-0.4
o.8
O-* - 0
-0.2
0 0.2 Magnitude
t
0.6
Adult
t 0
0.4
3rd
..Kii
0.2
0.6 Magnitude
0.4
0.8
1
1
FiFre 1. Weighted MDS analysis of children’s number-similady judgmentsfrom Miller & Gelman (1983). Top panel shows stimulus space. Lower panel shows subject or weights space, which indicates the salience of each dimension for a particular age group.
18
K Miller
An important distinction between the INDSCAL model and most other MDS models is that because subject variation is represented by weighting dimensions, INDSCAL solutions are not invariant across rotation through an arbitrary angle, thus providing a unique orientation for the solution up to reflections or permutations of the axes. In other words, the INDSCAL model uses information on variation among subjects to uniquely orient a multidimensional stimulus space. The dimensions of an INDSCAL solution correspond to those dimensions that captured variation in subjects’ judgments, and one might expect that these dimensions of variation among subjects will correspond to important features of the stimuli that those subjects were judging. The overall correlation between predicted values and obtained judgment is r=.78. The two dimensions of the stimulus solution shown in the top panel of Figure 1 quite clearly correspond to Magnitude (Dimension 1) and Odd vs. Even (Dimension 2). These dimensions resemble quite closely those found in Shepard, Kilpatric & Cunningham’s (1975) analysis of adult judgments of similarity among numbers judged as abstract concepts. The lower panel of Figure 1 shows how individual groups of subjects weighted these dimensions. The length of the vector between the origin and the subject’s location is proportional to the amount of variance that was accounted for by the sum of these dimensions. The angle of this vector shows the relative significance of the two dimensions. Thus, the Kindergarten and third grade subjects show a small weight for the odd/even dimension and a very large weighting on the magnitude dimension, corresponding to their nearly complete insensitivity to multiplicative relations such as Odd vs. Even. As children learn new arithmetical operations new mathematical relations become prominent in their judgments of relations between numbers. Kindergarten and third graders were consistent in making magnitude-based judgments, incorporating a defining feature of number into their judgments of numerical relations. This reliance of magnitude was supplemented by (and, in the case of adults, subordinated to) multiplicative relations with development. The process by which new mathematical relations are incorporated into judgments of relations between numbers is clearly not an immediate one, however. Third graders are familiar with simple multiplication and can characterize odd and even numbers, but they did not incorporate these features into their judgments of numerical similarity. The prominence of odd/even relations in the judgments of older subjects is particularly interesting, because odd/even is not represented distinctively in the base ten system of Arabic numerals and is not part of any of the philosophical number foundation proposals described above. Odd/even relations fall out of
Mathematical Foundations
19
particular uses of numbers - counting by twos and multiplying - that become part of our understanding of what constitutes a number. Results of this study suggest that children's developing number concepts do not maintain a distinction between what might define numbers and the contexts in which those numbers are used. Instead, number concepts continue to evolve, as new applications for numbers affect one's understanding of what a number is. Expertise and number representation: The case of abacus calculation If children incorporate their understanding of multiplicative relations into their judgments of number similarity, does this mean that all numerical relations one acquires are incorporated into one's conception of number? An important case for looking at the limits of this expansion of number conceptions is provided by the case of abacus calculation, explored in a study by Miller & Stigler (1991). Prior research had documented the impressive computational skills developed by adults and children who receive extended practice in abacus calculation (Hatano, Miyake, & Binks, 1977;Hatano & Osawa, 1983; Stigler, 1984). Perhaps the most intriguing aspect of this skill is the development of "mental abacus calculation", in which subjects calculate with reference to an internal image of the abacus. The nature of abacus skill
Figure 2 shows how numbers are represented on the Japanese abacus used throughout Asia. Beads "count" as they are pushed toward the center (horizontal) bar by the thumb (lower beads) or forefinger (upper bead). The upper bead represents 5 times the column value, while the lower beads represent one unit each. The value represented by a column is the sum of the top bead (0 or 5) and the lower bead (0-4), with the total multiplied by the column value (as with standard place value notation). Within a column, the abacus represents a modulo 5 number system, while remaining a base-ten system between columns. Thus numbers such as one and six that differ by f5 have similar abacus representations, differing only in the placement of the upper bead. Persons who develop a high level of skill at abacus calculation report calculating with reference to a "mental abacus," using an image of the abacus to perform mental arithmetic. Studies of the mental calculation of abacus experts support this claim of abacus experts. Stigler (1984) reported that for abacus-trained children the number of steps involved in an abacus calculation was associated with reaction time for mental calculation. He also reported that these abacus-trained children could distinguish true intermediate states from foils. Perhaps the strongest
K Miller
20
evidence for the abacus-like nature of the mental calculation of these children came from analysis of their errors. Abacus calculators (but not American college students) were prone to make errors that could be accounted for by misrepresenting the location of one bead on the abacus. These included leaving out the value of one column, and errors in which the answer was off by 5 in some column from the correct sum. Because the abacus represents numbers that differ by 5 in a similar way, the finding of an increased incidence of these moddo-5 errors provides a convincing demonstration of the abacus-like nature of the calculation of those who have become experts at the skill of mental abacus calculation.
0
1
2
3
4
5
6
7
8
9
Figure 2. Representation of digits on the Japanese-style abacus used in this study. Beads "count"as they are moved towards the center (horizontal) bar. The top bead represents 5, the lower beads each represent 1. Value within a column is the sum of the top bead (0 or 5) plus the number of lower beads (0-4)pushed toward the center bar. Abacus skill and number concepts Models of expertise permit two contradictory predictions about the effects of abacus expertise on experts' judgments of similarity between numbers. One view, stretching back to Bryan & H a t e r (1899) argues that domains of knowledge should be organized in terms of the functional expertise one has within those domains - a view Miller & Stigler (1991) called "conceptual determination by skill." In this view, abacus experts should view numbers in terms of the relations that are important in abacus calculation (such as the modulo-5 relations prominent on the abacus). The opposite prediction, which we termed "conceptual transparency of skill" has an equally distinguished ancestry, going back to Binet (1893/1966). Binet
Mathematical Foundations
21
argued that experts develop a representation that is much more abstract than the constraints of a particular expertise would imply. In this view, abacus experts should judge as salient only features that are meaningful in a larger mathematical context. The fact that 1 and 6 have similar representations on an abacus is of very limited mathematical significance. Conceptual transparency implies that, although abacus experts are prone to make abacus-specific mistakes in calculation (e.g., substituting a "1"for a "6"),abacus-specific factors should not be prominent in their judgments of similarity between numbers. To evaluate the effect of abacus skill on number representation, we collected similarity judgments for pairs of numbers in the range 0 - 20 from three groups of sixth graders: U S . children (with no abacus experience), Chinese children with only general abacus training (who knew how numbers were represented on the abacus, but had no special training in abacus calculation beyond brief exposure in the general school curriculum), and Chinese children who had participated in special after-school abacus training and were rated at one of the top three levels of skill in the Chinese Abacus Association rating system. Children judged similarity of pairs of numbers, and each child saw stimuli presented either as Arabic numerals or as abacus figures (in a truncated display showing 2 columns). Figures 3 - 4 present the results of a SINDSCAL analysis of the similarity judgments of these two kinds of numbers (Numerals vs. Abacus depictions) by these three groups of subjects (Abacus experts, Abacus novices, and Americans). A three dimensional solution is described, with an overall correlation of r = .638 between predicted and actual judgments. The first two dimensions, presented in Figure 3, correspond to the results of Miller & Gelman (1983) discussed previously. The first dimension corresponds to numerical magnitude, and is most stressed by U.S. children judging numerals and Novices judging the abacus. The abacus does not provide a direct representation of magnitude, as the relatively low weighting of this dimension by U.S. children judging the abacus. Therefore, Novices judging abacus stimuli are going beyond the relations explicitly encoded on the abacus when they emphasize numerical magnitude in judging the similarity of abacus figures. Figure 4 presents the first dimension of the SINDSCAL solution paired with the third dimension, which corresponds to the number of beads used to present a particular number. It is essentially a modulo-5 dimension; thus the numbers 0, 5, 10, 15, 20 (all of whom are depicted with 0 lower beads) have a similar low value on this dimension. Novices judgments of abacus and number stimuli were quite different from each other, however, indicating that not every mathematical feature known to these children was incorporated in their abacus judgments. Novices judging abacus stimuli placed little emphasis on Odd vs. Even, although their judgments of
22
K. Miller Expertise & Number Concepts: Magnitude vs. Odd/Even SINDSCAL solution lor judgments of Abacus 8 Numeral number judgments
0.4
by U.S. subjects, Abacus Novices 8 Abacus Experts
20
2 :4
8 0.2
10
6
16
12 I f 1 8
'
c
?
w 4 >
0 0
5
U
B 1
9
-0.2
..
31
-0.4 -0.4
11
-0.2
0 Maqnitude
r
0.2
0.4
ExpNum NovNum hp+ba
USNum
0 0
0.2
0.4 0.6 Magnitude
C 3
Figure 3. Weighted MDS analysis of abacus-numeral similarity judgments. This figure pairs the first two dimensions, which roughly correspond to Magnitude and Odd vs. Even. The top panel shows weighting of stimuli in a common space derived from all subjects' judgments. The lower panel presents the subject or weight space corresponding to the dimensions presented in the upper panel. Thus, for example, US. subjects judging the abacus placed very little weight on the Odd/Even dimension relative to Magnitude.
23
Mathematical Foundations Expertise 8, Number Concepts: Magnitude vs. Modulo 5 SINDSCAL solution for judgments of Abacus 8. Numeral number judgments bv U.S. subiects. Abacus Novices 8 Abacus Expeiis
0.5
8
0.25 h
.
.
-
m
18 17
12 13
7
.4
u)
D
9
14'.19
Q
m
16
v
0
2
0
11
9 10
U
0
2
1 -0.25
10 0.
15
2!
5
I
-0.5 -0.4
-0.2
0.2
0 Magnitude
0.4
0q .USAha Y
NovAba ExpAha
NovNuni ' txpNurn
'
us
0
0.2 0.4 Magnitude
urn
0.6
Figure 4. Weighted MDS analysis of abacus-numeral similarity judgments (continued). This figure pairs the first and third dimensions, which roughly correspond to Magnitude and the number of b e a h that represent a given number on the abacus. This rough& corresponds to the value of the number modulo 5. Thus 0, 5, 10, 15, and 20 equal 0, modulo 5 and 4, 9, 14, and 19 equal 4, modulo 5. The lower panel presents the subject or weight space corresponding to the dimensions presented in the upper panel. Thus US.subjects viewing the abacus relied almost entirely on the bead representation, relative to magnitude.
24
K Miller
numerals emphasized this feature heavily. In terms of the developmental progression from magnitude to multiplicative relations described above, Novices’judgments of abacus representations were less mature than their judgments of numerals. U.S. and Novice subjects judging abacus figures placed a greater emphasis than did experts on the modulo-5 relations used to represent numbers on the abacus. Expert judgments of the different types of stimuli were much more similar to each other than were those of Novices, and both resembled Novices’ judgments of numerals more than Novices’ judgments of abacus figures. This indicates that Experts’ judgments tend to de-emphasize abacus specific features, consistent with the conceptual transparency view of the conceptual consequences of expertise. Figure 5 presents a multiple regression analysis of judgments by the three predictors that emerged from the SINDSCAL analysis (Magnitude, Odd/Even parity, and Modulo-5 value). For Abacus stimuli, there is a drop with increasing expertise in the effects of Modulo-5 relations. There is an increase from U.S. to Novices in the use of magnitude as a feature in judging abacus stimuli, and an increase from Novices to Experts in the use of Odd/Even parity as a basis for judgments. Judgments of numerical stimuli show a different pattern with expertise. In this case, U.S. subjects show the strongest influence of magnitude, consistent with other data (e.g., Stevenson, Lee, & Stigler, 1986) in suggesting that U.S. children are less mathematically sophisticated than their Chinese peers. Novice and Expert children show a correspondingly greater influence of Odd/Even parity than do U.S. children, while there is a small effect of Modulo-5 features that does not show a consistent change with expertise.
Conceptual consequences of expertise Abacus experts show some evidence of a unified representation of number across different external representations. This representation does not, however, emphasize the features that are important in the skill they have acquired. Instead, features of number that are more significant in a broader mathematical context assume prominence in the judgments of abacus experts, at the expense of those features on which abacus calculation is based. The differences between Novices and U.S. subjects are interesting, because even though Novices knew the numerical value of abacus representations, their results for abacus stimuli fell between U.S. subjects (who were perforce judging on the features of the abacus depiction) and
25
Mathematical Foundations Expertise & Number Similarity Judgments: Abacus vs. Numeral stimuli Abacus Stimuli
0.5
5 .- 0.3 EL
2 0.1
-0.1
us
Novice Expertise Group
Experts
Numeral Stimuli
-0.1
I us Novice
Experts
Expertise Group
Figure 5. Regression weights @-weights)for number-similanlyjudgments. The full model (magnitude, beads, and odd/even) was fir separately to each task x group condition. Experts (who placed strong emphasis on non-abacus mathematical features). Novice subjects enriched their judgments of abacus stimuli by including the nonabacus feature of magnitude, yet their judgments of abacus figures failed to include multiplicative relations that were prominent when Novices judged numerals. This finding suggests that the conceptual transparency model provides a good description of the conceptual consequences of expertise, but it also indicates that
26
IC Miller
simply knowing the mapping between two systems is not enough to ensure that all relations known in one will transfer to the other. The problem of accessing knowledge across different but parallel representational systems is one that extends beyond the domain of abacus calculation. The final study looks at the effects of different orthographies for writing numbers on the mathematical relations children perceive. Orthography and number representation: Numerals and words in English, Chinese, and Korean
The system of numbers represented by the Hindu-Arabic numerals is used throughout the world. Languages that use these numerals often differ substantially in both the rules used to name those numbers and the orthographies used to write those number names. This combination of a universal orthography (numerals) coupled with alternative notations (such as words, or characters) presents an opportunity to assess the role that orthographic structure plays in children's accessing mathematical relations.
Effects of orthography in adult performance Mathematical notation and adult performance. Some adult experiments on mathematical notations suggest that the features of particular symbol systems influence mathematical processing even into adulthood. Gonzalez & Kolers (1982, 1987) demonstrated effects of external symbolic representation (Hindu-Arabic numerals vs. Roman numerals) on calculation time. Miller and Zhu (1991) demonstrated that the English language's inconsistent rules for naming numbers in the "teens" affects adults' ability to perform a simple place-reversal task. Miller, Zhu, Zhang, & Yan (1992) reported that this difficulty was greatly exacerbated when numbers were presented as English words instead of as Hindu - Arabic numerals. To what extent does natural variation in linguistic representation of number influence children's ability to access mathematical relations? External representations of number: Orlhography and language in the US.and China Figure 6 presents the number naming systems of English, Chinese, and Arabic numerals. Although both countries use Hindu-Arabic numerals, the names for those numbers are composed according to different rules.
Mathematical Foundations
27
Number Names and Orthography in Three Systems a ) From I to 10
~~
h) From I 1 to 19
~
c ) From 20 to 99
Figure 6. Number formation in 3 systems: Arabic numerals and the wordr for those numbers in English and Chinese.
Number names from I to 9. In all three systems, number names from 1 - 9 consist of an unsystematically organized list. Hindu-Arabic numerals write "loaccording to a base principle, but this is not reflected in the name for "ten" in any of the three languages. From 10 - 19. Above ten, the languages diverge in interesting ways, as Figure 6 demonstrates. The Chinese number naming system maps directly onto the Hindu-Arabic number system used to write numerals. For example, a word-forword translation of "shl Q' (17) into English produces "ten-seven." English, on the other hand, has unpredictable names for "11"and "12, and names numbers in the teens in the opposite order of Hindu-Arabic numerals. Hurford (1975, 1987)
28
K Miller
proposed that English 'teens could be generated from the canonical representation of two-digit numbers embodied in the Hindu-Arabic system by a process of a) onedeletion (in which "onety-seven" becomes "ty-seven" or "teen-seven"), and b) a Switch in which the positions of unit and decade values are transposed ("teenseven" becomes "seven-teen"), From 20 - 99. Above 20, the various number-naming rules converge. Chinese and Hindu-Arabic numerals are consistent in forming decade names by combining a unit value and the base (ten), while English names two-digit numbers by concatenating a decade name with a unit value. Mathematical notation and children's mathematical competence Developmental data indicate that the organization of number names in a language affects the ease with which children learn to count and realize the baseten principle that organizes nearly all number naming systems. Miller & Stigler (1987) found that young American subjects showed different patterns of difficulty in learning the sequence of number names than did Chinese children, who were learning a system characterized by a more consistent set of rules. Miura, Kim, Chang?& Okamoto (1988) reported that first-graders in China, Japan, and Korea (all speaking languages that use the basic Chinese system for forming number names) showed a substantially greater ability than did American children to form object representations that incorporate base-10 principles. Finally, Fuson & Kwon (in press) have documented the difficulty American children have (compared with Korean children) in attempting to figure strategies for adding multidigit numbers; they argue that many of these difficulties are due to the idiosyncrasies of the English system of number names. The last study to be described explored the effect that orthography and language might have on the accessibility of mathematical relations in children's judgments of similarity among numbers. Two basic questions were of interest: 1)Do Hindu-Arabic numerals have a privileged role in thinking about numbers? That is, are children more sophisticated in reasoning about numerical relations represented in this notation? 2) To what extent is the organization of specific representational systems reflected in the judgments children make when confronted with these systems? Subjects were 40 children (20 per orthography x language combination) at each of grades 2, 4, and 6, recruited from public schools in Beijing, China, and Champaign - Urbana, Illinois.
Mathematical Foundations
29
Matching booklets of triads of numbers in the range 1 - 21 were presented as either Hindu-Arabic numeral or words/characters in a local orthography to groups of children in each of grades 2,4, and 6, in China and the United States. Subjects were asked to pick the most and least similar pairs of numbers from each triad. Each subject saw a subset of 140 of the complete set of 1330 triads, which was balanced so that each pair of numbers appeared equally often (twice). Figure 7 presents the results of a SINDSCAL analysis of the resulting similarity judgments. A two dimensional solution is described, with an overall correlation of r = .737. The two dimensions, presented in Figure 7, correspond to Magnitude and Odd vs. Even, the first two dimension found in the studies discussed above. Subject weights for each country (US.or China) X grade (2,4,6) X orthography (Arabic or Word) are graphed below the stimulus space with the U.S. groups marked in italics. The developmental pattern within each country for Arabic numerals corresponds to that found by Miller & Gelman (1983), with increasing emphasis on Odd/Even and decreasing emphasis on Magnitude. The rate of this change differs across countries, with U.S. fourth graders looking more like second graders in the other countries in their emphasis on Magnitude, although by sixth grade all groups of subjects showed a greater emphasis on Odd/Even than on Magnitude. Judgments of "Word orthographies showed a smaller, later, and less consistent shift from magnitude toward multiplicative relations. Both groups of second graders and U.S. fourth graders, placed much more emphasis on Magnitude than on Odd/Even. As with Arabic numeral stimuli, U.S. subjects consistently placed more of an emphasis on magnitude relative to multiplicative features than their Chinese peers. Shifts with age and orthography in children's reliance on magnitude and odd/even in judging number similarity were also assessed by multiple regressions of judgments using these predictors. Figure 8 presents results for U.S. subjects, which show a strong effect of orthography. For Arabic numerals, there is a decrease in emphasis on magnitude, and an increasing effect of odd/even parity, to the point where the relative weight of these predictors shift for sixth graders. For English words, there is some indication of the same trend, but it is much weaker and sixth graders still place substantially more emphasis on magnitude than on odd/even parity.
K Miller
30
Orthography & Number Concepts: USKhinese Data SINDSCAL solution forjudgments of Arabic & Word (Characters or English words) number Judgments by 2nd. 4th. & 6lh graders In China and the U.S.
0.4
2 4
..
-
0.2
18
8
' 5 20
6 14 '
1.2
10 c
%
W
i >
o
U
8 .5
-0.2 -
'm.3
15 17 '
. . 7
9
11
a
I
-0.4
19
13 21 .
I
0.6 0.5
' US-6-A
-
Ch-6-A * Ch-4-A
c
%
Ch-4-W
0.4-
Ch.6-W
.
W
US-6-W
0.373
B 0.20.1
0
Ch-2-W
-
*
us.4-;4 uL4-q
- - us-2-w
US-2-A.Ch-2-A
I
I
I
I
I
I
I
I
1
Figure 7. Weighted MDS analysis of number similarity judgments in different languages and orthographies. The top panel shows weighting of stimuli in a common space derived f i m all subjects' judgments. The lowerpanels present the subject or weight space corresponding to the dimensions presented in the upper panel. Plotted symbols in the lower panel represent a particular country (Ch = China, U S . = United States), Grade (2,4, or 6) and Orthography (A = Arabic numerals, W = English words or Chinese characters).
31
Mathematical Foundations
0.8
-
+ Magnitude
-5l .-$A 0.4 0.G Y
2 0.2
rr
Arabic numerals
00
Odd/Even
-
0
0
/ L
0 ’
0.8
-
I
Y n
/
0
/
0
1
English words
0
0.6
2
.? 0.4 3 &
Y
g
0.2
0 -0.2 I
2nd
4th
6th
Grade
Figure 8. Regression weights @-weights)for number-similarityjudgments for US. subjects. The fill model (magnilude and odd/even) was fir separately to each task x group condition. The top panel presents results for judgments of Arabic numerals, and judgments of English words are presented in the bottom panel.
32
K Miller
For Chinese subjects, presented in Figure 9, the results are less dramatic. For Arabic numerals, there is a large increase between grades 2 and 4 in the weight given to odd/even parity, and little change between grades 4 and 6. For Chinese character stimuli, there is also a large increase in odd/even parity between grades 2 and 4. There is an unexplained drop in emphasis on odd/even parity between fourth and sixth grade, which was not expected. Overall, however, the effect of orthography was less substantial for Chinese than for U.S. subjects. Although children are very familiar with the mapping between the various orthographies used to represent numbers, there are still substantial effects of orthographic variation on children's judgment of relations between numbers. Overall, it appears that Arabic numerals do have a privileged role in children's early numerical reasoning. This could be due to a number of factors, two of which are state dependent learning (that is, this is the preferred notation used for teaching mathematics and performing arithmetic), and the consistency of Arabic numerals as a base-ten representational system. Support for the second explanation is suggested by results from the Chinese character condition, in which children were more likely to access multiplicative relations when judging an orthography that maps consistently onto the base ten structure of Arabic numerals than does the English alphabetic orthography for writing number names. Children also provided justifications for their judgments of a set of triads presented at the end of each packet. Children in all conditions saw the same four triads (in the appropriate orthography), and their judgments were not incorporated in the scaling results. Coding of children's justifications are consistent with results from the analyses of judgments described above. U.S. children were less likely to cite odd/even relations, prime numbers, or other multiplicative features as a basis for their judgments when stimuli were presented as Arabic numerals, although Chinese subjects showed no such effect. Judgments based on the sound or writing of numbers (e.g., picking ''ten'' and "thirteen" as most similar because "they both end in 'en'.'') was relatively rare (8% of U.S. children in the word orthography used this rationale at least once) but was limited to U.S. children viewing word stimuli. The alphabet nature of English words can be a distraction from accessing mathematical relations, but as with Abacus experts judging abacus stimuli, U.S. children generally do not incorporate idiosyncratic features of English words in their judgments of relations among numbers presented in this format. General discussion
There is a compelling analogy between the child's efforts at understanding the nature of the quantitative world and the mathematician's or scientist's attempt to
33
Mathematical Foundations
Orthography & Number Similarity Judgments: Chinese subjects
0.8
+ Magnitude
0.6
-5 .YA
Y
Arabic numerals
4 OddlEven
0.4
$
0
0.2
-
0
0.6 .-5
Yb 0.4 2 L-)
0.2
0
/
0 I
0
-- --
0
0
0
0.8
0
0
I
4th Grade
2nd
6th
Chinese characters
:
/0
-
0
/
\\ \
\
\
/
-
/
0
//
\
\ \
*
$f/
0
I
I
Figure 9. Regression weights @-weights) for number-similarity judgments for Chinese subjects. m e full model (magnitude and odd/even) was fit separately to each task x group condition. The top panel presents results for judgments of Arabic numerals, and judgments of Chinese characters are presented in the bottom panel.
34
K. Miller
formally state the rules that characterize quality and quantitative transformations. As psychologists we have frequently been drawn to using these formal quantitative models to mediate our descriptions of human quantitative behavior. Such models provide a simplified description of the features and phenomena we may expect children to learn to understand, and thus form a basis for evaluating children’s understanding of quantity. Furthermore, since formal mathematical models are the products of human minds, we may expect them to reflect in some way the manner in which human beings are prone to think about quantitative phenomena. The analogy between child and mathematician has frequently been presented more strictly, however, with the suggestion (e.g., Elkmd, 1969) that the child and the mathematician are going about the same task, although on “different levels.” Developmental psychologists in particular have been strongly influenced by the demonstration that models for complex quantitative phenomena can be systematically developed from simple postulates. A tempting hypothesis has been that what is mathematically primitive is developmentally primitive as well. In this view, children begin with the most general and simplified models for quantitative phenomena, and proceed by adding further postulates and mastering new transformations. Representational systems such as calendars, numbers, and written language have an internal structure that may serve to highlight some aspects of the underlying domains (time, mathematics, and language) they represent and obscure others. A “symbolsystem”is taken to mean any set of symbols whose meaning depends upon the set of symbols in which it occurs. Thus the meaning of “April”, for example, depends largely on its affiliation with a particular calendar system within which it designates a particular unit. “Symbolic Structure Effects” are any effects that the particular organization of a symbol system has on cognition about the domain that the system represents. Children learn to use symbolic systems before their concepts of number, time, and language are fully formed, and this fact raises the question of what role the structure of symbolic systems plays in the acquisition of a conceptual understanding of these domains. Not all of the difficulty children have in understanding mathematics, for example, is due to the tools their culture provides them for representing number. On the other hand, tools for representing number may facilitate or inhibit children’s understanding of certain mathematical principles. A model based on the belief that there is a simple, unitary number concept for children to acquire provides a poor platform for understanding the development of children’s number concepts in the context of the tools their culture provides them for representing and manipulating numerical relations.
Mathematical Foundations
35
The evidence presented in this paper indicates that children’s conceptions of number are more fluid than a simple foundational model would allow. As children learn new applications for numbers, some of the relations that underlie those applications are incorporated in their concepts of what numbers are. The expansion of children’s number concepts is neither automatic nor mechanical abacus expertise, for example, is associated with a decrease in emphasis on abacusspecific features in judging the similarity among numbers. In general, children’s number similarity judgments become more similar across representational format with development, but this is a slow, gradual process. These results suggest that an abstract, integrated concept of number may be the endpoint of development, but it can not be the foundation on which children’s developing mathematical understanding is based.
ACKNOWLEDGEMENTS The research reported here was supported by NSF Grant BNS 89-09566 and a grant from the University of Illinois Research Foundation to the author. Tom Trabasso and Dan Keating provided early encouragement to explore the psychological consequences of philosophical foundations for number, which I gratefully acknowledge. Neal Cohen, Gail Heyman, and Jim Stigler provided helpful comments on earlier versions of this chapter. Address correspondence to: Kevin F. Miller, Department of Psychology, University of Illinois at Urbana Champaign, 603 E. Daniel, Champaign, IL 61820-6267 (electronic mail:
[email protected]). REFERENCES Antell, S.R. & Keating, D. (1983). Perception of numerical invariance by neonates. Child Development, 54, 695-701. Baillargeon, R., Miller, K.F. & Constantino, J. (1992). ZOmonth-old infants’ intuitions about addition. Unpublished manuscript, University of Illinois at Urbana - Champaign. Beth, E. & Piaget, J. (1966). Mathematical epistemology and psychology. New York: Gordon & Breach. Binet, A. (1966). Mnemonic virtuosity: A study of chess players. Generic Psychology Monographs, 74, 127-162. (Original work published 1893) Brainerd, C. (1973). Mathematical and behavioral foundations of number. Journal of General Psychology, 88, 221-81.
36
K Miller
Brainerd, C. (1978). Piaget’s theory of intelligence. Englewood Cliffs, NJ: PrenticeHall. Bryan, W.L. & Harter, N. (1899). Studies on the telegraphic language. The acquisition of a hierarchy of habits. Psychological Review, 6, 345-375. Campbell, J.I.D. & Clark, J.M. (1988). An encoding-complex view of cognitive number processing: Comment on McCloskey, Sokol, and Goodman (1986). Journal of Experimental Psychology: General, 204-214. Carroll, J.D. & Chang, JJ. (1970). Analysis of individual differences in multidimensional scaling via an N-way generalization of Eckart - Young decomposition. Psychometika, 35,283-319. Clark, J.M. & Campbell, J.I.D. (1991). Integrated versus modular theories of number skills and acalculia. Brain and Cognition, 17, 204-239. Copeland, R.W.(1984). How children learn mathematics (Fourth edition). New York: MacMillan. Elkind, D. (1969). Conservation and concept formation. In D. Elkind & J.H. Flavell (Eds.), Studies in cognitive development. New York: Oxford. Flavell, J. (1963). The developmental psychology of Jean Piaget. Princeton, NJ: Van Nostrand. Frege, G. (1968/1884). Thefoundations of arithmetic. Evanston, IL: Northwestern University Press. Frye, D., Braisby, N., Lowe, J., Maroudas, C. & Nichols, J. (1989). Young chddrens’ understanding of counting and cardinality. Child Development, 60,1158-1171. Fuson, K.C. (1988). Children’s counting and concepts of number. New York: Springer-Verlag. Fuson, K.C. & Kwon, Y.(in press). Effects of the system of number words and other cultural tools on children’s addition and subtraction. In J. Bideaud & C. Meljac (Eds.), Les chemins du nombre. Lille: Editions des Presses Universitaires de Lille. Gelman, R. & Gallistel, C. (1978). The child’s understanding of number. Cambridge, MA: Harvard University Press. Gelman, R. (1978). Counting in the preschooler: What does and does not develop. In R. Siegler (Ed.), Children’sthinking: what develops? Hillsdale, NJ: Lawrence Erlbaum Associates. Gelman, R. & Meck, E. (1983). Preschoolers’ counting: Principles before skill. Cognition, 13, 343-359.
Mathematical Foundations
37
Gonzalez, E.G. & Kolers, PA. (1987). Notational constraints on mental operations. In G. Deloche & X . Seron (Eds.), Mathematical disabilities: A cognitive neuropsychologicalperspective.(pp. 27-42). Hillsdale, NJ: Erlbaum. Hatano, G. & Osawa, K. (1983). Digit memory of grand experts in abacus-derived mental calculation. Cognition, 5, 95-110. Hatano, G., Miyake, Y. & Binks, M.G. (1977). Performance of expert abacus operators. Cognition, 5, 47-55. Hurford, J.R. (1975). The linguistic theory of numerals. Cambridge, Eng.: Cambridge University Press. Hurford, J.R. (1987). Language and number. Cambridge, Eng.: Cambridge University Press. McCloskey, M., Sokol, S. & Goodman, R A . (1986). Cognitive processes in verbal-number production: Inferences from the performance of braindamaged subjects. Journal of Experimental Psychology: General, 215, 307330. Miller, K.F. & Gelman, R. (1983). The child’s representation of number: A multidimensional scaling analysis. Child Development, 54, 1470 - 1479. Miller, K.F. & Stigler, J. (1987). Counting in Chinese: Cultural variation in a basic cognitive skill. Cognitive Development, 2, 279-305. Miller, K.F. & Stigler, J.W. (1991). Meanings of skill: Effects of abacus expertise on number representation. Cognition and Insfruction,8, 29 - 67. Miller, K.F. & Zhu, J. (1991). The trouble with teens: Accessing the structure of number names. Journal of Memoty and Language, 30,48 - 6 8 Miller, K.F.,Zhu, JJ., Zhang, H.C. & Yan, G.G. (1992). Language, Orthography and Number: US/China comparisons. Unpublished manuscript, University of Illinois at Urbana - Champaign. Miura, I.T., Kim, C.C., Chang, C.M. & Okamoto, Y. (1988). Effects of language characteristics on children’s cognitive representation of number: Crossnational comparisons. Child Development, 59, 1445-1450. Moore, D., Benenson, J., Reznick, J.S., Peterson, P. & Kagan, J. (1987). Effect of auditory numerical information on infants’ looking behavior: Contradictory evidence. Developmental Psychology, 23, 665-670. Peano, G. (1973/1889). Selected works of Giuseppe Peano. Toronto: University of Toronto. (Translated by Hubert Kennedy). Piaget, J. (1965/1941). The child’s conception of number. New York: Norton. Piaget, J. (1968/1964). Sir psychological studies. New York: Vintage Books. Russell, B. (1919). Introducfion of mathematical philosophy. London: George Allen and Unvin Ltd.
38
K Miller
Shepard, R.N., Kilpatric, D. & Cunningham, J. (1975). The internal representation of numbers. Cognitive Psychology, 7, 82-138. Sokol, S.M.,Goodman-Schulman, R. & McCloskey, M. (1989). In defense of a modular architecture for the number-processingsystem: Reply to Campbell and Clark. Journal Ewperimental Psychology: General, 118, 105-110. Starkey, P. & Cooper, R.G. (1980). Perception of numbers by human infants. Science, 210, 1033-1035. Starkey, P., Spelke, E.S. & Gelman, R. (1983). Detection of intermodal numerical correspondences by human infants. Science, 222, 179-181. Starkey, P., Spelke, E.S. & Gelman, R. (1990). Numerical abstraction by human infants. Cognition, 36, 97-127. Stevenson, H.W., Lee, S.Y. & Stigler, J.W. (1986). Mathematics achievement of Chinese, Japanese, and American children. Science, 233, 693-699. Stigler, J.W. (1984). “Mental abacus”: The effect of abacus training on Chinese children’s mental calculation. Cognitive Psychology, 16, 145-176. Vaid, J. & Frenck-Mestre, CA. (1991). Incidental memory for format of presentation of number stimuli: Evidence from monolinguals and bilinguals. Brain and Cognition, 17, 272-284. Wittgenstein, L. (1958/1953). Philosophical investigations. (3rd ed). New York: Macmillan. Wittgenstein, L. (1956). Remarks on the foundations of arithmetic. Cambridge, Massachusetts: MIT Press. Wittgenstein, L. (1974). Philosophical grammar. Oxford: Basil Blackwell.
The Nature and Origins of Mathematical Skills J.I.D. Campbell (Editor) 0 1992 Elsevier Scicnce Publishers B.V. All rights reserved.
39
Chapter 2 RELATIONSHIPS CHILDREN CONSTRUCT AMONG ENGLISH NUMBER WORDS, MULTIUNIT BASE-TEN BLOCKS,AND WRI'ITEN MULTIDIGIT ADDITION
Karen C. Fuson, Judith L. Fraivillig & Birch H. Burghardt Northwestern University
Introduction and background Arithmetic has arisen in many different cultures as a way to solve problems concerning quantitative aspects of real world situations. These quantitative aspects are described by words and, in many cultures, by marks that are written on some surface. In traditional cultures children learn arithmetic by observing and eventually using the quantitative words and written marks in their situations. In modern cultures, however, children are taught the arithmetic of single-digit whole numbers, multidigit whole numbers, integers (negative numbers), decimal fractions, and rational numbers. In much of this teaching, children do not learn to talk and write about quantitative aspects of real world situations, but rather stay within the arithmetic marks world and memorize sequences of written marks steps (routines) to accomplish each operation for each kind of number.' For too many children, this approach results in a verbal superstructure of hierarchical routines unrelated to anything. As a result, there is massive interference among the routines. Children have no way to reconstruct or verify forgotten routines and even no belief or expectation that one could do this, and they have poor ability to apply these routines to real world situations. Problem solving is often conceptualized only as the need to select and carry out the correct routine, and children have few ways to estimate answers to decide if an answer is sensible (e.g., Fuson, in press-a, in press-b). Before machine calculators were invented, there was a considerable need
'
The term written "marks" is used instead of the more usual term "symbol" in order to remind the reader continuously that for children, and any mathematical novices, the written mathematical squiggles are arbitrary and contain few cues to their referents in the real world. For adults and mathematical experts these meanings are so automatic that it is very difficult for us to remember and appreciate how arbitrary these written marks are.
40
K.C. Fuson, J.L. Fraivillig & B.H. Burghardt
for human calculators, so the school emphasis on producing human calculators, with its restricted calculator arithmetic focus, was understandable and probably even sensible. With the worldwide availability of inexpensive hand-held calculators, the need has shifted to humans who can apply and use these machines in real world situations and even in future situations not yet known. The need is now for meaningful arithmetic that can be related to real-world situations. Meaningful arithmetic requires that quantitative operations on mathematical words and written marks be connected to real-world referents (objects or situations) in order for children to have an opportunity to understand the meanings of these operations (i.e., to see the attributes of their real-world referents). Various pedagogical objects have been invented and used in mathematics teaching for this purpose. For example, fraction pies, fraction bars (rectangles), or fraction strip divided into different numbers of units are used to consider various aspects of fractions. Studies have been carried out concerning the relative efficacy of learning with and without these pedagogical objects. However, the efficacy of a particular pedagogical object is limited by the extent to which it does actually present in its salient physical features the mathematical domain for which it is used. There has been little serious analytical research that has attempted to define the mathematical attributes of the written marks and spoken words used in a particular mathematical domain or to suggest the kinds of pedagogical objects that would present these needed attributes. Such an analysis is required for the necessary next step: The empirical investigation of the ease with which children can make links among written marks, spoken words, and pedagogical objects and use these links to construct full and correct conceptual structures for these marks and spoken words and for operations on them. Such analytical research is complicated by the fact that the words and the written marks used to describe the mathematical entities may have different structural characteristics as well as some structural characteristics that are alike. Full understanding of the mathematical concepts being symbolized in verbal and written form may only be possible if both the system of words and the system of written marks are understood. The words and marks each fall along a positive to negative continuum, the positive side of which ranges from cueing many to cueing no important features of the symbolized concepts and the negative side of which ranges from containing no misleading features to containing many misleading features. For example, Chinese words for fractions convey more of the underlying fraction meaning than do English words: One says 3/5 as "out of five things three" in Chinese and "three fifths" in English. The word "fifths" conveys no sense of a fraction or of a ratio meaning and is even misleading because it is the same word used for the ordinal number meaning (fifth in the race, number five in an ordered
Number Worh, Multiunit Blocks, & Multidigit Addition
41
sequence). The Roman numeral VIII reflects the fiveness in a hand (the V-shape of the thumb and other fingers) plus three more fingers to make eight, whereas the Arabic numeral 8 contains no cues to its eightness. Multidigit numerals (e.g., 8625) are misleading because they look like adjacent single digits; no feature suggests that they tell how many tens, hundreds, and thousands or even that they involve such multiunits (larger units formed from multiple smaller units). Fractions similarly just look like two single digits separated by a line (3/5), thus seducing children into adding, subtracting, multiplying, and dividing these single digits as a way to carry out these operations on fractions (they are actually correct for two of these operations, thus giving partial reinforcement to this single-digit approach). Analyses of the structural characteristics of a mathematical domain and empirical investigations based on such analyses are needed. These can examine how the words and written marks used for a given mathematical domain appear on this continuum and explore how these supportive and misleading features may help or hinder children's learning. This chapter presents such an empirical investigation in the domain of multidigit addition and subtraction. This investigation and the analyses presented here are based on an analysis of this domain presented in Fuson (1990a). This analysis is briefly summarized here in the next several paragraphs to provide a context for the chapter; the parts of the analysis concerning numbers larger than four digits and conceptual structures beyond those needed for multiunit addition are not summarized here because this investigation was limited to these areas. Conceptual analysis of multidigit number marks and number words English multiunit words and the usual multidigit marks have some features in common. Any system of words or written marks expresses large numbers of single units by combining several different larger multiunits (chunks of single units). English words and base-ten written marks both use the same multiunits based on powers of ten: Each multiunit consists of ten of the next smaller multiunit. Each of these systems also uses nine different symbols (words or marks) to denote the first nine numbers and then also uses these same nine symbols to tell how many of each multiunit there are. Words and marks also differ in important ways. Written marks require the perception of a visual layout of horizontal "slots" or positions into which any of the nine number marks can be placed, whereas English words require learning special multiunit words (thousand, hundred, ten) that are each prefaced by any one of the nine number words. English words immediately say the largest multiunit, but one has to look at the number of places in written marks to decide the value of the largest position.
42
K C. Fuson,J.L. Fraivillig & B.H. Burghardt
Associations between these two systems enable translation between them. The first association is between the nine number words and the nine written marks (e.g., one for 1, two for 2, etc.), and the second association is between the English multiunit words and particular mark positions. To carry out a translation of marks to words using the second association (i.e., to say written marks as English words), one must count or subitize (immediately recognize visually) the number of positions in a multidigit number to find out the value of the leftmost position (e.g., "one, two, three, four -- oh, the fourth place is thousands"), or use the multiunit English word list in increasing order (ones, tens, hundreds, thousands) to find the name of the leftmost position. In both cases, these procedures are opposite to the order children are used to: they read and usually count (Fuson, 1988) from left to right, and they say multiunit English words in order from largest to smallest. To read this written multidigit number, one must then use the first association between the written mark and one of the nine English number words, say that written number, and follow it with the English multiunit word even though there is no cue in the multidigit marks to say this multiunit name. This process continues until the whole multidigit number is said. Translating in the opposite direction (writing spoken English words in marks) is much easier: one just uses the first association to write marks corresponding to the nine number words in the order they are said and ignores the special multiunit words. If one kind of multiunit does not appear in the number (e.g., 5096 has no hundreds multiunits), however, another difference between the two systems appears. In English words that multiunit just disappears and is not mentioned; in the written marks a special new mark, 0, must be used for that vanished multiunit so that all of the other marks will stay in their correct multiunit positions (they will move one position too far to the right if no mark is put into the position of the vanished multiunit). A final difference between the two systems is that one can easily say more than nine of a given multiunit and such constructions have a quantitative meaning even though they are not in standard form (e.g., twenty tweive or five thousand thirteen hundred fifty two), but one cannot write such a number because it pushes the larger multiunits into the wrong positions (e.g., 212 is not twenty twelve and 51352 is not five thousand thirteen hundred fifty two). The English words are concatenated in that they are independent and strung together successively. Young children and novices at learning the written marks (such as European adults used to concatenated Roman numerals first learning the new Arabic numerals in the Middle Ages) frequently learn written marks for each multiunit value and then concatenate rather than embed these multiunit values (e.g., one hundred skty four is 100604 instead of 164 or seventy five is 705 instead of 75) (Bell & Burns, 1981; Ginsburg, 1977; Menninger, 1969). These errors result from using a cardinal
Number Wor& Multiunit Blocksl & Multidigit Addition
43
notion of written marks (one hundred is three marks because it is written 100) instead of using the correct ordinal position meaning of the 100 mark: a 1 written in the third position from the right. For either of these systems to mean anything, they must be linked to a conceptual multiunit structure. AU of the above learning can take place without the learner having any idea what the English multiunit words or marks positions mean, and the translations can be carried out completely as rote procedures. This is in fact what seems to happen to many children in the United States under usual school instruction. They do not have any quantitative multiunit referents for either the English words or the written marks, and their conception of both of these systems, but especially the written marks system, is a concatenated single-digit conception: multidigit numbers are viewed as concatenated single-digit numbers (see the literature reviewed in Fuson, 1990a). Two abstract conceptual structures (ways of thinking) seem sufficient for understanding multiunit addition and subtraction. These are the "multiunit quantities" conceptual structure and the "regular ten-for-one and one-for-ten trades" conceptual structure. The "multiunit quantities" conceptual structure supplies multiunit meanings for the English multiunit words and the marks positions. Its construction therefore requires experiences with multiunit collections of single units (collections of ten units, a hundred units, and a thousand units) that can be referents for the English multiunit words and the marks positions. In such multiunit situations, a viewer must focus on the cardinality of the units and conceptually collect these units to form the required multiunit (e.g., see the ten units as one ten formed from the ten units). Because the presence of the ten units cannot ensure that the viewer actually collects them into a multiunit of ten (thinks of them as one ten), a distinction is needed between the potentially collectible multiunits presented in a situation and the conceptual collected multiunits formed by an individual seeing the multiunits presented in that situation. Base-ten blocks were invented by Dienes (1960) to support children's construction of multiunit conceptual structures. There are four kinds of blocks: single unit blocks (1 cc), ten-unit blocks (long blocks ten units long), hundred-unit blocks (flat blocks ten units by ten units by one unit), and thousand-unit blocks (large cubes ten units by ten units by ten units). Other physical referents and situations also can present such collectible multiunits, but base-ten blocks were used in the present study and so will be discussed as the exemplar collectible multiunits. The "regular ten-for-one and one-for-ten trades" conceptual structure is constructed from the multiunit conceptual structure, or from situations presenting collectible multiunits, by noticing that ten of one unit or multiunit makes one of the next larger multiunit and vice versa. This ten/one relationship
44
K C. Fuson, J.L. Fraivillig & B.H. Burghardt
can be learned as a rote rule; the ten/one conceptual structures are based on multiunits. Much of the above discussion has ignored the fact that the English words, and most European number words, actually have many irregularities for the multiunit of ten while being totally regular for the multiunits of hundred and thousand. These irregularities create problems for English-speaking children learning single-digit sums over ten and learning place value and multiunit addition and subtraction (Fuson, 19904 1990b; Fuson, in press-b; Fuson & Kwon, 1991). In contrast, Asian languages based on Chinese are regular for the tens as well as for higher multiunits: 52 is said "five ten two." Naming in this regular way the multiunit of ten seems to facilitate children's construction and use of conceptual multiunits of ten (Miura, 1987; Miura, Kim, Chang, & Okamoto, 1988; Miura & Okamoto, 1989) and their use of conceptual multiunits in multidigit addition and subtraction (Fuson & Kwon, in press-b). Addition of multiunit numbers involves two major components that differ from addition of single-digit numbers. The first component is adding like multiunits, that is, adding tens to tens, ones to ones, hundreds to hundreds, etc. The need for this component arises because one cannot combine different multiunits to make only one of these multiunits (5 hundreds plus 4 tens does not equal 9 hundreds or 9 tens). The second component is recognizing and solving the problem of having too many ( 2 ten) of a given multiunit. This need only arises in the marks world, and not in the English words or with base-ten blocks, because it is only with marks that one cannot write more than nine of a given multiunit. Thus, when the sum of a given multiunit exceeds nine, one must trade ten of that multiunit for one of the next larger multiunit. The two multiunit conceptual structures, the "multiunit quantities" structure and the "regular one/ten trades" structure, can direct correct addition with respect to both of these components of multiunit addition and can help eliminate incorrect addition procedures. The physically salient different sizes in base-ten blocks suggest combining same-sized blocks to find a multiunit sum and also support trading when a multiunit sum exceeds nine. When teachers used base-ten blocks to model the standard United States algorithm (writing the traded 1 multiunit above that multiunit number in the top addend), second graders of all achievement levels and high-achieving first graders learned multidigit addition of four-digit numbers and gave conceptual multiunit explanations for their trading (Fuson, 1986; Fuson & Briars, 1990). These studies indicated that the use of base-ten blocks directly linked to written marks procedures and described in English words and block words could be a powerful instructional intervention. However, they provided little detailed information concerning the ways in which children formed multiunit conceptual structures, and they did not indicate the
Number Words, Multiunit Blocks, & Multidigit Addition
45
extent to which children could use the blocks in a similar linked way to construct their own multiunit blocks addition and written marks addition procedures. Purposes of this study
This investigation had three main purposes. The first was to examine how easy it is for children to construct the relationships among English number words, written multidigit marks, and base-ten blocks and to maintain these relationships while exploring multidigit addition with the blocks and the marks. An important part of this first purpose was to identify trouble spots in this construction process for both practical and theoretical reasons. If teachers are to undertake this new approach, it would be quite helpful for them to have a roadmap marking typical roadblocks and error-prone routes as well as productive routes. Such knowledge can also contribute to theories about how to support such learning and about how this learning occurs. As part of this exploration, regular ten-based Asian word forms for the second position (e.g., "seven ten five" for 75) were taught to some groups. The other purposes stem from the Curriculum and Evaluation Standards fur Teaching School Mathematics (1989) and the emerging vision of new teaching methods described in Professional Standards for Teaching Mathematics (1991). These suggest supporting children's construction of their own arithmetic methods and recommend that children work in small groups. The second purpose of this study was to examine the kinds of procedures children would invent for four-digit numbers when given the support of base-ten blocks and to explore the amount of generalized place-value knowledge gained in making these constructions. The existing data on children's invented multidigit addition methods are limited almost entirely to addition of two-digit numbers (see the review in Labinowicz, 1985). Thus, we do not know how easy it is for children to use these two-digit procedures for three-digit and four-digit numbers (that is, to invent and see a general procedure across several digits) or to gain adequate generalized place-value understanding in such an approach. The third purpose was to explore the benefits and limitations of children working in small groups in this endeavor. We have almost no knowledge of the kinds of teacher support children need for multiunit addition approached in this way, and such knowledge would be extremely valuable for teachers attempting to support children's learning in the classroom. The analyses of these purposes are necessarily somewhat intertwined, but this report concentrates on the first purpose. This initial study used children who were high-achieving in mathematics so that the mathematical work would not be limited by gaps in prerequisite mathematical
K.C. Fuson, J.L. Fraivillig & B.H. Burghardt
46
knowledge or by huge motivational problems that interfered with their mathematical functioning. A second study using average and low-ability children has been done, but results are not yet analyzed. Method
Subjects The 26 children participating in this study were all the chidren.in the highest achieving of the three second-grade math classes in a Chicago-area school that grouped children for reading and for math. Children in the school were from a wide range of SES backgrounds, from meeting the federal standards for receiving free school lunches to high parental income and education levels, and were racially heterogeneous. Children were given a written and interview pretest that assessed conceptual and procedural competence in place value and multidigit addition and subtraction. Children ranged on the pretest from solving no two-digit or four-digit addition problem correctly (6 children) to solving all three vertical and both horizontal problems correctly (4 children); they showed a similar range in place-value knowledge and conceptual explanations for 2-digit and 4-digit trading and alignment of uneven problems. There were many different patterns of place-value and addition knowledge, with some children showing strength in one but not the other. Children were formed into three initial knowledge levels (high, medium, low) on the basis of pretest performance. Each initial knowledge level was split into two groups of 4 or 5 children balanced by gender. Because of the varied patterns of performance on the pretest the children in each group, especially in the middle and low initial knowledge levels, were quite heterogeneous with respect to the kind of domain knowledge they possessed. It emerged during the study that the first-grade teacher of about half of the children had used a different kind of base-ten blocks to teach place value. She taught addition and subtraction but did not use the blocks for this teaching. Every group contained at least one child who had seen the blocks before. Thus, the data concerning children’s initial exploration of the blocks must be considered as that possible when at least one group member has had some initial exposure to the blocks. Procedure The initial orientation to the study was done with the whole class during one class period. The study was described, and the session then focused on
Number Worh, Multiunit Blocks, & Multidigit Addition
47
establishing norms for the cooperative group work and describing the group roles of leader and checker that were used to facilitate group work and participation by everyone. The group approach was adapted from Cohen (1986) and Johnson and Johnson (1989). The principles of groupwork were to be brief, to listen to others and reflect on what they have said, and to make sure that everybody gets a turn. The leader’s roles were to enforce these principles and to choose whose turn it was to speak. The checker’s role was to be sure that group members said whether they understood or agreed with procedures being used or conclusions drawn by the group. These roles rotated around the group on a daily basis; children wore a special large button to identify their roles. Each group had an adult experimenter who monitored the group learning. One experimenter was a Ph.D. candidate who had designed the study and groupwork approach and had considerable knowledge of the literature on children’s mathematics learning. The other two experimenters were undergraduate honors students in psychology and education who had extensive experience with children. Each experimenter oversaw the videotaping of each group, took live notes of important mathematical discussions, and intervened when children’s behavior became too rowdy or when the group became stuck on a mistaken procedure for too long. Math class was 40 minutes long, and about 35 minutes were effective working time as opposed to set up or clean up time. An experimenter-intervention strategy was adopted that attempted to let children follow wrong paths until it did not seem likely that any child would bring the group back onto a productive path; the experimenter then intervened with hints to help the group but giving as little direction as necessary. This was done to provide maximal opportunities for the children to resolve conflicts and solve problems creatively. Because this necessarily always involved a judgment call, this loose description was replaced in the second session (see below) by a criterion of letting children follow a nonproductive path or engage in incorrect mathematical thinking for the length of one class session but then intervening. This criterion was intended to reflect the reality of a classroom where a teacher monitoring six or more groups might not get to a given group for a whole class session but would be able to give support by the end of that time. Space and videotape equipment constraints resulted in a need for two successive data-gathering sessions. Each session used three groups--one high, one medium, and one low initial knowledge group. In each 3-1/2 week session the teacher worked in the classroom on a different topic with the half of the class not participating in that data collection session. Each experimenter supervised the two groups at the same initial knowledge level in the two sessions.
48
KC. Fuson, J.L. Fraivillig & B.H. Butghardt
For the initial experience with the base-ten blocks the experimenters followed a script that asked children to do several successive tasks: 1) choose their own names for each kind of block, 2) frnd the (ten-for-one) relationship between adjacent-sized blocks, 3) find any similarity between these relationships, 4) establish the English words for each kind of block, and 5 ) establish the relationships among block arrays, English words, and the standard four-digit marks. The groups varied in the time they spent on these tasks, taking between one and three class periods to finish them. In the first session each group was given a file box of digit cards (small cards each with one numeral written on it) to show four-digit marks; these had been used successfully in the studies modelling the standard algorithm (Fuson, 1986; Fuson & Briars, 1990). When used for addition and subtraction by the children in the groups, the digit cards proved to be very time-consuming. It took children a long time to put all of the index cards away after a problem, and children frequently worked only in the blocks world or in the digit card world and did not link the two. Therefore, the digit cards were replaced in the second session (i.e., for the second three groups) by a "magic pad" an 11"by 14"pad of paper which was magic because it had to show everything that was done with the blocks as soon as it was done but could not show anything that was not done with the blocks. Children were encouraged to "beep"whenever these constraints were violated, In both sessions children also wrote on individual papers after the first few days of addition. Because big cubes are expensive, each group set had two wooden big cubes and five big cubes made up of cardboard folded and taped into big cubes. Some groups also had for the addition phase a few hundreds blocks cut out of plain wood. During the first session, the highest and lowest-achieving groups received the language intervention in which they were taught to use "Asian"number words for the tens place (68 was said "six ten eight"). We had intended to manipulate this variable across achievement and use it only in the middle-achieving group in the second session. But during the first session, only some of the children regularly used the Asian tens words, We did not want to interrupt the flow of children's work and the establishment of their autonomy by continually reminding them of this use. Therefore, we abandoned the manipulation of this variable and introduced this terminology to all three groups in the second session, intending to watch its survival and use with little support from the experimenters. During the addition phase of the study, children were given an addition problem written horizontally on a long strip of paper and asked to use the blocks and the digit cards (or magic pads) to do that addition problem. After they had agreed upon a solution to one problem, they were given the next problem in a prepared list of problems, The first several problems in the list had four-digit numbers as
Number Word, Multiunit Blocks, & Multidigit Addition
49
both addends (e.g., 1629 t 3747). No commas were used to write four-digit numbers. AU problems required trades in one, two, or three columns. The number of thousands was generally one through four in each addend; this was smaller than the other numbers (which ranged up to addends of 9) because we did not have as many thousands blocks as other blocks. The issue of adding like multiunits was raised by giving the children some four-digit problems plus threeor two-digit problems later in the problem list. After several days of addition, children wrote marks problems on individual papers as well as doing them with blocks and/or digit cards on the magic pad. The goal was to move from doing coordinated block and individual paper solutions to doing just individual paper solutions connected to mental multiunit quantities. We had hoped to have children work on multidigit addition until the group had agreed on one or more correct written procedures, and most children could add in written marks without using the blocks and could explain the addition in terms of multiunit quantities. Our agreement with the school regarding the dates for the study was based on the amount of time teachers in the earlier teacher-directed studies had spent with high-achieving second graders learning multidigit addition and subtraction with the blocks. Our original dates included at least 18 learning days, but math class was canceled on several days. Because we were obligated to teach both addition and subtraction, we moved on to subtraction before some of the children in the lower two groups displayed as much competence in addition as we desired (these subtraction results will be described elsewhere). Because most groups did not get to subtraction problems with zeroes, the posttests were combined with a teacher-directed phase intended to be more like the original teacher-directed studies. The high initial knowledge children who had done such problems were split between the other two groups, and these groups worked for three days on such problems with considerable direction by the teacher for each group. During this time the experimenters interviewed children from their own group. Analyses
All of the videotapes were transcribed by the experimenter for that group or by a work-study student. All mathematical conversations were transcribed verbatim and annotated with respect to actions with the blocks and marks (digit cards or magic pad), social-emotional interaction, and any other aspects of the group interaction not directly reflected in the verbal record. Off-topic digressions were to be summarized with an indication of their topic and length. All transcriptions were checked by a second transcriber.
50
K C. Fuson, J.L. Fraivillig & B.H. Butghardt
To ascertain the relationships between actions on the blocks and actions on written marks, block and mark summaries were prepared for each day of addition. These showed in one column drawings of the successive block lay-outs and in another column the successive digit card lay-outs or magic pad writings; all of these entries were numbered with a line of the transcript and lettered to identify the child doing the action. These summaries made it easy to ascertain key features of the block or mark addition procedure and to determine how parallel the two procedures were. In spite of the emphasis to the transcribers on including complete accounts of the actions on the blocks, digit cards, and magic pads, the transcriptions proved to be variable in the extent to which these block and mark summaries could be prepared from the drawings already in the transcriptions; some tapes had to be viewed again in order to prepare adequate summaries. A category scheme based on the analysis of multiunit knowledge in Fuson (1990a) was developed and used by one coder in a preliminary analysis of errors children made in the groups (Wallace, 1990). A major focus of these categories was associations between or among blocks, block words, English words, and written marks because we were interested in the extent to which children were constructing these associations. This category system was used over a three-month period by three coders to code every utterance. However, we were unable to achieve acceptably high inter-rater reliabilities. Coders would agree about the first level of association. For example, two coders would agree that a certain utterance was an English word/block association, (i.e., that it was an English word that referred to a block), but they would frequently disagree about further levels of association. For example, one coder would conclude that the child at that moment also had that English word and block associated with the written mark in the digit cards or on the original horizontal problem while the other coder would not agree with this further association. This was not a simple problem of different coders having more or less overall inclusive criteria; rather, they differed in their interpretation of what was in the mind of a given child at a given moment for a given utterance. We finally concluded that it is very difficult, and perhaps inherently impossible, to conclude for a given utterance at a given moment in time just which referents in the mathematical environment are intended by, or within the attention of, the child giving that utterance. The whole goal of this teaching/learning environment is to support the construction of a tightly linked web of interconnectionsamong words, visuo-spatial (actual or mental) objects, and written mathematical marks. Therefore, a child who possesses or is in the process of building such a web can potentially be accessing all of these meanings or referents attentionally in the real world or mentally. However, a child does not necessarily do so at any given moment even though all of these meanings are
Number Word$, Multiunit Blocks, d Multidigit Addition
51
available in the actual environment or mentally. Therefore, we abandoned the attempt to code individual utterances and moved to descriptive methods in order to capture the complexity of the relationships children were (or were not) constructing among these different worlds; these methods permitted us to summarize the evolution of these webs within individual groups. We were concerned about the reliability of these more descriptive case-study methods, so the following criteria were established. For the descriptions that are relatively "objective" such as whether a given blocks procedure was accurate or not, the descriptive summaries in this chapter were written by the first author alone. For group interaction or social/emotional issues or other more complex issues, any summaries were prepared and agreed on by at least two authors. Establishing relationshipsamong blocks, block words, English words, and written marks: Results The amount of time it took to establish relationships among blocks, block words, English words, and written marks varied by group from 1 2/3 to 3 40-minute class periods. The Session 1 high initial knowledge group (Hl) took 2 days. The Session 2 high initial knowledge group (H2) took 1 2/3 days. The Session 1 middle initial knowledge group (Ml) took 3 days (partly because of videotape failure that led to postponing the start of addition until the fourth day). The Session 2 middle initial knowledge group (M2) took 2 1/2 days. The Session 1 high initial knowledge group (Ll) took 3 days. The Session 2 low initial knowledge group (L2) took 2 1/5 days. Each group first chose names for the blocks. They then found the ten-for-one equivalencies between adjacent block sizes beginning with the little cubes and longs? Children then ascertained the English words for the blocks by deciding how many little cubes were in each of the larger blocks. Some games were then played to practice the connections among the blocks, block words, and English words. Finally, each group worked on establishing relationships among blocks, English words, and marks. Results of each of these activities are described below. Some readers may wish to skip these detailed results and move straight to the discussion of this initial phase of establishing relationships. To facilitate this, and to provide an advanced organizer for readers of these results, a brief summary of the major results is provided in the next paragraph.
*
Except where we are describing particular group block discussions, we will use the following names for the blocks little cube, long, flat, big cube.
52
K C. Fuson, J.L. Fraivillig & B.H.Burghardt
The names of the blocks chosen by various groups mostly depended on size and shape, and food related names were common. The naming process varied by group. Children easily found the ten-for-one equivalenciesbetween the little cubes and longs, the longs and flats, and the flats and big cubes. They made few errors with respect to these equivalencies throughout this whole preaddition phase. Four of the 26 children did propose a four-for-one or six-for-one equivalency for the big cube and flat (i.e., they initially said that four or six flats made a big cube). Many verbal responses concerning equivalencieswere not maximally helpful because they were so abbreviated. Children were very accurate in using English words and block words. They learned the Asian tens readily, though individuals varied in the extent to which they spontaneouslyused them in subsequent discussions. Children easily established relations among blocks, English words, and written marks. The need for zero arose in all groups and was successfully resolved. Some groups grappled with the issue of how to write block arrays that had more than ten of one kind of block (see Table 3).
Choosing names for the blocks Experimenter direcrion. The experimenters gave the following directions to each group: "Choose a name for each kind of these blocks. Choose names that tell you something about the blocks so that you will be able to remember the names. You all need to agree on the names." Thereafter, the amount of experimenter involvement varied by group. Groups H1 and M1 (session 1groups with high and middle initial knowledge) nominated and chose their names without any further comment from the experimenter. In Session 2 in all three groups some child nominated a block name that contained a number. Because we wanted the block names to be distinctive from the English words, which contain numbers in two different roles (as the multiunit name and as the number of multiunits), each experimenter said that the block names could not contain numbers. Two of the experimentersthen suggested choosing a name that tells something about what the blocks look like. The group H2 and M2 experimenters both acted to ensure that the group agreed with the final choices. When a child in group H2 announced that they had named everything, the experimenter responded to the lack of clear voting procedure and choices by asking what names they had chosen and then asking if everyone agreed with those stated choices. She then asked the checker to v e r q that everyone agreed with the choices, The experimenter for group M2 also asked the checker to verify the final choices with everyone. The name-choosing process for the L1 and L2 groups involved considerable interaction with the experimenter. Early in the choosing process for group L1 the experimenter began trying to get
Number Worak, Multiunit Blocks, & Multidigit Addition
53
the leader to establish an orderly process, perhaps because the group was nominating names at random for various blocks; the experimenter gradually took over the role of leader. In group L2 the experimenter initially suggested a procedure (beginning with the smallest block) and then took over the leader role early in the process. The choosing process in these two groups did include all children because of the effective adult leadership. To the suggestion of "heavy"for the big cube by group L1, the experimenter pointed out that the cardboard big cubes were not heavy, though the wood cube was. The experimenter questioned the children's choice of "rectangular" in group L1 and "big ice cube" in group L2 because they took so long to say, but the children refused to change these names. The block names. The number of block name nominations varied across groups. Groups H1, H2, M1, M2, L1, and L2 nominated 13,50,18,14,23, and 12 names, respectively (see Table 1). All six groups chose names for individual blocks rather than explicitly deciding upon a particular series or overall group category. However, four of the groups nominated a series of block names that did reflect some overall relationship, such as baby block, sister block, momma block, and daddy block. None of these series was the final choice. Group H2 nominated several different series names (see Table 1). All six groups chose at least one block name that related to the shape of a block. Three of the groups (Hl, M2, L2) used the shape of the block to designate names of food or food-related terms for all four blocks; the children in group L2 explicitly commented on the common theme of food in the names. The three remaining groups assigned two or three of the names based on the shape of the block, with only one of these being related to food ("pancake"). The other choices for these groups were based on size. The flat and long blocks were based on shape, and the small cubes were based on size in these three groups. The members of group L1 observed and discussed the fact that three types of blocks had square shapes and therefore this particular feature was rejected as a possible naming criterion. Five of the six groups followed at their own initiative a consistent size order when choosing the names for three of the four block choices; the one block out of this order varied across groups. Two of the groups (H1 and M1) moved from the smallest to the largest block, and the other three groups (H2, M2, L1) moved from the largest to the smallest block. Group L2 moved from the smallest to the largest block at the direction of the experimenter. The block naming process. Two different nominating and voting patterns emerged from the six groups. Three of the groups (H2, M1, M2) randomly nominated names for all the blocks, discussed the nominations, and postponed voting or choosing until the end of the discussion. Groups H1 and L2 discussed
54
KC. Fuson, J.L. Fraivillig & B.H. Burghardt
nominations for a particular block and agreed on a final name before nominating names for other blocks. Group L1 began with the first pattern but moved to the second. The procedure for L2 and the shift for L1 were initiated by the experimenter. Table 1. Names Nominated and Ultimately Chosen By Each Group Group Thousands
Hundreds
H1 N master ice cube Lflat N: Birr: Mac L loaf of bread L iceberg L Mommy E:glacier E:meatloaf
H2 Z blockhead
Ones
Lcarrot stick
N&L six-sidedblock L baby C: ice cube
D: lO-block M: mapletree D: Pin Q: skinny guy M: child square M: donkey D: stick M: math two D: ten math Q: skinny
M: Pinnochio Z one-square D: one-block M: Pinocchio(Pin) D: small fry Z small square M. baby square M dinkey Z eight ball M: math one Q: baby bear
L: plate
D: hundred block D: birchtree M: house M: a hundred M Mama square M: Christmas tree M dinkey M: elm Q: Mama bear Z redwood D: flathead M: math three Mfattv M facemask D: hundred math M: city Z big square M: Papa square M: dunkey M: Susie M: Philip M: math four D: thousand math
M: stegosaurus D: thousand block M: appletree Z eight corner
Tens
Number Words, Multiunit Blocks, & Multidigit Addition M1 U: thousands M: hundreds V: big guy 0: fattv
U: hundreds 0: kids U: kid guy U: baby U: iceberg
U: tens U: skinny guys U: lone lees 0: long guy
55
U&M: ones M: baby 0: little man U: baby T U: little guf
M2 N: one hundred T square U: cube Dh: ice cube
T: square U: pancake
U: rod T: candy N: stick Dh: smallest cube T: straw in the world T: orange straw U: tooth U: licorice
L1 N: biz block N Daddy block N heavy block X: ten blocks
D: square blocks D: rectangular square D: flat block D: Momma block D: pancake block X: medium block
D: rectaneular
L2 T&K thousands B: bie ice cube
J: hundreds N: bread B: bread-bread
N: tens T&K strangestrangey T: breadstick B: pretzel
D: little block D: small blocks N: rectangle N baby blocks blocks N Daddy Junior N: brother block X: tiny block N: sister block N: carrot stick X: skinny block X: medium blocks X: long block
biock
B: ones B: wear cube B: cheese
Note. The nominations are listed in chronological order and are prefaced by the nominating child's code. "'Little man" is the chosen name, but "little guy" is the name actually used by the group. The change occurs without explicit group discussion.
The time devoted to the block naming task differed across the groups. Groups H1 and M1 completed the naming process in less than 3 minutes and 4 minutes, respectively. Groups M2 and L2 completed the task in 6 and 7 minutes,
56
K C. Fuson, J.L. Fraivillig & B.H. Burghardt
respectively. Both groups Ll and H2 spent 14 minutes nominating and assigning names to the blocks. The block naming process varied considerably across groups. The personalities of the official leader and of the most dominant group member (these only sometimes coincided) strongly influenced this process. Some exercised their power with relatively little sharing, while others used their power to bring other group members into the choosing process. The children in group H1 quickly grasped the task of assigning block names and worked well together to accomplish it. Each child seemed to enjoy the notion of assigning names to the blocks. The child who initiated the nominating activity, L, emerged as the dominant group member in the naming process. She nominated the most names (8), won the most final choices (2), and successfully vetoed nominations for blocks that she did not win. Ultimately, however, the group engaged in a voting process that included all group members for each decision. Group H2 had difficulty in agreeing on what the block names should be. The process was meandering, with considerable disagreement between M and the official leader Q. M was a very dominant member in this process; she nominated 8 of the group’s 50 nominations (many as a series of names) and gave 5 vetoes and 0 agreements to the other members. M monopolized the conversation and became frustrated when other members disagreed with her nominations without suggesting alternatives. Q was the least verbal member and nominated only 4 names. D, an involved member, won two of the final four block names. Although D was engaged for most of the activity, at times she would stray from the discussion and construct block buildings. Group member Z made seven nominations. He seemed frustrated with the prolonged inconclusive discussion and fairly early suggested that each person name one block. This was ignored for a long time, but was followed at the end when each person chose one block name and everyone agreed without voting. Group M l passively followed the strong leadership of U, their official leader for the day. U suggested 12 of the 18 nominations, ending with two of the four final names. U later changed the name for the unit block to her own original suggestion. The group expressed no direct vetoes and maintained a low level of engagement throughout the activity. One member, D, expressed his votes nonverbally by raising his hand and did not nominate any names. Under the capable leadership of the official leader, Da, Group M2’s voting process was expedient and smooth, Da initiated the voting process, objectively facilitated the discussions, and kept the group focused on the task at hand. Da’s comments of “Does anyone have any suggestions for...”and “So who likes pancake for this?” propelled the voting process and invited involvement. Consequently, all
Number Wordr; Multiunit Blocks, & Multidigit Addition
57
group members were engaged and agreeable. The group offered many statements of agreements and few vetoes. Da suggested that the group choose names that would reflect the blocks' relative size. From his five nominations, U won 3 of the 4 final choices and openly displayed his pride. Group L1 members contributed equally to their 23 nominations and maintained similar levels of involvement throughout the exercise. The experimenter acted early to elicit more leadership from the official leader and then began to function as the leader, eliciting these equal levels of participation and strongly guidiog the name-choosing process, N, the offrcial leader, offered tacit approval of D s and X's suggestions. D gained quiet dominance over the group by means of her parsimonious approval and frequent vetoes. X evaluated others' nominations with a balanced number of agreements and vetoes. Overall, the group was not strongly engaged and grew less interested over the 14minute process. One group member, M, was absent during the naming process. Group L2's voting process was structured from the beginning by the experimenter who suggested moving from the small to the large blocks. The official leader, T, relinquished her authority and strayed off task after her nomination of "breadstick"was not chosen by the group; the experimenter then took over the leader role. K, although nominating only two names, skillfully performed his job as checker. B suggested five of the overall 12 nominations and won three of the final four choices. The group at no time voiced vetoes of the few nominations but rather resuggested their preferred choice.
Finding the Ten-For-One equivalencies in the blocks This activity was structured by a worksheet that presented three successive questions using block diagrams. The questions were all of the form: How many -equal a - ? The first blank showed a diagram of the smaller block, and the second showed the block that was ten times larger. The questions moved from the small cube/long equivalency to the long/flat equivalency to the flat/big cube equivalency. Table 2 presents the number of children giving verbal responses that stated "ten" as the equivalency or making block demonstrations showing that ten smaller blocks make one of the next larger blocks. Both kinds of responses are further classified into two types: a) simple block demonstrations or statements of a ten-for-one equivalency, for example, that ten longs make a flat or b) complex demonstrations or statements, which involved the use of an already established ten-for-one equivalency. Most of the complex cases were argumentations or demonstrations that used two adjacent equivalencies, for example, "Ten little guys in a long leg, ten
K.C. Fuson, J.L. Fraivillig & B.H. Burghardt
58
long legs in each iceberg means one hundred little guys in an iceberg." The table includes not only children's responses to the worksheet but also spontaneous demonstrations of equivalence that occurred before this task (many children spontaneously put small cubes on top of a long or longs on top of a flat during the block name-choosing process) and after this task while children went through the rest of the preaddition activities (most of the complex uses occurred in the phases after the equivalency task). For each entry in Table 2, the number of responses a child made ranged from one to four.
Table 2. The Number of Children in Each Group Who Demonstrated Ten-For-One Equivalencies with Blocks, with Words, and with Blocks and Wor& Simultaneously Ten Hundreds = One Thousand Simple Complex
Blocks and Verbal
12
Blocks only Verbal only
10
Ten Tens = One Hundred Simple Complex
8
10
1
8
7
10
3
Ten Ones = One Ten Simple Complex
13
1
8 9
14
9
Note. Over the total equivalency time, a child could produce equivalencies in each category (verbal only, blocks only, blocks and verbal at the same time). A child is entered in each category once regardless of how many equivalencies of that type s/he produced, but a child may appear in more than one category. Every child in the study demonstrated understanding of at least one ten-for-one equivalency. All but two children demonstrated the ten/one equivalency verbally or with blocks or in both ways. These two children did identify both the thousand/hundred and the hundred/ten equivalencies. Twenty-three of the 26 children demonstrated the hundred/ten equivalency, and twenty children correctly showed or stated the thousand/hundred equivalency. Most of the demonstrations with blocks consisted of putting ten of the smaller blocks on top of the larger block (on top of the long or the flat) or beside the larger block (beside the long or the big cube). Children in three groups spontaneously put ten flats inside the open cardboard version of the big cubes as part of the discussion of how many small cubes make a big cube (see the next section). Verbal responses included simple answers of "ten" to the how-many question, counts of the blocks that ended in ten, equivalence statements that used indicating pronouns (e.g., "There's ten of these
Number Words, Multiunit Blocks, & Multidigit Addition
59
in this"), and full equivalence statements using the block words (e.g., "There's ten little guys in a long leg."). The majority of the responses were of the first two types. Most of the equivalence statements did not name the multiunits. The lack of demonstration of an equivalence by a given child may reflect only the structure of the group discussion as not requiring action of any kind from every child in the group. The "flavor" of the equivalency discussions in all groups was one of most children already knowing (from the place-value work with blocks in the previous year) or readily seeing the ten-for-one equivalencies. A few children said that they did not understand a particular equivalency, and other children in the group immediately demonstrated with blocks and verbally told the child the answer. During the whole preaddition phase, there were very few errors concerning the ten-for-one equivalencies. There were two counting errors (final counts of nine and eleven instead of ten) that were immediately corrected by other children. In group L1, two children, N and D, answered "two" in response to questions like, "So, if this (single) is one, what's this (long)?" These responses of the ordinal number of the multiunit, the second multiunit, instead of the cardinal multiunit embodied by the blocks (ten) seemed to be misunderstandings of the question rather than lack of understanding of the multiunit value of the long block. Four children demonstrated confusion over what attribute of the big cube should be used to determine the equivalency. In group H1, E objected to determining the flat/big cube equivalency by stacking ten flats beside one cube, stating, "You can't do it by thickness." The other three group members finished their stacking and counted the flats to show ten. E then counted the blocks himself to verify the group's answer and agreed with it. There was a more prolonged confusion in group L1 that was initiated by N and X answering the flat/big cube equivalency question by focusing on the drawing of the big cube that was on the worksheet. The drawing directed them to the sides of the big cube, and they initially responded by saying four flats equal a big cube. The experimenter moved them from the drawing to the real blocks, where X immediately stacked ten flats beside the big cube and N put four flats on the four sides of the big cube. X reasserted that there were ten, but there was no discussion by N or the group to resolve this difference. During the task of finding the English word for the big cube, N answered "four hundred," demonstrating that he had not changed his view. D then counted the six sides of the big cube, so N changed his answer to "six hundred." X again made a stack of ten flats by the big cube. The experimenter then clarified that the question, "How many pancakes equal a big block?" means how many fill up the cardboard cube, not how many cover the sides. This seemed to resolve the issue because all three children spontaneously demonstrated this ten-for-one equivalency during the discussion of English names on the following day.
60
ICC. Fuson, J.L. Fraivillig & B.H. Burghardt
Within each of the six groups, at least one child identified the overall ten-for-one pattern that held across all three equivalencies. In all but one of the groups, this was a spontaneous observation that occurred during one of the particular equivalencies. In group L2 the observation that they were all ten was in response to the final worksheet question "Is there anything the same about what you found in (worksheet questions) 1, 2, and 3?" Ascetiaining the English words for the blocks The English words for the first three multiunit values used in base-ten multiunit numbers are, like most words, arbitrary. But the quantity named by each of these English words ("ten" "hundred and "thousand) can be ascertained by establishing how many little cubes are in each of the larger blocks. For the long, this is the same task as the ten-for-one equivalency task. For the flat and big cube, children can establish that one hundred little cubes make the flat and one thousand little cubes make the big cube. The hollow cardboard big cubes could be opened up to facilitate the task of ascertaining that one thousand little cubes fill the big cube. Establishing the multiunit quantity of the flat and big cube in terms of the unit cubes, and providing the English names for these multiunits if children did not already know them, was the next task. Again, the process followed varied considerably by group. However, all groups established the hundred and thousand equivalencies fairly readily, but they did so with relatively little full verbalization. Three groups had at least one child who began to explore the little cube/flat or little cubebig cube equivalencies during the ten-for-one equivalency task. Two of these groups (H1 and M2) continued on with the little cube/flat equivalency by covering a flat with longs and arguing that one hundred little cubes made a flat because ten little cubes were in a long. In both groups the experimenter then asked how many little cubes would fill the cardboard big cube. Both groups began to fill the big cube with little cubes. Both experimenters, perhaps prematurely, then suggested a more efficient approach (M2: "Is there an easier way?" H1: "Howmany little cubes if they were all level at the bottom?"). Group M2 then put ten flats into the big cube, and Da gave a full explanation based on all three adjacent ten-for-one equivalencies. In H1 E answered "one hundred and suggested putting hundreds blocks in the big cube. The experimenter then asked how many carrots (longs) were in the big cube. L and E said "one hundred," and they all started filling the big cube with carrots. E then said he didn't think they had one hundred carrots and that they should start putting in some plates (flats). They filled a new cube with plates, saying that ten plates tilled the big cube. The group then wandered into a discussion of the leader and checker roles, and an
Number Wora3, Multiunit Blocks, & Multidigit Addition
61
explanation of why there were one thousand little cubes in the big cube was never elicited or given. The third such group (H2) only asserted verbally that there were one hundred little cubes in a flat; the experimenter never asked them to explain or demonstrate this with blocks. In this group the same pattern was followed for the little cube/big cube equivalence. In the other three groups the experimenters specifically began a new phase in which they described the task as deciding what the English names were for each kind of block. Group M1 began like H2, but they asserted the answer verbally. When asked to demonstrate their assertions on the second day, U began putting little cubes on a flat while D put ten longs on a flat. M then counted the longs by tens (10,20, ...,90,100). 0 reconciled these two approaches by putting little cubes on top of the longs on a flat to show the ten rows of ten little cubes each (only some of the rows of little cubes were made, but all ten rows were described). These children repeated these same roles with respect to the big cube except that 0 said they didn't have one thousand little guys so he repeated the approach of counting by hundreds the ten flats stacked by the big cube. When U said she did not understand, 0 began to show her. But she then said she understood, stacked ten flats, and said that it was one thousand because the nine flats are nine hundred and the last one makes one thousand. Group L1 began by putting little cubes on the flat. N then put longs on a flat and counted them by tens to get one hundred. The group then moved spontaneously to the issue of the number of little cubes in a big cube and became involved in the ambiguity discussed above, with N and D asserting four hundred and then six hundred (based on four and then six flats to make the sides of a big cube) and X asserting that one thousand little cubes were in the big cube (because of a stack of ten flats beside a big cube). The experimenter clarified the meaning as filling up the big cube. On the next day two children verbalized several times the small cube/long and long/flat ten-for-one relationships to show that one hundred small cubes make one flat, but X said he did not understand. Most of these explanations used "these" rather than block words or English words. The experimenter elicited a counting by ten of the longs on a flat, which N had done spontaneously on the previous day. X then spontaneously moved from one hundred little cubes in a flat to saying there were one thousand little cubes in a big cube. M first stacked ten flats beside a big cube, and they then filled a cardboard big cube with the flats. In group L2, B and J responded that they had learned last year that one hundred little cubes made a flat. B showed and verbalized that "these (sugar cubes) are ten (shows sugar cubes stacked next to a pretzel) and ten of these (pretzels) make this (bread)." J started putting lots of sugar cubes on top of a
62
EL C. Fuson, J.L. Fraivillig & B.H. Burghardt
bread, and B pushed them into neat rows with a pretzel and covered the rest of the flat with pretzels. To show how many sugar cubes in a thousand, B stacked ten flats beside a big cube. To the task of filling the big cube, J recognized that they did not have one thousand little cubes to use and, when the experimenter said maybe they could fill it with other things, he excitedly said to fiU it with flats. B wanted to put in pretzels, which the other children began to do (after some sugar cubes were in). B said that the pretzels had to be in rows (i.e., ten of them together made a flat across the bottom of the big cube). J made a joke by using the food block names to suggest that they make a sandwich with the sugar cubes (their little cubes) between two slices of bread (their flat). Time ran out before they finished filling the big cube. Practicing labelling the blocks w*thenglish words and block words Children played three k i d s of games to practice the associations among the block words, English multiunit words, and blocks. A child would choose a block (or say a block word or an English multiunit word), and the other children had to say the English word and block word for it (or say the other kind of word and show the block). The experimenters were to continue the games until all children were able to produce these English word/block/block word associations quite rapidly and accurately. These games were done because it was anticipated that large numbers of inaccurate or very slow use of English multiunit words or block words would interfere with children’s ability to communicate during the multidigit addition and subtraction phases. All groups used the English words and the block words quite accurately during the games, rarely making any errors. In some groups children also produced these words very rapidly from the beginning of the games; these groups moved on quickly to the next phase. No group spent more than half the period doing this practice. To assess how accurately children used the block words and the English multiunit words throughout the whole preaddition phase, all such utterances between the end of the block name-choosing phase and the beginning of multidigit addition were identified. The English words included only those uses of an English word as a unit value (one) or as a multiunit value (ten, hundred, and thousand); not included were uses of these English words as the number of a given unit or multiunit because such uses are unitary cardinal meanings that tell how
Number Words, Multiunit Blocks, & Multidigit Addition
63
many of some kind of unit rather than telling what kind of unit.' So, for example, "a long legs is a ten" and "there are four tens" would be included as an English multiunit ten, but "there's ren of those in this" would not be. Across all children, the words "thousand," "hundred," "ten, and "one" were used as multiunits 187, 161, 200, and 103 times, respectively, with the number of uses by each child ranging from 0 to 14, 1 to 16, 1 to 13, and 0 to 12 for these multiunits, respectively. There were no errors in the use of "thousand or "one," and only four errors in the use of the words "hundred and "ten" (all were in group Ll). All but three children said each multiunit word at least once; the three exceptions were in group M1, where one child never said "thousand and two children never said "one." These uses are pooled across examples of giving the English multiunit word for a block, for a block word, and for a written mark, so these children exhibited a very robust ability to give the correct English multiunit word for these various multiunit manifestations. Children used the block words chosen by their group for the big cube, flat, long, and little cube 112, 92, 89, and 83 times, respectively, with the number of uses by each child ranging from 0 to 9, 0 to 7, 0 to 7, and 0 to 6 for these blocks, respectively. There were no errors in the block words for the small cube or long, one error in the use of the block word for the big cube, and four errors by three children in the word for the flat; all errors were in group L1. All but one or two children said each block word at least once (all but one of these exceptions were in group Ml); two-thirds of the children said a given block word at least three times. It thus seemed to be quite easy for these children to use the block words they had chosen in their group. Most of the English multiunit word and block errors in group L1 were confusions between the multiunits of ten and hundred. Three of the four errors in English multiunit words and block words for the hundred multiunit occurred at one point in a game where one child said pancake and three children said "tens" (this was quickly corrected to hundreds). One child used the block word "rectangular" instead of "pancake" for the flat and later grabbed a flat instead of a long block for the word "ten." The other errors were using a suggested but not chosen name for the big cube ("daddy" instead of "big") and giving the ordinal
During the discussion of how many little cubes make a flat and how many little cubes make a big cube, it was sometimes difficult to tell whether the words "hundred"and "thousand"were used as a single collected multiunit of small cubes or as the cardinal number of that many small cubes. Because the task in this case was to ascertain the latter in order to form the conception of the former, these meanings may be ambiguous or even simultaneously intended. Such uses were included in the analysis.
64
ICC. Fuson, J.L. Fraivillig & B.H. Burghardt
number of a multiunit (two, the second multiunit) instead of the multiunit value as ten. Establishing relationships among blocks, English words, and written four-digit marks
The final two phases before multidigit addition focused on multiunit numbers composed of thousands, hundreds, tens, and ones. The first of these phases related block arrangements to English words, and the second established relationships among blocks, English words, and written 4-digit marks. A collection of blocks presents the same multiunit number no matter what order the blocks are arranged in because each block contains its multiunit value and thus w r i e s this value to any new location. Although English words are ordinarily said in a standard order from the largest multiunit down to single units, the value of a multiunit number will be conserved if the multiunits are reordered or even split up and reordered: Three hundred two thousand five ten four hundred eight is obviously two thousand seven hundred five ten eight. However, written marks cannot be reordered because that will change their value. They do not carry their multiunit value within themselves in any feature except their relative left-to-right order--they are, after all, only ordinary single-digit numbers that tell how many there are of each multiunit and which multiunit is numbered depends only on the position of that number. It is therefore easier to say written marks if one uses the standard larger-to-smaller order of English words that matches the larger-to-smaller left-to-right order of written marks. It is also much easier for the multidigit written marks to take on the multiunit quantities presented by the blocks if the order of the blocks matches the order of the written marks. Therefore, in the first phase children were told by the experimenter that it was easier to say the English words for the blocks if the blocks were arranged left-to-right from largest to smallest. Children then practiced making several numbers by putting out several of each kind of block and saying each such block array in English words. In the final preaddition phase they did this while also making the marks for these block numbers by using digit cards (Session 1groups) or writing on the magic pad (Session 2) or writing on individual papers (both sessions). Because the children during most of these phases made their own numbers by selecting some of each kind of block, not all of the issues discussed in this section arose equally in all groups. Block arrays and English words. The initial phase of arranging block collections from the big cubes on the left through flats, longs, and little cubes on the right and saying the English words for such collections went smoothly in every group. The
Number Words, Multiunit Blocks, & Multidigit Addition
65
only error or difficulty occurred in group L2 when one child said that the ones go on the left. Five of the six groups were told how to say Asian tens (the regular form of ten that parallels the English use of thousands and hundreds: 52 is said as "five ten two") after the first block arrangement! They then practiced making several different block numbers and saying them in English words using the Asian tens. All groups learned the Asian tens readily, with no one making any errors for arrays having two or more tens. Teen numbers were not modeled by the experimenter, and most groups did not generate such words. One child in group L2 first said "one two" for twelve rather than "ten two." Groups L1 and L2 required some practice before the regular ten form replaced the usual English decade words reliably. Children in group M2 actually used the Asian ten form before they were told about it by the experimenter. They had just made a block arrangement and named it with block names (2 ice cubes 5 pancakes 4 licorice and 7 teeth), and they then produced the exact analogy with English words: 2 thousands 5 hundreds 4 tens and 7 ones. In addition to the irregularities in how the multiunit of ten is said, English words for four-digit numbers have two other irregularities--omitting the multiunit word "ones" and omitting any mention of multiunits that do not exist in a given number rather than stating "zero tens." Although the multiunits thousand, hundred, and the various forms of ten are said, the word "ones" or "units" is not said. It is more consistent to say the ones (2 thousand 5 hundred 4 ten 7 ones) because each number is then followed by its unit. Children in all groups produced such forms spontaneously. In groups H1 and L2, the experimenter said that you don't have to say the ones. In group L2, N asked why, and K asked if they could say ones if they wanted to. M in group L1 ended a multiunit word with "and eight ones" and then asked if you are supposed to say ones or just the eight. So children are sensitive to this irregularity and seem predisposed to regularize the English word form. Zero. Written marks explicitly signal when a multiunit is missing by putting a 0 in that position, but such cases are not said as "zero hundreds" or "zero tens." Instead, that unit is not said at all but must be skipped over, thus interfering with the regular production of the ordered multiunits. Again, it would be easier for novices to learn English words if each multiunit was named each time, and the relationship to written marks would also be simpler. Children in two groups, H1 and M2, actually said such a zero form. Also, M in group L1, after they had put
According to the original design, group M1 was not given the Asian tens.
66
K C.Fuson,J.L. Fraivillig & B.H. Burghardt
in a 0 digit card for the hundreds, showed explicit awareness that the zero is not said in English, "You couldn't say like 2 thousand and zero. You couldn't say something like that so it would be better just to say two thousand four ten four." All of the groups spontaneously made at least one block array that omitted one kind of block, except for L1 in which each child was in charge of one kind of blocks so each kind was always used (the experimenter made a block array with no flats for this group). In groups H1, H2,and M2 a chid correctly used a written zero for that block when making the marks for the block array with digit cards or writing them on the magic pad, and there was no discussion of whether or why a zero was needed. In groups M1,L1, and L2 the number was first written without a zero (e.g., 249 instead of 2049). In L1 and L2 one or more children then argued that there should be a zero. These arguments took two forms. One was the observation that there were none of a particular kind of block ("there are no ones"), so a zero needed to be written. The other was that the number without the zero is the wrong number (a block array of four big cubes five longs seven little cubes was written as 457, and K said, "It needs a zero because that'd be four hundred fifty seven."). The first argument addresses why you use a zero (to tell how many of that unit there are), and the second tells why you must use the zero: The marks show the wrong number if the zero is not there to push the single digits into their correct multiunit places, unlike the English words where one could say zero ones but it is not necessary, or even common, to do so. In group M1 the experimenter precipitated this kind of argument by asking the group what the blocks said ("two thousand forty nine") and what the digit cards said (249 "two hundred forty nine"); the group decided that they needed a zero to show the zero hundreds blocks in order to move the 2 into the fourth (thousands) place. The forward and backward thinking and counting of places that is required to understand this argument was nicely demonstrated by a discussion led by the experimenter in group L2. English words are written down just as they are said in order from left to right. But to read any given multidigit number, a child must do a reverse right-to-left process in order to decide the name of the farthest left place before beginning to say that number as an English word. This was described by a child answering the experimenter's question, "How do you know it is the thousands place?" as follows: "Cuz the four on the end-that's the one, and then the seven is the ten, and the three is the hundred, and the two is the thousand." The amplification of this response by another child beautifully captured the two reverse processes that must be gone through to read a number: "Because thousands is after--is before the hundreds." Thousands is after the hundreds in the initial right-to-left assignment of multiunits but is before the hundred when the number is said as an English word (the numerals are read from left to right).
Number Words, Multiunit Blocks, & Multidigit Addition
67
One other issue concerning zero arose in two groups. This is the common attempt by children to make the written marks parallel the English words and explicitly name each multiunit by using zeros to show the multiunit, that is, to use cardinality (four positions show thousands) rather than ordinality (only the fourth position shows thousands)(see the discussion in the introduction). Because two thousand is written as 2O00, or three hundred as 300, children want to write three zeros after a number to show that it is thousands or write two zeros to show that it is hundreds, yielding forms like 2000300405. A child in group M2 asked, "Why don't you put the zeros in for two thousand?" and a child in L2 similarly asked, "Why didn't I make zeros after my three hundred?" In the first case the child spontaneously then said, "I see now." and the issue was not pursued. In the second case the child asked again after the question was ignored. Another child responded that the zeroes were not necessary because "you can tell (it's a hundred) because you know how many numbers there are (pointing to the three places up to and including the hundred's place)." Use of commas. In the United States, a comma is used to separate groups of three digits in a multidigit number. The comma is placed by counting each three places from the right so that each three digits will compose one of the larger multiunits based on a thousand that constitute the large English words. In the United States these base-thousand multiunits are called thousand, million (one thousand thousands), billion (one thousand millions), trillion (one thousand billions), etc.' The comma may make it easier to identify a 4-digit number as beginning with the thousands to someone who knows how commas should be interpreted. But it is an unnecessary feature of the written base-ten marks, arising instead from the base-thousand structure of the English words and from a desire to simplify perceptual processing of many numbers (other countries use a period or a space for the same purpose). Commas were not used in any numbers or problems presented to the children in this study. The issue of commas arose as an extended topic of discussion in two groups. In group M1 two children articulated a comma rule "Put a comma every three numbers," but they counted from the left and wrote a number as 204,9 rather than as 2,049. The third child Dh said that the comma should be on the other side, and they agreed. This difference arose later when they all wrote on paper very large numbers to show commas. The first two children wrote from left to right. They wrote three numbers, made a comma, wrote three more numbers, made a comma, etc. This
'
In Great Britain a larger sub-base of a million is used instead of one thousand; the words are the same up to one million but then a billion is a million millions, a trillion is a million billions, etc.
68
K C. Fuson, J.L. Fraivillig & B.H. Burghardt
process will only work if you end with three numbers. Dh wrote his whole long number, started from the right and made curves over each group of three numbers, and then wrote in all the commas. Group L2 had an argument over whether you have to write a comma. Children in each group sometimes wrote a comma between the thousand and hundred places in the addition phase and occasionally used them this way in the preaddition phase. Ten or more of a given multiunit. The final issue confronted by each group in ascertaining the relationships among the blocks, English words, and written marks was how t o write block arrays that had ten or more of a given multiunit. This is a crucial issue in multidigit addition, for it arises whenever the s u m of a given multiunit is ten or more. This issue could arise in the preaddition phase if a group made a block array with ten or more of a given kind of block and wanted to write that block array in marks. In fact, all groups very early made such a block array. In all but group M2 the experimenter for that case and several others restricted the number of blocks by having children make that pile of blocks smaller. In groups H1, H2, and M1 the children later made such arrays, and then modified them in order to write the arrays in marks. Flats were stacked to make a big cube and the thousands were increased verbally by one, and other kinds of blocks were counted to make ten and the next larger multiunit was increased verbally, for example, finding the value of 21 longs as follows: "200(there were two flats), 10, 20, ..., 90,100 (counting the value of ten longs by counting each one as ten), 300 (incrementing the original 200), 1,2, ..., 9, 10 (this time just counting ten longs as single units to get ten longs as another 100), 400 (incrementing for the second 100 made from longs)." Examples in group M1 included such large cases as these 21 longs and 30 small cubes. In group L1 this issue only arose in an array that had nine longs. M said, "If they (longs) were ten, they'd be like that (points to flat)." On the next block array, M then restricted the number of longs to less than ten, saying, "or else we'll be in the hundreds, and we don't want that to happen." Group L2 made an array with ten small cubes, and T wrote this as 3,3610. Three children said that you couldn't do a ten at the end because it has to go in the tens pile, you'd have to regroup. Regrouping was for them a procedure done with numerals; the discussion focused on "take the 1 and put the zero here." They did nothing with the blocks. The experimenter asked, "What are those (the small cubes) the same as?" B answered "one of those (a pretzel)," and N answered "a ten." B put down a pretzel, and they wrote 3,37 (and later added a 0 to make 3,370). B then stated a general rule about making block arrays and writing marks: "So you have to do any number under ten (her emphasis) cuz then you'd just put down one more of these (pretzel) and you wouldn't need the ten (ones)."
Number Work, Multiunit Blocks, dt Multidigit Addition
69
Group M2 embarked on a long exploration of this issue that extended over substantial parts of the second and third days. On the second day this group of five children made their second block array of five big cubes ten flats eight longs and twelve little cubes. The discussion went as follows: Several children: "Five thousand ten hundred." Da: "No, one thousand." U: "Wait, that makes six thousand." Several children: "Six thousand eight ten twelve." The next block array also had more than ten units that the children again read as teens ("seven thousand five hundred six ten fourteen"). Three block arrays requiring no trading followed, and then U made an array of three pancakes, four licorice, and twenty nine teeth. The group's pursuit of this problem over the rest of that day and part of the next is given in Table 3. Everyone in the group recognized that, although they could say twenty nine teeth, they could not write that many teeth in the usual written marks. They generated several different interesting solutions to the problem they posed to themselves in this situation: conveying in written form the number of blocks they had. They eventually needed the help of the experimenter to solve their original problem -- how to write that many blocks in standard marks -- because their reformulation of the problem (writing the blocks they had) did not solve the original problem of writing standard marks for those blocks. Their version of the problem actually cannot be solved: standard marks cannot write the blocks they had, standard marks can only write a value equivalent to those blocks. The blocks can be traded to find this equivalent value that can be expressed in standard marks. When on the next day they got into the same issue with a new number, the experimenter focused them on the task of changing the blocks to match their stated English word value for the blocks (see Table 3: they said "five hundred but wrote the 5 in the fourth position). They had traded flats for a big cube on their first block array, so quickly saw its relevance here. They might have been able to think of trading with less direct support than the experimenter actually gave. This group went on in the addition phase to use this solution for writing too many blocks in a two-part "add and then fuc" addition method they invented. Establishing relationships among blocks, block words, English words, and written marks: Discussion The success of the block nominating and naming procedures in groups H2, M1, and M2 in which the names for all blocks were nominated simultaneously indicates that teachers do not have to be concerned and intervene immediately in a messy choice process. In the groups in which the experimenter did not intervene much, children were able to select sensible block names. All of the block names chosen
K.C. Fuson, J.L. Fraivillig & B.H. Burghardt
70
Table 3. Identifiing and Solving The Problem of Writing Ten or More of a Given Unit: Group A42 Day
Problem phases
2 All
Three hundred four ten twenty eight-twenty nine! T So, what’s the number? Dh (writes 3429) How come when Dh writes it on the pad it looks like three thousand? T I think he should write a zero instead of three. Then they would know Da there’s no ice cubes. But then it would say 0 3 U Dh (writes 3429 more neatly) So, who cares? Da Three thousand Three hundred - three hundred U Da To me it looks like three ten, four hundred. Da&N Three thousand, four hundred, two ten, nine But that’s not the number that we set up T (to experimenter) But this is one number (points to 3) and this is one T number (points to 4) and this is one number (points to 29)
Experimenter’s question (“How would I know that?”) precipitates many marks solutions: 3 4
3 4 x 2 9 29
Discussion of proposed solutions ~
U N U Da
~
That’s what we did with the boxes before. That’s messy. No, it isn’t. It’s the same thing though. OK. That’s it, we’re writing it down.
3
141121
Number Words, Multiunit Blocks, & Multidigit Addition N Da U
Da Dh T
71
We’re wasting paper T’s idea is good. Write 3 4 29. But that’s the same thing as the boxes. We could have stopped a long time ago with boxes. OK, that’s how we’re doing it. How about we stop now? (starts writing the number) That’s the wrong way.
More solutions Da,T,N Three hundred, four ten, twenty nine So what’s this number, guys? T But it still looks like three thousand. U But see, this whole thing is underlined, so we can read it. Da No we can’t. U (asks experimenter if she has any suggestions) Da I want to see how you guys figure it out. EXP I have a different idea. I think I know what to do. (writes 300 4 ten 29 U teeth) I have an idea. (writes 3 4 29 with a bracket under the 3, the 4, and N the 29)
No resolution U Da U
That’s the same as boxes, again. All we’re doing is wasting paper. I know. We could have stopped a long time ago. This is the first day, too.
New similar problem
3
U
The group made 5 flats, 3 longs, 17 small cubes which they wrote 5317 and said as “five hundred three ten seventeen ones.“ That’s the same problem we had yesterday.
ICC. Fuson, J.L. Fraivillig & B.H. Burghardt
72
Experimenter questions support solution EXP
U
Da
T EXP Da T EXP Da EXP U
Da
U Da
Da
U Da
Is there anything you can do with the blocks to make it look like five hundred? (shifting children’s focus from trying to write the blocks in nonstandard marks to trying to change the blocks so they can be written in standard marks) Yeah - um should put like a thousand down, take these away and put five of those blocks (ice cubes) down What we could do with the magic pad is write hundreds and put an arrow up to the five and then write tens and put an arrow up to the three and then write ones and put an arrow up to the seventeen. That’s what we could do and we can’t do that with the blocks. You put lines. Let’s concentrate on the blocks right now. What could you do to make the blocks look simpler? You could take away some teeth to make it less than ten. Is there anything you could do with the ones you take away? You can make another problem out of it. Is there any exchanges you could make with the teeth and the licorice? Yeah, you take one of these (a licorice) and that will make ten (teeth). Or if you take ten of these (teeth), it would be the same as that (licorice). You could take ten of these (picks up ten teeth) and then put them over here (in licorice pile) and then take one of these and just put them over here or you could just take ten away (teeth) and put another licorice there. Here, take ten of these away. You need seven left, so count out seven. You take ten away and put another licorice there. And that makes our problem easier. five hundred four ten seven. (short discussion about writing a zero before the number) But yesterday we used about twenty pieces of paper for one problem. That was my problem that I shouldn’t have even thought up. That took forever.
-
Note. Group M2 chose the following names for the blocks, listed largest to smallest: big ice cubes, pancakes, licorice, and teeth.
Number Worh, Multiunit Blocks, & Multidigit Addition
73
by the groups proved to be easy for the children to remember and use. We wanted to observe this naming process and so allowed each group to choose block names. In a classroom it is probably advisable for all children to agree on the same names in order to facilitate communication among the children in that classroom. The level of spontaneous verbalizations about most activities was disappointingly low, given the high verbal ability of many of these children. The discussions of the ten-for-one equivalencies and of the number of small cubes in the flat and big cube contained some good thinking and some complex arguments. But in general these children did not spontaneously produce verbal responses that would be maximally helpful to group members who did not understand. For the ten-for-one equivalencies, the many statements using "these" and "those" rather than the block names or English words required listeners to understand the referents for "these" and "those" in a sometimes complex physical and social environment. The even more frequent simple responses of "ten" or counts to ten required each listener to know the question being answered by these responses. The use of zero in written marks arose in all groups, and the issue of saying zero in English words arose in some groups.6 Children successfully used zero in all groups. However, there was no spontaneous discussion or clear articulation of why zero is needed in marks but not really needed in words. Group L2, when asked by the experimenter, came close to such a discussion, so it seems likely that children at this level can clearly articulate these reasons if the teacher initiates and supports such conversations. Similarly, having children use multiunit words and block names to give full statements of any mathematical relationships would increase the ability of weaker or momentarily distracted group members to follow the mathematical discourse and might increase the accessibility of these relationships to the speaker. The children's discussions of the equivalencies and their use of English words and block words with the marks underscore a limitation of English in this domain. The English language does not clearly differentiate between the use of the word "ten" as a unitary cardinal number telling how many there are of some unit and its use as a single multiunit of ten collected units that serve as a new higher unit. French, Spanish, Russian, and many other European languages do differentiate between these two meanings by providing a special ending for the single collected new multiunit meaning.7 For example, "diez" is ten in Spanish, and "decena"
Experimenters in fact were to make arrays that required a zero if the children did not.
'
I am grateful for helpful conversations with Robert Streit concerning this issue in several different languages.
74
KC. Fuson, J.L. Fraivillig & B.H.Burghardt
means a group of ten. Some sense of the collected meaning supported by these special endings is provided by the English word "dozen" which means a collected group of twelve (a dozen eggs); "dozen" in fact sounds as if it comes from the French special collection ending added to the French word for twelve: "douze" (12) plus "aine." However, the existence of this differentiation in the language does not necessarily mean that all users of that language comprehend the collective meaning. Recent conversations with some teachers from Puerto Rico indicate that the functional use even by teachers of the word "decena" may be, at least in some cases, limited to a label for the tens position in a written multidigit mark and may carry little or no cardinal meaning as "a group of ten." The common use of the collected-ten meanings may depend on the use in a culture of the metric system and the consequent frequent packaging of items into groups of ten or measures of ten units, as is common in the Soviet Union, for example. Without experiences of such actual collections of ten, or special experiences in the classroom, these special ending forms may not have multiunit meanings. When there is more than one ten, hundred, or thousand, the "s"in the plural forms (e.g., five tens, eight hundreds) in English does provide a minimal cue that one is talking about collected multiunits. Children in this study did frequently use the English plural form just as they used the plural form for block words indicating multiunits (e.g., four thousands six hundreds five tens two ones" or "four ice cubes six breads five pretzels two sugar cubes"). But the fact that standard multidigit English words drop the "s"and say instead "four thousand six hundred fifty two" muddles even this possible difference, and it is not always easy to hear this plural form even when it is said. This lack of differentiation in English of ten as the number of multiunits and as a kind of multiunit, combined with the tendency noted in this preaddition phase for children not to use the multiunit words, caused communication difficulties and some addition errors in the addition phase. Children's behavior in this preaddition phase indicated that they are predisposed to regularize the irregularities in English and generate full English forms that parallel the block words and name each unit. Children used the Asian regular tens words quite easily, though certain children and certain groups used them more than others. Sometimes a child added the units word "ones" so that every unit would be named and sometimes used "zero"to name a unit (zero tens) rather that just omitting that unit. It may be much easier for children, especially those not high achieving in mathematics, to see and use the multiunit structure of the English words and relate the words to multidigit numbers if they use full regular forms in the beginning of learning. Because none of the forms are "wrong" (just nonstandard or unnecessary), they could be dropped when children are older and understand the whole multiunit structure.
Number Words, Multiunit Blocks, & Multidigit Addition
75
It may be particularly helpful to use the Asian regular ten forms because the English special words that hide the tens in two-digit numbers enable children to persevere erroneously in situations whose ten-structure would be much clearer with regular ten words. For example, the prolonged engagement of group M2 with writing the 29 teeth (see Table 3) seems much less likely to have occurred if their language said those teeth as "two ten nine" instead of as "twenty-nine.'' Saying the ten suggests trading the blocks to simplify the block display or adding in those two tens with the other tens. For saying words in the teens, the fact that one child in group L2 first said "one two" for twelve rather than "ten two" suggests that it might be better not to use the abbreviated form "ten two" (as Chinese do) but to use the full equivalent of the later decade form for the teens: Tell how many tens by saying "one ten two" or "one ten and two." Saying a full regular Asian form for numbers with zeroes (e.g., saying 100 as one hundred zero tens and zero ones) would also help to eliminate the wdinal/ordinal confusion that leads children to want to write forms such as 100406 instead of 146 for "one hundred forty six." It is the usual short-cut wording of 100 as one hundred, 10 as ten, and 40 as forty that suggests such errors. The equivalency question for the flats and big cube actually is ambiguous because six flats (or four if you ignore the top and bottom) do "equal" a big cube in the sense that one can make one big cube out of six flats (dimensions are off by 0 to 2 cm depending on how you assemble the six flats). Because one cannot tell just by looking whether the wooden big cube is solid (made of ten flats) or empty inside (made of six flats), the former feature needs to be clarified from the beginning. The phrases "make as much wood as inside the big cube" or "fill up" (for the cardboard version that can be opened) might be better. The solid meaning also can be addressed by using weight, which some children did, for example, "See this (ten flats) is just as heavy as this (wooden big cube) except this (big cube) is much easier to carry around." One reason so few children had this possible confusion may be that, because the flat/big cube equivalency was last, children had an expectation that this relationship would be the same as the earlier ones (a ten-for-one relationship) and that one would show it in the same way--by stacking the smaller blocks next to or on top of the larger block. A new version of the blocks that is recently available does avoid this ambiguity. The blocks are clear plastic and fit together, so the big cube is seen to be filled with little cubes and can be made by sticking together ten flats. The disadvantage of this version is that they are somewhat difficult to put together and take apart, and the small extra parts that enable them to fit together may be distracting. Our results concerning these preaddition experiences clearly are limited by the achievement level of these children and by our discovery that many children had
76
K C. Fuson, J.L. Fraivillig di B.H. Burghardt
place-value experiences with a different version of base-ten blocks in first grade. Establishing these relationships would presumably take longer and need more teacher support if all children were new to the blocks or were of varying achievement levels. However, the ease with which many children handled these ideas, and the high level of some spontaneous discussions, indicate that establishing relationships among blocks, English words, and written four-digit marks is well within the zone of proximal development of high-achieving children working in groups at the beginning of second grade. If teachers support discussion of these relationships, these children are probably able to articulate and explain clearly all of these relationships. Without such teacher initiation and support, even these high-achieving children do not spontaneously discuss all of the important issues in these relationships or articulate them clearly enough for weaker children to understand. Results of the addition experiences Adding like multiunits
Every group immediately added the like multiunit blocks. After making each addend with blocks, they either pushed the addend blocks of each kind together and counted all of the blocks of a given kind, or counted the blocks in place, or used extra blocks to make as many sum blocks for each kind of block as were in both addends. Evidently the visually salient collectible multiunits in the blocks supported the correct definition of multiunit addition as adding like multiunits. There was only one exception to this uniform definition of adding like blocks: One child in group M2 suggested that the answer should be obtained by counting all of the blocks of all kinds (he thus ignored the collectible multiunits in the blocks and considered each block as one countable unit item). All groups also added two four-digit written marks addends by adding together the marks written in the same relative positions. In the groups that were clearly linking block addition and written marks addition, this carried the connotation of adding like multiunits. For some children, their written multidigit procedure already entailed the understanding that they were adding English multiunits (ones to ones, tens to tens, hundreds to hundreds, and thousands to thousands). For other children, multidigit addition was a procedure carried out on concatenated single digits, so these actions were based on a procedural rule and did not imply understanding of adding like multiunits. Evidence of these different bases for marks addition was not as clear as for multiunit understanding of the next component of multiunit addition, trading when one has too many of one kind of
Number Worak, Multiunit Blocks, & Multidigit Addition
77
multiunit, and it is linked to this trading knowledge. Therefore this issue will be discussed further in the section on trading. The incorrect block arrays always involved making the first digit out of big cubes (for example, 287 would be made from 2 big cubes 8 flats and 7 longs or sometimes 7 units), and the incorrect alignments always aligned on the left. Although these both resulted in a failure to add like multiunits, such errors seemed to stem from the left-to-right manner in which block arrays were made and marks were written. Although block arrays can be made in any order from written four-digit marks, almost all block arrays for all addends in all the addition problems were made in the same order in which marks are written and English words are said: from the big cube to flats to longs to units. Only 5 out of 118 addends were made in any other order. Group H1 made two block arrays in the order longs, units, big cubes, flats and one array from units to big cubes, and group M1 made two arrays from units to big cubes. The initial several problems worked by all groups in which two four-digit numbers were added seemed to induce a "set" towards making a block array by using the big cubes first. Thus, when seeing a written multidigit number, children who had done several four-digit plus four-digit problems had a predisposition towards making the first number on the left out of big cubes. Similarly, there was a predisposition for groups who were writing the addends vertically to start writing the 287 under the 3458 on the left, putting the 2 under the 3. Table 4 shows the relative correctness of making block arrays and aligning written marks problems for the problems in which the two addends had different numbers of digits. Th'e performance reflected in Table 4 is group competence, not individual competence. Some children in some groups also verbally suggested making the three-digit number using the big cubes or writing the numbers aligned on the left, but they were ignored or corrected by other children. All but two block arrays (18 out of 20) were made correctly initially or were immediately corrected by some group member, while a lower proportion of marks problems (15 out of 24) were written in correct alignment. When incorrect marks problems were corrected, children justified or explained their correction by using multiunit words: by saying the English words for the marks ("That's two hundred not two thousand") or by saying the block words ("That's two pancakes, not two icebergs"). Thus, thinking of the multiunit values by saying the marks in English words or block words may be an effective way to reduce such alignment errors. Failing to think of the multiunit values seems to be more of a problem with the written marks than with the blocks, so teachers might suggest that children read problems in English words and/or block words.
78
K C . Fuson, J.L. Fraivillig & B.H. Burghardt
Table 4. Correctness of Block Array and Marks Alignment for a Four-Digit and a Three- or Two-Digit Addend Block
Mark
Immediately made threedigit number correctly
13
10
Made three-digit number incorrectly but soon changed by group
5
5
Made three-digit number incorrectly but eventually changed by group
0
2
Made incorrectly and never changed
2
7
Note. The six groups worked a total of 24 problems in which one addend had four digits and the other addend had three or two digits. Children worked 20 of these problems with blocks and marks and worked 4 only in marks. The entries in the table reflect group, not individual, performance. Trading or putting: Solving the problem of too many of a given multiunit All of the groups recognized the problem of having too many of a given multiunit. This problem in fact arises only in the marks world One can have as many blocks of a given kind as one wants, but one cannot write down a blocks display that has more than nine of a given kind of block. Thus, this problem really only arises when blocks addition is linked to written marks addition. The collectible multiunits in the blocks support the solution to the problem -- trading ten of the multiunit with too many for one of the next larger multiunit, but the problem presents itself in the marks world where writing down ten or more of a given multiunit pushes the other written marks too far to the left (see group M2 wrestling with this problem in Table 3). Group M2, because of their earlier extended experience, did indicate that they understood why they could not write two digits for a given multiunit. No other group clearly explained why having ten or more was a problem. Many children brought some awareness of this issue from
Number Wora3, Multiunit Blocks, & Multidigit Addition
79
their knowledge of written multidigit addition, but for many of these the knowledge seemed to be formulated as an arbitrary rule such as "You can't write two numbers" or "You have to regroup (which meant writing little 1's in specified places)" or "You can't have more than nine." These rule statements were never accompanied by any hint of why this might be so or even that any justification of the rule was necessary. These children clearly are capable of understanding and articulating this problem as did group M2; they do not have to memorize an arbitrary rule. Every group in fact did present to themselves this problem when they first made block arrays: Every group made at least one array that had ten or more of a given kind of block. But in all groups other than M2, the experimenter did not allow them to consider this problem and constrained block arrays to those that did not have this problem. Thus, this problem could come up for initial consideration when first making block arrays, and the recognition that writing two digits for a given kind of block makes the other digits in the wrong place would then be helpful in the addition context. When this issue arose in the addition context, children did not spontaneously try to understand why writing two digits for a given multiunit is a problem or seek to explain rules they had memorized. They needed outside support to raise this issue and focus them on trying to explain why writing two digits is a problem in the marks world. That group M2 easily saw this problem indicates that it is an easy one to solve if it does get raised. Most of the addition phase was directed toward solving the recognized problem of what to do when there were too many of a given kind of multiunit. Each group had a different experience with this issue, and the nature of the experience was crucially affected by the extent to which the blocks addition procedures were linked to the marks addition procedures by that group. An overview of the evolution of co.-rect block trading and correct marks trading in the sequence of solution procedures for each group is provided in Table 5, and the addition experiences of each group are briefly summarized in the following sections. In Table 5, the nature of the trading in each successive block and written marks procedure is characterized, and each procedure is classified as linked or not. In linked block and mark procedures, children added or traded a given kind of block and marks position simultaneously (different children doing the blocks and marks step) or soon after each other before any other block or position was added or traded. In unlinked procedures the children worked in separate unconnected block and marks worlds. Either some children in a group worked on blocks while the others worked on marks -- and there was no communication or synchrony between these solutions -- or the whole group worked a problem in blocks and then in marks and there was no connection made between the solutions.
K C. Fuson, J.L. Fraivillig & B.H. Burghardt
80
Table 5. Accuracy and the Linked Status of Blocks and Marks Addition Procedures by Group
Group H1
Block Addition Accurate trade Linked to marks
Not linked to marks
2,3,4, 596
H2
M1
10,ll
2,7, 799
2,394, 4,535
M2
1,2,3,4,5,6, 7,8,9,10,11, 12,13,14,15
L1
2,8,9, 10,ll
394
Correct sum but not show trade with blocks Mentally added 1 the trade
2
Copy traded answer from marks
9
Added like multiunits and leave sums 2 ten Linked to marks Not Linked to marks Trade inaccurately Linked to marks Not linked to marks
1,2,3,4,5 1 6,7,8,9
L2
4,4,5,6,7, 8,9,10,11 12,13,14
Number Words, Multiunit Blocks, & Multidigit Addition Digit card/magic pad addition Accurate trade Linked to blocks 1'
Not linked to blocks
4,4,7, 1,2,3,4, 8,9, 5,6,7,8, 10,ll 9
Correct sum but not show trade with marks Copy answer 5,6 from blocks Copy answer from individual papers Mentally added the trade
lV,lln
2 3
7,7,8, 8,999
11,12,13,14 15
3,4
2
1,2,3,4,5,6, 7,8,9,10,11
81
1,2, 8,lO
4,4,5,6,7, 9,1@,11", w1495
1,4', 6'
1,3
294'3 6'
13b 3,4',6', 9,11
Trade inaccurately Linked to blocks
Note. Numerical table entries are the ordinal number of the problem. Many problems had multiple solutions proposed and used, and different parts of a problem may have been solved differently; each partial or whole solution is entered in the table. Groups HI, M1, and L1 used digit cards, and groups H2, M2, and L2 used a magic pad. Block counting errors and mark single-digit addition errors are ignored for the classification of accurate procedures. In linked block and mark procedures, children added or traded a given kind of block and a given marks position before moving to another kind of block or position, and the actions in the blocks word and the marks world were connected. ')Marks were on individual papers. b'These solutions were on individual papers and had some "unfmed" sums.
K C. Fuson, J.L. Fraivillig & B.H. Burghardt
Patterns over all the groups and important issues that arose in any group are discussed in the main section following these group summaries (the Discussion of the Addition Experiences). Some readers might wish to move directly to that section. A brief overview of these group results is as follows. Children were much more accurate in their block than in their mark trading procedures. Across the six groups, only one group made any inaccurate block trades, while five groups made inaccurate mark trades. Three of these groups made a mean of 13 inaccurate marks trades. There is not space in this chapter to describe each of the solution procedures in detail; these are described in Fuson, Burghardt, and Fraivillig (1992). The focus here is on the relative accuracy and ease of addition with blocks and marks, and the nature of the relationships children established between these two worlds. The amount of supportive or misleading verbal descriptions and explanations, and the quality of such verbalizations elicited by the experimenter, are also briefly summarized. Each of the groups established a somewhat different relationship between working with the blocks and working with the marks. The groups also varied considerably in the accuracy of their marks procedures and in the number of different inaccurate marks procedures they used. These paths through addition were influenced by the extent to which the dominant and most socially skilled individuals in the group possessed and used conceptual multiunit knowledge versus rote marks rules. Group HI. In group H1, L found the block sum for the first addition problem by looking at the two block addends and adding each kind of block, mentally trading over to the next larger multiunit when necessary, writing these sums on her paper under columns she made headed 1OOO,100,10,1, and then making the block answer from her written marks answer. On the next and all successive problems on which block addition was done, whenever there were ten or more of a given kind of multiunit, children traded ten of these blocks for one block of the next larger size. There was relatively little discussion or justification of this block trading throughout all of these problems, and many trades were made without any verbalization. Each child at some time did make at least one comment articulating the ten-for-one nature of the trade. For example, E said for the very first trade, "There are twelve blocks (singles), so ten become one of the carrots (he makes this trade and puts the carrot with the carrots in the top addend)." Examples from the other children are: C "Take one ten out of there;" L "We took the ten pancakes and handed them in for a Big Mac;" and N: "Thirteen tens. And an extra hundred (adding a plate to the answer). And that's three tens." The children cooperated in physically making the block ,trades, indicating that they all understood these ten-for-one trades well. Most of the time the blocks were not related to the digit-card addition method. Frequently the girls did addition with
Number Wordr; Multiunit Blocks, & Multidig-t Addition
83
the blocks while the boys did addition with the digit cards (or vice versa), and there was relatively little discussion or linking of the two during addition (they usually were on different parts of the procedure) or afterwards. These children did not devise a complete digit-card solution to the problem of trading until the sixth digit-card attempt (problem 7), when the experimenter said just to do the digit cards. In the earlier digit-card procedures children simply made digit-card addends and copied the sum from blocks or individual papers or added the digit cards mentally or devised only partial procedures (problem 4). For problems 7, 8, and 9, children showed the traditional algorithm by putting a card containing the digit 1 above any column that received a trade. There was little discussion or justification of the digit-card procedure or of written marks procedures done on individual papers. When the experimenter did ask for an explanation, these children did produce conceptual quantitative descriptions of their addition procedures that indicated that they had at least implicitly Linked the block multiunit addition to written marks addition and that for them the traditional algorithm involved trades of ten of one kind of multiunit for one of the next larger multiunit, not just writing little 1's. Thus, they were capable of levels of discussion and explanation that were much higher than those generated spontaneously in the group. However, even these children would have benefitted from explicitly linking the blocks to their marks procedures, as when on problem 8 E changed the tens sum (which came from 7 t 3) from 0 to 1, saying, "I never can remember if you're supposed to put a 1 or 0." He soon changed it back, but this would have been a good opportunity for the experimenter to ask him to answer his question by thinking about the blocks. Group H2. Group H2 immediately defined "adding blocks" as counting all the blocks for a given multiunit or pushing the blocks together, but, until the tenth problem, they did not do a full trade when they had ten or more of a given kind of block: The blocks answer for several days had at least one multiunit with ten or more blocks. On the magic pad, they used the traditional vertical algorithm writing the 1's above the next left column. They all knew this algorithm before the addition phase began. They sporadically linked the blocks and magic pad addition for one or two columns, but they did not link or really even compare a whole blocks addition and answer to their magic pad written addition until the tenth problem, worked during the fifth day of addition. On that problem M, who had from the beginning focused more on the blocks and tried to link the blocks and magic pad more than any other group member, related her whole individual written marks problem to the blocks problem by describing blocks addition using blocks words and showing what marks she had written for each kind of block. The experimenter asked about the difference between the blocks answer and their
84
RC.Fuson, J.L. Fraivillig & B.H. Burghardt
individual magic pad answers (these all showed traded answers for tens and hundreds while the blocks were not traded), and the group immediately traded ten s k i e s for one flathead and ten flatheads for one fatty. The group then spontaneously traded blocks on the next problem. No one in this group ever spontaneously gave a full conceptual explanation of trading, and such explanations were not elicited by the experimenter. Their later subtraction work indicated that they were capable of high level thinking and did have ten-for-one conceptions of trading. GroupMI. Group M1 quickly constructed accurate blocks addition, trading with the blocks on the second and all subsequent problems. They began with the blocks linked to the digit cards, and almost constructed a correct, fully linked, addition procedure on the second problem, where discussion of the blocks hundred/thousands trade led them to make a correct digit-card trade instead of writing the hundreds sum as two digits. They did not spontaneously make a similar blocks and digit-card link for the ones/tens trade, and so ended with a blocks answer of 5376 and a digit-card answer of 53616. On the next day they did a complete correct blocks addition with trading, and then moved to the digit-card world, where they never connected the digit cards to the blocks. This separate pattern continued for four more problems over three more days. During this time they generated multiple incorrect digit-card marks procedures, suggesting or carrying out as many as four different incorrect procedures on one problem. On three of these days they also wrote marks problems on their own individual papers. Much of the time these individual solutions were different from the digit-card solution showing at that time, and the incorrect procedures circulated among the group members l i e a virus, popping out on the digit cards or on individual papers with little predictability. For most of this part of the addition phase, these children were operating in three different unrelated worlds: the blocks world in which they carried out correct multiunit addition and seemed to understand their trading procedures, a digit-card marks world, and an individual paper marks world. Occasionally they even operated in a fourth world, the problem card on which the problem was written horizontally, because they would discuss a procedure by pointing to that card, and the discussion or proposed solution might differ from the solutions in the digit cards or on individual papers. On problems 5 and 6 the experimenter tried to have them connect the blocks to the marks procedures by having them talk about the marks procedures using blocks words; on problem 5 they actually used blocks, and on problem 6 they just used block words, But they did not carry out the linking consistently and continued to use wrong marks trading procedures. Finally on problem 7 the experimenter enforced block and digit card links, making the children relate their
Number Wordr, Multiunit Blocks, & Multidigit Addition
85
procedures column by column (e.g., as soon as they added and traded the unit blocks, they had to show that in the digit cards). They carried out correct linked methods in both worlds. For some parts of the problem, they even constructed two different correct Linked methods (increasing the bottom addend by one and the standard algorithm of putting the traded new multiunit block above its multiunit column). On the next day the experimenter asked them to do the digit-card addition while saying block words. As long as they said block words, they traded the digit cards correctly, beginning by using the traditional algorithm. But they stopped using block words, and then reverted to their most frequent wrong digit-card procedure (in which the ten from any two-digit s u m was written above the tens column because it was a ten). The experimenter again forced them to use block words, and they solved the problem correctly in two ways, using the traditional algorithm and increasing the top number by one to show a trade. Each of these procedures was described by two different group members using block words. On the next day, the experimenter continually had them describe the digit-card actions with block words (the child doing the digit cards was a "digit-card robot" that could only move after the blocks had moved) and they did a digit-card procedure tightly linked to their blocks solution. They traded correctly in blocks and digit cards and described their digit-card trades in block words except for one long digression in which they used English words, leading two members to argue for the procedure in which the "ten" from any two-digit sum (e.g., the ten from 8 + 7 in the hundreds column) is written above the tens column. They then all worked the problem on their individual papers and all got the same answer; until they were forced by the experimenter to link the blocks to the digit cards, at least two children on every problem did incorrect marks procedures on their individual papers. One child spontaneously used Asian tens to read his answer aloud: "four thousand two hundred one ten nine" (this was the one group that were not taught the Asian tens.) The children then did another problem individually on paper, getting the same correct answer. They all used the traditional algorithm, but there was not time to discuss it using block words. Group M2. The five members of group M2 cooperated very well throughout the addition phase, kept the blocks and magic pad procedures completely linked throughout this phase, and invented the only correct nonstandard marks procedure used for any length of time by any group. From the very first problem, they added one kind of multiunit block by pushing all of them together, wrote that sum on the magic pad, pushed together another kind of blocks, wrote that sum on the magic pad, etc. This resulted in at least one two-digit answer for every problem. The group then "futed this answer, trading ten of any block for one of the next larger
86
IC C. Fuson, J.L. Fraivillig & B.H. Burghardt
block wherever possible. For ten problems solved over four days, they copied the fured answer from the blocks onto the magic pad. Initially they did all the block trading and then copied the whole traded answer. But these group members were very conscious of keeping the blocks linked to the magic pad, and someone beeped whenever some action with the blocks was not immediately recorded on the magic pad (this was supposed to be the modus operandi for all the session 2 groups). This linking soon led them to write each intermediate answer after each trade. For example, for 1947 + 4185, they wrote each fured answer as they changed them; they changed 5 10 12 12 to 5 11 2 12 to 6 1 2 12 to 6 1 3 2. All members clearly understood their two-phase addition method. They called the second phase "furing" or "changing"the answer and realized that this phase was necessary in order for the marks not to show an answer that was too large (recall that this was the group that spent a long time in the preaddition phase figuring out how to write a block array with 29 units). However, they did not really reflect on the marks procedure itself, and the successive traded answers were often written unaligned below the problem, or scattered across the page, or even on a different page from earlier answers and from the problem so that such reflection was not easy to do. On problem 9 the answer was written after all the trades were made because there was no official writer. One child did not understand this answer, and other children explained the answer using English words to explain the f i n g that had been done while pointing to the magic pad problem. On problem 10 and 11 they did the problems with blocks and recorded on individual magic pads (pieces of paper). The child making the blocks for problem 11 put out one too many flats and one too few longs. N added the columns on her individual paper rather than writing the announced block sum, as had been done on earlier problems and was done by other children for this problem. She noticed and stated that her sums were different from the block sums. The group decided that the block sums were wrong and corrected them, and then fixed the magic pad problem. Thus, at this point they could carry out the first phase of their addition procedure just in the marks world by adding each column of marks. To support reflection on trading in the marks procedure, the experimenter began the sixth addition day by showing the group their unfured and fixed marks magic pad answer from the day before and asked them how they could fur such an answer without actually doing the block trading. The subsequent conversation included many descriptions of imagined individual block trades that related to particular marks, and children could write the marks trades if they thought about the blocks for that particular trade. In response to the experimenter's repeated request for a marks procedure that did not involve thinking through the individual
Number Words, Multiunit Blocks, & MultidigifAddition
87
block trades, two group members evolved a method in which the 1 to be traded was written above the sum to the left and a smallx was put below the 1 in the sum. The futed answer could then be written in one step by increasing each sum number that had a 1 above it (each 1 reflected a ten in the column to the right). This written procedure was described in block words, so it was clear that these children understood the traded 1 as ten coming from the right. For example, an explanation of the new procedure of writing the 1's above the sum to the left was: "I took ten of these (teeth) and put one licorice up. Then I took ten licorice and put one pancake up. Then I took ten pancakes and put one ice cube up." Over the next two days the experimenter supported a fading procedure in which children increasingly worked in the marks world while still relating the marks to the blocks by describing whatever they did with the marks in blocks words. On problem 12 they described the blocks trades before they fured their marks answer. For each of the final three problems they first solved the problem on their individual magic pads and then did the problem with blocks and discussed the problem. On the first such individual problem, solutions ranged from completely correctly futed problems to partially correct futes to sums not fured at all (e.g., 2 15 10 8). The last two problems had almost completely correct futed solutions by the three children (the weakest three) present on this last addition day; there were two errors (one sum with one too many and one sum with one too few) in the 18 furing steps these three children did on these two problems. Again there were full explanations in block words of the marks trading, and both group magic pad solutions integrated the group's furng procedure with the traditional algorithm: the two-digit sums were written below the problem, 1's were written above the problem in the columns to the left where necessary as in the traditional algorithm, and the futed answer reflecting the sum of these 1's and the unfured sum in that column was written below the unfured sum. In the final solution N also wrote little x's below the 1's in the unfured sum because "I just wanted to show what I put up there when I carried." Group LI. Group L1 immediately added like blocks, but they did not trade blocks on the first problem. For a long time, the whole group was driven by procedural rules and usually used concatenated single-digit words in describing their marks procedures (e.g., "Put the two from the twelve there and put the one at the top."). On the first problem they suggested or did with the digit cards two incorrect and two correct trading procedures, a correct one of which was linked to the blocks (looking at the twelve tinies led D to take away the 8 and 4 digit cards in the ones column, replace them with a digit card 2, and replace the digit card in the tens column of the top addend by a number one larger: change a 3 to a 4). On the second problem the two girls traded blocks from the ones to the
88
KC. Fuson, J.L. Fraivillig & B.H. Burghardt
tens, used the traditional algorithm with the digit cards, and clearly described the trade both in English words and block words. They also traded the blocks accurately for the hundreds/thousand trade. D was the leader in the mathematical aspects of these activities, while M was more socially dominant but was the mathematical follower. Their physical trading procedure involved counting as many blocks as the number of'ones in the sum and moving them aside to keep them for the block answer, removing the remaining ten blocks, and adding one of the next larger block. For example, for 7 t 6, they counted three pancakes and moved them aside, removed the remaining ten pancakes, and added in one big block. The two boys did not understand the hundreds/thousand trade, and the girls only described it in single-digit terms or described their counting actions literally. This led the boys, and the whole group thereafter, to describe trades as "take away three" instead of saying anything about the ten traded blocks. In the third problem one block trade was made and the other was described but not done because the boys were being silly. The digit-card answer was obtained by mentally trading and adding in the extra multiunit to that sum. Then began a six-problem sequence over five days in which the most dominant member of the group, M,imposed an incorrect procedure on the group. M reacted to the correct digit-card solution of problem 3 by carrying out her new procedure instead. This procedure stemmed from the "take away" language used in the early block trading and a rule repeatedly stated by M: "You can't have ten in any column." In M's procedure, when the sum of any column was more than nine, the addend digit cards were taken away and replaced by a 9 digit card, nine blocks were left in that column, and the rest of the blocks were thrown away. This resulted in answers with many nines in them (e.g., 4995 and 6999). Everyone agreed with M's rule (you can't have ten in any column), but there were repeated rebellions over the five days as various group members objected to the nines procedure and tried to discuss alternative procedures. M was very domineering in her responses to this resistance, and usually won by stating her rule. But several times she expressed frustration and said that she did not understand what she was doing, and also proposed substantive objections to her procedure (e.g., when someone else did her nines procedure with the blocks, she said, "You can't just throw some blocks away. You have to use them.") Almost all of the discussions of proposed correct procedures and of the nines procedure used single-digit procedural words, and the multiunits in the blocks were not used by the children in this discussion. On problem 8, the experimenter focused them on the blocks, asked why they threw away one rectangle (they were making 7 + 3 = 9 with the rectangles), and asked if they could do something with ten of the rectangles instead of just throwing
Number Wordr, Multiunit Blocks, & Multidigit Addition
89
one away. M immediately responded that that was what she had been trying to do earlier when "I was trying to explain that we should put the one in the other column." (i.e., do a traditional marks trade); she had proposed this earlier when they were trading. Over this and the next problem the experimenter supported the children's block trades, sometimes perhaps giving suggestions before it was necessary. On the final two problems the children did the block trading independently and spontaneously, but they never evolved a digit-card trading procedure to show the trades. They added in the trade mentally without showing it with a marks 1. This may have partly been because their digit-card procedure was to take away the two addend cards and replace them with the sum w d . Thus, in both the blocks and the digit cards, only the answer showed at the end of addition. Three of the four children did the same marks procedure on their paper as with the digit cards -- they added in the trade mentally and did not show it with the traditional 1 mark written above the top addend. The children never spontaneously gave a full explanation of the marks procedure in English words or blocks words, and such explanations were not elicited by the experimenter. Group L2. From the first problem the five children in group L2 added like multiunit blocks, though they continued throughout the addition phase to argue about whether they should show the sum with extra blocks or just push the addend blocks together. For the first three problems over two days the added blocks were not traded or linked to the magic pad written marks procedure. Various incorrect trading procedures as well as the correct traditional procedure (called regrouping by the children) were done on the magic pad. "Regrouping" for all the children involved writing a little 1 somewhere. They always referred to the regrouped number as a one (never as a ten or hundred or thousand) and never explained what they were doing or why. A typical such interchange is the following: N: "Well, then what's the one there for?" T: "It's just because you regrouped, and you keep the one there for a little mark." On the third addition day (problem 4) the experimenter emphasized that they had to write on the magic pad each time they did something with the blocks and enforced those links throughout the problem solution. The experimenter also asked children for explanations several times during the problem solution; explanations of particular actions with the blocks or marks written on the magic pad were seldom given spontaneously. There were three trades required by this fourth problem, and each was discussed at length by the group. For each place in which a trade was needed, some children suggested or physically did a ten-for-one block trade or put ten of one block together in the column to the left. Discussion of these trades often used ten to describe the 1 written on the magic pad. However, for each trade at least one marks magic pad or blocks procedure
90
R C . Fuson, J.L. Fraivillig & B.H. Butghardt
stemming from a regrouping notion as writing a little 1 somewhere was also suggested or done; these were always described by using procedural descriptions of writing single digits somewhere. The children were solving the problem from left to right, so some wanted to write the 1 above the column to the right (a correct procedural analog of the usual right-to-left solution: write the 1above the next column to be added). The children who focused on the block values and block trades and those who focused on single-digit marks regrouping varied across the problem solution. For each place, children were eventually convinced by the blocks and by the explanations of the blocks trades, and they agreed to trade the blocks and agreed on the correct marks writing of these trades. Except for marks errors on problem 8, the children solved ten more problems over five more days in which the blocks were traded accurately and a correct written marks procedure linked to the blocks trade recorded these trades. The group solved problems left-to-right and right-to-left and, at the suggestion of the experimenter, did some problems both ways in order to decide which way they thought was best. This issue for most of the addition phase was a boys against girls issue, with the boys wanting to add left to right. In this group the two boys happened to be the weakest mathematically; whether this was related to their preference is not clear. They decided by the end that right-to-left was easier because they did not have to cross out sums they had written and write new ones. This part of the addition phase was not as smooth as indicated by the uniformity in Table 5. Although the group worked hard and well over the whole period, there were continuing controversies about who got to write what, who got to do each kind of block (five children and four blocks meant one person was without blocks), whether to begin on the left or on the right, and whether to use extra blocks or just push the blocks together. Some of these controversies were carried on simultaneously, and the discussion became quite confused. The experimenter needed to intervene at times to facilitate their resolution of these issues. As in group L1, these children also rapidly moved toward a procedural take-away description of the block trading in which the trading was not described and ten was not mentioned. Instead, only the number of blocks remaining was stated: for twelve pretzels, "Leave two out, take away the rest." This was more efficient than counting ten and taking them away and was based on conceptual knowledge: twelve consists of ten and two, so if we count two and take the rest, we will be taking ten. But for the two weakest group members, the failure of the conceptually strong group members to give conceptual explanations, or even descriptions that included the word "ten," for the block trades as they were carried out meant that these members still retained their procedural single-digit marks regrouping orientation along with the new blocks trading procedure and that they
Number Worak, Multiunit Blocks, & Multidigit Addition
91
could be quite fuzzy about the ten involved in the trade. Once the experimenter asked the group how many were left after they had taken away the ones of the two-digit sum, and the answers “nine”and “eleven”were given. On the last day of the addition phase, the experimenter asked each group member to give block word and English word descriptions of the block trading when they were just doing marks problems. Three of the members gave several good explanations that indicated that by now the marks procedure was no longer just marks single-digit regrouping but was firmly grounded in conceptual multiunit quantities. The two weakest members could not consistently give such explanations. Their linking of the blocks to the marks would have been facilitated by conceptual multiunit descriptions from the stronger group members rather than the short-cut procedural take-away description of block trades. Other aspects of mulfiunit addition
The other component of multiunit addition, single-digit addition of each multiunit, and the technical aspects such as copying the problem, writing digits, and counting the blocks did not present much difficulty to these children. These children either knew single-digit facts or had fairly rapid solution procedures for finding them (such as sequence counting on). Thus, adding like multiunits in the marks world was fairly easy when they began to fade into just doing the marks problems without the blocks. One problem was miscopied from the problem card. Blocks were miscounted in making the block addends several times. These sometimes were caught quickly and other times resulted in long derailments of a problem solution because the source of the difference between the blocks and the marks was not seen immediately. Such errors helped us to ascertain which children were working from the blocks to the marks and which were only in the marks world (the marks of the former reflected the incorrect number of blocks), but these derailments were frustrating to the groups. The block addends and sums also sometimes became incorrect during the solving of a problem because children played with blocks in the problem (and removed them in so doing) or because the block problem became quite messy and the blocks in the problem merged into the nonproblem blocks in the block bank reserve. Total finie of the addifionphase
Groups H1 and H2 took five days for the addition phase, and all of the other groups took eight days. During this time Groups H1, H2, and L1 solved 11 problems, Group M1 solved 9, and Groups M2 and L2 solved 15 problems. Thus,
92
KC. Fuson, J.L. Fraivillig & B.H. Burghardt
the high-achieving groups averaged about 2 problems per day, while all the others averaged from I to 2 problems a day. The two high groups did have fairly good conceptual understanding of marks addition at the end of that time, though the experimenter did not force full linked procedures or full conceptual explanations by everyone in both groups, partly because she felt that these children had good understanding. All of the other groups were moved into the subtraction phase earlier than was ideal because of our contract with the school that required us to do subtraction as well as addition within the allotted time. Discussion of the addition experiences Linking the blocks and marks
The purpose of base-ten blocks is to enable children to construct conceptual multiunit structures as meanings for multidigit written marks and English number words. Their function in addition of multidigit numbers is to enable children to use their conceptual multiunit structures to understand how to add multiunit numbers and, eventually, how to carry out meaningful written marks addition without needing to use the blocks. To facilitate both of these goals, each experimenter was supposed to establish and enforce linked procedures in which blocks addition was tightly linked for each kind of multiunit to marks addition with the digit cards (first session) or on the magic pad (second session). At the beginning of the addition phase each experimenter gave such linking directions -that as soon as children did something with the blocks, they were to do that same thing with the digit cards or on the magic pad. Because children in session 1 frequently violated this directive, the session 2 children had the further Linking directive that each child was to beep whenever something was written on the magic pad that had not been done with the blocks or something was done with the blocks that was not written on the magic pad. In spite of these directions all groups, except group M2, did not link marks addition to blocks addition. For some groups this separation continued for days. When children were functioning in a marks world separated from the blocks worlds, this unlinked marks world proved to be a fertile ground for generating many different incorrect addition procedures (the incorrect entries in Table 5 are described in Fuson, Burghardt, and Fraivillig, 1992). The group M1 children on some days were even functioning, at least briefly, in four different unlinked worlds: blocks addition, digit card addition, marks addition on their individual papers, and marks addition on the horizontal problem card. In all groups, when the experimenter imposed these links after children had devised and persisted in
Number Worak, Multiunit Blocks, & Multidigit Addition
93
incorrect marks addition, one simple directive was not enough. Children might record the marks for the block addition of one or two kinds of multiunits, but the blocks and marks would then become separated. The experimenter had to continue to monitor and support linking in order for children to execute a fully linked procedure in which each addition with blocks was immediately recorded. Such linking support was necessary for at least one day, and in some groups, for two days. Children then seemed to be able to carry out a linked addition solution without any support. There were several identifiable sources of this difficulty in linking. First, spontaneous comments by children in some groups indicated that children constructed different interpretations of the chunks involved in the experimenter's linking directive. Group M2 wrote down each digit as a block array was made; the other groups usually wrote the whole multidigit number after the blocks were made. One child in another group argued that they should not write the frst addend number after making it with blocks because they "had to make the whole problem (i.e,, both addends)" before writing it. Thus, the crucial linking of writing the result of adding and trading (if necessary) one kind of block needs to be clearly articulated, emphasized, and monitored by the teacher. Second, the practical division of work sometimes contributed to this lack of linking. Who got to do what, and when, was an extremely important and emotionally charged issue in every group (except perhaps group M2, which, under an effective and fair initial leader, quickly evolved an atmosphere of equal participation). There were long arguments in many groups about turns and fairness. In most groups the leader chose who got a turn at something. In some cases these decisions were based on who had not yet received a turn, but in many cases the choices seemed to reflect friendship or criteria other than equal turns. A time-consuming counting rhyme was chosen as a fair procedure in one group. Sometimes a whole problem would be worked by one child chosen by the leader; linking then depended upon that child, and to a lesser extent, on the rest of the group. Sometimes some children would do the blocks while the others did the digit cards (e.g., girls the blocks and boys the digit cards); this set-up proved to be quite difficult to link, with each subgroup having its own momentum. To facilitate participation by everyone, the experimenter for groups L1 and L2 instituted the agreement that each child had one kind of blocks. This meant for group L2, with five members, that one child did not get to do anything or wrote on the magic pad (a potential source of unlinking as this child might move ahead of the blocks or lag behind the block solution). Distributing the blocks in this way did involve everyone on every problem and worked fairly well, but children sometimes went
94
K C. Fuson, J.L. Fraivillig & B.H. Burghardt
out of turn or were so involved in the problem solution that they did another child’s blocks (with consequent protests). Third, at least initially, the direction of the link from the blocks to the written marks--and the consequent status of the digit cards or magic pad as the written record of the blocks procedure-was not emphasized enough. The fact that many of these children already had a procedure for adding written marks also interfered with establishing the link in this direction because there was some tendency to use the written marks procedure they already knew and make the backwards link from the marks to the blocks. Nor was the purpose of the blocks underscored sufficiently, which was to enable children to construct written marks procedures they could understand, explain, and defend conceptually in terms of attributes of the multiunit numbers they were adding. The learning task should have been presented from the beginning as one of using the blocks to help explain in terms of the blocks, and in English words, why one or more written marks procedures worked. Without these needed emphases, the children assimilated this task into their usual school mathematics set: Learn how to do something -- add with the blocks and add with the marks -- and these procedures do not need to be connected or explained except by rote rules. The emphasis on explaining why a marks procedure works might also have elicited much more discussion and explanation than these children generated spontaneously. Fourth, the relatively small space shared by groups H1 and L1 meant that the blocks problem and the digit-card problem were crowded together and sometimes children did not have room to lay out the digit cards exactly as they had laid out the blocks. Thus, sufficient space must be provided to support linking. In all groups in which the experimenter imposed links between the blocks and the written marks, these links did prove to be sufficient to direct a correct marks procedure and to eliminate incorrect marks procedures that were done before the links were made. The collectible multiunit quantities in the blocks were salient enough to direct and constrain correct block trading, and any block trading was easily recorded as a written marks procedure. Block trading always involved a ten-for-one trade, but the one next-larger block could be placed in various places in the block problem. No group ever made any incorrect block trades when they approached the trading problem within the block world. Some children needed their attention directed to the collectible multiunits as a potential source of the solution to their problem of having too many of a given multiunit, but once attention was focused on this feature of the blocks, all children saw the sense of block trading immediately. No one ever objected to a block trade, unlike objections or reservations that were voiced about the incorrect marks procedures. Even the weakest members could think fruitfully about the block trades, as with
Number Wordr, Multiunit Blocks, & Multidigit Addition
95
one of the weakest members of L2, who said after the first block trade was made (this happened in a case in which the sum of the hundreds was exactly ten flats, so all ten flats were traded for a big cube), "I don't like this idea if we go put 'em all on." In this case the trade had been ambiguous and could be interpreted as trading ten or trading all of the blocks; this child was checking to be sure that they were doing the former and not the latter. The behavior of group M2, which most clearly exemplified the desired blocks-to-marks link approach, reveals another function of a teacher that might be necessary during the fading procedure to the marks. For several days, these children did not reflect on the marks procedure at all and often did not even record their blocks addition in such a way that they could really reflect on it; they wrote the successive furng answers on different pages or disorganized all over the problem page. Therefore a teacher should to monitor the recording process to ensure that it is eventually done in such a way that children can reflect on what they are doing in the marks world. Children may also need to be helped to do this reflection in the marks world -- still strongly connected to the blocks world by blocks words -- to facilitate the fading process from the blocks to just the marks but with multiunit meanings attached to the marks. The final step in the fading process is to think about the blocks while doing the marks procedure. For children who had instruction in which the blocks modelled the standard algorithm, this step proved to be very powerful in helping those who later started to make errors in the marks procedure self-correct these errors (Fuson, 1986). The collectible multiunits in children's mental images of the blocks were sufficient to direct them to a correct trading method (when they made an error, they were asked to think about the blocks), and they verbalized these corrections with block words or English words or mixtures of the two. Although in no group was there a great deal of spontaneous verbal description of the block trades in block words or English words, most children who were asked by their experimenter were able to make such descriptions while looking only at the marks procedure. Weaker children were not always able to do so. This suggests that it would be very helpful if the task of using blocks included describing and explaining what one is doing with the blocks. This would mean that initially the abler children in a group would give full block multiunit descriptions of their block trades, enabling the weaker children to follow the block trade and link it to the written marks recording. With such modelling, the weaker children could become able to verbalize their block actions. Describing the marks addition in block words would help ensure that children were constructing and using multiunit conceptual quantities for those multidigit numbers instead of the inadequate single-digit conception of those numbers. Verbalization would facilitate all phases of the
96
KC. Fuson, J.L. Fraivillig & B.H. Burghardt
linking process and support children's use of the blocks to monitor and self-correct errors that might otherwise creep into their marks procedures. Aspects of helpful verbalization
The previous section described several aspects of verbalization that can help establish the initial links between blocks and marks addition and then support the reverse marks-to-blocks link mentally when the child is no longer using blocks. The importance of initially emphasizing explaining and justifying block and mark addition--not just saying what one did, but saying why one did it or could do it--was also discussed. This section focuses on other results concerning verbalization. Children may initially need the teacher's support to say the multiunit word as well as the number of multiunits. As in the preaddition phase, many spontaneous descriptions of the block trades named how many one was trading but did not name the multiunits involved in the trade: "I'm taking ten and putting one here." Perhaps because of the lack of differentiation in English between these two uses (as in diez and decena in Spanish), the failure to say the multiunit word after ten (ten whats) led children to confuse the function of these two meanings of ten (the unitary ten telling how many of a multiunit and the multiunit ten in the second position). Such confusions led to the prolonged use in group M1 of the incorrect marks procedure of putting all trades in the tens column. Each 1 was written above the tens column because it was one ten coming from the sum of the two numbers in a column. Thus, the ten (the number of a particular multiunit) went to the tens column (a kind of multiunit). As soon as these children consistently focused on the kind of block involved or on the multiunit word (ten whats?), they saw that ten hundreds or ten tens would not go to the tens column (would not be one multiunit of ten). The confusions from not saying the multiunit word were briefer in other groups, but children did other combinations of these two functions of the word ten. For example, "OK, six (carrots) plus ten (actually one traded carrot, a ten multiunit) is sixteen." Using block words is one powerful way to reduce these confusions, because the block words say the multiunit quantities and the ten will tell how many of a block there are. For all mathematical quantities children learn about after small whole numbers, the small whole numbers in fact are always used in special new ways to tell how many of some particular new kind of quantity. In multidigit numbers it is how many of larger and larger multiunits. In fractions it is how many of a particular unit divided into how many parts. Multidigit numbers present a good opportunity for children to prepare for all of these new mathematical ideas by recognizing that
Number Wotds, Multiunit Blocks, & Multidigir Addition
97
these small numbers are "how many" numbers that tell how many of something there is. Thus, they can begin the very useful practice of asking "how many whats?" for any number they see in these new uses. The dysfunctional nature of the take-away descriptions groups L1 and L2 used to describe trading, with this language supporting and perhaps even suggesting the erroneous nines procedure, indicates that it would be helpful if teachers monitor the language used to describe trading. For this take-away description, and any other procedural short-cut definitions that suggest wrong marks trading, teachers need to ensure that children, instead, give a full description using multiunit words and the numbers of these multiunits ("I'm trading ten of these pancakes for one ice cube." or "I'm putting these ten hundreds here together to make one thousand."). These differing positive and negative effects of language indicate that future research needs to examine the effects of the kinds of verbal descriptions children produce. Resnick and Omanson (1987) reported that the amount a child verbalized when using base-ten blocks was positively related to their correction of written marks errors. In the present study some kinds of verbalization seem instead to have had negative effects. Furthermore, even though questions by the experimenter at the end of the addition phase indicated that most children could produce good verbal descriptions, these highly verbal children did not spontaneously produce large amounts of such descriptions. Therefore, both teachers in the classroom and researchers studying multiunit learning need to support children's production of the positive kinds of language (full multiunit descriptions of trading). We are unable to make any strong conclusions about the efficacy of using the regular Asian ten-structured words compared to the irregular English words because children used relatively few full multiunit verbalizations (English irregular or Asian regular). Children did learn the Asian words readily, and some children seemed to like their regularity. Other occasions on which they seemed advantageous have been discussed elsewhere. A preliminary report of a teaching experiment in which Asian tens words are used in a first and third grade class is in Fuson and Fraivillig (in press). Time to build multiunit thinking
The four groups that spent eight days on addition all had agreed upon an accurate blocks procedure that was understood and could be carried out by all children. Each of these groups also had centered on some marks procedure that, for some children, was linked closely to the blocks procedure and was conceptual,
98
K.C. Fuson, J.L. Fraivillig & B.H. Burghardt
but that for other children in the group, was less closely linked and not yet fully conceptual. None of these four groups had enough time in the reverse marks-to-blocks linked direction. They would have benefitted from two more days doing fully linked blocks-to-marks procedures with full multiunit descriptions to help the weakest group members, and two to four days doing faded reverse marks-to-blocks links where children did marks procedures and described them in block words and English words and used the blocks where necessary to clarify problematic points. If blocks-to-marks links and full multiunit descriptions had been supported earlier in the addition phase than occurred spontaneously in this study, these children might have been where they were at the end of eight days three or four days earlier. They also would have avoided adding the several incorrect marks procedures they invented to their repertoire of solution procedures, with those procedures needing to be suppressed by thinking about the collectible multiunits in the blocks. These results indicate that it takes a long time--days and even weeks--for high-achieving second graders to construct multiunit quantities and ten/one trade conceptual structures and use these structures in devising and being able to explain and justify an accurate method to add multidigit marks. For children with weaker backgrounds it might take two or three times longer for this construction, and children might need even more support from the teacher or other expert to maintain links and produce full verbalizations in order to enable the blocks to function most effectively. In studies in which the standard algorithm was modelled with blocks (Fuson, 1986; Fuson & Briars, 1990), the amount of time spent on addition varied with the achievement level of the second-grade class from about a week to three weeks. When children are inventing their own addition method, the required time would seem to be greater, even with teacher support to curtail the long unlinked incorrect marks sidetrips taken by some groups here. Therefore the appearance in textbooks of base-ten blocks for three or four pages, as is becoming typical now (they do appear but only for a short time: Fuson, in press-b), is not nearly long enough even for the most able children. Of course, the extent to which children need to move the blocks themselves as opposed to seeing the collectible multiunits in pictures is also unknown at this time, so this may be another limitation of the blocks in book pictures. Some interventions with blocks (e.g., Resnick & Omanson,1987), and with other physical materials used to support conceptual understanding (e.g., Byrnes & Wasik, 1991), consist of a single instructional session. When this single session fails to lead to full conceptual understanding or to accurate written computation, that physical material, or the whole approach of using physical materials to provide quantitative referents for mathematical symbols and operations, is judged to be a
Number Wordr, Multiunit Blocks, & Multidigit Addition
99
failure (e.g., Byrnes & Wasik, 1991; Siegler, 1991). Instead, the real question should be why anyone would think that a single session would be sufficient for all but the brightest child to construct all of the necessary new conceptual structures in the mathematical domain in question and clearly link the quantitative features in the situation to the new operations on the mathematical marks. If a child already had all of the requisite knowledge, a single session might be sufficient to make new connections in this knowledge that would lead to an insight kind of new learning. But in most cases, children must build the requisite knowledge as well as make the connections. Such a single session is even more problematic when the subjects are not novices for whom the target written procedure is a new discovery but are instead children who have for months and even years carried out an incorrect procedure (e.g., Resnick & Omanson, 1987; Byrnes & Wasik, 1991). Finally, as we discussed above with respect to our children in their small groups, children in such a single session will bring to this session their usual interpretation of the goals and purposes of mathematical activities-learning correct written procedures. Until these social norms can be renegotiated, and a new focus established on conceptual understanding and explanation, use of materials is likely to be assimilated into this expectation concerning mathematical activity. Children then are likely to see the session as having two separate components--learning to add with the blocks and learning to add with the marks--rather than trying to use the perceptual support of the materials to understand the marks procedure. Children also took a long time to work through a problem, averaging only one to two problems per 35-minute working time. This is in contrast to the range of 7 to 12 three-digit subtraction problems worked in the single tutoring session in Resnick and Omanson (1987). Our groups’ time was spent discussing various group issues and some off-task topics as well as proposing and carrying out and arguing about various block or mark solution procedures. But this group pace is much more typical of the slower pace, relatively fewer problems solved per class, and more discussion found in Japanese classrooms compared to classrooms in the United States (Stigler, 1988). Blocks can be quite effective when they are used over a longer period both in the standard algorithm studies (Fuson, 1986; Fuson & Briars, 1990) and in the invention approach used here, but time is needed to work through a single example and extended time over days and weeks is needed to build and connect all of the new conceptual structures. Supporting multiunit thinking
Children in all groups were capable of much higher levels of thinking than they produced spontaneously. Group L2, with the support of the experimenter,
100
K C. Fuson, J.L. Fraivillig & B.H. Burghardt
profitably compared the relative advantages of adding from the left and from the right. All groups could have discussed this issue. Most groups had at least one member who wanted to add the blocks from left to right, and several groups carried out full left-to-right blocks procedures. The exploration of left-to-right procedures was shortcut in several groups by assertions of a rule (e.g., in reading we go from left to right and in math we go from right to left), so such an exploration might be especially natural for children who have not yet learned such rules. We had hoped that all groups might construct at least one marks procedure that differed from the standard algorithm many of them already knew. However some groups, especially H1 and H2, were so focused on the standard algorithm that they did not really do this. In such a case it would seem worthwhile to demonstrate one or more nonstandard procedures described here or in F w n , Burghardt, and Fraivillig (1992) and ask the children to decide if they are correct or not and what might be their relative advantages or disadvantages. Children also might pursue extensions of their four-digit experience such as deciding what would be the size of the next three multiunit blocks (blocks for the fifth, sixth, and seventh positions) and how would they add two seven-digit numbers and why. Role-playing activities, such as explaining an addition method to a new student who has just come into the class or helping that student figure out a method for herself, might also be interesting and help children try to articulate every step they were doing. Some children in this study showed remarkable ability to scaffold other children’s learning, and even first graders can do so. During the first year of the block work reported in Fuson and Briars (1990), a new student came into the high-achieving first-grade class just as the children were completing the block work (they had been working in individual groups each helped by a fourth grader). The teacher asked one of the strongest students to show this child how the blocks and marks procedure worked, and in one day the new child added correctly and could explain the marks addition in terms of the blocks. If children are working in the same group for a prolonged period of time, it also may be helpful from time to time to have each group make a report to the whole class on their addition methods, discoveries, and current difficulties. The class then might discuss these alternative approaches and discoveries and suggest solutions to the difficulties. Having to make such a report may help to focus the work of the group. If any group member can be chosen at random to make the report, all group members may feel more need to support the understanding of all members of their group. Having occasional outside input might help groups whose socially dominant members have weaker conceptual understanding. These comparative discussions would require children to understand the thinking of other
Number Words, Multiunit Blocks, & Multidigit Addition
101
groups and would seem likely to extend everyone’s thinking. Of course, as with the groups here, such discussions may need to be supported by the teacher, at least initially. Generalizing to other achievement levels
How well these results extend to second-graders who are low- or average-achieving in mathematics is an open question. We collected similar data on a class of such children, but have not yet completed the data analysis. This class, especially the low-achieving children, clearly needed more support from the experimenters than did the children in this study. In the studies modelling the standard algorithm (Fuson & Briars, 1990) even low-achieving second graders did learn to add four-digit numbers and explain their trading using tenlone multiunit concepts (they did this late in the year rather than at the beginning of the year, as in this study); in the lower-achieving classrooms the teachers initially modelled the blocks and marks procedures and the children participated in justifying what was being done with the blocks. When classrooms or groups are more heterogeneous, it seems likely that the higher-achieving children might do more of the initial discovery of an addition method and then play the role of the teacher modelling this method for the lower-achieving children. The quality of the explanations given by the children in the present study, at least those elicited by the experimenters, seems sufficient to facilitate the learning of their low-achieving peers. Lower-achieving children may also have more trouble with the technical aspects such as copying the problem, counting the blocks, and doing single-digit addition. For the first two, It would be helpful if teachers emphasized that disagreements can sometimes be resolved by checking that the problem was copied correctly or checking that the number of blocks is correct or by starting the problem again with carefully checked blocks. To keep the problem blocks accurate and eliminate the frustrating digressions that occasionally occurred when blocks were played with or merged with nonproblem blocks, horizontal block trays for each addend and for the sum would be helpful. These would also provide perceptual support for the horizontal multiunit number versus the vertical columns in the written marks; this was a special problem in subtraction (Fuson & Burghardt, 1991). Finally, though the children in this study did not need the blocks to find single-digit sums, some lower-achieving children might need them at least initially for the larger sums. Because the trading in addition requires a given multiunit sum to be in the form of a ten and the left-over amount, and the block collectible multiunits display this tenness, multiunit addition with blocks can help children learn the efficient ten-structured up-over-ten method in which the second addend is split into a) the
102
K.C.Fuson, J.L. Fraivillig & B.H. Burghardt
part that makes ten with the first addend and b) the left-over amount. For example, 8 t 5 is done as 8 t 2 (to make ten) t 3 (the rest of 5 ) = ten plus three. With the blocks 8 longs plus 2 of the 5 longs makes one flat plus 3 longs of the 5 original longs left. This is the addition method taught to Chinese, Japanese, and Korean children (Fuson & Kwon, in pressa; in press-b; Fuson, Stigler, & Bartsch, 1988), and it seems to be readily learned by these children whose language supports these methods. Thus, work with base-ten blocks might support this more advanced single-digit addition procedure. Conclusions The base-ten blocks present key quantitative features of multidigit English words and marks. Empirical questions about these blocks are (a) whether and when and how do children use these features of the blocks to carry out correct blocks addition and (b) whether and when and how do children use these features of the blocks and of blocks addition to carry out correct addition with multidigit marks. Our results indicated that (a) was fairly straightforward. The blocks strongly directed children toward correct block addition procedures. For (b), we found that second graders could easily link the quantitative features of the blocks to the marks and English words. Such linking did enable them to invent addition methods for the written marks that were accurate, based on the quantitative features; methods twhich the children were explain and justify. Such linking also enabled them to self-correct incorrect marks addition procedures and justify these corrections based on quantitative features. However, we identified two crucial roles of an adult in accomplishing (b). First, most children did not spontaneously make such links but instead worked within separate blocks and marks worlds. In the separate marks world, they primarily used concatenated single-digit meanings of the multidigit marks and made many different kinds of errors. When an adult supported links between blocks and marks over at least a class session, children did find it relatively easy to carry out linked procedures. It was these linked procedures that enabled them to correct incorrect marks procedures and carry out correct ones. Second, these high-achieving children spontaneously produced relatively few explanations or even full descriptions of blocks addition that could support the linking of the multiunit quantities to the marks procedure. This lack of explanations was sometimes detrimental to the mathematically less advanced children in a group. Again, when an adult elicited such verbalizations, most children could give them. The ease with which children functioned within the linked setting once the adult had helped them create this setting, and the fact that many could give when asked
Number Wordr, Multiunit Blocks, & Multidigit Addition
103
adequate verbal descriptions and justifications, suggests that these inadequacies may have at least partly stemmed from an inadequate communication of the goals of the block activity as including describing and justifying steps in an addition procedure. It may take some time for a teacher to define the goals of a classroom as using physical materials and situations to enable children to carry out marks activities that are comprehensible to and justifiable by them and to help children learn how to use materials to approach mathematics in this way. The power of the blocks to direct correct addition and constrain incorrect marks addition indicates that this can be a powerful approach to meaningful mathematics learning and will be well worth the extra class time and extra initial teacher preparation it takes. These results illuminate several views of children’s learning that have been or are currently rather widespread. These might be summarized in the following three different views or models of children’s learning: (a) the Monkey See-Monkey Do imitation view of learning, (b) the Computer programmed view of learning, and (c) the Instamatic Camera view of learning (especially with pedagogical objects such as base-ten blocks). In a classroom using the Monkey See-Monkey Do view of learning, an expert (usually the teacher) models a mathematical activity or procedure, and children imitate that model. With the Computer view of learning, children are programmed to carry out the mathematical activity or procedure by being told the rule or procedure to carry out. With the Instamatic Camera view of learning, children are briefly shown pedagogical objects that present mathematical features; children are viewed as cameras that can instantly picture these objects and use them internally to direct their mathmatical thinking. Learning views (a) and (b) have dominated traditional school mathematical instruction. Learning view (c) has directed some uses of manipulative materials (pedagogical objects) for a long time; it recently appears in some textbooks and has marred some instructional research. None of these views of learning results in successful mathematics learning for most students, though all of these views can be effective with some students (usually the most advanced, who can construct conceptual understanding with little support). There is considerable evidence that children do not imitate correctly (the Monkey SeeMonkey Do view is not effective), they do not stay told (the computer view is not effective), and they do not stay shown--or showing does not even work initially (the Instamatic Camera view does not work) (e.g., see Grouws, in press, for results concerning the first two views, and Byrnes & Wasik, 1992, and Resnick & Omanson, 1987, concerning inadequacies of the third view). The view of children’s learning that accounts for much of the current evidence (for example, concerning the many different computational errors children make)
104
KC. Fuson, J.L. Fraivillig & B.H. Burghardt
is that children are goal-directed active Meaning Makers and Rule Derivers. Children in any mathematical classroom are individuallyconstructing meanings for mathematical symbols and deriving rules for mathematical procedures. They do this within the particular mathematical situations presented in that classroom, and their individual constructionsare affected by the meanings and rules derived by the other children and the teacher. A major reason that traditional mathematics calculation instruction fails for so many children is that much of this calculation (with multidigit numbers, decimal fractions, ordinary fractions) uses a meaning of written marks as single digits. Multidigit numbers and decimal fractions are concatenated single digits, and ordinary fractions are two single digits separated by a bar. This meaning is sufficient for the rules in standard calculation procedures. That it is sufficient is part of the power and simplicity of these systems of written marks. But the single-digit meanings of these systems are insufficientto constrain incorrect rules or direct correct ones because there are so many differences between calculation with single digits and calculation in these more complex systems. Various mathematical pedagogical objects have been invented in order to address this issue of helping children learn adequate meanings for various mathematical systems. However, the research literature on their use has been quite mixed concerning the success of various pedagogical objects. Many of the failures, we believe, are due to two sources: (1) their use has been governed by inappropriate or inadequate learning theories and (2) an inadequate understanding of the mathematical system or procedure they are intended to support which results in an inherently inadequate pedagogical object. When the efficacy of particular pedagogical objects has been tested using one of the three learning theories (a) through (c), children’s understanding does not improve considerably,and they may not learn the targeted mathematical procedure as well as children not using the pedagogical objects. These failures occur because the teaching effort does not recognize children as active Meaning Makers and Rule Derivers and thus it ignores where given children are at the beginning of the teaching effort. Use of pedagogical objects has to begin where children are. Such use has to recognize that these pedagogical objects will be viewed by a given child with the conceptual structures that child has at that moment. For most children, there will be some distance between the conceptual structures they possess for the targeted domain at the beginning of teaching and the desired conceptual structures that the pedagogical objects are designed to support. Our results here, and the results of successful use of pedagogical objects (e.g., Wearne & Hiebert, 1989), all indicate that the successful use of pedagogical objects requires a process of interiorization of the features of and actions on the pedagogical objects. This
Number Words, Multiunit Blocks, & Multidigit Addition
105
process of interiorization is neither rapid nor veridical. It depends on the conceptual structures the child already has, and it will for most children take days and, for some children, weeks or even months. These conceptual structures, and the amount of sensitively adapted conceptual support in the environment in the form of adults and other children, determine for a given child, the rapidity of interiorization and the nature of the interiorized conceptions. The brief group summaries of addition given in this paper give some indication of individual differences in this process of interiorization; more detailed analyses of individual learning paths as affected by the conceptions of others in the group are given in Burghardt (1992). Successful uses of pedagogical objects also depend, as we have seen in the study reported here, on children’s making constant close links between the pedagogical objects and the mathematical symbols (here, multidigit marks). If these two worlds remain separate, the meanings potentially supportable by the pedagogical objects cannot and will not become linked to the mathematical symbols. Further, our results indicate that children do not naturally link these two worlds. If anything, strong social forces may continually seek to separate these worlds. Thus, successful use of pedagogical objects may depend upon a teacher’s support of such linking. Unfortunately, other evidence indicates that teachers do not recognize the need for this linking (Hart, 1987). Their typical pattern is to use the pedagogical objects (base-ten blocks, in Hart, 1987) for some period alone, and then to move to marks with little (one day) or no linking of the pedagogical objects to the marks. Our results indicate that this is just the opposite of what children need. The children had little difficulty in adding with the blocks. The blocks were successful pedagogical objects in that their features did direct children to correct multiunit addition and correct trading. But children did have considerable difficulty with addition with marks if this addition was not connected to the blocks. Therefore, children need much of the learning time spent on experiences in which the blocks world and marks world are tightly connected in order for the marks to take on the quantitative meanings supported by the blocks. Furthermore, our results indicate that, after strong connections between the worlds are made, children may need time working just in the marks world while using interiorized blocks meanings for these marks. This can permit children to reflect on and connect various marks procedures (e.g., group M2 connecting their invented f i n g method to the standard algorithm) and give them practice in using these interiorized meanings to direct and correct marks operations. The results of Fuson (1986) indicate that second graders of all achievement levels and even high-achieving first graders do interiorize base-ten blocks, and most of them can
106
K C. Fuson, J.L. Fruivillig & B.H. Butghardt
use these interiorized blocks to self-correct errors that may arise over time in their marks procedures. A few children who began making marks errors needed to use the actual blocks to self-correct their errors, another indication of the individual variation in the process of interiorization. The combination of the (usually) slow process of interiorization and the need for prolonged linking of the pedagogical objects with their verbal and written symbols results in a prolonged and complex learning experience with different phases. Initially there is a period of close linking of actions on the pedagogical objects and actions on spoken or written mathematical symbols. Then there may be a phase in which the marks are used without the objects but verbal descriptions of pedagogical object actions are given to keep the meanings linked to the marks. Finally, there may be a phase of use of the marks in a particular solution in which the meanings are not explicitly accessed during the solution (e.g., 5286 and 2749 are added without accessing multiunit meanings). However, the goal of the use of pedagogical objects is that the interiorized meanings are available at any time for the solution of nonroutine problems or for the justification of a particular solution. Unfortunately, many oversimplistic interpretations of the use of pedagogical objects assume the Instamatic Camera theory of learning with pedagogical objects or have a simpler view than indicated in this study (e.g., Byrnes and Wasik, 1991, in their reduction of pedagogical objects to their described "simultaneous activation" view). The second limitation of pedagogical objects is not in the learning theory employed in their use, but rests in the pedagogical objects themselves. An inadequate understanding of the mathematical domain can lead to the design of pedagogical objects that have features that are irrelevant or even misleading with respect to the targeted mathematical domain. For example, colored chips are widely used as place-value activities. However, they present neither the quantitative features of the English words and multidigit marks (red does not show tenness, green does not show hundredness, etc.) nor the use of position to show multiunit quantities as in the written marks (one needs to use chips all of one color to do this, e.g., Bell, Fuson, & Lesh, 1976). Thus, these pedagogical objects do not help children learn multiunit meanings, and they support incorrect responses (Labinowicz, 1985). Many pedagogical objects are designed and marketed without any accompanying analysis of the mathematical features of the English words and written symbols that indicate how the pedagogical objects can support the desired learning, Nor is there empirical research indicating what conceptual structures children need to have already in order to use particular pedagogical objects successfully, i.e., there is no sense of the zone of proximal
Number Words, Multiunit Blocks, & Multidigir Addition
107
development for particular pedagogical objects (Vygotsky, 1962). Both of these kinds of analyses are needed. An inadequate mathematical analysis can also result in measures of conceptual understanding that are not the mathematical prerequisites for learning a particular mathematical operation. For example, Byrnes and Wasik (1991) used measures of conceptual knowledge in the domain of fractions that were much too simple for the targeted operations: addition and multiplication of fractions. Only the conceptual measure of order was even close to the difficulty of these operations; understanding order of fractions (i.e., is 1/3 > 1/5?) can inhibit the almost universal incorrect addition procedure used by Byrnes and Wasik's subjects (add the numerators and add the denominators). However, there was no measure of understanding of equivalence classes of fractions (e.g., 1/2 = 2/4 = 3/6 = 4/8 = ...) or of the ability to change one fraction into a targeted related member of the equivalence class. These are the conceptual prerequisites for addition of fractions. One can only add fractions that have the same fractional unit (e.g., eighths of a whole unit) just as one can only add like multiunits in multidigit addition. Therefore, one must change fractions that have different fractional units into those with the same fractional units in order to add them. Byrnes and Wasik's use of pedagogical objects to support correct fraction addition also seemed (from the brief description available) to be limited by an inadequate linking of actions on the pedagogical objects (plastic wedges in which different fractional units were a different color) to operations on the written fraction symbols (using a least common denominator procedure). In this paper we did not focus on the social aspects of learning. Many of our group summaries did convey some of these social results, however, because these aspects are inherently inseparable from the cognitive aspects whenever learning occurs in a social setting. Ways in which the personalities, initial knowledge, and knowledge-under-construction of individual children interacted with the personalities, initial knowledge, and knowledge-under-construction of other group members is described in Burghardt (1992). A recent theoretical discussion of social/cultural aspects of the Meaning Maker theory of children's learning is given in Cobb, Yackel, and Wood (1992). That paper articulates very sensitively and well these social/cultural aspects of a Meaning Maker theory of learning. But, in our view, the paper confounds a "representational view of the mind with the theory of learning used by the person espousing such a view of the mind, especially when the research used pedagogical objects. The paper therefore mislabels the target of its analyses and loses effectiveness as a result. The real targets of the arguments mounted against the "representational view of the mind are the Monkey See-Monkey Do, Computer, and Instamatic Camera theories of learning.
108
K C.Fuson, J.L. Fraivillig & B.H. Burghardt
The paper reads quite well if one simply substitutes these targets whenever the "representational view of the mind is mentioned. A major problem with the mislabelling is that some representational view of the mind, i.e., some view of children's conceptual structures, is required by a constructivist Meaning Maker theory of learning. Such a theory requires, more than any other, considerable understanding of children's conceptual structures in a given mathematical domain. For researchers who have experience in watching children's learning and are keenly aware of the power of the conceptual structures possessed by a given child to affect what that child sees and hears in a given learning situation, it is absurd to imagine that most children could learn a complex mathematical procedure or system of concepts in one session of use of pedagogical objects, no matter how powerful (the Instamatic Camera view), or by simple imitation (the Monkey See-Monkey Do) or by following rules (the Computer view). Their discussion of these theories is insightful. However, it would have been very helpful if these theories of learning had not been lumped together and mislabelled the "representational view of the mind." Ohlsson and Rees (1991) raise the question of whether one can learn "why" and "how"at the same time. Their analysis of the function of conceptual understanding in the learning of arithmetic procedures emphasizes using knowledge about "why to correct errors that arise in an already learned "how" procedure. In their analysis, conceptual understanding can support the self-correction of errors by constraining problem states so that errors can be detected and corrected. Our study provided support for this position. When blocks were linked to marks, the multiunits in the blocks led children to see the errors in their marks procedures. However, our study also indicates that the "why" can precede and direct the "how": Conceptual understanding has an equally important role in directing the construction of initially correct problem states (i-e., here, correct multiunit addition). Our results also emphasize the critical role of the teacher in the classroom in ensuring that the pedagogical objects are linked to the written marks (i.e., that the available conceptual understanding is related to the marks) and in directing children's attention to critical features of these objects to facilitate their use. Conceptual understanding can enable children to construct correct arithmetic procedures and to find and eliminate errors in incorrect procedures. However, children must understand and accept this conceptual approach to mathematics learning in order to carry it out, and they may need to be helped along the way in seeing and using critical quantitative aspects of the domain. Likewise, researchers and teachers using pedagogical objects need to use a theory of learning that is
Number Words, Multiunit Blocks, & Multidigit Addition
109
consistent with the focus of pedagogical objects on meaning (is., they need to use a Meaning Maker and Rule Deriver theory of children’s learning). ACKNOWLEDGEMENTS This research is supported by the National Center for Research in Mathematical Sciences Education, the Wisconsin Center for Education Research, University of Wisconsin-Madison. The Center is funded primarily by the Office of Educational Research and Improvement, US. Department of Education (OERI/ED). The opinions expressed in this publication do not necessarily reflect the position or policy of OERI/ED. No endorsement by the Department of Education or the Office of Educational Research and Improvement should be inferred. REFERENCES Bell, J. & Burns, J. (1981). Counting and numeration capabilities of primary school children: A preliminary report. In T. R. Post & M. P. Roberts (Eds.), Proceedings of the 7’hird Annual Meeting of the North American Chapter of the International Group for fhe Psychology of Mathematics Education (pp. 17-23). Minneapolis: University of Minnesota. Bell, M.S., Fuson, K.C. & Lesh, R A . (1976). Algebraic and arithmetic structures: A concrete approach for elementary school teachers. New York: Free Press. Bymes, J.P, 8c Wasik, BAA. (1991). Role of conceptual knowledge in mathematical procedural learning. Developmental Psychology, 27, 777-786. Cobb, P., Yackel, E. & Wood, T. (1992). A constructivist alternative to the representational view of mind in mathematics education. Journal for Research in Mathematics Education, 23, 2-33. Cohen, E.G. (1986). Designing groupwork: Strategies for the heterogeneous classroom. New York: Teachers College Press. Dienes, Z.P. (1960). Building up mathematics, Fourth edition. London: Hutchinson Educational, Ltd. Fuson, K.C. (1986). Roles of representation and verbalization in the teaching of multidigit addition and subtraction. European Journal of Psychology of Education, I , 35-56. Fuson, K.C. (1988). Children’s counting and concepts of number. New York: Springer-Verlag. Fuson, K.C. (19Wa). Conceptual structures for multiunit numbers: Implications for learning and teaching multidigit addition, subtraction, and place value. Cognition and Instruction, 7, 343-403.
110
KC.Fuson, J.L. Fraivillig & B.H. Burghardt
Fuson, K.C. (1990b). Issues in place-value and multidigit addition and subtraction learning and teaching. Journal for Research in Mathematics Education, 21, 273-280. Fuson, K.C. (in press-a). Research on learning and teaching addition and subtraction of whole numbers. In G. Leinhardt, R.T. Putnam & R A . Hattrup (Eds.), The analysis of arithmetic for mathematics teaching. Hillsdale, NJ: Lawrence Erlbaum Associates, Inc. Fuson, K.C. (in press-b). Research on whole number addition and subtraction. In D. Grouws (Ed.), Handbook of research on mathematics teaching and learning. New York: Macmillan. Fuson, K.C. & Briars, D.J. (1990). Using a base-ten blocks learning/teachhg approach for first- and second-grade place-value and multidigit addition and subtraction. Journal for Research in Mathematics Education, 21, 180-206. Fuson, K.C. & Burghardt, B.H. (1991). Multidigir subtraction procedures invented by small p u p s of second graders. [Unpublished data.] Fuson, K.C., Burghardt, B.H. & Fraivillig, J.L. (1992). Multidigit addition procedures invented by small p u p s of second graders. Manuscript in preparation. Fuson, K.C. & Fraivillig, J.L. (in press). Supporting children’s ten-structured thinking in the classroom. In G . Bell (Ed.), Asian Perspectives on Mathematics Education. Melbourne: Australian Council for Educational Research. Fuson, K.C. & Kwon, Y. (1991). Chinese-based regular and European irregular systems of number words: The disadvantages for English-speaking children. In K. Durkin & B. Shire (Eds.), Language and mathematical education (pp. 211-226). Milton Keynes, GB: Open University Press. Fuson, K.C. & Kwon, Y. (in press-a). Korean children’s single-digit addition and subtraction: Numbers structured by ten. Journal for Research in Mathematics Education. Fuson, K.C. & Kwon, Y. (in press-b). Korean children’s understanding of multidigit addition and subtraction. Child Development. Fuson, K.C., Stigler, J.W. & Bartsch, K. (1988). Grade placement of addition and subtraction topics in China, Japan, the Soviet Union, Taiwan, and the United States. Journal for Research in Mathematics Education, 19,449-458. Ginsburg, H. (197‘7). Children’s arithmetic: How they learn it and how you teach it. Austin, TX: Pro-Ed. Grouws, D. (Ed.) (in press). Handbook of research on mathematics teaching and learning. New York: Macmillan.
Number Worh, Multiunit Blocks, h Multidigit Addition
111
Hart, K.M. (1987). Practical work and formalization, too great a gap. In J. C. Bergeron, N. Herscovics & C. Kieran (Eds.), Proceedings ftom the 22th International Conferencefor the Psychology of Mathematics Education (Vol. 2, pp. 408-415). Montreal. Johnson, D.W. & Johnson, R.T. (1989). Cooperative learning in mathematics education. In P. R. Trafton & A. P. Shulte (Eds.), New directions for elementary school mathematics, (pp. 234-245). Reston, VA: The National Council of Teachers of Mathematics. Labinowicz, E. (1985). Learning from children: New beginnings for teaching numerical thinking. Menlo Park, CA: Addison-Wesley Publishing Company. Menninger, K. (1969). Number wor& and number symbols: A cultural histoy of numbers. (P. Broneer, Trans). Cambridge, MA: MIT Press. (Original work published 1958). Miura, I.T. (1987). Mathematics achievement as a function of language. Journal of Educational Psychology, 79,79-82. Miura, I., Kim, C.C., Chang, C. & Okamoto, Y. (1988). Effects of language characteristics on children’s cognitive representation of number: Cross-national comparisons. Child Development, 59, 1445-1450. Miura, 1. & Okamoto, Y. (1989). Comparisons of American and Japanese first graders’ cognitive representation of number and understanding of place value. Journal of Educational Psychology, 82, 109-113. Ohlsson, S. & Rees, E. (1991). The function of conceptual understanding in the learning of arithmetic procedures. Cognition and Instruction, 8, 103-179. Resnick, L.B. & Omanson, S.F. (1987). Learning to understand arithmetic. In R. Glaser (Ed.), Advances in inslructional psychology (Vol. 3, pp. 41-95). Hillsdale, NJ: Erlbaum. Sekgler,R.S. (1991). Children’s thinking. Englewood Cliffs, NJ: Prentice-Hall, Inc. Stigler, J.W. (1988). The use of verbal explanation in Japanese and American classrooms. Arithmetic Teacher, 36(2), 27-29. Vygotsky, L.S. (1986). Thought and language ( A. Kozulin Trans.). Cambridge, MA: The MIT Press. (Original work published 1934). Wallace, J. (1990). Learning multidigit addition and subtraction in small cooperative groups: Errors that arise and the conceptual structures fhey violate. Honors Thesis, Northwestern University. Unpublished manuscript. Wearne, D. & Hiebert, J. (1989). Cognitive changes during conceptually based instruction on decimal fractions. Journal of Educational Psychology, 81, 507-513.
112
KC. Fuson, J.L. Fraivillig & B.H. Burghardt
Working Groups of the Commission on Standards for School Mathematics of the National Council of Teachers of Mathematics (1989). Curriculum and evaluation standards for school mathematics. National Council of Teachers of Mathematics: Reston, VA. Working Groups of the Commission on Teaching Standards for School Mathematics of the National Council of Teachers of Mathematics (1991). Professional standards for reaching mathematics. National Council of Teachers of Mathematics: Reston, VA.
The Nature and Origins of Mathematical Skills J.I.D. Campbell (Editor) 0 1992 Elsevier Science Publishers B.V. All rights reserved.
113
Chapter 3 UNDERSTANDING ELEMENTARY MATHEMATICS
Jeffrey Bisanz University of Alberta Jo-Anne LeFevre Carleton University
Summary Understanding often is defined inconsistently, ambiguously, or narrowly, and consequently the relation between understanding and cognitive processing on mathematical tasks is not very clear. Many forms of behavior are related to understanding and a framework is needed to describe these various forms in an integrated way. A "contextual space" is proposed for classifying different types of perjormance related to understanding elementary mathematical concepts. The two dimensions of this space are the type of activity involved (applying justifying, and evaluatingsolution procedures) and the degree of generality with which these activities are exercised. These two aspects can be used to construct a 'Iprofile" that reflects the contexts in which an individual shows various forms of understanding. Acquisition of understanding can be described in terms of the sequences of profiles that emetge, and these sequences have implicationsfor characterizing the mechanisms underlying changes in knowledge. Introduction Over the past 25 years, cognitive, developmental, and instructional psychologists have made a great deal of progress in determining how children and adults compute answers to simple arithmetic problems. Research focused on the processes that underlie arithmetic computation has been important for generating insights about remembering, attention, problem solving, and development, and these insights have led to hypotheses and conclusions that extend far beyond the domain of arithmetic (e.g., Ashcraft, 1987; Campbell & Clark, 1989; Kail, 1988;
114
J. Bisanz & J. LeFevre
Siegler, 1986; Siegler & Jenkins, 1989). This research also has considerable promise for improving instructional practice by increasing the sensitivity of assessment and by contributing to the development of better teaching methods (e.g., Grouws, in press; Leinhardt, Putnam & Hattrup, in press; Romberg & Carpenter, 1986). Despite the success of research on identifying computational processes, we still lack an integrated and detailed account of the relation between computational processes and the common, if ambiguous, notion of understanding Certainly the degree or level of understanding must be reflected in the solution processes used by children and adults. What remains unclear, however, are a number of questions about the precise relation between understanding and solution processes. How, for example, should understanding be inferred from behaviors, how might it be represented internally, how is it involved in the construction, selection, or modification of solution procedures, and how might it be modified by the use of solution procedures? Questions such as these are of critical importance for psychologists who seek to provide a full account of remembering and problem solving, and of how these processes develop. Answers to these questions are also of interest to teachers, who seek to help students achieve levels of understanding that transcend the low levels of performance often required in classrooms (Cum'culum and Evaluation Standards for School Mathematics, 1989). Details about the relation between understanding and solution procedures are also very relevant for evaluation specialists, who are beginning to recognize the shortcomings of current achievement tests (Kulm, 1990). This confluence of interests between research psychologists, educational practitioners, and evaluation specialists has begun to provide a basis for a mutually beneficial interaction, and the opportunity for enhancing this interaction should not be missed. In our view, research on the relation between understanding and solution procedures has progressed slowly because understanding is often defined poorly or inconsistently. In this chapter we therefore focus on the preliminary and important issue of how understanding in elementary mathematics might be conceived and operationalized optimally for developmental, cognitive, and instructional purposes. We begin by describing general difficulties in defining what it means to understand simple arithmetic, and we note the kinds of problems that can arise when understanding is defined operationally in terms of performance on a single type of task. We suggest, instead, that different forms of understanding must be recognized. Next we describe a general, two-dimensional framework for classifying the contexts in which forms of understanding can be evaluated, and we highlight the advantages of focusing on relations among tasks as opposed to single tasks. Finally we explore some implications of this classification scheme for the
Understanding Elementary Mathematics
115
development of understanding. Our intent is to provide a multifaceted view of understanding that facilitates research on the relation between understanding and solution processes and that contributes to improved assessment and instructional practices. Problems in defining understanding What does it mean to understand simple mathematics? Teachers, parents, researchers, and sometimes even students emphasize the importance of understanding in learning mathematics. Despite this consensus, a clear and comprehensive definition of understanding remains elusive. Consider, for example, the case of students who are presented with two problems: 85 + 29 and 29 + 85. Suppose that students answer both problems correctly, and then they perform with continued success on several similar pairs of problems. Do these students understand simple multidigit addition? If we focus on correctness of the answers, as is often the case in conventional methods of educational assessment, then we might conclude that these students indeed understand addition of two-digit numbers. If, however, we focus on the way in which students solve the problems or respond to verbal questions about the problems, our judgment might be quite different. For example, if some students solve the problems correctly but fail to realize that the 8 and 2 represent 80 and 20, respectively, then we might conclude that these students have only memorized a solution algorithm and do not understand the role of place value in addition (Fuson, 1990). If the students are incapable of decomposing the problem into a simpler form (e.g., 85 + 29 = 85 t 30 - l), then we might conclude that they do not completely grasp part-whole relations (Putnam, DeBettencourt & Leinhardt, 1990). If the same left-to-right procedure is used on every single problem, then we might be concerned that the students do not understand commutativity; that is, the students may fail to grasp that 85 + 29 and 29 + 85 must yield the same answer, and so calculation on the second problem is unnecessary (Baroody, Ginsburg & Waxman, 1983). Finally, if some students solve two-digit addition problems but fail to solve three-digit problems or word problems with two-digit values, then we probably would conclude that their understanding of multidigit addition is illusory or, at best, quite limited. Clearly different forms of understanding can be identified, and this example highlights two problems in defining what it means to understand in simple mathematics. First, our judgment about whether a person understands simple addition depends on the elements entailed in our definition of simple addition.
116
J. Bisanz & J. LeFevre
Depending on our d e f ~ t i o n these , elements can include various concepts and principles (e.g., place value, part-whole relations, commutativity, inversion, associativity, compensation). Moreover, determining whether our hypothetical students understand simple addition depends not only on the products of their solution processes (i.e., the answers provided by the students), but also on the processes that underlie performance (e.g., fact retrieval, counting, decomposition, and use of heuristics). Thus the first problem consists in defining the specific domain of what is to be understood, and it is addressed by developing theories about processes and/or concepts involved in performing tasks related to that domain. The vast bulk of research on mathematical cognition is consistent with this agenda, including research ranging from the identification of memory retrieval processes (Campbell & Oliphant, this volume) to classification of the knowledge structures necessary for solving arithmetic word problems (Kintsch & Greeno, 1985).
Also implicated in the example is a second, somewhat more subtle problem, and one on which we focus in this chapter. After researchers or teachers have identified the domain of understanding, the next step typically is to select or develop a task to elicit relevant behavioral evidence on whether the student understands the domain. In the example above, our judgment about a student’s understanding could well depend on whether we examine solution processes or pose questions about the properties of addition. Similarly, our conclusion may depend greatly on whether we present symbolic arithmetic problems or word problems to the student. Thus this second problem amounts to defining the context in which understanding is being assessed. For present purposes, context refers broadly to task demands and materials that are used to evaluate understanding; we do not use the term to refer only to certain environments (e.g., academic versus nonacademic situations). Effects of context have been amply demonstrated, but there has been little effort to systematically examine the role of context and to explore and integrate the cognitive and developmental implications of these effects. We now turn to this issue. Contexts of understanding As illustrated in the example we have described, an individual may show evidence for some forms of understanding but not others, depending on context. We assume that no single context is definitive for assessing understanding. Moreover, we assume that discrepancies as a function of context reflect potentially important differences in underlying processes and/or representations. A narrow definition or criterion would tend to make understanding an all-or-none
Understanding Elementay Mathematics
117
phenomenon (Greeno, 1983) and would mask relations among the varieties of understanding that may emerge during the course of acquisition. We propose that evidence for different forms of understanding can be organized, at least initially, in terms of a few orthogonal aspects or dimensions, and that understanding and its development can be characterized in terms of the contextual space defined by these dimensions. Such an organizational scheme should be useful for determining a profile of understanding that reflects the contexts in which an individual evinces performance related to understanding. These profiles should be useful for identifying common patterns that reflect underlying individual differences or sequences in acquisition, as well as for providing insights for instructional intervention. As an initial step toward identifying different forms of understanding in simple mathematics, we propose the classification scheme in Table 1. The scheme consists of two aspects, activity and generality, that jointly define a contextual space for assessing understanding.
Table I. Contexts for Assessing Understanding
Generality
.....................
Activity
Narrow
Application of procedures
Using an appropriate procedure spontaneously on one particular task.
Using a similar procedure on a variety of related tasks.
Justification of procedures
Describing the principle or concept that makes a procedure appropriate for one particular task.
Generating a similar explanation on a variety of related tasks.
Evaluation of procedures
Recognizing the validity of a procedure on one particular task.
Making a similar judgment on a variety of related tasks.
>
Broad
118
J. Bisanz & J. LeFevre
Activity
Tasks used to assess understanding vary considerably in terms of the demands placed on an individual. Children and adults respond to these demands with a wide variety of problem-solving activities. Despite their diversity, the activities required to perform many tasks can be grouped into three general classes: application of procedures to solve problems; explicit justification of procedures; and evaluation of the procedures. To illustrate these activities, consider the problem of determining whether a person understands the principle of inversion as it applies to addition and subtraction (Starkey & Gelman, 1982). When presented with a problem of the form a +. b - b, a person who understands inversion presumably knows that the answer to this problem must be equal to a and that successive addition and subtraction is unnecessary. The critical issue is how this Understanding, or lack of understanding, can be assessed. Application of procedures. One approach is to observe whether an individual uses solution procedures that are consistent with the principle of inversion. When presented with a problem such as 4 + 9 - 9, some children engage in a laborious, left-to-right solution procedure that involves sequential addition and subtraction (i.e., 4 t 9 = 13, and then 13 - 9 = 4), whereas others answer quickly without adding or subtracting (Bisanz & LeFevre, 1990). Children in the latter group appear to use a shortcut based on the principle of inversion. Application of procedures simply refers to the use of a solution procedure that reflects, or is at least consistent with, a concept or principle appropriate for that problem. Knowledge of appropriate concepts or principles often enables children to use alternative solution procedures that are easier, more efficient, or more accurate than the standard algorithms that typically are taught. For example, a child who solves 5 + 6 by transforming the problem to (5 t 5 ) t 1 may be using a procedure based on the concept of part-whole relations (Putnam et al., 1990). Whether the child obtains the right answer is largely irrelevant; the important point is that the child activates a procedure that is consistent with the underlying concept. Use of an appropriate procedure may appear to be compelling evidence that a child understands the underlying concept or principle. This type of evidence alone often can be insufficient, however, for two reasons. First, children may use a conceptually appropriate procedure for reasons unrelated to the underlying concept. In the inversion problem, for example, a child may simply respond with the first number because that type of response had been rewarded on previous problems, or because the child has been trained in a rote fashion to respond in
Undentanding Elementary Mathematics
119
this way on problems having the form of inversion problems. In this case, attributing understanding to the child would be overestimating the child's competence. Second, children may be fully capable of using a conceptually appropriate procedure but simply may do not do so, perhaps because they think that adult observers expect a more prosaic algorithm. In this case, attributing a lack of understanding to the child would be underestimating the child's competence. Thus conclusions based only on the application of a conceptually appropriate procedure can be misleading, and researchers often seek other forms of evidence (such as justification and evaluation of procedures) to confiim or refute their assessments. Justification of procedures. Another way to assess children's knowledge of inversion is to ask them to provide a rationale for the shortcut procedure on inversion problems. For example, when a child uses a shortcut to solve 4 + 9 9, or when a shortcut is presented to the child, can he or she explain why the answer must be 4 and why adding and subtracting are not necessary? An explanation in which the child focuses on the logical necessity of the answer might be taken as evidence that the child understands inversion. Tasks requiring justification can be used to assess whether an individual can explicitly explain or describe the principles, concepts, or rules that account for the appropriateness or validity of a particular procedure. If the principle or concept in question is featured in a reasonable way, the child may be credited with having "explicit knowledge" of the underlying concepts (Green0 & Riley, 1987), and the possibility that the procedure was applied in a rote fashion may be regarded as minimal or negligible. Piaget, of course, relied heavily on the justifications of children and adolescents in determining whether their reasoning was characteristic of preoperational, concrete operational, or formal operational thought (Miller, 1982; Piaget, 1970). In research on mathematical cognition, justifications are often used to assess understanding. Putnam et al. (1990), for example, sought to determine whether elementary school children were able to justify the use of derived-fact procedures to solve certain addition and subtraction problems. A derived-fact procedure involves the transformation of a problem so that known number facts can be used to solve the problem. For instance, a child might solve 4 + 5 by transforming the problem to (4 + 4) t 1, thus making it possible to use his or her knowledge that 4 + 4 = 8 to solve the more difficult problem. In this study, children watched as puppets used derived-fact procedures to solve arithmetic problems. The children were asked, by one of the puppets, to explain why the derived-fact procedure was valid. Putnam et al. evaluated the explanations as a means of assessing whether children understand part-whole relations (Resnick, 1983). Understanding part-
-
120
J. Bisanz & J. LeFevre
whole relations, according to their analysis, consists in the knowledge that the combined value of the parts (the augend and addend) must equal the value of the whole (sum) and that transformations of a part must be compensated by transformations to another part or to the whole, Children were credited with understanding part-whole relations if, for example, they recognized that a change in the value of one of the addends or parts (i.e., changing the 5 to a 4 in the example above) must be compensated by a change in the sum or whole (i.e., changing the 8 to 9). Similarly, Greeno (1983) analyzed verbal protocols to assess the extent to which high school students used a part-whole schema when solving geometry problems. In each case, the justifications provided by students were crucial for determining students’ understanding of the conceptual basis for problem-solving activities. The conditions under which children and adults acquire explicit knowledge of problem-solving procedures, as indexed by their verbal justifications, are of considerable interest. For example, Schliemann and Acioly (1989) studied the performance and knowledge of bookies for a lottery in Brazil and found that justifications can vary as a function of schooling. Although all the bookies showed similarly high performance, only those with some formal schooling were able to satisfactorily explain why their solutions worked. In contrast, young children’s justifications on conservation problems appear to improve as a function of age, not schooling (Bisanz, Dunn & Morrison, 1991). Clearly the relation between schooling and the quality of justification is variable, and the boundary conditions on this relation require clarification. When individuals provide an adequate justification for a procedure, it often is reasonable to assume that they have explicit, accessible knowledge about the underlying principles. Two points of caution should be noted, however. First, failure to provide an adequate justification does not imply that the person lacks the knowledge in question; he or she simply may have difficulty verbalizing that knowledge (e.g., Brainerd, 1973). Second, being able to provide adequate explanations or rationales does not necessarily imply that the person can use the corresponding procedure spontaneously. The procedure simply may be too dificult to implement. For example, a student may be able to justify the use of a multiplication procedure to solve a particular problem but be incapable of executing that procedure mentally because of the memory load involved. Alternatively, a person may be able to justify a solution algorithm but may fail to recognize situations in which that algorithm might be applied. Thus the criterion of justification can be important, but it need not be the sine qua non for evaluating understanding.
Understanding Elementary Mathematics
121
Evaluation of procedures. To assess a child‘s understanding of inversion, a third approach is to describe or present a shortcut procedure based on inversion and to ask the child simply to judge, without performing or justifying the solution procedure, whether the procedure is valid. In the case of inversion, a child who observes a puppet solving the problem without successive addition and subtraction could be asked whether an answer really can be found without adding and subtracting. An argument could be made that children who do not understand inversion would judge the shortcut to be invalid, whereas a child who understands inversion would find the procedure to be perfectly acceptable. Evaluation of procedures refers to a person’s decision about the applicability and correctness of a particular solution to a problem (Greeno, Riley & Gelman, 1984). It reflects a potentially different type of knowledge than application of procedures, in much the same way that one’s judgments of grammatical correctness are different than the production of grammatical speech. Judgment tasks have been used to assess understanding of mathematical cognition, although less frequently than application and evaluation tasks. In a study of counting, for example, Gelman and Meck (1983) discovered that children could detect that a puppet had counted incorrectly even though they could not count accurately themselves (but see Briars & Siegler, 1984, and Wynn, 1990). In a study of estimation, Sowder and Wheeler (1989) found that more children could choose a best-case solution from among alternatives than could apply their knowledge to open-response problems. Appropriate evaluations about the validity of solution procedures reflect knowledge about the conditions and/or constraints under which a concept applies. In contrast with application of procedures, evaluation does not require execution of the solution procedure and so processing demands presumably are minimized. In contrast with justification, evaluation does not require explicit verbalization about a principle and so assessment is not confounded with verbal skill. Moreover, evaluation tasks are relatively simple to perform because of the minimal demand placed on the subject. The simplicity of the task does not necessarily beget simplicity of interpretation, however, because the knowledge used to make the evaluation may be difficult to identify. If, for example, a child indicates that a puppet’s count is inaccurate, it may be difficult to discern whether this judgment reflects sensitivity to a counting principle (e.g., double counting is invalid) or whether the child decided on the basis of some irrelevant feature of the observed procedure (e.g., a dislike of the object on which the count ended). This difficulty can be minimized by combining evaluation and justification activities, so that a child must justify his or her evaluation, but again the verbal demands of justification may lead to an
122
J. Bisanz C? J. LeFevre
underestimation of the child’s competence. Moreover, an evaluation situation may be so ambiguous or difficult that a child could fail to decide correctly, even if the child is fully capable of justification, application, and evaluation in a simpler situation. Generality Each of the activities in Table 1 can be used independently to assess understanding, but each has shortcomings. One way of minimizing this difficulty is to combine two or more activities, but another is to examine the generality of performance by varying the degree of similarity among the problems used. In the case of assessing knowledge of inversion, for example, a researcher could test a child’s understanding of inversion by studying performance on a variety of problems in which an inversion-based shortcut could be used. Problems could vary in form (a + b - b, b t a - 6 , orb + c t a - c - b), size (small or large numbers), or format (symbolic equations or word problems). Children who solve a broader range of problems using shortcut procedures have, arguably, a better Understanding of inversion. The importance of generality for defining the contextual space in Table 1 for assessments of understanding can be illustrated with two examples. Carraher, Schliemann, and Carraher (1988) examined the use of ratio conversion processes in Brazilian foremen who were experienced in converting measurements from the scale of blueprints to the scale of the construction projects on which they worked. Carraher et al. found that all of the foremen were successful when they used blueprints drawn in familiar scales, but only 34% were successful when an unfamiliar scale was used. The fact that many of the men could not adapt the ratio-conversion process to unfamiliar scales was taken to imply that their understanding of ratio conversions was strikingly limited. Whereas Carraher et al. (1988) focused on individual differences, Lawler (1981) examined changes in the generality of knowledge with development. Lawler carefully observed how his 6- year-old daughter’s knowledge about arithmetic changed over several months. Initially his daughter’s knowledge was highly particularistic. For example, her use of solution procedures for numerically identical problems depended on the specific domain or microwodd of the problems: An addition problem presented verbally elicited a combination of derived-fact and counting procedures; a problem involving money elicited addition in coin-based units; and a problem in vertical, paper-and-pencil form elicited the usual right-to-left algorithm with carries. These three procedures, and the domains in which they were applied, were viewed by her as being entirely independent. The
Understanding Elementay Mathematics
123
generality of each procedure continued to be tightly constrained until, according to Lawler, these separate microworlds became integrated into a structure that enabled the use of any of these specific procedures in any of the formerly separate domains. In the case of the 6-year-old, as with the Brazilian foremen, greater generality was interpreted to imply a greater understanding of the underlying mathematical principles. Indeed, generality in problem solving is often taken to be the primary characteristic of true understanding. Wertheimer, for example, distinguished between rote learning and meaningful learning (see Greeno, 1983). According to this view, rote learning results in adequate performance only in situations identical to, or very similar to, the original context of learning. Meaningful learning, in contrast, involves the acquisition of a general principle that enables successful performance in a variety of situations where the principle is relevant. The flexible and general understanding that results from meaningful learning is characterized by demonstrations of transfer from original learning situations to new and different problems. The importance of generality is that successful performance is unlikely to be due to rote learning or to spurious factors if it can be observed on a variety of new or different problems. As Greeno et al. (1984) noted, "the case for understanding of principles is strongest if a child is required to generate a new procedure or a modification of a known procedure, and the procedure that is formed is consistent with the principles" (p. 105). Generality often is minimal in learning and development. As children develop, for example, they often fail to use mnemonic strategies even though they are fully capable of executing the strategies and have used them successfully in other situations (e.g., Brown, Bransford, Ferrara & Campione, 1983). In adults, knowledge obtained in one problem-solving situation often does not transfer to a second, analogous situation (e.g., Gick & Holyoak, 1980). Hatano (1988) found it useful to distinguish between two types of experts: Adaptive experts use their knowledge flexibly, whereas routine experts are proficient only in the specific domain in which they have perfected their performance. At a minimum, the implication of these findings is that thinking and problem-solving skills can be heavily influenced by the nature and organization of specific knowledge. Bransford, Vye, Kinzer, and Risko (1990) refer to the commonly observed lack of generality as one aspect of the "problem of inert knowledge" and clearly illustrate the pervasive importance of this problem for the study of cognition, development, and instruction. Some researchers (e.g., Greeno, 1983) have tried to identify the properties of knowledge structures that enable generality, as opposed to those that enable only very restricted application. Other researchers have focused on factors
124
J. Bisanz & J. LeFevre
that seem to decontextualie learning and enable increased generality (e.g., Bransford et al., 1990; Lehman, Lempert & Nisbett, 1988; Perfetto, Bransford & Franks, 1983), and these investigationsshould provide explicit clues to the question of how knowledge structures influence generality. We do not attempt here to solve the problem of inert knowledge, but we want to underscore the importance of generality in assessing understanding. Profiles. The principle contribution of the scheme in Table 1 is that it helps to outline the kinds of research that need to be conducted to fully evaluate relations among different assessments of "understanding". No single definition or operationalization of understanding is sufficient for capturing the range of contexts in which understanding, in some form, may be demonstrated. Instead, we conceive of Understanding in terms of aprofile in the contextual space defined by activity and generality. Thus the issue for assessment of individuals is not one of deciding whether an individual has or does not have understanding, but rather one of determining the pattern of successes and failures in the contextual space. Similarly, the issue for studying acquisition is not how one group performs on a particular task as opposed to another group, but rather how understanding, or evidence for some form of understanding, "spreads"in this contextual space during the course of development or instruction. In the next section we speculate on some possible developmental sequences and the implications of each for mechanisms of knowledge acquisition. Before proceeding to implications for development, we should note several qualifications that pertain to the contextual space represented in Table 1. First, the activities described in this organizational scheme are illustrative but not necessarily exhaustive. Many researchers have used one or more of the three types of activity to assess understanding (e.g., Greeno et al., 1984), and no doubt other activities (or variations on the three activities) could be used. Second, our characterization of generality is incomplete. A full account of this dimension would require a psychological analysis of the gradients of similarity among problems, a description that necessarily will vary with the particular types of problem solving being studied. Moreover, generality of performance can be assessed in a number of ways (see Greeno & Riley, 1987, for an illustrative list). Third, we have focused on aspects of task contexts that should apply across many domains, rather than on the knowledge structures that might be required for performance in these contexts. We highlight these three limitations because each represents an issue that must be addressed in research on understanding in a specific domain. That is, the scheme in Table 1 serves as a broad framework for investigating understanding in any domain, but the agenda for research in a particular domain consists largely of identifying and studying (a) the most
Understanding Elementary Malhematics
125
important activities relevant to that domain, (b) the dimension(s) of generality for problems in that domain, and (c) the knowledge structures required for performance. In summary, we propose that the study of understanding is best conceived in terms of a space of contexts defined, at least initially, by activity and generality. One implication of this approach is that researchers need to focus not on single tasks but rather on relations among tasks. Given this orientation, the empirical goal becomes one of describing how profiles of understanding vary as a function of domain, development, and individual differences. Development of understanding When research is designed to assess profiles of understanding in a contextual space, then development can be represented as a sequence of profiles or a spread of understanding from an initial point to other parts of the space. This formulation leads to a number of pertinent questions. For example, are application, justification, and evaluation tasks always mastered in the same order? Is some generalization (horizontal spread) typically observed before a new activity is mastered (vertical spread)? How general are changes in profiles across different domains of problems? Answers to these questions should provide valuable insights into characteristics of the knowledge acquisition process that underlies cognitive development (see Klahr, 1976). In this section we briefly describe two plausible, partial sequences to illustrate how different sequences implicate different developmental mechanisms.
Evaluation before application One plausible sequential relation is that evaluation precedes application. Suppose, for example, that children are able to recognize that a shortcut on inversion problems is legitimate before they generally use the shortcut spontaneously. In such a case, evaluative judgment precedes spontaneous application in the course of acquisition. This sequence seems fairly prevalent in a variety of domains. Children often can judge the grammatical appropriateness of a sentence well before they spontaneously incorporate the underlying grammatical rule in their own speech, much as researchers and students often can evaluate the validity of using certain statistical procedures without being able to implement those procedures. Similarly, children and adults typically are better at recognizing features of a "good experiment than they are at designing experiments (Bullock, 1991; Sodian, Zaitchik & Carey, 1991). In mathematical cognition,
J. Bisanz & J. LeFevre
126
Gelman and Gallistel (1978) have proposed that knowledge about the principles of counting precedes and guides acquisition of counting procedures, and Byrnes and Wasik (1991) have argued that children acquire conceptual knowledge about fractions before they add fractions accurately. The common theme is that knowledge about the underlying principles or concepts is acquired in some form before the corresponding procedures are used. An evaluation-before-application sequence would be expected if knowledge is represented initially in terms of general, relatively abstract principles, which in turn can be used to guide or constrain the construction of more specific procedures (Wynn, 1990). The assumption is that possession of abstract principles is sufficient for evaluating the validity of an observed procedure, even if the child has not yet used that procedure. If this hypothesis about the genesis of knowledge structures is correct, then two critical questions about development need to be addressed. First, how do abstract knowledge structures influence the construction or selection, and subsequently the implementation, of task-specific procedures? Second, what accounts for the lag between the acquisition of abstract principles and the use of procedures reflecting those principles?' A survey of proposed answers to the first two questions is beyond the scope of this chapter, but a few examples serve to highlight the issues. Bisanz and LeFevre (1990) speculated about how knowledge about the principle of inversion might be adapted to enable children and adults to use shortcuts on problems of the form a t b - b. They proposed that conceptual and procedural knowledge about mental arithmetic could be represented in terms of productions, which are condition-action statements that can vary greatly in complexity and specificity (cf. Neches, Langley & Klahr, 1987; Newell & Simon, 1972). A production system can contain not only abstract, conceptual productions and task-specific procedural productions, but also knowledge-acquisition productions that, under specific conditions, can modify old productions or create new productions (Kail & Bisanz, 1982; Klahr, 1984; Neches et al., 1987). Bisanz and LeFevre proposed that conceptual productions combine with these knowledge-acquisition productions to create task-specific procedures that enable children to use shortcuts on certain
'
A third and more basic question concerns the origins of abstract principles. This question is of central importance to any developmental theory about understanding, but it is not particularly tractable, theoretically and methodologically, and hence it is rarely addressed thoroughly in research or theory. Basic concepts or capacities are often described as natural or innate. This proposal has practical value because it functionally defines the state of the cognitive system at the point of development where analysis begins. Ultimately, however, simple proposals about innateness have little explanatory value from a developmental-systems view (e.g., Cottlieb, 1991).
Understanding Elementary Mathematics
127
kinds of arithmetic problems. Similar theoretical issues have been addressed with comprehensive models by a number of researchers (e.g., Anderson, 1983; Greeno, 1983; Ohlsson, 1987; Ohlsson & Rees, 1991), and in each case the focus is on the developmental mechanisms that account for the acquisition of task-specific representations of knowledge (procedures) from preceding, abstract representations of knowledge (principles and concepts). The lag between knowledge of principles and use of procedures raises not only issues about representational changes but also about processing factors that are associated with the task or the problem solver and that may interfere with the expression of abstract knowledge (e.g., Flavell & Wohlwill, 1%9; Pascual-Leone, 1970). For example, spontaneous application of a procedure may require more resources (e.g., short-term memory capacity) or processing speed than an individual can manage, whereas the evaluation process may be less demanding. Thus hypotheses about developmental changes in processing efficiency or speed (e.g., Case, 1985; Kail, 1991) also may be pertinent for explaining the transition from the availability of abstract principles to the spontaneous use of procedures based on those principles.
Application before evaluation A second plausible sequence is the opposite of the first. Suppose, for example, that children are able to use shortcuts spontaneously before they are able to evaluate the validity of the procedure. This application-before-evaluationsequence might imply that an individual can activate a procedure initially without having knowledge of the underlying principles or concepts that make the procedure valid. This particular profile may not be flattering, but it appears to capture the capabilities of many school children who are capable of executing arithmetic procedures but have little sense for where those procedures are appropriate (e.g., Carpenter & Lindquist, 1989). Assuming that many children progress beyond this profile, it is possible that, after repeated applications, individuals may construct or infer the principles that underlie these procedures, thus enabling performance on evaluation and justification tasks. Such a sequence, ideal or not, also seems to characterize the way in which some adults learn to program computers or to analyze data statistically. Application of procedures without an appreciation for why the procedures are relevant is often referred to as rote or mechanical performance (e.g., Baroody & Ginsburg, 1986), and it is often viewed as the nemesis of true understanding. To the extent that this sequence represents the use of procedures before acquisition of principles, however, it may be important for acquiring some kinds of knowledge.
128
J. Bisanz & J. LeFevre
For example, several researchers (Briars & Siegler, 1984, Fuson, 1988; Fuson & Hall, 1983; Wynn, 1990) have proposed that children learn to count, at least with small numbers, before they learn at least some of the underlying principles of counting. Upon examining relations between preschoolers’ use of counting procedures and their understandingof cardinality,Wynn (1990) found that children younger than approximately 3.5 years of age showed little evidence of comprehending cardinality, even for numerosities within their counting range. Older children, in contrast, often showed a generalized understanding of cardinality. Wynn concluded that counting is, initially, a relatively meaningless, recitation-like activity, and that children later infer principles such as cardinality on the basis of this activity. A similar scenario seems plausible for forms of arithmetic. If a child is able to execute, however mechanically, the appropriate procedures for adding 2 + 3 and 3 + 2, then he or she will have the opportunity to observe relations among answers and thus to infer underlying principles (e.g., commutativity). The concern, of course, is that children might learn procedures rotely but never gain the insights that are afforded (Baroody & Ginsburg, 1986). An application-before-evaluationsequence would be expected if knowledge is represented first in terms of task-specific procedures, which subsequently enable inferences about more general and abstract principles. Observation of this sequence leads directly to two questions. First, how do task-specific procedures influence the process by which principles are inferred? Second, what accounts for the lag between use of procedures and acquisition of the corresponding principles? One possible approach toward answering these two questions involves hypotheses about resource limitations and automatization. A common set of assumptions in developmental and cognitive research is that (a) the amount of mental resources that can be allocated to processing at any one time is limited, (b) execution of procedures requires resources, and (c) procedures can become increasingly resource efticient with repeated practice, even to the extent that they become automatized or completely independent of resources (see Bjorklund & Harnishfeger, 1990; Case, 1985; Kail & Bisanz, in press). It can be argued that individuals must construct or infer the concepts necessary for evaluation and justification, and that the process of construction or inference is extremely resource dependent. That is, an individual cannot construct or infer the concept necessary for evaluation or justification unless sufficient resources are available. By repeatedly applying the procedure on appropriate problems, the procedure may become increasingly efficient, that is, less dependent on resources. Consequently, greater procedural efficiency may result in the liberation of resources that can be allocated to the process of constructing or inferring the underlying concepts. Consequently, repeated applications and the resulting increase in efficiency could
Understanding Elementary Mathematics
129
be a necessary precursor for acquisition of the concept or principle that is, in turn, necessary for performance on evaluation or justification tasks. (For similar arguments in the domain of elementary arithmetic, see Kaye, 1986; Kaye, dewinstanley, Chen & Bonnefil, 1989; Kaye, Post, Hall & Dineen, 1986). This sort of resource-efficiency hypothesis has several appealing characteristics for explaining developmental phenomena, but many problems with this approach have been in encountered developmental research (Howe & Rabinowitz, 1990). One problem, in the present context, is that the hypothesis includes a description about necessary, but not sufficient, conditions for inference to occur. Specifically, the increasing efficiency of procedures is not sufficient, theoretically, for inferences of a concept or principle; some unspecified process would be required. One possibility is that productions exist that monitor and modify procedures to make them more efficient and more general (Kail & Bisanz, in press). A thorough description of methods by which this type of self-modification could be implemented is beyond the scope of this chapter (see Anderson, 1983; Klahr, Langley & Neches, 1987). The point is that a lag between application and evaluation highlights the need to understand the processes by which inferences about concepts and principles occurs. Other developmental sequences The number of ways in which evidence for different forms of understanding might "spread through the contextual space in Table 1 certainly is not limited to the two cases we have described. For example, the underlying relation between evaluation and application could be reciprocal in that primitive concepts lead to acquisition of simple procedures, use of these procedures could lead to acquisition of more complex or differentiated concepts, which in turn could guide the development of more advanced procedures, and so on (Case, 1985; Fuson, 1988, Chapter 10). This type of account is particularly appealing because it implies that mechanisms for constructing procedures and for inferring concepts are interactive and equally important in understanding cognitive development. Methodologically, repeated observations over time with a variety of evaluation and application tasks might be the most appropriate way to identify and assess reciprocal interactions of this type (cf. Siegler & Crowley, 1991). To this point we have not considered how performance on justification tasks might fit into various sequences, but several possibilities seem plausible. If justification requires verbal skills that lag behind other means of performing (e.g., Brainerd, 1973; Miller, 1982), then performance on justification might occur last in acquisition. For some kinds of expertise, however, a different sequence might
130
J. Bisanz & J. LeFevre
be expected. An excellent figure skating coach, for example, may be quite capable of explaining the principles of performing a routine without being able to execute the routine, and a researcher who described how an idiot savant performs calendrical calculations may not be able to perform those calculations. Thus the relation between justification and application need not be unidirectional. To summarize, different sequences of development in the contextual space lead to different questions about the underlying nature of the knowledge-acquisition process. The evaluation-before-application sequence requires hypotheses about how a cognitive system is able to use general, abstract principles to generate specificprocedures, the application-before-evaluationsequence requires hypotheses about how the system might progress from use of specific procedures to inferences about general principles, and more complex sequences may require hypotheses about reciprocal interactions among procedures and principles.
Conclusions
As we pondered what our contribution to this volume might be, we considered describing our work on memory and problem-solving processes in mental arithmetic (e.g., Bisanz & LeFevre, 1990; LeFevre, Kulak & Bisanz, 1991). We were struck, however, by the enormous gulf between two solitudes in the study of mathematical cognition. One solitude consists of cognitive and developmental psychologists who conduct experimental work on the ways in which children and adults solve arithmetic problems. For this group, the focus is on careful identification and measurement of the processes and representations that underlie performance, and the goal is to generate a model of cognition and its development that accounts for observed performance on carefully defined tasks. The other solitude consists of teachers and some educational researchers who are less interested in memory or problem-solving processes and more interested in the degree to which students understand mathematics. To the experimental psychologist, the educator’s concept of understanding seems soft, fuzzy, and bereft of ties to rigorous theory or methodology. To educators, the psychologist’s preoccupation with retrieval and decisions processes seems misplaced. In our view, crossing the gulf between these two solitudes is critical for addressing fundamental questions about the acquisition of elementary mathematical skills. With some notable exceptions (e.g., Fuson, 1988, 1990; Greeno & Riley, 1987; Resnick, 1983), attempts at connecting research on procedures with research on various forms of understanding have been few and far between, and an integrative framework for this line of research is needed.
Understanding Elementary Mathematics
131
Our approach in this chapter has been to outline such a framework by exploring the kinds of methods that are useful for addressing questions about the development of understanding. This framework is based on three central propositions. The first proposition is that many forms of behavior are related to understanding and that defining understanding in terms of performance on a single task or measure is counterproductive. Narrow definitions promote an all-or- none view of understanding that simply is not very informative for developmental and instructional concerns, where the focus is on the sequence in which different behaviors and skills are acquired. The second proposition is that a contextual space can be defined to identify different types of performance related to understanding a particular concept or principle. We suggest that activity, as exemplified by three general classes of tasks, and generality are particularly appropriate for identifying apmfile of performance in this contextual space, and that this profile provides the empirical basis for constructing theories about the relations among skills required for different tasks. The third proposition is that the acquisition of understanding be studied in terms of sequences of profiles that emerge during the course of development or instruction, as opposed to studying changes in performance on a single task. As we have illustrated, this approach should provide valuable insights into mechanisms underlying the acquisition processes, Given this framework, the empirical goal is to describe relations among tasks, or profiles, that emerge during development and instruction. The theoretical goal, then, is to describe knowledge acquisition systems that can account for the observed changes in profiles. We expect that these theories will provide useful insights for both the assessment and instruction of mathematical understanding and, moreover, that the process of attempting to convert these insights into educational practice will contribute to richer and more revealing theories and methods of research. ACKNOWLEDGEMENTS Preparation of this chapter was supported with grants from the Natural Sciences and Engineering Research Council of Canada to both authors. We are grateful to Jamie Campbell, Karen Fuson, Alison Kulak, Don Mabbott, and Laura Novick for helpful comments on a previous draft. Correspondence may be addressed to J. Bisanz, Department of Psychology, University of Alberta, Edmonton, AB, Canada T6G 2E9, or to J. LeFevre, Department of Psychology, Carleton University, Ottawa, Ontario, Canada K l S 5B6.
132
J. Bisanz & J. LeFevre
REFERENCES Anderson, J.R. (1983). The architecture of cognition. Cambridge, MA: Harvard University Press. Ashcraft, M.H. (1987). Children’s knowledge of simple arithmetic: A developmental model and simulation. In J. Bisanz, C.J., Brainerd & R. Kail (Eds.), Formal methods in developmental psychology: Progress in cognitive development research (pp. 302-338). New York Springer-Verlag, Baroody, A.J. & Ginsburg, H.P. (1986). The relationship between initial meaningful and mechanical knowledge of arithmetic. In J. Hiebert (Ed.), Conceptual andprocedural knowledge: The cme ofmathematics (pp. 75-112). Hillsdale, NJ: Erlbaum. Baroody, A.J., Ginsburg, H.P. & Waxman, B. (1983). Children’s use of mathematical structure. Journal for Research in Mathematics Education, 24, 156-168. Bkanz, J., DUM, M. & Morrison, FJ. (1991, April). Efects of age and schooling on the acquisition of elementary number skills. Poster presented at the meeting of the Society for Research in Child Development, Seattle, WA. Bisanz, J. & LeFevre, J. (1990). Strategic and nonstrategic processing in the development of mathematical cognition. In D.F. Bjorklund (Ed.), Children’s strategies: Contemporay views of cognitive development (pp. 213-244). Hillsdale, NJ: Lawrence Erlbaum Associates. Bjorklund, D.F. & Harnishfeger, K.K. (1990). The resources construct in cognitive development: Diverse sources of evidence and a theory of inefficient inhibition. Developmental Review, 10, 48-71. Brainerd, C.J. (1973). Judgments and explanations as criteria for the presence of cognitive structure. Psychological Bulletin, 79, 172-179. Bransford, J.D., Vye, N., Kinzer, C. & Risko, V. (1990). Teaching thinking and content knowledge: Toward an integrated approach. In B.F. Jones & L. Idol (Eds.), Dimensions of thinking and cognitive instruction (pp. 381-413). Hillsdale, NJ: Erlbaum, 1990. Briars, D. & Siegler, R.S. (1984). A featural analysis of preschoolers’ counting knowledge. Developmental Psychology, 20, 607-618. Brown, A.L., Bransford, J.D., Ferrara, RA., and Campione, J.C. (1983). Learning, remembering, and understanding, In J.H. Flavell & E.M. Markman (Eds.), P.H. Mussen (Series Ed.), Handbook of child psychology: Vol. 4. Cognitive development (pp. 77-166). New York: Wiley. Bullock, M. (1991, April). Scientific reasoning in elementary school: Developmental and individual differences. In E. Amsel (Chair), Evidence
Understanding Elementary Mathematics
133
evaluation, conceptual change, and the development of scientific reasoning. Symposium presented at the meeting of the Society for Research in Child Development, Seattle, WA. Byrnes, J.P. & Wasik, BA. (1991). Role of conceptual knowledge in mathematical procedural learning. Developmental Psychology, 27, 777-786. Campbell, J.I.D. & Clark, J.M. (1989). Time course of error priming in number-fact retrieval: Evidence for excitatory and inhibitory mechanisms. Journal of Ewperimental Psychology: Learning, Memory, and Cognition, 15, 920-929. Carpenter, T.P. & Lindquist, M.M. (1989). Summary and conclusions. In M.M. Lindquist (Ed.), Results from the Fourth Mathematics Assessment of the National Assessment of Educational Progress (pp. 160-169). Reston, VA: National Council of Teachers of Mathematics. Carraher, T.N., Schliemann, A.D. & Carraher, D.W. (1988). Mathematical concepts in everyday lie. In G.B. Saxe & M.Gearheart (Eds.), Children’s mathematics: New directions for child development (Vol. 41, pp. 71-88). San Francisco, CA: Jossey-Bass. Case, R. (1985). Intellectual development: Birth to adulthood. Orlando: Academic Press. Cum‘culum and evaluation standardsfor school mathematics. (1989). Reston, VA: The National Council of Teachers of Mathematics. Flavell, J.H. & Wohlwill, J.F. (1969). Formal and functional aspects of cognitive development, In D. Elkind & J.H. Flavell (Eds.), Studies in cognitive development (pp. 67-120). New York: Oxford University Press. Fuson, K.C. (1988). Children’s counting and concepts of number. New York: Springer-Verlag. Fuson, K.C. (1990). Conceptual structures for multiunit numbers: Implications for learning and teaching multidigit addition, subtraction, and place value. Cognition and Instruction, 7, 343-404. Fuson, K.C. & Hall, J.W. (1983). The acquisition of early number word meanings. In H. Ginsburg (Ed.), The development of children’s mathematical thinking (pp. 49-107). New York Academic Press. Gelman, R. & Gallistel, C.R. (1978). The child’s understanding of number. Cambridge, M A : Harvard. Gelman, R. & Meck, E. (1983). Preschoolers’ counting: Principles before skill. Cognition, 13, 343-359. Gick, M.L. & Holyoak, K.J. (1980). Analogical problem solving. Cognitive P~chology,12, 306-365.
134
J. Bisanz & J. LeFevre
Gottlieb, G. (1991). Experiential canalization of behavioral development: Theory. Developmental Psychology, 27, 4-17. Greeno, J.G. (1983). Forms of understanding in mathematical problem solving. In S.G. Paris, G.M. Olson & H.W. Stevenson (Eds.), Learning and motivation in the classroom (pp. 83-111). Hillsdale, NJ: Lawrence Erlbaum Associates. Greeno, J.G. & Riley, M.S.(1987). Processes and development of understanding. In F.E.Weinert & R.H. Kluwe (Eds.), Metacognition, motivation, and understanding (pp. 289-313). Hillsdale, N J Lawrence Erlbaum Associates. Greeno, J.G., Riley, M.S.& Gelman, R. (1984). Conceptual competence and children’s counting. Cognitive Psychology, 16, 94-143. Grouws, D. (Ed.). (in press). Handbook of research on mathematics teaching and learning. New York: Macmillan. Hatano, G. (1988). Social and motivational bases for mathematicalunderstanding. In G.B. Saxe & M. Gearheart (Eds.), Children’s mathematics: New directions for child development (Vol. 41, pp. 55-77). San Francisco, CA: Jossey-Bass. Howe, M.L.& Rabinowitz, F.M. (1990). Resource panacea? Or just another day in the developmental forest. Developmental Review, 10, 125-154. Kail, R. (1988). Developmental functions for speeds of cognitive processes. Journal of Qerimental Child Psychology, 45, 339-364. Kail, R. (1991). Developmental change in speed of processing during childhood and adolescence. Psychological Bulletin, 109, 490-501. Kail, R. & Bisanz, J. (1982). Information processing and cognitive development. In H.W. Reese (Ed.),Advances in child development and behavior (Vol. 17, pp. 45-81). New York: Academic Press. Kail, R. & Bisanz, J. (in press). The information-processing perspective on cognitive development in childhood and adolescence. In C. A. Berg & R. J. Sternberg (Eds.), Intellecfual development. New York Cambridge University Press. Kaye, D.B. (1986). The development of mathematical cognition. Cognitive Development, I , 157-170. Kaye, D.B., dewinstanley, P., Chen, Q. & Bonnefil, V. (1989). Development of efficient arithmetic computation. Journal of Educational Psychology, 81, 467-480. Kaye, D.B., Post, TA.,Hall, V.C. & Dineen, J.T. (1986). Emergence of information-retrieval strategies in numerical cognition: A developmental study. Cognition and Instruction, 3, 127-150.
Understanding Elementary Mathematics
135
Kintsch, W. & Greeno, J.G. (1985). Understanding and solving word problems. Psychological Review, 92, 109-129. Klahr, D. (1976). Steps toward the simulation of intellectual development. In L. B. Resnick (Ed.). The nature of intelligence (pp. 99-133). Hillsdale, NJ: Erlbaum. Klahr, D. (1984). Transition processes in quantitative development. In R.J. Sternberg (Ed.), Mechanisms of cognitive development (pp. 101-139). New York: W. H. Freeman. Klahr, D., Langley, P. & Neches, R. (Eds.). (1987). Production system models of learning and development. Cambridge, MA: MIT Press. Kulm, G. (Ed.). (1990). Assessing higher order thinking in mathematics. Washington, DC: American Association for the Advancement of Science. Lawler, R.W. (1981). The progressive construction of mind. Cognitive Science, 5, 1-30. LeFevre, J., Kulak, A.G. & Bisanz, J. (1991). Individual differences and developmental change in the associative relations among numbers. Journal of Experimental Child Psychology, 52, 256-274. Lehman, D.R., Lempert, R.O. & Nisbett, R.E. (1988). The effects of graduate training on reasoning: Formal discipline and thinking about everyday-life events. American Psychologist, 43, 431-442. Leinhardt, G., Putnam, R.T. & Hattrup, R A . (Eds.). (in press). The analysis of arithmetic for mathematics teaching. Hillsdale, NJ: Lawrence Erlbaum Associates. Miller, SA. (1982). Cognitive development: A Piagetian perspective. In R. Vasta (Ed.), Strategies and techniques of child study (pp. 161-207). New York: Academic Press. Neches, R., Langley, P. & Klahr, D. (1987). Learning, development, and production systems. In D. Klahr, P. Langley & R. Neches (Eds.), Production system models of learning and development (pp. 1-53). Cambridge, MA: MIT Press. Newell, A. & Simon, H A . (1972). Human problem solving. Englewood Cliffs, NJ: Prentice-Hall. Ohlsson, S. (1987). Truth versus appropriateness: Relating declarative to procedural knowledge. In D. Klahr, P. Langley & R. Neches (Eds.), Production system models of learning and development (pp. 287-327). Cambridge, MA: MIT Press. Ohlsson, S. & Rees, E. (1991). The function of conceptual understanding in the learning of arithmetic procedures. Cognition and Instruction, 8, 103-179.
136
J. Bisanz & J. LeFevre
Pascual-Leone, J. (1970). A mathematical model for the transition rule in Piaget’s developmental stages. Acta Pqchologica, 32, 301-345. Perfetto, BA., Bransford, J.D. & Franks, J.J. (1983). Constraints on access in a problem solving context. Memory & Cognition, 11, 24-31. Piaget, J. (1970). Piaget’s theory. In P.H. Mussen (Ed.), Manual of child psychology (3rd ed., Vol. 1, pp. 703-732). New York: Wiley. Putnam, R.T., deBettencourt, L.U. & Leinhardt, G. (1990). Understanding of derived-fact strategies in addition and subtraction. Cognition and Instruction, 7, 245-285. Resnick, L.B. (1983). Toward a cognitive theory of instruction. In S.G. Paris, G.M. Olson & H.W. Stevenson (Eds.), Learning and motivation in the classroom (pp. 5-38). Hillsdale, NJ: Lawrence Erlbaum Associates. Romberg, TA. & Carpenter, T.P. (1986). Research on teaching and learning mathematics: Two disciplines of scientific inquiry. In M.C. Wittrock (Ed.), Handbook on research on teaching (3rd ed., pp. 850-873). New York: Macmillan. Schliemann, A.D. & Acioly, N.M. (1989). Mathematical knowledge developed at work: The contribution of practice versus the contribution of schooling. Cognition and Instruction, 6, 185-221. Siegler, R.S. (1986). Unities across domains in children’s strategy choices. In M. Perlmutter (Ed.), Minnesota Symposium on Child Development (pp. 1-48). Hillsdale, NJ: Lawrence Erlbaum Associates. Siegler, R.S. & Crowley, K. (1991). The microgenetic method: A direct means for studying cognitive development. American Psychologist, 46, 606-620. Siegler, R.S. & Jenkins, E. (1989). How children discover new strategies. Hillsdale, NJ: Lawrence Erlbaum Associates. Sodian, B., Zaitchik, D. & Carey, S. (1991). Young children’s differentiation of hypothetical beliefs from evidence. Child Development, 62, 753-766. Sowder, J.T. & Wheeler, M.M. (1989). The development of concepts and strategies used in computational estimation. Journal for Reseamh in Mathematics Education, 20, 130-146. Starkey, P. & Gelman, R. (1982). The development of addition and subtraction abilities prior to formal schooling in arithmetic. In T.P. Carpenter, J. M. Moser & T. A. Romberg (Eds.), Addition and subtraction: A cognitive perspective (pp. 99-116). Hillsdale, NJ: Lawrence Erlbaum Associates. Wynn, K. (1990). Children’s understanding of counting. Cognition, 36, 155-193.
The Nature and Origins of Mathematical Skills J.I.D.Campbell (Editor) 0 1992 Elsevier Science Publishers B.V. All rights reserved.
137
Chapter 4 MATHEMATICAL MISUNDERSTANDINGS: QUALITATIVE REASONING ABOUT QUANTITATIVE PROBLEMS
Richard E. Mayer University of California, Santa Barbara Anne Bovenmyer Lewis University of Colorado, Boulder Mary Hegarty University of California, Santa Barbara
Summary
In this chapter, we summarize a research program on students’ misunderstandings of mathematical story problems. A mathematical misunderstanding occurs when the problem solver constructs a mental model of the problem situation that conflicts with the information in the problem statement. In particular, we examine difficulties that students have in reasoning about problems containing relational statements, such as ”Gasat Chevron is 5 cents more per gallon than gas at ARCO.” We report a series of studies in which relational statements cause systematic errors in the recall of story problems, in the pattem of solution errors, in the pattem of solution times, in students’ eye fNaiions, and in the efiectiveness of remediation training. This research supports the idea that insttuction in mathematical problem solving should focus on the development of qualitative reasoning skills-such as how to build a mental model of the problem situation--in addition to quantitative reasoning skills-such as executing computational procedures.
138
R.E. Mayer, A.B. Lewis & M. HegaQ
Introduction Mathematical misunderstandings In mathematical problem solving, a problem solver begins with a verbal statement of a situation describing quantitativerelations among different variables, and must end with a solution, which is derived by combining the given quantities in a manner consistent with the described situation. The cognitive processes involved in mathematical problem solving include: translah’on--converting each statement in the problem into an internal mental representation, integration--combining the relevant information into a coherent mental representation,planning-devising and monitoring a plan that breaks the problem into parts (including metacognitive processes of monitoring one’s solution attempts), and execution-carrying out the computationsor operations for each step in the plan (Mayer, 1985, 1989, 1991; Mayer, Larkin & Kadane, 1984). Thus, successful mathematical problem solving involves both qualitative and quantitative reasoning. The first three cognitive processes, i.e., translation, integration, and planning, involve qualitative reasoning, which includes constructing a qualitative model of the situation described in the problem (Greeno, 1987,1989; Kintsch & Greeno, 1985); in contrast, the process of execution involves quantitative reasoning, which includes combining the numerical values in the problem to derive the quantitative solution. Mathematical misunderstandings occur in qualitative reasoning when the problem solver’s model of the problem situation conflicts with the information in the problem statement. One of the major accomplishments of qualitative reasoning about quantitative problems is the production of a model of the problem situation (Greeno, 1987, 1989; Kintsch & Greeno, 1985). Kintsch and Greeno (1985) characterized this type of reasoning as a special case of text comprehension in which a propositional textbase is formed and organized into a problem model, a mathematical representation of the actions and relations expressed in the problem. In a later version of this problem-comprehension model, a distinction is made between situation models-which are qualitative representations of problem text in everyday terms, such as comprehending that a boat is travelling upstream-and corresponding problem models--which express these qualitative representations in formal mathematical terms such as an equation (Lewis & Nathan, 1991; Nathan, Kintsch & Young, in press). Thus, construction of a problem model involves formally quantifying the informal situation information. In terms of this problem-comprehension model, mathematical misunderstandings occur when the problem solver’s problem model does not accurately reflect the situation described
Mathematical Misunderstandings
139
in the problem, that is, when the problem solver is unable to construct a correct situation model of the problem, or is unable to make a correct correspondence between this situational understanding and the mathematical formalisms of a problem model. In this chapter, we explore the nature of high school and college students’ mathematical misunderstandings of a particularly difficult type of word problem--compareproblems (Hudson, 1983; Morales, Shute & Pellegrino, 1985; Riley & Greeno, 1988; Riley, Greeno & Heller, 1983). Compare problems are story problems that include a proposition describing a quantitative relation between two variables, such as, “Gas at Chevron is 5 cents less per gallon than gas at ARCO.” In particular, we review the role of problem representation in mathematical problem solving, summarize five approaches that we have used in our research program to study how students understand and misunderstand mathematical problems, and conclude with suggestions for mathematics education. An example of mathematical misunderstanding
A well-documented example of a mathematical misunderstanding is based on the following problem (Hegarty, Mayer & Green, in press; Lewis and Mayer, 1987; Lewis, 1989): At ARCO gas sells for $1.13 per gallon. This is 5 cents less per gallon than gas at Chevron. How much do 5 gallons of gas cost at Chevron? This is a two-step compare problem because two computations are required and the second sentence compares the cost of gas at ARCO to the cost of gas at Chevron using the relational term, “less...than.” The college and junior college students we tested in our program of research found this to be a difficult problem (Hegarty, Mayer, & Green, in press; Lewis, 1989; Lewis & Mayer, 1987). The most common incorrect answer was $5.40 based on computations such as these:
-
1.13 -05 = 1.08 1.08 x 5 = 5.40
We refer to this mistake as a reversal error because the problem solver performs the opposite operation of what is actually required--e.g., subtraction instead of addition in the first step. In spite of many years of mathematics education, including practice in solving hundreds of word problems, approximately 15% to
140
R.E. Maye5 A.B. Lewis & M. Hegatty
35% of the college students that we tested in various studies produced reversal errors on two-step compare problems like the ARCO problem. De Corte and his colleagues have extended this research, showing that young children display the same pattern of errors on one-step compare problems (De Corte, Vershaffel & Pauwels, 1989). Why is the ARCO problem so difficult? Computationally, the problem is straightforward and our students rarely made arithmetic errors. Conceptually, however, the problem is demanding. Students who rely on key words to understand the problem will see the word "less" and therefore be inclined to subtract (rather than add) the first two numbers in our example. Briars & Larkin (1984) have shown that a key word approach to understanding word problems can be effective for some word problems commonly found in mathematics textbooks, in spite of the fact that reliance on key words is a superficial heuristic that does not work on problems such the ARCO problem. Thus, reversal errors on problems such as the ARCO problem seem to signal mathematical misunderstandings rather than computational failures on the part of the problem solver. The distinction between the key word approach to solving word problems and a model-based approach has also been explored in the artificial intelligence research literature. Bobrow (1968) developed a program, called STUDENT, that solves mathematical story problems by identifying key words in the program statement and converting the quantities and key words into mathematical equations from which the problem can be solved. This program was successful in solving many problems that occur in mathematical textbooks, but had some limitations as a theory of problem solving. In particular, Paige and Simon (1966) found that in converting story problems into equations, some human subjects did not always translate directly from the problem statement, but first constructed an "auxiliary representation" of the problem situation and derived the equations from this. On the other hand, some less able problem solvers did translate directly, and these solvers failed to notice that the following problem describes an impossible situation (P&ge & Simon, 1966, p. 87): A board was sawed into two pieces. One piece was two-thirds as long as the whole board and was exceeded in length by the second piece by 4 feet. How long was the board before it was cut? If a program like STUDENT solved the ARCO problem, relying on the key words to derive the appropriate operations, it might commit the type of reversal errors that we observed in some of our students. In contrast, students who first
Mathematical Misunderstandings
141
build a qualitative model of the situation described in the problem should be better able to determine that gas at ARCO is less than gas at Chevron and therefore select the proper mathematical equation (Kintsch & Greeno, 1985). One way to correctly represent the cost-per-gallon-of-gas is on a number line (Lewis, 1989):
After constructing this qualitative representation, students can recognize that to determine the price of gas at Chevron they much add the difference in price to the cost of gas at ARCO. We hypothesize that this external, spatial representation helps students create a link between their situational understanding of the problem and the proper mathematical representation of the situation. As suggested by Paige and Simon (1966), the difference between a key-word and a model-construction approach to problem representation exemplifies a possible difference between successful and unsuccessful problem-solvers. The ARCO example suggests that errors occur when students focus on product--getting a numerical answer-rather than process--building a qualitative representation of the problem that makes sense and enables the creation of an appropriate solution plan. Put another way, the ARCO example suggests that errors occur when students view word problems primarily as an invitation to engage in quantitative reasoning-carrying out computations based on numbers in the problem-rather than qualitative reasoning--understanding the situation described in the words of the problem and then devising an appropriate plan. The theme of this chapter is that problem understanding should be viewed as a basic mathematics skill, that is, students need to acquire skill in qualitative reasoning about quantitative problems. The kind of instructional program we propose differs radically from current practice in which mathematics instruction emphasizes quantitative computation as the primary goal in mathematical problem solving. Understanding a problem has long been recognized as one of the premier skills required for successful mathematical problem solving (Cummins, Kintsch, Reusser & Weimer, 1988; Greeno, 1987; Kintsch & Greeno, 1985; Mayer, 1985, 1991; Mayer, Larkin & Kadane, 1984; Polya, 1965; Wertheimer, 1959). Problem understanding occurs when a problem solver converts the words of the problem into an internal mental representation of the problem situation. In the case of word problems involving numerical quantities, the problem solver must construct a qualitative model of the situation described in the problem. Nathan, Kintsch & Young (in press) refer to this process as the construction of a situation model, and
142
R.E. Mayer, A.B. Lewis & M. Hegarty
we argue in this chapter that a crucial skill for mathematical problem solving is the ability to construct a qualitative, situation model of the problem information and relate this model to mathematicat formalisms. In the following sections we describe five approaches we have taken in trying to understand the nature of mathematical misunderstandings--recall, solution errors, solution times, eye furations, and remediation. Recall: Difiiculties in remembering compare problems Our study of mathematical misunderstanding began with an analysis of approximately 1100 word problems collected from mathematics textbooks commonly used in California's junior high schools (Mayer, 1981). We sorted the problems into "families" that each shared the same general situation, such as motion problems which involved the formula distance = rate x time. In all, we identified approximately 24 families including motion, current, age, coin, work, part, dry mixture, wet mixture, percent, ratio, unit cost, markup/discount/profit, interest, direct variation, inverse variation, digit, rectangle, circle, triangle, series, consecutive integer, arithmetic, physics, and probability. This extends the list of 18 problem categories proposed by Hinsley, Hayes and Simon (1977). Within each family, there were several different formats, which we called "templates," yielding a total of approximately 100 templates. For example, there were 13 different templates for motion problems including overtake (one vehicle overtakes another), closure (two vehicles converge on the same point from different directions), speed change (one vehicle changes speed during a trip), and round trip (one vehicle makes a round trip), In this chapter, we use the term ?ype" to refer to a problem template. The structure of any template can be expressed as a unique list of propositions with each proposition consisting of an assignment proposition, a relational proposition, or a question proposition. An assignment gives a single numerical value for some variable, such as "the cost of candy is $1.70 per pound or "one vegetable oil contains 6% saturated fats." A relation gives a single numerical relation between two variables, such as "the length is 2 1/2 times the width" or "the rate in still water is 12 mph more than the rate of the current." A question asks for a numerical value of some variable, such as "how much time will it take to empty the tanks" or "how many miles will the first car have gone before it is passed." In addition, problems could contain relevant or irrelevant facts. The psychological reality of such problem categories has been demonstrated by Hinsley, Hayes and Simon (1977). In a series of verbal protocol studies, they found that students recognized each of the standard types of problems, often
Mathematical Misunderstandings
143
within reading the first few words of the problem. Students were also able to sort problems into categories based on problem type, suggesting that students possessed schemas for basic problem types. Our research focused on students’ recall of story problems. In order to examine students’ comprehension of problems, we asked college students to study a series of eight algebra story problems that were presented individually in printed form and then, as a recall test, we asked the students to write down the problems they had just read (Mayer, 1982). For example, one of the problems was the following current problem: A river steamer travels 36 miles downstream in the same time that it travels 24 miles upstream. The steamer’s engine drives in still water at a rate 12 miles per hour more than the rate of the current. Find the rate of the current.
This problem consists of two assignments, two relations, and a question, which can be specified as: miles downstream = 36 miles upstream = 24 time upstream = time downstream rate of steamer in still water = rate of current rate of current = UNKNOWN
+ 12
A common error in recalling the current problem was to change a relation
proposition into an assignment proposition: A river boat goes 36 miles down the river and 24 miles up the river. If its engines push the boat 12 mph in still water, find the current speed. In this example, the problem is also converted from a version that rarely appeared in math books (twice per 1100 problems) to a more common version (nine times per 1100 problems). Uncommon problem types (or templates) occurred once or twice per 1100 problems in textbooks whereas common problem types occurred from nine to 40 times per 1100 problems. Overall, the results indicated a proposition-type effect: propositions about relations were approximately three times more likely to be misrecalled than propositions about assignments (29% and 9% errors, respectively), and there were 20 instances of students recalling a relation as an assignment versus only one instance of the reverse. These results suggest that students have difficulty in mentally representing propositions that express a numerical relation between two
144
R.E. Mayer, A.B. Lewis & M. Hegarty
variables--that is, relational propositions may require more cognitive processing than assignments. Our results also indicated a scherna-familiarity effect: problem types commonly found in mathematics textbooks were more easily recalled than uncommon problem types (e.g., the correlation between probability of recall and the relative frequency with which a problem type occurred in textbooks was r = .942), and there were 17 instances of students converting a less common problem type into a more common one versus no cases of the reverse. Cummins, Kintsch, Reusser and Weimer (1988) also found that students tended to miscomprehend difficult word problems by converting them into simpler problems. These results support the view (Hinsley, Hayes & Simon, 1977) that problem categorization plays an important role in the representation of story problems. Solution errors: Diffculties in solving compare problems The next approach to our study of mathematical misunderstanding was an examination of the solution protocols of students who solved a series of story problems (Lewis & Mayer, 1987). We asked college students to show their work as they solved a series of 24 story problems, including problems that contained relational propositions. Some of the relational propositions were called consistent because the relational term primed the appropriate arithmetic operation (e.g., contained "less" when the appropriate operation was to subtract or contained "more" when the appropriate operation was to add); others were inconsistent because the relational term primed an inappropriate arithmetic operation (e.g., contained ''less" when the appropriate operation was to add or contained "more" when the appropriate operation was to subtract). For example, a consistent problem was: At ARCO gas sells for $1.13 per gallon. Gas at Chevron is 5 cents less per gallon than gas at ARCO. How much do 5 gallons of gas cost at Chevron? In contrast, an inconsistent problem was: At ARCO gas sells for $1.13 per gallon. This is 5 cents less per gallon than gas at Chevron. How much do 5 gallons of gas cost at Chevron?
Mathematical Misunderstandings
145
Each problem contains an assignment (ARCO = 1.13 per gallon), a relation (ARCO - .05 = Chevron, or ARCO t .05 = Chevron), an assignment (number of gallons = 5), and a question (total cost = UNKNOWN). The results indicate an error-consistency eflecf: inconsistent problems, which contain a relational statement that primes an incorrect operation, lead to far more reversal errors (i.e., adding when subtraction is required or subtracting when adding is required) than consistent problems, which contain a relational statement that primes an appropriate operation. For example, over the course of several studies, college students made reversal errors on less that 2% of the consistent problems but on 15% to 35% of the inconsistent problems. This result suggests that students have difficulty in constructing and using a representation corresponding to the situation portrayed in relational statements. Solution times: Difficulties in planning solutions for compare problems
The third approach in our research program was to pinpoint the locus of the error consistency effect using eye-fmtion methodologies (Hegarty, Mayer & Green, in press). We asked college students to view a series of problems presented on a computer monitor and to state a solution plan for each. For example, a solution plan for the inconsistent version of the ARCO problem is, "Add a dollar eighteen to five cents, then multiply by five." Thus, unlike previous studies, students did not engage in computing a numerical answer (execution); instead, they engaged solely in the qualitative aspects of mathematical problem solving (translation, integration, and planning). To better evaluate students' qualitative reasoning processes, we recorded their eye faations as they read each problem until they produced a solution plan. Reflecting the error-consistencyeffect found by Lewis and Mayer (1987) and by Lewis (1989), students made more errors in devising plans for inconsistent compare problems than for consistent compare problems. This result suggests that the consistency effect occurs during the qualitative phases of problem solving rather than during the quantitative computations in the execution phase. A new aspect of this study is a more detailed picture of the problem-solvers' processing times. When we look only at highly accurate students, for example, we find a time-consistency eflect in which more time is allocated to inconsistent compare problems (e.g., 27 seconds) than consistent compare problems (e.g., 23 seconds). Interestingly, the effect is not present during the initial reading of the problem (translation phase) but is seen in the period between initial reading and producing a solution plan (integration and planning phases). Thus, the
146
R.E. Mayer, A.B. Lewis & M. l Hegarty
time-consistency effect seems to be localized in the integration and planning phases rather than the translation and execution phases of problem solving. The time-consistency effect was not found for less accurate students. Overall, students who produced many errors did not spend more time processing inconsistent problems than consistent problems. De Corte, Vershaffel, and Pauwels (1989) obtained a similar pattern of time effects with younger children. This suggests that less-accurate students may not be sensitive to the linguistic structure of problems. This lack of recognition that inconsistent problems require more careful processing can account for the higher error rate on inconsistent problems. Eye fixations: Ditficulties in processing words and numbers in compare problems
The eye-furation data provide some preliminary new information concerning how highly accurate students process word problems. As an initial step, we developed an eye-furation protocol for each student on the ARCO problem. The protocol listed each line that the student futated, in order; each line on the protocol contained all of the words and/or numbers ftvated before the student’s eyes moved to another line. If students read our word problems as they would read familiar narrative prose we could expect their eye movements to progress systematically from line to line, with few rereadings--that is moving one’s eyes from the current line to a line that had previously been ftvated (Just & Carpenter, 1987). In contrast, our students reread a portion (or all) of a previously read line an average of 13.2 times, suggesting that they experienced difficulty in understanding the problem. An analysis of these rereadings revealed three interesting patterns-a furation-funnel effect, a fucation-selection effect, and a futation-consistency effect. First, we found a fuation-funnel effect in which students tended to fixate fewer words and/or numbers when their eyes moved back to a line they have previously read. Whereas students futed almost all of the words and numbers on each line during the initial reading, they most commonly furated only one or two items on subsequent rereadings. For example, approximately 39% of the rereadings involved looking at one word or number and 26% of the rereadings involved looking at two words or numbers. Second, we found a fucation-selection eflect in which students tended to furate on words and numbers during the initial reading of the problem but to focus disproportionately on numbers during subsequent rereadings. For example, students reread the numbers in the ARCO problem approximately three times while rereading other information approximately once. These patterns are
Mathematical Misunderstandings
147
consistent with the idea that successful students begin by trying to build a qualitative model of the problem situation--using mainly the words in the problem statement--and then later fdl in the model with specific numbers. Third, we found a furation-consistency ef/ect in which students reread the numbers in inconsistent and consistent problems about the same number of times but reread the words more often in inconsistent than consistent problems. For example, in the second line of the ARC0 problem, students given the inconsistent version reread "Chevron" an average of 1.4 times compared to .5 for students given the consistent version and reread "less" 2.6 times compared to 1.5 for the consistent version; however, students given the inconsistent version reread "5 cents" an average of 2.9 times compared to 3.5 times for students given the consistent version. This finding suggests that the additional processing time required to correctly process inconsistent problems can be accounted for by students building a situation model of the problem--based on the words rather than the numbers in the problem. Remediation: Eliminating errors via representation training
The foregoing solution-time and eye-fmation results suggest that more-accurate students build a qualitative model of the situation described in the problem before planning their solution. Thus, the final approach we used in our research program was to provide less-accurate students with direct instruction in how to construct qualitative situation models for compare problems (Lewis, 1989). The instruction, called representation training, consisted of two sessions. In the first session, the instructor presented the definitions and examples of the three types of statements in word problems ( k . , assignments, relations, and questions), the instructor noted that problems with relations were called compare problems and that relational statements were generally more difficult to understand than assignments, and students learned to label statements in example problems with A for assignment or R for relation or Q for question. In the second session, the instructor demonstrated how to use a diagram to represent the information in two-step compare problems, and provided feedback as students diagrammed several practice problems. A typical diagram procedure worksheet is shown in Figure 1. Students did not actually solve any word problems during the training. Does teaching students how to recognize and represent relational statements affect their problem solving performance? To answer this question, Lewis gave counterbalanced pretests and post-tests to students who received the training (trained group) and those who did not (control group). The tests included eight two-step compare problems, such as described in the previous sections, and four
148
R.E. Mayer, A.B. Lewis di M.Hegarty Sample Problem Megan has saved $420 for vacation, She has saved 1/5 as much as James has saved. James has been saving for his vacation for 6 months. How much has he saved each month? Diagramming Steps 1. Draw a number line and place the variable and the value from the assignment statement in the middle of the line. $420
Me$an
2. Tentatively place the unknown variable (James's savings) on one side of the middle.
' $420
James
Megan
3. Compare your representation with the information in the relation statement, checking to see if your representation agrees with the meaning of the relation statement. If it does, then you can continue. If not, then try again with the other side.
$420
x
James
Megan
x
James
4. Translate your representation into an arithmetic operation. If the
unknown variable is to the right of the center, then the operation is an increase, such as addition or multiplication. If the unknown variable is to the left of the center, then the operation is a decrease, such as subtraction or division. -INCREASE-+ $420
x
James
Megan
James
Figure I . Diagramming procedure for a sample problem.
x
Mathematical Misunderstandings
149
three-step compare problems, such as: ”Alfred0 is 25 years old. He is 7 years younger than Pedro who is 3 years older than Dennis. How old will Dennis be in 8 years?” Although two-step problems (i.e., requiring two computations) were presented in the training, no three-step problems (i.e., requiring three computations) were presented so three-step problems may be viewed as transfer problems. The problems given during training were different from those given on the pretest or on the post-test. On the pretest both groups made many errors on compare problems, especially on inconsistent problems-replicating the error-consistency eflect found by Lewis and Mayer (1987). For example, in one study, the probability of making a reversal error was approximately -02 for consistent and .30 for inconsistent problems. However, almost all of the errors were eliminated on the post-test for the trained group whereas most of the errors remained for the control group. For example, on two-step problems, the error rate fell from 15% on the pretest to 1% on the post-test for the trained group and from 16% to 9% for the control group. Training also had a strong effect on students’ performance on the transfer problems: On three-step problems, the error rate fell from 23% on the pretest to 7% on the post-test for the trained group and from 20% to 19% for the control group. These results can be described as a representation-trainingeffect: training in how to recognize and represent relational statements improves students’ problemsolving performance. These results are consistent with the idea that a major impediment to successful problem solving is the student’s failure to understand the problem. Importantly, instruction aimed at improving students’ qualitative reasoning about word problems also improved their quantitative answers. Conclusion In this chapter we presented evidence that understanding what a word problem says is a major source of difficulty for mathematical problem solvers. We also provided encouraging evidence that problem-representation skills can be taught successfully to students who possess adequate computational skills. Each of the major effects that we found (such as proposition-type, error-consistency, time-consistency, futation-consistency, and representation-training effects) can be viewed as different measures of the same cognitive processes. Our program of research on mathematical problem solving suggests both theoretical and practical implications.
150
R.E. Mayer, A.B. Lewis & M.Hegarty
Theoretical implications Our work is consistent with an emerging theme in the cognitive science literature concerning the place of qualitative and quantitative reasoning in mathematical problem solving (Hall, Kibler, Wenger & Truxaw, 1989). Our findings complement research on how students solve physics problems in which expert problem solvers reason qualitatively about a problem before and while they begin making quantitative computations (Larkin, 1983; Larkin, McDermott, Simon & Simon, 1980, White & Frederiksen, 1987). Similarly, our results suggest that successful students actively attempt to construct a qualitative model of the problem situation before and while they engage in quantitative computations; in contrast, errors are more likely to occur when students engage in quantitative reasoning before they have constructed a qualitative representation of the problem. Practical implications Our work is also consistent with the educational pleas for emphasis on process rather than product (Bloom & Broder, 1950) and for emphasis on understanding rather than responding (Brownell, 1935). In particular, our work is consistent with the idea that students often need instruction and practice in representing and planning of word problems. Greeno (1987, p. 70) has noted that "students need to recognize quantities and relations among quantities" and suggests that "explicit representations of the quantities and their relations" could be helpful in instruction. Lewis' (1989) use of number lines provides one method of improving students' skills in representing the situation in word problems and deriving a corresponding plan. Willis & Fuson (1988) provide another example of how practice in constructing visual representations of word problems can improve mathematical problem-solving performance. In summary, our research points to the crucial role of qualitative reasoning about quantitative problems. Understanding the situation presented in the words of the problem is a fundamental skill required for solving word problems. Mathematical misunderstandings--in which students fail to construct appropriate qualitative models of the problem situation--are likely to contribute to incorrect answers in students who possess computational skill, and therefore, learning to "think situationally" (Nathan & Young, 1990) should be a central focus of mathematics instruction.
Mathematical Misunderstandings
151
REFERENCES Bloom, B.S. & Broder, L.J. (1950). Problem-solvingprocesses of college students: A n ewploratory investigation. Chicago: University of Chicago Press. Bobrow, D.G. (1968). Natural language input for a computer problem-solving system. In M. Minsky (Ed.), Semantic information processing. Cambridge, M A : MIT Press. Briars, D.J. & Larkin, J.H. (1984). An integrated model of skill in solving elementary word problems. Cognition and Instruction, I , 245-296. Brownell, WA. (1935). Psychological considerations in the learning and teaching of arithmetic. In The teaching of arilhmetic: Tenth yearbook of the National Council of Teachers of Mathematics. New York: Columbia University Press. Cummins, D., Kintsch, W., Reusser, K. & Weimer, R. (1988). The role of understanding in solving word problems. Cognitive Psychology, 20, 439-462. De Corte, E., Vershaffel, L. & Pauwels, A. (1989). Third-graders’ comprehension of compare problems: Empirical validation of Lewis and Mayer’s consistency hypothesis. Unpublished manuscript, University of Leuven. Greeno, J.G. (1987). Instructional representations based on research about understanding. A.H. Schoenfeld (Ed.), Cognitive science and mathematics education. (pp. 61-88). Hillsdale, NJ: Erlbaum. Greeno, J.G. (1989). Situation models, mental models, and generative knowledge. In D. Klahr and K. Kotovsky (Eds.), Complex information processing: The impact of Herbert A. Simon. Hillsdale, NJ: Erlbaum. Hall, R., Kibler, D., Wenger, E., & Truxaw, C. (1989). Exploring the episodic structure of algebra story problem solving. Cognition and Instruction, 6, 223-283. Hegarty, M., Mayer, R.E. & Green, C.E. (in press). Comprehension of arithmetic word problems: Evidence from students’ eye fmations.Journal of Educational Psychology, 84. Hinsley, D., Hayes, J.R., & Simon, H A . (1977). From words to equations. In P. Carpenter and M. Just (Eds.), Cognitive processes in comprehension (pp. 89-106). Hillsdale, NJ: Erlbaum. Hudson, T. (1983). Correspondences and numerical differences between disjoint sets. Child Development, 54, 85-90. Just, MA. 8~ Carpenter, PA. (1987). The psychology of reading and language comprehension. Newton, M A : Allyn and Bacon. Kintsch, W. (1988). The use of knowledge on discourse processing: A construction-integration model. Psychological Review, 95, 163-182.
152
R.E. Mayer, A.B. Lewis di M. Hegaw
Kintsch, W. & Greeno, J.G. (1985). Understanding and solving word arithmetic problems. Psychological Review, 92, 109-129. Larkin, J.H. (1983). The role of problem representation in physics. In D. Gentner & A.L. Stevens (Eds.), Mental models (pp. 75-98). Hillsdale, N J Erlbaum. Larkin, J.H., McDermott, J., Simon, D.P. & Simon, H A . (1980). Expert and novice performance in solving physics problems. Science, 208, 1335-1342. Lewis,A.B. (1989). Training students to represent arithmetic word problems. Journal of Educational Psychology, 81, 521-531. Lewis, A.B. & Mayer, R.E. (1987). Students’ miscomprehension of relational statements in arithmetic word problems. Joumal of Educational Psychology, 79,363-371. Lewis, A.B. & Nathan, M.J. (1991). A framework for improving students’ comprehension of word arithmetic and word algebra problems. In L. Birnbaum (Ed.), Proceedings of the International Conference on the Leaming Sciences (pp. 305-314). Charlottesville, VA: Association for the Advancement of Computing in Education. Mayer, R.E. (1981). Frequency norms and structural analysis of algebra story problems. Instructional Science, 10, 135-175. Mayer, R.E. (1982). Memory for algebra story problems. Journal of Educational PSyChOlOgy, 74, 199-216. Mayer, R.E. (1985). Mathematical ability. In R J . Sternberg (ed.), Human abilities: An information processing approach (pp. 127-150). San Francisco: W.H.Freeman. Mayer, R. E. (1989). Introduction to special section on cognition and instruction in mathematics. Journal of Educational Psychology, 1989,81, 452-556. Mayer, R.E. (1991). Thinking, problem solving, cognition (Second edition). New York: Freeman. Mayer, R.E., Larkin, J.H. & Kadane, J. (1984). A cognitive analysis of mathematical problem solving ability. In R.J. Sternberg (ed.), Advances in the psychology of human intelligence, Volume 2 (pp. 231-273). Hillsdale, NJ: Erlbaum. Morales, R.V., Shute, VJ. & Pellegrino, J.W.(1985). Developmental differences in understanding and solving simple word problems. Cognition and Instruction, 2, 41-57. Nathan, M.J., Kintsch, W. & Young, E. (in press), A theory of algebra word problem comprehension and its implications for the design of learning environments. Cognition and Instruction.
Mathematical Misunderstandings
153
Nathan, MJ. & Young, E. (1990). Thinking situationally: Results with an unintelligent tutor for word algebra problems. In A. McDougell & C . Dowling (eds.), Computers and education (pp. 187-216). New York: North-Holland. Paige, J.M. & Simon, H A . (1966). Cognitive processes in solving algebra word problems. In B. Kleinmuntz (Ed.), Problem solving: Research, method, and theory. (pp. 51-119). New York: Wiley. Polya, G. (1965). Mathematical discovey: On understanding learning, and teaching problem solving. New York: Wiley. Riley, M.S. & Greeno, J.G. (1988). Developmental analysis of understanding language about quantities and of solving problems. Cognitionand Instruction, 5, 49-101. Riley, M.S., Greeno, J.G., & Heller, J.I. (1983). Development of children’s problem-solving ability. In H.P. Ginsberg (Ed.), The development of mathematical thinking (pp. 153-196). New York: Academic Press. Wertheimer, M. (1959). Productive thinking. New York: Harper & Row. White, B.Y. & Frederiksen, J.R. (1987). Qualitative models and intelligent learning environments. In R.W. Lawler & M. Yazdani (Eds.), Arlificial intelligenceand education, Volume One (pp. 281-306). Norwood, NJ:ABLEX. Willis, G.B. & Fuson, K.C. (1988). Teaching children to use schematic drawings to solve addition and subtraction word problems. Journal of Educational PSyChOlO~,80, 192-201.
This Page Intentionally Left Blank
m J.I.D. Campbell (Editor) (P 1992 Elsevier Science Publishers B.V. All rights reserved.
155
Chapter 5 THE ROLE OF EXPERTISE IN SOLVING ARITHMETIC AND ALGEBRA WORD PROBLEMS BY ANALOGY
Laura R Novick Vanderbilt University
Summary This chapter considers the contribution of algebraic expertise to the solution of arithmetic and algebra story problems by analogv. Analogical transfer involves retrieving an appropfiate exampleproblem, constructing a mapping between elements in the example and test problems, and adapting the solution procedure from the example problem to fit the requirements of the test problem. Level of expertise is shown to influence the successful outcome of all three of these component processes of analogical transfer. In contrast, level of expertise does not affect the likelihood of schema induction, a type of learning that ofen results from successful transfer: Over a wide range of expertise, students at all levels of proficiency are equally likely to induce a more abstract knowledge structure encompassing the example and test problems as a result of successful&transferring the example solution procedure to the test problem. In the final sections of the chapter, I consider (a) the relation between these expertise effects from the adult literature on analogical problem solving and results ftom the developmental literature on analogical transfer in children and (b) implications of the experimental results for mathematics education. Introduction
In solving mathematical word problems, as in many other situations, students rely on previous experience to guide their solution attempts. This reliance on prior learning can take several forms (Novick, 19fBb): Students may retrieve an appropriate formula from memory, such as that for computing the area of a circle or the length of the hypotenuse of a right triangle, by matching their
156
L.R. Novick
representation of the new problem to a schema for a class of problems stored in memory. If an appropriate schema (and therefore an appropriate formula) cannot be found, students would seem to have two options (excluding the possibility of giving up). One option is to construct a solution on their own using their general knowledge of mathematical facts and heuristics. A second option is to attempt to relate the current problem to a specific problem encountered previously, in order to transfer some aspect of the prior problem-solving episode to the current situation. This latter option encompasses several different problem-solving behaviors, because it is possible to transfer a variety of aspects of an earlier problem-solvingepisode to the current one. Most research has considered transfer of the solution procedure from an example problem to an analogous target problem. Such analogical (or procedural) transfer is ubiquitous in mathematical problem solving. For example, most college students will say that they often try to solve their homework problems by fmdmg appropriate worked-out example problems in the textbook. Although procedural transfer is possible only if the two problems have identical or nearly identical structures, even when this condition is not met (therefore the problems must be solved using quite different procedures) solvers may be able to transfer other aspects of an earlier problem-solving episode to the current one. For example, Catrambone and Holyoak (1990) have studied the importance of representing Poisson probability problems in terms of subgoals and transferring information at that level. Novick (1990) has studied transfer of a diagrammatic representation (e.g., a matrix) across problems that require conceptually-differentprocedures for solution. The extensive literature on analogical problem solving,both in mathematical as well as non-mathematical domains, has delineated the component processes of procedural transfer and the stimulus factors affecting their execution. However, very little work has considered the contribution of expertise to transfer success. In the present chapter, I summarize what is known about the contribution of expertise to the analogical solution of arithmetic and algebra word problems. This chapter contains eight major sections. First, I consider issues concerning the measurement of mathematical expertise, and I discuss the measure of expertise used in my own research. Next, I present a model of the component processes of analogical transfer. In the four subsequent sections, I consider the extent to which execution of each of these processes in the analogical solution of arithmetic and algebra word problems is affected by one’s level of expertise in that domain: In the third section, I summarize some research on expertise differences in how word problems are represented, and I consider several potential implications of these differences for the outcome of the retrieval process of transfer. In the fourth and
Expertise and Word Problem Analogies
157
fifth sections, I discuss the contribution of expertise to the execution of the post-retrieval processes of mapping and adaptation. In these sections, I provide a preliminary report of data from a new study. I also report some new analyses of data collected by Novick and Holyoak (1991), because the contribution of expertise to transfer was only a secondary concern in that paper. In the sixth section, I consider the contribution of expertise to schema induction, a type of learning that often results from successful analogical transfer. Having concluded my discussion of the processes associated with procedural transfer, I then consider the relation between expertise effects on such transfer and results from the developmental literature on analogical problem solving in children. Finally, in the discussion section, I consider implications of the experimental results for mathematics education. Measuring mathematical expertise Mathematics is a large domain, ranging from the simple arithmetic known by third graders, through the algebra, geometry, and calculus known by high school and college students, to numerical analysis, topology, and other subjects known only by those who have majored in math. It is important, therefore, to delineate clearly the type of mathematical problem solving that is required before selecting a measure of expertise. The work that I will discuss in this chapter concerns students' solutions of arithmetic and algebra story problems. For none of the problems is even pre-calculus-level knowledge required for solution. How should expertise be assessed? Expertise often is measured by the amount of relevant experience one has. For mathematical expertise, then, one might classify students by their most advanced math class. I will refer to this measure as math level. Although students who have taken more advanced math classes certainly are more expert at math than students who stopped their math education at an earlier level, there are several problems with using math level as a measure of expertise in the present context. First, it is not clear that math levels of first-year calculus, second-year calculus, and advanced differential equations reflect increasing expertise at solving arithmetic and algebra story problems. Second, how does the expertise of someone who received an "A" in second-semester calculus compare to that of someone who received a "B" in that course and a "D" in third-semester calculus? It is unclear how to combine information about the most advanced course taken and the person's grades in all math courses into a single measure of expertise. Finally, math level often does not vary enough in a university population to make it a useful measure of expertise. For example, among 363 Vanderbilt University
158
L.R. Novick
undergraduates in a paid subject pool, 59% have first-year calculus as their most advanced math class. Only 20% of the students have a math level below that. Given these problems with using math level as a measure of arithmetic and algebraic expertise, what other measure might better serve this purpose? The measure I have used in my own research is performance on the mathematics section of the Scholastic Aptitude Test (MSAT). As discussed h Novick (1988a), this measure has several advantages. A good measure of expertise should be reliable. MSAT scores are. The measure should reflect proficiency in the domain that has developed gradually over many years of experience. The MSAT has this property as well (College Entrance Examination Board (CEEB), 1986; Messick, 1980). Most importantly, the measure should provide information about expertise in the relevant domain. The MSAT measures expertise at solving arithmetic, algebra, and geometry problems (CEEB, 1986). Moreover, performance on the test will increase as a result of further instruction in these areas of mathematics (CEEB, 1986; Messick, 1980), as one would expect of a measure of expertise. In sum, students who have higher MSAT scores are more proficient, i.e., more expert, at solving arithmetic and algebra story problems than are students who have lower scores. In keeping with using MSAT as a measure of expertise in my work, I will refer to the domain in which expertise is being assessed as algebra rather than as mathematics. Although it would be somewhat more precise to refer to the domain as "arithmeticand algebra," that label is cumbersome, and arithmetic competence would seem to be a prerequisite for algebraic competence. A final issue to consider is whether the research reviewed here on contributions of expertise to success at analogical transfer will simply show that more proficient algebraists perform better than less proficient algebraists on all tasks requiring arithmetic and algebra. This question is relevant to studies of expertise generally. For example, a skeptic might argue that the work on expertise in physics merely shows that physics experts understand physics better than do novices. This issue can be addressed in two ways for the research reviewed here. First, it is not a foregone conclusion that expertise will contribute to the success of executing each of the processes of analogical transfer. That is, more proficient algebraists may have an advantage in executing some transfer processes but not others. This turns out to be the case, and in ways that may be important for education. A second answer is particularly relevant to the complex arithmetic word problem I have used in some of my research (Novick, 1988a; Novick & Holyoak, 1991): Members of the West High School Band were hard at work practicing for the annual Homecoming Parade. First they tried marching in rows of
Eqvettise and Word Problem Analogies
159
twelve, but Andrew was left by himself to bring up the rear. The band director was annoyed because it didn’t look good to have one row with only a single person in it, and of course Andrew wasn’t very pleased either. ‘To get rid of this problem, the director told the band members to march in columns of eight. But Andrew was still left to march alone. Even when the band marched in rows of three, Andrew was left out. Finally, in exasperation, Andrew told the band director that they should march in rows of five in order to have all the rows filled. He was right. This time all the rows were filled and Andrew wasn’t alone any more. Given that there were at least 45 musicians on the field but fewer than ulo musicians, how many students were there in the West High School Band? This problem can be solved in several ways, for example by examining all multiples of 12, of 5, or of 24, the lowest common multiple of 12, 8, and 3. When subjects do not receive a relevant example problem prior to attempting to solve the band problem (baseline condition), correct solution of that problem requires selecting and correctly executing one of these procedures. Importantly, baseline success at solving the band problem is unrelated to expertise. This finding is based on an analysis of the solution data from 64 subjects, with MSAT scores fairly densely distributed from 410 to 770, who attempted to solve the band problem without having previously seen a relevant example problem (Novick, 1988a; Novick & Holyoak, 1991). The mean MSAT score for the 40 solvers was 649, which is not significantly greater than the mean of 614 for the 24 non-solvers, t(62) = 1.56, p > .12. (The 35 point difference in observed means is less than the statistically reliable 50-60 point difference between the scores of males and females, a difference that many psychologists argue is not particularly important; e.g., Mayer, 1983). It is also important to note that the solution procedure cued by the relevant example problem, when it is presented, is rarely used to solve the band problem in the baseline condition, regardless of expertise (Novick, 1988a). Thus it is meaningful and informative to consider the contribution of expertise to the success of executing the various processes associated with transferring the example problem’s solution procedure to the band problem. A process model of analogical transfer
Solving problems by analogy is not a single process, but rather involves the coordination and execution of several component processes. Although logical dependencies among the processes would seem to require that they be initiated sequentially, it is unlikely that each process must be completed before the next one
160
L.R. Novick
begins. In fact, the processes may very well interact once initiated. These interactions, however, will not be discussed here. Moreover, the purpose of this chapter is not to provide support for a model of analogical transfer, as such support is detailed elsewhere (e.g., Gick & Holyoak, 1980, 1983; Holyoak, Novick, & Melz, in press; Novick, 1988a; Novick & Holyoak, 1991; Ross, 1989; Ross & Kennedy, 1990). Rather, the goal is to consider the contribution of algebraic expertise to the successful execution of the various processes associated with the analogical solution of arithmetic and algebra story problems. Solvers obviously must begin their solution attempt by encoding what they believe to be the solution-relevant features of the problem to be solved (henceforth referred to as the target problem). Because this representation process is required for all problem solving, it is not considered here to be a specific component of analogical transfer. Rather, the fust component of transfer is the wfrievul of an appropriate example (source) problem from memory (or from an external source such as a textbook). The features that solvers include in their representations of the target problem serve as retrieval cues for accessing the representation of a similar problem in memory. Once a potential source problem has been retrieved, solvers must construct a mapping between the source and target problems. This mapping specifies a set of one-to-one (or possibly many-to-one; see Holyoak & Thagard, 1989) correspondences between elements in the two problems. Only if a sufficiently detailed and coherent mapping can be constructed will it make sense to adapt the solution from the source problem into a similar procedure for solving the target problem. This adaptation process is important, because transfer of the solution from source to target is not an automatic consequence of successful mapping (Holyoak et al., in press; Novick & Holyoak, 1991). That is, it is possible to succeed at mapping yet fail to transfer the source solution procedure to the target problem. In addition to the processes of retrieval, mapping, and adaptation, it is important to consider the process of schema induction, a type of learning that results from successful transfer (Bassok & Holyoak, 1989; Holyoak, 1984, 1985; Novick & Holyoak, 1991; Ross, 1989; Ross & Kennedy, 1990). In particular, successful use of the adapted source procedure to solve the target problem often leads to induction of a more general schema encompassing the source and target problems. Schema induction is important for two reasons, First, the induced schema facilitates solution of subsequent analogous problems (Gick & Holyoak, 1983; Novick & Holyoak, 1991; Pirolli & Anderson, 1985). Second, much of the expert’s knowledge base is thought to be organized in terms of schemas representing classes of structurally-similar problems, and these schemas are believed to underlie experts’ superior performance on a variety of tasks (e.g., Chase & Simon, 1973;
Eqemke and Word Problem Analogies
161
Chi, Feltovich, & Glaser, 1981; Holyoak, 1991; Schoenfeld & Herrmann, 1982). Because much of expert knowledge is schema-based, and analogical transfer leads to schema induction, analogical transfer also may contribute to increasing one’s level of expertise (Ross, 1989). Given the substantial benefits of schema induction, it is important to ask whether the likelihood of inducing a schema for the common structure underlying the source and target problems following successful transfer depends on the expertise of the problem solver. In other words, do solvers at different levels of expertise reap comparable benefits from their successful attempts at analogical transfer (at least as measured by schema induction), or is the link between analogical transfer and schema induction dependent upon one’s entering level of expertise? Conclusions concerning the importance of analogical transfer as one mechanism for increasing solvers’ expertise clearly will depend on the contribution of initial expertise to the likelihood of schema induction. Representation and retrieval Differences in problem representations as a function of expertise
Because retrieval of a source problem depends upon finding a match between the elements in one’s representation of the target problem and those in representations of other problems in memory, it is appropriate to begin discussion of the retrieval process by considering expertise differences in how problems are represented. It has been well documented that the features attended to by solvers depend on their level of expertise in the problem domain (e.g., Aaronson & So, 1990; Chi et al., 1981; Schoenfeld & Herrmann, 1982; Silver, 1981). In particular, novices emphasize surface features in their representations -- for example, the specific objects and terms mentioned in the problem and the way in which the question is phrased. Experts, in contrast, emphasize stmcturaf features. These features reflect the relations among the objects in the problem rather than the specific objects themselves. Thus, they are most important for determining how to solve the problem. In the remainder of this section, I review three studies that provide evidence for these representational differences within the domain of mathematical word problems. Silver (1981) asked seventh-graders to sort 16 arithmetic word problems into groups such that all of the problems in a group were “mathematically related.” The 16 problems represented a factorial design in which 4 mathematical structures were crossed with 4 story contexts (e.g., coin collecting, farming). The basis for determining mathematical similarity was examined for students at three levels of
162
L.R. Novick
proficiency at solving arithmetic story problems, labelled good (scores of 11-12 on a 12 problem test), average (scores of 5-10), and poor (scores of 4 or less) problem solvers. Each student received two scores on the sorting task, indicating respectively the extent to which the student’s sorting of the problems reflected the shared mathematical structure versus the shared story context of the problems. Each score was computed as the number of pairs of similar problems that were grouped together. Performance on each of these measures varied reliably as a function of expertise: Sorting of the problems according to mathematical structure increased across the poor, average, and good problem solvers, with means of 6.3, 12.0, and 17.8, respectively (out of 24). Correspondingly, sorting of the problems according to story context (ie., surface features) decreased with increasing expertise, with means of 8.9, 3.4, and 0.6 (out of 24), respectively. D o these results reflect the knowledge differences that are believed to accompany different levels of expertise, or was it simply the case that the good problem solvers were in some sense smarter than the poor problem solvers? One way to address this question is to conduct a training study in which students’ representations of a set of problems are compared before and after instruction that increases their expertise. Schoenfeld and Herrmann (1982) conducted such a study. College students sorted a set of mathematical problems before and after completing an intensive one-month course in strategies for solving such problems. A group of math professors also sorted the problems to enable a comparison of the students’ sortings to expert performance. As expected given Silver’s (1981) results, with instruction the students’ representations changed from reflecting primarily the surface features of the problems to reflecting primarily their structural features. Although the students’ sortings of the problems were not identical to those of the professors after only a month of instruction, the sortings of the two groups were much more similar after instruction than before, with correlations of .72 and .54, respectively. Aaronson and So’s (1990) results from a solution-verificationtask are consistent with those just described from the sorting task. In their experiment, college students read a series of short arithmetic word problems, with each problem preceded by a yes/no question. For example, one of the stimuli was: “Did Sally have 4 quarts of soup? Sally cut her six quart minestrone recipe in half and then added a quart of left over tomato soup.” Both the question and the problem were presented one word at a time, and reading times were collected for each word. The data of interest here concern a comparison of the reading times of the top and bottom thirds of the subjects, based on error rates (11.5% vs. 31.8% errors, respectively), for various classes of words in the problems. For all classes of
Expettise and Word Problem Analogies
163
words, the poorer problem solvers read much more quickly than the better problem solvers. More relevant to the issue of representational differences, the two groups differed in how they allocated their total reading time among the various categories of words. Compared to the good solvers, the poor solvers spent a greater percentage of their total reading time (49% vs. 43%, respectively) on words that reflected the surface features of the problems (e.g., the numbers, actors, and objects). Conversely, they spent a smaller percentage of their total reading time (51% vs. 57%, respectively) on aspects of the problems that were indicative of their structure: (a) the units attached to the numbers, (b) the operations that specified what to do with the numbers, (c) clause boundaries, where the relations among actors and objects would be determined, and (d) the end of the sentence, where the mathematical and semantic components of the problem would be integrated. In sum, consistent with the results of the sorting studies, poor solvers, in comparison to good solvers, preferentially allocated their attentional resources to the surface features of the problems, at the expense of the structural features.
Potential impiications of expertise differences in problem representations for the retrieval process Solvers must attend to the structural features of problems to be successful at analogical transfer, because those are the features that are responsible for determining how to solve a problem. Only when two problems are very similar structurally will it be appropriate to adapt the solution procedure from one for use in solving the other. Because the features emphasized in solvers’ representations of the target problem determine the cues that are available for retrieving related information from memory, and because solvers at different levels of proficiency differ in whether they place greater emphasis on structural or surface features, it follows that expertise level may affect the likelihood of retrieving an appropriate source problem from memory. More generally, one’s level of expertise may affect whether particular problems are retrieved from memory, as a function of the nature of the similarity between the target problem and those in memory. In the next three sub-sections, I consider several predictions concerning the implications of expertise differences in problem representations for the outcome of the retrieval process of analogical transfer (also see Novick, 1988a, b). Evidence bearing on these predictions is provided from the literature on the analogical solution of arithmetic and probability story problems. Positive transfer as a function of surface similarity. One consequence of the expertise-based differences in problem representations is that novices should be
164
L.R. Novick
more successful at analogical transfer if two problems share surface as well as structural features than if they only share structural features, because the common surface features will facilitate retrieval of the structurally-appropriate source problem. Experts, on the other hand, should be less affected by such a manipulation, because they should be able to retrieve a structurally-similar problem even if it does not share surface features with the target problem. The results of an experiment by Ross (1984) support the prediction for novices. (To my knowledge, the prediction for experts has not been tested.) Ross taught undergraduates four probability principles (e.g., combinations, permutations), each of which was accompanied by a worked-out example from a different content domain. Subjects then solved a series of test problems requiring the use of these principles, and Ross compared their accuracy on problems for which the story line matched that for the appropriate worked-out example to problems for which the story line was novel (i.e., did not match the story line for any of the examples). As expected, accuracy was higher for the test problems with matching story lines than for those with novel story lines, with means of 77% and 43% correct, respectively. Positive transfer in the absence of surface similarity. A second hypothesis is that when the source and target problems share structural features only, analogical transfer should be more common among experts than novices, because experts will be much more likely to retrieve the source problem. The results of a study by Novick (1988a, Experiment 1) using complex arithmetic story problems support this prediction. The more proficient subjects had MSAT scores between 690 and 770, inclusive, with a mean of 729. The less proficient subjects had scores between 500 and 650, inclusive, with a mean of 603. The two levels of expertise were crossed with presence or absence of a structurally-similar, but superficiallydissimilar, source problem: In the source-present condition, subjects received the source problem as the second problem in a set of three initial problems. The other problems were unrelated to the target problem. In the source-absent condition, the source problem was replaced by a filler problem that was unrelated to the target problem. The target problem was the marching band problem described earlier. The source problem was: Mr. and Mrs. Renshaw were planning how to arrange vegetable plants in their new garden. They agreed on the total number of plants to buy, but not on how many of each kind to get. Mr. Renshaw wanted to have a few kinds of vegetables and ten of each kind. Mrs. Renshaw wanted more different kinds of vegetables, so she suggested having only four of each kind. Mr. Renshaw didn’t like that because if some of the plants died, there wouldn’t
Expertise and Word Problem Analogies
165
be very many left of each kind. So they agreed to have five of each vegetable. But then their daughter pointed out that there was room in the garden for two more plants, although then there wouldn’t be the same number of each kind of vegetable. To remedy this, she suggested buying six of each vegetable. Everyone was satisfied with this plan. Given this information, what is the fewest number of vegetable plants the Renshaws could have in their garden? Subjects were taught to solve this problem using a procedure based on finding the lowest common multiple (LCM) of several numbers: Since at the beginning Mr. and Mrs. Renshaw agree on the total number of plants to buy, 10, 4, and 5 must all go evenly into that number, whatever it is. Thus the first thing to do is to find the smallest number that is evenly divisible by those 3 numbers, which is 20. So the original number of vegetable plants the Renshaws were thinking of buying could be any multiple of 20 (that is, 20 or 40 or 60 or 80 etc.). But then they decide to buy 2 additional plants, that they hadn’t been planning to buy originally, so the total number of plants they actually end up buying must be 2 more than the multiples of 20 listed above (that is, 22 or 42 or 62 or 82 etc.). This means that 10, 4, and 5 will now no longer go evenly into the total number of plants. Finally, the problem states that they agree to buy 6 of each vegetable, so the total number of plants must be evenly divisible by 6. The smallest total number of plants that is evenly divisible by 6 is 42, so that’s the answer. Because analogical transfer is defined as transferring the solution procedure from a source problem to a target problem, the dependent measure in this study was the rate of use of the LCM procedure to solve the marching band problem. (See Novick & Holyoak, 1991, for an in-depth discussion of the merits of using this measure as opposed to other measures such as solution time or accuracy). In the absence of the source problem, only 6% of the subjects in each expertise group used the LCM procedure to solve the target problem. The less proficient subjects showed no evidence of analogical transfer, as their performance in the source-present condition was identical to their performance in the source-absent condition. In contrast, the more proficient subjects were quite likely to retrieve and use the source problem when it was present, with 56% of them using the LCM procedure to solve the target problem in that condition. The explanation given for this large expertise difference in transfer was an expertise difference in the likelihood of retrieving the structurally-similar but superficially-dissimilarsource problem. Although this explanation is consistent with the results, and in fact was the basis for predicting the observed results, it is
166
L.R.Novick
logically possible that the two expertise groups differed in their ability to use the source problem once retrieved rather than in their ability to retrieve that problem. That is, perhaps the two groups of subjects retrieved the source problem equally often, but the more proficient subjects were more successful than the less proficient subjects at executing the post-retrieval processes of mapping and adaptation. This alternative explanation can be evaluated by comparing less proficient subjects’ rates of using the LCM procedure to solve the target problem when (a) they must retrieve the source problem on their own (as in the source-present condition just described) and (b) spontaneous retrieval is not required because they are told to use the source problem to help them solve the target problem. In the first case, successful transfer requires that subjects execute correctly the retrieval, mapping, and adaptation processes. In the second case, subjects only have to execute the mapping and adaptation processes, because the correct source problem is retrieved for them. According to the alternative hypothesis, the transfer rate should be the same in these two situations because the less proficient subjects are able to retrieve the source problem on their own in the first situation; that is, in both situations the major factor limiting the performance of the less proficient subjects is difficulty in executing the mapping and/or adaptation processes. Although I have not conducted a study to test this hypothesis directly, it is possible to get data relevant to evaluating it from the study just described and from the two studies in Novick and Holyoak (1991). In one condition of those later experiments, the source-present condition from the earlier experiment was replicated. In other conditions, subjects were told to use the source problem to help them solve the target problem. Subjects in Novick and Holyoak’s experiments were not preselected on the basis of their MSAT scores, so I recomputed the rates of successful LCM use for the marching band problem for those subjects in the relevant conditions whose MSAT scores were in the same range as that exhibited by the less proficient subjects in the study reported above. Combining the data from appropriate conditions, the following results are obtained. In the absence of the garden source problem ( N = 16), 6% of the subjects used the LCM procedure to solve the band target problem. When the source problem was present but subjects were not told to use it ( N = 22), 9% of the subjects solved the band problem using the LCM procedure. Finally, when subjects were given a hint to use the source problem ( N = 41), 39% were successful. The fact that performance in the source-present, no-hint condition is comparable to that in the source-absent condition (G’(1, N = 3 8 ) = 0 . 1 1 , ~>. 50) and significantly lower than that in the source-present, retrieval-hint condition
Expertise and Word Problem Analogies
167
(G’(1, N=63) = 7.13, p < .Ol) indicates that an important factor limiting analogical transfer among less proficient subjects in the no-hint condition is the inability to retrieve the source problem, as originally predicted. Put another way, the fact that telling less proficient subjects to use the source problem greatly improves their performance implies that the retrieval process is an important source of transfer difficulty for those problem solvers. Negative transfer as a function of expertise. A third hypothesis is that novices should be more likely than experts to show negative transfer when the source and target problems share misleading surface similarity cues. I provided an initial test of this hypothesis using the band target problem (Novick, 1988a, Experiment 2). A new source problem, which I referred to as a distractor problem, was constructed that shared surface but not structural features with the band problem: Two assistant deans were planning how to seat the recipients of the University Service Award on the auditorium stage. They couldn’t figure out how many award recipients to put in each row. The frrst assistant dean wanted to put nine people in each row, but with that plan there would be one person left over. So the second assistant dean suggested seating the award recipients in columns of six. But the first dean remarked that this was the same as his seating arrangement so one person would still be left over. Their next idea was to put four people in each row, but that was no good either because then three people would be left over. At that point the dean walked in and told them to put five people in each row. This arrangement was good because there wouldn’t be any award recipients left over. Given that there were at least 20 but fewer than 120 award recipients, how many people did the assistant deans have to seat? Subjects were taught the following procedure for solving this problem: In the first seating arrangement, each row has 9 award recipients in it. However, there is one person left over. In the second seating arrangement, each column has 6 award recipients in it. Since this is the same as the first seating arrangement, there must have been 9 x 6 = 54 award recipients in the regular part of the seating arrangement. To get the total number of award recipients, you just need to add 1 to this number for the one extra person who was left over. So the answer is 55. Note that it is between 20 and 120 and that 55 divided by 4 (rows of 4) leaves a remainder of 3. Finally, as suggested in the problem, 55 is evenly divisible by 5. This row/column multiplication procedure can be applied to the band problem (e.g., by computing (12 x 8)t 1=97), but using this procedure will not yield the correct answer.
168
L.R.Novick
Similar to the positive-transfer study, expertise level was crossed with presence versus absence of the distractor problem. The more proficient subjects had MSAT scores from 680 to 770, inclusive, with a mean of 711. The less proficient subjects had scores from 500 to 580, inclusive, with a mean of 548. Baseline use of the incorrect row/column multiplication procedure in attempt to solve the target problem was 12% in the distractor-absent condition for both expertise groups. Use of this procedure jumped to 87% in the distractor-present condition, with no difference as a function of expertise, thereby indicating a large negative transfer effect for both the less and more proficient subjects. That is, subjects in both groups were equally likely to retrieve the distractor problem and transfer its solution procedure to the target problem. Why did the results of this experiment fail to show differential rates of negative transfer as a function of expertise? One explanation rests on the observation that in the real world surface and structural features are correlated (Medin & Ortony, 1989). That is, two problems that seem similar on the surface are likely to be solved similarly. Moreover, it seems likely that more proficient algebraists would be aware of this correlation. Given the high degree of surface similarity between the distractor and target problems in the previous experiment, and given the ease of applying the distractor problem’s solution procedure to the target problem, the more proficient subjects had little to lose, and potentially much to gain, by trying to use that procedure. This analysis suggests that a more challenging situation in which surface and structural similarity are pitted against each other might be required to discriminate subjects at different levels of expertise in terms of their susceptibility to negative transfer. Accordingly, in a third experiment (Novick, 1988a) all subjects received both the structurally-similar (but superficially-dissimilar)garden source problem and the award distractor problem. Performance on the band target problem was scored for both positive transfer (successful use of the LCM procedure) and negative transfer (use of the row/column multiplication procedure). The more proficient subjects had MSAT scores from 700 to 770, inclusive, with a mean of 731. The less proficient subjects had scores from 460 to 620, inclusive, with a mean of 560. Replicating Experiment 1,positive transfer was greater for the more proficient than the less proficient subjects, with 54% versus 25% use of the LCM procedure, respectively. Supporting the negative transfer hypothesis, the less proficient subjects were much more likely than the more proficient subjects to use the row/column multiplication procedure (73% versus 46%, respectively).
IZxpertise and Word Problem Analogies
169
Mapping
The results concerning positive transfer reviewed in the previous section indicate that expertise plays an important role in the retrieval of an appropriate source problem that shares few if any surface features with the target problem. This effect presumably is mediated by expertise differences in how problems are represented. The results of the two experiments on negative transfer, however, seem more consistent with an expertise effect on mapping. When the distractor problem was presented alone, both less and more proficient algebraistswere highly (and equally) likely to retrieve and use that problem, presumably because of its high surface resemblance to the band problem. Given this finding, the negative transfer difference observed when surface and structural similarity were pitted against each other can be interpreted as indicating that both groups of subjects retrieved the award distractor problem, but the more proficient subjects were better able to reject the award/band mapping as insufficientfor transfer because they could compare it to the mapping between the band problem and the structurally-similar but superficially-dissimilargarden problem. Stronger evidence concerning the contribution of expertise to the successful execution of the mapping process requires a methodology in which success at mapping can be assessed directly. Accordingly, I used a modified version of a task devised by Reed (1987, Experiment 4) in which undergraduates were given elements from one algebra word problem, and their task was to identify the corresponding elements from a related word problem. The problems either were taken verbatim from Reed’s work or were modified versions of his problems. The 31 subjects were preselected to fit a uniform distribution of MSAT scores from 480 to 760, inclusive. The average score was 634. Subjects received two pairs of mixture problems and two pairs of work problems, with the order of the pairs counterbalanced across subjects. For each problem type, the problems in one pair were structurally identical, whereas those in the other pair were only similar structurally. All pairs of problems were superficially dissimilar. Subjects received a three-page booklet for each pair of problems. The first page showed how to solve the source problem, using an instructional format modelled after that devised by Reed (1987). The second page presented a mapping task: Subjects were given a list of elements from the target (test) problem, and their task was to identify the corresponding elements from the source (example) problem. The first corresponding element was provided for subjects. Table 1 shows the second page of the booklet for the isomorphic pair of mixture problems. The italicized elements in the column for the example problem are the answers to
170
L.R. Novick
be supplied by the subjects. The third page of the booklet presented an adaptation task that will be described in the next section. Table 1. The Mapping Task (Booklet Page 2) for the Isomorphic Pair of Midure Problems ~~
Example Problem: A nurse mixes a 6% boric acid solution with a 12% boric acid solution. How many pints of each are needed to make 4.5 pints of an 8% boric acid solution? Let w represent the number of pints of the weaker (6%) boric acid solution that are used to make the combined (8%) boric acid solution.
This problem can be solved using the following equation: . O ~ W+ (.12)(4.5-~)= (.08)(4.5) Test Problem: A grocer mixes peanuts worth $1.65 a pound and almonds worth $2.10 a pound.
How many pounds of each are needed to make 30 pounds of a mixture worth $1.83 a pound? Letp represent the number of pounds of peanuts that are used to make the mixed nuts. Matching Instructions: The table below lists several elements from the test problem. For each, your task is to indicate the matching element from the example problem. The first matching element is provided for you: the peanuts in the test problem correspond to the weaker (6%) boric acid solution in the example problem. Example Problem weaker (6%) boric acid solution .08 (05 8%) pints combined (8%) boric acid solution .I2 (or, 12%) 4.5 pints/amount of acid per pint of solution (or, % acid) stronger (12%) boric acid solution .06 (05 6%)
Test Problem peanuts 1.83 pounds mixed nuts 2.10 30 dollars per pound almonds 1.65
Expertise and Word Problem Analogies
171
Replicating Reed (1987), subjects were very good at computing the mapping between the pairs of example and test problems. Collapsed across the four pairs of problems, subjects correctly mapped an average of 91% of the elements. Nevertheless, mapping success was highly correlated with algebraic expertise, r = .60,p c .001. If the subjects are divided into two groups based on whether their score on the mapping measure was above or below the mean, the average MSAT scores of the two groups differ by 85 points (means of 667 and 582, respectively). Adaptation
Overview Once the correct mapping of source and target elements has been computed, solvers must determine the implications of those correspondences, particularly the numerical correspondences (for arithmetic and algebra problems), for how to apply the source procedure to the target problem. This application process is what I referred to earlier as adaptation. Novick and Holyoak (1991) have shown that mapping and adaptation are separate components of analogical transfer in the solution of complex arithmetic word problems. In that paper, we proposed that the adaptations required for successful transfer of a mathematical procedure from one word problem to another can be classified into (a minimum of) three categories. These categories are sufficient to classify the adaptations required for analogical solution of the LCM arithmetic problems, as well as those required for the algebra problems reported in the previous section. It is likely that additional adaptation categories will need to be created as transfer is examined with a broader range of problems. The three categories of adaptations considered here are: (a) substitution of the target numbers into the source operators (or equation) based on the numerical mapping between the source and target problems (Holyoak et al., in press, consider substitution to be part of a separate, and prior, pattern completion process), (b) creation of new target-problem elements that were not described in that problem but that must be mapped to elements in the source problem or solution for successful transfer to occur, and (c) generalizations of the source procedure that nevertheless preserve the essential structure of that procedure. Evidence on the contribution of algebraic expertise to the successful execution of the adaptation process will be presented in the next two sub-sections, for arithmetic and algebra word problems respectively.
172
L.R. Novick
Results fiom shrdies of arithmetic word problems Descriptive data. In contrast to mapping, at least some categories of adaptation appear to be quite dificult for subjects to execute successfully, Averaged across five test problems (including the band problem) in two experiments, Novick and Holyoak (1991) found that two-thirds of their undergraduate subjects attempted to adapt the LCM solution procedure from the garden source problem for use with a test problem. Of these subjects, 54% made errors. We determined the relative difficulty of the three types of adaptations by computing the percentage of all errors that fell into each category. At one extreme, substitution of the target numbers into the source operators was qU;te easy, accounting for only 3% of all errors. The ease of substitution makes sense. Stated informally, analogical transfer of an arithmetic procedure means performing the same operations on the numbers in the target problem that were executed on the corresponding numbers in the source problem. If the numerical correspondences are not known, then solvers probably will not attempt to adapt the source procedure for use with a test problem because any transfer attempt would seem doomed to fail. In contrast, if solvers do know the numerical mapping, then substitution of the target numbers for the source numbers in the source operators should not be difficult (also see Holyoak et al., in press, on this point). At the other extreme, the postulation of new test-problem elements in the course of adapting the four source operators was very difficult, accounting for 86% of all errors. The structure-preservinggeneralizations were more mixed in terms of difficulty. Four such generalizationswere required for analogical solution of our test problems, and overall they accounted for 11% of all errors. However, three of the four generalizations rarely if ever led to errors. The fourth type of generalization accounted for 21% of the errors for one of the test problems (see Novick & Holyoak, 1991, for details), which was on par with the dificulty of adapting the individual operators. Expertise eflects. Recall that Aaronson and So (1990) found that poor problem solvers, compared to good problem solvers, spent a smaller percentage of their total reading time attending to the operations that specified what to do with the numbers. Because successful adaptation would seem to require understanding the relations between the numbers and the operators, this finding suggests that expertise may affect the success with which the adaptation process is executed. An experiment by Novick and Holyoak (1991, Experiment 2) provides data relevant to addressing this issue. Subjects were taught the LCM solution to the garden source problem. Then they solved two target problems. One was the band
Expertise and Word Problem Analogies
173
problem. The other was an analogous problem that involved determining how many cookies a woman brought to a bake sale given information about what happened when she tried putting various numbers of cookies in each bag. Each target problem was accompanied by a hint that specified either the conceptual correspondences (e.g., the band members are like plants, the bags of cookies are Like kinds of plants) or the numerical correspondences (e.g., the 5 in the band problem is like the 6 in the garden problem) between it and the source problem. The same type of hint was given for each target problem. The written solution protocols for both target problems were coded for attempts to adapt the source solution procedure, as well as for any errors made during the adaptation attempts. Because subjects monitored their own progress toward solution, were free to attempt adaptation of the source solution procedure or not as they wished, and were free to abandon an adaptation attempt at any point, this experiment can provide information only on the contribution of expertise to the likelihood of (a) attempting adaptation and (b) completing an adaptation attempt successfully. It cannot provide a meaningful assessment of whether particular types of errors are associated with different levels of expertise, because not all subjects attempted to adapt each of the source operators, and therefore not all subjects had the opportunity to make all types of errors. To assess the contribution of expertise to the likelihood of attempting to adapt the source procedure for use with the target problems (regardless of the eventual success of that process), I conducted a multiple-regression analysis in which the dependent measure was the number of target problems (0-2) for which subjects attempted adaptation. Because the experimental manipulation of type of mapping hint reliably affected transfer success (the number-mapping hint produced greater transfer), that predictor variable was included in the regression equation along with MSAT (mean MSAT = 625, with a range of 410-780; N=129). Both predictor variables were reliably associated with the number of adaptation attempts: More expert subjects Cp = .24, p < .01) and subjects who were told the numerical correspondences (p = .23,p < .01) were more likely to attempt adaptation. The latter finding is consistent with the argument given earlier that substituting the target numbers into the source operators should depend on knowing the numerical correspondences between the source and target problems. Although knowledge of the numerical mapping would seem to be necessary to begin the process of adapting an arithmetic procedure, it is not sufficient for successful transfer. Considering just the subjects who received the number-mapping hint (N = 64), algebraic expertise still was reliably associated with transfer, r = .43, p < .001.
174
L.R. Novick
To assess the contribution of expertise to the likelihood of completing an adaptation attempt successfully, a subsequent multiple-regression analysis considered only subjects who attempted adaptation on at least one of the two target problems (N = 105) and tried to predict whether an adaptation error of any kind was made. The independent variables again were MSAT and type of mapping hint. The probability of making an error in adapting the source procedure for use with a target problem increased with decreasing expertise @ = -.25,p < .01) but was unrelated to the type of mapping hint subjects received @ = -.04,p > .70). (The same results are obtained if the two target problems are analyzed separately.) Thus, more proficient subjects, compared to their less proficient peers, are both more likely to attempt adaptation and less error-prone in their adaptation attempts, even when the subjects are equated for knowledge of the numerical mapping. Results from a study of algebra word problems
Because the study just described did not allow an analysis of whether expertise level affects the likelihood of making particular types of adaptation errors, I recently conducted a study of algebra problem solving using a methodology that enabled consideration of this issue. I am referring to the study reported in the section on mapping. Because subjects were required to provide either source or target elements that corresponded to given target or source elements, respectively, all subjects attempted to perform each of the three types of adaptations described earlier: (a) Substitution of the target numbers into the source operators (or equation), (b) creation of new target-problem elements that were not described in that problem, and (c) generalizations of the source procedure that nevertheless preserve the essential structure of that procedure. Experimental materials. As indicated earlier, the subjects received two pairs of mixture problems and two pairs of work problems. For the isomorphic mixture problems, all of the components of the test-problem equation could be generated by substituting numbers from the test problem into the example equation, according to the numerical mapping specified in Table 1. For the isomorphic work problems as well, the entire process of adaptation consisted of substitution. The pairs of similar mixture and work problems also provided data on success at substitution. In addition, the similar mixture problems provided data on the adaptation category of postulating new elements, and the similar work problems provided data on structure-preserving generalizations. Subjects’ performance on page three of the booklet for each pair of problems provided data on adaptation
q e r t i s e and Word Problem Analogies
175
success: Subjects received a list of elements from the example-problem equation, and they had to provide the corresponding test-problem elements. The example problem for the similar pair of mixture problems was very similar to the example shown in Table 1. The equation given for solving it was .2Ow + (.30)(10-w) = (.22)(10). The test problem was "A wholesale spice dealer mixes ground cinnamon worth $1.36 an ounce and ground nutmeg worth $1.81 an ounce. How many ounces of each are needed to fill a 9 ounce jar of mixed spices that sells for $14.04?" Because the price of a 9 oz jar of the mixture is given, rather than the price per ounce of the mixture, an appropriate equation is: 1.36~+ (1.81)(9-c) = 14.04. Three elements of this equation can be derived by substitution, namely 1.36c, 9-c, and (1.81)(9-c). Completion of the equation, however, requires the creation of new elements. Subjects' success at this type of adaptation was tested by two items in the experimental booklet. On page three, subjects were given .22 from the example problem and had to produce 14.04/9 as the corresponding test-problem element. On page two, subjects were given 14.04 from the test problem and were required to produce (.22)(10) as the corresponding example-problem element. The example for the similar pair of work problems was "A small hose can fill a swimming pool in 10 hours and a large hose can fill the pool in 6 hours. How long will it take to fill the pool if both hoses are used at the same time?" The equation given for this problem was (1/10)h + (1/6)h = 1. The test problem was "It takes Alex 56 minutes to mow the lawn and it takes his older brother Dan 40 minutes to mow the lawn. Dan mowed half the lawn on Saturday. On Sunday the two boys work together to mow the other half of the lawn, but Dan starts 4 minutes after Alex. How long will each boy work on Sunday?" A correct equation for this problem is (1/56)m + (1/40)(m-4) = 1/2. Solving this problem by andogy to the example requires execution of two structure-preserving generalizations. First, subjects must realize that the right-hand-side of the example equation refers to the amount of the task completed by the two "workers"together, which need not be the entire task. This generalization is reflected by the example/test correspondence 1 = 1/2. Second, subjects must realize that the workers need not work the same amount of time. If Dan corresponds to the large hose, this generalization is reflected by the correspondence (1/6)h = (1/40)(m-4). The remaining elements of the test-problem equation (1/40, 1/56, and (1/56)m) can be derived by substitution. Results. I will consider the results for each adaptation category in turn. Replicating the results of Novick and Holyoak (1991), substitution of mapped numbers was quite easy, with an average of only 17% errors for the four pairs of problems. Reed, Dempster, and Ettinger (1985, Experiment 3) similarly found
176
L.R. Novick
that only 11% of solutions to target algebra word problems involved substitution errors (what they called quantity errors). As suggested earlier, success at substitution was reliably associated with success at mapping the numbers in the example and test problems, r = .44,p < .02. Success at substitution also was reliably associated with algebraic expertise, r = .42, p c .02. In contrast to substitution, the postulation of new elements was quite difficult, again replicating the results of Novick and Holyoak (1991): Subjects averaged only 13% correct for the appropriate two correspondences for the similar mixture problems. These problems were adapted from a pair used by Reed (1987) in his mapping study. Consistent with my results, none of Reed’s subjects produced the correct test-problem element for the correspondence that required a new-element adaptation. Despite the difficulty of this type of adaptation, adaptation success in my study was reliably related to subjects’ level of algebraic expertise, r = .41, p < .03. Finally, let me consider the structure-preserving generalizations required for successful adaptation of the hose equation for the lawn-mowing problem. These adaptations also were difficult, although not as difficult as the new-element adaptations, as subjects averaged 31% correct. In this case, however, success at adaptation was not reliably associated with expertise, r = -.01, p > .94. It seems sensible that expertise would be more highly related to success at executing structure-violating than structure-preservinggeneralizations, although this finding clearly must be replicated. Schema induction The results discussed so far indicate that algebraic expertise plays an important role in determining the success of analogical problem solving with arithmetic and algebra story problems. The success of each component process of analogical transfer is determined in part by the solver’s level of expertise. Thus, more proficient subjects are more likely to retrieve an appropriate source problem, better able to compute the mapping between the elements in the source and target problems, and more successful at adapting the source procedure for use in solving the target problem (although they are not necessarily better at performing all types of adaptations). A fourth component of analogy use, mentioned earlier, is learning. An important benefit of analogical transfer is that it teaches solvers about the existence of a class of problems not previously known to be related (Bassok & Holyoak, 1989; Novick & Holyoak, 1991; Ross & Kennedy, 1990). That is, subjects who succeed at transfer often induce a more abstract schema encompassing the
Expertise and Word Problem Analogies
177
specific source and target problems. Such learning is beneficial for several reasons. First, because the induced schema emphasizes the features common to the source and target problems, it must throw away many of the surface features of those problems while preserving their common structural features (Holyoak, 1984). Thus transfer-mediated schema induction may be one mechanism through which the important categories of problems possessed by experts are learned (Ross, 1989). Second, schema induction facilitates solution of subsequent analogous problems (e.g., Gick & Holyoak, 1983; Novick & Holyoak, 1991; Pirolli & Anderson, 1985). That is, subjects who have induced a schema encompassing an initial set of analogs are more likely to transfer the common solution to a later analog than are subjects who have not induced a schema. Given these benefits of schema induction, it is important to consider whether the likelihood of inducing a schema as a result of analogical transfer is related to one’s level of expertise. That is, given a group of students who (a) do not possess the schema in question prior to their exposure to the experimental problems, (b) successfully solved the target problem by transferring the procedure learned for the source problem, and (c) vary in their level of expertise, does expertise level predict the likelihood of schema induction? Different answers to this question have different implications for instruction. If expertise is a significant contributor to the success of schema induction, as well as to the success of each of the components of analogical transfer, then less proficient solvers are at a considerable disadvantage: Not only are they less able to adapt solution procedures learned earlier to fit the requirements of current problems, but even when they succeed at transfer they do not reap all of the benefits that often accrue from such success. In contrast, if schema induction is unrelated to expertise,then instruction can focus on retrieval, mapping, and adaptation, secure in the knowledge that the ensuing successful transfer is likely to lead to the desired learning. The studies of analogical transfer with LCM problems conducted by Novick and Holyoak (1991) provide data relevant to this issue. Schema induction was assessed by asking subjects to describe the solution procedure common to the source and target problems immediately after solving the target. These descriptions were scored on a 3-point scale, with higher scores reflecting higher quality schemas for the common LCM procedure. The hypothesis that analogical transfer leads to schema induction has two implications (Novick & Holyoak, 1991): Schema quality should be (a) positively related to the strength of analogical transfer but (b) unrelated to successful solution by non-analogical means. (As indicated earlier, the LCM problems used by Novick and Holyoak also can be solved by other, less efficient methods.) Accordingly, measures of both analogical transfer and accuracy on the target
178
L.R. Novick
problem(s) were included as predictors of schema quality in a multiple-regression analysis. In Experiment 1,these predictors were based on performance on a single target problem (the band problem), whereas in Experiment 2 they were based on performance on two target problems (the band and bake-sale problems). To assess the contribution of algebraic expertise to the likelihood of schema induction, MSAT scores also were included as a predictor. The two-experiments yielded comparable results. Supporting the schema induction hypothesis, schema quality was reliably predicted by target transfer but not target accuracy @s of .49 and .03, p c .01 and p > $2, respectively, N = 75, for Experiment 1;9s of .40and -.17, p c .01 andp > .22, respectively, N = 129, for Experiment 2). Unlike the results for the component processes of analogical transfer, however, schema induction was not reliably associated with expertise in either experiment @s of .07 and .11, p > .58 andp > .24, respectively, for Experiments 1 and 2). As suggested earlier, the absence of an expertise effect for schema induction is important. Although less proficient algebraists are not as likely as more proficient algebraists to succeed at analogical transfer with these problems, when they do succeed they are just as likely as the more proficient subjects to benefit from that problem solving experience by inducing a solution schema that encompasses the source and target problems. The relation between expertise and age differences
Overview Almost all of the work discussed above has concerned problem solving in college students. There is nothing in the results of those studies, however, to suggest that the expertise effects should be restricted to young adults. Do the results extend downward to children? Moreover, can age differences in analogical transfer be interpreted as expertise differences? In line with recent theorizing by a number of developmental psychologists, I have argued recently that the answer to both questions is likely to be "yes" (Novick, 1991). Structural models of development suggest that the analogical reasoning mechanism of young children is less sophisticated than that of older children and adults (Brown, 1989; Goswami, 1991). Indeed, whether measured by performance on the types of proportional analogies found on mental tests or by performance on story problem analogies, the ability to reason analogically historically has been thought not to emerge until the formal operational period of development (i.e., until adolescence). More recently, however, evidence for analogical competence
Expertise and Word Problem Analogies
179
among preschoolers has cast doubt on the stage view of analogical development. The large individual differences among college students, who years earlier supposedly entered the formal operational stage, also are inconsistent with the traditional view of development. Accordingly, Brown (1989), Goswami (1991), and Vosniadou (1989) have proposed that age differences in performance on analogical reasoning tasks, when they occur, are due to younger children’s deficient knowledge concerning the causal relations about which they are asked to reason. In other words, what develops is not the ability to engage in analogical reasoning, but the conceptual system upon which such reasoning must operate. For example, Goswami and Brown (1990) found that the age difference in performance on a set of proportional analogies for 3-, 4, and 6-year-olds mimicked the age difference in understanding of the causal relations underlying those analogies. Studies of analogical transfer in children have focussed on non-mathematical domains. Nevertheless, where relevant data exist in the developmental literature, there seems to be a reasonably close mapping between the age differences observed in those domains and the expertise effects found for the transfer of arithmetic and algebraic procedures. Moreover, transfer differences within an age group can be understood by reference to expertise differences: When young children are tested in domains for which they have considerable knowledge concerning the underlying causal relations, their performance resembles that of more expert adults; and when they are tested in domains about which they are naive, their performance resembles that of less expert adults. In the next sub-section, I describe commonalities in the results from the two literatures. It is somewhat difficult to isolate the effects from the developmental literature to specific component processes of transfer. Nevertheless, the results seem largely to pertain to the retrieval process and to some extent the mapping process as well. Although there is very little data concerning execution of the adaptation process in children, Gholson, Morgan, and Dattel (1990) have speculated recently that as children increase their expertise in a domain, they may acquire more facile strategies for executing the adaptation process of analogical transfer. A comparison of results in the two literatures
I argued earlier that when the source and target problems share structural features but not surface features, analogical transfer should be more common among experts than novices, because experts are more likely to include the features in their representations of the target problem that will enable them to
180
L.R. Novick
retrieve the source problem from memory. The results of two experiments by Novick (1988a) supported this hypothesis for more and less proficient algebraists. Several studies in the developmental literature can be interpreted as yielding similar results. Brown, Kane, and Echols (1986) had 4- to 5-year-olds recall the source story before solving the target problem. If children's recalls are taken as indicators of their source representations, then recalls that emphasize the story's goal structure reflect more expert encoding of the story than do recalls that focus on the narrative details. Mirroring my findings, the transfer rates for children who produced these two kinds of recalls were approximately 80% and 20%, respectively. Chen and Daehler (1989) similarly found that the percentage of 6-year-oldsattempting to transfer a source solution to a target problem was higher for children whose representations of the two source problems included an abstract description of their common solution principle (100% vs. 50%, respectively). In fact, the children who possessed less sophisticated source representations failed to show transfer at all, as their performance on the target problem was indistinguishable from that of control subjects who did not receive the source problems. Finally, Gholson et al. (1990) reported similar results for preschoolers through sixth-graders solving variations of the missionaries and cannibals river-crossingproblem. At each age level, what predicted the strength of transfer was the quality of understanding of the underlying structure of the source problems. I also hypothesized that novices should be more likely than experts to show negative transfer when the source problem is superficially-similar but structurally-dissimilar to the target problem. The results of an experiment in which more and less proficient algebraists received such a distractor source problem and a remote analog (a source that was structurally-similar but superficially-dissimilarto the target problem) supported this hypothesis (Novick, 1988a). If the results are interpreted as indicating that when appropriate structural information is present in a transfer situation, less expert subjects have greater difficulty than their more expert peers in ignoring misleading surface similarity, then a link might be made between my study and one conducted by Gentner and Toupin (1986). Those researchers found that when relevant structural information was provided (in what they called their "systematic" condition), 4- to 6-year-olds but not 8- to 10-year-oldswere adversely affected by a cross-mapping manipulation in which similar characters played different roles (rather than the same roles) in the source and target stories. Because understanding the causal structure of the stories required understanding sophisticated emotions and motivations such as jealousy and greed (Goswami, 1991), it is possible that the younger children were
Expertise and Word Problem Analogies
181
akin to novices in terms of their knowledge bases. This link between the results of the two studies must be considered tentative, because Gentner and Toupin did not test the children’s knowledge of the causal relations used in the stories. Chen and Daehler (1989) also examined negative transfer in children. Unfortunately for the hypothesis considered here, however, they did not manipulate (or assess) either age or expertise. In one condition of their experiment, 6-year-olds received two source problems that had minimal structural and surface similarities with the target problem. Because the children either induced or were taught the common solution principle for these problems, it is reasonable to expect that they had a fairly good representation of them. A moderate amount of negative transfer was observed: 46% of the children tried to apply the ineffective procedure to the target problem, compared to 17% of children using that procedure in the control condition. In another condition, subjects received the two inappropriate source problems but were not taught the solution principle. Presumably some children induced the principle and some did not, as in the condition reported earlier in which the source problems were appropriate, although those data were not reported. Fifty-eight percent of the children in the inappropriate-source condition applied the ineffective procedure to the target problem. In order to relate these findings to those reported in the expertise literature, future research must assess either (a) the quality of 6-year-olds’ representations of the source and target problems or (b) the performance of older and younger children. If the view being espoused here is correct, then the rate of negative transfer should vary inversely with representation quality and with age. Discussion Summary of findings
This chapter has considered the contribution of algebraic expertise to the successful transfer of an arithmetic or algebraic procedure from one word problem to another. To facilitate consideration of this issue, analogical transfer was divided into three component processes -- retrieval, mapping, and adaptation -- and one resulting process -- schema induction. Algebraic expertise was found to affect the success of each component process, but not the resulting process. The role of expertise in retrieving an appropriate source problem appears to be mediated by differences in the types of memory retrieval cues available to solvers at different levels of expertise due to differences in their problem representations. In particular, more proficient algebraists are much more likely than less proficient
182
L.R.Novick
algebraists to include the structural features in their representations of the target problem that are required for retrieving an analogous but superficially-dissimilar source problem from memory. Once an appropriate source problem has been retrieved, students at all levels of algebraic proficiency are quite good at determining the mapping between the elements of the source and target problems. Nevertheless, mapping success increases with increasing expertise. The mapping is used to determine how to substitute the target elements into the source operators and how to adapt those operators to account for the unique aspects of the target problem (i.e., those not shared with the source problem) -in other words, how to apply the source solution to the target problem. Level of expertise is positively related to attempting to use the source solution in solving the target problem and negatively related to making an error during the attempt. More specifically,success at substitution increases with increasingexpertise, despite subjects’ generally high scores for that sub-process. Novick and Holyoak (1991) distinguished two additional types of adaptations required for the analogical solution of arithmetic and algebra story problems, both of which are considerably more difficult than substitution: (a) Creation of new target-problem elements that were not described in that problem but that must be mapped to elements in the source problem or solution for successful transfer to occur and (b) generalizations of the source procedure that preserve the essential structure of that procedure. The success of only the first of these types of adaptations is reliably predicted (in a positive manner) by level of expertise. Although this difference in the results for the two types of adaptations still needs to be replicated, it makes sense for expertise to be more highly related to success at executing structure-violating as opposed to structure-preservinggeneralizations. In contrast to the importance of domain expertise for executing the processes underlying analogical transfer of an arithmetic or algebraic procedure, it appears to have little if any effect on what is learned as a result of successful transfer. Although the likelihood of succeeding at analogical transfer in this domain increases with expertise, over a wide range of algebraic proficiency subjects seem equally able to benefit from successful transfer by inducing an abstract schema encompassing the source and target problems. This is important, because schema induction facilitates the solution of subsequent analogous problems and may also contribute to increasing one’s level of expertise. Recently, many developmental psychologists have proposed an analogy between young children and (adult) novices. Consistent with this hypothesis, the expertise-based differences in the success of executing the component processes of analogical transfer are mirrored by the transfer differences observed for
Expertise and Word Problem Analogies
183
children at different ages, with younger and older children behaving more like novices and experts, respectively. In addition, transfer differences within an age group can be understood by reference to expertise differences. Directions for future research
The findings discussed in this chapter suggest four directions for future research on the contribution of expertise to analogical transfer. First, it is important to gain a better understanding of the role of expertise in the successful execution of the adaptation process, because most situations in which procedural transfer might be used probably involve source and target problems that are not completely analogous. An initial step in this direction should be to try to replicate the suggestion in the data reported here that expertise contributes more highly to predicting the successful execution of structure-violating than structure-preserving generalizations. Subsequent research might then be directed toward budding a more sophisticated model of the types of adaptations required for arithmetic and algebra word problems and determining precisely how expertise fits into this model. Second, it will be important in terms of constructing a general theory of expertise to extend the present findings, and any new findings concerning the adaptation process, to other areas of mathematics, as well as to non-mathematical domains such as anthropology, literature, or history. Does domain expertise have the same kinds of effects on analogical transfer in other areas of mathematics and in non-mathematical domains as it does in the domain of arithmetic and algebra story problems? In what ways is the contribution of expertise to success at analogical transfer similar across domains, and in what ways does it differ? Third, the relation between expertise effects and developmental differences in analogical problem solving should be pursued, in mathematical as well as non-mathematical domains. The distinction between mapping and adaptation is new in the adult literature, and it is only beginning to be considered in the developmental literature. The process of schema induction as a result of analogical transfer also has not been studied extensively in children. The expertise results reported here for these processes enable specific predictions to be made concerning children’s transfer performance. Tests of these predictions will provide evidence on the validity of, and the constraints on, the analogy between young children and novices. This line of research also is likely to yield important theoretical and practical insights into both expertise and mathematical problem solving.
L.R. Novick
Finally, Brown (1989) has argued that the age differences in analogical transfer that remain after knowledge levels have been equated might be attributable to greater metacognitive competence among the older children. Is it correspondingly the case that the advantage of adult experts over novices in procedural transfer is in part due to greater metacognitive competence of the experts? The metacognitive and self-regulatory superiority of experts, compared to novices, at non-analogical problem solving within their domain of expertise has been noted previously (e.g., Glaser, 1987). I am not aware of any research on metacognition as related to transfer. This would seem to be a fruitful direction for future research, both in adults and developmentally (also see Brown, 1989).
Implications for instruction Obviously, it is impossible to solve a target arithmetic or algebra word problem by analogical transfer unless a useful source problem has been retrieved. Because the success of the retrieval component of transfer depends on the nature of the problem representations constructed for the target problem and for problems in memory, and because teachers are not always available to provide a correct analog for students, it would seem useful to devote a considerable amount of time to teaching representation skills. It would also seem desirable to teach the strategy of looking for analogous problems (see Brown & Kane, 1988). Although expertise also contributes to the successful execution of the mapping process, less effort probably can be devoted to instruction on this process because the overall accuracy level is quite high. For arithmetic and algebra story problems, perhaps the best advice a teacher can give is for students to concentrate on mapping the numbers in the source and target problems, because those are the target elements that must be substituted into the source operators (or equation) to generate an analogous solution for the target problem. In addition to problem representation, the skill that probably most needs to be addressed in instruction is adaptation. Less proficient algebraists often (a) do not attempt to adapt the source procedure to fit the requirements of the target problem, even when told to do so, (b) abandon their adaptation attempt prematurely, or (c) persevere but are unsuccessful. Instruction in executing this component of transfer might begin by considering the different types of adaptations, illustrated with clear examples, that might be encountered in the domain of arithmetic and algebra story problems. Like the mapping process, probably less effort can be expended on instruction pertaining to the schema induction process. Although performance is not at ceiling on this measure, it makes sense to begin a revision of the curriculum with
Ejcpertise and Word Problem Analogies
185
those skills in which overall performance is lowest and expertise-based differences are highest. In an ideal world, instruction would devote equal time to all components. In the real world, however, time and resources are limited -- hence the present emphasis on problem representation (because of its direct effect on the retrieval component of transfer) and adaptation, at the expense of mapping and schema induction.
ACKNOWLEDGEMENTS Preparation of this manuscript was supported by a Spencer Fellowship from the National Academy of Education. The algebra study reported in the sections on mapping and adaptation was supported by a grant from the Dean’s Office of Peabody College of Vanderbilt University. The se ‘.ion of the chapter on developmental differences is adapted from a symposium presentation made at the 1991 meeting of the Society for Research in Child Development. I would like to thank Jamie Campbell and Richard Mayer for their helpful comments on an early version of this chapter. Requests for reprints should be addressed to the author at Department of Psychology and Human Development, Box 512 Peabody, Vanderbilt University, Nashville, TN 37203 (or send e-mail to
[email protected]). REFERENCES Aaronson, D. & So, P.M. (1990, November). Reading and mathematicalproblem solving as interactive processes. Presented at the thirty-first meeting of the Psychonomic Society, New Orleans, LA. Bassok, M. & Holyoak, KJ. (1989). Interdomain transfer between isomorphic topics in algebra and physics. Journal of Experimental Psychology: Learning, Memory, and Cognition, 15, 153-166. Brown, A.L. (1989). Analogical learning and transfer: What develops? In S. Vosniadou & A. Ortony (Eds.), Similarity and analogical reasoning (pp. 369-412). NY: Cambridge University Press. Brown, A.L. & Kane, M.J. (1988). Preschool children can learn to transfer: Learning to learn and learning from example. Cognitive Psychology, 20, 493-523. Brown, A.L., Kane, M.J. & Echols, C.H. (1986). Young children’s mental models determine analogical transfer across problems with a common goal structure. Cognitive Development, 1, 103-121.
186
L.R. Novick
Catrambone, R. & Holyoak, K.J. (1990). Learning subgoals and methods for solving probability problems. Memory & Cognition, 18, 593-603. Chase, W.G. & Simon, H A . (1973). Perception in chess. Cognitive Psychology, 4, 55-81. Chen, Z . & Daehler, M.W. (1989). Positive and negative transfer in analogical problem solving by 6-year-old children. Cognitive Development, 4, 327-344. Chi, M.T.H., Feltovich, P.J. & Glaser, R. (1981). Categorization and representation of physics problems by experts and novices. Cognitive Science, 5, 121-152. College Entrance Examination Board. (1986). Taking the SAT. Princeton, NJ: Author. Gentner, D. & Toupin, C. (1986). Systematicity and surface similarity in the development of analogy. Cognifive Science, 10, 277-300. Gholson, B., Morgan, D. & Dattel, A.R. (1990). The development of analogical problem solving: Strategic processes in schema acquisition and transfer. In D.F.Bjorklund (Ed.), Children’s strategies: Contemporary views of cognitive development. Hillsdale, NJ: Erlbaum. Gick, M.L. & Holyoak, K.J. (1980). Analogical problem solving. Cognitive Psychology, 12, 306-355. Gick, M.L. & Holyoak, K.J. (1983). Schema induction and analogical transfer. Cognitive Psychology, 15, 1-38. Glaser, R. (1987). Thoughts on expertise. In C. Schooler & K.W.Schaie (Eds.), Cognitive functioning and social structure over the life course (pp. 81-94). Norwood, NJ: Ablex. Goswami, U. (1991). Analogical reasoning: What develops? A review of research and theory. Child Development, 62, 1-22. Goswami, U. & Brown, A.L. (1990). Melting chocolate and melting snowmen: Analogical reasoning and causal relations. Cognition, 35, 69-95. Holyoak, K.J. (1984). Analogical thinking and human intelligence. In R.J. Sternberg (Ed.), Advances in thepsychology of human intelligence (Vol. 2, pp. 199-230). Hillsdale, NJ: Erlbaum. Holyoak, K.J. (1985). The pragmatics of analogical transfer. In G.H. Bower (Ed.), The psychology of learning and motivation (Vol. 19, pp. 59-87). NY: Academic Press. Holyoak, K.J. (1991). Symbolic connectionism: Toward third-generation theories of expertise. In K A . Ericsson & J. Smith (Eds.), Toward a general theory of aperfise:Prospects and limits (pp. 301-335). N Y : Cambridge University Press.
Expertise and Word Problem Analogies
187
Holyoak, K.J., Novick, L.R. & Melz, E. (in press). Component processes in analogical transfer: Mapping, pattern completion, and adaptation. In KJ. Holyoak & J A . Barnden (Eds.), Connectionist approaches to analogy, metaphor, and case-based reasoning. Norwood, NJ: Ablex. Holyoak, K.J. & Thagard, P. (1989). Analogical mapping by constraint satisfaction. Cognitive Science, 13, 295-355. Mayer, R.E. (1983). Thinking,problem solving, and cognition. NY:Freeman. Medin, D.L. & Ortony, A. (1989). Psychological essentialism. In S. Vosniadou & A. Ortony (Eds.), Similariry and analogical reasoning (pp. 179-195). NY: Cambridge University Press. Messick, S. (1980). The effectiveness of coaching for the U P Review and reanalysis of research from the fifties to the FTC. Princeton, NJ: Educational Testing Service. Novick, L.R. (1988a). Analogical transfer, problem similarity, and expertise. Journal of Experimental Psychology: Learning, Memory, and Cognition, 14, 510-520. Novick, L.R. (1988b). Analogical transfer: Processes and individual differences. In D.H. Helman (Ed.), Analogical reasoning (pp. 125-145). Dordrecht, The Netherlands: Kluwer Academic Publishers. Novick, L.R. (1990). Representational transfer in problem solving. Psychological Science, I, 128-132. Novick, L.R. (1991). The role of expertise in analogical problem solving. In J. Gallini (Chair), Analogical problem solving: The mechanisms underlying what develops. Symposium conducted at the Annual Meeting of the Society for Research in Child Development, Seattle, WA. Novick, L.R. & Holyoak, K.J. (1991). Mathematical problem solving by analogy. Journal of Experimental Psychology: Learning, Memory, and Cognition, 17,398415. Pirolli, P.L. & Anderson, J.R. (1985). The role of learning from examples in the acquisition of recursive programming skills. Canadian Journal of Psychology, 39, 240-272. Reed, S.K. (1987). A structure-mapping model for word problems. Journal of Experimental Psychology: Learning, Memory, and Cognition, 13, 124-139. Reed, S.K., Dempster, A. & Ettinger, M. (1985). Usefulness of analogous solutions for solving algebra word problems. Journal of Experimental Psychology: Learning, Memory, and Cognition, 11, 106-125. Ross, B.H. (1984). Remindings and their effects in learning a cognitive skill. Cognitive Psychology, 16, 371-416.
188
L.R. Novick
Ross, B.H. (1989). Remindings in learning and instruction. In S. Vosniadou & A. Ortony (Eds.), Similan'ty and analogical masoning (pp. 438-469). Cambridge: Cambridge University Press. Ross, B.H. & Kennedy, P.T. (1990). Generalizing from the use of earlier examples in problem solving. Journal of Experimental Psychology: Learning, Memory, and Cognition, 16,42-55. Schoenfeld, A.H. & Herrmann, D.J. (1982). Problem perception and knowledge structure in expert and novice mathematical problem solvers. Journal of Experimental Psychology: Learning, Memory, and Cognition, 8, 484-494. Silver, EA. (1981). Recall of mathematical problem information: Solving related problems. Journal for Research in Mathematics Education, 12, 54-64. Vosniadou, S. (1989). Analogical reasoning as a mechanism in knowledge acquisition: A developmental perspective. In S. Vosniadou & A. Ortony (Eds.), Similarity and analogical reasoning (pp. 413-437). NY: Cambridge University Press.
The Nature and Origins of Mathematical Skills J.I.D.Campbell (Editor) Q 1992 Elsevier Science Publishers B.V. All rights reserved.
189
Chapter 6 THE DEVELOPMENT OF SKILL IN MENTAL ARITHMETIC AN INDMDUAL DIFFERENCES PERSPECTIVE
Keith F. Widaman University of California at Riverside
Todd D. Little Max Planck Institute for Human Development and Education, Berlin
Summary
In this chapter, we attempt first to describe the different emphases inherent in traditional experimental and individual differences approaches. N a t , we discuss the types of cognitive models that have been proposed to account for mental arithmetic perfomtance. In doing so, the nature of the basic theoretical propositions (e.g., underlying the retrieval of answers from a memory store) and the proper means of testing these propositions will be stressed. Then, we survey the findings of several studies, highlighting recently published findings on individual and developmental differences in the skills underlying proficiency in mental addition. One unique contribution of an individual differences approach is the aternal validation of proposed models specifying the elementary cognitive processes invoked during problem solution. Ejrtemal validation requires the fitting of statistical models to data from each individual, resulting in the estimation of parameters that reflect speed of executing each of various cognitive processes, or elementary cognitive components. Then, individual differences in these parameter estimates, or component scores, are related to individual differences on traditional measures of abiliry and achievement. Various implications of our findings for research on numerical facility and for applications in education are also discussed.
190
ICF. Wdaman & T.D. Little
Introduction The skills involved in simple forms of arithmetic--addition, subtraction, multiplication, and division--undergo profound changes during childhood and adolescence. This should come as no surprise to parents, to teachers, or to researchers of developmental processes. However, deciding how to portray these changes represents a major challenge to researchers and educators, and, perhaps, to parents as well. As will become apparent as this chapter unfolds, the development of simple forms of arithmetic takes on many forms--emergence of new skills, transitions from one form of problem solution to another, and the systematization, or automatization, of these and other skills. The key task of researchers of numerical processing is to provide a complete and coherent characterization of the manner in which individuals learn the various skills underlying arithmetic as well as the ways in which these skills change or develop as a function of age. The central thrust of the present chapter is to concentrate on the influence that an individual differences perspective has on the hypotheses tested and theories developed in a domain of inquiry, focusing here on mental arithmetic in general and mental addition in particular. Foreshadowing later sections, we claim that many questions of core theoretical issues in cognitive psychology are answerable only if a concern for individual differences is reflected in the formulation of hypotheses, the design of research studies, the analyses of data, and the framing of conclusions. Before proceeding further, we should consider first a relevant question: what is meant by an "individual differences perspective" when one is studying the development of psychological processes? One basic aspect of an individual differences approach is the importance of evaluating and representing the performance of individuals, rather than groups, on mental or cognitive tasks. Such an approach contrasts with the most common approach to investigating cognitive processes, which bases theory and analyses on group-level data. Using this more common approach, a set of problems is carefully designed; the problems are presented to a sample of subjects; the average performance of the sample of subjects (e.g., average reaction time [RT]) to each problem is obtained; and then models are fit to these average indices of performance to represent theoretical models and to contrast the fit of alternative models. This is a most useful exercise, and we have used such approaches in some of our own work. However, one may be misled in important ways about the manner in which individuals respond to problems if group-level data alone are analyzed. As Estes (1956) and others since (e.g., Underwood, 1975) have argued, certain models may fit average, or group,
Individual Differences in Arithmetic Skill
191
data well, yet may misrepresent significantly the patterns in data from each individual comprising the group. An individual differences approach to theory and research advocates verifying at the level of individuals the appropriateness of models developed on average, or group, performance. A further way to explicate the meaning of an individual differences perspective is to identify what the term "individual differences" denotes. There are two basic types of individual differences: Inter-individual differences and intra-individual differences. Most commonly, the term "individual differences" refers to inter-individual differences, or differences between or among individuals. Inter-individual differences reflect variability that occurs among individuals, such as variability among persons on a single stimulus trial or at a single time of measurement, as well as across stimulus trials or measurement points. In contrast, intra-individual differences reflect variability that occurs within a single individual, such as variability across stimulus trials at a single time of measurement or variability (i.e., change) across times of measurement (Baltes & Nesselroade, 1979). One implication of the preceding distinctions is that inter-individual differences may be exhibited at a single time of measurement on a single measuring instrument, whereas some form of longitudinal or repeated measurements must be obtained to investigate intra-individual differences. That is, if individuals receive different scores on a single measuring instrument administered at a single point in time, then inter-individual differences are present. In contrast, studying intra-individual differences requires the collection of data across time (or situations, or problems), to represent the differences or changes in response by the individual as a function of the successive times (or situations, or problems). The time lags between measurements may be rather long, such as one- or five-year lags between measurements, or may be rather short, such as those required for tracking microanalytic changes that occur during practice on a task during a single session. Regardless, the focus of intra-individual difference studies is on differences or changes in individual performance. As will be made clear later, the development of mental addition skills subsumes changes in both strategy choice and in strategy or skill execution. Moreover, both of these types of change may be isolated at both the intra- and inter-individual levels. Foreshadowing our conclusions, we note here that a complete representation of the development of mental arithmetic, in terms of inter- and intra-individual differences in strategy choice and strategy execution, has not yet been achieved. In our own work, we have concentrated on certain aspects of this problem, primarily on inter-individual differences in strategy execution. Others have studied, for example, differences in strategy choice, while ignoring both inter-
192
KF. Widaman & T.D.Little
and intra-individual differences. In the present chapter, we hope to present a framework for placing and interpreting previous work on mental arithmetic, a framework that will help organize previous research and provide a context for further research on the development of arithmetic skills. This framework should enable the identification of issues that previous research has failed to confront as well as unique and crucial research questions that arise only in situations in which data at the level of the individual subject are considered. What develops and who develops?
'The focus of much experimental research on development is on cognitive processes that undergo change as a function of growth and aging. Indeed, the review of research on a given domain commonly is initiated with a title such as "What is development of X the development of ?", where X may be replaced by constructs such as memory or intelligence,and the review consists of characterizing the development of the cognitive processes underlying behavioral phenomena within the given domain. We intend to show that there are at least two ways to construe the preceding generic question. One way of answering this question--the most typical way--embodies a focus on the psychological processes that emerge or change as a function of age. The second way of answering the question focuses instead on the individuals within whom the psychological processes are emerging or changing. The two ways of answering the generic question regarding "what develops" lead to systematically different research designs as well as to different forms of hypotheses tested, although there is useful overlap between the two approaches. An orientation toward processes
The most common way in which the development of cognitive skills in a given domain are portrayed is to focus on the manner in which children of different ages perform on tasks in that domain. Research of this sort often is performed at the group level, averaging over subjects. Using this strategy, stimuli that require systematically varied cognitive processes are presented to subjects, and the responses by groups of subjects are represented by averages for each cell in a design. We have chosen to characterize this approach an orientation toward processes. This approach reflects an orientation toward processes because the research questions pursued concern the cognitive processes required for problem solution and the way in which execution of these processes may change over time. In the
Individual Differences in Arithmetic Skill
193
attempt to answer such questions, sets of stimuli or problems are carefully prepared so that contrasts between responses to different types of problems will reveal the effects of hypothesized cognitive processes. Consider two sets of problems, Set A and Set B. The contrast between Sets A and B may be of at least two basic types: (a) qualitative, or presence versus absence of hypothesized cognitive processx; and (b) quantitative, or the number of times that hypothesized cognitive process x is executed (e.g., problems in Set A require a single execution of processx, whereas problems in Set B require two executions). In the qualitative case, acquisition of cognitive process x will lead to systematic differences in responses to problem Sets A and B, whereas failure to acquire cognitive process x may lead to equivalent forms of response to Sets A and B. In the quantitative case, developmental change in performance (e.g., automatization of performance as a function of age) would lead to systematic changes in the difference in response to Sets A and B, such as a lessening of the difference in RT between the two sets of problems as a function of chronological age. In either case, the research questions are concerned with identifying the cognitive processes that are required for response to particular forms of problem, the order in which these processes are acquired, and the ways in which response execution changes as a function of development. To answer these questions, analyses of data in such studies often take one of two forms. First, researchers may obtain the average response (e.g., RT) to each problem for persons of each age. Analyses are performed separately by age group, with problems as the unit of analysis; contrasts across age levels are then made with regard to parameter estimates in the statistical models. Second, researchers may use a measure of average response (e.g., median RT) by each person in each condition. In the resulting analyses, individuals appear in the analyses, but are relegated to the error term. Using either of these two strategies, researchers, by dint of their experimental approaches, lose sight of the individuals in their studies, of the differences among individuals in the manner in which they respond to problems of particular types. To be sure, the questions posed at the group level are important ones, but the answers offered apply only to a loosely conceived "average" person and may not apply to any observable individual. Another, related observation on the orientation toward processes is the nomothetic stance (Brunswik, 1952) taken toward research and theory. The term nomothetic refers to the search for lawful relations that describe human behavior, lawful relations that are presumed to hold for all persons and that usually do not incorporate individual difference parameters that moderate the form of the relation for each person. For example, a summary of research may conclude that human forgetting is best described by a power function, in which memory for
194
ICF. Wdaman & T.D. Little
learned facts is inversely and nonlinearly related to the time since the facts were learned (Anderson & Schooler, 1991; Wixted & Ebbesen, 1991). Although a power function may represent group-level data on forgetting better than do competing models, analyses are rarely, if ever, performed on individual-level data to determine whether individuals exhibit performance similar to that of the group. Rather, group-level trends are assumed to represent accurately lawful trends capturing the systematic behavior of persons, with individual differences representing small deviations about the group trend. The orientation toward processes therefore relegates concerns regarding individual differences to the status of untested presumptions, presumptions that become testable hypotheses using an individual differences approach. An orientation toward persons
In contrast to the orientation toward processes, the developing individual may be placed at the focus of inquiry. We term such an approach an orientation toward persons. The design of such a study could well employ the same types of stimuli or problems, as well as the same form of sampling of persons at different age levels, as used for group-level investigations. Moreover, certain research questions pursued may be identical to those guiding process-oriented research: the identification of the cognitive processes that are required for response to particular forms of problems, the order in which these processes are acquired, and the ways in which response execution changes as a function of development. Hence, many of the ultimate goals of the two approaches are identical. However, the orientation toward persons has additional central goals: characterizing the cognitive processes underlying performance on a given task, and the acquisition and development of these processes, for each individual subject. These concerns are the hallmark of the orientation toward persons. Using data from individual subjects to estimate accurately the presence and effects of cognitive processes presumed to underlie response to different types of problems may require certain modifications in study design. The basic form of studies may remain the same (e.g., the number of cells in an experimental design), but a more extensive amount of data from each person might be required to ensure reliability or accuracy of the individual-level parameter estimates. Taking the usual orientation toward processes, a relatively small number of observations per cell per subject may be gathered, and the researcher may still have sufficient power to test theoretical propositions if the sample size is sufficiently large. In contrast, the orientation toward persons requires a sufficient amount of data from each individual so that cognitive models may be tested and contrasted using data
Individual Differences in Arithmetic Skill
195
from each subject. This is possible, in most cases, only if fairly extensive data are obtained from each subject, although there are exceptions. For example, we (e.g., Widaman, Little, Geary, & Cormier, in press) have found that parameter estimates for the several cognitive operations underlying mental addition have adequate reliability when based on the amount of data on which most group-difference research is based. The key goal of research with an orientation toward persons is to understand or represent the lawfulness in the behavior or responses of each individual. In contrast to the nomothetic stance of the orientation toward processes, the orientation toward persons is often portrayed as using an idiographic approach to research (Brunswik, 1952). The term idiographic refers to research that attempts to describe or represent the behavior of the individual, with little regard for the necessity of generalizing across individuals. Biography is one exemplar of idiographic approaches, as the goal is to understand the person whose life story is being presented, knowing that both the story and the lessons learned from it may be unique. In a more scientific context, a researcher may find that different models of behavior hold for different persons; indeed, a model with a unique form may hold for each individual. An outcome of this sort would not be problematic for an investigator committed to the idiographic approach, as this would underscore the value of the approach. But, to the practicing scientist, attempting to understand behavior in terms of a relatively small set of organizing principles, such an outcome might be rather disconcerting. However, scientific study of human behavior that is oriented toward persons need not stop at the idiographic level. Rather, researchers should move on to a level that has been variously termed idionomothetic (Brunswik, 1952) or idiothefic (Lamiell, 1981). Such a term implies an interest in both the search for lawful behavioral relations and the concern for the ways in which these lawful relations hold for individuals. The start of scientific work using an idionomothetic approach may be at the group or the individual level, whichever leads to greatest progress. Perhaps the group level seems most likely to provide advances during initial forays into any content domain. However, later, more confirmatory research studies could well focus first at the individual level. Once the behavior of each individual is represented well, then the models of individual behavior may be aggregated in various ways to reflect the common lawfulness across persons in models for behavior. Ideally, the common lawfulness across persons will then exhibit interpretable patterns at the aggregated group level as well as showing meaningful patterns of change as a function of development.
KF. Wdaman & T.D.Little Methodological implications of an individual differences approach
Several methodological concerns accompany individual differences approaches to research. First, investigations that focus on individual differences circumvent, at least in some senses, the need to present results of analyses conducted at the group level. As mentioned above, studies that highhght individual differences-employing an orientation toward persons-represent cognitive processes at the level of the individual subject. If individual-level results are acceptable (e.g., parameter estimates are reasonable), then group-level presentation would entail summaries of the individual-level results. That is, rather than performing analyses of group-level, or average, responses to each problem (e.g., Ashcraft & Battaglia, 1978; Geary, Widaman, & Little, 1986), results could be summarized by listing the mean and standard deviation (or median and inter-quartile range) across subjects of the estimates of each given parameter. The general pattern of findings would thus be conveyed quite adequately, and explicit evidence regarding the extent of individual differences on the parameters in question would be presented as well. On the other hand, some basic issues assessed at the group level may provide important, initial understanding of phenomena that deserve further attention at the individual level. For example, the types of processing model hypothesized for mental addition, originally substantiated on group-level data, represent the types of strategies that individuals are presumed to use when solving addition problems. Thus, the basic form of processing models for mental addition was based on group-level data. But, even with this admission, we argue that complete consideration of phenomena in any domain of research ultimately must involve analyses conducted at the level of the individual. Only in this fashion can one determine whether the models developed on group-level data reflect performance by individual subjects. Furthermore, an individual differences approach leads to a unique set of issues and problems that may be addressed. The concepts of internal and external validation (Sternberg, 1977) are useful for structuring our thinking about these matters. Internal validation involves verifying that one’s theoretical model is consistent with data on a single experimental paradigm. For example, a researcher may posit a particular theoretical model of the manner in which persons respond to addition problems (see later sections); if the given theoretical model provides a better account of RT data than do competing models, then the study has provided differential internal validation for the best-fitting model. In contrast, external validation concerns the relations of individual differences in parameter estimates obtained from a particular paradigm with individual differences on
Individual Differences in Arithmetic Skill
197
outside measures. In the realm of mental addition, correlations between parameters based on RT data to simple addition problems and scores on paper-and-pencil tests of similar abilities would provide evidence of external validity. Given the distinction between internal and external validity, typical group-level research on cognitive processes clearly reflects internal validation of proposed models. Experiments are designed to require the subject to engage in a determinate number of cognitive processes; the resulting data provide, or fail to provide, evidence of the hypothesized cognitive processes, evidence such as expected patterns in RT data. Presuming the presence of a well-replicated paradigm, let us say Paradigm A, researchers feel confident that the paradigm provides a useful look at one or more cognitive processes. But, there are always multiple ways to measure the cognitive processes hypothesized to underlie performance on a given paradigm. Assume that a researcher designs a new method of tapping into the processes invoked by Paradigm A, resulting in Paradigm A', a presumed alternative to Paradigm A. Now, to determine whether Paradigms A and A' provide interchangeable methods of assessing the effects of identical cognitive processes, researchers often attempt to determine whether Paradigms A and A' have similar effects on group-level performance, and whether identical or analogous modifications to Paradigms A and A' lead to parallel changes in group-level performance (cf. the notion of converging operations; Garner, Hake, & Eriksen, 1956). But, this is not a sufficient test. The two paradigms may indeed result in similar group trends in data, and similar modifications to the two paradigms may lead to similar changes in average performance. But, these essentially identical patterns of findings across paradigms may arise for different reasons. That is, identical group-level patterns across paradigms may arise from different mixtures of the use of several cognitive processes across individuals. One way of testing these ideas more directly is to use individual-level data and determine whether the modifications to Paradigms A and A' lead to consistent changes in parameter estimates across individuals. The preceding point is exemplified in a recent study by O'Neill (1992) of pop-out effects in a visual attention task. "Pop-out" refers to the experience of effortless and rapid identification of a visual target embedded within a field of visual distractors (Treisman & Gelade, 1980). Possession of a unique feature enables a target stimulus to "pop out" of the perceptual array of distractors; the set of stimuli with a "pop out" target is presumably processed in parallel and preattentively. In contrast, a target stimulus that does not pop out of the perceptual array lacks a feature that is unique from those of the distractors. Non-pop-out target stimuli are processed in a serial, attention-requiring fashion.
198
R F . Widaman & T.D. Little
That is, attention must be directed in turn to each stimulus in the perceptual array, to determine if the target is present. Given the preceding distinctions between pop-out and non-pop-out targets, the RT to identify the presence of a pop-out target should be unaffected by the number of distractors in the perceptual array, whereas the RT to identify the presence of a non-pop-out target should be positively and linearly related to the number of distractors. The typical way of testing for pop-out effects is to employ an RT paradigm, presenting stimuli via tachistoscope or computer screen and contrasting the effects on RT of the number of distractors in pop-out and non-pop-out conditions. O’Neill (1992) employed a tachistoscope for this standard RT paradigm, but also developed a card-sort task that enabled a test of distractor effects across pop-out and non-pop-out card decks. The patterns in group-level data were quite similar across tasks: on both tasks, response times were relatively unaffected by the number of distractors in pop-out conditions, but were strongly and positively related to the number of distractors in the non-pop-out conditions. These results lead one to presume that the two tasks-tachistoscopic RT and card-sort versions--tap into common or isomorphic cognitive processes. However, when O’Neill computed parameter estimates at the individual level for both tasks, the across-task correlations between purportedly identical componentswere uniformly less than r = .20, despite high levels of reliability for the parameter estimates from each task. These results suggest either that rather different cognitive processes are tapped by the two tasks or that parameter estimates from the two paradigms are confounded by massive amounts of method variance (Campbell & Fiske, 1959). At the least, these findings imply that similarity across tasks in group-level trends in data is an insufficient basis for inferring that the tasks reflect common, isomorphic processes. To support the latter inference, strong patterns of convergent correlations across tasks using individual-level data would be required. The preceding study is a single example of the use of external validation, which involves the correlation of parameter estimates from a particular paradigm with outside variables. In the study of mental addition, few studies have explored external validation of parameter estimates associated with the cognitive processes underlying skill in mental addition (Cornet, Seron, Deloche, & Lories, 1988; but, see Geary & Widaman, 1987, 1992; Little & Widaman, 1992; Widaman et al., in press). However, such external validation will be a critical issue in future research on mental arithmetic, in general, as external validation provides evidence of commonalities across different types of mental arithmetic (e.g., addition, multiplication) as well as evidence regarding whether the hypothesized cognitive processes reflect the skills exhibited by school children in class and on achievement measures.
Individual Differences in Arithmetic Skill
199
We turn now to research on mental addition, discussing componential models for mental addition as well as the range of internal and external validational analyses that have been pursued in this domain. In the review that follows, we will attempt to emphasize the unique theoretical questions that may be posed and answered only through the consideration of individual differences in performance. Componential models for mental addition A central aspect of research on mental addition is the characterization of arithmetic skills within componential models for each type of mental operation, such as addition. A proposed componential model must specify the core processing components by which the correct answer to an arithmetic problem may be obtained. Further, a componential model must speclfy the full complement of cognitive operations that are presumed to underlie response to problems. That is, a componential model must place the various cognitive processes underlying performance within the context of an integrated model of the flow of processing as an individual responds to problems presented. To these issues we now turn.
Types of processing components for mental addition
The types of proposed cognitive components for arriving at the correct answer for mental addition problems can be divided into two broad categories: (a) computational or procedural-based components, characterized by subjects' reliance on accessing stored algorithms or rule-based strategies for solving addition problems, and (b) retrieval or declarative-based components, characterized by subjects' reliance on accessing stored facts (i.e., sums) associated with addition problems (Widaman, Geary, Cormier, & Little, 1989). Procedural processes Counting, or reconstructive, processes. The most well articulated computational process hypothesized for the search/compute stage involves a digital, counting process (Groen & Parkman, 1972; Parkman & Groen, 1971). The metaphor used in the conceptual model of Groen and Parkman is that of a mental counter. A mental counter operates much like the mechanical counter of a ticket taker: The counter increments in a unit-by-unit fashion, and the resulting sum of the increments is "read off' from a mental display. The mental counter proposed by Groen and Parkman also may be set to a specified value one time prior to the start of the incrementing process. Given the conceptual metaphor of a digital or
200
KF. Widaman & T.D. Little
counting-based solution process, Groen and Parkman stated five explicit ways in which a subject could use the "set and increment" mental counter to arrive at the correct answer for an addition problem; each model corresponds to a statistical model with a specific predictor variable that is linearly related to RT. Only one of these models, MIN,has received widespread empirical support (e.g., Cornet et al., 1988). The MIN model assumes that a subject selects the larger of the two addends to be the value to which the counter is initially set and that the subject then increments the counter a number of times equal to the smaller, minimum addend. For this model, RT is a linear function of the minimum addend (i.e., MIN). The estimated regression slope for the MIN structural variable is interpreted as reflecting the time associated with each increment of the mental counter . In the Groen and Parkman (1972) study, adults had an estimated regression slope (b = 20 ms) that was one-twentieth the size of the same estimate for first graders (b = 400 ms), indicating much greater proficiency in the adult subjects. These results are consistent with the basic notion that information processing speed increases dramatically during the years from childhood to early adulthood, presumably due to the automatization of skill that occurs as a function of practice on a task. This form of automatization could well explain the twenty-fold decrease in the MIN parameter estimate in adults as compared with first-grade students. Alternatively, the obvious developmental progression of the MIN parameter may be consistent with the operation of a rather different process, a direct memory access process in which correct sums are retrieved from long-term memory in a manner that is unrelated to problem size in adults. Groen and Parkman hypothesized that a direct memory access process of this type may be invoked by adults on most trials, but these subjects must resort to a slower, yet more reliable counting strategy (i.e., MIN) on a few trials because of retrieval failure. Specifically, Groen and Parkman suggested that 1 in 20 trials may result in retrieval failure and subsequent use of the MIN strategy. Further, because the one MIN trial for every 20 problems contributes the only problem-size related variability in the sample of RTs, RT would still be a linear function of MIN. This model implicitly assumes that when adults resort to the MIN strategy on those 5 percent of trials characterized by retrieval failure the adult incrementing rate must be the same as for children (e.g., 400 ms); this relation is necessary mathematically to counterweight the flat RT of the direct memory access trials and result in the typical slope of 20 ms found for adults. Accordingly, a simple scatter plot of the RTs as a function of MIN should show 1 in 20 outlying data points that, again, would offset the expected flat or horizontal spread of the RTs for the direct access trials. Of course, this ratio could be adjusted, for example, to 1 in 10 retrieval
Individual Diflerences in Arithmetic Skill
201
failure trials, to reflect the increased automaticity of counting speed in adulthood (e.g., 180 to 200 ms.; Landauer, 1962). At any rate, the scatter plot of RTs would still appear to be bimodal at each value of MIN if the Groen and Parkman direct access model were a veridical representation of adult processing performance. Such bimodalities have commonly not been reported in addition research on adults (e.g., Ashcraft & Stazyk, 1981). The appropriateness of the MIN model as the primary strategy in the solution of addition problems has been investigated and supported in many studies with children (e.g., Ashcraft, 1982; Widaman et al., in press). In contrast, the alternative direct memory access model suggested by Groen and Parkman has not been fruitful; instead, models of memory network retrieval processing have proliferated in recent investigations, and these are discussed in a later section. Rules and heuristics. First, however, the possibility that solution of addition problems relies on the use of rules and heuristics must be mentioned. In a number of recent papers, Baroody (1983, 1985, 1987) argued that procedural knowledge, rather than declarative knowledge, may frequently be used by highly proficient persons to solve addition problems. For example, frequency of practice cannot account for acquisition of all number facts (Baroody & Ginsburg, 1986), as acquisition and representation of addition combinations involving zero appear to involve a rule (Baroody, 1983), and there is clear, swift transfer to unpracticed problems once combinations involving zero are mastered (Baroody, 1985, 1987). The learning of "plus one" combinations, of the form 'k t 1" and "1 t x," also appears to involve a rule relating the addition operation and knowledge of the basic number sequence (Baroody, 1985,1987). Baroody reviewed these and other findings to support his argument that rules and procedures may underlie much addition performance. Incorporating such processes into our componential model, discussed below, is beyond the scope of the present paper; but further research on such efforts appears strongly merited. Memory network retrieval processes General form of memory network retrieval. Although two classes of memory retrieval model have been proposed (i.e., tabular vs. nontabular; see Widaman et al., 1989), each shares assumptionsregarding the general characteristicsof memory retrieval processes. In general, memory network retrieval reflects some form of search through a stored network of answers for the correct answer for a given problem and then retrieval of the correct answer once it is located. This search and retrieval process is commonly described as the process of activation. Anderson (1985, p. 142) defined the generic characteristics of the activation
202
KF. Widaman & T.D. Little
concept as follows, "Information in long-term memory is normally in an inactive state. To use the information it is necessary to activate that knowledge. The process of retrieving information from long-term memory can be thought of as the process of activating that information." Further, once information has been activated, it becomes a part of short-term memory (Anderson, 1985). Thus, information is retrieved from long-term memory through activation into short-term memory where the information can be acted upon (e.g., sum comparison,response selection and execution). Although the concept of activation appears clear cut, obscurity occurs as models of memory retrieval attempt to detail the manner by which activation results in specific, measurable temporal characteristics. One central assumption regarding activation is that it takes a measurable amount of time (Anderson, 1985). This assumption suggests that certain characteristics of the activation process can be used to predict RT and that these characteristics are related to the practical form of the network of connections between stored facts. That is, RT is directly related to characteristics of the functional nexus by which the facts are stored in long-term memory. For example, the duration of the retrieval process might be a function of the strength of association between answers and the problem or a function of the area, or distance, traversed through the network. A second underlying assumption regarding a network of stored facts is that as facts become "stored in memory, the internal representation of them is not arbitrary or idiosyncratic; instead, storage in a network becomes characterized by systematic and lawful functional features. As more sums become stored in memory, the retrieval process begins to behave in a lawful and systematic manner. Moreover, network models of memory do not require assumptions regarding the structural substrate of memory; rather, these models need only assume that the processes of storage and retrieval entail lawful, functional characteristics for which a specified model may serve as metaphor. The primary distinction between tabular and nontabular models of network retrieval is the metaphor invoked to represent the retrieval process. Typically, for tabular models, the RT associated with the activation process is linearly related to the distance or area in a table-like network in long-term memory that must be searched or traversed in order to retrieve an answer (e.g., Ashcraft & Battaglia, 1978). Tabular models assume that the natural seriation, or ordered relations, of numbers is mapped from external patterns to internal representations during the learning process; this correspondence between external patterns and internal representations leads to the tabular functional characteristics of retrieval times. For nontabular models, the durations associated with the activation process are typically viewed as being a function of the associative strength between the integers
Individual Differences in Arithmetic Skill
203
comprising a problem and their correct sum as well as competing associations among incorrect sums and related integers (e.g., Siegler 8c Shrager, 1984). Incorrect competing associations reduce the activation strength of a correct association. Correct associations are established, presumably, through frequency and/or duration of exposure to correct combinations. Nevertheless, in general, the nontabular models assume that the stronger the association between a problem and its correct sum, the greater will be its activation strength and the quicker will be the retrieval of the sum. Tubular memory network retrievulprocesses. Ashcraft and Battaglia (1978) were perhaps the first to propose that table-related retrieval processes underlie the responses of proficient subjects to addition problems. Ashcraft and Battaglia proposed, as an initial working metaphor, a square, symmetric two-way table, with nodal values ranging from 0 through 9. These nodal values correspond to the addends of a simple addition problem. A final assumption is that the correct answer to an addition problem is stored at the intersection of the two nodal values representing a problem. Retrieval time under such a model should be related to the distance from the origin of the table, or the (0,O) intersection, to the intersection of the nodal values for a given problem. Presuming that retrieval search of such a table ensued according to a city-block metric via the shortest path from the origin to the intersection of the proper nodal values, RT to a given problem should be equal to the correct sum of the problem (Ashcraft & Battaglia, 1978). However, Ashcraft and Battaglia found that RT to simple addition problems was more strongly related to the square of the true sum, or SUM’. As a result, they argued that the underlying tabular array in memory might be systematically stretched as a function of the size of the addend, leading to the findings that RT was a nonlinear function of the correct sum for the problem. More recently, research (e.g., Miller, Perlmutter, & Keating, 1984; Widaman et al., 1989) has found that the simple product (PRODUCT) of the two addends of a simple addition problem provides a better table-related predictor of RT than does the SUM’. Widaman et al. (1989) presented a theoretical rationale for the PRODUCT predictor by referring back to the notion of the square, symmetric, printed addition table as the metaphor for addition fact organization. The Widaman et al. rationale assumes that (a) distances between the nodal values, 0 through 9, along each of two parent nodes are equal, (b) initial activation of each parent nodal value takes a constant amount of time regardless of the magnitude of the encoded number, (c) spread of activation through the memory network begins at the origin (i.e., the “0,O intersection), and (d) activation proceeds at a constant rate and as a linear function of the area of the network that must be traversed during the fact retrieval process. Given these assumptions, PRODUCT
204
KF. widaman & T.D.Little
represents the area of the rectangle formed by the origin, the nodal values involved, and the intersection of the nodal values. Under this model, RT is hypothesized to be linearly related to the area of the memory network that must be activated to arrive at the correct sum (Widaman et al., 1989). Few studies have contrasted SUM' and PRODUCT on the same sample of subjects, However, when contrasted, PRODUCT has emerged consistently as the better supported tabular predictor (e.g., Geary et al., 1986, Geary, Widaman, Little, & Cormier, 1987; Miller et al., 1984, Widaman et al., 1989; Widaman et al., in press). Moreover, the PRODUCT variable has been shown to be consistent with its conceptual model, a consistency that does not obtain for the SUM2variable (see Widaman et al., 1989). Further, PRODUCT appears to be the more appropriate structural variable for statistical models of both addition and multiplication processing, substantiatingan important functional commonality for these two types of performance (e.g., Geary et al., 1986). That is, the PRODUCT model is consistent with the idea that addition and multiplication facts are stored in h & l y interrelated memory networks (see also, Miller et al., 1984; Stazyk, Ashcraft, & Hamann, 1982), and the PRODUCT structural variable is the only tabular representation that is consistent with this cross-operation prediction. Nontabular memory network retrieval processes. Several conceptual models that are not directly related to a tabular memory network have recently emerged as viable representations of the memory retrieval process. These models presuppose that number facts are stored in long-term memory in networks that resemble those proposed for semantic facts. The most thoroughly articulated of the nontabular models is the Distribution of Associations model (Siegler & Shrager, 1984). According to the Distribution of Associations model, the distribution of associative strengths (DA) of alternative potential answers for each problem varies systematically from very peaked to very flat as a function of problem size. Siegler and Shrager hypothesized that a problem with a more peaked DA would require less time to verify, because less effort is required to distinguish the correct sum from alternative, incorrect sums. The DA for a given problem is indexed by the frequency of correct and incorrect answers that are associated with the problem; the fewer the errors, the more peaked the distribution of associations, with the correct answer having clearly the highest strength of association. The manner in which associations of correct and incorrect answers are built up with a given problem as a function of practice has been described by Siegler and Shrager (1984). A second nontabular model, tested by Ashcraft, Fierman, and Bartolotta (1984), used norms presented by Wheeler (1939), who assessed the percentage of second-grade pupils who mastered each of the 100 simple addition problems.
Individual Differences in Arithmetic Skill
205
Ashcraft et al. (1984) found that problem difficulty was the best predictor of RT and that this measure (RPER, for the reverse-coded percent mastery measure) led to very similar equations for both verification and production formats of problem presentation (see also Hamann & Ashcraft, 1985). As mentioned above, nontabular models assume that interference from false associations results in greater RT, and that, somewhat circularly, "the frequencies of specific errors estimate the strength of false associations" (Campbell, 1987a, p. 110). Both RPER and DA are measures of the frequency (or percent) of retrieval errors associated with each number combination; correspondingly, each index also represents the strength of false associations. Thus, the most parsimonious interpretation of the RPER index is that it is a type of DA measure for addition combinations. Still other nontabular structural variables, such as the frequency and order of presentation of problems in arithmetic books (e.g., H m m & Ashcraft, 1986), have been proposed as variables that underlie the problem size effect for addition (see Widaman et al., 1989, for a review). Moreover, detailed computational models of number fact retrieval employing nontabular processes at their core have recently been pursued, including efforts such as those by Campbell and Oliphant (this volume) and McCloskey and Lindemann (this volume). As a result, the notion that nontabular processes underlie number fact retrieval represents a robust and currently well-researched alternative to tabular retrieval processes as the basis for arithmetic performance.
Relative utility of alternative structural variables Predictive utility. One aspect of the relative utility of the alternate structural variables that have been proposed for the mental arithmetic domain is the relative predictive utility of the variables. That is, do certain structural variables consistently outperform others in modeling RT data? As mentioned above, MIN has been the best supported of the structural variables in samples of young children (e.g., students in second grade; Ashcraft, 1982; Groen & Parkrnan, 1972; Widaman et al., in press), and the PRODUCT has been the best supported of the tabular network retrieval predictor variables in older subjects (e.g., Widaman et al., 1989). Of the nontabular retrieval variables, the RPER measure from Wheeler (1939) has been a better predictor than the DA measure (e.g., Siegler & Shrager, 1984) in comparative studies (e.g., Ashcraft et al., 1984). These predictors--MIN, PRODUCT, RPER, and DA--appear to constitute the most useful predictor variables for representing mental addition performance. Moreover, when placed in direct competition for explaining RT data in highly proficient subjects (i.e., college students), PRODUCT has consistentlyoutperformed MIN, RPER, and DA
206
KF. Wdaman & T.D. Little
in the majority of studies (e.g., Miller et al., 1984, Geary et al., 1986, Widaman et al., 1989; Widaman et al., in press). Thus, there appear to be differences in the abir;ty of alternate structural variables to represent RT data on addition problems, differences lending greater support to associated conceptual models of the ways in which persons respond to addition problems. However, f m conclusions on the relative predictive utility of the competing structural variables should await further research that continues to confirm the predictive power of certain variables (e.g., PRODUCT) over the competitors. A related concern is the fact that the primary predictors that are the primary competitors for explaining RT to addition problems are highly correlated with one another (e.g., Ashcraft & Battaglia, 1978; Geary et al., 1987). This fact makes empirical differentiation between the predictions made by the several competing structural variables rather difficult. As a result, the search for which precise structural variable provides ultimate goodness of fit to RT data may be a relatively unimportant matter. The high correlations among structural variables ensure that the predictions of "less good structural variables will still be highly similar to those of "the best" structural variable. Moreover, which structural variable emerges as "the best" predictor in a given study may be determined more by sampling variability than by superior ultimate explanatory value, Hence, the search for "the best" structural variable may be rather less important than investigations of the form of the cognitive processing model within which the cognitive processes are executed, a topic discussed in a later section. Theoretical issues. In addition to predictive or empirical concerns, several theoretical concerns emerge when evaluating the differential utility of the primary structural predictors of addition performance. One important issue concerning the structural variables is the presence, or lack, of correspondence between the statistical variable used to predict RT and the conceptual form of the hypothesized cognitive process or component (Widaman et al., 1989). Both MIN and PRODUCT are associated with detailed conceptual models; RPER, on the other hand, may or may not be related to the Distribution of Associations model (Siegler & Shrager, 1984), which lacks a clear cut mechanism by which the systematic pattern of associations is established as a function of problem sue. Finally, certain statistical variables proposed for the mental addition domain appear to be inconsistent with the conceptual models they are presumed to represent (e.g., SUM'; see Widaman et al., 1989). A second theoretical issue is the coherence of the nomothetic versus idiographic nature of the structural variables employed and the cognitive processes modeled by the various structural variables. The MIN digital and PRODUCT tabular retrieval structural variables are clearly and explicitly nomothetic, as they reflect
Individual Differences in Arithmetic Skill
207
the assumption that certain external structures provide the basis for common mental structures across persons. That is, the MIN and PRODUCT invoke the computational structure of a digital (i.e., digits = fingers) counter and the natural seriation and ordered relations of number facts, respectively, as the external structures that are incorporated into mental structures for responding to addition problems. The nomothetic nature of a given model implies that individuals use the corresponding mental structure in the same way. For example, if a child uses a mental counter to solve problems, the presumption made is that the child increments the counter in a unit-by-unit fashion just as other children do, a reasonable assumption unless evidence to the contrary is presented. A similar case may be made for tabular structural variables, such as PRODUCT. If such structural variables represent the execution of the retrieval process, then each person is presumed to utilize a similar mental counter or to search through a similarly ordered network to access the correct answer to a problem. This nomothetic nature of the digital and tabular retrieval structural variables catl mesh easily and directly with an idionomothetic approach to research. That is, if a particular process is executed in the same fashion by each individual, then individual differences may be captured in an unambiguous fashion in the differential estimates of the speed with which individuals execute the given process. Moreover, a properly specified regression model may yield these individual differences in processing time in the form of the raw score regression weight for the relevant structural variable in a statistical model for RT data. The nomothetic versus idiographic issue appears more problematic for nontabular retrieval models. Let us consider the DA structural variable in more detail. The DA structural variable reflects the distribution of associations of alternate answers to each addition problem. This structural variable was originally stated in an explicitly idiographic fashion. That is, Siegler and Shrager (1984) held that, as an individual works with, or responds to, addition problems during relatively early stages of learning about addition, both correct and incorrect answers are given to each addition problem. Each time a particular answer is offered for a given addition problem, the strength of association between that answer and the problem is increased. The more frequently the correct answer is given to a problem, the more peaked is the distribution of associations for that problem, with the peak at the correct answer. The problem size effect arises for this model under the assumption that problems with larger addends will have less peaked distributions of associations. The problem size effect may arise because (a) problems with larger addends require more computational operations by novice adders, offering a larger number of opportunities for mistaken incrementing of an explicit (i.e., fingers) or implicit (i.e., mental) counter when attempting to arrive
208
KF. Widaman & T.D.Little
at the correct sum, (b) the cardinal value of larger addends may be mistakenly altered in short term memory during the incrementing process, due to the attentional demands of counting, or (c) any of a number of other alternative processes. The important point is this: the problem size effect may be due to any process that leads to the flattening of the distribution of associations as a function of increasing addend size. Now, the process underlying the DA structural variable was stated in an idiographic fashion, such that each individual develops his/her own pattern of distributions of associations during the learning of addition problems. But, the DA model has never been tested in an idiographic fashion. To do so would require (a) ascertaining the distribution of associations of answers to each addition problem for each individual subject, and then (b) determining that the distribution of associations for a given subject was systematically (linearly, preferably, or nonlinearly) related to RT for that subject. Instead, the frequency of answers to a given problem across a sample of subjects has been used as the distribution of associations to that problem for each subject. This may reflect a problematic assumption. Informal observations of young children solving addition problems seem to suggest that children who are very good at arithmetic make relatively few errors on addition problems, whereas children who are poorer at arithmetic make many, many more errors across all addition problems. If this is true, then the "average" distribution of associations across all subjects may not reflect well the parameters of the long term memory network of arithmetic facts for any individual subject or may do so for only a small proportion of subjects. The DA structural variable may be salvaged by making less stringent, more reasonable assumptions, but there are various costs of these assumptions. For example, one might assume that the distribution of associations for each individual is systematically (e.g., linearly) related to the "average" distribution of associations across individuals. This seems to be a weaker, but more reasonable assumption than the assumption of identity of distributions of associations across individuals. Moreover, the assumption of a systematic relation of each individual DA with an average, group DA represents a testable hypothesis. This assumption would, however, likely remain an untested one, given the difficulty in assessing the distribution of associations for each individual. However, if the assumption were tested and little systematic relationship were found between many individual DAs and the group DA, this would be a serious blow to research on the DA model that presumes a similar, nomothetic DA across individuals. Let us assume that the DA model passes this hurdle; that there is a systematic relationship between individual subject DAs and the standard, group-level DA. That is, assume that the distribution of associations for each individual i , D A , were
Individual Differences in Arithmetic Skill
,209
linearly related to the group-level distribution of associations, DA, by the following equation: (1) DA, = b, + biiDA, where b, and b,, are the intercept and slope, respectively, for predicting the individual distribution of associations, DA,, from the grouplevel distribution of associations, DA. One could then model RT data for individual i, RT,, using models such as: (2) RT, = c, t cII DA,, or (3) RTi = d, t dll DA, where c, and c,, are the intercept and slope, respectively, for predicting an individual’s RT data, RT,, from his/her own distribution of associations, DA,, and d, and d l ,are the intercept and slope, respectively, for predicting the individual‘s RT data from the group-level distribution of associations, DA. Importantly, the estimates of c,, and d l , would reflect estimates of retrieval time as a function of individual-level and group-level distributions of associations, respectively. Interestingly, Equations 2 and 3 would provide identical fit to RT data (e.g., identical variance explained) for a given individual if the distributions of associations for individual i were perfectly captured by Equation 1. Furthermore, there would be a systematic relationship between the regression weights estimated in Equations 2 and 3. That is, substituting Equation 1 into Equation 2 shows that d, = [tor + (CII b,)l and dl1 = (Cll bll). Even if the foregoing were granted, two further problems remain. First, there would remain an equivocal relationship between the statistical models estimated and tested and the conceptual cognitive processing models presumed to underlie performance. If individual-level DAs were used as predictors, then the raw score regression weights reflecting retrieval speed, c,,, would not be comparable across individuals, because the structural variable (i.e., DA,) used as a predictor differs across individuals. That is, there would be individual differences in both the structural predictors used and the regression weights obtained, leading to an inability to pin down the basis of the differences among individuals. On the other hand, if a common group-level DA (i.e., DA) were used when modeling data for each individual, then the raw score regression weights reflecting retrieval speed, dl, (see Equation 3), may still be uninterpretable in any simple fashion. That is, retrieval speed, as indexed by the clr parameter of Equation 2, could be a constant across individuals, but the raw score regression weights d,, may still vary across individuals unless each person’s individual DA was related to the group-level DA using identical slope and intercept values in Equation 1, but this would be an unlikely outcome. Hence, an underlying ambiguity about the meaning of the
210
KF. widaman & T.D. Little
predictor variables, their raw score regression weights, and the nature of the retrieval process by which persons respond to addition problems would remain. The second remaining problem is the utility of the DA notion in samples of highly proficient subjects. Once a person gets to high school or, surely, to college, simple addition problems have been highly overlearned; such subjects rarely make mistakes on arithmetic problems. Given this, it is unclear why an index of distribution of associations, derived during the early learning of arithmetic, should still have predictive power. It seems obvious that the distribution of associations between alternate answers and addition problems may govern the pattern of responding to addition problems during rather early stages of experience with such problems. But, after large amounts of practice, the predictive utility of the DA measure might be expected to wane. Further theoretical development must be undertaken to justify the predictive power of the DA measure in highly proficient subjects. A third and final theoretical concern involves the choice between computational, tabular retrieval, and nontabular retrieval cognitive components for a place in our models of the way in which persons respond to arithmetic problems. This choice is influenced by several factors, such as the relative predictive efficiency of structural variables across studies as well as the theoretical elegance and coherence of the models. However, this choice must be determined, ultimately, on bases such as which type of model leads to maximal goodness of fit with empirical data and to the confirmation of the greatest number of unique, testable hypotheses (Widaman et al., 1989). Only the accumulation of evidence over many years of research will enable a convincing answer to such questions. Models for responding to addition problems
In the preceding section, we discussed several alternative componential processes that may be invoked to arrive at the correct answer to a simple addition problem. These alternative componentialprocesses represent one of several stages in an overall model of the way in which persons respond to addition problems. In the present section, we describe the information processing models that have been proposed to represent the several processes underlying responding to arithmetic problems. Models for responding to simple addition problems Models for verification taskparadigms. Most research studies on mental addition processes have used a true/false verification task paradigm for collecting the RT data upon which models of solution strategy are substantiated. In a verification
Individual Differences in Arithmetic Skill
211
task paradigm, a subject is presented an addition problem for which a stated sum is provided (Ashcraft, 1982). For example, if given a problem such as 3 + 8 = 10, a subject must decide whether the stated sum of the problem is correct or incorrect and then respond by appropriately depressing a "yes" or "no" response button. The RT and accuracy (whether the response was correct or not) are both recorded as dependent measures. Models of the solution process are then tested against the measured durations for correct problems. However, concerns have arisen concerning results obtained using the verification paradigm (e.g., Little & Widaman, 1992), as the presence of a stated answer may affect the solution process in various ways. These influences may take a number of forms, including (a) priming, (b) strategy alteration, (c) interference, and (d) global evaluation. Priming effects typically facilitate RT and can occur if successive problems contain the same addends or the same stated or correct sums or if the stated answer for one problem is an addend or augend of the next (e.g., Campbell, 198%). Strategy alteration effects also facilitate RT and can occur when a person uses a modified strategy for responding to a particular problem, such as by using an odd-even heuristic regarding the stated sum (Krueger & Hallford, 1984). Interference effects typically hinder RT and can occur when the stated answer for an addition problem is the correct answer under a different operation, such as multiplication (Geary et al., 1986). Lastly, global evaluation effects, which typically facilitate RT, can have sizeable influences on RT when a stated answer is substantially wrong. In such a case, the subject makes the evaluation that the stated answer is clearly and substantially wrong, circumvents completion of the processing stage, and jumps directly to the decision stage (see the split effect; Ashcraft, 1982; Restle, 1970). Following Zbrodoff and Logan (1990), we may assume that the global evaluation process is invoked consistently, regardless of the relative correctness of the stated sum when addition problems are presented in verification format. Again, however, the facilitative effects due to global evaluation are likely to be positively related to the absolute or relative difference between the true and stated sums for a problem. In order to understand the relationship between the task demands of a verification task and the form of the cognitive models presumed to underlie performance on the task, a flow diagram is presented in Figure 1. The diagram depicted in Figure 1is a five-stage model of processing in a verification paradigm. This model explicitly delineates the flow of information between the stages of processing that occurs from the presentation of a stimulus to the execution of a response. The three figural conventions used for Figure 1 are as follows: (a) model components enclosed in boxes represent more controlled processes that are assumed to require nontrivial durations and therefore may be separately
212
KF. Widaman & T.D. Little
estimated, (b) diamonds enclose branching operators that direct the flow of processing, but that, being quite overlearned and therefore rather automatic, likely require trivial amounts of time for execution, and (c) single-headed arrows sign@ the order of processing within the conceptual model. The boxes are often referred to as stages of processing or the components of processing, whereas the diamonds represent only conceptual clarifications regarding the nature of the flow of information between the stages of processing. Encode
atb:c
1
I ’ SearchICornpure Correct Sum.
I ReactivateStated
-IEirl Select k Execute
Figure 1. A flow diagram for responding to simple addition problems presented within a verification paradigm. The first stage of processing of the model shown in Figure 1 is termed the encoding stage. Encoding involves bringing the external stimuli into the internal cognitive system in a task appropriate manner (e.g., Glass & Holyoak, 1986). Also, the encoding process is assumed to take a constant amount of time for like categories of problem (i.e,, problems with the same number of presented digits). The next juncture in the verification model, depicted as a diamond, is an explicit
Individual Differences in Arithmetic Skill
213
representation of the additional evaluative processing that occurs only in verification procedures. Specifically, a global evaluation of the approximate correctness of the stated sum is made; if the stated sum is grossly inaccurate, the "No" branch is taken, a "No" decision is made, and a "No" response is executed, representing the type of processing involved in the split effect (Ashcraft, 1982). Occasionally, if aspects of the problem lead to interference, the "Yes'" branch is taken. For example, interference may occur if the stated answer is incorrect for addition, but is correct for another operation, as in "2 + 4 = 8." Interference under such conditions typically leads to a longer RT (Campbell, 198%). The second stage of the model in Figure 1, termed the search/compute stage, has been a main focus of research on cognitive arithmetic (e.g., Ashcraft, 1982, 1983; Baroody, 1983; Siegler, 1987; Widaman et al., 1989). The search/compute stage depicts the point at which the correct answer to a problem is obtained. The principal RT phenomenon associated with this stage is the problem sue effect. The problem size effect refers to the robust finding that as problem size increases (i.e., as the magnitudes of the addends increase), both RT and error rate also increase. Models of solution strategy are assessed against the problem-size related changes in RT associated with the search/compute stage. The various alternative strategies invoked during the search/compute stage, involving computational or memory network retrieval processes, were discussed in the preceding sections of this chapter. The next two stages of the verification model presented in Figure 1 represent (a) the re-encoding or reactivation of the stated sum, and (b) the subsequent decision process in short-term memory, stages that occur only during verification task responding, Evidence exists (see Widaman et al., 1989) that the digits comprising an incorrect sum must be reactivated or re-encoded, due perhaps to decay of the representation of these digits in working memory, before a decision may be made regarding the veridicality of the stated sum. Reactivation of the digits in correct sums may also be necessary; however, the extra processing step in Figure 1 reflects the finding that reactivation of digits in incorrect sums is more time-consuming than is reactivation of the digits in correct sums (see Widaman et al., 1989). These stages also are assumed to take a constant amount of time for like categories of problem size or complexity. The last stage of the model in Figure 1represents, straightforwardly,the process of selecting and executing a response. In a verification paradigm, selection involves choosing between a "yes"or "no"button and implementing the appropriate manual action. Again, this stage is assumed to take a constant, measurable amount of time across types of problem, so the response time associated with this stage is not estimable separately.
214
KF. widaman & T.D. Little
Modelsforproduction taskparadigms. A production task paradigm requires the subject to produce an answer for 8 stimulus problem in the absence of a stated answer (Baroody, 1983). Given a stimulus problem, such as 6 t 2 = ?, a subject must determine the correct answer and state the answer to trigger the offset of an RT clock. Although certain biases may arise, responding in a production task format is likely a more simple and representative expression of the mental addition process; this type of processing is, thus, less susceptible to undesirable influences that may affect the nature of the task. Arguably, the accuracy of models representing such data are also less biased. As can be seen in Figure 2, a model representing the flow of processing in a production task paradigm is much simplified relative to the model for the verification task paradigm, presented in Figure 1. Only three stages of processing are necessary to depict the nature of the task, and no evaluativeldecision components are necessary for representing the processing characteristics. The three stages of processing are similar to their counterparts of the preceding model. First, the Encoding stage involves representing the stimulus internally in a task appropriate manner. Second, the Search/Compute stage delineates the solution process and occupies the same position in the posited model of addition responding as the like-named stage in the model in Figure 1. Finally, the Response Selection and Execution stage encompasses the processes of response selection and execution. In a production paradigm, response selection involves choosing the appropriate number name for the derived sum, and response execution comprises implementing the appropriate vocal action. Given the rather different forms of response between the verification and production paradigms, this Response Selection and Execution stage may have different temporal characteristics across paradigms; however, this would likely represent only a constant difference between paradigms. Because the Encoding stage and the Response Selection and Execution stage are assumed to involve only a constant amount of time, the Search/Compute stage is the only stage in which RT will vary and, thus, is the stage represented by models of the solution process for the production task paradigm. Although the three stages in the production task model in Figure 2 have the same name and characterization as comparable stages in the verification task model in Figure 1,the conclusion that the comparable stages across models reflect the operation of identical processes is not warranted. For example, Campbell (198%) argued that the presence of a stated answer for addition problems in the verification task format led to greater facilitative priming for difficult problems than for easier problems. This conjecture was supported by the finding of a steeper problem-size-related RT slope for addition problems presented in
Individual Differences in Arithmetic Skill
215
production task format relative to the verification task format. Campbell (1991) reported a similar finding for multiplication problems across the two task paradigms. These results suggest that we exercise caution when attempting to equate processes reflected when subjects respond to problems in different formats.
SearchlCornpute Correct Sum. X
I
IT
ig 12 I1 ' - D
I0
I 1
Select & Execute Response
I
*/
/
/
/
Figure 2. A jlow diagram for responding to simple addition problems presented within a production paradigm. Models for responding to conipler addition problems
The simple flow diagrams for mental addition presented in Figures 1 and 2 provide an adequate conceptualization for most previous research on simple addition. However, processes involved in more complex forms of addition, such as time taken to carry from the units to the tens column, are not explicitly represented by the simple models. To fill this void, Widaman et al. (1989) introduced a general model of addition to account for the additional cognitive components involved in more complex forms of addition processing, while still accounting for simple forms of processing. This model builds on the simple model in Figure 1; thus, it represents processing for simple addition problems as well as complex addition problems with any number of addends and any number of digits per addend. The flow diagram of this general model for verification-task performance on mental addition is presented in Figure 3. Given space constraints, only the additional components of the model in Figure 3 will be discussed below;
216
KF. widaman & T.D. Little
this is not problematic, as all components in the simple models in Figures 1 and 2 are contained within the model in Figure 3. In addition to the box, diamond, and arrow conventions used in Figures 1 and 2, a fourth figural convention is used in the model presented in Figure 3. Specifically, circles are used to represent control processing counters that assist in keepmg track of the current column or digit being processed (e.g., units column, tens column, third digit, fourth digit). These control counters are presumed to take a constant amount of time per problem. When problems of different sizes and types are presented in an intermingled, random manner, a subject may need to initialize, or preset, certain counters or branching operators, given the number of addends and number of digits per addend, to allow optimally efficient processing of the presented problem. As an example of the initialization of counters, the general model assumes that subjects solve most simple and complex addition problems in columnar fashion, beginning with the units column. Therefore, the column indicator is initialized with a setting to the first, or units, column (i.e., c = 1). The second step of processing involves a branching operator, the use of which is determined by the number of digits in column c to be summed. The units column of all simple and complex addition problems must contain at least two digits, so the initiation of processing of an addition problem will always result in selection of the ">1"branch the first time the present branching operator is invoked. However, in complex problems with more than one digit per addend, only a single digit may remain in the final, or left-most, column of digits in the problem. In this case and if no carry operation to the final column is required, the "1"branch may be taken, as no summing of digits is required. In the next stage of processing, two digits from column c are encoded. Regardless of the number of digits in column c, the general model assumes that the solving of a given addition problem will commence with the encoding of two, and only two, digits in column c. If more than two digits are to be summed in column c, the person maintains in short-term memory an index variable indicating which digits are being encoded and which digits remain to be encoded and processed further. Given the frequency with which individuals in our culture encounter numerical stimuli, encoding of the addends of an addition problem is likely a very rapid and overlearned process. However, encoding an addend probably requires retrieving certain attributes of the addend from a long term memory (LTM) store, such as the number of units represented by the addend. Such retrieval should take a consistent, but estimable amount of time, perhaps on an order comparable to that found for access to name codes of letters (Posner, Boies, Eichelman, & Taylor, 1969; Hunt, Lunneborg, & Lewis, 1975).
Individual Differences in Arithmetic Skill
217
ENCODE
DGlT IN
MLLMNC
DGlTS IN COLWN
c NETWORK
I
OF STORED FACTS
KWWLEOGE
OF ARITMIETIC RULES
ENCODE DIGIT(S) IN STATED
IS THE S W
CORRECT'
1 SELECT RESPONSE
NO
EXECUTE RESPONSE
Figure 3. A Flow diagram for responding to complex addition problems presented willtin a production paradigm.
KF. Wdaman & T.D. Little
The next stage of the general model is the Search/Compute stage. Once again, the search/compute stage is identical, in terms of both componential composition and temporal operating characteristics, to the like-named stage in the simple models in Figures 1 and 2. Thus, the correct sum of the two encoded digits is either retrieved from a memory store or calculated anew, with a temporal duration determined by problem size. Immediately after the correct sum of the two digits has been obtained, a branching operator is encountered. If more digits in column c are to be summed, the current sum is only provisional. Therefore, the current sum must be held in short-term memory while the "yes" branch is taken, which leads to the encoding of one more digit and the subsequent obtaining of a new sum. The preceding series of "encoding then summing" of digits continues until all digits in column c have been summed; at this point, the "no" branch is taken, as a correct column sum has been determined. Upon completion of the summing of digits in column c, the digits in the stated sum must be encoded so that the obtained and stated sums may be compared. If column c is not the final column of digits to be summed, only the single digit in column c of the stated sum need be encoded for comparison with the units value of the obtained sum. On the other hand, if column c is the final column in the given problem, then all remaining digits in columns c, c t 1, etc., of the stated sum must be encoded for comparison with the stated sum. The decision stage is the same as presented in the simple model for verification presented above. However, if one or more columns of digits remain to be summed, the "yes" branch is taken. Processing then continues with a unit increment of the column counter, shifting the locus of processing to the next column of digits. The final branching operator in the general model governs whether the carry operation is performed. If the most recently obtained column sum was 9 or less, the "no" branch is followed, and processing of the given problem resumes with the encoding of one or two digits in column c. But, if the column sum in question was greater than 9, the "yes" branch must be taken, leading to the carrying to column c of information regarding the number of provisional tens. The latter information must be summed with one of the digits in column c before further digits from column c are encoded and summed. The design of the general model presented in Figure 3 was based on the assumption of columnar processing of addition problems, in which the subject performs addition in an algorithmic fashion, by column, starting with the units column. Such an approach to problem solution is an implicit assumption of most previous conceptual models for mental addition. Although certain models (e.g.,
Individual Diflerences in Arithmetic Skill
219
Restle, 1970; Baroody, 1987) suggest that addition of numbers may be performed in a global, rather than columnar, manner, recent research does not support such a hypothesis. Specifically, Widaman et al. (1989, in press) explicitly tested columnar models against noncolumnar models, finding clear and strong evidence for the columnar models.
Specijication and estimation of model components For each of the models presented in Figures 1, 2 and 3, estimation of the components explicitly represented involves specifying predictor variables for each of the cognitive components. The simplest case is the production task model presented in Figure 2. The two stages of Encoding and Response Selection and Execution do not vary across problems; thus, the time associated with these two stages is a constant that would be estimated as part of the intercept term of a regression analysis. The Search/Compute stage on the other hand would evince differences in RT as a function of the strategy used in deriving the sum for the presented problem. Each of the counting strategies and the tabular and nontabular retrieval strategies may be represented by a statistical predictor that is based on the correspondence of the given strategy to a hypothesized type of cognitive process. The regression model having the highest level of explanatory power provides evidential support for the associated cognitive process. A somewhat more complex case is the simple addition verification task model presented in Figure 1. Two additional predictors can be specified here, one to represent the variability across problems in encoding demands and the other to represent the difference in time associated with a "no" response. One of the most common findings in RT research is the fact that a "no" response takes approximately 100 ms. longer to execute than does a "yes" response. This parameter can be estimated and tested in the context of the model in Figure 1. The most complex case is represented in the general model shown in Figure 3. In this model, additional parameters may be estimated and tested, specifically parameters related to (a) units-column and tens-column retrieval, (b) units-column and tens-column encoding, and (c) time taken to carry to the next column. Greater detail regarding the specification of statistical models representing performance on addition problems is provided by Widaman et al. (1989, in press). We wish to note here, however, that it is possible to translate a process model rather directly into a statistical model that should account for RT data on a task. Such a statistical model then becomes a rejectable model--if the model fails to explain RT data well, or if the model does not adequately represent all aspects of the RT data, then the model should be rejected in favor of a model that provides
220
KF. widaman & T.D. Little
a more adequate representation of the data. The task for the researcher interested in individual differences is to investigate the degree to which such statistical models represent the ways in which individuals respond to addition problems as well as the ways in which such statistical models change as a function of development. To these topics we now turn. Representing proficiency in mental addition: Individual differences and developmental change Research on the manner in which individuals solve addition problems focuses on two related issues--which strategies do persons invoke to arrive at the correct answer to addition problems, and how quickly or efficiently do they execute these strategies? These issues have been considered in the literature under the rubrics of strategy choice and strutegy execution, respectively. Differences in strategy choice IdentiMng strategy choices empirically
In the clear majority of studies, the fit of chronometric models is evaluated using the average RT for a group of individuals at a particular age (e.g., college students). This group-level approach to the testing of models neglects and confounds intra-individual and inter-individual differences in strategy choice and execution (e.g., Cooney, Swanson, & Ladd, 1988; Cornet et al., 1988; Siegler, 1987, 1988a, 1988b, 1989). The solutions to this problem lie in several directions, most particularly taking strategy choice on each problem into account as well as taking individual differences into account. These approaches are addressed next. A problem-centered approach. One way of improving the explanatory power of chronometric models is to determine how subjects solve each addition problem. Siegler (1987, 1988a, 1988b, 1989) has been most consistent in admonishing researchers for analyzing data across problems regardless of the solution strategy employed, because subjects are likely to use different strategies on different problems. The resulting models of strategy execution based on all problems regardless of strategy used may represent accurately neither the strategies persons choose when responding to all or even many of the addition problems nor the temporal characteristics of the execution of the strategies modeled. To remedy the confounding of strategy choice and execution, Siegler (1987, 1988a, 1988b, 1989) suggested that differences in strategy choice could be identified through examination of individual protocols. That is, suppose that a subject is
Individual Differences in Arithmetic Skill
221
seated in front of a computer screen and is asked to respond as quickly as possible to each addition problem, indicating whether the problem is correct or incorrect as stated. Then, one or more way of assessing strategy choice on each particular problem may be used. In an early study, Siegler and Shrager (1984) videotaped young children as they solved addition problems. Using these videotapes, Siegler and Shrager were able to designate each problem as having been solved via some explicit strategy (e.g., counting fingers) or an implicit strategy. In later research, Siegler (e.g., 1988a; 1988b; 1989) obtained self-report data, in which subjects indicated the strategy that they used on the particular problem. Based on such collateral measures, Siegler (1987; 1988a; 1988b) argued that children exhibit a great deal of variability in strategy use across arithmetic problems and that a single regression analysis of RTs to problems that are solved using different strategies may lead to rather biased and uninterpretable results. Moreover, Siegler (1987) separated subjects’ responses into those solved using counting processes and those solved using retrieval processes. For the problems solved using counting processes, counting based predictor variables, specifically the MIN, were better predictors of RT than were retrieval predictors; for problems solved using retrieval processes, retrieval predictors had stronger relations with RT than did counting predictors. This led to conclusions by Siegler that the use of regression analysis across all problems without regard to the strategy used to solve each problem would result in a misrepresentation of the ways in which children were responding to the problems. These criticisms by Siegler give rise to at least two considerations. The frrst consideration involves the relation between variability in strategy use and level of expertise. Variability in strategy use is likely to be negatively related to age or expertise on a task (cf. Ashcraft, 1990). In turn, this would attenuate the force of Siegler’s criticism for studies based on data from older children and adolescents. That is, Siegler and his associates have typically used rather young children as subjects in their studies on addition, children aged 3 to 6 years. The variability in strategy use reported by Siegler may be the rule in young children who are just learning how to add, and a failure to utilize collateral information when performing regression analyses of RT data may indeed lead to results that are difficult or impossible to interpret. However, variability in strategy use may be quite uncommon, or at least much more restricted, for older children and adolescents who have had much greater exposure to, and have greater facility with, addition problems (cf. Kaye, 1986; Kaye, deWinstanley, Chen, & Bonnefil, 1989). If this were the case, collateral information would have little influence on the modeling of RT data. This clearly is an issue that should be investigated empirically.
222
KF. Widaman & T.D.Little
The second consideration concerns the potential biasing effects that may arise when obtaining the collateral measures of strategy use. That is, probing a child after each trial for strategy use information may lead to different patterns of strategy use than would occur under nonprobing conditions or may lead to inaccurate reports of the strategies used (cf. Ashcraft, 1990). In a recent study, RUSSO, Johnson, and Stephens (1989) investigated the effects of obtaining verbal protocols on performance on each of four types of task, including mental addition. In considering the potential effects of obtaining verbal protocols, Russo et al. distinguished between reactivity and veridicality of protocols. A verbal protocol would be considered reactive if obtaining the verbal protocol changed the primary cognitive processes underlying performance; in contrast, a verbal protocol would be nonveridical if it did not reflect the underlying cognitive processes accurately. Russo et al. found that there was evidence of reactivity with mental addition, as the obtaining of verbal protocols led to higher error rates when compared to a "no protocol" control condition. Moreover, Russo et al. reported evidence of nonveridicality of verbal protocols across all tasks, including mental addition. Cooney, Ladd, and Abernathy (1989) also found evidence of reactivity in a study of the effects of obtaining verbal reports of strategy use on multiplication performance by third-grade students, finding that error rate was affected relative to a "no report" control condition. Given such findings, the effects of obtaining collateral strategy use information on arithmetic performance should be investigated further, to ensure that this procedure does not contaminate RT data even as it presumably illuminates the array of strategies used across arithmetic problems. A person-centered approach. As noted several places in this chapter, the standard approach to regression modeling of RT data to addition problems is to obtain the mean RT across individuals for each problem and then to perform regression analyses on this group-level dependent variable. Fitting alternative regression models to the data, substitutingdigital and retrieval structural variables, provides a way of representing and testing the fit of different cognitive component models to data reflecting the "averagesubject," a convenient fictional construction. The standard statistical approach to assessing strategy choice can be easily and straightforwardly adapted to determine which of several competing component models has the primary predictive power in explaining RT variance for each individual subject. The result of such analyses would be the classification of individuals with respect to purported strategy use. This, in turn, would allow the researcher to establish homogeneous subgroups of individuals who use the same strategy or functionally similar strategies.
Individual Differences in Arithmetic SkiII
223
Importantly, homogeneous subgroups formed using the preceding approach would be quite useful for an array of analyses designed to evaluate individual and developmental differences in strategy execution. For example, if based on ill-advised group averages, estimates of growth curves for the development of addition skill would confound inter-individual differences in strategy execution with intra- and inter-individual merences in strategy choice; the resulting growth curves might, therefore, be relatively uninformative. In contrast, growth curves estimated on only those individuals identified as using a particular strategy would reflect the population represented by the subgroup. That is, age trends based on a subgroup of individuals who use the same strategy would reveal more accurately the underlying developmental functions for the processes being modeled. Further, within identified subgroups, attempts at external validation of individual differences in strategy execution could be evaluated more easily and directly. This extension of the standard approach to the individual level has certain drawbacks, including the fact that, at least in one sense, the researcher "averages over" problems regardless of the strategy an individual uses for each problem. That is, the analysis is typically performed on all addition problems for which the individual provided a correct response; some of these problems may have been solved using digital processes, whereas other problems may have been solved using retrieval. Fitting alternative regression models to such data and concluding that the individual used either digital or retrieval would thereby misrepresent the individual's data to a lesser or greater degree. That is, one would conclude that the individual used one strategy or the other, whereas, in fact, the individual used more than one strategy across the set of problems. Our positive outlook on individual-level modeling of RT data to identify the strategy that each person uses is almost certainly colored by the fact that we have used this approach quite fruitfully in our studies of mental addition; we describe the results of several of these studies below. However, we also acknowledge that the criticisms that Siegler (e.g., 1987) has leveled against collapsing across problems appear to be legitimate concerns, especially in samples of subjects who are in relatively early stages of learning a given operation. With regard to this issue, we would like to highlight the presence of several horns of one or more dilemmas. First, we note that modeling RT data at the level of the individual without regard to intra-individual differences in strategy choice may misrepresent the way in which that individual solved the addition problems, by "averaging across" problems solved using different strategies. But, we would also like to note that some investigators (e.g., Siegler, 1987) have routinely averaged across individuals when demonstrating that strategy choice information leads to differential predictability of RT data. That is, these researchers divide RT
224
RF. Wdaman & T.D. Little
data for each individual into those problems solved using counting strategies and those solved using retrieval. Then, the RTs for each problem solved using counting are averaged across individuals, obtaining a mean "counting R T for each addition problem; separately, the RTs for each problem solved usbg retrieval are averaged across individuals, obtaining a mean "retrieval RT' for each problem. Not surprisingly, such studies have found that the mean "counting R T is better predicted by counting structural variables than by retrieval structural variables and that the reverse holds for the mean "retrieval RT." Now, on both theoretical and empirical bases, one may abhor the averaging across problems if these problems are solved using different strategies. But, on parallel bases, we abhor averaging across individuals if these individuals exhibit large intra- and inter-individual differences in strategy execution and strategy choice-and they do. The compromise to these competing concerns would be to do what no one has yet reported in the literature: to gather problem-by-problem collateral information on strategy use, and then to model individual-level RT data taking strategy use on each problem into account. The results would be most interesting. One could model the speed with which each individual executed counting processes, the speed with which each individual retrieved answers from long term memory, and the proportions of problems on which the person invoked counting and retrieval processes. Each of the preceding types of information would represent useful data for characterizing individual differences in arithmetic skill, and each would represent a dimension along which developmental change should be seen. Empirical investigations of this sort would appear to be a most useful addition to the research literature, to demonstrate the importance of the competing concerns regarding accurately representing the manner in which individual persons solve individual addition problems. Developmental changes in strategy choice
Two models of the development of strategy choices in mental addition have been proposed (see Ashcraft, 1983; Baroody, 1983). Both models presume the presence of computational or procedural strategies and memory retrieval or declarative strategies. The Ashcraft model assumes that strategy choice in mental addition development proceeds from use of a rather slow, procedural strategy, such as counting, to use of a more efficient process, such as memory network retrieval, as the preferred and dominant strategy from the middle elementary schools years and beyond. Commonly, as inferred from cross-sectional data, this developmental trend is hypothesized to emerge as an age-related transition in which young children use a computational strategy, adults use a memory retrieval strategy, and,
Individual Differences in Arithmetic Skill
225
at around the fourth grade, approximately half the children are using either a procedural strategy or a declarative strategy (see Ashcraft, 1982; cf. Baroody, 1983; Kail, 1986). In contrast, the model hy-pothesized by Baroody (1983) depicts strategy choice in mental addition skill development as a movement from slower procedural processes, such as MIN, to more principled and much quicker procedural processes. Although Baroody admitted that some addition facts may be stored in long term memory (e.g., ties), he claimed that individuals employ rules (such as the identity element, n + 0 = n), principles (such as commutativity, ties + l), and heuristics (such as reorganization, recasting n t 9 as { [ n + 101 - 1)) in solving simple addition problems throughout development, However, a statistical model that corresponds to the conceptual model of quick procedural processes has yet to be formulated, which diminishes the empirical utility of the proposed model. The underlying issue between the Ashcraft (1983) and Baroody (1983) models is the degree to which procedural and/or declarative strategies underlie the changes in arithmetic skills that occur during development. Many factors, such as the individual's level of motivation, attention, and fatigue as well as the individual's idiosyncratic processing of a minority of problems, are likely to introduce error variance in RT data. Despite these factors, however, individuals probably process problems of a similar type using a rather consistent cognitive strategy because of the lesser cognitive effort required to be consistent and systematic as opposed to inconsistent and arbitrary (e.g., Kaye, 1986; Kaye et al., 1989). For example, an individual who generates a unique procedure for each type of addition combination would likely expend considerable cognitive effort coordinating and implementing the numerous strategies. Rather less cognitive effort would be required by the individual who organizes the addition facts around a single, primary strategy, such as memory network retrieval. Although the latter individual may not utilize network retrieval strategy for every single problem, he or she would likely use memory retrieval on a majority of problems; this type of strategy choice should lead to patterns in RT data that are consistent with a network retrieval model. The minority of problem combinations that are processed using an alternative strategy would introduce error variance, but presumably would not bias dramatically the accuracy of a model's general representation (e.g., in terms of both internal and external validity analyses). This same argument would hold for procedural-dominatedprocessing of a consistent form, such as counting, but would not hold for an inconsistent and idiosyncratic approach to number facts. The question of whether skill development in mental addition is best characterized by a procedural or declarative processing model at different ages, grades, or transitional phases remains an important empirical concern.
RF. Wdaman & T.D. Littie Individual differences in strategy execution In an earlier section, we described the goals of experimental approaches to mental addition in terms of establishing the internal and external validity of models and the parameter estimates in these models. In componential analyses of mental addition, RT tasks are used to test hypotheses about proposed componential models of the ways in which persons respond to addition problems. The testing of such hypotheses is one way to demonstrate the internal validity of the proposed componential processes. That is, internal validation is the determination that RTs to arithmetic problems are affected (i.e., increased or decreased) in accordance with specific manipulations in the hypothesized component processes. Sternberg described two general forms of internal validation, intensive and extensive. Intensive internal validation involves assessing the degree to which the hypothesized component processes represent RT performance for a single type of problem, such as simple addition. All of the primary process models of mental addition meet this simplest of validity criteria in that they all show high levels of fit with RT data, although certain structural variables (e.g., PRODUCT) have relatively higher predictive power than others. Regardless, other forms of validation are important analytic goals. Extensive internal validation involves demonstrating the accuracy of the component processes across different types of problem, such as finding that a specific memory retrieval model holds for both simple and complex addition or for both addition and multiplication problems. The degree of extensive internal validity is dependent upon the degree of theoretical similarity among the component processes for the types of problem examined. For example, the primary models of addition fact retrieval may be less appropriate for division and subtraction than for multiplication, to the degree that different component processes are involved in their solution. Determining the scope of extensive internal validation relies on both the theoretical underpinning and the subsequent empirical proof of the relationships. A second domain of validation of component processes is external validation. External validation also takes on a number of forms. For example, one form of external validation is demonstrating that, within the constraints of a single empirical investigation of a single type of problem, the parameter estimates of the component processes show predicted patterns of convergent and discriminant relations with other outside variables, such as achievement test scores. This might be termed intensive external validation. Another form of external validation, extensive external validation, involves demonstrating that the estimates of component processes across several RT tasks are related systematically to outside
Individual Differences in Arithmetic Skill
227
variables, such as test scores and chronological age. These concepts of internal and external validation provide useful distinctions as we now consider empirical studies of mental addition. Internal validation of processing models Group-level analyses. As mentioned above, much of the early work on mental arithmetic research was conducted at the group-level. These analyses averaged the RTs for a group of individuals at a given age and then fit models of the solution process to the aggregate RT estimates. The models of arithmetic performance that emerged, such as MIN (e.g., Groen & Parkman, 1972), demonstrated internal validity as these models predicted the increase in RT as a function of problem size. However, again as mentioned above, these studies have come under criticism because the analytic techniques confound both inter- and intra-individual differences in RT performance (e.g., Siegler, 1987). To the degree that individuals inter- and intra-individual differences on RT tasks, the accuracy, or validity, of these models is compromised. As a result, group-level research should be pursued with full knowledge of the potential shortcomings and should be followed by research at the individual level of analysis. In a recent group-level analysis, Widaman et al. (1989) performed an ambitious chronometric analysis of the general model of addition processing presented in Figure 3 across four types of addition problem. The types of addition problem comprised problems composed of two single-digit addends (e.g., 4 + 5 = 9), three siagle-digit addends (e.g., 3 + 7 t 2 = 12), one single-digit and one double-digit addend (e.g., 34 + 7 = 41), and two double-digit addends (e.g., 86 t 43 = 129). The sample for this study consisted of 23 college age subjects who responded to a total of 800 addition problems in a true/false verification paradigm. Within the overall set of 800 problems, 100 problems for each type were presented twice, once with a correct answer and once with an incorrect answer. One way of viewing this study is as the intensive internal validation of chronometric models for each of four types of mental addition, followed by an extensive internal validation of the processing models across all four types. During the intensive internal validation analyses, Widaman et al. (1989) demonstrated that the estimates of the primary processing parameters for each of the four types of addition problem were rather similar in magnitude. For example, across the four problem types, the regression parameter estimate for the PRODUCT structural variable, which reflects time taken to retrieve answers from long term memory, varied from about 7.2 to 11.6 ms. The intercept also showed relative stability across problem types, varying from about 675 to 800 ms. Further,
KF. Widaman & T,D. Little the intercept difference between true and false problems also was rather similar across problem types, ranging from 100 to 120 ms. The only component that was common across problem types and that exhibited substantial variability in its parameter estimates was the digit-encodingcomponent, with estimates that ranged from 50 to 160 ms. across problem types. In addition to the preceding component processes, Widaman et al. (1989) showed that certain components were required only in restricted types of problems, consistent with the general model in Figure 3. One such component was the carry operation, which was required only in two-column problems. The carry operation took a significant amount of time in both types of two-column problem, with estimates of 150 and 245 ms. for the two types of problem. The four intensive internal validation analyses were followed up by an extensive internal validation of the models. Two analyses of this type were performed, one involving the 600 problems with two addends and the other based on all 800 problems. In these analyses, constraints across problem type for parallel mental components were invoked and tested for significance. For example, we constrained the retrieval time parameter estimate (i,e., the regression slope for the PRODUCT structural variable) to be invariant across problem types and then tested to determine if this constraint resulted in a worsened fit of the model to the data (i.e., a significant drop in RZ). Constraints that led to a significant worsening of model fit were relaxed; constraints that had a nonsignificant effect on model fit were retained. In the first extensive internal validation analysis, a constrained model with only 6 parameters for the 600 problems explained over 90 percent of the RT variance across problems; similar results were found in the second analysis across all 800 problems. Interested readers are referred to the Widaman et al. (1989) paper for more detail on these analyses. The important point to note here is that this study demonstrated strong extensive internal validation of processing parameters for mental addition across an array of problem types at the group level for a sample of college subjects. However, the resulting models may not capture well the responding of each individual subject. Thus, in order to assess the internal validity of processing models more accurately,these models must be tested at the individual level. Individual-level ana!yses. A number of studies have taken the appropriate next step, proceeding on to individual-level analyses. For example, Widaman et al. (in press) reported the results from two studies of mental arithmetic that involved individual-level analyses. In Study 1,Widaman et al. obtained RTs to a sample of 70 of the 100 basic simple addition problems in a verification task format from 123 students in second, fourth, and sixth grade. Then, in Study 2, Widaman et al. performed an extensive internal validation analysis of simple and complex addition
Individual Differences in Arithmetic Skill
229
problems in a sample of 163 subjects, including students in second, fourth, and sixth grade (N = 63) and in college (N = 100). The simple addition problems used were a sample of 40 of the 100 basic addition problems with two single-digit addends; the complex addition problems were a sample of 40 problems from Widaman et al. (1989) that had two double-digit addends. In both studies, Widaman et al. (in press) fit componential models to the RT data from each of the individuals and then classified individuals as to whether they used a computational strategy (i.e., MIN was the best predictor of their RT data) or a retrieval strategy (i-e., PRODUCT was a better predictor). Study 2 offered an interesting further contrast to be investigated, as componential models were fit separately for each subject for the 137 persons who completed both the simple addition and complex addition problem sets. Across both simple and complex addition, Widaman et al. demonstrated, through use of constrained regression analyses, that the same processes were invoked by the sub-groups of individuals for both types of stimuli. That is, the strategy invoked by an individual to solve simple addition problems was the same strategy invoked by that individual for complex addition problems. The preceding results do not imply, however, that students at any grade were a homogeneous lot with regard to the strategy used to solve problems. As shown in Table 1, subjects at each grade level in the Widaman et al. (in press) investigationrepresented a mix of persons classified as using digital processes (i.e., counting) and persons identified as using retrieval processes. Looking across both Studies 1and 2, shown in the bottom section of Table 1, there was an approximate 50-50 split of second grade students into digital and retrieval subgroups. By fourth grade, approximately 70 percent of students were best fit by a componential model with a retrieval structural variable, and this increased to approximately 80 percent by sixth grade, a level that was maintained for our college-aged subjects. Given the work on differences in strategy choice across problems by Siegler (e.g., 1987, 1988a), we present the information in Table 1 cautiously. Because we did not gather strategy use information on a problem-by-problem basis, we may have mistakenly characterized 20 percent of our college-aged subjects as using digital processes to solve addition problems. That is, these subjects may have used retrieval processes for many or most addition problems, but switched to a slower back-up counting process for some problems when retrieval failed to produce a satisfactory answer. Siegler (1987) demonstrated how such a pattern of responding could result in a counting structural variable explaining more variance than a retrieval structural variable if analyses are done across all problems regardless of strategy choice. At the least, the categorization of subject presented in Table 1 should warn researchers strongly against averaging across all subjects regardless
KF. Wdaman & T.D. Little
230
of their strategy use, complementing the admonitions of Siegler against averaging over problems. Table 1. Distribution of Subjects Using Digital and Retrieval Strategies For Simple Addition, By Grade and Study ( j b m Wdaman et al., in press) Digital
Network Retrieval
Total Grade
N
N
Pct
N
PCt
43
36 19
21 29 33
64 81
58 20 26 20
10 16 14 80
42 80 74 80
31 45 47 80
51
Study 1 2 4 6
37 45 41
16 16 8
57
Study 2 2 4 6 College
24 20 19 100
14 4 5 20
Studies 1 and 2 Combined 2
4 6 College
61 65 60 100
30 20 13 20
49 31 22 20
69 78 80
External validation of model parameters A number of studies have investigated the relationship of individual differences in strategy execution with criterion measures of achievement or ability (see, for example, Geary & Widaman, 1987, 1992; Geary et al., 1987; Little & Widaman, 1992; Widaman et al., in press). This form of analysis addresses a very important and central question in the area of mental addition research. Specifically, the question confronted is this: Do the parameters identified within componential
Individual Differences in Arithmetic Skill
231
models for mental addition skill have a substantial degree of criterion-related external validity? Exfemal validation of arithmetic cognitive components alone. In their Study 1, Widaman et al. (in press) presented a structural equation model for mental addition parameters, a model shown in Figure 4, that was based on the 83 subjects who were identified as using retrieval processes for mental addition. In Figure 4, we use (a) boxes to denote measured or observed variables; (b) circles to represent latent variables or factors; (c) curved, double-headed arrows from a variable to itself to denote the variance or residual variance of the variable; (d) curved, double-headed arrows from one variable to another to represent the covariance between the variables; and (e) straight, single-headed arrows to denote unidirectional paths, or regression weights, from predictor variables to criteria. Associated with each arrow in Figure 4 is a parameter estimate reflecting the magnitude of the quantity. Asterisked parameters were fmed to identify the model. For example, the variance of the Age latent variable was unity, the Age latent variable was identical to the person’s chronological age (i.e., observed chronological age, represented by Box 1, had a residual variance of zero), and Age had a regression weight of .27when predicting Math Achievement. All parameter estimates in Figure 4 were statistically significant, except for the effects of Age on Verbal Comprehension and Reading Skills (see below), paths that were retained to unbias other estimates in the model. In the model in Figure 4, the Linear and quadratic trends of chronological age, Age and Age’, respectively, were equated with measured Variables 1 and 2, respectively, and were partialled from all appropriate latent variables. Two latent variables were specified for the RT parameter estimates: (a) Addition Efficiency, which had as indicators the PRODUCT parameter estimate and the root mean square error from the regression modeling of the individual’s RT data (Variables 3 and 4, respectively), and (b) Speediness, with a single indicator (Variable 5 ) formed as the sum of the individual’s intercept and encoding parameter estimates. The remainingeight observed measures were subtests of the Stanford Achievement test and comprised three measures primarily related to math achievement (Variables 6 4 , three primarily related to verbal achievement or comprehension (Variables 9-11), and two primarily related to reading or word skills (Variables 12 and 13; see figure caption for variable names). As shown in Figure 4, three latent variables were hypothesized to account for the relations among the Stanford Achievement Test subtests. The final structural model for these data is shown in Figure 4;the model shows that individual differences in Addition Efficiency were strongly related to individual differences in Math Achievement, b = -.59. The Widaman et al. model provided
232
X F . Widaman & T.D. Little
strong empirical validation of estimates from the componential model given the sizeable relationship with the latent variable of Math Achievement, a form of criterion-related validity. The Widaman et al. model also showed moderate discriminant validity as the Addition Efficiency latent variable was more strongly related to the Math Achievement latent variable than to the Verbal Comprehension and Reading Skills latent variables (see Widaman et al., in press). A final interesting aspect of the structural model in Figure 4 concerned the general Speediness latent variable, which failed to exhibit any direct predictive effects on any of the three achievement latent variables. n
Figure 4. Structural relations among the information processing and achievement variables and chronological age for retrieval subjects in Study I of Widaman et al. (in press) (Observed variable codes: 1 = chronological age (in years), 2 = age squared 3 = PRODUCTparameter estimate, 4 = root mean squared error, 5 = intercept + encoding speed 6 = Mathematics Computations, 7 = Mathematics Concepts, 8 = Mathematics Applications, 9 = Listening Comprehension, 10 = Vocabulary, I1 = Reading Comprehension, 12 = Reading and Word Study Skills, 13 = Spelling).
Individual Difierences in Arithmetic Skill
233
Widaman et al. (in press) also evaluated the external validation of processing efficiency for a computational subgroup (i.e., subjects best fit by the MIN predictor). Using regression procedures due to the relatively small sample size (N = 4 9 , Widaman et al. found that counting speed as measured by an aggregate of the MIN and RMSE parameters (a composite termed MINR) explained statistically significant amounts of variance in each of three math achievement subtests (Math Computations, Math Concepts, and Math Applications), even after the linear and quadratic effects of age were first partialled out. Further, the MINR predictor showed age-partialled relations with the three subtests related to reading skills (Reading Comprehension, Reading and Word Study Skills, and Spelling), but not with the two tests that were fairly pure indicators of verbal comprehension (Listening Comprehension and Vocabulary). These findings demonstrate at least moderate levels of convergent and discriminant external validation for the estimated information processing parameters related to the MIN and PRODUCT models of mental addition. Across both the structural model and the regression analyses reported by Widaman et al. (in press), two important trends emerged. First, the direction of the relations between the MINR predictor and the achievement tests as well as the Addition Efficiency factor and the achievement factors was in the hypothesized direction--negative,which indicates that the more highly automatized the counting process (as reflected in smaller values of MIN) or the retrieval process (as reflected in smaller values of PRODUCT) the higher is the level of achievement. Second, the speediness variable for the computational subjects and the Speediness factor for the memory retrieval subjects was never significantly related to measures of achievement. Turning to Study 2 of Widaman et al. (in press), complex as well as simple addition stimuli were presented in a verification format, and a similar pattern of convergent and discriminant validity emerged using two criterion related measures of ability, numerical facility and perceptual speed. Subjects were once again divided into retrieval and digital subgroups. Componential models including the PRODUCT structural variable were fit to data from retrieval subjects, and models including the MIN structural variable were fit to data from digital subjects. Paralleling results from Study 1, the results from Study 2 exhibited patterns of convergent and discriminant validity between the RT-based parameter estimates and the traditional ability measures. That is, for the retrieval subgroup, PRODUCT parameter estimates accounted for significance proportions of age-partialled variance in a numerical facility test composite, but no significant variance in a perceptual speed test composite. In turn, for the digital subgroup, MIN parameter estimates explained significant age-partialled variance in the
KF. Widaman & T.D. Little numerical facility test composite, but did not do so for the perceptual speed test composite. A second major study of the external validity of parameter estimates based on the general model for addition presented in Figure 3 was that by Geary and Widaman (1987). In this study, a sample of 100 college students was administered several sets of arithmetic problems and a battery of paper-and-pencil tests of ability. The four arithmetic problem sets were simple addition, complex addition with two double-digit addends, simple multiplication, and complex multiplication with one double-digit number multiplied by a single-digitnumber. The test battery consisted of three tests for each of three ability factors, Numerical Facility, Perceptual Speed, and Spatial Ability. Only rather small modifications of the general model for addition, presented in Figure 3, were required in order to model responses to multiplication. Interestingly, for both addition and multiplication, PRODUCT has been found to be the best predictor of RT data (e.g., Miller et al., 1984, Geary et al., 1986). Indeed, Geary et al. (1986) found that identical regression models fit both simple addition and simple multiplication problems very well. Only one slight modification of the complex addition model was required for complex multiplication. This modification concerns the nature of the carry process in multiplication,which involves both the presence or absence of a carry operation and, in the former case, the addition of the cardinal value of the carried amount onto the tens column product. Geary and Widaman (1987) found that (a) the PRODUCT parameter estimates from the four different RT problem sets all loaded on a single Product, or Retrieval Efficiency, latent variable, (b) the two carry parameter estimates (from complex addition and complex multiplication) loaded on a second, Carry latent variable, (c) the four intercept terms loaded on a third, Intercept latent variable, and (d) the four parameter estimates representing an intercept difference between correct and incorrect problems loaded on a fourth, Truth latent variable. Then, the pattern of correlations among the preceding four latent variables was consistent with a higher-order latent variable structure. Specifically, the Product and Carry latent variables, which represent processes unique to arithmetic problems, were indicatorsfor a second-orderArithmetic Efficiency latent variables; and the Intercept and Truth latent variables, reflecting processes that may be identified across a wide array of RT tasks, were indicators for a second-order Basic Speediness latent variable. Once the preceding factor structure of the RT parameter estimates was specified, clear and strong patterns of convergent and discriminant validity between the RT latent variables and the ability latent variables were found by Geary and Widaman (1987). The second-order Arithmetic Efficiency latent variable had a
Individual Differences in Arithmetic Skill
235
very strong relationship (0 = .89) with the Numerical Facility factor based on test scores, and the second-order Basic Speediness latent variable had a strong relationship (0 = .71) with the Perceptual Speed factor based on test scores. No other direct relationships between the RT latent variables and the traditional paper-and-pencil latent variables were required. These results, from both the Widaman et al. (in press) and the Geary and Widaman (1987) studies, demonstrate that the parameter estimates derived from componential models of arithmetic performance represent rather well the individual differences in arithmetic skill that underlie numerical skills as measured by traditional tests of mental ability or school achievement. External valiahtion of arithmetic and other cognitive components. Yet another aspect of external validity is the role that other cognitive processes, such as working memory, play in mediating, or predicting, mental addition performance. Working memory is assumed to be a limited capacity processing system (or set of sub-systems, e.g., Baddeley, 1983) that places constraints on cognitive activity; the exact nature of these constraints has been the focus of much theory and research over the last 25 years (Humphreys, Lynch, Revelle, & Hall, 1983). One of the more predominant indices of working memory has been the memory or digit span measure (Dempster, 1981; Humphreys et al., 1983). The memory span construct has far-reaching implications for understanding the nature of intelligence and the processes that underlie cognitive abilities because, as Dempster (1981, p. 95) pointed out, “There is considerable evidence that processing efficiency, a variable that appears to play an important role in span differences, is also an important factor in many other intellectual tasks.” In a replication and extension of the Geary and Widaman (1987) study, Geary and Widaman (1992) included chronometric and paper-and-pencil tests of working memory along with batteries of RT and paper-and-pencil indicators that were similar to those in their earlier study. Once again, clear patterns of convergent and discriminant validity were shown between the latent variables based on parameter estimates from the chronometric tasks and the latent variables representing the battery of paper-and-penciltests. The Working Memory Capacity chronometric latent variable had direct relationships with a General Reasoning factor and with a Memory Span factor, but not with the factors of Numerical Facility or Perceptual Speed. Replicating the earlier study, the Arithmetic Efficiency chronometric latent variable was strongly related (0 = .88)with the Numerical Facility factor; Arithmetic Efficiency was also related to the General Reasoning factor, but was not related to factors for Perceptual Speed or Memory Span. The Basic Speediness chronometric factor was related only to a single ability factor, Perceptual Speed. The results of this study suggest that Working
236
KF. Widaman & T.D. Little
Memory Capacity influences certain types of performance (i,e., reasoning and memory span), but does not influence numerical facility directly, at least in samples of subjects who are rather proficient with numerical tasks. In a second study, Little and Widaman (1992) assessed the role of working memory in numerical and perceptual tasks, using a sample of 156 grade school children from grades 2 through 8 and 111college subjects. In this study, the basic 100 addition problems were presented in a production-task format, with subjects required to state the correct answer to each problem. As in previous studies, an Addition Efficiency chronometric latent variable was related directly to both Numerical Facility and Perceptual Speed factors, and a Basic Speediness chronometric latent variable was unrelated to either paper-and-pencil factor; these relationships held for both the grade school and college samples. Interestingly, Little and Widaman found that working memory capacity was positively related to both Numerical Facility and Perceptual Speed in the grade school sample. However, working memory capacity was not directly related to either Numerical Facility or Perceptual Speed for the college sample, replicating the findings of Geary and Widaman (1992). Thus, foreshadowing a later section, it appears that relations among processing components may change during development, with working memory capacity having an impact on ability performance at earlier ages, but not at later ages when these types of performance are more highly automatized. Developmental changes in mental addition skills
Cross-sample and cross-study comparisons in the area of mental arithmetic research is made difficult by the fact that the models for representing the arithmetic process are not consistent across samples and studies. One criterion of external validation is consistency of parameter estimates across samples and studies. However, the direct comparison of parameter estimates is clouded by the lack of consensus about which models best represent the addition processes. One study described above (Widaman et al., in press) did provide some evidence for cross-sample,cross-studyvalidation of model parameters. Given this, an intriguing question is whether there are consistent changes with development in the cognitive components that underlie skill in mental addition. Improved skill in executing elementary components
Group-level analyses. Both Kail(1986,1988,1991a, 1991b) and Widaman et al. (in press) evaluated the developmental changes in parameter estimates for mental
Individual Differences in Arithmetic Skill
237
addition, among other cognitive processing tasks. In each of several studies, Kail administered batteries of cognitive processing tasks to samples of subjects who varied from 8 to 22 years of age. One of the tasks that Kail assessed was mental addition presented using a verification format; the RT data from the mental addition task were averaged across subjects at each grade level. For each age level, Kail fit regression models to the average RT data, using SUM2 as the structural variable representing retrieval processing. Kail then contrasted the fit of exponential and hyperbolic developmental functions to the group-level processing parameters. Using chronological age as the independent variable, Kail found that an exponential function provided a slightly better fit to these cross-sectional data than did a hyperbolic function (see Kail, 1986, 1988, 1991a, 1991b). Widaman et al. (in press) modeled the developmental changes in parameter estimates separately for their two subgroups of subjects, the digital or computational subgroup identified by the MIN model and the retrieval subgroup identified by the PRODUCT model. In their analyses, Widaman et al. fitted a number of alternative statistical models to the mean parameter estimates for the subgroups of subjects identified in the two studies discussed above, which included students at four grade levels, 2nd grade, 4th grade, 6th grade, and college. In addition to the exponential and hyperbolic models tested by Kail (e.g., 1988), Widaman et al. fit a traditional linear model and a simple multiplicative function to their data. In a second departure from Kail, Widaman et al. estimated the fit of each statistical model twice, once using chronological age as the independent variable and once using grade in school as the independent variable. For both the digital and retrieval subgroups, Widaman et al. (in press) concluded that the multiplicative function fit the data better than did the linear or exponential equations; this multiplicative function was also more parsimonious than the exponential model, requiring only two parameter estimates rather than three (see Widaman et al., in press). More importantly, Widaman et al. also found that grade in school provided better empirical fit to the data than did chronological age. The resulting multiplicative model results are presented in Figure 5 for the retrieval subjects and in Figure 6 for the digital subjects. As shown in Figure 5, the mean parameter estimates from the retrieval subgroup were well fit by the multiplicative model; the equation for this group was PRODUCT = 200 x Grade"*'6. For the individuals classified as relying on a counting based algorithm (i.e., the MIN group), the multiplicative model results are shown in Figure 6; the equation for this group was MIN = 2536 x Grade-'.%. Of particular note is the fact that the exponent or decay parameter in each
238
KF. Wdaman di T.D. Little
Predicted Value -1.16 Product = EOO[Grade) Observed Estimates OSimple, Study 1 nSmple, Study 2 v Complex, Study 3
Figure 5. Plot of the relation between mean PRODUCT pammeter estimates and grade (based on Wdaman et al., in press).
Predicted Values -1.26 MIN = 2536(GRADE) Observed Estimates OSimple, Study 1 OSimple, Study 2 VComplex, Study 2
4
5
8
7
8
9
1'0
1'1
12 1 3 1 4 $5
Grade
Figure 6. Plot of the relation between mean MIN parameter estimates and grade (based on Wdaman el al., in press).
Individual Differences in Arithmetic Skill
239
equation was approximately -1.00, that is, skilled performance in addition appears to improve roughly as a function of the inverse of grade. Furthermore, Widaman et al. had argued that grade in school may be a better index of the development of skill in mental addition than would chronological age, because grade in school is a rough index of the amount of systematic, formal life-time practice that children and adolescents have with numerical tasks. Numerical tasks, such as addition, fall within the crystallized domain in the theory of fluid and crystallized intelligence as propounded by Horn and his colleagues (Horn, 1988, Horn & Cattell, 1982). Horn and Cattell have argued that crystallized abilities develop as a function of systematic acculturation processes that are provided to individuals in a given culture through various standard experiences, including those in school. Widaman et al. (in press) concluded that the multiplicative relationships found between chronometric parameter estimates and grade in school support the hypothesis that developmental differences are tied to the systematic acculturation processes that individuals experience as a function of their years in school. In summary, the analyses reported by Widaman et al. (in press) suggest that a multiplicative function was a better representation of the nonlinear growth trends for arithmetic skills during the years from childhood through adolescence than were alternative forms of model. In addition to having better empirical fit to the data, grade in school has intriguing theoretical ties to a mechanism that may underlie the developmental changes observed, specifically formal instruction in and practice with arithmetic during schooling. However, at least one important question arises when examining these results. Because the data that were analyzed by both Kail (e.g., 1988) and Widaman et al. were at the group or subgroup level, would the trends emerge in a similar manner if analyzed at the individual level? .‘ndividual-level analyses. In the study by Little and Widaman (1992), the fitting of growth models to chronometric parameter estimates was extended to the individual level of analysis. To evaluate the form of mental addition skill development, analyses of the individual-level data contrasted the previously described linear and nonlinear functions (e.g., an exponential growth function, after Kail, 1988) as well as additional growth functions. Following Widaman et al. (in press), all models were fit once using chronological age and once using grade in school as the independent variable, and the above models were fit to the component process parameter estimates (e.g., PRODUCT and MIN) at the individual level separately for digital and retrieval subgroups. In these analyses, the exponential curve fit the data better for both subgroups than did any other model. The figural representations of the nonlinear, exponential trends for the retrieval and digital subgroups are presented in Figures
K.F. Widaman & T.D.Little
240
7 and 8, respectively. The retrieval subgroup, shown in Figure 7,exhibited a clear developmental trend: Mental addition skill development, indexed by PRODUCT parameter estimates, begins with a period of rapid acquisition, which decelerates exponentially through college (i.e., grade 14). The digital subgroup, shown in Figure 8, revealed a roughly similar pattern of development as for the retrieval subgroup. The exponential function explained 77.3% of the grade-related variance in skill development for the retrieval subgroup, and 58.5% of the variance for the digital subgroup. Despite these fairly large estimates of explained variance, inspection of Figures 7 and 8 reveals the presence of important levels of individual differences in strategy execution that are not considered if only group-level data are analyzed.
0 -
PRODUCT Parameter (ms) Estimated Growth Function
o
E
1
9
0 rb ii
4
ii
=L
15
14
7
15
Grade
Figure 7. Plot of the relation between individual PRODUCT parameter estimates and grade (based on Little & Widaman, 1992).
In summary, contrary to the findings of Widaman et al. (in press), the multiplicative function did not adequately capture the form of the developmental trends in mental addition skill development for the data from Little and Widaman (1992). Rather, a variant of the three-parameter exponential function suggested by Kail (1986; 1988) appears to be most appropriate. Specifically, the exponential curve, when fit as a function of grade in school, best represented the developmental trend for the PRODUCT and MIN subgroups.
Individual Differencesin Arithmetic Skill
241
OMIN Parameter Estimate (ms)
- Estimated Growth Function
5
400-
200e
O J1
1
9
4
5
8
7
8
4
Ib
1'1
1'2
13
14
lb
Grade
Figure 8. Plot of the relation between individual MIN parameter estimate and grade (based on Little & Widaman, 1992). The developmental trends in Figures 7 and 8 exhibited nonlinear patterns that were similar to those presented by Kail (e.g., 1988) and Widaman et al. (in press). But, perhaps a more important finding that emerged from the individual-level data presented by Little and Widaman was that grade in school, and not chronological age, was once again the more appropriate index of skill development for mental addition, a finding replicating the group-level results presented by Widaman et al. As Widaman et al. noted, the use of grade in school may be a more accurate measure of skill development than age, because grade in school reflects the amount of practice with, or exposure to formal instruction in, addition that students receive during their years in school. Therefore, grade in school may index the experiences underlying the developmental changes in the efficiency with which students respond to simple and complex addition. As noted above, the explanatory value of grade in school also supports the suggestion of Horn and Cattell (1982) that systematic influences of acculturation, such as schooling, affect the development of crystallized abilities, such as mental addition. These theoretical notions found support in both the study by Widaman et al. (in press) and that by Little and Widaman (1992).
242
KF. Wdaman & T.D.Litrle
Potential development in relations among components Further questions regarding individual differences in the cognitive processes that underlie cognitive arithmetic concern the relations among component scores as a function of age. One theoretical notion from studies of the development of mental abilities over the past 60 years or more is the differentiation hypothesis, which states that mental abilities become more differentiated during the years from infancy through childhood and adolescence (Reinert, 1970). This hypothesis could be carried over to the chronometric study of arithmetic skills in several ways. For example, the general model for addition performance has resulted in the identification of several component skills that underlie RT performance on addition and multiplication problems (Geary & Widaman, 1987; Widaman et al., in press). Certain of these component skills are unique to arithmetic problems (e.g., retrieval efficiency and speed of executing the carry operation), whereas other components seem to be common to many types of RT task (e.g., the overall intercept and the intercept difference between correct and incorrect problems). This leads to several pertinent questions. For example, within each type of component (e.g., those unique to arithmetic), do the several components show a pattern of differentiation with age? Or, do the components unique to addition show a gradual differentiation from the components common to many RT tasks? Other questions arise with regard to additional components or cognitive processing capacities as these relate to numerical processing. One example here is the work by Geary and Widaman (1992) and Little and Widaman (1992) on the role of working memory capacity on addition performance. Across these two studies, it appears that working memory capacity may place some limits on the addition performance of children during the elementary and junior high school years, but that performance by college students has become so automatized that working memory capacity no longer influences their performance. These findings deserve replication and extension. Moreover, the findings suggest that there may be other cognitive mediators of addition performance that await identification, an intriguing way of extending extant work on this domain. However, we must admit that so little research has been pursued at the individual level that little in the way of research findings are available on the important topic of developmental changes in the relations among components. A good deal of research that goes under the rubric of individual differences (e.g., Siegler, 1988a) is not true individual differences research at all. Rather, groups of students differing in levels of ability or achievement are identified, and analyses of data are performed at these subgroup levels. This approach should be distinguished from the research by Widaman et al. (in press) or Little and
Individual Differences in Arithmetic Skill
243
Widaman (1992), which also identified subgroups of individuals. In the latter studies, analyses were indeed performed at the subgroup level, but the estimates used for many analyses (e.g., external validation against measures of achievement or ability) were individual-level estimates, not subgroup-level estimates. Given the paucity of research on inter-individual and intra-individual differences in strategy choice and strategy execution, it is impossible to provide more than a sampling of questions and a few answers in this area. However, given the importance of such questions for an understanding of the development of arithmetic skills, it is clear that a focus on the changes in relations among processing components underlying arithmetic skill is a fertile avenue for future research. Findings on mental arithmetic from an individual differences approach: Implications for research and practice The prior research reviewed in this chapter is based partly on group-level results and partly on individual-level results. In this section, we w ill note several implications for research and practice that derive from our review, attempting to highlight those implications that are uniquely related to our individual differences perspective. Research implications Research on mental addition
Restricting our attention first to the domain of mental addition, the findings presented above answer certain questions, but raise others as well. One of the most central issues concerning mental addition is the nature of the process by which persons retrieve answers to addition facts from long term memory. One aspect of this question concerns the proper structural variable for representing the retrieval process. In our research, we have found that the PRODUCT structural variable is the best predictor of RTs to addition problems, but others have recently been touting other structural variables, such as the normative DA variable. Given the fairly strong relationship between these two structural variables, yet certain systematic differences between them, further research should be directed toward understanding the relationship between these two variables. To be most helpful, such research should not simply attempt to determine once again which is the better predictor of empirical RT data. Rather, the research should be directed toward understanding which structural variable is a more veridical representation
244
RF.widaman & T.D. Little
of the retrieval process, and which structural variable is merely a competitor for explained variance due to its correlation with the proper structural variable. A second goal of future research should be to understand the mechanisms by which the nomothetic patterns of association are established in an individual's long term memory. The PRODUCT variable is based on a tabular conception; if this is the proper analog for the retrieval process, then an important quest is the search for how this obvious and explicit representation of number facts is incorporated into long term memory stores. In contrast, if the distribution of associations structural variable, DAYis a more correct representation of the manner in which search through long term memory ensues, then theory should specify how it is that persons who presumably differ in the degree to which they have produced errors for individual addition problems would come to have long term memory stores governed by a normative index of strength of association. Perhaps a broadly conceived replication of the Wheeler (1939) study, in light of contemporary theory, would be able to identify a mediating mechanism for either or both structural variables. Such a search would be a most worthwhile goal. A related question regarding the nature of the retrieval process is the manner in which activation of arithmetic information occurs. This question concerns the issues of both direct and collateral activation of related information in long term memory. Currently, several researchers are making contributionsin this area (e.g., Campbell & Oliphant, this volume). But, a good deal of future research should be undertaken on this problem, as it is a central issue with retrieval of addition information. Yet another, rather crucial question for future research concerns the shift in strategies by individual subjects from primary or sole use of digital processes to primary or sole use of retrieval strategies when responding to addition problems. The grouplevel approach that has been used by several researchers (e.g,, Ashcraft & Fierman, 1982) is obviously problematic, as it fails to contend with interindividual differences in strategy choice within age or grade groups. The subgroup approach employed in our research (e.g., Widaman et al., in press) seems more appropriate, but this also has some shortcomings. One shortcoming of this latter approach is the fact that growth curves for each subgroup refer to differentially representative subsamples as a function of chronological age. That is, when a person becomes proficient enough to switch from reliance on digital processes to reliance on retrieval processes, it is not clear precisely where on the PRODUCT growth curve the person would appear: At the mean of his/her age or grade level, at the mean of the persons at the same level of experience with retrieval strategies? In addition to this problem, it seems crucial to identify the process or conditions that allow the switch from digital to retrieval processes; what are the
Individual Differences in Arithmetic Skill
245
processes or conditions that enable this shift. In order to understand how qualitative shifts in performance occur, these questions must be confronted. A fifth interesting question that is calling for attention is the issue of intraindividual differences in strategy choice and the importance of this issue during the childhood and adolescent years. The work of Siegler demonstrates the importance of taking into account the manner in which persons respond to individual problems; our research demonstrates the importance of taking into account individual differences in strategy execution. A complete understanding of the manner in which individual subjects respond to individual addition problems has yet to be attained; this goal would be a most worthy one for considerable attention in the future. A sixth and final issue is the effects of the nature of the task set when responding to addition problems. Most of our research has used verification task procedures, although Little and Widaman (1992) used a production task paradigm. Either paradigm seems sufficient for research on simple addition, but there are shortcomings of the production task paradigm when modeling performance on complex addition. Further research should be aimed at determining whether individuals use the same process when responding to addition problems in verification and production task formats; we presume that persons would use consistent strategies regardless of task format, but we know of no research that has attempted to answer this question at the level of the individual subject. If our presumptions were not confiimed (i.e., if persons use different strategies with addition problems presented in different formats), we may need to revise our thinking about the utility of chronometric studies in this domain in important ways.
Research on other forms of mental arithmetic
Our research also has implications for research on other forms of mental arithmetic. Several years ago, we reported results that indicated substantial similarity in the long term memory networks for addition and multiplication (Geary et al., 1986). That is, identical regression models fit RT data for simple addition and simple multiplication problems equallywell, and highly similar models were able to characterize RTs to complex addition and complex multiplication problems. The cognitive bases for this similarity and for the interrelations between the memory networks for these two operations should be spelled out more clearly. The presence of confusion effects is an argument in favor of the close relationship between the memory networks for addition and multiplication facts; this close relationship may itself provide some bases for the similarity of the temporal
246
KF. widaman & T.D. Little
operating characteristics for the retrieval process that accesses addition and multiplication facts. A second issue concerning other forms of mental arithmetic involves the influences of format. In most of our research on addition, we have presented addition problems in columnar form; others have used row formats (e.g., 2 + 4 = 6). Research on addition could well be guided toward determining the influence of columnar versus noncolumnar presentation formats on RT data; this research could then be extended to investigate similar effects on other forms of simple addition. It would be interesting to find whether there are detectable effects on RT data for a given operation (e.g., addition) as a function of presentation format, whether there were differences across operations in the magnitude of format effects, and whether there were individual differences in the magnitude of format effects across operations. Such differences could well feed back information into our theories of the relative automatization of the several simple arithmetic operations. A third issue is the relationship between models for addition and multiplication to appropriate models for the remaining operations, subtraction and division. This issue is a complex one, with several sides. First, it would be interesting to map out the relations between arithmetic facts for the several operations. Previous research has demonstrated strong relations between addition and multiplication facts; future research could well aim at determining whether strong relations exist between the long term memory networks for other pairs of operations, and whether the strength of these interoperation relations differ across individuals. Second, we have found that the general model for addition performance could easily be modified to represent responding to multiplication problems. However, modifymg the model appropriately for subtraction or division problems may prove more difficult. Moreover, it is not clear how information processing models might have to be revised for problems under different presentation formats or paradigms. That is, we have found that similar models hold for verification and production task paradigms for addition. But, the same may not hold true for other operations, such as subtraction. Of course, the degree to which similar models hold across presentation format and task paradigm may well be an individual difference variable. If our goal is complete understanding of the manner in which individuals respond to different types of arithmetic problems, then such issues require research in order to verify the generalizability of our results across such seeminglyunimportant aspects of the experimental situation as presentation format and task paradigm.
Individual Differences in Arithmetic Skill
247
Educational Implications The results reviewed in this chapter have at least two educational implications. One major educational implication of our results concerns the development of arithmetic skills during the early years of schooling. Our developmental growth curves (Widaman et al., in press; Little & Widaman, 1992) as well as those of others (e.g., Kail, 1986, 1988) suggest that change in addition processing skills develops quite rapidly during the first three or four grades of elementary school, after which improvement in the basic adding skills is much less pronounced. From these findings, it is clear that educational influences will have their strongest impact during the fvst few years of school. However, our frndings fail to provide definitive answers to other important and interesting educational questions. For example, does the most rapid improvement in arithmetic skills occur during early elementary skills because of the particular readiness of children at those ages? Which teaching strategies may accelerate the automatization of arithmetic fact memorization (e.g., rote repetition, cognitive strategies, or others)? Which teaching strategies may be most effective on children who seem to be lagging behind their classmates in the automatization of number facts? Finally, educational efforts would be informed in important ways by a determination of the malleability of memory networks for arithmetic facts as a function of age. Our results, taken together with those of Kail, imply that the period of greatest change is during the early years of school. However, remedial efforts may still be quite successful if long term memory networks for arithmetic facts remain malleable beyond these early years of schooling. A second educational implication of certain results discussed in this chapter is the need to consider at least the intra-individual context when a child is learning addition facts. Little and Widaman (1992) demonstrated that working memory capacity has an influence on addition performance for children in elementary grades. This implies that effective teaching of arithmetic facts may require the careful design of instruction to the current capabilities of the individual child. If a child is having trouble with arithmetic problems, certain aspects of the problems may be capturing his/her attention. That is, certain aspects of the problem may require the child to expend a major share of his/her attentional resources on them. This might leave too small an amount of working memory capacity that can be focused on the remaining aspects of the problem, aspects to which one must attend fully in order to solve the problem. If this occurs, the most effective approach to teaching would be to fit the teaching of each problem to the capacities of the individual child. This might take several forms, including the stripping away of irrelevant details of problems so that the child is not distracted from the major
248
XF. Mdaman di T.D. Little
task at hand--the solution of the particular problem. After the child becomes able to handle such "stripped away" problems easily, other more difficult problem formats may be introduced. The key to this approach is to characterize well the capabilities of the individual child, including his/her capabilities in a number of related domains, in order to understand how the instructional goals may be most easily attained for the individual child. In closing, we trust that the present chapter has served its purpose, indicating the unique view on cognitive processes that accompanies an individual differences perspective, Further, we would second the views of Underwood (1975) that individual differences should be a crucible for theories of cognitive phenomena. If cognitive theories can begin to account for individual differences in addition to group-level phenomena, then our theories will have moved importantly along the path toward ultimate goodness of fit with the data to be explained.
ACKNOWLEDGEMENTS The present work was supported in part by grants HD-21056 and HD-22953 from the National Institute of Child Health and Human Development and by intramural grants from the Academic Senate of the University of California, Riverside to the first author, by grants GOO8530208 and H023C80072 from the U. S. Office of Education (Donald MacMillan, Principal Investigator), and by computing grants from Computing and Communications, University of California, Riverside to both authors. The insightful comments by Jamie Campbell to a previous version of this paper are gratefully acknowledged. Many thanks as well are extended to Chris Strand for her invaluable help in preparing the final draft of this chapter. Parts of this chapter were written while the first author was on sabbatical leave at the Department of Psychology, University of Trier, Trier, Germany; the support, in various ways, provided there by Professor Leo Montada, Maria Haas, and Manfred Schmitt is also gratefully acknowledged. Requests for reprints should be sent to Keith Widaman, Department of Psychology, University of California, Riverside, CA 92521, or to Todd Little, Max Planck Institute for Human Development and Education, Lentzeallee 94, loo0 Berlin 33, Germany.
Individual Differences in Arithmetic Skill
249
REFERENCES Anderson, J.R. (1985). Cognitive psychology and its implications (2nd edition). New York W.H. Freeman. Anderson, J.R., & Schooler, LJ. (1991). Reflections of the environment in memory. Psychological Science, 2, 396-408. Ashcraft, M.H. (1982). The development of mental arithmetic: A chronometric approach. Developmental Review, 2,213-236. Ashcraft, M.H. (1983). Procedural knowledge versus fact retrieval in mental arithmetic: A reply to Baroody. Developmental Review, 4, 148-156. Ashcraft, M.H. (1990). Strategic processing in children’s mental arithmetic: A review and proposal. In D.F. Bjorklund (Ed.), Children’s stmtegies: Contemporary views of cognitive development (pp. 185-211). Hillsdale, NJ: Erlbaum. Ashcraft, M.H., & Battaglia, J. (1978). Cognitive arithmetic Evidence for retrieval and decision processes in mental addition. Journal of Experimental Psychology: Human Learning and Memory, 4, 527-538. Ashcraft, M.H., & Fierman, BA. (1982). Mental addition in third, fourth, and sixth graders. Journal of Experimental Child Psychology, 33,216-234. Ashcraft, M.H., Fierman, BA., & Bartolotta, R. (1984). The production and verification tasks in mental addition: an empirical comparison. Developmental Review, 4, 157-170. Ashcraft, M.H., & Stazyk, E.H. (1981). Mental addition: A test of three verification models. Memory & Cognition, 9, 185-196. Baddeley, A.D. (1983). Working memory. In D.E. Broadbent (Ed.), Functional aspects of human memory: Proceedings of a Royal Society discussion meeting held on 26 and 27January 1983 (pp. 73-86). London: The Royal Society. Baltes, P.B., & Nesselroade, J.R. (1979). History and rationale of longitudinal research. In J.R. Nesselroade & P.B. Baltes (Eds.), Longitudinal research in the study of behavior and development (pp. 1-39). New York: Academic. Baroody, A.J. (1983). The development of procedural knowledge: An alternative explanation for chronometric trends of mental arithmetic. Developmental Review, 3, 225-230. Baroody, A.J. (1985). Mastery of basic number combinations: Internalization of relationships or facts? Journal for Research in Mathematics Education, 16, 83-98.
250
R F . Widaman & T.D. Little
Baroody, A.J. (1987). Children’s mathematical thinking: A developmental frameworkforpreschool, primary, and special education teachers. New York Teachers College Press. Baroody, AJ., & Ginsburg, H.P. (1986). The relationship between initial meaningful and mechanical knowledge of arithmetic. In J. Hiebert (Ed.), Conceptual and procedural knowledge: The case of mathematics (pp. 75-112). Hillsdale, NJ: Erlbaum. Brunswik, E. (1952). The conceptual framework of psychology. Chicago: University of Chicago Press. Campbell, D.T., & Fiske, D.W. (1959). Convergent and discriminant validation by the multitrait-multimethod matrix. Psychological Bulletin, 56, 81-105. Campbell, J.I.D. (1987a). Network interference and mental multiplication. Journal of Experimental Psychology: Learning, Memory, and Cognition, 13, 109-123. Campbell, J.I.D. (198%). Production, verification, and priming of multiplication facts. Memo9 & Cognition, 15, 349-364. Campbell, J.I.D. (1991). Conditions of error priming in number-fact retrieval. Memory & Cognition, 19, 197-209. Cooney, J.B., Ladd, S.F., & Abernathy, S. (1989, April). The influence of verbal protocol methodr on children’s mental computation. Paper presented at the meeting of the American Educational Research Association, San Francisco. Cooney, J.B., Swanson, H.L., & Ladd, S.F. (1988). Acquisition of mental addition skill: Evidence for the transition between counting and retrieval strategies. Cognition and Instruction, S, 323-345. Cornet, J., Seron, X.,Deloche, G., & Lories, G. (1988). Cognitive models of simple mental arithmetic: A critical review. European Bulletin of Cognitive PsyChObgV, 8, 551-571. Dempster, F.N. (1981). Memory span: Sources of individual and developmental differences. Psychological Bulletin, 89, 63-100. Estes, W.K. (1956). The problem of inference from curves based on group data. Psychological Bulletin, 53, 134-140. Garner, W.R., Hake, H.W., & Eriksen, C.W. (1956). Operationism and the concept of perception. Psychological Review, 63, 149-159. Geary) D.C., & Widaman, K.F. (1987). Individual differences in cognitive arithmetic. Journal of Experimental Psychology: General, 116, 154-171. Geary, D.C., & Widaman, K.F. (1992). Numerical cognition: On the convergence of componential and psychometric models. Intelligence, 16, 47-80.
Individual Differences in Arithmetic Skill
251
Geary, D.C., Widaman, KF., & Little, T.D. (1986). Cognitive addition and multiplication: Evidence for a single memory network. Memory & Cognition, 14, 478-487. Geary, D.C., Widaman, K.F., Little, T.D., & Cormier, P. (1987). Cognitive addition: Comparison of learning disabled and academically normal elementary school children. Cognitive Development, 2, 249-269. Glass, A.L., & Holyoak, KJ. (1986). Cognition (2nd Ed.). New York Random House. Groen, GJ., & Parkman, J.M. (1972). A chronometric analysis of simple addition. Psychological Review, 79, 329-343. Hamaan, M.S., & Ashcraft, M.H. (1985). Simple and complex mental addition across development. Journal of &penmental Child Psychology, 40,49-72. Hamann, M.S., & Ashcraft, M.H. (1986). Textbook presentations of the basic addition facts. Cognition and Instiuction, 3, 173-192. Horn, J.L. (1988). Thinking about human abilities. In J.R. Nesselroade & B. Cattell (Eds.), Handbook of multivariate experimental psychology (2nd ed., pp. 645-685). New York: Plenum. Horn, J.L., & Cattell, R.B. (1982). Whimsey and misunderstandings of Gf-Gc theory: A comment on Guilford. Psychological Bulletin, 91, 623-633. Humphreys, M.S., Lynch, M.J., Revelle, W., & Hall, J.W. (1983). Individual differences in short-term memory. In R.F.Dillon & R.R. Schmeck (Eds.), Individual differences in cognition (pp. 35-64). New York: Academic. Hunt, E.,Lunneborg, C., & Lewis, J. (1975). What does it mean to be high verbal? Cognitive Psychology, 7, 194-227. Kail, R. (1986). Sources of age differences in speed of processing. Child Development, 57, 969-987. Kail, R. (1988). Developmental functions for speed of cognitive processes. Journal of Experimental Child Psychology, 45, 339-364. Kad, R. (1991a). Developmental change in speed of processing during childhood and adolescence. Psychological Bulletin, 109, 490-501. Kail, R. (1991b). Processing time declines exponentially during childhood and adolescence. Developmental Psychology, 27, 259-266. Kaye, D.B. (1986). Sources of age differences in speed of processing. Child Development, 57, 969-987. Kaye, D.B., dewinstanley, P.,Chen, Q., & Bonnefd, V. (1989). Development of efficient arithmetic computation. Journal of Educational Psychology, 81, 467-480.
252
KF. Wdaman & T.D. Little
Krueger, L.E., & Hallford, E.W. (1984). W h y 2 + 2 = 5 looks so wrong: On the odd-even rule in s u m verification. Memoty and Cognition, 22, 171-180. Lamiell, J.T. (1981). Toward an idiothetic psychology of personality. American P~chologist,36, 276-289. Landauer, T.K. (1962). Rate of implicit speech. Perceptual and Motor Skills, 15, 646. Little, T.D., & Widaman, K.F. (1992). A production tusk evaluation of individual differences in the development of mental addition skills: Internal and erternal valia’ation of chronometic mo&&. Manuscript under review. Miller, K., Perlmutter, M., & Keating, D. (1984). Cognitive arithmetic: Comparison of operations. Journal of Eyerimental Psychology: Learning, Memory and Cognition, 20,46-60. O’Neill, P.E. (1992). Parallel and serial search processes in visual attention: A multimethod evaluation. Unpublished manuscript, University of California at Riverside. Parkman, J.M., & Groen, G.J. (1971). Temporal aspects of simple addition and comparison. Journal of Experimental Psychology, 89, 335-342. Posner, M.I.,Boies, SJ., Eichelman, W.H.,& Taylor, R.L. (1%9). Retention of visual and name codes of single letters. Journal of Experimental Psychology (Monograph), 79,1-16. Reinert, G. (1970). Comparative factor analytic studies of intelligence throughout the human Me-span. In L.R. Goulet & P.B. Bakes (Eds.), Life-span developmental psychology: Research and theory (pp. 467-484). New York: Academic. Restle, F. (1970). Speed of adding and comparing numbers. Journal of Experimental Psychology, 83, 274-278. RUSSO,J.E., Johnson, EJ., & Stephens, D.L. (1989). The validity of verbal protocols. Memory & Cognition, 17,759-769. Siegler, R.S. (1987). The perils of averaging data over strategies: An example from children’s addition. Journal of Experimental Psychology: General, 116, 1-15. Siegler, R.S. (1988a). Individual differences in strategy choices: Good students, not-so-good students, and perfectionists. Child Development, 59, 833-851. Siegler, R.S. (1988b). Strategy choice procedures and the development of multiplication skill. Journal of Experimental Psychology: General, 117, 258-275. Siegler, R.S. (1989). Hazards of mental chronometry: An example from children’s subtraction. Journal of Educational Psychology, 81, 497-506.
Individual Differences in Arithmetic Skill
253
Siegler, R.S., & Shrager, J. (1984). Strategy choices in addition and subtraction: How do children know what to do? In C. Sophian (Ed.), Origins of cognitive skills (pp. 229-293). Hillsdale, NJ: Erlbaum. Stazyk, E.H., Ashcraft, M.H., & Hamann, M.S. (1982). A network approach to simple multiplication. Journal of Experimental Psychology: Leaming, Memory, and Cognition, 8, 320-335. Sternberg, RJ. (1977). Intelligence, information processing, and analogical reasoning: The componential analysis of human abilities. Hillsdale, NJ: Erlbaum. Treisman, A.M., & Gelade, G. (1980). A feature-integration theory of attention. Cognitive Psychology, 12, 97-136. Underwood, BJ. (1975). Individual differences as a crucible in theory construction. American Psychologist, 30, 128-134. Wheeler, L.R. (1939). A comparative study of the difficulty of the 100 addition combinations. Journal of Genetic Psychology, 54, 295-312. Widaman, K.F.,Geary, D.C., Cormier, P.,& Little, T.D. (1989). A componential model for mental addition. Journal of Experimental Psychology: Learning, Memory, and Cognition, I S , 898-919. Widaman, KF., Little, T.D.,Geary, D.C., & Cormier, P. (in press). Individual differences in the development of skill in mental addition: Internal and external validation of chronometric models. Learning and Individual Differences. Wixted, J.T., & Ebbesen, E.B. (1991). On the form of forgetting. Psychological Science, 2, 409-415. Zbrodoff, N.J., & Logan, G.D. (1990). On the relation between production and verification tasks in the psychology of simple arithmetic. Journal of Experimental Psychology: Learning, Memory, and Cognition, 16, 83-97.
This Page Intentionally Left Blank
Part Two
Numerical Cognition: Representation, Process, and Architecture
This Page Intentionally Left Blank
The Nature and Origins of Mathematical Skills J1.D. Campbll (Editor) @ 1992 Elsevier Science Publishers B.V. All rights reserved.
257
Chapter 7 A THEORY OF ENUMERATION THAT GROWS OUT OF A GENERAL, THEORY OF VISION:
SUBITIZING, COUNTING, AND FINSTs
Lana M. Trick University of British Columbia
Summary
Enumeration is one of the most basic mathematical skills. Yet none of the theories of enumeration is adequate to explain human enumeration. Two types of enumeration must be explained. Thefirst type, subitizing,is mpid (40100 ms/item), effortless, and accurate, but specialized for 4 or fewer items. n t e second type, counting, can deal with more items but is slow (250-350 mshtem), effortful and error-prone. In this paper I will propose a theory that grows out of a general theory of vision, yet explains subitizing and counting. I argue that subitizing exploits a limited capacity preattentive mechanism for item individuation, Pylyshyn’s FINST mechanism (Pylyshyn & Storm, 1988; Pylyshyn, 1989). FINSTs, for Fingers of INSTantiation, are tokens that are used to individuate a small number of items before the serial, area by area processing that characterizes spatial attention. Counting in contrast, involves spatial attention. Two kinds of evidence support the claim that subitizing relies on a preattentive mechanism while counting requires spatial attention. First, whenever spatial attention is needed to compute a spatial relation (cJ, Ullman, 1984) or perfom feahrre integration (c$, Treisman & Gelade, 1980), subitizing does not occur, whereas it does occur in similar displays when the task does not require spatial attention (Trick & Pylyshyn, 1991a). This would be expected if subitizing relies on a preattentive mechanism, such as the FINST mechanism. Second the position of the attentional focus affects counting latencies more than subitizing latencies (Trick & Pylyshyn, 1991b), as would be expected i f counting involved moving the attentional focus from location to location in the image.
L.M. Trick
258
Introduction
Enumeration is one of the most basic mathematical skills, yet it is poorly understood. Consider a simple enumeration task, in which subjects have to say how many dots there are in a display, as fast as they can, with accuracy. Reaction time is measured as a function of the number of items in the display. Typically, when this type of experiment is performed, a slope discontinuity in the enumeration function is evident. An idealized enumeration function is presented in Figure 1. When there are 4 or fewer items, the slope of the function is shallow. Each additional item adds a constant between 40-120ms to enumeration latency in the 1-4 range in most experiments. When there are more than 4 items, the slope jumps to 250-350 ms/item. Not only are there differences in the rate of enumeration between small and large numbers of items, but there are differences in accuracy. When there are 4 or fewer items, subjects rarely make mistakes. When there are more than 4 items, errors are much more common. 2106
1900 I
1700
1500 1300
c U
1100
700 500
300
i
I
I
I
I
1
I
1
2
3
4
5
6
7
Figure 1. Enumeration latency AF a function of the number of items. The differences in the subjective ease, accuracy and rate of enumeration have caused researchers to conclude that there are two enumeration processes. One process is specialized for small numbers of items and is effortless, fast and perfectly accurate, This process is called subitizing (e.g., Klahr, 1973a, after
Subitizing & Counting
259
Kaufman, Lord, & Reese, 1949). The second process can handle large numbers of items, but is slow, effortful and error-prone. This process is called counting. The "elbow" in the reaction time curve is taken to be the boundary between the subitiziig and counting ranges. The point at which the function changes direction is thought to be the upper limit of the subitizing range. There is controversy about exactly how many items can be subitized, however; estimates of the subitizing range vary between 1-3 and 1-7. It is not surprising there is so little consensus. First, different researchers use different experimental paradigms, accuracy criteria, and most important, different analytic procedures when calculating the subitking range (e.g., Akin & Chase, 1978; Atkinson, Campbell, & Francis, 1976; Chi & Klahr, 1975; Jensen, Reese, & Reese, 1950; Mandler & Shebo, 1982; Oyama, Kikuchi & Ichihara, 1981). Second, there are developmental changes in how many items can be subitized. Svenson and Sjoberg (1978) suggest that the subitizig range extends from 1-3 in children to 1-4 in adults. In fact, there are individual differences even between adults (Akin &Chase, 1978). As a result, the enumeration function is seldom as clean as the one presented in Figure 1. There is no sharp break at 4, but rather a more rounded curve that breaks between 3 and 6. This rounding is the result of averaging data from subjects who subitize different ranges, some who subitize to 3, some who subitize to 4, some who subitize 5, and some who subitize to 6. These disputes aside, almost every study shows discontinuities at some point in the enumeration function. What point depends on who the subjects are, how they are tested, and how the data are analyzed afterwards. For my purposes it does not matter exactly where the discontinuity falls. I am only concerned that there is a discontinuity. Throughout this paper I will refer to the subitizimg range as 1-4, as have many others (e.g., Aoki, 1977; Klahr, 1973a; Oyama et al., 1981; Atkinson et al., 1976; Simons & Langheinrich, 1982). Although the idea that there are two enumeration processes is not new (e.g., Warren, 1897), no one has def~tivelyexplained why two enumeration processes are necessary.' The theories of enumeration are under-developed or flawed, and lack the generality to explain how we subitize and count all the things we subitize and count, as will be shown in the next sections. The goal of this chapter is to
I Van Oeffelen and Vos (1982a) have tried to explain subitizing in terms of Weber fractions and the "spanof discriminability",appealing to the work of Thurstone (1929). This approach yields little insight into mechanism, however. What information permits the visual system to discriminate every instance of 3 items from every instance with any other number of items? How does the visual system pick up this information?
260
LM. Trick
formulate a new theory of enumeration by considering enumeration in the context of a general theory of visual processing. In particular, I will argue that subitizing and counting are side effects of the way the visual system is built. The paper has three sections. In the first, I discuss the problems with the existing theories of enumeration, In the second, I talk about visual processing in general, and subitizing and counting as visual processes. I argue that subitizing exploits the FINST mechanism (Fingers of INSTantiation, F‘ylyshyn, 1989), a limited capacity mechanism for item individuation that operates before spatial attention-serial, area by area visual processing. Finally,I present experiments that demonstrate the relationship between subitiziig, counting, and spatial attention. Three theories of enumeration
At the very least, a theory of enumeration must meet the following three criteria. First, it should explain why small and large numbers are treated differently. Why can’t we subitize any number of items? Second, the theory should have sufficient generality to explain how we subitize and count all the things we subitize and count. Finally, the theory should explain the slopes in the subitizing and counting functions. In the following sections the major theories of enumeration will be discussed in terms of these criteria. Density based explanation When there are large numbers of items in the display, items are on average closer together or more densely packed than when there are small numbers of items. Consequently, the spatial frequency of contours in displays is greater when there are large numbers of items than when there are small numbers of items. The density based explanation relies on the relationship between number and item density, or more accurately spatial frequency. Why can’t we subitize any number of items? Atkinson, Campbell and Francis (1976) suggest that there are special neural units responsive to number when there are small numbers of items. They cite an article by Glazer, Ivanoff and Tscherbach (1973) in which cortical neurons in the cat responded selectively to low frequency gratings of a certain number of cycles. Atkinson et al. speculated that the reason we can only subitize small numbers is that there are no such number sensitive units for higher spatial frequencies. There is no mention of how higher numbers of items might be enumerated. Clearly, there is a relationship between number and contour density. Nonetheless, the density based account is inadequate to explain the flexibility of the enumeration process. In fact, from the Atkinson et al. article, it is difficult to
Subitizing & Counting
261
see how this account could handle anything but equisized, equispaced, linear arrangements of dots. There are two problems. First, contour density is not a good cue to number given that most objects are composed of many contours, and the number of contours per item varies. We seem to be able to subitize and count objects as wholes, nonetheless; people aren't completely defeated by the task of enumerating holstein cattle, for example. In fact, Akin and Chase (1978) demonstrated that subjects could not only enumerate multiple contour objects (drawings of three dimensional block figures in this case), but subitized when these figures partly occluded one another. Second, enumeration is not limited to items made up of contours defined by simple illumination discontinuities. We can enumerate objects though their boundaries are defined by motion, stereo, depth, and even illusory contours. Consequently, a more abstract formulation of the problem is necessary. The density based account gives no indication of where the subitizing slope comes from, let alone the counting slope. Pattern based explanation
In dot displays, 1 dot forms a point, 2 dots can be connected by a line, and 3 randomly positioned dots often fall in a triangular configuration. The second explanation of subitizing exploits this relationship between number and geometry. According to Mandler and Shebo (1982) subjects use "canonical pattern" of the items to determine how many items there are when there are small numbers of items in the display; we recognize the pattern of dots and respond with a number. For example, when we see a linear arrangement of items we say "two". Why can't we subitize any number of items? Pattern stops being a useful cue for number past 3. When there are large numbers of items, a different (and unspecified) process must be used. The pattern based account has intuitive appeal in that we can sometimes use pattern to avoid counting, when we read dice, for example. In addition, this account sidesteps the problem of multiple contour objects by assuming the scene has already been parsed into units; to "connect"items to form a pattern, the items have to be already defined?
*
In fact, this account also assumes that each item has been individuated. In order to form a polygon out of a number of different points, first the points have to be considered as separate entities from one another. You have to consider each point as an individual in order to form a polygon with each item as a vertex.
262
L.M. Trick
Nonetheless, the pattern based theory has problems as well. Pattern stops being a complete& reliable cue for number at 2 and many argue that adults subitize to 4 (Aoki, 1977; Atkinson, Campbell & Francis, 1976; Klahr, 1973a; Oyama, Kikuchi & Ichiara, 1981; Simons & Langheinrich, 1982). MandIer and Shebo (1982) consider the subitizing range 1-3, but even at 3 occasional misleading linear arrangements occur. Linear arrangements of 3 items should be mistaken for 2 according to the pattern based theory, because the linear configuration is the configuration for 2. In fact, subjects have no particular difficulty with linear codigurations of 3 (Trick, 1987). Subjects made no errors, and were non-significantly faster at enumerating these linear configurations than the "canonical" triangular patterns of 3. Mandler and Shebo (1982) also found no latency advantage for "canonical" triangular patterns of 3. The problem with the Pattern based account of subitizing is that pattern is not a perfectly valid cue to number even in the 1-3 range. In some situations pattern will yield the wrong answer, and one of the defining features of subitizing is its accuracy. The Pattern based account has a more general shortcoming, however. How do subjects prevent themselves from saying any linear configuration of items is 2 and any triangular configuration is 3? Frick (1987) had his subjects enumerating linear arrangements up to 20 items. Accuracy was high throughout the number range in this study, and subitizing was evident from the slope discontinuity in the latency function. In Frick's study Pattern could not be a useful cue for number because the items always fell in the (linear) configuration for 2. What causes the slopes in the subitizing and counting functions? Mandler and Shebo don't discuss this at length, but the association of subitizing with pattern recognition implies that there should be a linear function in recognition, in which pictures of a dot are identified more rapidly than pictures of a line, and pictures of a line are identified more rapidly than pictures of a triangle. There is no explanation of the counting slope. Working memory explanation
A third explauation for why we need two enumeration processes has to do with memory limitations. People can only hold so many items in working memory at once. If the number of items in the display exceeds the memory capacity, the subject must make successive "trips"to the display. This idea emerged early, with philosophers such as Sir William Hamilton (1860), who stated it in terms of the amount of information that could be held simultaneously "in consciousness". Although the memory based account of enumeration is predicated on a rather crude structural model of working memory, it seems plausible. Limitations in the
Subitking & Counting
263
number of items that can be kept in working memory are well documented (e.g., Miller, 1956). Though "the magical number 7" is a little distant from most estimates of the subitgig range, there is enough controversy about the precise range to make memory limitations a plausible account of subitizing, as discussed previously. In addition, the theory is certainly broad enough in scope: Subitizing should always be evident no matter what is being enumerated. In fact, this theory is sufficient to encompassboth spatial enumeration, with units such as items, defined across space, (e.g., dots in a display), and temporal enumeration, with units such as events, defined across time (e.g., successive tones or flashes of light at one location). What produces the subitizing and counting slopes? Klahr (1973b) associated the subitizing slope with Sternberg's short term memory scanning rates of 40 ms/item. The subitizing slope came partly from the need to pair items with memory representations for number names in his model. Counting was assumed to involve a number of additional operations: marking off items as counted, reloading the memory buffer ("noticing"), and performing additions, which have been shown to take longer as the size of the addends increase (c.f., Parkman & Groen, 1972). The working memory based theory is the best articulated of the three theories of enumeration. It has limitations as well, however. First, it may not be an advantage to have a theory so general that it explains event and item enumeration in the same way. Although both event and item enumeration involve making a numeric response, there is evidence to suggest that the processes are different. The subitizing range seems to be different for event than item enumeration, though this conclusion must be made with trepidation given that we don't know how temporal and spatial resolution correspond. Nonetheless, the subitizing range, as measured by the range of perfect accuracy, seems to be smaller for event enumeration than item enumeration. For example, Taubman (1950a; c.f. 1950b), who was interested in both auditory (tone) and visual (light flash) Morse code, reported the difficulties that even experienced operators have mistaking the signal for I with the signal for S and mistaking the signal for D with the signal for B ("-..."). This tendency to confuse 2 for 3 was resistant to corrective training. It is possible that people are capable of subitizing only 1 item with perfect accuracy in event counting whereas in many spatial (visual) enumeration studies subjects seems capable of subitizing up to 4 with perfect accuracy (e.g., Atkinson et al.,1976). In addition, there are indications that certain factors influence item enumeration later in the number range than event enumeration. For example, Frick, (1987) and ("..I!)
("...'I),
("-..'I)
264
L.M. Trick
Mandler et aL(1982) found that the heterogeneity of the items had little effect within the 1-4 range for item enumeration. In contrast, heterogeneity may assume importance even within the 1-4 range for event enumeration. Taubman (1950% 1950b) reported that Morse code operators had particular trouble with signals that had successive tokens of the same type, especially if the tokens were dots (e.g.,"..", "-..").Signals with alternating dots and dashes, e.g., ".-.", were less difficult? Similarly, memory load seems to have an effect later for item enumeration than event enumeration. hgie and Baddeley (1987) investigated the effects of Various types of memory load on enumeration accuracy. In particular, they were interested in the articulatory suppression task, in which subjects were required to pronounce "the the the" while enumerating. In one experiment subjects performed a spatial enumeration task, enumerating dots laid out across the screen at one time. In another, subjects performed a temporal enumeration task, enumerating successive flashes of light at one location. In both, subjects were required to make manual responses. For spatial enumeration, none of the distractor tasks had any effect on enumeration until there were more than 7 items. For temporal enumeration, distractor tasks, and in particular articulatory suppression, had a noticeable effect even at 1. Consequently, it may not be an advantage to have a theory so general that it explains item and event enumeration by appealing to the same mechanism. There are hints that the subitiziig range is not the same for event and item enumeration and factors like item heterogeneity and memory load have an effect earlier for event than item enumeration. The second major problem with the Working Memory based account of subitizing is that there is no evidence that memory load has any effect on subitizing or the subitizing range. If subitizing occurs because there are more "slots" in memory than things to enumerate, increasing the memory load should make subitizing impossible. At the very least, memory load should affect the size of the subitizing range. It should not be possible to subitize as many items as usual if memory capacity is used up by extraneous information. There are two ways that the memory load might be increased, either by adding "memory filler" tasks or "memory filler" distractor items. Lode and Baddeley (1987) showed that adding a memory filler task had no effect on spatial subitizing, as discussed before. Trick and Pylyshyn (1991a) demonstrated that subitizing was still possible even if "visual" memory load of the task was increased by adding
This difficulty with successive event tokens of the same type may be related to repetition blindness, although repetition blindness also occurs when there are intervening events of different types (c.f., Kanwisher, 1987).
Subitizing & Counting
265
distractor items. Subjects were perfectly capable of subitizing 0 ’ s in a background of 4 X’s, for example. (They were not able to subitize 0 s in Q’s, however, though there is no reason that 0 ’ s should produce more memory load than Xs). To this point, no one has been able to show that memory load has any effect on the ability to subitize. The counting process would be expected to require working memory, both for storing addition subtotals and keeping track of already counted items. Nonetheless, memory limitations could not explain why we can only subitize small numbers of items because memory load seems to have no effect on the ability to subitize. Subitizing and counting in the context of vision
How can subitizing be explained? I will be arguing that the abmty to subitize is a side effect of the way that parallel (preattentive) and serial (attentive) stages of processing are coordinated in visual analysis. In particular, I will suggest that subitizing makes use of a limited capacity mechanism for individuating a small number of feature clusters from preattentive analysis for access by the attentional focus. To develop this argument I will talk briefly about a general framework for understanding vision that incorporates work from Marr (1981), Treisman et al. (e.g., Treisman & Gelade, 1980), Pylyshyn (1989), and Ullman (1984). Second, I will relate this framework to the phenomena of subitizing and counting. Finally, I will present a number of experiments that suggest a relationship between subitizing, counting and spatial attention. Three stages of visual analysis 1. Preattentive Analysis. This first stage is primarily involved in finding edges or boundaries. This stage can be seen to have two parts: assigning place tokens to discontinuities in the image which are often the diagnostics of edges, and grouping these tokens (c.f., Marr, 1982). A number of different types of discontinuity can define an edge. For example, discontinuities in the intensity, color, motion, or binocular disparity of adjacent areas in an image give rise to the impression of an edge. The process of deriving discontinuities from the image has been referred to as feature registration in the attention literature (e.g., Treisman & Gelade, 1980),where features are properties such as brightness, color, orientation, curvature, etc. Feature discontinuities often signal the presence of an edge. For example, a black area next to a white area defines an edge. Feature registration is a classic example of a preattentive process because it occurs before the operation of attention.
266
L.M. Trick
Once discontinuities are found, the tokens are organized into groupings called feature clusters, on the basis of proximity, similarity, good continuation and common fate. The tendency to group adjacent similar edges causes effortless texture segregation. Subjects see an implicit boundary surrounding items with different features than the rest of the display. As a result, subjects can easily indicate a which corner of the display has different items than the others. These discrepant items can be seen to define a shape or area that segregates away from the rest of the items in the display (c.f., Beck, 1982; Julesz, 1984). Grouping can occur at many levels. For example, the place tokens associated with a number of tiny vertical bars may be grouped into a feature cluster, and assigned one token, in this case a larger horizontal bar. Similarly, tokens associated with a group of these horizontal bars may be grouped into a feature cluster, and assigned one token, a yet larger vertical bar. These image segmentation processes also occur before serial analysis or the operation of attention. Thus, even at the preattentive level the visual system is forming units--feature clusters. These units are represented by place tokens at a given resolution in the representation. In many cases these feature clusters will correspond to "items", certainly in the simple displays used in most enumeration or search studies. In cases where there are camouflaging surface markings, or where items are in atypical configurations (e.g., concentric items, Saltman & Garner, 1948), one token may be assigned to a "unit" based on the edges from several different objects, however. To preattentive vision a unit is any grouping of contours, where contours are grouped by the Gestalt properties of proximity, similarity, good continuation and common fate. There is no guarantee that contour groupings derived by preattentive vision correspond to items, though in many cases they might. Preattentive processes have two important characteristics. First, the analyses are assumed to be spatially parallel. That is, the processes go on at every location in the image at the same time. For this reason, subjects can indicate the presence of an item that has different features from others, (e.g. a red item in a field of blue items), in a time independent of the number of items in the display, as shown in Treisman's search studies (e.g., Treisman & Gelade, 1980). Similarly, the grouping processes responsible for texture segregation operate at every location the display at the same time; there is no need to scan the display location by location to find the boundary between discrepant items. Second, low level processes are thought to be data-driven. Consequently, the output is determined by the retinal input, and is unaffected by the subjects' goals and beliefs. The analyses will take place whether the subject intends them to or not.
Subitizing & Counting
267
2. Attentional Analysis / Wsual Routines. The second stage of analysis involves computing properties defined over objects rather than retinal locations, and constructing structural descriptions of objects. This is the stage at which UUman's (1984) Visual Routines operate to compute number and spatial relations such as inside, connected, etc. Ullman argues that number and spatial relations by their nature require spatially serial processing because they have an infinite support set; there is no one place in the image to look, or one configuration to check for, to determine if these properties exist in the image.4 Routines work by moving a processing focus through the image, location to location. The processor only works on one area at a time. Ullman associates this processing focus with spatial attention, or what has also been called the spotlight of attention (Posner, Snyder, & Davidson, 1980; Eriksen & Hoffman, 1972,1973; Jonides, 1980; Laberge, 1983) and the locus of feature integration (Treisman & Gelade, 1980). Ullman suggests five elementary operations that can be performed using the attentional focus: indexing, marking, boundary tracing, Scanning and coloring. Visual routines are programs made from these operations. I will discuss indexing, scanning and marking in detail because they will be of importance in upcoming discussions. For Ullman, indexing is the process of moving the attentional focus toward a feature cluster or discontinuity in the image." Subjects index when they move their attentional focus towards a specific item, e.g., an item that differs from all the others in a search task. Similarly, indexing occurs in peripheral cuing tasks when subjects are required to move their attentional focus towards a flashing marker that is placed near where the target is to appear. Indexing can be contrasted with scanning, in which the attentional focus is moved, but there is no specific visible destination for the focus to move towards. Scanning would occur in central cuing tasks, in which a central arrow indicates the direction the attentional focus is to be moved, but there is no item near the desired destination of the attentional focus at the time of cuing.
In principle it is possible to build a parallel network to compute the number of dots in a display. In the general case, in which items may overlap, and have different sizes, shapes, colors and brightness, this sort of mechanism would not work (Minsky & Papert, 1969).
'
Pylyshyn and Ullman use the word "index" in different ways. Pylyshyn uses "index" to refer to the process of assigning an index (pointer) to a certain feature cluster, which would be a preattentive operation. Ullman's sense of "index" is clearly attentional; indexing refers to a particular style of moving the attentional focus. Ullman's "indexing" assumes the ability to index in Pylyshyn's sense, however. For the sake of clarity, I will stick with Ullman's definition of indexing throughout this chapter because the discussion began with Ullman's Visual Routines.
268
L.M. Trick
Marking is the process of indicating the items that have already been visited by the attentional focus. Without the ab;l;ty to mark, people could go on forever, enumerating the same items again and again. Similarly, without the ability to mark, subjects could go on indefinitely looking for an item not present in the display in search. There are two important characteristics of Visual routines and attentive processes in general. First, the processes are spatially serial; the attentional focus can only be at one place at a time. For this reason, latencies to determine spatial relations vary with the complexity of the display. For example, the time required to decide whether two items are connected varies with the contour distance between the items (Jolicoeur, 1988) because the focus can only be at one location on the contour at time. The attentional focus must be moved along the connecting contour in order to determine if two items are connected. Similarly, in search tasks, the time to decide an item is both green and vertical (in a display of green horizontal and white vertical items) requires going through the display, location to location, item by item. Treisman's feature integration processes are assumed to employ the same spatially serial attentional focus that Ullman's routine's do, and the attentional focus can't be everywhere at once. Because different items occupy different locations, spatial seriality often amounts to a limitation in the number of items that can be processed at once. Processing is done one item at a time. Second, visual routines are largely goal driven. The output of this intermediate level of analysis depends on the intention of the viewer. For example, in a forest you don't automatically count trees. You only count when you want to count. 3. Wsual Cognition. Object recognition and classification occurs at this final stage, the processes involved in "seeing as". Structural descriptions created in the earlier stages of analysis are matched to memory representations for particular objects or classes of objects. If a match is found then the item can be named or categorized. This stage is not strictly necessary for vision or visual motor coordination. We can see and discriminate items from one another, even if we cannot recognize or name them. We can touch objects and move around them even if they are unfamiliar. This final stage of visual analysis is heavily influenced by beliefs. Consequently, whether you recognize a particular object "as"a holstein depends on what you believe a holstein to be. Similarly, whether you see a display as having "four"depends on how many items you think there has to be in a display to correspond to the number name "four".
Subitizing & Counting
269
Subitizing counting and visual routines
It makes sense that number might be computed by a serial goal driven mechanism such as a visual routine, because enumeration is not performed automatically, and the process takes place in a time that is dependent on the number of items in the display. By augmenting Ullman’s operations, indexing and manking, with memory functions entailing number name retrieval and addition, a visual routine for number can easily be created. First, a memory counter would have to be initialized to zero. Then a cycle of indexing, augmenting the counter by one, and marking the item as counted would be performed until all the items are marked. The final value of the memory counter would correspond to how many items there are in the display. This does not seem to be the way people enumerate, however. If it were, then the reaction time should increase linearly with the number of items in the display; the latency difference between 9 and 10 should be more or less the same as that between 1 and 2. The research on subitizing and counting shows that this is clearly not the case; the slope in the 1-4 range is typically only 40-120 ms/item whereas the slope for 5 or more is 250-350 ms/item. Given the discussion so far, and even given the research on mental arithmetic (e.g., Parkman & Groen, 1971), there is no reason that there should be a sudden discontinuity in the slope after 4. How can the research on subitizing and counting be reconciled with Ullman’s Visual Routines? To answer this question, it is necessary to consider what UUman’s indexing operation entails. Assume that there is an operation, INDEX, that takes as its argument information that would allow the attentional focus to be moved towards a particular item in the visual array. Suppose that the attentional focus is currently at the point denoted as a in Figure 2. The task is to move the attentional focus point denoted b. How could indexing be accomplished? One possibility is that the processor be sent to a location defined by the retinal features of the item, e.g., INDEX(smal1 grey dot). This strategy would not work, however, if there were more than one token of the same type, i.e., if there were more than one small grey dot as in Figure 2. Moreover, in the world, the retinal properties of an item might change from moment to moment as a result of changes in lighting or projection or changes in the item itself. A dot might of its own change brightness or size, for example. For this reason, the indexing operation cannot use properties as arguments, There needs to be a way to refer to an item, maintaining its identity even if it changes in its properties: it is the same “ O W although it used to be little and grey and now it is big and black.
270
L.M. Trick
Figure 2. Moving the aftentional focus.
Another possibility is that retinal coordinates be used as arguments for the INDEX operation, e.g., INDEX(25,35). The problem with this strategy is that the retinal coordinates of the item change with eye movement. For example, item b might fall at the same retinal location as e if the eyes were moved to the left. Moreover, in the world, items move independently. Objects roll and fly and shift and bounce. Thus, the position of item b might change even if eye position is constant. If coordinates are used as arguments there is a danger of sending the processing focus to a retinal location that no longer houses an item, or houses a different item than the one intended. There needs to be a way to refer to an item independent of its location, so that its identity will be maintained even though its position changes; it is the same "ONE"despite the fact that it was once at (25,35) and now it is at (63,90). Granted, in visual displays, in which items are laid out in space, differences in location initially define an item as an item distinct from other identical items. Nonetheless, it would be a bad idea to refer to that item by its location because items move. Specifically, there has to be a way of referring to items, individuating them, in the same way that I did when I labelled the points a, by and c. What is more, these labels must stay with their respective items though the properties and positions of the items change. Thus, between preattentive and attentive stages of Visual analysis, there is need for a stage in which certain feature clusters are individuated, or named in such a way that their retinal position can be accessed by the attentional focus. This is where Pylyshyn's (1989) HNST hypothesis fits in. HNSTs, for Fingers of INSTantiation, are reference tokens, that permit access to certain places in the representation. FINSTs provide a way of referring to a feature cluster without explicitly specifying properties (e.g., the black one) or coordinates (e.g., the item
Subitizing & Counting
271
in the same way that a pointing frnger allows you to spec@ "THAT" at (=,a)), one without detailing properties or the specific location. Individuation is a necessary precursor to UUman's indexing operation. In order to move the attentional focus to a particular item, there must be a way to specify WHICH item. Without this ability the intelligent movement of the attentional focus from item to item would not be possible. The attentional focus would not end up where it was intended to go. It is not enough to individuate and track one item, however. Computing spatial relations such as connected, inside, ahead of, requires that several items be individuated at once. For example, in Figure 2, in order to determine whether a certain line (X), connects a two particular points (e.g., Y and Z), all three have to be individuated at once. Similarly, in order to determine if a certain runner is "ahead of' another runner in a race, both runners must be individuated and monitored at once. One study directly supports the contention that we can individuate and monitor a small numbers of items preattentively, the multiple target tracking experiment of Pylyshyn and Storm (1988). In it, subjects were faced with up to 10 identical objects. For a brief time a subset of these objects flashed. Subjects were required to treat this subset as target items, and the rest as distractors. The targets stopped flashing, and became identical to the distractors once more. Then, all the items were set into random motion. Each item, target or distractor, moved independently of the others. Each item would randomly change speed and direction every 100 to 150 ms. After an unpredictable amount of time, one of the items changed shape. The subjects' task was to decide if the object that changed shape was a target, distractor, or new item in the display. The number of targets and distractors was varied, along with the tracking time and the average rate of motion. In this experiment subjects successfully tracked up to 4 or 5 target items in 5 distractors, even when the item movement was so rapid and erratic that attentive scanning from target to target was impossible, given current estimates of the speed at which the attentional focus can be moved (c.f., Posner, Snyder & Davidson, 1980). The FINST hypothesis The FINST hypothesis was initially motivated by work on a computer system designed to create diagrams from simple instructions, and then reason on the basis of information in the diagram (Pylyshyn, Elcock, Marmor, & Sander, 1978a, 1978b). As work progressed the need for an item individuation process became
272
L.M. Trick
apparent, and moreover it became evident that individuation would be best accomplished by internal reference tokens. FINSTs, for Fingers of INSTantiation, are mental reference tokens. Each token can be assigned to an "item", a feature cluster derived by preattentive vision. FINSTs do not encode the properties of the items they refer to, they just make it possible to examine the properties if needed. FINSTs are simply names, symbols that are bound to feature clusters. These "names" permit us to consider each item as a distinct unit even among others with the same properties. These "names" permit us to think of an item as the same "ONE" even though its properties and position change from moment to moment. Moreover, they enable our attentional operations and motor commands to refer to the cluster and access its location in the visual array. In fact, these "names" would function in much the same way as pointer variables do in computer languages such as Pascal or C. A pointer variable stores the memory location of a variable rather than the value of a variable. Assigning a FINST is simply the process of variable binding. Updating FINST pointer variables would involve no less than the moment to moment solution of the motion correspondence problem. FINSTs are assigned to a small number of feature clusters after edge detection and grouping but before attentional processing (F'ylyshyn, 1989). Consequently, FINSTs are assigned at a stage midway between attentive and preattentive analysis, as shown in Figure 3. This stage represents an intermediate step because it is spatially parallel, like preattentive analysis, yet limited capacity, like attentive. There are only a small number of FINSTs; multiple target tracking and number discrimination experiments suggest that there are at least 4 however (F'ylyshyn & Storm, 1988; Sagi & Julesz, 1984, respectively). This capacity limitation may serve to simplify the motion correspondence problem, the problem of determiningwhich item at time 1corresponds to which item at time 2 in a multi-item display in which items move independently (c.f., Ullman, 1981; Pylyshyn & Storm, 1988). The difficulty of solving the correspondence problem skyrockets as the number of items increases because there is a combinatorial explosion in the number of potential matches that have to be considered. This combinatorial explosion could be prevented if only a small number of items are tracked at once. Moreover, the FINST stage represents an intermediate step because, like preattentive vision, it is primarily data driven, but nonetheless it is subject to the influence of goals and beliefs in certain situations. Goals could come into play in three ways. First, FINSTs may be assigned selectively to items with certain features. For example, at the beginning of the each multiple target tracking trial subjects selectively assigned their reference tokens to the target items, the items that were flashing. Luminance transients (though not color transients) act as a
Subitizing & Counting
273
Visual cognition
0
i Serial attentive stage of visual processing (visual routines)
I FINST mechanism]
1
Parallel preattentive stage of visual processing
Figure 3. The FINST mechanism in visual processing.
kind of feature (Pylyshyn & Burkell, 1990), so the flashing items had different features than their non-flashing companions. Similarly, color is thought to be a feature (Treisman & Gelade, 1980). Subjects might assign FINSTs to items of particular color, e.g., the red items but not the green items. Such selective assignment of FINSTs would only be possible when the target items differed from distractors on the basis of a feature, i.e., FINSTs can only be assigned selectively when the targets and distractors differ on the basis of preattentive information. This is because FINSTs must be assigned before the operation of spatial attention to be useful for the tasks Pylyshyn (1989) suggests for them. Second, subjects may select what "resolution" they want to work at. Discontinuities can be defined at any number of levels; what constitutes a "unit" depends on what the subject decides the unit of interest will be, however. For example, we can decide to enumerate the lines, equal signs or boxes in Figure 4. Third, subjects must be able to assign and reassign FINSTs at will in order to deal with displays with large numbers of items. Although goals may influence how FINSTs are assigned in these three ways, the process of assigning FINSTs is data driven because FINSTs can only be assigned to discontinuities in the image.6
This may explain why people lose sense of where their eyes are focused in a Ganzfeld.
274
L.M. Trick
Figure 4. Enumerating at different levels of analysis. Thus, FINSTs can be used to accomplish two functions. First, and most obviously, FINSTs can be used to individuate items from one another, even if the items are identical, even if the items move and change from moment to moment. Second, because FINSTs can be assigned selectively, FTNSTs can be used to select a subset of the items with certain properties, at certain resolutions, or in certain locations. Recent studies of "guided search (Egeth, Garbart, & Virzi, 1984, Wolfe, Cave, & Franzel, 1989; Treisman & Sato, 1990) suggest that subjects only search a relevant subset when looking for an item with a conjunction of features (e.g., red vertical), among distractors that have one but not both features (e.g., items that are red horizontal or green vertical). Subjects might only search the red -items, for example. In order to perform guided search there needs to be some way to select the relevant subset before the attentional focus is moved from item to item, either by activating the relevant subset of items (c.f., Wolfe, Cave & Franzel, 1989), or inhibiting an irrelevant subset (Treisman & Sato, 1990). FINSTs are assigned to place tokens that correspond to discontinuities or feature clusters (groupings of discontinuities) in the image. These tokens are assigned to units that may not necessarily correspond to items in the world. Preattentive vision groups contours into clusters on the basis of the Gestalt grouping laws. In most simple enumeration and search displays these groupings would correspond to items. In some situations, however, these contours from different items may be grouped into a unit. Attentional processing would be required to properly ensure all the contours originate from the same object in such cases (Ullman, 1984). Although this discussion has focused on cases in which FINSTs are assigned to as yet unattended items, FINSTs can also be assigned to items that have already
Subitizing & Counting
275
been attended (marked) but are no longer the focus of analysis. Regardless, only FINSTed items can be further accessed by attentive processes employed by visual routines. Furthermore, only FINSTed items can be further accessed by motor commands permitting eye or finger movements, according to Pylyshyn. FINSTs are thus predicted to be important for visual-motor coordination as well as spatial attention.’ The FINST hypothesis contributes to the attention research by rendering explicit an assumption of a number of enterprises including the peripheral cuing research (Tsal, 1983), Feature Integration theory * (Treisman & Gelade, 1980), and Ullman’s Visual routines and Structure from Motion theorem (1984; c.f., Jolicoeur’s (1986, 1988) boundary tracing experiments; 1981 respectively). Specifically, these theories assume we can individuate and treat as distinct specific items in the display, without attending to them first. Nonetheless, the importance of individuation, “distinguishingdifferent tokens of the same type”,has just begun to be appreciated in the visual attention literature. For example, Moser (1989) discusses the problem of item individuation, as Kanwisher (1987) does for event individuation in RSVP (Rapid Serial Visual Presentation) of words. Only FINST theory is capable of explaining how we can individuate and keep distinct a number of identical items even as they move and change their properties, however. Moreover, only FINST theory predicts individuation must occur before, not as a result of, the operation of spatial attention. Finally, only FINST theory predicts that a small number of items can be individuated and tracked at once. In order to find evidence of a mechanism that individuates small numbers of items at once it is necessary to look to tasks where there is more than one target item. For this reason tasks such as multiple target tracking and enumeration are important for the substantiation of the FINST hypothesis. In particular, enumeration is important because in the subitizing and counting literature there is long standing evidence of small numbers of items being processed differently than large.
’
Ultimately, the FINST model might be extended to encompass the integration of sparial information from different sensory modalities, eg. vision, touch, and hearing. At this point, the FINST hypothesis has only been developed to cover the case of vision, however.
*
The FINST hypothesis can also be related t o the idea of object files (Kahneman & Treisman, 1984;Kahneman, Treisman, & Gibbs, 1990) in that FINSTs allow an object file t o be built; in order to start incorporating information from a number of different feature maps, using the attentional focus, a discontinuity on the master map must be individuated, and its position must be updated if the item moves. The FINST hypothesis represents a refinement o n Treisman’s idea in that it suggests that only a small number of object files can be started at once.
276
L.M.Trick
Subilizing and counting according to the FINST hypothesis
Why can we only subitize a small number of items? The system that individuates feature clusters by binding them to reference tokens has limited resources; there are only a few reference tokens (FINSTs). A different process must be employed if all the FINSTs are assigned. The following sections detail the processes which may be involved in subitizing and counting. Subitizing. Subitizing occurs when people decide to enumerate, and the number of items is less than the number of internal reference tokens or FINSTs. Subitizing can be seen to have two stages. The fust is variable binding. One FINST is assigned to each item in the display, where what constitutes an item is determined by low level feature detection and grouping processes. In most enumeration tasks FINSTs will have to be assigned selectively, to feature clusters with certain properties, for example, the white dots in the display and not the contours that correspond to the edge of the screen. This is because in all but the most austere images there will always be more feature clusters than FINSTs. "Top down" assignment of FINSTs is only possible as long as target locations differ from the others on the basis of a preattentive feature such as color, orientation or curvature, etc.. If there are still tokens left over after all of the selected items have been assigned FINSTs, subitizing will occur. This fust stage of subitizing is prenumeric because the number name has not yet been accessed; number recognition has not occurred. The viewer is only conscious of "some" items in the display. Item individuation information of this sort must be available to the system before the attentional processor begins to move, if the attentional focus is to be moved intelligently. Otherwise the system would not "know" when to start indexing or "know" when to stop. Without individuation information, reaction time would not so neatly parallel display size in conjunction search. Subjects might stop short, processing only one or two of the items before responding, or might continue on, indefinitely, looking for items even if there were only one in the display. Moreover, subjects would miss targets more often because of failure to check all the locations. Subjects rarely make mistakes in search tasks, though they may be unaware of the number of items in the display (e.g., Treisman & Gelade, 1980). Individuation information is also necessary when subjects are required to decide whether a display is subitizable or not, without actually enumerating. Atkinson et al.(1976) found that the time required to make such a decision is more or less constant with the number of items in the display, save for a slight increase at 5, which may be a borderline case because it is sometimes counted and sometimes subitized. This individuation information is prenumeric because it is available
Subitizing & Counting
277
before the number name is accessed. It is nonetheless number information in a sense because it can be used to distinguish zero from some, one from more than one, or a subitizable number from a more-than-subitizable number of items. The second stage of the subitizing process, response choice, involves using information about the FINSTed place tokens to gain access to a number name stored in semantic memory-recognizing the number as "3" for example. The processes involved in number recognition must be somewhat different from the usual processes of recognition, because with number there is the problem of embedding In every 4 item display there are 3 item displays, 2 item displays and 1item displays. The only thing that differentiates every 4 item display from every other type of display is that 4 items displays have 1item, 2 item, 3 items, 4 items implicit, and NOT 5 items, 6 items.... There are two incontrovertible cues to how many items there are in the display, either the highest number supported by the display or the lowest number NOT supported by the display minus 1. Given that we are not very good at looking for the absence of things (Treisman, 1985), the former is more likely. The second stage of subitizing must thus involve searching semantic memory for the highest number representation with units to match the items in the display. This would require pairing each item in the display with a number name in order of the number names, as Klahr suggested (1973b). If there are no particular expectations about how many items there are in the display, the most promising procedure would be to start the matching process from 1. Enumeration studies that measure response latencies invariably show a subitizing slope. What causes the subitizing slope? According to this account there are two possibilities: the variable binding stage and the response choice stage; the stage at which a number name is chosen to match the number of items in the display. Information about the origin of the subitizing slope can be found by comparing subitizing slopes from studies that differ in terms of their eye movement control procedures, dependent measure, and number of response alternatives. First, reaction time studies with eye movement controls tend to have smaller slopes than ones that do not. Klahr and Wallace (1976) explicitly compared subitizing slopes for trials with eye movements with those without eye movements. The subitizing slope was 60 ms/item for trials with eye movements and 25 ms/item for trials without. Second, when a masking methodology is used, the subitizing slope decreases even more markedly. Masking studies not only control for eye movements, but exclude the response choice part of the task from the latencies because the focus is on accuracy at a certain SOA rather than response time. Response choice occurs after timing is complete in this paradigm. The subitizing slope is much
L.M. Trick
lower when the masking paradigm is used, e.g. 4-10 ms/item in Oyama, Kikuchi & Ichihara (1981), as compared to 25-40ms/item in comparable enumeration studies where reaction time is measured and eye movements are controlled (Klahr, 19n, Oyama et al., respectively). Finally, in reaction time studies the subitizing slope seems slightly more pronounced when there is a wide range of responses, as in typical enumeration studies, than when there are only two responses, as in number discrimination and number matching studies. Folk, Egeth & Kwak (1988) performed a number discrimination study in which subjects had to discriminate n from n + I, and found the slope to be 33 ms/item in the 1-4 range. Simons & hgheinrich (1982) performed a number matching experiment in which subjects were presented with a digit and then a display of items. Subjects were required to decide whether the number of items in the display matched the digit or not. In their study the slope was 31 ms/item in the 1-4 range. For comparable enumeration studies (without eye movement controls) in which the response range was large, the slope in the 1-4 range was 60 ms/item on average (Klahr, 1973a; Aoki, 1975). If a masking paradigm was used and the number of response alternatives was limited, the slope all but disappears (1.9 ms/item, Sagi & Julesz, 1984). Consequently, the subitizing slope seems to arise primarily from the need to choose a response from a range of responses, though eye movements exacerbate it. If the response range is limited, or if the response choice part of the latency is excluded by using a masking paradigm, the subitizing slope decreases. Counting. According to the present view, counting starts once we learn that we can't subitize, i.e. when it becomes apparent that ALL the FINSTs are assigned. First, a strategy for moving the attentional focus is formed. For example,we may decide to work from left to right, top to bottom, in order to minimize the probability of "getting lost" in the display. Attentional processing then begins. A counting routine is performed, an index, mark and add cycle is executed. The attentional focus is moved through the visual array, area by area. As children we progressed one item at a time, probably because we learned to count by ones before we learned to add. Most adults count by groups, however (VanOeffelen & Vos, 1982b; c.f. Klahr's (1973b) production system for group and add enumeration). Thus, counting involves grouping the items into clusters of 2-4 items, subitizing the group, adding the result into a running total, marking the cluster and then moving the attentional focus to the next group, a process that roughly corresponds to our subjective impressions of what we are doing. This activity can be conceived as the process of FINSTing a place token at a low level of resolution (a large blob or group of items), moving the processing focus toward it, reassigning the FINSTs to place tokens associated with the elements within a
Subitizing & Counting
279
group (the small blobs representing individual black dots), and finally marking the group once subitizing is complete. Counting involves many processes: discovering the display is non-subitizable, grouping the items, forming a strategy for moving the attentional focus, moving the focus from group to group, subitizing the number of items within a group, adding the result into a running total, marking. A given counting latency is thus composed of the latencies from many processes. Some of the processes involved in counting may take place in a time independent of the number of items in the display. For example, the time to decide whether or not the display is subitizable, the time to group items, and the time to form a scanning strategy may take place approximately as fast when there are large numbers of items as when there are small numbers of items. Other processes may take longer as the number of items in the display increases. Thus, the more items in the display, the more times the attentional focus would have to be moved, the more addition operations would have to be performed, the more marking would have to take place. In addition, number would affect the quantity of items within a group; the more items in a display, the larger number of items within a cluster, and the longer to subitize the cluster. Similarly, group size would affect addition processes because addition latencies increase with the magnitude of the addends (c.f., Parkman & Groen, 1971for early work in this field). Counting latencies are made up of the latencies from many different processes, some of which take longer the more items there are in the display. To review, subitizing is fast because it is a simple process involving two stages, one of which may be parallel. Subitizing is accurate because there are few memory requirements. Counting is slow because it involves many sub-processes, some of which take longer the more items there are in the display. Counting is error prone because of memory requirements; it is possible to forget the subtotal, forget the addition table, or forget which items have already been counted. There are simply more things to go wrong when counting than subitizing. FINST theory compared with other accounts of enumeration
The Density and Pattern based theories of enumeration share with the FINST based account a concern with the visual /spatial aspects of enumeration. Unlike these theories, the F'INST based account does not explain subitizing by appealing to number cues that only work with certain types of display, however. We don't invent a different enumeration process for every enumeration task. Nor do we muddle through with unreliable cues, making mistakes. Subitizingis characterized
280
L.M. Trick
by its accuracy. The FINST based account of enumeration shares with Working Memory theory the idea of capacity limitations, but the limitation is in the type and number of things that can be individuated in a visual scene, rather than the amount of information that can be held in consciousness. Consequently, the visual aspects of the enumeration task are predicted to be important in determining whether subitizing will occur. The advantage of the FINST based account of enumeration is that it explains subitizing and counting, yet it grows out of a general consideration of the requirements of visual processing. The FINST hypothesis wasn’t invented simply to explain enumeration, or for that matter multiple target tracking (Pylyshyn & Storm, 1988). In general, we need to be able to individuate items if we are to reason about visual displays (Pylyshyn, Elcock, Marmor, & Sander, 1978a, 1978b). There needs to be a way to individuate a small number of items from one another to compute spatial relations. For that matter, there needs to be a way to individuate items if we are even to move attentional focus from item to item, so the focus arrives at the destination intended. FINSTs are needed to explain how we perform some of the simple tasks we can perform (c.f., Pylyshyn, 1981, the minimal mechanism strategy). Experiments on subitizing, counting and spatial attention In the following experiments I examine one assumption of FINST based account of enumeration, namely the idea that subitizing exploits a mechanism that operates before the operation of spatial attention whereas counting requires spatial attention. Three conceptions of attention were explored Ullman’s visual routines, Treisman’s attentional “glue”,and Posner’s “spotlight of attention”. In Trick and Pylyshyn (1991a), Ullman (1984) and Treisman and Gelade’s (1980) operationalizations of attention were employed. The experiments shared a common strategy. We were trying to show that subitizing is not possible when the enumeration task is changed to force attentional processing. The reason that such manipulations preclude subitizing is that the FINST item individuation mechanism operates b#om the operation of spatial attention. FINSTs will not be useful for enumeration when spatial attention is required to parse contours into units that correspond to items, or distinguish items to be counted from distractors. Consequently,subitizing should not occur in these situations. Of more importance, however, was showing that subjects were capable of subitizing multiple contour items, or target items in distractors, as long as there was no need for spatial attention.
Subiritng & Counting
mi
The Density and Pattern based accounts of enumeration are inadequate to explain how subitizing could be performed at all in these tasks. At the very least they need to be augmented to incorporate feature detection and low level grouping processes, to explain how multiple contour items could be enumerated or how items could be enumerated among distractors. The Working Memory accouIIf predicts that subitizing should always occur, regardless of the visual characteristics of the display, as long as the memory load remains the same. None of the theories of enumeration predict that there should be a relationship between subitiziing and spatial attention. We predict that subjects will be capable of subitizing multiple contour items, and target items in distractors, but only if preattentive analysis is adequate to the task of parsing contours into items, or .. . . d s t q g d q targets from distractors. How will we know when subitizing occurs'? The trademark of the change from subitizingto counting is the "elbow"in the enumeration function that results from the sudden increase in slope outside the subitizing range. Trend analysis registers this discontinuity as a deviation from linearity. If a manipulation eradicates this deviation, and moreover if the slopes are high throughout the range, there is evidence that subitizing has not occurred. Notice, the focus is primarily on the difference between slopes for small and large numbers of items given the same display conditions, rather than the absolute latencies or absolute slopes in different conditions. Many factors influence absolute enumeration latencies and slopes: the contrast between the items and the background (Hunter & Sigler, 1950), the complexity of the items to be enumerated (compare Akin & Chase, 1978 with more typical enumeration dot enumeration studies), the visual angle covered by the items (Klahr, 1973), the age of the subjects (Chi & Klahr, 1975; Svenson & Sjoberg, lm),accuracy criteria, practice effects, and knowledge of the maximal number of items that will appear in the display (Mandler & Shebo, 1982; Saltzman & Garner, 1948). Moreover, it is to be expected that extraneous contours would slow enumeration. In search, even distractors that differ from target items by a feature d a t e latencies a small amount (the "cost of filtering", Treisman, Kahneman & Burkell, 1983). Distractors that differ from the target items by the particular combination of features add a large amount (the "cost of feature integration", Treisman & Gelade, 1980). It is the difference between slopes for large and small numbers of items that is important in determining whether subitizing occurs, however, not the absolute latencies or absolute slopes. Moreover, it is important that trend analysis reveal slope discontinuities in datasets for each subject, when each subject's data is considered separately from the rest of the subjects in the experiment. Otherwise there is a danger that
L.M. Trick
averaging across subjects might obscure discontinuities in slope, because there are individual differences in how high people subitize (Akin & Chase, 1978). Consequently, analyses involved several stages. First, trend analyses were performed on each subject’s data. (Once a deviation from linearity was found, it was necessary to further ensure that it came about because of an increase in slope rather than a drop). Second, as a further check, trend analyses were performed on the group dataset, made up of the average latencies for each subject. Finally, regressions were performed on these averaged latencies to establish the slopes in the subitizing and counting ranges. The following discussion will focus on the within subject analyses, though the results were replicated in regressions and trend analyses performed on the averaged datasets. Subitizing and visual routines
Ullman (1984) talks about attention as a processing focus that must be moved through the stimulus array, one area at a time, to compute spatial relations and form a global representation of an object from contour segments. If attention is required to bring together the segments of an object, how is it possible to subitize multiple contour objects? In Trick and Pylyshyn (1991a), we argue that it isn’t always possible. When the configuration of items is such that low level processes group contours from different objects, subitizing is not possible, and attentional processing must be used to enumerate both small and large numbers of objects. Given this prediction it is interesting that one of the few studies that failed to produce strong evidence of subitizing had subjects enumerating concentric circles (Saltzman & Garner, 1948; cf. Woodworth & Schlosberg, 1954; Allport; 1975). For concentric items the nearest and most similar contours come from different items. Concentric circles necessarily have an inside and outside that are bounded by a contour. They are also necessarily of different sizes. Finally, concentric circles are one inside another and by definition share a common focus. We wanted to find out which factor caused concentric items to be so difficult to enumerate in Saltzman and Garner’s (1948) study. There were three conditions. In the Same Size condition subjects were required to enumerate rectangles that were defined by a bounding contour, but were uniformly sized. In a given display all the rectangles might be small, medium, or large, but they were all the same. In the Different Size condition at least one of the rectangles was different in size from the others. Finally, in the Concentric condition subjects were required to enumerate concentric rectangles.
Subitizing & Counting
283
We predicted that subitizing would be evident in the Same and Different Size conditions, but not the Concentric. U h a n (1984) might predict that subitizing would occur in none of the three conditions. Visual Routines are proposed to explain enumeration, yet they cannot explain subitizing. There is no reason that small numbers of items should be treated differentlythan large in UUman’s model. None of his operations for moving the attentional focus can handle small numbers of items at once, and nothing in the addition literature predicts that addition by 1 suddenly becomes much more time consuming after 4 (e.g., Parkman & Groen, 1971). Therefore, there is no reason that there should be a slope discontinuity between 1-4 and 5-8 in any condition according to UUman’s theory. There was clear evidence of subitizing in both the Same and Different Size conditions, however; all 12 subjects showed the appropriate deviations from linearity resulting from an increase in slope after 3 or 4. In contrast, only 2 of 12 subjects showed such deviations in the Concentric conditions. Chi square analysis indicated a significant difference between the proportion of subjects showing deviations from linearity in the Concentric condition and the other two conditions. Moreover, for the Concentric condition the slope in the 1-3 range was approximately the same as the slope in the 5-7 range. Both of these slopes were comparable to the slope in the 5-7 range in the other two conditions. See Figure 5 for the average enumeration latencies. There were two problems however. There was not a great enough variety of item sizes in the Different Size condition to test whether the variety of item was the source of difficulty in the Concentn‘c condition. More seriously, on average contours were closer together in the Concentric than Same and Different Size conditions. Thus, lateral masking between contours from different items might explain the absence of subitizing in the Concentric condition? For these reasons a control study was performed. Subjects were required to enumerate the straight lines and right angles that made up the sides and corners of the concentric rectangles. All subjects were able to subitize both corners and lines, even though the corners were of uniform size and the lines varied by a factor of 30. In fact, there was no significant difference between the latencies to count parallel lines and corners, and most particularly not in the subitizing range, as shown in the bottom panel of Figure 5. Moreover, both subitizing slopes were
There was no evidence of lateral masking between contours from the same item, because there was no significant difference between latencies for enumerating small and large rectangles ( p > .1). In small rectangles the sides were 0.16 and 0.26 degrees apart whereas for large items the sides were
0.78 and 1.01 degrees apart. Small rectangles were enumerated 11 msec faster than large, in fact. Enumeration does not seem to be impeded by the presence of nearby contours from the same object.
L.M. Trick
284
w i t h 2 ms of those for uniformly sized rectangles. Therefore, concentric items are difficult to subitize because the items have a common focus and are one inside another, rather than because of the variety of item sizes or the proximity of contours from different items. 2800
2400 2000
c m?
1600 1200 800 400
1
2
3
1
2
3
4 5 Number
6
7
8
I
Parallel corners
4 5 Number
6
7
8
Figure 5. Average latenciesfor enumerating items in concenbic item and paraIle1 line studies.
Subitizing & Counting
285
In a third study we investigated the ability to subitize when the enumeration task required computing a spatial relation. According to Ullman (1984) visual routines are required to compute the whether items are connected. Jolicoeur and his colleagues (Jolicoeur, 1988; Jolicoeur, Ullman, & Mackay, 1986) have performed a series of experiments showing that the time to decide whether two items are connected varies with the contour distance between the items when retinal distance between items is held constant. These results were interpreted as evidence that computing the connected relation involves moving the processing focus along the connecting contour at a fured rate. In the experiment, subjects were presented with a winding contour imposed over an matrix of orthogonal lines, as shown in Figure 6. The items were small blocks that could be green or purple. In any display subjects were required to enumerate 1-8 blocks that were designated as targets. There were also 2-8 blocks in the display that served as distractors. The number of distractors was varied to prevent subjects from attempting to use heuristics such as the proportion of the screen covered by a particular color to guess the number of items, thus avoiding the usual enumeration process. Because distractor effects were not the focus of the experiment, only trials with 6 distractors were analyzed.
Figure 6. Sample display for connected study.
In the Connected condition subjects were required to enumerate items on a particular contour. At the beginning of the trial subjects were provided a lateral fwtion marker to tell them the starting point of the contour they were supposed to attend to. Subjects were required to visually trace the contour, enumerating blocks until they came to the end of the contour. Contours could be of three
L.M. Trick different lengths. The length of the contour on which the items were placed would be expected to have an effect on enumeration latencies because the time to decide whether two items are connected varies as a function of contour length (e.g., Jolicoeur, 1988). Distractors were defined as any block (regardless of its color), that occurred after the break in the contour or on the orthogonal contour. For example, in Figure 6, if contour tracing were started at the top left corner, there are 6 connected blocks (there were 7 distractors: 1 block after the break in the contour and 6 blocks on orthogonal contours). If attention is required to compute the connected relation then subjects should not be able to subitize connected items. In the Color condition subjects were shown the same displays, and given the same lateral fmtion point, but their task was to enumerate items of a particular color (there were 6 dark boxes in Figure 6, for example). Attention is not required to detect an item of a different color from other items; color is assumed to be afeature (e.g., Treisman & Gellade, 1980). Because preattentive information distinguished target items from distractors, subitizing was predicted in the Color condition. Despite the appearance of Figure 7,there was clear evidence of subitiziig in the Color condition. All subjects had the appropriate deviations from linearity, though averaging across data from different subjects obscured the discontinuity in this graph. In contrast, there was little evidence of subitizing in the Connected condition; there was only one deviation from linearity, for one subject at the intermediate contour length only. (Chi square analysis revealed significant differencesin the proportions of subjects showing deviations from linearity in the Color and Connected conditions). Moreover, in the Connected condition, latencies were affected by the length of the contour. In the Color condition they were not. This pattern of results suggests that the attentional focus was being moved along the contour in the Connected condition but not the Color condition. Thus, subitizing was not evident when a spatial relation was superimposed on the enumeration task, that required the attentional focus to be moved along a contour to find out how many target items there were. Nonetheless, subjects could subitize the same displays if the task were to enumerate items of a particular color. Subitizing and Treisman's attentional "Glue"
According to Treisman (1985), spatial attention is required to form an integrated representation of an object from its features (e.g., color, orientation). Attention is not necessary to detect the presence of an item that differs from others by a feature, however.
Subitizing & Counting
287
Counting coloured iterns 3550 31 50
-
2750
c
2350
OL
1950
1550
4 link
Slink 6 link
1150 750
1
2
3
4 5 Number
6
7
8
Counting connected i t e m s 3950 3550 3150
2750
4 link
2350
5 link 6 link
1950
1550 1150 750
1
2
3
4 5 Number
6
7
8
Figure 7. Average enumeration latencies for connected study.
Typically, this research relies on the search paradigm. Subjects are required to indicate whether a given target item is present in a display. The number of distractor items is manipulated. When subjects are given the task of finding a target that differs from its neighbors by color or orientation, the time required to find the target does not increase appreciably with the number of items in the
L.M. Trick
display (Treisman & Gelade, 1980). For example, subjects can detect the presence of a white item among green almost as quickly when there are MI green items as when there are 10. Similarly, subjects can detect the presence of a vertical item among horizontal in a time roughly independent of the number of horizontal items in the display. The target items are said to pop out in this case. Even if subjects are given the task of looking for a disjunction of two different features, e.g.,. look for a white item (among green) OR a vertical item (among horizontal), reaction time does not increase substantially with the number of distractors. This result is interpreted as evidence that color and orientation can both be derived without spatial attention. In contrast, if subjects are looking for a conjunction of features, white AND vertical lines among green vertical and white horizontal lines, latency increases markedly with the number of distractors. This result is interpreted as evidence that spatial attention is required to "glue" features such as color and orientation together into a representation for an object (Treisman & Gelade, 1980). Moreover, the latency increases half as fast for trials in which the target is present than for trials in which the target is absent, as would be expected if the attentional focus was moved from item to item until the target is found. Given that on average the target would be found one half way through the items on positive trials, the slope for positive trials is half that of the negative. In Trick and Pylyshyn (1991a) a search task was superimposed on an enumeration task. Subjects were once again required to enumerate items in a field of distractors. There were two conditions. In the Disjunction condition subjects were required to enumerate white OR vertical lines in green horizontals. In the Conjunction condition subjects were required to enumerate lines that were both white AND vertical in a field of green vertical and white horizontal lines. There were 0-9 target items, and 0-20 distractor items. Only trials with 1-8 targets, and 0, 12 or 20 distractors were analyzed. Sample displays are presented in Figure 8. Consider the predictions from Feature Integration Theory (Treisman & Gelade, 1980; Treisman & Sato, 1990). According to Feature Integration Theory, features come without detailed location information (Treisman & Schmidt, 1982). Subjects are only responding to activity a feature map (e.g., the red map) when they perform feature search (Treisman & Gelade, 1980). As a result there would be no way to individuate different items with the same features at the preattentive level, certainly not in the presence of distractor items. At the attentive level, object files can only be created one item at a time with the attentional focus (Kahneman & Treisman, 1984). Thus, although items could be individuated by object files, these tiles must be produced one at a time. Thus, there is nothing in Feature Integration Theory that could explain why 1-4 items would receive
Subitizing & Counting
289
different processing than 5 or more items. As a result, subitizing and counting could only be explained by processes that operate after both preattentive analysis and the attentive processes that define object fdes, according to Feature Integration Theory. If this were true, the attentional requirements of the task should not effect whether subitizing occurs. Slope discontinuities in the enumeration function should occur whether feature integration is necessary or not, according to Feature Integration Theory, although enumeration should take longer on average when feature integration is required.
rn
0
pi 0
m
Disjunction Conditions
-
Count the white lines
0
Count the
vertical lines m
I= I I
I
Conjunction I 0 Condition o
I" 0
0
I o
I
Count the white vertical lines
0
0
Figure 8. Sample displays in colored lines study. In contrast, we predicted that attentional manipulations would be affected if subitizing occurred. We predicted that subitizing would only be evident in the Disjunction condition, where the target items could be distinguished from the distractors on the basis of their features. This feature information would become
L.M. Trick
290
available before the use of spatial attention, consequently FINSTs could be assigned selectively to the target items in this case. As we predicted, subjects were capable of subitizing even with 12 and 20 distractors in the Disjunction condition, 9 of 10 subjects in each case. In contrast, there was little evidence of subitizing in the presence of distractors in the Conjunction condition; O j l O and 1/10 showed deviations in the 12and 20 distractor conditions, respectively. The Conjunction and Disjunction conditions had sigdcantly different proportions of subjectswith deviations from linearity for both 12 or 20 distractors. For averaged latencies, see Figure 9. Disjunction condition
zoo 3200 2BM)
1600
-
1200
-c. 20 dirtractorr
2400
I
2000
I
-c
0 diStrJCtbrS
I2 dtstrrotors
a00 400 I
I
I
I
I
I
I
2
3
4
5
6
7
i 8
Numbor
Conjunction condition 3600
3200 2800
2400
c PL
2000 0 dlrlraoiws 1600
12 dirlractors
1200
20 d u l r r c t w s
800
400
lumbar
Figure 9. Average enumeration latencies for colored lines study.
Subitizing & Counting
291
Therefore, whenever attentional processing is unnecessary, either because the property that distinguishes targets from distractors is a pop out feature, or because low level grouping processes deliver contour clusters which each correspond to an item, subitking was evident from a trend analysis. Moreover, analysis of slopes revealed that the slopes in the 1-3 range were significantly different from that of the 5-7 range and, in addition, the slopes in the 1-3 range were low. Whenever attention is required either to compute a spatial relation, to resolve the object as a whole or to combine features into a unified object description, subitizing was not evident from trend analysis of latencies. Moreover, the slope in the 1-3 range was not significantly different from the slope in the 5-7 range, which suggests that the same process is being used for both ranges. Second, the slope was high; it is the counting process that it is being employed for small and large numbers of items. These findings are consistent with the idea that subitizii uses preattentive information. If this preattentive information is to be useful for enumeration there must be some way to individuate the feature clusters derived from low level analysis, however. Pylyshyn's FINST mechanism provides a way to accomplish this task. None of the theories of enumeration are adequate to explain this pattern of results. The Density and Pattern theories are inadequate to explain how complex enumeration tasks are performed at all. How do we enumerate multiple contour items? How do we enumerate targets in distractors? The Memory based theory of enumeration predicts that subitizing should occur regardless of the appearance and configuration of the items, as long as the memory load remains the same. For this reason, it is hard to explain why subjects were unable to subitize Concentric items from Memory theory. The theories of attention are also inadequate to explain the results from these experiments. Ullman's Visual Routines (1984) cannot explain subitizing, even without distractors, although the theory was proposed to explain enumeration. Feature Integration Theory (Treisman & Gelade, 1980) would predict that subitizing and counting would occur regardless of the attentional demands of the task, although attentional demands would add to the overall enumeration latencies. Only FINST theory predicts attentional manipulations would affect whether subitizing would occur. In these studies we wanted to argue that subitizing does not occur when serial attentional analysis is needed to resolve and individuate items. Before we can be safe in making this conclusion we need to show that contracting the attentional focus, narrowing the spotlight or "zoom lens" (Eriksen & St. James, 1986), doesn't
292
L.M. Trick
prevent subitizing in itself. For this reason a third series of studies was performed using the cue validity paradigm. Subitizing counting and the spotlight of attention
In the cue validity paradigm subjects are required to make a perceptual decision, (e.g., press one key if there is a " B in the display and another if there is a "D"). Speed and accuracy are measured. A comparison is made between situations in which subjects know beforehand both when and where a stimulus will appear (Valid cuing) as opposed to situations in which they only know when the stimulus will appear (Neuhcrl cuing). Finally, in the Invalid cuing condition subjects are given correct information about when the stimulus will appear, but incorrect information about where it will appear. Typically, subjects are faster and more accurate at making perceptual decisions if they are given correct information about where the stimulus will fall. Performance is thus best in the Vulid condition, followed by the Neutrul and Invalid conditions. This finding has been interpreted as evidence that a processing focus, the "spotlight of attention" is moved through the stimulus array in response to subjects' expectations about where the target item will appear, This processing focus performs perceptual analysis (Posner, Snyder, & Davidson, 1980). Although the focus moves rapidly (and independent of eye movement), it takes time to move the focus from one location to another (Posner et a1.,1980; Laberge & Brown, 1986). Subjects are thus slower in the Invalid condition than the Valid because they have to move the focus from one location to another. It has been argued that processing resources are dispersed evenly throughout the display in the Neutral condition (Gawryszewski, Riggio, Rizzolatti, & Urnilta, 1987). The difference between Valid and Neutral trials would thus reflect the fact that analyses are performed less efficiently everywhere when resources are shared over large areas of the display (c.f., Eriksen & St. James, 1986). In Trick and Pylyshyn (1991b) a cue validity paradigm was combined with an enumeration task. The goal was to show first that subitizing would be possible whether attention was focused on a small area, as in the Valid condition, or distributed throughout the display, as in the Neutrul condition. If this were true then it would show that subitizing is not prevented when the attentional focus is narrowed. Second, we wanted to show that the position of the attentional focus would have a stronger effect on counting latencies than subitizing latencies. Specifically, the difference between Valid and Invalid conditions should be more pronounced in the counting range than the subitizing range. This result would make sense if the
Subitizing & Counting
293
counting process involves the attentional focus, and moving the attentional focus takes time. The position of the attentional focus should have a smaller effect in the subitiziig range because subitizing doesn't require the attentional focus. Two cue validity experiments were performed. In both, subjects were required to count 1-8 dots. In one study the cue was a central arrow that indicated which half of the display that the dots would occupy, or a cross that indicated the items could appear on either side of the display. (The cue preceded the counting display by 192 ms). In a second study the cues were rectangles that surrounded the area that dots were to appear. There was one rectangle in each corner of the display. In the Valid and Invulid conditions, one rectangle was a different color from the others. The position of this rectangle predicted with 80% accuracy the position of the dots. (Cuing rectangles preceded the dots by 128 ms)." In the Neutrul condition all the rectangles were the same color. In both studies subjects were told to distribute their attention evenly throughout the display in the Neutrul condition. For all subjects, and in both experiments, subitizing was evident in Vulid,Invalid and Neurrul conditions, as indicated by trend analysis. The necessity of contracting the attentional focus in the Valid and Invalid conditions did not prevent subitizing, or even restrict the subitizing range. The position of the attentional focus had a significantly stronger effect in the counting range than the subitizing range in both studies. In fact, when cuing rectangles were used there was no significant effect of spatial cuing in the 1-4 range, as can be seen from Figure 10, although there were significant effects in the 5-8 range. Consequently, the position of the attentional focus, as manipulated by spatial pre-cues, seems to have a greater effect in the counting range than the subitizing range, as would be expected if counting requires spatial attention. General Discussion These studies show two things. First, they demonstrate that attentional manipulations affect whether subitizing occurs. If the particular enumeration task requires spatial attention to resolve items as wholes and distinguish target items from distractors, subitizing does not occur. When spatial attention is not necessary
lo The time between cue and display is typically kept short in peripheral cuing studies, in which the cue appears near where the target will appear, and attention automatically shifts to the cued location (c.f., Jonides, 1981). In central cuing studies, in which a symbolic cue such as an arrow or cmss is presented in the center of the display, extra time is allotted between the cue and display to allow for interpretation of the cue.
294
L.M. Trick
Valid I-
hvolid
m
Neutral
Number
Figure 10. Average enurneration latencies as a function of cue vaIidity.
to define and individuate target items, subjects can even subitize multiple contour items, or items among distractors. Neither pattern, memory, or density based theories of enumeration predict a relationship between subitiziig and spatial attention. In fact, the Pattern and Density theory need to be augmented to explain how these complex enumeration tasks are performed at all. None of the theories of spatial attention, including Ullman’s Visual Routines (1984) or Treisman’s Feature Integration Theory (Treisman & Gelade, 1980), explain why there are slope discontinuities in the enumeration function when the preattentive processes are adequate to resolve items as wholes or distinguish targets from distractors, but not when spatial attention is required for these tasks. Second, these studies suggest that the position of the attentional focus has a stronger effect on counting latencies than subitizing latencies. This finding makes sense if it is assumed that moving the attentional focus is part of the counting process. I would like to stress that moving the attentional focus is only part of the counting process, however. Memory functions are also involved in the addition component of counting; memory load affects latencies in the counting range (e.g. Loge & Baddeley, 1987). Marking is part of the counting process. Thus, any factor that influences the ease of “getting lost” and accidentally re-counting or missing items thus has an effect in the counting range. For example, item heterogeneity aids counting (Beckwith & Restle, 1966; Frick, 1987; Moser, 1989;
Subitizing & Counting
295
Potter & Levy, lw),and circular item configurations interfere (Aoki, 1977). The counting process has many subcomponents. Nonetheless] moving the attentional focus is one of them. The goal of this paper was to provide a theory of visual enumeration that could be incorporated into a theory of vision, yet explain the data on subitizing and counting. I have tried to show that subitiziig occurs because of a item individuation mechanism that can handle small numbers of items at once. Item individuation occurs after the preattentive processes of feature registration and grouping, but before the operation of spatial attention. This item individuation stage is necessary to explain how we enumerate] because accurate enumeration requires that each item be considered as a separate unit, even if it has the same properties as other items. Item individuation is also necessary to explain how we reason on the basis of visual information and compute spatial relations (Pylyshyn, Elcock, Marmor, & Sander, 1978a, 1978b), because several items must be selected and individuated at once if spatial relations are to be derived. In fact, item individuation is necessary to explain how the attentional focus ends up where it is intended to go when we deliberately move the attentional focus from item to item in complex displays. The FINST hypothesis suggests that individuation is accomplished by mental reference tokens-FINgers of INSTantiation. The FINST mechanism can be seen to accomplish two goals. First, FINSTs individuate items. Individuation allows items with the same properties, similar tokens of the same type, to be considered as distinct units. Individuation allows individual items to maintain their identity even as they move and change their properties. Second, FINSTs can be used select a small number of items from a complex display, before attentional processing, a need that more and more attentional researchers are beginning to appreciate (e.g., Egeth, Virzi & Garbart, 1984, Treisman & Sato, 1990; Wolfe, Cave, & Franzel, 1989).
REFERENCES Akin, 0. & Chase, W. (1978). Quantification of three-dimensional structures. Journal of Eqerimental Psychology: Human perception and perfomance, 4(3), 397-410. Allport, D. (1975). The state of cognitive psychology. Quarterly Journal of Experimental Psychology, 27, 141-152. Aoki, T. (1977). On the counting process of patterned dots. Tohoku Psychofogicu Folia, 36, 15-22.
2%
L.M. Trick
Atkinson, J., Campbell, F.& Francis, M. (1976). The magic number 4 f 0: A new look at visual numerosity judgements. Perception, 5,327-334. Beck, J. (1982). Textural segmentation. In J. Beck (ed.) Organization and representation in perception. Hillsdale, NJ: Erlbaum. Beckwith, M.& Restle, F. (1966). Process of enumeration. Psychological Review, 73, 437-444. Chi, M. & Klahr,D. (1975). Span and rate of apprehension in children and adults. Journal of Eaperimental Child Psychologv, 19,434-439. Egeth, H.,Virzi, R. & Garbart, H. (1984). Searching for conjunctively defined targets. Journal of &penmental Psychology: Human perception and petfomance, 10,32-39. Eriksen, C., & Hoffman, J. (1972). Temporal and spatial characteristicsof selective encoding from visual displays. Perception and Psychophysics, 12(2b), 201-204. Eriksen, C. & Hoffman, J. (1973). The extent of processing of noise elements during selective encoding of visual displays. Perception and Psychophysics, 14(1), 155-160. Eriksen, C. & St. James, J. (1986). Visual attention within and around the field of focal attention: A zoom lens model. Perception and Psychophysics, 40(4), 225-240. Folk, C., Egeth, H. & Kwak, H. (1988). Subitiziig: Direct apprehension or serial processing? Perception and Psychophysics, 44(4), 313-320. Frick, R. (1987). The homogeneity effect in counting. Perception and Psychophysics, 41(1), 8-16. Gawryszewski, L., Riggio, L., Rizzolatti, G. & Umilta, C. (1987). Movement of attention in the three spatial dimensions and the meaning of "neutral" cues. Neuropsychologica,25(la), 19-21. Hamilton, W. (1860). Lecture XIV, Consciousness: Attention in general. Lectures on metaphysics. (Vol. 1). Boston: Gould and Lincoln. Jensen, E., Reese, E. & Reese, T. (1950). The subitizing and counting of visually presented fields of dots. Journal of Psychology, 30, 363-392. Jolicoeur, P. (1988). Curve tracing operation and the perception of spatial relations. In Z . Pylyshyn (Ed,) Computationalprocesses in human vision: An interdisciplinary approach (pp 133-168). Norwood, NJ: Ablex Publishing. Jolicoeur, P, UUman, S. & Mackay, L. (1986). Curve tracing operations and the perception of spatial relations. Memory and Cognition, 14, 129-140. Jonides, J. (1980). Towards a model of the mind's eye. Canadian Journal of Psycholow, 34, 103-112.
Subitizing & Counting
297
Julesz, B. (1984). Towards an axiomatic theory of preattentive vision. In G. Edelman, W. Gall & W . Cowan (eds.) Dynamic aspects of neocortical function. (pp. 585-612) Toronto: John Wiley & sons. Kahneman, D., Treisman, A. & Gibbs, B. (1990). The reviewing of object files: Object specific integration of information. Unpublished manuscript. Kahneman, D. & Treisman, A. (1984). Changing views of attention and automaticity. In R. Parasuraman and D. Davies (eds.) Varieres ofattention. (pp 29-61). Orlando, FL: Academic. Kanwisher, N. (1987). Repetition blindness: Type recognition without token individuation. Cognition, 27, 117-143. Kaufman, E., Lord, M. & Reese, T. (1949). The discrimination of visual number. American Journal of Psychology, 62,498-525. Klahr, D. (1973a). Quantification processes. In W.G. Chase (ed.) Visual information processing. (pp 3-34). New York: Academic Press. Klahr, D. (1973b). A production system for counting, subitizing and adding. In W.G. Chase (ed.) Wsual infomation processing. (pp 527-546). New York: Academic Press. Klahr, D. & Wallace, J. (1976). Cognitive development: An information processing view. Hillsdale, NJ: Lawrence Erlbaum Assoc. Laberge, D. (1983). Spatial extent of attention to letters and words. Journal of Experimental Psychology: Human Perception and Perfomtance, 9, 371-379. Laberge, D. & Brown, V. (1986). Variations in size of the visual field in which targets are presented An attentional range effect. Perception and Psychophysics, 40(3), 188-200. Logie, R. & Baddeley, A. (1987). Cognitive processes in counting. Journal of Experimental Psychology. 13(2), 310-326. Mandler, G. & Shebo, B. (1982). Subitizing: An analysis of its component processes. Journal of Experimental Psychology: General, 111(1), 1-22. Man, D. (1982). Vision. San Francisco: W.H. Freeman and Co. Miller, G. (1956). The magical number 7 k 2 Some limits on our capacity to store information. Psychological Bulletin, 63, 81-97. Minsky, M. & Papert, S. (1969). Perceptrons: An introduction to computational geometry. Cambridge, Mass: MIT press. Moser, M. (1989). Types and tokens in visual letter perception. Journal of Experimental Psychology: Human perception and perfomtance, 15(2), 287-303. Oyama, T., Kikuchi, T. & Ichihara, S. (1981). Span of attention, backward masking and reaction time. Perception and Psychophysics, 29(2), 106-112.
298
L.M. Trick
Parkman, J. & Groen, G. (1971). Temporal aspects of simple addition and comparison. Journal of Experimental Psychology, 89(2), 335-342. Posner, M.,Snyder, C. & Davidson, B. (1980). Attention and the detection of signals. Journal of Experimental Psychology: General, 109, 160-174. Potter, M. & Levy, E. (1968). Spatial enumeration without counting. Child development, 39, 265-272. Pylyshyn, Z.(1981). Psychologicalexplanation and knowledge dependent processes. Cognition, 10, 167-174. Pylyshyn, Z.(1989). The role of location indexes in spatial perception: A sketch of the FINST spatial-index model. Cognition, 32, 65-97. Pylyshyn, Z.& Burkell, J. (1990, November). Change: R primitivef e d r e ? Paper presented at the annual meeting of the Psychonomics Society, New Orleans. Pylyshyn, Z., Elcock, T., Marmor, M. & Sander, P. (197th). Explorations in visual-motor space. Proceedings of the Second International Conference of the Canadian Society for Computational Studies of Intelligence. University of Toronto. Pylyshyn, Z., Elcock, T., Marmor, M. & Sander, P. (1978b). A system for perceptual-motor based reasoning, (Technical Report 42) London, Ontario: Dept. of Computer Science. Pylyshyn, Z.& Storm, R. (1988). Tracking multiple independent targets: Evidence for both serial and parallel stages. Spatial vision, 3(3), 179-197. Sagi, D. & Julesz, B. (1984). Detection vs discrimination of visual orientation. Petreption, 13,619-628. Saltzman, I. & Garner, W.(1948). Reaction time as a measure of the span of attention. Joumal of Psychology, 25, 227-241. Simons, D. & Langheinrich, D. (1982). What is magic about the magical number four? Psychological Research, 4, 283-294. Svenson, 0 & Sjoberg, K. (1978). Subitiziig and counting processes in young children. Scandinavian Journal of Psychology, 19, 247-250. Taubman, R. (1950a). Studies in judged number: I. The judgement of auditory number. Journal of General Psychology, 43, 167-194. Taubman, R. (195Ob). Studies in judged number: 11. The judgement of visual number. Journal of General Psychology, 43, 195-219. Thurstone, L. (1929). Fecbner’s law and the method of equal appearing intervals. Journal of Experimental Psychology, 12, 214-224. Treisman, A. (1985). Preattentive processing in vision. Computer vision, graphics and image processing, 31, 156-177.
Subitizing & Counting
299
Treisman, A. & Gelade, G. (1980). A feature integration theory of attention. Cognitive Psychology, 12, 97-136. Treisman, A., Kahneman, D. & Burkell, J. (1983). Perceptual objects and the cost of filtering. Perception and Psychophysics, 33, 527-532. Treisman, A. & Sato, S. (1990). Conjunction search revisited. Journal of Experimental Psychology: Human Perception and Performance, 16(3), 459-478. Treisman, A. & Schmidt, H. (1982). Illusory conjunctions in the perception of objects. Cognitive Psychology, 14, 107-141. Trick, L. (1987). Subitizing and canonical pattern. Paper presented at the annual meeting of the Canadian Psychological Association. Vancouver, B.C. Trick, L. & Pylyshyn, Z. (1991a). Preattentive and attentive processes in visual enumeration: Subitizing, counting and FINSTs. Submitted. Trick, L. & Pylyshyn, Z. (1991b). Cuing and counting: Sometimes the spotlight isn’t much help. Submitted. Tsal, Y. (1983). Movements of attention across the Visual field. Journal of Experimental Psychology: Human perception and performance, 9, 523-530. Ullman, S. (1981). The interpretation of visual motion. Cambridge, Mass: MIT press. Ullman, S. (1984). Visual routines. Cognition, 18, 97-159. Van Oeffelen, M. & Vos, P. (1982a). A probabilistic model for the discrimination of visual number. Perception and Psychophysics, 32(2), 163-170. Van Oeffelen, M.& Vos, P. (1982b). Configurational effect on the enumeration of dots: Counting by groups. Memory and Cognition, 10(4), 396-404. Warren, H. (1897). The reaction time of counting. Psychological Review, 4, 569-591. Wolfe, J., Cave, K. & Franzel, S. (1989). Guided search: An alternative to the Feature Integration model for visual search. Journal of Experimental Psychology: Human perception and performance. 15(3), 419-433. Woodworth, R. & Schlosberg, H. (1954). Experimental Psychology. (Revised). New York: Holt.
This Page Intentionally Left Blank
m
J.I.D. Campbell (Editor) 0 1992 Elsevier Science Publishers B.V. All rights reserved.
301
Chapter 8
WORKING MEMORY, AUTOMATICITY, AND PROBLEM DIFFICULTY
Mark H. Ashcraft, Rick D. Donley, Margaret A. Halas Cleveland State University Mary Vakali University of Thessaloniki
Summary
Two complimentary topics are of special interest in the study of cognitive skills, jirst the involvement of working memory resources in successfulpefonnance, and second the role of automaticity in the component processes of such performance. While these questionsfigure prominently in contempomy cognitive research, they have only recently begun to receive direct investigation in the area of mental arithmetic. In this chaptet, we review the existing reseamh that bears on these issues, then present two experiments. Experiment I focused on the deployment of working memory resources during arirhmetic processing using a standard dual-task method. Experiment 2 investigated automatic and conscious processing m revealed by a priming task. The results of both experiments are viewed in terms of the basic problem dificulty variable, and the rnlationship between this variable and manipulations that tap automatic and effo@l aspects of pefonnance. The chapter concludes with some remarks on the central construct of problem difficulty. Introduction and background The purpose of this chapter is largely empirical, rather than theoretical. Several papers in the Literature have addressed the issues of attention and automaticity in mental arithmetic, from the standpoint of theories of arithmetic processing and from the standpoint of fluent performance and mastery of basic arithmetic and mathematics. Because these issues are relatively straightforward, and covered thoroughly elsewhere, they are only summarized briefly here. Instead, we focus
302
M.H. Ashcraf, et al.
on the experimental attempts to assess working memory processes and automaticity in arithmetic, and present new evidence on each. Because of the variety of ways in which the terms attention, working memory, automaticity, and the like are used, we must clarify our particular view here, and situate the two experiments being reported into an appropriate context. For the present paper, we intend the term attention to refer to the mental resources or effort allocated to a cognitive task; our particular interest, of course, is in a task that probes knowledge within the domain of simple arithmetic. While this view differs from some more specialized approaches to the concept of attention (e.g., see Trick, this volume, on the topic of visual attention), it does not necessarily contradict those approaches. Instead, it asserts the rather conventional meaning of the term attention, and essentially the same approach as the Working Memory view proposed by Baddeley and Hitch (1974). That is, we assume that: the pool of attentional resources that may be deployed for cognition is limited; deliberate assignment of resources to a demanding cognitive task enables that task to be completed; and competition for attentional resources from a second task will degrade performance to the extent that, one, the primary task requires resources for completion, and two, that the total demand for resources exceeds the allocatable supply within working memory. Such a conception is largely the same as that connoted by the term "conscious" (Posner & Snyder, 1975) or "controlled processing" (Shiffrin & Schneider, 1977). That is to say, conscious processing is the effortful, deliberate performance that reties on attentional resources. Conversely, automatic or autonomous (Zbrodoff & Logan, 1986) processing is commonly viewed as the very rapid, skilled performance that is accomplished with few if any demands on working memory resources. Components of processing that are found to be automatic or autonomous involve elementary processes like memory retrieval, i.e., accessing a body of overlearned, stored knowledge (Logan & Klapp, 1991). On this view, performance can be influenced, either positively or negatively, via effortless, rapid, automatic activation of the underlying stored knowledge. To the extent that this occurs, we will conclude that the components of performance have achieved an appreciable level of automaticity. To the extent that there is no such influence on performance, or that the influence is delayed, we will conclude that the components operate at a more conscious, controlled level. As an example of these distinctions, and as a prelude to the topic of interest, consider two contrasting problems, 2 t 3 and 14 t 27. The first is one of the 100 "basic facts" of addition, likely to have been stored strongly in an adult's memory due to overlearning. We would expect performance to this problem, i.e., either generating the answer or verifying that 5 is the correct answer, to be quite rapid,
Working Memory and Problem Difficulty
303
effortless, and accurate. This amounts to a claim that that answer was retrieved automatically. If so, then this retrieval should not be degraded by a task that consumes additional working memory resources. Further, the retrieval might be facilitated if the memory representation for 2 + 3 had received prior activation. The second problem, however, requires not only retrieval of basic facts but also a carry operation from the units to the tens column. If executing the carrying rule in arithmetic is less than fully automatic, then overall performance should not only be slower, it should also suffer when an attention-demanding, second task is performed simultaneously. Having sketched these distinctions, it is now important to define and explain the most common and basic empirical effect found in the mental arithmetic literature, variously termed theproblem size,problem strength, or problem difficulty effect. The phenomenon is as follows: When single-digit arithmetic problems (e.g., 4 + 3, 7 x 6 ) are tested, subjects’ performance is slower and more error prone on problems with larger operands (addends, multipliers, etc.) and answers. The effect holds for simple addition (e.g., Ashcraft & Battaglia, 1978; Groen & Parkman, 1972), subtraction (e.g., Donley, 1991; Siegler, 1987b), multiplication (e.g., Parkman, 19n,Stazyk,Ashcraft, & Hamann, 1982),and division (Campbell, 1985). It is quite pronounced for children in early grade school (Ashcraft & Fierman, 1982; Cooney, Swanson, & Ladd, 1988), a result widely appreciated as an effect of children’s reconstructive strategies used in dealiig with larger problems (Siegler, 1987a, b). Though smaller in absolute terms, whether assessed by reaction time (RT)or error rates, the effect is just as reliable for college-aged adults as for children; furthermore, initial work (e.g., Allen, Ashcraft, & Weber, 1991; Geary & Wiley, 1991) confirms its presence among the elderly. For adults, the effect is generally viewed as one of retrieval, rather than reconstructive processing. That is, adults do upon occasion rely on some reconstructive or strategic methods for solving simple problems (e.g., Geary & Wiley, 1991). Nonetheless, the problem size/ difficulty effect is also apparent even in situations that minimize the possible contributions of non-retrieval processes (e.g., when slow RT trials were excluded in Ashcraft & Stazyk, 1981, Experiment 1; when related answers disrupted performance in Stazyk et al., 1982, Experiment 3, and Winkelman & Schmidt, 1974). Generally, explanations of the effect at the adult level appeal to a memory representation of simple arithmetic fact knowledge, in which the strength of the association or network connection between operands and answers governs the time necessary for successful retrieval (e.g., Ashcraft, 1982, in press; Campbell, 1987; see Siegler & Jenkins, 1989, for a related approach to children’s performance).
304
M.H.Ashcraft, et 41.
Problem size was originally defined in a structural sense, i.e., as the numerical value of the smaller number in a simple addition problem, where this value specified the number of underlying counting operations needed to arrive at an answer (Groen & Parkman, 1972). Later research, however, has turned from the structural approach. Among other difficulties, it fails to predict a variety of subsequently reported empirical effects (e.g., the "confusion effect" in Stazyk et al., 1982; "error priming" in Campbell & Clark, 1989), and in fact is a less successful predictor of RTs than any of several normative measures of strength or difficulty (Campbell & Graham, 1985; Koshmider & Ashcraft, 1991). Thus, problem size per se is currently viewed as only a correlate of the more central variable, problem strength or difficulty. The demonstrations of the problem difficulty effect in both reconstructive and retrieval performance raise questions of working memory resources and automaticity quite directly. That is, it would seem likely that a strategy-based reconstruction of the answer to 6 t 9 is an attention-demanding process, placing heavy demands on working memory, whereas a low difficulty problem l i e 2 t 3 should be retrieved rather than reconstructed, with accordingly low demands on working memory. Notice here that a common, though arguable (e.g., Ashcraft, 1990), connotation of the term "strategy" is that it occurs intentionally, is rather prolonged in its time for execution, and consumes mental resources. Thus, the attentional demands of mental arithmetic might be especially obvious in a setting that requires a second, attention consuming task to be performed simultaneously. This reasoning formed the basis for Experiment 1 here. Similarly, to the extent that retrieval of simple arithmetic facts is an automatic or autonomous process, then that retrieval should be influenced by a priming manipulation, one that selectivelyactivates related or irrelevant information in the memory representation. Experiment 2 was designed to assess automaticity of performance, hence low demand on working memory, within such a priming context. Experiment 1: Working memory
What is the role of working memory in mental arithmetic? How dependent is arithmetic performance on conscious, deliberate, effortful processing, on the limited pool of mental resources? Research reviewed in Experiment 2 below indicates that for the simple, single-digit facts of addition and multiplication, much of adults' processing is quite automatic; according to Koshmider and Ashcraft (1991), even third graders show some automatic facilitation of retrieval on the smaller, less difficult multiplication problems. Such facts would presumably require little in the way of attentional processing, at least by adults. And yet, more
Working Memory and Problem Difficulty
305
complex computations involving multi-digit problems (e.g., 27 + 15) would be expected to consume at least some attentional resources. To state the obvious, while we study and memorize the basic addition facts 0 + 0 up through 9 + 9 in elementary school, larger problems are intentionally computational. That is, the rules of arithmetic are generative, enabling any of the infinite number of sums (remainders, products, etc.) to be obtained, given knowledge of the rules and the basic facts. And furthermore, problems requiring carrying, borrowing, and the like, would even more certainly require the efforts of an attention-consuming, working memory system. Hitch’s (1978b) report on multi-column addition is the clearest evidence on the role of working memory and attentional processing in mental arithmetic. In this research, subjects were given multi-column problems orally, and had to write their answers in a timed task. In some conditions, subjects wrote answers in the “units, tens, hundreds” order, and in some conditions the reverse order was required (Experiments 2 and 3). In some conditions, one, both, or neither of the operands were presented on paper as the problem was read aloud (Experiment 4). In all cases, problems varied as to whether or not a carry operation was required, and whether the carry was from the units or tens column. The results of these manipulations were quite straightforward. Requiring subjects to write answers in the reverse order, which of course imposed a delay on the reporting of some digits, led to an increase in errors. Conversely, errors decreased as the number of operands supplied on paper increased. Finally, both response latency and errors increased as the number of carry operations in the problem increased. In a related paper, Hitch (1978a) found similar effects in a mental multiplication task. Problems had either a 2-, 3-, or 4digit “top number” as the multiplicand, and a 1-digit value as the multiplier. In the results, error rates were approximately 5% for values in the units column, regardless of the size of the first operand, but were 13%, 23%, and 31% for the tens column when the fvst operand was a 2, 3, or 4 digit multiplicand, respectively. Hitch concluded in both papers that the results demonstrated an important role for working memory in the computation of arithmetic answers for complex problems. In particular, working memory was held to be the storage system for both initially presented operands as well as intermediate values computed during solution. Manipulations that prolonged the storage period for either type of information increased errors, because the manipulations served to place heavier storage loads on working memory. The increase in errors, and latencies in some studies, were attributed to a simple decay function in an overloaded working memory system.
306
M.H.Ashera& et d.
There are two major limitations in these studies from the present perspective. First, Hitch's evidence of working memory involvement in mental arithmetic came from multi-column arithmetic problems, in which successive retrievals from the long-term memory "library of number facts" supplied the intermediate values necessary for problem solution. In short, the evidence is clear that holding multi-column operands in working memory, performing the cany operation, and holding intermediate values until they can be reported require working memory resources. This conclusion says nothmg, however, about possible working memory involvement in the simple fact retrieval process itself. It is entirely possible, given the prominence of the problem difficulty effect, that retrieval of the basic facts, especially the more difficult ones, also consumes working memory resources. The second limitation is simply that no analysis was reported of the specific arithmetic facts that were tested, although it is clear that the basic facts vary in both latencies and errors. The approach taken by Hitch thus needs to be repeated, testing both the basic arithmetic facts as well larger, multi-column problems. Furthermore, it seemed desirable that an explicit manipulation of working memory load be employed, as in the classic dual task paradigm reported in Baddeley and Hitch's (1974) initial research on working memory. To this end, we had subjects perform the addition task, using both basic addition facts and multi-column problems, while they simultaneously performed one of three competing, concurrent tasks. The nature of the disruption of arithmetic performance as a function of concurrent task, of course, was expected to shed light on the question of attentional processing in arithmetic. Methodr and procedures A pilot study was conducted initially to identify a suitable control task to be used with addition, that is, a concurrent task that could be performed
simultaneouslywith addition without disrupting the ongoing addition performance. Because of the numerical nature of the primary task, we used letter-based tasks as a way to consume working memory resources without specifically interfering with the arithmetic stimuli. The pilot study revealed that simple letter repetition was a very adequate concurrent task to use as a control manipulation, placing demands on the articulatory subsystem while not disrupting that component, presumably the Executive Control system (Baddeley & Hitch, 1974), responsible for arithmetic processing. Thus, Experiment 1 used letter repetition as a control level of the concurrent task, and then used two modifications of the task for the manipulation of working memory load.
Working Memory and Problem Dificulfy
307
Subjecfs. A total of 15 undergraduate students at Cleveland State University were tested, the only sampling restriction being normal or corrected- to-normal vision. Three subjects failed to attain 90% overall accuracy, so were excluded from further consideration; there was no apparent relationship between these errors and treatment condition. Stimuli. A total of 40 addition problems were selected for testing, 20 of the basic addition facts, with single digit addends, and 20 two-column addition problems. The 20 basic facts were selected from the entire range of basic fact problems, such that one- and two-digit answers were equally represented. The only further restriction on sampling was that once a problem was selected, its inverse was excluded from further consideration. For purposes of analysis, and to maintain consistency with the present Experiment 2, the 20 problems were categorized into low, medium, and high levels of problem difficulty, based on Siegler's (1977) normative assessment of associative strength (see, for example, Koshmider & Ashcraft, 1991). Two column problems were selected such that half required a carry, and half did not. Addends ranged from 10 to 34, and answers from 21 (for 11 + 10) to 47 (24 + 23). Each of these 40 problems appeared with its correct answer and also with an incorrect answer, yielding a set of 80 stimulus problems. For the basic fact problems, the incorrect answer differed from the correct value by t /- 1 or 2. For the two-column problems, the incorrect value appeared equally often in the units or tens column, and differed from that column's correct value by t /- 1 or 2. The entire problem set of 80 stimuli was presented three times to each subject, each time under a different concurrent task requirement. The order of concurrent tasks was randomized across subjects. Apparatus. Subjects were tested individually at an IBM microcomputer with monochrome monitor, running the MEL (Schneider, 1988) software. Stimuli were centered on the screen, and presented in column fashion. Subjects made their "true" and "false" responses to the '1' and '2' keys on the lower right corner of the keyboard. The software presented all instructions to the subjects, sequenced the several blocks of trials, and recorded accuracy and RT in msec. Conditions and instructions. Subjects were instructed in the tasks, given a 20-item practice set to familiarize them with the tasks and apparatus, and then were tested in all three concurrent task conditions. Speed and accuracy were given equal emphasis in the instructions. Subjects were specifically cautioned against slowing their verbalizations while answering the arithmetic problems. In all three conditions, a set of four letters appeared on the computer screen for 2 sec, followed by the 2 sec furation point and then presentation of the addition problem. Subjects were told to begin the letter task immediately upon
308
M.H.Ashcraj?, et al.
presentation of the four letters, and to continue with this task until they finished responding to the arithmetic problem. In the control letter task, the Reput condition, subjects saw a letter repeated four times on the screen. They were asked to repeat that letter out loud throughout the entire trial, at a rapid and constant rate. In the Word Generation condition, subjects saw the same letter as in Repeat, but were required to generate words beginning with that letter, naming them out loud throughout the duration of the trial, at a constant and rapid rate. In the Alphabetization condition, subjects saw a combination of four consonants. They were first required to name the four letters out loud, and then to name them in correct alphabetical order. Pilot testing insured that the Word and Alphabetize conditions were taxing enough to ensure vocalization throughout the duration of the trial. Unfortunately, and despite the instructions and practice set of trials, subjects tended to slow down their verbalizations in these two conditions, especially when the addition problem appeared on the screen. We elected to remind subjects of their instructions only during the break between conditions, rather than to interrupt the ongoing trials within a set. Results
We analyzed the data in our standard fashion, i.e., excluding errors and RTs identified as outliers by Dixon’s test. The RTs as well as error percentages were subjected to repeated measures analysis of variance. Because the problem dfliculty factor was not directly comparable for the basic facts and the two-column problems, these problem types were analyzed separately. For the basic fact analyses, the design evaluated the factors of concurrent task (Repeat, Alphabetization, Word Generation), true/false, and problem difficulty (low, medium, and high). For the two-column problems, the factors were concurrent task, true/false, and a two-level difficulty factor distinguishing problems with and without a carry operation. A separate analysis of RTs to false problems was also conducted on the two-column problems, to determine the effects of error location, i.e., whether the incorrect value was in the units or tens column of the problem. Basic facts. After the three especially error prone subjects were excluded, the overall error rate was 3.53% (ranging from 0 to 6.8%). The only effect in the error analysis that approached significance was the problem difficulty effect, F (2, 22) = 2.63,~< 0.10 (all otherps > 0.40); error rates across increasing levels of problem difficulty were 2.35%) 2.69%, and 5.54%. An additional 4.2% of all trials were identified as outliers. The analysis of outliers by condition showed a significant problem difficulty effect, F (2, 22) = 5.32 (throughout the chapter, all effects described as significant attained at least the p c 0.05 level). Oddly,
Working Memoy and Problem Difficulty
309
extreme scores were nearly twice as common, 6.8%, on low difficulty problems than on the medium (2.3%) or high difficulty problems (3.5%). This seemed especially true for the Alphabetization task (mean extreme score rate of 9.1%), and to a lesser extent for the Word Generation condition (M = 6.4%), although the difficulty by task interaction was nonsignificant.' In the analysis of RT to the basic facts, all three main effects were significant. Briefly, RT to true problems was faster than to false problems, F (1, 11) = 5.75, with means of 1583 and 1693 msec, respectively. Performance slowed across the levels of problem Wiculty, F (2, 22) = 15.70, with means for low, medium, and high difficulty problems of 1442, 1576, and 18% msec, respectively. Overall, RT in the Repeat control condition was considerably faster (M = 1353 msec) than either the Alphabetization (M = 1763 msec) or Word Generation (M = 1798 msec) tasks. Figure 1 depicts the problem difficulty effect as a function of concurrent task, separately for the true (left panel) and false (right panel) trials. Despite the apparent pattern of a more pronounced increase in true RTs for the Word Generation and Alphabetization tasks (approximately 400 and 600 msec, respectively) than for the Repeat control condition (approximately a 250 msec increase), the task x problem difficulty interaction was nonsignificant, F (2, 22) = 1 . 4 9 , ~c .25. Standard interpretation of this nonsignificance would hold that having to perform a task concurrent with mental addition merely added a constant amount of time to processing, but did not substantially alter the ongoing arithmetic performance. This would clearly be the simplest conclusion here, and the conclusion least troublesome for current views on the retrieval of basic arithmetic facts. It would suggest that retrieval of all basic addition facts is accomplished with little if any need for working memory resources, i.e., with virtually complete automaticity.
'
The Dixon test for outliers forms a ratio of two ranges, in essence the difference between a suspiciousscore and the next highest score,divided by the difference between the suspicious and the lowest score (the comparison scores and the critical values for the ratio change as a function of the number of scores being considered, and whether the suspicious score is a large or small numerical value; for details, see Dixon, 1953). Critical to the test, therefore, is the numerical range among the "non-suspicious"scores, basically their variability. The unusual situation in our low difficulty condition was that the bulk of a subject's RTs would be quite low, with surprisingly low variability. Within such a set of scores, an occasional long RT is more likely to be identified as extreme by the test than a long score among faster yet more variable scores. Thus, it is unlikely that any special importance should be attached to this rather unusual finding, in our opinion.
M.H.Ashcraft, et al.
310
False
True
Reaction Time (msec)
2200
1400
t
1200
,0001
loool Ldw
'
'
Med '
'
'
Problem Difficulty
Hlgh '
' Low
'
'
'
'
'
Medlum
'
'
Hlgh
,
Problem Difficulty
Figure 1. Mean RT to Basic Addition Facts of Low, Medium, and High Dificulty, for Repeat, Word Generation, and Alphabetize conditions (Experiment I). Further consideration, however, suggests that such an interpretation, i.e., accepting the null hypothesis, is not so straightforward here. First, while the 250 msec increase in RT across difficulty levels is well within the range normally obtained in studies of simple addition (e.g., Ashcraft & Battaglia, 1978; Geary & Wdey, 1991), increases in the 400 to 600 msec range, as in the Word Generation and Alphabetization tasks, are much longer than is customarily observed. Even by themselves, these increases suggest that the null hypothesis should be questioned. Second, although no documentation (e.g., tape recordings) can be presented to support this argument, subjects in fact experienced difficulties in maintaining the rate of overt verbalization. Typically, their verbalizations slowed noticeably during the Word Generation and Alphabetization tasks, despite careful warnings against slowing down during the practice session. A reasonable interpretation of this is that subjects sacrificed speaking rate for speed on the arithmetic task. Had they maintained a constant rate of verbalization, their RTs could easily have shown even more pronounced increases. Third, the analysis of variance interaction may not be the most precise way of testing for possible slope differences across problem difficulty. To permit a more detailed analysis, we computed the mean RT to each of the 20 basic facts included in the true stimulus set, separately for each concurrent task, and then regressed
Working Memory and Problem Difficulty
311
these on Associative Strength (Siegler, 1987a), the normative measure used to categorize the problems into low, medium, and high diffculty sets. Table 1 shows the relevant values from these regression analyses. The intercept and slope values were clearly lower in the Repeat condition than in either the Word or Alphabetization tasks. While the intercept values differed significantly across groups, F (2, 22) = 5.52, the slope values did not, F < 1.40. Nonetheless, the slope values for the Word Generation and Alphabetization conditions did fall outside of the 90% confidence interval of the Repeat condition slope (-7.44 to -3.42), but just withm the 95% confidence interval. "his is equivalent to saying that the slope difference between Repeat and the two more demanding tasks was marginally sigruficant at p e 0.10. Table 1. Summary of Regression Analyses for Associative Strength to RT
Concurrent Task Repeat Word Alphabetize
intercept (s.e.)
slope (s.e.)
-.741
1640 (89.2)
-5.43 (1.16)
-.697
2244 (142.8) 2184 (298.7)
-7.66 (1.86) -7.75 (3.89)
r
-.425*
Note. rerit= 0.444 (18 dfi; *p < 0.07
We conclude from these three arguments that the evidence against the null hypothesis of no slope difference across conditions is stronger, though not at the conventional significance level, than is the evidence for the null hypothesis. Especially in light of the slower verbalization that characterized performance in the Word Generation and Alphabetization tasks, it would seem that the concurrent tasks indeed influenced arithmetic processing at least to some degree. To this degree then, one would expect that more rigorous enforcement of a constant verbalization rate could rather clearly reveal disruption of basic fact retrieval processes. Such a result would be tantamount to saying that important elements of simple fact retrieval in addition rely on working memory resources for their completion, especially for difficult problems. Two-digit addition
Accuracy varied more widely on the two-digit addition problems than on the basic facts. Overall error rate on the two-digit problems was 6.8%. Analysis of
312
M.H. Ashcrafr, et al.
variance indicated a significantly higher error rate on the carry problems (9.2%) than on the no-carry problems (4.4%); F (1, 11) = 13.30. The higher error rate on false problems (9.3%) than true problems (4.3%) approached significance, F (1, 11) = 4 . 3 0 , ~< 0.07. Within the false trials only, higher error percentages were associated with carry (12.8%) than with no-carry (5.8%) problems, F (1, 11) = 5.91. Problems with the error in the 10s column were marginally more error prone (12.5%) than those with a 1s column error (6.1%), F (1, 11) = 3.76, p c 0.08. The interaction of carry/no carry and location of error also approached significance, F (1, 11) = 3.88, p < 0.08, showing approximately 18% errors for carry problems with the error in the 10s column, but a nearly constant 6% error rate for the other conditions. In the reaction time analysis, the main effects of carry/no carry and concurrent task were significant; respectively, F (1, 11) = 14.93, and F (2,22) = 8.59. Carry problems required an average of 2588 msec for solution, compared to 2325 msec for no-carry problems. Mean RT in the Repeat condition was 2071 msec, 2595 msec for Word Generation, and 2704 msec for Alphabetization. The tendency was for false problems to show a greater increase in RT for carry problems than true problems did, although this interaction only approached significance in the overall analysis, F (1, 11) = 3 . 7 7 , ~c 0.08. Analysis of only the false trials, however, showed the important way this tendency depended on location of error. That is, both the main and interaction effects of carry/no-carry and location of error were significant in the false only analysis; for the interaction, F (1, 11) = 21.18. This effect is displayed in Figure 2; we present these data separately for the three concurrent task conditions because of the significance of that main effect in both the overall analysis and the false only analysis. Several aspects of Figure 2 deserve special mention here. First, the functions across the problem type variable may be considered conceptually equivalent to the problem difficulty curves for the basic addition facts in Figure 1. That is, for 2-column problems, no-carry and carry represent the factor of problem difficulty. As such, the interaction of the carry/no-carry factor with error location, and the replication of this interaction at each level of concurrent task, is evidence for a significant involvement of working memory. Second, the locus of that involvement seems especially associated with the carry operation. In particular, RTs were most affected on the carry problems when the error was in the 10s column of the answer. This is of course quite reasonable; detection of an error in the Is column permits the subject to self-terminate processing prior to the several 10s column steps (executing the carry, retrieving the 10s column sum, comparing the stated answer to the retrieved/computed sum; see Widaman, Geary, Cormier, & Little, 1989). Failing to find an error in the 1s column, however, necessitates further
Working Memoty and Problem Difficulty Word Generation
Repeat 3600
3600
-
3400 -
3200 -
3200 -
3400
3000
313
F-10’s
~
2800 -
24001
F-lo’s
2600.
2400 -
2200
2000
2200 -
L
T F-1’s
/t
1800 -
2000
-
1800 -
1600
3600
F-10’:
i:;:
2800 -
2600
-
2400 -
2200
__//T
F-1’s
~
1800 -
2000
1600-
Figure 2. Mean RT to Two-digit Addition Problems, separately for the Repeat, Word Generation, and Alphabetize conditions. Each panel displays RT to True problems, to False problems wrong in the 1’s column (F-l’s), and to False problems wrong in the 10’s column (F-lo’s) (Experiment I). processing to maintain accuracy. Interestingly, though the ultimate decision “false” on the 10s column error problems presumably requires only one more step beyond those necessary for true carry problems, the RTs and errors showed a marked increase in that condition. And although the interaction with concurrent task was nonsignificant, notice from Figure 2 that the RT increase across the carry factor
314
M.H.Ashcraf,
et al.
for problems wrong in the 10s column was fully 800 msec in the Alphabetization task, contrasted with approximately 500 msec in Word Generation, and 400 msec in the Repeat task. Three general conclusions may be offered based on these results. It is apparently true that even retrieval of the basic addition facts relies somewhat on working memory resources. Performance was not only slower overall under the two demanding concurrent tasks, but aspects of the results indicated that this interference had a somewhat stronger effect on the more difficult problems (Figure 1, Table 1). Furthermore, two-digit addition problems showed strong interference from the concurrent tasks, indicating rather substantial reliance on working memory processes. Interestingly, working memory would appear to be especially taxed when the carry operation is required and an error is embedded in the 10s column of the answer (e.g., Widaman et al., 1989). This result suggests that, aside from the necessity of working memory resources for retrieval and computation, the comparison and decision-making processes necessary for the verification task also place sizeable demands on the working memory system. In summary, the results argue for a working memory involvement at a rather subtle level for retrieval of difficult basic facts, and at a higher, more central level for complex addition with carrying. The experiment both replicates the general conclusions offered by Hitch (1978a, b) and extends them down to the level of the basic addition facts. Notice, however, that the precise nature of the underlying addition processes affected by the concurrent tasks is not addressed directly by these results. It seems likely from a variety of standpoints that performance to the basic addition facts, certainly for adults, is largely an issue of retrieval from organized memory (e.g., Ashcraft, in press). If so, then the present results suggest that such retrieval, perhaps especially for more difficult problems, remains at a level that requires some resources from working memory. Alternatively, more difficult problems may be prone to greater involvement of strategy-based, computational solution (e.g., Geary & Wiley, 1991). Such solutions would seem quite naturally to require working memory resources for their execution. Despite the apparent clarity of this distinction, definitive evidence on one or the other alternative has yet to be reported. Experiment 2:Automaticity As discussed at the outset, there is a close, almost inverse connection between studying the influence of working memory on performance and studying the role of automaticity in that performance: To the degree that component processes become automatic or autonomous, their reliance on working memory should
WoAdng Memory and Problem Dificuity
315
become negligible; to the degree that cognitive processes remain at a fairly conscious, controlled level, they should require substantial working memory resources for successful completion. In experiment 2, we decided to examine automaticity directly, rather than rely on the absence of a working memory effect to infer automatic processing. To that end, we modeled experiment 2 after Koshmider and Ashcraft's (1991) developmental study of multiplication. Because that paper reviews the literature and issues rather extensively, only a synopsis is provided here. Researchers have speculated for several years that mental arithmetic, at least for the basic facts of addition and multiplication, should demonstrate clear evidence of automatic processing (e.g., Resnick & Ford, 198l), and that there should be a discernible progression across childhood from relatively conscious to relatively automatic processing of the basic facts (e.g., Ashcraft, 1982). Kaye (1986) presented an excellent discussion of these issues from the developmental standpoint of mastery of arithmetic. To paraphrase, Kaye notes that an individual who must expend conscious resources in computing answers to the basic facts will suffer when complex arithmetic processing is required. If laborious reconstruction is necessary for 9 x 6, for instance, then insufficient working memory capacity will remain for the further operations like carrying, holding intermediate values, etc., in the problem 439 x 6. Note that such thinking, although at an intuitive level, has traditionally characterized teaching methods for basic arithmetic; regardless of orientation, elementary school curricula seem invariably to stress the need for "memorization of the basic facts," so that more difficult problems can be approached successfully. Aside from this practical need for some degree of automaticity in performance to the basic facts, current theoretical explanations of arithmetic processing also suggest an important role for automatic processing. That is, current theories (see Ashcraft, in press, for a review), generally claim that adults' performance to simple addition and multiplication facts is due to a memory retrieval process, operating on an organized, interconnected memory representation of the facts. Just as in theories of semantic representation, the retrieval process is viewed as one of spreading activation, which activates both target information as well as related information within the memory structure. This spreading activation process is normally viewed as highly automatic; i.e., spreading activation is an obligatory, non-optional process, triggered by the input regardless of a subject's intentions or goals. Indeed, Logan and Klapp (1991) have recently claimed that memory retrieval is the requisite process that underlies automaticity. Several lines of research provide both direct and indirect evidence on the involvement of automatic processing in arithmetic. Koshmider and Ashcraft (1991)
316
M.H. Ashcrafl, et al.
discussed three studies in particular, reports by LeFevre, Bisanz, and Mrkonjic (1988), Zbrodoff and Logan (1986), and Campbell (1987). In all of these, the common element is that related information can alter the processing of an arithmetic stimulus. When this influence occurs rapidly, say within 250 or 300 msec of stimulus onset, then the effect is generally attributed to automatic processes, occurring well before more controlled, conscious factors can have an effect (e.g., Posner & Snyder, 1975). In the LeFevre et al. study, subjects were tested with a simple digit matching task, saying "yes" when the prime, termed a "probe digit," matched either of two subsequently presented target digits. Evidence for automaticity was obtained in their "sum probe" condition, when the probe was in fact the sum of the two target digits. In this condition, deciding that the probe did not match either target digit was significantly slower than decisions in the "neutral probe" condition. Such a pattern would be expected if presentation of the target digits triggered an automatic retrieval of the addition fact representation in memory. Interestingly, LeFevre et al. found this pattern only at a Stimulus Onset Asynchrony (SOA, the time interval between prime and target onset) of 120 msec, suggesting that whatever activation process was triggered by the probe had a very short term effect, followed by an inhibitory or suppression effect. More directly, Campbell (1987, Experiment 2) presented the basic multiplication facts in a production task, in which subjects name the answer to the problem out loud, as rapidly as possible. Each problem was preceded by one of four types of primes, (1) a pair of number signs (##), which served as the Neutral prime, (2) the Correct answer to the upcoming target, (3) a Related, or (4) an Unrelated value. As determined in Campbell's Experiment 1, Related primes were high frequency errors to the upcoming target (e.g., 24 for 4 x 8), and Unrelated primes were low frequency errors to their targets (e.g., 28 for 4 x 8). The results showed clearcut facilitation and inhibition of performance depending on prime type. With a constant 300 msec SOA, Correct primes speeded performance by 100 msec, relative to neutral baseline, whereas Related primes slowed performance by nearly 50 msec. In other words, 300 msec of advance information yielded a 100 msec benefit when the prime activated the correct answer to the target, but a 50 msec cost when the activated value was related but incorrect. Interestingly, the benefit under correct priming was 55 msec stronger for difficult problems than for easy problems, Campbell's (1991) recent extension of this work, with primes presented a constant 200 msec in advance of the problem, revealed a 40 msec stronger benefit for difficult than easy problems with Correct primes, and an approximately 30 msec greater cost for difficult than easy problems preceded by Related but incorrect primes.
Working Memory and Problem Difticulty
317
Finally, Zbrodoff and Logan (1986) replicated and extended the "confusion" effect originally reported by Winkelman and Schmidt (1974). In both studies, addition and multiplication problems were occasionally presented with an answer correct under the other operation, e.g., 5 + 4 = 20. Significant slowing or interference was observed on these problems, again indicative of an automatic spreading activation process that accesses related but misleading information, thus slowing the overall processing of the problem. Stazyk et al. (1982) found the same pattern using distractor answers taken from the same operation, eg., 7.x 4 = 21, thus eliminating misperception of the plus or times sign as the reason for the confusion effect. On the other hand, Zbrodoff and Logan's results (Experiments 5 and 6) demonstrated that these problems were also influenced by a "stop signal" manipulation, in which subjects are interrupted on a portion of the trials. On a subsequent recognition task, subjects showed lower recognition rates for those problems that had been interrupted by the stop signal than for those not interrupted, especially when the stop signal had occurred earlier in the trial, i.e., with shorter SOAs between problem and signal onset. Whereas the confusion effect in the earlier studies demonstrated that the retrieval process was triggered quite automatically by the stimulus, these last two experiments also demonstrated that intention to complete processing was also a component of performance. In other words, arithmetic processing could not be viewed as totally automatic, because a manipulation of intention, presumably under conscious control, had an effect. Zbrodoff and Logan concluded, therefore, that mental arithmetic performance should be viewed as "partially autonomous,", i.e., beginning automatically but requiring deliberate effort for completion. Such deliberate effort, presumably, would correspond to a load on working memory, as tested Experiment 1 here. In experiment 2, we hoped to extend the results presented in the Koshmider and Ashcraft report. In that study, adults showed automatic costs and benefits in a standard priming task, but those effects depended on problem difficulty;aside from Campbell's two reports (1987, 1991), none of these reviewed studies provided detail on priming as a function of problem size or difficulty. Koshmider and Ashcraft found that multiplication problems in the easy and medium categories showed significant benefits under Correct answer priming, even at the short, 225 msec SOA level, but that high difficulty problems showed benefits only at the 450 msec and 1400 msec SOA levels. This pattern suggests strongly that degree of automaticity, hence level of benefits, depends on problem difficulty in a direct and important way. Likewise, the costs of irrelevant priming were limited to the easy and medium levels of difficulty, and for the most part to shorter SOAs. As in Campbell's (1987, 1991) procedure, but unlike Koshmider and Ashcraft's (1991),
318
M.H. Ashcraft, et al.
we tested three types of number primes in the present study, Correct answer primes, primes that were Related yet not correct answers, and genuinely Unrelated primes, primes that are tabled answers in simple multiplication yet non-multiples of the target problem. Methods and procedures Subjects. Twenty-five undergraduates at Cleveland State University served as subjects, the only sampling restriction being normal or corrected-to-normal vision. Unfortunately, an error in one copy of the experimental software prevented us from using data from 6 of these subjects. Stimuli. The stimulus set was an extended version of the set used by Koshmider and Ashcraft (1991). We excluded all problems with multipliers of 0 and 1, and all tie problems (e.g., 7 x 7) as well, leaving 28 unique problems m x n = p, and their 28 inverses (n x rn = p ) . As a problem was randomly selected from this pool of 56 to appear as a true problem, its inverse was assigned to the false set; thus, the true and false sets included equal numbers of problems in which rn > n, and vice versa. These problems were then categorized into either low, medium, or high levels of problem difficulty, based on Siegler’s (1988) Associative Strength values, with 10, 10, and 8 problems in the categories, respectively. Finally, we assigned answers to the problems in the false set by selecting the nearest tabled multiplication answer that was not a multiple of the problem (e.g., either 20 or 25 could be selected as the false answer for 7 x 3, because both are tabled answers to the single digit multiplication facts and neither is a multiple of 7 or 3). Approximately half of the false answers were less than the correct answer to the problem, and half greater. Each of the 28 true problems then had values assigned to be used as primes. In the true condition, Correct primes were simply the correct answers to the problems. Related primes were values that were multiples of one of the problem’s operands, obtained by incrementing or decrementing the other operand by 1; e.g., for 5 x 7 = 35,the related prime was 30,the answer to 5 x 6. Unrelated primes were tabled multiplication answers near the correct answer that bore no relationship to either operand (e.g., 36 for 7 x 5 = 35). To ensure that problem-to-problem variations did not contaminate the RT comparisons across prime types, problems were randomly assigned to the Correct, Related, or Unrelated prime conditions, and then presented only in that condition. Thus, the problem 2 x 5 = 10 was assigned to the Correct prime condition, 3 x 2 = 6 to the Related condition, 7 x 2 = 14 to the Unrelated condition, and so forth. Of course,
Working Memory and Problem Difficulty
319
each problem appeared with the Neutral prime, a pair of hyphens, at all three SOA levels. Primes for false problems were designed to control for two unavoidable aspects of the true priming procedure. First, when true problems are primed with Correct answers, there is a match between prime and answer, a factor that should be controlled to eliminate the possibility of responding "yes"whenever the prime and the stated answer matched. Thus for false problems, Correct primes were in fact the answer stated in target, as is the case in true Correct primes; for instance, 27 was the Correct prime for the false problem 6 x 4 = 27. This proceduie controls for the answer-match artifact. Second, a Related prime on a true trial necessarily mismatches the answer but is multiplicatively related to the problem. Thus we used the genuinely correct answer to the problem as the Related prime on false trials, which provides the related, answer-mismatch control needed for the true condition; for example, 24 was the Related prime for 6 x 4 = 27. Finally, Unrelated primes on false trials were the same as in the true Unrelated condition, unrelated to the correct or stated answer, and mismatching the stated answers. Note that these manipulations were intended merely to equate the true and false problem sets on the frequency of the answer-match and answer-mismatch trials. Given recent evidence that a stated answer itself generates significant priming (e.g., Campbell, 1987; Zbrodoff & Logan, lm),which would include trials with incorrect stated answers, no attempt to ascertain priming effects among false trials was planned? Apparatus and Conditions. All testing was conducted on the same microcomputers and software as in the first experiment. After receiving instructions and a set of 24 practice trials, subjects saw the total of 336 experimental trials; each subject saw a different randomized order of trials. Subjects were given two short rest periods during the sequence of trials. Each trial consisted of an initial fmation point, a 100 msec presentation of the prime, a second fixation point, and finally the target problem, which remained on the screen until a response was made. The second fucation point was presented for either 100,300,or 900 msec, thus yielding the three SOA intervals of 200,400, and loo0 msec between onset of the prime and onset of the target.
This result might seem to imply that priming effects cannot be assessed in the true/false verification task at all, given that a stated answer seems to generate additional priming. We argue that this is not a difficulty in the true trials here, since they all presented the correct answer in the target stimulus. The verification stimulus may have the effect of reducing RT because of priming, relative to performance in a production task, but the effect should be constant across all true conditions here.
320
M.H. Ashcraft, et al.
Results
Overall, error rates were 4.9% and 4.6% for the true and false trials respectively, indicating that the prime-answer controls indeed prevented subjects from responding just on the basis of matching/mismatchig primes and stated answers. For reasons explained above, no further reference is made to the false trial data. Within the true trials, extreme scores were identified as in the first experiment, accounting for 3.7% of the RTs. The analysis of error and extreme score percentages within the true trials revealed no significance, either for main or interaction effects. Within the RT data, two main effects and two interactions were significant. Mean RT increased significantly across the low, medium, and high -problem difficulty categories, F (2,36) = 79.40,the means were 1043,1135, and 1305 msec, respectively. Mean RT was 1134 msec under Neutral primes, 1105 msec under Correct primes, 1204 msec for Related primes, and 1u)l for Unrelated primes, F (3, 54) = 6.73. Although these priming effects were in the same directions as found in Campbell (1987,1991), their magnitudes were changed; i.e., the benefits of Correct answer primes, which averaged 30 msec here, were considerablysmaller than those reported by Campbell, but the costs of Related and Unrelated primes, 70 msec, somewhat larger than Campbell reported. Furthermore, prime type interacted significantly with SOA, F (6, 108) = 5.38, and this two-way interaction was qualified by its three-way interaction with problem difficulty, F (12, 216) = 2.90. The simpler problem difficulty effects as a function of prime type are presented in Figure 3, followed by the costbenefit functions of the interaction, in Figure 4. Figure 3 shows rather clearly that Related and Unrelated primes exerted a slowing effect on processing, compared to the Neutral prime function. Notice that the Neutral and Unrelated functions were largely parallel here, showing an approximately 75 msec difference at each level of problem difficulty. Related primes, however, showed a slight tendency (F (1,36) = 1 . 9 1 , ~< 0.17) to disrupt performance more at medium and high levels of problem difficulty;the differences between neutral and related conditions grew from 50 to 130 msec across levels of difficulty. Correct answer priming, finally, seems to have had a positive effect largely on the medium difficulty problems, at least as depicted in this figure. Figure 4 presents further detail on these effects, as they changed across the levels of SOA. First, note that only the medium difficulty category (upper right panel) showed consistent evidence of facilitation across all SOA levels. Problems of low difficulty showed small to moderate facilitation at the two longer SOAs, but no benefits at all at uw) msec SOA. Similarly, high difficulty problems
Working Memov and Problem Difficulty
321
Difficulty by Prime Type Reaction Time (msec) 1
1300 -
4
0
0
R U2 C N
1200 -
1100 -
1000 -
900'
Low
Medium
High
.
Problem Difficulty
Figure 3. Mean RT to True Basic Multiplication Facts of Low, Medium, and High difficulty, for Neutral, Correct, Related and Unrelated prime conditions (Qeriment 2).
demonstrated facilitation only at the longest SOA interval; the apparent 100 msec cost at the shortest SOA level here was nonsignificant, F (1,18) = 1 . 2 5 , ~> 0.20. The major discrepancy between these patterns and those obtained by Koshmider and Ashcraft (1991) involves the lower benefits in the present study under Correct priming; in Koshmider and Ashcraft's adult data, benefits under Correct priming were strong for all three difficulty levels, at all SOAs. The other functions displayed in Figure 4 concern the generally inhibitory effects of both Related and Unrelated primes. Uniformly, Unrelated primes showed either a small but constant or a growing inhibitory effect on performance. While the effects of Related priming were generally negative, as suggested by Figure 3, and as reported elsewhere (e.g., Campbell, 1991), two points departed from this pattern, showing apparent benefits for medium difficulty problems at the longest SOA, and for high difficulty problems at the shortest SOA.
M.H. Ashcraf, et al.
322
300 -
400 -
400
200 100
300 -
-
0 -
-100-
200 -
.c
__ --
-200-
1-
-L--++++
.
O
.'
/
~ , o o ~ * -U ---*--
U
4O0 0
,R
/'
0.
R
-200-
+ -400 2 0 0
~
-______ ------_____
'
C
100 K-----+
1000 L
400
-400 200
1000
SOA (rnsec)
SOA (msec)
RT as a function of SOA
AT a8 a fmctlon of SOA
.
High Difficulty Cost/Beneflt (msec)
400r
r:(:_::: 3001
-100 0 ,-r
-_-----__---
+-------
-----%
-200
u
*/---
-30 -400 200
400
1000
SOA (msec) RT as a function of SOA
Figure 4. Costs (negative values) and Benefits (positive values) to Tme Multiplication Facts of Low, Medium, and High Difficulty, with Correct, Relate4 and Unrelated Primes, across three levels of SOA (Experiment 2).
Upon closer inspection, these two rather anomalous points seemed due not to strong, consistent effects, but instead to unexpectedly high variability among problems within the levels of difficulty; that is, only some of the Related primiig problems showed the unusual benefits that are apparent in mean performance. In an attempt to understand this tendency better, we conducted several reanalyses; we regrouped the problems into difficulty levels based on Campbell and Graham's (1985) RT norms, as Campbell (1987, 1991) has done; we recomputed separate
Working Memory and Problem Difficulty
323
Neutral baseline RTs for each priming condition, such that each individual problem served as its own Neutral baseline; and we divided the subject sample into faster and slower groups, against the possibility that slower subjects were introducing a disproportionate amount of variability. None of these alternate schemes yielded appreciably more systematic functions for Related primes. Several explanations exist for these rather unusual findings; because we are currently exploring some of these possibilities, we provide only a brief mention of them here. As stated, the obstacle appears to be inordinately large variability among the problems within a particular difficulty level. One reason for this lack of systematicitycould be procedural factors related to the verification task; recall that Campbell's (1991) rather straightforward results came from the production task. Furthermore, the low incidence of genuinely relevant and helpful prime-target pairs (i.e., a Correct answer prime for a true target), and the high frequency of "misleading"primes (i.e., both Related and Unrelated primes for true targets, and all number-primed false trials) may have distorted or altered the costs and benefits of priming here. In fact, this possibility may have interacted with problem-specific factors. For instance, the slowest and second most error prone problem in Campbell and Graham's (1985) tabulation, 6 x 9 = 54, appeared here in the Correct prime condition, and took over 200 msec longer to verify at the shortest SOA than it did under Neutral priming. Within the present design, even relatively few such cases could easily distort the computed costs and benefits. Finally, it may also be that the priming effect is more fragile or variable than would be expected, especiallywhen problems are tested repeatedly across multiple levels of SOA. For example, the relatively strong Correct priming effect in Koshmider and Ashcraft (1991) was obtained in an experiment where fully 75% of the true primed trials had Correct primes. In contrast, only 1/3 of the true primed trials here had Correct primes. The influence of such overall proportions has been reported elsewhere, for a letter matching task (Posner & Snyder, 1975); it may be just as important, or even more so, in studies of mental multiplication. Discussion
The results of this experiment add modestly to the growing literature on automatic costs and benefits in mental arithmetic. In particular, Correct primes yielded a generally positive effect on RT, though high difficulty problems showed this effect only with lo00 msec of advance exposure to the prime. This would suggest a fairly high degree of automaticity for low and medium difficulty problems, but reliance on more controlled processing for problems of high difficulty. Related and Unrelated primes yielded generally negative effects on
324
M.H. Ashcrufi, et al.
processing, especially for more difficult problems. Thus, in both cases the degree of the priming influence depended on problem difficulty, in agreement with Campbell’s (1991) recent report, and with arguments presented earlier; low and medium difficulty problems are influenced at short SOAs, suggestive of automatic processing, but high difficultyproblems show effects well after consciousprocessing has begun, i.e., at the longest SOA. Note, however, that it might be viewed as rather surprising that Related primes yield the same inhibitory effect as observed with Unrelated primes. Related primes are, after all, multiples of the problem’s operands, i.e., are multiplicatively related to the upcoming target problem. To be sure, Campbell’s manipulation of Related primes would be expected to yield inhibition; values that occurred frequently as errors were used to prime the problems in his report, generating substantial interference. On the other hand, from the standpoint of spreading activation, one might expect that related primes would activate relevant problem-answer representations in memory, thus facilitating retrieval because the target had already received some degree of activation. That is, 30 could be expected to activate not only 6 x 5, but several of the nearby “6 times“ and “5 times” facts as well. If this prime is then followed by 7 x 5 = 35, one would expect clear, though possibly moderate, facilitation. Unrelated primes would be expected to activate information truly irrelevant to the upcoming target, e.g., 21 as a prime for 6 x 5 = 30. This is apparently not the case, however. It seems, instead, that any prime other than the correct answer yields a disruptive effect. This disruption is, of course, evidence that the prime is indeed activating information in memory, and exerting an influence on the retrieval process. The influence, however, is largely one of interference. General discussion We turn to one final theoretical issue that deserves discussion here as well, since it ties the present experiments to other recent work on the development of automaticity. Logan and Klapp (1991; see also Klapp, Boches, Trabert, & Logan, 1991) have investigated the growth of automaticity in an ”alphabetarithmetic”task, i.e., a task in which specific facts such as A + 2 = C are learned and then responded to in a true/false RT task. In Logan and Klapp’s first experiment, the evidence showed that a set of 40 such facts achieved automaticity after extended practice. Of greater interest, their Experiment 2 showed that a set of only six such facts achieved automaticity within a single session, thus ruling out extended practice as a necessary condition for automaticity. In fact, their overall conclusion
Working Memory and Problem Difficulty
325
was, simply, that having an item stored in memory, hence available for retrieval, was the governing factor in achieving automaticity. Skill, on the other hand, would correspond to having a greater number of such items within the domain stored. While we do not question Logan and Klapp's evidence, the question remains as to why greater evidence of automatic retrieval, especially as shown by facilitation, is not apparent in a multiplication task. To be sure, there are more multiplication facts than the sets tested by Logan and Klapp. On the other hand, their third experiment found that learning rate depended on the number of presentations of individual items, not the sheer number of items to be learned. As such, it seems extremely peculiar that the evidence for clearcut, automatic retrieval in the present data is relatively weak, one would imagine that by adulthood, each of the multiplication facts would have been presented far more frequently than is necessary to establish the item in accessible memory. An important key to this conclusion, however, appears to involve the effects of problem diff'iculty, a factor not addressed by Zbrodoff and Logan. Evidence is now growing that smaller arithmetic facts, i.e., those with smaller operands, are "advantaged at the early elementary school level. Hamann and Ashcraft (1986) found that addition problems with addends ranging from 2 to 5 occurred nearly twice as frequently as those with addends outside that range (e.g., 0, 1,6, etc.) in textbooks for grades Kindergarten through Three; Ashcraft and Christy (1991) have found the same relationship in texts for grades Three through Six, in multiplication as well as addition. There is also evidence that smaller numbers are more frequent than larger numbers in naturally occurring settings (e.g., Dehaene & Mehler, 1991). Thus, smaller arithmetic facts, by virtue of their greater frequency, seem especially likely to become "low difficulty" facts, not only more rapid and accurate under standard experimental conditions, but also less affected by manipulations that compete for attentional resources in working memory. Interestingly, the rapid automatization effects in Logan and Klapp (1991; also Klapp et al., 1991) were all obtained on stimulus sets using addends in the 2 to 5 range. It may in fact be that automaticity will develop quickly only for such small addend problems. We conclude, consistent with Campbell's (1987, 1991) view, that multiplication knowledge, and possibly all arithmetic knowledge, is especially prone to interference. When a misleading, irrelevant value is used as a prime, significant slowing of performance is obtained. When the prime is related to the target multiplicatively, this seems to generate at least the same degree of slowing, if not more. Regardless of the related prime's multiplicative relationship to the target, this spread of activation appears not to have any systematic facilitation effect; instead, the effect seems to be quite the opposite, generating interference. And
326
M.H.Ashcra.. et al.
indeed, when a clearcut, helpful prime, the correct answer, is presented, facilitation depends on problem difficulty and, apparently, factors like the proportion of trials primed by values not equal to the correct answers. If retrieval of the basic multiplication facts were highly automatic, positive priming effects should be clearly and easily obtained. On the evidence from this experiment, as well as that of Koshmider and Ashcraft (1991), this is not the nature of the obtained results. Subject to further replication, then, we conclude that multiplication is performed at a level substantially less than fully automatic. In this respect, the evidence is similar to that reported by Zbrodoff and Logan (1986), and supports their notion of “partial autonomy” using a quite different paradigm. And like the addition performance described in the frrst experiment here, this conclusion further asserts that performance at a less than automatic level implicates the working memory system. Problems of high difficulty, in other words, show little or no facilitation at short SOA levels, and more serious disruption under concurrent task conditions, for the same reason --they require non-trivial levels of controlled processing, i.e., working memory resources, for adequate performance. Generally, then, it would appear that autonomy, or at least the potential for autonomy, is not fully r e a l i d in simple arithmetic processing. This could be due to any number of factors, of course. Campbell and Graham (1985), for instance, suggested that simple arithmetic might be inherently prone to interference, given the large number of associations to be formed from a limited set of basic symbols. Strategic processing, especially on large, more difficult facts, may be more common for adults than is customarily realized, may compete successfully against memory retrieval when the information sought is of low strength in memory (e.g., Ashcraft, 1982; Logan & Klapp, 1991), or may be an underappreciated individual difference factor in adults’ performance to arithmetic. Regardless of which factor best explains the lack of complete automaticity or autonomy in arithmetic processing, one conclusion is quite clear: Further attempts to resolve these issues must, at a minimum, address the still central puzzle of the area, the problem difficulty effect. ACKNOWLEDGEMENTS I express my thanks to Jamie Campbell and Lana Trick for helpful comments on an earlier version of this paper. Correspondence should be addressed to Mark H. Ashcraft, Department of Psychology, Cleveland State University, Cleveland, OH 44115. bitnet: r0599@CSUOHIO
Working Memory and Problem Difficulty
327
REFERENCES Men, PA., Ashcraft, M.H. & Weber, TA. (1991). On mental multiplication and age. Unpublished manuscript. Ashcraft, M.H. (1982). The development of mental arithmetic: A chronometric approach. Developmental Review, 2, 213-236. Ashcraft, M.H. (1990). Strategic processing in children’s mental arithmetic: A review and proposal. In D.F. Bjorklund (Ed.) Children’s strategies: Contemporary views of cognitive development (pp. 185-212). Hillsdale, NJ: Erlbaum. Ashcraft, M.H. (in press). Cognitive arithmetic: A review of data and theory. Cognition: International Journal of Cognitive Psychology. Ashcraft, M.H. & Battaglia, J. (1978). Cognitive arithmetic: Evidence for retrieval and decision processes in mental addition. Journal of Experimental Psychology: Human Learning and Memory, 4,527-538. Ashcraft, M.H. & Christy, K. (1991). Tabulation of problem frequency in elementary texts: Grades 3-6. Unpublished manuscript. Ashcraft, M.H. & Fierman, BA. (1982). Mental addition in third, fourth, and sixth graders. Journal of Experimental Child Psychology, 33, 216-234. Ashcraft, M.H. & Stazyk, E.H. (1981). Mental addition: A test of three verification models. Memory & Cognition, 9, 1851%. Baddeley, A.D. & Hitch, G. (1974). Working memory. In G.H. Bower (Ed.), The psychology of learning and motivation (Vol. 8, pp. 47-89). New York: Academic Press. Campbell, J.I.D. (1985). Associative interference in mental computation. Unpublished doctoral dissertation, University of Waterloo. Campbell, J.I.D. (1987). Production, verification, and priming of multiplication facts. Memory & Cognition, IS, 349-364. Campbell, J.I.D. (1991). Conditions of error priming in number-fact retrieval. Memory and Cognition, 19, 197-209. Campbell, J.I.D. & Clark, J.M. (1989). Time course of error priming in number-fact retrieval: Evidence for excitatory and inhibitory mechanisms. Journal of Experimental Psychology: Learning, Memory, and Cognition, 1.5, 920-929. Campbell, J.I.D. & Graham, D.J. (1985). Mental multiplication skill: Structure, process, and acquisition. Canadian Journal of Psychology, 39, 338-366.
328
M.H. Asherap, et al.
Cooney, J.B., Swanson, H.L. & Ladd, S.F. (1988). Acquisition of mental multiplication skill: Evidence for the transition between counting and retrieval strategies. Cognition and Insmction, 5, 323-345. Dehaene, S. & Mehler, J. (1991). Cross-linguistic regularities in the frequency of number words. Manuscript submitted for publication. Dixon, WJ. (1953). Processing data for outliers. Biometries, 9, 74-89. Donley, R.D. (1991). Arithmetic performance as afunction of marhematics m * e t y : An analysis of simple addition and subtraction problems. Unpublished master’s thesis, Cleveland State University, Cleveland, OH. Geary, D.C. & Wiley, J.G. (1991). Cognitive addition: Strategy choice and speed-of-processing differences in young and elderly adults. Psychology and Aging, 6, 474-483. Groen, G.J. & Parkman, J.M. (1972). A chronometric analysis of simple addition. Psychological Review, 79, 329-343. Hamann, M.S. & Ashcraft, M.H. (1986). Textbook presentations of the basic addition facts. Cognition and Instruction, 3, 173-192. Hitch, GJ. (1978a). Mental arithmetic: Short-term storage and information processing in a cognitive skill. In A.M. Lesgold, J.W. Pellegrino, S.D. Fokkema & R. Glaser (Eds.), Cognitive psychology and insmction (pp. 331-338). New York: Plenum. Hitch, G.J. (1978b). The role of short-term working memory in mental arithmetic. Cognitive Psychology, 10, 302-323. Kaye, D.B. (1986). The development of mathematical cognition. Cognitive Development, 1, 157-170. Klapp, S.T., Boches, CA., Trabert, M.L. & Logan, G.D. (1991). Automatitizing alphabet arithmetic: 11. Are there practice effects after automaticity is achieved? Journal of Experimental Psychology: Learning, Memory, and Cognition, 17, 196-209. Koshmider, J.W. 111 & Ashcraft, M.H. (1991). The development of children’s mental multiplication skills. Journal of Experimental Child PsychologyJ51, 53-89. LeFevre, JA., Bisanz, J. & Mrkonjic, L. (1988). Cognitive arithmetic: Evidence for obligatory activation of arithmetic facts. Memory & Cognition, 16,45-53. Logan, G.D. & Klapp, S.T. (1991). Automatizing alphabet arithmetic: I. Is extended practice necessary to produce automaticity? Journal of Experimental Psychology: Learning, Memory, and Cognition, 17, 179-195. Parkman, J.M. (1972). Temporal aspects of simple multiplication and comparison. Journal of Experimental Psychology, 95, 437-444.
Working Memory and Problem Difficulty
329
Posner, M.I. & Snyder, C.R.R. (1975). Facilitation and inhibition in the processing of signals. In P.MA. Rabbitt & S. Dornic (Eds.)Attention andperfonnance V (pp. 669-682). New York Academic Press. Resnick, L.B. & Ford, W.W. (1981). Thepsychologyof mathematicsfor instruction. Hillsdale, NJ: Erlbaum. Schneider, W. (1988). Micro Experimental Laboratory: An integrated system for IBM PC compatibles. Behavior Resemh Methods, Znshuments, & Computers, 20, 206-217. Shiffrin, R.M. & Schneider, W. (1977). Controlled and automatic human information processing: 11. Perceptual learning, automatic attending, and a general theory. Psychological Review, 84, 127-190. Siegler, R.S. (1987a). The perils of averaging data over strategies: An example from children’s addition. Journal of Experimental Psychology: General, 116, 250-264. Siegler, R.S. (1987b). Strategy choices in subtraction. In J. Sloboda & D. Rogers (Eds.) Cognitive process in mathematics. Oxford: Oxford University Press. Siegler, R.S. (1988). Individual differences in strtegy choice: Good students, notso-good students, and perfectionists. Child Development, 59, 833-851. Siegler, R.S. & Jenkins, EA. (1989). How children discover new strategies. Hillsdale, NJ: Erlbaum. Stazyk, E.H., Ashcraft, M.H. & Hamann, M.S. (1982). A network approach to simple multiplication. Journal of Experimental Psychology, 8, 320-335. Trick, L. (this volumn). Perceptual and attentional mechanisms in counting and subitizing. In J.I.D. Campbell (Ed.) The nature and origins of mathemallcal skills. Elsevier. Widaman, K.F., Geary, D.C., Cormier, P. & Little, T.D. (1989). A componential model for mental addition. Journal of Experimental Psychology: Learning, Memory, and Cognition, 15, 898-919. Winkelman, J.H. & Schmidt, J. (1974). Associative confusions in mental arithmetic. Journal of Experimental Psychology, 102, 734-736. Zbrodoff, N.J. & Logan, G.D. (1986). On the autonomy of mental processes: A case study of arithmetic. Journal of Experimental Psychology: General, 115, 118-130. Zbrodoff, N.J. & Logan, G.D. (1990). On the relation between production and verification tasks in the psychology of simple arithmetic. Journal of Experimental Psychology: Learning, Memory, and Cognition, 16, 83-97.
This Page Intentionally Left Blank
m
J.I.D.Campbell (Editor) CP 1992 Elsevier Science Publishers B.V. All rights reserved.
331
Chapter 9
REPRESENTATIONAND RETRIEVAL OF ARITHMETIC FACTS: A NETWORK-INTERFERENCEMODEL AND SIMULATION
Jamie 1. D. Campbell and Michael Oliphant University of Saskatchewan
Summary
Wepresent a computer model of a network-integerence theory of memory for singledigit multiplication and addition facts. According to the model, a presented problem activates representationsfor a large number of related anlhmetic facts, with strength of activation of specific facts determined by similarily to the presented problem. Similan?y is assumed to be based on both physical codes (e.g., common visual or phonological features) and visuo-spatial magnitude codes. Nodes representing numerical facts that are related to the presented problem are continuously activated and compete by way of mutual inhibition until one reaches the critical activ&*on threshold and biggers a response. The counteractingprocesses of WCiration and inhibition in the model reproduce a large number of response time and error phenomena observed in skilled memory for number facts. The general form of the representationalstmctures proposed in the simulation provide for a natural extension of the model to other areas of cognitive arithmetic and associated research. Background and introduction
In this chapter, we describe a computer simulation of the micro-processes underlying retrieval of simple addition and multiplication facts (e.g., 5 + 7 = 12; 3 x 4 = 12). The simulation model represents an initial formalization of the networkinterference theory of number-fact retrieval (Campbell, 1987, 199Ob, 1991; Campbell & Graham, 1985; Campbell & Clark, 1989; Graham, 1987; Graham & Campbell, in press). Our goals for this preliminary version of the simulation were quite ambitious: To implement a well-specified version of the network-interference
332
J.I.D. Campbell & M. Oliphant
model that reproduces the major response time (RT) and error patterns observed in skitled retrieval of multiplication and addition facts. Despite its popular status as "simple"arithmetic, number-fact retrieval presents a complex array of response time and error phenomena. Viewed as performance criteria, these phenomena provide many useful constraints on plausible retrieval models, and make constructing a precise and detailed model a substantial theoretical challenge. As a preview of what will be described in detail later, the main assumptions of the current model can be summarized quite simply: Presentation of a simple multiplication or addition problem excites memory representations associated with three response categories: adding, multiplying, and naming (i.e., "reading" the problem's operands). Multiple items are activated within each category and this excitation converges on verbal production mechanisms. We assume in the model that arithmetic problems are represented by magnitude codes (i.e., a region on a visuo-spatial number-line) and by physical codes. Physical codes are visual and phonological records assumed to preserve features of perceptual experiences with problems. We refer to the representational complex corresponding to the combined physical and magnitude codes for a problem as a node. Retrieval in the model consists of a series of processing cycles, with a cycle representing a few tens of ms of processing. On each cycle, each node in the network receives an amount of excitatory input that is determined by its physical and magnitudinal similarity to the presented problem. This excitatory input is modulated on each cycle by local inhibition from all the other nodes within the same response category, and by global inhibition between response categories. A verbal response is produced when one of the nodes in the network reaches a critical threshold level of activation. The nominal input strength to the correct node is generally the same for all problems; thus, differences in retrieval difficulty among problems arise primarily from differences in interference due to the inhibition from other nodes. As may be clear from the preceding overview, the network-interference model is a theory of number-fact retrieval specifically, and is not a general model of what people "know" about arithmetic. Arithmetic competence also encompasses knowledge of a large corpus of quantitative rules, procedures, and relations, as well as an understanding of the functional or conceptual relations among quantitative procedures. Efficient production of the simple addition and multiplication facts, however, is an important part of arithmetic competence and we assume that educated adults and older children normally can and usually do rely on a retrieval strategy. Thus, within a general model of arithmetic competence, the networkinterference model of number-fact retrieval can be viewed as a detailed theory of a component memory skill.
Network-interference Model
333
As there have been several recent articles extensively comparing and critiquing current models of number-fact retrieval, we will not present a review of the alternative theories here (see Ashcraft, 1991; Baroody, 1991; Cornet, Seron, Deloche & Lories, 1988, McCloskey, Harley & Sokol, 1991; see also McCloskey & Lindemann, this volume). Instead, we will begin by describing the major phenomena of number-fact retrieval that served as the performance criteria for the present version of the model. We then describe the the model in detail, report analyses of the model's performance and compare the model's performance to actual data. Finally, we identify some strengths and weaknesses of the model, and suggest directions for future development. Performance characteristics of number-fact retrieval
Problem-size effect Perhaps the most pervasive and extensively researched phenomenon of mental arithmetic is the problem-size effect, which refers to the fact that the difficulty of simple arithmetic combinations generally increases with numerical size. Correlations of approximately + .6 to + .8 are observed between indices of problem size (e.g., the sum or product of problem operands) and RT for both children and skilled adults on simple addition and multiplication problems (e.g., Ashcraft, 1987; Campbell & Graham, 1985; Geary, Widaman & Little, 1986; Miller, Perlmutter & Keating, 1984, Norem & Knight, 1930, Parkman & Groen, 1971). Error rate is positively correlated with RT for a correct response across problems (Campbell & Graham, 1985; Miller et al., 1984), suggesting that the factors that contribute to the relation between problem-size and RT also contribute to parallel effects on accuracy. The problem-size effect in young children appears to be due to the use of counting procedures or other reconstructive strategies (Siegler & Shrager, 19W, Siegler, 1988), and even experienced adults sometimes use such strategies to solve single-digit combinations (Svenson, 1985). Most adults and older children, however, appear to rely primarily on the use of fact-retrieval for single-digit arithmetic problems (e.g., Koshmider & Ashcraft, 1991). Since these individuals also show a substantial problem-size effect, there must be factors that are correlated with problem-size that contribute to retrieval difficulty. Campbell and Graham (1985; see also Graham & Campbell, in press; Hamann & Ashcraft, 1986) proposed that because the larger-number problems are generally introduced later in the learning sequence, cumulative proactive interference potentially makes the larger problems more difficult to memorize. Another possibility is that large-
334
J.I.D. Campbell & M. Oliphant
number problems are more difficult to remember because they are rehearsed less often, and, indeed, larger-number problems do appear less frequently in elementary-school textbooks (Clapp, 1924; Hamann & Ashcraft, 1986; Siegler, 1988). Although differential frequency and proactive interference as explanations of the problem-size effect are consistent with established laboratory memory phenomena, there is no direct evidence that these factors operating in childhood would exert a substantial influence on the adult’s performance. An alternative view is that numerical magnitude contributes directly to retrieval difficulty in some way (e.g., Beem, Ippel & Markusses, 1987; Gallistel & Gelman, 1991). This approach suggests a direct connection between the problem-size effect and other magnitude effects in arithmetic-verification and number-comparison tasks. For example, in verification tasks (e.g., 4 t 8 = 11, true or false?), false answers that are close numerically to the true answer are more difficult to reject than numerically distance false answers (e.g., Stazyk, Ashcraft & Hamann,1982; Zbrodoff & Logan, 1990). In numerical-comparison tasks, time to make judgements of relative numerical magnitude (e,g., choose the larger of a pair of numbers) is sensitive both to the absolute and relative magnitude of the numbers being compared: decision time tends to increase as absolute magnitude increases, but decreases as the distance between the numbers increases (e.g., Banks, Fuju, & Kayra-Stuart, 1976; Dehaene, 1989; Foltz, Poltrock & Potts, 1984). A recurring theme in theoretical accounts of these observations (e.g., Restle, 1970; Dehaene, Dupoux & Mehler, 1990) is that numerical comparisions are mediated by a visuospatial representation of magnitude; in effect, a mental number line. To account for nonlinear effects of magnitude, it can be assumed that this psychophysical magnitude scale is compressed for larger numbers so that the magnitudes of larger numbers are less discriminable (e.g., 54 and 56 are closer in psychological magnitude than are 14 and 16). There have been several mathematical models advanced to describe the form of the mapping between numerical and psychophysicalmagnitude (e.g., Dehaene, 1989; Shepard,Kilpatric & C ~ ~ i n g h a m , 1975; Welford, 1960), with some form of logarithmic function being the dominant candidate. In the present model we combine the assumption of visuo-spatial magnitude codes representing numbers, as described by Welford’s (1960) function, with assumptions about retrieval interference to account for the problem-size effect. Specifically, the problem-size effect arises in the model because larger-number problems are more similar in magnitude to neighboring problems than are smaller problems, and hence, activate the representations of neighboring problems more strongly. This phenomenon, which occurs because the presumed psychophysical magnitude scale is more compressed for larger numbers, causes larger-number
Netwo&-interference Model
335
problems to encounter more inhibition from neighboring problems than do smaller-number problems. This increases the time required for the correct node to reach criterion, and makes the larger problems more susceptible to retrieval errors. 'Ties"and other exceptions to the problem-size effect
Although the problem-size effect is a robust phenomenon, and knowing the magnitude of a simple arithmetic problem provides reasonably good prediction of its difficulty, there are groups of problems that deviate from the general problemsize rule. For example, both ChiIdren's and adults' performance on problems composed of a repeated operand (e.g., 3 x 3,8 + 8), so called "ties,"show accuracy and RT advantages relative to non-tie problems of about the same magnitude (e.g., Campbell & Graham, 1985; Miller et al., 1984). One account of the ties advantage is higher repetition frequency relative to non-ties (Siegler, 1988), although the evidence is mixed (cf. Hamann & Ashcraft, 1986). Others have suggested that ties are intrinsically easier, perhaps because being comprised of fewer different elements renders them less susceptible to retrieval interference (Graham & Campbell, in press), or because a single, repeated operand eliminates some encoding processes required for non-ties (e.g., Beem et al., 1987; Gallistel & Gelman, 1991). Consistent with the View that ties are intrinsically easier, Graham and Campbell (in press) found that with practice frequency controlled, performance on tie "alphaplication"problems (arithmetic-likememory items composed of letters rather than numbers) was better than on non-ties. They also found evidence suggesting that tie and non-tie problems form categorically distinct subclusters of items: The majority of errors on tie problems involved correct answers to other tie problems (52%), whereas tie answers were infrequent error responses to non-tie problems (4%). Similarly, the multiplication error matrix presented by Campbell and Graham (1985) shows the answer to 9 x 9 as the most common error to 8 x 8, along with other examples of inter-tie confusions, whereas ties answers such as 25, 49, 64, and 81 were infrequent error responses to non-ties. The tendency to confuse answers within but not between these subsets of problems suggests that ties and non-ties are partially dissociated in memory and thereby constitute distinct categories of problems. If it is assumed that activation will tend to be stronger within than across category boundaries, then ties may be easier because they produce weak activation of non-ties and thereby encounter less interference. Furthermore, ties are relatively distinct from each other (i.e., none of the ties share a common operand) and would not produce strong inter-tie interference.
336
J.I.D. Campbell & M. Oliphant
The possibility of subclusters of problems that are partially insulated from extracluster interference could also explain why certain other subsets of arithmetic items are easier than their numerical size would predict. In simple multiplication, the five-times problems deviate from problem-size predictions (Aiken & Williams, 1973), and the majority of errors on five-times problems are other multiples of five, whereas five-times products are infrequent errors to non-five-times problems (see Campbell & Graham, 1985). Similarly, in simple addition, combinations that sum to 10 are fast and accurate relative to their magnitude (Aiken & Williams; 1973; Krueger & Hallford, 1984), and we noted in a recent experiment that 10 was a relatively low-frequency error response in single-digit addition. Thus, the fivetimes problems and problems summing to 10 may form conceptual subclusters of problems like that suggested previously for ties. Five-times problems may form a conceptual subcluster because people learn that multiples of five always have a 0 or 5 in the units position, whereas non-fives multiples never do. The sum-to-ten problems may be distinct because of the functional importance of 10 in counting and in multi-column addition and subtraction. The ease of retrieval observed for these three subsets of problems may be explained by assuming that each constitutes a subcluster of items that is partially isolated from interference by extra-cluster items. In the simulation, these three subcategories are defined, and their relative isolation produced, by providing stronger excitatory connections among intra-category items and relatively weaker excitatory activation of inter-category items. Because these subclusters contain relatively few items, the problems within the clusters tend to receive less inhibitory input compared to the average problem. The effect is to make retrieval faster and more accurate for ties, fives-multiples,and sum-to-ten problems, while at the same time makiing other items within each subcategory the strongest competitors for retrieval. Causes and characteristics of errors
It is a common assumption that the specific errors observed in mental arithmetic reveal aspects of the memory representations and processes that underly retrieval (e.g., Campbell & Graham, 1985; Siegler & Shrager, 1984;McCloskey, Caramazza & Basili, 1985). Thus, an important goal for the model was to have the basic processes responsible for variation in problem difficulty also produce the patterns of specific errors observed. In the model, problems vary in difficulty because of how similar they are to other problems in terms of magnitude and shared physical or perceptual features. Specific retrieval errors of multiplication and addition also
Network-interference Model
337
appear to reflect the influence of these factors; that is, errors are often the correct answers to neighboring problems or to other problems that share features with the presented problem (e.g., share a common operand; have the same operands but a different operation sign; e.g., Campbell & Graham, 1985; Miller et al., 1984). While most retrieval errors involving simple multiplication and addition appear to be due to associative confusions, several experiments have demonstrated that errors also are influenced by the events on preceding trials, so-called emr-priming effects (Campbell, 1987; 1990a; 1990b; 1991; Campbell & Clark, 1989). Specifically, the correct answer from the preceding trial appears to be inhibited and is an unlikely error response on the current trial (negative error primii), whereas answers generated previously over a range of about three to ten trials are promoted as errors (positive error priming). Overall, about 10% to 20% more errors match previous correct products than would be expected by chance, and positive error priming has a measurable range of about 60 s to 90 s. Error priming in mental arithmetic, as with a variety of other implicit memory phenomena (e.g., Blaxton, 1989; Graf & Ryan, 1990, Schacter & Graf, 1989), is apparently mediated by processes whose influence depend on reinstantiaion of encoding conditions. Campbell (1990a) found that positive error priming (i.e., the tendancy for answers to previous problems to appear as errors on later problems) in both simple addition and multiplication was stronger between problems presented in the same format (digits or number words) than between problems presented in different formats. In recognition of these and other phenomena (see Clark & Campbell, 1991), we adopted the so-called encoding-complex view (Campbell, 1990a; this volume; Campbell & Clark, 1988; this volume; Clark & Campbell, 1991) in the model. The encoding-complex position assumes that number-fact retrieval is mediated by modality-specific verbal and visuo-spatial codes, rather than only by abstract, modality-independent codes (e.g., Sokol, Goodman-Schulman & McCloskey, 1989). Thus, in addition to Visuo-spatial magnitude codes that account for the problem-size effect and other influences of relative magnitude, we assumed that representation of number-facts is also based on physical codes; visual and phonological codes that preserve perceptual characteristics of stimulus encoding. As is explained below, physical codes are assumed to be the representational locus of error-priming effects in the model, and also are the source of other subtle error phenomena that are sensitive to the physical arrangement of stimuli.
338
J.I.D. Campbell & M. Oliphant
The model in detail Overview of the model’s atchitecture
In the current version of the network-interference model, presentation of a problem excites three categories of responding: adding, multiplying, and naming. Other numerical categories of responding, such as subtraction, division, magnitude judgements, and so on, are not currently considered in the model. The inclusion of adding, multiplying, and naming was motivated primarily by errors observed in simple multiplication and addition. Specifically,cross-operationalconfusions (e.g., 2 + 3 = 6,2 x 3 = 5) are relatively common errors, at least when both operations are tested in the same session (Campbell, 1990a; Miller et al., lW), and errors sometimes correspond to the name of one or both of the problem’s operands (e.g., 4 x 8 = 48,2 + 5 = 5; Campbell, 1990a). Based on these observations, we assume that responses associated with these three categories of responding are activated and that successful performance of the intended operation requires inhibition of the alternative response categories, As the model is designed to simulate the arithmetic production task (as opposed to a true-false verification task, for example), we assume that the activation of representations within the three categories converges on articulatory mechanisms that support verbal production of number names. On each trial, the model proceeds through a series of cycles, with a cycle representing a small unit of time (=50 ms). On each cycle, addition, multiplication, and naming nodes receive an amount of excitatory input that is determined by physical and magnitudmal similarity to the presented problem. The excitatory input to an item representation is modulated by inhibitory input from all the other items in the same response category, and by global inhibition between response categories. A verbal response is generated when one of the activated representations reaches a critical threshold level of activation. Across cycles, the strengths of the excitatory and inhibitory inputs gradually come into equal opposition, and the system settles into a state of dynamic equilibrium. If the system reaches equilibrium, before any node exceeds the critical level of activation, the model either generates the answer that is currently most strongly activated or halts without responding. Assumptions about specific representations
Two main types of representation are proposed in the model: We assume that numbers are mapped on to a magnitude-scale that is based on visuo-spatial
Nehvork-inteflerence Model
339
representations (i.e., a mental number-line). These visuo-spatial magnitude codes provide one medium for representation and retrieval (cf. Dehane & Cohen, in press). Physical codes, which theoretically correspond to modality specific (Lee, visual and phonological) representations, constitute another medium of representation. The physical codes for problems are conceptualized as associative units comprised of the pair of operands, the operation symbol, and the correct answer (e.g., the visual sequence "4 x 8 = 32" and its phonological counterpart are assumed to be distinct, but closely associated, physical code structures). If a presented problem matches any component feature of the structure, the entire structure is activated. The strength of activation depends both on the strength of association among the components (i.e., the degree of "unitization"), and on a feature-matching process (described below) that computes the degree of similarity to the presented problem. We assume in the model that people have memorized problem-answer pairs involving combinations of operands up to 12 for both addition and multiplication. When a problem is presented, all the addition and multiplication facts in the network are activated to some degree. For the naming category of responding, we defined the set of candidates to include four possible naming responses based on the pair of operands presented If the presented problem was 4 x 8 or (4 + 8), for example, the responses "four", "eight", "forty", and "forty eight" received activation. Activation of physical codes Theoretically, physical codes correspond to visual and phonological representations for digits and number words. In this version of the model we did not attempt to model physical codes explicitly in terms of specific visual and phonological features. Instead, the function of physical codes was implemented using a generic scheme in which the operands, operation symbol, and answer were treated as an ordered string of characters (e.g., 6 9; x; 54), with each character representing a feature. If we assume that the visual codes for digits and number words and their corresponding phonological representations are strongly interassociated (cf. Clark & Campbell, 1991), modelling their correlated activities in terms of a single code should provide a reasonable approximation. We employed a simple feature-matching algorithm to compute the similarity between the physical codes for two problems. The algorithm assumes that the pair of operands, answer, and operation symbol are distinct subunits within the physical code representations, and that matching strength is determined by feature overIap both within and between the operand and answer subunits. An examination of the
J.I.D. CampbeN & M. OIiphant
340
matrix of multiplication errors presented by Campbell and Graham (1985) suggested three factors that influence the extent to which a shared numeral feature contributes to confwability (i,e,, similarity) of problems: 1) A common feature within the same subunit (i.e., a common operand or a common numeral in the correct answer) produces a similarity response stronger than a common feature shared between subunits. 2) A common feature in the same relative position within a subunit yields a similarity increment stronger than when a common feature is out of position within a subunit (e.g., 56 is more similar to 54 than to 45). 3) Because the operands are presented while the answer is not, the influence of matches with the operands is assumed to be stronger than the influence of matches with features of the associated answer subunit. As an approximation based on the Campbell and Graham error matrix, and for simplicity, we decided that "stronger" would correspond to a 2 to 1 ratio in each case. Table 1. Physical code feahrre matching values.
Matched Feature of Node
Left Op.
Right Op.
Ans. Decade
A n s . Unit
Left op.
.5
.25
.25
Right Op.
.Z
.5
.125
.125 .25
Ans.Decade Ans.Unit
.0625 .03125
.03125 .0625
.125 .0625
.Of525 .125
Feature of Problem
Note. Op = Operand. Ans = Answer. The physical code similarity of two problems was computed as the sum of the values for matched features, plus an increment for matching the operation symbol (a value of 1.75 in the current implementation).
Table 1 presents the actual physical-code feature-matching values that were adopted for the current version of the model, which are presented here to help make the role of physical codes more concrete. The table rows correspond to numeral components of the presented problem (i.e., left operand, right operand, decade of the answer, units of the answer), and the columns correspond to the same components in a to-be-matched item. The table entries show the contribution to similarity for a match between any two components. The similarity of the physical codes for two problems is computed as the sum of the weights for
Network-inteflerence Model
341
matched numeral components, plus an increment (1.75) for a match on operation symbol. In the model, it is matching the physical code for the operation symbol that causes either the family of addition facts or family of multiplication facts to be relatively more activated, and ultimately determines whether the model adds or multiplies, The value of 1.75 was determined by trial and error over successive tests of the model.
Activation of magnitude codes We assume that the physical codes for numbers are associated closely with visuo-spatial magnitude representations (cf. Paivio, 1986). When a problem is presented, its physical code representations activate a corresponding magnitude code, which we think of as a region along an imaginal number-line. Theoretidy, the magnitude region associated with a specific physical code activates other physical codes with magnitude representations that overlap this region. Thus, via the associated magnitude codes, physical codes activate other physical codes that are numerical neighbors. To model this process we adopted a magnitude similarity metric based on Welford’s function (Welford, 1960). The Welford function represents the psychophysical spacing of number magnitudes, and has been used to predict response times for judgements of relative magnitude. Welford’s original function, in which RT is proportional to LOG[larger/(larger-smaller)], defines a scale that is compressed as number magnitude increases. This implies that the magnitudes of larger numbers are more similar (less discriminable) than the magnitudes of smaller numbers. We will refer to the expression LOG[larger/(larger-smaller)] as the Welford value. To compute the magnitude similarity between any two given problems we computed a Welford value between the operands (pairing smaller to smaller and larger to larger between problems) and between the corresponding answers, and summed these three values to obtain an overall magnitude-similarity weight. Because the Welford function is undefined for numbers of equal magnitude (i.e., the calculation would require division by zero), we determined by trial and error a magnitude similarity value for identical numbers (identical magnitudes = 3) that was functionally appropriate in the context of the model.
Ejrcitation, inhibition, and residual effects We conceptualize a node in the model as a representational complex corresponding to the combined physical and magnitude codes. Thus, for computational simplicity, we sum the physical and magnitude similarity values to
342
J.I.D. Campbell & M. Oliphant
obtain a single, overall similarity value that defines the total excitatory input that a node receives on each cycle. Computationally, therefore, a node is an abstraction in the model; but, theoretically, we reserve the view that it is specific physical and magnitude codes that are the objects of retrieval, rather than abstract, modality-independent codes. On each cycle, the excitatory input is modulated by inhibitory and residual factors that determine the change in activation for each node on successive cycles. Specifically, the algorithm for the activation (A) of a given node on cycle (c) is:
E is the constant excitatory input to a node defined by the sum of the physical and magnitude similarity values. The R parameter, which may be thought of as the "reactivation potential" of a node, implements residual effects of previous retrievals and is the source of error priming effects in the model. Campbell and Clark (1989) modelled the negative and positive components of error priming in terms of the sum of two exponential functions corresponding to counteracting inhibitory and excitatory factors (see Figure l), and these functions were adapted to define the time course of R in the model. The excitatory component of R represents the assumption that retrieval of a node increases the unitization of its components (i,e., the strength of association among the operands, operation symbol, and answer), producing a temporary increase in the accessibility of the entire unit given activation of one of its components. Thus, when one of the operands in the current problem matches an operand in a previous problem, the activation of the previous problem is increased in proportion to the residual unitization. This excitatory residual effect, which decays exponentially, increases the probability that a previous problem's correct answer appears as an error (i.e., positive error priming). Beyond positive error priming, degree of unitization (i.e., the excitatory component of R) potentially provides the model with a locus for effects of frequency and recency on retrieval, although these effects are not currently implemented. For the tests of the model described later, it was assumed that unitization at the beginning of the block of trials was the same for all nodes. Theoretically, the inhibitory component of R represents a temporary reduction in the capacity for a unit to be reactivated by one of its components immediately after it has been retrieved (cf. MacKay, 1987). This inhibitory effect briefly masks the excitatory influence of unitization and produces negative error priming. IG implements the assumption of global inhibition between response categories (see Bjork, 1989, for general discussion of inhibitory memory mechanisms). IG
Network-inte~erenceModel
343
Reactivation Potential
1o.ox e -.OeP8
1 if....
ExcltaUon I
-8
I
-10.01x e
//"
3
---- IohlblUon -R
=
-.07PI
Nat Effact (110)
1
I
I
1
1
I
7
9
I1
13
Lag in Trials
Figure 1. Reactivation potential (in arbitrary units) of previously retrieved nodes as a function of lag in trials. takes on values ranging from 0 to 1, with low values representing more inhibition. The strength of category inhibition on a particular cycle is determined by the proportion of total activation (summing across the three categories) that is associated with the set of nodes within each category. The activation of each node within a category is multiplied by the category's inhibitory weight on each cycle. Thus, as the total strength of the representations within one category increases relative to the other categories, there is increased global inhibition of all the representations associated with the other response categories. Taking the nth root of the proportional weights (i.e., IG), in effect, controls the "efficiency"of global inhibition. Low values of n result in one category quickly dominating, whereas higher values result in the three categories maintaining a more equal balance, and increase the probability that a response from an incorrect category intrudes. ILrepresents local inhibition among the representations within categories. IL is proportional to the current activation of a node relative to the current total activation of all the nodes within the same category. Thus, strength of local inhibition decreases for a node whose relative activation is increasing, whereas
344
J.I.D. Campbell & M. Oliphant
local inhibition increases for a node whose relative activation is decreasing. Node activation levels are relatively similar initially, but, because of the I, parameter, the activation levels of nodes with higher E values rapidly "pull away from nodes with lower similarity. As with category inhibition, the efficiency of this discrimination process is controlled by taking the nth root of the raw proportional weights. In effect, the larger the root, the poorer the discrimination as a function of differences in E. As an illustration of this discrimination process, Figure 2 presents a series of surface plots showing the pattern of activation produced by 3 x 6 across the set of nodes corresponding to 2 x 2 through 9 x 9. On the arbitrary activation scale used in Figure 2,the top panel shows a nearly uniform pattern of activation after one retrieval cycle. After five cycles, some definite structure appears, with excitation concentrated around the correct node and around the node corresponding to the commuted problem (6 x 3 = 18). By cycle ten, there are prominent "ridges" that extend from the two points on the surface where 3 and 6 intersect, and which correspond to the families of problems in the three-times and six-times tables. These nodes are most stongly activated because components of the corresponding problems match the input problem both on the physical and magnitude dimensions. The strength of activation generally decreases with distance from each of the problem's operands, an effect resulting from use of Welford's (1960) function to represent activation due to similarity in magnitude. From cycle 10 to 19, when the correct node reaches the retrieval criterion, the rate of change of activation in the network slows rapidly as the excitatory and inhibitory inputs settle into equilibrium and the activation of each node approaches an asymptotic level. The initial differences in excitation across nodes in the network, as illustrated in the top panel of Figure 2, are small relative to the activation required for retrieval. Nonetheless, these small differences in the pattern of input weights across nodes ultimately are responsible for all the systematic performance differences across problems produced by the model. Production mechanisms The simulation of verbal-number production in the model is functionally equivalent to the syntactic-frame model proposed by McCloskey, Sokol, and Goodman (1986): Answers that are production candidates simultaneously feed their activation (equivalent to the node's current activation) to a common output "frame" comprised of hundreds-place, tens-place and ones-place "slots". In the McCloskey et al. theory, the frame receives abstract number representations (i.e., modality- and format-independent codes), and provides access to the appropriate
Network-interference Model
Figure 2. Activation of multiplication nodes by 3 x 6 as a function of cycles.
345
346
J.I.D. Campbell & M. Oiiphant
number words and determines their order of production. In the current model, the most important theoretical implication of adopting the frame notion is that verbal-articulatory mechanisms represent a convergent stage of processing that the node structures potentially influence. We do not assume, as do McCloskey et al. (see also Sokol et al., 1989), that input to the frame necessarily involves only abstract number codes. Instead, for example, input may be conceptualized in terms of modality-specificphysical codes (e.g., visual codes for digits; Campbell & Clark, 1988; Clark & Campbell, 1991). Furthermore, it does not necessarily follow from adopting the notion of an output frame that lexical representations are only involved at production in number-fact retrieval. Our concept of physical-code representations assumes that a problem-answer unit can be stored and retrieved as a string of phonological (i.e., lexical) codes. The retrieved phonological representation of an answer may directly activate associated articulatory mechanisms (i.e., gain access to the verbal output frame), although this process presumably would not involve the syntactic mechanisms normally required to convert a digit sequence into a number-word sequence. As the current implementation does not model phonology explicitly, however, it is conveninent to think of the input to production mechanisms in terms of digits. We assume that there is a minimum threshold activation required for a node to influence verbal-production mechanisms and that, generally, there are multiple candidates that compete for output. In the current implementation, the answers associated with all nodes that are activated to within 5% of the most strongly activated node are candidates for production. On each cycle there is a single potential candidate for each slot determined by summing the activation across candidates. For example, suppose the problem "6 x 8 has been presented and the current candidate set contains "twenty four", "twenty eight", and "forty eight". If the combined activation associated with both occurances of "twenty" is greater than that for the single occurance of "forty", then "twenty" will occupy the tens slot on that cycle. Similarly, if the summed activation for both occurances of "eight" exceeds the activation of ''four1',then "eight1'will occupy the ones position. If an overt response was triggered on this cycle, "twenty eight" would be stated. On a later cycle, the activation of "forty" may have increased sufficiently to be stronger than the combined weights for "twenty", at which point the dominant response would be "forty eight". When the maximally activated code for a slot is "O", this corresponds, theoretically, to a null value signifying the absence of an articulated number word. This situation arises for single-digit answers (i.e., no articulated tens word) and answers that are multiples of ten (i,e., no articulated ones word). If the maximally activated code in the tens slot is "l",this signifies that the current answer is in the
Network-inteqerence Model
347
teens, and the maximally activated code in the units slot determines which teens word is accessed. Although only "genuine" answers to problems (and the names of the problem operands) are potential sources of activation at production, the convergence of candidate activation at output can sometimes result in miscellaneous responses that are mixtures of lexical components of different answers. Criteria for termination of processing There are two conditions that terminate the processing of a problem: The model produces a "stated response when any node reaches a specified threshold level of activation. When this occurs, the answer currently occupying the output frame is generated. If the response criterion is set appropriately high, and there is no significant residual activation from preceding trials, the response will always be the correct answer. If the criterion is reduced, or because of noise or priming by residual activation, an incorrect response can occupy the output frame when the criterion is reached. A second halting condition occurs when the model enters a state of equilibrium before any answer has reached criterion. If there is no significant change (*O.l%) in the total activation in the system over several cycles, and no node has reached criterion, the model halts and either produces no overt response or is set to generate the current response in the output frame. The model reaches equilibrium before criterion occasionally when residual priming from previous trials produces a pattern of activation in which mutual inhibition prevents any one node from dominating the network. Tests of the model's performance
To evaluate the model, we compared the performance of 80 simulated subjects to the results produced by 64 university students. Stimuli were the 64 addition and 64 multiplication combinations composed from the numbers 2 to 9. For the students, problems were presented visually in horizontal orientation, either in digit format or in number-word format, but we will report results only for the digit format (see Campbell, l W a , for a discussion of the format effects in these data). Students were instructed to state the correct answer to each problem as quickly and accurately as possible and received two blocks of multiplication problems and two blocks of addition problems. Operation (addition vs. multiplication) alternated across blocks, and the order of problems in each block was pseudo-random (see Campbell, l m a , for additional details about the procedure).
348
J.I.D. Campbell & M. Oliphant
Parameter set-up of the Model
For the present test, performance across the 80 simulated subjects varied only as a result of randomized problem order within blocks, which produces variablity in residual effects of previous trials, and from introducing random noise in the input to each node on each cycle. In the current implementation, the input strength was allowed to vary randomly by f10% on each cycle. This relatively high noise value was combined with a relatively low criterion for retrieval (which was the same for multiplication and addition) in order to obtain a large number of errors from the model. Finally, while the model represents combinations of operands up to 12 t 12 and 12 x 12, nodes for problems with 1 or 10 as an operand were not implemented on the grounds that these problems probably are solved by rule-based strategies. Results and discussion
In the students’ data, multiplication was slightly more difficult than addition, both with respect to RT (982 ms vs. 912) and errors (7.6% vs 5.8%). The model required an average of 21.3 cycles to produce a correct answer to a multiplication problem, compared with 20.7 cycles for addition. Thus, addition was 7.1% faster than multiplication in the students’ data, whereas addition was 2.8% faster in this test of the model. In contrast to the students, the model made more errors of addition (32.5%) than of multiplication (21.3%). This occurs primarily because the answers to neighboring addition problems are, on average, more similar than the answers to neighboring multiplication problems (e.g., 4 + 6 = 10 is more similar to 5 t 6 = 11 then 4 x 6 = 24 is to 5 x 6 = 30). As a consequence, nodes neighboring the correct node in addition are more likely to gain access to the production frame and be generated as errors. Vmktion in pmblem difficulty
Figure 3 shows mean RT for correct multiplication (left panel) and addition trials (right panel) as a function of operand size, with each point representing the average RT for all problems that contain the specified operand (e.g., the mean time for 2 x 9 contributed both to the two-times and to the nine-times points). As Figure 3 shows, both the students and the model produced the problem-size effect, with RTs generally increasing with operand size. In the present run of the model, we assumed that all problem nodes were unitized to the same degree prior to testing (i.e., that there were no significant performance effects due to pre-test
Network-intetference Model
349
differences in problem frequency or recency); consequently, the observed problemsize effect is due entirely to interference (i.e., to inhibition of the correct node by competing nodes). Interference increases with problem sue because larger problems are, in theory, more psychologically similar in magnitude to neighboring problems than small-number problems are to their neighbors. As a result, the correct nodes for the larger problems generally encounter more inhibition from neighboring problems. Multiplication
Addition II00
Yodel's C p l e s
Student.' RT (mm) 43. Itudenl.
1"
'7
IW
~2
r9
rC
r0
~0
Times Table
r7
r8
r0
+9.
84
4 Modal
+3
+4
+6
44
+'I M
+@
Sum Table
Figure 3. Mean time (cycles) until a correct retrieval as a function of operand size.
The model also reproduced the deviation from linearity associated with fivetimes problems (albeit somewhat exaggerated), as well as the tendency for the slope or range of the problem-size effect to be larger for multiplication than addition. The steeper problem-sue slope for multiplication than addition in the model results, at least in part, from a rather complex effect due to the globalinhibition factor. As explained above, addition problems are more similar to one another, on average, than are multiplication problems, and therefore activate each other more strongly. As a result, a given addition problem tends to generate more total within-operation activation than does the corresponding multiplication problem. Because the strength of global-inhibition is directly related to total within-operation activation, multiplication nodes generally receive more inhibition from addition nodes than vice versa. Thus, the model produces slightly longer RTs for multiplication because of an asymmetry in inter-operation interference that favors addition. The magnitude similarity of neighboring answers increases more
350
J.I.D. Campbell & M. Oliphant
as a function of problem size for addition problems than for the corresponding multiplication problems, whose answer magnitudes are distributed over a much larger range. Consequently, the asymmetry in inter-operation interference is larger for larger-number problems, which produces the interactionbetween operation and problem size. Table 2 presents correlations that demonstrate several important similarities between the model's and the students' data. For the correlations, each of the 64 addition and multiplication problems contributed 1) a mean RT for correct trials, 2) a rate for within-operation errors (the rate of errors excluding naming and cross-operation errors), 3) a rate for cross-operation errors, and 4) an answererror rate.' Answer-error rate refers to the frequency with which a problem's correct answer appeared as an incorrect answer to other problems within addition or multiplication. Table 2 shows correlations among the four dependent measures, their correlations with common problem-size indices (maximum operand, product, and sum), their correlations with ties vs. non-ties (coded 1and 0 respectively in the regression), and their correlations with "fives" vs. "non-fives" (problems with a correct answer that is multiple of five were coded as 1, all others were coded as 0)* For both multiplication and addition, both the students and model produced strong, positive correlations between correct RT and within-operation error rate. In the model, this relation reflects the central role of inhibition and associated retrieval interference: The more strongly a problem activates competing problem representations, the more inhibition the correct node receives, delaying retrieval, and the higher the probability that residual activation or noise will cause an incorrect node to exceed the retrieval threshold first. The positive relation between answer-error rate and retrieval difficulty(i.e., correct RT and error rate) in the model is an artifact of the influence of the magnitude codes. Specifically,the hypothesized magnitude codes produce the problem-size effect (i.e., larger problems are more error prone) and also tend to produce errors that are correct answers to neighboring problems (see below). As a result of these combined influences, the answers to the larger, more difficult problems also tend to be the most common errors. Campbell and Graham (1985; see also Graham & Campbell, in press; Norm & Knight, 1930) suggested that the positive correlation between answer-error rate and problem difficulty could indicate that competing associations with a problem's correct answer contribute to retrieval interference. The model shows that it is not necessary to posit this additional interference factor
' The overall rates of the various errors types are presented in Table 3 and are discussed later.
Network-inleqerence Model
351
Table 2. Correlations with Measures of Retrieval Performance. Multiplication Students
Model
RT WOE XOE AE
RT WOE XOE AE
Mean RT Within-Op. Errs Cross-Op. Errs
---
Maximum
.66 -58 .60 .64 .66 .66
-.22 -.lo -.14
.49 .40 .49
-.29 -.18
.35
-.20 -.23
.02
Product Sum Ties vs. Non-ties Fives vs. Non-fives
.45 .40 -.20
.40 .48 -.28 .71 .39 .54
.77 .61 .71
-.24 -.12 -.18
.58
-.20
-.39 -23
.31
-.31
-.32
-.43 -.14
-.24
-.17
.30 .44
Addition Students
Model
RT WOE XOE AE
RT WOE XOE AE
Mean RT Within-Op. Errs Cross-Op. Errs
---
---
MaximW Product Sum
.62 .43 SO
Ties vs. Non-ties Fives vs. Non-fives
.77
---
-.17 -.33
---
56 .35 .45
-.55
.48
.51 -.37
.76
---
-.39 -.41
---
.60 .61 -.20
.74 .79 .44 .75 .60 .83
-.42 -.26
-.55
.47 .18 .35
-.48 -.33
-01
-.14
-.48 -.22
.57
-.19
-.28 -.15
.03
-28
-.27 -28
-.03
.09
-.44
-30
.68 .38 .57
Nore. RT = retrieval time (or cycles) for a correct trial. WOE = withinoperation error (e.g., 4 x 7 = 21). XOE = cross-operation error (4 x 7 = 11). AE = answer-error rate. Ties = problems with a repeated operand (e.g., 8 t 8). Fives = problems that result in a multiple of 5 (e.g., 2 + 8, 6 x 5). For df = 62, r > .25 is signifcant a t p c .05.
J.I.D. Campbell & M. Oliphant
352
to account for the effect in simple arithmetic, although such an account remains plausible (e.g., Graham & Campbell, in press). As Table 2 shows, the model also reproduces the pattern of correlations among the dependent variables and the three problem-size indices (Le., maximum operand, correct product, correct sum), although the correlations tend to be higher in the model: Mean RT, within-operation errors, and answer-error rates all tend to increase with problem size,with the maximum operand generallybeing a slightly better predictor than product or sum in both the model’s and the students’ data? In contrast, as indicated by negative correlations, cross-operation errors tended to decrease with problem size for both multiplication and addition. Furthermore, cross-operation errors were more likely for ties than non-ties in multiplication (r = .35 and .31 for the students and model respectively), although this relation was observed only in the model for addition. The tendancy for cross-operation errors to occur with ties and small-number problems arises in the model because these problems produce relatively weak activation of incorrect nodes. Thus, althougb the relative activation of the correct node is high (which makes ties and small-number problems relatively easy), the total within-operation activation generated is relatively low. Because the inter-operation inhibition a problem generates is proportional to the total strength of within-operation activation it produces, ties and small-number problems are less effective at inhibiting crossoperation interference. The negative trend in the correlations between ties/non-ties and both RT and within-operation error rate corresponds to the standard finding that tie problems are somewhat easier than non-ties. The negative correlations between ties vs. nonties and answer-error rate reflect the tendancy for the correct answers to tie problems to be infrequent error responses. The corresponding correlations for fives vs. non-fives reveal a pattern similar to that for the ties in both multiplication and addition: Problems whose answers are a multiple of five tend to be easier and their answers infrequent error responses, although the model did not reproduce this latter feature of the students’ addition data. As described previously, these effects arise in the model from treating ties and fives-multiples as distinct subcategories of problems (e.g., ties activate the more numerous non-ties relatively weakly and thereby encounter less interference, on average, relative to non-ties). It is worth noting that the model produces a ties advantage even without assumptions about subcategoriesof problems. One reason is that, because the two
*
When tie problems are excluded, or problems involving 0 or 1 are included, the product tends to be the better predictor (cf. Miller, Perlmutter, & Keating, 1984; Geary, Widaman, & Little, 1986).
Network-inte~erenceModel
353
operands in a tie are the same, the matching of physical codes for ties results in the correct node yielding a higher average similarity value relative to non-ties. A second reason is that the model treats the two orders of non-ties as distinct with respect to their physical codes. As a result, the other member of a non-tie, which receives relatively strong activation, is a substantial source of interference. The model also produces an advantage for some of the fives-times problems that is not due to the assumption that fives-multiples are a distinct subcategory of problems. Specifically, five-times problems with products that end in five (e.g., 3 x 5 = 15, 5 x 5 = 25,7 x 5 = 35,9 x 5 = 45, and their commuted counterparts) benefit from a physical-code match between the operand five and the five in the correct product. Corresponding multiplication and addition problems (i.e., pairing 2 x 2 with 2 + 2, 2 x 3 with 2 + 3, etc.) were similar in difficulty in the both the students’ and model’s data. RTs across the 64 corresponding multiplication and addition problems were correlated .65 in the model and .66 in the students data. For within-operation errors, the correlation was .71 in the model and .47 in the experimental data. Commuted pairs (e.g., comparing 2 x 9 with 9 x 2) also were very similar in difficulty. In the students’ data, mean RTs for the 28 commuted pairs (classified as small operand on left vs. small operand on right) were correlated .84 for multiplication and .79 for addition. The corresponding values for the model were .82 and .76. The students’ error rates for commuted pairs were correlated .93 and .94 for multiplication and addition respectively. In the model’s data, the corresponding correlations were .78 and .92. Analyses of specific errors The preceding analyses show that the model reproduced many subtle features observed in the pattern of correct RTs and error rates for both multiplication and addition. The specific errors produced by the model also represent important criteria for assessing the model’s performance. Table 3 presents the percentages of multiplication and addition errors falling into several mutually exclusive categories. An error was classified as a cross-oprafion error if the response was the correct answer to the other operation (i.e., addition or multiplication). A naming error occurred if the error response comprised one or both of the problem’s operands (2 + 8 = 8 or 2 x 8 = 28). For multiplication, errors not classifed as cross-operation or naming errors were classified as tuble-related if the error was a correct answer to another single-digit multiplication problem in the same times table (4 x 8 = 36). A table-unrelated error was a correct answer to a single-digit multiplication problem in a different times table (4 x 8 = 42). For
J.I.D. Campbell & M. Oliphant
354
addition, a non-cross-operation and non-naming error was table-related if it was a correct answer to another single-digit addition problem. Miscellaneous errors did not fall into any of the preceding categories (7 x 6 = 46; 7 + 9 = 19). Table 3. Error Types as a Percentage of Errors Multiplication
Cross-operation Naming Table-related Table-unrelated Miscellaneous
Students
Model
6.4 2.4 72.5 12.1 7.7
10.0 0.2 76.6 8.9 4.6
Addition Students
Model
23.1 2.8 74.1
92.3
0.2
0.9
__
7.2
0.5
--
Note. See text for explanations of error types. For multiplication, the students and model produced similar percentages of the five error categories. Table-related errors accounted for the majority of errors (about 75%), and there was substantial overlap in the specific table-related errors produced by the students and model; that is, most table-related errors were the correct answer to the problem obtained by changing one operand by kl (see discussion of the 'lneighbor" effect below). Similarly, the specific table-unrelated errors were often the same in the students' and model's data. For example, 6 x 9 or 9 x 6 = 56 and 7 x 8 or 8 x 7 = 54 accounted for 47% and 38% of all the tableunrelated errors produced by the students and the model respectively. The answers to 6 x 9 and 7 x 8 are confused in the model because the problems are of similar magnitude and share a common feature in their correct answers. The model also produced some of the same miscellaneous errors as the students. For examples, the errors 6 x 7 = 43,8 x 6 = 46,7 x 8 = 46, and 9 x 7 = 62 appeared among both the students' and the model's miscellaneous errors. In the model, these errors reflect the mixing of lexical components from at least two answers (e.g., "sixty two" in response to 9 x 7 includes ysixty" from the correct answer 63 and the "two" from 72, the correct answer to the neighboring problem 9 x 8). For addition, the model produced instances of each category, and tabled errors were the dominant category in both the students' and model's errors. However, the model underestimated the observed rate of cross-operation errors in addition, relative to the rate observed for multiplication. This descrepancy arises under the
Network-intetference Model
355
simplifyin& but arbitrary, assumption that efficiency of inter-operation inhibition is the same for multiplication and addition (see Miller & Paredes, 1990, for evidence that learning multiplication has a distruptive influence on memory for addition facts). Nonetheless, the specific cross-operation errors produced by the model also were, in many instances, the same ones observed in the experimental data: In the students’ data, 66% of cross-operation errors occurred on the 20% of problems that include ties and small-number problems with both operands less than or equal to 4. Similarly, among the cross-operation errors produced by the model, 46% were accounted for by this small subset of problems. The errors produced by the students and the model were also similar in other ways. The students’ errors often involved the answer to a neighboring problem; the correct answer if one operand was changed by f 1: Neighbors accounted for 46% and 54% of the students’ multiplication and addition errors respectively. The percentages were higher in the errors generated from the model, with neighbors accounting for 74% of multiplication errors and 87% of addition errors. The tendancy for errors to be answers to neighboring problems occurs in the model because neighboring problems often share physical features (i.e., a common operand or a common decade value in the correct answer) and are similar with respect to magnitude; consequently, neighbors often are the most strongly activated incorrect problems. Operand-intnrsion errors
Another important feature of errors produced by the model is associated with effects of operand order. These effects are important because they constitute evidence that number-fact retrieval is sensitive to perceptual or physical characteristics of encoding (i.e., the proposed function of the physical codes in the present model). Although commuted pairs are very similar with respect to RTs and error rates (see above), there are substantial effects of operand order on the frequencies of specific errors, particularly in simple multiplication (Campbell, 1Wa; CampbeU & Graham, 1985). Specifically, it is common for one of the problem’s operands to appear in the error response (e.g., 4 x 8 = 28), and this effect is stronger when the left-right position of the operand in the problem maps on to the left-right position of the same number in an incorrect answer. For example, in both the students’ and the model’s data the error 9 x 6 = 56 was more likely than 6 x 9 = 56 (12 vs. 6 and 12 vs, 5 instances from the students and model respectively). Conversely, the error 9 x 6 = 63 was less likely than 6 x 9 = 63 (11 vs. 1 and 10 vs. 2 occurrences, respectively, in the students’ and model’s errors).
356
J.I.D. Campbell & M. Oliphant
For multiplication, the rates of these operand-intncsion errors (i,e., non-naming errors in which one of the problem’s operands appeared in the error response) were similar in the students’ (37.1%) and model’s (33.0%) data. The sensitivity of intrusions to operand order demonstrates that they are not simply chance events: Operand position was preserved (ie., the left operand appeared in the decade of the error or the right operand appeared in the unit position in the error) in 67.9% and 655%of the students’ and model’s multiplication intrusion errors respectively. Relative to the expected proportion of .5, the proportions of intrusions with position preserved were sigaifcantly greater than expected (both 2’s > 3; ties and error responses in which the decade and unit are the same were not counted). Both the students and the model produced a lower rate of intrusions (6.1% and 13.3% respectively) for addition than multiplication. The lower rate for addition occurs largely because there are fewer different sums than products (e.g., there are 15 sums and 31 products obtained from the 64 combinations of the numbers 2 through 9); consequently there are fewer ways for operands to match components of addition answers. As in the multiplication data, operand position was preserved in a high proportion of addition intrusion errors (76.2% for the students, 66.7% for the model). Operand-intrusion effects arise in the model from the assumption that characteristics of the surface structure of problems are preserved in the memory representation (i.e., via the proposed physical codes), and from the assumption that the relative position of internal features shared between two representations affects computed similarity (see Table 1): A higher matching response is obtained when the relative position of a feature within a representation is preserved in the matched representation. Thus, commuted pairs, which differ in the model only by virtue of the reversed positions of operands within the physical codes, yield very similar performance overall, but differentially activate other specific nodes and thereby produce different frequencies of specific errors, Emor priming
Error priming is the phenomenon whereby retrieval errors on simple arithmetic problems are influenced by the events on preceding trials. Campbell (1990a; see also Campbell & Clark, 1989) showed for both simple multiplication and addition that, relative to chance probabilities, the answer given one trial back has a low probability of matching an error response (negative error priming), whereas answers given 3 to 10 trials back show an increased probability of matching an error (positive error priming). Figure 4 depicts the error-priming effects produced by the model on the current run for both multipliclation and addition. Specifically,
357
Network-inteference Model
the figure shows the proportion of times that errors matched a previous correct answer as a function of lag in trials.’ The model produced the typical errorpriming function (cf. Campbell, 1991; Campbell & Clark, 1989), with the errormatching rate low for the immediately preceding trial, and then increasing out to a range of approximately five trials before returning to baseline. The function for addition lies above that for multiplication because there are more opportunities to match addition errors relative to multiplication errors (i.e., several sums are correct answers to multiple problems within a trial block and therefore an addition error may be matched at several different lags within a block, whereas, in contrast, most multiplication products are correct to only one problem within a block). 0.15
Answer-Error Matching Rate
0.12
0.09
0.06
Yultipiicmtion
-aAddlUon
0
I
3
6
7
9
11
13
Lag in Trials
Figure 4. Proportion of the model’s emrs matching a previous correct answer as afunction of lag in tials.
The comparable error-priming analysis of the students’ data is not reported in Figure 5 because number format (digits. vs. words) alternated across trials in their experiment. Positive error priming was stronger within than between formats (see Campbell, 1990a) and produced an irregular pattern of error priming across lags. Overall, however, there was a statistically robust positive error-priming effect for both formats and operations.
358
J.I.D. Campbell & M. Oliphant
Error-priming effects are associated with the physical codes in the model. The visual and phonological physical codes are assumed to be comprised of associated components (i.e., the operand pair, operation sign, and answer) that are activated and retrieved as a unit. The capacity for any component to activate the entire unit increases with the strength of association among the components, and this unitization is assumed to increase whenever the components are brought together under attention (e.g., when a correct retrieval takes place). Thus, unitization represents a type of learning in the model and positive error priming is a sideeffect of learning. Specifically, the capacity for the current problem to activate nodes corresponding to previously retrieved problems is increased in proportion to the increase in unitization due to the previous retrieval. As explained previously, this residual, excitatory effect of unitization decays exponentially. The negative priming at a lag of one is assumed to reflect an inhibitory or fatigue effect that reduces the capacity for a component to immediately reactivate a node (e.g., MacKay, 1987). This inhibitory effect also decays exponentially and, combined with the exponential decay of the excitatory effect, gives rises to the curvalinear error priming function shown in Figure 4. Concluding comments
The network-interference model in its current form captures a large number of prominent and subtle features of memory for simple multiplication and addition facts. Although the complexity of the model undoubtedly would allow us to fit the data more closely, we wanted to avoid the trap of endlessly adjusting parameters simply to improve the fit. The model produces a reasonable imitation of the students' performance under a wide variety of parameter settings. Furthermore, adjustments, for example, in the efficiency of inhibition or in the relative contribution of physical and magnitude codes, represent means for modeling individual differences or effects of manipulating presentation format. Viewed this way there is no fixed, ideal value for many of the parameters in the model. The version of the model we have described represents a "snapshot" taken of work in progress. We explored several different architectures before reaching the architecture we report here, and undoubtedly there will be further changes to the structure of the model. Indeed, the most recent version that we are studying does away entirely with global inhibition. Although the global-inhibition factor contributed to reproducing several important RT and error phenomena, as explained above, these benefits probably are not worth the cost in complexity that this factor introduces. The global-inhibition factor interacts in complex (and sometimes rather paradoxical) ways with manipulations of other factors, and makes
Network-interference Model
359
predicting and interpreting the model’s behavior very difficult (e.g., consider the explanation given above for how global inhibition contributed to producing the steeper problem-size slope for multiplication). In contrast, when the global inhibition factor is eliminated, there is a simple expression that accurately describes the activation of individual nodes. Specifically a node’s activation function across cycles (c) is governed by the relation: 1-I’ A, = E.R.11-1 E and R correspond, respectively, to the excitatory input and reactivation potential factors described previously. I is a general inhibition factor that is computed as the proportion of total activation associated with a given node. Our preliminary tests indicate that this simpler model will (given a small number of additional assumptions) be able to reproduce all of the phenomena generated by the present, more complex version, and it is very likely that global inhibition will not be a factor in future versions of the model. Despite these anticipated modifications, the current version of the model does demonstrate that the network-interference approach can be given a formal implementation that accounts for the major phenomena of number-fact retrieval. Furthermore, despite the relative complexity of details of implementation, the model is built upon a foundation of a small number of basic assumptions that provide the major phenomena; specifically, that there are physical and magnitude codes that are excited as a function of similarity and that compete for retrieval by way of mutual inhibition. These are the core assumptions of the model that account for the problem-size effect, the major features of errors, and other basic features of performance (e.g., speed-accuracy trade-offs; the positive relation between speed and accuracy across items; selection of response categories and items within categories). The core mechanisms are not tied intrinsically to the domain of cognitive arithmetic, and thus the architectural and processing assumptions of the model are potentially relevant to other areas of cognitive research. There are a number of directions in which to expand the scope of the model. We have already had some success simulating effects of number format (digits vs. words; e.g., Campbell, 1990a) and a variety of priming effects (e.g., Campbell, 1991) within the current architecture. Other directions will require more substantial modifications to the model. One important goal is to generalize the model to handle verification-task phenomena (e.g., Stazyk et al., 1984; Koshmider & Ashcraft, 1991; Zbrodoff & Logan, 1990), and another is to extend the model to simulate retrieval for all four basic arithmetic operations. The concept of unitized problem representations (i.e., a retrieval structure comprised of associated
360
J.I.D. Campbell & M. Oliphant
subunits representing the operands, operation symbol, and answer) directly provides for extending the simulation to model memory for simple division and subtraction. Under the assumption that the inverse operations (i.e., multiplication vs. division and addition vs. subtraction) rely heavily on common representational structures, the current simulation could easily be made to divide or substract. For example, a component of the representation that functions as a multiplier in the context of multiplication (e.g., the 6 in 6 x 8 = 48), could be the quotient in the context of division (i.e., 48 t 8 = 6). From this standpoint, the difference between retrieval of multiplication and division facts (or addition and subtraction) is substantially a matter of selecting which components of the retrieval structure control production processes. Despite the generally good performance of the model, it is important to emphasize that we consider the model to be in a preliminary phase of development, with several implementation features that we do not feel strongly committed to. For example, the decisions we made about how to model the calculation of similarity for physical codes was entirely a "common sense" approach guided by characteristics of the specific errors that this factor was intended to promote. Similarly, Welford's (1960)function as a model of the role of magnitude codes was chosen primarily for its simplicity and familiarity, but represents only one of many possible approaches to the role of magnitude in the simulation. While we acknowledge the extent of arbitrariness as a legitimate criticism about the current version of the model, at the micro-level of representation and process that we were attempting to simulate, there often was no clear empirical or theoretically principled basis for design decisions. We made it our primary initial objective, therefore, to provide an in-principle demonstration that an architecture such as we have proposed can work reasonably well. Furthermore, the assumptions of the current model are open, of course, to direct empirical evaluation, and we believe that the model can be useful in helping to identify core questions for future research. Ultimately, the scientific value of the model will be determined by the extent to which it stimulates interesting research into cognitive arithmetic and the elementary mechanisms of mind that it reveals.
ACKNOWLEDGEMENTS We thank Mark Ashcraft, Grethe Lindemann, Mike McCloskey, Paul Meagher, and Val Thompson for thoughtful comments on the model and the chapter. Thanks also to Jim Clark for many discussions on issues of representation in mental arithmetic, and to Jeff Graham for bringing to our attention the potential
Nehvork-interferenceModel
361
usefulness of Welford’s (1960) function in the model. This research was supported by Natural Sciences and Engineering Research Council of Canada grant OPG0001980 to Jamie Campbell. A demonstration version of the simulation for IBM machines running under MS-DOS is available from the fust author (please include a floppy diskette). Correspondence to Jamie Campbell (
[email protected]), Department of Psychology, University of Saskatchewan, Saskatoon, Canada, S7N
owo.
REFERENCES Aiken, L.R.,& Williams, E.N. (1973). Response times in adding and multiplying single-digit numbers. Perceptual and Motor Skills, 37, 3-13. Ashcraft, M.H. (1991). Cognitive arithmetic: A review of theory and data. Manuscript submitted for publication. Ashcraft, M.H. (1987). Children’s knowledge of simple arithmetic: A developmental model and simulation. In C.J. Brainerd, R. Kail, & J.Bisanz, (Eds.) Formal Models in Developmental Psychology (pp. 302-338). New York: Springer-Verlag. Banks, W.P., Fujii, M., & Kayra-Stuart, F. (1976). Semantic congruity effects in comparative judgments of magnitudes of digits. Journal of Erperimental Psychology: Human Perception and Performance, 2, 435-447. Baroody, A.J. (1991). A n evaluation of evidence supporting fact-retrieval models. Learning and Individual Differences. Beem, A.L., Ippel, M.J., & Markusses, M.F. (1987). A structuring principle for the memory representation of simple arithmetic facts. Paper presented at the Second European Conference for in Research on Learning and Instruction, Tubingen, F.R.G.. Bjork, RA. (1989). Retrieval inhibition as an adaptive mechanism in human memory. In H.L. Roediger, 111, & F.I.M. Craik (Eds.). Vurieties ofmemory and consciousness (pp.309-330). Hillsdale, NJ: Erlbaum. Blaxton, T h .(1989). Investigating dissociations among memory measures: Support for a transfer-appropriate processing framework. Journal of Erperimental Psychology: Leaming Memory and Cognition, 15,657-668. Campbell, J.I.D. (1991). Conditions of error priming in number-fact retrieval. Memory and Cognition, 19, 197-209.
362
J.I.D. Campbell & M. Oliphant
Campbell, J.I.D. (1990a). Error priming in cognitive arithmetic: Efiects of number fomzat. Poster presented at the meetings of the Psychonomic Society, New Orleans. Campbell, J.I.D. (199Ob). Retrieval inhibition and interference in cognitive arithmetic. Canadian Journal of Psychology,44,445-464. Campbell, J.I.D. (1987). Network interference and mental multiplication.Journal of Eqwimental Psychology: Leaming Memoryl & Cognition, 13, 109-123. Campbell, J.I.D., & Clark, J.M. (1989). Time course of error-priming in number fact retrieval: Evidence for excitatory and inhibitory mechanisms. Journal of Experimental Psychology: Learning, Memory, and Cognition,15(5), 920-929. Campbell, J.I.D., & Clark, J.M. (1988). An encoding-complex view of cognitive number processing: Comment on McCloskey, Sokol, & Goodman (1986). Journal of Experimental Psychology: General, 117, 204-214. Campbell, J.I.D., & Graham, DJ. (1985). Mental multiplication skill: structure, process and acquisition. Canadian Journal of Psychology, 39, 338-366. Clapp, F.L.(1924). The number combinations: Their relative difficulty and the frequency of their appearance in textbooks. University of WisconsinBureau of Educational Research, Bulletin No. 2. Clark, J.M., & Campbell, J.I.D. (1991). Integrated versus modular theories of number skills and acalculia. Brain & Cognition, 17,204-239. Cornet, J., Seron, X., Deloche, G., & Lories, G. (1988). Cognitive models of simple mental arithmetic: A critical review. European Bulletin of Cognitive PsychOkOm, 8, 551-571. Dehaene, S. (1989). The psychophysics of numerical comparison: A reexamination of apparently incompatible data. Perception and Cognition,45, 557-566. Dehane, S. & Cohen, L. (in press). Two mental calculation systems: A case study of severe acalculia with preserved approximation. Neuropsychologia. Dehaene, S., Dupoux, E., & Mehler, J. (1990). Is numerical comparison digital? Analogical and symbolic effects in two-digit number comparison. Journal of Experimental Psychology: Human Perception and Perfomrance, 16,626-641. Foltz, G.S., Poltrock, S.E., & Potts, G.R. (1984). Mental comparisons of size and magnitude: Size congruity effects. Joumal of Experimental Psychology: Learning Memory, and Cognition, 10, 442-453. Gallistel, C.R., & Gelman, R. (1991). Preverbal and verbal counting and computation. Manuscript submitted for publication. Geary, D.C., Widaman, K.F., & Little, T.D. (1986). Cognitive addition and multiplication: Evidence for a single memory network. Memory and Cognition 14(6), 478-487.
Network-intetj‘erence Model
363
Graf, P., & Ryan, L. (1990). Transfer-appropriate processing for implicit and explicit memory. Journal of Experimental Psychology:Leaming Memory and Cognition, 16, 978-992. Graham, D.J. (1987). An associative retrieval model of arithmetic memory: How children learn to multiply. In J. Sloboda and D. Rogers (Eds.) Cognitive processes in mathematics (pp. 123-141),Oxford, England: Oxford University Press. Graham, D.J., & Campbell, J.I.D. (in press). Network interference and numberfact retrieval: evidence from children’s alphaplication. Canadian Journal of Psychology. Hamann, M.S., & Ashcraft, M.H. (1986). Textbook presentations of the basic addition facts. Cognition and Inshrction, 3, 173-192. Koshmider, J.W., & Ashcraft, M.H. (1991). The development of children’s mental multiplication skills. Journal of Experimental Child Psychology, 51, 53-89. Krueger, L.E., & Hallford E.W. (1984). Why 2 t 2 = 5 looks so wrong: On the odd-even rule in sum verification. Memory and Cognition, 22(2), 171-180. Mackay, D.G. (1987). Self-inhibition and the disruptive effects of internal and external feedback in skilled behaviour. In H. Heuer, & C. Fromm (Eds.), Generation and modulation of action patterns. Berlin: Springer-Verlag. McCloskey, M., Caramazza, A., & Basili, A. (1985). Cognitive mechanisms in number processing and calculation: Evidence from dyscalculia. Brain and Cognition, 4, 171-1%. McCloskey, M., Harley, W., & Sokol, S.M. (1991). Models of arithmetic fact retrieval: An evaluation in light of findings from normal and braindamaged subjects. Journal of Experimental Psychology: Learning, Memory and Cognition, 27, 377-397. McCloskey, M., Sokol, S.M., & Goodman, RA. (1986). Cognitive processes in verbal-number production: Inferences from the performance of braindamaged subjects. Journal of Experimental Psychology: General, 115, 307330. Miller, K., & Paredes, D.R. (1990). Starting to add worse: Effects of learning to multiply on children’s addition. Cognition, 37, 213-242. Miller, K., Perlmutter, M., & Keating, D. (1984). Cognitive arithmetic: Comparison of operations.Journal of Experimental Psychology, 10(1),6-60. Norem, G.M., & Knight, F.B. (1930). The learning of the one hundred multiplication combinations. National Society for the Study of Education: Reporl on the Society’s Committee on Arilhmetic, 15, Yearbook 29,551-567.
364
J.I.D. Campbell & M. Oliphanl
Paivio, A. (1986). Mental Representations:A dual-coding approach. New York Oxford University Press. Parkman, J.M., & Groen, GJ. (1971). Temporal aspects of simple addition and comparison. Journal of Erperimental Psychology, 89,335-342. Restle, F. (1970). Speed of adding and comparing numbers. Journal of Experimental Psychology, 83, 274-278. Schacter, D.L., & Graf, P. (1989). Modality specificity of implicit memory for new associations. Journal of Qerimental Psychology: Leaming, Memory and Cognition, 15, 3-12. Shepard, R.N., Kilpatric, D.W., & Cunningham, J.P. (1975). The internal representation of numbers. Cognitive Psychology, 7,82-138. Siegler, R.S. (1988). Strategy choice procedures and the development of multiplication skill. Journal of Experimental Psychology: General, 117, 258275. Siegler, R.S., & Shrager, J. (1984). Strategy choices in addition and subtraction: How do children know what to do? In C. Sophian (Ed.), Origins of cognitive skills (pp. 229-294). Hillsdale, NJ: Erlbaum. Sokol, S.M., Goodman-Schulman, R., & McCloskey, M. (1989). In defense of a modular architecture for the number processing system: Reply to Campbell & Clark. Journal of Experimental Psychology: General, 118, 105-110. Stazyk, E.H., Ashcraft, M.H.,& Hamann, M.S.(1982). A network approach to simple multiplication. Journal of Experimental Psychology: Learning, Memory, and Cognition, 8, 320-335. Svenson, 0. (1985). Memory retrieval of answers of simple additions as reflected in response latencies. Acta Psychologica, 59, 285-304. ' Welford, A.T. (1960). The measurement of sensory-motor performance: Survey and reappraisal of twelve years' progress. Ergonomics, 3, 189-230. Zbrodoff, NJ.,& Logan, G.D. (1990). On the relation between production and verification tasks in the psychology of simple arithmetic. Journal of Experimental Psychology: Learning, Memory, & Cognition, 16, 83-97.
The Nature and Origins of Mathematical Skills J.I.D.Campbell (Editor) Q 1992 Elsevier Science Publishers B.V. All rights reserved.
365
Chapter 10 MATHNET PRELIMINARY RESULTS FROM A DISTRIBUTED MODEL OF ARITHMETIC FACT RETRIEVAL
Michael McCloskey and A. Margrethe Lindemann Johns Hopkins University
Summary
In this chapter we explore modeling of arithmetic fact retrieval within a connectionist or %neural net" framework. We first sketch briefly some of the major empirical phenomena emergingfrom research on arithmeticfacts, and reviewprevious attempts to develop distributed connectionist models of arithmetic fact retieval. We then describe a new model, MATHNET,and present preliminay findings concerning the model's ability to simulate several phenomena of human fact retrieval. We conclude by discussing limitations of our initial modeling and directions to be pursued in subsequent work. Background and introduction Over the past two decades, considerable empirical and theoretical effort has been directed toward elucidating the cognitive mechanisms underlying retrieval of arithmetic "table" facts such as 4 x 8 = 32 (e.g., Ashcraft & Battaglia, 1978; Campbell & Graham, 1985; Groen & Parkman, 1972; McCloskey, Aliminosa, & Sokol, in press; McCloskey, Harley, & Sokol, 1991; Miller, Perlmutter, & Keating, 1984; Parkman & Groen, 1971; Siegler & Shrager, 1984, Siegler, 1988; Sokol, McCloskey, Cohen, & Aliminosa, 1991; Stazyk, Ashcraft, & Hamann, 1982). This interest in arithmetic fact retrieval is motivated by at least two considerations. First, the ability to retrieve basic arithmetic facts is central to most arithmetic problem-solving. Second, the arithmetic facts represent a well-defined and circumscribed set of facts that is learned by virtually every educated adult. This naturally-occurring but constrained set of facts offers the opportunity to explore
366
M.McCloskey & A.M. Lindemann
issues concerning representation and retrieval of stored information at levels of detail not readily attainable in the study of more complex knowledge domains. Advances in understanding of arithmetic fact retrieval may thus offer insights into memory representation and retrieval in general. Empirical phenomena A model of arithmetic fact retrieval must ultimately address a wide range of phenomena, including both everyday observations about arithmetic performance, and findings from studies of normal and impaired arithmetic fact retrieval. In this discussion we focus on the basic phenomena toward which our modeling efforts have thus far been directed. Although we consider these results to be of central importance for models of arithmetic fact retrieval, we by no means intend to suggest that these are the only findings of interest.
Ability to learn Learning of arithmetic facts is by no means trivially easy (see, e.g., Ashcraft, 1982). However, people are clearly capable of attaining a high level of proficiency with these facts. Thus, under non-speeded and otherwise non-stressed testing conditions, normally-educated adults show excellent performance on basic arithmetic table facts (e.g., McCloskey, Aliminosa, & Macaruso, in press). Capturing this fundamental, and obvious, phenomenon may be considered a basic requirement for any model of arithmetic fact retrieval. Errors in retrieval of well-learnedfacts
Although educated adults typically know the arithmetic facts well, they nevertheless make occasional errors in retrieval of these facts, especially in tasks requiring speeded responses. For example, Campbell and Graham (1985) found an error rate of 7.7% in adults’ production of answers to single-digit multiplication problems. In considering the implications of this phenomenon it is important to distinguish between errors in retrieval of well-learned facts (e.g., responding 64 to 8 x 7 even though one knows full well that 8 x 7 = 56), and errors reflecting lack of the relevant knowledge (e.g., responding incorrectly to 8 x 7 because one does not know the correct answer). Presumably, most of adults’ arithmetic fact retrieval errors are of the former type. Thus, a model of arithmetic fact retrieval must be capable of explaining how a subject could respond incorrectly to a problem on, say, 1 of 10 presentations, while responding correctly on the remaining trials.
MATHNET
367
Emor types
Errors in arithmetic fact retrieval are not random, but rather are highly systematic (e.g., Campbell & Graham, 1985; Miller et al., 1984). In multiplication most errors are operand errors, in which the erroneous answer is correct for a problem that shares an operand with the stimulus problem (e.g., 6 x 8 = 42, in which the answer is correct for 6 x 7). Operand errors show an interesting pattern, which Sokol et al. (1991) labelled the operand distance effect. Specifically, the erroneous responses are usually correct for a problem that not only shares an operand with the stimulus problem, but is also close in magnitude with respect to the other operand (Campbell & Graham, 1985; Miller et al., 1984). For example, in the error 5 x 8 = 35, the response is correct for 5 x 7, which shares the first operand (5) with the stimulus problem, and differs by one on the non-shared operand (8 vs. 7). Somewhat less frequent than operand errors are table errors, operation errors, and non-table errors. In table errors the incorrect response is the answer to a single-digit multiplication problem that does not share an operand with the stimulus problem (e.g., 7 x 8 = 54). In operation errors the response is correct for a problem involving the same operands as the stimulus problem, but a different arithmetic operation (e.g., 9 x 8 = 17). Finally, non-table errors (e.g., 4 x 9 = 38) are erroneous responses not falling into the operand, table, or operation error categories. Variation in perlfomance across problems Studies involving speeded responses to single-digit addition and multiplication problems have found systematic differences among problems in reaction time and error rate. The most well-established finding (e.g., Ashcraft & Battaglia, 1978; Campbell & Graham, 1985; Miller et al., 1984; Parkman, 1972; Parkman & Groen, 1971; Stazyk et al., 1982) is that problems with large operands (e.g., 8 x 7) in general show longer reaction times and higher error rates than problems with smaller operands (e.g., 2 x 3). This finding is referred to as the problem-size effect.
Effects of brain damage Recent studies have reported selective deficits of arithmetic fact retrieval in brain-damaged patients with intact comprehension and production of numerals (e.g., McCloskey, Caramazza, & Bas$ 1985; McCloskey, Aliminosa, & Sokol, in
M.McCloskey & A.M. Lindemann
368
press; Sokol et al., 1991; Warrington, 1982). Several interesting phenomena have emerged from these studies. Non-unfonn impaiment across problems. Even within a particular arithmetic operation (e.g., multiplication) impairment is typically not uniform across problems. That is, performance is usually found to be much worse for some problems than for others (Sokol et al., 1991; McCloskey, Aliminosa, & Sokol, in press). This non-uniformity of impairment is illustrated in Table 1,which presents the error rate for patient IE (McCloskey, Aliminosa, & Sokol, in press) on each multiplication problem in the range 2 x 2 through 9 x 9. Thus, for example, IEs error rate was 100% for 7 x 8, but only 27% for 9 x 8, and 7% for 9 x 9. Table 1. Percentage of errors for patient IE on 2-9's multiplication problems. Second Operand First Operand
2
3
4
5
6
7
8
9
13
0
0
0
13
7
13
0
7
20
7
0
40
7
100
7
0
7
0
0
47
33
93
7
0
0
0
20
7
7
0
7
7
67
20
31
80
100
27
13
7
13
20
86
100
100
23
0
93
67
0
100
93
64
21
0
20
7
27
33
20
27
7
6
4
0
Problem-size eflect. McCloskey, Aliminosa, & Sokol (in press) recently reported results for 7 patients with impairments in retrieval of multiplication facts for problems in the range 2 x 2 through 9 x 9. Although each patient showed nonuniform impairment across problems, severely impaired and relatively intact problems were not scattered randomly over the set of problems. Rather, for each patient, problems with large operands in general showed greater impairment than problems with small operands. However, for virtually all patients there were
MA THNET
369
notable exceptions to the problem-size effect. For example, IE’s error rate was 100% for 3 x 8, but only 27% for 9 x 8. Error patterns. Although patients with arithmetic fact retrieval deficits make errors at a higher-than-normal rate, many of these patients show error patterns very similar to that of normal subjects. For instance, 5 of the 7 patients studied by McCloskey, Aliiinosa, & Sokol (in press) showed a predominance of operand errors, with much lower proportions of table, operation, and non-table errors. However, some patients present with error patterns rather different from that reported for normal subjects. For example, non-table errors (e.g., 9 x 4 = 38) have been found to comprise roughly 10% of the multiplication errors made by normal subjects. However, for two of McCloskey, Aliminosa, and Sokol’s (i press) patients nearly half of the errors fell into the non-table error category. Distributed models of arithmetic fact retrieval Among current models of arithmetic fact retrievai, the most well-developed are those adopting an associative network view of memory (e.g., Ashcraft, 1987; Ashcraft & Battaglia, 1978; Campbell & Graham, 1985; Siegler, 1988; for a recent review, see McCloskey et al., 1991). Recently, however, several researchers have begun to explore connectionist approaches in which arithmetic problems and answers are represented as distributed patterns of activation across simple processing units.
Vucuso,Anderson, & Spoehr (1989) Viscuso, Anderson, and Spoehr (1989; see also Anderson, Rossen, Viscuso, & Sereno, 1990) formulated a model of arithmetic fact retrieval within the framework of Anderson’s (1983; Anderson, Silverstein, Ritz, & Jones, 1977) autoassociative Brain-State-in-a-Box (BSB) algorithm. In a typical BSB network the units take on activation values ranging continuously from -1 to + 1, and every unit is linked by weighted connections to every other unit (as well as to itself). By means of these connections the units send graded excitatory and inhibitory signals to one another. Retrieval involves a series of cycles in which each unit sends signals to all units in the network (including itself), and each unit adopts a new activation state on the basis of the signals it receives. The units then again send signals to each other, adjust their activation levels accordingly, and so forth. Viscuso et al. (1989) developed a 640-unit BSB network to model retrieval of multiplication facts. The network included 3 fields of units, one field representing the first operand in a problem, a second field representing the second operand,
M.McCloskey & A.M. Lindemann
370
and a third field representing the answer. Numbers were represented as distributed patterns of activation across the operand and answer fields. Thus, a problem (e.g., 6 x 4) was presented to the network by setting the units in the fustand second-operand fields to patterns of activation corresponding to the numbers in the problem (i.e., 6 and 4). The network's task was to produce over the field of answer units a pattern of activation corresponding to the correct answer. Viscuso et al. (1989) adopted a hybrid coding scheme for representing problem operands and answers in the network. Each number was coded by concatenating a magnitude representation, and a representation of the number's name (e.g., one for the number 1). The coarse-coded magnitude representations were constructed in such a way that numbers close in magnitude had similar representations (e.g., [++++-----------------] for 0, [--++++---------------1 for 1, [ ++++--------------] for 2, and so forth).' The name representation for a number was a letter-by-letter coding of the spelling of the number name (e.g., a coding of the letter sequence o-n-e for the number 1). Each letter was represented by an essentially arbitrary 8-unit pattern (which was derived from the letter's ASCII code). Due to limitations on network size, the network was trained and tested not on actual multiplication facts (e.g., 6 x 3 = 18,6 x 4 = 24), but rather on "qualitative" multiplication facts, in which products were rounded to the nearest 10 (or, for products less than 20, to the nearest 5). In the qualitative multiplication scheme, for example, 6 x 1 is 5, 6 x 2 is 10, and 6 x 3 and 6 x 4 are both 20. In separate runs, the network was trained on either the 64 facts ranging from 0 x 0 through 7 x 7, or the 64 facts ranging from 2 x 2 through 9 x 9. Although interesting in many respects, the Viscuso et al. (1989) network has several significant shortcomings as a model of arithmetic fact retrieval. The first concerns the choice of a representation for numbers in the network. Although the inclusion of a magnitude code seems reasonable, the number-name portion of the hybrid representation is unmotivated. It might be argued that because people usually encounter arithmetic facts in the form of Arabic numerals (e.g., 8 x 6 = 48) or spoken words (e.g., /eyt/ /taymz/ /sIks/ /iz/ /fordi/ /eyt/), stored arithmetic fact representations might include visual digit representations or phonological number-word representations (e.g., Campbell & Clark, 1988; Clark & Campbell, 1991). However, the Viscuso et al. (1989) number-name
----
I The precise nature of the magnitude representations is difficult to determine from the Viscus0 et al. (1989) report. Viscuso et al. characterize the magnitude representation for a number as a band of four +'s in a field of units, but do not specify the width of the field, or make clear whether a + corresponds to a single unit in the network.
U A THNET
371
representation codes the spelling of number words. Given that arithmetic facts are very rarely encountered in the form of written words (e.g., eight times sir is fortyeight) the basis for choosing such a representation is unclear. A second limitation of the Viscuso et al. (1989) model is that, as described above, the network does not store and retrieve actual single-digit multiplication facts, but "qualitative multiplication" facts in which, for example, 2 x 4, 2 x 5, and 2 x 6 are all 10. It is unclear to what extent results obtained with this unusual set of "facts" apply to actual multiplication. Third, the network did not attain the high performance levels of which human learners are capable. After training on the 0-7's problems the network was only 64% correct, and training on the 2-9's problems yielded a performance level of only 42% correct, Fourth, the network fails to capture the phenomenon of occasional errors in retrieval of well-learned facts, because the retrieval process is deterministic. That is, for any given input, the network always generates the same output (as long as the connection weights remain unchanged). Consequently, for any given problem the network either always generates the correct response, or always makes an error. (And, in the latter case, the error is always the same.) Thus, whereas errors made by adult human subjects usually represent occasional mistakes in retrieval of well-learned facts, the network's errors represent failure to know the correct answer. Hence, Viscuso et al. (1989) are not entirely justified in claiming that their network's error pattern is similar to that observed for human subjects. Finally, it should be mentioned that Viscuso et al. (1989) did not consider other phenomena of human arithmetic fact retrieval, such as the problem-size effect. Although further development of the Viscuso et al. model may well overcome these limitations, at present the model cannot be said to simulate closely the arithmetic fact retrieval performance of human subjects. McCloskey & Cohen (1989) and Graham (1990)
In the course of investigating interference effects in connectionist networks, McCloskey and Cohen (1989) developed a simple feed-forward back-propagation network for arithmetic fact retrieval. As shown in Figure 1, the network comprised a layer of 28 input units connected to a layer of 50 hidden units, which in turn was connected to a layer of 24 output units. In the input layer two fields of 12 units represented the problem operands, and a third field of 4 units represented the arithmetic operation (e.g., addition). In the output layer one field of 12 units represented the tens quantity of the answer, and a second field represented the ones quantity.
M. McCloskey & A.M. Lindemann
372
Input Units
Hidden Units
’0 0 0 0
0
a
0 0 0 0 0
First Number: 6
a 0 0
0 0 0 0
0 0
0 \O Operation: Addition
f
e
a
.a
-;
.\ 0
0 0 0 0
a
0 0
a
Tens
0 O /
0
\;
0 0 0 0 0
0
\O /O
1
\ s
0
0 0
Second Number:
Output Units
0
0 0
/ i
0
Ones
7
0 0
Figure 1. McCloskey di Cohen’s (1989) back-propagation arithmetic netwotk. Numbers were represented in the network in a coarse-coded form similar to the magnitude portion of the Viscus0 et al. (1989) representations. As illustrated below for the numbers 0-2, each number was represented by activating three units, and for numbers close in magnitude some of the same units were activated 0
[ 1 1 1 0 0 0 0 0 0 0 0 01
1
[O 1 1 1 0 0 0 0 0 0 0 0 ]
2
[ O O 1 1 1 0 0 0 0 0 0 0 ]
A problem (e.g., 6 + 1) was presented to the network by setting the operand fields in the input layer to the patterns corresponding to the problem’s operands,
MTHNET
373
and the operation field to a pattern representing the appropriate arithmetic operation. Activation was then propagated from the input layer to the hidden layer, and from the hidden layer to the output layer. McCloskey and Cohen (1989) reported that when the network was trained by presenting the 200 single-digit addition and multiplication facts repeatedly to the network, the network readily learned all of the facts. However, when the network was trained sequentially -- that is, first on the 1's addition facts, then the 2's facts, and so forth -- the training of new facts drastically degraded performance on previously-learned facts. For example, training on the 2's addition facts severely disrupted performance on the 1's facts. McCloskey and Cohen (1989) labelled this disruption catamophic interference, and suggested that it may represent a significant problem for attempts to model human learning with many current forms of connectionist networks.2 Because McCloskey and Cohen (1989) were concerned with the general issue of interference effects in connectionist networks, and not with modeling of arithmetic fact retrieval per se, they did not discuss their network's ability to simulate phenomena such as the problem-size effect, or patterns of arithmetic errors. However, Graham (1990) explored some of these phenomena with networks similar to that of McCloskey and Cohen (1989), and also discussed the issue of catastrophic interference in learning of arithmetic facts. With respect to interference, Graham (1990; see also Graham & Schneider, 1988, and Hetherington & Seidenberg, 1989) argued that disruption of previouslylearned facts by new learning can be reduced by continuing training on the old facts when new facts are introduced. However, it is not clear whether the amount of "maintenance rehearsal" required by a network is comparable to the amount children obtain in the course of learning arithmetic facts. Nor is it clear whether the substantial temporary disruption of previously-learned facts occasioned by introduction of new facts even under maintenance rehearsal conditions occurs to the same extent in children (but see Miller & Paredes, 1990, for evidence that at least some such interference occurs). At present, it is uncertain to what extent the disruption of previously-acquired facts by new learning represents a significant problem for connectionist modeling of arithmetic fact retrieval. Clarification of this matter will require a better understanding of the conditions under which children learn arithmetic facts (e.g.,
*
The interference issue did not arise in the context of the Viscus0 et al. (1989) model because the network was trained "concurrently"on all facts. That is, throughout the training period facts were sampled for training from the full set of to-be-learned facts.
374
M. McCloskey &A.M. Lindemann
the order and frequency of presentation of the various facts), and the extent to which children experience interference during learning. Also relevant are attempts to develop means of reducing interference in connectionist networks (see, e.g., French, 1991; Kruschke, in press; Lewandowsky, 1991). In addition to interference effects,Graham (1990) also considered the problemsize effect. He divided the 64 2-9’s multiplication problems into a set of 32 small problems with products less than or equal to 25, and a set of 32 large problems with products greater than 25. Noting that children typically encounter small problems before large problems in learning arithmetic, Graham (1990) trained networks first on the small facts and then on the large facts (with maintenance rehearsal on the former during training of the latter). Under these training conditions a problem-size effect was observed in measures of accuracy: performance was generally better for the small problems than for the large problems. When large problems were trained first, a reverse problem-size effect was observed. These results seem to suggest that order of acquisition was an important determinant of the network’s performance on individual facts, such that facts presented early in training had an advantage over facts presented later. However, in at least some of the simulations the facts in the set trained first were also presented more frequently overall (due to continued training of fust-set facts during second-set learning). Hence, it is not entirely clear,whether the obtained problem-size effects reflected order of presentation, frequency, or both, In theorizing about human arithmetic fact retrieval, both frequency of presentation (e.g., Ashcraft, 1987; Campbell & Graham, 1985) and order of acquisition (Campbell & Graham, 1985) have been suggested as determinants of the problemsue effect. Other results reported by Graham (1990) raise the possibility that the nature of the facts themselves (at least as represented in Graham’s simulations) may also contribute to the occurrence of problem-size effects. For example, Graham found that when networks were trained on either the small problems or the large problems, the latter took longer to learn. With respect to errors, Graham (1990) reported that networks trained on multiplication facts, like human subjects, made predominantly operand errors. However, as simulations of human errors in arithmetic fact retrieval, the errors made by Graham’s networks suffer from the same problem noted in the discussion of the Viscus0 et al. (1989) model. In standard feed-forward back-propagation networks of the sort used by Graham (1990), the retrieval process is deterministic, always generating the same output for any given input (as long as the connection weights remain constant). Thus at any given point in training, Graham’s networks
M A THNET
375
were either consistently correct or consistently incorrect on a problem. The errors Graham describes therefore represent instances in which a network had not learned the correct response to a problem, as opposed to occasional mistakes in retrieval of well-learned facts.3 Cottrell & Tsung (1991) Cottrell and Tsung (1991) trained modified back-propagation networks to solve multi-digit addition problems (e.g., 327 + 865). Problems were presented to a network one column at a time. For each column, the network was required to produce the ones digit of the sum, indicate whether there was a carry, and request input of the next column (or indicate that the problem was finished). Under certain training conditions the networks performed quite well. Cottrell and Tsung (1991) were primarily concerned with training a network to carry out the sequence of steps needed to solve multi-digit addition problems. Although part of the network's task was to add single-digit numbers, Cottrell and Tsung had little to say about this aspect of the network's functioning. For example, no attempt was made to model phenomena apparent in human performance, such as the problem-size effect. It may also be noted that the network performed addition in base 4, which would complicate any attempts to compare its fact retrieval performance to that of humans. MATHNET Connectionist modeling of arithmetic fact retrieval is currently in early stages of development, and many issues have not yet been systematically explored (e.g., the ability of models to account for patterns of impaired performance observed in brain-damaged patients). However, one point emerging from our survey of initial efforts is that the models developed to date fail to capture the phenomenon of occasional errors in retrieval of well-learned facts. Although a variety of approaches could be taken toward resolving this problem (see, e.g., note 3),
The phenomenon of occasional errors in retrieval of well-learned facts could perhaps be simulated in a back-propagation (or brain-state-in-a-box) network by introducing random noise into the network activation process. The noise would presumably lead to variation over trials in the output generated in response to any given problem, and hence to at least occasional incorrect outputs. However, it is not entirely clear whether the introduction of noise would lead to errors resembling those made by human subjects (e.g., for the problem 7 x 6, an output activation pattern corresponding closely to the pattern representing 48), as opposed to noisy or degraded output patterns that do not correspond closely to either the correct answer or any particular incorrect answer.
376
M. McCloskq & A.M. Lindemann
perhaps the most straightforward is to adopt a modeling paradigm in which the process of retrieving information from a network is inherently stochastic. The MATHNET model presented in this chapter represents an application of this approach. However, before describing MATHNET we first outline the general model of numerical processing that guides our computational modeling of arithmetic fact retrieval, and set forth the specific aims of the work we report. A general model of numerical processing McCloskey et al. (1985; see also McCloskey, in press) have proposed a model that analyzes cognitive numerical processing mechanisms into several functionally autonomous components (see Figure 2). At the most general level a distinction is drawn between the numeral-processingsystem, which comprises mechanisms for comprehension and production of numerals, and the calculation system, which encompasses processing components required specifically for carrying out calculations. Numeral-processing system. McCloskey et al. (1985) assume that the numeralprocessing system is composed of independent numeral comprehension and numeral production subsystems. Within each of these subsystems a further distinction is drawn between components for processing Arabic numerals (i.e., numerals in digit form, such as 63), and components for processing verbal numerals (i.e., numerals in the form of spoken or written words, such as siutythree). According to the model, the Arabic and verbal numeral comprehension mechanisms convert numerical inputs into semantic representations for use in subsequent cognitive processing, such as performing calculations. The numeral production mechanisms transform semantic representations of numbers into the appropriate form for Arabic or verbal output. The internal semantic representations are assumed to specify in abstract form the basic quantities in a number, and the power of ten associated with each (e.g., 6 tens and 3 ones for 63). It is further assumed that the quantity representations reflect magnitude relations among numbers, such that quantities that are close in magnitude, such as 8 and 7, have more similar representations than quantities that are not, such as 8 and 2 (Macaruso, McCloskey, & Aliminosa, 1991; Sokol, Goodman-Schulman, & McCloskey, 1989). Calculation system. Calculation requires, in addition to comprehension and production of numerals, a variety of calculation-specific processes. In particular, the McCloskey et al. (1985) model posits mechanisms for (a) comprehension of
377
MA THNET
operation symbols (e.g., +) and words (e.g.,plus), (b) retrieval of arithmetic facts (e.g., 6 x 7 = 42), and (c) execution of calculation procedures that specify the sequence of steps to be carried out in solving multi-digit arithmetic problems (e.g., in multiplication, start at the right-most column, translate the digrts in this column into semantic representations, use these representations to retrieve the product of the digits, translate the ones quantity of the product into a digit representation, write this digit at the bottom of the column, and so forth).
CALCULATION
MECNANISMS
PROCEDURES
ARABIC COMPREHENSION
ARABIC
ABSTRACT SEMANTIC REPRESENTATION
EIGHT
PRODUCTION
THREE
NUMERAL
COMPREHENSION MECNANISMS
NUMERAL PRODUCTION MECNANIWS
Figure 2. A general model of cognitive numencd processing mechanisms. Empirical evidence. Support for the model's principal assumptions comes from patterns of numeral-processingand calculation deficits observed in brain-damaged patients (for reviews see Caramazza & McCloskey, 1987; McCloskey et al., 1985; McCloskey & Caramazza, 1987; McCloskey, in press). For example, results indicating that retrieval of arithmetic facts may be disrupted while ability to execute calculation procedures remains intact (e.g., Sokol et al., 1991; McCloskey et al., 1985) provide support for the assumption that arithmetic fact retrieval mechanisms are separate from mechanisms for execution of calculation procedures. Implications for modeling of arithmetic fact retrieval. The framework provided by the McCloskey et al. (1985) model motivates and directs our computational
378
M. McCloskey & A.M. Lindemann
modeling in several ways. In the first place we take the eventual goal of our efforts to be that of developing a computational implementation of the fact retrieval component posited by the model. The model's assumption that arithmetic fact retrieval represents a functionally autonomous process motivates the decision to consider fact retrieval separately from other numerical processes such as numeral comprehension or execution of calculation procedures.4 The McCloskey et al. (1985) model also guides our choice of a representation for problems and answers. In accord with the model we assume that regardless of the form in which a problem is presented (e.g., 6 x 7 versus "six times seven"), numeral comprehension processes deliver to the arithmetic fact retrieval component an abstract semantic representation of the problem. The fact retrieval process returns a semantic representation of the answer, which is then converted by numeral production processes to representations appropriate for the desired mode of output (e.g., 42 versus "forty-two"). Finally, with respect to the semantic representations of numbers, we follow the McCloskey et al. model in assuming that each basic quantity in a number is represented separately (e.g., 49 is represented as 4 tens and 9 ones, and not as a single entity 49), and that similar quantities have similar representations. Although supported by some empirical evidence, these assumptions about the functional architecture of calculation mechanisms, and the representations processed by these mechanisms, are by no means uncontroversial. For example, several researchers (e.g., Campbell and Clark, 1988; Clark & Campbell, 1991; Gonzalez & Kolers, 1982, 1987) argue that arithmetic fact retrieval processes operate on representations tied to the form in which problems are presented or answers are given. Hence, our efforts to develop a computational model of arithmetic fact retrieval within the framework of the more general McCloskey et al. (1985) numerical processing model may be viewed in part as a means of evaluating the viability of the general model. To the extent that the McCloskey et al. assumptions are incorrect, a computational model based on these assumptions should ultimately prove inadequate.
It is important to distinguish our goal of modeling a modular arithmetic fact retrieval process from the goal of modeling the imk of solving basic arithmetic "table" problems. On the one hand, our aim is narrower than that of modeling the task of solving table problems, in that we do not consider the numeral comprehension processes that translate stimulus problems into the internal representations that serve as input to the arithmetic fact retrieval component, or the numeral production processes that operate on the output generated by the retrieval process. On the other hand, our goal is broader than that of modeling a particular task, in that the arithmetic fact retrieval process is (we assume) implicated in a variety of specific tasks (e.g., solving multi-digit as well as single-digit problems).
MATHNET
379
Aims of the present modeling
Although our eventual goal is to develop a full-fledged model of arithmetic fact retrieval, the aims of the work reported in this chapter are considerably more modest. In the first place, we limit our attention to retrieval of multiplication facts. Second, we consider only the 64 facts involving the operands 2-9 (i.e., 2 x 2 = 4 through 9 x 9 = 81). Several researchers have argued that whereas generation of these 2-9’s products involves retrieval of stored fact representations (e.g., 8 x 7 = 56), multiplication by 0 or 1 is accomplished by application of general rules (i.e., N x 0 = 0, N x 1 = N see, e.g., Ashcraft, 1983, 1987; Ashcraft, Fierman, & Bartolotta, 1984; Baroody, 1983, 1984, Campbell & Graham, 1985; McCloskey, Aliminosa, & Sokol, in press; Miller et al., 1984, Parkman, 1972; Sokol et al., 1991; Stazyk et al., 1982). Although many interesting issues arise concerning the ability of connectionist networks to simulate phenomena appearing to involve rule application, we defer consideration of these issues to a subsequent report. Third, we focus on production as opposed to verification of arithmetic facts (i.e., generating the answer to a problem, as opposed to deciding whether a presented answer is correct or incorrect). Much of the research on arithmetic fact retrieval has involved verification tasks, and an adequate model must ultimately be capable of interpreting verification results. Given, however, that everyday use of arithmetic typically involves production of answers, we focus initially on production. Finally, we restrict our attention to (normal and impaired) adult performance. Although we attempt to train networks in a manner roughly simulating the way in which children learn arithmetic facts, we do not consider the networks’ ability to exhibit phenomena observed in studies of arithmetic fact learning. Network architecture and representations
MATHNET was developed within the mean field fheory paradigm (Peterson & Anderson, 1987; Peterson & Hartman, 1989; Hinton, 1989). In the following discussion we do not attempt to provide a rigorous mathematical treatment of this paradigm; rather we present a brief, non-technical description of the mean field theory approach as implemented in MATHNET. For more detailed discussion the reader is referred to the above-cited articles, and also to Geszti (1990), and Hertz, Krogh, & Palmer (1991).
M. McCloskey &A.M. Lindemann
380
Figure 3 illustrates the architecture of the MATHNET networks? The networks include 26 "problem"units, 40 hidden units, and 24 answer units. Each unit can take on an activation value ranging continuously from -1 to + l . The problem units comprise two fields of 12 units representing the problem operands, and a third field of 2 units representing the arithmetic operation. Among the answer units one field of 12 units represents the tens quantity of the answer, and a second field represents the ones quantity. The distributed quantity representations are the same as in the McCloskey and Cohen (1989) networks, except that the 0's in the McCloskey and Cohen representations are replaced by -1's in the MATHNET representations, as illustrated below for the quantities 7-9? 7
[ -1 -1 -1 -1 -1 -1
8
[ -1 -1 -1 -1 -1 -1
9
-1 +1 +1 +1 -1 -1 ] -1 -1 +1 +1 +1 -1 ] [ -1 -1 -1 -1 -1 -1 -1 -1 -1 +1 +1 +1 ]
All problem units are connected to all hidden units, and all hidden units are connected to all answer units. The answer units are also connected to one another.' Each connection has a real-valued weight which can be positive or negative. The connections between units are bidirectional and symmetric. That is, given a connection between unit i and unit j , signals are transmitted from i to j , and also from j to i; further, the connection weight is the same for signals in either direction. Thus, for example, in the MATHNET networks a connection between a hidden unit and an output unit transmits signals not only from the hidden unit to the output unit, but also from the output unit back to the hidden unit. The signal sent from a unit i to a connected unit j is simply aiwij, where ai is the activation level of unit i, and wij is the weight on the i-j connection. Similarly, the signal from unit j to unit i is ujwij.
As discussed below, we trained multiple networks, each having the same architecture, on multiplication facts. Hence, we refer to MATHNET networks, instead of the MATHNET network.
In both back-propagation and mean field theory networks, either 0 or a negative value (e.g., -1) may be used as the minimum unit activation level. In general, use of a negative value speeds learning. Thus, whereas we used a minimum of 0 in our earlier back-propagation modeling (McCloskey & Cohen, 1989), we chose a minimum of -1 in our current work with mean field theory networks. (For further discussion see Peterson & Hartman, 1989; Stometta & Huberman, 1987.) In training or testing on any given problem, the states of all problem units remained fixed. Hence, no purpose would have been served by connecting problem units to one another.
M A THNET
PROBLEM UNITS
FIRST OPERAND:
6
0 0 0 0 0 0
7
0 0 0 0 0 0 0 0 0 0
0 0 OPERATION: MULTIPLICATION
ANSWER UNITS
0 0 0 0 0 0
0 0 0
0 0 0
SECOND OPERAND:
HIDDEN UNITS
381
0 0
\L
(,i
0
TENS: 4
0 O
0 0 0 0 0
0
0
0 0
0
0 0
0
Figure 3. Architecture of the MATHNET networks. In addition to receiving signals from connected units, each hidden and answer unit has a bias, which may be thought of as a tendency to adopt a positive or negative activation level independent of external input. Biases function exactly like signals from connected units, in that a unit’s bias is added with the incoming signals in determining a unit’s activation level. In fact, a unit’s bias is implemented as the weight on a connection from a unit that is always fully on (i.e., a unit with an activation level of + 1.0). In subsequent discussion we will not distinguish the bias weights from other connection weights.
382
M. McCloskey & A.M. Lindemann
The retrieval process
In MATHNET a problem (e.g., 8 x 6) is presented to a network by setting the operand fields of the problem units to the activation patterns corresponding to the problem's operands, and the operation field to the pattern [-1 +1], which arbitrarily represents the operation of multiplication. The states of the problem units remain "clamped" (i.e., fixed to the initially-set values) during the retrieval process. Activation levels of hidden and answer units are idtially set to 0. The network is then allowed to settle to a stable state through "simulated annealin&" an iterative process in which the activation levels of the unclamped hidden and answer units are repeatedly adjusted. The pattern of activation across the answer units at the end of the annealing process is taken as the network's answer. (The use of the term annealing is by analogy to a physical annealing process in which a molten substance is cooled gradually to a stable state.) On each iteration of the simulated annealing process, the activation states of all unclamped units are updated asynchronously. Specifically, the units are updated one at a time in a random order. When a unit i is selected for updating, its activation level is determined by summing the signals it receives as input, and transforming the sum non-linearly: ai = tanh(Cujwij/T) j
In this equation, Cajwij refers to the sum of the incoming signals from all units j to which unit i is'connected (including the "bias" unit). T, a parameter referred to as "temperature," is discussed further below. Finally, tanh, the hyperbolic tangent, is a sigmoidal function which constrains the unit's activation level to the range -1.0 to + 1.0. The more strongly positive the summed inputs divided by the temperature, the more closely the unit approaches the maximum activation level of t 1.0; the more strongly negative the inputs divided by temperature, the more closely the unit approaches the minimum activation level of -1.0. Once a unit's new activation level is determined, its signals to other units are adjusted accordingly. Then, another unit is selected for updating, and so forth, until all unclamped units have been updated. The procedure of updating the unclamped units in random order is repeated on each iteration of the annealing process. As the activation levels of the units are repeatedly adjusted, the network gradually settles to a stable state. The network may be thought of as moving across an "energy landscape" (where each location is defined by the activation states of the units) in search of a minimum energy state. The asynchronous updating of unit activation levels is an important feature of the MATHNET networks, because it introduces a source of random variation into
u4THNET
383
the retrieval process. Updating a unit’s activation level alters the signals it sends to other units. As a consequence, the inputs to a unit selected for updating will vary as a function of which other units have previously been updated on the current iteration of the simulated annealing process. Because the units are updated in a random order on each iteration, the updating order will vary across retrieval attempts for any given problem. Thus, on each retrieval attempt for a problem, the network may traverse a different route across the energy landscape en route to a stable state, and indeed the retrieval process may terminate in different states on different retrieval attempts. This feature of the networks is important for our aim of simulating the occasional errors in retrieval of welllearned facts observed for human subjects? One final aspect of the retrieval process requires comment. During annealing, the temperature parameter is initially set to a relatively high value, and gradually reduced (just as the temperature is gradually reduced in a physical annealing process). Given that the summed inputs to a unit are divided by the temperature, the use of initially-high temperatures prevents unit activation levels from being driven to extreme values too early in the settling process. The sequence of temperatures used over the iterations of the annealing process is referred to as the annealing schedule. In most of the network training and testing described in this chapter we used a 164teration annealing schedule beginning at a temperature of 30 and ending at a temperature of 0.5. As discussed in a later section, we also made use of a shorter annealing schedule to simulate speeded testing conditions.’
The learning process Weights on connections between units in MATHNET are initially set to small random values distributed uniformly over the range -0.5 to +0.5. The networks are then trained to produce correct responses to problems through a process in
*
Strictly speaking, in asynchronous updating each unit has a constant probability of updating its activation state per unit time. To implement truly asynchronous updating we would sample units randomly with replacement for updating, so that some units might be updated more than once on a given iteration of the annealing schedule, and some not at all. However, our approach of sampling without replacement, and consequently updating each unit once per iteration, is a widely-adopted approximation to asynchronous updating (e.g., Peterson & Anderson. 1987).
’
Our annealing schedules were generated on the basis of general principles discussed by Peterson and Hartman (1989), coupled with limited trial-and-error experimentation. Hence, the schedules are to some extent arbitray. A goal for future work is to explore the consequences of adopting different schedules.
384
M.McCloskey & A.M. Lindemann
which sets of training problems are presented repeatedly, and the connection weights are adjusted in small steps on the basis of a learning algorithm." For each presentation of each training problem, learning involves a two-phase process. In the "free" phase the problem units are clamped to the activation pattern representing a particular problem, and the network is allowed to settle to equilibrium as described in the discussion of retrieval. For each pair of connected units i andj, a+zjh" (the product of the units' activation levels) is then computed. In the second, or "clamped," phase, both the problem and the answer units are clamped to the appropriate activation patterns. For example, in training on the problem 8 x 6, the problem units are clamped to the pattern for this problem, and the answer units are clamped to the pattern for the number 48. The network is then allowed once again to settle to equilibrium, and for each pair of connected units the product of the units' activation levels (up;'"'@) is again computed. The free phase ascertains the state adopted by the network under normal retrieval conditions, whereas the clamped phase (where the answer units are clamped to the correct activation values) provides a target or desired state. The learning algorithm is aimed at changing the connection weights in such a way that the network will subsequently adopt under free-phase conditions a state more closely resembling the state achieved in the clamped phase. Specifically, for each connection the weight change is computed as A w ~= e ( a g ~ -rupj") ~ In this equation e is a scaling parameter sometimes referred to as the learning rate. In the simulations reported in this chapter, r was set at .003. An intuitive grasp of the weight adjustment rule may be obtained by noting that the product a+zjcorresponds roughly to a measure of the extent to which the units i and j are in similar activation states. The product will be positive if both units have positive activation levels, or if both have negative activation values. On the other hand, the product will be negative if one unit has a positive activation level, and the other has a negative activation level. From this perspective, it can be seen that the effect of the learning rule is to decrease the difference between the network's free-phase and clamped-phase behavior. If two units are in similar states to a greater extent in the clamped phase than in the free phase, upj""" will be greater than ugzjf", and the connection weight will be changed in the positive direction. This change will increase the likelihood that the two units will adopt similar activation states under subsequent free-phase conditions, because the more
lo This learning process applies to the weights implementing unit biases, as well as to the other weights in the network.
iU4 THNET
385
positive the connection weight for two units, the more the units are constrained to be in similar activation states. In contrast, if two units are in dissimilar activation states to a greater extent in will be less than upj", and the the clamped phase than in the free phase, up:connection weight will be changed in the negative direction. This change will increase the likelihood that the two units will adopt different activation states under subsequent free-phase conditions. Connection weights may be adjusted after each training problem, or information about activation levels of connected units in clamped and free phases may be accumulated over multiple problems before weights are adjusted. In the simulations reported in the present chapter, weights were adjusted every 10 problems when the set of training problems was small (i.e., 64 problems or fewer), and every 64 problems when the training set was larger. Training regimen In the modeling reported in this chapter, the presentation of problems during training was designed to simulate very roughly the training experienced by children from initial introduction of multiplication facts to eventual mastery of the facts. In an initial ordered training period, the network was first presented with the 2's problems (2 x 2 through 2 x 9, and 3 x 2 through 9 x 2), then the 2's and 3's problems, then the 2's, 3's, and 4's problems, and so forth. Each problem set included two occurrences of each newly introduced problem, and one occurrence of each previously-trained problem. For example, in the set including 2's, 3's and 4's problems, the 2's and 3's problems each occurred once, and the not-previouslypresented 4's problems (i.e., 4 x 4 through 4 x 9, and 5 x 4 through 9 x 4) each occurred twice. Each training set was presented for 5 learning cycles. On each learning cycle the problems in the training set were presented in a different random order. This portion of the training regimen was intended to simulate the ordered introduction of "times-tables" in the early school years. Following the initial ordered training, the network was presented repeatedly with a single training set including all 64 2-9's problems. Within this training set, problem frequency was varied, such that "small" problems occurred more often than "large"problems. As shown in Table 2, the problems were divided into 7 size classes on the basis of the sum of the problem operands (a commonly-used index of problem size). For example, class A included the 6 problems with operand sums in the range 4-6 (i.e., 2 x 2, 2 x 3, 3 x 2, 2 x 4,4 x 2, 3 x 3); these problems occurred 7 times each in the training set. The varied-frequency training set
M. McCloskey & A.M. Lindemann
386
included a total of 3 6 problems, which were presented in a different random order on each learning cycle. This portion of the training regimen was intended as a rough simulation of experience with multiplication facts following their initial introduction, and incorporates the assumption made by several researchers (e.g., Ashcraft, 1987) that small problems are encountered more frequently than large problems. Table 2. Division of problems into size classes for manipulation of frequency of training. Sue Class
Sum of Operands
A B C D E
4-6 7-8 9-10 11 12-13 14-15 16-18
F G
Number of Problems
Frequency in Training Set
6 9 13 8 13 9 6
Two points about this complex training regimen require comment. First, our ability to provide a network with arithmetic fact training similar to that experienced by typical human learners is limited by a dearth of specific data on typical frequencies and patterns of exposure to arithmetic facts. For example, although the assumption that frequency of exposure varies with problem size appears reasonable, no strong evidence has been adduced in support of this assumption (see McCloskey et al., 1991). Thus, the extent to which our training regimen (or any training regimen) even roughly simulates human experience with arithmetic facts is somewhat uncertain. Second, the training regimen incorporates the two factors -- order and frequency of exposure to problems -- that have most often been hypothesized to contribute to the genesis of the problem-size effect (e.g., Ashcraft, 1987; Campbell & Graham, 1985; but see Campbell & Oliphant, this volume; Siegler, 1988). However, because frequency and order are confounded (i.e., small problems are encountered earlier and more often than large problems), the training regimen does not allow us to determine the extent to which each individual factor
M A THNET
387
contributes to problem-size effects in the networks. Hence, in reporting results we present additional simulations in which (a) order and frequency are both held constant across problems, and (b) one of the two factors is varied while the other is held constant. (Of course, the extent to which the determinants of problem-size effects in the networks are also the determinants of problem-size effects in people is a separate issue.) Replication
To ensure that the obtained results were robust, three separate networks (which we will refer to as networks A, B, and C) were trained in the manner described above. Although the training procedure was the same for each network, the networks differed in the initial randomly-assigned connection weights, the random order in which the problems were trained on each learning cycle, and the random order in which the unit activation states were updated on each iteration of the simulated annealing schedule for each presentation of each problem. Scoring
In evaluating a network's answer to a problem, the tens-quantity and onesquantity answer fields were scored separately. For each field the pattern of activation across the individual units was compared to the activation pattern for each of the quantities 0-9, and the best-matching quantity was taken to be the network's output." For example, if for a particular problem the activation pattern across the tens answer field corresponded most closely to the pattern for the quantity 7,and the pattern across the ones answer field was most similar to the pattern for the quantity 2, then the network's answer was taken to be 72. For "normal" (i.e., undamaged) networks most answer-field patterns were entirely unambiguous, matching the pattern for one particular quantity perfectly, or nearly so. Damaged networks, however, occasionally produced noisier activation patterns.
I' The degree of match between an actual output pattern and the target pattern for a particular quantity was determined by computing for each unit the squared difference between the actual and target activation values, and summing these differences across units. "he smaller the sum of squared differences, the better the match between the actual and target patterns.
M. McCloskq & A.M. Lindemann
388
Results Abilify to learn During training each network’s performance was evaluated every 5 learning cycles by scoring the answer produced by the network for each training problem on the free phase of the training process (i.e., the phase in which the problem units are clamped, and the hidden and answer units are free to vary). Training continued until the network scored 100% correct on one of these tests. Because the network retrieval process is stochastic (due to the random order in which unit activation levels are updated on each iteration of the annealing schedule), a correct response to a problem on a single trial does not necessarily indicate that the problem is well-learned in the sense that the correct response could be produced consistently. Hence, after each network reached the initial learning criterion, it was tested more thoroughly to determine whether all facts had indeed been learned well. In each of 10 test blocks each of the 64 problems was presented once. For each problem the operand and operation fields were clamped to the pattern representing the problem, and the network was allowed to settle to equilibrium through the 16-iteration annealing schedule used in training. Network A reached the learning criterion after 110 learning cycles, network B after 140 learning cycles, and network C after 120 cycles. The subsequent testing revealed that all three networks had mastered all 64 problems: Networks A and C were 100% correct over the 10 test blocks, and network B was 99.8% correct (639/640). Thus, averaging over the three networks, the 64 2-9’s multiplication facts were mastered in a mean of 123 learning cycles. For .each network the first 40 learning cycles comprised the initial ordered training period, and the remaining cycles involved the 256-problem training set in which small problems were presented more frequently than large problems. The mean number of presentations for individual problems over the full training period ranged from 93 (for 9 x 9) to 626 (for 2 x 2,2 x 3, and 3 x 2). Although it is difficult to estimate the approximate number of exposures to problems children receive in the course of learning multiplication facts, the amount of training required by the networks does not seem unreasonable. Thus, we conclude that the MATHNET networks were able to learn all 64 facts, and could do so in a reasonable number of trials.I2
’*
Note that the average of 626 exposures to 2 x 2, 2 x 3, and 3 x 2 does not mean that this number of exposures was required to learn these facts, but simply that over the training required for learning of all 64 facts, 2 x 2 , 2 x 3, and 3 x 2 were presented 626 times.
M A THNET
389
Perjbmance under speeded testing conditions In the 10 blocks of test trials discussed in the previous section, the networks were tested with the annealing schedule used during training. These test blocks may be thought of as simulations of non-speeded testing. However, many of the phenomena reported in studies of normal arithmetic fact retrieval (e.g., occasional errors in retrieval of well-learned facts, problem-size effects in reaction time and error data) involve speeded testing conditions. To assess the extent to which our networks exhibited these phenomena, we tested the networks under simulated speed pressure. Specifically,we shortened the annealing schedule from 16 iterations to 11 by omitting the first five iterations. In this way we forced the network to arrive at an answer more quickly than during training. One other aspect of the speeded testing procedure allowed us to simulate reaction time for retrieval of arithmetic facts. In training and non-speeded testing, the retrieval process on each trial continued through the full set of iterations specified by the annealing schedule. In speeded testing, however, the iterative retrieval process continued only until the network had achieved a stable state. This procedure was intended to simulate the pressure human subjects presumably experience on a speeded test to produce an answer as soon as one has been retrieved. The number of iterations the network required to attain a stable state on a trial was taken as the network’s reaction time for that trial. A stable network state was defined somewhat arbitrarily as a state in which all answer units had activation values greater than t0.9 or less than -0.9 (i.e., values within 0.1 of the maximum or minimum unit activation value^).'^ On (the few) trials in which the network did not meet this criterion prior to the end of the 11iteration speeded annealing schedule, the retrieval process terminated at the completion of the schedule, and the reaction time was taken to be 11 iterations. Each network received 30 blocks of 64 speeded test trials.
Errors in retrieval of well-learned facts We have noted that human subjects who know arithmetic facts well nevertheless make occasional fact retrieval errors, at least when under speed pressure. Hence, we may ask whether the MATHNET networks show an analogous phenomenon,
” Consistent with the assumption that network states meeting this criterion are stable, preliminary results suggest that retrieval accuracy is virtually identical when the retrieval process continues for the full annealing schedule, as when the stability criterion is used to terminate retrieval.
390
M.McCloskey & A.M. Lindemann
and if so, whether the errors made by the networks resemble those observed for human subjects. Emor rates. Whereas the networks were virtually perfect in non-speeded testing (mean accuracy = 99.95%), the simulated speed pressure resulted in occasional errors. In the speeded testing the mean accuracy was 97.3%, representing an average of 52 errors per network over the 30 test runs. Thus, the networks, like human subjects, showed reduced accuracy under speed press~re.'~ The networks' performance also resembled that of human subjects in another respect. Although speed pressure increased error rates, performance on all problems remained reasonably good. For each network accuracy was at least 65% for all problems, and no network had more than 3 problems with accuracies lower than 85%. Emor types. Table 3 presents the distribution of the networks' errors across error types. Of the 155 total errors, 122 (79%) were operand errors (e.g., 7 x 7 = 56), in which the answer produced by the network was correct for a multiplication problem sharing an operand with the stimulus problem.
Table 3. Error distributionfor the MATHNET networks. Error Type
Example
Operand Operation Table Non-Table
8 x 8 = 12 8 x 5 = 13 9 x 7 = 64 9 x 5 = 55
Number (%)
l 4 Given that terminating retrieval on the basis of the stability criterion appeared to have no effect on retrieval accuracy (see note 13), the decreased accuracy in speeded testing presumably reflects the removal of the first 5 iterations from the 16-iteration annealing schedule used in training and non-speeded testing. By shortening the annealing schedule in this way, we eliminated an initial period in which the "temperature" was slowly reduced from a high starting value (30); with the shortened schedule the retrieval process began at a considerably lower temperature (i.e., 7.1). As expected, this procedure increased the likelihood that a network would fail to "find" the lowest energy state (corresponding, for a trained network, to the correct answer), but would settle instead into a local energy minimum in which the pattern of activation across the answer units was incorrect. In subsequent work we plan to assess the effects of other ways of defining "speeded" annealing schedules (e.g., starting at a high temperature, and reducing it rapidly).
M A THNET
391
Eight of the errors (5%) were table errors, in which the network’s response was correct for a multiplication problem not sharing an operand with the stimulus problem (e.g., 9 x 7 = 64). Finally, 25 of the errors (16%) were non-table errors, in which the network‘s response was not the correct answer to any single-digit multiplication problem (e.g., 8 x 6 = 46, 7 x 4 = 22). This overall error distribution is representative of the distributions for each of the three individual networks. The networks’ error distribution corresponds well to the distributions observed for human subjects. For example, Campbell and Graham (1985) reported a distribution of 79% operand errors, 13% table errors, and 7% non-table errors for a group of 60 adults; and Harley (1990), in a study involving 42 adult subjects, found 71% operand errors, 13% table errors, and 10% non-table errors. Like human subjects, the networks produced various different incorrect answers for some problems. For example, on the problem 8 x 6 network A was correct on 23 of the 30 trials, but answered 36 four times, and 42 three times. Human subjects occasionally make errors in which the answer is correct for a different arithmetic operation (e.g., 8 x 5 = 13). Not surprisingly, the networks, which were trained only on multiplication, made no errors of this type. Operand distance effect. Examination of the networks’ operand errors revealed the operand distance effect observed for human subjects. That is, the erroneous responses were usually correct for a problem that not only shared an operand with the stimulus problem, but was also close in magnitude with respect to the other operand. For example, in 9 x 7 = 56, the answer is correct for 8 x 7, which shares the second operand (7) with the stimulus problem, and differs by 1 on the nonshared operand (9 vs. 8). In 111 of the 122 operand errors (91%), the nonshared operand was within f 1 of the correct operand. Determinants of the error pattern. We have not yet carried out analyses of network functioning to probe the processes leading to errors of particular sorts. However, we may nevertheless offer a few plausible speculations. The predominance of operand errors presumably reflects the fact that the problem representation is similar for problems sharing an operand. For example, all 8 x N problems have the same pattern of activation across the first-operand field. If, then, we assume that the answer to problem X is more likely to occur as an error to problem Y the more similar the X and Y problem representations, we would expect operand errors to be common. In the same vein, the operand distance effect presumably reflects our choice of a quantity-codingscheme in which similar quantities have similar representations. For example, the representation for 9 more closely resembles the representation for 8 than the representation for 3. Accordingly, the 9 x 7 problem representation
392
M. McCloskey & A.M. Lindemann
is more similar to the 8 x 7 representation than to the 3 x 7 representation, and as a consequence errors like 9 x 7 = 56 are much more common than errors like 9 x 7 = 21. Finally, inspection of the non-table errors suggests that most of these errors occurred when the network in some sense combined the tens quantity from the correct answer and the ones quantity from the answer to a related problem, or vice versa. For example, in 8 x 7 = 58, the erroneous answer may represent a combination of the tens quantity from the correct answer 56, and the ones quantity from 48, the answer to the related problem 8 x 6. The extent to which human non-table errors reflect this sort of tens/ones recombination is an issue that merits consideration.
Van’ation in perfomance across problems Problem-size effect. As we noted in discussing arithmetic fact retrieval phenomena, many studies of human subjects have found slower reaction times and/or higher error rates for problems with large operands (e.g., 9 x 7) than for problems with smaller operands (e.g., 2 x 3). Typically, problem-size effects have been characterized in terms of correlations between measures of performance and measures of problem size (e.g., the sum or product of the operands). Accordingly, we computed correlations between sum of problem operands, and the networks’ reaction times and error rates for the individual problems (averaged over the three networks). For reaction time, the correlation with sum of operands was .69 (p c .001), indicating that reaction time generally increased with sum. This correlation is comparable in magnitude to those reported in studies of human subjects -- the human studies have typically found correlations in the range of roughly .6 to .8 (e.g., Campbell & Graham, 1985; Miller et al., 1984; Stazyk et al., 1982). For the networks’ error rates, the correlation with sum of operands was .52 (p < .001). The most directly comparable result available in the arithmetic literature is Campbell and Graham’s (1985) report of a .63 correlation between error rate and sum of operands for 2-9’s multiplication problems in a study where adults produced speeded responses to problems. Tables 4 and 5 present the mean reaction time and error rate, respectively, for each of the individual problems. The data in both tables represent averages across networks A, B, and C. The reaction time reported for each problem in Table 4 is a normalized value obtained by dividing the mean reaction time for a problem (in number of iterations) by 11, the maximum number of iterations allowed by the annealing schedule used in the speeded testing. For example, on the problem
MA THNET
393
2 x 2 the networks required a mean of 6.8 iterations to reach a stable state; dividing this value by 11 yields the normalized reaction time of .618 shown in the table.
Table 4. Mean normalized reaction times for the MATHNET networks on 2-9’s multiplication problems. Second Operand First Operand
2
3
4
5
6
7
.618 .673 .573 .536 .673 .664 .627 .709
.582 .582 .664 .582 .582 .636
.573
.564 .564 .545
.600
.673 .718 .582 .755 .691 .700 .673 .745
.645 .609
.m .636 .582 9 5 .609 .682
.m
.627 .a2 .664 .700 ,782
.636 .527 .582 .618 .773 ,736 .755
8 .691 .655 .682 .609 .655 .682 .782 ,964
9
.800 .609 -564 .818 .791 .827 .836 A18
Table 5. Mean error percentagesfor the U A THNET networks on 2-9’s multiplication problems. Second Operand First Operand
2
3
4
5
6
7
8
9
0 0 0 0 2 2 3 1
0 0 0 0 0 0 1 1
0 0 0 2 1 3 0 0
0 0 0 1 12 3 9 8
2 0 0 9 0 3 9 0
3 0 0 3 1 7 2 11
2 3 0 8 1 3 21 7
2 6 1 0 0 2 8 6
394
M. McCloskey & A.M. Lindemann
It is apparent from the tables that whereas reaction times and error rates were uniformly low for problems with two small operands, many of the problems with larger operands showed longer reaction times and/or higher error rates. This is especially true for problems with two large operands. To be sure, reaction time and error rate did not increase uniformly with problem size -- some "large" problems had lower reaction times and/or error rates than some "smaller" problems. However, neither is the problem-size effect entirely uniform in human subjects (see, e.g., Campbell & Graham, 1985; McCloskey et al., lW1).'s Deteminants of the problem-size effect. In describing the training regimen used in the present modeling, we noted that the regimen confounds order and frequency of presentation. Small problems are introduced earlier, and presented more frequently, than large problems. To ascertain whether frequency, order, both, or neither, contributed to the obtained problem-size effect, we trained additional networks with different procedures. First, we trained three control networks with a procedure that held constant both frequency and order of presentation. For each network the training set comprised 4 occurrences of each of the 64 problems. In this way the size of training set was equated with that of the varied-frequency set used in the second phase of training for the primary networks (see pp. 21-22); other aspects of the control training procedure (e.g., learning rate) were also the same as for the second-phase training of the primary networks. The three control networks completed training in an average of 90 learning cycles. Speeded testing of the control networks revealed no problem-size effects for either reaction time or error rate measures. The correlation of mean reaction time and sum of operands was negligible (r = .13,p > .2), and the correlation of mean error rate with sum was not only negligible, but in the wrong direction (r = -.12,p > .2). Second, we trained networks with a procedure that manipulated order of presentation while holding frequency constant. In this procedure networks were trained for a fured number of learning cycles on each of a series of training sets. In M i n networks the problem sets were defined on the basis of the minimum operand. In the first training set for these networks each 2's problem occurred once. In the second set each (not previously presented) 3's problem occurred
In one respect the results reported in Tables 4 and 5 d o not correspond closely to those reported for human subjects: the correlation between the reaction time and error rate measures for the individual problems was only .42. Although reliable (p < .001), this correlation is considerably lower than those typically reported in studies of arithmetic fact retrieval. For example, Campbell and Graham (1985) found a reaction time/error rate correlation of .85.
MATHNET
395
twice, and each 2’s problem occurred once. In the third set each 4‘s problem occurred three times, and each 2’s and 3’s problem occurred once. And so forth, through the eighth and final set, in which each 2’s through 8’s problem occurred once, and the remaining 9’s problem 9 x 9 occurred 8 times. When these training sets are presented in order for a futed number of learning cycles each (e.g., 20 cycles on the first set, then 20 cycles on the second set, and so forth), the total presentation frequency is held constant across problems, but small problems are introduced earlier in training than large problems. The procedure was the same for Sum networks, except that the training sets were defined according to sum of operands. More specifically, the training sets were defined on the basis of the 7 problem size classes established for training of the principal networks (see Table 2). These classes were formed by grouping problems according to operand sum. For the Sum networks the frrst training set comprised one occurrence of each Class A problem; the second set consisted of two occurrences of each Class B problem, and one occurrence of each Class A problem; and so forth. Although the number of learning cycles for each training set was futed for a given network, we trained a variety of Min and Sum networks with numbers of learning cycles per training set ranging from 5 to 100 (with accuracies at completion of training ranging from about 50% to virtually 100%). None of these varied-order networks exhibited reliable problem-size effects. For example, a representative Min network trained with 30 learning cycles per training set showed a correlation of .14 between reaction time and sum of operands, and a correlation of .04 between error rate and sum. Similarly, a Sum network trained with the same number of learning cycles per training set showed correlations with operand sum of -.01 and -.03 for reaction time and error rate, respectively. Finally three networks were trained with a procedure that varied frequency of presentation while holding order constant. Like the primary A, B, and C networks, these networks were trained with the 256-item training set in which small problems occurred more frequently than large problems, but without the initial ordered training period in which small problems were introduced earlier than large problems. The networks trained in this way completed training in an average of 130 learning cycles. These varied-frequency networks exhibited problem-size effects comparable in magnitude to those observed for the primary networks. (r = .66,p < .001,for reaction time data, and r = .44,p < .001, for error rate data). Thus, no problem-size effect was observed when frequency and order of presentation were both held constant, or when frequency was held constant and order was varied. However, when frequency was varied and order of presentation
3%
M. McCIoskey & A.M. Lindemann
was held constant, large problems showed slower reaction times and higher error rates than small problems. Consequently, it appears that problem presentation frequency was primarily responsible for the problem-size effects obtained in the primary MATHNET networks. Other variations in performance across problems. Whereas our networks show some form of problem-size effect, they do not exhibit two other phenomena involving variation in performance across problems. Tie problems (i.e., problems with two identical operands, such as 4 x 4) have often been found to show better performance than non-tie problems (e.g., Campbell & Graham, 1985; Miller et al., 1984). Also, Campbell and Graham (1985) found that 5's multiplication problems showed better performance than would be expected on the basis of problem-size measures. For the MATHNET networks, however, neither the ties effect nor the 5's effect is apparent in the accuracy or reaction time data. Why these effects occur in human subjects, and whether they can be simulated in networks like MATHNET, remain to be determined. Effects of damage We have recently begun "lesioning" studies to determine whether MATHNET networks, when damaged, exhibit phenomena observed in brain-damaged patients with deficits in arithmetic fact retrieval. There are many different ways in which networks may be damaged -- deletion of units, deletion of connections, weakening of connection weights, addition of noise to the signals sent from unit to unit, and so forth. We have not yet explored these various types of damage systematically. Here we consider one particular form of damage, reporting preliminary results to suggest that damaging MATHNET networks can yield 'performance patterns resembling those observed in brain-damaged humans. Networks A, B, and C were each subjected to damage in which each connection weight was reduced in magnitude by a percentage sampled randomly from a normal distribution with a mean of 40% and a standard deviation of 10%. Thus, the average reduction in magnitude was 40%, but some weights were weakened to a lesser extent, and some to a greater extent. This extent of damage was chosen by trial-and-error to produce overall error rates comparable to the error rates of the brain-damaged patients studied recently by Sokol et al. (1991) and McCloskey, Aliminosa, and Sokol (in press). We refer to the three "lesioned networks as networks A-L, B-L, and C-L. After damage, each network received 30 blocks of test trials on the 64 2-9's problems. The damaged networks were tested with the Witeration annealing schedule used in training, roughly simulating the non-speeded testing procedure
MATHNET
397
used in studies of brain-damaged patients with arithmetic fact retrieval deficits (e.g., McCloskey, Aliminosa, & Sokol, in press). The following discussion accordingly focuses on accuracy data; reaction time measures were not collected. Impairedperformance. As expected, randomly weakening the connection weights impaired the networks’ performance. Whereas networks A, B, and C were virtually 100%correct with the 16-iterationannealing schedule, networks A-L,B-L, and C-L were only 69%, 86%, and 81% correct, respectively. Non-uniformity of impairment. Like brain-damaged patients, the damaged networks evidenced non-uniform impairment across problems. Tables 6, 7, and 8 present the individual-problem error rates for the three damaged networks. It is apparent from the tables that for each network performance was severely impaired for some problems while remaining intact, or nearly so, for other problems. This result may seem surprising, given that the networks’ representation of each fact is distributed across many units and connections. However, some units and connections may be more important for some facts than for others. Thus, disruption of a connection weight may have more serious consequences for some problems than for others.
Table 6. Mean error percentages for the damaged network A-L on 2-9’s multiplication problems. Second Operand
First Operand 2 3 4
5 6 7 8
9
2
3
4
5
6
7
8
9
0 0 0 0 0 0 0 0
100 0 0 100
0 0 0
0 0 0
0 0 loo
0 0 0
100 0
0 0 0 0 0 100 93 10
30 0 33 0 37 0
100 3 100 loo 100 0 73 3
0
67 100
3
60 3
0 100 100
loo
0 87 0 0
83 100
M. McCIoskey & A.M. Lindemann
398
Table 7. Mean error percentages for the damaged network B-L on 2-9’s multiplication problems. Second Operand First Operand
2
3
4
5
0 0 0 7 0 0 97 0
0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 100
0 0 0 0 0 0 0 0
7
8
9
0 0 0 0 0 0 0 2 0 3 0 2 0 2 3 17 100 100 0
97 0 0 0 0 100 100 0
0 0 100 0 0 0 0 0
6
Table 8. Mean error percentages for the damaged network C-L on 2-92 multiplication problems. Second Operand First Operand 2 3 4
5 6 7 8 9
2
3
4
5
6
7
8
0 0 0 0 0 0 0 97
10 0 3 0 0 0 0 50 0
0 100 0 97 17 0 13 0
0 0 0 0 7 0 93 0
0 0 87 100 0 100 0 0
0 0 0 0
0 0 0 0
0
7 0 0
4
0
0 0 100
9
3
0 0 0 0 0 47 100 90
Problem-size eflecfs. The error rate data for the damaged networks also showed problem-size effects, although these effects were somewhat weak (especially for network C-L). As can be seen from Tables 6-8, large problems were more likely
U A THNET
399
than small problems to show impairment. However, there were many exceptions. For example, network A-L’s error rate was 100% on the small problem 2 x 3, and 0% on the large problems 8 x 7 and 9 x 7. Although brain-damaged patients also show exceptions to the generalization of greater impairment for large than small problems (see McCloskey, Aliminosa, & Sokol, in press; Sokol et al., 1991), the exceptions appear to be more numerous for the networks, especially with respect to instances of severe impairment for small problems. Error types. The error patterns for the damaged networks resemble in interesting ways those observed in brain-damaged patients with deficits in arithmetic fact retrieval. Table 9 presents the error distribution for the 7 patients in the McCloskey, Aliminosa, and Sokol (in press) study who were impaired in retrieval of 2-9’s multiplication facts. It is evident from the table that most of the patients presented with an error pattern similar to that of normal subjects: a high proportion of operand errors, with much lower proportions of table and non-table errors. However, two of the patients (FW and TM) evidenced a much higher than normal proportion of non-table errors (e.g., 7 x 8 = 59).
Table 9. Percentage of errors on 2-9’s multiplication problems falling into each error categoy for seven brain-damaged patients.‘ Error Type
Patient
Operand
Operation
Table
Non-Table
CM FW IE MD PS SB TM
61 45 59 74 80 62 35
0 0 2 0 5 7
25 10 22 13 7 15 15
14 45 17 13 8 16 46
4
‘From McCloskey, Aliminosa, & Sokol (in press)
A similar phenomenon was observed for the damaged networks. As shown in Table 10, networks B-L and C-L exhibited error distributions similar to those observed for normal subjects, and for most of the brain-damaged patients studied
M. McCIoskey & A.M. Lindemann
400
by McCloskey, Aliminosa, and Sokol (in press). Like patients FW and TM, however, network A-L presented with a high proportion of non-table errors. Table 10. Percentage of emrs on 2-95 multiplication problems falling into each error category for the damaged networks. Error Type Network
Operand
Operation
Table
Non-Table
A-L
47
0
19
34
B-L
85
0
2
13
c-L
72
0
10
18
Why the error distributions differed across damaged networks is not clear. However, it does not appear that the answer lies in differences among networks prior to damage. When we damaged network A a second time (applying the same random weakening method to the "normal"connection weights), we found an error distribution quite similar to that of networks B-L and C-L -- 69% operand errors, 15% table errors, and only 16% non-table errors. Further, a second lesioning of network B yielded a distribution resembling that obtained in the initial damaging of network A -- 59% operand errors, 14% table errors, and 27% non-table errors. Thus, the same lesioning method applied on different occasions to the same network can lead to substantially different error distributions. This interesting result raises the possibility that the observed differences in error distributions among brain-damaged patients do not reflect either premorbid differences in arithmetic fact retrieval mechanisms, or differences in the nature of the damage to these mechanisms. Discrepancies between networks' and patients'perfonnance. Although the results from the damaged networks show some interesting similarities to the performance of brain-damaged patients with arithmetic fact retrieval deficits, the correspondence is considerably less than perfect. First, as we have already mentioned, the problem-size effects exhibited by the networks do not appear to be as strong as those evidenced by most of the brain-damaged patients studied by McCloskey, Aliminosa, and Sokol (in press). A second human/network discrepancy concerns performance on pairs of "complementary"problems such as 3 x 8 and 8 x 3. McCloskey, Aliminosa, and
UATHNET
401
Sokol's (in press) patients consistently showed a close correspondence in error rate between complementary problems. This phenomenon is clearly apparent in the data from patient IE presented in Table 1 (see also Sokol et al., 1991). Although the damaged networks show some tendency toward similar error rates for complementary problems (see Tables 6 4 , the tendency is relatively weak. Further, for each network at least one pair of complementary problems showed a very high error rate on one member of the pair, and a very low error rate on the other. For example, network C-L's error rate was 100% on 7 x 6, but 0% on 6 x 7. The implications of this human/network discrepancy remain to be explored. On the one hand, it may reflect a fundamental difference between humans and MATHNET networks in the representation of complementaryfacts. For example, in humans a single stored fact representation may underlie performance on both facts in a complement pair, whereas in MATHNET the representations of two complementary facts are apparently at least partially separate. On the other hand, it may be that both humans and MATHNET have at least partially separate representations for complementary facts, but humans make use of a strategy not available to MATHNET. In particular, a brain-damaged patient who is unable to retrieve a particular fact (e.g., 8 x 3 = 24) may attempt to circumvent this difficulty by retrieving the complementary fact (e.g., 3 x 8 = 24). This strategy will yield similar performance for both facts in a pair, even if the facts have separate representations and can be disrupted independently. If both facts are impaired, performance will be poor on both; if one or both facts are intact, performance will be good on both (see Sokol et al., 1991, for further discussion). A third difference between MATHNET and (some) brain-damaged patients concerns error types. All of the networks' errors, at least as we have scored them, are "commission" errors -- that is, errors in which an incorrect number is produced. However, two of the brain-damaged patients studied by McCloskey, Aliminosa, and Sokol (in press) made large numbers of "omission"errors, in which they failed to give an answer to a problem (on grounds that they did not know, or could not remember, the answer). (The error distributions in Table 9 are for commission errors only.) The failure to simulate omission errors may perhaps be remedied simply by altering the scoring of network response patterns. For example, a "noisy" pattern of activation across an answer field (Lee,a pattern that does not match very closely the pattern for any of the quantities 0-9) might be scored as a failure to respond, and not as a particular quantity. (Noisy activation patterns, although rare for the "normal" networks, were not uncommon for the damaged networks.) However, the consequences of adopting such a scoring procedure have not yet been explored.
402
M.McCloskey &A.M. Lindemann
Discussion
Our initial results with MATHNET are promising, and suggest that this type of connectionist network deserves further exploration as a basis for modeling arithmetic fact retrieval. The MATHNET networks were readily able to learn the 2-9's multiplication facts. Further, the networks occasionally erred in retrieval of well-learned facts, and the types of errors made by the networks corresponded well to those made by human subjects. The networks also evidenced a problem-size effect: When small problems were presented more frequently than large problems during training, error rates and reaction times in "speeded testing were lower for the former than for the latter. Finally, when damaged by random weakening of connection weights, MATHNET networks exhibited performance similar to that of brain-damaged patients with arithmetic fact retrieval deficits (i.e., non-uniform impairment, problem-size effects, and particular error patterns). To be sure, our work to date is preliminary, and much remains to be done before MATHNET will merit serious consideration as a model of arithmetic fact retrieval. In the following discussion, we sketch briefly the principal goals we plan to pursue in future work with MATHNET. Evaluating the adequacy of the simulation One major aim is to carry forward our initial efforts at assessing the extent to which the networks can simulate the arithmetic fact retrieval performance of human subjects. First, we obviously need to address the human/network discrepancies turned up in the preliminary work (e.g., the absence of a ties effect in the networks, and the apparently weak problem-size effects in the damaged networks). More generally, there is a need to examine phenomena in greater detail, and to explore more systematically the various means of producing the phenomena in MATHNET networks. For example, we need to investigate more thoroughly the consequences of different ways of damaging networks, and the impact of different forms of quantity representations on error patterns and problem-size effects. mending the model. It will also be important to extend MATHNET to a wider range of empirical phenomena. For example, we plan in the near future to expand the scope of the modeling to encompass arithmetic operations other than multiplication, and sets of problems (e.g., 0's addition and multiplication problems) that have typically been assumed to involve application of rules (e.g., N t 0 = N, N x 0 = 0). Another goal is simulation of performance in verification tasks (e.g., Does 4 x 7 equal 32?). Finally, we plan to compare the performance of networks
U A THNET
403
over the course of training to that of children learning arithmetic facts. Given previous results concerning catastrophic interference in connectionist networks (e.g., McCloskey & Cohen, 1989), an important developmental question is whether the disruption of previously-learnedfacts (e.g., the 2's multiplication facts) resulting from introduction of new facts (e.g., the 3's facts), and the amount of "maintenance rehearsal" needed to overcome this interference, are comparable for networks and children. Additional empirical research. Evaluating the extent to which MATHNET networks can simulate the performance of human subjects will require not only additional work with the networks, but also further research with human subjects. To give just one example, we clearly need to know more about the development of arithmetic fact knowledge. For most models (including MATHNET), interpretations for phenomena such as problem-size effects (e.g., Ashcraft, 1987) or error patterns (e.g., Siegler, 1988) depend not only upon basic assumptions about cognitive representations and processes, but also upon assumptions about children's experience with arithmetic facts during learning (e.g., frequency or order of exposure to problems, types of errors made during learning). As we noted in an earlier section, however, relatively little detailed and specific information about this experience is available. As a consequence, it is often difficult to assess the implications of a model's successes or failures in accounting for phenomena. For example, in weighing the significance of the problem-size effects exhibited by the MATHNET networks, we need to know (among other things) whether the frequency-of-exposure differences between small and large problems we created in training the networks were comparable to the differences typically experienced by children. Unfortunately, however, the available data do not permit a determination on this issue. Thus, it is certainly conceivable that the frequency differences between small and large problems were greater for our networks than for children learning arithmetic, and that a training regimen more accurately reflecting children's experience would have produced little or no problem-size effect. Clearly, a better understanding of children's experience with arithmetic facts is needed in evaluating MATHNET's performance. (And, of course, the same is true with respect to other models.) Theory development
We have suggested that MATHNET networks can reproduce some phenomena of normal and impaired human arithmetic fact retrieval performance, and may have the potential for simulating other phenomena as well. However, this does not mean that MATHNET constitutes an explanatory theory of arithmetic fact
404
M.McCloskey & A.M. Lindemann
retrieval, a theory that interprets the simulated phenomena. In the first place, we are far from fully understanding how the networks generate the interesting behaviors they exhibit. For example, we do not know what internal representations for arithmetic problems a network develops in the course of training, or whether these representations differ in significant ways across networks, we do not know exactly how the networks map an input activation pattern onto the correct output, or what factors lead to the occurrence of particular types of errors (e.g., non-table versus operand errors); and we do not know how damage to a network alters its representations or retrieval processes. Simiiarly, we do not know what aspects of our networks are important in determining their performance, and what aspects are irrelevant. For example, we cannot say with any certainty whether the performance of the networks would have been different if we had used a different number of hidden units, a different pattern of connections among units, or a different representation for problems and answers. For these reasons, we suggest that MATHNET should be viewed not as a theory that explains the simulated phenomena, but rather as something akin to an animal model (McCloskey, 1991).16 In work with an animal model, some animal system thought to share crucial features with a human system of interest is studied with the aim of shedding light on the human system. By studying the animal model rather than working directly with the human system, one may be able to carry out manipulations that could not be performed on human subjects (e.g., lesions to particular brain structures). The model system may also be somewhat simpler than the human system, and therefore more amenable to analysis. Thus, an animal model is not a theory, but rather an object of study. In work with an animal model the goal is to elucidate the structure and functioning of the animal system, and on this basis to formulate a theory of the corresponding human system. Of course, one does not assume that insights gained through study of the animal system will necessarily apply without modification to the human system. Rather, it is simply assumed that because the animal system may be similar in relevant respects to the human system, studying the former may aid in developing a theory of the latter. Similarly, a connectionist network that exhibits some of the phenomena observed for a cognitive process such as arithmetic fact retrieval may perhaps
l6 Other connectionist models of human cognitive processes (e.g., Seidenberg & McClelland, 198Y) should also, we suggest, be viewed from this pelspective (see McCloskey, 1991, for further
discussion).
MATHNET
405
resemble in relevant respects the mechanisms underlying the process in humans. If by studying the network we can gain some understanding of its structure and functioning at a level relevant for cognitive theorizing, this understanding might aid in developing a theory of the human cognitive process (see McCloskey, 1991, for further discussion). Thus, a central aim of our future work with MATHNET is to analyze the networks’ functioning. Networks like MATHNET are complex non-linear systems the dynamics of which are difficult to analyze and apprehend (e.g., Dyer, 1988; Grossberg, 1987; McCloskey, 1991; Minsky & Papert, 1988; Pavel, 1990; Rager, 1990). Nevertheless, a variety of techniques may be applied to gain at least a partial understanding of a network’s functioning (e.g., Gorman & Sejnowski, 1988; Hanson & Burr, 1990; Sejnowski & Rosenberg, 1987). Further, the MATHNET networks are relatively small, and operate upon a well-defined and constrained set of facts, Hence, it may prove possible to elucidate to a substantial extent the MATHNET representations and retrieval processes. It is our hope that the insights gained thereby will provide the basis for specific theoretical proposals.
ACKNOWLEDGMENTS The research reported in this chapter was supported by NIH grant NS21047. Address correspondence to Michael McCloskey, Department of Cognitive Science, Johns Hopkins University, Baltimore, MD 21218 REFERENCES Anderson, J A . (1983). Cognitive and psychological computation with neural networks. IEEE Transactions on Systems, Man, and Cybernetics, 13,799-815. Anderson, JA., Silverstein, J.W., Ritz, SA. & Jones, R.S. (1977). Distinctive features, categorical perception, and probability learning: Some applications of a neural model. Psychological Review, 84, 413-451. Anderson, JA., Rossen, M.L., Viscuso, S.R. & Sereno, M.E. (1990). Experiments with representation in neural networks: Object motion, speech, and arithmetic. In H. Haken & M. Stadler (Eds.), Synergetics of cognition (pp. 54-69). New York: Springer-Verlag. Ashcraft, M.H. (1982). The development of mental arithmetic: A chronometric approach. Developmental Review, 2, 213-236.
406
M. McCloskey & A.M. Lindemann
Ashcraft, M.H. (1983). Procedural knowledge versus fact retrieval in mental arithmetic: A reply to Baroody. Developmental Review, 3, 231-235. Ashcraft, M.H. (1987). Children’s knowledge of simple arithmetic: A developmental model and simulation. In C.J. Brainerd, R. Kad& J. Bisanz (Eds.), Formal methodr in developmental research (pp. 302-338). New York: Springer-Verlag. Ashcraft, M.H. & Battaglia, J. (1978). Cognitive arithmetic: Evidence for retrieval and decision processes in mental addition. Journal of Experimental Psychology: Human Learning and Memory, 4, 527-538. Ashcraft, M.H., Fierman, BA. & Bartolotta, R. (1984). The production and verification tasks in mental addition: An empirical comparison. Developmental Review, 4, 157-170. Baroody, A.J. (1983). The development of procedural knowledge: An alternative explanation for chronometric trends of mental arithmetic. Developmental Review, 3, 225-230. Baroody, A.J. (1984). A reexamination of mental arithmetic models and data: A reply to Ashcraft. Developmental Review, 4, 148-156. Campbell, J.I.D. & Clark, J.M. (1988). An encoding complex view of cognitive number processing: Comment on McCloskey, Sokol, and Goodman (1986). Journal of Experimental Psychology: General, 117, 204-214. Campbell, J.I.D. & Graham, D.J. (1985). Mental multiplication skill: Structure, process, and acquisition. Canadian Journal of Psychology, 39, 338-366. Caramazza, A. & McCloskey, M. (1987). Dissociations of calculation processes. In G. Deloche & X. Seron (Eds.), Mathematical disabilities: A cognitive neuropsychologicalperspective (pp. 221-234). Hillsdale, N.J.: Erlbaum. Clark, J.M. & Campbell, J.I.D. (1991). Integrated versus modular theories of number skills and acalculia. Brain and Cognition, 17, 204-239. Cottrell, G.W. & Tsung, F.S. (1991). Learning simple arithmetic procedures. In JA.Barnden & J.B.Pollack (Eds.), Advances in connectionist and neural computation theory. Volume I: High-level connectionist models (pp.305-321). Norwood, NJ: Ablex. Dyer, M.G. (1988). The promise and problems of connectionism. Behavioral and Brain Sciences, I I , 32-33. French, R.M. (1991).Usingsemi-distributedrepresentationsto overcome catastrophic forgetting in connectionistnetworks. CRCC Technical Report 51-1991,Indiana University, Bloomington, Indiana. Geszti, T. (1990). Physical models of neural networks. Singapore: World Scientific.
M A THNET
407
Gonzalez, E.G. & Kolers, PA. (1982). Mental manipulation of arithmetic symbols. Journal of Experimental Psychology: Learning, Memory, & Cognition, 8,308319. Gonzalez, E.G. & Kolers, PA. (1987). Notational constraints on mental operations. In G.Deloche & X.Seron (Eds.), Mathematical disabilities: A cognitive neuropsychologicd perspective (pp.27-42). Hillsdale, NJ Erlbaum. Gorman, R.P. & Sejnowski, TJ. (1988). Analysis of hidden units in a layered network trained to classify sonar targets. Neural Networks, I, 75-89. Graham, DJ. (1990). Connectionist networks of cognitive arithmetic: Explorations in backpropagation architectures and trainingsequencing.Artiftcial Intelligence Memo, University of Otago. Graham, D.J. & Schneider, W. (1988). Sequential learning in a connectionist model of mental arithmetic. Paper presented at the meeting of the Psychonomic Society, Chicago, Illinois. Groen, GJ. & Parkman, J.M.(1972). A chronometric analysis of simple addition. Psychological Review, 79,329-343. Grossberg, S. (1987). Competitive learning: From interactive activation to adaptive resonance. Cognitive Science, 11, 23-63. Hanson, S.J. & Burr, D.J. (1990). What connectionist models learn: Learning and representation in connectionist networks. Behavioral and Brain Sciences, 13, 471-518. Harley, W.S. (1990). Associative memory in mental arithmetic. Unpublished doctoral dissertation, Johns Hopkins University. Hertz, J., Krogh, A. & Palmer, R.G. (1991). Introduction to the theory of neural computation. Redwood City, CA: Addison-Wesley. Hetherington, PA. & Seidenberg, M.S.(1989). Is there "catastrophicinterference" in connectionist networks? Proceedings of the 11th Annual Conference of the Cognitive Science Society (pp.26-33). Hillsdale, NJ: Erlbaum. Hinton, G.E. (1989). Deterministic Boltzmann learning performs steepest descent in weight-space. Neural Computation, Z, 143-150. Kruschke, J.K. (in press). ALCOVE: An exemplar-based connectionist model of category learning. Psychological Review. Lewandowsky, S. (1991). Gradual unlearning and catastrophic interference: A comparison of distributed architectures. In W.E.Hockley & S.Lewandowsky (Eds.), Relating theory and data: Essays on human memoty in honor of Bennet B.Murdock pp 509-527. Hillsdale, NJ: Erlbaum.
408
M. McCloskey &A.M. Lindemann
Macaruso, P., McCloskey, M. & Aliminosa, D. (1991). The functional architecture of the cognitive numerical-processing system: Evidence from u patient with multiple impainnents. Manuscript submitted for publication. McCloskey, M. (in press). Cognitive mechanisms in numerical processing: Evidence from acquired dyscalculia. Cognition. McCloskey, M. (1991). Networks and theories: The place of connectionism in cognitive science. Psychological Science, 2, 387-395. McCloskey, M., Aliminosa, D. & Macaruso, P. (in press). Theory-based assessment of acquired dyscalculia. Brain and Cognition. McCloskey, M., Aliminosa, D. & Sokol, S.M. (in press). Facts, rules, and procedures in normal calculation: Evidence from multiple single-patient studies of impaired arithmetic fact retrieval. Brain and Cognition. McCloskey, M. & Caramazza, A.(1987). Cognitive mechanisms in normal and impaired number processing. In G.Deloche & X.Seron (Eds.), Mathematical disabilities:A cognitive neumpsychologicalperspective (pp.201-219). Hillsdale, N.J.: Erlbaum. McCloskey, M., Caramazza, A. & Basili, A.G.(1985). Cognitive mechanisms in number processing and calculation: Evidence from dyscalculia. Brain and Cognition, 4, 171-196. McCloskey, M. & Cohen, N J. (1989). Catastrophic interference in connectionist networks: The sequential learning problem. In G.H.Bower (Ed.),The psychology of learning and motivation: Volume 24 (pp.109-165). San Diego: Academic Press. McCloskey, M., Harley, W. & Sokol, S.M.(1991). Models of arithmetic fact retrieval: An evaluation in light of findings from normal and brain-damaged subjects. Journal of Experimental Psychology: Learning, Memory, & Cognition, 17,377-397. Miller, K.F.& Paredes, D.R. (1990). Starting to add worse: Effects of learning to multiply on children’s addition. Cognition, 37,213-242. Miller, K., Perlmutter, M. & Keating, D. (1984). Cognitive arithmetic: Comparison of operations. Journal of Experimental Psychology: Learning, Memory, & Cognition, 10, 46-60. Minsky, M.L. & Papert, SA. (1988). Perceptrons: An introduction to computational geometry (Expanded edition). Cambridge, M A : MIT Press. Parkman, J.M.(1972). Temporal aspects of simple multiplication and comparison. Journal of Experimental Psychology, 95, 437-444. Parkman, J. & Groen, G.(1971). Temporal aspects of simple addition and comparison. Journal of Experimental Psychology, 89, 335-342.
MA THNET
409
Pavel, M. (1990). Learning from learned networks. Behavioral and Brain Sciences, 13, 503-504. Peterson, C. & Anderson, J.R. (1987). A mean field theory learning algorithm for neural networks. Complev Systems, I , 995-1019. Peterson, C. & Hartman, E. (1989). Explorations of the mean field theory learning algorithm. Neural Networks, 2, 475-494. Rager, J.E. (1990). The analysis of learning needs to be deeper. Behavioral and Brain Sciences, 13, 505-506. Seidenberg, M.S. & McClelland, J.L.(1989). A distributed, developmental model of word recognition and naming. Psychological Review, 96, 523-568. Sejnowski., TJ. & Rosenberg, C.R. (1987). Parallel networks that learn to pronounce English text. Compler systems, 1, 145-168. Siegler, R.S. (1988). Strategy choice procedures and the development of multiplication skill. Journal of Experimental Psychology: General, 117, 258275. Siegler, R.S. & Shrager, J.(1984). A model of strategy choice. In CSophian (Ed.), Origins of cognitive skills (pp.229-293). Hillsdale, NJ:Erlbaum. Sokol, S.M., Goodman-Schulman, R. & McCloskey, M. (1989). In defense of a modular architecture for the number-processing system: Reply to Campbell and Clark. Journal of Experimental Psychology: General, 118, 105-110. Sokol, S.M., McCloskey, M., Cohen, N.J. & Aliminosa, D. (1991). Cognitive Inferences from the representations and processes in arithmetic: performance of brain-damaged patients. Journal of &penmental Psychology: Learning, Memoy, & Cognition, 17, 355-376. Stazyk, E.H., Ashcraft, M.H. & Hamann, M.S. (1982). A network approach to mental multiplication.Journal of Experimental Psychology: Learning, Memoy, & Cognition, 8,320-335. Stornetta, W.S. & Huberman, BA. (1987). An improved three-layer, back propagation algorithm. In E.Caudill & C.Butler (Eds.), Proceedings of the ZEEE first international conference on neural networks, Vol.ZZ (pp.637-644). San Diego, C A SOS Printing. Viscuso, S.R., Anderson, J A . & Spoehr, K.T. (1989). Representing simple arithmetic in neural networks. In G.Tiberghien (Ed.), Advances in cognitive science.Volume 2: n e o y and applications (pp.141-164). New York: Wiley. Warrington, E.K.(1982). The fractionation of arithmetical skills: A single case study. Quarterly Journal of Experimental Psychology, 34A, 31-51.
This Page Intentionally Left Blank
The Nature and Origins of Mathematical Skills J.I.D.Campbell (Editor) 0 1992 Elsevier Science Publishers B.V. All rights reserved.
411
Chapter 11
INHIBITORY MECHANISMS IN NORMAL AND DYSFUNCTIONAL NUMBER PROCESSING James M. Clark University of Winnipeg
Summary
Inhibitory mechanisms can play central roles in associative theories of number processing, including context-sensitive activation of overlapping associative networks and evaggemtion of differences between competing nodes. I review conceptual and empirical considerations that implicate inhibitory mechanisms in these important functions, not on& in numberprocessing but also in other cognitive domains. Many interjkrence and related phenomena are shown to be consistent with the proposed roles for inhibition. Central roles for inhibition have many implications for such correlates of normal and dysfunctional number processing as physiological and psychological indicators of inhibitory functioning, brain damage, childhood development, and aging. More generally, associative theories based on inhibition and excitation encourage the development of unified mechanistic qlanations soundly rooted in diverse empirical phenomena. Introduction and background
This paper examines empirical and rational support for the hypothesis that inhibition serves major roles in the associative mechanisms that underlie number processing, and proposes some novel implications of that hypothesis for correlates of normal and dysfunctional performance on numerical tasks. The major roles discussed are context-sensitive activation of problem answers and, more briefly, enhancing differences in activation between competing answers. The implications have to do with such inhibition-related correlates of number processing as age and brain damage.
412
J.M. Clark
In the main part of the paper, I show that inhibitory associative structures based purely on operand-driven activation can selectively activate correct answers, even for ambiguous problems whose operands are both connected to more than one answer (e.g., 4 and 6 are connected to 12, 24, and 36). Operand-driven selective activation has important implications for number processing phenomena and for the assumption that entire number problems are represented by distinct configural mental codes (e.g., Campbell & Graham, 1985). These configural representations, also called gestalt or holistic codes, are presumed by their proponents to exist in addition to separate representations for the components of the problem (i.e., operands, operators, and answers). A second section examines a related role for inhibition, to enhance existing differences in levels of activation. Inhibitory mechanisms can exaggerate even small differences in activation levels, often leading to the complete suppression of what would otherwise be strongly competing responses. I then review what the hypothesized prominence of inhibitory mechanisms implies about individual differences and other correlates of normal and dysfunctional number processing. Variables from quite diverse areas of psychology are shown to implicate inhibitory mechanisms, and hence to have potential relevance to arithmetic and other numerical tasks. Before focusing on these inhibitory processes, I briefly describe the basic characteristics of associative models, especially as they relate to number processing. Associative models of number processing In essence, associative models of number processing posit mental representations for digits, number-words, operators (e.g, x, t ), and other number-related entities. These representations are interconnected in a complex network that permits codes to collectively influence each other's levels of activation. Excitation and inhibition spread in parallel among the individual units to produce the successive patterns of activation that underlie number-related experiences and behavior. To illustrate, stimulation of the codes for 9, x, and 3 would activate other relevant codes via excitatory connections, including verbal codes for "nine," "three," and "twenty-seven.'' In addition to these excitatory connections, some associative models include inhibitory associations that can reduce activation in connected codes. For example, the codes for such competing verbal responses as "eight," "twelve," and "eighteen" might be suppressed when the codes for 9, x, 3, and their positively related codes are activated.
Inhibition and Number Processing
413
The basic assumption of associative models is that spreading excitation and inhibition among a suitably conceptualized network of representations will reproduce relevant findings in human behavior and experience, for example, production of the correct answer on most trials, various types of errors, and latency data. Strong versions strive to explain number-processingpurely in terms of a limited set of mental representations, associative connections, and spreading excitation or inhibition. Associative models can be contrasted with models cast in terms of such high level symbolic processes as strategies and rules (e.g., IF 3 and 7 THEN 21). Advocates of associative models recognize that much human behavior can be described at these higher levels of abstraction, but constrain their psychological explanations to purely mechanistic processes and try to avoid propositions and assumptions that cannot, at present anyway, be characterized in terms of identifiable concrete mechanisms. This emphasis on basic mechanisms also means that associative models involve domain-independentprocesses, as opposed to rules or procedures designed for specific domains, such as number processing. Finally, the associative perspective presented here emphasizes explicit connections among coherent representations for meaningful objects and events (e.g., number words, images of digits), as opposed to the distributed representations of some neural net approaches. This associative framework is the foundation for the specific functions and implications of inhibitory processes examined in this paper. In general, I argue on both rational and empirical grounds that inhibitory associative mechanisms are necessary to explain certain phenomena in number processing or, at the very least, explain the phenomena more parsimoniouslyand elegantly than models that ignore inhibition. Important roles for inhibition in theories of number processing are expected, of course, given the ubiquitous presence in the brain of such inhibitory neurotransmitters as GABA (Roberts, 1987), and the increasing use of inhibitory theoretical mechanisms to explain phenomena in selective attention (Tipper, 1985; Walley & Weiden, 1973), animal learning (Williams & Clark, 1992), aging (Hasher & Zacks, 1988), childhood development (Bjorklund gL Harnishfeger, 1990), and diverse other areas of psychology including number processing (Campbell & Clark, 1989). The specific functions and implications of inhibitory mechanisms are considered here detached from other essential and often unresolved features of associative models; hence, the present chapter does not provide a complete associative model for number processing, and is sometimes quite speculative even about the isolated mechanisms that are discussed. More fully developed associative models are available for such numerical tasks as multiplication (e.g., Campbell & Oliphant,
414
J.M. Clark
this volume), and some of the general issues related to this and contrasting classes of theory have been examined elsewhere (Campbell & Clark, 1989; Clark & Campbell, 1991). The primary role of inhibition emphasized here is providing context-sensitiveor selective activation of responses, a demanding task in the highly overlapping associative networks involved in number processing. Inhibition and context-sensitive activation A major challenge for associative models of number processing is to construct associative networks such that correct answers are activated primarily when the appropriate operands occur together. To achieve this specificity of activation, the associative networks that underlie number processing must be highly context sensitive or selective. The requirement that spreading activation be context sensitive is demonstrated most clearly in the case of ambiguous problems whose operands are both connected to more than one shared answer, Because 4 and 8 are each connected to 24 as well as to 32, for example, simple excitation from 4 and 8 would activate both answers equally. Operand-based activation must somehow be context sensitive in order to produce the correct answer; that is, 4 should activate 24 when 6 is the co-operand and 32 when 8 is the co-operand. Although ambiguous problems provide the starkest examples of context-sensitivity,the fact that all numerical operations involve associations among a small set of operands means that selective activation is important even for problems that are not completely ambiguous. The multiplication facts 3 x 5 = 15 and 3 x 6 = 18, for example, would be stored well by a network in which activation from 3 spreads primarily to 15 when 5 is the co-operand of 3 and to 18 when 6 is the co-operand of 3. Similar conditional activation would be required for the rest of the 3 times-table and for the times-tables for other multiplication facts. A third example of the need for context-sensitivity arises from different arithmetic operations that contain the same operands, and even the same operand combinations. To illustrate, 3 x 5 should activate 15, whereas 3 + 5 should activate 8. Thus the destination (i.e., answer) to which activation spreads from the operands and their combination depends in large part upon the presence of a specific arithmetic operator, which presumably acts as a contextual or modulating stimulus that steers activation from the operands to the appropriate response. These three examples demonstrate the necessity of context-sensitive activation in associative models for arithmetic facts. Alternative methods to provide such contingent activation are discussed later (e.g., unique composite representations for
Inhibition and Number Processing
415
problems, as in Campbell & Graham, 1985, or of problems and answers, as in Campbell & Oliphant, this volume), but I first want to examine whether and how pure associativenetworks that posit only operand and answer representations could achieve the requisite selective activation. Inhibitory mechanisms in fact provide a powerful tool for generating just the sort of context-sensitive activation required to model the preceding behaviors and other examples of conditional responding. A disinhibition model for context-sensitive activation
Context sensitivity is a generic problem in many areas of psychology. One area in which the basic associativemechanisms underlying selective activation have been investigated is Pavlovian conditioning. There are a variety of conditioning procedures in which training and learning are highly dependent on context, and such tasks show surprising relevance to the associative processes of interest here. I describe briefly two types of conditional learning, occasion setting and patterning, and an inhibition-based mechanism to explain such learning (for further discussion, see Williams & Clark, 1992). In occasion setting, animals are exposed to a conditioned stimulus (CS) that is only followed by an unconditioned stimulus (US) when the CS itself is preceded by a third stimulus, called a feature (F). The US does not follow a CS that occurs without F. This conditional learning can be described notationally as: F-> CS + versus CS-, where -> indicates the temporal relation between F and the CS, and t and - indicate the presence and absence of the US. After training, animals emit the conditioned response in the presence of F->CS and do not emit the response in the presence of CS alone. A variety of phenomena have demonstrated that F is qualifying the stimulus properties of the CS, and not simply controlling the response directly; for example, F can be a positive occasion setter for one CS and a negative occasion setter whose absence signals the US for a second CS. Thinking of the CS as one operand (e.g., 8), F as a second operand (e.g., 4), and the response as the correct answer (32), demonstrates that occasion setting involves exactly the kind of context-sensitiveactivation required to learn arithmetic facts. Patterning (or configural learning) is a second type of conditioning that also shows striking relevance to the associative structures that underlie individual number facts. In positive patterning, animals learn to respond when both elements in a compound stimulus occur and to suppress responding when individual elements in the compound occur alone, This form of learning is symbolized as AB + ,A-, and B-. Animals readily learn such conditional responding. If we think of A and B as operands and the answer as the correct response, patterning again
J.M. Clark
416
shows just the kind of context-sensitive activation involved in ambiguous or other arithmetic problems. Although a variety of models have been proposed to explain feature setting and patterning (several relevant to analogous models for number processing are mentioned later), I focus here on an inhibition approach suggested recently by Williams and Clark (1992). Simple associative networks to correctly perform occasion setting and patterning are presented in Figures lA and 1B. Two components of the highly similar models are critical for producing selective activation. First, the models assume that CSs acquire both excitatory and inhibitory connections to mental representations of the US (the mental code for the US is hypothesized to elicit the response). Theoretically, inhibitory connections develop because there are identifiable occasions when CSs are not followed by the US. The second component of the model is a modulating stimulus (F in occasion setting or the other stimulus in patterning) that inhibits the CS- >US inhibitory pathway. This disinhibition reveals the excitatory connection that was previously masked by the now-suppressed inhibitory connection. Williams and Clark (1992) suggested that disinhibition models could explain various context effects in animal learning and reviewed some findings consistent with the proposed models.
A . Occasion Setting
us B. Patterning A
4 Excitation
1
1
i Inhibition
Figure I . Disinhibition models for occasion setting and patterning.
Inhibition and Number Processing
417
Disinhibition is an excellent way to conceptualize and model forms of context-sensitive activation in addition to those that occur in Pavlovian conditioning. To illustrate with an everyday example, people who regularly drive different cars must produce vehicle-specific acts to control the lights, the wipers, and other functions. Perfect responding would require highly selective associative mechanisms, and even minimal performance would involve considerable contextual control over responding. The ability to switch between appropriate sets of responses, including occasional errors, is modelled elegantly by the disinhibition hypothesis, which maintains for this example that vehicle-specific responses are inhibited unless released by cues that one is in the appropriate car. Clearly any inferred relation between human number processing and such basic behaviors as Pavlovian conditioning or even simple motor responses must be drawn carefully. Nonetheless, the conditional learning phenomena described here are isomorphic to context-sensitiveeffects in multiplication and other number facts, suggesting that disinhibition might help to explain selective activation in arithmetic. Inhibition and disinhibition of arithmetic answers
Thinking about how calculation facts are learned suggests that inhibitory mechanisms may develop in precisely the ways just described to explain context effects in Pavlovian conditioning. Multiplication and other arithmetic operations are learned initially by exposure to pairs of operands and their answers, sometimes with prior production of the answer. That is, people study such problems as 3 x 5 = 15 or 3 x 5 = ?, and subsequently receive feedback on the correctness of their answers. It seems evident that one effect of repeated trials will be to strengthen excitatory associations between the individual operands and the correct answer. Presentation of the problem 3 x 5 = 15, for example, will strengthen the connections between 15 and the operands 3 and 5. Later presentations of 3 and 5 thereafter result in spreading excitation from the presented operands to the answers with which the operands have been paired. Inhibition of incorrect answers. What may be less evident than these positive connections is that such experiences also support learning something about the relation between each operand and the answers for multiplication problems that are not correct on the current trial. The problem 3 x 5 = 15, for example, informs the system not only that 3 should activate 15, but also that 3 should not activate other table-related answers (i.e., 6, 12, 18,...), at least not when 5 is also present. Ignoring for the moment the conditional nature of the preceding statement, this analysis suggests that the mental network will acquire or strengthen negative connections between each operand and answers that are incorrect on the current
J.M. Clark
418
trial. This suppression is especially important for answers to table-related problems involving each of the operands (e.g., 3 x 6 = 18, 6 x 5 = a), because these related answers receive any unsuppressed excitation emitted by the individual operands. That is, it would be particularly useful if this irrelevant excitation were suppressed when the appropriate co-operands were not present. Negative associations between operands and incorrect responses directly implicate inhibitory processes. Figure 2A presents an idealized representation for part of a network in which each operand acquires both excitatory and inhibitory connections to table-related answers. The inhibitory connection depicted in Figure 2A seems intuitively reasonable. There are innumerable occasions on which 3 occurs in the absence of 15 and when production of 15 would be incorrect. Indeed the vast majority of times that individual operands such as 3 occur, they involve answers other than the correct answer for one particular problem (15 in the present example). These numerous "negative" trials provide many opportunities for inhibitory learning between operands and answers, and it would perhaps be surprising if such inhibition did not form, especially for incorrect answers from the same times-table. A . Parallel Excitation + Inhibition
3====iL
15
B Excitation t Two-step Inhibition
'-----+yL
15 C. Co-operand Disinhibition of Answer
3
3 Excitation
I
-4 Inhibition
Figure 2. Inhibition-based models for multiplication.
Inhibition and Number Processing
419
Although Figure 2A and most of the remainingdiscussion characterize inhibition in terms of a single connection from the operand to the answer, the detailed underlying mechanism is almost certainly more complex. In particular, there are probably one or more hidden layers between the operands and answers. Figure 2B demonstrates one alternative network that for present purposes is functionally equivalent to the simple network in 2A. Figure 2B shows a forward inhibition network in which excitation from the operand activates an intervening inhibitory node, which in turn depresses activity in the answer node. This forward inhibition version of the disinhibition model shows how a single excitatory source (an operand) could have both excitatory and inhibitory effects; this version is the basis of a later simulation. Other multi-node models are possible (e.g., reciprocal inhibition), and these different instantiationsof the present theory might eventually lead to distinct predictions about number processing. Irrespective of the specific underlying connections and many other unresolved issues (e.g., the range of inhibited representations, variations in the strength of inhibition), the general hypothesis that inhibitory as well as excitatory functional connections develop between operands and answers has profound implications for the production of correct answers. Disinhibition of correct answers. Given inhibitory connections between individual operands and same-table answers, it seems likely and perhaps essential that this inhibition be suppressed or overridden when the answer is correct; that is, the excitatory connection should predominate when the appropriate other operand is present. Under such circumstances, inhibition of inhibition (i.e., disinhibition) would play a major role in activating the correct answer. Figure 2C illustrates a disinhibition mechanism of the sort hypothesized here. The presentation of 5 inhibits the inhibition from 3 to 15, which results in the dominance of the excitatory connection. Analogous connections exist between 5 and 15; that is, 5 has both excitatory and inhibitory connections to 15 and the latter is in turn inhibited by the presence of 3. The net effect of an appropriate balance of excitation and inhibition would be that 3 and 5 would excite 15 when both operands were present, and would have less impact on 15 when either operand was absent. Rational analysis of arithmetic problems thus leads to an associative network that is isomorphic to those proposed by Williams and Clark (1992) for context-sensitive conditioning, as seen by comparing Figures 1 and 2. The notion of disinhibition thus provides an elegant explanation for the fact that activation must be highly conditional (i.e., context-dependent) in associative networks for number facts. At least in principle, disinhibitory networks could approach the ideal of 5 activating 15 only when 3 is present, 20 only when 4 is present, and so
J.M. Clark on, and could do so without positing special configural representations for problems. The plausibility of these mechanisms is further enhanced by their capacity to cope with such extreme context dependencies as ambiguous multiplication problems. Ambiguous problems As noted earlier, one challenge for simple associative models of arithmetic is
that multiplication, addition, and other number facts involve networks in which some pairs of operands are connected to more than one shared answer, which seems to render such problems ambiguous for models that assume all activation arises from separate operand and operator codes. For example, 4 x 8 is ambiguous because 4 and 8 share the incorrect answer 24, as well as the correct answer 32. There are 10 such shared-answer problems, and they result from the five multiplication answers that are correct for more than one problem (see Table lA). Given these shared correct answers, pairs of operands can both be linked to more than one answer, albeit by different problems. Table 1B shows the 10 ambiguous problems (ignoring operand order and ties) and their incorrect answers. Note that ambiguous problems and shared-answer problems only partially overlap, having four problems in common. Problem ambiguity is even more acute in addition, where most pairs of operands are connected to more than one common answer. The ambiguous multiplication problems in Table 1 cannot be solved by simple associative models that posit only excitatory connections between representations for single operands and their answers. Summation of excitation from the individual operands (e.g., 4 and 8) fails to produce the correct response because 24 and 32 are equally activated. Because people do learn ambiguous number facts, the simple model cannot be correct, and some additional mechanism must produce greater activation of the correct response in ambiguous problems. Interestingly, the disinhibitory associative networks presented in the previous section do differentially activate correct answers for ambiguous problems. Disinhibition and ambiguousproblems. Figure 3 shows disinhibition associative networks for the 4 and 8 times-tables, including competing responses connected to both operands. Presentation of 4 x 8 results in combined excitation and inhibition of the answers in the 4 and 8 times tables (i.e., the answers connected to the individual operands 4 and 8). By themselves, these associations might result in minimal activation of particular answers, including 24 and 32, because the excitatory and inhibitory inputs would cancel one another, assuming comparable strengths. The critical mechanism for resolving problem ambiguity in Figure 3 is
Inhibition and Number Processing
421
disinhibition. Specifically, the operand 4 suppresses the inhibition between 8 and 32, and the operand 8 suppresses the inhibition between 4 and 32. These mutual disinhibitory connections result in 32 being activated more than 24 when 4 and 8 occur together. The question marks for tie problems (i.e., 4 x 4 and 8 x 8) reflect doubts about whether the proposed mechanisms apply to ties (see later discussion). Table 1. Shared Answer and Ambiguous Problem Sets
A. Shared Answer Problem Set Answer Problems 12 2x6* 16 2x8 18 2x9 24 3x8 36 4x9
3x4* 4x4 3x6* 4x6* 6x6
B. Ambiguous Problem Set (Correct Answer Underlined) Problem Multiple Answers 2x3 6 12 18 8 12 16 2x4 2x6* 12 18 3x4* 12 24 3x6* 12 18 24 3x9 18 27 4x6 * 12 24 36 4x8 16 24 32 6x8 24 48 6x9 18 36 54
* Problems that appear in both sets. To demonstrate that the disinhibition model does indeed work, a primitive simulation was prepared for the problems 4 x 6 and 4 x 8 using the two-stage model shown in Figure 2B. The program in essence computed activation levels of the answers 24 and 32 firstly as a direct function of the activation levels of the individual operands 4, 6, and 8. For this step, problem operands (e.g., 4 and 6 or 4 and 8) were simply set to a futed 10 units of activation; operands in turn activated any answer connected to the individual operands by that amount on each cycle. These direct inputs activated 24 and 32 equally.
J.M. Clark
422
4
W8
9
T
3 Excitation
T
4
9 Inhibition
Figure 3. A disinhibition model for the ambiguous multiplication problem 4 x 8. Each operand also produced an answer-specific negative component that depended on three factors: (a) the activation level of the primary operand (10 units as stated above), (b) a forward inhibition weighting factor (which was 0 or -.2 for the output shown here), and (c) a disinhibition weighting factor (also 0 or -.2 here) driven by the activation level of the appropriate co-operand for each answer (i.e., the operand 4 for the 8 to 32 connection and the operand 8 for the 4 to 32 connection). In essence, activation of the co-operand negated some of the inhibition from the operand to the answer. Activation of operands was simply "turned on" and additive growth in the activation of the answers 24 and 32 was measured over 15 cycles. For those interested in the details of this crude simulation, the short program used is presented in the Appendix. The results of the simulation appear in Figure 4. The heavy solid line in Figure 4 shows the equivalent activation level achieved by the answers 24 and 32 when 4 and 8 were activated and the forward inhibition and disinhibition parameters both equalled zero (i.e., INH=O). This curve simply demonstrates the ambiguity of such problems for simple associative models.
Inhibition and Number Processing
423
The interesting results that demonstrate how forward inhibition and disinhibition can disambiguate such problems are the curves for INH=-.2. A small amount of forward inhibition (-.2) partly disinhibited by the other operand (-.2) separates somewhat the growth curves for 24 and 32, with the answer 32 increasing in activation faster than the answer 24 and achieving a slightly higher ultimate level of activation. An identical value (-.2) was used for both the inhibitory and disinhibitory connections for convenience, and has no particular importance. The slight amount of separation shown could be increased by manipulating inhibitory components of the disinhibition model, but modest separation was retained deliberately to emphasize the benefits of lateral inhibition presented later.
ANSWER(ARB1TRARY UNI TS) 2 50 2 00
150
100
50
0 0
5
10
15
PROCESSING CYCLE Figure 4. Simulated 4 x 8 activation of 32 and 24 by levels of answer inhibition and co-operand disinhibition (INH = 0 or .2).
That inhibitory mechanisms can produce differences in levels of activation for ambiguous problems supports the proposed inhibition and disinhibition
J.M. Clark
424
mechanisms. Analogous mechanisms can model other context effects in number processing, such as competition between different number operations.
Inter-operation competition The concept of disinhibition as just described can be used to build associative networks that are sensitive to the arithmetic operator accompanying the operands. The numbers 4 and 8, for example, are associatively related to 12 in the addition associative network and to 32 in the multiplication network. Which of these answers is activated should depend on the operator that occurs explicitly or implicitly with the operands (plus, + , times, x). Presence of t should activate 12 and presence of x should activate 32. The disinhibition view encourages the inverse conceptualization of this issue; that is, + should inhibit the answer 32 and x should inhibit the answer 12. Such a network is readily realized by a disinhibition model. Figure 5 shows one simple disinhibitory network that would permit 4 to activate 12 when both 8 and + accompany 4, and to activate 32 when both 8 and x accompany 4. This particular network builds on the network shown in Figure 3. In essence, inhibition functions as an " a n d gate, permitting excitation to pass only when one or more conditions are met (i.e., the appropriate operator), in addition to the presence of the primary stimuli. A similar network would exist between the operand 8 and the two answers, but with 4 now acting as the modulatory context.
12 -4+32
++
T
T
Figure 5. A disinhibition model for selective inter-operation activation. The idealized model presented in Figure 5 is undoubtedly an oversimplification of actual arithmetic networks, ignoring as it does the relative strengths of the various connections and effects due to the order in which number operations are learned. Moreover, other inhibitory networks can probably be constructed to achieve this same specificity of activation. But the incompleteness and the specific details of the network are not important; primarily it demonstrates in principle that suitable inhibitory and disinhibitory associative connections can indeed produce
Inhibition and Number Processing
425
selective activation of different number facts depending on what operands and mathematical operators are present.
Empirical considerations Strong tests of the proposed mechanisms are complicated by the hidden nature of the inferred structure and the existence of alternative possible networks that could effect inhibition and disinhibition of answers (e.g., Figure 2). Nonetheless, the disinhibition model is consistent with some fmdmgs and suggests specific mechanisms that could underlie various phenomena in number processing. Ambiguous problems. One obvious place to look for support for the disinhibition hypothesis is the set of ambiguous problems, since such problems provide a notable example of context-sensitive activation. Several observations suggest that ambiguous problems are particularly difficult, although many factors, including differential practice, complicate comparisons across problem types. Campbell and Graham (1985) reported that one of the most difficult multiplication problems in their set was 4 x 8, and they attributed this to the fact that both operands were connected to more than one answer; the most common error was 24, which is connected to both 4 and 8 via other problems. The results from the earlier simulation (Figure 4) show that 24 and 32 can both be highly activated in operand-driven associative networks. Another difficult problem was 3 x 9, again largely due to an error that is connected to both operands (i.e., 18). The difficulty of ambiguous problems may also contribute to systematic deviations from the problem-size effect, the tendency for errors and RTs to increase with operand size and related measures. Campbell (1987c, Figure 6.1) plotted RTs for the 2 through 9 times-tables (excluding problems involving 0 or 1). The only nonmonotonic changes as operand size increased were for 5 and 7, with errors for 5 being exceptionally low (see also Campbell & Graham, 1985). One factor in these deviations may be the fact that 5 and 7 are the only operands not involved in any ambiguous problems (see Table 1 above). A regression analysis of RTs (estimated from Campbell’s Figure) demonstrated that R2increased 12% (p = .04) from R2 = .80 with just operand size (2 to 9) to R2 = .92 including number of ambiguous problems in which operands appear (values of 3 , 4 , 4 , 0 , 5 , 0,2, and 2 for operands from 2 to 9, respectively). The advantage of unambiguous operands is difficult to interpret, however, because ambiguity is confounded with other beneficial factors (e.g., experience counting by fives, 5 problems end in 0 or 5) * A final hint that ambiguous problems are more difficult than expected given other properties of the problems comes from research on shared-answer problems.
426
J.M. Clark
Campbell and Clark (this volume) report data showing that the shared answer problems in Table lA are more difficult than expected, given a variety of other strong correlates of problem difficulty (see also Campbell & Graham,1985). They propose that these problems are analogous to the fan effect reported in other cognitive tasks (Pirolli & Anderson, 1985), but a related possibility is that problem ambiguity contributes to this effect; 4 of the 10 shared-answer problems are ambiguous. Indeed, the fan effect, which involves stimuli associated with varying numbers of facts, may be more similar to ambiguous problems (Lea,a problem connected to multiple answers) than to shared-answer problems (i.e., an answer connected to multiple problems). One barrier to analyzing hypotheses about intercorrelated attributes is the limited number of standard arithmetic problems. Stronger tests of this and competing hypotheses might be possible, however, with the alphaplication task of Graham and Campbell (in press). In that task, subjects who learned arbitrary arithmetic-like associative networks among letters displayed many of the phenomena characteristic of number fact retrieval. Such analogue tasks permit experimental manipulation of ambiguity independent of other confounding variables. Cross+perafion inte#erence. A second class of phenomena for which the proposed model shows strong promise is cross-operation interference. The disinhibition model hypothesizes that arithmetic operators suppress competing answers for alternative operators (Figure 5). Because activation is produced by the individual operands and operators connected in an integrated associative network, the theory therefore predicts that different mathematical operations should influence one another’s performance. One obvious implication is that cross-operation errors should be prevalent in production tasks, because successful multiplication entails suppression of addition answers and vice versa, This is indeed the case, especially when operations are mixed (Miller & Paredes, 1990). The effect of mixed operations on errors is explained by extra challenges for cross-operation suppression when to-be-suppressed answers are occasionally activated in the testing situation. The integrated nature of the proposed network is also consistent with research demonstrating that correct answers for other operations slow rejection of false verification problems; that is, such problems as 4 x 3 = 7 are more difficult to reject than problems such as 4 x 3 = 8 (Winkelman & Schmidt, 1974; Zbrodoff & Logan, 1986). More specific predictions about cross-operation errors can be derived from the disinhibition model in conjunction with supplementary assumptions that appear reasonable. For example, the model may explain the fact that multiplication
Inhibition and Number Processing
427
appears to interfere with addition more than addition interferes with multiplication (Miller & Paredes, 1990), at least as measured by cross-operation intrusions. The reasoning here is that most people learn addition before multiplication and other arithmetic operations. Therefore the network for addition could be acquired, at least initially, without suppression of the still-unlearned multiplication answers by the addition symbol; there is simply no need to suppress answers for interfering operations that have not yet been learned. Multiplication, on the other hand, is acquired after addition, and initial learning requires suppression of the arithmetic answers from the outset. As multiplication is lemed, or shortly thereafter, the addition network will need to be revised to include suppression of the newly learned multiplication answers. The specific effects of these distinct learning histories are not known, but it seems possible that learning inhibition initially might be easier or more efficient than trying to superimpose it later on an already developed system. Hence, more interference is observed from multiplication to addition than vice versa. The inhibition-disinhibition model for the contextual effects of operation signs also predicts that learning a new operation will affect already learned arithmetic facts because the learner will need to develop cross-operation inhibitory connections. Newly-learned multiplication answers, for example, must be suppressed for addition to function well. Just such effects of learning multiplication on addition have been demonstrated in studies of children learning arithmetic (Miller & Paredes, 1990). Specifically, learning multiplication temporarily disrupts -addition, slowing it down and increasing errors. This phenomena is consistent with the present hypothesis that inhibitory connections must be developed between the addition operands, the addition sign, and the new multiplication answers. These effects may also be related to developmental differences in inhibitory functioning discussed later. Experimentally, such phenomena and perhaps more precise predictions, could be investigated by teaching novel arithmetic operations to the point of associative retrieval, or again by using the alphaplication task of Graham and Campbell (in press). Another somewhat perplexing finding that might be explained by the present hypothesis is that addition is perhaps not as easy as expected relative to multiplication, which is learned at a later age and presumably receives less overall lifetime practice than addition. Differences in errors between multiplication and addition can be quite small and can even favor multiplication, and similar small or reversed effects have been reported for RT. Miller and Paredes (1990), for example, found that error rates for adults were only slightly higher for multiplication than addition (their Table 1, Mixed condition) and RTs were actually somewhat faster for multiplication (their Figure 1). One factor that may
428
J.M. Clark
contribute to similar difficulties for addition and multiplication is the asymmetry in cross-operationinterferencejust discussed;that is, multiplication interferes more with addition than vice versa. A second contributing factor could be the greater ambiguity of addition problems than multiplication problems. Because the 36 addition problems involve just 15 distinct answers, few addition problems involve a unique combination of operands, and operands for the addition problems from 2 to 9 are connected to as many as 7 different shared problem answers (this maximum occurs for adjacent operands). Cross-operation errors and ambiguity both implicate inhibition. Of course, problem and operation difficulty depend on many correlated variables, making it difficult to isolate the role of individual factors such as ambiguity. Addition involves closely related answers that might compete with one another more intensely than the more distributed answers for multiplication, although Campbell and Oliphant (this volume) describe a model based on compressed similarity for larger numbers. On the positive side, addition answers are lower numbers than multiplication answers, and addition is acquired before multiplication. Further complicating the investigation of such issues is the fact that people might differentially practice problems and operations that vary in initial difficulty, making comparisonswith adults even more uncertain. At the very least, the disinhibition model suggests factors that must be studied if we hope to eventually understand the relative difficulty of various arithmetic operations in terms of percentage ambiguous problems, order of acquisition, and diverse other properties. Ties and other "special" problems. A third finding that benefits from a consideration of inhibitory retrieval mechanisms is the fact that tie problems (or twins) and problems involving 0 or 1 are particularly easy, both in terms of few errors and fast RTs (e.g., Miller, Perlmutter, & Keating, 1984). Moreover, such problems show less of a problem-size effect than non-tie problems (Groen & Parkman, 1972; Miller et al., 1984). The classical explanation for such findings is that people use a special "rule" or a different "procedure" to solve problems involving ties, 0, or 1,whereas semantic retrieval mechanisms are used for non-ties. This explanation is not very adequate, at least from an associative perspective, because rules and procedures must somehow be realized by an associative network based on spreading excitation and inhibition. The inhibition and disinhibition networks shown earlier suggest a simple mechanistic explanation for some of these findings. Specifically, retrieval of answers for these special problems may be less conditional or context-dependent than the remaining problems. In the case of zero, a simple network involving one operand (0), one sign (x), and one answer could easily be constructed. The value
Inhibition and Number Processing
429
of the co-operand is irrelevant, permitting a simpler network than those involved with other operations to produce the correct response. A similar analysis works for ties. The networks for ties are simpler than for non-ties because the presence of a single operand (albeit a duplicated one) is sufficient to elicit the response. That is, one might learn connections such as 2->4, 3->9, and so on qualified only by the multiplication sign and not qualified by co-operands. Non-tie problems necessarily entail conditional activation of the answer by signs and both co-operands. Hence, tie problems that involve a single operator may require a less complex associative structure than problems that involve several operators whose activation of answers depends strongly on the presence of specific co-operands. It is less obvious that a similar associative analysis will work for ones problems because the specific response still depends on the operand paired with one. But an associative network should be possible in which the presence of 1 cues the naming of the other operand, again subject only to the presence of the appropriate operator. Moreover, ones problems benefit from the fact that the answer is directly primed by the operand with which it is identical (cf. Campbell & Clark, this volume). This analysis suggests that comparisons of tie, 0, and 1 problems with other problems might be better served by a distinction between types of associative structures than by a distinction between retrieval and rule-governed performance. That is, the unconditional associative networks involved in ties, 0, and perhaps 1 may result in faster RTs and fewer errors because they involve less selective inhibition and disinhibition than other problems, although it remains to be determined whether such associative structures can explain the incidence of "rulelike" errors in normals (e.g., 4 x 0 = 4 or 8 + 1 = 8) and the atypical behavior of these problems in some acalculic subjects (e.g., Sokol, McCloskey, Cohen, & Aliminosa, 1991). Table-related e m m . Finally, the present model is entirely consistent with numerous standard findings on problem difficulty. One robust effect is that many errors in multiplication and other arithmetic production tasks are table-related (i.e., they come from one of the operand's times-tables), and incorrect verification problems are more difficult to reject when the presented answer belongs to the times-table for one of the operands (Campbell, 198%). Table-related errors follow directly from models that posit individual connections between operands and answers, such as the present one. Campbell and his colleagues have developed and extensively tested this class of associative model (e.g., Campbell, 1987a; Campbell & Graham, 1985), and in general their results could be accommodated
430
J.M. Clark
by the present associative model. Later, I note possible differences between the Campbell approach and the present mechanisms. In the variant proposed here, table-related answers are directly activated because their operand-answer associations are in fact sometimes correct for one of the individual operands. This residual activation produces potential interference that requires greater dependence on the delicate balance of excitation and inhibition than unrelated answers. The inhibitory connections were originally proposed to suppress inappropriate activation resulting from excitatory connections between individual operands and answers. Any inadequacy in the inhibitory connections or lack of specificity in the disinhibition would permit activation of operand-related incorrect answers, especially if incorrect answers had been recently primed (Campbell, 1990; Campbell & Clark, 1989). Presentation of competing answers in a verification task similarly increases demands on the inhibitory system. Thus, table-related errors follow from the operand-based associative model proposed here. Operand-based inhibition and disinhibition are clearly effective mechanisms for context-sensitive activation, and some interesting findings encourage further consideration of the hypothesis. However, alternative mechanisms are possible for conditional activation, such as exaggerated excitation and various types of configural representations. Exu~ratedacitutionmodels
Instead of suppressing an inhibitory connection, context might act by enhancing or exaggerating excitation. Thus, the presence of 4 might directly enhance the spread of excitation from 8 to 32, rather than suppress the spread of inhibition from 8 to 32. Exaggerated excitation based on the presence of the co-operand in principle could permit differential activation of the correct response. Although strong conclusions cannot be drawn about the relative merits of the exaggerated excitation versus the disinhibition hypotheses, several considerations presently favor disinhibition. One practical problem for the exaggerated excitation hypothesis is that it is not obvious how or that exaggerated activation can be achieved in associative networks that incorporate mechanisms consistent with the known capacities of neural mechanisms. The dilemma is that simple summation of excitation does not produce exaggeration of existing activation; instead, the various sources of activation simply sum. For example, if the inhibitory connections of the inhibitiondisinhibition model simulated in Figure 4 are made excitatory, identical activation levels result for the competing correct and incorrect answers. Although both 24
Inhibition and Number Processing
431
and 32 achieve higher levels of activation because of the additional excitation, the levels for the two answers remain identical. The reason that summation of excitation does not work is that excitation has the same effect on the response in the absence of the other operand as in its presence. That is, contextual excitation elevates the level of activation in an associative system even if the contextual stimulus is the only stimulus present. Thus additive excitation from a coatext does not in fact modulate excitation from another stimulus. Disinhibition, on the other hand, has no effect unless it diminishes some existing excitatory input to the forward inhibition. There are ways to overcome this difficulty, but they may entail relaxing such criteria as the physiological and psychological reality of the actual mechanisms used in the theory or in simulations of exaggerated excitation models. One simple way to exaggerate excitation, for example, is to literally multiply the activation levels of the operands; when either operand is missing, the product will be zero and that answer will not be activated. The challenge raised by this approach, however, is to discover a physiological or psychological mechanism (as opposed to a computational method) that can multiply activation levels without involving inhibitory mechanisms. This requirement entails a considerably more powerful “synaptic”associative process than the well known capacity of neural systems to summate excitatory and inhibitory impulses. Plain summation suffices for the disinhibition model. Another difficulty for exaggerated excitation as a general mechanism for context-sensitiveactivation is that problematic phenomena have been identified in at least one area to which such models have been applied, namely Pavlovian conditioning. Specifically,repeated presentation of an excitatory CS in the absence of the US (i.e., extinction) weakens the capacity of the CS to elicit a conditioned response, whereas repeated unreinforced presentation of a feature in occasion setting produces little decrease in the feature’s capacity to modulate responding to a CS (Williams & Clark, 1992). That occasion-setting properties of a feature are undiminishedby nonreinforced presentations suggests that feature-based excitation is not involved in occasion setting, one form of context-sensitive activation. Inasmuch as inhibitory connections are thought not to be weakened by unreinforced exposure to a CS, this criticism does not apply to models based on inhibition. A final point about exaggerated excitation is that such networks become unstable as multiple levels of contextual modulation are added. An excitatory model to explain the contextual effects of operation signs, for example, would require the product of three or more connections, and could achieve excessively high levels of activation. Because inhibition-based systems maintain modest levels
J.M.Clark of activation, additional levels of context can be superimposed without disturbing the integrity of the system. This criticism of exaggerated excitation is less telling than the others, however, because lateral or other forms of global inhibition, as discussed later, could be introduced to control any tendency to excessive excitation. These considerations suggest that exaggerated excitation provides a less satisfactory and less mechanistic explanation for context effects than do inhibitory and disinhibitory mechanisms. A third class of theory used to explain contextsensitive activation consists of models that incorporate configural representations. Configural models for selective activation
By definition, configural models include composite representations in addition to representations for individual operands, operators, and answers. Campbell and Graham (1985), for example, proposed distinct problem representations that could uniquely activate the correct answer, resulting in differentiation of the correct and incorrect answers to ambiguous problems. Although operand representations for 4 and 8 would activate 24 and 32 equally, for example, a distinct 4 x 8 problem representation could differentially activate 32. Campbell and his colleagues (e.g., Campbell & Oliphant, this volume) now hypothesize modality-specific composite representations that include operands, operators, and answers as unified mental entities, and have demonstrated that such representations work well in their sophisticated model for number processing. Despite this evidence for the sufficiency of configural representations, a number of conceptual grounds favor disinhibition models over certain types of configural models or suggest that the two approaches may yet prove to be complementary rather than mutually exclusive. Comparison of configural and disinhibition models benefits from separating "pure" configural models from "integrative" configural models. Pure configural representations. The most extreme view would be that configural representations are novel, unitary problem representations that do not originate with existing representations for the "components" of the eventual whole. For example, configural representations might consist of perceptual analogues for entire problems that emerge independently of existing representations for individual operands and operators. The hypothesis that configural representations do not derive from component parts entails several conceptual difficulties. Configural representations that are not aggregates of existing "parts" fail to explain how operands, signs, and other stimuli are able to activate the appropriate configural representations. Although it may be acceptable to propose that individual operands and signs automatically activate problems of which they are
Inhibition and Number Processing
433
component parts, rejection of the componential view leaves the associative mechanism in need of further specification. A related difficulty is that table-related errors, which presumably arise from operand-driven activation, become somewhat paradoxical for theories that posit unitary representations not based on operands. Holistic representations that are not dissectible into components also sacrifice one ready explanation for the high correlations across operand order for both errors and RTs, which are most easily explained by the equivalence of their parts. Given the various ways in which arithmetic problems can be presented, it is clearly unsatisfactory to simply suppose that holistic representations for problems are activated by the entire pattern of perceptual activation (rather than components such as 4, 6, x). Direct activation is somewhat plausible in certain cases (e.g., simultaneous visual presentation), but is strained for such cases as the sequential presentation of components in nonvisual, novel (e.g., "writing"problems on the skin by touch), or mixed modalities. Multi-stage arithmetic problems provide another common situation in which perceptual codes are not directly available; for example, the problem 4 x 2 x 4 would presumably activate the configural pattern 8 x 4 somehow. These various kinds of sequential presentation invite explanation in terms of component parts and associative mechanisms. Given such limitations, the specification of the associative or other mechanisms by which components become connected to and activate configural representations are critical for evaluating the adequacy and completeness of configural models. Without specification of the actual mechanisms by which these processes operate, it is difficult to determine whether the configural models even differ from the disinhibition view, let alone which of the views is more correct. For example, a theory that includes unspecified perceptual processes by which problems activate configural representations may eventually require context-sensitive associative mechanisms equivalent to those proposed here. Overall then, the operand-based disinhibition model is more complete and mechanistic than pure configural models, while still providing a solution to problem ambiguity. Although awkward for extreme configural models that posit pure holistic representations constructed from scratch, the preceding difficultiesare less problematic for more moderate configural views, called here integrative. Inlegruled configural representations. An alternative to the pure configural view (i.e., a singular and novel holistic representation) is that configural representations evolve by the consolidation or integration of existing discrete components into a unified configural network. That is, problems exist initially as separate operands, operators, and answers. Experience strengthens the links among components, until collections of representations begin to act as integrated units rather than as
434
J. M. Clark
separate associated elements. Such models incorporate associative learning processes, explicit part-whole relations, and other features that avoid most of the problems faced by pure configural representations. Configural models such as Campbell and Oliphant's (this volume) clearly correspond more closely to this integrated associative view than to the extreme configural view just criticized. Their model includes such theoretical mechanisms as degree of unification and the strengthening of connections among the elements in the problem representation. Campbell and Oliphant assume that separate component representations become sufficiently integrated to function much as a single representation. That is, 4, x, 8, and 32 could start as discrete mental representations and subsequently become integrated into a "holistic" problem representation. Such integrated configural models are in principle compatible with contextsensitive operand models. This potential correspondence raises the question, however, whether the associative structures that underlie such integrated networks involve simple direct connections, or the kinds of context-sensitive inhibitory and disinhibitory connections proposed here. The benefits of the special networks presented here are essentially those enumerated earlier; in particular, selective activation of other elements in the problem. Such networks can provide "configural representations"for problems without explicitlyadding another level of holistic representation to the system, and can also explain transitional arithmetic performance prior to consolidation of the configural codes. In a complementary fashion, Campbell's hypothesis that operands, operators, and answers constitute a single entity suggests possible modifications to the disinhibition mechanism. Specifically, configural networks should perhaps include answer-to-operand context-sensitive connections based on the presence of the appropriate co-operand. For example, 24 could activate 4 when 6 was present and 3 when 8 was present. Integrated associative networks in which all problem elements are linked in a selective way that depends critically on the presence of other elements should mimic configural representations, and could also permit a single network to accommodate inverse operations (e.g., multiplication and division). Disinhibition and related mechanisms that permit the development of configural associative models for number processing could have far-reaching consequences for psychology, inasmuch as the distinction between associative and configural representations underlies several important unresolved issues. For example, codigural or integrative representations have been proposed to explain phenomena in cued recall that seem to contradict standard associative theories (e.g., Marschark
Inhibition and Number Processing
435
& Paivio, 1977) and to explain such context effects as patterning in Pavlovian conditioning (Williams & Clark, 1992). Although operand-driven and integrated configural models can be reconciled and can even complement one another, fundamental theoretical questions still remain unanswered. One basic question concerns the existence of configural representations independent of the configural associative networks that defrne them. That is, once 4 x 8 = 32 becomes a configural unit, does it then exist in any form independent of the individual elements (4, x, 8, 32) that participate in the configural associative network? If not independent, then the elements in this network will be shared with other configural networks. If the configural unit does come to exist independent of the associative network, however, then some of the questions raised by pure configural models might resurface (e.g., how do problem parts activate the whole?). Since disinhibitory associative networks explain phenomena for which separate configural representations were created, the present analysis challenges advocates of separate configural representations to find definitive evidence that holistic representations separate from associatively-related component elements are indeed required to explain number-processing. A related question asks whether there are single or multiple representations for each symbolic unit (e.g., operands, answers). In conditional associative networks of the sort proposed here, a single digit code could in principle participate in the multiple associativerelations that underlie all number facts related to that operand. Activation spreading from the single digit code would be context-dependent because of appropriate inhibitory and disinhibitory connections. Without knowing full details of the implementation,such models as Campbell and Oliphant’s suggest multiple representations for operands distributed throughout the system in combination with different problems (i.e., 4 in 4 x 6 = 24 seems distinct from 4 in 4 x 8 = 32). These replicates of 4 and other numbers may provide the tacit assumption that permits configural models to achieve context-sensitivity. This primary section of the paper has examined the potential role of inhibition in the selective activation of problem answers; in the extreme case, to create differential activation where none existed. A related and similarly prevalent role for inhibitory mechanisms in number processing is to enhance existing differences in the activation levels of competing responses.
Inhibition enhances differences in activation Enhancement of differential levels of activation is important in number processing because competition and interference effects are so ubiquitous; that is, interfering responses are often sufficiently activated to compete seriously with the
J.M. Clark correct answer for dominance. This differentiationrole for inhibitory mechanisms is more familiar and evident than context-sensitive activation; hence it is treated less extensively. I first document that number processing demonstrates high degrees of competition and interference among alternative response candidates, and then show, both in principle and empirically, that inhibitory mechanisms are ideally suited to contend with the intense competition that underlies such effects. Interference phenomena in number processing A priori, extreme competition is expected in associative networks for numbers because each number is associated with multiple alternative responses (e.g., 3 is connected to all answers in the 3 times table, the 3 plus table, and so on). These overlapping connections mean that associative processes can activate multiple representations, some of which will interfere with production of the target response. This assumption of "noisy"associativemechanisms is supported by much evidence for interference and response competition effects in number processing, including phenomena discussed previously. Interfemnce within opemtions. Research has clearly demonstrated that arithmetic errors and RTs are highly susceptible to interference from related problems, as attested to by numerous papers in this volume and elsewhere. The dominant errors in multiplication come from answers to nearby table-related problems (e.g., 7 x 8 = 48), suggesting that nearby operands and/or problems in the same times-table are indirectly activated (Campbell, 1987a). Table-related answers also interfere with rejection of false verification problems (Campbell, 1987b). The degree of interference exerted by related answers depends on their current activation levels. Removing certain problems from a test session, for example, speeds responding and decreases errors for any remaining problems that are susceptible to interference from the answers to the removed problems (Campbell, 1987a), and priming with table-related incorrect answers disrupts arithmetic performance (e.g., Campbell, 1987b, 1991). Moreover, competing answers can be activated by operands that are part of the answer (Campbell & Clark, this volume), and subjects may perform more rapidly during the first few moments of arithmetic testing, when competing answers would not yet be active, than later in the session (Chapman, 1915). Close analysis of the relation between recent priming of an answer (e.g., by presentation of its problem) and the likelihood of that answer occurring as a subsequent error has demonstrated that moderately recent problems are a major source of error priming in multiplication (Campbell, 1991; Campbell & Clark, 1989). The assumption is that residual activation from recently presented
Inhibition and Number Processing
437
problems promotes errors. Very recent problems, especially immediately preceding ones, are an exception to this rule and are discussed later. Interference befween operutions. Arithmetic is susceptible to interference from other arithmetic operations, as noted earlier. Miller and Paredes (1990) reported several effects attributable to confusion between operations, including a notable incidence of cross-operation errors. That learning multiplication slowed addition also suggests conflict between answers for addition and multiplication. On verification tasks, subjects are particularly slow to reject false number problems if they involve the correct answer for a different operation (e.g., 3 x 4 = 7), as demonstrated by Winkelman and Schmidt (1980), Zbrodoff and Logan (1986), and others. Current activation levels of competing cross-operation responses again contribute to the magnitude of interference effects. For example, mixing operations (e.g., testing both addition and multiplicationwithin a session) increases the probability of cross-operation errors and slows responding (Miller & Paredes, 1990). Active responses presumably interfere more with the correct answer than do dormant responses. Other number-processingphenomena. Non-arithmetic number-processing tasks also show interference phenomena that are relevant to calculation tasks. Consistent with the idea that numbers indirectly activate numerically close numbers, den Heyer and Briand (1986) found that subjects were faster to identify digits as digits (versus letters) when targets were preceded by a close digit than by a distant digit. Numerical nearness as a fundamental basis for associative relations among numbers is also consistent with number similarity judgments (Shepard, Kilpatric, & Cunningham, 1975) and digit naming errors (Campbell & Clark, 1989). Superfluous activation of nearby numbers could increase interference from close table-related answers, contributing to some of the interference phenomena observed in arithmetic. For example, if 7 primes 6, then 7 x 8 could experience increased competition from 48 because of indirect activation of 6 x 8. Although other evidence for high levels of interference in arithmetic tasks could be cited (see Campbell, 1987c, and other papers in this volume), these examples demonstrate clearly that associative interference both within and between specific arithmetic operations is widespread. The errors result from situational factors and associative connections between operands and either other operands or related answers. This associative interference may be automatic or at least difficult to control, inasmuch as it is evoked even in tasks that are impaired by it (e.g., LeFevre, Bisanz, & Mrkonjic, 1988). The preceding findings demonstrate convincingly that number processing involves activation that spreads to related representations. Elevated activation of
438
J.M. Clark
competing answers is problematic for several related reasons. First, heightened activation of multiple responses means that such additional processes as thresholds, response criteria, and comparators are necessary to "select" the appropriate response, whereas an associative mechanism that resulted in a single response being highly activated could be said to have actually performed the selection. Second, similarly activated answers make it very difficult or even impossible to set a response criterion that is relatively error-free. Strongly differentiated responses therefore permit simpler and perhaps more mechanistic models of arithmetic. Observed interference phenomena presumably represent residual effects that elude whatever control mechanisms exist to suppress competing answers. Given the high degree of overlap in associative networks for numbers and evidence for pronounced interference in various number processing dysfunctions (see later discussion), it seems reasonable to assume that number processing generally involves quite intense competition among correct and incorrect answers. A number-processing system that is so prone to interference would clearly benefit from some mechanism to reduce the competition; inhibition provides just such a mechanism. Inhibitory mechanisms and interference
What is needed to reduce excessive interference and competition is an associative mechanism to exaggerate existing differences in activation levels of competing representations, and ideally to actually suppress the weaker candidates. Lateral inhibition among answers achievesjust these results. The basic mechanism is quite simple but powerful; in essence, answers mutually inhibit activation of competing answers in proportion to their own levels of activation. The conclusion that interference might be reduced by such inhibitory mechanisms is supported by findings with analogous cognitive tasks, as well as by some simulation and empirical results in number processing. Inhibition and analogous tasks. Indirect support for the role of lateral inhibition in number processing comes from research on analogous interference effects in diverse areas of psychology. Inhibition enjoys widespread use to suppressjust such noise as underlies number processing, with selective attention providing a prototypical example of the use of inhibition to control interference. In selective attention tasks, people attend to targets and ignore irrelevant stimuli or distracters. There is now much evidence that selective attention operates in part by suppression of distracters (e.g., Tipper, 1985; Walley & Weiden, 1973). For example, naming a color in the Stroop task is slower if the same name was a distracter on the preceding trial (i.e., had to be suppressed) than if a different
Inhibition and Number Processing
439
color name was the preceding trial distracter (e.g., Dalrymple-Alford & Budayr, 1966). Negative priming and related suppression effects have now been observed in various attention tasks. Selective attention provides an ideal model for the kinds of semantic retrieval operations involved in mental arithmetic. Whereas targets and irrelevant distracters are presented externally in selective attention, they are generated by internal associative processes in number processing. That is, arithmetic problems indirectly activate both "targets" (i.e., correct answers) and "distracters" (i.e., incorrect answers) to varying degrees. The resulting competition and interference in number processing are isomorphic to those found in selective attention. The parallel is even closer when distracting incorrect answers are presented directly, as in priming or verification studies. The analogy with selective attention suggests that inhibitory mechanisms could filter out or suppress competing representations in number processing, just as inhibition operates in selective attention. This inference is further strengthened by the fact that inhibitory mechanisms suppress interfering stimuli and responses in many other domains besides selective attention and number processing. Inhibition sharpens differences in activation levels of sensory neurons so as to enhance edges (e.g., Von Bekesy, 1967), and has been proposed in the motor skills literature to resolve competition between alternative responses (e.g., Kornblum, 1%5). Evidence also suggests that inhibitory mechanismssuppress competing responses in word identification (e.g., McClelland & Rumelhart, 1981) and in retrieval from semantic memory (e.g., McLeod & Walley, 1989), and contribute to such effects as retrieval inhibition in episodic memory (Roediger, 1974). In short, the problems of competition, interference, and noise are common ones in diverse areas of psychology, and inhibitory mechanisms have been widely proposed and accepted as an effective way to reduce such competition. It therefore seems likely that inhibitory mechanisms would be similarly used to manage the excessive activation of inappropriate codes that is so widespread in the domain of number processing. Inhibition and simulations of number processing. Lateral or mutual inhibition plays a major role in at least one model of number processing. Campbell and Oliphant's (this volume) model incorporates lateral inhibition among problem representations both within and between arithmetic operations. The sophistication of the model, however, makes it difficult to appreciate the benefits of its individual components, such as the lateral inhibition of interest here. The earlier simulation of ambiguous problems (reproduced in Figure 6) permits a clear demonstration of the positive consequences of lateral inhibition in number processing,because the parameters chosen for the disinhibition model achieved only very slight separation
J.M. Clark
440
of the competing answers; a mere 12 units (5%) of activation separated the correct and incorrect answers after 15 cycles. This tight competition provides a challenging test case to evaluate the capacity of additional inhibitory mechanisms to separate the curves more sharply and to produce some levelling off or even decline in the activation of the incorrect answer.
ANSWER( ARB1TRARY UNITS) 250
-
INH LI ANSWER
,O .O BOTH
200
- -2 .O
150
-
32
,2 -0 24 .2 - 2 32
100
50 0 0
5
10
15
PROCESSING CYCLE Figure 6. Simulated 4 x 8 activation of 32 and 24 by levels of answer inhibition and co-operand disinhibition (INH = 0 or .2), and by levels of lateral inhibition between answers (LI = 0 or .2). The three curves discussed earlier were produced with the lateral inhibition parameter set to 0 (LI=O); that is, the answers of 24 and 32 did not inhibit one another. For the new pair of curves added to Figure 5 (INH = -,2, LI = -.2), lateral inhibition was set to -.2. That is, at each cycle the activation level of 24 was decreased by .2 times the activation level of 32 and the activation level of 32 was decreased by .2 times the activation level of 24. As shown in Figure 6, this modest lateral inhibition amplifies the difference between the activation levels of 24 and 32, and even produces a levelling off and an eventual decrease in the level of
Inhibition and Number Processing
441
activation of the incorrect answer 24. These are exactly the effects that would facilitate differentiation of correct and incorrect answers. That the activation level of 24 is actually decreasing demonstrates how associative models with lateral inhibition can "select" answers from competing alternatives. The simulation shows that lateral inhibition can in principle resolve campetition in number processing. Moreover, the pattern of growth and separation in activation provides an elegant explanation for errors that occur when subjects are forced to respond quickly. The lower the cycle at which the response is emitted, the greater the activation of competing responses, and/or the less their separation from the eventual correct response. Increases in errors would also be expected if the inhibitory processes were somehow weakened (see later discussion of correlates). This in principle demonstration that inhibition can reduce competition among competing answers is consistent with several findings in number processing. Inhibition and number processingphenomena. One clear example of inhibition in number processing comes from the research on error priming described previously (Campbell, 1991; Campbell & Clark, 1989). Although recently presented answers tend to intrude as errors in multiplication and other arithmetic tasks, the answer to the immediately preceding problem is significantly less likely than chance to occur as an intrusion, and intrusions primarily come from several trials back. Campbell and Clark (1989) modelled this finding by the combined effects of differentially decaying excitatory and inhibitory influences on the preceding answer. The activation of the preceding answer was temporarily masked by rapidly decaying inhibition, so the answer is suppressed but then intrudes after several trials. Campbell (1991) went on to show that the inferred suppression of potentially interfering responses from the preceding trial actually produces fast RTs for immediately following related problems, in theory because one potent competitor has already been inhibited. Interference effects themselves can be viewed as support for the notion of inhibitory mechanisms, inasmuch as interfering answers must exert their influence on targets somehow and one plausible mechanism is inhibition. The negative effects of interfering problems in a testing set, for example, may occur because active competing responses inhibit target responses more than do inactive competing responses, or because the target response must suppress competing responses, which takes longer and is more difficult when competing responses are highly activated. Either process requires inhibitory mechanisms. Several findings on acalculia also implicate inhibitory mechanisms. Arithmetic errors in acalculics are similar to those made by normals under speeded conditions (e.g., Campbell & Clark, this volume; Sokol et al., 1991). One explanation for such findings is that brain insults selectively damage inhibitory mechanisms, which
442
J.M. Clark
leaves noisy activation that is less constrained than normal. The selective vulnerability of inhibitory mechanisms is discussed later. Campbell and Clark (1988) used a similar model to explain errors on number naming by HY,a patient studied by McCloskey and his colleagues (McCloskey, Sokol, & Goodman, 1986). HY's errors were primarily number names that were visually similar, numerically close, and of the same odd-even agreement as the target. This is the expected outcome given damage to inhibitory mechanisms that normally filter out interfering responses similar to the target response. Inhibitory mechanisms might also be reflected in the effects of response uncertainty on RT. In picture naming and other semantic retrieval tasks, RT increases as the number of alternative responses increases (Clark, 1991b;Lachman, 1973; Paivio, Clark, Digdon, & Bons, 1989). One explanation for uncertainty effects is that the multiple responses inhibit one another, slowing RT. A similar uncertainty effect occurs when the number of different stimuli (hence responses) are varied experimentally. Schvaneveldt and Staudenmayer (1970)found that such uncertainty effects were large for numerical tasks that required subjects to retrieve one or more responses from semantic memory (e.g., add 3 to the stimulus digit, subtract the stimulus digit from 9, or retrieve number words that had been randomly associated with the stimulus digits). Uncertainty effects were small and sometimes minimal for tasks less dependent on selective retrieval mechanisms (e.g., name the stimulus digit, add one, or subtract one). If uncertainty effects reflect inhibitory mechanisms, as suggested above, then these findings implicate inhibitory mechanisms in retrieval of arithmetic facts. As observed earlier, difficulties associated with ambiguous and shared-answer problems can also be conceptualized in terms of the fan effect, one type of uncertainty effect (e.g., Pirolli & Anderson, 1985). In summary, inhibition can accentuate differences in activation of competing nodes in semantic memory. This role is consistent with the widespread use of inhibitory mechanisms for suppression of interfering events in other domains of psychology, simulations of number processing, and suggestive number processing phenomena. It is not clear that noninhibitory mechanisms could achieve similar differentiation of highly confusable nodes. Moreover, until alternative models are stated explicitly, it may even be difficult to determine which models actually entail inhibitory mechanisms and which do not. Models that raise or lower threshold levels to suppress errors, for example, might be characterized in terms of inhibitory associative mechanisms that change baseline activity levels of nodes. Reducing inhibition would lower the threshold (i.e., weaker signal required to "fire") and adding inhibition would raise the threshold (i.e., stronger signal required to "fire").
Inhibition and Number Processing
443
Inhibition-related correlates of normal and dysfunctional number processing
I have argued that inhibitory mechanisms serve at least two major functions in number processing, selective activation of answers and enhancement of differences in activation of competing answers, and that these functions underlie performance on many numerical tasks. This hypothesis not only explains much available data, but also suggests future research on number processing and its relation to diverse areas of psychology. Here I examine characteristics of people that ought to correlate with their number processing competencies, given the assumption that inhibitory mechanisms play a central role in arithmetic competencies. The basic approach is to examine human traits that can be demonstrated to reflect differential levels of inhibitory functioning. The areas examined are: direct indicators of inhibitory functioning, brain injury, aging, and childhood development. Indicators of inhibitory functioning One clear implication of the present proposals is that direct measures of inhibitory functioning ought to correlate with performance on numerical tasks. Several physiological and behavioral indicators of inhibitory functioning are available, although not enough is known at present about their correlations with number processing. Physiological indicafors. One physiological indicator of weakened inhibition is neuroleptic seizures. Much evidence is consistent with the hypothesis that seizures often result from and hence reflect inadequate physiological inhibition (Avoli, 1988). This evidence includes: analyses of seizures induced by kindling or other experimental procedures (e.g., Burnham, 1989); the use of agonists for GABA and other inhibitory transmitters in the treatment of epilepsy (e.g., Bartholini, Scatton, Zivkovic, Lloyd, Depoortere, Langer, & Morselli, 1985); neurotransmitter assays of epileptic brain tissue (Lloyd, Bossi, Morselli, Rougier, Loiseau, & Munari, 1985); and induction of seizure activity by antagonists of inhibitory neurotransmitters or other manipulations (e.g., hypoxia, pyridoxine deficiency) that selective damage inhibitory mechanisms (McCormick, 1989). Epilepsy and seizures correlate with performance in a variety of psychological domains, including several relevant to inhibitory functioning. For example, Telzrow (1985) reviews evidence that epileptics have difficulty suppressing distracters in attention tasks, act impulsively and show other behaviors associated with hyperactivity, and show some evidence of aggressiveness (i.e., failure to suppress hostile behaviors).
444
J.M. Clark
With respect to numerical and related cognitive tasks, seizures tend to be associated with general cognitive dysfunctions,such as low scores on standardized intelligence tests, and with specific learning problems (Tehow, 1985). Consistent with the hypothesized importance of inhibitory functioning in number processing, Green and Hartlage (1971; cited in Tehow, 1985) reported particularly poor performance in arithmetic. Various recording and imaging techniques can also be used to measure or infer levels of inhibitory functioning. Dustman and his colleagues (e.g., Dustman, Snyder, & Schlehuber, 1981) have argued that correlations between visual evoked potentials for patterned (checkerboards) and unpatterned (homogenous) stimuli reflect variations in inhibitory functioning. The argument is that lateral inhibition enhances the contrast in the patterned stimuli, making them more distinct from the unpatterned stimuli, and hence reducing the correlation between the two. I later show that aging and childhood development, two variables associated with changes in similarity of evoked potentials to patterned and unpatterned stimuli, are related as expected to changes in number processing performance. To my knowledge, direct tests of the theoretical relation between Dustman’s measure and number processing are not available. Behavioral measures. I have already mentioned several experimental tasks and phenomena that could serve as measures of inhibitory functioning; for example, susceptibilityto negative priming effects in selective attention (e.g., Tipper, 1985) or to uncertainty effects. A more typical individual difference measure that implicates inhibitory functioning is impulsivity, which has been measured by ratings of normal and pathological behavior (e.g., Goyette, Conners, & Ulrich, 1978), and by fast but error-prone performance on the Matching Familiar Figures Task (Kagan, 1965). These variables have been shown to correlate with cognitive functioning generally. Impulsivity and attentional problems are primary symptoms of Attention Deficit Hyperactivity Disorder, which has been strongly associated with learning disabilities and other school-related dysfunctions, and the slope relating individual RTs to uncertainty predicts performance on intelligence tests (Jensen, 1987). These general effects suggest that individual differences in negative priming, uncertainty effects, and impulsivity might very well correlate with performance on number processing tasks. Whether the relations will be precise enough to isolate unique contributions of inhibition and of number processing remains to be determined.
Inhibition and Number Processing
445
Brain damage and related conditions The exceptional roles proposed here for inhibitory mechanisms also predict negative effects of brain insult on number processing, inasmuch as inhibitory brain mechanisms are more vulnerable to insult than are excitatory mechanisms. Early behavioral evidence suggested that inhibitory mechanisms were particularly vulnerable to hypoxia; for example, decreased oxygen at altitude produces exaggerated reflexes (e.g., van Harrevald, 1939; van Liere, 1942). Contemporary research has confirmed these early ideas. For example, inhibitory neurotransmitters such as GABA are particularly susceptible to hypoxia (Avoli, 1988), and one consequence of head injury can be seizures or abnormal brain activity (Mathieson, 1982). Brain injury also tends to produce other signs of weak inhibition, such as attentional difficulties (Posner, 1988), agitation, and impulsivity. Selective damage to inhibitory mechanisms should produce number processing dysfunctions that resemble associative networks in which inhibition is relatively weak, hence, exaggerated interference effects are expected. As noted earlier, McCIoskey et al's (1986) patient HY produced errors that could be predicted by the numerical similarity (nearness and odd-even agreement) and visual similarity of the erroneous and correct responses (Campbell & Clark, 1988). The assumption is that in normals, digits activate number representations that are visually or numerically similar to themselves, but these competing representations are suppressed by inhibitory mechanisms that are deficient in HY. It was also noted earlier that mental arithmetic performance by brain-damaged patients demonstrates a number of effects consistent with the hypothesis that brain injuries disrupt people's capacity to suppress interfering responses. In particular, patterns of errors are exaggerations of the patterns produced by normals (e.g., table-related errors, operand intrusions), consistent with the idea that excitatory associations remain intact to activate the interfering responses but the inhibitory mechanisms that normally filter out such responses have been weakened or perhaps even eliminated.
The preceding analysis suggests further that old age ought to be associated with weakened inhibition. The reasoning is that the elderly are susceptible to various pulmonary, heart, and circulatory dysfunctions and diseases that disrupt delivery of oxygen to the brain. Prolonged or abrupt deficiencies in oxygen delivery should produce inhibitory dysfunctions in the elderly that parallel those seen with traumatic brain insult. Models of aging based on weakened inhibition have been
446
J.M. Clark
proposed by several authors (e.g., Clark, 1991a; Dustman, Emmerson, & Shearer, 1990; Hasher & Zacks, 1988). Supporting findings include increases with age in the correlation between evoked potentials for patterned and unpatterned stimuli (Dustman et al., 1930), and decreases with age in the capacity to suppress distracting stimuli in selective attention tasks (Tipper, 1991). Given such fmdings, the present hypothesis predicts that number processing will deteriorate with age, and more specifically, that the deterioration should demonstrate characteristics of a number-processing system with weak inhibition. Consistent with this prediction, Campbell and Charness (1990) found that elderly subjects showed elevated levels of substitution errors (operand intrusions from earlier stages) on a complex calculation task, and seemed to be particularly challenged by the need to maintain multiple intermediate values in working memory. Such findings are readily explained by a weakened capacity to suppress interfering events. Childhood development A relation between childhood and inhibitory functioning would have major consequences because of the educational relevance of number processing. Number processing abilities that require inhibition may be especially difficult for children if inhibitory mechanisms mature late. Childhood and inhibitory functioning. Much evidence supports the hypothesis that inhibitory mechanisms increase in strength relative to excitatory mechanisms during childhood (Bjorklund & Harnishfeger, 1990, Clark & Johnson, 1990). Childhood changes that indicate strengthened inhibition include: a dramatic decrease in susceptibility to seizures from infancy to early teens (Hauser & Hesdorffer, 1990), decreased correlation between evoked potentials for patterned and nonpatterned stimuli (Dustman et al., l990), improved capacity to suppress distracters on selective attention tasks (Tipper, Bourque, Anderson, & Brehaut, 1989), and decreases in impulsivity as measured by the Matching Familiar Figures Task and related tests (Kagan, 1965, 1966). Children also have difficulty suppressing competing responses in choice RT tasks (e.g., Jensen, 1987) and in picture naming tasks (Clark & Johnson, 1990). Inhibitory processes are also suggested by several major developmental theories, including Piaget. The constructs of centration and egocentrism, for example, suggest a failure to suppress or inhibit attention to some immediate and compelling property of the situation, such as height in a conservation of volume task or one’s own point of view in a perspective-takingtask. If inhibitory processes indeed develop later during childhood than excitatory processes, then the present
Inhibition and Number Processing
447
theory predicts that children will demonstrate forms of number processing that reflect weak inhibition. A number of observations in the domain of number processing support this hypothesis. Numberprocessing in chiddhood. The inhibitory analysis may shed light on why children use counting or other strategies to perform such arithmetic operations as addition, rather than retrieval from semantic memory (e.g., Groen & Parkman, 1972; Restle, 1970). Although elaborate theories have been developed to model the processes by which children select direct retrieval or various procedural approaches to arithmetic problems (Siegler, 1988), the fact that children persist in "counting" and other non-retrieval methods may actually reveal something fundamental about the state of children's inhibitory systems, and about the inhibitory mechanisms required to perform arithmetic by retrieval from semantic memory. Young children can certainly learn associations much earlier than they use them in arithmetic; for example, children can name thousands of pictures long before they are able to associatively retrieve the relatively few answers to basic arithmetic questions. Perhaps children's inhibitory systems have not yet developed to the point of permitting effective use of semantic retrieval for highly interfering sets of stimuli such as number facts. Continued use of counting and non-retrieval strategies may therefore indicate that inhibitory mechanisms are still inadequate to permit direct retrieval given the ambiguity of some problems, and the fact that all operand-answer associations are context-dependent and must therefore be suppressed on most trials. Another developmental finding illuminated by the inhibition hypothesis is that of Miller and Paredes (1990) on inter-operation competition, specificallythe effects on addition of learning multiplication. Certainly learning processes are essential for mastering such competing operations, but some of the temporal trends observed by Miller and Paredes may also reflect maturation in the basic inhibitory mechanisms posited here to underlie resolution of inter-operation competition. That is, young children may have great difficulty developing the inhibitory and disinhibitoryconnections that permit selection of appropriate answers given crossoperation interference. It is also obvious that weak inhibition can readily explain why young children produce more errors at arithmetic than do adults (Miller & Paredes, 1990). Children simply lack the inhibitory mechanisms essential for the differentiation and suppression of highly interfering responses. Not until children's inhibitory systems have matured can they construct the special kinds of associations required to cope with highly interfering networks, such as those involved in arithmetic.
448
J.M. Clark
Direct indicators of inhibitory functioning, brain damage, aging, and childhood development illustrate the kinds of predictions that derive from the inhibition model; similar cases could have been made for weak inhibition in such other areas as psychopathology and alcohol consumption. For example, numbers are often involved in the repetitive behaviors characteristicof obsessive-compulsivedisorder; indeed, an early case study referred to such behaviors as "aritbromania"(cited in Rapoport, 1989, p. 91). The hypothesis that inhibitory mechanisms underlie performance on computation and other number tasks thus predicts that number processing will correlate with diverse measures of inhibitory functioning and dysfunction. Moreover some available results are consistent with this far-reaching prediction, although the inhibition model suggests much additional research. General discussion and conclusions Previous sections of the paper have established the specifc capacities of inhibitory mechanisms to explain number processing phenomena. In closing, I briefly examine some broader issues raised by inhibitory associative approaches to number processing and to cognition in general. Mechanistic explanations Explaining phenomena in terms of excitatoryand inhibitory connections ensures that explanations are mechanistic, rather than alternative verbal descriptions that sometime later must be translated into specific mechanisms. For example, "IF-THENand other high-level procedures do not readily suggest concrete mechanisms and may actually discourage a deep understanding of the cognitive mechanisms that permit organisms to act in conditional ways, especially if the higher-level description is wrongly perceived as an adequate explanation. The associative networks presented here provide a specific description of the disinhibitory mechanisms that effect conditional behavior. Stated somewhat differently,the theoretical constructsof associations, excitation, and inhibition closely approximate the concrete mechanisms used in neuroscience, whereas elaborate and complex systems would be required to model the abstract, higher-level processes characteristic of some psychological theories. The correspondence between psychological and physiological levels of explanation ensures that cognitive theories are constrained to processes presently realizable in terms of known physiological mechanisms. A reciprocal benefit is that theoretical and empirical advances in the cognitive domain are more clearly relevant to neuroscience. For example, if a well-founded cognitive model holds that multiple
Inhibition and Number Processing
449
representations are activated and mutually inhibit one another, then neuro-imaging might be directed towards the identification of early global activity followed by more focal localized activity. It is less straightforward to identify localizable brain states associated with such high-level psychological constructs or general processes as IF-THENand comprehension.
Unified theory Inhibitory and associative explanations based on patterns of associations among a few kinds of representations provide unified explanations for diverse phenomena, providing specific and concrete models for each. For example, appropriately organized inhibitory connections (specifically,disinhibition) eliminated the need for configural problem representations in number processing. This same mechanism can similarly explain context effects in animal learning (Williams & Clark, 1992) and perhaps a wide variety of other behaviors strongly dependent on context. General theoretical mechanisms and associated research ensure that the study of number processing has broad benefits for psychology, and is relevant to many scholars and important issues. Although the correlates of inhibition section focused on number processing, for example, the inhibition hypothesis actually maintains that diverse areas of psychology depend on inhibitory mechanisms of the sort described in this paper, and may therefore demonstrate similar correlations with inhibitory functioning. Even so, number processing may be particularly sensitive to variations in inhibitory functioning, perhaps because of the high levels of interference discussed earlier.
Empirically justified theory A final strength is closely related to what some have perceived as a weakness
of associative approaches to cognitive theory, namely their relative lack of internal or formal constraints. There are indeed few axiomatic limitations on what can be connected to what or on the strength of the connections, other than those presented by often inadequate data. Moreover, alternative networks may be functionally equivalent at the level of description currently available. We saw earlier, for example, that operand-answer inhibition might be realized in various ways depending on the number and type of intervening or hidden nodes. There is also much latitude associated with the particular weights assigned to connections, such as the forward and lateral inhibition in the model described earlier. Although challenging, the alternative structures and lack of formal constraints actually constitute a strength of associative models inasmuch as the human brain
J.M. Clark
and associated cognitive processes seem to demonstrate the same flexibility. Moreover, the lack of formal constraints necessitates meticulous and detailed observations to realize specific associative networks that are faithful to reality. Alternative approaches that begin with unrealistic formal assumptions about the structure of the number processing system are in danger of building elaborate theories on weak foundations. Weak beginnings may be a particular problem if the assumptions are never challenged empirically. For example, no real-world constraint dictates that Visual representations for digits cannot be directly connected to articulatory representations for number words, so it would be imprudent to build a theory on this assumption without first obtaining extensive empirical support. Because they avoid artificial restrictions and properties, associative theories almost by definition are based on rich empirical data, which are needed to delineate specific features of the theory. In conclusion, empirical and rational considerations indicate that inhibitory mechanisms play major roles in number processing. Inhibitory mechanisms permit associative networks that show selective activation under highly ambiguous circumstances without the need for configural representations. Inhibition can also exaggerate even slight differences in levels of activation for distinct answers, permitting the mechanistic "selection"of responses by suitably arranged associative networks. Number processing thus implicates diverse indicators of inhibitory functioning, including physiological and behavioral indicators, brain damage, aging, and childhood development. More generally, explanations that are mechanistic, that unify distinct areas of psychology, and that are empirically well-founded recommend associative models based on excitation and inhibition as a useful approach to the study of normal and dysfunctional number processing. ACKNOWLEDGEMENTS I thank Jamie Campbell for comments on an earlier draft of this paper and for many constructive talks about the role of inhibition in number processing and cognition. The ideas on inhibition in Pavlovian conditioning were developed in collaboration with Doug Williams. This research was supported by Natural Sciences and Engineering Research Council of Canada grant OGP0042736 to James M. Clark, Department of Psychology, University of Winnipeg, Winnipeg, Manitoba, Canada, R3B 2E9 (E-mail:
[email protected]"IPEG.CA).
Inhibition and Number Processing
451
REFERENCES Avoli, M. (1988). GABAergic mechanisms and epileptic discharges. In M. Avoli, TA. Reader, R.W. Dykes, & P. Gloor (Eds.), Neurotransmitters and cortical functions: From molecules to mind (pp. 187-205). New York Plenum. Bartholini, G., Scatton, B., Zivkovic, B., Lloyd, K.G., Depoortere, H., Langer, S. Z., & Morselli, P.L. (1985). GABA receptor agonists as a new therapeutic class. In G. Bartholini, L. Bossi, K.G. Lloyd, & P.L. Morselli (Eds.), Epilepsy and GABA receptor agonists: Basic and therapeutic research (pp. 1-30). New York: Raven. Bjorklund, D.F., & Harnishfeger, K.K. (1990). The resources construct in cognitive development: Diverse sources of evidence and a theory of inefficient inhibition. Developmental Review, 10, 48-71. Burnham, W.M. (1989). The GABA hypothesis of kindling: Recent assay studies. Neuroscience & Biobehavioral Reviews, 13, 281-288. Campbell, J.I.D. (1987a). Network interference and mental multiplication.Journal of Experimental Psychology: Learning, Memory, and Cognition, 13,109-123. Campbell, J.I.D. (198%). Production, verification, and priming of multiplication facts. Memory & Cognition, 15, 349-364. Campbell, J.I.D. (1987~).The role of associative interference in learning and retrieving arithmetic facts. In J. Sloboda and D. Rogers (Eds.) Cognitive processes in mathematics (pp. 107-122). Oxford, England: Oxford University Press. Campbell, J.I.D. (1990). Retrieval inhibition and interference in cognitive arithmetic. Canadian Journal of Psychology, 44, 445-464. Campbell, J.I.D. (1991). Conditions of error priming in number fact retrieval. Memory & Cognition, 19, 197-209. Campbell, J.I.D., & Charness, N. (1990). Age-related declines in working-memory skills: Evidence from a complex calculation task. Developmental Psychology, 26,879-888. Campbell, J.I.D., & Clark, J.M. (1988). An encoding-complex view of cognitive number processing: Comment on McCloskey, Sokol, and Goodman (1986). Journal of Experimental Psychology: General, 117, 204-214. Campbell, J.I.D., & Clark, J.M. (1989). The time course of error priming in number-fact retrieval: Evidence for excitatory and inhibitory mechanisms. Joumal of Experimental Psychology: Learning, Memory, and Cognition, 15, 920-929.
452
J.M. Clark
Campbell, J.I.D., & Graham, D.J. (1985). Mental multiplication skill: Structure, process, and acquisition. Canadian Journal of Psychology, 39, 338-366. Chapman, J.C. (1915). A study of initial spurt in the case of addition. Journal of Educational Psychology, 6, 419-426. Clark, J.M. (1991a, June). Inhibitory mechanisms in childhood development and aging. Canadian Psychological Association, Calgary, Canada. Clark, J.M. (1991b, November). Eflects of response uncertainty on semantic retrieval. The Psychonomic Society 32nd Annual Meeting, San Francisco, California. Clark, J.M., & Campbell, J.I.D. (1991). Integrated versus modular theories of number skills and acalculia. Brain and Cognition, 27, 204-239 Clark, J.M., & Johnson, C.J. ( l m , May). On the cenbrrl role of inhibitory mechanisms in development. University of Waterloo Conference on Child Development, Waterloo, Canada. Dalrymple-Alford, E.C., & Budayr, B.(1966). Examination of some aspects of the Stroop color-word test. Perceptual and Motor Skills, 23, 1211-1214. den Heyer, K., & Briand, K. (1986). Priming single digit numbers: Automatic spreading activation dissipates as a function of semantic distance. Americun Journal of Psychology, 99, 315-340. Dustman, R.E., Emmerson, R.Y., & Shearer, D.E. (1990). Electrophysiologyand aging: Slowing, inhibition, and aerobic fitness. In M.L. Howe, M.J. Stones, & C.J. Brainerd (Eds.), Cognitive and behavioral pegormance factors in atypical aging (pp. 153-180). New York Springer-Verlag. Dustman, R.E.,Snyder, E.W., & Schlehuber, C.J. (1981). Life-span alterations in visually evoked potentials and inhibitory function. Neurobiology of Aging, 2, 187-192. Goyette, C.H., Conners, C.K., & Ulrich, R.F. (1978). Normative data on revised Conners parent and teacher rating scales. Journal of Abnormal Child P ~ c h o l o6,~ 221-236. , Graham, D.J., & Campbell, J.I.D. (in press). Network interference and number-fact retrieval: Evidence from children’s alphaplication. Canadian Journal of Psychology. Groen, G., & Parkman, J.M. (1972). A chronometric analysis of simple addition. Psychological Review, 79, 329-343. Hasher, L., & Zacks, R.T. (1988). Working memory, comprehension, and aging: A review and a new view. In G.H. Bower (Ed.), The psychology of learning and motivation (Vol. 22, pp. 193-225). San Diego, CA: Academic Press.
Inhibition and Number Processing
453
Hauser, W. A., & Hesdorffer, D. C. (1990). Epilepsy: Frequency, causes, and consequences. Epilepsy Foundation of America. Jensen, A.R. (1987). Individual differences in the Hick paradigm. In PA. Vernon (Ed.), Speed of information-processing and intelligence (pp. 101-175). Nonvood, NJ: Ablex. Kagan, J. (1%5). Matching Familiar Figures Test. Cambridge, MA: Harvard University. Kagan, J. (1966). Reflection-impulsivity:The generality and dynamics of conceptual tempo. Journal of Abnormal Psychology, 71, 17-24. Kornblum, S. (1965). Response competition and/or inhibition in two-choice reaction time. Psychonomic Science, 2, 55-56. Lachman, R. (1973). Uncertainty effects on time to access the internal lexicon, Journal of Experimental Psychology, 99, 199-208. LeFevre, JA.,Bisanz, J., & Mrkonjic, L. (1988). Cognitive arithmetic: Evidence for obligatory activation of arithmetic facts. Memory & Cognition, 16, 45-53. Lloyd, K.G., Bossi, L., Morselli, P.L., Rougier, M.,Loiseau, P., & Munari, C. (1985). Biochemical evidence for dysfunction of GABA neurons in human epilepsy. In G. Bartholiai, L. Bossi, K.G. Lloyd, & P.L. Morselli (Eds.), Epilepsy and GABA receptor agonists: Basic and therapeutic research (pp. 43-51). New York: Raven. Marschark, M., & Paivio, A. (1977). Integrative processing of concrete and abstract sentences. Journal of Verbal Learning and Verbal Behavior, 16,217231. Mathieson, G. (1982). Pathology and pathophysiology: Part 1 - Pathology. In J. Laidlaw & A. Richens (Eds.), A textbook of epilepsy (2nd ed.) (pp. 437-456). Edinburgh: Churchill Livingstone. McCormick, DA. (1989). GABA as an inhibitory neurotransmitter in human cerebral cortex. Journal of Neurophysiology, 62, 1018-1027. McLeod, B.E., & Walley, R.E. (1989). Early interference in a priming task with brief masked targets. Canadian Journal of Psychology, 43, 444-470. McClelland, J.L., & Rumelhart, D.E. (1981). An interactive activation model of context effects in letter perception: Part I. An account of basic findings. Psychological Review, 88, 375-407. McCloskey, M., Sokol, S.M., & Goodman, RA. (1986). Cognitive processes in verbal-number production: Inferences from the performance of braindamaged subjects. Journal of Experimental Psychology: General, 115, 307330. Miller, K.F., & Paredes, D.R. (1990). Starting to add worse: Effects of learning to multiply on children’s addition. Cognition, 37, 213-242.
454
J.M. Clark
Miller, K.F., Perlmutter, M., & Keating, D. (1984). Codtive arithmetic: Comparison of operations. Journal of Experimental Psychology: Leaming Memoy, and Cognition, 10,40-60. PaiVio, A., Clark, J.M.,Digdon, N., & Bons, T. (1989). Referential processing: Reciprocity and correlates of naming and imaging. Memory & Cognition, 17, 163-174. Pirolli, P.L., & Anderson, J.R. (1985). The role of practice in fact retrieval. Journal of Experimental Psychology: Leaming, Memoty, and Cognition, 11,136-153. Posner, M.I. (1988). Structures and functions of selective attention. In T.Boll & B.K. Bryant (Eds.), Clinical neuropsychology and brain function: Research, measurement, and practice (pp. 173-202). Washington, DC: American Psychological Association. Rapoport, J.L. (1989). The boy who couldn't stop wushing. New York: Penguin. Restle, F. (1970). Speed of adding and comparing numbers. Journal of Experimental Psychology, 83, 274-278. Roberts, E. (1987). What do GABA neurons really do? They make possible variability generation in relation to demand. The Journal of Mind and Behavior, 8, 591-604. Roediger, H.L. (1974). Inhibiting effects of recall. Memory & Cognition,2,261-269. Schvaneveldt, R.W., & Staudenmayer, H. (1970). Mental arithmetic and the uncertainty effect in choice reaction time. Journal of Experimental Psychology, 85, 111-117. Shepard, R.N., Kilpatric, D.W., & Cunningham, J.P. (1975). The internal representation of numbers. Cognitive Psychology, 7, 82-138. Siegler, R.S. (1988). Strategy choice procedures and the development of multiplication skill. Journal of Experimental Psychology: General, 117, 258-275. Sokol, S.M., McCloskey, M., Cohen, N.J.,& Aliminosa, D. (1991). Cognitive representations and processes in arithmetic: Inferences from the performance of brain-damaged patients. Journal of Experimental Psychology: Learning, Memory, and Cognition, 17,355-376. Telzrow, C.F. (1985). The science and speculation of rehabilitation in developmental neuropsychological disorders. In L.C. Hartlage & C.F. Telzrow (Eds.), Neuropsychological aspects of individual diflerences: A developmental perspective (pp. 271-307). New York: Plenum. Tipper, S.P. (1985). The negative priming effect: Inhibitory priming by ignored objects. The Quarterly Journal of Expenmental Psychology, 37, 571-590.
Inhibition and Number Processing
455
Tipper, S.P. (1991). Less attentional selectivity as a result of declining inhibition in older adults. Bulletin of the Psychonomic Society, 29, 45-47. Tipper, S.P., Bourque, T.A., Anderson, S.H., & Brehaut, J.C. (1989). Mechanisms of attention: A developmental study. Journal of Experimental Child PsyChOlOgy, 48,353-318. van Harreveld, A. (1939). The effect of asphyxia on reflex inhibition. American Journal of Physiology, 128, 13-18. van Liere, E.J. (1942). Anaria: Its #ect on the body. Chicago: University of Chicago Press. Von Bekesy, G. (1967). Sensory inhibition. Princeton, NJ: Princeton University Press. Walley, R.E.,& Weiden, T.D. (1973). Lateral inhibition and cognitive masking: A neuropsychological theory of attention. Psychological Review, 80, 284-302. Williams, D., & Clark, J.M. (1992). Inhibitory associative mechanisms in Pavlovian occasion setting. Manuscript submitted for publication. Winkelman, J.H., & Schmidt, J. (1974). Associative confusions in mental arithmetic. Journal of Experimental Psychology, 102, 734-736. Zbrodoff, N.J., & Logan, G.D. (1986). On the autonomy of mental processes: A case study of arithmetic. Journal of Experimental Psychology: General, 11.5, 118-130.
J.M. Clark
456 Appendix
Simulation Program DECLARE FUNCTION in! (ol!, 02!) REM Op-Ans Forward Inh (-) / Exc Parameter fi = -.2 REM Co-op Disinhibition (-)/ Exag Exc Param. di = -.2 li = -.2: REM Ans-Ans Lateral Inhibition Parameter a4 = 10: a8 = 10: REM Activation Levels for Operands DEF fnmax (x, y) IF x > = y THEN fnmax = x ELSE : fnmax = y END I F END DEF DEF fnin (01, 02) = 01 t fnmax(o1 t 02 * di, 0) * fi CLS : PRINT "FI fi; " DI = di; " LI PRINT "Cy Answ24 Answ32 ='I;
'I;
='I;
li: PRINT
FOR c = 0 TO 15 PRINT USING "## ###.### ###.###"; c; a24; a32 a24 = a24 t fnin(a4, a6) t fnin(a6, a4) t fnin(a3, a8) t fnin(a8, a3) a32 = a32 t fnin(a4, as) t fnin(a8, a4) a24pre = a24 a24 = fnmax(a24 + li * a32,O) a32 = fnmax(a32 t li * a24pre, 0) NEXT c
The Nature and Origins of Mathematical Skills J.I.D.Campbell (Editor) 0 1992 Elsevier Science Publishers B.V. All rights reserved.
457
Chapter 12 COGNlTlVE NUMBER PROCESSING: AN ENCODING-COMPLEX PERSPECTIVE
Jamie 1. D. Campbell University of Saskatchewan James M. Clark University of Winnipeg
Summary According to the encoding-complex approach (Campbell & Clark, 1988; Clark & Campbell, 1991), numerical skills are based on a variety of modality-specific representations (e.g., visuo-spatial and verbal-auditory codes), and diverse numberprocessing tasks (e.g., numerical comparisons, calculation, reading numbers, etc.) generally involve common, rather than independent, cognitive mechanisms. In contrast, the abstract-modular theory (e.g., McCloskey, Caramazza, & Basili, 1985) assumes that numberprocessing is comprised of separate comprehension, calculation, and production subsystems that communicate via a single type of abstract quantity code. We review evidence supporting the specific-integrated (encoding-complex)view of number processing over the abstract-modular view, and report new experimental evidence that one aspect of numberprocessing, retrieval of simple mulriplicationfacts, involves non-abstract, fomat-specific representations and processes. We also consider implications of the encoding-complex hypothesis for the modularity of number skills. Introduction In this paper we present evidence for an encoding-complex view of cognitive number processing (see also Campbell & Clark, 1988; Clark & Campbell, 1991). The central assumption of our approach is that number concepts and skills are based on modality- and format-specific mental codes that are interconnected in a
458
J.I.D. Campbell & J.M. Clark
complex and highly integrated associative structure. Furthermore, the specific symbol manipulation skills reflected in different numerical tasks (e.g., magnitude comparison, number-fact retrieval, number naming, etc.) are generally based on activation of learned associative relations involving common representational structures. The encodmg-complex view can be contrasted with abstract-modular theories, which view number skills as comprised of specialized processing modules that communicate by way of a single type of abstract, semantic number representation. The most developed version of the abstract-modular view of number processing is the theory proposed by M. McCloskey, A. Caramazza, S. Sokol and their colleagues (e.g., McCloskey, Caramazza & Basili, 1985; McCloskey, Sokol & Goodman, 1986; Sokol, Goodman-Schulman, & McCloskey, 1989; Sokol, McCloskey, Cohen, & Alminosa, 1991). We will argue that the two central assumptions of the abstract-modular theory -- that the meaning of numbers is based on a single type of abstract quantity code, and that number processing entails a small number of separate "modules" -- underestimate the complexity of number processing. Because these assumptions influence the direction of theoretical inquiry, determine and validate the empirical methods used, and are at the heart of the logic by which the modularity hypothesis permits functional diagnosis of acalculia disorders, they deserve careful scrutiny. Specifc-integrated versus abstract-modular views of number processing The model of number-processingpresented by McCloskey et al. (1985) proposes three functionally distinct subsystems corresponding to number "comprehension," "calculation,"and "production"modules (see also Sokol et al., 1989; Sokol et al., 1991). The primary function of the comprehension system is to recode number stimuli encountered in various formats (e.g., digits or number words) into an abstract, semantic code that is the same for all surface formats. These abstract codes provide the basis for subsequent processing. For example, the abstract codes provide input to a separate calculation system, which stores associative number facts and computes arithmetic procedures and relations. The abstract output of the comprehension and calculation systems can also be input to the verbal and Arabic production subsystems. The primary function of the production system is to translate the abstract representations of numbers into specific number codes, such as digits or written or spoken number words. The basic assumptions of the encoding-complex view are incompatible with this strong form of modularity. Instead, we propose that the collections of tasks denoted by the terms comprehension,calculation, and production, primarily utilize
An Encoding Complex Perspective
459
common cognitive representations and resources (e.g., visual and verbal workingmemory systems), although the tasks may differ to some extent in terms of specific processing and response requirements. In other words, number comprehension, calculation, and production are coextensive skills rather than separate processing modules. The emphasis on abstract codes in the McCloskey et al. model also is incompatible with the encoding-complex approach. Specifically, McCloskey et al. (1986, Sokol et al., 1989) claim that different surface forms of numbers (e.g., digits and number words) are translated into a uniform modality-free semantic code that entails an abstract representation for each number, plus the power of ten appropriate to its place value. In their notation, the quantity ninety, for example, is represented semantically by the abstract code {9}10EXPl, and the quantity eight by {8}10EXPO, irrespective of the manner in which the quantity was encoded (e.g., spoken or visually presented) or computed. In contrast, the most basic assumption of the encoding-complex view is that numbers are represented in terms of modality-specific mental codes. Verbal or word codes include articulatory and auditory codes in most people, visual and written number-word codes in literate individuals, and unique codes in various specific groups (e.g., sign-language codes for numbers). In addition to verbal codes, number processing implicates such nonverbal codes as visual and written codes for digits, imaginal, analogue codes for magnitude (e.g., number lines), and combined visual-motor representations (e.g., counting on fingers; using an abacus). Being associatively connected, the various specific codes can activate one another to produce a multi-component representational structure that we call an encoding complex. In this paper, we review evidence supporting a role for modality-specific number representations (for a more thorough discussion see Clark & Campbell, 1991),and report a study of simple multiplication showing clearly that retrieval varies with the surface form in which a problem is presented. We then shift the focus to the hypothesis of modularity in number processing, and argue that the sharp separation of number comprehension, calculation, and production in the abstractmodular theory fails to recognize the strong interdependence and relatedness of these facets of number processing.
Specific versus abstract number codes A variety of internal codes appears to be necessary to explain how people represent number words, digits, and other quantitative stimuli. For example, many number-processing tasks require subjects to temporarily store information while performing some other mental operation (e.g., Campbell & Charness, 1990). This
460
J.I.D. Campbell & J.M. Clark
storage of intermediate results implicates working-memory processes, which are assumed to involve temporary retention of verbal or visuo-spatial representations (Baddeley, 1986; Baddeley & Hitch, 1974). The disruptive effects of concurrent articulation in mental addition (Hitch, 1978) and counting tasks (Logie & Baddeley, 1987; see also Healy & Nairne, 1985) implicates verbal working-memory processes (i.e., phonological and articulatory codes) in mental arithmetic. Other research demonstrates the importance of visual-spatial codes in working memory (Baddeley, 1986), and also provides evidence of visuo-spatial codes in calculation. For example, Stigler (1984; see also Hatano, Miyake, & Binks, 1977) proposed that skilled users of the "mental abacus" employ a visual representation of the abacus to perform mental arithmetic, and Hayes (1973) obtained evidence for the use of visual imagery in a large number of simple calculation tasks. Beyond evidence for visual and verbal number codes in working memory, experimental research also implicates modality- or format-specific processes in a variety of basic number comprehension and calculation tasks. Format-specific phenomena in number processing Effects of surface format (e.g., digits vs. number words) on number processing potentially contradict the abstract-code hypothesis, because processing based on abstract codes should be the same irrespective of input format. In contrast, the encoding-complex view implies that different surface forms can vary in their capacity to activate the specific internal codes that mediate performance, and therefore the representational basis of performance can change with changes in surface form. Clark and Campbell (1991) describe a substantial body of evidence indicating that the internal mechanisms mediating performance in a variety of number tasks varies with the format of stimulus presentation. For example, Vaid and Corina (1989) demonstrated that both stimulus format and prior experience determine how numerical comparisons are performed (see also Foltz, Poltrock & Potts; 1984, Tzeng and Wang, 1983). Vaid and Corina compared Stroop interference in speeded magnitude judgments between digits, English words, and number signs in American Sign Language (ASL) - English bilinguals, Stimuli were presented either in the left-visual-field (LVF) or the right-visual-field (RVF). Overall, there was greater interference when number words and signs were initially processed in the RVF, whereas interference was greater for digits presented to the LVF (see also Vaid, 1985). Moreover, for number words and signs, interference was greater in the RVF for the more skilled language (i.e., English or ASL), but was greater in the LVF for the less skilled language.
An Encoding-Complex Perspective
461
Vaid and Corina concluded that magnitude judgments about digits versus words involve internal representations that are differentially accessed in the right and left hemispheres respectively, and that degree of language skill affects lateralization of magnitude judgments. Specifically, magnitude judgments for number words appear to be more dependent on left-hemisphere functions in right-handers. Thus, processing of numerical magnitude appears to depend on format, apparently because left-hemisphere codes (e.g., possibly verbal representations) and righthemisphere codes (e.g., visuo-spatial representations) may be differentially engaged in judgments about magnitude as a function of format (e.g., digits vs. number words). Clark and Campbell (1991) also reviewed evidence that processing of oddeven status varies with stimulus format (e.g., Hines, 1990; Klein & McInnes, 1988; Shepard & Gale, 1982), which supports the view that knowledge and use of the odd-even relationship is also mediated by multiple forms of mental codes. Effects of number format are particularly relevant in the present context because proponents of the abstract-modular theory use format-based dissociations in brain-damaged patients as a primary method for deducing the architecture of number processing and for inferring the functional locus of a patient's deficit. The theory proposed by McCloskey et al. (1985) makes a strong distinction, for example, between comprehension deficits (i.e., failure to access the correct semantic-quantity code) and calculation deficits (ix., failure to perform the correct computation given input of the appropriate abstract code). For example, if a calculation can be performed successfully given arabic digits, but cannot be performed when the stimuli are presented using number words, then one can conclude that the comprehension subsystem for words must be impaired. This follows because unimpaired calculation for digits confirms that the abstract calculation process is intact (cf. McCloskey et al., 1985, p. 178). This logic is valid, however, only when it is assumed that processes "down stream" of stimulus encoding are constant irrespective of format. If different number formats can differentially activate internal number representations and associations that are functional in performance, as the encoding-complex hypothesis proposes, then mechanisms of numerical judgment, calculation, and production can vary as a function of number format. This would, at the very least, restrict the generality of using format-related dissociations to infer the functional locus of a numerical deficit. It is important to note that the critical issue is not whether abstract number codes exist, but rather whether calculation and production can be mediated by multiple codes whose roles vary with format (e.g., visual codes may be especially salient with digit stimuli, whereas activation of phonological codes may be more salient with number words). If such is the case, then the logic for interpreting
462
J.I.D. Campbell & J.M. Clark
performance dissociations described above is invalid whether or not some form of abstract code also exists.
An experimental demonstration of format effects in basic arithmetic In the following sections, we present experimental evidence that retrieval of simple multiplication facts (e.g., 3 x 6 = ?) is sensitive to number format (digit vs. number-word stimuli). These data have important implications for the abstractcode hypothesis because number-fact retrieval unambiguously implicates the abstract calculation module in the McCloskey et al. (1985; Sokol et al., 1989; Sokol et al., 1991) model and thus should not be sensitive to surface format. In contrast, the encoding-complex assumption of modality-specific codes implies that retrieval processes can be format specific. For example, digits and number-words could differ in their capacity to activate visuo-spatial and verbal-code representations of problems. This multiple-code view is consistent with the findings of Kashiwagi, Kashiwagi et al., (1987), who studied Japanese aphasics with impaired performance for simple multiplication. Despite extensive practice, the patients could not relearn multiplication with verbal presentation and responses. They did, however, learn to generate the multiplication facts given visual presentation combined with written responses. Such findings support the theory that the representations underlying retrieval of number facts can involve multiple codes that are differentially involved as a function of surface form. Although there are performance effects due to format, some researchers have concluded that these effects could be due to differences in encoding stimuli in different formats (e.g., Marsh & Maki, 1976; McClain & Shih Huang, 1982). Based on such possibilities, proponents of the abstract-modular view have argued that results such as those of Kashiwagi et al. (see also, Gonzalez and Kolers, 1982; 1987) are not compelling with respect to the issue of format-specific processes in calculation (Sokol et al., 1989; Sokol et al., 1991). To clarify this issue, the following study examined number-fact retrieval (simple multiplication) with digit and word stimuli. In particular, we examined whether surface format interacted with problem-size and other indices of retrieval difficulty. Such interactions would support the specific-code hypothesis that number-fact retrieval can involve format-specific representations. We also examined error patterns as a function of presentation format. A previous study by Sokol et al. (1991) examined format effects in the simple-multiplication errors of a single acalculia patient, PS, and found similar error patterns when digits, number words, or dots were used to represent numbers. The current study, which involved a large number of normal adult subjects, may permit a more sensitive analysis. Finally,
An Encoding-Complex Perspective
463
we conducted reanalyses of the simple multiplication errors of Sokol et al.3 (1991) patients GE and PS, and further analyses of the present multiplication experiment, that suggest that number-reading and number-fact retrieval are integrated, interactive processes, rather than functionally independent processes as the abstract-modular theory assumes.
Method The subjects were 80 undergraduate students (36 females, 44 males) at the University of Western Ontario who ranged in age from 18 to 28 years. Stimuli were the multiplication problems in the range from 2 x 2 through 9 x 9 presented in digit format or in English number-word format using either upper or lower case letters. Problems were presented horizontally with the two operands separated by a lower case "x" with flanking spaces. Digit-format problems were 2 centimeters in length whereas word-format problems ranged in length from 2.5 to 3.5 centimeters. Subjects received four blocks of 62 trials, with word format used for oddnumbered trials and digit-format for even-numbered trials. The problems tested yield 31 different products, with five of the products being correct answers to two combinations of operands (Lee,12, 16, 18,24 and 36). The 31 digit and 31 word trials in each block included the same shared-product problems, and both sets included problems involving all 31 products. The five shared-product problems excluded in the first block were exchanged for their same-product counterparts in Block 2. Blocks 3 and 4 replicated the problems in Blocks 1 and 2, respectively. The specific set of shared-product problems tested in the first block was counterbalanced within each of two sets of 32 subjects, and was chosen arbitrarily for the remaining 16 subjects. For each subject the order of problems in each block was pseudo-random, with the constraint that the digit trial and the word trial involving the same correct product were separated by at least 20 trials within a block. The order of operands for non-tie problems was determined randomly and independently for word and digit problems in Block 1 and then alternated across blocks. Subjects were instructed to state the correct answer to each problem as quickly and accurately as possible. A computer presented the stimuli and recorded response times (RTs) to & 1 ms. For each trial, the prompt "words" or "digits" appeared briefly at the center of the computer screen, and was followed by a futation dot for 1.5 s. The problem then appeared with the multiplication sign at furation. Timing began when the problem appeared and was stopped when the subject's spoken response triggered a voiceactivated relay. An experimenter recorded the response given on each trial.
464
J.I.D. Campbell & J.M. Clark
Eflects of format on speed and accuracy of simple multiplication
RT summary. Table 1 presents mean correct RTs for "easy and "difficult" problems presented in digit and word formats'. Problem difficulty was determined from the table of normative multiplication performance reported by Campbell and Graham (1985, Appendix B) and the assignment of problems to easy and difficult cells was the same as that used by Campbell (1987; 1991). A two-by-two repeated-measures analyses of variance (ANOVA) with factors of format and difficulty confirmed that the difficult problems produced longer RTs relative to the easy problems, F(1, 79) = 210.3, MSe = 24727.6, p c -001, and showed that word problems required longer RTs than digit problems, F ( l , 79) = 513.3, MSe = 13261.3,~c .001. Although the pattern of RTs across the set of 64 problems was very similar for digit and word formats (r = .913), the interaction of format and difficulty was highly significant, F(1, 79) = 101.4, MSe = 2 7 1 0 , ~c .001. The interaction shows that the increase in RT with words relative to digits was greater for the difficult problems. Table 1. Mean R T and %E for Easy and Difficult Problems as a Function of Presentation Format
Words RT (SD)
Problem Type Easy Difficult
Digits
950 (147) 1264 (289)
717 (96) 914 (220)
1107
816 %E (SD)
Easy Difficult
5.0 (6.2) 19.9 (12.5) 12.5
2.1 (2.1) 14.5 (10.1)
8.3
Noce. n = 80 subjects.
'There were no apparent effects of upper vs. lower case letters and this factor is not considered in the following analyses.
An Encoding-Compla Perspective
465
Error summay. Subjects made a total of 861 errors with digits and 1329 errors with words, an increase of 54% from digit to word format. The difference between mean errors per subject for digits (10.8) and words (16.6) was highly significant, SE = 1.02, z = 5.71, p < .001. The majority of errors were commission errors for both words (1236 errors or 93%) and digits (822 errors or 95%). Although commission errors accounted for the bulk of errors, omission errors (is., failures to produce an intelligible response before triggering the voice key) were much more common with the word format (93 errors) than the digit format (39 errors), an increase of 138% versus the 50% increase in commission errors. The mean number of omission errors per subject was 1.16 for words and 0.49 for digits, SE = .23, z = 2 . 9 6 , ~< .001. A two by two (Easy vs. DXicult by Words vs. Digits) repeated-measures ANOVA on the mean rates of commission errors (see Table 1) confirmed a higher error rate for word problems than digit problems, F(1, 79) = 28.4, MSe = 49.1, p < .001, and verified a higher error rate for difficult problems relative to the easy problems, F(1,79) = 220.8, MSe = 6 8 . 1 , ~< .001. The interaction of format and difficulty, [F(1, 79) = 8.3, MSe = 1 5 . 1 , ~= .005,]indicated that the increase in errors due to the word format was greater for the more difficult problems, although error rates across problems in each format were highly correlated (r = .895). The encoding-complex view accounts for format-related differences in RTs and errors in terms of the distinct numerical processing associated with digit and word formats, including differential access to number facts, format-specific generalization effects to related number codes, and distinct sorts of response priming effects. For example, multiplication facts may involve associative networks among digit-like codes, which are more readily accessed by the digit format than by the word format. Weaker word-format associations or increased word-format competition could readily produce increases in RTs and both commission and omission errors. Several of these mechanisms are described more fully in the following sections of the paper. To begin, we examine the patterns of RT and errors across problems and format in more detail using multiple regression techniques. Based on the these results, we argue that format is directly influencing associative retrieval processes. Subsequent to this, we show that format produces robust effects on the patterns of specific multiplication errors, and we argue that these effects further support the encoding-complex approach and challenge the abstract-modular theory.
466
J.I.D. Campbell & J.M. Clark
Multiple regression analyses of RTs and errors The dependent variables for the multiple regression analyses were the mean correct RT and commission-error rate for each of the 64 problems in each format. Only a subject's first correct trial in each format for each problem contributed to the RT means. There were four predictor variables used in the regression: 1 ) problem Size (answers c 20 coded as -1, 220 and <40 coded 0, 240 coded t l ) , 2) Fan (shared-product problems [n = 181 coded t 1 and unique-productproblems [n = 461 coded 0), 3) Length (the number of character spaces occupied by a problem in the word format), and 4) Ties ("tie" problems such as 2 x 2 , 3 x 3 coded +1 and non-ties as 0). Length and Ties were included to assess possible contributions of stimulus reading or encoding factors to between-format differences. Size provides an index of problem difficulty (e.g., Campbell & Oliphant, this volume) that is not confounded with the tie/non-tie factor (r = .03). In contrast, the breakdown into "easy" and "difficult"multiplication problems used by Campbell (1987, 1991) places all but one of the tie problems (8 x 8) into the easy set. Shared-product problems might show a fun eflect, an interference phenomenon apparently due to the number of irrelevant associations "fanning out" from a concept in memory (e.g., Pirolli & Anderson, 1985); thus, Fan was assumed to potentially index a source of retrieval interference. In this case, problems with shared products should tend to be more difficult, once other factors contributing to difficulty (i.e., factors estimated by product size) are taken into account. Table 2 presents the results of separate multiple regression analyses of the word-format data, digit-format data, and the differences between formats (word data minus digit data). A forward-stepwise procedure was used for entering the predictors (Size, Fan, Ties, and Length) into the equations. The criterion for a variable's entry was that beta differed from 0 with (Y < .05. Wordfornut. In the regression of word-format RTs, all four variables entered the equation and produced an RZof .710 [F(4,59) = 39.57, MSe = 10070.90,p c .001]? In the analysis of word-format errors, Size, Fan, and Ties each accounted for significant variability [R2= .493, F(3, 60) = 19.47, MSe = 5 6 . 6 1 , ~c .001]. Problem length apparently did not affect errors in the word-format condition, suggesting that the relation between Length and word-format RT primarily reflects stimulus reading time rather than a general encoding difficulty for longer number words. The overall performance advantage for tie problems is a well-established effect in both multiplication and addition (e.g., see Campbell & Oliphant, this
All R's are adjusted for the number of predictors in the equation.
An Encoding-Complex Perspective
467
volume). Fan entered in both the RT and error analyses with positive beta weights indicating that the shared-product problems tended to be slightly more difficult once other difficulty factors were partialled out.
Table 2. Raw Regression Weightsfrom Problem-based Multiple Regression Analyses of RT and %E Words
Fan Ties Length Intercept
Words-Digits
RT
Predictors Sue
Digits
159.6' * * 86.1' * -268.0** * 32.7 *
124.8' * * 46.2' -93.1' * *
942.5
873.7
---
36.7'* 38.1' -175.1' 27.6' * * 111.7
%E Size Fan Ties Length Intercept
8.9' * * 6.4' *
1.5' 3.9' *
-9.T * * ---
---
13.4
10.1
3.7
Note. *** = p < .001, ** = p < .01, * = p < .05. Variables that did not reach the .05 level of significance were not included in equations. RZ's adjusted for number of predictors. n = 64 problems. Digit fomat. The regression indicated significant effects of Size, Fan, and Ties on digit-format RTs [R2 = .660,F(3, 60) = 41.80, MSe = 4909.00, p < .001]. Length (is., the number of characters in the corresponding word-format problem) was not a significant predictor of digit-format RTs. In the analysis of digit-format errors, Size and Ties entered the equation [R2 = .454, F(2, 61) = 27.18, MSe = 40.91, p < .001]. Neither Length nor Fan entered in the error analysis, although the partial correlation for Fan was in the expected direction (.163), despite a negative zero-order correlation (-.153).
468
J.I.D. Campbell & J.M. Clark
Word format minus digit format. Across the 64 problems the RT and error differences as a function of format were significantlycorrelated (r = .383,p < .Ol), suggesting that format effects on RT and errors were mediated to some extent by common factors. In the analysis of RT differences, the coefficients for Size, Fan, Ties, and Length were all significant [R2 = .624, F(4,59) = 27.19, MSe = 3116.69, p < .001]. Specifically, the magnitude of the word-format deficit increased with problem size and tended to be larger for the shared-product problems. These factors emerged over the tendency for RT differences to increase with the length of word problems (Length) and to be reduced for problems with repeated multipliers (Ties). In the analysis of error differences, Sue and Fan each accounted for independent variability, [R2 = .103, F(2, 61) = 4.63, MSe = 19.33, p = .013]? Because Length and Ties were not significant predictors of error differences, the prediction of RT differences by Length and Ties likely reflects only differences in time to scan or read number words relative to digits, as opposed to processes related specifically to retrieval of multiplication facts. The preceding analyses show that format had substantial effects on performance, and the interactions of format with problem size or difficulty suggest, more specifically, that retrieval processes varied as a function of format. Before discussing the implications of these findings in more detail, we present results of detailed analyses of the specific errors produced in each format. Format effects on the patterns of specific errors also support our view that calculation processes in arithmetic depend on the format of the problem. Eflects of format on specific errors Table 3 shows commission errors divided into several mutually exclusive categories, which are described in order of classification priority. An error was classified as a cross-operation error if the response was the correct answer for the corresponding addition problem. A naming error occurred if the response consisted entirely of one or both of the problem’s operands (4 x 8 = 8 or 4 x 8 = 48). Errors not classified as cross-operation or naming errors were classified as table-related if the error was a correct answer to another single-digit multiplication
In a more recent study comparing word and digit multiplication, effects associated with the Fan variable did not emerge clearly as in the present study. This discrepancy may be due to procedural differences between the experiments; specifically, in the present study, problems sharing the same product were exchanged on alternating blocks of trials so that each product would be encountered only once within a block. In the more recent study, different problems with the same product were allowed to recur within the same block.
A n Encoding-Compleu Perspective
469
problem in the same times table (4 x 8 = 36). A table-unrelated error was a correct answer to a single-digit multiplication problem in a different times table (4 x 8 = 42). Miscellaneous errors were the remaining commission errors that did not fall into any of the preceding categories (4 x 8 = 34).
Table 3. Rates of Simple-Multiplication Error Types for Number-word and ArabicDigit Presentation Formats Format Words Error Type Cross operation Naming Table Related Table Unrelated Miscellaneous Total
Digits
f
P
f
P
21 90 875 145 105
.02 .05 .72 .12 .09
22 16 621 110 53
.02 .77 .12 .05
1236
1.oo
822
1.oo
Note. f = frequency. p n = 80.
=
.04
mean proportion of commission errors per subject.
Cross-operation errors were relatively infrequent and occurred with equal frequency in each format (22 vs. 21), although there was weak evidence that they accounted for a higher proportion of digit-format errors (mean proportion per subject of .02 for words versus .04 for digits, SE = .012, z = -1.92, p = .055). This latter effect presumably reflects the increased incidence of other errors with the word format. The most dramatic increase in error types occurred for naming errors, which were 463% higher for words (90 errors) than digits (16 errors). Mean number of naming errors was higher for words (1.13) than digits (.20), SE = .246, z = 3.76, p < .001, as was mean proportion of naming errors (.06 versus .03), SE = .012, z = 2 . 4 0 , ~< .02. Naming errors are related to another phenomenon, operand intrusions, and are discussed in the next section. Both naming and operandintrusion errors implicate format-sensitive interactions of number reading and number-fact retrieval processes that challenge the abstract-modular view.
470
J.I.D. Campbell & J.M.Clark
Miscellaneouserrors occurred 98% more often with word than digit format (105 vs. 53 errors). The mean frequency of these errors was higher for words (1.31) than digits (.a) SE , = .178, z = 3.65, p < .001, as was the mean proportion (.09 vs. .05), SE = .012, z = 3.10, p < .001. Although we have no specific explanation for miscellaneous errors, their increased incidence supports the conclusion that word format problems introduce elements into mental multiplication that are not present with the digit format. The percentage increase in errors was higher for naming and miscellaneous errors than for errors from the times-tables (table-related plus table-unrelated). Nonetheless, the table errors were the most common errors and increased a robust 40% from digits (731 errors) to words (1020 errors). Table-related products accounted for the largest part (61%) of the increase in errors for word relative to digit problems, and the mean number of table-related errors per subject was higher in the word format (10.9 with words vs. 7.8 with digits; SE = .663, z = 4.79, p < .ool). Further analyses of table-related errors pointed to additional specificity in the higher error rate for the word format, and suggested one factor that might contribute to the effect of format on table-related errors. In Table 4, table-related errors are classified according to whether the error was related to the minimum operand only (e.g., 3 x 7 = 24), to the maximum operand only (3 x 7 = 28), or to both operands (4 x 8 = 24). As Table 4 shows, the higher frequency of tabled-related errors was due mainly to an increase in errors related to the maximum operand (352 for words vs. 197 for digits), an increase of 79%. Errors related to the minimum operand demonstrated only a 16% increase. Indeed, the mean number of max-related errors was substantially higher with the word format (4.4 for words vs. 2.5 for digits, SE = .41, z = 4.74, p < .OOl), whereas the evidence was less clear that the mean number of min-related errors differed between formats (5.0 for words vs. 4.3 for digits, SE = .36,z = 1.88, p = .M): The mean proportion of tabled-related errors that were related to the maximum operand also was higher with the word-format condition than with digits (.43 vs. .32, SE = .034, z = 3.07, p < .001). Errors related to both operands, an error
In their following commentary on this chapter, McCloskey, Macaruso, and Whetstone correctly point out that the higher proportion of max-related errors with words is an artifact of the higher rate of intrusion errors observed with the word format (see below). This occurs because an operand intrusion in the units position (e.g., 9 x 6 = 36) can yield an error that is either min-related or maxrelated, whereas an intrusion into the decade position (e.g., 6 x 9 = 63) can only be max related. Thus, the higher rate of intrusions for word-format problems, relative to digit problems, also results in relatively more max-related errors.
An Encoding-Complex Perspective
471
Table 4. Rates of Errors Related to the Minimum Operand Only, the Mawimum Operand Only, or Both Operands
Format Words Related Operand
Digits
P
f
Minimurn Maximum Both
398 352 125
.44 .43 .13
344 197 80
.54 .32 .14
Total
875
1.oo
621
1 .oo
Note. f n = 80.
=
frequency. p
=
mean proportion of table-related errors per subject.
Table 5. Rates of Table-Related Errors as a Function of Format and Distance of the Unrelated operand
Format Words
Digits
Distance
f
P
fl f2 >2
480 293 102
.56 .33 .ll
408 173 40
.68 .24 .08
Total
875
1.oo
621
1 .oo
Note. f = frequency. p = mean proportion of table-related errors per subject, n = 80.
type that cannot be unambiguously associated with the minimum or maximum operand, increased by 56%. A second phenomenon that demonstrates the specificity of the increase in table-related errors concerns the distance between the correct and incorrect operands. Table 5 shows table-related errors as a function of the "distance" of the
472
J.I.D. Campbell di J. M. Clark
unrelated operand ( f 1, i-2, > 2). For example, the error 3 x 7 = 28 corresponds to a distance of 1, because replacing 3 with 4 (a difference of 1) yields the error response of 28. The error 4 x 9 = 28 represents a distance of 2, because replacing the 9 with 7 produces 28. The data in Table 5 show a systematic increase with distance in the magnitude of the format effect. Table-related errors increased only 18% for adjacent operands (480 vs. 408 errors), 69% for a distance of two (293 vs. 173 errors), and 155% for greater than two (102 vs. 40 errors). The mean proportion of related errors greater than or equal to two operand units in distance was substantially higher with the word problems (.44for words vs. .32 for digits, SE = .031, z = 3 . 8 1 , ~c .001). Interpretation of the preceding table-related effects is presently very tentative. One explanation is that number-fact retrieval with word stimuli is less sensitive to numerical magnitude or proximity than with digit stimuli. The reasoning is that max-related errors, on average, will be numerically more distant from the correct answer than will be min-related errors. A similar explanation could apply to the operand distance effect in Table 5, inasmuch as errors are determined by more remote digits in the same times-table. A related possibility for the operand distance effect is that number words develop more remote associations than digits because of counting by twos and related verbal operations. Although our explanation for the preceding effects, many of which are new, is tentative, the preceding results show definitely that the pattern of multiplication errors varies substantially between the word and digit formats, including differences in the absolute and relative frequencies of overall errors, specific categories of errors, and numerical distance effects within categories. Such findings challenge the assumption that number-fact retrieval is mediated only by a single type of abstract code that is the same for digit and word-based stimuli, and instead suggest that the internal representations mediating retrieval vary with surface form. In the following section, we report further evidence that format affects performance in additional ways, and we argue that these findings support the encoding-complex hypothesis and challenge the abstract-modular view. Operand intrusion errors and effects of operand order As noted earlier, subjects were much more likely to produce naming errors (e.g., 2 x 9 = 9) for words than digits. Only answers that resulted entirely from operand naming were considered naming errors, but only one of the operands was incorporated into many other incorrect responses. Campbell and Oliphant (this volume), for example, observed that about 37% of adult’s errors on simple multiplication problems incorporated one of the problem’s operands (e.g.,
An Encoding-Complex Perspective
473
9 x 6 = 56, 4 x 8 = 24, 7 x 9 = 72). Consistent with the naming error data, the mean number of operand intrusions was substantially higher for word-format problems (7.5) than for digit-format problems (4.1), SE = S80, z = 5.84, p < .OOl). The mean proportion of errors involving an operand-answer match also was higher with words (S1) than with digits (.38),SE = ,034,z = 3.77, p < .OOl). The high incidence and significant format effects for operand intrusion errors suggest that such errors reflect systematic numerical processes, rather than random factors. This inference is further confirmed by examination of operand order effects on intrusions. By chance, operand intrusions should match the left-right position of the operand only 50% of the time (see also Campbell & Graham, 1985). Campbell and Oliphant found, however, that 68% of intrusions matched the position of the "intruding" operand in the problem (e.g., 9 x 6 = 56 occurred more frequently than 6 x 9 = 56,). In the present data, the mean proportion of intrusions with position matched was .58, which is significantly greater than the S O expected by chance (SE = .023, z = 3.36, p < .OOl)'. The effects of operand order on intrusions confirms that matches between operands and components of error responses cannot be attributed to chance. Another interesting effect was that operand intrusions frequently co-occurred with associatively related error responses. Specifically, table-related errors constituted 77% and 79% of intrusion errors observed with the word format and digit formats, respectively. Together these findings suggest that operand-intrusion errors result from a confluence of number-fact retrieval and number reading processes. The encoding-complex hypothesis explains operand intrusions and the specific effects cited here in a straightforward manner. Operand intrusions occur because operand-reading processes compete with fact-retrievalprocesses. This explanation follows from the encoding-complex assumption that number-reading and number-fact retrieval are integrated functions, utilizing common, specific representations. For example, operand reading presumably activates a variety of visual and verbal codes that are also the primary media of number-fact representation. Operand reading therefore primes retrieval of number-facts that match the problem's operands. The high proportion of tabled intrusions occurs simply because it is representations of multiplication facts that are being primed, as opposed to post-retrieval, verbal-output codes. The effects of operand order on intrusions may be attributed to feature-matching processes that activate memory representations for arithmetic facts. Given the encoding-complexassumption that
' Errors on tie problems were excluded for this analysis.
474
J.I.D. Campbell & J.M. Clark
memory codes for number-facts preserve perceptual or "physical" features, it is reasonable to propose that a problem representation will be activated more strongly when its features are matched in the same relative positions (cf. Campbell & Oliphant, this volume). Thus, under this encoding-complex view, operand intrusions are properly considered retrieval phenomena. The standard abstract-modular view appears strained to explain operandintrusion phenomena and, simultaneously, to maintain the integrity of the discrete modules and the fundamental role of the abstract codes. Although the basic fact of operand intrusions might be explained by abstract outputs of the number-comprehension system (i.e., operand codes) and abstract outputs of the calculation system (i.e., answer codes) being combined or summed in the verbal-number production module, this post-retrieval model fails to explain why operand intrusions generally involve table-related answers as opposed to random concatenations of operands with answers. That is, simple post-retrieval summation of answers and operands ought frequently to produce non-tabled answers. The abstract-modular view also has serious difficulty explaining format and operand order effects. Format effects on intrusion errors are awkward for the abstract-modular view because digit and word naming both involve comprehension and production processes mediated by the identical abstract code. That is, naming a digit involves translating the digit into an abstract code and then translating the abstract code into a number name. Naming a number-word is also mediated by the same abstract codes. Since production in both cases is mediated by the same abstract code, the greater incidence of operand intrusions given words has no explanation within the standard components of the abstract-modular theory of number processing. Although the abstract-modular view could incorporate additional direct connections between number word stimuli and their names, such connections would bypass the abstract quantity codes and considerably weaken the ideal modular structure that is central to the abstract-modular view. A second possibility is that abstract codes are activated for the individual operands and for the combined operands (e.g., 4 x 8 is encoded both as separate operands and as 48). It seems unusual, however, that number words would be more likely than digits to activate these composite abstract codes, if in fact abstract codes are neither word-like nor digit-like. The tendency for operand intrusions to appear in the same relative position further complicates the abstract-modular view. From that perspective, operand order effects suggest that the left operand is processed by the comprehension and production system as if it represented a tens value, whereas the right operand is processed as a units value (e.g., 4 x 8 would activate "forty eight" in the production
An Encoding-Complex Perspective
475
system). But tens and units values are in fact represented by the abstract quantity codes, which we just noted may need to be bypassed to explain the greater incidence of operand intrusions for word problems. If abstract quantity codes are not involved in intrusion errors, then some independent mechanism is necessary to explain operand order effects. Although distinct from the abstract codes, the proposed mechanism must nonetheless be intimately related to calculation processes in order to explain why the majority of intrusions (about 80%) are answers in the correct times tables. To further test the encoding-complex view that operand-intrusions reflect convergent associative retrieval processes, we examined multiplication problems in which one of the operands occurs in the correct answer. By analogy to operand-intrusion errors, when the position of an operand matches the same number in the correct answer (e.g., 6 x 4 = 24), performance should be faster and more accurate relative to when there is a positional mismatch (e.g., 4 x 6 = 24). This prediction follows from our assumption that operand reading directly primes answer representations that are the objects of number-fact retrieval. Furthermore, since word-stimuli produced a higher rate of operand intrusions, it follows under these assumptions that the facilitation due to a match between operand and correct-answer should be stronger with words than digits. No such predictions follow from the abstract-modular hypothesis that operand intrusions reflect convergence of activation at a post-retrieval, verbal-production stage. Table 6 presents mean RT and error rates based on the twelve multiplication problems tested in which an operand appears in the correct answer (i.e., 6 x 2 = 12,2~= 6 1 2 3 x 5 = 1 5 , 5 ~ =3 1 5 , 6 ~ = 4 2 4 , 4 x 6 = 2 4 , 7 ~ 5= 3 5 5 x 7 = 35, 9 x 5 = 45, 5 x 9 = 45, 6 x 8 = 48, 8 x 6 = 48). The values in the Table contrast performance on these problems when an operand matches the corresponding number in the same position in the correct answer (e.g., 6 x 4 = 24), versus when there was a positional mismatch (4 x 6 = 24). Repeated-measures ANOVAs indicated that for this subset of items, word-format produced longer RTs [MSe = 711, F(1, 5) = 960.57,~< .OOl], and more errors [MSe = 14.03, F(l, 5) = 8.19, p = .035] than digit-format. More importantly in the present context, when there was a positional match, correct RTs were faster [MSe = 1699.7, F(l, 5) = 9.27, p = .029], and there were fewer errors [MSe = 7.10, F(l, 5) = 13.96, p < ,0131. The accuracy advantage with an operand-answer match relative to a mismatch appeared to be somewhat stronger for word stimuli [t(5) = 2 . 7 6 , ~< ,051 than for digit stimuli [t(5) = 1 . 8 7 , ~< .lo], although the interaction was not significant [MSe = 10.72, F(1, 5) = 1.031. In the RT analysis, there was a 69 ms advantage for a positional match with the word stimuli [t(5) = 4.50, p < .OO5], compared to a 33 ms advantage in the digit
J.I.D. Campbell & J.M. Clark
476
condition [t(5) = 1 . 5 8 , ~< .lo], and the interaction effect approached conventional significance levels [MSe = 338.54, F(1, 5 ) = 5.69, p = .063]. Thus, there was strong evidence that operand position influenced the probability and speed of correct responding, and also weaker evidence that operand-answer priming tended to be stronger with the word format.
Table 6. Peqormance Differences as a Function of a Positional Match Between a Problem Operand and a Numeral in the Correct Answer (e.g., 6 x 4 = 24 vs. 4 x 6 = 24)
RT(ms) Match Mismatch
p(Error) Diff.
Digits
859
892
33
Words
1178
1247
69
Match Mismatch Diff. 0.07
0.10
0.03
0.10
0.15
0.05
~ 1018
1070
51
Note. RT = mean correct response time. p(Error) that were errors. Means based on six problems.
0.085 =
0.125
0.040
mean proportion of trials
Evidence for operand-answer priming phenomena in acalculia The relevance of the preceding findings to the abstract-modular view would be further established by signs of operand-related phenomena in the performance of acalculia patients, inasmuch as acalculics have provided a substantial part the evidence for the abstract-modular theory. The multiplication errors produced by two acalculia adults (PS and GE) described previously by Sokol et al. (1991) indeed contain clear evidence of operand-answer priming effects. We refer the reader to the original article for details of these patients' histories and testing.6 PS produced a total of 214 (11.5%) incorrect responses over 23 blocks of testing on the multiplication problems from 1 x 1 to 9 x 9. Of these errors, 84 (39%) were possible intrusion errors in which an operand appeared in the error response.
We express our thanks to Mike McCloskey for providing us with the lists of GE's and PS's simple-multiplication errors.
An Encodingcompler Perspective
477
For the same set of problems, GE produced 150 (8.5%) commission errors over 22 blocks of testing. Among these, 35 (23%) involved an operand-error match. The apparently lower rate of operand-error matches for GE may be attributed to the higher rate of cross-operation errors (61 or 39% for GE vs. 14 or 5% for PS), because cross-operation errors generally will be incompatible with the possibility of an intrusion. For PS, excluding 20 errors made on tie problems (for which the effects of operand position cannot be measured), 64 operand intrusions remained. Position was preserved in 43 cases (67.2%) and not preserved in 21 (32.8%); the observed proportion of .672 is significantly greater than .5 (SE = .0625,z = 2.75, p = .003, one-tailed). For GE, there were 35 operand-error matches on non-tie problems, with position preserved in 25 (71.4%) of the cases, which also is greater than the expected proportion of S O (SE = .0845, z = 2.54, p = .006). Thus, for both PS and GE, operand order had a significant effect on the frequency of operand-error matches, which confirms that operand intrusions influenced their specific errors. The findings are consistent with the conclusion that one source of error for PS and GE, as well as normals, is interference from numbers directly activated by the problem. Table 7. Mean Propottion of Errors for PS and GE as a Function of a Positional Match Behveen a Problem Operand and a Numeral in the Correct Answer (eg., 6 x 4 = 2 4 ~ 4 x =6 24)
PS
GE
Match
Mismatch
Diff.
Match
Mismatch
Diff.
.08
I12
.04
.09
.15
.06
Note. Means based on six problems. Data obtained from Sokol et al. (1991), Table 1 (PS) and Table 2 (GE). We also tested for evidence of operand-answer priming effects comparable to those described previously (see Table 6). Error rates for each problem were reported by Sokol et al. (1991) in their Table 1 for PS (p. 359) and their Table 6 for GE (p. 370). Table 7 presents the mean error rates for PS and GE on the six problems for which an operand matches a number in the same position in the correct answer (e.g., 9 x 5 = 45), and for the six commuted problems that yield a positional mismatch (5 x 9 = 45). The rates of errors produced by PS and GE on
418
J.I.D. Campbell & J. M. Clark
these problems was approximately equal to that produced by the present group of normal subjects tested under instructions for speed. Again paralleling the results of the normal subjects, both PS and G E showed a tendancy for an accuracy advantage for operand orders in which an operand matches its position in the correct answer. The 6.7% advantage for G E was not significant by a paireddifference t-test [SE = 5.39, t(5) = 1.24, p > .05]. The 4.3% advantage for PS did reach significance [SE = 2.15, t(5) = 2 . 0 2 , ~= .05 one-tailed]. The evidence of operand intrusions and operand-answer priming in the performance of PS and G E confirm that operand-intrusions and operand-answer priming effects are not phenomena unique to speeded number-fact retrieval by normal adults. Furthermore, the possible influence of operand-reading processes on the performance of PS and G E raises the possibility that their apparent "calculation" deficit (cf. Sokol et al., 1991) may be due, in part, to a failure to control or inhibit number-reading processes. As we discuss in detail later, the possibility of such subtle interactions between components of number comprehension, calculation, and production tasks, make it difficult to localize deficits within a specific number-processing "module." Discussion The present results are not easily reconciled with the assumption that numberfact retrieval is mediated only by abstract, format-independent representations contained in a separate calculation module (e.g., McCloskey et al., 1985; Sokol et al., 1989, 1991). The results show that variables theoretically related to retrieval difficulty and interference (Size and Fan) predicted word-digit differences for both RT and errors. These effects emerged over and above encoding effects associated with word length and tie versus non-tie problems. Indeed, the magnitude by which errors increased for word problems (i.e., a 50% increase in commission errors) presents a major challenge to the abstract-code view of number processing. Because numerical and production processes subsequent to comprehension are equivalent for digit and word formats, the locus for the increase in errors must be the comprehension module. That is, comprehension of words must be sufficiently weaker than comprehension of digits to produce a 50% increase in commission errors and a dramatic 138% increase in omission errors. But it seems highly unlikely that comprehension of very familiar number words is in fact so inadequate as to produce such robust effects. Furthermore, the patterns of specific errors, including the numerical distance of errors and the influence of the operands on specific errors varied substantially between format. These results support the
An Encoding-ComplexPenpctive
479
encoding-complex view that arithmetic retrieval processes can differ as a function of presentation format. The results also are consistent with the encoding-complex view that number-reading and number-fact retrieval processes are integrated, which explains operand-answer priming effects (i.e., Table 6), operand-intrusion errors, and naming errors. Moreover, competition among these various sources of activation may sometimes be unresolvable, leading to omission errors. The findings that operand-intrusion errors, pure naming errors, and perhaps operand-answer priming, were stronger effects with word stimuli than with digit stimuli, indicate that these effects are format dependent. One hypothesis is that arithmetic answers are represented as verbal-lexical codes, which are more strongly primed by number words than digits. Under this view, operand-answer priming arises because reading a problem's operands activates verbal number-fact representations that contain the operands. Retrieval is facilitated when operand-answer priming activates the correct problem (as demonstrated in Table 6), or retrieval is disrupted when priming activates a competing problem representation, as evidenced by operand-intrusion errors. These effects are exaggerated with word stimuli because number words are more strongly associated with verbal representations than are digits. It is not clear how the abstract-modular view can explain such findings, particularly given the central assumption that number-reading mechanisms are assumed to be "functionally independent" of calculation processes (e.g., number-fact representations) and all communication between modules is based on abstract codes. If number-reading processes can directly contribute to specific calculation errors, and also systematically affect correct response times and error rates in arithmetic, then the assumption of functional independence is put in serious doubt. Under the encoding-complexview effects of format on retrieval occur because arithmetic depends on retrieval of specific codes (e.g., visual and phonological codes) that can be differentially associated with different surface forms. For example, the highly practiced digit format (e.g., 3 x 9 = ?) may support both visuospatial and verbal-auditory routes to problem representations. In contrast, number-word stimuli (three x nine = ?) are relatively novel visually, so retrieval of products from number words may be mediated primarily by verbal (e.g., lexical) codes. As mentioned previously, greater dependence on verbal representations of problems potentially explains why word-format retrieval is more susceptible to interference from operand-reading processes. In terms of the network-interference model of number-fact retrieval (Campbell & Oliphant, this volume), the strong visual associations activated by digit-format problems provide an additional basis
480
J.I.D. Campbell & J.M. Clark
for discriminating related memory representations, Relative to word problems, therefore, performance on the digit-format problems suffers less from retrieval interference among verbal codes, resulting in faster RTs and fewer errors. The theory that words and digits can be differentially associated with different internal representations of number-facts also can explain why errors tended to be more numerically distant with word stimuli than digit stimuli. For example, numerical distance or magnitude effects in simple arithmetic may reflect involvement of codes that explicitly represent magnitude (e.g., visuo-spatial analog representations; cf. Dehaene & Cohen (in press); see Campbell & Oliphant, this volume, for a computational model of number-fact retrieval that implements this assumption). Digits may evoke greater involvement of such magnitude codes than number-words because calculations and judgments of magnitude presumably are based more frequently on digit stimuli than number-word stimuli. Thus, numberfact retrieval based on number-words would be less constrained by magnitude information, which would promote errors that are more numerically distant relative to those observed with digits. Speculatively, reduced involvement of magnitude codes with number-word stimuli potentially explains the observation that the word-format deficit tended to increase with problem size. Campbell and Oliphant (this volume) proposed that problem difficulty generally increases with magnitude because problem representations become less discriminable as absolute magnitude increases. This occurs because the psychophysical scale for magnitude is more compressed for larger values and, consequently, larger-number problems encounter more retrieval interference from neighbouring problems. This relatively poor discrimination of magnitude for larger problems would be exaggerated with number-words, relative to digits, if number words activate generally weaker or diffuse magnitude information. In summary, multiplication performance varies with surface form in ways that are not easily reconciled with the assumption that number-fact retrieval is based only on abstract, format-independent processes. We have demonstrated a variety of phenomena that strongly suggest that format directly affects retrieval processes, rather than only encoding or production processes, In addition to overall effects on speed and accuracy of performance, we demonstrated that format also affected the relative frequencies of various types of errors, affected changes in the numerical distance of errors, and altered the influence of operand-reading processes on number-fact retrieval. These findings support the encoding-complex view that number-fact retrieval is based on multiple types of specific codes (e.g., verbal-lexical or visuo-spatial codes), and that these specific codes can differentially affect retrieval performance as a function of presentation format.
An Encoding-Complex Perspective
481
Is the bypotbesis of abstract number codes necessary? The foregoing evidence for format-specific phenomena, and hence for specific codes, does not disprove the hypothesis of abstract codes, and that is not our intent. We simply have documented evidence that specific codes appear to be necessary to explain many aspects of number processing. Nonetheless, given the evidence that both elementary aspects of number meaning and more complex numerical tasks involve multiple, specific codes, the question arises as to whether a more abstract level of representation is also necessary. One source of evidence against the assumption that number-processingrequires a special kind of abstract, semantic code, as proposed by McCloskey et al., comes from research on memory retrieval for non-numerical stimuli. Graham and Campbell (in press) had children in grades 3 and 4 memorize alphaplication facts; arithmetic-like memory items composed of letters instead of numbers (e.g., A,I = x; I,U = b). The children were quite successful at memorizing 25 such items in two training sessions (70% accuracy under instructions for speed), and memory for alphaplication problems paralleled memory for arithmetic facts in many respects. Among other similarities, alphaplication results showed a) a large performance advantage for tie (e.g., E,E = j) over non-tie problems (E,I = p), b) most errors involved answers from the correct “alpha-table,’‘c) response times and error rates were strongly correlated across problems, d) performance on commuted pairs was highly correlated, and e) the correct answers to poorly learned problems tended to be the most common error responses. These parallels suggest that retrieval of alphaplication and multiplication facts is similar with respect to at least some of the basic memory processes involved. Unlike multiplication, however, the operands of alphaplication problems were not systematically or meaningfully related to answers; thus, learning alphaplication simply involved memorizing arbitrary associations among combinations of letters. The alphaplication data therefore provide a sufficiency proof that memory for arithmetic-like stimuli does not require an underlying abstract semantic code such as that proposed by McCloskey et al. Magnitude representations and knowledge of other numerical relations are, of course, semantically relevant in genuine arithmetic (cf. Campbell & Oliphant, this volume), but the alphaplication data demonstrate that arbitrary associations provide sufficient basis for performance, and there seems no good reason to reject the assumption that such arbitrary associations among specific symbols (e.g., digits and words) also contribute to retrieval of arithmetic facts and other numerical tasks. One putative type of evidence supporting the assumption of abstract codes is similaritywhen numbers are processed in different stimulus formats and modalities
482
J.I.D. Campbell & J.M. Clark
(e.g., Sokol et al., 1989; Sokol et al., 1991). Indeed, in the preceding study, performance on digit and word problems was highly correlated. Such findings, however, do not constitute direct evidence for abstract codes. One possible explanation for these similarities is that a common code is used in number processing, but that the code is specific rather than abstract. That is, subjects may translate all stimulus codes into number word or visual digit format, so that calculations are produced predominantly via the same specific code. A related explanation is that the various specific numerical codes become so strongly associated as to activate one another with a high probability every time any one of the codes is activated, unless special conditions are implemented to interfere selectively with specific modalities. Still another possibility is that many features of the associative structure emerge from learning and contextual experiences that would develop and operate similarly in different stimulus modes. That is, interfering associations, effects of priming, and related mechanisms may operate similarly whether number words or visual digits constitute the nodes in the associative network. Thus, evidence for unique features of number processing with different surface forms or modalities provides positive evidence for specific codes, but the converse is not necessarily true. Modular versus integrated views of cognitive number processing In the abstract-modular theory, the presumed ability to localize the source of a number-processing impairment within one of the hypothetical comprehension, calculation, or production systems provides the primary evidence for the reality of these systems. We have pointed out, however, that this ability depends on the assumption that a single type of abstract number code mediates all computation and production tasks. Only this assumption permits performance on one task to be used as a control condition for the integrity of processes assumed to be shared with another task (e.g., sum verification may be assumed to involve the comprehension and calculation processes required for sum production). The abstract codes, in effect, define the boundaries that separate the three hypothetical systems: The comprehension system converts different surface forms of numbers into abstract semantic codes that then are passed on to the calculation or production systems, and the production system converts abstract codes received from the calculation or comprehension modules back into specific number formats (e.g., written digits or number words). Once it is allowed that different facets of number processing can involve multiple, modality-specific codes, the boundaries among number comprehension, calculation, and production become blurred. For example, as pointed out earlier,
An Encoding-Complex Perspective
483
the apparent influence of operand intrusions on the multiplication errors produced by PS and GE (Sokol et al., 1991) suggests that their apparent calculation deficit could arise, in part, from abnormal interference by number-reading processes (a "production" deficit?). Conversely, Campbell and Clark (1988) demonstrated that the number-reading errors (e.g., stating "six" for " 2 ) produced by McCloskey et d.'s (1986) patient, HY, were predicted by visual and arithmetic similarity (numerical nearness and odd-even agreement). Although such results can be reconciled with a pure production deficit, they are also consistent with the hypothesis that HY's deficit was due to a failure to inhibit visually similar codes (a "comprehension"deficit?) or a failure to inhibit arithmetically related responses (a "calculation" deficit?). That such factors collectively affect performance challenges not only the logic by which a deficit is localized within a particular subsystem, but, more generally, the view that number processing can be carved neatly into a small number of independent systems. That the boundaries among number comprehension, calculation, and production are not drawn as sharply as McCloskey et a]. (1985; Sokol et al., 1989) propose is illustrated by a consideration of counting skills, and how counting is related to calculation and number-production. Although nothing specific has been claimed about "where" counting occurs in the abstract-modular number-processing system, Sokol et al. (1989) state that they do not consider counting to be a normal function of the calculation system (p. 108). Furthermore, the verbal production subsystem of the McCloskey et al. model is implicated by much research that indicates that counting relies on phonological and articulatory codes (e.g., Healy & Nairne, 1985; h g i e & baddeley, 1987; Seron & Deloche, 1987). But localizing counting processes somewhere "outside of" the calculation system raises a problem for the modularity assumption because some of the tasks assumed by McCloskey et al. (1985) and McCloskey et al. (1986) to involve the calculation system (e.g., production and verification of arithmetic facts) can in fact be performed using counting strategies instead of retrieval. There is evidence that even normal, educated adults sometimes use such procedural strategies as counting for simple addition (Geary & Wiley, 1991). Thus, "calculation"tasks may be performed using strategies that do not involve the hypothetical calculation system. Similarly, there is evidence that arithmetic-verification (e.g., 4 t 8 = 15, true or false?), which is used to assess the integrity of components of the calculation system (McCloskey et a]., 1985; McCloskey et al., 1986), is sometimes performed by judging the magnitude (Stazyk, Ashcraft & Hamann, 1982) or odd-even status of the presented answer (e.g., Krueger, 1986, Krueger & Hallford, 1984). As mentioned previously, however, Sokol et al. (1989) suggested that the processing of magnitude and odd-
484
J.I.D. Campbell & J.M. Clark
even status could be external to the calculation module. Thus, again, "calculation" performance may depend on resources external to the putative calculation system. Such considerations are one source of motivation for the encoding-complexview that comprehension, calculation, and production are integrated processes rather than separate modules. Performance on the verification task illustrates further the integrated nature of comprehension, calculation, and verbal production. Sum verification, a presumed "calculation"task, can be solved by assessing the relative magnitude of the problem and the presented answer. Such magnitude judgments (e.g., identify the larger number), however, could be based on counting string associations ("six" must be greater than "five"because it occurs later in counting), suggesting that the verbal production processes involved in counting potentially mediate verification. On some verification trials, subjects may generate an answer for the problem and check it against the presented answer (cf. Campbell, 1987; Widaman, Geary, Cormier & Little, 1989). This comparison process could involve verbal representations of the corresponding number words (although comparisons of visual representations are also possible), and an error in the comparison or number-word production process could lead to an incorrect verification response. Furthermore, Campbell (1987; 1991) showed that brief pre-exposure (200-300 ms) of the correct answer sharply reduces retrieval errors on simple arithmetic problems, and this facilitative priming effect may well operate when the correct answer is presented in a verification task. Given such considerations, successful verification may provide only limited information about the integrity of the arithmetic retrieval processes involved in generating an answer. The various examples that we have considered illustrate, not only that it is ambiguous which "modules"a given task implicates, but also that interpretation of dissociations between tasks will depend critically on understanding how different strategies and knowledge structures influence performance on those tasks. Naturally, many tasks will share some processes in common, but the potential for processing variability within and across tasks strongly recommends against uncritically applying the methodological strategy of using performance on one task as a control for components of another task (cf. Dunn & Kirsner, 1988).
The relation between arithmetic and number naming Further difficulties associated with separating functions of number comprehension, calculation, and production arise from considering the role of arithmetic in number naming. In the abstract-modular framework proposed by McCloskey et al. (1985),arithmetic processes (e.g., knowledge of simple arithmetic
An Encoding-Complex Perspective
485
operations and facts) are functionally separated from language processes (i.e., syntactic and lexical mechanisms) contained in the production system. This separation of arithmetic and linguistic processes seems to ignore the intimate relationship between number-naming systems and arithmetic relations. Although the names for "primitive numbers" (i.e., numbers which are directly named in a given language) are often arbitrary, in some languages the basic number names are related to objects or events that imply the corresponding quantity (e.g., the word for "five" may be based on the word for "hand; Fuson & Kwon, 1990). Furthermore, in many, if not all, languages (cf. Boden, 1988), the production rules by which a given numeral (e.g., 2226) is converted to a number-word sequence (two-thousand two-hundred and twenty-six) correspond to arithmetic relationships (i.e., 2 x lo00 + 2 x 100 + 20 + 6). The intrinsic knowledge of arithmetic entailed by number naming systems is also demonstrated by the various ways in which the same number can be named. For example, the number 2200 may be stated as "two thousand two hundred" or "twenty-two hundred," depending on which arithmetic relationships are used to parse and interpret the string of digits. These observations suggest that different linguistic number-naming conventions can involve different semantic specifications or, put another way, that linguistic form potentially determines the underlying semantic representation for numbers and how calculations are performed. Indeed, there is empirical evidence that number representation and calculation skills are at least partially determined by linguistic structure. For example, there is a close connection between the arithmetic structure of number-naming systems and how children represent number. Fuson and Kwon (1990) review studies showing that children represent quantities using tokens that correspond to linguistic structure. Asian number-naming systems usually quantify the tens positions explicitly (e.g., 34 is named "three ten four" in Chinese), whereas European languages do not explicitly express the decade position as a multiple of ten. Children who learn a system that names the tens are more likely to represent 34, for example, with three tens tokens and four units tokens, whereas agematched children learning a system that does not name the tens are more likely to count out 34 units. Furthermore, these differences in number-naming systems also affect how children learn to perform single-digit and multi-digit addition and subtraction (Fuson and Kwon, 1990). Such observations challenge a model that sharply separates functions of calculation from the grammatical processes of verbal number production.
486
J.I.D. Campbell & J.M. Clark
Autonomous procedures versus modulariy The encoding-complex view rejects a strong form of modularity, but it does not follow that there are no differentiated aspects of number processing. For example, the "syntactic frame" model for verbal number-production proposed by McCloskey et al. (1986) provides a plausible account of how numbers presented as digits are mapped on to verbal number names. Note, however, that this component of verbal number production requires neither the assumption of abstract codes nor the assumption that number production mechanisms occupy a separate system. Despite this disclaimer, it is certainly possible that the semantic relations entailed in converting a string of digits to the appropriate number name become procedurulized with practice (Anderson, 1982), which implies that the underlying semantic structure would no longer need to be explicitly processed in the context of a simple naming task. This specific skill, once acquired, may therefore involve a relatively autonomous cognitive routine. Nonetheless, the encoding-complex view stills permits number production routines to vary across tasks because different tasks emphasize different representational media. For example, when the digit string "1026" is presented for naming, the computation of place value (i.e., the function of the syntactic frame) may be based directly on an analysis of the visual form of the stimulus. If, instead, one generates the number name for the total quantity represented by one "thousands" token, two "tens" tokens, and six "ones" tokens, a visual frame is not directly available from the input structure, and a different process for generating the correct grammatical structure of the output may be required. In summary, the assumption that number comprehension, calculation, and production involve independent cognitive subsystems, as proposed by McCloskey et al., appears to oversimplify the relationships among components of number processing. According to the encoding-complex view, number comprehension, calculation and production are terms that refer to somewhat distinct collections of tasks that vary in information or response requirements, but which, nonetheless, often draw on common verbal and visual networks of associative relations. The encoding-complex view that number skills are based on multiple forms of internal representations and can be realized in many ways, advises directly against an approach that tends to ignore this plasticity in favor of a relatively simple taxonomy of number processing subsystems. The better part of understanding performance in number processing tasks may be in understanding its diversity and complexity.
An Encoding-Complex Perspective
487
Conclusions The encoding-complex view maintains that number-processing tasks can be understood in terms of retrieval and comparison of elementary visual and verbal codes. That is, these elementary associative codes and processes are the building blocks out of which number processing skills and routines are constructed. Relatively simple tasks such as counting, comparison of magnitude and other elementary number features, generation and verification of basic number facts, as well as some components of written and spoken number production are already amenable to a detailed analysis in terms of elementary associations. Although the modality-specific codes that provide the representational foundation for number processing do not in and of themselves explain the various numerical operations that people can perform (e.g., comparisons of magnitude, multiplication and other basic operations, complex calculations, sophisticated aspects of mathematical reasoning, and so on), these and other competencies must ultimately be explained in terms of basic associative mechanisms. Our ultimate objective is to identify the principles by which excitatory and inhibitory connections among various specific representations cause number processing phenomena to emerge from the collective activity of associative networks (e.g., Campbell & Oliphant, this volume). In our approach we also seek to identify common mechanisms that underlie numerical and nonnumerical domains. We doubt that the human brain involves number-processing mechanisms that are completely independent of non-numerical cognitive functions, although some unique features may emerge from the particular ways in which mechanisms are combined in the area of number. Thus, we believe that number skills do not comprise a separable cognitive "system," and indeed there is a substantial research literature demonstrating continuity of numerical and non-numerical skills (see Clark & Campbell, 1991). It seems unlikely, for example, that the brain has a "power of ten" mechanism that is uniquely numerical in nature (cf. the abstract codes assumed by McCloskey et al., 1986). Instead we suspect that the "power of ten" emerges from some as yet determined collection of basic quantity representations and associations, perhaps involving the use of place or position at some level. We have also argued that the quantity representations are themselves modality specific, which makes them not qualitatively distinct from non-number representations. Note that our position still allows some emergent aspects of number processing to be unique. Whatever the underlying mechanism is for "powers of ten," for example, it is unlikely that the primitives behave collectively in an identical way in non-number domains. The encoding-complex theory in its current form does not constitute a complete model of number processing, and we acknowledge that, at present, there is
J.I.D. Campbell & J.M. Clark
488
uncertainty regarding the taxonomy of representational codes that will be necessary and sufficient for a complete encoding-complex theory of cognitive number skills. Furthermore, without detailed models of specific tasks, precise predictions often are not possible because of the complex associative processes by which the various interconnected codes potentially activate and inhibit one another. Despite such uncertainty, the encoding-complex approach does represent a substantive and clear alternative to the global assumptions of abstract-codes and modularity of number functions. Although proponents of the abstract-modular view have dismissed the encoding-complex approach as "vacuous" (Sokol et al., 1989), our theory that semantic number representations may be modality- and format-specific is no more inherently vacuous than the assumption that number representations are wholly abstract and divorced from modality and surface characteristics (see Clark & Campbell, 1991, for a critique of the abstract codes proposed by McCloskey et al., 1986). Indeed, the issue of abstract versus format-specific semantic codes is far from resolved in research on picture and word processing and in research on semantic representation in multilinguals (e.g., Clark, 1987; Vaid, 1988; Glucksberg, 1984). Furthermore, as the present experimental findings have demonstrated, there is considerable evidence that calculation processes vary with surface form, and such findings represent a strong challenge to the abstract-code hypothesis. ACKNOWLEGEMENTS We thank Mark Ashcraft, David Geary, Kevin Miller, Valerie Thompson, and an anonymous reviewer for useful feedback on a previous version of the paper. This research was supported by Natural Sciences and Engineering Research Council of Canada grant OPG0001980 to Jamie Campbell. REFERENCES Anderson, J.R. (1982). Acquisition of cognitive skill. Psychological Review, 89,369-
406. Baddeley, A.D. (1986). Working Memory, Oxford, England: Oxford University Press. Baddeley, A.D., & Hitch, G.J. (1974). Working memory. In G.H. Bower (Ed.), The psychology of learning and motivation: Advances in research and theory (Vol. 8, pp. 47-90). New York: Academic Press. Boden, M A . (1988). Computer Models ofMind. Cambridge: Cambridge University Press.
An Encoding-Complex Perspective
489
Campbell, J.I.D. (1987). Production, verification, and priming of multiplication facts. Memory & Cognition, 15, 349-364. Campbell, J.I.D. (1991). Conditions of error priming in number-fact retrieval. Memory & Cognition, 19, 197-209. Campbell, J.I.D., & Charness, N. (1990). Age-related declines in workmg-memory skills:Evidence from a complex calculation task. Developmental Psychology, 26,879-888. Campbell, J.I.D., & Clark, J.M. (1988). An encoding-complex view of cognitive number processing: Comment on McCloskey, Sokol, & Goodman (1986). Journal of Experimental Psychology: General, 117, 204-214. Campbell, J.I.D., & Graham, D.J. (1985). Mental multiplication skill: structure, process and acquisition. Canadian Journal of Psychology, 39, 338-366. Clark, J.M. (1987). Understanding pictures and words: A comment on Potter, Kroll, Yachzel, Carpenter, and Sherman (1986). Journal of Experimental Psychology: General, 116, 307-309. Clark, J.M., & Campbell, J.I.D. (1991). Integrated versus modular theories of number skills and acalculia. Brain and Cognition, 17, 204-239. Dehaene, S., & Cohen, L. (in press). Two mental calculation systems: A case study of severe acalculia with preserved approximation. Neuropsychologia. Dunn, J.C., & Kirsner, K. (1988). Discovering functionally independent mental processes: The principle of reversed association. Psychological Review, 95, 91-101. Foltz, G.S., Poltrock, S.E., & Potts, G.R. (1984). Mental comparisons of size and magnitude: Sue congruity effects. Journal of Experimental Psychology: Learning Memory, and Cognition, 10,442-453. Fuson, K.C., & Kwon, Y.(1990). Effects on children’s addition and subtraction of the system of number words and other cultural tools. To appear in J. Bideaud & C. Meljac (Eds.), Les chemins du nombre (Pathways to number). Villeneuve d’Ascq, France: Presses Universitaires de Lille. Geary, D.C., & Wiley, J.G. (1991). Cognitive addition: strategy choice and speedof-processing differences in young and elderly adults. Psychology and Aging, 6, 474-483. Glucksburg,S. (1984). The functional equivalence of common and multipk codes. Journal of Verbal Learning and Verbal Behavior, 23, 100-104. Gonzalez, E.G., & Kolers, PA. (1982). Mental manipulation of arithmetic symbols. Journal of Experimental Psychology, 8, 308-319.
490
J.I.D. Campbell & J.M. Clark
Gonzalez, E.G., & Kolers, P A . (1987). Notational constraints on mental operations. In G. Deloche & X . Seron (Eds.) Mathematical disabilities: A cognitive neuropsychologicalperspective (pp. 27-42). Hillsdale, NJ: Erlbaum. Graham, D.J., & Campbell, J.I.D. (in press). Network interference and number-fact retrieval: Evidence from children’s alphaplication. Canadian Journal of Psychology. Hatano, G., Miyake, Y., & Binks, M.G. (1977). Performance of expert abacus operators. cognition, 5, 57-71. Hayes, J.R. (1973). On the function of visual imagery in elementary mathematics. In W.G. Chase (Ed.), Visual Infomation Processing (pp. 177-214). New York: Academic Press. Healy, A.F., & Nairne, J.S. (1985). Short-term memory processes in counting. Cognitive Psychology, 17, 417-444. Hines, T.M. (1990). An odd effect: Lengthened reaction times for judgments about odd digits. Memory & Cognition, 18, 40-46. Hitch, G.J. (1978). The role of short-term working memory in mental arithmetic. Cognitive Psychology, 10, 302-323. Kashiwagi, A,, Kashiwagi, T., & Hasegawa, T. (1987). Improvement of deficits in mnemonic rhyme for multiplication in Japanese aphasics. Neuropsychologia, 25, 443-447. Klein, R., & McInnes, J. (1988). Visual field differences in the processing of numerical stimuli. Brain and Cognition, 7, 247-256. Krueger, L.E. (1986). Why 2 x 2 = 5 looks so wrong: On the odd- even rule in product verification. Memory & Cognition, 14, 141-149. Krueger L.E. & Hallford E.W. (1984). Why 2 t 2 = 5 looks so wrong: On the odd-even rule in sum verification. Memory & Cognition, 12, 171-180. Logie, R.H. & Baddeley, A.D. (1987). Cognitive processes in counting. Journal of Experimental Psychology: Learning, Memory, and Cognition, 13, 310-326. Marsh, L.G., & Maki, R.H. (1976). Efficiency of arithmetic operations in bilinguals as a function of language. Memory & Cognition, 4, 459-464. McClain, L., & Shih Huang, J.Y. (1982). Speed of simple arithmetic in bilinguals. Memov & Cognition, 10, 591-596. McCloskey, M., Caramazza, A., & Basili, A. (1985). Cognitive mechanisms in number processing and calculation: Evidence from dyscalculia. Brain and Cognition, 4, 171-196. McCloskey, M., Sokol, S.M., & Goodman, R A . (1986). Cognitive processes in verbal-number production: Inferences from the performance of brain-
An Encoding-Complex Perspective
491
damaged subjects. Journal of Experimental Psychology: General, 115, 307330. PiroK, P.L., & Anderson, J.R. (1985). The role of practice in fact retrieval. Journal of Experimental Psychology: Learning, Memory, and Cognition, 11, 136-153. Seron, X., & Deloche, G. (1987). The production of counting sequences by aphasics and children: A matter of lexical processing? In G. Deloche and X. Seron (Eds.), Mathematical Disabilities: A Cognitive neuropsychological perspective (pp. 171-196). Hillsdale, NJ: Erlbaum. Shepherd, R., & Gale, A. (1982). EEG correlates of hemisphere differences during a rapid calculation task. British Journal of Psychology, 73, 73-84. Sokol, S.M., Goodman-Schulman, R.,& McCloskey, M. (1989). In defense of a modular architecture for the number processing system: Reply to Campbell & Clark. Journal of Experimental Psychology: General, 118, 105-110. Sokol, S.M., McCloskey, M., Cohen, N.J., & Aliiminosa, D. (1991). Cognitive representations and processes in arithmetic: Inferences from the performance of brain-damaged patients. Journal of Experimental Psychology: Learning, Memory, and Cognition, 17, 355-376. Stazyk, E.H., Ashcraft, M.H., & Hamann, M.S. (1982). A network approach to simple multiplication. Journal of ExperimentalPsychology:Learning,Memory, and Cognition, 8, 320-335. Stigler, J.W. (1984). "Mental abacus": The effect of abacus training on Chinese children's mental calculation. Cognitive Psychology, 16, 145-176. Tzeng, O.J.L., & Wang, W.S.Y. (1983). The first two r's. American Scientist, 71, 238-243. Vaid, J. (1985). Numerical size comparisons in a phonologically transparent script. Perception and Psychophysics, 37, 592-595. Vaid, J. (1988). Bilingual memory representation: A further test of dual coding theory. Canadian Journal of Psychology, 42, 84-90. Vaid, J., & Corina, D. (1989). Visual field asymmetries in numerical size comparisons of digits, words, and signs. Brain and Language, 36, 117-126. Widaman, K.F., Geary, D.C., Cormier, P., & Little, T.D. (1989). A componential model for mental addition. Journal of Experimental Psychology: Learning, Memory, and Cognition, 15, 898-919.
This Page Intentionally Left Blank
The Nature and Origins of Mathematical Skills J.I.D. Campbell (Editor) 0 1992 Elsevier Science Publishers B.V. All rights reserved.
493
Chapter 13 THE FUNCTIONAL ARCHITECTURE OF NUMERICAL PROCESSING MECHANISMS: DEFENDING THE MODULAR MODEL
Michael McCloskey Johns Hopkins University Paul Macaruso Massachusetts General Hospital Tony Whetstone Johns Hopkins University
Summary
In this chapter we address the logical and empirical arguments offered by Campbell and Clark (this volume) against our model of numerical processing. We argue that several of their objections are based upon misinterpretations of our model; that the empirical results cited by Campbell and Clark do not constilute clear evidence against the model; and that the approach to theory construction embodied in our model is likely to be more productive than the approach exemplified by Campbell and Clark’s encoding complex theory. Background and introduction
Recent years have seen increased interest in the cognitive mechanisms underlying the processing of numerical information, and the functional architecture of these mechanisms has been the subject of some controversy. We have proposed that cognitive numerical processing mechanisms have a modular organization (McCloskey, Caramazza, & Basili, 1985; McCloskey, Sokol, & Goodman, 1986; McCloskey, in press; Sokol, McCloskey, Cohen, & Aliminosa, 1991; Sokol,
M. McCloskey, P. Macaruso & T. Whetstone
494
Goodman-Schulman, & McCloskey, 1989).' In contrast, Campbell and Clark (1988, this volume; Clark & Campbell, 3991) have presented an encoding complex view that assumes a non-modular functional architecture. In the preceding chapter, Campbell and Clark (this volume) argue against our model on logical and empirical grounds, concluding that their encoding complex perspective is much to be preferred. In this chapter we offer a contrasting assessment, We first summarize the assumptions of our model, and attempt to clarify several points that have apparently led to confusion. In the course of this discussion we address a number of the Campbell and Clark objections. We then discuss the alternative approaches to theory development and evaluation exemplified by our model and the encoding complex perspective, suggesting that our approach is likely to be more productive. Turning next to empirical evidence, we examine in some detail the multiplication experiment reported by Campbell and Clark (this volume). We argue that contrary to Campbell and Clark's assertions, the experiment does not constitute evidence against our model. Further, we suggest that the same conclusion applies to the other empirical studies cited by Campbell and Clark. We conclude that various alternative theoretical perspectives merit continued development and evaluation in research on numerical processing. The modular model Our model considers at a general level the cognitive mechanisms mediating comprehension and production of Arabic and verbal numerals, and execution of simple calculations.* Numeral-Processingmechanisms
As illustrated in Figure 1, the model posits functionally independent numeral comprehension and numeral production mechanisms. Numeral comprehension
'
Throughout this chapter we adopt as an expository convenience the use of the first person in discussing the work of McCloskey and his colleagues, even when the authors of the paper(s) under discussion are not co-extensive with the authors of this chapter. In this way we avoid the awkwardness of repeatedly refemng to one or more of our number in the third person.
*
We use the term numeral to refer to a symbol or set of symbols representing a number. Arabic numerals are numerals in digit form (e.g., 56), and verbalnumerals are numerals in the form of words (e.g.,fifry-sk),whether the words are spoken or written. Finally, we denote verbal numerals in spoken and written form as spoken verbal numerals and written verbal numerals, respectively.
Defending the Modular Model
495
mechanisms convert numerical inputs into semantic representations for use in subsequent cognitive processing, such as performing calculations. Numeral production mechanisms translate internal semantic representations of numbers into the appropriate form for output.
CALCULATION
MECHANISMS
PROCEDURES
ARABIC
COMPREHENSION
IT
41
ABSTRACT SWNTIC
m+ ARABIC
/
24
PRODUCTION
REPRESENTATION EIGHT
PRODUCTION
THREE
NUMERIL COMPREHENSION MECHANISMS
NWERAL PRODUCTION MECHANISMS
Figure 1. Schematic depiction of the McCloskey et al. (1985) model.
The semantic representations are assumed to specify in abstract form the basic quantities in a number, and the power of ten associated with each. For example, the Arabic numeral comprehension process is assumed to generate from the stimulus 5030 the semantic representation {5}10EXP3, { 3)lOEXPl. The digits in braces (e.g., { 5 ) ) indicate quantity representations, and lOEXPn indicates a power of 10 (e.g., 10EXP3 specifies 10 to the third power, or thousand). Thus, {5}10EXP3, {3}10EXPl indicates a number made up of five thousands and three tens. This particular notation is adopted merely to avoid confusion between internal semantic representations of numbers, and Arabic or verbal numerals. The important assumption is that the internal representations specify basic quantities and their associated powers of 10. In addition to distinguishing numeral comprehension and production mechanisms, our model further divides these mechanisms into components for processing Arabic numerals (e.g., 362), and components for processing verbal
496
M. McCloskey, P. Macaruso & T. Whetstone
numerals (e.g., three hundred sixty-two). For example, reading a price tag would implicate Arabic numeral comprehension processes, whereas writing a check would involve both Arabic and verbal numeral production processes. Calculation mechanisms Performing calculations requires, in addition to numeral comprehension and production, cognitive processes specific to arithmetic. In particular our model posits components for comprehension of operation symbols (e.g., +) and words (e.g., plus), retrieval of arithmetic "table" facts, and execution of calculation procedures that specify the steps required to solve multi-digit problems. Clarification of the model's assumptions Many of Campbell and Clark's arguments against our model are stated rather briefly. As a consequence, it is often difficult to determine just what claims Campbell and Clark are attributing to the model when raising objections. However, a substantial number of the objections appear to flow from misapprehensions concerning the model's stance on several fundamental issues. In the following discussion we attempt to clarify the assumptions of our model, and in so doing we address some of the Campbell and Clark arguments. Nature of the representations and processes In discussing our model Campbell and Clark refer several times to mechanisms in the brain. For example, they call into question the assumption that "the brain has a 'power of ten' mechanism that is uniquely numerical in nature" (Campbell and Clark, this volume, p. 487). Remarks such as these may create the impression that our model assumes "hard-wired" brain mechanisms for numerical processing. Although Campbell and Clark may not have intended this interpretation, it is important to make clear what the model actually assumes. In our view it is highly unlikely that evolution has endowed the human brain with genetically-programmed biological modules specialized for processing of numerical information. Accordingly, the various processing mechanisms postulated in our model are not assumed to be genetically-givenmodules (although some aspects of verbal numeral processing may draw upon biological language modules). Rather, our claim is that experience and training with numerical information lead to development of various cognitive processes that are functionally distinct in the sense that (1) the processes are represented separately from one another, and so can be disrupted selectively by brain damage; and (2) the processes may communicate by passing
Defending the Modular Model
497
representations to one another, but each process has no access to, or involvement in, the inner workings of the others? A related point may be made with regard to representations. The internal power-of-ten representations posited by our model (e.g., {8}10EXF'3, {4}10EXP1 for 8040) are not assumed to be a genetically programmed feature of the cognitive system. Instead, we assume that the internal base-10 representational scheme is built up from more fundamental concepts of quantity, through experience with base-10 number systems. Different forms of experience would presumably eventuate in different forms of internal representations. For example, we assume that individuals in a culture with a base-5 number system would develop internal base-5 representations. Campbell and Clark object to our characterizing as "abstract" the semantic representations of numbers. They seem to believe that our abstract semantic representations are somehow quite different from any of the forms of representation they postulate, and even suggest that it may not be necessary to assume an abstract level of representation for numbers (Campbell & Clark, this volume, pp. 481-482). We find Campbell and Clark's discussion on these points somewhat puzzling. In referring to abstract representations for numbers we mean simply representations that are not tied to any particular number-naming system (e.g., the Arabic numeral system or the English verbal numeral system). Thus, we assume that semantic representations of numbers are not representations of Arabic digit strings or number-word sequences. Rather, we assume, the semantic representations abstract away from these surface details to represent quantity or magnitude. We take it to be uncontroversial that any theory of numerical processing will need to posit some form of internal quantity or magnitude representation; indeed, Campbell and Clark themselves assume representations of this sort. Thus, our postulation of abstract semantic representations does not represent, as Campbell and Clark seem to suggest, a strange or radical proposal; and the answer is clearly "yes" to their question, "Is the hypothesis of abstract number codes necessary?" Scope of the model
The cognitive processes implicated in numerical information processing are doubtless many and varied. In the face of this complexity we have adopted the
The assumption that functionally independent processing mechanisms may develop through experience has also been made by other theorists (see, e.g., Ellis & Young, 1988, p. 15).
498
M. McCloskey, P. Macaruso & T. Whetstone
strategy of restricting the scope of our work to a limited and therefore potentially tractable set of issues, with the expectation that this work will eventually provide a basis for expanding the scope of inquiry to address a broader range of questions. Accordingly, our model was not proposed as a comprehensive theory encompassing all numerical processing. At present the model is restricted in scope to comprehension and production of numerals in digit and word form, and simple arithmetic (ie., basic addition, subtraction, multiplication, and division). Extension of the model to other aspects of numerical cognition (e.g., algebra problemsolving) would require postulation of processes falling outside the scope of the current model, although many of these processes would presumably draw upon the numeral processing and calculation mechanisms currently incorporated in the model. The restricted scope of our model has two important implications. First, the model does not claim (as Campbell and Clark occasionally seem to assume) that any conceivable numerical process can be assigned to one of the posited processing components or, more generally, that all numerical processing is carried out by a small set of modules. Second, the model does not contend that all numerical processing is mediated by abstract semantic representations. For example, internal spatial representations may well be implicated in solving geometry problems. Thus, rather than sweeping generalizations about all numerical processing, our model makes specific claims about the processes falling within its scope. Numerical and non-numerical processing mechanisms
In contrasting their views with ours, Campbell and Clark state, "We doubt that the human brain involves number-processing mechanisms that are completely independent of non-numerical cognitive functions" (Campbell and Clark, this volume, p. 487), and "we believe that number skills do not comprise a separable cognitive 'system"' (Campbell and Clark, this volume, p. 487). However, our model does not assume that numerical processing mechanisms constitute a cognitive system that is isolated from non-numerical processing mechanisms. For example, we have repeatedly raised as an unresolved question the relationship between mechanisms for processing verbal numerals, and the mechanisms for processing language in general (e.g., McCloskey et al., 1985; McCloskey et al., 1986; McCloskey, in press; McCloskey Aliminosa, & Macaruso, 1991). McCloskey (in press) discusses the matter as follows: A central question in this realm concerns whether numeral-processing mechanisms are incorporated within, or are separate from, the cognitive language-processing system. Presumably, lexical processing of verbal numerals
Defending the Modular Model
499
(i.e., comprehension and production of individual number words) implicates general lexical processing mechanisms. For example, in production of spoken verbal numerals phonological number-word representations are presumably retrieved from a general phonological output lexicon (although the numberwords may comprise a functional class within that lexicon). In the case of syntactic processing of verbal numerals, and processing of Arabic numerals in general, the situation is perhaps less clear. (McCloskey, in press, p. 46) Further, various numeral processing operations are likely to draw upon general cognitive capacities such as attention and working memory (see, e.g., McCloskey et al., 1985; McCloskey Aliminosa, & Macaruso, 1991). For example, calculation processes may invoke working memory to retain carry digits in the course of solving multi-digit problems, or even to retain the problem and intermediate results during attempts to solve problems in one’s head. We have decided to focus initially on characterizing processes in terms of their roles in numerical tasks, postponing for subsequent consideration the relationships between numerical and non-numerical processing mechanisms (see, e.g., McCloskey et al., 1986; McCloskey, Aliminosa, & Macaruso, 1991). Regardless of the merits of this research strategy, it is clearly not equivalent to assuming that numerical processing mechanisms constitute an isolated system. Interaction of processes
According to our model, numerical tasks are carried out by recruiting basic processes in various combinations. For example, reading aloud Arabic numerals is presumed to involve Arabic numeral comprehension processes and verbal numeral production processes, and these same processes are assumed to be involved in other tasks (i.e., tasks involving Arabic numeral stimuli in the case of Arabic numeral comprehension processes, and tasks involving spoken verbal numeral responses in the case of verbal numeral production processes). Thus, the functional architecture assumed by our model is modular not in the sense that each numerical task is performed by an isolated process dedicated solely to that task, but rather in the sense that processes for performing whole tasks are constructed from a limited set of more basic processes, each of which performs the same specific function in any task for which it is recruited. In many instances, carrying out a task may involve a complex interleaving of the component processes. For example, our model assumes that calculation procedures for solving multi-digit arithmetic problems fulfill their function by coordinating and sequencing various numeral comprehension, numeral production, and arithmetic fact retrieval operations. McCloskey (in press) describes in the
M. McCloskey, P. Macaruso & T. Whetstone
500
following way the application of the multiplication procedure to the problem 64 x 59: [the procedure] would call first for processing of the digits in the rightmost column (i.e., 4 and 9). Thus, Arabic numeral comprehension processes would be recruited to translate the digits into abstract internal representations. These representations, along with a representation of the arithmetic operation, would then be taken as input by the arithmetic fact retrieval process, which would return an abstract internal representation of the product (i.e., {3}10EXP1, (6)lOEXPO). The multiplication procedure would then call for the ones portion of the product to be written in Arabic form beneath the rightmost column of the problem. . (McCloskey, in press, pp. 9-10)
. .
Thus, we view calculation procedures as orchestrating the application of several more basic processes, including not only the calculation-specific arithmetic fact retrieval process, but also numeral comprehension and numeral production processes that are involved in a variety of non-calculation tasks. As discussed in an earlier section, more general cognitive capacities such as working memory may also come into play. These assumptions about interaction of processes in no way contradict our claim that the various processes are functionally distinct. For example, the assumption that calculation procedures orchestrate numeral comprehension, numeral production, and arithmetic fact retrieval processes is entirely consistent with the view that the calculation procedures (a) are represented separately from the various processes they coordinate, and (b) have no access to, or involvement in, the inner workings of the more basic processes, but simply pass inputs to, and accept outputs from, these processes. Thus, for example, the model's assumptions allow for the possibility that calculation procedures could be disrupted while the processes they coordinate continue to function normally, and continue to be available for other tasks. Counting and calculation. Several of Campbell and Clark's (this volume, pp. 481-486) arguments against our model seem to reflect a misconstrual of the model's claims regarding interaction of processes. One such argument concerns the involvement of counting in calculation. Counting, Campbell and Clark note, seems to be related to verbal numeral production; hence, within the framework of the McCloskey et al. model one might want to localize counting within the verbal numeral production component. However, the argument continues, arithmetic operations may sometimes be carried out by counting (e.g., 7 and 4 can be added by counting up 4 units from 7). Thus, if counting is placed outside the calculation component, one is faced with the problem that calculation performance "may
Defending the Modular Model
50 1
depend on resources external to the putative calculation system" (Campbell & Clark, this volume, p. 484). Obviously, the assumption here is that according to the our model, a calculation process could not invoke a counting process unless the latter were localized within the calculation component. However, we have seen that calculation processes can invoke numeral comprehension, numeral production, and working memory processes, and there is nothing in the model to prohibit other sorts of processes, including counting, from being recruited as well (as long as the invoked process could be fed inputs in an appropriate form, and the invoking process could operate on the resulting output). More specifically, the involvement of counting in calculation may be accommodated simply by assuming (following Siegler, 1988; Siegler & Shrager, 1984) that when an arithmetic fact is not available by retrieval, a learned "backup" process may be invoked to compute the fact. In the case of an addition or subtraction fact, the backup process may involve computing the fact by counting up or down from a problem operand, and this process may therefore recruit basic counting processes that are also invoked in various non-calculation contexts. Of course, these assumptions would need to be fleshed out -- and tested -- with respect to such matters as the specific form taken by the backup process, the nature of the representations passed to and from the process, and so forth. For present purposes, however, the relevant point is that involvement of counting in calculation is in no sense troublesome for our mode1.4 This conclusion applies equally to Campbell and Clark's arguments that the McCloskey et al. model does not allow for involvement of magnitude judgments or assessment of odd-even status in arithmetic verification decisions (Campbell and Clark, this volume, p. 484):
Campbell and Clark's discussion also seems to reflect the erroneous assumption that processes such as counting must, according to our model, be assigned to one of the postulated processing components. As we have already noted, however, the model does not assume that all numerical processes fall within one of these components. Thus, counting need not be localized within the verbal numeral production component, although some counting tasks may invoke verbal numeral production processes. It may also be noted that the term "counting" covers a substantial range of numerical tasks (e.g., saying the count words aloud in sequence, counting objects, solving subtraction problems by counting). Hence, it is unlikely that counting represents a unitary cognitive process, as the Campbell and Clark (this volume) discussion seems to assume. In developing their arguments on counting, odd-even status, and magnitude judgments Campbell and Clark refer to an article by Sokol et al. (1989): "Sokol et al. (1989) suggested that processing of magnitude and odd-even status could be external to the calculation module" (Campbell & Clark, this volume, p. 483); and, "Sokol et al. (1989) state that they do not consider counting to be
502
M. McCloskey, P. Macaruso & T. Whetstone
Arithmetic and number naming. A different sort of problem attends Campbell and Clark's argument concerning relationships between arithmetic and number naming (Campbell & Clark, this volume, pp. 484-485; see also Clark & Campbell, 1991). Campbell and Clark note that arithmetic concepts and relations are implicit in number-naming systems (e.g., the verbal numeral hvo thousand three hundred corresponds to 2 x lo00 + 3 x 100). This involvement of arithmetic in numbernaming systems, they argue, constitutes a problem for our model's assumption of separate numeral-processing and calculation mechanisms. It is certainly true that number-naming systems may be described as involving arithmetic relations. However, this observation does not warrant the very strong claims about underlying cognitive processes implicit in -- and required by -Campbell and Clark's argument. The observation that number names often involve additive and multiplicative relations by no means implies that numeral comprehension and production involve the same cognitive addition and multiplication processes implicated in solving arithmetic problems like 378 t 412 or 54 x 98. Hence, this observation does not constitute evidence against our assumptions about separation of numeral processing and calculation mechanisms. Interpretation of dyscalculias In discussing numerical processing deficits (dyscalculias) Campbell and Clark (this volume; see also Clark & Campbell, 1991) seriously misrepresent the methods we espouse and practice for interpreting patterns of impaired performance. According to Campbell and Clark, we take for granted the validity of our model when arriving at interpretations of deficits. For example, Campbell and Clark ascribe to us the view that "if a calculation can be performed successfully given arabic digits, but cannot be performed when the stimuli are presented using
a normal function of the calculation system @. 108)" (Campbell & Clark, this volume, p. 483). From these references it might appear that the Sokol et al. (1989) article provides a basis for Campbell and Clark's interpretation of our model's claims about interaction of processes. However, the Sokol et al. passage t o which Campbell and Clark are referring bears little resemblance to the Campbell and Clark rendering and provides no foundation whatever for their construal of our model: our model does not, as Campbell and Clark (1988) claimed, imply that odd-even status is an exclusively calculation-related variable. The role of odd-even status in calculation tasks is (at least in our view) quite uncertain, and the variable may well be related to performance in noncalculation tasks (e.g., counting by twos, which at least when well-practiced does not seem to involve calculation). Hence our model does not require that effects of odd-even status be limited to calculation tasks. For example, the effects of this variable could even be taken to suggest that the internal numerical representations posited by our model reflect odd-even status as well as numerical proximity (although we do not find this interpretation particularly appealing). (Sokol et al., 1989, p. 108)
Defending the Modular Model
503
number words, then one can conclude that the comprehension system for words must be impaired. This follows because unimpaired calculation for digits confirms that the abstract calculation process is intact" (Campbell & Clark, this volume, p. 461,emphasis added). Thus, in assessing the implications of this hypothetical dissociation we would, according to Campbell and Clark, accept without question our model's assumption that the calculation process is the same for problems presented in digit (e.g., 8 x 7) and word (e.g., eight rimes seven) format; therefore, we would conclude, solely from this dissociation, that (1) a calculation process shared by the two tasks is intact, and (2) the impaired performance on word-format problems reflects a deficit in comprehension of number words. In the words of Campbell and Clark, our approach would be one of "uncritically applying the methodological strategy of using performance on one task as a control for components of another task (Campbell and Clark, this volume, p. 484; see also Clark & Campbell, 1991). This portrayal of our views is a gross misrepresentation. To be sure, we have argued on a number of occasions that interpretation of deficits in cognitive neuropsychological research is necessarily a theory-laden enterprise (e.g., McCloskey et al., 1985;McCloskey, Aliminosa, & Macaruso, 1991). For one thing, interpretations are stated in terms of disruption to cognitive processes specified in a theory of normal processing (e.g., verbal numeral comprehension deficit, arithmetic fact retrieval deficit). Further, inferences about which of the hypothesized processes are intact, and which are impaired, are drawn on the basis of the patient's performance on cognitive tasks. This use of task performance obviously requires specific assumptions about the processes involved in the various tasks. However, the point that interpretations of deficits are necessarily grounded in assumptions about normal cognitive processes and their involvement in particular tasks does not imply that the validity of these assumptions is taken for granted. It is precisely because the assumptions cannot be taken for granted that efforts are made to obtain evidence from multiple sources concerning the status of each hypothesized process of interest. If the various sources of evidence converge in the conclusion they suggest about a process -- for example, if all of the evidence suggests that the process is impaired in a particular way -- then one can have some confidence in the conclusion (and in the theoretical framework within which it was generated). On the other hand, if two or more pieces of evidence have inconsistent implications when interpreted within a particular theoretical framework, then one has cause to question the assumptions of that framework. Thus, one seeks multiple sources of evidence concerning each hypothesized process
504
M. McCloskey, P. Macamso & T. Whetstone
of interest exactly because this method provides a check on the validity of the theoretical assumptions in which interpretations of deficits are grounded. It is important to emphasize that we not only preach but practice this approach (see, e.g., McCloskey et al., 1986; McCloskey, Alhinosa, & Macaruso, 1991; McCloskey, Aliminosa, & Sokol, 1991; Sokol & McCloskey, 1988, Sokol et al., 1991). For example, Sokol et al. (1991) reported results from patient PS, who presented with impaired performance in tasks involving multiplication of singledigit numbers. According to our model, such tasks require (1) a comprehension process that generates a semantic representation of the problem; (2) an arithmetic fact retrieval process that retrieves a semantic representation of the answer; and (3) a production process that translates the retrieved answer into the appropriate form for output. Sokol et al. (1991) concluded that PS’s impaired multiplication performance reflected a deficit in arithmetic fact retrieval. In support of this conclusion they presented the following - results and arguments: PS’s error rate was reliably lower for the 1’s and 2’s multiplication problems (1%) than for problems with both operands in the range 3-9 (24%). Because the 1’s and 2’s problems (e.g., 2 x 9, 8 x 1) collectively require comprehension of the full range of numbers occurring in single-digit multiplication problems (i.e., 1-9), this result argues against the possibility that the multiplication errors for the 3-9’s problems reflect impaired comprehension of the problem operands. The vast majority of PS’s erroneous responses were answers in the singledigit multiplication table, and most of these errors were multiples of one of the problem operands (e.g., 7 x 6 = 35). This result argues against the possibility that the errors occurred when impaired numeral production processes translated correct internal representations of answers into incorrect overt responses, because errors resulting from a production deficit should not be constrained to answers in the single-digit multiplication table or, more specifically, to multiples of problem operands. The error types are, however, consistent with a deficit in arithmetic fact retrieval. PS showed excellent performance on six “transcodmg”tasks involving all translations among the various numerical formats in which problems were presented and responses were elicited (e.g., stimulus 27, response twentyseven). This result suggests that comprehension and production of the various formats was intact, at least to the extent required by the multiplication tasks. In single-digitmultiplication PS’s error rate and distribution of errors across error types were not affected by variation in the form in which problems
Defending the Modular Model
505
were presented (e.g., 8 x 7 versus eight times seven), or by variation in the form in which responses were elicited (e.g., 56 versusfifty-sir). This result points to a deficit affecting an arithmetic fact retrieval process that remains constant across variation in input and output format. In solving multi-digit multiplication problems PS’s error rate and error (5) distribution for individual multiplication facts were the same as in the single-digit multiplication tasks. This result supports the conclusion of a deficit affecting an arithmetic fact retrieval process implicated in solving both single- and multi-digit multiplication problems. The inferences drawn from each of the above results are based upon assumptions about cognitive processes and their involvement in particular tasks. For example, the use of the transcoding tasks to assess the status of the numeral comprehension and production processes required by the multiplication tasks rests upon the assumption that the transcoding tasks collectively involve the same comprehension and production processes as the multiplication tasks. However, Sokol et al. (1991) clearly did not take the validity of these assumptions for granted, and did not draw conclusions about the status of cognitive processes from single results. For example, each hypothesized numeral comprehension and production process was tested in two different transcoding tasks. Further, conclusions about the status of these processes were not drawn solely from the transcoding data; results from the multiplication tasks themselves were also brought to bear. Thus, evidence was obtained from multiple sources, and this evidence converged on the conclusion that PS was impaired in retrieval of arithmetic facts. At the same time the convergence of results provided support for our model’s assumption that arithmetic fact retrieval processes remain constant as stimulus and response format are varied. Approaches to theory development and evaluation It should be evident from the preceding discussion that the claims of our model are somewhat less extreme than Campbell and Clark have suggested. However, this is not to say that the differences between our theoretical position and that of Campbell and Clark evaporate when our model is interpreted as we intended; on any interpretation the two positions are radically different. It appears to us that the differences reflect not only a disparity in views about numerical processing, but also -- and perhaps even more importantly -- strongly contrasting views about theory development and evaluation. Our model reflects the belief that in developing cognitive theories it is advantageous to begin by postulating a minimal set of processing mechanisms.
506
M. McCloskey, P. Macamso & T. Whetstone
This approach allows the generation of testable predictions, and provides a welldefined anchor point from which further theory development may proceed by adding or modifying assumptions as dictated by specific empirical findings. Campbell and Clark's style of theorizing is quite different, apparently involving an attempt to incorporate any and all representations and processes that data or intuition suggest might be involved in numerical processing. The result is a theory that is comprehensive, but also vague and probably immune to discodumation. These points may be illustrated by contrasting the two approaches with respect to their treatment of Arabic and verbal numeral processing. Campbell and Clark (1988, this volume; Clark & Campbell, 1991) assume that when a numeral is encountered, many different forms of representation may become activated via connections among codes in an associative network. For example, processing of an Arabic or verbal numeral may activate (directly or indirectly) "visual and written codes for digits, imaginal, analogue codes for magnitude (e.g., number lines), and combined visual-motor representations (e.g., counting on fingers; using an abacus)" (Campbell & Clark, this volume, p. 459), as well as verbal codes that include "articulatory and auditory codes in most people, visual and written number-word codes in literate individuals, and unique codes in various specific groups (e.g., signlanguage codes for numbers)" (Campbell & Clark, this volume, p. 459). The associations among the various forms of representation are assumed to mediate processes such as numerical transcoding (i.e., translation of numerals from one form to another). Campbell and Clark also raise the possibility that some transcoding tasks (e.g., reading aloud Arabic numerals) may become "proceduralized,"such that the transcoding could be accomplished by a "relatively autonomous" asemantic procedure. These assumptions, although comprehensive, are rather vague. For example, Arabic-to-verbal and verbal-to-Arabic transcoding are not simple processes involving one-to-one translation of digits to words, or vice versa (see, e.g., McCloskey et al., 1986; McCloskey, in press). Thus, vague allusions to associations or autonomous asemantic procedures are not sufficient to explain how transcoding is carried out. Further, Campbell and Clark's proposals concerning numeral processing do not appear to be falsifiable; it is difficult to imagine any data that would disconfirm the proposals in whole or in part. This immunity to disconfirmation stems not only from the vagueness of the Campbell and Clark assumptions, but also from the multiplicity of representations and processes that, according to the encoding complex theory, may come into play in any numerical task. For example, in the case of Arabic-to-verbal numerical transcoding the model apparently allows for involvement of direct associations between the various different forms of Arabic
Defending the Modular Model
507
and verbal codes, indirect associations via any of the various other forms of representation (including semantic representations), and "proceduralized routines. Because Campbell and Clark offer no specific claims about which of these representations and processes contribute significantlyto transcoding performance, or how the contributing processes accomplish the task, the encoding complex theory could accommodate results suggesting that transcoding was performed by any one of the aforementioned processes, or any combination of two or more of these processes. Thus, empirical evidence could perhaps provide confirmation for certain assumptions of the encoding complex theory, but apparently has little if any role to play in identifying inadequacies of the theory, and suggesting appropriate modifications. In contrast our model postulates a small set of numeral comprehension and production mechanisms, each of which is grounded in basic observations about numerical processing. For example, in circumstances where people encounter Arabic or verbal numerals (e.g., examining a price tag, or listening to the current temperature in a radio broadcast), the aim is often to determine the numeral's meaning. Accordingly, our model posits processes that compute semantic representations from Arabic and verbal numerals. Similarly, spontaneous production of numerals typically begins, we presume, with a to-be-expressed meaning. Hence, we also posit processes that translate semantic representations of numbers into Arabic or verbal form for output. For each of the postulated processes the model specifies the input and output representations, and offers some assumptions about the internal organization of the process (see, e.g., McCloskey et al., 1985; McCloskey & Caramazza, 1987; McCloskey, Sokol, Goodman-Schulman, & Caramazza, 1990). Further, McCloskey et al. (1986; see also McCloskey et al., 1990) have proposed a detailed model of verbal numeral production. Thus, while we freely acknowledge that many of our assumptions are in need of more explicit development, these assumptions are sufficiently well-specified to provide a basis for a variety of testable predictions. Consider, for instance, the application of the model to numerical transcoding. The Arabic and verbal numeral comprehension and production processes are sufficient in principle for performing transcoding tasks. For example, an Arabic numeral (e.g., 80,012) may be read aloud by translating the numeral into a semantic representation through the Arabic numeral comprehension process, and then translating the semantic representation into a sequence of number-word representations (e.g., eighy fhousand twelve) by means of the verbal numeral production process. Accordingly, we take as our initial hypothesis that the numeral comprehension and production processes -- required in any event to account for processing that clearly implicates the meaning of numerals -- also
508
M. McCloskey, P. Macaruso & T. Whetstone
mediate numerical transcoding. This hypothesis may of course be wrong; indeed, falsifiability is one of its major virtues. If evidence contradicting the hypothesis were reported, we would then have clear motivation for postulating additional processing mechanisms. Depending on the nature of the evidence, various additions to the model could be considered. At the simplest level one might entertain the possibility that representations of individual digits (e.g., 6), and phonological and orthographic representations of individual number words (e.g., /srks/, SIX) may be mapped onto one another without an intervening semantic representation. Most models of reading and spelling postulate grapheme-phoneme conversion processes and/or direct orthographyphonology mapping processes for translation between orthographic and phonological representations of words, and these processes surely may be applied to number words (e.g., for translating between SIX and /slks/). In failing to acknowledge these processes our model is clearly oversimplified. Conceivably, there may also be processes that map directly from a digit representation to the corresponding phonological or graphemic number-word representation (e.g., 6 <--> SIX or /sIks/). At the other extreme, one might entertain the possibility of full-scale asemantic transcoding algorithms capable of handling complex numerals as well as single digits and words. This is apparently what Campbell and Clark (this volume) have in mind when referring to proceduralization (see also Deloche & Seron, 1987). In considering this possibility it should be borne in mind that given the complexity of the relationship between Arabic numerals and their verbal counterparts, such algorithms would have to be comparable in sophistication to the numeral comprehension and production processes postulated by our model. Indeed, the Arabic-to-verbal asemantic transcoding algorithm proposed by Deloche and Seron (1987) is similar in many respects to the verbal numeral production algorithm proposed by McCloskey et al. (1986). Our aim in this discussion is not to propose additions to our model's assumptions about numerical transcoding. Except in the case of processes for translating between phonological and orthographic number-word representations there is no clear evidence to suggest the need for additional transcoding mechanisms (although see Cohen & Dehaene, 1991, for some suggestive results). Rather, the point of the discussion is twofold. First, by starting with a minimal set of processing mechanisms, we can introduce additional mechanisms in a principled fashion -- that is, in such a way that each process incorporated into the model has a clear empirical motivation. Second, to accommodate (at least some forms of) evidence inconsistent with the model, specific limited modifications would suffice. Thus, contrary to what Campbell and Clark seem to assume, one is not limited in
Defending the Modular Model
509
theorizing about numerical processing to a choice between our model in its present form, and the vastly more complicated, and vastly less constrained, encoding complex perspective. To put it another way, evidence against our model is not necessarily evidence for the encoding complex theory. Of course, testing and consequent modification of our model could eventuate in a version comparable in complexity to Campbell and Clark's encoding complex theory. Thus, our objection to the encoding complex theory is not that it overestimates the complexity of numerical processing. Rather, our objection is that Campbell and Clark postulate enormous complexity without adequate motivation, and in a form not subject to empirical test. Empirical evidence
In raising objections to our model, Campbell and Clark present not only logical arguments, but also empirical results. The Campbell and Clark experiment
The results of greatest potential relevance to our model come from an experiment by Campbell and Clark (this volume) in which subjects gave spoken answers to single-digit multiplication problems. Problems were presented in the form of digits (e.g., 8 x 7) or written words (e.g., eight x seven), with digit- and word-format trials alternating. Our model assumes that for problems in either format an encoding or comprehension process first translates the problem into an internal semantic representation. An arithmetic fact retrieval process then retrieves from this problem representation a semantic representation of the answer, and finally a verbal numeral production process generates from the answer representation an overt response in the form of spoken words. According to the model, the encoding processes will differ as a function of problem format. For the digit format, encoding will implicate Arabic numeral comprehension processes, whereas for the word format verbal numeral comprehension processes will be required. However, in both formats the encoding process is assumed to convert the problem into the same semantic representation. Accordingly, the model predicts that, other things being equal, the arithmetic fact retrieval process (and response production process) will be the same for the two formats. Campbell and Clark report, however, that several results from their study demonstrate differences between the word and digit formats in arithmetic fact retrieval processes. These results, they suggest, argue against our model and in favor of the encoding complex theory.
5 10
M. McCloskey, P. Macaruso & T. Wietstone
In the following discussion we suggest that the Campbell and Clark results, although interesting, do not constitute clear evidence against our model. We first show that reaction time data taken by Campbell and Clark as evidence of formatdependent retrieval processes can be interpreted in terms of differences between formats in problem encoding processes, We then argue that although some results concerning errors do point to differences between formats in arithmetic fact retrieval processes, these differences are not of a sort that are inconsistent with our model. Specifically, we suggest that the differences occurred simply because fact retrieval was carried out under greater speed pressure for word-format problems than for digit-format problems. As we consider the implications of the Campbell and Clark results it is important to bear in mind two points. First, the pattern of results was extremely similar for the word and digit formats. For example, the individual-problem reaclion times correlated ,913 across formats, and the correlation for error rates was .895. Thus, the experiment did not reveal gross differences in performance between the word and digit problems; the differences between formats were subtle. Therefore, relatively subtle factors may suffice to explain the differences. The second point is that the obtained differences between word and digit problems did not provide confirmation for specific predictions of the encoding complex theory. The Campbell and Clark theory does not specify how arithmetic fact retrieval processes vary as a function of problem format, and so does not make explicit predictions about differences between digit and word problems, To be sure, Campbell and Clark offer post hoc interpretations for the some of the observed differences. However, effects opposite to those obtained would have been equally consistent with the encoding complex theory.6
In fact even a finding of no difference between formats in retrieval processes would apparently not be an embarrassment to the encoding complex position. In discussing the similarity in results between word and digit formats Campbell and Clark make the following points: One possible explanation for these similarities is that a common code is used in number processing, but that the code is specific rather than abstract. That is, subjects may translate all stimulus codes into number word or visual digit format, so that calculations are produced predominantly via the same specific code. A related explanation is that the various specific numerical codes become so strongly associated as to activate one another with a high probability every time any one of the codes is activated, unless special conditions are implemented to interfere selectively with specific modalities. Still another possibility is that many features of the associative structure emerge from learning and contextual experiences that would develop and operate similarly in different stimulus modes. That is, interfering associations, effects of priming, and related mechanisms may operate similarly whether words or visual digits constitute the nodes in the associative network. Thus, evidence for unique features of number processing with different surface forms or modalities
Defending the Modular Model
511
Hence, we are not in a position of attempting to accommodate results specifically predicted by a competing theoretical framework. Rather, we are proposing alternatives to the post koc interpretations offered by Campbell and Clark. We do not assert that our interpretations are necessarily correct; rather, we simply claim that these interpretations are as plausible as (and more specific than) the accounts offered by Campbell and Clark, and hence that their data do not provide clear-cut evidence against our model. Reaction time results Two reaction time (RT) results from the Campbell and Clark experiment are potentially relevant to our model. First, the difference in RT between "easy" and "difficult" problems was greater for the word format than for the digit format. Second, variables related to arithmetic fact retrieval processes accounted for significant amounts of variance in a multiple regression analysis of word-digit RT differences, even when variables potentially related to encoding differences between formats were included in the analysis. Campbell and Clark take these results as evidence that fact retrieval processes differed between the word and digit formats. We suggest to the contrary that the findings can be interpreted in terms of encoding processes. Our interpretation asserts (a) that word problems (e.g., eight x three) generally took longer to encode than digit problems (e.g., 8 x 3), and (b) that the magnitude of this encoding disadvantage for the word format varied across problems as a function of several variables, including a variable Campbell and Clark failed to consider.
Magnitude of the word-digit RT diflerence Encoding a word requires processing of several characters, whereas encoding a digit involves processing of only one. Hence, it seems plausible to assume that word-format multiplication problems (e.g., nine x seven) would take longer to encode than digit-format problems (e.g., 9 x 7). The finding of slower word than digit RT is of course consistent with this assumption. However, one might wonder whether a word-digit RT difference as large as that obtained by Campbell and provides evidence for specific codes, but the converse is not necessarily true (Campbell & Clark, this volume, p. 482). This argument would apply to results showing identical as well as merely similar fact retrieval processes across stimulus formats, and illustrates very nicely the immunity from disconfirmation conferred by the multiplicity and vagueness of the encoding complex theory's assumptions.
512
M. McCloskey, P. Macaruso & T. Whetstone
Clark (roughly 300 msec) could plausibly be interpreted in terms of differences between formats in encoding time. In our view this interpretation is entirely plausible, for several reasons. First, multiplication problems are frequently encountered in digit form, but virtually never in the form of written words. Thus, for the word problems but not the digit problems subjects were faced with an unfamiliar task. Furthermore, the subjects had little opportunity to gain familiarity with the word format -- subjects were apparently given no practice in the task, and each subject received a total of only 124 trials in each format. In addition the procedure of alternating word and digit trials may have made it difficult for subjects to become comfortable with the word format. It may also be noted that the word problems, by virtue of their greater physical length, were spread out over a larger area of the display than the digit problems, perhaps requiring or at least encouraging eye movements during encoding. And, independent of issues concerning eye movements, it is conceivable that whereas the digits in a digit-format problem can be encoded in parallel, the words in a wordformat problem must be processed serially (NoEl & Seron, in press). Finally, analyses of word-digit RT differences for individual problems (discussed below) support the assumption of a substantial word-format encoding disadvantage, as well as the assumption that the disadvantage varies markedly across problem^.^ In the following discussion we examine the R T phenomena reported by Campbell and Clark in light of our encoding hypothesis, addressing first the
In a task where subjects read aloud individual digits (e.g., 6) and number words (e.g., six) Campbell (1990) found that mean RT was only 11 msec slower for the words than for the digits. He interpreted this result as evidence that word-format arithmetic problems do not take substantially longer to encode than digit-format problems. However, drawing inferences about the magnitude of encoding time differences between word and digit arithmetic problems from the word-digit reading time difference is, in our view, highly questionable. First, the word and digit reading stimuli (e.g.,four versus 4) differed less in length than word and digit arithmetic problems (e.g., four x seven versus 4 x 7). As a consequence, potential word-digit processing differences related to differences in length between word and digit arithmetic problems (e.g., differences in eye movements) may have been attenuated or absent in the reading task. Further, if digit pairs can be encoded in parallel whereas word pairs must be processed sequentially, this word-digit difference would contribute to slower word than digit RTs in arithmetic tasks but not in the reading task. Also, issues concerning the unfamiliarity of word-format arithmetic problems do not arise in the reading task. In a subsequent section we present reading-time data as evidence of variation in encoding time among individual number words (and lack of variation among digits). However, our use of these data is very different from Campbell’s use of his reading results. We do not attempt to make inferences about absolute differences in encoding time between digit and word arithmetic problems, or even about absolute differences in encoding time between digits and words presented in isolation. Accordingly, the abovementioned problems do not arise.
Defending the Modular Model
513
multiple regression analyses of word-digit differences for individual problems, and then considering the easy-difficult by format interaction. Word-Digit R T differences for individual problems Table 1 presents the difference in RT between the word and digit formats for each of the 64 problems tested by Campbell and Clark? Table 1. Difference in RT (in msec) between Word Format and Digit Fonnat Problems in the Campbell and Clark Experiment Second Operand First Operand 2 3 4 5 6 7 8 9
2
3
4
5
6
7
8
9
126 283 292
254 265 311 331 400 397 414 360
299 261 153 295 311 491 371 404
271 302 314 178 405 338 295 383
316 421 389 254 145 376 372 326
236 407 303 365 502 198 502 399
262 512 426 344 305 452 186 356
206 393 441 359 330 403 396 144
286 278 202 328
260
Length and ties. Two results reported by Campbell and Clark in analyses of these RT differences bear on our assumptions concerning word-digit encoding differences. The fwst concerns problem length. Length was constant at 5 character spaces for digit-format problems (e.g., 7 x 8), but varied for word-format problems from 9 (e.g., two x two) to 13 (e.g., seven x eight). Thus, one might expect not only that word problems, being longer than digit problems, would take longer to encode, but also that the encoding disadvantage would increase with the length of the problem in word form.
' We are grateful to Campbell and Clark for making the data from their experiment available to us.
514
M. McCloskey, P. Macaruso & T. Whetstone
In accord with this expectation Campbell and Clark’s regression analysis revealed that problem length in word format accounted for a significant amount of the variance in the word-digit RT difference. The regression weight for the length variable -- 27.6 -- indicates that even when other variables were taken into account the word-digit RT difference increased by about 28 msec for each added character in the word form of a problem. Thus, due to problem length alone we would expect the word-digit RT difference to be about 110 msec greater (i.e., 4 x 27.6) for the longest problems (13 spaces) than for the shortest problems (9 spaces). These results are clearly consistent with the assumption of a wordformat encoding disadvantage that is both substantial in magnitude, and varies considerably across problems. This assumption finds further support in the results for tie problems (e.g., 2 x 2, three x three). The average word-digit RT difference was 348 msec for non-tie problems, but only 174 msec for tie problems. This phenomenon is readily apparent in Table 1. Not surprisiigly, a dichotomous ties variable (coded + 1for tie problems and 0 for non-tie problems) accounted for a significant amount of variance in the regression analysis of word-digit RT differences. Campbell and Clark suggest, and we agree, that the ties effect probably reflects problem encoding processes. Recognizing early in processing that both operands of a problem are identical may reduce the encoding task from that of comprehending two separate operands to that of comprehending just one. Given the assumption that encoding is generally slower for words (e.g., eight) than for digits (e.g., 8), this facilitation of encoding should benefit word problems to a greater extent than digit problems. In this way the reduction in the word-digit RT difference for tie problems may be explained. The size of the reduction (from 348 msec to 174 msec) supports the assumption that the encoding disadvantage for word problems was quite substantial. Sue andfun. While conceding that encoding processes may have differed across problem formats, Campbell and Clark argue that the obtained word-digit RT differences cannot be explained solely in terms of encoding. This conclusion is based on the fmding from the regression analysis that two variables presumed to index difficulty of arithmetic fact retrieval -- problem size and fan -- accounted for significant amounts of variance in the word-digit RT difference, even when variance attributable to length and ties was partialled Campbell and Clark
Size refers to the magnitude of a problem’s correct answer (or, virtually equivalently, to the size of its operands); fan refers to whether a problem’s answer is unique to that pair of operands (as for 3 x 9), or shared by other problems with different operands (as for 3 x 6, because 18 is also the answer to 2 x 9).
Defending the Modular Model
515
interpret the size and fan effects as showing an influence of problem format on arithmetic fact retrieval processes. Implicit in this argument is the assumption that once variance attributable to length and ties is partialled out, any remaining variation must reflect processes other than encoding. However, this assumption is unsound, because Campbell and Clark failed to consider a third salient encoding-related variable: word frequency. There is a substantial body of evidence indicating that time to comprehend a written word is a function of the word's frequency (for a review see Monsell, 1991). Further, frequency varies substantially across the words wo to nine, as shown in Table 2 (see also Dehaene & Mehler, in press). For example, two has a frequency 23 times that of nine. Table 2. Word Frequency Counts for the Worh Two through Nine
Word
Frequency'
two three four five
1893 847 463 351 239 136 127 82
SiX
seven eight nine
Note. "Counts are from the Carroll, Davies, & Richman (1971) U distribution, and sum over instances in which a word occurred alone (e.g.,five) or in combination with a tens word in a two-digit number (e.g., twenty-five).
Thus, there is reason to suspect that encoding time for word-format problems may vary in ways not captured by the length and ties variables. Specifically, encoding may be slower for problems with large (and hence low-frequency) operands, such as eight x sir, than for problems with small (and hence highfrequency) operands, such as three x two. However, our specific concern here is with differences in RT between wordformat and digit-format problems. Therefore, we must also consider variation in frequency across the digits 2-9, and possible effects of frequency on digit encoding time. It is apparent from the digit frequency counts presented in Table 3 that the digits vary in frequency far less than their word counterparts. For example,
516
M. McC10skey~P. Macaruso & T. metstone
whereas the frequency of the word two is 23 times that of the word nine, the frequency of the digit 2 is only 4 times that of the digit 9. Thus, even if we assume that encoding time varies with frequency for digits as well as words (which is far from certain; see below), the difference in encoding time between small and large digits (e.g., 2 and 9) may well be less than the difference in encoding time between the corresponding words (e.g., two and nine). Table 3. Frequency Counts for the Digits 2 through 9 Digit
Frequency' 906
624 496 559 358
268 309 230
Note. 'Counts are from the Carroll et d. (1971) U distribution, and sum over instances in which a digit occurred alone or as part of a two-digit number. As a consequence, differences in encoding time between problems with small operands and problems with large operands may have been greater for the word format than for the digit format. For example, the encoding time difference between three x two and eight x six may have been greater than the difference (if any) between 3 x 2 and 8 x 6. Furthermore, differences between word and digit formats in frequency-related encoding effects may have been responsible for the effects of size and perhaps even fan in the Campbell and Clark regression analysis. For example, the size effect indicates that the RT difference between small and large problems was greater for the word format than for the digit format. This is just what we expect if the encoding time difference between small and large problems is greater for word than for digit problems, even if arithmetic fact retrieval processes are the same for both formats. &analyzing the word-digil RT dgerences. To evaluate this interpretation we repeated Campbell and Clark's regression analysis with an added independent variable reflecting frequency. We conducted analyses with several frequency variables, each of which yielded qualitatively identical results. Here we discuss only one of the variables: word-digit frequency difference. This variable indexed the difference in operand frequency between the word and digit format of a problem, and was calculated for each problem as the sum of the operands' word frequency
Defending the Modular Model
517
minus the sum of the operands’ digit frequency (e.g., for 8 x 7, the summed frequencies of eight and seven minus the summed frequencies of 8 and 7). In accord with our interpretation the frequency difference variable correlates -.394 (j< .001) with the word-digit RT difference. (The negative sign of the correlation indicates that the larger the frequency disadvantage for a problem’s word format relative to its digit format, the larger the word-digit RT difference.) Furthermore, when frequency difference is entered into the regression equation predicting the word-digit RT differences, this variable and the other two encoding variables (length and ties) each account for a significant amount of variance. Notably, however, the effects of size and fan are no longer significant. With the effects of length, ties and frequency difference partialled out, the correlations of size and fan with the word-digit RT difference are only .143 and ,149,respectively. We do not suggest that frequency difference is a better predictor of word-digit RT differences than the combination of size and fan; the Rz for our regression solution involving length, ties, and frequency difference (604)is essentially the same as that for Campbell and Clark‘s solution involving length, ties, size, and fan (.624). Our point is simply that a solution incorporating only encoding variables is as satisfactory as the Campbell and Clark solution, which included the retrievalrelated variables sue and fan. Hence, we suggest, the word-digit RT differences cannot be taken to show that arithmetic fact retrieval processes vary across formats. A potential counterargument. Word frequency, and the word-digit frequency difference, decrease systematically as operand size increases. Hence, it might be suggested that the frequency difference variable is simply an alternative problem size measure, and as such indexes retrieval difficulty in the same way as the Campbell and Clark size variable. If this were the case, partialling out effects of frequency difference in the analysis of word-digit RT differences could amount to partialling out variance due to differences between formats in arithmetic fact retrieval processes. However, it is easy to show that frequency difference is not interchangeablewith size as an index of retrieval difficulty. For both the word-format RTs and the digit-format RTs (as opposed to the word-digit RT differences) the effect of the size variable remains strong and significant even when effects of frequency difference (and the other variables in Campbell and Clark’s regression solutions) are partialled out. For the word RTs the partial correlation for size is a robust .532 when effects of ties, length, and frequency difference are partialled out; and for digit RTs the partial size correlation is .623 when effects of ties and frequency difference are removed. Thus, including frequency difference in the regression eliminates the effect of size only for the word-digit RT diferences, and not for the word RTs alone, or for the digit RTs alone. Measures of word and digit encoding time. Another potential objection is that although frequency may in general be related to encoding time for words,
518
M.McCloskey, P. Macaruso & T. Whetstone
substantial effects of frequency seem rather unlikely in the Campbell and Clark experiment, where each word was presented many times. Fortunately, we can offer evidence that frequency is related to encoding time for the words hvo to nine under such circumstances, as well as evidence that encoding time is essentially constant across the digits 2-9. As part of an ongoing study of an adult with a developmental dyscalculia, we have tested 5 normal adult control subjects in two tasks in which the 100 numerals from 0-99 were presented in random order, and reaction time was measured as the subject read each numeral aloud. The numerals were presented in Arabic form (e.g., 36) in one task, and in written verbal form (e.g., thirty-sir) in the other task. Each subject received 4 blocks of 100 trials in each of the two tasks. From the task with written verbal stimuli we can derive estimates of differences in comprehension time among the words two through nine. However, we cannot simply use for this purpose the reaction times for the stimuli two through nine. Because the reading task requires not only comprehension of the word in written form, but also production of the word in spoken form, any RT differences among the words two through nine could reflect production processes in addition to, or even instead of, comprehension processes. Also, words may differ in how readily their initial sounds trigger the voice key used to measure reading time; hence, RT differences between words are not necessarily a pure measure of processing time differences. We can, however, at least partially circumvent these difficulties. Consider the numerals twenty-two through twenty-nine. With respect to comprehension processes these numerals differ only in requiring encoding of different ones words (is., two through nine); the tens word is the same in all cases. With respect to production, the initial word in the response is the same for all of the stimuli (i.e., "twenty"). Thus, effects of initial sounds on triggering of the voice key should not contribute to any RT differences. It is conceivable that response-preparation processes for the second word (i.e., the ones word) could influence time to initiate production of the tens word. Thus, RT differences among the numerals could in part reflect differencesamong the words two through nine in production processes. However, we will present evidence that this was not in fact the case. Consequently, the differences in reading time among the numerals twenty-two through twenty-nine may tentatively be taken as measures of the encoding time differences among the words two through nine. The same is true of differences in reading time among thirty-two through thirtynine, fotfy-hvo through forpnine, and so on. Thus, we computed the encoding time measure for two as the mean of the reading times for twenty-two, thirty-two, fotfy-two,. . ., ninefy-two. The measure for three was computed as the mean of the reading times for twenty-three,thirty-three,forty-three, . . .,ninety-three;and so forth. Differences among the measures computed for the various words provide estimates of encodingtime differences between these words; obviously, however, the measure
Defending the Modular Model
519
for a word in no sense provides an estimate of absolute encoding time for that word." Table 4. Encoding Time Estimates in msec for the Worh Two through Nine Word two three four five Six
seven eight nine
Encoding Time Measure 625 653 631 635 639 652 662 673
Table 4 presents the encoding time estimates for the words huo through nine. The estimates vary considerably across words, from 625 msec for two to 673 msec for nine. Further, the encoding time estimates are correlated with both word length (r = .59) and word frequency (r = -.60). (When length is partialled out, the frequency-encoding time correlation remains a substantial -.48;and when frequency is partialled out, the correlation of encoding time with length is .46.) These results support the assumption that encoding time for number words varies substantially as a function of length and word frequency, even in a task where the same words are presented over and over. The procedure used to generate the word-encoding estimates may also be applied to the Arabic numeral-reading task in order to obtain estimates of differences in encoding time among the digits 2-9. Thus, the estimate for the digit
10
Even as estimates of encoding time differences between words, our measures are far from perfect. In the first place, production processes for the tens word, and perhaps even initiation of an overt response, may well begin before encoding of the ones word is completed; thus, our measures may underestimate actual encoding time differences among the ones words. Further, non-lexical grapheme-to-phoneme conversion processes may come into play in the reading task, further diluting effects of lexical comprehension processes. Finally, our data were of course collected from different subjects and under different conditions than the Campbell and Clark multiplication data; hence, we obviously cannot take the estimates of encoding time differences among number words in our numeral-reading task as measures of absolute encoding time differences in Campbell and Clark's experiment. Nevertheless, the encoding time estimates at least provide tentative evidence bearing on our assumptions about variation across number words in encoding time.
520
M. McCloskey, P. Macamso di T. Whetstone
2 was calculated as the mean of the reading times for the numerals 22-92; the estimate for 3 was calculated as the mean for the numerals 23-93; and so forth. These estimates are presented in Table 5. Table 5. Encoding Time Estimates in msec for the Digits 2-9 Digit
Encoding Time Measure 525 529 519 516
523 524 529 529
It is evident from table 5 that the encoding time estimates for the digits show far less variation than the word estimates. Further, the digit estimates are essentially unrelated to either digit frequency (r = -.20) or word frequency (r = .03).11 The digit encoding-time estimates are important for two reasons. First, these data motivate our contention that the frequency-related variation for the word estimates reflected encoding processes, and not response production processes. Both the Arabic and verbal numeral-reading tasks involved responses in the form of spoken number words; hence, if the frequency effect in the word estimates reflected effects of ones-word response-preparation processes on time to initiate production of the tens word, a similar effect should presumably have shown up in the digit encoding estimates. Furthermore, the digit encoding estimates support our assumption that frequency-related variation in encoding time was greater for the words two through nine than for the digits 2-9. Thus, the digit and word encoding time estimates together provide support, albeit tentative, for the encoding
I ' It seems likely that the differences among digits are due simply to noise. Estimates computed using Arabic numeral-reading data from a study involving 42 subjects (Harley, 1990) revealed a difference of only 4 msec between the digits with the longest and shortest estimates. (Reading times for written verbal numerals were not collected in this study.)
Defending the Modular Model
521
assumptions upon which our re-interpretation of Campbell and Clark's word-digit RT differences is founded. Interaction of format with problem difficuiy
Consider next the finding that the difference in RT between easy and difficult problems was greater for the word format than for the digit format. Campbell and Clark interpret problem difficulty as a variable related to arithmetic fact retrieval. However, the format by difficulty interaction clearly cannot be taken as evidence of a difference between formats in fact retrieval processes, because the difficulty variable is confounded with the encoding variables we have discussed. First, tie problems comprise 7 of the 24 easy problems (24%), but only 1 of the 35 difficult problems (3%). Second, mean word frequency summed across operands is much higher for the easy problems (1544) than for the difficult problems (614). (In contrast, the easy-difficult difference in digit frequency is substantially smaller: 1143 vs. 767.) Finally, mean length in the word format is slightly shorter for easy problems (10.9) than for difficult problems (11.6). For these reasons we expect the word-format encoding disadvantage to be substantially greater for difficult problems than for easy problems. Accordingly, we expect a larger easy-difficult RT difference for word problems than for digit problems due to encoding factors alone.
Error data In addition to reaction time results Campbell and Clark report a number of interesting findings concerning errors. The error rate was substantially higher for word-format problems (13.4%) than for digit-format problems (8.7%); the difference in error rate between easy and difficult problems was greater for the word format (14.9%) than for the digit format (12.4%); and the size and fan variables accounted for significant variance in a regression analysis of word-digit error rate differences. Furthermore, analyses of specific errors revealed several word-digit differences. Some of the error phenomena may conceivably be due in part to a higher rate of encoding errors for the word format than for the digit format, and to variation across problems in the word-digit difference in encoding accuracy. For various reasons, however, it is unlikely that the word-digit differences in error results are entirely due to encoding factors. Rather, we propose that the differences occurred because slower encoding of word than digit problems put greater speed pressure on the arithmetic fact retrieval process in the word format than in the digit format.
522
M. McCloskey, P. Macamso & T.Whetstone
We suggest that subjects in speeded tasks strike a balance between speed and accuracy at least in part by setting an internal processing deadline. On each trial processing is allowed to run to completion as long as the deadline is not exceeded. If, however, the deadline is reached before processing is complete, a response is made on the basis of whatever information is available at that time. Cutting off processing prior to completion serves to avoid an unreasonably long response time, but obviously increases the likelihood of error.” Because word- and digit-format trials were intermixed in the Campbell and Clark experiment, we assume that subjects applied a single processing deadline to problems in both formats. Coupling this assumption with the assumption of longer encoding times for word than for digit problems, we can interpret the observed error phenomena within the framework of our model. Specifically, we assume that the retrieval process, being constant across formats, needed for any given problem the same amount of time for completion on digitand word-format trials. However, the word-format encoding disadvantage had the effect of putting greater speed pressure on the fact retrieval process in the word format than in the digit format. That is, because of the longer encoding times for the word format, total processing time exceeded the internal deadline more often for word problems than for digit problems.
Error rates This speed-pressure hypothesis accounts straightforwardly for, and in fact is motivated by, the substantially higher error rate for the word format than for the digit format. Also readily explained is the interaction of format with problem difficulty. Difficult problems presumably require more time than easy problems for arithmetic fact retrieval. As a consequence, the former should be more likely than the latter to be pushed over the deadline by the increased encoding time in the word format. Thus, we would expect a larger word-digit difference in error
The notion of a strict deadline is probably an oversimplification. For example, it is perhaps more plausible to assume that as processing time lengthens subjects gradually relax the criterion for responding (i.e., gradually decrease the amount or certainty of information required for a response). Like imposition of a strict deadline, gradual relaxation of a certainty criterion helps to avoid long response times, but at the cost of increasing the likelihood of error. With respect to the issues we will discuss, the deadline assumption leads to the same predictions as the assumption of gradual criterion relaxation. Hence, we couch our discussion in terms of the simpler deadline formulation.
Defending the Modular Model
523
rate for the difficult problems than for the easy problem^.'^ This same interpretation applies to the size and fan effects in the regression analysis of worddigit error rate differences.
Error types In several analyses of subjects' specific errors, Campbell and Clark found differences between word and digit formats. For example, although the distribution of errors across error types was extremely similar for the two formats, the proportion of errors classifiable as naming, miscellaneous and operand intrusion errors was higher for the word format than for the digit format. Campbell and Clark interpret these and other between-format differences in error pattern as strong evidence against our model. However, our model does not predict identical error patterns for conditions differing in the speed pressure placed upon fact retrieval processes; hence, word-digit differences in error pattern do not constitute evidence against the model. Arithmetic fact retrieval may perhaps be thought of as involving a gradual accumulation of activation (e.g., Ashcraft, 1987; Campbell & Graham, 1985). Early in retrieval many different answers may become activated to some extent, and due to noise of various sorts the correct answer may be no more strongly activated than several incorrect answers. Presumably, however, as activation accumulates the set of candidate answers is progressively narrowed, until eventually (if the retrieval process runs to completion) a single answer emerges as the "winner," and is produced as the response. When retrieval is terminated prematurely, a response must be selected from the current set of candidate answers, none of which has yet emerged as a clear winner. This will often result in an error, and the nature of the errors will reflect the composition of the candidate-answer set at the time retrieval is terminated. Assuming that the range of candidate answers narrows as retrieval proceeds, the errors may be somewhat different when the retrieval process runs nearly to completion, than when retrieval is terminated earlier. Consider now the word and digit format conditions in Campbell and Clark's experiment. We assume that in both conditions most errors occurred due to premature termination of the retrieval process when the processing deadline was
l 3 Also, as we argued earlier, the word-digit difference in encoding time was probably larger for difficult than for easy problems (see p. 521). For this reason too we would expect the increase from the digit to the word format in likelihood of exceeding the deadline, and hence in error rate, to be greater for difficult than for easy problems.
524
M. McCloskey, P. Macamso & T. Whetstone
exceeded. However, given the assumption of longer encoding times for the word problems, it follows that relative to the digit condition the premature terminations in the word condition occurred not only more often, but also earlier on average in the retrieval process. Thus, relative to the digit-format errors, the word-format errors may have been drawn from candidate answers active earlier on average in the retrieval process. Without knowing exactly how sets of candidate answers change as retrieval progresses, we cannot make precise predictions about how the error pattern should vary with variation in speed pressure. Thus, for example, we cannot derive from the speed pressure hypothesis the prediction that relative frequency of naming and operand intrusion errors should be greater for the word format than for the digit format. Nevertheless, the point remains that given the possibility of word/digit differences in speed pressure, differences in error pattern do not clearly demonstrate differences between formats in the nature of arithmetic fact retrieval processes. Operand distance. For one of the word/digit error pattern differences we can offer a slightly more specific interpretation in terms of the speed pressure hypothesis. This finding concerns table-related errors (i.e., errors in which the incorrect response is correct for a problem sharing an operand with the stimulus problem, such as 6 x 7 = 48). Table-related errors may be classified with respect to "operand distance." For example, the operand distance for 6 x 7 = 48 is 1, because 6 x 8 -- the problem for which the response is correct -- differs from the stimulus problem 6 x 7 by 1 on the non-shared operand. Campbell and Clark found that operand distance for table-related errors was, on average, greater for the word format than for the digit format. For instance, 44% of the word-format errors but only 32% of the digit-format errors had a distance of 2 or more. The speed pressure hypothesis can interpret this finding in a straightforward fashion. It seems reasonable to suppose that as the set of candidate answers is gradually narrowed during retrieval, more distant answers (e.g., 27 for 3 x 6) are ruled out earlier than closer answers (e.g., 21). Given the assumption that wordformat errors stem from earlier termination of retrieval than digit-format errors, we would expect table-related errors to have a greater operand distance in the word format than in the digit format. Campbell and Clark tentatively interpret the operand distance results by assuming that "number-fact retrieval with word stimuli is less sensitive to numerical magnitude or proximity than with digit stimuli" (Campbell & Clark, this volume, p. 472). However, this assumption appears to be unmotivated -- why should retrieval be less sensitive to magnitude with word stimuli than with digit stimuli,
Defending the Modular Model
525
as opposed to more sensitive or equally sensitive? Our speed-pressure interpretation seems both more plausible and better motivated. Miscellaneous errors. The higher relative frequency of miscellaneous errors for word than for digit problems may be interpreted similarly. A miscellaneous error (e.g., 7 x 3 = 26) is an incorrect response that is not an answer in the single-digit multiplication table (and does not fall into another error category). As in the case of distant operand errors, it seems reasonable to suppose that numbers not within the set of multiplication table answers would be ruled out early in retrieval, and hence would be less likely to occur as digit-format than word-format errors. Another possibility is that many of the miscellaneous errors reflect errors in response production (e.g., for 7 x 3 = 26, erroneously producing 'kid' in place of "one" after retrieving the correct answer). Assuming (as seems reasonable) that response production as well as fact retrieval processes were under more speed pressure for the word format than for the digit format, we would expect more production errors in the word condition. M u - versus min-related errors. One other word-digit difference in error pattern requires comment. Campbell and Clark classified table-related errors according to whether the error was related to the minimum operand (e.g., 3 x 7 = 24, in which the answer is a multiple of the minimum operand 3)' the maximum operand (e.g., 3 x 7 = a), or both operands (4 x 8 = 24). The analysis revealed that the proportion of max-related errors was higher for the word format than for the digit format. Campbell and Clark take this result as yet another piece of evidence for word/digit differences in retrieval processes. They suggest that the phenomenon may reflect the (supposed) tendency of retrieval to be less sensitive to magnitude for the word format than for the digit format (on grounds that max-related errors tend to be farther from the correct response than min-related errors). We suggest, however, that the difference between formats in proportion of maxrelated errors is simply an artifact of the higher relative frequency of operand intrusion errors in the word condition, and not a distinct phenomenon. An operand intrusion is an error that incorporates one of the problem's operands (e.g., 9 x 6 = 56, 7 x 9 = 72). Both max- and min-related errors can result from intrusion of an operand into the ones position of an answer (e.g., 4 x 6 = 36, 8 x 6 = 36). However, only max-related errors can result from an operand intrusion into the tens position (e.g., 3 x 8 = 32). Thus, overall we might expect that operand intrusion errors, relative to non-intrusion errors, would be biased toward max-related errors. Given that operand intrusions were relatively more frequent for word than for digit problems, such a bias could be responsible for the word/digit difference in proportion of max-related errors. In fact this interpretation is supported by a re-
526
M.McCloskey, P. Macaruso & T. Whetstone
analysis of the table-related errors. First, if we consider only the errors that are not operand intrusions, the proportion of max-related errors does not differ between the word format (.29) and the digit format (.25),P(l, N = 675) = 1.45, p > .lo. Furthermore, for errors involving an operand intrusion into the ones position of the answer (which can lead to either a max- or min-related error), the proportion of max-related errors again does not differ between the word and digit formats (.30 vs. .28, respectively), p ( 1 , N = 355) = 0.18, p > .lo). The overall difference between formats in proportion of max-related errors is due entirely to intrusions into the tens position of the answer, which (like operand intrusions in general) occur more often for the word format (173 vs. 71 errors), and are uniformly ma-related. Operand intrusion errors
In presenting their error data, Campbell and Clark assert that our model is inconsistent not only with the various word/digit differences we have discussed, but also with the mere occurrence of operand intrusions. Their argument may be summarized as follows: Operand intrusions are not simply incorrect answers that happen by chance to include a quantity matching a problem operand. This conclusion follows from the finding that the position of the intruding operand matches its position in the answer significantly more than half of the time. (For example, 7 x 6 = 56 occurs more frequently than 6 x 7 = 56.) The McCloskey et al. model might attempt to interpret operand intrusions by assuming that semantic representations of problem operands interfere with processing of the semantic representation of the answer in the verbal numeral production process. However, a production interpretation proves inadequate because the vast majority of operand intrusions are table-related errors (i.e., multiples of a problem operand). This finding suggests that at least most of the operand intrusions have their genesis in arithmetic fact retrieval processes. Furthermore, the higher rate of operand intrusions for word than for digit problems is difficult to accommodate within the model. We agree that operand intrusions occur more often than expected by chance, and that these errors often arise from arithmetic fact retrieval processes. However, it does not follow that the intrusion results are inconsistent with our model. We have already noted that the word/digit difference in rate of operand intrusions does not constitute clear evidence against our model. Further, various interpretations for the intrusion errors themselves can be suggested within the framework of our model. One possibility is that the errors reflect interference occurring not in response production, but in arithmetic fact retrieval. Following
Defending the Modular Model
527
the suggestion of Campbell and Clark (this volume, p. 474) we might assume that numeral comprehension processes not only encode the operands separately in developing an internal representation of the problem, but also attempt to treat the entire problem as a single numeral -- for example, generating a semantic representation of 76 (i.e., (7)lOEXPl (6)lOEXPO) from 7 x 6 or seven x sir. The numeral representation may then interfere with the arithmetic fact retrieval process, leading to activation of answer representations with which it shares one or both quantities (especially in matching positions). For example, the representation of 76 ((7)lOEXPl (6)lOEXPO) may activate the stored answer representations for 72 ({ 7) lOEXPl { 2) 10EXPO) and 36 ({3) lOEXPl { 6) 1OEXPO) and perhaps also -- but to a lesser extent -- the representations for 27 ((2)lOEXPl (7)lOEXPO) and 63 ((6)lOEXPl (3)lOEXPO). In this way the tendency of operands to intrude into matching answer positions may be explained. The strong tendency for operand intrusions to be table-related errors may be explained by assuming that an incorrect answer receiving activation both from the problem encoding and from the interfering numeral encoding is especially likely to occur as an error. For example, when 7 x 6 is presented, the answer 56 may become activated both because it is a multiple of 7, and because it matches the interfering numeral 76 in the ones position. As a consequence, 56 may be quite likely to occur as an error. This interpretation applies not only to operand intrusions in normal subjects, but also to the intrusions noted by Campbell and Clark in the errors from our brain-damaged patients PS and GE. The facilitation observed in Campbell and Clark's experiment for a correct answer that includes one of the problem operands in a matching position (e.g., 6 x 4 = 24)may be interpreted in an analogous fashion. The intrusion data appear entirely consistent with the assumption that the interference leading to operand intrusions occurs at a semantic level of representation. In particular, the intrusion effect is not limited to intrusion of an operand's name into the response. Rather, the intruding operand may occur in the response as a tens word or a teens word. For example, in 5 x 9 = 54 (which occurs more often than 9 x 5 = 54) the operand 5 intrudes not as the word "five" but rather as the word "fifty." Thus, the intrusions are apparently best characterized as operand quantity intrusions and not operand name intrusions. The assumption that numeral comprehension processes attempt to encode the problem as a single numeral is clearly speculative, but in our view is not entirely implausible. It may also be noted that this assumption suggests another possible explanation for the higher rate of operand intrusions in the word condition than in the digit condition. In particular, the word problems may have appeared more numeral-like than the digit problems, For example, seven x sir arguably resembles
528
M. McCloskq, P. Macaruso & T. Whetstone
seventy-six more closely than 7 x 6 resembles 76. (Note that the multiplication
operation was indicated by an x for both the digit and word formats.) Thus, the tendency to encode the problem as a single numeral may have been greater for the word format than for the digit format, and this may have contributed to the higher relative frequency of intrusions in the word condition. Conclusions regarding the Campbell and Clark experiment
Our discussion obviously does not demonstrate that Campbell and Clark are incorrect in arguing for format-dependent arithmetic fact retrieval processes; nor was this our aim. We suggest simply that the alternative interpretations offered from the perspective of our model are as viable as the Campbell and Clark accounts. Hence, we conclude that contrary to the assertions of Campbell and Clark, their findings do not show "clearly that retrieval varies with the surface form in which a problem is presented (Campbell & Clark, this volume, p. 459). Results from other studies In addition to their own experiment Campbell and Clark (this volume; see also Campbell & Clark, 1988; Clark & Campbell, 1991) cite a substantial number of other studies as evidence against our model. However, Campbell and Clark's interpretations of these studies do not withstand scrutiny. Although space does not permit an exhaustive critique, we illustrate this point by reviewing briefly the results on two of the topics Campbell and Clark discuss. Numerical comparison Campbell and Clark (this volume; see also Clark & Campbell, 1991) mention severat studies showing that performance in numerical comparison (i.e,, judging which of two numbers is larger) can vary with stimulus format. In most of these studies the relevant results concern the size incongruity effect. This is the finding that comparison RT increases when the relative physical sizes of the stimuli in a pair are incongruent with their relative numerical magnitudes (e.g., 2 7). In an early study Besner and Coltheart (1979) found a size incongruity effect for digits but not for number words. However, later studies have obtained incongruity effects with both words and digits (e.g., Besner, Davelaar, Alcott, & Parry, 1984, Foltz, Poltrock, & Potts, 1984). Nevertheless, some studies have found larger effects for certain formats such as Arabic digits or Kanji characters, than for
Defending the Modular Model
529
others, such as an alphabetic script like English, or a syllabic script like Kana or Hindi (e.g., Takahashi & Green, 1983; Tzeng & Wang, 1983; Vaid, 1985). Campbell and Clark take these findings as evidence that the representations mediating numerical comparison vary with stimulus format. This conclusion is unwarranted. First, size incongruity effects cannot meaningfully be compared across stimulus formats unless the difference between large and small stimuli in subjective size (i.e., size as it is perceived and coded internally) is equivalent for each format. However, it is unclear what objective size variable should be matched across formats to ensure that the subjective small-large difference is equated (see Holender & Peereman, 1987, pp. 54-55, for further discussion). For example, in a study involving digits and number words, if the large digits are twice as tall as the small digits, how should the size of large and small number words be chosen to ensure equivalent subjective differences between large and small stimuli in both conditions? Given the lack of a clear basis for determining whether size manipulations are equivalent across formats, comparisons of size incongruity effects across formats are highly questionable. Even if we disregard this problem, differences among formats in the magnitude of size incongruity effects do not necessarily imply that the representations mediating numerical comparison vary with format. The extent to which physical size information interferes with the processing of numerical magnitude information may depend upon the relative speed with which the size and magnitude information are encoded and hence become available for internal processing. (For more detailed discussions see Holender & Peereman, 1987; Foltz et al., 1984.) We have already argued that time to encode magnitude information may vary across formats; and given the physical differences among formats such as digits and number words, time to encode stimulus size may also vary. Consequently, the extent of conflict between size and magnitude information -- and thus the size of the incongruity effect -- might be expected to vary across formats even if the numerical comparison process is invariant across formats. In arguing that the representations underlying numerical comparison vary as a function of stimulus format, Campbell and Clark (this volume) highlight the results of a complex study by Vaid and Corina (1989). This study examined the size incongruity effect with left visual field (LVF) and right visual field (RVF) presentation of Arabic digits, English number words, and number signs from American Sign Language (ASL). Three groups of subjects were tested: deaf adults who use ASL as their primary language, hearing adults of deaf parents who acquired both ASL and English but for whom English is the dominant language, and hearing adults who acquired ASL during or after adolescence and use it regularly in their profession.
M. McCloskey, P. Macaruso di T. Whetstone
530
Campbell and Clark (this volume, p. 460-461) summarize the Vaid and Corina (1989) results as follows: "Overall, there was greater interference [i.e., a larger incongruity effect] when number words and signs were initially processed in the RVF, whereas interference was greater for digits presented to the LVF. . . . Moreover, for number words and signs, interference was greater in the RVF for the more skilled language (i.e., English or ASL), but greater in the LVF for the less skilled language."" From this summary Campbell and Clark (this volume, p. 460) argue that "processing of numerical magnitude appears to depend on format, apparently because left-hemisphere codes (e.g., possibly verbal representations) and right-hemisphere codes (e.g., visuo-spatial representations) may be differentially engaged in judgments about magnitude as a function of format." Following Vaid & Corina (1989) Campbell and Clark further suggest that Iaterality effects for a stimulus format may be affected by degree of skill with that format. However, the Campbell and Clark interpretation is not at all straightforward. First, it is far from obvious how a hemifield difference in the magnitude of the size incongruity effect should be interpreted. For example, Campbell and Clark interpret the larger LVF than RVF incongruity effect for digit stimuli to mean that numerical comparisons for digits predominantly involve right-hemisphere representations. However, the logic underlying this interpretation is not entirely clear to us (from either Campbell and Clark's discussion or that of Vaid and Corina, 1989). It is easy to see that fuster RT for LVF than RVF presentation could suggest preferential processing in the right hemisphere. However, it is not as easy to see how preferential right-hemisphere processing follows from a larger infeqerence effect for the LVF than the RVF. This inference is especially questionable in light of the fact that Vaid and Corina did not find faster RT for LVF than RVF digit presentations. For all six conditions within the digit format the mean RT was slower for the left than for the right hemifield.
l4
The Vaid and Corina results were even more complex than Campbell and Clark's summary
suggests. For example, whereas both group of hearing subjects showed a greater digit-format size
incongruity effect for LVF than for RVF presentation, the deaf subjects showed no visual field difference in the magnitude of the effect. Also, for ASL stimuli presented to the RVF the hearing subjects with deaf parents showed a large (98-msec) reversal of the size incongruity effect (i.e., faster RT for incongruent than congruent stimuli). Finally, the magnitude of the incongruity effect varied across format and subject groups in way that are difficult to understand. For example, for both LVF and RVF presentations the incongruity effect in the digit format was roughly twice as large for the hearing subjects who use ASL professionally than for the hearing subjects with deaf parents, even though both g r o u p were about equally fast overall in responding on digit-format trials.
Defending the Modular Model
531
Also, neither Campbell and Clark (this volume) nor Vaid and Corina (1989) have any specific interpretation to offer for the variation in hemifield effects across subject groups. The vague statement that "degree of language skill affects lateralization of magnitude judgments" (Campbell & Clark, this volume, p. 461) is at best a description of the results. Why, for example, did deaf subjects show a larger LVF than RVF incongruity effect for English number words? Is the claim that these subjects process number words preferentially in the right hemisphere? Thus, the pattern of results obtained by Vaid and Corina (1989) is quite complex, and neither Campbell and Clark nor Vaid and Corina have a clear, well-motivated interpretation. Furthermore, the Vaid and Corina results are not inconsistent with our model. One might argue that the variation across formats in hemifield differences in the size incongruity effect implies differences among formats in numerical comparison processes. As we mentioned earlier, however, the magnitude of the incongruity effect may depend upon the relative speed with which the relative size and relative magnitude information are encoded. Obviously, varying hemifield and format could affect the time required to encode size and/or magnitude, and thus could influence the magnitude of the incongruity effect in ways that are difficult to anticipate, but in no way inconsistent with our model. More generally, there is reason to exercise caution when interpreting laterality effects involving numbers in various formats. For one thing, studies involving lateralized presentation of numerical stimuli have produced inconsistent results (for reviews, see Holender & Peereman, 1987; Troup, Bradshaw, & Nettleton, 1983). For example, some studies have obtained a LVF advantage for digit stimuli in numerical comparison tasks (e.g., Katz, 1980, 1981; Troup, 1982, discussed in Troup et al., 1983), whereas others have found a RVF advantage for digits using a variety of tasks, including numerical comparison (e.g., Besner, Daniels, & Slade, 1982; Besner, Grimsell, & Davis, 1979; Besner, Snow, & Davelaar, 1986; Hatta, 1983; Peereman & Holender, 1985). Also, it appears that minor variations in methodology can have a significant impact on laterality effects. For instance, Katz (1980) argues that the sharp contrast between his magnitude comparison results and those of Besner et al. (1979) may reflect differences between the studies in stimulus sizes and presentation rates. In fact, Besner et al. (1986) conclude that attempts to compare Arabic and alphabetic stimuli in laterality studies are futile because it is nearly impossible to control for variables such as lateral masking and distance from fmtion which may confound any obtained visual field differences. In our view the factors that could have influenced performance in the Vaid and Corina (1989) study are so many and varied that no clear inferences can be drawn from the results. In particular, it seems obvious that the data do not provide an
532
M.McCloskey, P. Macaruso & T. Whetstone
adequate basis for concluding that the representations or processes underlying relative magnitude judgments vary as a function of stimulus format. In sum, the numerical comparison results cited by Campbell and Clark do not warrant the claim that "these various findings confirm that modality-specific number codes play a central role in magnitude comparison" (Clark and Campbell, 1991, p. 210). Odd-Even judgments Campbell and Clark (this volume, p. 461; see also Clark & Campbell, 1991) briefly cite three studies as evidence that "knowledge and use of the odd-even relationship is also mediated by multiple forms of mental codes." Hines (1990). Hines examined odd-even decisions for the numbers hvo through nine. Stimuli were digits in Hines' Experiment 2 and written words in Experiment 5. With word stimuli RT was 20 msec faster for even numbers than for odd numbers, and this difference was reliable. However, for digit stimuli RT was only 5 msec faster for even than odd numbers; this difference was not reliable. The 15 msec difference in the magnitude of the "even-number superiority effect" forms the basis for Campbell and Clark's (this volume) claim that the Hines study implies a difference between words and digits in the representations mediating odd-even judgments. However, in addition to offering no demonstration that the 15 msec difference between experiments was reliable, Campbell and Clark entirely ignore Hines' (1990) error rate data. In the word-format experiment Hines found no significant difference in error rate between even and odd numbers. However, in the digit experiment the error rate was reliably lower for even than for odd numbers. In other words, an even number superiority effect occurred in the RT data but not the error data in the word experiment, and in the error data but not the RT data in the digit experiment. Thus, the difference in RT (and error) results between experiments may reflect nothing more than a difference in the extent to which subjects stressed speed as opposed to accuracy (i.e., the greater the emphasis on speed the greater the extent to which effects will be expressed in error rate as opposed to RT). In any event the error data obviously undermine crossexperiment comparisons of RT results. Even if the even-number superiority effect were in fact larger for words than for digits, this difference could reflect word-digit differences in encoding time. Examination of the encoding time estimates we reported in an earlier section (see Tables 4 and 5 ) reveals that for digits the mean estimate for even numbers (524 msec) does not differ from that for odd numbers (525 msec). However, for
Defending the Modular Model
533
words the mean for even numbers (639 msec) is 14 msec faster than the mean for odd numbers (653 msec). If, as these results suggest, there is an encoding time advantage for even numbers in the word but not the digit format, we might well expect for this reason alone a larger even-number superiority effect in the word format. Mein and Mclnnes (2988). Klein and Mclnnes found a signifcant interaction between format and visual field in RT for odd-even judgments. Number words showed a 1Zmsec RVF advantage, whereas digits showed a lctmsec LVF advantage. Clark and Campbell (1991, p. 211) interpret this finding as evidence that "odd-evenjudgments are based on multiple, format specific codes that may be differently lateralized." However, an obvious alternative interpretation is that the format by visual field interaction reflects format-specific stimulus encoding processes, and not the processes involved in making odd-even judgments. Also, as Besner and Bryden (1988) have pointed out, neither the RVF advantage for words nor the LVF advantage for digits was reliable in the Klein and McInnes (1988) study. Thus, the evidence for lateralization of any process(es) is weak at best. Finally, as we noted in discussing numerical comparison, comparisons of laterality effects across stimulus formats are fraught with complexities. Thus, Klein & McInnes (1988) note that physical differences between stimulus formats cannot be ruled out as the cause of their format by visual field interaction. Shepherd and Gale (1982). In this study EEG's were recorded as subjects listened to sequences of four numbers and responded if (a) all numbers were odd, and (b) the sum of the numbers was greater than twenty. EEG measures for both hemispheres varied systematicallywith the number of odd stimuli in a sequence, and this effect was greater for the left hemisphere. Clark and Campbell (1991) interpret these results as evidence of a left-hemisphere advantage for odd-even judgments for verbal numeral stimuli. However, the task was complex, and the laterality effect may have been localized in stimulus encoding or calculation processes. Even if the laterality effect could be localized to the odd-even judgment process, the effect would in no way demonstrate that representations tied to the verbal numeral format were implicated in this process; rather, the RVF advantage would simply suggest that whatever representations are involved, these representations are processed preferentially in the left hemisphere. Also, it is important to note that Shepherd and Gale (1982) did not manipulate stimulus format, and hence their study provides no evidence for processing differences among formats. We conclude that the odd-even data, like the numerical comparison results, fail to warrant the claim that stimulus format affects processing beyond an initial encoding stage.
534
M. McCloskey, P. Macamso & T. Wietstone
Concluding remarks Debates between proponents of alternative theoretical frameworks can often be extremelyproductive, forcing clarification of claims, highlighting central unresolved issues, and stimulating empirical research aimed at discriminating between the contrasting positions. As the present and preceding chapters illustrate (see also Campbell & Clark, 1988; Clark & Campbell, 1991; McCloskey, in press; Sokol et al., 1989), the ongoing interchange between Campbell and Clark and ourselves has already proved fruitful in these respects. We expect the productive contention to continue as we elaborate, test, and (no doubt) modify our model. We hope that Campbell and Clark will similarly be spurred to formulate explicit, testable proposals within their encoding complex framework. ACKNOWLEDGEMENTS Preparation of this chapter was supported by NIH grants NS21047 and MH18215 to the Johns Hopkins University, and a grant from the James S. McDonnell Foundation to the Neurolinguistics Laboratory at the MCH Institute of Health Professions. Address correspondence to Michael McCloskey, Department of Cognitive Science, Johns Hopkins University, Baltimore, MD 21218 REFERENCES Ashcraft, M.H. (1987). Children's knowledge of simple arithmetic: A developmental model and simulation. In C.J. Brainerd, R.Kail, & J. Bisanz (Eds.), Formal methods in developmental reseatrh (pp. 302-338). New York: Springer-Verlag. Besner, D. & Bryden, P. (1988). Rolling the dice: A comment on Klein & McInnes "Visual field differences in the processing of numerical stimuli." Brain and Cognition, 7, 381-387. Besner, D. & Coltheart, M. (1979). Ideographic and alphabetic processing in skilled reading of English. Neuropsychologia, 17, 467-472. Besner, D., Daniels, S. & Slade, C. (1982). Ideogram reading and right hemisphere language. British Journal of Psychology, 75, 21-28. Besner, D., Davelaar, E., Alcott, D. & Parry, P. (1984). Wholistic reading of alphabetic print: Evidence from the FDM and the FBI. In L. Henderson
Defending the Modular Model
535
(Ed.), Orthographies and reading. Hillsdale, NJ: Lawrence Erlbaum Associates. Besner, D., Grimsell, D. & Davis, R. (1979). The mind's eye and the comparative judgement of number. Neuropsychologiu, 17, 373-380. Besner, D., Snow, D. & Davelaar, E. (1986). Logographic reading: Is the right hemisphere special? Canadian Journal of Psychology, 40,45-53. Campbell, J.I.D. (1990). Error priming in cogtifive m'thmetic: Effects of number fonnat. Paper presented at the meetings of the Psychonomic Society, New Orleans, Louisiana. Campbell, J.I.D. & Clark, J.M. (1988). An encoding complex View of cognitive number processing: Comment on McCloskey, Sokol, and Goodman (1986). Journal of Experimental Psychology: General, 117, 204-214. Campbell, J.I.D. & Graham, D.J. (1985). Mental multiplication skill: Structure, process, and acquisition. Canadian Journal of Psychology, 39, 338-366. Carroll, J.B., Davies, P. & Richman, B. (1971). The American Heritage word frequency book. New York: Houghton Mifflin. Clark, J.M. & Campbell, J.I.D. (1991). Integrated versus modular theories of number skills and acalculia. Brain and Cognition, 17, 204-239. Cohen, L. & Dehaene, S. (1991). Neglect dyslexia for numbers? A case report. Cognitive Neuropsychology, 8, 39-58. Dehaene, S. & Mehler, J. (in press). Cross-linguistic regularities in the frequency of number words. Cognition. Deloche, G. & Seron, X. (1987). Numerical transcoding: A general production model. In G. Deloche & X. Seron (Eds.), Mathematical disabilities: A cognitive neuropsychological perspective. (pp. 137-170). Hillsdale, NJ.: Erlbaum. Ellis, A.W., & Young, A.W. (1988). Human cognitive neuropsychology. Hillsdale, NJ: Erlbaum. Folk, G.S., Poltrock, S.E. & Potts, G.R. (1984). Mental comparison of size and magnitude: Size congruity effects. Journal of Experimental Psychology: Learning, Memory & Cognition, 10, 442-453. Harley, W.S. (1990). Associative memory in mental arithmetic. Unpublished doctoral dissertation, Johns Hopkins University. Hatta, T. (1983). Visual field differences in semantic comparative judgments with digits and Kanji stimulus materials. Neuropsychologia, 21, 669-678. Hines, T.M. (1990). An odd effect: Lengthened reaction times for judgments about odd digits. Memory and Cognition, 18, 4-46.
536
M. McCloskey, P. Macaruso & T. Whetstone
Holender, D., & Peereman, R,(1987). Differential processing phonographic and logographic single-digit numbers by the two hemispheres. In G. Deloche & X. Seron (Eds.), Mathematical disabilities: A cognitive neuropsychological perspective (pp. 43-85). Hillsdale, NJ: Erlbaum. Katz, A.N. (1980). Cognitive arithmetic: Evidence for right hemispheric mediation in an elementary component stage. Quarterly Journal of Epedmentaf PsyChOlOgY, 32, 69-84. Katz, A.N. (1981). Spatial compatibility effects with hemifield presentation in a unimanual two-finger task. Canadian Journal of Psychology, 35, 63-68. Klein, R. & McInnes, J. (1988). Visual field differences in the processing of numerical stimuli. Bmin and cognition, 7, 247-256. McCloskey, M. (in press). Cognitive mechanisms in numerical processing: Evidence from acquired dyscalculia. Cognition. McCloskey, M., Aliminosa, D. & Macaruso, P. (1991). Theory-based assessment of acquired dyscalculia. Brain and Cognition, 17, 285308. McCloskey, M., Aliminosa, D. & Sokol, S.M. (1991). Facts, rules, and procedures in normal calculation: Evidence from multiple single-patient studies of impaired arithmetic fact retrieval. Brain and Cognition, 17, 154-203. McCloskey, M. & Caramazza, A. (1987). Cognitive mechanisms in normal and impaired number processing. In G. Deloche & X . Seron (Eds.), Mathematical disabilities: A cognitive neuropsychologicalperspective (pp. 201219). Hillsdale, N.J.: Erlbaum. McCloskey, M., Caramazza, A. & Basili, A.G. (1985). Cognitive mechanisms in number processing and calculation: Evidence from dyscalculia. Brain and Cognition, 4, 171-1%. McCloskey, M., Sokol, S.M. & Goodman, R A . (1986). Cognitive processes in verbal-number production: Inferences from the performance of brain-damaged subjects. Journal of Experimental Psychology: General, 115, 307-330. McCloskey, M., Sokol, S., Goodman-Schulman, R & Caramazza, A. (1990). Cognitive representations and processes in number production: Evidence from cases of dyscalculia. In A. Caramazla (Ed.), Cognitive neuropsychology and neurolinguistics:Advances in models of cognitivefunction and impairment (pp. 1-32). Hillsdale, NJ: Erlbaum. Monsell, S. (1991). The nature and locus of word frequency effects in reading. In D. Besner & G.W. Humphreys (Eds.), Basicprocesses in reading: Ksual word recognition. Hillsdale, NJ: Erlbaum.
Defending the Modular Model
537
NoEl, M.-P. & Seron, X. (in press). Notational constraints and number processing: Reappraisal of Gonzalez and Kolers' (1982) study. Quarterly Journal of Experimental Psychology. Peereman, R. & Holender, D. (1985). Visual field differences for a number -non-number classification of alphabetic and ideographic stimuli. Quorterfv Journal of Experimental Psychology, 36A, 197-216. Shepherd, R., & Gale, A, (1982). EEG correlates of hemisphere differences during a rapid calculation task. British Journal of Psychology, 73, 73-84. Siegler, R.S. (1988). Strategy choice procedures and the development of multiplication skill. Journal of Experimental Psychology: General, 117, 258275. Siegler, R.S. & Shrager, J. (1984). A model of strategy choice. In C. Sophian (Ed.), Origins of cognitive skills (pp. 229-293). Hillsdale, NJ: Erlbaum. Sokol, S.M., Goodman-Schulman, R. & McCloskey, M.(1989). In defense of a modular architecture for the number-processingsystem: A reply to Campbell and Clark (1988). Journal of Experimental Psychology: General, ll8,105-110. Sokol, S. & McCloskey, M. (1988). Levels of representation in verbal number production. Applied Psycholinguistics, 9, 267-281. Sokol, S.M., McCloskey, M., Cohen, N.J. & Aliminosa, D. (1991). Cognitive representations and processes in arithmetic: Inferences from the performance of brain-damaged patients. Journal of Experimental Psychology: Learning, Memory & Cognition, 17, 355-376. Takahashi, A. & Green, D. (1983). Numerical judgments with kanji and kana. Neuropsychologia,21, 259-263. Troup, GA. (1982). Cerebral asymmetry in the processing of numerical information. Unpublished honours thesis. Monash University. Troup, G.A., Bradshaw, J.L. & Nettleton, N.C. (1983). The lateralization of arithmetic and number processing: A review. International Journal of Neuroscience, 19, 231-242. Tzeng, 0J.L. & Wang, W.S.-Y. (1983). The first two Rs. American Scientist, 71, 238-243. Vaid, J. (1985). Numerical size comparisons in a phonologically transparent script. Perception and psychophysics, 37, 592-595. Vaid, J. & Corina, D. (1989). Visual field asymmetries in numerical size comparisons of digits, words, and signs. Brain and Language, 36, 117-126.
This Page Intentionally Left Blank
The Nature and Origins of Mathematical Skills J.I.D.Campbell (EdTtor)
0 1992 Elsevier Science Publishcrs B.V. All rights reserved.
539
Chapter 14 IN DEFENSE OF THE ENCODING-COMPLEX APPROACH REPLY TO McCLOSKEY, MACARUSO, & WHETSTONE
Jamie I. D. Campbell University of Saskatchewan
Summary
In the preceding chapter, McCloskey, Macaruso, and Whetstone (henceforthMM& W) present a number of theoretical and empirical challenges to the encoding-complex view of number processing advanced by Campbell and Clark (this volume; henceforth C&C). In this reply I examine their explanationsfor the effects of numberformat on numberfact retrieval found by C&C. I argue that the main alternative accounts offered by MM& W -- the "internaldeadline" hypothesis and the "encodingeficiency" hypothesis -- are not supported by the data and do not provide convincing alternatives to G W s p r o p s a I that retrieval processes differ as a function offormat. I also argue that aspects of the qlanations offered by MM&W compromise the basic abstmctmodular theory and in fact, undermine MM&Ws claim that the modular view is likely to be more productive than the encoding-complexapproach. I propose further that the abstract-code theory of number meaning assumed within fhe modular framework is counterproductive, because it takes for granted compIa, fundamental aspects of cognitive number processing. In contrast, it is a primary goal of the encoding-complex approach to provide explanatory mechanisms for these basic elements of numerical cognition. Introduction A pervasive theme in the preceding comment by MM&W is that there are clear and important advantages that follow from starting with relatively simple assumptions about the cognitive architecture underlying numerical skills. One presumed advantage that MM&W emphasize is that the simple, modular architecture they propose provides straightforward predictions and interpretations with respect to experimental data, thus making their model testable and falsifiable.
540
J. I.D. Campbell
MM&W also argue that the encoding-complex approach is too underspecified to support the generation of testable hypotheses, and is therefore unfalsiiable. These considerations, they claim, indicate that the abstract-modular approach is likely to be more productive scientifically. In this reply I argue that the differences between the two approaches with respect to predictiveness and testability are not as clear-cut as MM&W claim. It is true that C&C's analyses of format effects in multiplication were exploratory, and MM&W correctly point out that the encoding-complex view did not predict a specific pattern of format effects (p. 510). This is not an embarrassment for the encoding-complex position, however. We have stated clearly (Clark & Campbell, 1991) that we are still early in the process of developing a precise encodingcomplex theory around our basic assumptions, and exploratory analyses are an essential part of that process. Furthermore, it is important to recognize that the abstract-modular theory similarly did not predict any specific format-related phenomena. This is not surprising, despite MM&W's claims of predictiveness and testability for the abstract-modular model. Although the model is specific in its assumptions about the skeletal structure of the proposed number-processing systems, very little has been specified about how the proposed comprehension, calculation, and production systems can interact in the context of different tasks. Indeed, in order to accommodate the various effects of number format demonstrated by C&C, MM&W introduced several new possible extensions and elaborations to the model. It is, of course, a natural part of scientific theory development to modify assumptions to accommodate new findings. But the question I raise here is, given the sorts of ad hoc modifications and possibilities introduced by MM&W, is the abstract-modular theory really more testable, constrained, or predictive than the encoding-complex view? Although MM&W characterize their modifications as "specific limited modifications" (p. 508), I will argue that the implications are more far reaching than this. To begin, it will be worthwhile to briefly review how the abstract-modular and encoding-complex views differ with respect to predictions about format-effects on number-fact retrieval. Number format and number-fact retrieval In the abstract-modular theory, numbers are represented by format and modality independent abstract-codes, which are characterized by a digit in brackets, followed by an exponential term specifying the appropriate order of magnitude (e.g., 67 is (6)lOEXPl (7)lOEXPO). Number facts are stored in abstract form in a "calculation system" and are activated by abstract encodings of problems that are input from a separate "comprehension system." In contrast,
In Defense of the Encoding-Compla Approach
541
according to the encoding-complexview (e.g., Clark & Campbell, 1991), numberfact retrieval is based on modality-specificrepresentations, such as visual codes for digits, and visual and phonological codes for number words. Given these different assumptions, a natural approach to testing the abstract- versus specific-code views is to examine effects of number format (e.g., digits vs. words) on number-fact retrieval. The abstract-code view implies that the calculation process for a given task (e.g., simple multiplication or addition) is the same regardless of the format in which problems are presented. Therefore, the abstract-code view seems to predict that there should be no substantial effect of format on performance, and, indeed, Sokol, McCloskey, Cohen and Aliminosa (1991) concluded that their failure to find format effects in the multiplication errors of an acalculic patient is consistent with the abstract-code hypothesis. The contrasting view, that memory for number-facts is modality specific, implies that retrieval processes can vary as a function of format. For example, it seems very likely that simple arithmetic problems are encountered more frequently in digit format (e.g., 2 t 6 = 8 and 4 x 6 = 24) than in written number-word format (two t six = eight and four x six = twenty-four). As a consequence, a digit problem would be more likely to activate a visual representation than a numberword problem. Retrieval via number words must depend more on auditoryphonological representations of problems. Retrieval via digits, therefore, should be easier because it is mediated both by well-established visual and phonological "routes,"whereas retrieval via number-word format will not provide a strong, direct visual basis for retrieval. Although the encoding-complexview leads to the general prediction that digitbased retrieval should normally be easier than word-based retrieval, more precise predictions at this early stage of theory development are not possible. Predictions are complicated by the possibility of indirect paths of activation due to strong associations between visual codes and the corresponding phonological codes. For example, retrieval via written number-words presumably can include activation of visual-codes for digit-based representations, because visual codes for number words potentially activate digits codes via a common phonological association. Furthermore, phonological representations of problems likely will be strongly activated whether problems are presented as digits or as number words. These considerations imply that there will be substantial overlap in the retrieval structures activated by problems presented in digit format or in number-word format. Consequently, effects of format are likely to be quite subtle, and there are bound to be many parallels in retrieval given number-word or digit formats. Nonetheless, in the C&C experiment, careful analysis uncovered several systematic effects of format.
542
J.I.D. Campbell
Comments on MM&W's explanations for various effects of format
Although MM&W state that the experiment reported by C&C "did not reveal gross differences in performance between the word and digit problems" (p. 510), there were, in fact, very substantial differences in accuracy and speed for simple multiplication problems as a function of format. The mean rate of errors was 51% higher for word-format problems than for the digit format (12.5% vs. 8.3% of trials), and mean time for a correct response was 36% slower with words than digits (1107 ms vs. 816 ms)'. C&C proposed that these and other phenomena suggest that retrieval processes can differ as a function of format. In contrast, MM&W argue that these overall differences could be due, directly or indirectly, to longer encoding times for the word stimuli relative to digits.
The "internal deadline" hypothesis MM&W propose that longer encoding times could have contributed to the high error rate for word problems in the following way: Subjects may have been motivated to generate an answer prior to some "internal processing deadline" (p. 522). This deadline corresponds to the maximum amount of time a subject permits to elapse before generating a response. Total multiplication time includes encoding and calculation (i.e., retrieval) stages, and both stages run to completion unless the deadline is exceeded. Because of longer encoding times for words, subjects were more likely to encounter the deadline for word-format problems than digit-format problems. As a consequence, word problems were afforded less retrieval-stage processing on average relative to digit problems, making retrieval errors more likely for the word stimuli. Although the internal deadline theory is plausible, it is almost certainly not the correct explanation for the overall higher error rate for word-format multiplication in C&C's experiment. Pervasive use of a deadline in the manner proposed by MM&W would impose an upper bound or ceiling on word-format RTs. This effect would appear as a clustering of RTs for the more difficult word-format problems at the cutoff defined by the average deadline. The absence of such a ceiling for word-format RTs would indicate that subjects' responses generally were not constrained by a deadline as suggested by MM&W. Figure 1 presents the mean RTs from C&C's study for each problem in each format, with word-format
'Campbell (1990) also observed that word-based addition and multiplication were both about 300 ms slower compared to digit format.
In Defense of the Encoding-Complex Approach
543
RT on the y-axis plotted against digit-format RT for the corresponding problem on the x-axis. The deadline hypothesis predicts a rightward bend at the top of the function corresponding to the ceiling imposed on word-format RTs by the putative deadline. As the Figure shows, there was no evidence that word-format RTs were constrained by a deadline; instead, the Figure shows that the relation between word and digit RTs across the 64 problems was linear (r = .913) across the entire range of word-format RTs.
0
1400 -
71250
-
E
v
IY +
:
0
0
Figure 1. Mean RT for M x N multiplication problems presented in word or digit format (Campbell & Clark, this volume).
Despite the apparent linearity, a quadratic component in the relation between digit and word RTs might signal the curvilinear trend predicted by the deadline hypothesis. To test for evidence of a quadratic trend in Figure 1, a multiple regression on the 64 word-problem RTs was performed with the corresponding digit RTs and their squares used as predictors. The squared-RT variable provides a test for the quadratic component. Once the linear relation with digit RTs was factored out, however, the quadratic component did not enter the equation (a to enter = .05; partial r = -.22). When only word-format problems with RTs over loo0 ms were included in the analysis, the partial correlation for the quadratic component was -.16 (p > .25). This affirms that there was no curvilinear trend
544
J.I.D. Campbell
associated with the upper end of the function. Thus, far from there being clear visible evidence of a ceiling on word-format RTs, there was no statistical evidence of a curvilinear trend in Figure 1 as would be expected given the internal deadline idea. The data appear to directly disconfirm the deadline hypothesis as a plausible explanation for the 50% higher rate of errors in the word-format condition. The unfamiliarity of word-based multiplication Apart from no evidence to support a RT ceiling for word problems, MM&W's deadline proposal encounters another major difficulty. The deadline proposal implies that differences in encoding times for the word and digit problems must have been substantially in excess of 300 ms (i.e., because the deadline proposal assumes that time in the retrieval stage was less for words than for digits, the overall 300 ms longer RTs for words must be underestimating the differences in encoding time). A difference of this magnitude is difficult to reconcile, however, with evidence that naming times for individual number-words and digits in the range from two through nine (i,e., the numbers used as multipliers in the C&C experiment) differ by as little as 11 ms on average (Campbell, 1990). The naming data suggest that words and digits do not differ greatly with respect to the amount of time required for encoding. MM&W argue (p. 512), however, that the 300 ms differences can be explained in terms of differences in encoding times. They propose that such factors as stimulus width and frequency (discussed below), and the "unfamiliar task (p. 512) of word multiplication would have contributed to longer encoding times for words. The proposal that encoding times were long because of low familiarity with word-based multiplication seems to be at odds with the basic abstract-modular viewpoint, which places great emphasis on the functional independence of the comprehension and calculation processes (e.g., MM&W, pp. 496-497). Functional independence implies that the process of converting stimuli into abstract codes (i.e., the comprehension process) should be the same irrespective of the subsequent processing that will take place. Consequently, practice and experience at "comprehending"number words in any task that requires the putative abstract codes (e.g., reading numbers; cf,, McCloskey, Sokol & Goodman, 1986), should transfer to the multiplication task. As people do have extensive experience with number words in non-calculation contexts, the claim that low-familiarity would contribute to longer encoding times for word-based multiplication, per se, seems to contradict the assumption of functional independence. Indeed, later in their comment, MM&W use number-word and digit frequencies that were estimated from book and magazine counts to predict differences in
In Defense of the Encoding-Complex Approach
545
multiplication RTs as a function of format (pp. 515-521). For the p u r p o s of this analysis, which was to discount a retrieval-based explanation of the diffelences in favor of an encoding-based account, MM&W appear to assume that general experience with encoding number-words should transfer directly to the multiplication task. Transfer seems to be predicted when it yields results potentially consistent with the abstract-modular position, but is not predickd when it would yield results that potentially contradict the abstract-modular view. MM&Ws claim that the abstract-modular model affords simplicity, prediaiveness, and testability ultimately is based on the assumption that there is a generic comprehension process that is common to most number-processing tasks; however, the actual criteria for determining when the model predicts generic comprahension processes and when they could be format or task specific, are elusive. There is also a more general point to be made about MM&Ws suggesrion that lack of familiarity with word-based multiplication accounts for the longer wordformat RTs: This suggestion amounts to asserting that encoding-fbrmat x calculation-task interactions are to be expected within the abstract-modular framework. I will argue that by admitting such interactions, MM&W substantially weaken their claim that the assumption of functionally independent modules provides powerful predictive and interpretive constraints. Frequency, encoding eflciency, and problem diflculry C&C reported multiple regression analyses of speed and accuracy across problems in the digit and word formats (pp. 466-468).These analyses showed that the physical width of word-format problems was positively related to word-format RTs, but not to digit-format RTs, indicating that time to read or encode (he word stimuli contributed to the pattern of differences between word and digit RTs. Once these effects were partialled out, however, variables theoretically related to retrieval difficulty (problem size and "fan"), accounted for significant residual variance in the pattern of format-related differences in RTs and errors. In general, the word-format deficit tended to increase with problem difficrlty, and C&C suggested that this finding was consistent with the conclusion that retrieval processes differed between the word and digit formats. As an alternative account, MM&W proposed that the apparent interaction of format with problem difficulty might be due to differences in encoding difficulty between digits and number words (pp. 515-521). MM&W report data (rables 2 and 3, pp. 515-516) that indicate that number-word and digit frequencies are negatively correlated with numerical value over the range from two to nne, and that the slope of the relation is steeper for number words than digits. 'I%us, if it
546
J.I.D. Campbell
is assumed that encoding efficiency will be directly related to frequency, then it follows that frequency could account for the interaction between format and problem size that appeared in C&C's multiple regression analyses. MM&W report new regression analyses showing that when frequency differences are taken into account the problem size and fan variables no longer account for unique variabillty. There is good reason, however, to doubt the frequency-based account. The plausibility of the frequency-based explanation proposed by MM&W is weakened, if not contradicted, by the results of the number encoding-time experiment they report (see Tables 4 and 5 in MM&W, pp. 519-520). The results showed that the estimated encoding times for digits (Table 5 ) were unrelated to frequency differences for digits (Table 3, p. 516), despite substantial variability in digit frequencies. If anything, digit frequency and digit-encodingtime estimates tended to be negatively related (r = -.20,p. 520). Although number-word encoding times were positively correlated with number-word frequencies, estimates of digit encoding times were unaccountably not predictable from frequency differences. The data indicate, therefore, that frequency counts are not a good basis for estimating number-encoding times. Thus, the statisticallysignificant prediction of RT differences by the frequency variable in MM&Ws multiple regression analyses might have nothing to do with encoding processes. Instead, for example, number frequency may simply be correlated with problem frequency and, hence, with retrieval difficulty.
Effects of format on specific features of errors The analyses performed by C&C also uncovered a number of format effects on the characteristics of specific errors. The most important format-related effects on specific errors concerned 1) the tendency for word-format errors to be more distant from the correct answer, and 2) for operand-intrusionerrors to occur more frequently with the word format. Error distance C&C found that errors that were two or more operand-units distant from the correct answer (e.g., 4 x 6 = 32) were about 40% more common for words than for digits. C&C also noted that, relative to digit problems, a higher percentage of word errors were multiples of the larger operand than the smaller operand. C&C emphasized that the specific causes of these effects were uncertain (p. 472), but suggested that both effects might reflect weaker involvement of magnitude factors for word relative to digit multiplication. This interpretation is contrary to the
In Defense of the Encoding-Complex Approach
547
abstract-modular view because it implies that the involvement of mgnitude representations can be format specific, whereas magnitude representation is associated with the abstract-codes in the abstract-modular theory, and should presumably be a constant factor across different formats. In their comment, MM&W correctly point out that the higher rate of maxrelated errors for words is actually an artifact of the higher rate of intrusim errors for words (see MM&W, p. 525) and probably has nothing to do with mgnitude per se. Nonetheless, format effects on intrusion errors present a major Qlallenge to the abstract-modular view (see below). Furthermore, whereas the higher incidence of intrusions can account for differences in the frequency of ma-related errors, it does not account for the operand-distance effect. MM&W propose that the tendency for tabled-related errors to be more distant with word stimul is more plausibly interpreted in terms of the processing-deadline hypothesis discussed above, rather than to differential involvement of magnitude factors for digits versus words. They argue that, although the word-format yielded much lorger RTs overall, less time was spent on retrieval processes relative to digit Woblems. Consequently, processing in the retrieval system was terminated earlier on average for word problems than digit problems. In this case, digit errors would tend to be less remote than word errors, MM&W suggest, if it is assumed that t k "set of candidates is gradually narrowed during retrieval" (p. 524). A similar explanation (p. 525) is offered to explain the higher rate of miscellaneous errors (he, errors involving non-products) in the word format. Although this explanation is not implausible, it takes for granted the varacity of the deadline hypothesis, which, as demonstrated previously, is not supporttd by the data: Contrary to the deadline hypothesis, there was no evidence of a ceiling for word-format RTs. Furthermore, there is other experimental evidence consistent with C&C's proposal (see Clark & Campbell, 1991, for a more extensive discussion). For example, Foltz, Poltrock, and Potts (1984) found t h a format (words vs. digits) interacted with numerical distance in a magnitude jdgement task; specifically, there was a strong magnitude-congruity effect with digt stimuli but not with number-word stimuli. These results suggest that magnitub: factors were not as important in numerical processing of number words as digits. Similarly, the effects of format on the distance of multiplication errors may be plausibly interpreted as evidence that number-words and digits can diflbr in the extent to which they involve the processing of magnitude information. Tks, given the apparent inadequacy of MM&Ws internal-deadline explanation, the finding that retrieval-error distances vary as a function of format remains a swstantial challenge to the view that number-fact retrieval is based only on formatindependent, abstract codes.
548
J.I.D. Campbell
Operand-priming as evidence of encoding-format x retrieval-process interactions The operand-priming effects reported by C&C provide some of the most challenging phenomena for the abstract-modular theory, and therefore it is worth reviewing the phenomena and C&C's characterization of them in some detail. Many of the multiplication errors observed by C&C appeared to involve "intrusions" of the problem's operands (e.g., 8 x 4 = twentyfour). Cases where the entire response corresponded to one or both of the operands were called naming errors (e.g., 2 x 9 = nine; 2 x 7 = twenty seven). Many other cases were simple intrusions, however, in which only one operand appeared in the error (e.g.,9 x 6 = thirty sir, 6 x 9 = sir& three). Format had a very strong effect on intrusions: Naming errors were about 4.5 times more common with words than with digits (90 vs. 16 errors), and simple intrusions similarly were much more common in the word format than the digit format (597 vs. 326 errors), accounting on average for 51% and 38% of word and digit errors, respectively. Thus, the data showed that operand intrusions were a substantial factor in performance, and that the influence of the processes producing intrusions was much greater for word stimuli. C&C argued that operand intrusions involve number-fact retrieval processes. If intrusions arose by priming of post-retrieval lexical codes, for example, intrusions ought to be associated frequently with miscellaneous answers (e.g., 7 x 4 = 34). Instead, about 80% of operand intrusions co-occurred with arithmetically associated products (7 x 4 = 24), suggesting that operand intrusions resulted from priming or activation of number-fact representations. Consistent with this interpretation, C&C also found that when the position of an operand matched the corresponding number in the correct answer (e.g., 6 x 4 = 24 vs. 4 x 6 = 24, 6 x 8 = 48 vs. 8 x 6 = 48, etc.), there were significantly fewer errors and correct RTs were faster relative to when there was not a positional match. These findings paralleled the observation that intrusion errors preserve the position of the matching operand more often that expected by chance (e.g., 8 x 4 = 24 tends to be a more common error than 4 x 8 = 24). C&C proposed that intrusion errors and related effects arise because of direct priming of number-fact representations that contain features or components that match the operands. More specifically, they proposed that the effects result from an interaction of fact-retrieval processes and the number-reading processes engaged when the problem is encoded. Number-reading processes could activate number-word codes that match corresponding verbal-code representations of multiplication problems and answers. The effect of this activation would be facilitative when the correct problem-answer representation is primed, but produce interference when related, but irrelevant, representations are primed (cf. Campbell
In Defense of the Encoding-Complex Approach
549
& Oliphant, this volume). With respect to format effects on intrusions, C&C proposed that the higher rate of intrusions for the word format would occur if number words evoked stronger activation of verbal codes relative to digit stimuli, perhaps because general reading experience makes the reading response more automatic with number words than digits. Another possibility is that greater experience calculating with digits than with words allows individuals to develop digit-specific inhibitory strategies that reduce interference from irrelevant numberreading processes (see Clark, this volume, for an extensive discussion of inhibitory mechanisms in mental arithmetic). C&c)s emphasis on number-reading mechanisms is supported by the observation that intrusions tend to preserve the position of the intruding operand (e.g. 6 x 9 = sixty three; 9 x 6 = thirty six). This finding suggests that the pair of operands is encoded as if it were a two-digit number, with the left operand encoded as a tens words (sixty) and the right operand as a units word (six). Although other explanations may be possible, the most obvious explanation for why the pair of operands in a horizontally-oriented multiplication problem (e.g., 3 x 6) would be encoded as a single numeral is that there is a strong tendency to read the stimulus as if it was a pair of numbers without a multiplication sign (i.e., 3 x 6 activates the verbal response thirty six, among others). One implication of this account is that number-fact retrieval and numberreading processes cannot truly be said to be functionally independent; rather, this view of operand-priming effects implies that number-reading and number-fact retrieval are integrated processes that are activated simultaneously and compete for common representational structures (e.g., verbal-code representations). Indeed, successful multiplication may normally require inhibition of the relatively automated number-reading response. Based on such considerations, C&C suggested (p. 474) that operand-priming effects provide evidence of format-specific interactions between number-reading and fact-retrieval mechanisms that at least complicate, and perhaps contradict, the simple modular view espoused by MM&W. According to the latter, number reading implicates the comprehension and production modules (e.g., MM&W, p. 499), but should have no direct involvement with processes in the calculation module (cf. MM&W, p. 497).
MM&W's alternative account of operand-priming phenomena Surprisingly, MM&Ws account of operand-priming effects very closely resembles C&C's account. They apparently accept the evidence that operandintrusion errors "often arise from arithmetic fact retrieval processes" (p. 526), and that the same factors "may have contributed to the higher relative frequency of
550
J.I.D. Campbell
intrusions in the word condition" (p. 528). Having acknowledged these points, however, MM&W argue that making such allowances does not necessarily contradict the abstract-modular theory. To account for the various operand-priming phenomena, MM&W argue that the basic theory outlined by C&C can be construed to be consistent with the abstractmodular model. Specifically, MM&W propose (pp. 526-528) that the comprehension system not only encodes the operands individually, but also treats the entire problem as a single numeral. For example, 7 x 6 would give rise to abstract-codes representing 7, 6, and 76 (e.g., (7)lOEXPl (6)lOEXPO). This produces intrusions because "(t)he numeral representation may then interfere with the arithmetic fact retrieval process, leading to activation of answer representations with which it shares one or both quantities" (p. 527). For example, in the case of 7 x 6, abstract representations of answers in the calculation system that contain either (7)lOEXPl or (6)lOEXPO may be directly activated and promoted as intrusions errors. The tendency for intrusions to preserve position (e.g., 7 x 6 = 56) occurs because the exponents in MM&Ws abstract codes correspond directly to position (i.e., lOEXF'l maps on to the tens position and lOEXPO on to the units position). The higher rate of intrusions in the word-format, MM&W suggest, might be due to a stronger tendency for the problem to be treated as a single numeral with word stimuli, because, "for example, seven x sir arguably resembles seventy-sir more closely than 7 x 6 resembles 7 6 (p. 527-528). One important difference between MM&Ws and C&C's explanations of operand-priming effects is that MM&Ws account assumes that intrusions reflect the activation of abstract codes. In contrast, C&C's account proposes that intrusions arise from reading processes that activate verbal codes for number words (e.g., 7 x 6 activates phonological codes for seven, sir, and seventy sir). MM&W claim, however, that their abstract-code account is superior because "the intruding response may occur as a tens word or a teens word, and therefore, "intrusions are apparently best characterized as operand quantity intrusions and not operand name intrusions" (p. 527). This claim, however, is based on an oversimplification of C&C's proposal. According to C&C's account, operand position is preserved in intrusions because position influences whether a tens or a units word is activated by number-reading processes. In other words, the intruding number names may be tens or units words depending on the position of the corresponding operand. A second difference between the explanations is that MM&W attribute the tendency for the pair of operands to be encoded as a single numeral to generic "comprehension processes" (p. 527), rather than to "reading processes," per se. Nonetheless, it seems reasonable to propose that it is long experience with reading
In Defense of the Encoding-Complex Approach
551
horizontally presented pairs of digits and number words that tends to promote encoding of the left operand as a tens item and the right operand as a units item. In the absence of the reading hypothesis there seems to be no basis within the abstract-modular theory to predict or explain why the pair of operands would be encoded as a single numeral. As pointed out previously, however, if it is allowed that intrusions are retrieval phenomena, and also allowed that intrusionphenomena arise because of format-specific interference from number-reading processes, then the modularity assumption that calculation relations are functionally separated from basic number reading processes (ix., in the presumed comprehension and production systems) is weakened or contradicted. The following section examines this issue more closely.
Fonnat-specific calculation effects and the merits of modular architecture MM&W's proposed explanation of the operand-priming phenomena unambiguously entails format-specific interactions of comprehension and calculation processes. That is, MM&W allow that operand intrusions are a number-fact retrieval phenomenon (i.e., reflect priming of number-fact representations), and also that format differentially influences this retrieval phenomenon (is., some features of the priming effect are more probable with word stimuli; pp. 527-528). Therefore, within the abstract-modular theory, basic calculation phenomena can differ as a function of number format. Allowing such interactions, however, sharply reduces the ostensible scientific value of assuming a simple architecture with functionally independent comprehension and calculation modules. The boundaries between MM&W's proposed modules are, in effect, defined by the abstract codes through which they communicate (pp. 496-497), and the abstract codes, in turn, are defined as being independent of surface form (p. 497). Allowing format-specific effects of comprehension processes that directly effect calculation processes, obscures, both empirically and theoretically, the boundaries between the modules. The boundaries are obscured empirically because the assumption of abstract codes no longer places constraints on the sorts of effects that can be transmitted between modules. Allowing format-specific interactions with calculation mechanisms (and presumably, therefore, with the proposed production mechanisms as well; cf. p. 508), renders the model capable of accommodating practically any imaginable pattern of format-specific phenomena. Furthermore, allowing formatspecific calculation effects comes close to violating the theoretical principles of functionally independent modules: If format-specific encoding processes in the proposed Comprehension system (whose presumed function is to abstract over
552
J.I.D. Campbell
formats) can have unique, task-specific consequences on retrieval processes in the calculation system (cf. MM&Ws explanation for operand-priming effects, p. 526528), it is not clear what is gained by calling comprehension and calculation "functionallyindependent modules,'' Practically speaking, format-specific influences of comprehension processes on calculation processes means that their functions are not independent. Instead, as the encoding-complex view espouses, such interactions strongly suggest a continuity of encoding and retrieval processes in which features of retrieval processes vary systematically with features of the encoding process. The questionable necessity of the assumption of abstract-quantity codes
Given the evidence of format-specific retrieval effects and other phenomena demonstrating modality-specific representations, C&C raised the question of whether the assumption of abstract codes is necessary (p. 481). Among other points, C&C pointed out that similarity of performance across different number formats does not provide direct evidence for the existence of abstract codes. MM&W recast this observation to make the point that "even a finding of no differences between formats in retrieval processes would apparently not be an embarrassment to the encoding complex position" (footnote p. 510). Although MM&W think that this constitutes an indictment of the encoding-complex view, it is a simple, logical fact that similarity does not demonstrate either the existence of abstract codes or the absence of modality-specific codes. Similarity across formats could mean that performance is based on a common (but not necessarily abstract) code, or that the different formats activate different internal codes that are processed similarly because of common functional or experiential factors. Thus, similarity of performance across number formats, in and of itself, does not necessitate the assumption of abstract codes. MM&W also express puzzlement in their commentary (p. 497) over the view that it might not be necessary to posit any internal representations that are not based on modality-specific codes. The puzzlement, apparently, is over how C&C could not see the self-evident need to hypothesize abstract codes. MM&W state that they assume that "the semantic representations [of numbers] abstract away from... surface details to represent quantity or magnitude" and that it is "uncontroversial that any theory of numerical processing will need to posit some form of internal quantity or magnitude representation" (p. 497). Based on this pair of statements, MM&W conclude that "the answer is clearly 'yes' to [C&C's] question, 'Is the hypothesis of abstract number codes necessary?"' (p. 497).
In Defense of the Encoding-ComplexApproach
553
This confident conclusion, however, seems to be based on a confusion of the issue of abstraction and the issue of magnitude representation. It certainly is necessary to posit internal processes that represent magnitude, but it is decidedly unclear that it is necessary to assume that they are abstracted processes. Under the encoding-complex view it is assumed that the representation of magnitude can take a variety of forms. For example, a visuo-spatial representation of distance or position (e.g., a number-line) is one possible form of magnitude representation; but associative connections between successive number-words and digits also can support relative magnitude judgements. Under this view, magnitude is not a unitary psychological construct, as assumed or implied by MM&W's abstract-code view; rather, the representation of magnitude corresponds to a set of specific learned relations and processes (e.g., labelling of perceptual groups or intensities, uses of counting and other basic arithmetic relations to represent changes in quantity), Furthermore, under the encoding-complex view, these magnitude skills generally are based on modality-specific codes (e.g., verbal or visuo-spatial representations). Viewed from this perspective, MM&Ws conclusion that the hypothesis of abstract codes is necessary because of the need to posit a representation for magnitude, seems to be only a dogmatic assertion of the abstract-code hypothesis. Magnitude representation does not require or imply the assumption of abstract (i.e., modality independent) codes.
The dubious explanatorypower of the abstract-quantitycodes In the abstract-modular theory, comprehension of a number is practically equated with activation of the appropriate abstract-quantity code. This special code is assumed to represent the "basic quantities in a number, and power of ten associated with each" (MM&W, p. 495). The abstract codes are represented using a notation in which quantity is specified by a digit in brackets, followed by an exponential term specifying the appropriate order of magnitude. For example, the abstract-code notation for 50 is (5)lOEXPl. The information contained in the proposed abstract-codes potentially contributes to many basic number processing skills. For example, the information specified potentially accounts for the ability to encode or produce syntactically and semantically correct strings of numbers. The abstract codes also provide a basis for judgements of absolute and relative magnitude, as well as processing of relations associated with order of magnitude. Sokol, Goodman-Schulman and McCloskey (1989) hinted at the possibility that the abstract-code might also specify the odd-even status of a number (p. 108), providing the explanatory basis for a whole variety of other basic, numerical tasks and judgements.
554
J.I.D. Campbell
However, a fundamental problem with the abstract-code hypothesis is that it is these "built in" assumptions about the properties of abstract codes that are doing much of the important explanatory work. MM&W state that "the internal base-10 representational scheme is built up from more fundamental concepts of quantity, through experience with base-10 number systems" (p. 497), but the nature of these more fundamentalconcepts is left unspecified. Furthermore, such statements leave completely unexplained how the abstract codes represent quantity, and therefore provide no concrete explanation of how they provide the capacity to decide, for example, that 5 is less than 6. In other words, the abstract-code hypothesis does not explain how people comprehend this relation, rather it simply takes for granted that the capacity to process this relation is entailed in the abstract specification. The abstract-code theory of number representation is, in effect, only a reification of the abstract-code notation. In this sense the abstract-code theory seems to evade, rather than explain, how quantity and order of magnitude is represented and processed. Number concepts and the encoding-compler hypothesis Under the encoding-complexview, the "semanticrepresentation of quantity" and the "comprehension of numbers" refer to a variety of specific number skills, rather than to a unitary, symbolic code2. Indeed, the representation of quantity and order of magnitude are complex components of number processing that need to be explained, rather than taken for granted. By focusing on how people use specific types of mental codes to process and represent numerical relations, the encoding-complex approach seeks to provide genuine explanation for these elementary features of numerical skill. For example, positing a visuo-spatial medium for magnitude (e.g., an imagistic line or area) entails positing representational structure that directly mediates the processing or understanding of magnitude (e.g., relative position on the imaginary line). Similarly, the series of verbal associationsunderlyingwell-learned counting strings represent magnitude explicitly in terms of the temporal order of elements in the verbal string. Individual number words and digits gain meaning in terms of their location within counting series, and by referring to specific perceptual representations of quantity
'MM&W now allow that "the model does not contend that all numerical processing is mediated by abstract semantic representations. For example, internal spatial representations may well be implicated in solving geometry problems" @. 498). It is still unclear, however, what principle allows spatial representations to be functional in geometry, but excludes them from more basic types of number skills.
I n Defense of the Encoding-ComplexApproach
555
or frequency. Processing of base-ten structure may be realized in terms of direct associative mapping between position in a visual string of digits, and related associative structures, such as counting by tens and ones, and other concrete representations of order of magnitude (cf. Fuson, Fraivillig & Burghardt, this volume). Approaching the difficult theoretical problems of quantity representation and knowledge of base-ten structure in such terms potentially provides concrete explanations of how these basic numerical skills are cognitively r e d i d . In contrast, to simply assert that there is an abstract-quantity code that provides these abilities, does not appear to explain these abilities in any substantial way.
Conclusions Whereas the diagram in Figure 1 of MM&Ws chapter does present a simple picture, the structures and processes outlined in that diagram obviously do not constitute the detailed models of specific tasks required to explain the complex and subtle phenomena that occur in number processing. When faced with these sorts of detailed experimental phenomena, the abstract-modular model requires the additional specification of a variety of ad hoc mechanisms and factors. At this level, the abstract-modular mode! appears to be as open to the criticism of underspecification as the encoding-complex approach, and also to be extremely flexible in its capacity to accommodate unpredicted phenomena. Although the abstract-modular theory provides a common-sense taxonomy of numerical skills organized in a plausible arrangement, this global level of architectural detail seems to provide few genuine constraints on the interpretation of data. Indeed, given the weak form of functional independence implied by MM&W’s various elaborations of the model, it appears that the general assumptions of abstract codes and modularity are not open to disconfirmation. Based on these considerations, I conclude, contrary to MM&Ws claims, that the abstract-modular approach is not inherently superior to the encoding-complex approach on the grounds of testability and predictiveness. At present, the abstract-modular approach, like the encoding-complex approach, is perhaps best thought of as a metatheoretical framework, providing specific directions and guidelines for experimental research and for the development of precise theoretical models of components of numerical cognition. Whereas both approaches point to worthwhile theoretical and empirical goals, it remains true that they differ fundamentally with respect to basic assumptions, and the two approaches cannot both be correct with respect to these assumptions: Cognitive number processing is either primarily representationally abstract and modular or it is primarily specific and integrated. It seems very unlikely, however, that a
556
J.I.D. Campbell
single, clear test of these alternative positions is possible; but this does not mean that there will never be a satisfactory resolution of these issues. As the two positions are developed and specified further, and faced with more empirical constraints, gradually the theoretical and practical benefits of one approach versus the other will become clear. Ultimately, theoretical coherence, plausibility, and utility in accounting for the accumulated weight of empirical evidence will settle the question. ACKNOWLEDGEMENTS I express my thanks to Paul Meagher, James Sawchyn, and especially to Valerie Thompson for very useful feedback on a previous draft of this chapter. This research was supported by Natural Sciences and Engineering Research Council of Canada grant OPG0001980 to Jamie Campbell. REFERENCES Campbell, J.I.D. (1990). Error priming in cognitive arithmetic: Effects of number format, Poster presented at the meetings of the Psychonomic Society, New Orleans. Campbell, J.I.D. & Clark, J.M. (1988). An encoding complex view of cognitive number processing: Comment on McCloskey, Sokol, and Goodman (1986). Journal of Experimental Psychology: General, 117, 204-214. Clark, J.M., & Campbell, J.I.D. (1991). Integrated versus modular theories of number skills and acalculia. Brain and Cognition, 17, 204-239. Foltz, G.S., Poltrock, S.E., & Potts, G.R. (1984). Mental comparisons of size and magnitude: Size congruity effects. Journal of Experimental Psychology: Learning, Memoty, and Cognition, 10, 442-453. McCloskey, M., Sokol, S.M., & Goodman, R.A. (1986). Cognitive processes in verbal-number production: Inferences from the performance of braindamaged subjects. Journal of Experimental Psychology: General, 115, 307330. Sokol, S.M., Goodman-Schulman, R., & McCloskey, M. (1989). In defense of a modular architecture for the number processing system: Reply to Campbell Sr Clark. Jounial of Experimental Psychology: General, 118, 105-110. Sokol, S.M., McCloskey, M., Cohen, N.J., & Aliminosa, D. (1991). Cognitive representations and processes in arithmetic: Inferences from the performance of brain-damaged patients.Journal of ExperimentalPsychology: Leaming, Memoy, and Cognition, 17,355-376.
557
Author Index Aaronson, D. 161, 162, 172 Abernathy, S . 224 Acioly, N.M. 120 Aiken, L.R 336 Akin, 0. 181, 259, 261, 281, 282,404 Alcott, D. 528 Aliminosa, D. 365-369, 376, 379, 396, 397, 399401, 429, 458, 493, 498, 499,503,504,541 Allen, PA. 303 Allport, D. 282 Anderson, J.R. 127, 129, 160, 177, 196, 203, 204, 369, 379, 383, 426, 442, 466,486 Anderson, S.H. 446 Antell, S.R. 11 Aoki, T. 259, 262, 278, 295 Ashcraft, M.H. 113, 198, 203-208, 213, 215, 223, 224, 226, 227, 246, 301, 303, 304, 307, 310, 314, 315, 317, 318, 321, 323, 325, 326, 333-335, 359, 360, 365-367, 369, 374, 379, 386, 403, 483, 523 Atkinson, J. 259, 260, 262, 263, 276 Avoli, M. 443.445
Baddeley, A.D. 237, 264, 294, 302, 306,460,483 Baillargeon, R 11 Bakes, P.B. 193 Banks, W.P. 334 Baroody, A.J. 115, 127, 128, 203, 215, 216, 221, 226, 227,333,379 Bartholini, G. 443 Bartolotta, R 206,379 Bartsch, K 102 Basili, A.G. 336, 367,457,458, 493 Bassok, M. 160, 176 Battaglia, J. 198, 204, 205, 208, 303, 310, 365, 367, 369 Beck, J. 266 Beckwith, M. 294 Beem, A.L. 334, 335 Bell, M.S. 42, 106 Benenson, J. 11 Besner, D. 528, 531, 533 Beth, E. 6 Binet, A. 21 Binks, M.G. 19,460 Bisanz, J. 113, 118, 120, 126, 128-131, 316, 437 Bjork, R.A. 342
Bjorklund, D.F. 128, 413,446 Blaxton, T.A. 337 Bloom, B.S. 150 Bobrow, D.G. 140 Boches, C.A. 324 Boden, M.A. 48.5 Boies, S.J. 218 Bonnefil, V. 129, 223 Bons, T.
442 Bossi, L. 443 Bourque, T.A. 446 Bradshaw, J.L. 531 Brained, C.J. 4, 7, 10, 120, 129 Braisby, N. 11 Bransfod, J.D. 123, 124 Brehaut, J.C. 446 Briand, K. 437 Briars, D.J. 11, 44, 48, 98-101, 121, 127, 140 Broder, LJ. 150 Brown, A.L. 123, 178-180, 184 Brown, V, 292 Brownell, W.A.
150 Brunswik, E. 195, 197 Bryan, W.L. 21 Bryden, P. 533
558
711e Nature and Origins of Mathematical Skills
Budayr, B. 439 Bullock, M. 125 Burghardt, B.H. 39, 82, 92, 100, 101, 105, 107,555 Burkell, J. 273,281 Burnham, W.M. 443 Bums, J. 42 Burr, D.J. 405 Bymes, J.P. 98, 99, 103, 106, 107, 126 Campbell, F. 259,260,262 Campbell, D.T. 200 Campbell, J.I.D. 15, 16, 116, 131, 185, 207, 213, 215-217, 246, 250, 301, 303, 304,316,317, 319-326,331,333, 335-340, 342, 346, 347, 350, 352, 355-357, 359, 361, 365-367, 369, 370, 374, 378, 379, 386, 391, 392, 394, 396, 411415,425430, 432,434-437,439, 441, 442, 445, 446, 450,457,459461, 464,466,472474, 479481,483,484, 487,488,493,494, 502,503,506,523, 528, 532-534, 541, 542-544,547,548 Campione, J.C. 123 Caramazza, A. 336, 367, 377, 457, 458, 493,507 Carey, S.
125 Carpenter, PA. 146 Carpenter, T.P. 114, 127 Carraher, D.W.
122 Carraher, T.N. 122 Carroll, J.B. 515,516 Carroll, J.D. 16 Case, R 127-129 Catrambone, R. 156 Cattell, RB. 241,243 Cave, K. 274, 295 Chang, C. 44 Chang, J.J. 16, 28 Chapman, J.C. 436 Charness, N. 446,459 Chase, W. 160, 259, 261, 281, 282 Chen, Z. 180,181 Chen, Q. 129,223 Chi, M. 161, 259. 281, 283, 286 Christy, K. 325 Clapp, F.L. 334 Clark, J.M. 15, 16, 113, 304, 331, 337, 339, 342, 346, 356, 357,360,370, 378,411,413-416, 419,426, 429431, 435437, 441,442, 445, 446, 449, 450,
457,459461,483, 487,488,493,494, 502,503,506,528, 532-534, 539, 541, 543,547,549 Cobb, P. 107 Cohen, E.G. 47 Cohen, L. 339,480,508 Cohen, N.J. 35, 365,371-373, 380, 403, 458, 429, 493, 541 Coltheart, M. 528 Conners, C.K.
444 Constantino, J. 11 Cooney, J.B. 222, 224, 303 Cooper, RG. 11 Copeland, RW. 3 Corina, D. 460,461,529531 Cormier, P. 197, 201, 206, 312, 484 Comet, J. 200,202,222,333 Cottrell, G.W. 375 Crowley, K. 129 Cummins, D. 141, 144 Cunningham, J.P. 18, 334,437 Daehler, M.W. 180, 181 Dalrymple-Atford, E.C. 439 Daniels, S. 531 Dattel, A.R 179
Author Index Davelaar, E. 528,531 Davidson, B. 267, 271, 292 Davies, P. 515 Davis, R 531 De Corte, E. 140, 146 dekttencourt, L.U. 115 Dehaene, S. 325, 334,480,508, 515 Deloche, G. 200,333,483,508 Dempster, A. 175 Dempter, F.N. 237 den Heyer, K. 437 Depoortere, H. 443 dewinstanley, P. 223 Dienes, ZP. 43 Digdon, N. 442 Dineen, J.T. 129 Dixon, W.J. 308,309 Donley, RD. 301, 303 Dunn, J.C. 484 Dunn, M. 120 Dupoux, E. 334 Dustman, R.E. 444,446 Dyer, M.G. 405 Ebbesen, E.B. 1%
Echols, C.H.
180 Egeth, H. 274, 278, 295 Eichelman, W.H. 218 Elcock, T. 271,280,295 Elkind, D. 34 Ellis, A.W. 497 Emmerson, R.Y. 446 Eriksen, C.W. 199, 267, 291, 292 Estes, W.K. 192 Ettinger, M. 175 Feltovich, P.J. 161 Ferrara, R.A. 123 Fierman, BA. 206,246,303,379 Fiske, D.W. 200 Flavell, J.H. 6, 127 Folk, C. 278 Folk, G.S. 334, 460,528,529, 547 Ford, W.W. 3 15 Fraivillig, J.L. 39, 82, 92, 97, 100, 555 Francis, M. 2.59, 260, 262 Franks, J.J. 124 Franzel, S. 274,295 Frederiksen, J.R 150 Frege, G. 4-7 French, R M .
559 374 Frick, R. 262, 263, 294 Frye, D. I1 Fujii, M. 334 Fuson, KC. 3, 11,28,39,41-44, 48,50,82,92, 95,W102, 106, 115, 127, 129-131, 150,485, 555 Gale, A. 461,533 Gallistel, C.R 3, 11, 126, 334, 335 Garbart, H. 274, 295 Gamer, W.R 199, 266, 281, 282 Gawryszewski, L. 292 Geary, D.C. 197, 198,200, 201, 206,208, 213,232, 236-238,244,247, 303, 310, 312, 314, 333, 352,483,484 Gelade, G. 199, 257, 265-267, 273, 275, 276, 280, 281, 288, 291, 294 Gelman, R 3, 11, 16, 17, 21, 29, 118, 121, 126, 334, 335 Gentner, D. 180, 181 Geszti, T. 379 Gholson, B. 179, 180 Gibbs, B. 275 Gick, M.L. 123, 160, 177 Ginsburg, H. 42, 115, 127, 128, 203 Glaser, R. 161, 184
560
The Nature and Origins of Mathematical Skills
Glass, A.L. 214 Gonzalez, E.G. 26, 378,462 Goodman, RA. 344,376,442,493, 494,507,544 Goodman-Schulman, R 15,337, 376,458, 494,507,553 Gorman, RP. 405 Goswami, U. 178-180 Gottlieb, G. 126 Goyette, CH. 444 Graf, P. 337 Graham, D.J. 304,322, 323, 326, 331, 333, 335-337, 340, 350, 352, 355, 361,365-367, 369, 371, 373, 374, 379, 386,391, 392, 394, 396, 412, 415, 425427,429,432, 464,473,481,523 Green, C.E. 139, 145 Green, D. 529 Greeno, J.G. 116, 117, 119-121, 123, 124, 127, 130, 138, 139, 141, 150 Grimsell, D. 531 Groen, G.J. 201-203,207, 229, 263, 269, 279, 283, 303, 304,333, 365, 367,428,447 Grossberg, S. 405 Grouws, D. 103, 114 Hake, H.W.
199 Hall, J.W. 128,237 Hall, R 150 Hall, V.C. 129 Hallford, E W . 213, u6,483 Hamann, M.S. 206,207,303,325 331335,365,483 Hamilton, W. 262 Hanson, SJ. 405 Harley, W.S. 333,365, 391,520 Harnishfeger, K.K. 128,413,446 Hart, K.M. 105 Harter, N. 21 Hartman, E. 379,380, 383 Hasher, L. 413,446 Hatano, G. 19, 123,460 Hatta, T. 531 Hattrup, RA. 114 Hayes, J.R 142, 144,460 Healy, A.F. 460,483 Hegarty, M. 137, 139, 145 Heller, J.I. 139 Hemnann, D.J. 161, 162 Hertz, J. 379 Hetherington, PA. 373 Hiebert, J. 104 Hines, T.M.
461,532 Hinsley, D. 142, 144 Hinton, G.E. 379 Hitch, G.J. 302, 305, 306,314, 460 Hoffman, J. 267
Holender, D. 529,531 Holyoak, IU. 123,156-161, 165, 166,171, 172, 175-177,182,214 Horn. J.L. 241,243 Howe, M.L. 129 Huberman, BA. 380 Hudson, T. 139 Humphreys, M.S. 237 Hunt, E. 218 Hurford, J.R 28 Ichihara, S. 259, 262, 278 Ippel, M.J. 334 Jenkins, E. 114, 303 Jensen, A.R 444,446 Jensen, E. 259 Johnson, C.J. 446 Johnson, D.W. 47 Johnson, E.J. 224 Jolicoeur, P. 268,275,285,286
Author Index Jones, R.S. 369 Jonides, J. 267,293 Julesz, B. 266, 272, 278 Just, M A . 146 Kadane, J. 138,141 Kagan, J. 11,444,446 Kahneman, D. 275,281, 288 Kail, R. 113, 126-129, 227, 238, 239, 241-243, 249 Kane, M.J. 180,184 Kanwisher, N. 264,275
Kashiwagi, A. 462 Kashiwagi, T. 462 Katz, A.N. 531 Kaufman, E. 259 Kaye, D.B. 129, 223, 227, 315 Kayra-Stuart, F. 334 Keating, D. 11, 35, 205, 333, 352, 365,428 Kennedy, P.T. 160, 176 Kibler, D. 150 Kikuchi, T. 259, 262, 278 Kilpatric, D.W. 334,437 Kilpatric, D. 18 Kim, C.C. 28944 Kintsch, W.
116, 138, 141, 144 Kinzer, C. 123 Kirsner, K. 484 Klahr, D. 125, 126, 129, 258, 259, 262,263,277, 278, 281 Klapp, S.T. 302, 315, 324-326 Klein, R 461,533 Knight, F.B. 333, 350 Kolers, P.A. 26, 378, 462 Kornblum, S. 439 Koshmider, J.W. 304, 307, 315, 317, 318, 321, 323, 326, 333,359 Krogh, A. 379 Krueger, L.E. 213, 336,483 Kruschke, J.K. 374 Kulak, A.G. 130,131 Kulm, G. 114 Kwak, H. 278 Kwon, Y. 28, 44,102, 485 Laberge, D. 267, 292 Labinowicz, E. 45,106 Lachman, R. 442 Ladd, S.F. 222,224,303 Lamiell, J.T. 197 Landauer, T.K. 203 Langer, S.Z.
561 443 Langheinrich, D. 259, 262, 278 Langley, P. 126, 129 Larkin, J.H. 138, 140, 141, 150 Lawler, RW. 122, 123 Lee, S.Y. 24 LeFevre, J.A. 113, 118, 126, 130, 131, 316, 437 Lehman, D.R 124 Leinhardt, G. 114, 115 Lempert, RO. 124 Lesh, R.A. 106 Levy, E. 295 Lewandomky, S. 374 Lewis, A.B. 137-139, 141, 144, 145, 147, 149, 150 Lewis,J. 218 Lindemann, A.M. 207,333, 360,365 Lindquist, M.M. 127 Little, T.D. 189, 195, 196, 198, 199,204,211, 230, 236,239,240, 241, 242, 245, 247, 248, 312, 333, 484 Lloyd, KG. 443 Logan, G.D. 213, 302, 315-317, 319, 324-326,334, 359,426,437 Logie, R.H. 264, 294,460,483 Loiseau, P. 443
562
The Nature and Origins of Mathematical Skills
Lord, M.
259 Lories, G. 200,333 Lowe, J. 11 Lunneborg, C. 218 Lynch, M.J. 237 Macaruso, P. 15,366,376,470, 493,498,499,503, 504,539 Mackay, D.G. 342,358 Mackay, L. 285 Maki, R.H. 462 Mandler, G. 259,261,262,264, 281 Markusses, M.F. 334 Marmor, M. 271,280,295 Maroudas, C. 11 Man, D. 265 Marschark, M. 434 Marsh, L.G. 462 Mathieson, G. 445 Mayer, R.E. 137-139,141-145, 149,159,185 McClain, L. 462 McClelland, J.L. 404,439 McCloskey, M. 15,207,333,336, 337,344,346,360, 365-369,371-373, 376-380,386,394, 396,397,399401,
403405,429,442, 457459,461,462, 470,476,478,481, 483,484,486488, 445,443,493495, 498-501,503,504, 506-508,526,534, 539,541,544,553 McDennott, J. 150 Mclnnes, J. 461,533 McLeod, B.E. 439 Meck, E. 11,121 Medin, D.L. 168 Mehler, J. 325,334,515 Melz, E. 160 Menninger, K 42 Messick, S . 158 Miller, G. 263 Miller, K.F. 3, 11, 15-17,19,21, 26,28,29,35,205, 206, 208,236,333, 335,337,338,352, 355,365,367,373, 379,392,396, 426428,437,447 Miller, S.A. 119,129 Minsky, M.L. 267,405 Miura, I.T. 28,44 Miyake, Y. 19,460 Monsell, S. 515 Moore,D. 11
Morales, R.V. 139 Morgan, D.
179
Morrison, F.J. 120 Morselli, P.L. 443 Moser, M. 275,294 Mrkonjic, L. 316,437 Munari, C. 443 Naime, J.S. 460,483 Nathan, M J . 138,141,150 Neches, R 126, 129 Nesselroade, J.R 193 Nettleton, N.C. 531 Newell, A. 126 Nichols, J. 11 Nisbett, RE. 124 Norem, G.M. 333 Novick, L.R 131,155-160, 163-168,171,172, 175-178,180, 182 O’Neill, P.E. 199,200 Ohlsson, S. 108,127 Okamoto, Y. 28,44 Oliphant, M. 116,207,246,331, 386,413,415,428, 432,434,435,439, 466,472474, 479481,487,549 Omanson, S.F. 97-99,103 Ortony, A. 168
Author I n d a Osawa, K. 19 Oyama, T. 259, 262, 278 Paige, J.M. 140, 141 Paivio, A. 341,435,442 Palmer, R G . 379 Papert, SA. 267,405 Paredes, D.R. 355,373,426,427, 437,447 Parkman, J.M. 201-203, 207, 229, 263, 269, 279, 283, 303, 304, 333, 365, 367,379,428,447 Parry, P. 528 Pascual-Leone, J. 127 Pauwels, A. 140, 146 Pavel, M. 405 Peano, G. 4-6 Peereman, R. 529,531 Pellegrino, J.W. 139 Perfetto, Bh. 124 Perlmutter, M. 205, 333, 352, 365, 428 Peterson, P. 11 Peterson, C. 379,380,383 Piaget, J. 3, 4, 6-12, 14, 119, 446 Pirolli, P.L. 160, 177,426,442, 466 Poltrock, S.E.
334,460,528,547 Polya, G. 141 Posner, M.I. 2 18, 267, 271, 280, 292, 302, 316, 323, 445 Post, T.A. 129 Potter, M. 295 Potts, G . R 334,460,528,547 Putnam, RT. 114, 115, 118, 119 Pybhyn, z 257,260, 264, 26.5, 267, 270-273, 275, 280, 282, 288, 291, 292, 295 Rabinowitz, E M . 129 Rager, J.E. 405 Rapoport, J.L. 448 Reed, S.K. 169, 171, 175, 176 Rees, E. 108, 127 Reese, E. 259 Reese, T. 259 Reinert, G. 244 Resnick, L.B. 97-99, 103, 119, 130, 315 Restle, F. 213,221, 294, 334, 447 Reusser, K. 141, 144 Revelle, W. 237 Reznick, J.S. 11 Richman, B. 5 15
563 Riggio, L. 292 Riley, M.S. 119, 121, 124, 130, 139 Risko, V. 123 Ritz. SA. 369
Rizzolatti, G. 292 Roberts, E. 413 Roediger, H.L. 439 Romberg, TA. 114 Rosenberg, C . R 405 Ross, B.H. 160, 161, 164, 176, 1?7 Rossen. M.L. 369 Rougier, M. 443 Rumelhart, D.E. 439 Russell, B. 4-7, 12, 13 Russo, J.E. 224 Ryan, L. 337 Sagi, D. 272,278 Saltzman, 1. 266, 281,282 Sander, P. 271, 280, 295 Sato, s. 274,288,295 Scatton, B. 443 Schacter, D.L. 337 Schlehuber, C.J. 444 Schliemann, A.D. 120, 122
The Nature and Origins of Mathematical Skills
564 Schlosberg, H. 282 Schmidt, H.
288 Schmidt, J. 303,317, 426,437 Schneider, W. 302, 307, 373 Schoenfeld, A.H. 161, 162 Schooler, L.J. 1%
Schvaneveldt,R.W. 442 Seidenberg, M.S. 373,404 Sejnowski, T.J. 40.5 Sereno, M.E. 369 Seron, X. 200, 333, 483,508, 512 Shearer, D.E. 446 Shebo, B. 259, 261, 262, 281 Shepard, RN. 18, 334,437, 461,533 Shiffrin, R.M. 302 Shih Huang, J.Y. 462 Shrager, J. 205-209, 223, 333, 336,365,501 Shute, V.J. 139 Siegler, R.S. 11, 99, 114, 121, 127, 129, 205-209, 215, 222,223,225,229, 231, 232, 244, 247, 303, 307, 311, 318, 333-336, 365, 369, 366,403,447, 501 Silver, E.A. 161, 162 Silverstein, J.W. 369 Simon, H.A.
126, 140-142, 144, 150, 160 Simon, D.P. 150 Simons, D. 259, 262, 278 Sjoberg, K. 259, 281 Slade, C. 531 Snow, D. 531 Snyder, C.RP. 267, 271, 292, 302, 316, 323, 444 So,P.M. 161, 162, 172 Sodian, B. 125 Sokol, S.M. 15, 16, 333, 337, 344, 346, 365, 367-369, 376, 317, 379, 396, 397, 399401,429, 441, 442, 458, 459, 462,463,541,544, 553,476478,482, 483, 488, 493, 501, 502,504, 505, 507, 534,541,544,553 Sowder, J.T. 121 Spelke, E.S. I1 Spoehr, KT. 369 Starkey, P. 11, 118 Staudenmayer, H. 442 Stazyk, E.H. 203,206, 303, 304, 317, 334, 359, 365, 361, 379, 392,483 Stephens, D.1.. 224 Sternberg, RJ. 198,228, 263 Stevenson, H.W. 24 Stigler, J.W.
19-21, 24, 28, 35, 99, 102,460 Storm, R 257,271,272, 280 Stometta, W.S. 380
Svenson, 0. 259, 281, 333 Swanson, H.L. 222,303 Takahashi, A. 529 Taubman, R 263, 264 Taylor, RL. 218 Teluow, C.F.
443,444 Thagard, P. 160 Thurstone, L. 259 Tipper, S.P. 413,438,444,446 Toupin, C. 180, 181 Trabert, M.L. 324 Treisman, A.M. 199, 257, 265-268, 273-277, 280,281, 286,288,291,294, 295 Trick, L. 257, 262, 264,280, 282, 288, 292, 302 Troup, G.A. 531 Truxaw, C. 150 Tsal, Y. 275 Tsung, F.S. 375 Tzeng, O.J.L. 460,529 Ullman, S. 257, 265, 267-269, 211, 272, 274, 275,
Author Index 280, 282, 283, 285, 291, 294 Ulrich, RF. 444 Umilta, C. 292 Underwood, B.J. 192, 250
Vaid, J. 460, 461, 488, 529-531 van Liere, E.J. 445 Van Oeffelen, M. 259,278 Vershaffel, L. 140, 146 Vini, R. 274, 295 Viscuso, S.R 369-372, 374 Von Bekesy, G. 439 vos, P. 259, 278 Vosniadou, S. 179 Vye, N. 123 Vygotsky, L.S. 107 Wallace, J. 50, 277 Walley, R.E. 413,438,439 Wang, W.S.Y. 460, 529 Warren, H. 259 Warrington, E.K. 368 Wasik, B.A. 98, 99, 103, 106, 107, 126 Waxman, B. 115 Wearne, D. 104
Weber, T.A.
259, 303 Weiden, T.D. 413,438 Weimer, R 141, 144 Welford, A.T. 334,341,344,360, 361 Wenger, E. 150 Wertheimer, M. 123. 141 Wheeler, L.R. 206, 207, 246 Wheeler, M.M. 121 White, B.Y. 150 Widaman, K.F. 191,197, 198, 200, 201,203,205-208, 212, 213, 215, 217, 221, 229, 230-247, 249, 250, 312, 314, 333, 352,484 Wiley, J.G. 303, 310, 314, 483 Williams, E.N. 336 Williams, D. 413, 415, 416, 419, 431,435,449,450 Willis, G.B. 150 Winkelman, J.H. 303, 317,426,437 Wittgenstein, L. 3-5,12-14, 16 Wixted, J.T. 196 Wohlwill, J.F. 127 Wolfe, J. 274, 295 Wood, T. 107 Woodworth, R. 282 Wynn, K. 121, 126, 128
565 Yackel, E. 107 Yan, G.G. 15,26 Young, E. 138, 141, 150 Young, A.W. 497
Zacks, RT. 413,446 Zaitchik, D. 125 Zbrodoff, N.J. 213, 302, 316, 317, 319,325,326,334, 359,426,437 Zhang, H.C. 15, 26 Zhu, J.J. 15,26 Zivkovic, B. 443
This Page Intentionally Left Blank
567
Subject Index Abacus skill 19-26, 459460, 506 Acalculia (see Brain injury) 429,443,458,462,476,502,518,541 - inhibition and, 441 Achievement levels 43, 75, 98, 101, 189, 198, 226, 230-235, 243 Addition 5, 11, 199,210, 236, 263,265,269, 279, 283, 294,420,447 - componential models of, 199 - computational model of, 331 - multidigit, 39,43, 76, 91, 115, 215, 217, 305,311,375 - performance compared to multiplication, 245, 348-358, 427 - simple, 116, 212, 308, 331, 349 - types of errors, 354 Analogical transfer 26, 149, 156, 158-161, 201 - adaptation, 171 - diagrammatic representations and, 156 - mapping, 169 - metacognition and, 184 Arithmetic word problems 155, 172 - visual representation of, 150 Attention 301-306, 325, 413,499 - focal, 286 - inhibition and, 438439, 4 4 3 4 6 - pop-out effect and, 197-198, 288 spatial, 257, 262, 273, 280 - spotlight of, 262, 292 - visual routines and, 267 Automaticity - 190, 193,200-201,212,233,236, 242, 246-247, 269, 301-305,309, 314-317, 323-326, 437, 549 Brain injury (see Acalculia) 367, 375, 377, 396-397, 399402,411, 441,443,445,448, 461, 496, 527 Carry operation - 215-216, 218-219, 228, 234, 242, 303, 305-308, 312-315 Commutativity 115, 128,225,344,353, 355-356,400-
-
-
-
-
-
-
-
40 1 Componential models - 199, 219 Computer simulation - 140,338,365,456 disinhibition model, 421 MATHNET model, 375 - Network-interference model, 331, 338 - neural net models, 365, 369, 371, 373-374, 379,402-404 Context-sensitive activation - 414415 Counting (see Enumeration) - 9, 11, 42, 66-68, 116, 121, 199-201, 2Q8,219, 221, 224-226, 229, 233, 237, 257,259,265, 276, m i , 292,304,333, 425,459460, 48344,500501, 554-555 Cross-operation errors 338, 350-355, 367,426427,437 Digit frequencies - 516 Disinhibition - 415422 ambiguous problems and, 420 Division - 226, 246,303, 360,434 Dual-task method - 301, 306 Educational implications - 114-115, 130-131, 150, 184,247,446 Encoding-complex view of number processing - 15, 337,457463,465,472475,478480,484,486488,493,554 vs. modular model, 458459,482483, 506-509, 539-541,551-555 Enumeration (see Counting) - 258, 260, 279 density-based theory, 260 pattern-based theory, 261 working-memory theory, 262 Error priming - 304, 337, 342, 356-357,436,441 Expertise - 4, 16, 19-24, 155, 221 age and, 178 - conceptual consequences of, 24 development of, 178, 192 measuring mathematical, 157 number representation and, 19
-
-
-
-
-
The Nature and Origins of Mathematical Skills
568
- problem
- activities of, 117-118
representation and, 161
- schema induction and, 157
- application of procedures, 117-
- surface vs. structural features and, 21-24, 161, Eye fixations 146 FINSTs 257,260, 270-280 Fonnat effects
- 337,457,460,509,528-533,540,551
- in simple multiplication, 464481, 509-528, 542-552 Functional architecture 378,493494,499
-
GABA
-
413,443,445 Heuristics 156,201 Individual differences 16, 189, 191, 194, 196, 226, 326, 412,
118
- evaluation of procedures, 117, 121
- generality of, 117, 171-178, 182 -justification of procedures, 117, 119 qualitative, 138, 141 situation models, 138 Mathmatical misunderstandings - 59, 138 Modular model of number processing
-
- 15,457459,461-463,469,482,494, 501, 551 - calculation system, 376, 458, 474,
482484,496,501,540,550,552 - comprehension system, 458,474, 482, 503,540,550-551 - numeral processing mechanisms, 376,494,540,551 - production system, 344,458, 474475, 482,485 - a.encoding-complex hypothesis,
-
444
- developmental change and, 220, 242 methodological implications, 196 nomothetic stance and, 193, 207 person-centered approach, 222 Inhibition 316,332,341,359,369,414,549 - acakulia and, 441 - aging and, 445 - attention and, 438439, 4 4 3 4 6 - brain damage and, 445 - context-sensitive activation and, 414415 development and, 443, 446 - interference and, 349, 438 - mechanisms of, 411 Interference uM,211,213,306,314, 317, 324-325, 331-336, 349-353, 358-359, 436, 526-527, 529530, 548-551 - in connectionist networks, 371, 373,403 - inhibitory mechanisms and, 349, 438 - inter-operation competition and, 349,355,424425,437 Inversion 116, 118-119, 121, 125-126 Mathematical understanding 40, 115
-
-
-
-
-
-
458459,482483,506-509,539-541,
551-555 Multidigit number marks - 41 Multidimensional scaling - 16, 18 Multiplication - 5,41, 107,303-305,315-318,321,324326, 331, 349,413-417,462 - complex, 234, 245 - effects of digit vs. number-word format, 464481, 509-528, 542-551 error types, 336, 353, 367, 389,429, 469, 523, 546 models of, 204,331, 375,421 performance compared to addition, 245, 348-358, 426427 Number-fact retrieval - associative models, 412 - configural models, 432 Distribution of Associations model, 204,206-210,244 - MATHNEX model, 375, 378 - Network-interference model, 331 - network retrieval, 201 - priming of, 211,214, 301,304,316317, 319-324, 326,436
-
-
Subject Index
- product model, 204, 237
- structural variables and, 205, 304
- tabular theory of, 203 Number words 41 - frequencies, 515 - orthography, 26,29,508 Number representation - 4, 16, 34 - abstract, 7, 15, 18, 35, 376, 459, 481,495, 552 - development of, 7, 107-108 encoding-complex hypothesis, 459461, 554 - external, 26 - language and, 4, 13-16, 26-28, 4144,484,502 - magnitude, 18, 21, 24, 332, 341, 370 - physical codes, 339, 332 - prenumerical, 10 Numerical comparison - 21,460,484,528 Numerical understanding development of, 7, 12, 34, 125 in infants, 11 Odd-even status - 18, 24,461,483,532 Operand-intrusion errors 355,446,469,472479,483,526,548551 Pedagogical objects - 40, 103-106 Philosophical foundations - 12 - foundational theories, 4 - logical foundations, 5 Piaget’s theory - 6, 10 Problem-size effect 231, 301, 303-304, 317, 333, 357, 392394,466,468,480 - deviations from, 335-336,425, 428 - frequency/strength theory of, 334, 394 - network-interference theory of, 334-335 - proactive interference theory of, 373 - procedural strategies and, 199-201, 302-303, 326, 333
-
-
-
-
-
-
569
Production mechanisms - 344,376-377, 495 - proceduralization, 486, 508 - production task, 205, 214-215, 217, 316, 323 Recall of math problems - 142 Remediation - 147 Representation training - 147 Rules - 201 Schema induction - 160, 176 Strategy choice - 220, 224, 303-304, 314,447 Distribution of Associations model, 204, 206-210, 244 - individual differences and, 226 Subitizing - 42, 257-258, 265, 269, 276, 280, 287 - spotlight of attention and, 292 - visual routines and, 282 Subtraction - 41,43,4849, 118, 226, 246, 303 Symbolic structure - 14, 34 Think-aloud-protocols - 96 Tie problems - 318, 335-336, 350-353, 355-356, 396, 402,421,428429,513-515 Verification task - 210-215, 218-219, 227, 233, 237, 245, 314, 323, 482484, 501 Visual analysis stages of, 265 Visual routines - 267 - indexing, marking, scanning, 266267 Working memory - 213, 235-236, 242, 247, 262, 280-281, 301, 304-307, 309, 311-312, 314-315, 317, 325-326,446,459460, 499-507
-
-
This Page Intentionally Left Blank