Psychology of Learning and Motivation Vol 16

THE PSYCHOLOGY OF LEARNING AND MOTIVATION Advances in Research and Theory VOLUME 16 CONTRIBUTORS TO THIS VOLUME Wil...

Author: Gordon H. Bower

54 downloads 3997 Views 15MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

THE PSYCHOLOGY OF LEARNING AND MOTIVATION Advances in Research and Theory

VOLUME 16

CONTRIBUTORS TO THIS VOLUME

William G . Chase K . Anders Ericsson Arthur C . Graesser Alice E Healy Werner K . Honig Barbee T. Mynatt Glenn V. Nakamura Zehra E Peynircioglu Kirk H . Smith Roger K . R . Thompson Michael J . Watkins

THE PSYCHOLOGY OF LEARNING AND MOTIVATION Advances in Research and Theory

EDITEDBY GORDON H. BOWER STANFORD UNIVERSITY, STANFORD, CALIFORNIA

Volume 16 1982

ACADEMIC PRESS A Subsidiary of Harcourt Brace Jovanovich, Publishers

New York London Paris 0 San Diego 0 San Francisco 0SBo Paulo

Sydney

Tokyo 0 Toronto

COPYRIGHT @ 1982, BY ACADEMIC PRESS,INC. ALL RIGHTS RESERVED. NO PART OF THIS PUBLICATION MAY BE REPRODUCED OR TRANSMITTED IN ANY F OR M OR BY ANY MEANS, ELECTRONIC OR MECHANICAL, INCLUDING PHOTOCOPY, RECORDING, OR ANY INFORMATION STORAGE AND RETRIEVAL SYSTEM, WITHOUT PERMISSION IN WRITING F R OM T H E PUBLISHER.

ACADEMIC PRESS, INC.

1 1 1 Fifth Avenue. New York, New York 10003

Uriired Kirigdoni Edition ptrblislred by ACADEMIC PRESS, INC. ( L O N D O N ) LTD. 24/28Oval Road, London N W l 7 D X

LIBRARY OF

CONGRESS CATALOG CARD N U M B E R :

ISBN 0- 12-5433 16-6 PRINTED IN THE UNITED STATES OF AMERICA

82838485

9 8 7 6 5 4 3 2 1

66-30104

CONTENTS

Contributors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

ix

Contents of Previous Volumes ................................................

xi

SKILL AND WORKING MEMORY William G. Chase and K . Anders Ericsson

111. A Theory of Skilled Memory ......... IV. Further Studies of Skilled Me V. Conclusion ....................................................

24

THE IMPACT OF A SCHEMA ON COMPREHENSION AND MEMORY Arthur C . Graesser and Glenn V . Nakamura I. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11. Methods ...................................... 111. A Schema Copy Plus Tag Model IV. Some Issues Confronting the SC

..........

60 66 71 79 93 97 103

V. The Fate of Four Alternative Mo ............................... VI. The Process of Copying Schema o Specific Memory Traces . . . . . . VII. Questions for Further Research .................................... References .................................... 105 V

vi

Contents

CONSTRUCTION AND REPRESENTATION OF ORDERINGS IN MEMORY Kirk H . Smith and Barbee i? Mynatf

I. 11. 111. IV. V. VI. VII. VIII. IX.

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Review of Previous Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ................. Overview of the Experiments . . Experiment 1: Retrieval from Pa Experiment 2: The Role of Determinacy in Constructing Partia Experiment 3: Node Construction .................... Experiment 4: Diverging and Conv Experiment 5 : The Role of the Schema . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Summary ..................................... . .. . . . .. .... References . . . . . . . . . . . . . . . . . . . . . . . . . .

111 1I4 121 122 127 134 140 145 149 150

A PERSPECTIVE ON REHEARSAL Michael J . Watkins and Zehra E Peynircioglu

I. 11. 111. IV. V.

Overview ..... . . ... ... ... ... ...... . . . . . . . . . . . . . . . . . . . . . ... . . . . . . . . . . . 153 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154 Meaning of Rehearsal . . Rehearsal for Free Recall Reconsidered . . . ... . . . . . . 158 Rehearsal of Nonverbal Stimuli . . . . . . . . . . . . . . . . . . . . . . . . . . . Summary and Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185 References

SHORT-TERM MEMORY FOR ORDER INFORMATION Alice E Healy

I. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191 11. Experiment 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202 111. IV. . . . . . . . . . . . . . . . . . 227 V. VI. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . .

Contents

vii

RETROSPECTIVE AND PROSPECTIVE PROCESSING IN ANIMAL WORKING MEMORY Werner K . Honig and Roger K . R . Thompson 1. Introduction: Retrospective and Prospective Remembering . . . . . . . . . . . . . . . . 11. Representations of Initial and Test Stimuli . ....................... 111. Differentiation of Trial Outcomes . . . . . . . . . .......................

239 242 250

IV. Comparisons among Working Memory Paradigms . . V. Memory for Multiple Items ...................... VI. Discrimination and Memory of Stimulus Sequences

. . . . . . . . . . 212 .......................... 211 References .............................................

Index . . . . . . . . .

.......................................

284

This Page Intentionally Left Blank

CONTRIBUTORS Numbers in parentheses indicate the pages on which the authors' contributions begin.

William G. Chase, Department of Psychology, Carnegie-Mellon University, Pittsburgh, Pennsylvania 15213 (1)

K. Anders Ericsson, Department of Psychology, University of Colorado, Boulder, Colorado 80309 (1) Arthur C. Graesser, Department of Psychology, California State University, Fullerton, California 92634 (59) Alice F. Healy, Department of Psychology, University of Colorado, Boulder, Colorado 80309 (191) Werner K. Honig, Department of Psychology, Dalhousie University, Halifax, Nova Scotia B3H 324, Canada (239) Barbee T. Mynatt,' Department of Psychology, Bowling Green State University, Bowling Green, Ohio 43403 (111) Glenn V. Nakamura, Department of Psychology, California State University, Fullerton, California 92634 (59) Zehra F. Peynircioglu, Department of Psychology, Rice University, Houston, Texas 77251 (153) Kirk H. Smith, Department of Psychology, Bowling Green State University, Bowling Green, Ohio 43403 (111) Roger K. R. Thompson, Department of Psychology, Franklin and Marshall College, Lancaster, Pennsylvania 17604 (239) Michael J. Watkins, Department of Psychology, Rice University, Houston, Texas 77251 (153) 'Present address: Computer Science Department, Bowling Green State University, Bowling Green, Ohio 43403.

ix

This Page Intentionally Left Blank

CONTENTS OF PREVIOUS VOLUMES Volume 1

Volume 3

Partial Reinforcement on Vigor and Persistence Abram Amsel A Sequential Hypothesis of Instrumental Learning E. J. Capaldi Satiation and Curiosity Harry Fowler A hlulticornponent Theory of the Memory Trace Gordon Bower Organization and Memory George Mandler Author Index-Subject Index

Stimulus Selection and a “Modified Continuity Theory ” Allan R. Wagner Abstraction and the Process of Recognition Michael I. Posner Neo-Noncontinuity Theory Marvin Levine Computer Stimulation of Short-Term Memory: A Component-Decay Model Kenneth R. Laughery Replication Process in Human Memory and Learning Harley A. Bernbach Experimental Analysis of Learning to Learn Leo Postman Short-Term Memory in Binary Prediction by Children: Some Stochastic Information Processing Models Richard S. Bogartz Author Index-Subject Index

Volume 2 Incentive Theory and Changes in Reward Frank A. Logan Shift in Activity and the Concept of Persisting Tendency David Birch Human Memory: A Proposed System and Its Control Processes R. C. Atkinson and R. M. Shiffrin Mediation and Conceptual Behavior Howard K. Kendler and Tracy S. Kendler Author Index-Subject Index

Volume 4 Learned Associations over Long Delays Sam Revusky and John Garcia On the Theory of Interresponse-Time Reinforcement xi

xii

Contents of Previous Volumes

G. S. Reynolds and Alastair McLeod Sequential Choice Behavior Jerome L. Meyers T h e Role of Chunking and Organization in the Process of Recall Neal F. Johnson Organization of Serial Pattern Learning Frank Restle and Eric Brown Author Index-Subject Index

Volume 5 Conditioning and a Decision Theory of Response Evocation G. Robert Grice Short-Term Memory Bennet B. Murdock, Jr. Storage Mechanisms in Recall Murray Glanzer By-products of Discriminative Learning H. S. Terrace Serial Learning and Dimensional Organization Sheldon M. Ebenholtz FRAN: A Simulation Model of Free Recall John Robert Anderson Author IndexSubject Index

Volume 6 Informational Variables in Pavlovian Conditioning Robert A. Rescorla T h e Operant Conditioning of Central Nervous System Electrical Activity A. H. Black T h e Avoidance Learning Problem Robert C. Bolles Mechanismsof Directed Forgetting William Epstein Toward a Theory of Redintegrative Memory: Adjective-Noun Phrases Leonard M. Horowitz and Leon Manelis

Elaborative Strategies in Verbal Learning and Memory William E. Montague Author Index-Subject Index

Volume 7 Grammatical Word Classes: A Learning Process and Its Simulation George R. Kiss Reaction Time Measurements in the Study of Memory Processes: Theory and Data John Theios Individual Differences in Cognition: A New Approach to Intelligence Earl Hunt, Nancy Frost, and Clifford Lunneborg Stimulus Encoding Processes in Human Learning and Memory Henry C. Ellis Subproblem Analysis of Discrimination Learning Thomas Tighe Delayed Matching and Short-Term Memory in Monkeys M. R. D’Amato Percentile Reinforcement: Paradigms for Experimental Analysis of Response Shaping John R. Platt Prolonged Rewarding Brain Stimulation J. A. Deutsch Patterned Reinforcement Stewart H. Hulse Author Index-Subject Index

Volume 8 Semantic Memory and Psychological Semantics Edward E. Smith, Lance]. Rips, and Edward J. Shoben Working Memory Alan D. Baddeley and Graham Hitch T h e Role of Adaptation Level in Stimulus Generalization David R. Thomas

Contents of Previous Volumes

Recent Developments in Choice Edmund Fantino and Douglas Navarick Reinforcing Properties of Escape from Frustration Aroused in Various Learning Situations Helen B. Daly Conceptual and Neurobiological Issues in Studies of Treatments Affecting Memory Storage James L. McGaugh and Paul E. Gold The Logic of Memory Representations Endel Tulving and Gordon H. Bower Subject Index

...

XI11

Toward a Framework for Understanding Learning John D. Bransford and Jeffrey J. Franks Economic Demand Theory and Psychological Studies of Choice. Howard Rachlin, Leonard Green, John H. Kagel, and Raymond C. Battalio Self-punitive Behavior K. Edward Renner and Jeanne B. Tinsley Reward Variables in Instrumental Conditioning: A Theory Roger W. Black Subject Index

Volume 9 Prose Processing Lawrence T . Frase Analysis and Synthesis of Tutorial Dialogues AllanCollins, Eleanor H. Warnock, andJosephJ. Passafiume On Asking People Questions about What They Are Reading Richard C;. Anderson and W. Barry Biddle The Analysis of Sentence Production M. F. Garrett Coding Distinctions and Repetition Effects in Memory Allan Paivio Pavlovian Conditioning and Directed Movement Eliot Hearst A Theory of Context in Discrimination Learning Douglas L. Medin Subject Index

Volume 11 Levelsof Encodingand Retention of Prose D.JamesDoolingand Robert E. Christiaansen Mind Your p’s and q’s: T h e Role of Content and Context in Some Uses of And, Or, and If Samuel Fillenbaum Encoding and Processing of Symbolic Information in Comparative Judgments William P. Banks Memory for Problem Solutions Stephen K. Reed and Jeffrey A. Johnson Hybrid Theory of Classical Conditioning Frank A. Logan Internal Constructions of Spatial Patterns Lloyd R. Peterson, Leslie Rawlings, and Carolyn Cohen Attention and Preattention Howard Egeth Subject Index

Volume 10 Some Functions of Memory in Probability Learning and Choice Behavior W. K. Estes Repetition and Memory Douglas L. Hintzman

Volume 12 Experimental Analysis of Imprinting and Its Behavioral Effects Howard S. Hoffman Memory, Temporal Discrimination, and

xiv

Contents of Previous Volumes

Learned Structure in Behavior Charles P. Shimp The Relation between Stimulus Analyzability and Perceived Dimensional Structure Barbara Burns, Bryan E. Shepp, Dorothy McDonough, and Willa K . Wiener-Ehrlich Mental Comparison Robert S. Moyer and Susan T. Dumais Th e Simultaneous Acquisition of Multiple Memories Benton J. Underwood and Robert A. Malmi Th e Updating of Human Memory Robert A. Bjork Subject Index

Immediate Memory and Discourse Processing Robert J. Jarvella Subject Index

Volume 14

A Molar Equilibrium Theory of Learned Performance William Timberlake Fish as a Natural Category for People and Pigeons R. J. Herrnstein and Peter A. de Villiers Freedom of Choice: A Behavioral Analysis A. Charles Catania A Sketch of an Ecological Metatheory for Theories of Learning Volume 13 Timothy D. Johnston and M. T. Turvey SAM: A Theory of Probabilistic Search of Pavlovian Conditioning and the Mediation Associative Memory of Behavior Jeroen G. W. Raaijmakers and J. Bruce Overmier and Janice A. Richard M. Shiffrin Lawry Memory-Based Rehearsal A Conditioned Opponent Theory of Ronald E. Johnson Pavlovian Conditioning and Habituation Individual Differences in Free Recall: Jonathan Schull When Some People Remember Better Memory Storage Factors Leading to Than Others Infantile Amnesia Marcia Ozier Norman E. Spear Index Learned Helplessness: All of Us Were Right (and Wrong): Inescapable Shock Has Multiple Effects Steven F, Maier and Raymond L. Volume 15 Jackson On the Cognitive Component of Learned Helplessness and Depression Conditioned Attention Theory Lauren B. Alloy and Martin E. I? R. E. Lubow, I. Weiner, and Paul Seligman Schnur A General Learning Theory and Its A Classification and Analysis of ShortApplication to Schema Abstraction Term Retention Codes in Pigeons John R. Anderson, Paul J. Kline, and Donald A. Riley, Robert G. Cook, and Charles M. Beasley, Jr. Marvin R. Lamb Similarity and Order in Memory Inferences in Information Processing Robert G. Crowder Richard J. H am s Stimulus Classification: Partitioning Many Are Called but Few Are Chosen: Strategies and Use of Evidence The Influence of Context on the Effects Patrick Rabbitt

Contents of Previous Volumes of Category Size Douglas L. Nelson Frequency, Orthographic Regularity, and Lexical Status in Letter and Word Perception Dominic W. Massaro, James E. Jastrzembski, and Peter A. Lucas

xv

Self and Memory Anthony G. Greenwald Children’s Knowledge of Events: A Causal Analysis of Story Structure Tom Trabasso, Nancy L. Stein, and Lucie R. Johnson Index

This Page Intentionally Left Blank

SKILL AND WORKING MEMORY William G. Chase CARNEGIE-MELLON UNIVERSITY PITTSBURGH, PENNSYLVANIA

K . Anders Ericsson UNIVERSITY OF COLORADO BOULDER, COLORADO

I. The Skilled Memory Effect.. . . . . . . . . . . .................... A. Short-Tern Memory Capacity.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. Chess and Other Game Skills . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. Nongame Skills . . . . . . . . . . . . . . . . . . . . . . . . ........... 11. Analysis of a Memory-Span Expert . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. The Effects of Practice on Digit Span ................... B. Mechanisms of Skilled Memory.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111. A Theory of Skilled Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. The Structure of Long-Term Memory ................... B. Short-Term Memory and Attention.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. Memory Operations.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D. Interference. . . . . . . . . . . . . . . . . . . . . . .............. E. Working Memory. . . . . . . . . . . . . . . . . ....... IV. Further Studies of Skilled Memory.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. Analysis of a Mental Calculation Expert . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. The Memory of a Waiter.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. Sentence Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . V. Conclusion ................................................ References .........................................

2 2 2 5 I 7 8 24 24 28 28 36 40 42 43 49 51 55 56

Why is memory so much better for skilled people in their domain of expertise? Our interest in this problem first began 3 years ago, when we started training a subject on the digit-span task. Over the course of 2 years of practice, our subject was able to increase his digit span from 7 digits to over 80 digits, and our analysis of this subject led us to our interest in memory performance of skilled individuals. In this article, we shall first review the literature on skilled memory, then we shall describe our analysis of skilled memory in the digit-span task, and finally we shall discuss THE PSYCHOLOGY OF LEARNING AND MOTIVATION, VOL. 16

1

Copyright 0 1982 by Academic Presr. Inc All rights 01 reproduction in any form reserved ISBN 0-12-543316-6

2

William G . Chase and K. Anders Ericsson

our latest work with a mental calculation expert, a waiter who memorizes food orders, and we shall discuss extensions of our work with normal subjects.

I. A.

The Skilled Memory Effect

SHORT-TERM MEMORYCAPACITY

The capacity of short-term memory has long been accepted as one of the most fundamental limits in people’s ability to think, solve problems, and process information in general (Miller, 1956; Newell & Simon, 1972). The memory span (about 7 unrelated symbols) is the most accepted measure of short-term memory capacity (Miller, 1956), and this severe limit on readily accessible symbols is commonly taken as a fundamental limit on the working memory capacity of the human information-processing system (Baddeley, 1976; Klatzky, 1980). This is, recent events attended to in the environment, knowledge states activated from long-term memory, and intermediate computations necessary for performing complex information-processing tasks are assumed to be held in short-term memory for immediate access. Working memory is equated with shortterm memory, and it is this severe constraint on the number of readily accessible symbols that limits our information-processing capacity. Memory span has even been taken by some people as a fundamental measure of intelligence (Bachelder & Denny, 1977a,b). The superior memory performance by experts in their area of expertise seems to fly in the face of these basic limits. B.

CHESSAND OTHERGAMESKILLS

The skilled memory effect has been in the literature for some time. de Groot ( 1966) discovered that chess masters have virtually perfect recall of a chess board after viewing it for only a few seconds (5-10 sec), whereas novices can recall only three or four pieces (Chase & Simon, 1973a). Chase and Simon (1973a) showed that this memory is specific to the master’s knowledge domain by presenting chess players with randomized chess positions and finding that recall was uniformly poor for all players, regardless of their skill level. In addition to the master’s superior memory for chess positions, Chase and Simon (1973b) also found that the master has greatly superior memory for sequences of moves. According to Chase and Simon (1973b), this memory performance is

Skill and Working Memory

3

the result of a vast knowledge base that the master has acquired through years of practice. This knowledge includes procedures for generating moves, stereotyped sequences of moves, and stereotyped patterns of pieces. In order to explain the master’s superior memory for positions, Chase and Simon suggested that the master recognizes familiar patterns that he sees often in his study and play, whereas the novice is able to notice only rudimentary relations in the limited time allowed in the chess memory task. When Chase and Simon (1973a) measured memory performance in terms of patterns rather than individual pieces, master and novice memory performance were much more similar, and the absolute magnitude of memory performance was closer to seven. They concluded that the limit in performance in the chess memory task is due to the limited capacity of short-term memory. The master holds retrieval cues in short-term memory for seven patterns, located in long-term memory; and at recall, these cues are used to retrieve each pattern, one at a time from long-term memory. The novice, on the other hand, must utilize all of his short-term memory capacity to store the identity, color, and location of three or four individual chess pieces. There was one discrepant finding in the Chase and Simon (1973a) study which, in retrospect, seems critical to our analysis of skilled memory. They found that even when the master’s memory performance was scored in terms of patterns recalled, and the sophisticated guessing strategies of the master were discounted, the master’s recall still often exceeded the accepted limits of short-term memory capacity (7 2). In short, the master’s recall of patterns even exceeded the capacity of short-term memory, and Chase and Simon (1973b) were unable to fully explain this phenomenon. Charness (1976) later demonstrated that these chess patterns (i.e., their retrieval cues) are not retained in short-term memory because they are not susceptible to, interference effects in short-term memory. Later we shall try to show that this result is perfectly compatible with our new conception of working memory. This skilled memory effect has been replicated many times (Charness, 1976; Chi, 1978; Ellis, 1973; Frey & Adesman, 1976; Goldin, 1978, 1979; Lane & Robertson, 1979), and the same effect has been found with expert players in the games of go, gomoku, and bridge. Reitman (1976) studied a professional-level go player whose perceptual memory for go patterns closely paralleled that of chess masters for chess positions. In another study, Eisenstadt and Kareev (1975) compared recall of go and gomoku patterns. They took advantage of the fact that go and gomoku are played on the same 19 X 19 board with the same black and white stones, but the objects of the’ games are different and the types of patterns are

*

4

William G. Chase and K. Anders Ericsson

different. In go, the object of the game is to surround the opponent's stones, whereas in gomoku, the object is to place 5 items (stones) in a row. They trained subjects to play both games, and then, in one experiment, they asked subjects to recall a go position and a gomoku position. In fact, subjects were shown the same pattern, except that it had been rotated 90" and the color of the pieces had been reversed so that subjects were unaware of the structural identity of the positions. The interesting finding of this study was that when subjects thought they were recalling a go position, their recall of go patterns (i.e. , stones crucial to the analysis of the position as a go game) was far superior to their recall of gomoku patterns (by a factor of almost 2 to l ) , and when subjects thought they were recalling a gomoku position, their recall favored the gomoku patterns by almost a 2-to-1 margin. Rayner (1958), in an interesting training study, was able to trace the development of gomoku patterns with practice. By studying a group of people over a 5-week period as they acquired skill in the game of gomoku, Rayner (1958) was able to describe the types of patterns that players gradually learned to look for, and the associated strategies for each pattern. The patterns are quite simple; the difficulty in learning them arises from the number of moves required to generate a win from a pattern. The most complicated strategy that Rayner described was an 11-move sequence starting from a fairly simply and innocuous-looking pattern of four stones. In his analysis of the acquisition of gomoku, Rayner (1958) described a process by which his subjects gradually switched from an analytic mode of working through the strategies to a perceptual mode in which they searched for familiar patterns for which they had already learned a winning strategy. In short, Rayner (1958) analyzed in his laboratory, over a 5-week period in a microcosm, the perceptual learning process that is presumed to occur on a much larger scale, over the course of years of practice, as chess players gradually acquire master-level proficiency. The skilled memory effect has also been found in the game of bridge, which has no obvious spatial component. Charness (1979) and Engle and Bukstel (1978) have both reported that high-level bridge experts can remember an organized bridge hand (arranged by suit and denomination) almost perfectly after viewing it for only a few seconds, whereas less experienced bridge players show much poorer recall. With unorganized hands, performance is uniformly poor for both experts and less experienced players. In addition, bridge experts were able to generate bids faster and more accurately, they planned the play of a hand faster and more accurately, and they had superior memory for hands they had played. Thus, it is our contention that bridge expertise, like chess, depends in

Skill and Working Memory

5

part on fast-access pattern recognition because patterns are associated with procedural knowledge about strategies and correct lines of play.

C. NONGAMESKILLS The skilled memory effect has also been demonstrated in domains other than games, such as visual memory for music (Salis, 1977: Slaboda, 1976). An additional important property of skilled memory has emerged from several of these nongame skill studies: hierarchical knowledge structures. Akin (in press) has analyzed the recall of building plans by architects and found several interesting results. First, as with chess players, architects recall plans pattern by pattern. Second, architectural plans are recalled hierarchically. At the lowest level in the hierarchy, patterns are fairly small parts of functional spaces, such as wall segments, doors, table in a corner. The next higher level in the hierarchy contains rooms and other areas, and higher levels contain clusters of rooms or areas. The fairly localized property of architectural patterns at the lowest level in the hierarchy is reminiscent of the localized nature of chess patterns reported by Chase and Simon (1973a). Only at the next level in the hierarchy do architectural drawings take on the functional form of the architectural space: rooms, halls, and so on. Architectural patterns seem similar to chess patterns in that functional properties are more important at higher levels, while structural properties are more important at lower levels. Egan and Schwartz (1979) have found superior recall of circuit diagrams by expert electronics technicians after a brief exposure (5-15 sec) of the diagram. Egan and Schwartz have also found evidence of a higher level organization for the skilled electronics technician. At the lowest level, the basic patterns were very similar to the chess patterns and architectural patterns in terms of their localized nature. The skilled technicians, however, were faster and more accurate in their between-pattern recall than the novices, which is good evidence for the existence of higher level organization. Egan and Schwartz concluded that expert technicians use their conceptual knowledge of the circuit’s function to aid in their recall. In the domain of computer programming, Shneiderman (1976) presented a print-out of a simple FORTRAN program or a scrambled print-out of a simple FORTRAN program to programmers with varying degrees of experience. The number of perfectly recalled lines of code from the real program increased dramatically with experience, whereas there was virtually no increase in recall with the scrambled program; for the most experienced programmers, there was a 3-to-1 difference in recall ( 6 vs 18 lines). McKeithen, Reitman, Rueter, and Hirtle (1981) have since repli-

6

William G. Chase and K. Anders Ericsson

cated this result with ALGOL programs. Schneiderman (1976) further showed that the nature of the errors by the experienced programmersreplacing variable names and statement labels consistently, changing the order of lines when it did not affect the program’s result-provided evidence that the experienced programmers were using knowledge of the program’s function to organize their memory for lines of programming code. The existence of higher level functional knowledge in the more experienced individuals has also been demonstrated in baseball fans. Chiesi, Spilich, and Voss (1979) have found that the differential recall of baseball events by individuals with high and low baseball knowledge can be traced to their differential ability to relate the events to the game’s goal structure. That is, high- and low-knowledge individuals were equally competent at recalling single sentences of baseball information. However, highknowledge individuals were better at recalling sequences of baseball events, presumably because they were better able to relate each sequence to the game’s hierarchical goal structure of advancing runners, scoring runs, and winning. A very similar result on normal subjects has been demonstrated by Bransford and Johnson (1973) for recall of paragraphs. Bransford and Johnson showed that subjects were better at recalling ideas from a paragraph if they were given an organizing principle for the paragraph at the time of learning, such as a title, an illustration of the main idea of the paragraph, or the topic of the paragraph. We suggest that recall is facilitated by the use of some abstract hierarchical organizing structure for the paragraph. The same must be true of scripts and schemas as organizing structures for stories and scenes (Biederman, 1972; Bower, Black & Turner, 1979). Although we will discuss this topic more fully in the analysis of our mental calculation expert, we briefly note here that mental calculation experts, as a side-effect of their computational skill, generally exhibit a digit span that is two or three times larger than normal (Hatano & Osawa, 1980; Hunter, 1962; Mitchell, 1907; Muller, 1911). To sum up the analysis so far, the skilled memory effect has been demonstrated in a variety of game-playing and non-game-playing domains, although the bulk of the research has centered on exceptional memories of chess masters. In theory, this exceptional memory performance has been attributed to the existence of a vast long-term knowledge base built up by the expert with years of practice. In game-playing domains this knowledge takes the form, in part, of patterns which serve as retrieval aids for desirable courses of action. It was suggested that in other domains, hierarchical knowledge structures exist in the expert for the purpose of organizing knowledge. For architectural drawings, functional

Skill and Working Memory

I

areas (e.g., rooms) serve to organize lower level structures (walls, furniture, etc.); for circuit diagrams and computer programs, function is used to organize the components; and for baseball games, the hierarchical goal structure of the game is used to organize sequences of events. Although Chase and Simon (1973a,b) did not find much evidence for the existence of hierarchical structure in the master’s memory of chess positions, we suggest that there must indeed be some organizing principle to account for the fact that the master’s recall of patterns exceeds his short-term memory capacity. We shall come back to this problem again later. Finally, before we enter into the analysis of our digit-span expert, we should briefly mention a distinctly different but related type of memory expert: the mnemonist. Unlike the skill-based expert, the mnemonist does not achieve his exceptional memory performance in a particular area of expertise. Rather, the mnemonist has acquired a system or repertoire of techniques for memorizing nonsense material. Persons with trained memories can use mnemonic techniques to memorize long lists of words, names, numbers, and other arbitrary items. The most common mnemonic technique is the use of visual images as mediating devices, and the most powerful system is the method of loci, in which items to be remembered are imagined in a series of well-memorized locations interacting with objects in these locations. Mnemonists have generally made themselves known as stage performers, although the techniques have received a great deal of attention recently in the psychological literature. A cognitive theory of exceptional memory should deal with both the expertise-based memory performance and the mnemonics-based memory performance. We shall return to the cognitive principles underlying mnemonics in a later section. (See Bower, 1972, for a good scientific analysis of mnemonic techniques; Yates, 1966, for a good historical analysis; and Lorayne and Lucas, 1974, for the current best-selling system.) 11.

Analysis of a Memory-Span Expert

In this section, we will describe the highlights of our previous analysis of digit-span experts (reported more fully in Chase & Ericsson, 1981; Ericsson, Chase, & Faloon, 1980), and in addition we report some new results of interest to our theory of skilled memory. A.

THEEFFECTSOF PRACTICE ON DIGITSPAN

The basic procedure in the memory span task is to read digits to subjects at the rate of 1 digit per sec followed by ordered recall. If the sequence is reported correctly, the length of the next sequence is in-

8

William G . Chase and K. Anders Ericsson

creased by one digit; otherwise the next sequence is decreased by one digit. Immediately after the recall of each trial, subjects are asked for a verbal report of their thought processes during the trial. At the end of each session, subjects are also asked to recall as much of the material as they can from the session. On some days, experimental sessions are run instead of practice sessions. Figure 1 shows the average digit span of two subjects as a function of practice. Both subjects demonstrate a steady, although somewhat irregular, increase in digit span with practice. It appears that 200-300 hours of practice is sufficient to yield performance that exceeds the normal memory span by a factor of 10. Our original subject, SF, began the experiment in May 1978 and continued for 2 years (a total of 264 sessions) before the experiment ended. The highest digit-span performance achieved by SF was 82 digits. We started training our second subject, DD, in February 1980 to see if it was possible to train another person to SF’s system, and now, after 286 sessions, the highest span achieved by DD is 68 digits. Until now, the highest digit spans reported in the literature have been around 20 digits, and these have generally been achieved by mental calculation experts (Hatano & Osawa, 1980; Hunter, 1962; Martin & Fernberger, 1929; Mitchell, 1907; Muller, 1911). How is this memory feat possible? To answer this question, we have resorted to an extensive analysis of our subjects’ verbal reports, we have conducted over 100 experimental procedures of various kinds on our two subjects, and we have even written a computer simulation of SF’s coding strategies. In the process, we have discovered three principles of memory skill that we believe characterize the cognitive processes underlying this memory skill: (a) subjects use meaningful associations with material in long-term memory, (b) subjects store the order of items in another longterm memory structure that we have called a “retrieval structure,” and (c) subjects’ encoding and retrieval operations speed up with practice. We shall consider each of these in turn. B.

MECHANISMS OF SKILLED MEMORY

1. The Mnemonic System

When we first started this experiment, we simply wanted to run a subject for a couple of weeks to see if it was possible to increase the memory span with practice and, if so, whether we could use the subject’s retrospective reports to figure out how it happened. The verbal reports ‘Sadly, SF died of a chronic blood disorder in the spring of 1981

Skill and Working Memory

9

80

60

Z

a4

v)

k 40

20

10

20

40

30

50

PRACTICE ( 5-DAY BLOCKS )

Fig. 1. Average digit span for SF (-0)

and DD

(Ap&

as a function of practice.

were very revealing of both the mnemonic system and the retrieval structure. The first 4 hours of the experiment were fairly uneventful. SF started out like virtually all the naive subjects we have run. On the first day, he simply tried to hold everything in a rehearsal buffer, and this strategy resulted in a perfectly average span of 7 digits. The next three days, SF tried another common strategy: Separate one or two groups of three digits each in the beginning of the list, concentrate on these sets first and then set them “aside” somewhere, and then hold the last part of the list in the rehearsal buffer; at recall, retrieve and recall the initial sets while simultaneously concentrating on the rehearsal buffer, and then recall the rehearsal buffer. (This strategy represents the first rudimentary use of a retrieval structure, which is the second component of the skill, to be described later.) This simple grouping strategy seemed to produce a slight improvement in performance (to eight or nine digits), but by Day 4, SF

William G. Chase and K. Anders Ericsson

10

reported that he had reached his limit and no further improvements were possible. And then, on the fifth day, SF’s span suddenly jumped beyond 10 digits, and he began to report the use of a mnemonic aid. From then on, SF’s performance steadily increased, along with the reported use of his mnemonic system and accompanying retrieval system. It turned out that SF was a very good long-distance runner-a member of an NCAA championship cross-country junior-college team-and he was using his knowledge of running times as a mnemonic aid. For example, 3492 = “near world-record mile time.” He initially coded only 1and 2-mile times, but he gradually expanded his mnemonic codes to include 1 1 major categories from 4 mile to marathon. In addition, he added years (e.g., 1943 = “near the end of World War II”), and later he added ages for digit groups that could not be coded as running times. For example, 896 cannot be a time because the second digit is too big, so SF coded this digit group as “eighty-nine point six years old, very old man.” Table I shows the major categories used by SF and the session number when they first appeared in the verbal protocols. By the end of 6 months-1 00 sessions-SF had essentially completed his mnemonic system and he was coding 95% of all digit sequences, of which the majority were running times (65%), a substantial minority were ages (25%), and the rest were either years or numerical patterns (5%). After 200 hours, SF coded virtually everything. Later, when we wanted to see if it was possible to train another subject to use SF’s menmonic system, we were able to enlist another exceptional runner, DD, who was a College Division 111 All-American cross-country runner. DD was able to learn SF’s mnemonic system without any trouble, although the system he eventually developed is somewhat different, in TABLE I MAJORCODINGSTRUCTURES Coding structure Three-digit groups Time Age + decimal Four-digit groups Time (3, 4. 5, 10 min) Time decimal Digit + time Year Age age

+

+

Examp1e

First reported (session no.)

8:05

5

49.7

70

13:20 4:10.6 9-7:05 1955 46 76

20 26 60 64 64

Skill and Working Memory

II

part because of the differences in the races he specializes in. DD also coded virtually everything after 200 hours of practice, and the relative proportions of running times, ages, years, and numerical patterns were similar to SF’s. It should be emphasized that the semantic memories of our two subjects are very rich. That is, SF and DD do not simply code digit groups as a member of a major category; there are many subcategories within each major category. For example, there are dozens of mile times: near worldrecord time, good work-out time for high school, training time for the marathon. Table I1 is a listing of the 1-mile categories around 4:OO derived from DD’s verbal protocol when he was recently asked to sort into categories a deck of 31 cards with running times ranging from 3:40 to 4:lO. The left-hand column of Table I1 contains the categories derived from a different protocol taken from SF 3 years earlier, after SF had had about 3 months of practice on the digit-span task. In this early protocol, we asked SF to divide the running-time spectrum into categories, although we did not ask him to describe each category. We were simply TABLE I1 SF’s AND DD’s CATEGORIES FOR TIMES BETWEEN 3:40 AND 4:20 ~~

~

DD’s categories SF’s category times

Times

349

34c344 346-349 347 349

350 35 1

350 35 1 352 353

Description of semantic category Slow M-mile times Coe and Ovett. I imagine a picture 1 saw in a magazine with Coe or Ovett and 348 on it. 347 point something is the new world record John Walker. With a decimal time, I think of John Walker in a race. Without a decimal, I picture Coe or Ovett New Barrier Old World Record for a long time Indoor World Record Darrell Waltrip

352-358 354-356 357-359

Now middle of the pack in a great race Breaking the 4-min mile

400 401-402

A sec or two off the 4-min mile

359 400

Still the Big Barrier

401414 4 0 3 4 12 415 416419 420

413420

Seems like everyone has run one of these Every good college miler has done a 40-something Teens. Usually associated with high school times

12

William G. Chase and K. Anders Ericsson

interested in determining the size of SF’s semantic network of running times. In that early protocol, SF reported 210 distinct running-time categories, including 81 1-mile categories. When this protocol was taken (after 3 months of practice), SF was coding mostly 1-mile and 2-mile times, which together comprised two-thirds of the 2 10 categories reported by SF at the time. Despite the differences in procedures, different amounts of practice, and important changes in the running world, it is interesting to examine the two sets of categories side by side. Although there is little direct correspondence of categories, there are some striking similarities. There are 10 or more distinct categories for each subject over this small range, many of which contain only a single (nondecimal) time. Note in DD’s protocol that several times are associated with specific events or people. Table 11 illustrates an important point about these mnemonic codes: They are semantically rich and distinctive. On the basis of SF’s verbal protocols, we were able to figure out his coding rules and eventually to incorporate these rules into a computer simulation model that predicted how SF would code a string of digits, with 90% accuracy. We have also conducted many experiments to test our theory of SF’s coding system. The first two experiments we conducted (Days 42 and 47) were a direct test of SF’s mnemonic system. We hypothesized that if SF were using a mnemonic system and we presented him with digit sequences that he could not code with his mnemonic system, then his performance would decline. We therefore presented SF with digit sequences that could not be coded with running times or easy numerical patterns. At that time, SF had not yet invented other categories for digit sequences that were nontimes. As expected, SF’s performance dropped about 20% from his normal average of 16 digits. In our second experiment, we presented SF with digit sequences that could all be coded as running times; under these circumstances, SF’s performance jumped by over 25%. We have several other indications that our subjects are using long-term memory in the digit-span task. Perhaps the most straightforward evidence is that both our subjects can recall almost all the digit sequences that they have heard after an hour’s session, although they cannot remember the order. Both our subjects, when asked to recall everything from a session, systematically recall three- and four-digit sequences category by category, starting with the shortest times @-mile times in DD’s system, and 4mile times in SF’s system) and they work their way through to the longest times (marathon), followed by ages, years, and patterns. Furthermore, within each category, they generally also start with the shortest times and work their way through to the longest times. We believe that our subjects

Skill and Working Memory

13

are using a simple generate-and-test strategy to search their semantic memory categories for recently presented times. To give a concrete example of the generate-and-test strategy in another domain, suppose you asked subjects to name all the states in the Union that begin with the letter M. One common strategy is to generate initial consonant-vowel sounds beginning with /m/,systematically working through all the vowel sounds, and see if any states come to mind. By “come to mind” we mean that a retrievaI cue is sufficiently similar to a node in long-term memory to cause its activation. In the subsequent recall task of our experiment, we believe that our subjects systematically think of running times within small ranges, such as those described in Table 11, and if any such traces have been generated recently, there is a high probability that they will be reactivated. Figure 2 shows the average percentage of items recalled by each of our subjects as a function of practice. Although we did not think of running this experiment until several weeks of practice had elapsed, we suppose that our two subjects were like other naive subjects in the beginning,

I

I

I

I

I

I

10

20

30

40

50

I

PRACTICE ( 5-DAY BLOCKS )

Fig. 2. Average percentage of aftersession recall for SF (0-0) function of practice.

and DD

(A----A)

as a

14

William G . Chase and K. Anders Ericsson

which is to say that virtually nothing is recalled from a digit-span task after an hour’s session. With practice, however, subsequent recall gradually approached 90% over the 200-300 hour range we studied. In another experiment (after about 4 months of practice), we tested SF’s recognition memory for digit sequences because recognition memory is a much more sensitive measure of retention than recall. On that occasion, SF not only recognized perfectly three- and four-digit sequences from the same day, he also showed substantial recognition of sequences from the same week. In another experiment (after about 4 months of practice), after an hour’s session we presented SF with threeand four-digit sequences but with the last digit missing, and he was asked to name the last digit. SF was able to recall the last digit 67% of the time after 4 months of practice; after 250 hours of practice, SF was virtually perfect at naming the last digit of a probe. Finally, we ran an extended recall session after Day 125 (Williams, 1976). At that time, SF was normally recalling about 80% of digit sequences from the session, and he generally took about 5 min to do it. We asked SF to try harder and keep trying until he could recall all the digit sequences from the session. After about an hour of extended recall, SF had recalled all but one four-digit sequence from the session. Every time we have asked for extended recall since then, SF has shown virtually perfect recall. We recently ran DD on extended recall after Session 286 and he too had virtually perfect recall (97%). Up to this point it seems clear that our subjects are making extensive use of semantic memory. We next address a question of theoretical importance concerning the role of short-term memory in this task. 2.

Short-Term Memory

How much information is being processed in short-term memory? Has the extensive practice produced an increase in the capacity of short-term memory? In one experiment, we attempted to determine how much information is in short-term memory by asking SF. In this experiment, we interrupted SF at some random point during a trial while he was being presented with digits, and we asked for an immediate verbal protocol. We wanted to know what SF’s running short-term memory load was and how far behind the spoken sequence he lagged. That is, how many uncoded digits and how many coded groups are kept in short-term memory? From SF’s verbal reports, we found that he was actively coding the previous group of three or four digits while the digits for the current group were still coming in, a lag of about 4 to 7 sec in time. DD’s verbal reports show a similar pattern, although he reports more information about numerical

Skill and Working Memory

15

patterns within groups and semantic patterns between groups. For example, typical relations noticed by DD, given the sequence 415527 are “a four-fifteen mile time with a repeating digit for the decimal; the time was run by a twenty-seven year-old man.” The interesting fact from both subjects’ protocols is that very little except the most recent few seconds are in short-term memory at any moment in time. We conclude that the contents of short-term memory include: (1) the most recent one, two, or three uncoded digits; (2) the previous group of three or four digits; and (3) all the semantic information associated with the mnemonic coding of the previous group. In a series of rehearsal suppression experiments, we wanted to see how much of the digit series was retained if the rehearsal interval between presentation and recall were disrupted. In one experiment, immediately after the list was presented, SF recited the alphabet as quickly as possible for 20 sec before recall. This procedure resulted in the loss of only the rehearsal buffer at the end-the last group of three to five digits at the end of the list. In two other experiments, we suppressed visual rehearsal by having SF either copy or rotate and copy geometric shapes for 20 sec in between presentation and recall. This procedure has been shown by Charness (1976) to interfere with short-term visual retention. However, in the digit-span task, this visual suppression procedure had no effect on performance. Two further experiments were designed to interfere with short-term memory processes during the presentation of digits. In one experiment, we introduced a concurrent chanting task (“Hya-Hya”) that has been used by Baddeley and his associates to suppress short-term memory (Baddeley & Hitch, 1974). In this task, SF said “Hya” after each presented digit. This procedure produced no decrement, and SF reported that he organized the chanting in a different phenomenal (spatial) location than his perception and coding of digits. In the second experiment, we produced a very substantial amount of interference with a concurrent shadowing task. We presented SF with a random letter of the alphabet between each digit-group boundary (every third or fourth digit), and his task was to say the presented letter as soon as he heard it. One experimenter read digits to SF at the rate of 1 digit per sec, and the other experimenter read a letter at the end of each group. Unlike the concurrent chanting task, this procedure produced a 35% drop in performance, even though there was only 4 to d as much verbalization required by the subject. It appears that the concurrent chanting task does not interfere with the phonemic short-term memory buffer, as Baddeley (1981) has also recently concluded. However, we believe that the shadowing task interferes with SF’s normal strategy of lagging behind the input of digits and using the

16

William G . Chase and K. Anders Eriesson

phonemic short-term memory buffer as a temporary storage for the incoming group while processing semantically the immediately preceding group. Finally, other evidence suggests that short-term memory capacity did not increase with practice. (1) SF’s and DD’s mnemonically coded groups were virtually always three and four digits. (2) Their rehearsal group virtually never exceeded six digits. (3) In their hierarchical organization of digit groups (to be described later), SF and DD never grouped together more than three or four digit groups. (4)There was no increase in SF’s or DD’s consonant letter span with practice on digits. ( 5 ) Without a single exception in the literature, expert mental calculators and other memory experts have digit groups of three to five digits (Hunter, 1962; Mitchell, 1907; Muller, 1911). These many converging lines of evidence led Chase and Ericsson (1981) to conclude that the reliable capacity of short-term memory is 3 or 4 units, independent of practice. The usual measure of short-term memory, the span, is the length of list that can be reported 50% of the time. However, the optimum group size for digits is three or four digits (Wickelgren, 1964), the running memory span is only about three digits, and long-term memory groups are also three or four items (Broadbent, 1975). Thus, the reliable capacity of short-term memory-the amount of material available almost all the time-is closer to three or four symbols. In speeded skills, three or four symbols is a more realistic estimate of shortterm memory capacity. In the digit-span task, the evidence seems to uniformly suggest that only a very small portion of the list of digits is in short-term memory at any point in time. During presentation, only a few seconds’ worth of material is in short-term memory, and after presentation, only the last group of three to six digits is rehearsed. Almost everything seems to be mnemonically coded in long-term memory. This leads to our next problem: If these digit groups are in long-term memory, how do subjects retrieve them? 3.

The Retrieval System

The simple model of retrieval in skilled memory proposed by Chase and Simon (1973a,b) is clearly inadequate to explain digit-span performance by our experts. They proposed that retrieval cues for chess patterns are stored in short-term memory and then used at recall to retrieve items from long-term memory. First, the rehearsal suppression experiments showed that very little coded information is retained in short-term memory. Second, both SF and DD recall too much (22 digit groups for SF and

Skill and Working Memory

17

19 digit groups for DD). Third, we ran a subject who used this simple strategy, and her digit span reached an asymptote of about 18 digits, or four mnemonically coded groups of digits. This subject developed a mnemonic system based on days, dates, and times of day (e.g., 9365342 = “September third, 1965, at 3:42 P . M . ” ) . This subject never developed a retrieval system, and she tried to hold the retrieval cues for these mnemonic codes in short-term memory. Her performance improved about as rapidly as SF’s and DD’s in the beginning, but she could never improve her performance above four mnemonic groups and she eventually quit the experiment from loss of motivation after about 100 hours. We have several reasons for proposing that our subjects developed what we have termed a “retrieval structure.” A retrieval structure is a long-term memory structure for indexing material in long-term memory. It can be used to store and order information, but is more versatile because it can allow direct retrieval of any identifiable location. A good example of a retrieval structure is the mnemonic system known as the Method of Loci because it provides a mechanism for retrieving a series of concrete items associated with identifiable locations via interactive images. We suggest that our subjects have developed a retrieval structure, analogous in some respects to the Method of Loci, for retrieving mnemonically coded digit groups in the correct order. The verbal protocols are very revealing about the retrieval structures. Before every trial, SF and DD both explicitly decide how they are going to group the digits. Figure 3 illustrates the development of SF’s retrieval structure, as revealed in his verbal protocols. SF started out by relying only on the short-term phonemic buffer (R) as his retrieval mechanism until he hit on the idea of setting aside the initial groups of digits and holding only the last few digits in the rehearsal buffer. This strategy is fairly common among subjects, however, and it is not unique to our skilled subjects. What makes the retrieval structure so powerful is that SF was able to store his mnemonically coded digit groups in these locations. Without the mnemonic, it is not clear how subjects would be able to associate many distinctive items with the different locations. Even so, SF experienced a great deal of difficulty keeping the order straight for more than three or four groups of digits. After about a month of practice, SF introduced a very important innovation in his retrieval structure: hierarchical organization. He began to separate groups of 4 digits followed by groups of 3 digits. We have termed these clusters of groups “supergroups.” Finally, when these supergroups became too large (more than four or five groups), SF introduced another level in his hierarchy (Day 109), and his performance improved continuously thereafter. DD’s hierarchical organization is very

William G . Chase and K. Anders Ericssun First reported (session no I

Number of digits

Retrieval structures

1

@

1.7

2

@

7.15

20

@

15-18

19.34

35.38

39.42

Fig. 3. Development of SF’s retrieval structure. On the left i$ shown the session number in which the retrieval structure was first reponed. and on the right is shown the range of digits over which the retrieval structure works. Squares linked together correspond to supergroups, and inhide each square is the number of digits corresponding to that group. The circled R corresponds to the reheard group of 4 to 6 digits.

similar to SF’s, and Fig. 4 illustrates our best guess as to SF’s grouping structure for 80 digits, and DD’s grouping structure for 68 digits. At their current levels of practice, SF and DD use at least a three-level hierarchy: ( 1 ) Digits -+ Groups, (2) Groups Supergroups, and ( 3 ) Supergroups -+ Clusters of Supergroups. In another study, run separately on SF and DD, after an hour’s session we presented our subjects with 3- and 4-digit groups from the session and asked them to recall as much as they could about that group. Subjects invariably recalled the mnemonic code they used and they often recalled

-

Skill and Working Memory

19

SF

DD

Fig. 4 . SF’s retrieval structure for 80 digits and DD’s retrieval structure for 68 digits.

the location of the group within the supergroup. On rare occasions when they were able to recall a preceding or following group, this recall was always associated with some relation between the groups, such as two adjacent 1-mile times. With the exception of this type of episodic information, retrieval of these mnemonic codes seems to be achieved via these hierarchically organized retrieval structures rather than through direct associations between digit groups. Another interesting aspect of our subjects is that they generally spend between 30 sec and 2 min rehearsing the list before they recall it, and their rehearsal pattern is revealing about the underlying retrieval structure. According to their verbal reports, both subjects rehearse the digit sequence in reverse, supergroup by supergroup, except the first supergroup. That is, both subjects rehearse the last supergroup, then the nextto-last supergroup, and so on, until they come to the first supergroup. Instead of rehearsing this initial supergroup, the subjects then go directly to the beginning of the list and start their recall. Within supergroups, SF generally rehearses in forward order and DD rehearses in reverse order. The interesting thing about these rehearsal patterns is that rehearsal is organized in supergroups. Besides the verbal protocols, there is a great deal of additional evi-

20

William G. Chase and K. Anders Ericsson

dence that our subjects use retrieval structures. The best evidence comes from the speech patterns during recall. In the literature, pauses, intonation, and stress patterns are well-known indicators of linguistic structure (Halliday, 1967; Pike, 1945). The speech patterns of SF and DD typically follow the same pattern. Digit groups are recalled rapidly a a normal rate of speech (approximately 3 digits per sec), with pauses between groups (about 2 sec between groups, on average, with longer pauses when subjects experience difficulty remembering). At the end of a supergroup, however, there is falling intonation, generally followed by a longer pause. In another study, we conducted a memory search experiment with SF after about 100 days of practice. We presented SF with a list of digits, but, instead of asking for recall of the sequence, we presented SF with a group of digits from the list and asked him to name the preceding or following group of digits. It took SF more than twice as long to name the groups preceding or following the probe if he had to cross a hierarchical boundary (10.1 vs 4.4 sec). Up to this point, we have described the two most important mechanisms underlying our subjects’ memory performance: the mnemonic system and the retrieval structure. However, these mechanisms are not sufficient to fully explain the performance of our subjects. These systems were essentially completed within the first 100 hours of practice for both subjects. Yet the performance of both subjects showed continuous improvements through 250 hours of practice, and there is no sign of a limit. There must be another mechanism. 4 . Encoding and Retrieval Speed

This aspect of memory skill has been the most elusive mechanism to track down. For one thing, our subjects’ verbal reports are of little use in analyzing changes in the speed of mental operations. For another, we have not been able to obtain a great amount of data supporting our theory of speedup. Nevertheless, we believe that the little evidence we have suggests that speedup is an important mechanism in skill acquisition in the memory span task. We have recorded latency data on both SF and DD in a self-paced presentation task several times over the past 3 years. In this task, we presented subjects with one digit at a time on a computer-controlled video display, the subject controlled the rate at which he received digits by pressing a button each time he wanted a digit, and we measured the time between button pushes. We also systematically manipulated the size of the list.

Skill and Working Memory

21

Figure 5 shows these latency data for both subjects as a function of the size of the list and practice. As one might expect, pauses tend to occur between groups, so we have displayed only the time between groups in Fig. 5. For both subjects, pause time increases with the size of the list. This result has been known for many years (Woodworth, 1938, p. 21), namely, there is more learning overhead for larger lists. The practice data are not as clear-cut for DD as for SF. Over a 2-year period, SF’s coding time has shown a very substantial decrease, and the decrease interacted with the size of the list so that there are bigger practice effects for larger lists. In SF’s case, the practice effect is so pronounced that there seems to be very little learning overhead for the larger lists after a couple of hundred hours of practice. In another experiment, we have several direct comparisons between our subjects and other memory experts in the literature on the speed to encode a 50-digit matrix (from Luria, 1968). Subjects in this task are shown a 50-digit matrix of 13 rows and 4 columns, and timed while they study it. Subjects are then timed while they recall the matrix, and then they are timed while they recall various subparts of the matrix (rows, columns, diagonals, and so on). These data are shown in Table 111 for DD, for SF (two trials spaced 1 year apart), for two well-known mnemonists in the literature (Hunt & Love, 1972; Luria, 1968), for our mental calculation expert AB, and for four unskilled subjects. A close examination of Table 111 reveals several interesting results. First, there is an enormous difference between memory experts and unskilled subjects in the time needed to memorize the list. Second, there is a large practice effect on learning time for SF. After a year’s practice, SF

DD

DAY 266

CAY 73

x

)

m

3

3

4

0

NUMBER OF DIGITS

5

0

10

20

30

40

50

NUMBER OF DIGITS

Fig. 5. Intergroup times for SF and DD as a function of list size and practice. The dependent variable is the time that subjects paused between groups when they controlled the visual presentation of digits.

STUDY AND

TABLE 111 RECALLTIMEON LURIA’S (1968) 50-DIGIT MATRIX^ Skilled subjects

Study time Recall time Entire matrix Third column Second column Second column up Zigzag OIn seconds

AB

Luria’s S

Hunt and Love’s VP

X

SD

SI

S2

S3

S4

X

SD

193

222

180

390

209

101

798

1240

685

715

860

258

38 68 28 27 41

51 56 40 54 52

40 80 25 30 35

42 58 39 40

45 60 36 38 46

7.3 13.0 8.2 10.9 11.9

77 125 81 112 123

95 117

42 42 31 46 78

51 78 40 63 94

66 90 66 76 101

24 38 37 28 19

SF

SF (1 year later)

DD

187

81

43 41 41 47 64

57 58 46 30 38

Unskilled subjects

110 83 107

Skill and Working Memory

23

was substantially faster than the other subjects on this task. Finally, there was very little difference in retrieval times among any of the subjects. This last result is unexpected, but interesting because it suggests that retrieval time depends upon how well learned the matrix is rather than on memory skill per se. That is, unskilled subjects can achieve almost as rapid retrieval as memory experts, provided that the former take the time to learn the digit matrix as well as the memory experts. In speeded tasks, of course, we would expect a deterioration in retrieval speed for the unskilled subjects because learning time would be severely limited. It is possible to compare SF’s learning time on the 50-digit matrix with Ruckle’s data (Fig. 6), reported by Muller (1911). As far as we know, Ruckle’s data represent the fastest learning times ever reported in the literature for digits (Woodworth, 1938; p. 21), and SF’s times are comparable after 2 years of practice. The data of Fig. 6 are only for visually presented lists; Ruckle’s auditory digit span was only about 18 digits. We mention one final experiment on encoding times. After about 50 hours of practice, we presented SF with digits at a rapid rate (3 digitskec) and found that SF could not code digits presented at this rate and his

(u).

Fig. 6 . A comparison between SF (0) and Professor Ruckle Shown is the time required to memorize visually presented digits as a function of the number of digits. SF’s data are taken from the experiment on the Luria matrices (Table 111) and Ruckle’s data are derived from Muller (1911).

24

William G. Chase and K. Anders Ericsson

performance dropped back to 8 or 9 digits. However, after 250 hours of practice, SF and DD were both able to code digits at these fast rates. They were both able to code one or two groups of 3 digits each and hold about 5 digits in their rehearsal buffer, to achieve a span of about 1 1 digits. This concludes our review of the major mechanisms underlying skilled performance in the memory span task. We next present our current ideas for a theory of skilled memory, along with some additional theoretical issues and more data of interest. 111. A Theory of Skilled Memory

We perceive the central issues of a theory of skilled memory to be the following: First, what is the structure of long-term memory? Second, what storage and retrieval mechanisms operate on this semantic memory to produce skilled memory performance? Finally, what role do retrieval structures play in skilled memory performance; and, in general, what is the role of working memory in skilled performance? A . THE STRUCTURE OF LONG-TERM MEMORY

1. Semantic Memory

We assume that our subjects’ knowledge of running times is stored as a hierarchical structure, which can be represented as a discrimination tree. In Fig. 7, we illustrate the portion of DD’s semantic network outlined in Table 11. We assume that as digits are presented to DD, he searches his discrimination tree for these categories. When he searches to a terminal node, we assume that recognition has taken place and a link is established between the terminal node in the semantic network and the episodic trace of the current digit group in short-term memory. Several aspects of our subjects’ behavior are consistent with this assumed structure. First, it explains the systematic generate-and-test characteristic of our subjects’ recall after the session. We assume that they simply search through this structure, activating each terminal node in turn, and from a terminal node they then activate any links between that terminal node and associated traces and report these traces. Second, there is evidence in the verbal protocols that subjects search a hierarchical structure. When we stopped subjects in the middle of a trial and asked for the contents of short-term memory (reported earlier), our subjects reported that when they are being presented with digits, they first notice the major category before making any finer categorizations. For example, given 357, they first notice that it is a 1-mile time before they notice that it is near the 4-min barrier. DD, in fact, explicitly reported that

Skill and Working Memory

25

Fig. 7. DD’s semantic network of I-mile times over the range of 346 to 420, derived from Table 11.

he waits until he hears the first two digits before he thinks about the category because one digit is too ambiguous. In our model of the semantic structure, two digits are sufficient to activate a nonterminal node in the tree, whereas one digit is not. After hearing two digits, DD says that he then makes a category decision (age, mile-time, etc.) and then the third digit is used to find a more meaningful category if possible. Finally, we report some latency data on S F that support our hierarchical model. In this experiment, after a session SF was presented visually with a digit group with one digit missing, and the task was to name the missing digit. Figure 8 shows that both the mean latency and the variance decreased monotonically with the position of the missing digit in the probe from first to last position, corresponding in our model to depth in the hierarchy. Further, the mean latencies decreased over a fairly large range (approximately 8 sec to 1 sec); a mean latency of 8 sec indicates a considerable amount of memory search. SF’s verbal protocols indicated that the earlier the missing digit is in the probe, the more extensive is his memory search. When the missing digit was in the third and fourth positions, SF often reported having direct access to the memory trace without any conscious awareness of search.

William G. Chase and K. Anders Ericsson

26

10

9

a

-P E

7

6

._ I-

5

4

3

2

I

1

Fig. 8. Latency to name the missing digit as a function of the location of the missing digit in the probe. The location of the missing digit is indicated at the bottom of the figure. Open squares represent three-digit probes, and darkened squares represent four-digit probes. Brackets represent 2 1 standard deviation, based on 10 or fewer observations.

2.

The Retrieval Structures

The second type of long-term memory structure that is relevant to skilled performance in the digit-span task is the retrieval structure. We assume that SF’s and DD’s retrieval structures have the hierarchical

Skill and Working Memory

21

forms portrayed in Fig. 4, and that they can also be systematically searched. In the beginning, we assume that the nodes in this retrieval structure are minimally differentiated, but with practice, each node takes on a distinctive set of features. That is, we assume that it takes practice, extensive practice, to use this retrieval system, just like any mnemonic system, and that practice involves learning to generate a set of distinctive features to differentiate one location from another. As with any mnemonic system, the more distinctive the better. One important issue concerns how versatile are these retrieval locations: What exactly can be stored in these locations? We had assumed that these locations were specific to abstract numerical concepts-running times, ages, years, and patterns for our subjects-because our subjects’ letter span did not improve along with their digit span, although we did not give our subjects much practice with letters. In another experiment, SF was able to store and recall perfectly a list of 14 names using his retrieval structure, so we do have some tentative evidence that these retrieval structures can store information other than digits. Storage locations in mnemonic systems have a similar limitation, but they seem more versatile. For example, the locations in the Method of Loci are specialized for concrete items for which a visual image can be generated. Rhymes are specialized for phonemically similar patterns. As we will discuss later, we think of a retrieval structure as a featural description of a location that is generated during encoding of digit groups, and these features are stored as part of the memory trace of a digit group. Then, at recall, these features will serve as a mechanism for activating the trace, when the featural description is attended to. The idea of a retrieval structure as a set of features stored with the memory trace, we believe, explains a great deal about the types of confusion errors that we have observed (to be described later). 3.

Contexl

Finally, a third type of long-term memory structure is relevant to the digit-span task: the context. We think that it is necessary to suppose that attended information is associated with the current context-the day, the trial number, the list length, the room and building, and probably much more. Furthermore, we think that attended information is automatically bound to the current context, unlike the retrieval structure, which requires control processes to bind information. We think that it is necessary to postulate the existence of current context because, otherwise, how is information not in short-term memory normally retrieved? That is, the everyday retrieval structure or working memory that people use all the

28

William G . Chase and K. Anders Ericsson

time to retrieve recent facts not in short-term memory but relevant to the ongoing task is the context. We do not have any concrete ideas about the form of the context, but it is probably not unreasonable to suppose that there is some type of hierarchical knowledge structure, analogous to a script, to which the current events are bound in some stereotypic fashion. In any case, we assume that in the digit-span task, memory traces are associated with the current context. B.

SHORT-TERM MEMORYAND ATTENTION

We simply assume that short-term memory is the set of knowledge structures that are currently active. Thus, short-term memory can contain graphic, phonemic, and semantic features. The rehearsal buffer, we assume, is a control structure, or retrieval structure if you will, for storing the order of a set of phonemic or articulatory features. We assume that for some basic unspecified reason, there is a limit to the number of knowledge structures that can be active at any moment. Attention refers to a property of the information-processing system which limits processing. The contents of attention refer to that subset of information in short-term memory that is attended to, and by “attended to,” we mean that this information serves as input to a process that requires attention. There is a class of processes that interfere with each other, that slow each other down, that compete with each other for sensory input channels, for short-term memory space, and so on. These processes are said to be attention demanding or controlled. Without getting involved in a discussion of the nature of attention, we will simply state that short-term memory places a limit on the number of knowledge structures that can be held simultaneously as input to a control process. As we discussed earlier, this limit seems to be about three or four symbols for the chunking process. We will equate our binding operation in long-term memory with attention. Our short-term memory and attention assumptions are of little consequence for the digit-span task, except that only one or two digit groups and their associated semantic information are in shortterm memory at any point in time. The interesting assumptions concern storage and retrieval operations. C . MEMORYOPERATIONS

1 . Storage

Our storage assumption is very simple: Memory traces attended to at the same time as an active long-term memory node are bound to that node, provided that they fit the node’s range. For example, in Fig. 6,

Skill and Working Memory

29

DD’s node for a GOOD COLLEGE MILE TIME will fit any time from 4:03 to 4:12, this node is not a good mnemonic for any sequence of digits, but only for a sequence of digits in the range specified. We adopt a featural representation of binding in which the memory trace and the semantic features activated from long-term memory are chunked together by virtue of being attended together. In the digit-span task, we assume that a digit group is bound to three long-term knowledge structures: the mnemonic association, the retrieval structure, and the current context. To take a concrete example, what happens when DD hears the digit string 4054? First, as the digit string is being perceived, he actively attends to the magnitude of the digits in order to classify it in his mnemonic system. As he perceives the first two digits, they are sufficient to activate two features in semantic memory corresponding to RUNNING TIME and 1 MILE. When he perceives the third digit, it is sufficient to activate the semantic feature of GOOD COLLEGE TIME, and when he hears the fourth digit, he notices that it is the same as the first digit, which activates a feature corresponding to SAME AS FIRST DIGIT. (We shall describe how our subjects parse decimals more fully later, when we discuss discrimination. ) This set of features is simultaneously attended to along with the trace of 4054, and a new memory chunk is formed. The current context and the location in the retrieval structure are also bound to the memory trace. The subject, as he is decoding the mnemonic code, also simultaneously thinks of the location within the retrieval structure and the current context, and featural descriptions of these long-term memory structures are activated and attended to simultaneously along with the trace and its mnemonic code. For example, suppose that DD notices that the previous group was also a 1-mile time, that it was faster, that these represent first and second place, respectively, in some imaginary race, and further that he had a similar time, 406.2, on the previous trial (a typical report). This information is also included as part of the context. Figure 9 depicts the final memory trace for 4054. We believe that this representation is consistent with a large number of observations. The additional links in Fig. 8 are included to illustrate the variety of associations that we have observed. The link between the location and the semantic code reflects the fact that subjects very often know any of several semantic features without knowing the actual digits. In fact, subjects’ verbal protocols indicate that the semantic code is invariably retrieved first, suggesting that the major link between the location and the trace is through the semantic code. However, the direct link between location and trace is necessary because subjects are able to recall digit groups without semantic codes. The links between context and location and between context and the

30

William G. Chase and K. Anders Ericsson

(

Semantic Code

)

Fig. 9. DD’s memory trace for the 1-mile time 4:05.4. Stored with the trace are semantic features describing the trace as a running time, features describing its location in the retrieval structure, and features corresponding to the current context. Included in the context are local features describing the decimal point as well as noticed relationships between the trace and other nearby digit groups, and global features describing noticed relationships between the trace and earlier digit groups, the trial number, and other global contextual features,

semantic code are there because the local context can be used to disambiguate either or both. The dotted lines indicate that the context contains information about other digit groups; the existence of these links is clearly seen in the clustering that occurs in the aftersession recall. The direct link between the context and the trace is there because people can still recall small recently heard groups of digits, even though they are not in shortterm memory, provided that there have not been too many such sequences. In our theory, context is virtually useless because it is not unique: If several digit sequences have been linked to the same context, then there are too many links to achieve activation. Finally, one might ask why it is necessary to assume a trace at all. Why isn’t memory retrieval a reconstructive process in which the set of features represents a sufficient code to reconstruct the event (Neisser, 1967)? The answer is that the semantic code is not sufficient to uniquely specify the event. In our example, GOOD COLLEGE TIME only specifies a range; in DD’s semantic network, 100 possible times (including decimal times) could fit this category. What the semantic code does is narrow the search in long-term memory for the memory trace. A good mnemonic should narrow the search to a single trace. But there still must be a trace.

Skill and Working Memory

31

Our theory is consistent with two related observations in the digit-span task concerning the retrieval structure: the limited size of supergroups and the hierarchical organization of the retrieval structure. Why should this be true? After all, there doesn’t seem to be any such constraints with other mnemonic systems, such as the Method of Loci. We speculate that with the Method of Loci and other mnemonic systems, the locations are so rich and distinctive that subjects have no trouble differentiating them. However, in the digit-span task, the subjects face the problem of building retrieval structures from nothing but position information. How is the subject to do this? We suppose that the subjects build supergroups by chunking them. That is, at the end of a supergroup, the subject must, according to our encoding assumption, attend to all groups simultaneously in conjunction with the current context. In fact, subjects’ introspections suggest that they are able to attend to only a few semantic features while grouping. Thus, according to our theory, the short-term capacity places a limit on the size of supergroups, and the hierarchical structure occurs because subjects have only enough capacity to group together a few abstract features representing groups, rather than the groups themselves. Another interesting property of the memory representation is redundancy. It is very common in our subjects’ retrospective reports that they notice such things as repeating semantic codes (e.g., two 1-mile times in a row) and many other kinds of relations. These redundant relations are very important to our subjects because they help to disambiguate the memory code and they aid in error recovery. It is very common for our subjects to retrieve only a very few features associated with a trace, and, with a combination of inference and further search, eventually recover an error or retrieve a missing trace. Our subjects also are good at judging certainty of their answers, and they can virtually always indicate when a digit group is right or wrong. The redundancy of the memory trace is a possible mechanism for this judged certainty. Before describing our retrieval assumptions, we should point out that our theory has focused on meaningful associations as the major mechanisms for building long-term memory structures, and we have said nothing about trace strength. This is in contrast to most memory theories, which focus on repetition as the major mechanism, and dwell time in short-term memory as the major determiner of strength (e.g., Anderson & Bower, 1974; Atkinson & Shiffrin, 1968; Raaijmakers & Shiffrin, 1981). We believe that both mechanisms operate, that attention time and number of redundant associations jointly determine the strength of meaningful associations, and this distinction between attention and meaningful associations vs short-term memory occupancy and rote repetition underlies the empirical distinction between elaborative and maintenance rehearsal (Bjork, 1975; Craik & Watkins, 1973). We believe that meaningful asso-

William G. Chase and K. Anders Ericsson

32

ciations are much more powerful, useful, and pervasive; and rote rehearsal is the default mechanism that people use when they cannot think of any meaningful associations.

2. Retrieval The process of retrieval during a trial, we assume, involves attending to a set of features in short-term memory, and this attention process will cause the activation of memory traces in long-term memory which contain the set of features. After a trial, with no information in short-term memory except an index to the current context, recall begins by activating the current context along with the first location of the retrieval structure. This should result in activation of the location information contained in the memory trace. From there, we assume that activation spreads jointly to the trace and to the semantic code, and spreading activation from the semantic code to the trace should normally be sufficient to activate the trace. In the case of recall after the session, retrieval is achieved by activating links between semantic memory and the trace. However, it is commonly reported by both SF and DD that during a trial when they have trouble remembering a digit group, they use the alternate time-consuming strategy of searching for it in semantic memory. When they do not know the mnemonic category, SF and DD sometimes take several minutes to search the semantic network before they retrieve an item. Figure 10 illustrates the various retrieval routes to the memory trace. It is interesting to compare the retrieval times for semantic memory and Semantic Memory

Retrieval Structure

/ I

4

Context

Fig. 10. Schematic representation of retrieval of the memory trace. The trace is accessible through its semantic code, its location in the retrieval structure, and the context.

Skill and Working Memory

33

working memory (i.e., the retrieval structure). In four memory search experiments (after about 100 hours of practice), we timed SF as he responded to a probe after being presented with a sequence of 30 digits. Two of the experiments involved accessing information via semantic memory: (1) Name the last digit of the probe, and (2) point to the location of the probe. In the first experiment, we assume that the first digits of the probe lead SF directly to the appropriate node in semantic memory, and SF uses the features of this node to activate semantic information in the memory trace. In the second experiment, SF is given the probe and must point to the location of the probe in a graphic representation of the retrieval structure. In this case, we assume that the probe activates the memory trace, which in turn activates the features corresponding to its location in the retrieval structure. In both cases, there is only a single direct link to activate, and the average latency was 1.6 sec (SD = .49 sec) . The other two experiments involved searching the retrieval structure for the trace: (1) Name the digit group pointed to in a graphic representation of the retrieval structure, and ( 2 ) name the group preceding or following the probe. In the first case, search begins with the retrieval structure, as in a normal recall trial; in the second case, the probe is first used to derive its location information, and from there, the retrieval structure is entered. Unlike the previous two tasks, retrieval is achieved via the retrieval structure. In both these cases, search time was much slower (average = 6.4 sec, SD = 2.9 sec). We interpret these results to mean that direct access in semantic memory is automatic and fast; access in working memory is controlled and relatively slow (Schneider & Shiffrin, 1977). As a corollary, we assume that the bottleneck in skilled performance is access to working memory, and that practice has its greatest effect on the speed of storage and retrieval operations in working memory. 3. Differentiation

Differentiation refers to processes that produce unique memory traces. We describe two such processes that our subjects use: (1) updating semantic codes and ( 2 ) coding the decimal place. According to our theory, mnemonics and meaningful associations derive their power from their ability to narrow the search in long-term memory to a unique memory trace. We have already discussed the role of redundancy in search. We have evidence from our subjects’ protocols that another mechanism is operating, a mechanism we will call updating. The issue concerns what happens when the subject is presented with more

34

William G . Chase and K. Anders Ericsson

than one digit group within the same mnemonic category. In the example presented earlier, what happens when the subject hears 4054 after hearing 4062 on a previous trial, since they both belong to the same semantic category? If they are not differentiated, then the semantic category will no longer serve as a unique cue to the memory trace. According to our theory, when the subject perceives the current digit group and activates the semantic features for the mnemonic category, this automatically results in the activation of any previous memory traces from the same category, within the same context. Thus, in our example, upon categorizing 4054,4062 (from the same category) is automatically reactivated, and this information is incorporated in the new memory trace. It is reasonable to suppose that a new hierarchical memory trace is formed from the combined memory traces, including any comparative information between the two traces, such as which is greater in magnitude. We have some evidence that updating is, in fact, occurring with our subjects. First, updated items are invariably recalled together in the aftersession recall. The average pause times between these items clustered in the output, for a sample of updated items taken from DD’s protocols, was 1.6 sec (SD = .92 sec), compared to 3.2 sec (SD = 3.32 sec) for pause times between nearby items. Second, on several sessions, we asked SF in his verbal reports after each trial to tell us when a digit group had reminded him of an earlier group. Out of a sample of 276 digit groups from two sessions, SF noticed similarities in 47 groups, approximately 17%. Our other subject, DD, reports slightly fewer such instances of updating (about 13%). In one experiment, after a regular practice session of 60 digit groups presented in six sequences, we presented DD with probes with varying degrees of similarity to the groups from the session. We presented these probes at the usual 1-sec per digit rate and we instructed DD to code them as he would in a normal session, but to indicate immediately when a probe reminded him of an earlier sequence. In this experiment, DD only recognized digit sequences in which the first three digits matched a previous group, and recognition occurred within a second after hearing the third digit. Thus, both subjects appear to be updating their memory traces. Finally, the speed of the process-somewhere between 1 sec or less to as much as 2 sec-is suggestive of the fast-access automatic retrieval from semantic memory that we described earlier. Both of our subjects report that they code decimal digits in terms of numerical patterns, although DD’s system is much more elaborate than SF’s. Figure 1 1, derived from SF’s verbal protocols, illustrates his coding system for decimals, which is basically designed around reference points. This simple system contains a total of only ten parsing rules, or ten features, which SF uses to code the decimal.

Skill and Working Memory

35

Fig. 11. SF’s system for coding digits, derived from his verbal protocols

In contrast, DD’s system is much more complicated. In one experiment, we asked DD to sort 181 running times (printed on cards) in the range 3400 to 4100 into equivalent piles. Within semantic categories, we counted 29 rules, all based on numerical relations, that DD used to code the decimal. Only 4 of these rules were similar to SF’s in that they assigned a feature to the decimal, based only on the magnitude of the decimal. These rules were, using DD’s terminology: (1) 0 = “flat,” (2) 5 = “half,” (3) 8 or 9 = “almost,” and (4) 1 or 2 = “just above.” The rest of the rules all involved numerical relations between the last digit and the preceding digits. These include such things as the last digit is the same as one of the preceding digits; the last digit is above the preceding digit by 1, 2, or 3; the digits are all odd or all even; and the last digit is some numerical combination of some of the previous digits. Furthermore, there is a rule hierarchy because the rules overlap. DD’s system is a very complex but rule-governed system for coding the last digit in terms of numerical patterns. The system is designed to discover a feature that can be used to uniquely code the decimal point relative to the semantically coded part of the trace. Both SF’s and DD’s digit-coding systems seem to work extremely well. From an analysis of the errors, we found that the chance of making an error on the decimal, given that the semantic part of the trace is reported correctly, was less than 1% for both subjects. This error rate is quite low compared to the unconditional error rate per digit group of about 4%. The two processes described in this section, we speculate, are instances of more general processes for differentiating semantic codes. Updating probably occurs all the time during normal cognitive processing; when-

36

William G . Chase and K. Anders Ericsson

ever more than one instance of an abstract category is noticed, it is important to keep each separate. The digit-coding system, on the other hand, is probably an instance of the more general process of generating elaborated, redundant memory codes in order to facilitate retrieval and disambiguation of memory traces. D.

INTERFERENCE

So far, we have said little about mechanisms of forgetting. However, we have some data on interference effects, most of which can be described within the theoretical framework we have outlined. Perhaps the most interesting data we have concern the buildup of proactive interference within a session. Figure 12 shows, for each subject over the last 100 sessions, the probability of recalling a sequence correctly as a function of the trial number within the session. Since we are using the up-and-down method, the average percentage correct is 50%. Both subjects have a substantial increase in the error rate as the session

1

2

3

4

Trial

Fig. 12. Percentage of correct recall for a trial as a function of the trial number, for both SF for the last 100 sessions. The standard error for these percentages, based on 100 observations, is about 5%.

(U and) DD (A-A)

Skill and Working Memory

37

progresses. Further, for both subjects there is also a substantial increase in the rehearsal interval as the session progresses. Figure 13 shows the average latency to begin recall as a function of trial number (for correct trials only, although the data are similar but slightly longer for incorrect trials). There is an important theoretical issue here: Is this forgetting due to a loss of order information, or are the semantic codes being weakened? In our theory, as memory fills up with traces, is there a loss of differentiation because the codes cannot be retrieved due to confusions among the similar locations in the retrieval structures, or are the semantic connec-

1

2

3

4

Trial

Fig. 13. Time between the last presented digit and the first recalled digit as a function of trial number for each subject. For the eight data points above, the average SD, based on the last 100 sessions, is 33 sec, and the average SE is 4.2 sec. ( L O ) , SF; DD.

(A-A),

William G. Chase and K. Anders Ericsson

38

tions being lost? According to the Encoding Specificity Principle (Tulving, 1979), long-term memory traces are not lost; what is forgotten are the appropriate retrieval cues. We have analyzed some data bearing on this issue. First, we analyzed 275 errors over an 86-day period for DD and 213 errors over a 78-day period for SF. As one might expect, there are many types of errors, almost all of which are at the group level: failure to recall a group, transposition of groups, intrusion of similar groups from earlier trials, and so on. Table IV shows a breakdown of errors into order errors and item errors. Item errors are more common than order errors, and the most frequent type of item error is reporting a digit group in the appropriate semantic category, but failing to get the digits exactly right. The most common type of order error is transposing two digit groups, usually from the same location between two supergroups. Thus, there are clearly some order errors, but there are more (partial) retrieval failures. The question still remains: What percentage of retrieval failures are caused by a loss of the connection between the location in the retrieval structure and the semantic code? Figure 14 presents some data bearing on this issue. These data show 10 days’ worth of data for both subjects on the aftersession recall task as a function of trial number. It is interesting to compare Fig. 13 with Figs. 11 and 12: the aftersession recall of digit groups is best for digit groups showing the poorest recall within the session. These data clearly suggest that the buildup of proactive interference over trials is due to a loss of connections between the location in the retrieval structure and the memory trace, because the memory trace is clearly accessible through the semantic code. Another interesting result in Fig. 13 is the significant loss on the early trials; there does appear to be a weakening of the memory trace in semantic memory. According to the Encoding Specificity Principle, the difference between good and poor mnemonic codes should disappear with a recognition test because the aftersession recall task is really a generateand-test recognition procedure. These results suggest that some amount of forgetting has occurred for memory traces from the early part of a session, contrary to predictions from the Encoding Specificity Principle. TABLE IV PERCENTAGE OF ITEM AND ORDERERRORS Item

SF DD

Order

82

18

71

29

Skill and Working Memory

39

Fig. 14. Percentage of correct recall of digit groups after the session as a function of trial number for SF (U (sessions ) 99-108) and DD (A-A) (sessions 1 1 1-120). The standard error for these percentages, based on slightly more than 100 observations, is about 5%.

It could still be argued, however, that the aftersession recall is not really a recognition procedure, and that much better performance was obtained with our recognition experiment (reported earlier). The alternative interpretation is that forgetting involves weakening of the connection between the semantic features and the memory trace. Finally, we report an interference experiment designed to see how fragile is the retrieval structure. Is a single schematic retrieval structure used over and over again, or are there multiple retrieval structures, one for each trial? We tested these possibilities by giving DD two trials in a row and then we asked for recall of both lists; DD first recalled the most recent list, and then he recalled the previous list. In this procedure, DD was presented with the first list and then given a normal amount of time to rehearse the list. However, instead of then asking for recall of that list, a second list was presented to DD, followed by rehearsal of the second list and then recall. Only when the second list had been recalled did DD attempt to recall the first list. In an hour’s session (on Day 195), we gave DD three pairs of lists of length 36 digits each. Although DD was unable

40

William G. Chase and K. Anders Ericsson

to achieve perfect recall to two lists in a row, on two of the three trials, he missed perfect recall by only a single error. On the third attempt, he missed about 30% of the previous list. In short, DD is able to differentiate trials well enough that we reject the idea of a schematic retrieval structure. We think the representation we have proposed in Fig. 8 is compatible with all the empirical results. It accounts for the present results by assuming that the context can be used to differentiate retrieval locations from previous trials. At the same time, it accounts for the confusion errors observed between different retrieval locations by assuming a partial loss of location features in the memory trace. Intrusion errors from previous trials, according to the theory, are caused by a loss of context features in the memory trace, and semantic errors are caused by loss of connections between location features and semantic features in the trace. E.

WORKINGMEMORY

In this section, we want to expand on what we think is an important implication of our work for a theory of skilled memory; the concept of working memory. Working memory has traditionally been thought of as the part of the memory system where active information processing takes place (Baddeley, 1976; Klatzky, 1980). Working memory is not exactly synonymous with short-term memory because short-term memory is usually taken to mean a passive storage system for item information, whereas working memory also contains control processes because they also require memory capacity. Baddeley and Hitch (1974) and Baddeley (1981) include the articulatory loop, the “visuospatial scratch pad,” and a central executive as part of the structure of working m-emory. The concept of working memory alone is not adequate to explain the performance of our skilled subjects in the digit-span task, or the skilled memory effect in general. Our research suggests that experts make associations with information in semantic memory, and they do not have to keep the information active during the retention interval; they can rely on retrieval mechanisms to reactivate information at recall. In the digit-span task, our subjects developed an elaborate retrieval structure for storing digit sequences. In the chess research, the reason the Chase and Simon (1973a,b) model underestimated the recall of chess masters was because it assumed that information was retained in short-term memory. We argue that the idea of working memory should be reconceptualized to include retrieval mechanisms that provide direct access to recent memory traces not in active memory. Perhaps this is a semantic distinction, and perhaps another term such as intermediate-term memory (Hunt, 1972), should be used to refer to temporary knowledge structures relevant

Skill and Working Memory

41

to the ongoing task. Nevertheless, these retrieval structures have the properties associated with working memory. The important properties of the short-term memory (STM) component of working memory are direct access and fast access to knowledge structures for input into processes. Retrieval structures provide direct access to knowledge structures, and they provide relatively fast access (say, within the range of 1-5 sec), thus avoiding the difficulties normally associated with long-term memory retrieval (such searches take a lot of time and cause interference by activating competing knowledge structures). Perhaps we should call these retrieval structures the intermediate-term memory (ITM) component of working memory. An important point we want to make about skilled memory is that the size of the ITM component of working memory expands with skill acquisition, and the retrieval speed increases. We speculate that at high levels of skill, retrieval speed from ITM approaches that of STM, which is less than a second. Thus, the ITM can serve as a useful part of working memory, greatly expanding the available knowledge states as inputs to mental operations. We think this is one reason why skilled performance of experts in many domains seems vastly superior to novice performance. This reconception of working memory is helpful in interpreting the literature in other domains besides skilled performance. For example, Shiffrin (1976) has argued that short-term memory does not have enough capacity to sustain performance in many tasks, and that context-tagged information in long-term memory is used to perform complex tasks. In other words, context can also serve as a retrieval structure for knowledge in some ongoing task, and hence can also serve as an important component of working memory. One reason that Baddeley (1976, 1981) has argued for an expansion of the concept of working memory is because complex tasks such as reasoning, comprehension, mental calculation, and learning can proceed with very little decrement when subjects have to maintain a near-span digit load simultaneously in STM (Baddeley & Hitch, 1974). Kintsch (1981) has recently argued that the current concepts of STM and working memory do not adequately account either for people’s ability to retain and use the meaning of text during reading, or their ability to retrieve more detailed propositional memory from reading text. In a recent article, Daneman and Carpenter (1980) showed that a domain-specific measure of working memory capacity is a far better predictor of reading ability than the traditional short-term memory span. In this measure, subjects were required to read a series of sentences and then recall the last word of each sentence in order. Correlations between this measure of working memory and measures of reading comprehension were typically in the range of .7 to .9, whereas word span correlated only

William G . Chase and K. Anders Ericsson

42

about .35 with measures of reading comprehension. Daneman and Carpenter argued that the reading processes of good readers are faster, more efficient, and take up less capacity in working memory, thus releasing more storage capacity for knowledge structures in working memory, hence these readers’ higher sentence memory span. Good readers achieve better comprehension, according to Daneman and Carpenter, because they have more facts in working memory at any time for their comprehension processes to work on. Although we agree in principle that skill development is associated with automated processing, our theory of skilled memory requires a different interpretation of their result. The working memory of good readers is expanded, according to our theory, because they have developed better structures for organizing and retrieving information of various types relevant to the comprehension process from semantic memory during the reading process. Their larger sentence memory span, we argue, is the result of utilizing these structures for storing sentences-or some deepstructure representations of the sentences-in long-term memory. Nevertheless, we agree with the important point made by Daneman and Carpenter’s experiment, namely, that skill in some domain is associated with an expanded working memory. We want to make one more point about encoding and working memory. How well an item is retrieved depends upon how it is coded for later use. This idea has been in the literature for some time as the encoding retrieval interaction principle derived from the levels-of-processing literature (Tulving, 1979), and the constructability principle in the information-processing literature (Norman & Bobrow, 1979). The idea is that a good encoding anticipates how it will be retrieved because it builds into the representation the retrieval cues that will arise at recall. In other words, skilled individuals have learned how to code information in a useful way, so that when it is needed in some context, the retrieval cues will be appropriate to achieve recall. Novices typically do not know when a fact is relevant, and they often fail to retrieve knowledge in their longterm memory that is relevant to some task performance (Jeffries, Turner, Polson, & Atwood, 1981). This is perhaps the reason that mnemonic systems do not seem very useful in skills: The retrieval mechanisms have to be domain specific because retrieval must occur when a fact is useful. IV.

Further Studies of Skilled Memory

In this section we present subsequent work in which we have attempted to expand our theory of skilled memory into other domains. Our later

Skill and Working Memory

43

work has taken two courses. In one direction we have analyzed already existing exceptional skills. We have been fortunate to be able to study two skilled individuals, a mental calculation expert (Chase, Benjamin, & Peterson, in preparation) and a waiter who remembers large numbers of orders (Ericsson & Polson, in preparation). In another direction, we have studied normal people in a domain where most people are skilled: sentence memory (Ericsson and Karat, 1981). A.

ANALYSIS OF A MENTALCALCULATION EXPERT

Our subject AB has a magic act that he terms “mathemagics” in which he does a variety of rapid mental calculation feats. For example, he can square a 2-digit number in 1 or 2 sec, he can square a 4-digit number in about 30 sec, and he can multiply two 2-digit numbers in about 5 sec. These mental calculation feats are far beyond the capacity of average people as well as mathematicians and engineers. AB claims that he is the only person in the United States with such a mathemagics act. AB’s digit span is about 13 digits, for which he uses a mnemonic system (described later); and his performance on Luria’s (1968) 50-digit matrix is also comparable to other memory experts (see Table 111). There is a well-documented literature on mental calculation experts, or “lightning calculators,” most of whom lived in the last century, before the advent of mechanical calculating aids. A common misconception is that most lightning calculators are mentally retarded or “idiot savants. Although there are a few documented cases of mentally retarded lightning calculators, most of the lightning calculators have been well-educated professionals. To take a few examples, Bidder was a very prominent British engineer, Ruckle was a German mathematics professor, and the great German mathematician and astronomer Gauss demonstrated his lightning calculating ability as a boy. (See Mitchell, 1907; and Scripture, 1891, for good reviews.) The only recent psychological study of a mental calculation expert is Hunter’s (1968) analysis of A. C. Aitken, a Cambridge mathematics professor and perhaps the most skilled lightning calculator reported in the literature. Aitken’s skill is based on two types of knowledge: ( 1 ) computational procedures and (2) properties of numbers. Aitken had gradually acquired a large variety of computational procedures designed to reduce memory load in mental computation. With years of intensive practice, these computational procedures gradually became faster and more automatic, to the point where Aitken’s computational skills were astounding. In addition to his computational procedures, Aitken also possessed a tremendous amount of ‘‘lexical’’ knowledge about numbers.

”

44

William G . Chase and K. Anders Ericsson

For example, he could “instantly” name the factors of any number up to 1500. Thus, for Aitken, all the 3-digit numbers and a few 4-digit numbers were unique and semantically rich, whereas for most of us, this is true only for the digits and a few other numbers, such as one’s age. This knowledge also provides a very substantial reduction in the memory load during mental calculation. Our subject AB has a typical history for a mental calculator. His interest in numbers really began at about age 6 (he is now 20 years old), and from that time to the present AB estimates that he has averaged several hours of practice a day. During this extended period of continuous practice, AB has discovered many numerical concepts by trial and error. For example, at around age 12, AB discovered the algorithm he uses to square numbers; interestingly, Aitken was about the same age when he also discovered the same squaring algorithm. Our analysis of AB began with his ability to square numbers, which turned out to be a fairly complex procedure. We expected, on the basis of our theory of skilled memory, that AB would use some type of retrieval structure to store the results of intermediate computations, and then he would retrieve these computations at some later point when he needed them. Our analysis of AB’s squaring procedure is based on about 10 hours of protocols, from which we derived a model, and about 20 hours of latency and error data on squaring 2-to-5-digit numbers. The heart of AB’s squaring procedure is the algorithm that reduces squaring to easy multiplication, and it is based on the following equation:

For example: 92 = 10 x 8 + l 2 1092 = 100 X 118 + 92

In words, the algorithm involves finding a number, d , which, when added to or subtracted from the number to be squared, A, generates a new number comprised of a single digit with trailing zeros. This in effect reduces the computation from a difficult n-digit by n-digit multiplication to a much easier 1-digit by n-digit multiplication, plus an (n - 1)-digit square. Also notice that the algorithm is recursive: An n-digit square is reduced to an easy multiplication plus an (n - 1)-digit square, which in turn is reduced to an easy (n - 1)-digit multiplication plus an (n - 2)-digit

Skill and Working Memory

45

square, and so on. Recursion stops with two-digit numbers because all two-digit squares are either memorized or, in those few cases where AB claims to use the algorithm on two-digit squares, the computations are so rapid and so familiar that they are virtually long-term memory retrieval. To give a concrete example of how the algorithm works, consider the following four-digit problem: 3,456* = 3000 X 3912 + 4562 = 11,736,000 + 500 X 412 442 = 11,736,000 + 206,000 + 1,936 = 11,943,936

+

Note that, as a result of the recursive process, three fairly large partial products accumulate in memory and must be added together. In general, for an n-digit square, there are y1 - 1 partial products. These types of mental arithmetic problems impose severe memory management problems, and this is what makes AB’s squaring procedure interesting for our theory of skilled memory. How can AB remember all of these numbers? One of the first things we discovered was that AB was using a mnemonic to store these partial products. AB had previously learned a standard mnemonic technique for converting digits to consonants and making a word out of the consonants. For example, the partial product in the above example, 736, can be converted to consonants: 7 = k, 3 = m, and 6 = g, and the consonants are then converted into words, such as 736 = “key mug.” Then at a later point in the problem, when AB needs to add partial products, he retrieves the mnemonic and decodes it. AB also uses his fingers as a mnemonic aid to store the hundred’s digit. In the above example, AB stores the digit 9 on his fingers. From AB’s verbal protocols we were able to derive a process model of his squaring algorithm. With the model, we were able then to make several predictions about how fast AB would be able to solve problems of varying degrees of difficulty; the model also gave us a way to objectively analyze the memory demands involved in squaring a number with the algorithm. The first analysis we did tried to account for the speed of problem solving as a function of problem size. Figure 15 shows the average time taken by AB to solve two-digit through five-digit squares, and Fig. 16 shows these same data replotted as a function of the model’s prediction of the number of symbols processed in working memory. Several structural variables from the model were regressed against solution time: (1) number of functions in the program, (2) number of

46

William G. Chase and K. Anders Ericsson

100

80

Y

I

m

E .-

60

I-

40

20

2

3

4

5

Number of Digits

Fig. 15. AB’s average solution time for squaring numbers as a function of number of digits. Brackets are SDs of averages for 17 days. (SDs for two and three digits--.2 and 1.0 sec, respectively-are too small to be shown.) Each daily average is based on 7 or 8 observations and each total mean is based on about 130 observations.

arithmetic operations, (3) number of mental operations, (4) number of chunks processed in working memory, and (5) number of symbols processed in working memory. None of these variables was able to adequately account for the rapid increases in time with problem size, but the two measures that did the best were number of chunks and numbers of symbols processed in working memory, with the latter variable (shown in Fig. 16) predicting best (RMSD = 7.6 sec). The interesting fits to the data were 482 msec/symbol, 1082 msedchunk and 3222 msedmental operation. The magnitude of these parameters seems well in line with what is generally known about the speed of mental operations (Chase, 1978). Our model, thus, seems to be a good first approximation of the speed of AB’s squaring algorithm. The model still does not predict a fast enough increase in solution time with problem complexity; however, we think that most of this complexity can be accounted for with further refinements of the model. Specifically, we think that we need to measure separately the speed of the various mental operations in our model, rather than simply assume that all operatings take the same amount of time. We are currently analyzing, at a finer grain level, the basic processes of addition and multiplication, which are used in more complex procedures.

Skill and Working Memory

47

Number of Symbols Processed in Working Memory Fig. 16. Observed and predicted solution time as a function of number of symbols processed in working memory.

Our model also makes predictions about error rates. We found that error rate was linear with the number of arithmetic operations. According to our model, each arithmetic operation that AB performs has a 2.7% chance of error. The overall error rates in the squaring procedure ranged from approximately 7% for 2-digit squares to approximately 45% for 5digit squares. The last, and perhaps the most interesting, analysis with respect to our theory of memory, is that of retrieval distance of various mental operations: How far back does AB have to go in memory to find inputs for his mental operations? This analysis has to be done within the framework of our model. That is, for problems of various size, we examined the trace of the model and computed the retrieval distance in terms of how far back in the trace were the inputs to the current operation. We generated a trace for three problems: 3452, 3,4562, and 34,5672, and the inputs for every mental operation were classified according to how many mental operations back they occurred in memory. Figure 17 shows the frequency of retrieval distance in operations; the distributions for the three problem sizes were combined because they were indistinguishable. Inputs that required decoding a mnemonic or other external memory aid are indicated in the figure with a dot. There are some interesting things to notice about these data. First, most of the inputs for mental operations come from very recent mental operations. In fact, over half the inputs for a mental operation come from the

48

William G. Chase and K. Anders Ericsson

40

35

30

2

25

C

a

IT

e

L

20

I5

10

5

5

10

15

20

Retrieval Distance

In Operations

tarice for Fig. 1 Frequency distribution of predicted solution times as a function of retrieval AB’s squaring algorithm. These frequencies are derived from three problems: 345*, 34562, and 345672. X s are retrievals without mnemonic aids, and dots are retrievals with the aid of a mnemonic.

immediately preceding operation. Second, inputs that were stored and retrieved with the aid of mnemonic are retrieved over much longer distances. From the analysis, we were surprised at how AB’s squaring procedure keeps the inputs for operations close in time. That is, AB’s squaring procedure seems to have been designed to minimize the working memory demands by deriving inputs to mental operations from immediately preceding operations. Even so, the squaring procedure is too complex to

Skill and Working Memory

49

keep everything in short-term memory. Partial products must be stored for fairly long periods (and with many intervening mental operations) before they are needed again. Under these circumstances, AB has resorted to mnemonics. Finally, we point out that even though the logic of AB’s squaring algorithm is recursive, recursion is very expensive in terms of memory load. AB has devised a complex procedure, the logic of which is iterative rather than recursive, to avoid the memory problems associated with recursion. B. THEMEMORYOF A WAITER

Ericsson and Polson (in preparation) have studied a waiter (JC), who is able to take up to 17 menu orders without any form of memory aid. The main focus of this research has been to describe the performance of this waiter in an experimentally controlled environment and describe the cognitive processes and structures underlying this memory feat. The initial phase of this study was concerned with finding an experimental analog of the restaurant environment. The people at the table in the restaurant were simulated by pictures of faces, and the order was read by an experimenter as the waiter pointed to the corresponding picture. To mimic the restaurant situation JC was allowed to ask for repetitions of items. JC controlled the rate at which he took orders, and he was timed until he signaled the experimenter that he was ready to recall. Each order consisted of a main course of a meat dish (eight alternatives) cooked to a certain temperature (five alternatives), with a starch (three alternatives) and a choice of salad dressing (five alternatives) and during the first part of the experiment also a beverage (nine alternatives). The beverage item was later omitted because JC argued that beverage orders are taken separately for dinners. Orders were generated randomly by a computer program. According to our subject JC, the experimental situation is much harder than the restaurant situation because of the randomness of orders. In the restaurant situation a relatively small number of the possible combinations is frequent. The experimental sessions consisted of two blocks, each consisting of an order of three, five, or eight items (people) in random order. JC was instructed to proceed as rapidly as possible without making errors when recalling the collection of orders. During some sessions JC was instructed to “think aloud” while doing the same task. JC was also tested for his memory of the orders at the end of the experimental session. Even though our experimental analysis of JC’s memory skill is not complete, we have found considerable evidence for the skilled memory

William G. Chase and K. Anders Ericsson

50

mechanisms described earlier. In our laboratory situation we were able to show that JC was able to perform the memory task with few, if any, errors in recall. The average presentation time of the first five sessions (5 items per order) is given in Fig. 18. In the same figure we have plotted the average times for Sessions 12-14 and Sessions 24-32, which are both based on 4 items per order. The presentation time is short and for the sessions with most practice it approaches the reading time for orders from a table of three people. We can also see a reliable decrease in presentation time as a function of practice. It may appear somewhat unexpected to find such a large speedup, given that JC had been taking orders without notes for several years prior to the experimental sessions. However, there appears to be little pressure for increasing encoding speed in the restaurant situation beyond the rate people are able to generate orders, and this rate is relatively slow. One of the difficulties in remembering dinner orders for normal people is the similarity between the orders. JC avoids interference by capitalizing on the redundancy created by similar items. From our thinking-aloud protocols it is clear that JC, at the time an order for a person is read to him, reorganizes this information into sublists with items of a given I

I

I

Sessions 1-5

200

-

12-14

U u)

I

0

E i=

24-32

100

3

5

0

Table Size

Fig. 18. Speed of taking orders for the skilled waiter. The SEs for the above points range from I .5 sec to 13.8 sec and the average SE is 6.5 sec.

Skill and Working Memory

51

category. Each sublist contains four items or less. For salad dressings JC uses the initial letters and searches for patterns or meaningful abbreviations or words. For example, once JC encoded “Blue cheese-Oil and vinegar-oil and vinegar-Thousand islands” as B-0-0-T or “boot. ” For temperatures, JC is sensitive to the dimension of rareness, which ranges from rare to well done, and encodes progressions and other types of patterns as well. There are only three different kinds of starches, and therefore there is a high probability of some kind of pattern occurring. JC encodes many other kinds of information about “spatial” position of the person making the order and relationships between the ordered items and the person making the order. However, the within-category encoding appears to be his principal means of encoding. One piece of evidence for JC’s coding strategy comes from the order in which he gives his immediate recall. In recalling orders from tables with five and eight people he does not preserve the presentation order of the items. Instead, JC recalls all salad dressings first and then all entrees, temperatures and starches. For a table of three persons JC originally recalled the information as presented (i.e ., entree, temperature, starch , and salad dressing) for each order before moving on to the next order. Recently, JC has changed to within-category recall even for three-person orders. We are now conducting experiments designed-to demonstrate the priority of the within-category encoding more directly. We have also studied JC’s memory for orders after the session. After Session 1 we constructed the pictures corresponding to the first table of five people, and JC could accurately recall 10 items or 40% of the presented items. He recalled the encoding for the salad dressings (COOBB) and a few isolated items, but not a single complete order. Then we reconstructed the second and last table of five persons, and JC recalled the presented information perfectly. Suggestive evidence indicates that a subsequent encoding of an order from a table with the same number of people leads to reduction of memory for the initial encoding. After Session 3 we asked JC to recall as much as possible about salad dressings. From the most recent set of table sizes (Block 2) JC recalled 14 items (88%) without regard to order, or 11 items (69%) if the order within a table has to be exactly correct. From the first block of tables JC recalled 4 items (25%). It should be noted that a similar low level of recall might have been obtained for our digit-span experts if they had had to rely on episodically based recall. C . SENTENCE MEMORY

Most of the above demonstrations of skilled memory refer to skills that only a small portion of the general population ever acquires. This raises

52

William G. Chase and K. Anders Ericsson

the issue of whether all adults are able to acquire and exhibit skilled memory. To address this concern Ericsson and Karat (1981) set out to search for evidence of skilled memory in a domain where all adults have developed a skill. The most obvious skill that all normal adults have is their ability to comprehend and generate meaningful language. In most respects we can compare the language skills of any human adult with other complex skills, such as chess. To make our study as comparable to the earlier work of Chase and Ericsson, we decided to use the methodology of measuring memory spans. We read sequences of words to subjects for immediate verbatim recall. We wanted to demonstrate an analogous finding to the one by Chase and Simon (1973a), that for scrambled chess pieces on a chessboard the chess master is no better than a novice in immediate recall of chessboards. We thus compared subjects’ immediate recall for meaningful sentences with the same words presented in a random scrambled sequence. From an extensive literature we know that normal subjects’ memory spans for unrelated words is on the average 6 words. Although we have not been able to find any attempts to measure peoples’ memory spans for meaningful sequences of words (i.e., sentences), it is clear from several studies and experiments that the span should be considerably higher, 10-12 words or more. The class of meaningful sentences is not well defined, so we did not attempt to generate the sequences. Like other investigators of skilled performance we collected instances (sentences), from real life. We sampled sentences of different length from two sources. The first source was two collections of short stories. The second source was three novels by John Steinbeck. We copied these sentences and only substituted pronouns for names. We generated scrambled word sequences by randomizing the order of words in these sentences. The subjects were first given a series of sentences, and then a series of scrambled sequences. All sequences of words were read at a constant rate (1 word/sec) in a monotone voice except for the last word, which was stressed to signal the subjects to write the sequence verbatim. In Fig. 19 we have plotted the percent perfectly recalled sequences as a function of number of presented words. Each point corresponds to averages based on more than 15 subjects’ responses to five or more different sequences (for more details see Ericsson & Karat, 1981). A measure of memory span is the number of words an “average” subject will correctly recall half of the time. The memory span for scrambled sequences is between 6 and 7 words, whereas the memory span for meaningful sequences (i.e., sentences) is about 14 words. The difference is statistically reliable.

Skill and Working Memory

5

6

7

8

53

9

Scrambled Sequence

Fig. 19. Percentage of correct recalled scrambled sequences as a function of the number of words in each sequence.

1. Coding

Some interesting results support the hypothesis that the words are not encoded and stored as units, but encoded in some other form. The almost linear relationship between number of words in a sentence and percentage of recall was based on averages over many sentences. Among these sentences we can find individual sentences for which this relationship does not hold. We found exceptionally difficult shorter sentences, such as the following 12-word sentence that less than a third of our subjects recalled correctly: “He had won a few dollars from a guard on the flatcar. ” (The italicized words were frequently altered, with the rest of the sequence recalled correctly.) On the other hand, several sentences with 20 words were recalled correctly by more than half our 15 subjects. In a subsequent experiment we included sentences of up to 30 words. One of the 26-word sentences was recalled correctly by 4 subjects, and the following 28-word sentence was recalled correctly by 2 subjects: “She brushed a cloud of hair out of her eyes with the back of her glove and left a smudge of earth on her cheek in doing it.” Further evidence is obtained from a preliminary analysis of errors. Subjects virtually always recall sentences that are semantically consistent with the presented sentence. Most errors concern lexical substitutions that

54

William G . Chase and K. Anders Ericsson

do not affect meaning, such as exchanging definite and indefinite articles and exchanging prepositions. Sometimes modifiers, such as adjectives and adverbs, are omitted. 2 . Postsession Recall

In one experiment we wanted to test subjects’ incidental long-term memory for presented sentences, for which substantial memory would be expected, versus scrambled words, for which little or no memory would be expected. We alternated sentences and scrambled sequences and asked for immediate written recall after each sequence. The major difference from earlier experiments was that we asked the subjects unexpectedly for cued recall of all the presented information afterward. A unique word from each sentence and each scrambled word sequence was presented in random order. Subjects were asked to recall as much as they could about the corresponding sequence. They were asked to underline parts of sequences they felt confident were verbatim. The main result from this experiment is that subjects’ cued recall of the sentences is remarkably high, but their recall of scrambled word sequences is essentially nil. In only 12% of the cases could subjects recall anything from the scrambled sequences, and in only 4% of the cases were they able to recall more than a single word. In contrast, sentences were recalled 79% of the time, with subjects mostly able to recall more than half of the presented words. This clearly suggests to us that a single cue word was able to access an integrated representation rather than just a single chunk or subunit. In a pilot study subjects were only given a free-recall instruction, and these subjects were only able to recall a few sentences. The superiority of cued recall indicates some interesting restrictions on when memory for the sentences can be accessed and used. Another aspect of skilled memory was demonstrated in this experiment, namely, the ability to monitor the correctness of one’s memory. Recall was almost 90% for words that subjects underlined to mark confidence that these words were verbatim. The corresponding percentage for words not underlined was only about 55%. This shows a highly reliable ability to assess correctness of recall. In another experiment we had subjects underline verbatim parts of their immediate recalls. Underlined words were correct 96% of the time and not underlined words were correct 75% of the time. 3 . Individual Differences Our experiments have also consistently found systematic individual differences in ability to recall sentences. Using traditional methods for

Skill and Working Memory

55

calculating span, we find span for words in sentences to range from about 11.0 to about 20.5 words for different subjects. When we analyze our data in terms of number of perfectly recalled sentences or percentage of recalled words we find reliable individual differences as well. In the last experiment we attempted to explore the source of the reliable individual differences in span on ability to recall. According to the skilled memory model, the best predictor of people’s ability to recall sentences verbatim is their level of language skill, which we attempted to assess by a test measuring correct language use and a test of verbal reasoning. To evaluate mediation of general achievement and intelligence, subjects were also given a test of numeric reasoning. Following our earlier procedure, we had subjects recall sentences and scrambled words. A regression analysis showed that the number of perfectly recalled sentences could best be predicted by a linear combination of language skill scores and the number of perfectly recalled scrambled sequences. It is interesting that language usage and verbal reasoning were unrelated to recall of scrambled sequences, which suggests that at least two independent factors underlie the ability to recall sentences: language skill and efficiency of rehearsal.

V.

Conclusion

In our work over the past 3 years, we have tried to discover the cognitive mechanisms underlying skilled memory performance. We have shown that skilled individuals are able to associate information to be remembered with their large knowledge base in the domain of their expertise, and further, they were able to index that information properly for later retrieval. In addition, practice storing and retrieving information causes these processes to speed up. The major theoretical point we wanted to make here is that one important component of skilled performance is the rapid access of a sizable set of knowledge structures that have been stored in directly retrievable locations in long-term memory. We have argued that these ingredients produce an effective increase in the working memory capacity for that knowledge base. The question arises as to what exactly is working memory? In part, there is a problem of definition and, in part, there is still considerable doubt about the mechanisms of working memory. For the sake of terminology, we suggest that working memory has at least the following components: (1) short-term memory, which provides direct and virtually immediate access to very recent or attended knowledge states; (2) intermediate-term memory, the task-specific retrieval structure in long-term memo-

56

William G . Chase and K. Anders Ericsson

ry , which provides direct and relatively fast access to knowledge states; and (3) context, which contains structures for controlling the flow of processing within the current task and provides relatively fast and direct access to knowledge structures relevant to the current task and context. The auditory and visual-spatial buffers are also important components of working memory, although they are not the focus of this article. The focus of our article has been on the important role of retrieval structures as working memory states. ACKNOWLEDGMENTS This research was supported by contract number N00014-81-0335 from the Office of Naval Research. We are grateful to Arthur Benjamin, Dario Donatelli, and Steve Faloon for serving as expert subjects.

REFERENCES Akin, 0. The psychology of architecrual design. London: Pion, in press. Anderson, J. R., & Bower, G. H. Human associarive memory. New York: Holt, 1974. Atkinson, R. C., & Shiffrin, R. M. Human memory: A proposed system and its control processes. In K. W. Spence & J. T. Spence (Eds.), The psychology of learning and motivation: Advances in research and theory. New York: Academic Press, 1968. Bachelder, B. .L., & Denny, M. R. A theory of intelligence: I. Span and the complexity of stimulus control. Intelligence, 1977, 1, 127-150. (a) Bachelder, B. L., & Denny, M. R. A theory of intelligence: 11. The role of span in a variety of intellectual tasks. Intelligence, 1977, 1, 237-256. (b) Baddeley, A. D. The psychology of memory. New York: Basic Books, 1976. Baddeley, A. D. The concept of working memory: A view of its current state and probable future development. Cognition, 1981, 10, 17-23. Baddeley, A. D., & Hitch, G. Working memory. In G. H. Bower (Ed.), The psychology of learning and motivation. New York: Academic Press, 1974. Biederman, I. Perceiving real world scenes. Science, 1972, 177, 77-80. Bjork, R. A. Short-term storage: The ordered output of a central processor. Hillsdale, New Jersey: Erlbaum, 1975. Bower, G. H. Mental imagery and associative learning. In L. W. Gregg (Ed.), Cognition in leurning and memory. New York: Wiley, 1972. Bower, G. H., Black, J. B., &Turner, T. J. Scripts in memory for text. Cognitive Psychology, 1979, 11, 177-220. Bransford, J. D., & Johnson, M. K. Considerations of some problems of comprehension. In W. G. Chase (Ed.), Visual information processing. New York: Academic Press, 1973. Broadbent, D. A. The magical number seven after fifteen years. In A. Kennedy & A Wilkes (Eds.), Studies in long-term memory. New York Wiley, 1975. Charness, N. Memory for chess positions: Resistence to interference. Journal of Experimental Psychology: Human Learning and Memory, 1976, 2 , 64-653. Chamess, N. Components of skill in bridge. Canadian Journal offsychology, 1979, 33, 1-50. Chase, W. G. Elementary information processes. In W. K. Estes (Ed.), Handbook of learning and cognitive processes. Hillsdale, New Jersey: Erlbaum, 1978.

Skill and Working Memory

57

Chase, W. G., & Ericsson, K. A. Skilled memory. In J. R. Anderson (Ed.), Cognitive skills and their acquisition. Hillsdale, New Jersey: Erlbaum, 1981. Chase, W. G., & Simon, H. A. Perception in chess. Cognitive Psychology, 1973, 4, 55-81. (a) Chase, W. G., & Simon, H. A. The mind’s eye in chess. In W. G. Chase (Ed.), Visual information processing. New York: Academic Press, 1973. (b) Chi, M. T. H. Knowledge structures and memory development. In R. S. Siegler (Ed.), Children’s thinking: What develops? Hillsdale, New Jersey: Erlbaum, 1978. Chiesi, H.L., Spilich, G. J., & Voss, J. F. Acquisition of domain-related information in relation to high and low domain knowledge. Journal of Verbal Learning and Verbal Behavior, 1979, 18, 257-273. Craik, F. I. M., & Watkins, M. J. The role of rehearsal in short-term memory. Journal of Verbal Learning and Verbal Behavior, 1973, 12, 599-607. Daneman, M., & Carpenter, P. A. Individual differences in working memory and reading. Journal of Verbal Learning and Verbal Behavior, 1980, 19, 450-466. de Groot, A. D. Perception and memory versus thought: Some old ideas and recent findings. In B. Kleinmuntz (Ed.), Problem solving: Research, method, and theory. New York: Wiley, 1966. Egan, D. E., & Schwartz, B. J. Chunking in recall of symbolic drawings. Memory and Cognition, 1979, 7, 149-158. Eisenstadt, M., & Kareev, Y. Aspects of human problem solving: The use of internal representations. In D. A. Norman & D. E. Rumelhart (Eds.), Explorarions in cognition. San Francisco, California: Freeman, 1975. Ellis, S. H. Structure and experience in the matching and reproduction of chess patterns. Unpublished doctoral dissertation, Carnegie-Mellon University, 1973. Engle, R. W., & Bukstel, L. Memory processes among bridge players of differing expertise. American Journal of Psychology, 1978, 91, 673-690. Ericsson, K. A,, Chase, W. G., & Faloon, S. Acquisition of a memory skill. Science, 1980, 208, 1181-1182. Ericsson, K. A., & Karat, J. Memory for words in sequences. Paper presented at the 22nd Annual Meeting of the Psychonomics Society. Philadelphia, Pennsylvania, 1981. Frey, P. W., & Adesman, P. Recall memory for visually-presented chess positions. Memory and Cognition, 1976, 4, 541-547. Goldin, S. E. Effects of orienting tasks on recognition of chess positions. American Journal of Psychology, 1978, 91, 659-672. Goldin, S. E. Recognition memory for chess positions. American Journal of Psychology. 1979, 92, 19-31. Halliday, M. A. K. Intonation and grammar in British English. The Hague: Mouton, 1967. Hatano, G . , & Osawa, K. Digit span of grand experts in abacus-derived mental computation. Paper presented at the 3rd Noda Conference on Cognitive Science, 1980. Hunt, E., & Love, T. How good can memory be? In A. W. Melton & E. Martin (Eds.), Coding processes in human memory, New York: Holt, 1972. Hunter, I. M. L. An exceptional talent for calculative thinking. British Journal of Psychology. 1962, 53, 243-258. Hunter, I. M. L. Mental calcuation. In P. C. Wason & P. N. Johnson-Laird (Eds.), Thinking and reasoning. Baltimore, Maryland: Penguin, 1968. Jeffnes, R., Turner, A. A , , Polson, P. G., & Atwood, M. E. The processes involved in designing software. In J. R. Anderson (Ed.), Cognitive skills and their acquisifion. Hillsdale, New Jersey: Erlbaum, 1981. Kintsch, W. Comprehension and memory for text. Colloquium talk, Carnegie-Mellon University, October, 1981 . Klatzky, R. L. Human memory: Structures and processes (2nd ed.). San Francisco, California: Freeman, 1980.

58

William G. Chase and K. Anders Ericsson

Lane, D. M., & Robertson, L. The generality of levels of processing hypothesis: An application to memory for chess positions. Memory and Cognition, 1979, 7 , 253-256. Lorayne, H., & Lucus, J. The memory book. New York: Ballatine, 1974. Luria, A. R. The mind of a mnemonist. New York: Avon, 1968. Martin, P. R., & Femberger, S. W. Improvement in memory span. American Journaloffsychology, 1929, 41, 91-94. McKeithen, K. B., Reitman, J. S., Rueter, H. H., & Hirtle, S. C. Knowledge organization and skill differences in computer programmers. Cognitive Psychology, 1981, 13, 307-325. Miller, G. A. The magical number seven, plus or minus two. Psychological Review, 1956, 63, 81-97. Mitchell, F. D. Mathematical prodigies. American Journal of Psychology, 1907, 18, 61- 143. Miiller, G. E. Zur Analyse der Gedachtnistatigkeit und des Vorstellungsverlaufes: Teil I. Zeirschrqt fur Psychologie. Erganzungsband 5 , 191I . Neisser, U. Cognitive psychology. New York: Appleton, 1967. Newell, A., & Simon, H. A. Human problem solving. New York Prentice-Hall, 1972. Norman, D. A., & Bobrow, D. G. An intermediate stage in memory retrieval. Cognitive Psychology, 1979, 11, 107-123. Pike, K. The intonation of American English. Ann Arbor: University of Michigan Press, 1945. Raaijmakers, J. G. W., & Shiffrin, R. M. Search of associative memory. Psychological Review, 1981, 88, 93-134. Rayner, E. H. A study of evaluative problem solving. Part 1: Observations on adults. Quarrerly Journal of Experimental Psychology. 1958, 10, 155-165. Reitman, J. Skilled perception in Go: Deducing memory structures from inter-response times. Cognitive Psychology, 1976, 8, 336-356. Salis, D. The identification and assessment of cognitive variables associated with reading of advanced music at the piano. Unpublished doctoral dissertation, University of Pittsburgh, Pittsburgh, Pennsylvania, 1977. Schneider, W., & Shiffrin, R. M. Controlled and automatic human information processing. I. Detection, search and attention. Psychological Review, 1977, 84, 1-66. Scripture, E. W. Arithmetical prodigies. Journal of Psychology. 1891, 4, 1-59. Shiffrin, R. M. Capacity limitations in information processing, attention and memory. In W. K. Estes (Ed.), Handbook of learning and cognitive processes (Vol. 4). Hillsdale, New Jersey: Erlbaum, 1976. Shneiderman, B. Exploratory experiments in programmer behavior. International Journal of Computer and Information Sciences, 1976, 5 , 123- 143. Slaboda, J. Visual perception of musical notation: Registering pitch symbols in memory. Quarterly Journal of Experimental Psychology. 1976, 28, 1-16. Tulving, E. Relation between encoding specificity and levels of processing. In L. S. Cermak & F. I. M. Craik (Eds.), Levels of processing in human memory. Hillsdale, New Jersey: Erlbaum, 1979. Wickelgren, W. A. Size of rehearsal group and short-term memory. Journal of Experimental Psychology, 1964, 68, 413-419. Williams, M. D. Retrieval from very long-term memory. Unpublished doctoral dissertation, University of California, San Diego, 1976. Woodworth, R. S. Experimental psychology. New York: Holt, 1938. Yates, F. A. The art of memory. London: Rutledge & Kegan Paul, 1966.

THE IMPACT OF A SCHEMA O N COMPREHENSION AND MEMORY Arthur C . Graesser and Glenn V. Nakamura CALIFORNIA STATE UNIVERSITY FULLERTON, CALIFORNIA

I.

Introduction . . . . . . . _

B. C.

.......................................

How Do Schemas Function .......................... Memory for Schema-Relevant versus -Irrelevant Information . . . . . . . . . . . . .

B. Measures of Memo. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A Schema Copy Plus Tag Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. Representational Assumptions and Predictions . . . . . . B. Guessing Assumptions and Predictions . . . . . . . . . . . . . .. .. . . . . . . . . . , C. Retrieval Assumptions for Recall and Re ition . . . . . . . . . . . . . . . . . . . . . . D. Retention Assumptions and Predictions . . . . . . . . . . . . . IV. Some Issues Confronting the SC+T Model.. ....., . .................., A. Does the Typicality Effect Occur for Different Types of Schemas? . . . . . . . . B. Does the Typically Effect Persist When More than One Schema Guides 111.

............................................. ffect Occur When Scripted Activities Are Videotaped? D. Is the Typicality Effect Influenced by Presentation Rate?. . . . . . . . . . . . . . . . . E. Is the Typicality Effect Influenced by the Goals of the Comprehender? . . . . . F. Does the Typicality Effect Occur in Ecologically Valid Settings?. . . . . . G. Are Unpresented Typical Items Inferred at Comprehension or at Retrieval?. . V. The Fate of Four Alternative Models.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. Problems with the Filtering Model.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. Problems with the Attention-Elaboration Model . . . . . . . . . . . . . . . . . . . . . . . . C. Problems with the Partial Copy Model.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D. Problems with the Schema Pointer Plus Tag Model . . . . . . . . . . . . . . . . . . . . . VI. The Process of Copying Schema Nodes into Specific Memory Traces . . . . . . . . . . A. Activation of Inferred Actions via the Generic Script.. . . . . . . . . . , . . . , . . . , B. Activation of Inferred Actions via Stated Passage Actions. . . . . . . . . . . C. Activation of Inferred Actions via the Activation of a Subchunk.. . D. Predicting False Alarm Rates for Unstated Script Actions . VII. Questions for Further Research. . ......................... References . . . . . . . . . . .................................

THE PSYCHOLOGY OF LEARNING AND MOTIVATION, VOL. 16

59

60 60 62 63 66 61 69 71 11

14 15 16 19 80 80 84 85 87 90 92

93 94 94 95 96 97 98 98 100

101 103 105

Copyright 0 1982 by Academic Press. Inc. All rights of reproduction in any form reserved. ISBN 0-12-543316-6

60

Arthur C. Graesser and Glenn V. Nakamura

I.

Introduction

How do generic knowledge structures (called schemas) influence the encoding and retrieval of meaningful stimulus input? This fundamental question has recently received a great deal of attention in several subareas of psychology, ranging from perception and memory to social psychology. As a consequence of this collective enthusiasm, there are literally volumes of data, speculations, and theorizing on the role of schemas in information processing (Adams & Collins, 1979; Anderson, 1977; Bobrow & Norman, 1975; Bregman, 1977; Flavell, 1963; J. Mandler, 1979, 1982; Minsky, 1975; Norman & Bobrow, 1976, 1979; Rumelhart & Ortony, 1977; Spiro, 1977; Taylor & Crocker, 1981; Thorndyke & Hayes-Roth, 1979; Thorndyke & Yekovich, 1980). This article will not provide an exhaustive summary of this literature. An entire article would be needed to discuss the historical roots and the current status of the schema construct in research and theory. We shall propose a schemabased model of encoding and retrieval in this article, but shall fall short of achieving a complete and satisfactory explanation. One question that a schema-based model must address is how the comprehender processes information that is typical versus atypical of a central organizing schema. There rarely is a perfect match between specific input and the schemas that the comprehender identifies and applies to the specific input. Some information is directly inconsistent with a schema, whereas other information is simply irrelevant. One of the highlights of our proposed schema-based model is that it explains memory for information that varies in typicality with respect to a schema. A.

WHAT IS A SCHEMA?

Researchers do not entirely agree on what schemas are. This fact has been unsettling to some researchers. It has also led to some debate and confusion as to whether it is appropriate to use the schema construct to guide theory and research. We acknowledge this problem, but at the same time we encourage psychologists to pursue their schema-based models and to clarify the essence of the theoretical construct. At some point, psychologists may converge on a “schema for schema” that is adequately developed, articulated, and shared by everyone. Until then, researchers should specify in some detail what they mean by a schema. For the purposes of this article, we shall adopt a relatively broad definition of a schema that would be accepted by most researchers. Schemas are generic knowledge structures that guide the comprehender’s interpretations, inferences, expectations, and attention. A schema is gener-

Schemas, Comprehension, and Memory

61

ic in that it is a summary of the components, attributes, and relationships that typically occur in specific exemplars. A schema for eating at a restaurant differs from the specific memory trace constructed when an individual eats at a specific restaurant at a specific time and place. A schema consists of knowledge in that its properties typically apply to its referent. Thus, its components, attributes, and relationships may normally apply to a specific referent, but need not apply out of necessity. The content of a schema is highly structured, rather than being simply a list of features or properties. It is convenient to view schemas as having variables which are eventually filled as a schema guides the comprehension of specific input (Minsky, 1975; Rumelhart & Ortony, 1977). For example, the variables of a restaurant schema include character variables (customer, waitress, cook, hostess); object variables (table, chairs, food, menus); and action, plan, or goal variables (the customer orders food, the waitress serves the food, the customer eats). These variables are filled with contextually specific referents when someone comprehends a specific restaurant experience. A schema is instantiated when variables have been filled and conceptually interrelated in a specific context. Sometimes a specific variable is filled by default. For example, we normally infer that a cook prepared the food, even though we never see this task. A passage about someone eating at a restaurant might not state that the customer pays the bill, but this would be inferred by default by the comprehender. When one variable of an instantiated schema is filled, there may be repercussions on the other variables. For example, if a restaurant has a hostess, we would expect the cuisine to be relatively impressive. The variables affect one another in a complex but systematic way. When a schema is instantiated, the constructed memory trace is a highly integrated structure. We shall assume that schemas represent many different knowledge domains. There are schemas for person stereotypes and roles (Hamilton, 1981; Reeder & Brewer, 1979; Taylor & Crocker, 1981), goal-oriented action sequences (Nelson, 1977; Schank & Abelson, 1977), spatial scenarios (Biederman, in press; Brewer & Treyens, 1981; Goodman, 1980; Palmer, 1975), and other knowledge domains. Some schemas are very abstract, such as story schemas (Mandler & Johnson, 1977; Rumelhart, 1975, 1977; Stein & Glenn, 1979; Thorndyke, 1977). Other schemas are more concrete and embody world knowledge, such as an action-based schema for eating at a restaurant or a visual-spatial schema for an office. Several terms have been invented to capture the unique properties of different types of schemas, for example, scripts, stereotypes, themes, macrostructures, models, frames, and memory organization packages

62

Arthur C. Graesser and Glenn V. Nakamura

(MOPS). Scripts will receive a great deal of attention in this article. Scripts correspond to activities that one or more characters enact frequently (e.g., eating at a restaurant). The actions that characters perform in a script are ordered logically, conventionally, or in a manner constrained by the environment. Schank and Abelson (1977; Abelson, 1981) introduced the script construct and have specified the properties and functioning of scripts in detail. B. How Do SCHEMAS FUNCTION?

A distinction will be made between two stages of schema utilization, called schema identification and schema application (see Norman & Bobrow, 1976; Schank & Abelson, 1977). During schema identification, the comprehender identifies a schema which provides a good fit to some aspects of the input. This is a process of pattern recognition. As information accrues in a data-driven fashion, the information matches the components, attributes, and relationships of a particular schema better than alternative schemas. Of course, ambiguities sometimes occur when the information matches more than one schema equally well. It is important to determine whether the comprehender has identified the appropriate schema in experiments. When a schema title is provided for a passage, it is safe to assume that the comprehender has identified the appropriate schema. When no schema title is provided, then comprehenders must induce a schema or theme. Dramatic differences in comprehension and memory performance may emerge between conditions in which the schema has been identified and conditions in which the schema has not been identified. For example, memory for passages substantially improves when the subject has identified the central schema for a passage that would otherwise be vague or ambiguous (Anderson, Spiro, & Anderson, 1978; Bransford & Johnson, 1973; Dooling & Lachman, 1971). Once a schema is identified during comprehension, the schema application stage begins. During schema application, the schema guides processing in a conceptually driven fashion. Several events and processes occur during schema application. First, the schema influences the perception and interpretation of presented information. Experiences would be ambiguous or difficult to interpret if no schemas provided background knowledge. Second, the schema governs the attention that is allocated to elements in the stimulus material. In most conditions, more attention is devoted to information that deviates from the schema than information that is relevant to the schema (Bellezza & Bower, 1981b, 1982; Friedman, 1979; Krueger, 1981; Loftus & Mackworth, 1978; Taylor & Crocker, 1981; den Uyl & Oosterdorp, 1980). For example, more atten-

Schema, Comprehension, and Memory

63

tion would be allocated to an octopus in a farm scene than a tractor in a farm scene. However, when the comprehender is confronted with more than one activity simultaneously and the comprehender intends to concentrate on only one activity, the schema associated with the target activity guides attention to schema-relevant information (Neisser, 1976; Neisser & Becklen, 1975). Third, the schema provides the background knowledge needed to generate inferences. As we mentioned, inferences occur when schema variables are filled by default. Fourth, schemas provide the knowledge base for formulating expectations about subsequent events during comprehension. It should be apparent that schemas are very powerful and intelligent knowledge structures. During schema application, the schema imposes an interpretation on the input, guides attention, generates inferences, and formulates expectations. VERSUS -IRRELEVANT C. MEMORYFOR SCHEMA-RELEVANT INFORMATION

When a specific passage or experience is comprehended, some information is relevant to a central organizing schema (typical), whereas other information is irrelevant or inconsistent (atypical). Tearing up a bill is inconsistent with a restaurant schema, and reading a letter is irrelevant. How well is typical versus atypical information retrieved from memory? We shall propose a model that attempts to explain the influence of typicality on memory. However, before turning to this model, we shall describe four alternative hypotheses or models that have predicted how typicality and memory are related. These are the filtering hypothesis, the attention-elaboration hypothesis, the partial copy model (Bower, Black, & Turner, 1979), and the schema pointer plus tag model (Graesser, 1981). There has been widespread disagreement on the effects of typicality on memory, as well as on an explanation of these effects. According to the filtering hypothesis, schema-relevant information tends to be preserved in memory, whereas irrelevant information tends to be discarded. The schema organizes the typical information in a cohesive manner, with irrelevant information being loosely associated or not acquired. Many researchers have implicitly adopted a filtering hypothesis when deriving predictions about memory. This filtering hypothesis can be readily detected in memory studies that involve person stereotypes (Cantor & Mischel, 1977; Hamilton, 1981; Rothbart, 1981; Taylor & Crocker, 1981), prose passages (Bransford, 1979; Kintsch & Van Dijk, 1978; Spilich, Vesonder, Chiesi, & Voss, 1979), perception of real-world scenes (Brewer & Treyens, 1981), and chess configurations (Goldin, 1978).

64

Arthur C. Graesser and Glenn V. Nakamura

Researchers have not controlled for guessing in most studies that allegedly support the filtering hypothesis. Suppose that a student were asked to recall all the actions and gestures that a professor performed during a lecture on a specific day. The student would probably recall several lecture-related actions that the professor never performed. The student might erroneously recall that the professor opened a briefcase, yet the student would not erroneously recall an irrelevant action, such as taking an aspirin. Such guesses yield an unfair advantage to actions that are typical of a schema. Of presented actions that are recalled, some actions would be accurately retrieved from the specific memory trace, whereas others would be guesses. Of course, the problem of guessing also occurs on recognition tests. An accurate assessment of memory involves a discrimination between presented and unpresented information. In contrast to the filtering hypothesis, the attention-elaboration hypothesis predicts poorer memory for schema-relevant information than irrelevant or inconsistent information. According to the attention-elaboration hypothesis, more cognitive resources are allocated to information that deviates in some way from the schema. When an individual allocates more cognitive resources to an item, there is an increase in attention, rehearsal, depth or breath of conceptual elaboration, or the number of associations with other items. More resources are allocated to the deviations because the latter are difficult to relate to the mass of schemarelevant information. Since the comprehender allocates many resources to deviations from the schema, this information is later remembered better than typical information. Some researchers have implicitly or explicitly adopted the attention-elaboration hypothesis (Bobrow & Norman, 1975; Hastie, 1980; Hastie & Kumar, 1979; Srull, 1981). The attention-elaboration hypothesis has received its share of support. Experiments have confirmed that atypical information draws more cognitive resources than does typical information when resources are measured by study times during prose comprehension (Bellezza & Bower, 1981b; den Uyl & van Oostendorp, 1980) and by eye movements during the inspection of visual input (Friedman, 1979; Loftus & Mackworth, 1978). Memory experiments have also supported the prediction that memory discrimination is poorer for typical than atypical information. Schemainconsistent information is remembered better than schema-consistent information when passages involve scripts (Bower er al., 1979) or stereotypes (Hastie, 1980; Hastie & Kumar, 1979; Srull, 1981). There is better recognition memory for faces that have unusual features than faces that are prototypical (Going & Read, 1979; Light, Kayra-Stuart, & Hollander, 1979). Irrelevant information is remembered better than schemarelevant information when the stimulus material involves pictorial scenes

Schema, Comprehension, and Memory

65

(Friedman, 1979; Goodman, 1980), scripted passages (Graesser, 1981; Graesser, Gordon, & Sawyer, 1979; Graesser, Woll, Kowalski, & Smith, 1980), and stereotype-based descriptions of people (Woll & Graesser, 1982). However, the causal relationship between resource allocation and memory has generally been difficult to pin down (see Reynolds & Anderson, 1980), and there are reasons for doubting that resource allocation explains the effect of typicality on memory (Friedman, 1979; Graesser, 1981; Light et al., 1979). The partial copy model was recently introduced by Bower, Black, and Turner (1979) in the context of scripted passages. According to this model, when a scripted passage is comprehended, two different memory codes are established. The first code, called the episodic memory structure, is simply a record of actions explicitly mentioned in the passage. Access to this episodic memory structure is restricted to shorter retention intervals. This code fades quickly over time at an exponential rate; it would not be accessible after a long retention interval such as a day. The second code is established in what Bower et al. call the knowledge store. The knowledge store corresponds to the generic script. Actions in the script are activated by the passage information. Explicit actions in the passage produce a high level of activation in the corresponding actions of the script. Script-relevant actions not stated in the passage receive a lower degree of activation. Memory discrimination via the general knowledge store decays independently from that of the episodic memory structure. After a long retention interval, the different levels of activation are virtually indistinguishable. Therefore, subjects would find it difficult to distinguish presented from unpresented script actions. Bower et al. reported findings that support the partial copy model. First, unstated typical script actions had high false alarm rates on recognition tests and high intrusion rates on recall tests. False alarms are high for these items because they are activated in the generic script. Second, memory discrimination between stated versus unstated actions was higher for irrelevant actions than typical actions. This predicted outcome also follows from the fact that unstated typical actions receive some activation during encoding, but unstated irrelevant actions do not. Third, unstated script actions had higher false alarm rates (recognition ratings) when the subject read more and more versions of a given scripted activity, for example, listening to one versus three passages of a “visit health professional” script. An unstated typical action should receive higher activation when the subject reads three scripts rather than one script, because the action would receive several versus only one activation in the generic script. The schema pointer plus tag (SP+T) model also predicts better memory for information that is atypical than typical of the schema. However,

66

Arthur C. Graesser and Glenn V. Nakamura

the explanation of this typicality effect is different than the attentionelaboration hypothesis and the partial copy model. The SP+T model bears some resemblance to the schema with correction hypothesis introduced decades ago (Woodworth, 1958; Woodworth & Schlosberg, 1954). According to Graesser’s (1981) SP+T model, a specific memory trace is constructed when a schema-based passage (or experience) is comprehended. The memory trace consists of (a) a pointer to the generic schema, which interrelates both the stated and unstated schema-relevant information as a whole, and (b) a set of tugs for information that is atypical of the schema. A tag is constructed for each atypical item, but only some typical items receive tags, namely, marginally typical actions. The untagged typical information is represented by a pointer to the schema. This implies that all of the generic schema is copied into the specific memory trace. The generic schema constitutes one chunk of information containing many typical actions; each tagged item consists of an additional chunk of information. The other properties of the SP+T model need not be discussed at this point. Several predictions of the SP+T model have been confirmed in memory experiments involving scripted passages (Graesser, 1981; Graesser et al., 1979, 1980; Smith & Graesser, 1981) and paragraphs organized around person stereotypes and roles (Woll & Graesser, 1982). First, memory discrimination between presented and unpresented items is better for atypical than typical information. Memory discrimination is poor for typical information because this information is often copied into the memory trace even when it is not stated in the passage. Second, there is no memory discrimination for very typical information. These very typical items would always be incorporated in the memory trace, even when unstated. Third, recognition false alarm rates and recall intrusions are higher for typical than atypical items. The unstated typical items would be copied into the memory trace by virtue of the pointer to the generic schema. The four hypotheses and models discussed in this section do not adequately account for the effects of typicality on memory. The shortcomings will be enumerated in this article after we have reported some pertinent studies. In a later section we shall propose a model that provides an impressive fit to available data. 11.

Methods

This section summarizes the methods we have adopted when preparing acquisition materials and assessing memory. Specific methodological details and constraints were incorporated in virtually all of our studies on

Schemas, Comprehension, and Memory

67

schemas and memory. Some of these methodological details proved critical for an adequate assessment of memory. A.

PREPARATION OF ACQUISITION AND TESTMATERIALS

The acquisition materials were constructed systematically and the acquisition items were scaled on a number of informative dimensions. In most of the studies, the acquisition materials were passages containing action sequences. Most of the actions were typical of the underlying schema, whereas some actions were irrelevant (atypical). In this section we assume that the acquisition materials are passages conveying scripted action sequences. The typical actions in a scripted activity were a subset of actions drawn from a free generation set. Subjects in a free generation group were presented a script title, such as eating at a restaurant. The subjects wrote down actions that were typical of the script. An action was included in the free generation set if it was mentioned by at least two subjects in the free generation group. Each action of a script was scaled on generic recallability, which is the likelihood that an action is listed as a typical script action in a free generation task. An action was defined as typical of a script if it was a member of the script’s free generation set. Free generation tasks have been used by other researchers as an empirical method of exposing the content of scripts (Bower et al., 1979) and stereotypes (Cantor, 1978). The investigators supplied the atypical actions in each scripted activity. It is important to point out that the atypical actions were not bizarre, weird, or emotionally salient. The atypical actions were mundane actions that were simply irrelevant to the script. For example, putting apen in the pocket was an atypical action in the scripted activity of eating at a restaurant. For each scripted activity, the typical and atypical actions were rated on a typicality scale by a normative rating group of subjects. Each action was rated on the following 6-point typicality scale: (1) very atypical; (2) moderately atypical; (3) uncertain, but probably atypical; (4) uncertain, but probably typical; (5) moderately typical; and (6) very typical. For the scripted activities investigated, the ratings of the typical actions (i.e., in the free generation set) ranged from 4.5 to 6.0, with a mean of 5.41. The ratings of the atypical actions ranged from 1.2 to 4.4, with a mean of 2.84. The generic typicality of an item was defined as its mean rating on this 6-point typicality scale. A counterintuitive finding in previous studies has been the low correlation between generic recallability and generic typicality within the set of

68

Arthur C. Graesser and Glenn V. Nakamura

typical actions. For scripts, we found a nonsignificant correlation, r = . 10 (Graesser et al., 1980). The fact that an item is very typical of a schema does not ensure that the item will be articulated in a free generation task. A very typical action may be difficult to capture in words or may be inferred TABLE I EXAMPLE PASSAGES AND TESTACTIONS Restaurant script (version A) That evening Jack wanted to go out to dinner so he called a friend who recommended several good restaurants. Jack took a shower, went out to his car, picked up his girlfriend and gave his girlfriend a book. He stopped the car in front of the restaurant and had the valet park the car. They walked into the restaurant and sat for a few minutes in the waiting area until the hostess escorted them to their table. They sat down at the table, the waitress introduced herself, and they ordered cocktails. Jack talked to his girlfriend and asked her how her job was doing, and they decided on what to eat. Jack cleaned his glasses, paid the bill, and bought some mints. Then they left the restaurant and drove home. Restaurant script (version B) That evening Jack wanted to go out to dinner so he called a friend who recommended several good restaurants. Jack took a shower, put away his tennis racket, put on a jacket, and picked up his girlfriend. He stopped the car in front of the restaurant and had the valet park the car. Jack confirmed his reservations and they sat for a few minutes in the waiting area. Jack put a pen in his pocket and the hostess escorted them to their table. The waitress introduced herself and they ordered cocktails. Jack talked to his girlfriend and they decided on what to eat. They ordered dinner, ate their meal, and Jack picked up a napkin off the floor. Jack left a tip, left the restaurant, and they drove home. Test actions for restaurant script (1) Jack asked his girlfriend how her job was doing (Atypical-A)

(2) Jack ordered dinner (Typical-B) (3) They sat down at the table (Typical-A) (4) Jack gave his grilfriend a book (Atypical-A) ( 5 ) Jack put on a jacket (Atypical-B) (6) They walked into the restaurant (Typical-A) (7) Jack left a tip (Typical-B) (8) Jack cleaned his glasses (Atypical-A) (9) Jack put a pen back in his pocket (Atypical-B) (10) Jack confirmed his reservations (Typical-B) (1 1) Jack picked up a napkin off the floor (Atypical-B) (12) Jack bought some mints (Atypical-A) ( 13) Jack went out to his car (Typical-A) (14) Jack put away his tennis racket (Atypical-B) (15) Jack paid the bill (Typical-A) (16) They ate their meal (Typical-B)

Schemas, Comprehension, and Memory

69

from other actions mentioned in a free generation protocol. In either case, the process of verbally articulating schematic knowledge is substantially different than the process of judging the typicality of information in the schema. After the actions were generated and scaled, scripted passages were prepared. There were always two versions (A and B) of each scripted activity. Version A contained a different sample of actions than version B. There were three sets of actions. Common typical actions were presented in versions A and B; these were context actions that were never analyzed or tested in later recall or recognition tasks. A set of A actions included typical and atypical actions presented in version A, but not in version B. Similarly, B actions were typical and atypical actions presented in version B, but not in version A. The A actions and B actions served as test actions for a scripted activity. The rationale for having two versions of each scripted activity is important. This design feature permits an assessment of sophisticated guessing for each test action. In a recall task, a subject may recall a test action that was not presented. In a recognition task, a subject may decide that he or she experienced a test action that in fact was never presented. An estimate of guessing is essential for an assessment of what is remembered about a passage. Table I shows the A and B versions of a restaurant script. Listed below these versions are the test actions associated with the scripted activity. Four typical test actions are presented in version A, but not in version B; four typical test actions presented in B, but not in A; four atypical test actions presented in A, but not in B; and four atypical test actions presented in B, but not in A. For subjects who listened to version A, the typical and atypical A actions serve as target test actions and the typical and atypical B actions serve as nontarget actions. The reverse holds true for subjects presented version B . These design and counterbalancing constraints have been imposed in all the studies conducted in our laboratory. The design features illustrated in Table I can be adopted for different types of acquisition materials. We have incorporated these design features for tape-recorded scripted activities, videotaped scripted activities, and personality descriptions organized around role and stereotype schemas. We have also constructed a Jack story, which involves a character named Jack who engages in several scripted activities. B.

MEASURESOF MEMORY

Both recall and recognition memory tests have been administered to subjects after they are presented the acquisition passages. When they

70

Arthur C. Graesser and Glenn V. Nakamura

receive recall tests, they are given the script title and they write down as many actions as they can remember. The experimenter points out that the acquisition passage included actions that are typical of the script and actions that are irrelevant. The subjects are encouraged to recall both types of actions. When a recognition test is administered, the subjects rate each test action on the following 6-point scale: (1) positive that the item was not presented; (2) fairly sure that the item was not presented; (3) uncertain, but guess that the item was not presented; (4)uncertain, but guess that the item was presented; (5) fairly sure that the item was presented; and (6) positive that the item was presented. Decisions of 4,5, and 6 constitute YES judgments, whereas 1, 2, and 3 are NO judgments. An appropriate assessment of memory involves a discrimination between presented actions and actions that were not presented in an acquisition passage. Performance on a recognition test improves to the extent that subjects say YES to target actions and NO to nontarget actions. A subject’s hit rate is the likelihood of saying YES to presented (target) actions, whereas the false alarm rate is the likelihood of saying YES to unpresented (nontarget) actions. Recognition memory improves with an increase in the difference between hit rates and false alarm rates, that is [p(hit) - p(fa1se alarm)]. An alternative and more widely accepted measure of memory discrimination is a d’ score (Green & Swets, 1966; Kintsch, 1977). Sometimes we shall refer to a memory gcore, which is a measure of memory that corrects for guessing. Memory score =

p(hit) - p(fa1se alarm) 1 - p(fa1se alarm)

When passages are tested by recall, analyses focus exclusively on the test actions (see Table I). A recallproportion consists of the likelihood of recalling a target test action. An intrusion proportion is the likelihood of recalling a nontarget test action. We do not score actions that are not test actions. Thus, when version A is presented, we score only intrusions that are B actions; when version B is presented, we score only intrusions that are A actions. Good memory discrimination consists of a high recall proportion and low intrusion proportion. A d‘ score is normally not computed for recall, but Memory Scores may be computed. Analogous to formula 1, the Memory Score for recall is shown in formula 2. Memory score =

p(recal1) - p(intrusion) 1 - p(intrusion)

Schemas, Comprehension, and Memory

71

111. A Schema Copy Plus Tag Model

In this section we will describe the assumptions of our schema copy plus tag (SC+T) model and report some data that support it. This model is almost the same as the schema pointer plus tag (SP+T) model described in Section 1,C. In fact, the two models are identical except for one property. In the SP+T model, the memory trace contained a pointer to the generic schema as a whole. This constraint implies that all information in the generic schema is passed to the specific memory trace. However, it seems more plausible that only a subset of the information in a schema is passed to the memory trace. The SC+T model reflects our change in attitude. According to the SC+T model, only a subset of the information in a generic schema is copied into the specific memory trace. Of course, an important issue is what schema information is copied into the memory trace. We shall address this issue later. For the present, we want to emphasize that much of the schema is copied into the memory trace, but usually not all of it. A.

REPRESENTATIONAL ASSUMFTIONSAND PREDICTIONS

Three assumptions of the SC+T model pertain to the representation and content of specific memory traces. According to the SC+T model, a specific memory trace is constructed whenever a specific passage (or experience) is comprehended at a specific time and place. We shall assume for the moment that the passage (or experience) is organized around one central schema. We shall also assume that passages serve as acquisition materials. Of course, the scope of our model would extend to experiences other than discourse comprehension. Assumption I

The memory representation for a passage contains a partial copy of the generic schema that best fits the input statements compared to the set of alternative schemas in memory. Very typical items are always copied into the trace, yielding a set of very typical items (1 to n ) . Assumption 2

Some items in a passage are only moderately typical of a generic schema. Other items are relevant to a generic schema, but do not fit in with other typical items that are explicitly stated. These typical items are linked to the schema copy with tags (i.e., an associative relation that

12

Arthur C. Graesser and Glenn V. Nakamura

signifies a constrast). There is a unique tag for each of these typical items, yielding a set of tagged typical items (1 to m). Assumption 3

Some items in a passage are typical of the generic schema and are linked to the memory representation with tags. There is a distinct tag for each atypical item, yielding a set of tagged atypical items (1 to 9 ) . With these three assumptions, we are ready to make some predictions about memory for specific passages. One prediction is that memory discrimination should be better for tagged typical actions (assumption 2) and tagged atypical actions (assumption 3) than actions that are not tagged (assumption 1). Since many typical actions are inferred by virtue of the schema content being copied into the memory trace, the subject would be unable to remember whether a typical action was explicitly stated or merely inferred. On the other hand, the tagged actions would be distinct and salient in memory in the form of a separate organizational unit which contrasts the mass of typical information. All the atypical actions are tagged, yet only a subset of the typical actions are tagged. Since tagging directly predicts memory discrimination, memory discrimination will be better for atypical than typical actions. A second prediction is that there should be no memory discrimination for the very typical actions. These actions would be inferred in virtually any scripted activity. The very typical actions (with a 6.0 typicality rating) would be copied into the memory trace when they are stated and when they are not stated in a passage. Subjects should be unable to decide correctly whether these actions were presented in a passage. A third prediction is that false alarm rates and intrusion proportions should be higher for typical actions than atypical actions. A subset of the typical schema actions would be copied into the memory trace, even though these actions were never stated in the passage. These typical actions should evoke false alarms on recognition tests and intrusions on recall tests. However, unpresented atypical actions would not be copied into the memory trace, so their false alarm rates and intrusion proportions should be lower. A fourth prediction is that hitrates and recall proportions should not vary with typicality in any simple, elegant manner. On the one hand, hit rates and recall proportions should increase with typicality because the likelihood of being copied into the memory trace increases with typicality. On the other hand, hit rates and recall proportions should decrease with typicality because the likelihood of an action being tagged decreases with typicality.

Schemas, Comprehension, and Memory

73

These four predictions of the SC+T model have been consistently supported in our previous experiments. We have confirmed all four predictions when the acquisition materials involved scripts (Graesser, 1981; Graesser et al., 1979; Graesser et al., 1980; Smith & Graesser, 1981) and roles or stereotypes (Woll & Graesser, 1982). The predictions were supported for recognition tests and for recall tests. Figure 1 summarizes the outcomes of the recognition studies we have conducted. Analogous trends would occur for recall by substituting recall proportions for hit rates, intrusion proportions for false alarms, and Memory Scores (see formula 2) for d’ scores. The predictions of the SC+T model are supported by these trends. First, there is better memory discrimination for atypical actions than typical actions (see d’ scores). Second, d’ scores are essentially zero for very typical actions (6.0 typicality ratings); the hit rates do not differ from false alarm rates for these actions.

t

A Hit rate

.50

/

t

/ I

I

I

2 - 2.99

3 - 3.99

4 4.99

Very Atypical

I

5-5.99

L

6.00 Very Typical

Typicality of Information

3 . q

I

very Atypical Typicality of Information

Very Typical

Fig. 1 . Hit rates, false alarm rates (A), and d’ scores (B) as a function of the typicality of an item with respect to a schema.

74

Arthur C. Graesser and Glenn V. Nakamura

Third, false alarm rates increase with typicality, particularly within the typicality interval of 4 to 6. Fourth, hit rates vary only modestly with typicality. In fact, hit rates have increased with typicality in some experiments, have decreased with typicality in other experiments, and have shown no change in other experiments. In summary, the four predictions of the SC+T model have been consistently supported in recognition and recall studies. B.

GUESSINGASSUMPTIONS AND PREDICTIONS

Loosely speaking, subjects are guessing when there are false alarms in a recognition test and intrusions in a recall test. Subjects often guess that typical actions were presented, but rarely guess that atypical actions were presented. Unfortunately, the term “guessing” has connotations that are inappropriate in the present context. The false alarms and intrusions are not products of coin tossing or random generations of responses. These guesses are based on schematic knowledge and reflect encoding processes (i.e., copying mechanisms), rather than processes invoked when there is a lack of information to serve as decision criteria. Despite these unfortunate connotations, we shall use “guessing” in this section for lack of a better word. Correlational analyses have been performed on the typical actions (not the atypical) in order to examine guessing processes. Guessing on a recognition test can be predicted primarily by an action’s generic typicality and secondarily by an action’s generic recallability . For scripts, the correlation between false alarms (i.e., guessing on a recognition test) and generic typicality is substantial, r = .42, p < .05 (Graesser et al., 1980). The correlation between false alarms and generic recallability is low, but consistent, r = .19, p < .10 (Graesser et al., 1980). As we mentioned earlier, typical actions show a low or nonsignificant correlation between generic recallability and generic typicality. Guessing on a recall test is predicted by an action’s generic recallability, but not by its generic typicality. Typical actions in scripts show a robust correlation between intrusion proportions (i.e., guessing on a recall test) and generic recallability, r = .67, p < .05 (Graesser et al., 1980). Not surprisely, both recall and free generation tasks share similar mechanisms for the articulation of linguistic codes. These articulation mechanisms do not map directly onto the generic typicality for the set of typical actions. Moreover, generic typicality does not significantly correlate with intrusion proportions, r = .03. On recognition tests, subjects occasionally decide that very atypical nontarget actions were presented. These incorrect decisions suggest that subjects occasionally guess randomly on recognition tests. Obviously, an

Schemas, Comprehension, and Memory

75

individual would not infer the occurrence of an irrelevant action in a scripted activity. The earlier SP+T model introduced three guessing assumptions which capture the above trends in recognition and recall tests (Graesser, 1981). These three assumptions will be incorporated into the SC+T model: Assumption 4

The likelihood of guessing a typical item at recall increases with its generic recallability . Assumption 5 The likelihood of guessing YES to a typical item at recognition increases with its generic typicality and to a smaller extent with its generic recallability . Assumption 6 Individuals sometimes guess YES randomly on a recognition test, but this is unrelated to either generic typicality or generic recallability .

c.

RETRIEVAL ASSUMITIONS FOR RECALLAND RECOGNITION

The retrieval assumptions in the SP+T model were different for recall and recognition tests. Such differences are also assumed to exist in the SC+T model. When subjects recall a scripted activity, they are presented a script title (eating at a restaurant) and are asked to recall all actions that were mentioned in the activity. When subjects complete a recognition test they receive test actions in addition to the script title. The retrieval cues are quite different for recall and recognition. It is not surprising, therefore, that the retrieval processes would differ. When subjects recall a scripted activity, their recall is guided by an organized retrieval strategy. This retrieval is said to be conceptually driven. Since the subjects are prompted by a script title, the script has a substantial impact on the conceptually driven retrieval strategy. Conceptually driven retrieval often requires effort to accomplish as the individual invokes some strategy to access and decode the contextually specific memory trace. Assumption 7

Recall of an item is directed by an organized retrieval strategy that is conceptually driven and influenced by the generic schema. Recognition tasks involve more than conceptually driven retrieval. Some test actions may be retrieved by conceptually driven retrieval, but

76

Arthur C. Graesser and Glenn V. Nakamura

others are accessed by data-driven retrieval. A test item on a recognition test contains information that provides a more direct access to the item in memory. The test item serves as a copy cue, that is, a rich configuration of information that has a close match to the original encoding. Datadriven retrieval is analogous to “detection of familiarity” (Atkinson & Juola, 1974) and “intraitem elaboration” (Mandler, 1980) in models of recognition memory. Data-driven retrieval is often acceomplished quickly, is not always guided by a strategy, and does not always require a reinstatement of the original script context in which the item was embedded. Thus, individuals may correctly recognize that Jack put a pen in his pocket, but forget whether this action occurred in a restaurant script or in a packing-for-vacation script. Graesser (1981) has described the differences between data-driven retrieval and conceptually driven retrieval in more detail. Assumption 8

Recognition of an item is accomplished either by data-driven retrieval through the test item as a copy cue, or by a conceptually driven retrieval strategy. The SC+T model adopts a dual-process model of recognition memory. In this scene, the sC+T model is compatible with dual process models of word recognition (Atkinson & Juola, 1974; Kintsch, 1977; Mandler, 1972, 1980). Whereas recall is guided by conceptually driven retrieval, recognition is guided by conceptually driven and data-driven processes. D . RETENTIONASSUMPTIONSAND PREDICTIONS

According to the SC+T model, the retention functions differ for conceptually driven and data-driven retrieval. Moreover, the rate of decay is different for atypical and typical actions. In the present discussion, rate of decay is simply a description of the retention function. The decay rate is undoubtedly a product of interference mechanisms. Two retention assumptions are associated with conceptually driven retrieval. Assumption 9 addresses the impact of typicality on decay rate, and assumption 10 addresses the shape of the decay function. Assumption 9

As the retention interval increases, the schema plays a more important role in guiding conceptually driven retrieval. Thus, atypical items have a faster decay rate than typical items.

Schemas, Comprehension, and Memory

77

Assumption I0

The likelihood of accessing an item via conceptually driven retrieval decreases exponentially over the retention interval. Assumption 9 agrees with the hypothesis that memory shifts from being reproductive (faithfully close to what is stated) to being reconstructive (close to the schema) at longer retention intervals (Bartlett, 1932; Cofer, Chmielewski, & Brockway, 1976; D’Andrade, 1974; Kintsch & VanDijk, 1978; Spiro, 1977). The schema has a more central role in guiding conceptually driven retrieval as the retention interval increases. As the retention interval increases, there is a greater bias to retrieving typical tagged items than atypical tagged items. According to assumption 10, there is an exponential decay rate, which agrees with Ebbinghaus’s classical observations about recall and retention interval. If assumptions 9 and 10 are correct, then an interaction should occur between typicality and retention interval for conceptually driven retrieval. The interaction reflects the claim that atypical actions decay at a faster rate than typical actions. Such an interaction has in fact been reported in previous studies involving scripted activities (Graesser , 1981;Graesser et al., 1980; Smith & Graesser, 1981). The data actually show a crossover. Recall for atypical actions is better than recall for typical actions through 2 or 3 days; after a few days there is better recall for typical than for atypical actions. This crossover is depicted in Fig. 2. Two retention assumptions are associated with data-driven retrieval. Assumption 11 addresses the impact of typicality on decay rate, and assumption 12 addresses the shape of the decay rate. Assumption I 1

The decay rate is the same for typical and atypical items when tagged items are accessed via data-driven retrieval. Assumption 12

The likelihood of accessing an item via data-driven retrieval approximates a linear decrease over time. Data-driven retrieval does not always require a reinstatement of the original passage context and is not always substantially influenced by a strategy. Consequently, typical and atypical items have comparable decay rates. The linear decay rate probably is a simplification of the true decay function for data-driven retrieval. Data-driven retrieval is guided by a copy cue, which invokes several dimensions and levels of informa-

Arthur C. Graesser and Glenn V. Nakamura

78

Conceptually driven retrieval

1

I

I

1.00 L

g

Data-driven retrieval

-

2

.75-

Atypical

tion to serve as retrieval cues. The retention function that corresponds to any one dimension may be exponential. However, the resolution of several exponential functions, with different decay rates, is a decay function that approaches linearity. If assumptions 11 and 12 correctly specify data-driven retrieval, then there should be no interaction between typicality and retention interval. When memory scores are computed from recognition tests, there are significant interactions (Graesser, 1981; Graesser et al., 1980; Smith & Graesser, 1981). However, both conceptually driven retrieval and datadriven retrieval contribute to the Memory Scores. An accurate assessment of data-driven retrieval would partial out contributions from conceptually driven retrieval. Graesser (1981) used recall memory scores as estimates

Schemas, Comprehension, and Memory

79

of conceptually driven retrieval at recognition. After these estimates were partialed out of the recognition memory scores, the resulting functions for data-driven retrieval were plotted. These corrected data-driven functions approached linearity and the lines were nearly parallel for typical and atypical script actions. As expected, retrieval was better for atypical than typical actions. The decay rate was roughly constant for the two types of actions; if anything, the decay rate was steeper for typical than atypical actions. The pattern of data-driven retrieval is depicted in Fig. 2. Smith and Graesser (1 98 1) assessed memory for typical versus atypical script actions after a retention interval o f f hour, 2 days, 1 week, and 3 weeks. The actions of a given scripted activity were tested at only one of the four retention intervals by either a recall or a recognition test (but never both). The Memory Scores for recall and recognition supported the assumptions of the SC+T model. A mathematical simulation of the data demonstrated that the proposed assumptions provided a better fit to the data than alternative assumptions. An exponential decay function for conceptually driven retrieval provided a better fit than a linear decay function; a linear decay function for data-driven retrieval was better than an exponential function. The decay rate for conceptually driven retrieval was steeper for atypical than typical actions; the data-driven decay rates were roughly the same for atypical and typical actions. Memory was initially better for atypical than typical actions in both data-driven and conceptually driven retrieval. A dual-process recognition mechanism showed a better fit to the Memory Scores than did a single-process recognition mechanism. In summary, the SC+T model explains memory for information that varies in typicality with respect to a central organizing schema. The model can explain recall and recognition of typical versus atypical actions after varying retention intervals. Recognition is uniformly better for atypical than typical actions at all retention intervals. Recall is initially is initially better for atypical than typical actions, but the opposite holds true for 3 or 4 days. Thus, typical actions are remembered best only when memory is assessed by recall after a long retention interval.

IV. Some Issues Confronting the SC+T Model The data reported in the previous section provide encouraging support for the SC+T model. However, some questions have not been answered regarding the role of schemas in comprehension and memory. The purpose of this section is to address some questions that colleagues have raised and to report research that should help clarify these issues.

80

A.

Arthur C. Graesser and Glenn V. Nakamura

DOESTHE TYPICALITY EFFECTOCCURFOR DIFFERENT KINDSOF SCHEMAS?

We believe that the reported effects of typicality on memory generalize to knowledge domains other than scripted activities. The same effects should emerge for person schemas, visual scenarios, and schemas that correspond to other knowledge domains. In Section 1,C we reported studies that confirm the typicality effect for picture memory. In a study by Woll and Graesser (1982) subjects listened to descriptions of fictitious people and were later given a recognition test on actions and traits. The actions and traits varied in typicality with respect to a person schema which was foregrounded at the beginning of the personality description. We examined person schemas that corresponded to roles (e .g ., professor, cowboy) and stereotypes (e.g., macho male, aggressive female). As with the scripted activities, there were two versions of each personality description, so that the target items of version A were nontarget items in version B, and vice versa. The pattern of d' scores in three studies consistently supported the SC+T model. Memory was substantially better for atypical than typical actions and there was no memory discrimination for very typical information (d' = .lo). A few studies in the social cognition area have reported better memory for information that is congruent (typical) with a person schema than information that is irrelevant (e.g., Hastie & Kumar, 1979; Hastie, 1980). However, the memory measures in these studies did not control for guessing, so the data are difficult to interpret. We have reexamined the means reported in the Hastie and Kumar study and have estimated a memory score based on a reasonable guessing likelihood (. 10). These memory scores show better memory for irrelevant than typical items. It is important to control for guessing and response biases when assessing the impact of stereotypes on memory (see Bellezza & Bower, 1981a; Clark & Woll, 1981). B.

DOESTHE TYPICALITY EFFECTPERSISTWHENMORETHAN ONESCHEMA GUIDESCOMPREHENSION?

In our previous studies, passages were organized around a central schema. However, most passages and experiences foreground more than one schema. The schemas correspond to different knowledge domains and levels of structure. We have recently conducted some memory studies on passages that invoke both a script schema and a person schema. This research will be reported here. The acquisition passages contained actions that varied in typicality

Schemas, Comprehension, and Memory

81

with respect to a role schema and with respect to a script. At the beginning of the passage, the relevant script and role were identified, for example, “Bill was a professor and he decided to go horseback riding.” Here professor is the role schema and riding a horse is the script. After this introductory statement, the passage included a series of actions which varied in typicality. There were four categories of test actions: (1) script typical and role typical, (2) script typical and role atypical, (3) script atypical and role typical, and (4)script atypical and role atypical. In addition to these critical actions, which were later tested, each passage contained several context actions typical of the role or script. As in the previous studies, there were several passage versions so that a test item was a target in some versions and a nontarget in other versions. Free generation groups and normative rating groups were run in order to systematically prepare the acquisition passages. In addition, special versions and counterbalancing procedures were employed, so that a given test action was typical in some versions and atypical in others. For example, put on a hut would be atypical for a professor; the same action would be typical of a cowboy role in a passage introduced as “Bill was a cowboy and he decided to go horseback riding.” Therefore, any effects of typicality on memory could not be attributed to intrinsic properties of the items such as imagery, concreteness, or salience. Eighty subjects listened to eight experimental passages that were designed with the above constraints. Approximately 30 min later they completed a recognition test on the test actions using the 6-point rating scale. Altogether there were 128 test actions, with 16 per passage. Again the typicality of a test action varied, depending on the passage version that the subject received. The recognition data are shown in Table 11. Table I1 includes d‘ scores, hit rates, and false alarm rates for the four categories of actions. For TABLE I1 RECOGNITION MEMORYAS A FUNCTION OF SCRIPTTYPICALITY AND ROLETYPICALITY Script typical

Recognition measure

d’ score Hit rate False alarm rate

Script atypical

Role typical

Role atypical

Role typical

Role atypical

.50 .68 .55

.7 I .73

.98

1.12 .63 .25

.50

.71 .36

82

Arthur C. Graesser and Glenn V. Nakamura

present purposes, we shall focus on the d’ scores because D’ provides an accurate assessment of memory discrimination that controls for guessing. The pattern of d’ scores supported the SC+T model. The d’ scores were significantly higher for script-atypical than script-typical actions, 1 .05 versus .61, respectively, F ( 1 , 79) = 8 7 . 1 3 , < ~ .05. The d’ scores were significantly higher for role-atypical than role-typical actions, .92 versus .74, respectively, F ( 1 , 7 9 ) = 6.91, p < .05. The interaction between role typicality and script typicality was not significant. The pattern of d’ scores supported the claim that the scripts were more critical than the role schemas in guiding comprehension and memory for the passages. A difference score between atypical and typical items may be used as an index of the impact of a schema on comprehension processes. This difference score was .44 (1.05 - .61) for scripts and .18 (.92 - .74) for roles. Since script typicality predicted memory better than role typicality, it appears that there is a script bias in comprehension and memory. Some follow-up experiments were conducted in order to assess the robustness of the script bias. In one follow-up experiment we varied the instructions that subjects received before listening to the passages. In a script emphasis condition, subjects were instructed to pay careful attention to the actions that the fictitious characters performed. In a role emphasis condition, subjects were instructed to pay careful attention to the characters’ personalities. Forty subjects participated in each of these conditions. The acquisition passages, recognition test booklets, and recognition scale were identical to those in the previous study. Table 111 shows the recognition data for subjects in the script emphasis and role emphasis conditions. The instructions had no impact on memory for the passages. The main effect of instructions was nonsignificant, and instructions did not significantly interact with role typicality or script typicality. The latter variables did significantly predict memory and consistently supported the SC+T model. Script-atypical actions had significantly higher d’ scores than the script-typical actions, 1.09 versus .56, respectively, F( 1, 78) = 64.02, p < .05. Role-atypical actions had significantly higher d’ scores than did role-typical actions, .90 versus .75, respectively, F(1, 78) = 4.82, p < .05. The role typicality X script typicality interaction was not significant. The atypical-typical difference scores supported the idea of a script bias. The difference score was .54 for scripts and .15 for roles. Varying instructions did not have an impact on script bias. Another study was conducted to assess further the robustness of script bias. We modified the passages in order to attenuate any potential emphasis on the scripts. This was accomplished by deleting all the context script

Schemas, Comprehension, and Memory

83

TABLE 111 RECOGNITIONMEMORYAS A FUNCTION OF SCRIPT TYPICALITY, ROLE TYPICALITY, AND ~NSTRUCTIONAL SET Script typical Recognition measure d’ score

Hit rate False alarm rate

Instructional set Script emphasis Role emphasis Script emphasis Role emphasis Script emphasis Role emphasis

Role typical

Role atypical

.66 .67 .75 .75

.48 .44 .71 .70 .55 .56

.54

.53 Script atypical

Recognition measure d’ score Hit rate False alarm rate

Instructional set Script emphasis Role emphasis Script emphasis Role emphasis Script emphasis Role emphasis

Role typical

Role atypical

.97 1.12 .70

1.17 1.08 .61 .61 .23 .21

.71

.36 .33

actions (i.e., typical script actions that were presented in all passage versions). The rewritten passages contained a higher proportion of rolerelevant actions and very few script-relevant actions. Otherwise the passages were identical to the previous passages. The recognition booklets and the recognition scale were also identical to the previous two studies. Forty subjects participated in the experiment. Table IV shows the recognition data in the follow-up study. The pattern of d‘ scores supported the script bias idea as well as the SC+T model. The d’ scores were significantly higher for script-atypical than script-typical actions, 1.04 versus .68, respectively, F( 1,39) = 22.96, p < .05. The d’ scores were significantly higher for role-atypical than role-typical actions, .96 versus .76, respectively, F(1,39) = 8.51, p < .05. There was a nonsignificant script typicality X role typicality interaction. The atypical-typical difference score was greater for scripts (.36) than for roles (.20), which supports the notion of a script bias. ln summary, the three experiments support the SC+T model. The typicality effect persists when more than one schema is foregrounded during comprehension. Each schema has an independent effect on memor y , with schema-relevant information being remembered less well than

84

Arthur C. Graesser and Glenn V. Nakamura

TABLE IV RECOGNITION MEMORYAS A FUNCTION OF SCRIPTTYPICALITY AND ROLE TYPICALITY WHENPASSAGESEMPHASIZE ROLEPROCESSING Script typical

Recognition memory

d’ score Hit rate False alarm rate

Script atypical

Role typical

Role atypical

Role typical

Role atypical

.55 .63 .43

.80 .70 .42

.96 .72 .38

1.12 .67 .26

schema-irrelevant information. We also found a script bias in these passages which foregrounded both a script and a role. Compared to the role schema, the script had a more robust impact on comprehension and memory. This script bias was not influenced by the comprehenders’ goals (instructions) and the ratio of role-relevant to script-relevant information. Scripts are apparently more central organizing schemas than are roles. C . DOESTHE TYPICALITY EFFECT OCCURWHENSCRIPTED ACTIVITIESARE VIDEOTAPED?

The studies we have reported so far have one common property: The acquisition materials have involved passages. What happens when the acquisition material is nonverbal? In order to answer this question, we compared memory for actions in videotaped scripted activities and taperecorded scripted activities. The actions and scripts were identical in the videotaped and the taperecorded action sequences. There were four experimental scripts: setting the table, polishing shoes, fming lunch, and typing a letter. There were two versions (A and B) of each scripted activity, so that a given test action was presented in one version, but not the other. The tape-recorded scripted activities were recorded at a medium rate of approximately 150 words per min. The videotaped scripted activities had no sound. In order to foreground the appropriate script, the script title was presented on the screen immediately prior to the videotaped action sequence. All four experimental scripted activities were enacted in approximately 10 min. Thirty subjects were assigned to the videotaped condition and 30 to the tape-recorded condition. Approximately 15 min after viewing or listening to the scripts, subjects completed a recognition test using the 6-point recognition scale. Half of the 48 test actions were typical and half were

Schemas, Comprehension, and Memory

85

atypical; half of the actions were target actions and half were nontarget actions for any given subject. Table V shows the recognition data for the videotaped and tape-recorded scripted activities. The d’ scores were significantly higher for atypical actions than typical actions, 1.90 versus 1.27, respectively, F(1, 58) = 28.52, p < .05. There were slightly, but not significantly higher, d’ scores in the videotaped condition than the tape-recorded condition, 1.79 versus 1.38, respectively, F(1, 58) = 3.37, p < .07. There was no significant interaction between typicality and presentation mode, F( 1, 58) = 2.42, .10 < p < .13. These data confirm the typicality effect and support the SC+T model. The atypical-typical difference score was .55 for videotaped scripts and .81 for tape-recorded scripts. Thus, the typicality effect emerges in both linguistic and nonlinguistic acquisition materials, and in both auditory and visual modalities. D.

Is THE TYPICALITY EFFECTINFLUENCED BY PRESENTATION RATE?

Several researchers have either assumed or asserted that the typicality effect is a product of the amount of attention, rehearsal, cognitive resources, or conceptual elaboration that items receive during encoding (see Section 11,C). These ‘‘attention and elaboration” explanations are very popular among researchers. When colleagues are asked to explain why atypical actions are remembered better than typical actions, researchers frequently reply that ‘‘subjects pay more attention to the atypical items’’ or “subjects study the atypical input harder” because the atypical information does not fit in with the typical information. The attention-elaboraTABLE V RECOGNITION MEMORY AS A FUNCTION OF SCRIPT TYPICALITY AND PRESENTATION MODALITY Presentation modality Tape recorder (auditory)

Videotape (visual)

Recognition measure

Typical

Atypical

Typical

Atypical

d’ score Hit rate False alarm rate

.97 .82 .54

1.78 .73 .22

1.56 .83 .37

2.01 .72 .I5

86

Arthur C. Graesser and Glenn V. Nakamura

tion explanation differs substantially from the SC+T explanation of the typicality effect. The mechanism underlying an attention-elaboration explanation is simple and straightforward. During comprehension the comprehender identifies one or more schema as relevant to the passage. For each incoming item, the comprehender assesses the typicality of the item with respect to the available foregrounded schema. If the item is evaluated as typical, then the item does not require additional processing and analysis. If the item is atypical, then the item draws additional cognitive resources and further elaboration at a conceptual or semantic level. Since atypical items receive additional resources and conceptual elaboration at comprehension, they would be remembered better than typical items. According to the SC+T explanation, the typicality effect is a product of the organizational processes that are invoked automatically. For the most part, the representational code and the typicality effect do not depend on the encoding strategies and the goals of the comprehender during comprehension. The magnitude of the typicality effect should remain essentially constant across different encoding contexts. We conducted an experiment that varied the rate at which scripted activities were presented to subjects auditorily. In a medium rate condition, the Jack story was presented at a normal conversational rate of 175 words per min. The Jack story contained 10 scripted activities. In afast rate condition the scripted activities were presented as quickly as possible without sacrificing comprehensibility of the material (280 words per min). Forty subjects were assigned to each condition. Approximately 30 min after listening to the passage, subjects completed a recognition test. The purpose of varying presentation rate was to test between the SC+T explanation and the attention-elaboration explanation of the typicality effect. The SC+T model predicts no interaction between presentation rate and typicality on recognition memory. However, such an interaction would support the attention-elaboration explanation. According to the attention-elaboration hypothesis, atypical actions are remembered better than typical actions because the atypical actions receive more cognitive resources during comprehension. Varying processing resources among actions is manageable at a medium presentation rate. However, at a very fast presentation rate it should become more difficult, if not impossible. Consequently, the attention-elaboration explanation predicts that the differences in memory between atypical and typical actions should be less pronounced in the fast rate condition than in the medium rate condition. Table VI shows the recognition memory data for typical and atypical actions in medium and fast presentation rate conditions. The d' scores were significantly higher for atypical than for typical actions, 1.79 ver-

87

Schemas, Comprehension, and Memory

TABLE VI RECOGNITIONMEMORYAS A FUNCTION OF SCRIPTTYPICALITY AND PRESENTATION RATE Presentation rate Medium rate

Fast rate

Recognition measure

Typical

Atypical

Typical

Atypical

d’ score Hit rate False alarm rate

.63 .SO .62

1.96 .76 .I6

.37 .77

1.60 .66 .I7

.66

sus .50, respectively, F(l , 78) = 229.63, p < .05. The d’ scores were also significantly higher in the medium than in the fast presentation rate condition, 1.29 and .99, respectively, F(1, 78) = 6.28, p < .05. The interaction between presentation rate and typicality was not significant, F(1, 78) = .34, p > .50. Thus, the influence of item typicality on memory persists at all presentation rates. Since the magnitude of the typicality effect was constant at both presentation rates, organizational properties of the memory trace appear to best explain the effects of typicality on memory. The construction of the memory tags for atypical actions seems to be achieved very quickly during comprehension. Variations in attention and elaboration have little or no impact on the typicality effect. E.

Is

TYPICALITY EFFECTINFLUENCED BY THE GOALSOF COMPREHENDER?

THE

THE

To what extent is the typicality effect sensitive to the goals of the comprehender? Does memory vary for typical versus atypical information depending on the attention or emphasis that the comprehender gives to the two types of actions? We conducted an experiment to answer this question. We varied the comprehenders’ goals by varying the instructions that subjects received before listening to the scripted activities. The acquisition material consisted of the Jack story. The experiment included six instructional set conditions with 20 subjects assigned to each condition. The conditions are listed and described below. 1. Vague condition. The subjects did not receive specific instructions on how to process the Jack story. The subjects were told that they would

88

Arthur C. Graesser and Glenn V. Nakamura

later be asked questions about the Jack story. This vague condition will be regarded as a normal processing environment and a prototype to compare other instructional set conditions. 2. Personality condition. The subjects were told that they would be given a test that assessed their perceptions of Jack’s personality. The subjects were instructed to pay careful attention to unusual actions that Jack performed, because unusual actions convey much information about a person’s personality. Subjects were expected to place more emphasis on the atypical actions and less emphasis on the typical actions, compared to the vague condition. 3. Global condition. The instructions were designed to encourage subjects to focus on the global levels of the Jack story rather than the details (i.e., individual actions). Subjects were told to monitor changes in spatial settings. A change in spatial setting occurred roughly at the junctures between scripts. The subjects wrote down a tally mark in a box on the instruction sheet whenever they detected a change in spatial scenario. For example, a scenario change would occur if Jack’s activities changed from the location of his home to that of a department store. Subjects were expected to place less emphasis on atypical actions, compared to the vague condition. 4. Specific condition. The instructions were designed to encourage subjects to focus on individual actions in the Jack story. The subjects were told to write a tally mark in a box on the instruction sheet whenever Jack executed a skilled action, which was defined as an action requiring extensive training or education to perform. The subjects were given examples of skilled and unskilled actions. For example, fixing a radio is a skilled action, whereas eating a sandwich is not a skilled action. The subjects were expected to place more emphasis on typical actions (and probably also the atypical actions), compared to the vague condition. 5. Recall condition. The subjects were told that they would later be asked to recall, in writing, the contents of the Jack story. Recall instructions presumably promote an increased concern with organizing and interrelating information in a cohesive fashion. Since typical actions form the cohesive core of a passage, more emphasis should be pfaced on typical actions and less emphasis should be placed on atypical actions, compared to the vague condition. 6. Recognition condition. The subjects were told that they would later be given a recognition test on the Jack story. The format of the recognition test was described to subjects. Subjects would presumably focus on details in this condition. Compared to the vague condition, subjects should place more emphasis on typical actions. The above six instruction conditions were designed to manipulate the

Schemas, Comprehension, and Memory

89

emphasis on processing typical versus atypical actions. Of course, the instructions only indirectly control the allocation of processing resources, and there is no insurance that the instructions produce the intended manipulation. However, if the typicality effect is very sensitive to the goals of the comprehender and the allocation of resources to typical versus atypical actions, then a typicality X instructional set interaction should occur when the recognition data are analyzed. If, however, the typicality effect [&(atypical) - d‘(typical)] is relatively impervious to variations in the comprehenders’ goals, then the typicality X instructional set interaction would be nonsignificant. In all instructional set conditions, a recognition test was administered to subjects approximately 30 min after the Jack story was presented. Table VII shows the recognition data. The d‘ scores were higher for atypical than typical actions, 1.64 versus .52, respectively, F( 1, 114) = 319.03, p < .05. There were significant differences in d’ scores among the instructional set conditions, with means of .74, .85, .98, 1.02, 1.42, and 1.48 in the global, vague, specific, recall, recognition, and personality conditions, respectively, F ( 5 , 114) = 9.05, p < .05. However, the interaction between typicality and instructional set was not statistically significant, F ( 5 , 114) = 1.67, p > .lo. A series of Newman-Keuls tests was performed on the d’ scores of the six instructional set conditions in

TABLE VII RECOGNITION MEMORYAS A FUNCTION OF SCRIPTTYPICALITY AND INSTRUCTIONAL SET Recognition memory measure

Instructional set Recall Recognition Global Specific Personality Vague

Typicality Typical Atypical Typical Atypical Typical Atypical Typical Atypical Typical Atypical Typical Atypical

d‘ score

Hit rate

.49 I .54 .14 2.10 .28 1.20

.71

.48 1.47

.87 2.09 .30 1.41

False alarm rate

.62 .ll .61

.62 .I8 SO .I1 .13 .22 .63 .I8

.19

.51

.16 .16 .12

.I4 .lo .26

.70 .14 .74 .81

90

Arthur C. Graesser and Glenn V. Nakamura

order to examine the source of the significant main effect of instructional set. The outcome of the Newman-Keuls tests supported the following ordering among means using a .05 level of significance: personality = recognition > recall = specific = vague = global. The fact that there was no typicality X instructional set interaction suggests that the typicality effect is rather impervious to the comprehenders’ goals. The typicality effect difference scores were 1.1 1, 1.22, .99, .92, 1.36, and 1.05 in the vague, personality, specific, global, recognition, and recall instruction conditions, respectively. These typicality effect scores are roughly constant and do not systematically vary with the amount of resources that we expected to be allocated to atypical versus typical actions at comprehension. The instructional set variables did influence memory, but these variations in the comprehenders’ goals had a constant effect over typical and atypical actions. In summary, an explanation of the typicality effect involves an organized representational code that is established automatically during comprehension. The typicality effect is a robust phenomenon that is not malleable by the comprehender’s goals and the allocation of cognitive resources. F. DOESTHE TYPICALITY EFFECTOCCURIN ECOLOGICALLY VALID SETTINGS?

In all the experiments we have reported so far, the subjects have been aware that they were in an experiment. Does the typicality effect occur in more ecologically valid situations when an individual does not anticipate being tested in some form? Some social rules and pragmatic constraints are followed when individuals comprehend prose or experience events in an experimental setting. In the context of discourse, the speaker and listener assume that whatever is said is important and relevant to the goals of the interchange. In the context of an experiment, the subjects assume that all presented material is important and relevant to the goals of the experimental session. These pragmatic rules underly speech acts, discourse, and social interaction (de Beaugrande, 1980; Grice, 1975; Searle, 1969). The studies supporting the typicality effect and the SC+T model may be restricted to contexts in which the above pragmatic rules operate. When subjects comprehend an atypical action in a scripted passage, they would assume that the atypical action is particularly important and a relevant part of the message. The subject would believe that the speaker had an important reason for including information which would otherwise be irrelevant to the topic. Consequently, the atypical information would receive more attention and elaboration. The same pragmatic rules might

Schemas, Comprehension, and Memory

91

not apply when the material is not prose and when the comprehender is not in an experimental setting. We conducted an experiment to assess the typicality effect in an ecologically valid setting. Students received a lecture and subsequently completed a surprise recognition test on the actions performed by the lecturer. The actions varied in typicality with respect to the lecture script. Since the students were not aware that they were in an experiment during the lecture, they would not assume that all input was relevant and important. Moreover, lecturers normally communicate important and relevant messages by sentences rather than by actions; the lecturer does not intend to communicate each action or gesture executed in a lecture. According to the SC+T model, memory should be better for atypical than typical actions. However, if the typicality effect is restricted to experimental materials and prose materials, then the atypical actions would not be remembered better then typical actions. The experiment included a lecture phase, an intervening task phase, and a test phase. During the lecture phase, the students received a 15-min lecture at the beginning of their scheduled laboratory section. During the lecture, the lecturer performed a number of actions that varied in typicality with respect to a lecture script. Some typical actions were the lecturer pointing to the blackboard and the lecturer handing a student a sheet of paper. Some atypical actions were the lecturer taking offa watch and the lecturer wiping offhis glasses. Both typical and atypical actions were performed in a smooth, nonobvious manner. Before the lecturer delivered the lecture, he told the students that the material was review and that they did not need to take notes. The purpose of this comment was to encourage students to look at the lecturer rather than at their notes. There were two lecturers, which included a different sample of 10 typical and 10 atypical actions. There were 20 students in each lecture session. After the lecture, there was a 20-min intervening task. During this intervening task, the lecturer led the students to a different room in order to give a demonstration on the use of a computer. While the students and lecturer were in the computer room, a confederate cleaned up the laboratory room in order to destroy all clues as to what actions may have been performed during the lecture. After the intervening activity, the students were led back to the laboratory room and were given a recognition test on the lecture actions. The recognition data supported the SC+T model. The d’ scores were significantly higher for atypical than typical actions, 1.02 versus .15, respectively, F(1, 39) = 21.00, p < .01. The hit rates were .66 and .56 for typical versus atypical actions, whereas the false alarm rates were .62

92

Arthur C. Graesser and Glenn V. Nakamura

versus .23, respectively. These data again confirm the generality of the typicality effect. The typicality effect persists in ecologically valid settings when subjects are unaware they are in an experiment. The typicality effect is not an artifact of social rules, pragmatic constraints, or conversational postulates. G . ARE UNPRESENTED TYPICAL ITEMS INFERREDAT COMPREHENSION OR AT RETRIEVAL?

The effects of typicality on memory are mainly determined by the false alarm rates. Hit rates do not vary systematically with typicality and usually show a flat function. Figure 1 illustrates the effect of typicality on false alarm rates. Between the interval of moderately typical and very typical, there is a sharp increase in false alarm rates; the corresponding d’ scores show a dramatic decrease. In other words, memory varies antagonistically with the false alarm rates. Are the false alarm rates a product of encoding mechanisms or retrieval mechanisms? The false alarms may correspond to the inferences that the comprehender generated during comprehension. These inferences would be copied into the memory trace, as specified by the SC+T model. There is an alternative possibility, however. Perhaps these false alarms are not comprehension-generated inferences. Instead, the inferences may have been retrieval generated, that is, derived only at test time. To what extent are the false alarms comprehension-generated inferences versus retrievalgenerated inferences? Researchers would probably not quibble with the claim that a subset of the false alarms reflects comprehension-generated inferences, whereas another subset reflects retrieval-generated inferences. In fact, there is evidence for both types of inferences. One finding suggests that some false alarms are a product of retrieval, but not of comprehension. Specifically, false alarm rates show a modest increase between the interval of very typical to moderately atypical. None of these atypical actions would have been made at comprehension, yet there was a systematic change in false alarm rates. Studies by Yekovich (Dunay, Balzer, & Yekovich, 1981; Yekovich & Yekovich, 1982) have supported the conclusion that many unpresented typical actions are generated at comprehension. Subjects were presented with several scripted activities. Within 5 and 8 sec after listening to an excerpt, subjects received test words and decided as quickly as possible whether the test word was (a) explicit (i.e., mentioned in the passage), (b) implicity, or (c) unrelated to the passage. Some of the nouns were explicitly mentioned in the passage. Other test nouns were part of plausible

Schemas, Comprehension, and Memory

93

inferences that were not explicitly mentioned. A third group of nouns was totally unrelated to the passage. Yekovich reported an extremely high false alarm rate for implicit, related words. The false alarm rates varied from .33 to .46. These false alarm rates are almost as high as the false alarm rates for typical actions after a 30-min or 3-week delay (.53 in Smith & Graesser, 1981). Thus, within a few seconds after comprehension, subjects judged that unpresented typical information was being presented. These data strongly indicate that a substantial number of inferences would be generated at comprehension. A related question is whether subjects sometimes avoid searching memory when the test action is very typical of the script. We have reported that memory discrimination is low for moderately typical actions and zero for very typical actions. Perhaps memory is poor for these actions because the subjects prematurely conclude that these actions “must have been presented” and they avoid searching their memory. If subjects avoid search processes, there might be good memory for an item, but this would not be manifested in the recognition data. A study was conducted to assess whether the poor memory for typical actions was an artifact of memory search avoidance. In this study (Graesser et al., 1980), two types of recognition tests were administered. One test format had a YES/NO format, whereas the other had a twoalternative, forced-choice (2AFC) format. For the 2AFC test, the two test actions of each pair were matched on typicality. Consequently, typicality could not serve as a criterion for deciding whether an action was presented; subjects would be forced to search their memory traces in order to decide which of the two actions had been presented. Suppose that subjects sometimes avoided memory search on the YES/NO test. If they did, then memory would be better in the 2AFC than the YES/NO format. If, however, memory search was not avoided in the YES/NO format, d’ scores would be the same for YES/NO and 2AFC formats. The Graesser et al. (1980) study did not support the idea that subjects sometimes avoid memory search when given a YES/NO recognition test. The d‘ scores were the same in the YES/NO format and the 2AFC format. The YES/NO recognition format accurately taps the memory subjects have about actions in scripted activities.

V.

The Fate of Four Alternative Models

In an earlier section we described four alternative models specifying the impact of schemas on comprehension and memory. How do the four alternative models compare to the SC+T model in explaining the avail-

94

Arthur C. Graesser and Glenn V. Nakamura

able data? We believe that the SC+T model provides the best account of the data. The purpose of this section is to point out the weaknesses of the other four models. A.

PROBLEMS WITH

THE

FILTERING MODEL

The filtering model predicts that typical information will be encoded and retained in memory better than atypical information. This prediction was not supported in the studies reported in this article. There was better memory discrimination for atypical information than typical information. This outcome seriously challenges the filtering model and has led us to reject the model. It is interesting, however, that the intuitions and theories of many researchers adopt a filtering mechanism. There is only one condition in which memory for typical information exceeds that of atypical information. This condition involves recall after a long retention interval. This outcome is accommodated by the SC+T model’s assumptions regarding recall and conceptually driven retrieval. Recall involves conceptually driven retrieval and the schema plays a more important role in guiding conceptually driven retrieval as the retention interval increases. For conceptually driven retrieval, the tagged typical items decay at a slower rate than the tagged atypical items (see Fig. 2). The obtained crossover effect for conceptually driven retrieval is accommodated by the SC+T model. B . PROBLEMS WITH

THE

ATTENTION-ELABORATION MODEL

According to the attention-elaboration explanation of the typicality effect, atypical items receive more attention and conceptual elaboration than the typical items. This differential attention and elaboration are presumed to explain differences in memory. According to the SC+T model, there can be systematic differences in attention and elaboration devoted to items, but such differences do not explain the typicality effect. Instead, properties of the representational code explain the typicality effect, and the code is constructed automatically at comprehension. Several findings challenge the attention-elaboration explanation of the typicality effect. First, there is zero memory discrimination for very typical actions, yet these items must have received some attention and elaboration. Second, memory for typical and atypical actions is not influenced by a variety of script distortions that would presumably draw resources away from atypical actions and therefore attenuate memory for such items. Specifically, memory for script actions does not decrease when (a) more and more atypical actions are included in the scripted

Schemas, Comprehension, and Memory

95

activities, (b) a script is interrupted by another scripted activity, (c) actions of one scripted activity are interleaved with actions of another scripted activity (Graesser, 1981; Graesser et al., 1979). Third, it is difficult for the attention-elaboration model to explain the fact that recall for atypical information is lower than that of typical actions after a long retention interval. Fourth, the typicality effect is not influenced by the presentation rate of the material. Fifth, the typicality effect is not influenced by variations in instructions which are designed to manipulate the relative amount of attention and elaboration that typical versus atypical actions receive. Thus, the typicality effect is clearly not a malleable phenomenon contingent on resource allocation during encoding. Rather, it is a robust phenomenon that is relatively impervious to the distribution of cognitive resources. Other researchers have arrived at similar conclusions in the context of picture memory (Friedman, 1979; Light et al., 1979). We acknowledge that we have not completely eliminated an attentionelaboration explanation of the typicality effect. Perhaps the comprehender’s attentional resources can oscillate very quickly between typical and atypical actions. Future experiments need to test this possibility. As it stands, however, evidence for an attention-elaboration explanation is slim. Our attempts to manipulate the allocation of resources to items and thereby affect memory have consistently failed. The typicality effect has been particularly resilient to variations in resource allocation. Such variations may modestly influence memory in the form of a second level code, but they do not explain the typicality effects reported in this article.

c.

PROBLEMS WITH THE PARTIAL COPY

MODEL

Bower et al. (1979) proposed a partial copy model as an explanation of how scripts influence comprehension and memory (see Section 1,C). According to this model, two independent codes are formed when scripted activities are comprehended. One code is the episodic memory structure, which is a list of propositions explicitly stated in the passage. According to Bower et al., the likelihood of accessing a proposition in this episodic list decays quickly according to an exponential function. After a week or so there would be no memory for this code. The second type of code involves the generic script. When a scripted passage is comprehended, actions of the generic script are activated by script-relevant passage actions. For script-relevant actions stated in a passage, there is a corresponding action in the generic schema that receives a strong activation. There are also actions in the generic schema that receive a weak activation because they are inferred in the passage by

96

Arthur C. Graesser and Glenn V. Nakamura

default. An action is later remembered if it meets or exceeds some critical level of activation. With time, the activation level for an action in the generic script decays, and is reactivated when the script is used again. After a long retention interval, there will be no discrimination between actions that received a weak versus a strong activation in the context of a specific scripted passage. Thus, after a long retention interval there should be little or no memory discrimination between presented and unpresented typical script actions. There are two problems with the partial copy model. The first problem is the finding of zero memory discrimination for very typical actions after a 30-min retention interval. According to the partial copy model, there should be some memory discrimination for these items. These actions would have some likelihood of being retrieved from the episodic memory structure, that is, the list of explicit passage propositions. These actions should also show memory discrimination by virtue of the generic script; the very typical stated actions should have a high activation, whereas very typical unstated actions should have a weak activation. The second problem with the partial copy model involves recognition memory after a long retention interval. According to the partial copy model, there should be no memory for irrelevant passage actions after a long retention interval. After 3 weeks, for example, the episodic memory structure would certainly be completely decayed and the generic script would provide no basis for remembering the irrelevant information. However, there is in fact substantial recognition memory for irrelevant actions after 3 weeks (Graesser, 1981; Smith & Graesser, 1981). D.

PROBLEMS WITH THE SCHEMA POINTER PLUS

TAGMODEL

The SP+T model is quite similar to the SC+T model and provides a close fit to the data reported in this article. The major shortcoming of the SP+T model pertains to the amount of generic schema information that is incorporated into the specific memory trace for a passage. According to the SP+T model, the memory trace contains a pointer to the generic schema, so that the entire generic schema is copied into the specific memory trace. As we noted earlier, it is implausible that the entire generic schema is copied into the specific memory trace. For reasons discussed by Bower et al. (1979), only a subset of the generic schema content would be activated during the comprehension of a specific passage. Therefore, we abandoned this strong assumption of the SP+T model and adopted the SC+T model. The SC+T model assumes that only a subset of the nodes in a generic schema are copied into the memory trace. This is not a strong claim. The

Schemas, Comprehension, and Memory

91

assumption does not predict which subset of generic nodes is passed to the memory representation of a specific passage. In the next section, we shall examine whether there is a systematic relationship between (a) the subset of generic schema nodes that are copied into a passage representation and (b) the set of items that are explicitly mentioned in a given passage. The SC+T model would be strengthened if it could specify when a schema node is or is not copied into a specific memory trace.

VI.

The Process of Copying Schema Nodes into Specific Memory Traces

When a schema-based passage is comprehended, many nodes in the generic schema are copied into the passage representation, even though the nodes were not explicitly mentioned. These nodes are inferences. Which of the generic nodes end up being passage inferences? In this section we will report a study designed to address this question. The study was part of a master's thesis completed by Donald Smith at California State University in Fullerton (Smith, 1981). Smith's thesis focused on memory for scripted activities. Smith explained whether it is possible to predict which actions of a generic script are copied into the memory trace of a scripted passage. The false alarm rate of an unpresented script action served as an index of whether an action was an inference in a scripted passage. When an action had a high false alarm rate, it was regarded as a passage inference. Again, we should mention that the false alarm rates have been responsible for the pattern of memory discrimination scores (d' scores) in nearly all the experiments reported in this chapter. As shown in Fig. 1, typicality substantially influences false alarm rates and d' scores, but not hit rates. Within the typicality interval of 4 (moderately typical) to 6 (very typical) there is a dramatic increase in false alarm rates. What factors predict false alarm rates for typical actions? Smith (1981) investigated the extent to which an unpresented action node is activated by three different knowledge sources. First, an inference may be activated by the generic script. When the script is identified at the beginning of the scripted activity (e.g., Jack decided to eat dinner at a restaurant), then a set of script-relevant actions are activated as inferences. Second, an inference may be activated by an action explicitly stated in the passage. For example, if the scripted activity stated that Jack took out his wallet, then a plausible inference would be that Jack paid the bill. Third, an inference may be activated by a conceptual subchunk within a generic script. The significance of subchunks will be discussed shortly. In the

98

Arthur C. Graesser and Glenn V. Nakamura

following subsections, we shall describe the three knowledge sources and also how script actions were scaled with regard to these sources of activation. A.

ACTIVATION OF INFERREDACTIONS VIA THE GENERIC SCRIPT

Some script actions are activated when the generic script is introduced at the beginning of the scripted activity. For example, when Jack decided to eat dinner at a restaurant is mentioned, then the restaurant script is identified and some actions are activated, even when they are not explicitly mentioned in the rest of the passage. Some plausible script-activated actions for the restaurant script are Jack ordered food, someone gave Jack the food, and Jack paid for the food. These script-activated actions would presumably include actions that are central, essential, or characteristic of the script. The typicality ratings for script actions provide a reasonable index of the extent to which the actions are script activated. Subjects in a normative group scaled the actions on the 6-point typicality scale described earlier. The actions have also been scaled on a 6-point necessity scale. Subjects rated how necessary it is to execute an action when enacting a given script. For example, going to the restaurant would be a necessary action in the restaurant script. The necessity ratings are highly correlated with the typicality ratings, r = .91 (Graesser, 1981; Graesser et al., 1979). The mean typicality rating for an action served as a measure of script activation in Smith’s (198 1) thesis. B.

ACTIVATION OF INFERRED ACTIONS VIA PASSAGEACTIONS

STATED

Some inferences are activated by an explicitly stated passage action together with the generic script context. For example, suppose the restaurant script is identified and the passage states that Jack sat down at the table. The comprehender would probably infer that Jack walked to the table. This inference would be activated by the passage action plus the restaurant script, or by the passage action alone. The inference may not have been activated by the restaurant script alone; in some restaurants the customers do not eat at tables. In Smith’s thesis, necessity scures were computed from measures collected in a normative rating group of subjects. Necessity scores were computed for the test actions in the Jack story scripts (Graesser, 1981; Graesser et al., 1980). Subjects were given the script title and then rated

Schemas, Comprehension, and Memory

99

pairs of actions in a necessity scale. A pair of actions involved an activator and an activatee action and was placed in the following frame: Given that (activator), how necessary is it that (activatee)? For example, if the activator action is the person orderedfood and the activatee action is the person ate food, then the action pair would be: Given that the person ordered food, how necessary is it that the person ate the food? The subjects rated these action pairs on a 6-point necessity scale: 1 = very unnecessary; 2 = somewhat unnecessary; 3 = uncertain, but probably unnecessary; 4 = uncertain, but probably necessary; 5 = somewhat necessary; and 6 = very necessary. Trabasso, Secca, and Brock (1982) has used a similar test for determining whether actions, events, and states in stories are causally related. Of the 22 typical actions in each script, only 8 served as test actions in our previous memory studies. The same 8 actions of a script were the activatee items of the action pairs in the Smith thesis. Each of the 8 critical actions had 21 action pairs for which necessity ratings were collected; the activatee action was paired with the other 21 actions in a script. Since there were 8 critical actions and 21 pairs per critical action, 168 pairs were rated for each script. Smith computed a strength of activation, P,, for each action pair, which corresponded to the proportion of subjects who rated an activatee action, Aj, as being necessary for a given activator action, Ai.The criterion for being necessary was a rating of 4,5, or 6 on the necessity scale. Once these activation scores were collected, the total activation was computed for an inference action in an acquisition passage. The total activation for an inference action obviously depended on what actions were presented in the acquisition passage. An inference action would tend to be activated when there were one or more passage actions that had high activation scores associated with the inference. The total activation could be measured for inference Aj in a specific passage that contained a given subset of script actions. 22

Total activation ( A j ) = i= I

(PV given that A, was presented in the passage)

(3)

The necessity score for action Aj in a specific passage was the average amount of activation per explicit action. If there were n explicit actions, then the necessity score for Aj would be Necessity score (Aj) = Total activation (A,)ln

(4)

100

Arthur C. Graesser and Glenn V. Nakamura

The false alarm rate for inference Aj is expected to increase as its necessity score increases. C. ACTIVATION OF INFERRED ACTIONSVIA A

THE

ACTIVATION OF

SUBCHUNK

It is quite plausible that the generic script is organized into subchunks. For example, the restaurant schema might have the following subunits: waiting, ordering, being served, eating, and paying. The idea that scripts, schemas, and passages are subdivided into subchunks has been proposed by a number of researchers (Bower et al., 1979; Black & Bower, 1979; Rumelhart & Ortony, 1977; Schank, 1980). In the context of scripts, Schank calls these subchunks “scenes.” Thus, there would be a waiting scene, ordering scene, and so on. It is possible that the content of a generic schema is copied into the memory trace subchunk by subchunk. When scripts are involved, the memory trace is constructed scene by scene. A scene-by-scene composition would provide some flexibility in the construction process. When an explicit action invokes a specific scene, then many nodes associated with the scene would be copied into the memory trace. If, however, there is no mention of any action associated with a given scene, then the memory trace would contain no nodes associated with that scene. For example, if the restaurant passage does not mention anything about paying, then the entire paying scene would be left out. Moreover, an inference action may be activated by virtue of a subchunk, but not by virtue of script activation or explicit action activation. Smith used a sorting task as a method for identifying subchunks in the eight scripted activities. Subjects were given a set of 22 cards, with each card containing an action in the script. Subjects sorted the actions into different piles. Subjects were told what the script title was before they sorted the actions. The number of chunks or piles that subjects decided to use was at their own discretion, but the experimenter emphasized that actions within a chunk should be conceptually related. The mean number of chunks per script ranged from 3.5 to 8.4, with a mean of 5.6. The mean number of actions per chunk ranged from 2.6 to 6.1, with a mean of 3.9. Smith computed a “chunk score” for the eight critical actions in each scripted activity. The chunk score for action Aj was an index of the extent to which the action Aj would be activated by virtue of being in the same chunk as actions explicitly stated in an acquisition passage. Suppose that P y is the proportion of subjects who sort activatee action, Aj, in the same pile as action Ai.Then the total chunk activation for action Ai in a given

Schemas, Comprehension, and Memory

101

passage would be 22

Total chunk activation (A,) =

(Pii given that A is n='

(5)

stated in the passage and P,>.25)

There was a .25 minimal activation threshold in formula 5 because the subjects rarely placed only one action in a pile and we desired a sensitive assessment of chunk activation. The chunk score was an average chunk activation per explicit passage action. If there were n explicit actions in the scripted passage, then the chunk score for inference Aj would be: Chunk score (A,) = Total chunk activation (Aj)/n

(6)

The false alarm rate for inference Aj is expected to increase with the chunk score for Aj. Obviously, the chunk score for an inference should vary as a function of which actions are stated in the acquisition passage. D.

PREDICTING FALSEALARMRATESFOR UNSTATED SCRIPTACTIONS

To what extent can false alarm rates for script actions be predicted by the typicality ratings, necessity scores, and chunk scores? These three predictor variables were assumed to measure the activation of an action via the script, a stated action, versus a chunk. Smith analyzed the false alarm data in the Smith and Graesser (1981) study involving memory for the Jack story. We should remind the reader that there were two versions of the Jack story (A and B) and that each scripted activity had contained (a) 14 common typical actions presented in both versions A and B, (b) 4 typical actions presented in A but not in B, and (c) 4 typical actions presented in B but not in A. Smith attempted to predict the false alarm rates involved in the latter two sets of actions, that is, the B actions when subjects listened to version A, and the A actions when subjects listened to version B. Since there were 8 scripted activities and 8 critical test actions per script, Smith attempted to predict false alarms for 64 actions. Smith used multiple regression techniques when assessing the extent to which the false alarm rates for the 64 actions could be predicted by the actions' typicality ratings, necessity scores, and chunk scores. Analyses revealed that only the typicality ratings significantly predicted the false alarm rates. The fact that the necessity scores and chunk scores failed to predict false alarm rates is not surprising in retrospect. The inference actions in a script may have been usually activated by several explicit actions. Perhaps the activation levels of the unstated actions always exceeded threshold levels because they were activated several times by

102

Arthur C. Graesser and Glenn V. Nakamura

several knowledge sources. If there was extensive multiple activation, then the false alarm rates would be high and not sensitive to the chunk scores and necessity scores. In fact, the range of chunk scores was .08 to .42, and the range of necessity scores was .07 to .82. Since a score of .06 indicates activation from one activator (.06 X 18 stated typical actions = 1.08), all of the activatees (potential inferences) received at least one activation and usually several activations. Therefore, a more sensitive test was needed to assess the impact of necessity scores and chunk scores on false alarms. In order to prevent multiple activation, Smith wrote new scripted activities that contained a small subset of the actions used in the Smith and Graesser study. There were several versions of each scripted activity with a different subset of eight actions in each version. The versions were composed and manipulated in order to assess the false alarm rate for a single inference action (activatee). The details of these contextual variations are described in Smith’s (1981) thesis. The upshot of Smith’s careful manipulation of script versions was that there were four context conditions, and the false alarm rate for a critical inference (activatee) was measured in each context condition. The four context actions are (1) low necessity score and low chunk score, (2) low necessity score but high chunk score, (3) high necessity score but low chunk score, and (4) high necessity score and high chunk score. In the low necessity score conditions, none of the 8 explicit actions in the passage would activate the critical inference by virtue of necessity; similarly, in the low chunk score conditions, none of the eight explicit actions in the passage would activate the critical inference by virtue of being in the same subchunk as a stated action. In the high necessity (or chunk) condition, one or two of the eight explicit actions activated the critical inference. Table VIII shows the mean false alarm rates for the critical inference actions as a function of necessity score and chunk score. An analysis of variance was performed using item variability in the error term. False alarms were somewhat higher in the high than low necessity score conditions, .40 versus .28, respectively, F(1, 14) = 7.59, p < .05. False alarms were higher in the high than low chunk score conditions, .37 versus .30, respectively, but not quite significantly. The most interesting outcome was the chunk score X necessity score interaction, F(1, 14) = 4.82, p < .05. The false alarm rate was very high (.47) when the chunk score and the necessity score were both high compared to the other three conditions, which had roughly the same false alarm rates (.30). The interaction suggests that a generic script node must satisfy dual criteria before it is copied into the memory trace as an inference. First, the generic node must be part of the same subchunk as a

Schema, Comprehension, and Memory

103

TABLE VIII FALSEALARMRATESAS A FUNCTION OF ACTIVATION VIA CHUNKINC AND NECESSITY Activation via necessity Activation via chunking

Low

High

Low High

.28 .28

.33

.47

node that is explicitly stated in a passage. Second, the generic node must be a necessary antecedent or consequent of an action that is explicitly stated. A generic node does not tend to be copied into the memory trace if it satisfies none or only one of these two criteria. Stated differently, an explicit action tends to activate an inference if (a) the inference is part of the same subchunk as the stated action, and (b) the inference is a necessary antecedent or consequence of the stated action. The findings reported in Smith’s ( 1981) thesis need to be replicated and are clearly not the final word on the question of which schema nodes are copied into specific memory traces in the form of inferences. However, Smith’s thesis findings are consistent with the following claims regarding scripts:

1. When the script is identified, the script activates some generic nodes and these nodes are copied into the specific memory trace. Nodes that are very typical of the script tend to be activated. 2. One or more actions in a passage activate a subchunk of information (i.e., a scene) within the generic script. Some of the generic nodes within the subchunk are activated and copied into the specific memory trace, namely, nodes that are necessary antecedents and consequences of the explicit actions. VII.

Questions for Further Research

We believe that the SC+T model is a strong competitor with alternative schema-based models of comprehension and memory. The SC+T model has provided a close fit to the data reported in this article and has been articulated in the form of a mathematical model (see Graesser, 1981; Smith & Graesser, 1981). The model also has a fairly broad scope. First, the model accounts for the effects of typicality on memory after different

104

Arthur C. Graesser and Glenn V. Nakamura

retention intervals. Second, the model isolates differences between recall and recognition processes. Third, the model applies to different knowledge domains, to different types of schemas, and to situations in which more than one schema guides the processing of incoming information. Fourth, the model’s predictions are confirmed under different encoding conditions and variations in comprehender goals. Fifth, the model provides a better fit to the data than some alternative schema-based models. The model also is easily expanded to explain which generic schema nodes are copied into specific memory traces. There are a number of questions for future research. Experiments need to be conducted to examine further which generic nodes are copied into a memory trace. We have isolated some factors that predict inference activation. The Smith (1981) study suggests that some inferences are activated in a top-down fashion by the generic schema. Other inferences are generated in a bottom-up fashion, so that explicit information activates a subchunk in the schema and inferences within the subchunk are activated if and only if they are a necessary implication of the explicit information. Other factors might predict inferencing. For example, structural properties of the generic schema might predict inference generation. It is plausible that the generic schema would activate nodes that are more superordinate in a hierarchical structure when the schema structure is hierarchical (Bower et al., 1979; Graesser, 1978). An inference may tend to be activated by a schema if the node is related directly to many other nodes in the generic schema (Graesser, 1978, 1981; Graesser, Robertson, & Anderson, 1981). It is beyond the scope of this article to address the structural dimensions of schemas in comprehension and memory. This is clearly an important issue for future research and for future development of the SC+T model. A second question for future research involves the temporal dimensions of schemas in comprehension and memory. For some schemas, events and actions unfold in a chronological order. In fact, temporality is critical in scripted action sequences. Actions unfold in either a logical order (e.g., the waitress must serve the food before the customer eats), a conventional order (e.g., the customer usually eats before leaving a tip), or an order that reff ects certain environmental constraints. Researchers have examined story schemas and have specified temporal constraints that exist in stories (Thorndyke, 1977; Mandler & Johnson, 1977; Rumelhart, 1975, 1977; Stein & Glenn, 1979). When story episodes are presented out of order, the order of recalling episodes drifts toward the temporal constraints of story schemas (Mandler, 1978; Stein & Nezworski, 1978). These systematic errors in recall order also occur in scripted passages (Bower et al., 1979). In the future, the SC+T model

Schemas, Comprehension, and Memory

105

must address temporality, because this dimension has a central role in many knowledge domains. We have not addressed temporality in this article because we chose to concentrate on aspects of schema processing that would apply to any knowledge domain. For some types of schemas (e.g., stereotypes, roles, and spatial scenarios), the dimension of temporality is not particularly salient or important. A third problem for future research involves the collection and explanation of reaction time data when memory is assessed by a recognition test. For example, the model should account for the latencies of hits, false alarms, correct rejections, and misses when typical and atypical actions are tested at different retention intervals. In fact, we are at present collecting and analyzing these data. Reaction time data should provide a richer data base for testing and extending the SC+T model. Still other questions may be pursued within the framework of the SC+T model. The model will undoubtedly need to be modified and extended as new findings accumulate. With further efforts in research and theory, we hope to converge on a scientifically rigorous, detailed, general, and decisive schema for schema processing. ACKNOWLEDGMENTS The research reported in this article was supported by a National Institute of Mental Health grant (MH-33491) awarded to the first author. We would like to thank the following members of the Cognitive Research Group at California State University at Fullerton who conducted the experiments reported in Sections V and VII of this article: Lea Adams, Hank Bruflodt, Leslie Clark, Scott Elofson, Sharon Goodman, Tami Murachver, James Riha, Carol Rossi, Don Smith, Judy Zimmerman, and Professor Stanley Woll.

REFERENCES Abelson, R. P. The psychological status of the script concept. American Psychologist, 1981, 36, 715-729. Adams, M. J., & Collins, A. A. A schematic-theoretic view of reading. In R. 0. Freedle (Ed.), New directions in discourse processing (Vol. 2). Norwood, New Jersey: Ablex, 1979. Anderson, R. C. The notion of schemata and the educational enterprise: General discussion of the conference. In R. C. Anderson, R. J . Spiro, & W. E. Montague (Eds.), Schooling and the acquisition of knowledge. Hillsdale, New Jersey: Erlbaum, 1977. Anderson, R. C., Spiro, R. J., & Anderson, M. C. Schemata as scaffolding for the representation of information in connected discourse. American Educational Research Journal, 1978, 15, 433-440. Atkinson, R. C., & Juola, J . F. Search and decision processes in recognition memory. In D. H. Krantz, R. C. Atkinson, R. D. Luce, & P. Suppes (Eds.), Contemporary developments in mathematical psychology (Vol. 1). San Francisco, California: Freeman, 1974.

106

Arthur C. Graesser and Glenn V. Nakamura

Bartlett, F. C. Remembering. Cambridge, Massachusetts: Cambridge University Press, 1932. de Beaugrande, R. Text, discourse, and process. Norwood, New Jersey: Ablex, 1980. Bellezza, F. S., & Bower, G. H. Person stereotypes and memory for people. Journal offersonality and Social Psychology, 1981. 41, 856-865. (a) Bellezza, F. S., & Bower, G. H. The representation and processing characteristics of scripts. Bulletin of the Psychonomic Society, 1981, 18, 1, 4. (b) Bellezza, F. S . , & Bower, G . H. Remembering script-based text. Poetics, 1982, in press. Biederman, I. On the semantics of a glance at a scene. In M. Kubovy & J . R. Pomerantz (Eds.), Perceptual organization. Hillsdale, New Jersey: Erlbaum, 1982, in press. Black, J. B., & Bower, G. H. Episodes as chunks in narrative memory. Journal of Verbal Learning and Verbal Behavior, 1979, 18, 309-318. Bobrow, D. G . , & Norman, D. A. Some principles of memory schemata. In D. G. Bobrow & A. Collins (Eds.), Representation and understanding. New York: Academic Press, 1975. Bower, G. H., Black, J. B., &Turner, T. J . Scripts in memory for text. Cognitive Psychology, 1979, 11, 177-220. Bransford, J . D. Human cognition: Learning, understanding, and remembering. Belmont, California: Wadsworth, 1979. Bransford, J. D., & Johnson, M. K. Considerations of some problems on comprehension. In W. G. Chase (Ed.), Visual information processing. New York: Academic Press, 1973. Bregman, A. S. Perception and behavior as compositions of ideals. Cognitive Psychology, 1977, 9, 250-292. Brewer, W. F., & Treyens, J . C. Role of schemata in memory for places. Cognitive Psychology, 1981, 13, 207-230. Cantor, N. Prototypicality and personality judgements. Unpublished doctoral dissertation, Stanford University, 1978. Cantor, N., & Mischel, W. Traits as prototypes: Effects on recognition memory. Journal of Personality and Social Psychology, 1977, 35, 38-48. Clark, L. F., & Woll, S. B. Stereotypes: A reconstructive analysis of reconstructive effects. Journal of Personality and Social Psychology, 1981, 41, 1064-1072. Cofer, C. N., Chmielewski, D. L., & Brockway, J. F. Constructive processes and the structure of human memory. In C. N. Cofer (Ed.), The structure of human memory. San Francisco, California: Freeman, 1976. D’Andrade, R. G . Memory and the assessment of behavior. In H. Blalock (Ed.), Measuremenr in the social sciences. Chicago, Illinois: Aldine, 1974. Dooling, D. J., & Lachman, R. Effects of comprehension on the retention of prose. Journal of Experimental Psychology, 1971, 88, 216-222. Dunay, P. K., Blazer, R. H., & Yekovich, F. R. Using memory schemata to comprehend scripted texr. Paper presented at the meeting of the American Psychological Association, Los Angeles, California, 1981. Flavell, J. H. The developmental psychology of Jean Piaget. Princeton, New Jersey: Van NostrandReinhold, 1963. Friedman, A. Framing pictures: The role of knowledge in automatized encoding and memory for gist. Journal of Experimental Psychology: General, 1979, 108, 316-355. Going, M., & Read, J. D. Effects of uniqueness, sex of subject, and sex of photograph on facial recognition. Perceptual and Motor Skills, 1974, 39, 109-1 10. Goldin, S. E. Memory for the ordinary: Typicality effects in chess memory. Journal of Experimental Psychology: Human Learning and Memory. 1978, 4, 605-616. Goodman, G. S. Picture memory: How the action schema affects retention. Cognitive Psychology, 1980, 12, 473-495.

Schemas, Comprehension, and Memory

107

Graesser, A. C. How to catch a fish: The representation and memory of common procedures. Discourse Processes, 1978, I, 72-89. Graesser, A. C. Prose comprehension beyond the word. New York: Springer-Verlag, 1981. Graesser, A. C., Gordon, S. E., & Sawyer, J. D. Memory for typical and atypical actions in scripted activities: Test of a script pointer + tag hypothesis. Journal of Verbal Learning and Verbal Behavior, 1979, 18, 319-332. Graesser, A. C., Robertson, S. P., & Anderson, P. A. Incorporating inferences in narrative representations: A study of how and why. Cognitive Psychology, 1981, 13, 1-26. Graesser, A. C., Woll, S. B., Kowalski, D. J., & Smith, D. A. Memory for typical and atypical actions in scripted activities. Journal of Experimental Psychology: Human Learning and Memory, 1980, 6 , 503-513. Green, D. M., & Swets, J. A. Signal detection theory andpsychophysics. New York: Wiley, 1966. Grice, H. P. Logic and conversation. In P. Cole & J. L. Morgan (Eds.), Syntax and semantics (Vol. 3): Speech acts. New York: Seminar Press, 1975. Hamilton, D. L. Illusory correlations as a basis for stereotyping. In D. L. Hamilton (Ed.), Cognitive processes in stereotyping and intergroup behavior. Hillsdale, New Jersey: Erlbaum, 1981. Hastie, R. Memory for behavioral information that confirms a personality impression. In R. Hastie, T. M. Ostrom, E. B. Ebbesen, R. S. Wyer, D. L. Hamilton, & D. E. Carlston (Eds.), Person memory: The cognitive basis of social perception. Hillsdale, New Jersey: Erlbaum, 1980. Hastie, R., & Kumar, A. P. Person memory: Personality traits as organizing principles in memory for behaviors. Journal of Personality and Social Psychology, 1979, 37, 25-38. Kintsch, W. Memory and cognition. New York: Wiley, 1977. Kintsch, W., & Van Dijk, T. A. Toward a model to text comprehension and production. Psychological Review, 1978, 85, 363-394. Krueger, L. E. Is identity or regularity more salient than difference or irregularity? Paper presented at the meeting of the American Psychological Association, Los Angeles, California, 1981. Light, L. L.. Kayra-Stuart, F., & Hollander, S. Recognition memory for typical and unusual faces. Journal of Experimental Psychology: Human Learning and Memory, 1979, 5 , 212-228. Loftus, G. R., & Mackworth, N. H. Cognitive determinants of fixation location during picture viewing. Journal of Experimental Psychology: Human Perception and Performance, 1978, 4, 565-572. Mandler, G. Organization and recognition. In E. Tulving & Donaldson (Eds.), Organization and memory. New York: Academic Press, 1972. Mandler, G. Recognizing: The judgement of previous occurrence. Psychological Review, 1980,87, 252-271. Mandler, J. M. A code in the node: The use of a story schema in retrieval. Discourse Processes, 1978, 1, 14-35. Mandler, I. M. Categorical and schematic organization in memory. In C. R. Puff (Ed.), Memory organization and structure. New York: Academic Press, 1979. Mandler, J. M. Representation. In J. H. Flavell and E. M. Markman (Eds.), Cognitive development. Vol. 2 of P. Mussen (Ed.), Manual of child psychology. New York: Wiley, 1982, in press. Mandler, J. M., & Johnson, N. S. Rememberance of things passed: Story structure and recall. Cognitive Psychology, 1977, 9 , 11 1-151. Minsky, M. A. A framework for representing knowledge. In P. H. Winston (Ed.), Thepsychology of computer vision. New York McGraw-Hill, 1975. Neisser, U. Cognition and reality. San Francisco, California: Freeman, 1976. Neisser, U., & Becklen, R. Selective looking: Attending to visually specific events. Cognitive Psychology, 1975, 7, 480-494. Nelson, K. Cognitive development and the acquisition of concepts. In R. C. Anderson, R. J. Spiro,

108

Arthur C. Graesser and Glenn V. Nakamura

& W. E. Montague (Eds.), Schooling and the acquisition of knowledge. Hillsdale, New Jersey: Erlbaum, 1977. Norman, D. A., & Bobrow, D. G. On the role of active memory processes in perception and cognition. In C. N. Cofer (Ed.), The structure of human memory. San Francisco, California: Freeman, 1976. Norman, D. A., & Bobrow, D. G. Descriptions: An intermediate stage in memory retrieval. Cognitive Psychology, 1979, 11, 107-123. Palmer, S. E. The effect of conceptual scenes on the identification of objects. Memory and Cognition, 1975, 3, 519-526. Reeder, G. D., & Brewer, M. B. A schematic model of dispositional attribution in interpersonal perception. Psychological Review, 1979, 86, 61-79. Reynolds, R. E., & Anderson, R. C. The influence of questions on the allocation of attention during reading. Technical Report #183. Center for the Study of Reading, University of Illinois, Champaign-Urbana, Illinois, 1980. Rothbart, M. Memory processes and social beliefs. In D. L. Hamilton (Ed.), Cognitiveprocesses in stereotyping and intergroup behavior. Hillsdale, New Jersey: Erlbaum, 1981. Rumelhart, D. E. Notes on a schema for stories. In D. G. Bobrow & A. Collins (Eds.), Representation and understanding. New York: Academic Press, 1975. Rumelhart, D. E. Understanding and summarizing brief stories. In D. Laberge & S. J . Samuels (Eds.), Basic processes in reading: Perception and comprehension. Hillsdale, New Jersey: Erlbaum, 1977. Rumelhart, D. E., & Ortony, A. the representation of knowledge in memory. In R. C. Anderson, R. J . Spiro, & W. E. Montague (Eds.), Schooling and the acquistion of knowledge. Hillsdale, New Jersey: Erlbaum, 1977. Schank, R. C. Language and memory. Cognitive Science, 1980, 4, 243-284. Schank, R. C., & Abelson, R. Scripts, plans, goals, and understanding. Hillsdale, New Jersey: Erlbaum, 1977. Searle, J. R. Speech acts. London: Cambridge University Press, 1969. Smith, D. A. What schema-relevant inferences are passed to the memory representation of text? Unpublished masters thesis, California State University, Fullerton, 198 1. Smith, D. A,, & Graesser, A. C. Memory for actions in scripted activities as a function of typicality, retention interval, and retrieval task. Memory and Cognition, 1981, 9, 550-559. Spilich, G. J . , Vesonder, G. T., Chiesi, H.L., & Voss, J. F. Text processing of domain related information for individuals with high and low domain knowledge. Journal of Verbal Learning and Verbal Behavior, 1979, 18, 275-290. Spiro, R. J. Remembering information from text: Theoretical and empirical issues concerning the “state of schema” reconstruction hypothesis. In R. C. Anderson, R. J. Spiro, & W. E. n of knowledge. Hillsdale, New Jersey: Erlbaum, Montague (Eds.), Schooling and the acqu 1977. S ~ l l T. , K. Person memory: Some tests of associative storage and retrieval models. Journal of Experimental Psychology: Human Learning and Memory, 1981, 7 , 440-463. Stein, N. L., & Glenn, G. G. An analysis of story comprehension in elementary school children. In R. 0. Freedle (Ed.), New directions in discourse processing (Vol. 2). Norwood, New Jersey: Ablex, 1979. Stein, N. L., & Nezworski, T. The effects of organization and instructional set on story memory. Discourse Processes, 1978, 1, 177-193. Taylor, S. E., & Crocker, J . Schmatic bases of social information processing. In E. T. Higgins, P. Herman, & M. P. Zanna (Eds.), The Ontario Symposium on personality and social pschology. Hillsdale, New Jersey: Erlbaum, 1981.

Schemas, Comprehension, and Memory

I09

Thorndyke, P. W. Cognitive structures in comprehension and memory for narrative discourse. Cognitive Psychology, 1977, 9, 77-1 10. Thorndyke, P. W., & Hayes-Roth, B. The use of schemata in the acquisition and transfer of knowledge. Cognitive Psychology. 1979, 11, 82-106. Thorndyke, P. W., & Yekovich, F. R. A critique of schema-based theories of human story memory. Poerics. 1980, 9, 23-49. Trabasso, T., Secco, T., & Brock, P. V. D. Causal cohesion and story coherence. In H. Mandl, N. L. Stein, & T. Trabasso (Eds.), Learning and comprehension ofrext. Hillsdale, New Jersey: Erlbaum, 1982, in press. den Uyl, M., & Van Oostendorp, H. The use of scripts in text comprehension. Poerics, 1980, 9, 275-294. Woll, S. B., & Graesser, A. C. Memory discrimination for information typical or atypical of person schemata. Social Cognition, 1982, in press. Woodworth, R. S. Dynamics of behavior. New York: Holt, 1958. Woodworth, R. S., & Schlosberg, H. Experimental psychology. New York: Holt, 1954. Yekovich, F. R.,& Yekovich, C. W. The use of scripts in the study of knowledge-based comprehension of text. In U. Connor (Ed.),Discourse approaches to reading comprehension, 1982, in press.

This Page Intentionally Left Blank

CONSTRUCTION AND REPRESENTATION OF ORDERINGS IN MEMORY Kirk H . Smith and Barbee T. Mynatt BOWLING GREEN STATE UNIVERSITY BOWLING GREEN,

on10

I. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . , . . . . . . . . . , 11. Review of Previous Research . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111. Overview of the Experiments .. IV. Experiment 1: Retrieval from enngs ............................. A.

Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

...

B. Results and Discussion V.

...

....

C. Conclusions and Implications . . . . . . . . . . , . . . . . . . . . . . . . . . . . . . . . . . . . . . . Experiment 2: The Role of Determinacy in Constructing Partial Orderi A. Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

VI . A.

Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

B. Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Experiment 4: Diverging and Converging Nodes . . . . . . . . A. Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . VIII. Experiment 5 : The Role of the Schema.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. Conclusions from Experiments on Presentation Orders. . . . . . . . . . . . . . . . . . . IX . Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

VII.

I.

111 114 121 122 122 124

126 127 130 131 133 134 138 138 140 141 143 145 146 147 149 149 150

Introduction

Implicit in much of the recent work in cognitive psychology on the acquisition, retention, and retrieval of information on ordered relationships is the assumption that the linear order is a “good figure” (De Soto, 1960; Henley, Horsfall & De Soto, 1969). In general, when people are confronted with a set of asymmetric, transitive relations such as “A is greater than B,” there is a strong tendency to represent it in a single, complete ordering. Experiments by Barclay (1973), De Soto (1960), THE PSYCHOLOGY OF LEARNING AND MOTIVATION. VOL 16

111

Copyright 0 1982 by Academic Press, Inc All nghts of repduction in any form reserved ISBN 0-12-543316-6

112

Kirk H. Smith and Barbee T. Mynatt

Potts (1972), and Trabasso, Riley, and Wilson (1975) provide a variety of demonstrations of the strength of this tendency. Unfortunately, the world is not a simple, well-ordered place. Much of our knowledge cannot fit neatly into the complete ordering schema. Important examples of partial orderings include family trees and causal relations. The present article is concerned with the conditions that facilitate construction of appropriate mental representations of partially ordered information. We begin with an examination of several domains of knowledge that are typically portrayed as networks of partially ordered concepts or events. A review of the experimental literature on networks and partial orderings follows. We then describe five experiments on partial orderings that have not been reported previously. These studies make two contributions. First, they point out the limitations of previously published studies and suggest the need for exploration of a greater variety of partial orderings. Second, the experiments were designed to determine whether the acquisition of partial orderings can be understood in the same terms as the acquisition of linear orderings . Our investigations of presentation order strongly suggest that the same theory is satisfactory for both partial and complete orderings. What common kinds of information form networks of partially ordered objects or events? One instance that is very familiar to cognitive psychologists is the hierarchy (see Fig. 1A). Networks of this type have been used to represent the grammatical relationships among the words in a sentence (Chomsky, 1957; Johnson, 1968), as well as a person’s knowledge about the semantic relationships among words or concepts (Collins & Quillian, 1969). Two influential theories of semantic memory combine both types of information in complex networks (Anderson & Bower, 1973; Norman, Rumelhart, & the LNR Research Group, 1975). These theoretical networks are relatively complex in that several different kinds of relationships between words (or lexical entries) are represented. Another example of a frequently encountered hierarchy of partially ordered entities is an organizational chart or chain of command. A second type of information that can form a partial ordering is a network of causal relations. Suppose, for example, that B, J, H, K, F, and D are events in a narrative. Several relations among two or more events are possible. In the simplest, B might be the sole cause of J. More complex possibilities are that J and K jointly cause H or that H is the cause of both F and D. Such causal networks are sometimes represented in graphs like those of Fig. 1. (The examples above describe the situation at the top of Fig. 1B.) Historical relationships are often portrayed in this way. A family tree represents a special case of a history described in causal terms. The flow of material through a manufacturing process often

113

Orderings in Memory

A

C

B B

I

K

G

\I

J

I\ \ T S I\ P V I Z

C

B

’

\I

j

N

I

T

I

F

Fig. I , Examples of recently investigated networks of partially ordered entities from the following studies: (A) Nelson and Smith (1972), (B) Hayes-Roth and Hayes-Roth (1973, and (C) Moeser and Tarrant (1977).

can be usefully represented as a partial ordering, and the planning of complex development projects is frequently characterized this way (as in a “PERT Chart,” Moder & Phillips, 1964). Underlying all the examples is an implicit time line. However, to the extent that the relationships are partially ordered, time need not be fully specified. In Fig. lC, the precise time at which events H and M occur is unspecified. What is important is that both H and M precede N. For a variety of reasons, information about a network of relationships is often acquired sequentially. We learn about a family tree by listening to relatives talk about the individuals who comprise it. We read about causal relations one at a time. History is most often recounted serially. But even when we are not constrained by the serial nature of language as a medium of communication, experience enforces sequential acquisition on us. Our unguided experience with the interactions of a group of people necessarily follows a time line, even though what we eventually come to understand about a group may be most accurately reflected in a sociogram or organizational chart (cf. De Soto, 1960). Perhaps the most striking example of a network of relationships that is not completely ordered but must be translated into a serial representation is found in a computer program. For some purposes, a program must be understood as a set of complex logical relationships among the operations the computer can execute. However, the program must also be realized as a single, rigorously ordered series of symbolic statements. Writing a program involves a translation from the first kind of representation (in the programmer’s head) to the second. Often of more practical importance is the translation in the other direction, as when a program with logical errors is debugged or when someone other than the original programmer

114

Kirk H. Smith and Barbee T. Mynatt

tries to correct or modify a program. (Sometimes even the original programmer has this problem after the passage of time.) The preceding examples make it clear that people need to be able to understand networks of partially ordered entities. The question is how this is done. The writings of De Soto (1960; as well as Henley et al., 1969) seem to imply that even when sophisticated people give the relationships their most thoughtful consideration, they cannot handle certain kinds of networks. Yet some of our examples indicate that such networks cannot be impossible to understand and remember. The purpose of this article is to explore certain variables that affect people’s ability to understand and remember networks of partially ordered entities, and to show that they are the same variables that affect the apprehension of complete orderings. 11. Review of Previous Research

As indicated above, the hypothesis that the linear order is a “good figure” comes from the work of De Soto. The most relevant finding for present purposes comes from an experiment in which college students had to learn 12 relationships among four people (De Soto, 1960). The task was to report for each ordered pair of names whether or not the first influenced the second. Subjects required approximately nine repetitions, or trials, to learn all 12 relations when the latter formed a complete or linear ordering. Roughly 12 trials were required when the ordering was not complete, that is, when it formed a partial ordering and resembled a hierarchy or organization chart. A thorough treatment of De Soto’s work is beyond the scope of this article; however, several comments are needed to place our work in the proper perspective. From the beginning De Soto was concerned with how certain social relations are perceived. Thus, the 1960 experiment also contained groups of subjects that learned a set of statements identical to those described above, except that “influences” was replaced by “likes. These groups had equal difficulty with complete and partial orderings, but performed better when the relationship formed a transitive symmetric structure. De Soto concluded that “influence” is understood to be an asymmetric relation, whereas “like” is symmetric. It should be noted that no logical inconsistency is involved in a situation in which A1 likes Bill, but Bill does not like Al, whereas A1 cannot be both older and younger than Bill, nor both the father and the son of Bill. The present study was concerned almost exclusively with the latter type of relationships. Thus, ”

Orderings in Memory

I15

our work should be interpreted as an exploration of the role of completeness in understanding asymmetric, transitive structures. Two other aspects of De Soto’s work deserve comment. First, we are not concerned with the observation that a set of items with two or more conflicting orderings (e.g., on two different dimensions) produces cognitive strain (De Soto, 1961). This “predilection for single orderings” is easily confused with people’s tendency to reduce a partially ordered set of items (on a single dimension) to one (incorrect) ordering. Second, the present work is not intended to be a definitive treatment of how people understand and remember systems of causal relations or the logical structure of computer programs. We recognize that intransitivities play an important role in these instances. (In fact, the loop, one of the most important structures in computer programs, corresponds formally to what Henley et al., 1969 call a cycle.) However, many of the issues raised here, especially the methodological ones, must be faced in future studies of how people deal with even more complex networks than the ones considered here. The experiments considered so far have all used a paired associate learning procedure. A paper by Nelson and Smith (1972), which raises a number of important questions about partial orderings, explored a graphical form of presentation. Graphs are popular devices for facilitating comprehension of networks. Indeed, the word “network” is applied to partial orderings by an analogy between the graphs that are used and things like fishing nets. Examples are called flow charts, family trees, PERT charts, and sociograms. The value of a graph lies in its accurate and economical representation of the important aspects of a partial ordering. For example, Fig. 1B expresses not only the determinate relationships between K and H and J and H, represented by lines, but also the indeterminate relationship between K and J. (Each line in Fig. 1 represents an asymmetric, transitive relationship between two symbols. If the relationship is ‘‘greater than, ” then the symbol with the higher location on the graph is the larger.) Nelson and Smith examined learning and retention of the 34 determinate relations represented in Fig. 1A. The relations were presented either as a set of 34 associations between letters-that is, C M, G + M, B + M, . . . D +-K - o r as a diagram like the one in the figure. Tests required the subjects to make checkmarks in a 14 X 14 matrix for which rows indicated first letters of the associations (letters lower in the hierarchy in Fig. 1A) and columns indicated second letters (or letters higher in the hierarchy). The letters heading the rows and columns were assigned randomly from trial to trial. Thus, Nelson and Smith’s subjects had to learn how to translate one representation (a set of pairs or a graph) into

-

116

Kirk H. Smith and Barbee T. Mynatt

another (a matrix). There is no special reason to assume that the matrix representation is unique or cognitively simpler than a set of pairs or a graph. Although all possible combinations are available in the matrix, the subject has to identify only the ones presented in the association condition and does not have to discriminate between logically incorrect combinations and those that are indeterminate. Nelson and Smith found that subjects learned the 34 pairings in fewer trials, retained more pairs, and required fewer trials to relearn them when they were presented graphically than when only the pairs were presented. The difference was particularly striking in the number of errors made in learning. These results indicate that the information conveyed by a graph enhances in some way a person’s knowledge about a set of partially ordered relationships. Of particular importance is the finding that learning was unaffected when the left-to-right ordering of the branches in the graphs was rearranged from trial to trial. Groups receiving graphs that changed in this way performed as well as groups that received identical graphs throughout learning. Apparently, the subjects were able to discriminate the essential features of the graphs they were shown from their nonessential aspects. Graphs of the kind shown in Fig. 1 contain a number of details extraneous to the information they represent. The order of the nodes from left to right in the drawing is irrelevant, and the length of the lines carries no meaning. The conventions for drawing graphs of this kind permit ordinal, and sometimes interval, information to be expressed under some circumstances-as when time is represented in a graph illustrating the political history of western Europe in the eighteenth century. The latter example also makes it clear that graphs representing networks are abbreviated and impoverished in important ways. The nodes or boxes and even the lines or arrows are presented in a symbolic shorthand; one must usually read an accompanying text to get the complete story. Although Nelson and Smith demonstrated that college students are able to master the information in a partial ordering (at least temporarily for the purpose of completing a laboratory experiment), a series of recently published papers seems to argue that the knowledge is qualitatively different from that acquired when a linear ordering is learned. This conclusion is based on the finding that the “distance effect,” universally found with linear orderings, has not been obtained with partial orderings. Briefly, the distance effect refers to the fact that in judging which of two items rank higher on a linear ordering, people tend to be faster and more accurate as the number of intervening things on the scale increases. In a typical demonstration of this effect, subjects first learn a completely ordered set of relations, A > B, B > C, C > D, D > E, E > F. They are

Orderings in Memory

117

then asked either to judge whether various test items of the same form are true or false or to pick out which of two things rank higher. Reaction times tend to be shorter and error rates lower in verifying B > E than B > C, C > D, and D > E. Both Potts (1972) and Trabasso et al. (1975) found the distance effect even when relationships of greater distance (e.g., B > E) were not presented until testing. In contrast, when Hayes-Roth and Hayes-Roth (1975) tested subjects who had learned the 11 relationships represented by Fig. IB, they obtained a reverse distance effect. That is, adjacent relations, involving two letters connected by a single line (e.g., B > J in Fig. 1B) were judged more quickly than remote relations involving letters connected by two or more lines (e.g., H > P). This is, of course, just the opposite of what had been found with linear orderings. Hayes-Roth and Hayes-Roth described their experiment within the context of the semantic memory literature, in which the typical material is composed of sentences about class inclusion (e.g., “Canaries are birds”), and reaction time increases with distance rather than decreases. They went on to demonstrate that repeated testing on remote relationships could change the observed effects of distance, a result that clearly has methodological implications for the verificationtime procedure used to study semantic relations. This latter aspect of their paper has been largely ignored, and subsequent work has focused on the failure to find the appropriate ‘‘distance effect”-remote relations faster than adjacent ones-in partial orderings. Moeser and Tarrant (1977) pointed out that the procedure used by Hayes-Roth and Hayes-Roth (1975) encourages subjects to learn the individual relations between letters as isolated units in memory, rather than to integrate them into a network. Moeser and Tarrant argued that people do not spontaneously integrate information into holistic representations except under special circumstances, and that none of these conditions had occurred in the Hayes-Roth and Hayes-Roth experiment. Moeser and Tarrant therefore changed a number of aspects of the learning situation. They pointed out that in the Hayes-Roth and Hayes-Roth experiment, the relationships were presented as abstract inequalities involving meaningless letter pairs. Arguing that integration is more likely to occur with concrete and familiar material, Moeser and Tarrant used sentences that related a set of male names in terms of age (e.g., “Hugh is older than Bob”). They also required subjects to learn specific ages for some of the names. And in one condition, they showed subjects a network representation (along the lines of Fig. 1C) and encouraged subjects to store information in this format. They argued that these changes should lead to integrated storage and the usual distance effect observed with linear orders. In fact, judgment times for adjacents and remotes were equal.

118

Kirk H. Smith and Barbee T. Mynatt

It is not entirely clear how this last result should be interpreted. On the one hand, the finding of equally long times for adjacent and remote relationships can still be interpreted as evidence that partial orderings are not represented and accessed in the same way as linear orderings, even when special precautions are taken to ensure that all the information has been properly stored and integrated. On the other hand, both studies of complex, partial orderings (Hayes-Roth & Hayes-Roth, 1975; Moeser & Tarrant, 1977) differed from research on linear orders in several important ways. These differences might explain why the pattern of reaction times was different. One early hypothesis was that the experiments on partial orderings had used much larger structures. The number of elements in the partial orderings was 12, compared to at most 6 in Potts (1974) and Trabasso et al. (1975). Indirect evidence now suggests that the number of elements in the ordering is probably not responsible for the difference in results. Another line of investigation has looked more closely at the procedures used in this research. The results here are less clearcut; however, we argue in a subsequent section that these results tell us as much about the complex relationship between the distance effect and mental organization as they do about the difference between partial and complete orderings. There are two lines of evidence that the number of elements in an ordering cannot explain the difference between results from experiments on partial orderings and linear orderings. First, Pliske and Smith (1979) and Woocher, Glass, and Holyoak (1978) have reported distance effects using linear orderings of 12 and 16 terms, respectively. Second, Warner and Griggs (1980) had subjects learn a seven-term partial ordering under a variety of conditions designed to ensure that the information was correctly represented. In spite of these efforts, no distance effect was observed. Thus, it appears that the distance effect reflects the organization in memory of information from linear, but not partial, orderings. Unfortunately, no studies have directly compared distance effects in complete and partial orderings using the same procedures and testing the same relationships. The one study that has compared complete and partial orderings of the same size (14 elements in Moeser, 1979) failed to obtain a distance effect for both linear orderings and partial orderings. However, the procedure of this study was quite different from any discussed so far. Procedural variations seem to account for a good deal of the confusion in the literature on partial orderings. First, investigators have used a variety of methods to present the relationships that make up an ordering. Both Hayes-Roth and Hayes-Roth (1975) and Moeser and Tarrant (1977) used elaborate training sequences made up of many exposures to the

Orderings in Memory

119

relationships. By contrast, Pliske and Smith (1979) and Woocher et al. (1978) gave people a list to learn before coming to the laboratory and tested the success of this procedure by requiring each subject to recite the list in order. Second, the partial order studies have analyzed the time required to judge whether an assertion, such as “Carl is older than Mike,” is true or false (sentence verification procedure). Many of the linear ordering studies have used a procedure in which the subject is shown two terms (e.g., “Carl Mike”) side by side on a display screen and required to press a response key under the older of the two (twochoice procedure). Polich and Potts (1977) compared the two procedures and found that the verification procedure produces interactions between the presence of an end-anchor (highest or lowest ranked element in the ordering) and whether the sentence is true or false. Not only were such interactions absent in the two-choice procedure but Polich and Potts also reported that the overall variability of the response times was significantly less with this procedure. The importance of these procedural variations is dramatically illustrated by the two experiments that constitute a master’s thesis by Pliske (1978). In the first, unpublished experiment, subjects learned a 12-term linear order and were tested on series of adjacent relationships and a selected subset of the possible remote relationships. The method was designed to follow as closely as possible the one used by Moeser and Tarrant (1977), but without any special training on how to represent the ordering. Response times were highly variable; and the effect of distance, although evident and statistically significant, was not nearly as straightforward and compelling as that obtained in the second, published experiment (Pliske & Smith, 1979), in which subjects studied the list on their own and were tested with the two-choice procedure. Such variations in method may be more important than has been previously realized. For example, we cannot rule out the possibility that the distance effect is to some extent a reflection of the acquisition and testing procedures used in these larger sturctures. Two recent papers by Griggs and his students (Griggs, Keen, & Warner, 1980; Warner & Griggs, 1980) explored the procedural variations already discussed, along with several others. In no case were distance effects obtained for partial orderings. Elaborate preliminary instructions about the nature of partial orderings and their representation in graphical form did not lead to distance effects. Warner and Griggs (1980) found that without exposure to a graphical representation of the information, less than 60%of their subjects’ responses were consistent with the correct seven-term partial ordering. Only when subjects were required to draw

I20

Kirk H. Smith and Barbee T. Mynatt

the correct graph from memory on two consecutive trials and were then trained on the adjecent comparisons to two consecutive correct trials, did they respond correctly to remote comparisons on the test. Even this rigorous program did not lead to a distance effect, although this approach can be criticized because the extensive training on adjacent comparisons may have facilitated responses to them. Warner and Griggs’s third experiment comes the closest to matching the procedures of the earlier studies with large linear orderings. Following preliminary instructions on the nature of partial orderings and their graphical representation, subjects were given a graph of either a 7- or 12item partial order and told to memorize it for a second experimental session. At the beginning of the second session, the subject had to draw the structure both before and after familiarization with the test procedure. Finally, testing made use of a modification of the two-choice procedure similar to that used by Pliske and Smith and Woocher et al. The modification involved the addition of a third button to be used when a pair of items were indeterminate, that is, not ordered by the information given to the subject. In spite of the procedural similarities, Warner and Griggs found a reverse distance effect for the 12-item partial order used by Moeser and Tarrant. (It should be noted that 2 of the 20 subjects in this condition of Warner and Griggs’ experiment were unable to learn the correct structure.) In summary, 6 years of research suggests that the information in partial orderings is more difficult to memorize than similar completely ordered information. However, after receiving special instructions about the nature of incomplete ordering, accompanied by graphs and practice in using them, most college students are able to judge whether or not a given pair of elements is ordered and to draw correct inferences about the relationship between pairs of elements that are ordered but not specifically presented. The only evidence that the partial orderings are represented in an inherently different fashion is the failure to find a distance effect for judgments requiring inferences. The correct interpretation of this difference depends on how the distance effect is interpreted for complete sets of elements (cf. Potts, Banks, Kosslyn, Moyer, Riley, & Smith, 1978). In any case, these conclusions are based on investigations of a remarkably small sample of different partial orderings. No arguments have been offered to support the claim than the sample is representative. If our survey is complete, exactly four partial orderings have been examined, the two 12-item orderings shown in Fig. 1B and lC, a 7-item ordering studied by Griggs and his students, and a 14-item ordering that Moeser (1979) compared with a complete ordering.

Orderings in Memory

121

111. Overview of the Experiments As part of a research project dealing with the construction of linear orders (Foos, Smith, Sabol, & Mynatt, 1976; Mynatt & Smith, 1977; Smith & Foos, 1975), we became interested in partial orderings or networks because they seemed to be a rich domain into which we could extend our theory of constructive processes (see Foos et al. and Smith’s section of Potts et d.).Our work has focused on the construction of four-, five-, and six-element linear orderings. With such small sets of relationships, the construction of a branch or node from two relations (e.g., A > B, A > C) did not appear to be fundamentally different from the construction of a linear ordering (e.g., from A > B, B > C). Indeed, an early study in our laboratory (Smith & Mynatt, 1975) indicated that four- and five-term partial orderings were no more difficult to construct than similar-sized linear orderings. These preliminary observations were in sharp contrast to the previously published studies we have reviewed. In what follows, the first experiment we report was an investigation of the distance effect in retrieving information from partial orderings. One condition of the experiment was essentially a replication of the experiments by Moeser and Tarrant (1977) and Warner and Griggs (1980). It differed from the previous studies mainly in the additional procedures included prior to testing in order to guarantee that subjects understood the indeterminacy of partial orderings and had learned the specific adjacent relationships. A second condition tested another structure with the same number of elements (1 2) but a configuration similar to the hierarchical network studied by Nelson and Smith (1972). The second experiment reported below was designed to explore the diversity of structural configurations possible in 12-element partial orderings. A less elaborate testing procedure was used, and the focus was on whether subjects could answer questions and draw accurate diagrams on the basis of a set of sentences describing a partial ordering. The sentences were continuously in view in order to eliminate the effects of memory storage and retrieval. Results of the first two experiments were interpreted as evidence that, at least for college students, partial orderings can be learned and the resulting knowledge is not fundamentally different from what is learned in a linear ordering. The last three experiments, using four- and five-element orderings, were concerned with the process of construction. How are the relationships in individual sentences combined to form mental networks? The first of these experiments contrasted the process of extending a linear ordering with the process of building a node or branch (e.g., the structure

122

Kirk H. Smith and Barbee T. Mynatt

involving J, R , and D, at the top of Fig. 2). The second experiment investigated the construction of different types of nodes; the third was concerned with the effects of context on constructive processes. Different contexts-in this case the sentence frames used to express relationshipswere expected to elicit more or less appropriate representational schemas from the subjects’ permanent memory. Throughout these last three experiments, the Foos et al. theory of constructive processes was extended and modified to apply to networks or partial orderings.

IV.

Experiment 1: Retrieval from Partial Orderings

None of the recent studies of partial orderings has found a distance effect, in which responses to remote relationships are faster than to adjacent ones. Various attempts (Hayes-Roth & Hayes-Roth, 1975; Moeser & Tarrant, 1977; Warner & Griggs, 1980) suggest that this failure cannot be attributed to differences in training procedures, response measurement procedures, or to the number of elements in the structure. However, only four configurations have been investigated. Each of these four structures seem arbitrarily complex and unlike anything a college student might have encountered previously. The conclusion that distance effects cannot be obtained in any partial ordering is obviously premature. A more familiar and intuitively simpler configuration of relationships that form a partial ordering is a hierarchy or family tree. The present experiment compared retrieval time for information in the hierarchical network shown in Fig. 2 with comparable performance for the irregular network used by Moeser and Tarrant. The two structures have the same number of elements (12). The elements were one-syllable given names from Battig and Montague’s (1969) norms, and the relations between them were described as age relations. A.

METHOD

The subjects were 22 undergraduate students at Bowling Green State University. Their participation partially fulfilled a course requirement for introductory psychology. The 11 students in the “hierarchy” condition worked with the relations graphed in Fig. 2. The 11 students in the “irregular” condition worked with the relations that form the irregular partial ordering (see Fig. 1C) studied by Moeser and Tarrant, and Warner and Griggs. The training phase was based to some extent on the procedures used by Nelson and Smith. It had several steps and attempted to expose the

Orderings in Memory

C

/I\

F

A

i

B

M

123

L

Fig. 2. The hierarchical network that subjects in the hierarchy condition of Experiment 1 learned.

subjects to the various properties of the order in a thorough but relatively unstructured way. The subjects were first given a sheet containing 11 sentences describing the age relation between adjacent elements from the order and were asked to draw a diagram representing the information presented in the sentences. The experimenter checked these drawings and discussed any inaccuracies with the subjects. The subjects were then asked to verify four other diagrams representing the same information in four somewhat different ways. These diagrams had lines that were longer or shorter or of varying length and had different arrangements of the branches. Two of the diagrams had minor errors in them, and two had no errors. The purpose of this phase was to allow the subjects to see that a variety of diagrams could be accurate. Again the subjects’ responses were immediately scored and discussed. The subjects then read 26 sentences describing possible age relations between the names and decided whether each sentence was true, false, or indeterminate based on the information presented in the original sentences. Feedback was also given to the subjects on these decisions. The next training task required the subjects to fill in a 12 X 12 matrix which had the 12 names printed along the left and top borders. Subjects were instructed to place a checkmark in every cell in which the name from the row was older than the name from the (intersecting) column. These responses were corrected and discussed. Up to this point, the subjects had the original sentences and all other materials available for reference. However, the subjects were told at the outset that eventual memorization of the relationships would be necessary. At this point, the subjects were asked to study any or all of the materials as long as they wished, until they felt they knew the material completely. They were then given one final test in which they placed check marks on another arrangement of the 12 X 12 matrix. Following the training phase, subjects retrieved information from memory about the relative age of pairs of names in the partial ordering. Pairs of names appeared on an Owens-Illinois Digi-Vu screen, and response times were recorded as subjects pressed one of two marked keys

124

Kirk H. Smith and Barbee T. Mynatt

on a keyboard under the name of the older person or pressed the space bar to indicate an indeterminate relation. Presentation of the name pairs and response records were under the control of a Nova 1220 computer. Each block of trials consisted of 78 name-pair presentations. The composition of the trials depended on the condition. For the hierarchical structure, on each block of trials all 26 possible determinate relations were presented, including 11 adjacent relations, 9 remote relations with a step size of 1, and 6 remote relations with a step size of 2. (Step size is defined by the number of elements between the two test items.) Each of these was presented once in a left-to-right order on the screen, and again in a reversed order. A subset of 26 of the 80 possible indeterminate relations were also included in each block. To equate the hierarchy condition with the irregular condition as much as possible, subjects in the latter condition were likewise tested on 26 determinate relations presented in both forward and reverse orders and 26 indeterminate relations. However, these relations do not exhaust all the possible relations in either category. Of the determinate relations, 7 were adjacent pairs, 5 had a step size of 1, 5 a step size of 2 , 4 a step size of 3, and 3 a step size of 4, and 2 a step size of 5. Each subject was required to complete three blocks of trials in which their error rate did not exceed 5%. B.

RESULTSAND DISCUSSION

The mean number of trial blocks required to meet the criterion was 3.7 blocks for the hierarchy group and 4.4 for the irregular group. The difference was not significant, although it was in the direction of our original hypothesis that the irregular structure is more difficult to master than the hierarchy. Response times on error trials were replaced with the mean of response times for correct trials of the same type. The response times from the irregular condition parallel other reported results: There was no evidence of a distance effect like that found for linear orders. In fact, adjacent relations, with a mean response time of 2.80 sec, produced significantly faster responses than the nonadjacent relations, with a mean of 3.11 sec, F(1, 10) = 14.7, MSe = 2.16. However, response times to pairs from the hierarchical structure showed distance effects: Response time decreased as step size increased (see Table I). Analyses showed that response time to the adjacent pairs, with a mean of 3.51 sec, was slower than to the remote pairs, with a mean of 2.31 sec, F ( 1 , 10) = 72.6, MS, = 8.25, and that 1-step pairs, with a mean of 2.91 sec, were slower than 2-step pairs, with a mean of 1.41 sec, F(1, 10) = 69.5, MS, = 7.74. A more detailed analysis of the data for the irregular condition failed to

I25

Orderings in Memory

TABLE I MEANRESPONSETIME(SEC)To DETERMINATE PAIRSAS A FUNCTION OF STEPSIZEFOR THE HIERARCHY CONDITION Step size

0

1

2

1.49 3.52 3.14

1.41

N

1.34 3.40 3.60 4.50 3.66 4.06

Meanb (with J): Mean (without J):

3.51 3.99

2.91 3.63

1.41 -

Older term in paira J R

D P

G

“The actual elements used in the structure were one-syllable first names. For convenience, only the first letter of the name is used. Refer to Fig. 2 for the placement of each term in the structure. bThe last two rows present weighted means. Note that certain names have more than one relation of a certain type, so that the means given do not always equal the means of the corresponding column of table values.

reveal any obvious trends or patterns. However, the data from the hierarchy condition suggest that the obtained distance effect consisted of two components. First, any pair of names containing J, the element at the top of the hierarchy, led to faster responding than pairs not containing J. As can be seen in Table I, this was true for both adjacent pairs, F(1, 10) = 57.7, MSe = 13.09, and for pairs of step size 1, F(1, 10) = 40.0, MS, = 14.09. (All pairs of step size 2 involve J.) The fact that J is the correct response to any pair containing it appears to confer on it a special status. Subjects can store this specific information with the term in memory and use it in making a rapid, categorical decision in much the same way that Pliske and Smith’s (1979) subjects used the gender of names in a linear order to make rapid decisions when all names of one gender preceded names of the other gender in the ordering. However, even when pairs containing J were removed from the analysis, there remained a distance effect of the kind typical for linear orderings. The last row of Table I presents the means for adjacent and remote (step size 1) pairs that did not include J. The difference was significant when tested with a comparison that was not orthogonal to those given earlier, F(1, 10) = 24.7, MSe = 1.43. A second component of the distance effect shown in Table I is a significant increase in response times to adjacent pairs from the second

126

Kirk H.Smith and Barbee T. Mynatt

level of the hierarchy (R and D) compared to the third level (P, G, and N), F( 1, 10) = 7.69, MS, = 8.01. (This planned comparison is orthogonal to all but the last one presented above.) This pattern of results suggests a search process operating like the spread of activation (cf. Collins & Loftus, 1975) from the top of the hierarchy (J) and directed downward. If such a search process is assumed to terminate only when both members of pair have been located (as would be necessary to classify a pair as indeterminate), then most of the results in Table I fall into place. For example, responses to pairs of step size 1 containing R and D (3.52 and 3.74 sec, respectively) took longer than the adjacent pairs containing these letters (3.40 and 3.60 sec, respectivley). The apparent distance effect is the result of averaging response times for adjacent pairs at different levels of the hierarchy, (i.e., pairs involving P, G, and N). The results in Table I can also be seen to display an effect similar to that of a propositional fan (Anderson, 1976). Response times were shorter to pairs containing R, with only one subordinate, than to D, with two. And response times to pairs containing G, N, and P, with one, two, and three subordinates, respectively, increased as expected. (The mean response time to indeterminate pairs, 4.35 sec, was significantly longer than the mean to determinate pairs, 2.82 sec, t(l0) = 8.21, SE, = .375; however, we could find no easily interpretable trends in the times for indeterminates.) C. CONCLUSIONS AND IMPLICATIONS

The pattern of results from the hierarchical structure displayed a traditional distance effect (when analyzed in the usual way), whereas for the irregular structure the pattern was reversed. What conclusions should be drawn from this outcome? Following the logic of other studies of partial orderings published within the last 8 years, we might conclude that the hierarchy was represented and stored like a linear ordering, but the irregular structure was not. It is interesting to speculate on the direction research on partial ordering might have taken if Hayes-Roth and HayesRoth (1975) had chosen to investigate a hierarchical structure. We contend that the search for a set of conditions that lead to a distance effect with partial orders has been misdirected. The presence of a distance effect can be the result of averaging the response time data for adjacents and remotes (or even for rank-ordered distances) in ways that obscure the effects of very different variables and processes. Our data for the hierarchical structure illustrate this kind of confusion quite clearly. One hypothesis that is consistent with all the data discussed so far is a two-process model such as Pliske and Smith (1979) have suggested to explain retrieval times for linear orderings. The general form of this

Orderings in Memory

127

model contains two components, a rapid decision process based on specific categorical information about the elements of an ordering and a slower systematic serial search process that moves from element to element along the learned connections among them. For linear orderings, an example of categorical information is the gender of names used in an ordering. The parallel in the hierarchy would be responses to pairs of names containing the topmost element (J). An example of serial search processes in linear orderings is the proposal that subjects search the ordering from the ends inward (cf. also Woocher et a/., 1978). The parallel for a hierarchy is the spreading activation notion discussed above. The irregular structure may have been learned in the same way as the hierarchy (or a linear ordering, for that matter), and the retrieval processes may have been basically the same. The difference is that irregular structures do not have a small number of elements that can be uniquely categorized by their structural properties; and sequences of the serial search (or the pathways of spreading activation) may be more idiosyncratic from subject to subject, or even from trial to trial for the same subject. In effect, irregular structures such as Figs. 1B and 1C may be interpreted as situations in which only the underlying search processes are manifest in the data because the effects of other retrieval processes, especially rapid categorical decisions, have been randomized. However, if this interpretation is correct, investigations of partial orderings should have used many more irregular structures. It also follows from this interpretation that patterns of retrieval time reflect a great deal more than the incorporation of a set of relations into an integrated memory representation. Because our goal was understanding how people construct and represent partial orderings, we drew an important lesson from our first experiment. The time taken to retrieve comparative information from an ordering is influenced by many factors other than the process of understanding and representing the information as an integrated whole. A more appropriate methodology is needed to investigate how people combine relationships and construct integrated representations of partial orderings. There is no doubt that the process is more difficult for partial orderings than for linear orderings, but it can be done. The question is how.

V.

Experiment 2: The Role of Determinacy in Constructing Partial Orderings

In reviewing research on partial orderings, we noted that a very limited number of structures have been intensively investigated. Whatever conclusions have been reached cannot really be generalized to "partial order-

Kirk H. Smith and Barbee T. Mynatt

128

ings,” but must be confined to these few specific structures. The obvious remedy is a systematic exploration of the domain; however, the number of possible configurations of partial orderings is surprisingly large and diverse. A feeling for this diversity can be gotten from Fig. 3, which illustrates five possible partial orderings of 12 elements. The only configuration in Fig. 3 that has been investigated previously is Fig. 3B, the ordering devised by Moeser and Tarrant (1977). At present there is no way of guaranteeing that the structures shown are representative of partial orderings, even with the restriction that exactly 12 elements be ordered. The partial orderings in Fig. 3 were chosen according to several principles derived from our intuitions about the possible sources of difficulty people have in understanding and remembering the relations in such structures. Figure 3B was included as a reference point, since it has been investigated not only by Moeser and Tarrant, who introduced it, but also by Warner and Griggs (1980) and by us (reported in Section IV). It has been clearly established that with appropriate background and procedures of presentation, college students can learn the relationships involved in this ordering and can draw the correct inferences about remote relationships (even though the speed of their performance may not correspond to that of subjects who are retrieving information from a linear ordering). The question raised here is whether people have more or less difficulty with the other structures. If so, we wanted to isolate 1he reasons for these difficulties. A

C

B H

I

H

B

I\ L

D

J

\ I.. E

I

I F I

\ / F

I

F

I

G

G

G

D

E

47\7\/D ,7 A

H

l

J

K

G

I

L

D

/A\ E

/A\ F

G

H

A\ I

J

Fig. 3. The partial orderings investigated in Experiment 2.

K

L

Orderings in Memory

129

The fundamental difference between partial and linear orderings is that the former leave the relationship between some pairs of elements indeterminate. One obvious way in which partial orderings can differ is in the extent of this indeterminacy. For example, 12 elements have 66 possible pairwise relationships. A linear ordering specifies 11 of these, which in turn determine all of the remaining 55 relationships. Thus, a linear ordering is a complete ordering; there are no indeterminate relations. By comparison, Moeser and Tarrant’s partial ordering (Fig. 3B) leaves 25 of the 55 potentially determinable relations indeterminate. The remaining four structures in Fig. 3 were chosen for study in part on the basis of their level of indeterminacy. Figure 3A is in some ways a much simpler, more orderly configuration than Fig. 3B, which has received so much attention; yet Fig. 3A has the same degree of indeterminacy. Specifying 11 of the 66 possible relationships between pairs of elements leaves 25 of the remaining 55 relationships indeterminate. By contrast, Fig. 3C seems intuitively very similar to Fig. 3B, but in fact has more indeterminate relations-32 out of 55, instead of 25. Figure 3E, which is a hierarchy similar to that investigated in Section IV, has 46 indeterminate relationships, a still higher level of indeterminacy. (The hierarchy in Fig. 2 has 45 indeterminate relations.) Figure 3D was constructed to have almost the same number of indeterminate relations (45) as Fig. 3E, but to look very different, at least superficially. The five structures selected for study can be seen to possess levels of indeterminacy similar to either Moeser and Tarrant’s irregular structure or our hierarchy, with the exception of Fig. 3C, which falls in between. In fact, Fig. 3C has a little more than half indeterminate relations (58%), although it “looks” a lot more like Fig. 3B than it does like Figs. 3A, 3D, or 3E. The only difference is that in Fig. 3B, H is greater than B, whereas in Fig. 3C, H is less than A. One further dimension distinguishes the five structures. Figures 3A, 3B, and 3C all have one long linear ordering of 7 elements, ABCDEFG, whereas in Figs. 3D and 3E, the longest chain is 3 elements long. Although the differences outlined above may not correspond to the relevant cognitive dimensions of such configurations, they seem to reflect the diversity that exists within the domain of 12-element structures. In order to find out whether these structures differ in difficulty, we gave subjects a set of 11 statements describing adjacent relationships among the 12 elements and tested whether they could draw a diagram representing the partial ordering and answer questions about the nonadjacent relations implied by the ordering. We selected a paper-and-pencil version of the task in which all 11 specified relations were available for

Kirk H. Smith and Barbee T. Mynatt

I30

inspection during testing and subjects could work at their own pace. Our purpose was to find out whether people could understand partial orderings of this size and complexity, independent of the demands made on memory to retain the 11 relationships. The present experiment was designed to determine whether there are aspects of partial orderings for which people do not have a readily accessible schema. A more global source of difficulty may be the fact that comparisons of age per se, especially among several persons or objects, are most easily made in terms of numerical values rather than rankings. Thus, people may assume that a set of age comparisons form a complete ordering. Another familiar relationship, ‘‘parent of, ” implies relative age but only within broad limits. A set of sentences such as “Mary is the mother of Ted” and “Sam is the father of Rita” might be expected to lead to better understanding of a partial ordering. Although the effect of changing the sentence frame seems minimal, De Soto (1960) found that the same structure differed in difficulty depending on the sentence frame used. In the present experiment, the partial orderings were presented as either parent-of or older-than sentences, although the test sentences were age comparisons (older than) in all cases. A.

METHOD

The subjects were 200 undergraduate students drawn from the same source and in the same manner as for the first experiment. Five groups of 40 students were assigned to work with each of the five partial orderings. Within each group, half of the subjects drew the diagram first, then answered questions; the other half answered questions first. The subjects in the diagram-first conditions, when given the instructions for answering questions, were told that they could refer back to their diagrams. The subjects were run in small groups of 2 to 8. Verbal instruction asked them to follow a set of printed instructions and to ignore whatever a neighbor might be doing, because each person had different materials. Each subject received a three-page booklet containing materials and instructions. For both tasks, 11 sentences appeared at the top of the page. For half the students, the sentence had the form “Beth is older than Dave”; for the other half, the sentence had the form “Beth is the mother of Dave.’’ The 11 sentences were the set of adjacent relations necessary to describe one of the 12-element networks shown in Fig. 3. The six male and six female names used as elements were again chosen from Battig and Montague’s (1969) norms. The order of the sentences on the page was randomly determined. For the diagram-drawing task, the instructions asked the subject to draw a tree or diagram which would accurately represent the age relation-

Orderings in Memory

131

ships among the people described by the sentences. They were told to use arrows to connect the names, with the head of the arrow pointing toward the younger person. Space for the drawing was provided at the bottom of the sheet. For the question-answering task, 33 statements were listed below the 11 sentences defining the order, and the subject was told to read each statement and decide whether it was true, false, or indeterminate based on the information in the 11 sentences at the top. Three columns of blanks were printed next to the statements with the headings, “True,” “False,” and “Can’t Tell,” allowing the subject to put a check in the appropriate column to indicate an opinion about each statement. The types of statements used with each of the five partial orderings were randomly chosen from among the 132 possible statements in proportion to the occurrence of each type. For example, for Fig. 3E there are 20 possible true statements, 20 possible false statements, and 92 indeterminate relations. Of the 33 statements tested, approximately 68% (23) were indeterminate, 15% (5) were true, and 15% (5) were false. The indeterminate statements were randomly selected from all possible indeterminates. The indeterminate statements were randomly selected with the restriction that the proportions of adjacent and remotes were matched to their proportion of occurrence in the set of all possible determinates. B.

RESULTSAND DISCUSSION

The number of subjects in each condition who drew completely correct diagrams on the basis of 11 relationships is shown in Table 11. In general, subjects tended to be more successful when the sentences described the TABLE I1 NUMBEROF SUBJECTSWHOSEDIAGRAMS WERECOMPLETELY CORRECT IN EXPERIMENT 2 Structure Condition * Diagram first Older Parent Questions first Older Parent

A

B

C

D

E

9 10

7 9

6 9

6

7

8 9b

5 8

8 6

4

8 8

9 96

8

*A total of 10 subjects were tested in each condition

bDue to an error, two subjects, one in each of the indicated conditions, were not requested to draw a diagram.

Kirk H. Smith and Barbee T. Mynatt

132

parent-child relationship than when the same elements were related in age, 83% vs 70%, respectively; x2(1) = 5.28, p < .05. The differences among the five structures were not significant; x2(4) = 7.75, p < .lo. There was also no indication that answering a set of 33 questions about the implications of the 11 sentences had any impact on success in drawing a diagram, or vice versa. Of the 200 subjects, 8 drew linear orderings and 4 drew two separate orderings. These errors were not confined to any one condition, however. The errors made in answering questions were compiled separately for determinate and indeterminate relations and converted to percentages, which are shown in Table 111. These values were submitted to an analysis of variance in which type of question, determinate or indeterminate, was a repeated measure. The .01 level of significance was used as a criterion for discussing any comparison. The findings are presented here in order, beginning with the effect that accounted for the greatest proportion of total variance attributable to experimental manipulations. Subjects made significantly more errors on indeterminate than on determinate relations, F( 1, 180) = 107.1, MSe = 324.6, and this difference accounted for 37% of the variance. The effect was consistent across all conditions of the experiment. The five structures differed significantly in difficulty, F(4, 180) = 16.2, MSe = 306.8, accounting for another 21% of the variance. However, examination of Table I11 suggests that the effect of structure was different for indeterminate and determinate relations. The interaction was significant, F(4, 180) = 10.5, MSe = 324.6, and accounted for 15% of the variance. The percentages in Table I11 make clear that our original hypothesis that difficulty is affected by the amount of indeterminacy was incorrect. Subjects made the greatest number of errors on Fig. 3C, in which the level of indeterminancy is intermediate between Figs. 3A and 3B (with TABLE I11 PERCENTAGE OF ERRORS ON QUESTIONS ABOUT RELATIONSHIPS IN FIVE STRUCTURES TESTEDIN EXPERIMENT 2" Figure 3 structure Type of relationship

A

B

C

D

E

Overall

Determinate Indeterminate Overall

7 25 16

17 27 22

12 51 31

5

21 16

8 13 11

10 29

OPercentages based on different numbers of questions for different structures of Fig. 3.

Orderings in Memory

133

relatively few indeterminate relations) and Figs. 3D and 3E (with many). For Fig. 3C, questions about indeterminate relationships seemed to pose the greatest difficulty. The only subjects who failed to get any indeterminate questions correct were, with a single exception, in this condition (five subjects out of six). Figure 3D was more difficult than Fig. 3E because the indeterminate relations led to more errors, but Moeser and Tarrant’s structure (Fig. 3B) was more difficult than Fig. 3A because the determinate relationships produced more errors. Long-chain structures (Figs. 3A, 3B, and 3C) did not emerge as strikingly more difficult than short-chain structures (Figs. 3D, 3E). Structures with certain elements connected to many other elements (especially Fig. 3E and, to a lesser extent, Fig. 3D) were not different from structures with little multiple connectedness (e.g., Fig. 3A). (The average number of elements connected to a given element-a measure of the fan effect, Anderson, 1976-is the same for all the Fig. 3 structures.) The effect of the sentence frame on the question-answering data was consistent with the subject’s success in drawing correct diagrams. Fewer errors were made on questions about partial orderings based on “parent” sentences than on “older” sentences, F ( l , 180) = 13.5, MS, = 306.8. This significant difference, which accounted for 4.4% of the total variance, was primarily due to the reduced numbers of errors to questions about indeterminate relations when “parent” sentences were used. This interaction was also significant, F( 1, 180) = 9.4, MS, = 324.6, accounting for 3.3% of the variance. Finally, the only effect of procedure was on determinate questions. Fewer errors were made on determinates when subjects had first drawn a diagram of the ordering (6%) than when they answered questions first ( 14%), whereas errors on indeterminates were unaffected (29 and 28% error rates for diagram-first and questions-first conditions, respectively). This interaction accounted for only 2.5% of the variance but was significant, F ( l , 180) = 7.3, MS, = 324.6. One interpretation of this result is that a diagram does not aid the subjects’ understanding of indeterminate relations, although it helps in making inferences about remote relationships. However, subjects received no feedback on the diagrams they made. Thus, their errors on questions after drawing a diagram may have been a reflection of their initial misunderstandings in composing a diagram. C . CONCLUSIONS

The results of this experiment show that partial orderings can differ considerably in difficulty. By difficulty we refer to the problems people have in understanding the implications of a set of relationships that does

Kirk H. Smith and Barbee T. Mynatt

134

not assign each element a unique rank. The results for both diagram drawing and question answering support the contentions of earlier investigators that the necessary schema for representing partial orderings is not very salient. Our subjects generally did better on both diagrams and questions when the sentences described parent-child relations than when they simply stated that one individual was older than another. We assume that parent-child relationships awaken a schema of age ordering that admits indeterminancies. That subjects in the parent-child condition did better on questions about the relative age of members of a family and that the principal gain in performance was in answering indeterminate questions are especially compelling evidence for the importance of the schema. These results also indicate that our subjects, who were college students, did not lack the appropriate schema altogether. Rather, they seemed to be much less likely to use it with pure age relations. Instead of a predilection for linear orderings in all situations, people appear to gravitate toward the linear ordering in situations where both (a) complete ranking is possible and (b) interval measurement makes sense. Although our results indicate that much of the difficulty with partial orderings has to do with indeterminate relations, no conclusion about the importance of various structural variables seems possible. In particular, the amount of indeterminacy (the proportion of indeterminate relationships among the unspecified ones) in a partial ordering does not affect its difficulty in any simple, monotonic way. (The only hypothesis consistent with our data is that difficulty increases as the ratio of indeterminate to determinate relations approaches unity. However, even this idea is limited; structures with roughly the same ratios of indeterminancy showed differences in accuracy of question answering.) One important lesson about structural variables may be gleaned from our results. There is no reason to suppose that a haphazardly selected configuration of partially ordered relationships is representative of such structures. It is important to stop generalizing about “partial orderings” and begin a more systematic search for the processes people use in integrating and understanding this kind of information. VI.

Experiment 3: Node Construction

Presentation order is another factor that has been suggested as a possible explanation for the difficulty people have in correctly integrating the components of a partial ordering. Moeser (1979), for example, compared three groups of subjects (Experiment 3), one which learned a partial ordering and two which learned a linear ordering. One of the latter groups

Orderings in Memory

135

was exposed to the sentences describing relationships (specifically, “Todd is older than Herb”) using what Moeser labeled a “match” order, that is, A > B, B > C, C > D, . . . , M > N. The other linear order group received the sentences in a “nonmatch” order so that the ninth sentence introduced two names not previously mentioned (a nonmatch situation). Three more sentences were given before a sentence referred again to a name in the seventh sentence, and not until the twelfth sentence was there sufficient information to complete a 13-element ordering. All subjects received instructions on how to represent partial and linear orderings with diagrams and how to answer questions about them correctly. Moeser found that subjects in the linear match condition did better than those in the other two conditions, which did not differ. On the basis of this finding, she concluded that some of the difficulty in integrating and understanding a set of partially ordered relationships is due to the fact that partial orderings must always be presented in nonmatch orders. In spite of its plausibility, Moeser’s argument does not really explain either her results or the difficulty of partial orderings. The terms “match” and “nonmatch” are taken from a theory of linear order construction proposed by Foos et al. (1976). Detailed application of this theory to the presentation orders Moeser used reveals that they are not of comparable difficulty. In the Foos et al. theory, match orders require the subject to add new elements to the ordering by one of two processes, both of which involve locating a single element in common between the ordering previously constructed and the new relationship. (This common element is the “match.”) The M1 process detects a match at the end of a previously constructed ordering. For example, given B > C > D, the relation D > E is added by the M1 process. The M2 process detects a match at the beginning of a previously constructed ordering, so that given B > C > D, the relation A > B is added by the M2 process, which has been shown to be slightly more difficult than the M1 process. A nonmatch order in the Foos et al. theory requires at least one nonmatch situation, for example, when A 3 B > C > D is established and E > F is to be added eventually (but must be held temporarily separate in memory). There is no reason why a partial ordering has to be presented in a nonmatch order. For example, in Moeser’s 16element ordering, once the central %element linear ordering has been presented, the remaining comparisons may be added by processes comparable to M1 and M2. For example, given the previously constructed chain B > C > D > E, branches can be added by relations such as C > Q (parallel to M1) and X > D (parallel to M2). While Foos et al. did not consider such nodeconstruction processes, the process of locating a common element (the “match”) would be the same. Whether a node is more difficult to con-

I36

Kirk H. Smith and Barbee T. Mynatt

struct than a line is an empirical question, which the present experiment considered. Moeser did, in fact, introduce a nonmatch situation in presenting the partial ordering. However, the nonmatch was not introduced at precisely the same point in the presentation order for partial and linear orderings. For the partial ordering, the first true nonmatch occurred on the eleventh comparison and was immediately followed by a resolution (or “doublematch,” D1, process). A recent study by Foos and Sabol (1981) has reported that as the number of comparisons between the nonmatch and its resolution increases, performance on nonmatch orders declines. Whereas the nonmatch was immediately resolved in the partial ordering (no intervening comparisons), three comparisons intervened in Moeser’s linear nonmatch condition. The Foos and Sabol data were based on shorter, simpler orderings (strings of six letters to be constructed from five letter pairs), so the results may not be strictly comparable. However, the point is that Moeser’s three conditions confound a number of variables known to affect the difficulty of constructing linear orderings. The linear match condition will obviously be the easiest, not only because a match order was used but also because only the M1 process was required. The linear nonmatch condition has a substantial delay following the nonmatch, making it more difficult than the partial ordering; but the latter contains match processes that result in nodes, the difficulty of which is unknown. The difference between MI and M2 processes suggests that other match processes are unlikely to be as easy as M1. A good deal of the confusion about the difficulty of different presentation orders is probably due to the fact that subjects cannot master a 14element ordering of any kind in one exposure. Beginning with the second repetition, the classifications of Foos et al. do not apply, strictly speaking. The subjects have heard all the elements once, so no comparison can pose a nonmatch of the kind envisioned by the theory. Moreover, a question such as the relative difficulty of adding a node is best posed in designs similar to those used in unraveling the effects of presentation order in constructing linear orderings. Therefore, in the remaining experiments, we limited our attention to orderings with only a few elements and explored a wide range of presentation orders. We have not been exhaustive, however, as were Foos et al. The reason is that partial orderings create a much more extensive range of possibilities, even with orderings of only 5 elements. In the first experiment, we compared the difficulty of constructing either a linear order or one of the three partial orders shown in Fig. 4. The presentation orders are given in Table IV along with an analysis of the construction processes specified by the Foos et al. theory. All presen-

Orderings in Memory

I37

A

B

C

A

A

A

/ \

I

B

I I

B

E

I

c

c

D

D

'E

I

I

D

E

Fig. 4. The three partial orderings constructed by subjects in Experiment 3. They differ in the location of the node or branch relative to the four-term linear ordering ABCD.

tation orders consist of three sentences that can be combined to form a four-term linear ordering. Half the presentation orders in Table IV accomplished this with process M1 (orders 1, 2, 5 , 7, and 9) and the other half with process M2 (orders 3 , 4 , 6, 8, and 10). The last sentence in each presentation order either completed a five-term linear ordering or resulted in one of the networks. Completing a linear ordering required either process M1 (orders 1 and 3) or process M2 (orders 2 and 4). In the

TABLE IV MEANPROPORTION OF CORRECT TRIALSAS A FUNCTION OF PRESENTATION ORDERIN EXPERIMENT 3 Constructive processb involved Presentation order"

After second sentence

1. AB,BC,CD,DE 2. BC,CD,DE,AB 3. CD,BC,AB.DE 4. DE,CD,BC,AB

MI M1 M2 M2

5 . AB,BC,CD,AE

M1 M2 MI M2 MI M2

6. I. 8. 9.

CD,BC,AB,AE AB,BC,CD,BE CD,BC,AB,BE AB,BC,CD,CE 10. CD,BC,AB,CE

After third sentence Linear orderings M1 MI M2 M2 Networks MI M2 M1 M2 MI M2

After fourth Sentence

Mean proportion correct

MI M2 M1 M2

.62 .43 .35 .42

Nd Nd Nd Nd Nd Nd

.42 .45 .33 .29 .51 .28

<"Forconvenience, all presentation orders describe the linear ordering ABCDE or one of the three networks shown in Fig. 2. bConstructive processes are identified by the abbreviations of FOOSe r a / . (1976).

Kirk H. Smith and Barbee T. Mynatt

138

remaining 6 presentation orders, the last sentence required the formation of a “node” or branching point in the mental representation. The process of adding a branch or creating a node was labeled process Nd. A.

METHOD

A total of 24 sets of four sentences were tape-recorded and presented to subjects at a rate of one sentence every 6 sec followed by the spoken signal “Recall.” After a pause of approximately 25 sec for the subject to write down the constructed array, another set of sentences began. One block of 12 sets of sentences used the form “A is the father of B,” and a second block used sentences of the form ‘‘A is older than B ,’’ where A and B were chosen from the following set of five names: Bob, John, Fred, Paul, Tom. Within each block, half of the sets described a five-term linear array and half described one of the three types of five-term networks shown in Fig. 4. Presentation orders 2 and 3 occurred twice in each block, and the remaining eight orders occurred once. The subjects were 36 undergraduate students recruited in the same way as for the first two experiments. They were tested in small groups varying in size from 2 to 7. The instructions read to the subjects included examples of arrays and practice trials with feedback. Half of the subjects received the block of trials with father-of sentences first, the half received the block with older-than sentences first. Two different randomizations of each set of trials were recorded on magnetic tape, and each group of subjects received a different combination of conditions and randomizations. Responses were written on sheets containing two unequal columns of eight empty rectangles with connecting lines. The subjects were told that the boxes represented up to five generations, with the possibility of one or two persons belonging to each generation. B.

RESULTSAND DISCUSSION

Each trial was scored on the basis of whether all five names were correctly placed on the response sheet. The proportion of completely correct orderings constructed was analyzed in an analysis of variance using the .05 level of significance unless otherwise indicated. The relational term used in describing the lines and networks had no effect. Subjects’ performance was also not influenced by the order in which they were tested on the two relational terms, and none of the interactions involving these effects of sentence content was significant. The mean proportion of linear orderings correctly constructed (.a) was not significantly greater than the mean proportion of partial orderings

Orderings in Memory

139

(.38), although it cannot be concluded with certainty that there is no difference, F(1, 35) = 2.91, MS, = .23, p < .lo. Type of structure did not interact with any other factors in the design. The failure to find significant semantic or structural effects may be due to the task and, in particular, to the response sheets used. For example, older-than sentences describing networks might not be particularly difficult to understand and process in the context of a diagram that specifically implies the possibility of a node or branch. The evidence does not rule out the conclusion that even with considerable procedural support, networks may be more difficult to construct than lines. An alternative interpretation of this outcome in terms of the constructive processes involved is given below. The mean proportion of correct trials for the 10 orders of presentation is given in Table IV. These data were analyzed using a set of orthogonal planned comparisons plus several additional nonorthogonal comparisons. Overall, trials requiring mainly process M1 (orders 1, 2, 5, 7, and 9) resulted in more correct responses than trials requiring mainly process M2 (orders 3, 4, 6, 8, and 10). The mean proportion correct for the former was .46 and for the latter .37, F(1, 35) = 12.87, MS, = .174. This difference is completely consistent with earlier findings by Foos et al. (1976) and Mynatt and Smith (1977). However, detailed analysis of the 4 presentation orders describing a linear ordering suggests that difficulty did not increase in a perfectly linear fashion with number of M2 processes. For example, order 4, with three M2 processes, was easier than order 3, with only two. Similar discrepancies in orders 5 through 10 suggest that the sequence in which the processes are executed may be important, although neither Foos et al. nor Foos and Sabol reported any decrement in performance when the matching elements occur in relations separated in the presentation order by several other comparisons; for example, B > C, C > D, D > E, A > B was no more difficult than B > C, A > B, C > D, D > E. However, these studies did not examine partial orderings. The results suggest that working from the top to the bottom of a line and then forming a node at the bottom (order 9) was one of the easiest presentation orders, whereas working in a reverse direction and forming the node at the bottom (order 10) was the most difficult. This pattern appeared to be reversed for orders 5 and 6. The most difficult sequence was one in which the node had to be formed in the middle of a previously constructed linear ordering (orders 7 and 8 compared to orders 5, 6, 9, and lo), F( 1, 35) = 7.21, MS, = .135. One possible explanation for this pattern of results is based on the assumption that process Nd involves a search through previously stored elements in the representation to locate

Kirk H. Smith and Barbee T. Mynatt

140

one that matches (a basic component of the match processes, M1 and M2). In such a search process, a middle element might be less salient than one on the end. However, on this line of reasoning, the salience of the “middle” term C in order 9 would have to be explained by the recency of its occurrence in the preceding sentence. When the results for both linear and partial orderings are considered together, we conclude that process Nd is no more difficult than process M2; but, like M2, Nd is more difficult than MI. The mean proportion correct for orders involving MI on the last sentence (orders 1 and 3) was .49 as compared to .42 for orders involving M2 on the last sentence (orders 2 and 4) and .38 for orders involving Nd on this sentence. Nonorthogonal comparisons support this conclusion. A comparison of all orders involving MI on the last sentence to those involving M2 or Nd was significant, F ( 1 , 35) = 6.69, MS, = .250. A comparison of orders involving M2 on the last sentence to those involving Nd was not significant, F ( l , 35) = 1.48, MS, = .174. The preceding account makes clear that process Nd, which is a fundamental process in integrating a partial ordering, is basically a match process in the terminology of Foos et al. (1976). That is, after a pair of elements, AB, has been stored, the processing of a new relationship, AC, requires that the subject find the two occurrences of the element A in order to form a node. Foos et al. argued that process M2 involves an additional rearrangement of the terms that process M1 does not. Similarly, process Nd may involve a mental rearrangement of comparable difficulty. The implication is that partial orderings may be more difficult to construct than linear orderings because all the possible orders of presentation require some degree of mental rearrangement, whereas linear orderings can be presented in an optimal order requiring few, if any, such rearrangements.

VII.

Experiment 4: Diverging and Converging Nodes

The third experiment, described in Section VI, was deliberately designed to avoid the full range of complexity that is possible with networks. Two important omissions are addressed in Experiment 4. First, one of the largest and most reliable effects found in the work on constructing linear orders has been the difference between match and nonmatch orders (see Foos et al., 1976; Mynatt & Smith, 1977; Smith & Foos, 1975). However, the orders of presentation in the third experiment were what Foos et al. designated “match” orders. In the fourth experiment, a limited number of nonmatch orders were introduced in order to explore the generality of this important effect.

Orderings in Memory

141

The second omission in Experiment 3 has to do with the type of network structures considered. In all cases, they had nodes that opened downward in a diverging manner. For example, a father, A, could be described as having two sons, B and C. However, it is also possible to construct nodes in which two terms converge on a third term. For example, the information that A and B are the parents of C could be represented in a converging node. These two types of nodes convey somewhat different information, and the process of forming a converging node, which we shall now refer to as process Nc, may be psychologically different from the process of forming a diverging node (process Nd) investigated in the third experiment. In the present experiment, all possible orders of presentation were incorporated into a design investigating the four network structures shown in Fig. 5. As can be seen, all of the networks involve four elements and contain one node. The Fig. 5 structures have been classified as either convergent or divergent, and as symmetrical or asymmetrical. It is important to keep in mind that these labels refer to the “shape” or configuration of the network. In what follows, we discuss the relationship between shape and the order of presentation and argue that the effects of the latter on constructive processes are more important than the shape of the resulting network. A.

METHOD

A total of 48 sets of three sentences were tape-recorded and presented to subjects at a rate of one sentence every 6 sec followed by the signal “Recall.” After a pause of approximately 24 sec for the subject to write down the constructed array, another set of sentences began. The sets of Divergent

Convergent

A

Symmetrical

I

A\

/”\

C

/B C

I

D

D

A Asymmetrical

B

C

I

D

I

B\

/“

D

Fig. 5 . The four types of four-term partial orderings studied in Experiments 4 and 5 . Divergent and convergent orderings differ in the orientation of the node (down and up, respectively). “Symmetrical” and “asymmetrical” refer to the configuration of the entire network.

Kirk H. Smith and Barbee T. Mynatt

142

sentences were presented in two blocks of 24. Within each block, half of the sentences described divergent networks and half described convergent networks. All possible presentation orders were used; these are shown in Table V. Within each block, the 24 sets of sentences were randomly ordered. Half of the subjects received one randomization first, and half received the other randomization first. The sentences were of the form “A is the mother of C” or “B is the father of C,” where elements were chosen from the following set of names: Ann, Kate, Sue, Bob, Mike, Jim. After elements had been randomly assigned to arrays, the correct relation, “mother of” or “father TABLE V MEANPROPORTION OF CORRECTLY CONSTRUCTED NETWORKS AS A FUNCTION OF PRESENTATIONORDERSIN EXPERIMENTS 4 AND 5 Constructive processes involved0 Mean proportion correct

~

Presentation order

I . AC,BC,CD 2. AC,CD,BC 3. CD,AC,BC

4. AB,BD,CD 5 . BD,AB,CD 6 . BD,CD,AB 7 . CD,BD,AB 8 . AB,CD,BD 9. CD,AB,BD 10. BC,BD,AB 1 1 . AB,BC,BD 12. BC,AB,BD 13. AC,CD,AB 14. CD,AC,AB 15. AB,AC,CD 16. AC,AB,CD 17. AB,CD,AC 18. CD,AB,AC

After second sentence

After third sentence

Experiment 4

Convergent, symmetrical networksb M1 - 1 Nc MI Nc M2 Nc Convergent, asymmetrical networks MI Nc M2 Nc Nc M2 - 2 Nc M2 - 2 N Nc N Nc Divergent, symmetrical networks M2 - I Nd MI Nd M2 Nd Divergent, asymmetrical networks M1 Nd M2 Nd Nd MI - 2 Nd MI - 2 N Nd N Nd

Experiment 5

.77 .75 .76

.84

.87 .70 .72 .77 ,123

.82 .69 .78 .76 .68 .69

.85 .77 .88

.85 .81 .79

.87 .75 .95

.91

.43

.92 .65 .63

.85 .79

.78 .85 .78 .74 .69

Osee the text for a complete discussion. bFor symmetrical networks, each order of presentation consisted of two different presentation orders that cannot be logically distinguished, that is, order 1 could be either AC,BC,CD or BC,AC,CD. Both equivalent orders were presented to subjects, and the mean appears in the table.

Orderings in Memory

143

of,” was added and alterations were made in cases where two names of the same sex converged on a term. The subjects were 30 students from the same source as in previous experiments. The subjects were tested in small groups varying in size from 2 to 6. They were read a set of instructions which included examples of arrays and were given practice trials with feedback. The answer sheets used by the subjects contained six rectangular boxes arranged in two columns of three for each array. The subjects were instructed to write the names in any space they wished (although they were encouraged verbally and through examples to use an oldest-to-youngest order) and to connect the names by arrows pointing from parents to children. An answer was scored as correct only if all four names were recalled and correctly connected by arrows. B.

RESULTSAND DISCUSSION

The data from the fourth experiment were analyzed in the same manner as those of the third. The mean proportion of correctly constructed family trees is shown in Table V. Overall, the mean proportion of correctly constructed divergent structures (.8 1) and was significantly greater than the mean proportion of convergent structures (.69), F(1, 29) = 30.14, MS, = .195. Because the principal difference between the two types of structures is which of the two constructive processes, Nd or Nc, was required, it appears that converging nodes are more difficult to construct than diverging nodes. Although this conclusion seems intuitively correct, the explanation is by no means obvious. A converging node in a family tree constrains the entries, because the two parents cannot be of the same sex, whereas diverging nodes do not have this constraint. This additional constraint might appear to aid, rather than hinder, the subject in constructing a converging node. Perhaps the best explanation of the difference is that because family trees have a definite direction descending from older to younger generations, converging nodes in some sense extend “upwards” (or in the opposite direction). Process Nc appears to involve a mental rearrangement process similar to that involved in process M2. Experiment 5 was designed to deal, at least indirectly, with this issue. Overall, the two asymmetrical structures with a mean proportion correct of .70 were significantly more difficult to construct than the symmetrical ones with a mean of .80, F(1, 29) = 29.09, MS, = .104. However, a careful consideration of the cognitive processes involved in constructing the two types of networks reveals a confounding. The center columns of Table V present an analysis of the constructive processes

144

Kirk H. Smith and Barbee T. Mynatt

required in each order of presentation. Of interest at the moment is the presence of process N for the second sentence in orders 8, 9 , 17, and 18. These orders result in a nonmatch situation following the presentation of the second sentence, according to Foos et al. (1976). That is, the first two sentences in these orders introduce all four elements of the network, but there are no matches between duplicate elements. The subject is forced to maintain two separate relationships and await the information in the third sentence. Foos et al. argued that the nonmatch situation increases the difficulty of construction. Note that a nonmatch situation can only occur in the asymmetrical structures of this experiment. A planned comparison revealed that orders requiring process N (orders 8, 9, 17, 18) were significantly more difficult than the other orders of presentation for asymmetrical structures (orders 4-7 and 13- 16), F( 1, 29) = 110.51, MS, = .169. We would, therefore, argue that the “symmetry” of constructed networks does not make them easier to construct. Rather, asymmetrical networks may appear to be more difficult when nonmatch orders of presentation are included in the design. The difference between match and nonmatch orders of presentation also displayed an interaction with the type of node; the difference between match and nonmatch orders was greater for convergent than for divergent structures, F(1, 29) = 13.40, MS, = .151. The foregoing interaction appeared to be the result of the unique difficulty of order 9. The difference between orders 8 and 9 was significandy greater .than the difference between 17 and 18, F(1, 29) = 6.43, MS, = .127. Although previous research with linear ordering has found that presentation orders like orders 9 and 18 are more difficult than orders resembling orders 8 and 17, the explanations offered appear to have no bearing on the extreme difficulty of order 9. This issue is further discussed in Section VIII. The unusual difficulty of order 9 probably also accounted for a significant interaction between symmetrical-asymmetrical and convergent- divergent networks, F(1, 29) = 5.94, MS, = .187. Within all match orders, a comparison of orders requiring process M1 to construct a line with those requiring process M2 showed the expected difference favoring M1, F(1, 29) = 6.32, MS, = .149. This result, of course, replicated previous findings with linear orders. The remaining match processes shown in Table V involve adding the last element of the network to a previously constructed node. This situation can be analyzed into two factors. The first factor is locus of the attachment point: The new element can be added to the top or the bottom of the three-element branching construction. This distinction is identical to that between processes MI and M2, and in Table V, these processes retain the designations “MI” and “M2.” However, a qualifying numeral has been added

Orderings in Memory

145

to indicate the second factor, which is whether one or two points of attachment are available. To see more clearly what is at stake here, consider order 1 in Table V. A converging node is first formed from A > C and B > C. The third sentence, C > D, then requires attachment of D to the bottom of the network (paralleling process Ml), but just one point of attachment is available at the bottom, namely, C. Now, consider orders 15 and 16. From A > B and A > C, the subject can form a diverging node. Again, the third sentence requires that the fourth element in the network, D, be attached to the bottom (paralleling process M1 again); however, two possible points of attachment, B and C, are available. The potentially different processes are tentatively labeled processes M 1-I and M1-2. However, a contrast comparing orders involving processes M1-1 and M2-1 with those involving processes M1-2 and M2-2 was not significant, F(1, 29) < 1; the difference very slightly favored the latter orders, with two available points of attachment. A second contrast comparing orders involving processes M1- 1 and M1-2 with orders involving processes M2-1 and M2-2 also failed to reach significance F(1, 29) = 2.75, MS, = .128, although attaching an element to the bottom (MI) was slightly easier (mean proportion .85) than attaching an element to the top (M2) of a tree (.82). (The interaction of these two comparisons is a contrast of a subset of orders describing converging and diverging orders. This contrast, which is not orthogonal to similar ones described above, was significant, as would be expected.) The preceding indicates that converging and diverging networks differ subtly in ways other than the requirements of processes Nc and Nd. However, the relevant analyses suggested that the confounded variables account for little of the variance in the data collected in Experiment 4. The failure to find a significant difference between processes M 1 and M2 for partial orderings is somewhat disappointing. However, the difference between processes M1 and M2 for linear orders has proven statistically elusive and always accounts for a small fraction of the variance that is due to orders of presentation.

VIII. Experiment 5: The Role of the Schema The two preceding experiments make clear that college students can readily learn simple network structures when the component relations are presented in the proper context (i.e., as a family tree) and in an order that does not make too great a demand on memory. Both experiments demonstrate the importance of presentation order and its effect on successful

Kirk H. Smith and Barbee T. Mynatt

146

construction. Moreover, the constructive processes involved in constructing partial orderings appear to have much in common with those used in linear order construction. However, the family tree schema is not only well known but also has a conventional orientation. That is, most people find it natural to conceive of a family tree as a network with a top (usually headed by a significant ancestor) and a bottom (representing the most recent generation). Very infrequently, the network is conceptualized as a system of roots with the ancestors at the bottom. Whatever the specific representation, there is a specific, conventional format for organizing a family tree. In the last experiment, we explored the process of constructing a network with a much vaguer format and set of conventions for representation. The fifth experiment used an adaptation of a task used by Hayes ( 1 965) to study problem-solving strategies. During one phase of his procedure, Hayes had subjects learn configurations of code names forming “spy rings” in which not all spies could talk to each other. The subjects learned the configurations by studying a list of the pairs of names making up the spy ring. In the present experiment, a similar cover story was used to present the same four-term convergent and divergent structures used in the fourth experiment. Numbers were used instead of code names, and each spy could pass messages in only one direction. The stricture that spies could pass messages in only one direction was added to simulate the directional nature of the relations “older than,” “father of,” and “mother of.” The relations are directional in the sense that correct recall required that one term be marked or tagged as the older (vs the younger) person or source (vs the receiver) of messages. The term “direction” is here intended to be distinct from orientation. Orientation applies to the complete memory representation, whereas direction applies to the relation between any two terms. Thus, by changing the context in which the relations were presented in the last experiment, we attempted to influence the subject’s selection of a representational schema while maintaining the directional character of individual relations. A.

METHOD

The method of the fifth experiment was the same as that of the fourth with the following exceptions: The four elements of the network to be presented on each trial were randomly selected from the set of 10 digits. The subjects were told that the numbers stood for spies and that their task was to figure out spy networks from the three pairs of digits which would be read. The digits were read simply as pairs, for example, “3, 8,” with no sentence frame. The instructions indicated that the first-named spy

Orderings in Memory

147

could pass messages to the second-named spy, but messages could not go in the reverse direction. When subjects heard the verbal signal “Recall,” they were to draw the networks on an answer sheet containing two columns of three small circles for each trial. The numbers were to be placed in the circles and arrows drawn between the circles indicating in which direction spies could pass messages (see Fig. 6). Each of the 34 subjects received a total of 48 networks presented at a rate of 4 sec per relation. (Pairs can be read at a faster rate than sentences and still be comfortably understood.) B . RESULTSAND DISCUSSION

The mean proportion of trials on which the correct spy network was produced was analyzed in the same way as in Section VII,B, and the values for each presentation order are shown in the last column of Table V. In the interest of clarity, the outcome of the analysis will be presented in the same order as Section VI1,B. In contrast to Experiment 4, the difference between convergent and divergent networks (processes Nc and Nd, respectively) was not significant, F(1, 33) = 1.08, MS, = .183. The mean proportions were .78 and .80 for convergent and divergent structures, respectively. This result could be due to the fact that the schematic framework of the spy ring differs from a family tree in lacking a well-defined orientation. No convention requires one spy to be in a certain place relative to another, so converging nodes and diverging nodes can be represented in a nearly equivalent fashion, as illustrated in Fig. 6. This finding is especially striking when one considers that the individual relations, as presented in this study, had a directional component. That is, subjects were told that “AB” meant that A could talk to B , but not vice versa. Within asymmetrical structures, where nonmatch orders are possible, Converging

Diverging

Fig. 6. Two examples of the response format used in Experiment 5 . For each trial, the subject was provided with a set of six circles in the configuration shown and was to fill them in using arrows and the elements of the ordering (which were actually digits for the subjects). For purposes of illustrating the form of the responses, the two symmetrical networks in Fig. 5 have been represented.

148

Kirk H. Smith and Barbee T. Mynatt

these orders proved more difficult than match orders, F(1, 33) = 10.96,

MS, = .163. The mean proportion correct on nonmatch orders was .70, in contrast to .80 on match orders. Although this result is similar to what has been found in previous work on linear orders as well as in the fourth experiment, the effect was not so large in the present experiment, nor was there an interaction with convergent and divergent structures, F < 1. Moreover, order 9 was not strikingly more difficult than the other orders. These results are consistent with the notion that a spy network imposes no particular orientation on the representation. The reasoning involved can be seen best with examples. First, consider the nonmatch situation in the family tree setting. The relationships “A is the father of B” and “C is the father of D” in that order might suggest that A and C are on different levels of the tree, with A above (or older than) C, whereas “C is the father of D” followed by “A is the father of B” should, according to the explanation, suggest that C is above A. Subjects might be expected to order the two relationships tentatively by level using the order of mention. This would be a correct assumption if A is the father of C, but not if D is the father of A. In the latter case, some mental rearrangement would be required. (The rearrangement is part of the process D2 described by Foos et af., 1976). When “mother of” is permitted in the sentences, it becomes possible for A and C to be on the same level, but “A is the father of B” and “C is the mother of D” together seem to imply more complex relationships, for example, half-brothers and half-sisters. Now consider the same relationship in a spy network. “A talks to B” and “C talks to D” leave no implications about any third relationship. A could talk to C or C to A. Logically, there is even the possibility that the next relationship might be “C talks to B” or “A talks to D,” although such networks never occurred in the experiment. The point is that the spy network schema involves many fewer presuppositions about possible configurations and probably encourages subjects to remain uncommitted in the nonmatch situation. This may also explain why the difference between match and nonmatch orders is not as large in Experiment 5. The only remaining comparison of interest which was significant in Experiment 5 contrasted match orders requiring process M1 (mean proportion correct .86) with those requiring process M2 (.68), F( 1, 34) = 5.75, MS, = .180. This difference may seem inconsistent with the argument about the lack of orientation of spy networks. However, in all cases, the presentation orders compared begin with the construction of a threespy line, that is, A talks to B, B talks to C, or B talks to C, A talks to B. Because the construction of a linear ordering is involved, one might expect processes M1 and M2 to play a role. Note that the contrast did not

Orderings in Memory

I49

involve presentation orders in which the fourth element was added to a previously constructed node. No significant differences among these orders were obtained. C.

CONCLUSIONS FROM EXPERIMENTS ON PRESENTATION ORDER

The results of the last three experiments suggest that at least some kinds of networks are no more difficult to construct than linear orderings under the right circumstances. Moreover, the difficulty does not appear to be due solely to the type of structure. Rather, the context and form in which relationships are presented is quite important, as well as the order in which the component relations are presented. For the latter variable, the results quite strongly indicate that what Foos et al. labeled nonmatch presentation orders increase the difficulty of constructing a network, just as they do in constructing a linear ordering. In other respects, constructing a node, the structure that makes partial orderings more complex than linear orderings, turns out to be much like combining two relations to form a line. When people listen to sentences and attempt to respresent the information they contain, a large number of contextual factors including the choice of words in the sentences determines the type of representation ultimately formed. With most previous studies of constructive processes, both instructions and the investigator’s semantic choices generally activated a single familiar schema of representation-a line or a ranking. Once this schema has been invoked, it is probably very difficult to represent indeterminacies correctly. We would argue that this has happened in previously published network studies. Subjects were expected to represent partial orderings (or networks) on the basis of relational information that seemed to be fully ordered. In our experiments, we have shown that this presupposition can be readily broken and that college students have an appropriate way of representing partially ordered relations. The last experiment demonstrates that at least two such representational schemas differ subtly in their presuppositions.

IX.

Summary

Our program of research began from two points: a model of the constructive processes used in integrating the information in a linear ordering and a review of the recent literature on partial orderings. The first experiment was designed to clarify an implicit assumption in recent studies of partial orderings. We believe the focus on retrieval times has been mis-

1so

Kirk H. Smith and Barbee T. Mynatt

directed so far as understanding why people have difficulty mastering these structures. Retrieval times probably reveal more about search processes in partial orderings than about their representation. The second experiment we described also implied a criticism of recent studies. The use of partial orderings with a large number of elements and arbitrary structural characteristics may have obscured the similarities between partial and linear orderings when an integrated representation of them is being constructed. These two experiments set the stage for the last three, which verified the importance of activating an appropriate representational schema for the subject and demonstrated that many constructive processes are either the same as, or quite similar to, those described in earlier models of how linear orderings are constructed. ACKNOWLEDGMENTS The research described here was supported by grants BNS 75-19313 and BNS 77-16860 from the National Science Foundation to Bowling Green State University with the authors as coprincipal investigators. We would also like to acknowledge the many helpful discussions of this work we have had with Richard A. Griggs, Shannon D. Moeser, and Rebecca M. Pliske, although our views differ sharply with theirs in various ways. Finally, we would like to thank Mrs. Waltraud Vogel for her effort and patience in typing the manuscript of this contribution.

REFERENCES Anderson, J. R. Language, memory, and thought. Hillsdale, New Jersey: Erlbaum, 1976 Anderson, J. R., & Bower, G. H. Human associative memory. New York: Holt, 1973. Barclay, J. R. The role of comprehension in remembering sentences, Cognirive Psychology, 1973,4, 229-254. Battig, W. F., & Montague, W. E. Category norms for verbal items in 56 categories: A replication and extension of the Connecticut category norms. Journal of Experimental Psychology Monograph, 1969, 80, No. 3, Pt. 2. Chomsky, N. Syntactic structures. The Hague: Mouton, 1957. Collins, A. M . , & Loftus, E. F. A spreading-activation theory of semantic processing. Psychological Review, 1975, 82, 407-428. Collins, A. M., & Quillian, M. R. Retrieval time from semantic memory. Journal of Verbal Learning and Verbal Behavior, 1969, 8, 240-247. De Soto, C . B. Learning a social structure. Journal of Abnormal and Social psycho lug.^, 1960, 60, 417-421. De Soto, C. B. The predilection for single orderings. Journal of Abnormal and Social Psychology, 1961, 62, 16-23. Foos, P. W., & Sabol, M. A. The role of memory in the construction of linear orderings. Memory and Cognition, 1981, 9, 311-377. Foos, P. W., Smith, K. H., Sabol, M. A., & Mynatt, B. T. Constructive processes in simple linear order problems. Journal of Experimental Psychology: Human Learning and Memory. 1976, 2 , 759-766.

Orderings in Memory

151

Griggs, R. A , , Keen, D. A,, & Warner, S. A. Encoding partially ordered information. Bulletin of the Psychonomic Society, 1980, 15, 299-302. Hayes, J. R . Problem typology and the solution process. Journal of Verbal Learning and Verbal Behavior, 1965, 4, 371-379. Hayes-Roth, B., & Hayes-Roth, F. Plasticity in memorial networks. Journal of Verbal Learning and Verbal Behavior, 1975, 14, 506-522. Henley, W. M., Horsfall, R. B., & De Soto, C. B. Goodness of figure and social structure. Psychological Review, 1969, 7 6 , 194-204. Johnson, N. F. Sequential verbal behavior. In T. R. Dixon & D. L. Horton (Eds.), Verbal behavior and general behavior theory. New York: Prentice-Hall, 1968. Pp. 421-450. Moder, J. J., & Phillips, C. R. Project management with CPM andPERT. New York: Van NostrandReinhold, 1964. Moeser, S . D. Acquiring complex partial orderings in comparison with acquiring similar-sized linear orderings. Memory and Cognition, 1979, 7 , 435-444. Moeser, S . D., & Tarrant, B. L. Learning a network of comparisons. Journal of Experimental Psychology: Human Learning and Memory, 1977, 3, 643-659. Mynatt, B. T., & Smith, K . H. Constructive processes in linear order problems revealed by sentence study times. Journal of Experimental Psychology: Human Learning and Memory, 1977, 3, 357-374. Nelson, T. 0.. & Smith, E. E. Acquisition and forgetting of hierarchically organized information in long-term memory. Journal of Experimental Psychology, 1972, 95, 388-396. Norman, D. A., Rumelhart, D. E., & the LNR Research Group. Explorations in cognition. San Francisco, California: Freeman, 1975. Pliske, R. M. Distance effects in mental comparison studies. Unpublished master’s thesis, Bowling Green State University, 1978. Pliske, R. M., & Smith, K . H. Semantic categorization in a linear order problem. Memory and Cognition, 1919, 7 , 297-302. Polich, J. M . , & Potts, G. R. Retrieval strategies for linearly ordered information. Journal of Experimental Psychology: Human Learning and Memory, 1977, 3, 10- 17. Potts, G. R. Information processing strategies used in the encoding of linear orderings. Journal of Verbal Learning and Verbal Behavior, 1972, 11, 727-740. Potts, G. R. Storing and retrieving information about ordered relationships. Journal of Experimental Psychology. 1974, 103, 431-439. Potts, G. R., Banks, W. P., Kosslyn, S . M . , Moyer, R. S . , Riley, C. A., & Smith, K. H. Encoding and retrieving ordered relationships. In N. J. Castellan & F. Restle (Eds.), Cognitive theory (Vol. 3). Hillsdale, New Jersey: Erlbaum, 1978. Pp. 243-308. Smith, K . H., & Foos, P. W. Effect of presentation order on the construction of linear orders. Memory and Cognition, 1975, 3, 614-618. Smith, K. H . , & Mynatt, B. T. Effects ofpresentation order on construction of complete andpartial orders. Paper presented at the meeting of the Psychonomic Society, Denver, November, 1975. Smith, K. H., & Mynatt, B. T. On the time required to construct a simple linear order. Bulletin of the Psychonomic Society, 1977, 9 , 435-438. Trabasso, T., Riley, C. A., & Wilson, E. G. The representation of linear order and spatial strategies in reasoning: A developmental study. In R. Falmagne (Ed.), Psychological studies of logic and its development. Hillsdale, New Jersey: Erlbaum, 1975. Pp. 201-230. Warner, S. A . , & Griggs, R. A. Processing partially ordered information. Journal of Experimental Psychology: Human Learning and Memory, 1980, 6 , 741-753. Woocher, F. S . , Glass, A. L., & Holyoak, K. J. Positional discriminability in linear orders. Memory and Cognition, 1978, 6 , 165-173.

This Page Intentionally Left Blank

A PERSPECTIVE ON REHEARSAL Michael J . Watkins and Zehra F . Peynircioglu RICE UNIVERSITY

HOUSTON, TEXAS

Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Meaning of Rehearsal . . . . . . . . . . .................... Rehearsal for Free Recall Reconsidered. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. Formulating the Issue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B . Overt Rehearsal Evidence C. Other Evidence. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D. Experiment 1 . . . . . . . . . . E. Concluding Comments.. . IV. Rehearsal of Nonverbal Stimuli . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. Experiment2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1. 11. 111.

..................................... C. Experiment 4 . . . . . . .... ..... D. Concluding Comments. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . V. Summary and Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . ..... ....... ... .....

I.

153 154 158 159 160 165 168 173 I74 177 180 182 184 185 I86

Overview

This article inquires into the role of rehearsal in learning and memory. It begins with a background discussion of what rehearsal is and of how and why its meaning for the psychologist differs from that for the layman. The main thesis of the article is that this discrepancy has led to the role of rehearsal being overplayed in some areas of research and neglected in others. It is argued that the widely held assumption that rehearsal plays a central role in the free recall paradigm-and by implication in other common memory paradigms-has not been as convincingly demonstrated as is sometimes believed, and we report an experiment that suggests the importance of rehearsal in free recall experiments may be minimal. We then consider what we think is a neglected issue: the rehearsal of nonverbal materials. After reviewing the research on this question, we describe some new experiments demonstrating effective rehearsal of nonverbal auditory information. It should be made clear that this article does not provide a comprehenTHE PSYCHOLOGY OF LEARNING AND MOTIVATION, VOL. 16

153

Copyright 0 1982 by Academic Press, Inc. All rights of reproduction in any form reserved. ISBN 0-12-543316.6

154

Michael J. Watkins and Zehra F. Peynircioglu

sive review of the research conducted under the heading of rehearsal; for that the reader must go elsewhere (e.g., Johnson, 1980). Our intention is to suggest that the importance attributed to rehearsal is uneven-xaggerated in some areas and virtually ignored in others.

11. Meaning of Rehearsal There is a peculiar need to consider the meaning of the term rehearsal-peculiar because the way the term is used by psychologists is not the same as its everyday usage. In particular, there are many facets of rehearsal that psychologists tend to ignore. Although perhaps largely semantic, we believe this issue to be relevant to the understanding of learning and memory. For one thing, defining a key concept in a field of study in a way different from its everyday usage creates confusion and problems of communication. In the present case, these are felt by everyone from the college sophomore struggling with a learning and memory course to researchers trying to communicate in the pages of their journals. Also, it may be naive to assume that the domain of everyday usage is devoid of psychological coherence or significance. At the very least, consideration of the full everyday usage of the term might provide psychologists with a perspective-perhaps even a fruitful o n e - o f their own sense of the term. To convey the narrowness of the psychologist’s usage of rehearsal, we consider four ways in which the concept can be categorized, and indicate how in each case restrictions have crept in. The first form of categorization is according to whether rehearsal is overt or covert. That is, rehearsal can be manifest and public, or it can be mental and private. In the world outside the laboratory, overt rehearsal is probably no less important than covert rehearsal; yet in the world inside the laboratory, reference is largely confined to the latter. There are, of course, occasions when covert rehearsal is more appropriate-when one wants rapid and easy reassurance that one is capable of performing satisfactorily, or when one needs to go over one’s part in a play while on a crowded bus. But it is hard to see why covert rehearsal should be presumed to be more basic, or to be the rule rather than the exception. Certainly, rehearsing a ballet involves doing ballet; and rehearsing for a piano recital involves playing the piano. More generally, to the extent that the end one rehearses for is a public activity, overt rehearsal is likely to be a more valid form of preparation. It seems odd, then, that when contemporary psychologists use the term rehearsal they are nearly always referring to covert rehearsal.

A Perspective on Rehearsal

155

We might note that different kinds of activities may vary in the extent to which they lend themselves to covert rather than overt rehearsal. Although this is a question yet to be explored by experiment, intuition suggests that covert rehearsal may be suited to the learning of verbal material, but of little value for a ballerina trying to perfect a pirouette or a pianist the playing of a lengthy trill. The second distinction is that rehearsal may be stimulus based or memory based. Stimulus-based rehearsal involves the use of external prompts, as when reference is made to a score while rehearsing music or ballet. In memory-based rehearsal, no external prompts are provided, so that what has to be rehearsed first has to be remembered. In practice, these two modes of rehearsal may form the end points of a continuum, with the purpose of rehearsing being to move the performer along the continuum, and so eventually to dispense with the prompts. Thus, the pianist gradually comes to rely less and less on the score and more and more on memory. Once again the meaningfulness of this distinction will vary with the material being rehearsed, possibly with a greater significance where the prompts take the form of a well-developed form of symbols such as language or musical notation. Although the existence of both stimulus- and memory-based rehearsal seems obvious enough, and although in practice rehearsal often involves a mixture of the two, psychologists have restricted the term to what we are calling memory-based rehearsal. For example, even though in his review Johnson (1980) uses-and indeed, so far as we know, introduces-the term “memory-based rehearsal,” he considers this to be the only form of rehearsal, and practice in the presence of the to-be-remembered material as something else: Rehearsal also needs to be differentiated from external reexposures to the task. Such reexposures are simply additional learning trials, and there is no merit in dubbing such presentation trials as rehearsal. Even when the reexposure occurs after a delay interval, as in review, the reexposure does not constitute a rehearsal period. A review typically does include selective reexposures to the original content, and/or presentations of new facts and generalizations. As defined here, however, such external presentations would not constitute rehearsal even though such learning activities do prepare learners for future performances. All learning experiences may be viewed as preparation for future performances, but if rehearsal is to be a useful concept, the term should designate some particular subset of preparatory learning activities. (P. 265)

The third distinction is whether rehearsal is under the control of the rehearser or some external agent. A musician may control the rehearsal of his own part in a piece of orchestral music, but rehearsal of proper coordination with other instruments is under the control of the conductor. In the experimental study of learning, the presentation of the to-belearned material is usually under the control of the experimenter, in which

156

Michael J. Watkins and Zehra F. Peynircioglu

case, to the extent that the rehearsal of an item is confined to the exposures of the item on successive trials, it is experimenter controlled; to the extent that subjects rely on memory and rehearse items other than the one currently being exposed, rehearsal is subject controlled. For psychologists rehearsal means subject-controlled rehearsal. Finally, the material rehearsed can be verbal or nonverbal. Although perhaps not as a matter of policy, psychologists usually confine their attention to the rehearsal of verbal material. The rehearsal of nonverbal material is almost entirely neglected. It is clear from these four distinctions that in contemporary psychology the term rehearsal is used in a remarkably restrictive way. In the vast majority of cases, rehearsal refers to silent thinking about previously presented items of a verbal list. We shall discuss some possible whys and wherefores of these restrictions, but first should note that in one respect psychologists veer in the opposite direction and extend the use of the term beyond the limits of everyday usage. To the layman, rehearsal implies a repetition, or doing over and over. At the risk of sounding scholarly, we might add that the word derives from the Middle French “herse,” meaning harrow or rake, so that rehearse is a metaphorical reference to the repetitive act of raking. In psychology this implication of repetition is often played down, if not ignored. For instance, creative thoJght about an item is sometimes characterized as “elaborative rehearsal,” even though there may be nothing to indicate any form of repetition. Once again Johnson (1980) has captured the contemporary viewpint: “Repetition is a frequently used strategy in rehearsal, but the two concepts are not synonymous. . . . A definition equating rehearsal with repetition . . . is overly restrictive” (p. 265). Unfortunately, beyond appealing to convention, neither Johnson nor anyone else we know of has attempted to justify using rehearsal in reference to situations that do not entail repetition. Our own opinion is that this extended usage may, ironically, have been fostered by the very restrictions psychologists have imposed on rehearsal. When restricted to conjectural activities (i.e. , to covert, memory-based, and subject-controlled processing) , rehearsal does not readily submit to an operational definition and therefore is apt to be lumped together with other aspects of covert mentation. There may be no easy solution to this difficulty and the appropriateness of usage may have to be decided on a case-by-case basis. But we think that to characterize a procedure that may call for deep or creative thought about each item as it is presented as one involving elaborative “rehearsal’ ’ is to needlessly bamboozle the uninitiated. Let us consider now how the various restrictions psychologists have imposed on rehearsal came into being. At the heart of this question is the

A Perspective on Rehearsal

157

way the theory and research on memory in general have evolved. Until about 20 years ago, research on learning followed the tradition established by Ebbinghaus (1885), according to which learning occurred through the passive association of contiguously occurring items. Learning was thought of as controlled strictly by the experimenter, through the explicit presentation of the to-be-associated items. The choice of to-beassociated items, the instructions, and the procedural details were all designed to reduce the opportunity for subjects to learn through intelligent strategies, and the possibility of subjects successfully using such strategies was ignored. Within this experimental tradition, it would, in our view, have been appropriate to characterize the successive trials used in the acquisition of lists as rehearsals. Indeed, the anticipation method of paired-associate learning-in which, on each trial and for each pair of to-be-associated items, the stimulus member is presented, the subject tries to recall the response member, and then both members are shown as a pair-appears to come as close to capturing the essence of the everyday meaning of rehearsal as any experimental procedure in the researcher’s arsenal. On the other hand, even if the failure to use the term rehearsal in these procedures was happenstance, the theoretical thrust of the research was on such matters as the effects of similarity on transfer and interference, and not directly on rehearsal itself. Nowadays psychologists conceptualize learning and memory in a quite different way: They view the learner or rememberer as a computer. This perspective originated during World War 11, came to the fore around 1960, and now reigns unchallenged. Its development is a story well known, and there is no need to recount its details here. It is enough to note that learning and remembering now involve not the passive association of stimuli, but the active and resourceful processing of information. This means that the subject has become substantially freed from the control of the stimulus. Whereas learning was formerly seen as a simple consequence of mere stimulus exposure, it is now seen as depending in large measure on the extent and manner in which the stimuli are processed. And as important as any of the control processes a subject can bring to bear on a learning task is rehearsal. This point is captured by what has come to be known as the modal model of short-term memory, according to which the registration of the to-be-remembered items into the long-term store or memory proper is mediated by rehearsal of the items while they reside in a temporary, short-term store, so that the likelihood of recalling a given item in some later test increases with the number of times it is rehearsed (e.g., Atkinson & Shiffrin, 1968; Waugh & Norman, 1965).

158

Michael J. Watkins and Zehra F. Peynircioglu

This general movement to free the subject from stimulus control and, more particularly, the modal model of short-term memory both appear to have shaped the current usage of the term rehearsal in cognitive psychology. The point is obvious in the restriction of the term to subject-controlled rehearsal. No less obvious is the restriction to memory-based rehearsal, for if the subject’s rehearsal is to be free from stimulus control, then it must be memory based. And since subjects tend not to vocalize unless specifically instructed to do so, such rehearsals are a matter of conjecture and in any case covert. Clearly, these three restrictions are straightforward reflections of the new theoretical viewpoint. The reason for the restriction to verbal materials is more complex. To some extent it is due to a convention inherited from the associationists, and ultimately to the fact that verbal materials are extremely convenient to work with. On the other hand, it is probably no accident that verbal materials are, or at least are widely assumed to be, more amenable to mental manipulation than are nonverbal materials. So it is that in psychology rehearsal has, with only occasional exceptions, come to refer to covert, subject-controlled, memory-based processing of verbal material. That this usage is firmly established is well illustrated by the care taken to avoid extending it to discussions of research in which it might very well have suggested itself to the layman. Such research includes not only the verbal learning research of the associationists and the training research of industrial psychologists but also research on topics currently of considerable interest to cognitive psychologists, such as the effects of item repetition and of spacing of repeated items in a study list (e.g., Hintzman, 1974; Melton, 1970), and the effects of recall and other tests on subsequent memory performance (e.g., Darley & Murdock, 1971; Thompson, Wenger, & Bartling, 1978). As we noted earlier, the implications of such restrictive usage may be more than semantic. We believe it may be worthwhile to consider a more extended usage. Later we shall seek to illustrate this belief by describing existing evidence and presenting new evidence on rehearsal in a sense that transgresses two of the four restrictions we have noted: We shall look at experimenter-controlled rehearsal of nonverbal material. But first we shall suggest that the importance of rehearsal in the sense that psychologists use the term may have been substantially overplayed. 111. Rehearsal for Free Recall Reconsidered

Rehearsal has been assumed to play an important role in virtually all common short-term memory paradigms (see Johnson, 1980, p. 263), though for none is the assumption more widely held than for free recall.

A Perspective on Rehearsal

159

In a free recall procedure the items can be recalled in any order, and there are strong theoretical reasons why this freedom should maximize the benefits of rehearsal. One reason is that in free recall the order in which items are rehearsed need not follow the order in which they were presented. For tests requiring memory for order of item presentation, rehearsal could be a risky strategy, since any accidental reordering of the items during rehearsal would cause subjects to learn incorrect information. For free recall, on the other hand, not only is it not detrimental for subjects to get the items out of order when rehearsing, but it may even be good strategy to use rehearsal to deliberately rearrange or organize them into meaningful sets (cf. Mandler, 1967; Tulving, 1962). Another reason why rehearsal may be of special importance in free recall is brought out in the encoding variability hypothesis, according to which the more contexts an item is encoded in, the greater the probability of its later recall. This hypothesis predicts that rehearsal would constitute an optimal strategy for free recall, since the functional reordering of items it achieves allows each item to be studied in a variety of contexts. The encoding variability hypothesis has been used to account for the finding that an item that is presented twice in a list is more likely to be recalled if the two presentations are distributed among presentations of the other list items than if they are consecutive (e.g., Hintzman, 1974; Melton, 1970). Regardless of whether this interpretation is appropriate, the finding suggests that rehearsal can enhance recall merely by distributing item study. That is, the study of an item should be more effective if the time devoted to it is divided and distributed through the list. Despite such powerful theoretical arguments, the case for covert, memory-based rehearsal in free recall must ultimately rest on evidence. Much supportive evidence of a variety of forms has been amassed but, in our opinion, this evidence-whether considered in its separate strands or taken as a whole-is not entirely conclusive. In order to present our case or to evaluate the evidence at all, we first need to formulate the issue with some precision. A.

FORMULATING THE ISSUE

Just how important rehearsal is in free recall will necessarily depend on how the question is put. Our own formulation of the question is guided by two requirements: first, that it capture the thrust of contemporary theorizing; second, that it be as tractable as possible. To this end we sharpen the question in the following respects. First, we shall be concerned only with effective rehearsal. Since we are interested in the role that rehearsal plays in learning, we shall not be

160

Michael J. Watkins and Zehra F. Peynircioglu

concerned with what is sometimes called maintenance rehearsal (see Craik & Lockhart, 1972), which keeps items in mind for immediate use, as between looking up and dialing a telephone number, but which is without long-term effects. Subjective reports indicating awareness of rehearsal during presentation would not, therefore, be directly felevant to our concern. The issue is whether an effect of rehearsal shows up in some nonimmediate test. Second, we restrict consideration to situations in which each to-beremembered item is presented just once. When items are presented more than once within a study list, interest is usually in recall of the items, not in the recall of their individual presentations. Thus, the repetition of an item constitutes experimenter-controlled, stimulus-based rehearsal of the required item information. The effectiveness of this kind of rehearsal would probably be judged intuitively obvious; in any event, its experimental documentation is as old as the experimental study of memory itself (Ebbinghaus, 1885). But the issue at hand concerns rehearsal that is covert, memory based, and subject controlled, and the evidence pertaining to such rehearsal is likely to be more difficult to evaluate if confounded by the effects of rehearsal of other kinds. Third, we need to distinguish between rehearsing an item between the time of its presentation and that of the next item, and rehearsing it after the presentation of the next item. We restrict consideration to the latter case. By doing so, we not only make our question easier to address, but we avoid the danger, discussed earlier, of inappropriately applying the term rehearsal to a situation in which repetition does not occur. We can be sure that when an item other than the most recently presented is thought about, it is, in some sense at least, being repeated. Finally, we need to be quite clear about what we mean by learning. We define learning simply as the building up of a potential to recall at some later time-in other words, as an abbreviation for “enhancing the likelihood of recall in a subsequent test.” In short, this section addresses what we call the rehearsal hypothesis, which states: During the presentation phase of a typical free recall trial, subjects enhance the likelihood of their subsequently being able to report a once-presented item by rehearsing it after the presentation of at least the next item in the sequence. Formulated this way, the question of learning through rehearsal is not only tractable but also addresses a central assumption of contemporary theorizing. B.

OVERTREHEARSAL EVIDENCE

Probably the most influential source of evidence for learning through rehearsal-as we have defined the issue-comes from overt rehearsal

A Perspective on Rehearsal

161

studies, and we consider these first. Other kinds of evidence are considered in the next subsection. The basic idea of the overt rehearsal procedure is to listen in on subjects’ thoughts. Subjects report their rehearsal activity to the experimenter, who is then able to test the hypothesis that rehearsal promotes learning by seeing how amount of rehearsal relates to probability of recall. The details of the procedure were developed by Corballis (1969) and by Rundus and Atkinson (1970) and are fairly straightforward. As in the “standard” free recall procedure, a list of items is presented to the subject and followed by a recall test. The only novel feature is that the subject is asked to provide the experimenter with a running commentary on his rehearsal activity. In practice, the procedure typically entails a slow rate of presentation, such as one item every 5 or 10 sec, in order to allow the subject adequate time to report. The subject’s rehearsals are recorded on tape for later analysis. Consistent with current theorizing, overt rehearsal research (e.g., Rundus, 1971) has shown that during each rehearsal interval (Le., between the presentation of any two consecutive list items), subjects usually report not merely the item just presented but also previous items. Such data are consistent with the modal model of short-term memory in that during any given interval, the reported items are predominantly from among those most recently presented and hence most likely to be still in primary memory. Above all, with rehearsal reports summed over all rehearsal intervals, the total number of times an item is rehearsed correlates strongly with the probability of its being produced in the recall test. The only exception concerns the items reported in the final rehearsal interval, which follows presentation of the last list item and typically directly precedes the recall signal. Although these items tend to be from toward the end of the list, and hence somewhat lacking in total number of rehearsals, they are recalled with a high probability. As it happens, even this exception is entirely consistent with the modal model: Items reported in the final rehearsal interval are likely to be represented in, and recalled directly from, primary memory, so that with an immediate test their likelihood of recall will be high and their degree of rehearsal and hence registration in secondary memory will be irrelevant. Clearly, the results obtained by the overt rehearsal procedure provide impressive support for the rehearsal hypothesis. But the evidence has two major problems. First, it is correlational, the probability of recall being correlated with amount of rehearsal. Consequently, rather than rehearsal causing recall, it is entirely possible that the relation between rehearsal and recall is indirect, the product of some other factor. The variation in recallability among items may control both degree of rehearsal and proba-

162

Michael J. Watkins and Zehra F. Peynircioglu

bility of recall: Some words are both more rehearsable and more recallable, others are both less rehearsable and less recallable. The force of this possibility becomes apparent when we realize that overt rehearsal intervals are really miniature free recall tests. True, the intervals are rather brief as free recall tests go, and the number of items subjects manage to report is small; but, to press the point, the duration of even the official recall test has a limit, and if given more time more items would be recalled (Roediger & Thorpe, 1978). It seems, then, that even overt rehearsal evidence for the rehearsal hypothesis-influential as it has been-degenerates under scrutiny to a correlation between the total number of times an item is produced in a series of miniature recall tests and the probability of its recall in an official test. The second major problem with the overt rehearsal evidence is that it may not be a valid reflection of how learning normally proceeds. The very requirement that subjects report their rehearsal activity may change the way they study. It may rob them of a chance to use other learning strategies (Kellas, McCauley, & McFarland, 1975); and, more critical for our purposes, it may induce effective rehearsal where none would otherwise exist. Certainly, the procedure makes implicit demands on the subject. If subjects are told to provide a running commentary on their rehearsals, they probably feel obliged to rehearse. Consequently, even if learning were convincingly demonstrated to occur through rehearsal with the overt rehearsal procedure, it would not follow that the same is true with the standard procedure. An aspect of this validation problem concerns the need to use slower rates of presentation than are typical of free recall experiments. This need implies that at conventional presentation rates (say, one item every 1 or 2 sec) either the rehearsal hypothesis does not hold, or the reporting cannot keep pace with the rehearsal, in which case overt rehearsal fails to provide a faithful picture of covert rehearsal. There is now a fair amount of evidence regarding the validity of the overt rehearsal procedure, but opinions differ widely on how this evidence should be interpreted. In our opinion, the weight of evidence suggests that in a free recall procedure, a requirement for overt rehearsal alters the way subjects study. Several studies have compared recall in the standard free recall procedure with that obtained in a procedure that was identical, with the sole exception of a requirement for overt rehearsal. If the way the items are memorized is the same for the two procedures, then levels of recall should be about the same. If, on the other hand, subjects memorize in different ways in the two procedures, then there would be no reason to expect the levels of performance to be the same, although they might not be noticeably different. In this case, whether level of recall with overt rehearsal is higher than, lower than, or about the same as that

A Perspective on Rehearsal

163

for the standard procedure would depend on the nature of the items presented and a variety of procedural details. Although some research has found the two procedures to produce equivalent levels of recall (Horton, 1976; Murdock & Metcalfe, 1978; Roenker, 1974), more critical are findings of a difference, with sometimes the overt rehearsal procedure (Ashcraft, Kellas, & Needham, 1975; Whitten & Bjork, 1977) and sometimes the standard procedure (Fischler, Rundus, & Atkinson, 1970; Glanzer & Meinzer, 1967; Kellas et al., 1975; Madigan, 1973) showing the higher level of recall. The case for the validity of the overt rehearsal procedure as a technique for uncovering the nature of learning in free recall has been argued in an important article by Murdock and Metcalfe (1978). Their study was elaborate, involving seven experimental sessions. The findings included virtually identical performance for the overt rehearsal and the standard procedures in terms of both overall level of recall and serial position functions. On the other hand, data from a final recall test-a test given at the end of an experimental session which called for recall of as many of the words as possible from all lists presented in the session-showed that the number of words recalled from lists presented under standard conditions was about half as great again as the number recalled from lists presented under overt rehearsal conditions. The highly similar levels of performance in initial recall notwithstanding, this rather large difference in final recall implies a difference between the two conditions in the way the items were memorized. The primary concern of the Murdock and Metcalfe study was how recall in the overt rehearsal procedure compared, not with that in the standard procedure, but with that in a controlled rehearsal procedure. The controlled rehearsal procedure also involved overt rehearsal, but instead of relying on memory to rehearse, subjects were re-presented with the tobe-rehearsed items. Such stimulus-based rehearsal enabled the experimenter to control allocztion of rehearsal to the list items, and thereby avoid any item selection effects. Murdock and Metcalfe argued that if item selection effects were important, then performance should differ between the overt and the controlled rehearsal conditions. They compared performance with respect to (a) overall level of recall, (b) conventional serial position functions, and (c) what they called (following Brodie, 1975; see also Brodie & Murdock, 1977) functional serial position curves, which are obtained by plotting recall in terms of the relative positions of the items’ last rehearsals (such that for a 20-item list an item whose last rehearsal was followed by the rehearsal of 16 other, different items would be assigned to Position 4). In light of these comparisons, they concluded that “though item-selection effects can occur in the overtrehearsal procedure, we find similar results in a controlled-rehearsal pro-

164

Michael J. Watkins and Zehra F. Peynircioglu

cedure where item-selection artifacts cannot occur” and hence that for the overt rehearsal procedure ‘‘any stigma due to item-selection artifacts has been removed” (Murdock & Metcalfe, 1978, p. 321). In our view the comparisons between the controlled rehearsal and overt rehearsal conditions are less clear-cut than Murdock and Metcalfe imply. In particular, the functional serial position curves differ; moreover, they differ in a way consistent with the possibility of item selection effects: For the first half of the list (where item selection effects would be less likely to be masked by the effect of recency) recall shows a systematic increase across serial positions in the overt rehearsal condition (their Figs. 1 and 3) but not in the controlled rehearsal condition (their Figs. 2 , 4 , and 8). With regard to conventional serial position functions and overall levels of recall, it is not obvious why item selection effects should make a difference. In other words, it seems that of the comparisons Murdock and Metcalfe made between the overt rehearsal and controlled rehearsal conditions, only for that between the functional serial position curves should item selection effects be expected to make a difference; and for that comparison a difference did occur. A surprising aspect of Murdock and Metcalfe’s (1978) study is that what is probably the most direct test of the item selection hypothesis was not made. If the correlation observed between number of overt rehearsals and probability of recall is the result of item selection effects, then one would expect this correlation to be greater than that between number of controlled rehearsals and probability of recall, since item selection effects would not occur with controlled rehearsal. Unfortunately, these correlations are not reported. As will be described shortly, there are a number of approaches to evaluating the rehearsal hypothesis in which the overt rehearsal procedure plays an auxiliary role. For the moment we conclude that the primary overt rehearsal findings we have described-namely , when requested, subjects overtly rehearse in a way consistent with the modal model of short-term memory; and the amount of rehearsal correlates strongly with likelihood of recall-do not prove that rehearsal is important in the learning that goes on in the standard free recall procedure. ‘As far as overall level of recall is concerned, one could argue that when performance level is high, any damaging effect of reduced rehearsal of “difficult” items should exceed the beneficial effects of extra rehearsal of “easy” items, in which case recall should be lower in the overt rehearsal condition than in the controlled rehearsal condition; conversely, when performance is poor, the advantage should lie with the overt rehearsal condition. Unfortunately, overall level of performance turned out to be neither high nor low-50% and 46% for the overt rehearsal and controlled rehearsal conditions, respectively.

A Perspective on Rehearsal

165

C. OTHEREVIDENCE

The evidence for rehearsal as an effective means of learning in a standard free recall experiment does not rest on overt rehearsal research alone. In this subsection we briefly consider other major approaches to the question, and we suggest that the evidence they provide in support of the rehearsal hypothesis is also less than conclusive. One obvious approach is to replace the oral report of the overt rehearsal procedure with some less obtrusive means of monitoring the subject’s thoughts. To this end, a number of researchers have examined the effect of opportunity or instruction for covert rehearsal on one or more physiological measures such as electromyographic activity, pulse rate, breathing rate, and pupil size (e.g., Kahneman & Wright, 1971; Locke & Fehr, 1970; Ulich, 1967). Ingenious as much of this research may be, it has little relevance to our purposes. While it may reduce the demand characteristics inherent in the overt rehearsal procedure, the view it provides of the subject’s mental activity is murky. Even if we assume that these measures indicate rehearsal, they are too coarse to track the rehearsal of individual items. Thus, as a rule they do not tell us whether subjects think about each item only as it is presented or whether they covertly repeat it during the presentation intervals of later items. Moreover, even if we knew that items were being rehearsed, it would not follow that this rehearsal was the medium of learning. In another vein, the case for the rehearsal hypothesis has been made on the basis of organization occurring in the order in which the items are recalled at test. For instance, when a study list consists of words from each of a number of taxonomic categories in jumbled order, recall shows an appreciable grouping of the words into categories (Bousfield, 1953). Such organization is often attributed to a study strategy based on covert rehearsal (e.g., Allen, 1968; Greitzer, 1976; Weist & Crawford, 1973). To quote Greitzer (1976), “Subjects organize the material during acquisition by retrieving and rehearsing previously studied items when subsequent related items are presented” (p. 641). There are at least two problems with this argument. The first is that organization could be critically dependent upon the structure deliberately built into the list. In other words, organization in recall may not generalize to recall of nominally unstructured lists. Of course, organization does develop with unstructured lists during the course of a multitrial free recall procedure (e.g., Tulving, 1962), but there is no evidence that such organization occurs on the first trial and, thus, with single trial procedures. The second problem is that organization in recall does not prove rehearsal during presentation. Whatever organization is observed might occur even

I66

Michael J. Watkins and Zehra F. Peynircioglu

under conditions in which we can be reasonably sure there is little or no rehearsal during presentation; for example, when subjects are not trying to memorize the items and the memory test is unexpected. Suggestive of this possibility is a finding by Ambler and Maples (1977) that degree of organization in the recall of a structured list did not depend on whether subjects were instructed to rehearse each item only as it was presented or whether they were left free to rehearse as they wished; this result led the authors to conclude that “organized rehearsal was not a necessary condition for organized recall” (p. 295). Other studies comparing the effects of different kinds of rehearsal instructions have been taken as supporting the rehearsal hypothesis. In some of these studies, instructions were varied between lists (Welch & Burnett, 1924); in others between portions of a list (Brodie & Prytulak, 1975); and in still others on an individual item basis, with subjects cued to rehearse or remember before (Geiselman, 1974), during (Roediger & Crowder, 1972), or after (Woodward, Bjork, & Jongeward, 1973) the presentation of each item. Although most of this research has shown that rehearsal instructions have an effect on recall, it is not directly relevant to the evaluation of the rehearsal hypothesis as defined here. One problem is that, at least in most cases, the observed effects need have nothing to do with rehearsal as such, but could be due to the rehearsal instructions inducing qualitatively different ways of studying the items. Another problem is that even if rehearsal does control learning in such experiments, this may not be representative of experiments that do not make strong demands for rehearsal in the instructions. It is sometimes said that the von Restorff effect provides evidence for the rehearsal hypothesis in that distinctive items are likely to be well rehearsed (e.g., Bellezza & Hofstetter, 1974). Certainly, evidence from the overt rehearsal procedure is consistent with this view (Rundus, 1971). It seems probable, however, that the von Restorff items are inherently more recallable and that their high level of recall would occur even in the absence of rehearsal. Indeed, it could be this ease of recall that underlies the intensive rehearsal of these items observed in the overt rehearsal procedure. Far less readily dismissed is the finding of a reduced level of recall of items immediately preceding the von Restorff items (Schultz, 1971; Tulving , 1969). This retrograde amnesia, as Tulving (1969) called it, implies that the study of a regular list item is cut short by the occurrence of a von Restorff item, which is entirely in keeping with the rehearsal hypothesis and clearly poses a problem for the contrary view that the effective study of an item terminates with the presentation of the next item in the list. The best we can do in arguing against the rehearsal hypothesis is to point out that the magnitude of the inhibitory effect is

A Perspective on Rehearsal

167

small and its extent limited to a single item. It is therefore possible to emerge from contact with this effect sustaining the perhaps not too serious concession that in a standard free recall procedure, effective study of an item continues to some small degree during the presentation interval of just the immediately succeeding item. The rehearsal hypothesis has also gained support from the primacy effect, the relatively high level of recall for items presented at the very beginning of a list. In terms of the hypothesis, these items are well recalled because, relative to later items, they compete for rehearsal with fewer items and so are rehearsed more. This account of the primacy effect has been supported by a number of experiments. For instance, Bruce and Papay (1970) found that when subjects were cued during list presentation to ignore all items that had gone before and to prepare for recall of only those that followed, a primacy effect occurred for items presented after the cue. This result implies that the primacy effect depends on the subject’s intentions or strategies. This implication is reinforced by the finding that the primacy effect is eliminated when subjects encounter the list items unaware that their memory will be tested, so that their learning is entirely incidental (Marshall & Werder, 1972), or when subjects are instructed to restrict the learning of each item to its own rehearsal interval (Palmer & Ornstein, 1971; Welch & Burnett, 1924; for contrary findings see Darley & Glass, 1975; Leicht, 1968). On the other hand, we should note that the case for a rehearsal interpretation of the primacy effect is not completely cut and dried. Evidence that the effect depends upon the subject’s intentions and strategies is not proof that it is the product of enhanced rehearsal: It could be the product of a relatively effective study strategy that does not entail rehearsal. Alternatively, though a rehearsal strategy may be used to advantage for the first few items of a list, it may not be used (or may lose its effect) for subsequent items when memory span is approached and exceeded. A final piece of evidence in support of the rehearsal hypothesis comes from what might be conceived of as a complementary phenomenon to the primacy effect. This phenomenon occurs in a final free recall test in which, as mentioned in reference to the Murdock and Metcalfe (1978) study, the subject tries to recall words from all of the lists presented in the experimental session. When items recalled in this test are classified according to the within-list positions in which they originally occurred, probability of recall is found to decline over the last four or five positions (Craik, 1970). This ‘‘negative recency” effect is perfectly predictable from the rehearsal hypothesis. On the assumption that rehearsal stops when list presentation is complete, the last few items should be progressively less well rehearsed and hence progressively less well learned.

I68

Michael J. Watkins and Zehra F. Peynircioglu

In fact, such reduced rehearsal of recency items has been observed with the overt rehearsal procedure (Rundus, Loftus, & Atkinson, 1970). As consistent with the rehearsal hypothesis as this evidence seems to be, it does have an alternative interpretation. The reduced recall of recency items in the final recall test may reflect the manner in which these items were memorized. Perhaps the subjects studied the last few items in a list in a way that was adequate for the immediate recall test, but quite inadequate for the final recall test. That is, whereas the rehearsal hypothesis attributes the negative recency effect to a reduced amount of study, this alternative interpretation attributes it to a different kind of study. These two interpretations have been evaluated experimentally and the evidence favors the alternative, quality-of-study interpretation (Maskarinec & Brown, 1974; Watkins & Watkins, 1974). For example, Watkins and Watkins found that when list length was varied unpredictably, so that subjects were unable to adopt a strategy of changing their mode of study for the end-of-list items, the negative recency effect was eliminated. Of course, this account of negative recency leaves open the question of why subjects should bother to change their strategy of memorizing when they believe the end of the list is near. However, a resolution of this enigma was suggested by the Watkins and Watkins (1974) data: Recall of recency items in the immediate test was at a higher level when subjects could anticipate the end of the list than when they could not. In short, it seems that the negative recency effect has nothing to do with rehearsal, but is a consequence of a change in study strategy for the recency items. This change increases the chances of the recency items being recalled in the immediate test, but decreases their chances of being recalled in the final test. We have looked at several different kinds of evidence for applying what we call the rehearsal hypothesis to the standard free recall paradigm, and we have argued that at least the bulk of the evidence is open to other interpretations. We now report an experiment that represents our own attempt to address the question.

D. EXPERIMENT 1 As just described, negative recency in final recall does not arise when precautions are taken to prevent subjects from changing their study strategies during list presentation (Maskarinec & Brown, 1974; Watkins & Watkins, 1974). This finding not only undercuts the support the rehearsal hypothesis derives from the negative recency effect but also raises a problem for the hypothesis. Given the assumption that rehearsal ceases when list presentation is complete, the rehearsal hypothesis is not merely

A Perspective on Rehearsal

169

compatible with negative recency in final recall, but since it implies that end-of-list items should be underrehearsed, it would appear to predict such a result. How, then, is the hypothesis to be reconciled with the lack of negative recency? The reconciliation may be straightforward. In all of the negative recency research reported to date, performance in the final recall test is confounded with performance in the initial tests. More particularly, in the immediate tests the recency items are comparatively well recalled, and this higher level of recall could have increased the recall of these items in the final test. In this way, the effects of the predicted reduction in number of rehearsals for recency items could have been compensated for by the comparatively high level of recall of these items immediately following list presentation. That recall in one test enhances recall in a subsequent test is well established (Darley & Murdock, 1971; Thompson et al., 1978), even when the initial recall involves primary memory items (Bartlett & Tulving, 1974). Thus, whereas a finding of negative recency in the absence of any change in study strategy for recency items would have provided evidence in favor of the rehearsal hypothesis, the lack of such a finding is not very informative. The purpose of the present experiment was to see whether negative recency would occur under conditions that rule out not only the possibility of subjects switching the way they memorize when they suspect the end of the list is near but also the potentially confounding effects of initial recall. 1 . Method

Since the experiment was rather elaborate, it may be helpful to begin its description by sketching its main features. As in the Watkins and Watkins ( 1974) experiment described above, subjects were presented with a series of lists of varying lengths, and the focus of interest was on final recall. The main difference was that the immediate recall tests were given for only some of the lists. For these signaled-recall lists, the recency items were clearly identified, giving the subjects a chance to change their study strategy for these items. Delayed recall of these lists was expected to reveal a negative recency effect. The remaining lists, for which there was no immediate recall test, were of two sorts: the signaledforget lists and the unsignaled-forget lists. In the signaled-forget lists the recency items were identified just as for the signaled-recall lists, but they were followed immediately by the presentation of the next list rather than by a recall test. A negative recency effect in the delayed recall test was expected for these lists, too. Moreover, to the extent that the delayed

170

Michael J. Watkins and Zehra F. Peynircioglu

recall of the recency items is selectively enhanced by initial recall, the negative recency effect should be more pronounced for these lists than for the signaled-recall lists. Of more relevance to the present concern are the unsignaled-forget lists. For these lists the recency items were not identified as such, and no immediate recall test was given. Not only was there no indication of which items would turn out to be the last ones in the list but it also would make little sense for the subjects to guess, since they knew that there would be no test for lists that did not have the recency items identified, and, of course, they were not expecting the session to end with a final recall test. Thus, final recall for these lists should be uncontaminated by any effects of either initial recall or change in study strategy, and should provide an appropriate basis for evaluating the rehearsal hypothesis. a . Materials and Design. The lists were constructed from eight sets of 22 familiar one- and two-syllable concrete nouns selected such that all the words within a given set had the same initial letter ( B , C , D,F , G,M , P , or S ) . From each of these sets, four words were randomly chosen as guessing controls, words that were not presented but whose “recall” would allow degree of guessing to be estimated. The remaining 18 words in each set formed the presentation lists. For each subject, six lists included all of these 18 words, and two included just 12 words. Two additional lists of 30 concrete nouns (with initial letters A and 7‘) were used as buffer lists, one being presented at the beginning of the experimental sequence and the other at the end. Each subject received two lists in the signaled-recall condition, two in the signaled-forget condition, and four in the more critical unsignaledforget condition. The lists all consisted of 18 words except for two of the unsignaled-forget lists, which consisted of 12 words. Thus, for counterbalancing purposes, the three conditions of theoretical interest were cast into four, with the unsignaled-forget condition subdivided according to list length. A single ordering of the lists was used for all subjects, but the conditions of the individual lists were varied between four groups of subjects according to a Latin square. Within each of these groups, the ordering of words within a list was varied between three equal subgroups by rotating each block of 6 words through successive thirds of the list; for the 12-item unsignaled-forget condition, the third block was not presented. In this way each list served equally often in each condition and each word within a list served equally often in each section of the list. 6. Subjects and Procedure. The subjects were 84 university undergraduates; they were tested individually or in small groups. They were led to believe that the purpose of the experiment was to determine whether recall of lists of varying lengths depends on probability of being tested,

A Perspective on Rehearsal

171

and that in their case this probability was low and many of the lists would go untested. No mention was made of the final recall test. Ten successive lists were presented by means of a slide projector. Presentation was at a rate of 1 word every 3 sec. To reduce possible between-list confusion, the words had been chosen such that all those within a list began with the same letter. For lists not tested initially, the last word was followed immediately by the first word of the next list. To ensure that the subjects noticed the point at which a new list began and the preceding words could be forgotten, the first word of each list had its initial letter printed above it. It was explained to the subjects that for some lists the last 6 words would be underlined, and this would always be the case for the to-be-recalled lists. For the four lists (the two buffer lists and the two signaled-recall lists) for which an immediate recall test was given, the last word was followed by a blank slide and subjects were allowed 1 min to write down as many of the words as they could. They were told that the words could be written in any order, but that most people found it advantageous to begin with those toward the end of the list. To reduce the chances of the subjects being able to successfully anticipate the end of an unsignaled list, the first (buffer) list was made extra long (30 words) and the subjects were told that at least one list would be “very long indeed,” though in fact this was not the case. Following presentation of all 10 lists, a final recall test was given for the nonbuffer lists. The lists were cued in the order in which they had been presented, with the appropriate initial letters being presented as cues. The subjects were allowed as much recall time as they wanted.

2. Results and Discussion The immediate free recall data are summarized in Fig. 1. The recency effect is strong, and its form appears to reflect the fact that the last six items were identified as such during presentation. Inspection of the response sheets revealed that these items were usually recalled in their presentation order. Of more interest are the results for the final recall test, which are summarized in Fig. 2. Before focusing on the critical recency data, three points should be made about the overall level of recall. First, the probability of a subject “recalling” a nonpresented control word (selected from the same population as the presented words) was less than 1 %. The probability of producing correct words purely by guessing can therefore be considered negligible. Second, recall was substantially and uniformly higher for lists for which an immediate recall test was given. Thus, in line with previous findings (e.g., Darley & Murdock, 1971), there was a

172

Michael J. Watkins and Zehra F. Peynircioglu

_I

U

0

w

cc

8 7 -

6 -

m U

rn

5 -

0 LT

a

4 I

0

I

1-3

I

1

I

I

I

4-6

7-9

10-12

13-15

16-18

SERIAL POSITION Fig. 1. Probability of immediate recall as a function of serial position (Experiment 1).

beneficial effect of the immediate test. Third, overall level of recall among conditions for which there was no immediate recall test did not differ significantly. In particular, there was no significant difference among the recall levels for the 12-item unsignaled-forget lists, the first 12 items of the 18-item unsignaled-forget lists, and the first 12 items of the signaled-forget lists, F(2, 22) = 1.27, p > .lo.* Consider now the final recall data for the recency items. Perhaps the first point to note is that, consistent with the findings of Watkins and Watkins (1974), a negative recency effect was obtained in the signaledrecall condition. For statistical purposes, we defined negative recency as the negative linear component of trend across the last six serial positions; and for the signaled-recall condition this effect was significant, t( 11) = 4.87, p < .01. A significant negative recency effect was also found in the signaled-forget condition, t(l1) = 3.83. p < .01. Although interpreting relative magnitudes of trend when mean levels of performance differ is a hazardous business, there appeared to be no striking difference in the degree of negative recency between the two signaled conditions. We now turn to the final recall data for the critical unsignaled-forget lists. For neither list length was there a significant negative recency *Since each subject received only two lists in each condition, subject groups (actually the 7-subject subgroups referred to in describing the design of the experiment) rather than individual subjects served as the random variable.

173

A Perspective on Rehearsal

_I _1

5 -

a

0 W

4 -

[L

LL

0

3 -

t

k 1 a

2 -

m m

I

12-item unsignaled-forget

0

a

1

I

I

-

sqnaled-forget

[L

1

I

I

Fig. 2. Probability of final recall as a function of serial position and condition (Experiment 1).

effect: For the 12-item unsignaled-forget lists, t( 11) = .77, p > . l o , and for the 18-item unsignaled-forget lists, t(l1) = 1.26, p > .lo. Even with data combined across list lengths, there was no significant negative recency effect, t(l1) = 1.05, p > .lo. Moreover, the degree of negative recency for the combined unsignaled-forget lists was significantly smaller than that for the signaled-forget lists, t( 1 1) = 2.84, p < .01. The failure to find a negative recency effect in the unsignaled-forget lists-that is, with data uncontaminated by potential effects of previous recall or of a change in study strategy-is inconsistent with the idea that items continue to be learned after their own presentation intervals and, hence, is contrary to the rehearsal hypothesis. E. CONCLUDING COMMENTS As we have said already, covert, memory-based rehearsal has been assumed to play an important role in virtually all standard memory paradigms, and in none more so than in free recall. By contrast, our contention is that even for free recall neither the theoretical nor the evidential bases for this view are compelling. And we have reported an experiment that suggests the role of rehearsal in free recall to be minimal. It would be absurd to suppose that this single experiment will settle such an important issue; certainly, its conclusions need reinforcing with converging evidence. Furthermore, it is important that we be clear about the experiment’s inherent limitations. Some limitations stem from the precise formulation of the question addressed. In particular, the experi-

174

Michael J. Watkins and Zehra F. Peynircioglu

ment asks whether items are rehearsed beyond their respective presentation intervals, and thus avoids the more obtuse question of whether items are rehearsed within their presentation intervals. Also the experiment does not allow us to assert that the subjects do not rehearse at all, but merely that they do not rehearse effectively-that is, in a way that increases the chances of subsequent recall. Even beyond restricting our conclusions to our formulation of the rehearsal hypothesis, our results may be specific to one or more of the procedural details. Although perhaps difficult to see how, it is possible that the subjects adopted a nonrehearsal strategy in this experiment for reasons having to do with the fact that all words within a list had the same initial letter, or that some lists would not have to be recalled, or that for some lists the last six words were underlined. At another level, we should note that even if these results are generalizable to most free recall experiments, one can readily imagine free recall procedures in which effective rehearsal is likely to occur. For instance, if the presentation rate were extremely slow, say, one word every minute, subjects would surely be driven to learn previous items. Also, if the presentation rate varied markedly within a list, it seems plausible that subjects would adopt a rehearsal strategy in an effort to equalize the time they devote to each word; indeed, there is evidence to this effect (cf. Hintzman, 1970; Waugh, 1970). Other conditions for which the rehearsal hypothesis may be valid have already been noted. Thus, further research may show that it applies to the first few items of a list, or possibly to explicitly categorized lists. Also the finding of ‘‘retroactive amnesia’ ’ for items immediately preceding high priority (or von Restorff) items implies that items continue to be learned, at least to a small extent, during the presentation interval of the immediately following item. But all these qualifications and limitations notwithstanding, it seems to us that the role of rehearsal in the learning that goes on in standard free recall experiments is less substantial than usually assumed. And if there is doubt concerning the importance of rehearsal in the free recall paradigm, there would be strong reasons to doubt its importance in most other familiar paradigms of memory research.

IV.

Rehearsal of Nonverbal Stimuli

As if to compensate for the importance they attach to rehearsal in one area of inquiry, psychologists tend, as we have discussed earlier, to ignore its role in other areas. In this section we take a tiny step toward redressing this imbalance by considering the role of rehearsal in the

A Perspective on Rehearsal

175

learning of nonverbal material. Such consideration is not only of interest for its own sake, but it should also provide a valuable perspective for interpreting the verbal rehearsal findings. One basic question that immediately arises is whether and to what extent rehearsal can facilitate the learning of nonverbal information. The answer to this question will no doubt depend on a good many factors, not the least important of which is the kind of rehearsal being referred to. For instance, stimulus-based rehearsal is just as certainly effective in learning nonverbal material as it is in learning verbal material: We gradually but surely come to know a new face, voice, or piece of music. And just as certain is the effectiveness of overt rehearsal in the acquisition of manual skills. But the effectiveness of other forms of rehearsal in learning nonverbal materials and skills is not so obvious. Consider rehearsal in a sense closer to that used by contemporary psychologists-specifically , rehearsal that is memory based and covert. It would be rash to assume that just because such rehearsal can be effective for verbal material-and it surely can be, even though its role may have been somewhat overstated-it can also be effective for nonverbal material. Perhaps higher mental functioning can be used only for verbal information. For instance, recent research from our laboratory (Schiano & Watkins, 1981; see also Conrad, 1971) has suggested that the mental activity involved in the short-term retention of pictures (or at least of readily nameable pictures) is verbal in nature. Or if our introspections tell us that nonverbal material can be rehearsed, maybe such rehearsal is ineffective, just as under certain conditions rehearsal of verbal material is ineffective (e.g., Craik & Lockhart, 1972; Craik & Watkins, 1973). It seems, therefore, that whether nonverbal material can be learned through memory-based, covert rehearsal is an empirical question. What little relevant research has been reported on this issue has involved pictures; and this research has shown that rehearsal of pictures, at least when controlled by the experimenter, can increase the likelihood of their subsequent identification. Rather than describing this research in detail, we will summarize it briefly and then report three new experiments in which its logic and procedures are applied to the question of whether memory-based, covert rehearsal is effective in learning nonverbal auditory information. Within the picture rehearsal research, it is convenient for our purposes to distinguish three paradigms. The first was introduced by Shaffer and Shiffrin (1972) and entails presenting subjects with a sequence of pictures and varying the interval between successive pictures. With few exceptions (Bird & Cook, 1979; Shaffer & Shiffrin, 1972), these studies have shown that the probability of a picture being identified in a subsequent

176

Michael J. Watkins and Zehra F. Peynircioglu

recognition test increases slightly with the duration of the interval separating its presentation in the study sequence from that of the next picture (Hines & Smith, 1977; Intraub, 1979; Lutz & Scheirer, 1974; Read, 1979; Tversky & Sherman, 1975; Weaver, 1974; Weaver & Stanny, 1978). This finding is consistent with the idea that subjects are able to rehearse to advantage during the interstimulus intervals. Our first experiment in the present series (Experiment 2) was a skeletal version of this same procedure, but with auditory stimuli: Subjects heard a series of auditory stimuli, some of which were followed by a lengthy unfilled interval and others directly by the next stimulus; they were then given a recognition test in which they tried to identify the presented stimuli. At issue was whether stimuli that were followed by an unfilled interval would be better identified than those that were not. The second paradigm in the picture rehearsal research asks whether an unfilled interval has its beneficial effect regardless of the subject’s intentions, or whether the benefit requires a deliberate effort. In a series of experiments, Graefe and Watkins (1980) presented pictures two at a time, with unfilled intervals between successive pairs. At the beginning of these intervals subjects were cued to rehearse one member of the pair just presented. There followed a test in which subjects had to discriminate the pictures they had seen, both cued and uncued, from new pictures. In all experiments, discrimination of cued pictures was superior to that of the uncued pictures, implying that subjects did have control over the effects of the unfilled interval. In Experiment 3, we applied Graefe and Watkins’ procedure to auditory stimuli. The third picture rehearsal paradigm was prompted by the question of whether a beneficial effect of rehearsal would be restricted to the most recently presented picture or whether it could be extended to pictures that were, to use James’s (1890) term, more genuinely of the psychological past. Specifically, Watkins and Graefe ( 1981) presented short sequences of pictures, and after each sequence cued for the rehearsal of one or more of the pictures. They found that the cued pictures had a greater chance of being identified in a later test, even when they had not been the last in their sequence. This result indicates that effective memorization of a picture can extend well beyond what could conceivably be regarded as its own presentation interval. The fourth experiment we report used this procedure to determine whether the selective rehearsal of auditory stimuli could similarly extend to stimuli uther than the most recently presented. In all three experiments we report here, the stimuli were utterances of three-word phrases. Although the stimuli were verbal, the experiments were so designed that identification in the recognition test would have to be made solely on the basis of the stimuli’s nonverbal characteristics. Specifically, the utterances of the study list were presented in a variety of

A Perspective on Rehearsal

177

voices; and in the recognition test each utterance had to be discriminated from another of the same phrase, but in a different voice. Presumably, therefore, performance was based on intonation, voice quality, or both. The reasons for using utterances as auditory stimuli may be worth noting: They are easily obtained; their verbal contents provide convenient and unambiguous labels; and it also seems likely that, having an obvious ecological validity, they would not be too difficult to remember (cf. Craik & Kirsner, 1974; Geiselman & Glenny, 1977; McGehee, 1937). This last point was important for our purposes because testing of the uncued items had to be unexpected; therefore, all stimuli had to be presented before testing could begin. A. EXPERIMENT 2

Subjects heard a sequence of three-word phrases, each presented in one of many different voices. For each subject, half of the utterances were followed by a short (2-sec) interval and half by a long (20-sec) interval. Subjects were led to expect a recognition test of only the utterances that were followed by long intervals, and they were instructed to use these intervals for rehearsal. In fact, the test required recognition of all of the utterances of the study sequence; each had to be discriminated from a distractor utterance of the same phrase. Of interest was whether probability of correct identification would be higher for utterances followed by the longer interval.

I . Method a. Stimuli and Design. The stimuli were utterances of 40 three-word phrases, such as “she should know,” “nasty traffic jam,” and “don’t touch me. ” They were initially recorded on a master tape, from which the tapes for both this experiment and Experiment 3 were constructed. The recording voices were those of 20 university faculty, staff, and students10 male and 10 female-who were selected without regard to their voice distinctiveness or accent. These volunteers, who did not serve as subjects in either experiment, each recorded 4 phrases, 2 to be presented in the study sequence and then as targets in the recognition test, and 2 to serve as distractors in the recognition test. Half of the phrases were recorded in two male voices and half in two female voices; within this constraint, assignment of phrases to voices was random. A tape for a two-alternative recognition test was prepared from the master tape simply by recording in turn the two utterances of each of the 40 phrases. For the study sequence, one utterance of each pair was

178

Michael J. Watkins and Zehra F. Peynircioglu

selected at random, subject to the constraint that each voice be represented in exactly two utterances. The ordering of these utterances was scrambled with respect to the ordering in the test sequence, and was the same for all subjects. There were, however, two versions of the study sequence. For one version a random half of the utterances were followed by a silent interval of 20 sec and the other half by an interval of only 2 sec; for the other version the complementary sequence of intervals was used. Half of the subjects were given one version of the study sequence and the remainder the other, so that across subjects each utterance occurred equally often in each rehearsal condition. Construction of the experimental tapes from the master tape involved the use of two identical tape recorders (Akai model 1722 11). b. Subjects and Procedure. The subjects, 16 university students, were tested individually or in small groups. They heard a long sequence of phrases, each uttered in one of many different voices. Some of the utterances were followed by a lengthy silent interval, which the experimenter referred to as a rehearsal interval. Subjects were instructed to use each rehearsal interval “to imagine as vividly as possible hearing the justpresented utterance over and over again” in preparation for a recognition test to be taken immediately following the study sequence. Subjects were led to believe that utterances not followed by a rehearsal interval would not occur in the recognition test and could, therefore, be forgotten. Approximately 1 sec after the offset of each utterance the experimenter held up a card with either a cross or a check on it. A cross indicated that the next utterance would occur almost immediately, a check that a rehearsal interval was beginning. To encourage subjects to take the rehearsal instructions seriously, they were required to rate, toward the end of each rehearsal interval, how successful they judged their rehearsal to have been. This they did according to a 5-point rating scale, ranging from “not at all well” (1) to “very well” ( 5 ) . Presentation of the study sequence was followed by a two-alternative forced-choice recognition test. Contrary to the subjects’ expectations, this test included all of the study sequence utterances-those that were not cued for rehearsal as well as those that were. The utterances of each test pair were of the same phrase, but in different voices, and the subjects’ task was to indicate the one they thought had occurred in the study sequence. In addition, they were to rate their confidence in each decision, using a 5-point scale that ranged from “pure guess” (1) to “completely certain” (5). To familiarize subjects with the study and test procedures, the experiment began with a short practice session. The study sequence comprised four utterances, two of which were followed by a rehearsal interval. The

A Perspective on Rehearsal

179

details for both study and test were as for the main part of the experiment except that the two utterances that were not cued for rehearsal were not included in the test. 2. Results and Discussion Responses were analyzed both in terms of the proportion correct and in a way that takes into account the confidence with which they were made. With respect to proportion correct, a chance score (i.e., one made in the absence of any memory for the target utterance) would be approximately S O . In fact, the proportion correct was .79 for cued utterances and .68 for uncued utterances, and the advantage in favor of the cued condition was statistically significant, t(l5) = 2.95, p < .01. The same conclusion emerged when each response was scored according to a 10point scale ranging from an incorrect response rated as “entirely certain” (l), through an incorrect and a correct response rated as “pure guess” ( 5 and 6, respectively), to a correct response rated as “entirely certain” (10). The mean such “recognition score” was reliably greater for cued utterances (7.39) than for uncued utterances (6.58), t ( l 5 ) = 5.83, p < .01. It is therefore clear that, as in the case of picture memory (e.g., Tversky & Sherman, 1975; Weaver, 1974), memory for an utterance depends on the circumstances immediately following its occurrence. It may be inappropriate, however, to conclude on the basis of this finding alone that utterances can be effectively rehearsed. Rehearsal connotes a deliberate intention to learn, and it is possible that the benefit of providing a lengthy, silent interval observed in Experiment 2 accrued automatically, regardless of any effort to rehearse. Indeed, such an account of our findings is provided by a theory of echoic memory proposed by Morton (1970; see also Routh, 1971). According to this theory, information retained in an echoic buffer store is passively transferred into a long-term store. The echoic store holds a strictly limited amount of information which is displaced or erased with the occurrence of subsequent auditory information. Some of our own research suggests that, in the absence of subsequent speech, echoic information about spoken material can survive for at least the 20-sec interval used in the present experiment (Watkins & Todres, 1980; Watkins & Watkins, 1980), which means that the Morton-Routh theory can explain the finding of Experiment 2 by the mere existence of the silent intervals, and without appealing to any intentions to rehearse on the part of the subject. That in fact rehearsal underlies the effect of the silent interval is suggested by the finding that probability of identification for cued utterances

180

Michael J. Watkins and Zehra F. Peynircioglu

increased with the success-in-rehearsal rating given during the study phase. A correlation coefficient relating this rating to recognition score was computed for each subject, and the mean of these coefficients (. 14) was reliably greater than zero, t(l5) = 2.35, p < .025. Of course, this evidence is merely correlational, and its implications should be regarded as tentative. The next experiment addresses the question more directly. B. EXPERIMENT 3

Experiment 3 is an auditory analog of the Graefe and Watkins (1980) experiments already described. Utterances were presented in simultaneous pairs with a lengthy silent interval between each pair; at the beginning of these intervals, subjects were cued to rehearse 1 of the 2 utterances just presented. There is evidence that, at least in the case of tones and of intervals up to 2 sec, subjects can make use of a postpresentation instruction in deploying their attention (Kinchla, 1973). At issue in the present experiment is whether this attention or rehearsal translates into more accurate memory at some later time. The study phase was therefore followed by a test phase, in which both the cued and uncued utterances had to be distinguished from new utterances, exactly as in the previous experiment. 1. Method

a. Materials and Design. The utterances used were the same as in the previous experiment. For the study sequence the 40 targets were recorded in simultaneous pairs, using two channels of a tape recorder, for presentation at a rate of one pair every 30 sec. The two utterances in each pair were always in different voices, and both were presented over each of two speakers. For any given subject, the choice of utterances cued for rehearsal was random, though with the use of two complementary randomizations and two equal groups of subjects each utterance was assigned to the cued and uncued conditions equally often. The recognition test involved a forced-choice decision between each of 40 pairs of utterances. The design of the test was exactly as for the previous experiment, so that the two utterances of a pair were of the same phrase and occurred in close succession. b. Subjects and Procedure. The subjects were 16 undergraduates, and they were tested individually or in small groups. In the first, or study, phase of the experiment they heard a long sequence of utterances

A Perspective on Rehearsal

181

presented literally 2 at a time, with a “rehearsal” interval between successive pairs. Pilot research had shown that as often as not, the utterances could not be understood with this form of presentation, and this distressed the subjects. The problem was resolved by showing the subjects a written version of the 2 phrases 3 sec prior to their auditory presentation. Approximately 1 sec after the auditory presentation the experimenter held up a card showing 1 of the 2 phrases just uttered. As in the previous experiment subjects had been told to use the unfilled intervals to rehearse the cued utterances, and as before two additional measures were taken to help subjects focus their attention on only these utterances. First, subjects were given a short practice sequence for which only the cued utterances were tested, thereby setting up an expectation that this would also be true for the main sequence. Second, at four points during the rehearsal interval subjects were required to rate their mental activities. Specifically, they indicated on continuous lines marked from “very negative” to “very positive” their responses to each of the following questions: (a) How successful were you in isolating the cued phrase? (b) How clearly can you hear the cued phrase? (c) How clearly can you still hear the phrase? (d) How successful were you in blocking out the unwanted phrase? Although not paced, subjects had been encouraged to respond to these questions at a roughly even rate. A tone, sounded 27 sec after the onset of the utterances, signaled the subject to stop rehearsing and to read the next pair of phrases. After 20 pairs of utterances had been presented this way, the subjects were given a recognition test in which each utterance in the study sequence, whether cued or uncued, had to be discriminated from another utterance of the same phrase. They rated their confidence in each decision according to a 5-point scale as in the previous experiment. 2.

Results

The mean proportion of utterances correctly identified was reliably greater for cued (.65) than for uncued ( . 5 8 ) utterances, t(l5) = 2.56, p < .02. The mean recognition score, defined as in Experiment 2, was also reliably greater for the cued (6.48) than for the uncued (5.90) utterances, t(l5) = 3.43, p < .01. We might note incidentally that recognition scores correlated reliably with three of the four rating responses made during the rehearsal intervals, the exception being the last, which had to do with the blocking out of the uncued utterances. Clearly, the results of this experiment parallel those of the Graefe and Watkins (1980) picture study. They imply that the effect of a rehearsal interval observed in the previous experiment is at least partially under the subject’s control.

182

Michael J. Watkins and Zehra F. Peynircioglu

C . EXPERIMENT 4

The purpose of this final experiment was to see whether the control the subject has in selecting an utterance for rehearsal can extend to the temporal domain. Particularly, can the subject selectively rehearse an utterance if it is not the most recently presented? This is the question Watkins and Graefe (1981) asked about pictures, and the present experiment was fashioned after the experiments in their study. In brief, subjects heard three successive three-word utterances and were then cued to rehearse one of them. At issue was whether in a later recognition test cued utterances could be discriminated from new utterances more often than could uncued utterances; and more specifically whether this advantage would extend to the first- and second-presented utterances. One reason for doubting whether the effect of rehearsal extends to an utterance other than the one most recently presented is the possibility that the effect is mediated by echoic memory. It is generally agreed that echoic memory is limited to the most recently presented material-no more than, say, the last two or three words. If so, and if echoic memory does mediate the rehearsal effect, then it might be expected that only the most recent of our three-word utterances could be rehearsed to advantage. 1. Method a. Materials and Design. A new set of 108 utterances was recorded. They were again of three-word phrases, and indeed they included most of the 40 phrases used in the previous two experiments. There were in fact 54 phrases, each recorded in two voices. Fifty-four university staff, faculty, and students-selected as before, without regard to voice distinctiveness or accent-each recorded two phrases such that overall half of the phrases were recorded in two different male voices and half in two different female voices; of course, none of these people served in the experiment as subjects. The 108 utterances were thus paired by verbal content, and these 54 pairs constituted the successive items of the twoalternative forced-choice recognition test. One member of each pair was chosen for presentation in the study phase of the experiment, the choice being random save that each voice was included exactly once. These 54 utterances were then formed into 18 three-utterance blocks, 9 of which were of female-voiced utterances and 9 of male-voiced utterances. Each subject was cued to rehearse one utterance from each block, such that six were from the first position, six from the second, and six from the third. The position of the cued utterance varied unpredictably from block to block. The within-block positions of the utterances and the position of the utterance cued for rehearsal were rotated across subjects so that each

A Perspective on Rehearsal

183

utterance occurred in each position for '/3 of the subjects and was cued in each position for !4 of the subjects. b. Subjects and Procedure. The subjects, 36 undergraduates, were tested in groups of 2 to 4. They heard a long series of three-utterance blocks presented at an approximate rate of 1 block every 30 sec. The utterances within a block were separated by 1- to 2-sec pauses; and 2 sec after the third utterance the experimenter indicated which utterance to rehearse by holding up a card showing the number 1, 2, or 3. If they were uncertain about which utterance was being cued, subjects could refer to a list of the three utterances printed in order on the page of a booklet before them; 20 sec were allowed for rehearsal of the cued utterances. Rehearsal instructions were as in the previous two experiments. At the end of the 20-sec interval, a tone signaled the subjects to rate their success in rehearsal, and to turn to the next page of their booklets in preparation for the next block of utterances. The test phase of the experiment was precisely as in the previous two experiments, with each of the utterances presented in the study phase, whether cued or uncued, being paired with an utterance of the same phrase in a different voice. As before, subjects had to select from each pair the utterance that had occurred in the study sequence and for each choice to rate their confidence according to a 5-point scale. Also, as in the previous two experiments, subjects were led to expect that only the cued utterances would occur in the recognition test. This expectation was established by means of a practice session in which three blocks of three utterances were presented, one followed by a cue for rehearsal of the first utterance, one by a cue for the second, and one by a cue for the third. The test that followed involved the cued utterances (each paired with another of the same verbal content), but not the uncued utterances. The 12 utterances used in this practice involved none of the phrases or voices used in the main part of the experiment. 2 . Results and Discussion

Recognition of cued and uncued utterances as a function of withinblock position is summarized in the table; the measures of performance shown are the proportion of correct choices and the recognition score obtained by taking the confidence ratings into account in the manner described for Experiment 2. The important finding is that, by either measure, there was an advantage for cued utterances even if at the time of cuing they were not the most recently presented. Indeed, position of occurrence within the utterance block had no obvious effect. The cued advantage was reliable at

Michael J. Watkins and Zehra F. Peynircioglu

184

TABLE I RECOGNITION PERFORMANCE FOR CUEDAND UNCUED UTTERANCES, SCORED TO Two CRITERIA, AS A FUNCTION OF WITHIN-BLOCK POSITION ACCORDING 4) (EXPERIMENT Proportion of correct choices Position in block First Second Third Positions collapsed

Recognition score

Cued

Uncued

Cued

Uncued

.79 .78

.80

.67 .70 .69

1.56 7.86 7.71

6.67 6.87 6.60

.79

.69

7.71

6.71

each position, both by the proportion correct measure, t(35) = 3.58, p < .01, (35) = 2.40, p < .02, and t(35) = 4.10, p < .01, respectively; and by the recognition scores, t(35) = 4.23, 5.52, and 7.87,respectively (p < .01 in each case). It would seem therefore that the effect of rehearsal found in the previous two experiments was not mediated by echoic memory, or else the capacity of echoic memory is substantially greater than is usually assumed. A subsidiary finding, consistent with the previous two experiments, was that the rehearsal ratings correlated reliably with the recognition scores for the cued utterances, r = .22, t(35) = 6.23, p < .01. D.

CONCLUDING COMMENTS

The three experiments reported in Section IV demonstrate that memory for nonverbal auditory information can be enhanced through rehearsal. Each experiment involved the presentation of a sequence of utterances of three-word phrases, followed by a test in which each utterance had to be discriminated from another utterance of the same phrase. The first of these experiments (Experiment 2) showed that discrimination was more accurate when the utterances had been followed by lengthy silent intervals. In Experiment 3, utterances were presented in simultaneous pairs with an unfilled interval between each pair, and subjects were cued at the beginning of each interval to rehearse one of the two utterances just presented. Subsequent identification was more probable for cued than for uncued utterances, indicating that the benefit of the unfilled interval depends upon the intention of the subjects and does not accrue automati-

A Perspective on Rehearsal

185

cally. Experiment 4 involved presenting the utterances in blocks of three and cuing just one from each block. Cuing for rehearsal of an utterance again enhanced its identification, regardless of whether it was presented in the first, second, or third position within a block. This set of experiments parallels previous research on picture rehearsal and thereby extends our knowledge of the rehearsability of nonverbal materials. Of course, the extension is a very modest one, and a great deal more research remains to be done. There is a need to explore the effect on utterance memory of study-to-rehearsal lag. It seems reasonable that with long lags the effect of rehearsal may decline (cf. Watkins & Graefe, 1981), though perhaps this relation may in turn depend on the rehearsalto-test delay (cf. Glenberg & Lehmann, 1980). On a broader scale, there is a need to explore the role of rehearsal in the learning and remembering of other kinds of stimuli, both within the auditory modality and in other modalities. A fuller knowledge of the relative effects of rehearsal for different kinds of materials and skills is surely a prerequisite for a comprehensive understanding of the nature of the rehearsal. A central aspect of the nature of rehearsal concerns the appropriate description of its “code.” This question is of particular interest in the case of covert rehearsal. Does covert rehearsal take the same form, regardless of the material being rehearsed? It is sometimes assumed that rehearsal is nothing other than implicit speech, in which case the findings of our utterance rehearsal experiments and those of the picture rehearsal research would have to be explained in terms of subjects’ verbal descriptions. Many of these experiments were designed in such a way that the verbal description would have had to have been fairly elaborate, and our suspicion is that the code of rehearsal was a much more direct representation of the items as presented. One way of addressing this issue would be to determine whether the rehearsal of verbal and of nonverbal items is differentially affected by a verbal and a nonverbal distractor task during the rehearsal interval (cf. Brooks, 1968). V.

Summary and Conclusions

Contemporary psychologists conceptualize rehearsal rather narrowly. Consistent with the Zeitgeist, they focus on rehearsal that is covert, memory based, and subject controlled. They do, however, consider this kind of rehearsal to be pervasive; and, in contrast to their precedessors, they attribute to it a significant explanatory role in a variety of experimental paradigms, above all in free recall. In this article the evidence for rehearsal in a typical free recall procedure was considered and found

186

Michael J. Watkins and Zehra F. Peynircioglu

inconclusive. We suggested that, to the contrary, the learning of an item is substantially complete by the presentation of the next item, and virtually entirely so by the presentation of the item after that. We also described an experiment that supported this suggestion. Although in some respects the role of rehearsal may be exaggerated, in others it has been virtually ignored. A case in point is the memory-based rehearsal of nonverbal material. Research on this topic is extremely sparse and essentially restricted to picture rehearsal. We have reported here a series of three experiments that extend this research by demonstrating effective experimenter-controlled rehearsal of nonverbal auditory information. We believe that a great deal more research involving a variety of materials and skills is needed. Certainly, any comprehensive theory of rehearsal will have to explain any variation in the extent to which different types of materials and skills lend themselves to effective rehearsal. We also need to compare for various materials and skills the relative effects of memory-based and stimulus-based rehearsal, and of covert and overt rehearsal. In addition, it also seems worth exploring the distinction between subject-controlled and experimenter-controlled rehearsal, that is, between the rehearsal subjects elect to do in a particular situation and that which the experimenter contrives for them to do. This distinction is not as straightforward as may first appear. We suggest that in some of the methods used for studying subject-controlled rehearsal, the control may well have inadvertently shifted from the subject to the experimenter. To take just one example, the overt rehearsal research has promoted the conclusion that subjects do rehearse, whereas the appropriate conclusion may be merely that subjects can rehearse.

ACKNOWLEDGMENTS The research reported in this article was supported by National Institute of Mental Health Grant MH31674 to M. J . Watkins. The authors are grateful to Olga C. Watkins for her detailed comments on a draft of the article.

REFERENCES Allen, M. Rehearsal strategies and response cueing as determinants of organization in free recall. Journal of Verbal Learning and Verbal Behavior, 1968, 7, 58-63. Ambler, B., & Maples, W. Role of rehearsal in encoding and organization for free recall. Journal of Experimental Psychology: Human Learning and Memory, 1977, 3, 295-304. Ashcraft, M. H., Kellas, G., & Needham, S. Rehearsal and retrieval processes in free recall of categorized lists. Memory and Cognition, 1975, 3, 506-512.

A Perspective on Rehearsal

I87

Atkinson, R. C., & Shiffrin, R. M. Human memory: A proposed system and its control processes. In K. W. Spence & J. T. Spence (Eds.), The psychology of learning and motivation: Advances in research and theory (Vol. 2). New York: Academic Press, 1968. Bartlett, J. C., & Tulving, E. Effects of temporal and semantic encoding in immediate recall upon subsequent retrieval. Journal of Verbal Learning and Verbal Behavior, 1974, 13, 297-309. Bellezza, F. S., & Hofstetter, G. P. Isolation, serial position, and rehearsal in free recall. Bulletin of the Psychonomic Society, 1974, 3, 362-364. Bird, J. E., & Cook, M. Effects of stimulus duration and IS1 on accuracy and transference errors in pictorial recognition. Memory and Cognition, 1979, 7 , 469-475. Bousfield, W. A. The occurrence of clustering in the recall of randomly arranged associates. Journal of General Psychology, 1953, 49, 229-240. Brodie, D. A. Free recall measures of short-term store; Are rehearsal and order of recall data necessary? Memory and Cognition, 1975, 3, 653-662. Brodie, D. A. & Murdock, B. B., Jr. Effect of presentation time on nominal and functional serialposition curves of free recall. Journal of Verbal Learning and Verbal Behavior, 1977, 16, 185-200. Brodie, D. A,, & Prytulak, L. S. Free recall curves: Nothing but rehearsing some items more or recalling them sooner? Journal of Verbal Learning and Verbal Behavior, 1975, 14, 549-563. Brooks, L. R. Spatial and verbal components of the act of recall. Canadian Journal of Psychology, 1968, 22, 349-368. Bruce, D., & Papay, J. P. Primacy effect in single-trial free recall. Journal of Verbal Learning and Verbal Behavior, 1970, 9, 473-486. Conrad, R. The chronology of the development of covert speech in children. Developmental Psychology, 1971, 5, 398-405. Corballis, M. C. Patterns of rehearsal in immediate memory. British Journal of Psychology, 1969, 60, 41-49. Craik, F. I. M. The fate of primary memory items in free recall. Journal of Verbal Learning and Verbal Behavior, 1970, 143-148. Craik, F. 1. M. & Kirsner, K. The effect of speaker’s voice on word recognition. Quarterly Journal of Experimental Psychology, 1974, 26, 274-284. Craik, F. I. M., & Lockhart, R. S. Levels of processing: A framework for memory research. Journal of Verbal Learning and Verbal Behavior, 1972, 11, 671-684. Craik, F. 1. M., & Watkins, M. J. The role of rehearsal in short-term memory. Journal of Verbal Learning and Verbal Behavior, 1973, 12, 599-607. Darley, C. F., & Glass, A. L. Effects of rehearsal and serial list position on recall. Journal of Experimental Psychology: Human Learning and Memory, 1975, 1, 453-458. Darley, C . F., & Murdock, B. B., Jr. Effects of prior free recall on final recall and recognition. Journal of Experimental Psychology, 1971, 91, 66-73. Ebbinghaus, H. Uber das Gedachtnis. Leipzig: Ruyer & Bussenius, 1885. (Trans.), Memory. New York: Teachers College Press, 1913. Fischler, I., Rundus, D., & Atkinson, R. C. Effects of overt rehearsal procedures on free recall. Psychonomic Science, 1970, 19, 249-250. Geiselman, R. E. Positive forgetting of sentence material. Memory and Cognition, 1974, 2, 677-682. Geiselman, R. E., & Glenny, J. Effects of imaging speakers’ voices on the retention of words presented visually. Memory and Cognition, 1977, 5 , 499-504. Glanzer, M., & Meinzer, A. The effects of intralist activity on free recall. Journal of Verbal Learning and Verbal Behavior, 1967, 6, 928-935. Glenberg, A. M., & Lehnmann, T. S. Spacing repetitions over 1 week. Memory and Cognition, 1980, 8, 528-538.

188

Michael J. Watkins and Zehra F. Peynircioglu

Graefe, T. M., & Watkins, M. J. Picture rehearsal: An effect of selectively attending to pictures no longer in view. Journal of Experimental Psychology: Human Learning and Memory, 1980, 6, 156- 162. Greitzer, F. L. Intracategory rehearsal in list learning. Journal of Verbal Learning and Verbal Behavior, 1976, 15, 641-654. Hines, D., & Smith, S. Recognition of random shapes followed at varying delays by attended or unattended shapes, digits, and line grids. Journal of Experimental Psychology: Human Learning and Memory, 1977, 3, 29-36. Hintzman, D. L. Effects of repetition and exposure duration on memory. Journal of Experimental P ~ c h o l o g y ,1970, 83, 435-444. Hintzman, D. L. Theoretical implications of the spacing effect. In R. L. Solso (Ed.), Theories in cognitive psychology: The Loyola Symposium. Hillsdale, New Jersey: Erlbaum, 1974. Horton, K. D. Phonemic similarity, overt rehearsal and short-term store. Journal of Experimental Psychology: Human Learning and Memory, 1976, 2, 244-251. Intraub, H. The role of implicit naming in pictorial encoding. Journal of Experimental Psychology: Human Learning and Memory, 1979, 5, 78-87. James, W. The principles of psychology. New York: Holt, 1890. Johnson, R. E. Memory-based rehearsal. In G. Bower (Ed.), The psychology of learning and motivation (Vol. 14). New York: Academic Press, 1980. Kahneman, D., & Wright, P. Changes of pupil size and rehearsal strategies in a short-term memory task. Quarterly Journal of Experimental Psychology, 1971, 23, 187-196. Kellas, G., McCauley, C., & McFarland, C. E. Reexamination of externalized rehearsal. Journal of Experimental Psychology: Human Learning and Memory, 1975, 104, 84-90. Kinchla, R. A. Selective processes in sensory memory: A probe comparison procedure. In S. Kornblum (Ed.), Attention and performance IV. New York: Academic Press, 1973. Leicht, K. L. Differential rehearsal and primacy effects. Journal of Verbal Learning and Verbal Behavior, 1968, 7, 1115-1117. Locke, J. L. & Fehr, F. S. Subvocal rehearsal as a form of speech. Journal of Verbal Learning and Verbal Behavior, 1970, 9, 495-498. Lutz, W. J., & Scheirer, C. J. Coding processes for pictures and words. Journal of Verbal Learning and Verbal Behavior, 1974, 13, 316-320. McGehee, F. The reliability of the identification of the human voice. Journal of General Psychology, 1937, 17, 249-271. Madigan, S. Effects of overt rehearsal on recall of paired associates. Bulletin of the Psychonomic Society, 1973, 1, 423-424. Mandler, G. Organization and memory. In K. W. Spence & J. T. Spence (Eds.), The psychology of learning and motivation: Advances in research and theory (Vol. I ) . New York: Academic Press, 1967. Marshall, P. H., & Werder, P. R. The effects of the elimination of rehearsal on primacy and recency. Journal of Verbal Learning and Verbal Behavior, 1972, 11, 649-653. Maskarinec, A. S., & Brown, S. C. Positive and negative recency effects in free recall learning. Journal of Verbal Learning and Verbal Behavior, 1974, 13, 328-334. Melton, A. W. The situation with respect to the spacing of repetitions and memory. Journal of Verbal Learning and Verbal Behavior, 1970, 9, 596-606. Morton, J. A functional model of memory. In D. A. Norman (Ed.), Models of human memory. New York: Academic Press, 1970. Murdock, B. B., Jr., & Metcalfe, J. Controlled rehearsal in single-trial free recall. Journal of Verbal Learning and Behavior, 1978, 17, 309-324. Palmer, S. E., & Ornstein, P. A. Role of rehearsal strategy in serial probed recall. Journal of Experimental Psychology, 1971, 88, 60-66.

A Perspective on Rehearsal

189

Read, J. D. Rehearsal and recognition of faces. American Journal offsychology, 1979, 92, 71-85. Roediger, H. L., 111, & Crowder, R. G. Instructed forgetting: Rehearsal control or retrieval inhibition (repression?). Cognitive Psychology, 1972, 3, 244-254. Roediger, H. L., 111, & Thorpe, L. A. The role of recall time in producing hyperamnesia. Memory and Cognition, 1978, 6, 296-305. Roenker, D. L. Role of rehearsal in long-term retention. Journal of Experimental Psychology, 1974, 103, 368-371. Routh, D. A. Independence of the modality effect and amount of silent rehearsal in immediate serial recall. Journal of Verbal Learning and Verbal Behavior, 1971, 10, 213-218. Rundus, D. Analysis of rehearsal processes in free recall. Journal of Experimental Psychology, 1971, 89, 63-77.

Rundus, D., & Atkinson, R. C. Rehearsal processes in free recall: A procedure for direct observation. Journal of Verbal Learning and Verbal Behavior, 1970, 9, 99-105. Rundus, D., Loftus, G. R., & Atkinson, R. C. Immediate free recall and three-week delayed recognition. Journal of Verbal Learning and Verbal Behavior, 1970, 9, 684-688. Schiano, D. J., & Watkins, M. J. Speech-like coding of pictures in short-term memory. Memory and Cognition, 1981, 9, 110-114. Schultz, L. S. Effects of high-priority events on recall and recognition of other events. Journal of Verbal Learning and Verbal Behavior, 1971, 10, 322-330. Shaffer, W. O., & Shiffrin, R. M. Rehearsal and storage of visual information. Journal ofExperimental Psychology, 1972, 92, 292-296. Thompson, C. P., Wenger, S. K., & Bartling, C. A. How recall facilitates subsequent recall: A reappraisal. Journal of Experimental Psychology: Human Learning and Memory, 1978, 4, 2 10-22 I . Tulving, E. Subjective organization in free recall of “unrelated” words. Psychological Review, 1962, 69, 344-354.

Tulving, E. Retrograde amnesia in free recall. Science, 1969, 164, 88-90, Tversky, B., & Sherman, B. Picture memory improves with longer on time and off time. Journal of Experimental Psychology: Human Learning and Memory, 1975, 1, 114-1 18. Ulich, E. Some experiments on the function of mental training in the acquisition of motor skills. Ergonomics, 1967, 10, 41 1-419. Watkins, M. J. & Graefe, T. M. Delayed rehearsal of pictures. Journal of Verbal Learning and Verbal Behavior, 1981, 20, 276-288. Watkins, M. J., & Todres, A. K. Suffix effects manifest and concealed: Further evidence for a 20second echo. Journal of Verbal Learning and Verbal Behavior, 1980, 19, 46-53. Watkins, M. J., & Watkins, 0. C. Processing of recency items for free recall. Journal of Experimental Psychology, 1974, 102, 488-493. Watkins, 0. C., & Watkins, M. J. The modality effect and echoic persistence. Journal of Experimental Psychology: General, 1980, 109, 251-278. Waugh, N. C. On the effective duration of a repeated word. Journal of Verbal Learning and Verbal Behavior, 1970, 9, 587-595. Waugh, N. C., & Norman, D. A. Primary memory. Psychological Review, 1965, 72, 89104. Weaver, G. E. Effects of poststimulus study time on recognition of pictures. Journal of Experimental Psychology, 1974, 103, 799-801. Weaver, G. E., & Stanny, C. J. Short-term retention of pictorial stimuli as assessed by a probe recognition technique. Journal of Experimental Psychology: Human Learning and Memory, 1978, 4, 55-65. Weist, R. M., & Crawford, C. Sequential versus organized rehearsal. Journal of Experimental Psychology, 1973, 101, 237-241.

190

Michael J. Watkins and Zehra F. Peynircioglu

Welch, G. B., & Burnett, C. T. Is primacy a factor in association formation? American Journal of Psychology, 1924, 35, 396-401. Whitten, W. B., 11, & Bjork, R. A. Learning from tests: Effects of spacing. Journal of Verbal Learning and Verbal Behavior, 1977, 16, 465-478. Woodward, A. E., Jr., Bjork, R. A , , & Jongeward, R. H., Jr. Recall and recognition as a function of primary rehearsal. Journal of Verbal Learning and Verbal Behavior, 1973, 12, 608-617.

Alice F . Healy UNIVERSITY OF COLORADO BOULDER, COLORADO

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. Item and Order Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. Temporal Sequence and Spatial Location Information . . . . . . . . . 11. Experiment 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. Method ................................................. B. Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111. Experiment 2 ..................................... A. Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. Discussion .................. IV. Experiment3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. Results . . . . . . . ... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ........... V. Experiment 4 . . . . . . . . . . . . . A. Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. Results .............................................. C. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . VI. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. Independence of Item and Order Information ........... B. Coding Strategies.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. Mental Representations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.

I. A.

ITEM

AND

191 191 194 196 198 200 202 202 204 206 210 211 213 211 225 221 221 229 234 234 234 235 236 231

Introduction

ORDERINFORMATION

In the development of models for short-term memory, one issue that has received an enormous amount of attention is the relationship between the retention of item information and the retention of order information. Actually, this issue can be divided into two distinct questions about the separation of item and order information. Are order and item information THE PSYCHOLOGY OF LEARNING AND MOTIVATION, VOL. 16

191

Copyright 0 1982 by Academic Press, Inc. All rights of reproduction in any form reserved. ISBN 0-12-543316-6

192

Alice F. Healy

separately represented in short-term memory? Are order and item information separately lost from short-term memory? Different models yield different answers to these questions, and a single model does not necessarily answer the two questions in the same way. For example, the model by Shiffrin and Cook (1978) includes separate and distinct mechanisms for the representation of item and order informationitem bonds and relational bonds-but the loss of the two types of information is not independent; rather, in one version of their model, the loss of item bonds can cause the loss of relational bonds, and vice versa. Drewnowski’s (1980) attribute model also postulates distinct memory representations for item information and ordinal position information and includes different assumptions for the loss of these two types of information. However, some interaction between item and order loss is included in the model because it assumes that items that have been tagged for ordinal position are lost with one-third the probability of untagged items, although ordinal position information is assumed not to be forgotten. Furthermore, it is assumed that once a given item has been forgotten, its associated ordinal position tag cannot aid recall (Drewnowski, 1980, p. 21 1). Another link between item and order information occurs in the attribute model as well. Drewnowski postulates that information about the auditory (phonemic) properties of the items is used to reconstruct order information. In addition, Murdock (1976) has explored a Markov model which includes separate memory representations for item and order information, but postulates that forgetting order information precedes and increases the probability of forgetting item information (see Murdock, 1976, p. 195). In contrast to these models, which postulate distinct but interdependent mechanisms for the loss of order and item information, the slot theory proposed by Conrad (1965) includes only a single forgetting mechanism, which is responsible for the interdependent loss of item and order information. According to this model, order errors are a by-product of item errors and completely dependent on them. Specifically, the memory codes for the to-be-remembered items are entered into an ordered series of slots, or positions, and retrieved from these slots at the time of recall. Forgetting occurs as a result of a decay process that causes the contents of the slots to become degraded, but does not cause any movement of information from one slot to another. In one sense, the perturbation model of Estes (1972) and Lee and Estes (1977) is the complement of Conrad’s slot model. The perturbation model also postulates only a single forgetting mechanism, but in this case the mechanism involves loss of positional information-perturbations in the timing of the cyclic reactivations of item representations. In addition, according to the recently augmented version of this model (Lee & Estes,

Short-Term Memory for Order Information

193

1981), the perturbation process is assumed to operate on different levels concurrently, thereby causing the independent loss of item and order information. Order errors result from the perturbation process for the encoding of the relative positions of the items within the trial, and item errors result from the perturbation process for the encoding of the relative position within the session of the trial on which the items appeared. This conception is sympathetic to that of Crowder (1979), who argues that there is no fundamental difference between item and order information, since each type of information locates a well-known item in a particular temporal-spatial position. In the case of item information, the location specified is “coarse grained,” referring to the list containing the item; whereas for order information, the location is ‘‘fine grained,” referring to the position of the item relative to other items in the same list. One method I have used to help resolve this issue is to separate experimentally the retention of item and order information and determine whether the same pattern of results is found in the two cases (Healy, 1974). Specifically, I designed two experimental tasks that differed primarily in the amount and type of information provided in advance to subjects about the to-be-remembered material, and hence in the amount and type of information to be held in short-term memory. In one task (Order Only) subjects were told in advance the items that would be shown on a trial; they had to recall only their order. In the second task (Item Only) subjects were told in advance constraints on the order of the items that would be shown, so that they had to recall only item information. The patterns of results were different in these two situations, the most striking difference being in the serial position functions, which were bow shaped in the Order Only Experiment, but not in the Item Only Experiment. These findings suggest that order and item errors are caused by two different processes; these results provide difficulty for Conrad’s ( 1965) model of a single forgetting mechanism. Nevertheless, these results are not able to discriminate among the remaining models for the retention of item and order information, which postulate either different (but interdependent) mechanisms for the representation of item and order information (Drewnowski, 1980; Murdock, 1976; Shiffrin & Cook, 1978) or similar mechanisms for representing item and order information but independent loss of the two representations (Lee & Estes, 1981). One empirical question which should help differentiate these models is whether the retention of item and order information draws on the same processing capacity or pool or resources (cf. Navon & Gopher, 1979). This question is the focus of the present study. In Experiment 1, I attempt to answer this question by determining whether the retention of temporal sequence information is

194

Alice F. Healy

independent of the memory load for item information. In the subsequent three experiments, I try to answer this question by investigating whether the retention of spatial location information is affected by the properties of the to-be-remembered items. If temporal sequence and spatial location information are not affected by the amount and quality of the to-beretained item information, the models postulating interdependencies in the loss of order and item information would seem to need some revision. B. TEMPORAL SEQUENCE AND SPATIAL LOCATION INFORMATION

In most situations that involve visual information processing, the temporal sequence in which a set of characters is read corresponds to their spatial arrangement; typically, characters are read from left to right. However, the two types of order information can be unconfounded experimentally, and I have devised such an experimental situation for the short-term retention of letters. (See Healy, 1975, for a detailed discussion of the methods employed, and see Berch, 1979, for a comparison of my procedures to those used by other investigators.) Specifically, the standard task I have adopted is a modification of the distractor paradigm used, for example, by Bjork and Healy (1974), Conrad (1967), and Estes (1973). In earlier experiments with the distractor technique, characters occurred in only one spatial location of the display screen; but in my new technique, characters could occur in four different spatial locations arranged in a horizontal linear array. Four to-be-remembered letters were displayed successively on the screen, each letter occurring in only one of the four spatial locations, the remaining locations being left blank. The letters were not necessarily displayed from left to right, so that their spatial arrangement did not usually correspond to their temporal sequence. Following the presentation of the letters on a given trial and before their recall was a sequence of digits, with the length of the sequence varying from trial to trial and defining the retention interval for the trial. In order to isolate the retention of order information, as in the Order Only Experiment, the same four letters were shown on every trial of a subject’s session and the subject was told what these four letters would be in advance. All that the subject had to recall was the order of the letters-either their temporal sequence or their spatial location, depending upon the recall condition-which was varied between subjects. Further, to isolate the retention of temporal sequence or spatial location information, the letters shown to subjects in the spatial location recall conditions were displayed in a fixed temporal sequence known by the subjects in advance of the trials; and the letters shown to subjects in the temporal sequence recall conditions were displayed in a fixed spatial arrangement.

Short-Term Memory for Order Information

195

The studies using this procedure revealed numerous striking differences between the retention of temporal sequence and spatial location information. In temporal sequence retention there was a preponderance of phonemic confusion errors, especially at the shortest retention interval; the retention function was quite steep; and recall level was dramatically affected by whether subjects responded aloud or silently to each of the interpolated digits. In contrast, in spatial location retention, the percentage of phonemic confusion errors was no greater than chance; the retention function was much flatter; and recall level was greatly affected by whether subjects responded with the name or the spatial location number of each of the interpolated digits, but not by the mode of response (aloud or silent). These and related observations (e.g., Healy, 1975, 1977) led me to suggest that the retention of temporal sequence information was based on phonemic coding, whereas the retention of spatial location information was based on the coding of the temporal-spatial pattern of letter presentations. For example, subjects asked to retain spatial location information would code the information that on a particular trial the first letter occurred in the right-most spatial location and the next three letters occurred successively from left to right, starting in the left-most location. A Markov model specifying the features of this coding process was developed (Healy, 1978) and tested by varying the relationship between the temporal-spatial pattern of the to-be-recalled letters and the temporal-spatial pattern of the digits interpolated during the retention interval. As predicted, the percentage of correctly recalled letters was greatly affected by the interpolated digit pattern in spatial location recall, but not in temporal sequence recall. Why do subjects use different coding strategies to retain the order of letters arranged in a temporal sequence and in a spatial array? The temporal-spatial pattern coding strategy should be available for temporal sequence recall as well as for spatial location recall, but the strategy may not be used if subjects ignore the spatial locations of the stimulus letters, as they may tend to do in temporal sequence recall. In support of this hypothesis, an earlier experiment (Healy, 1977, Experiment 3) demonstrated that subjects did employ pattern coding in temporal sequence recall when a new method of response forced them to attend to the spatial locations of the stimulus letters. Even in this situation, the subjects used phonemic coding in conjunction with the pattern-coding strategy. Since phonemic coding is so prevalent in the retention of temporal sequences (e.g., Healy, 1974; Sperling & Speelman, 1970), it may seem surprising that subjects did not use phonemic coding in spatial location recall. However, phonemic coding may not have been a reasonable coding strategy in the spatial location recall conditions employed in the earlier studies by Healy (1975, 1977, 1978), since the subjects said the same thing on every

196

Alice F. Healy

trial of their session because the temporal sequence of letters was fixed. Hence, a record of what was said would not have been helpful for spatial location retention in that situation. Perhaps subjects choose a phonemic coding strategy whenever they can rely on a memory code that reflects what they heard themselves say. Experiment 2 tests the hypothesis that the coding strategy chosen will depend on what the subjects say during the presentation of the stimuli. Alternatively, subjects may not use phonemic coding in spatial location recall conditions because they attend to the visual properties of the letters, rather than their sound. In fact, unreported analyses of the errors in an earlier experiment (Healy, 1978, Experiment 2) indicated more visual confusions than phonemic confusions in spatial location retention, whereas more phonemic than visual confusions were found in temporal sequence retention. This phenomenon is examined more thoroughly in Experiments 3 and 4 in the present article. 11. Experiment 1

When asked to recall the temporal sequence of a short list of letters or words, subjects have been found to recode the information into a phonemic format, even when the to-be-remembered items are presented visually (e.g., Conrad, 1964). How flexible is this coding strategy? Can subjects develop a more efficient recoding scheme when given some information about the to-be-remembered items in advance of presentation? Such information available to the subject prior to a trial has been termed the “schema” by Lee and Estes (1981). Whereas some investigators have argued that subjects can use schema information to improve their guessing at the time of recall of the to-be-remembered items, but not to improve their coding and retention of the to-be-remembered items (Drewnowski, 1980; Healy, 1974); other have implied that subjects can use schema information to improve coding and retention (e.g., Brown, 1958; Crossman, 1961). In an earlier study (Healy, 1974), I compared two experimental conditions in which subjects were to recall the temporal sequence of a list of four consonants: In one (Bjork & Healy, 1974), the subjects knew nothing about which four to-be-remembered letters would be presented on a given trial. In the second (the Order Only Experiment described earlier), the subjects were told which four letters would occur and this information was made salient by the fact that the same four letters occurred on each of the 72 experimental trials in a given subject’s session. I found that the percentages of correct recall responses were higher when the identity of

Short-Term Memory for Order Information

197

the letters was known in advance, but there was no difference in the percentages of correct recall responses between the two experimental conditions when only memory for order information was considered. The percentages of correct responses in the Order Only Experiment were essentially equivalent to the conditional percentages of recalling the correct position of an item, given that the item was correctly recalled in the Bjork and Healy (1974) experiment. With a task similar to the Order Only Experiment but involving consonant-vowel-consonant trigrams, rather than letters, as to-be-remembered items, Drewnowski (1980) also demonstrated that there was no need to assume that the experimenter-supplied item information was actively represented in the subject’s memory during the trial. Drewnowski was able to fit his attribute model for short-term retention with a single set of parameter values to both his order-only condition and an analogous item-only condition, in which information about the order, but not the identity, of the items was given in advance to the subjects. One of the parameters in the attribute model (Pk) represents the probability of losing item information within a .5-sec interval when k items are currently in memory, and a second parameter (P,) represents the probability of storing order information in the form of position tags. Drewnowski did not find the best fit to the data from his order-only condition when Pk = 0, as would have been the case if subjects had been able to retain perfectly the redundant item information. Likewise, Drewnowski did not find a higher value for P , in his item-only condition, even though subjects in that condition were given in advance constraints on the order of the trigrams. Rather, the best fitting values for P , and P , were found to be the same in the two experimental conditions, suggesting that subjects used the same coding strategy in the two cases. Experiment 1 of the present study was designed to provide a direct test of the hypothesis that schema information about the identity of to-beremembered items helps the subject to code or retain order information more efficiently. As in my earlier study (Healy, 1974), an order only condition, in which subjects knew that the same four letters would occur on all experimental trials, was compared to an item + order condition, in which the identity of the four letters shown varied from trial to trial. In addition, a third experimental condition (the item + order - item condition) was constructed, in which the identity of the four letters varied across trials and subjects were told nothing in advance of a trial about which four letters would be shown, but subjects were reminded at the time of test (after coding, retention, and forgetting had already occurred) which four letters had been shown. In the item + order - item condition, as in the order only condition, the information provided by the experi-

Alice F. Healy

198

menter could improve the subjects’ guessing, but was given too late to have any effect on coding or retention. Therefore, if there was no advantage for subjects in the order only condition, compared to those in the item + order - item condition, then we could be assured that subjects were not able to use schema information about the identity of items to improve their coding and retention of the order of the items. A.

METHOD

I.

Subjects

Seventy-two male and female undergraduates of Yale University participated as subjects in order to fulfill a course requirement. There were 24 subjects in each of three conditions: item + order, order only, and item order - item. The order only condition included eight subconditions, with 3 subjects in each subcondition.

+

2 . Apparatus A Digital Equipment Corporation VT50 terminal, controlled by a PDP-11/40 computer, was used for the visual display of the stimuli. The alphanumeric characters were presented in a single location, 5 lines down and 5 spaces over in the screen, which was 12 lines by 80 spaces. Each trial began with the display of a single hyphen repeated twice in a row. The characters were .4 X .2 cm and were all upper case. The computer was programmed to display each character for approximately 400 msec, with an interstimulus interval of approximately 50 msec, and an intertrial interval of approximately 16 sec, with a warning clicking sound occurring in the intertrial interval after every 8 sec. Timing is approximate because a time-sharing system was employed.

3. Design and Materials Nine different 72-trial experimental sequences were prepared; one was employed in the item + order and item order - item conditions, and the other eight were used in the order only condition. A given subject was shown only one experimental sequence, and 3 subjects were shown each of the sequences for the order only condition. Each experimental trial consisted of four successively presented consonants followed by a retention interval including either 3, 8, or 18 intervening digits. The presentation order of the trials was quasi-random with the constraint that every block of 12 trials have four instances of each of the three retention

+

199

Short-Term Memory for Order Information

intervals. Only digits appeared during two initial practice trials shown to all subjects. The interpolated digits displayed on each experimental and practice trial were randomly selected from the digits 1 to 9 with the constraint that no digit immediately succeed itself. The same digits occurred in each experimental sequence. Only the consonants differed across sequences. For the eight sequences used in the order only condition, the same four consonants appeared on each trial of a given sequence. The 24 permutations of the four letters each appeared three times, once at each of three retention intervals. The letters were drawn from a population of 8 consonants: BPFSKMHL. The eight sequences of trials contained eight different four-letter subsets of these eight letters: BPKM, FSHL, BPHL, FSKM, BFKH, PSML, BSHM, and FPKL. The sequences were created from a master list by applying a set of mapping rules which ensured, for example, that every instance of the letter B in the first sequence corresponded to an instance of the letter F in the second sequence. Four of the sequences-BPKM, FSHL, BFKH, and PSML-were identical to those used in the Order Only Experiment by Healy (1974). For the sequence used in the item order and item order - item conditions, all eight letters from the experimental population were employed. On a particular trial one of the eight four-letter subsets of letters was shown. Each four-letter subset occurred in 3 of the 24 trials with a given retention interval. The order of the letters shown on a trial was the same as that shown on the corresponding trial of the order only sequence that contained the same subset of letters. These constraints ensured that across sequences, the same letters occurred in each of the three conditions.

+

+

4 . Procedure

Subjects were tested individually in sessions of approximately 1 hour. Each subject was instructed to read aloud the letters and digits as they appeared. At the end of each list of letters and digits, the screen went blank and the subjects were given 16 sec to write the four letters seen on that trial. The subjects wrote their responses on 4 X 6-inch cards which included four boxes arranged in a horizontal linear array. The subjects were to write the first letter seen in the first box, the second in the second box, and so on. They were not required to fill in the boxes in any particular temporal sequence; hence, to use the terminology of Berch (1979), ordinal recall, rather than serial recall, was required. The subjects were forced to fill in all the boxes on the card; they were not allowed to leave a box blank; and they were told to guess if necessary.

200

Alice F. Healy

At the start of the experiment, subjects in the order only condition were given full information concerning the subset of four letters that would be seen in varying orders during their session. In addition, as a reminder to the, subjects, the subset of letters used in the session was printed in the upper left-hand corner of each response card. The four letters on the cards were always shown in the same arrangement, depending on the subset: BPKM, FSHL, BPHL, FSKM, BFKH, PSML, BSHM, and FPKL. No information concerning the letters that would be seen was provided to the subjects in the item + order condition either at the start of the experiment or at the time of test. Although subjects in the item + order item condition were also not told anything at the start of the experiment about the letters that would be shown, they were provided with information about the subset of letters that occurred on a given trial at test time. This information was printed in the upper left-hand corner of the response cards in the same format employed in the order only condition. The subjects were told that the four letters were placed on a card in a random order, which did not necessarily correspond to the order in which the letters were shown on the trial. The subjects were told further that their task was to recall the proper order of the letters and to write them in that order in the boxes on the card. Subjects were also told to make sure that for every trial the letters written down as responses corresponded to those in the corner of the card, although the order might differ. In fact, the four letters in a subset were always printed in the same arrangement on the card, as in the order only condition. The response cards used by the subjects were given to them in a stack facing downward. The subjects in all conditions were told not to turn over the response card for a given trial until they had seen all the letters and digits for the trial. This procedure ensured that subjects in the item order - item condition would know the information about the identity of the four letters shown on a trial at the time of test, but not before.

+

B. RESULTS

The results are summarized in Table I for the three conditions of the experiment in terms of unconditional percentages of correct responses for all conditions and conditional percentages of correct positions, given correct items for the item + order.condition. The standard error of the values in Table I broken down by retention interval is 2%, as determined by analyses of variance. When only unconditional percentages are considered, subjects in the item + order - item condition performed as well as those in the order only condition; in fact, the overall recall level in the item + order - item condition was slightly better than that in the order

20 1

Short-Term Memory for Order Information

TABLE I TIMECOURSES OF FORGETTING IN CONDITIONS OF EXPERIMENT 1" Number of digits Condition

3

8

18

Totals

Order only Item + order - item Item + order Unconditional Conditional

84 87

68 72

52 49

68 69

83 86

63 71

39 52

62 70

UConditions+rder only, item + order, and item + order - item-in terms of (a) unconditional percentages of correct responses for all conditions and (b) conditional percentages of correct positions given correct items for item + order condition.

only condition. Both of these groups, who had to recall only order information, performed at a level that was superior to that of the item order group, who had to recall information about the identity of the items as well as information about their serial positions. The difference between groups was especially marked at the longer retention intervals. An analysis of variance performed on the unconditional percentages did not yield a significant main effect of condition, F ( 2 , 69) = 2.3, MSe = 1948, p = .104, but the interaction of condition and retention interval was statistically significant, F(4, 138) = 3.9, MS, = 265, p = .005, as was the main effect of retention interval, F(2, 138) = 398.0, MS, = 265, p < .001. In addition, a planned contrast comparing the item + order condition with the other two conditions was statistically reliable, F( 1, 69) = 4.4, MSe = 1948, p = .037, as was the interaction of that contrast with retention interval, F(2, 138) = 4.7, MSe = 265, p = .010. A comparison of the conditional percentages of correct order recall responses, given correct item recall responses for the three conditions, revealed no differences among conditions. (The item information was necessarily correct in the item + order - item and order only conditions, so the conditional percentages are equal to the unconditional percentages in these cases.) This finding indicates that the lower overall recall level of the subjects in the item + order condition cannot be attributed to poorer recall of order information.' An analysis of variance conducted on the

+

'It is important to be aware of the possible problem of selective effects when comparing conditional percentages to absolute percentages. The conditional percentages in this case only involve recalled items, not the more difficult items that were forgotten; it may be easier to recall the order of the less difficult items.

Alice F. Healy

202

conditional percentages did not yield either a main effect of condition, F ( 2 , 69) < 1, or an interaction of condition and retention interval, F ( 4 , 138) = 1.7, MS, = 250, p = .150, although the main effect of retention interval was significant, F(2, 138) = 350.7, MS, = 250, p < .001. C . DISCUSSION

These results provide dramatic support for the hypothesis that subjects under these conditions do not develop a more efficient method for coding order information in short-term memory, even when they are told in advance which items will occur. Subjects performed no better in the order only condition, in which they knew in advance which four letters would be shown and the same four letters occurred on every trial of their session, than in the item order - item condition, in which different subsets of four letters occurred on different trials and the subjects were reminded only at the time of test which four letters were shown on a particular trial. Subjects in both of these conditions did correctly recall more letters than in the item order condition, in which the subjects were given no information about the to-be-recalled letters in advance or at the time of test. However, the poorer recall level in the item + order condition can be attributed solely to differences in item retention, not to differences in the retention of order information. These results also provide a clear answer to the more general question posed initially concerning the independence of the processes used to retain item and order information. The order only condition differed from the item + order - item condition in terms of the memory load for item information. Nevertheless, the retention levels were essentially the same in the two conditions. Hence, the retention of temporal sequence information seems independent of the memory load for item information. Is the retention of spatial location information also independent of the amount and quality of item information? This question was addressed in Experiment 2 using a very different set of procedures from those used in Experiment 1.

+

+

111.

Experiment 2

To determine whether the coding used to remember the spatial arrangement of a sequence of letters is influenced by the identity of the letters, on each trial of Experiment 2 a single stimulus, the capital letter X, was shown four times successively, each time in a different spatial location of the display screen, and the subject was asked to indicate the temporal

Short-Term Memory for Order Information

203

sequence of spatial locations that contained Xs. The aim was to assess whether the coding strategy used by subjects in this situation would be similar to that employed when subjects are required to remember the spatial locations of a set of different letters, as in earlier studies of spatial location retention (Healy, 1975, 1977, 1978). If subjects rely on a memory code that reflects what they heard themselves say whenever that code contains some information, then subjects’ performance on this task should depend largely on what they say during the presentation of the stimuli. Two different conditions were therefore included in this experiment. In one condition, as in the previous spatial location recall conditions involving letters, the subjects said the same thing on every trial of their session, so that remembering what they said would not be informative. Specifically, in the count condition, subjects counted from 1 to 4 as the four Xs were presented successively. In contrast, in the second condition of this experiment, as in the previous temporal sequence recall conditions with letters, remembering what they said would be a useful strategy for the subjects. Specifically, in the position condition, subjects said aloud the position number of the spatial location where each X occurred. If the primary factor differentiating temporal sequence recall and spatial location recall conditions in the previous studies of letter retention was the usefulness of what the subjects said during stimulus presentation, then the two conditions of the present experiment should show differences similar to those found between temporal sequence recall and spatial location recall. In particular, the count condition should be analogous to the previous spatial location recall conditions involving letters, and the position condition should be analogous to the previous temporal sequence recall conditions. Such a pattern of differences between the present experimental conditions would be extremely impressive because the subjects’ recall task was identical in the two conditions; the conditions differed only in terms of what the subjects said during stimulus presentation. Another possible outcome of this experiment was that the position condition as well as the count condition would resemble the previous spatial location recall conditions with letters. This outcome would be expected if subjects were simply coding the temporal sequence of spatial locations in the previous spatial location recall conditions, as suggested, for example, by Berch (1979). The “baseline” model, based on this assumption, was tested in the earlier study by Healy (1978) but rejected because it could not account for the consistent effect of temporal-spatial pattern on the percentages of correct responses in spatial location recall. For example, simple “straight-across” patterns, in which the letters were presented either entirely from left to right or entirely from right to left,

Alice F. Healy

204

were recalled much better than other patterns; but the baseline model could not account for this observation. It is possible that certain sequences of position locations, such as 4321 or 1234 would be easier to retain than other sequences because of their familiarity, so that the notion that subjects rely on the retention of the sequence of position numbers in spatial location recall is worth further consideration. The present experiment was designed to be as analogous as possible to the previous experiment of letter retention by Healy (1978, Experiment 2), so that comparisons of the two experiments would be facilitated. As in the previous experiment, a manipulation of the interpolated digit patterns was included in order to determine whether subjects were coding pattern information about the to-be-remembered stimuli. Four different types of interpolated digit patterns were employed: digit patterns that matched exactly the pattern of Xs (“same” patterns), those that matched the pattern of Xs in pattern class but differed in terms of the other two items of information included in the pattern model of Healy (1978) (“similar” patterns), those that differed from the pattern of Xs in terms of all three items of information included in the pattern model (“different” patterns), and those that had no regular relationship to the pattern of Xs because of repetitions of digit locations within a block of four successive digits (“random” patterns). If the basic unit of memory is the temporal-spatial pattern of X presentations, then recall should be greater for same digit patterns than for the other digit pattern types, and subjects should be likely to confuse the interpolated digit pattern with the pattern of Xs when they are similar. A.

METHOD

1. Subjects

Twenty male and female undergraduates of Yale University participated as subjects in order to fulfill a course requirement. There were 10 subjects in each of two conditions: position and count. 2 . Apparatus

The same apparatus was employed as in Experiment 1, except that four spatial locations of the display screen, rather than just one, were used to present stimuli. The locations were arranged horizontally, with a single space separating adjacent locations. The first (left-most) location was the same as used in Experiment 1. Hyphens were used to represent blank locations. Each trial began with the display of four hyphens repeated twice in a row.

Short-Term Memory for Order Information

205

3. Design and Materials A 192-trial experimental sequence was prepared. A trial consisted of four successively presented instances of the letter X, followed by a retention interval of either 4 or 16 successively presented interpolated digits. Each character was displayed in only one of the four locations on the screen, while the other three locations contained hyphens. The four Xs were presented in different locations, not necessarily from left to right. Each of the 24 possible temporal-spatial patterns of X s was presented eight times, four times at each retention interval, once paired with the same digit pattern, once paired with a similar digit pattern, once paired with a random pattern of digits, and once paired with a digit pattern that differed from the pattern of Xs in terms of all three items of pattern information, as defined in the pattern model proposed by Healy (1978). The similar patterns matched the pattern of Xs shown on the given trial in pattern class but differed from it in the other two items of information. Each pattern of Xs was matched with a single digit pattern that met the criteria for a different pattern and a single digit pattern that met the criteria for a similar pattern; and across all 24 patterns of Xs, each of the 24 digit patterns was employed once as a different pattern and once as a similar pattern. The random patterns were constructed in the following manner: The location of a given digit was quasi-random with the constraint that every group of four successive digits have at least one location repetition, although a given location never occurred twice in immediate succession. Thus, random patterns could not be classified as any of the 24 regular temporal-spatial patterns. For same, different, and similar, but not for random patterns on a long retention interval with 16 interpolated digits, the digit pattern was repeated four times in a row. These constraints on patterns are identical to those employed by Healy (1978, Experiment 2). The presentation order of the trials was quasi-random with the following constraints: In every block of 24 successive trials there were six instances of each of the four digit-pattern types, and across the entire 192trial sequence, each of the 24 patterns of Xs occurred twice with each of the four digit-pattern types, once at each of the two retention intervals. As in the study by Healy (1978, Experiment 2), the intervening digits shown on each trial were selected with the constraints that each block of four digits show a given permutation of the digits 5 , 6, 8, and 9, and no two successive digits be the same. Eight practice trials were shown to the subjects. Eight different temporal-spatial patterns of Xs were used, one with each of the eight combinations of retention interval and digit pattern type.

Alice F. Healy

206

4 . Procedure Subjects were tested individually in sessions that lasted approximately 2 hours. Each subject was instructed to say aloud the name of each digit as it appeared and either to say aloud the position number (from 1 to 4) of the location where each X appeared (position condition) or to count aloud the numbers 1 to 4 as each of the four Xs appeared (count condition). At the end of each sequence of Xs and digits, the screen became blank and the subject was given 16 sec to indicate the spatial positions of the four Xs seen on that trial on a 4 X 6-inch card which included four rows of four boxes. Each row of boxes was for a different one of the four Xs, and the four boxes in each row were for the four spatial locations of the display screen. For each of the four Xs, the subject was to place an X in the appropriate box. For example, if the subject decided that the first X occurred in the left-most location, he or she was to place an X in the first row and first column. The subjects were instructed to place one, and only one, X in each of the four rows. At the end of every 48 experimental trials there was a short rest break. B.

RESULTS

The results are summarized in Table I1 in terms of percentages of correct responses as a function of condition (position and count), retention interval (4and 16 interpolated digits), and relation between the to-berecalled pattern of Xs and the interpolated digit pattern. The results from the temporal sequence recall and spatial location recall conditions of the analogous letter recall experiment by Healy (1978, Experiment 2) are also included in Table I1 for comparison purposes. An analysis of variance performed on the combined data from both experiments yielded 2% as an estimate of the standard error of the entries in Table 11. As Table I1 reveals, the count condition of the present experiment bears a strong resemblance to the spatial location recall condition of the earlier letter recall experiment, whereas the position condition is more similar to the earlier temporal sequence recall condition. Specifically, there was no overall difference between the count and position conditions, F( 1 , 18) < 1, but there was an overall decline in recall level as retention interval increased, F(1, 18) = 46.7, MS, = 778, p < .001, and this decline was much larger for the position condition than for the count condition, F(1, 18) = 6.3, MS, = 778, p = .021, just as it was much larger for the temporal sequence recall condition than for the spatial location recall condition in the earlier experiment. In addition, there was an overall advantage for same digit patterns, compared to the other types of interpo-

207

Short-Term Memory for Order Information

TABLE I1 PERCENTAGES OF CORRECT RESPONSES IN EXPERIMENT 2 AND IN LETTER BY HEALY(1978, EXPERIMENT 2) BY CONDITION, RECALLEXPERIMENT AND RELATION BETWEEN THE TO-BE-RECALLED RETENTION INTERVAL, DIGITPATTERN PATTERNAND THE INTERPOLATED Retention interval

16

4

Condition Experiment 2 Position Count Healy (1978) Temporal Spatial

Same Different

Similar

Random

Same Different

Similar

Random

92 88

89 83

89 79

88 82

74 83

64 72

71 71

67 68

84 88

83 79

78 80

86 86

61 84

55 69

57 72

56 70

lated digit patterns, F (3, 54) = 11.3, MSe = 183, p < .001, and the difference between digit pattern types was somewhat greater for the count condition than for the position condition, F (3 , 54) = 2 . 5 , MS, = 183, p = .069, just as the effect of digit pattern was larger in spatial location recall than in temporal sequence recall in the earlier experiment. These initial analyses support the conclusion that what the subject said during the presentation of the stimuli was very influential in determining the recall strategy used. Subjects in the count condition, who said the same thing on every trial, showed a pattern of results similar to that found in earlier spatial location recall conditions, in which subjects also said the same thing on every trial. In contrast, subjects in the position condition, whose utterances during stimulus presentation contained useful information, showed a pattern of results like that in earlier conditions of temporal sequence recall, in which subjects’ utterances also provided useful information. The more detailed analyses that follow give further support for these initial observations.

I.

Serial Position Functions

In previous studies of spatial location retention (e.g., Healy, 1975, 1977), large effects of temporal sequence position were obtained, and these effects were greater than the analogous effects of spatial location position. The temporal sequence position function for spatial location retention included a primacy advantage, but not a recency advantage, although both primacy and recency were characteristic of the function for

Alice F. Healy

208

TABLE 111 PERCENTAGES OF CORRECT RESPONSES IN EXPERIMENT 2 AND IN LETTER BY HEALY(1978, EXPERIMENT 2) BY CONDITION, RECALLEXPERIMENT AND SPATIAL LOCATION POSITION TEMPORAL SEQUENCE POSITION, Sequence position Condition Experiment 2 Position Count Healy (1978) Temporal Spatial

Location position

1

2

3

4

1

2

3

4

82 82

71 80

71 76

81

15

80 19

79 17

79 71

79 79

76 84

61 79

66 74

71 71

14 80

I0 78

69 71

67 19

temporal sequence retention. Similarly, in the present experiment, the effect of temporal sequence position, F(3, 54) = 12.2, MS, = 70, p < .001, was larger than the effect of spatial location position, F(3, 54) = 3 . 2 , MSe = 37, p = .032, and the temporal sequence position function was somewhat different in the two conditions, F ( 3 , 5 4 ) = 6.6, MS, = 70, p = .001. The position condition, like the earlier temporal sequence recall conditions, showed both strong primacy and recency advantages; whereas the count condition, like the earlier spatial location recall conditions, showed only a strong primacy advantage. Table I11 presents the temporal sequence and spatial location serial position functions for the present experiment in terms of percentages of correct responses by condition. The analogous functions for the letter recall experiment by Healy (1978, Experiment 2) are also provided in Table 111 for comparison purposes. Analyses of variance performed on the combined data from both experiments yielded 1% as an estimate of the standard error of the entries in Table 111.

2. Distance Functions In previous studies of temporal sequence and spatial location retention, the temporal distance functions were quite orderly. These functions reveal the distance in the temporal.sequence between the correct letter and the one that replaced it on the subject’s response card. In general, it has been found that when subjects respond incorrectly, they replace the correct letter with one close to it in the temporal sequence. Another property of the observed temporal distance functions is their symmetry: The functions for serial positions 1 and 2 are mirror images of those for positions 4 and 3 . There is one outstanding exception to this symmetrical pattern. In

209

Short-Term Memory for Order Information

spatial location recall, the number of interchanges involving letters in the last two temporal positions is exceptionally large, and, specifically, is larger than the number of interchanges involving letters in the first two positions. This asymmetry is typically less marked for temporal sequence recall. The temporal distance functions for the present experiment are shown in Table IV, where they are compared to the functions from the letter recall experiment by Healy (1978, Experiment 2). The functions for the present experiment show the features described above, with the count condition being more similar to spatial location recall, and the position condition being more similar to temporal sequence recall. TABLE IV OBSERVED TEMPORAL DISTANCE FUNCTIONS FOR EXPERIMENT 2 AND LETTERRECALLEXPERIMENT BY HEALY(1978, EXPERIMENT 2)“

FOR

~

Condition Healy (1978, Experiment 2)

Experiment 2 Positionb

Position

Count

Temporal

Spatial

82 7 6 5

82 8 5 6

76 9 8 7

84 7 5 3

8 77 8 7

8 80 6 6

9

7 79 7 6

5 6 76

1 1

2 3 4

2 1

2 3 4

67

14 10

3 1

6

2

9

3 4

77 7

I 2 3

4 7 9

4

81

8

13

13 66 12

5 7 74 14

4 7 14 75

6 I1 12 71

4 6 13 77

4

~

“By condition. bThe position code i, j (where i is the superordinate position label and j is the subordinate position label) refers to instances in which the correct letter being scored was presented in the temporal sequence position i and the subject’s response was the letter presented in the temporal sequence position j.

Alice F. Healy

210

3 . Pattern Confusions As in the letter recall experiment by Healy (1978), an analysis was made of the number of times the subject responded in complete accord with the interpolated digit pattern, rather than the to-be-recalled pattern of Xs, shown on a given trial. For this analysis, each trial was scored as a whole, in terms of the pattern with which the subject responded. In the earlier experiment by Healy (1978, Experiment 2), subjects were found to respond with the interpolated digit pattern, rather than the letter pattern, with a relatively high frequency when the interpolated digit pattern was similar to the letter pattern in the spatial location recall condition. The corresponding frequencies of confusions were considerably lower for the temporal sequence recall condition and for digit patterns that were different from the letter patterns, but not similar to them. As in the previous spatial location recall condition, subjects in the count condition of the present experiment responded with the interpolated digit pattern on 32 trials when the digit pattern was similar to the pattern of Xs (out of 163 trials of this type with errors). In contrast, subjects in the count condition responded with the interpolated digit pattern on only 7 trials when the digit pattern was different from the pattern of Xs (out of 154 trials of this type with errors). Once again, the position condition more closely resembled the temporal sequence recall condition than the spatial location recall condition. Subjects in the position condition responded with the interpolated digit pattern on only 8 trials when the digit pattern was similar to the pattern of Xs (out of 138 trials of this type with errors) and on only 6 trials when the digit pattern was different from the pattern of Xs (out of 157 trials of this type with errors). The large frequency of pattern confusions that occurred in the count condition for the similar digit patterns is especially impressive, given that none of the 24 patterns had a correspondence between the spatial location of a character in a given temporal sequence position for the pattern of Xs and for the matched similar digit pattern, although there was such a correspondence between the letter patterns and the matched different digit patterns in one temporal position in 1 1 of the 24 patterns and in two temporal positions in 1 pattern. C.

DISCUSSION

A number of striking differences between the count and position conditions of this experiment were analogous to those observed between the spatial location recall and temporal sequence recall conditons of the earlier experiments involving letter retention. The count and position condi-

Short-Term Memory for Order Information

21 1

tions differed only in terms of what the subjects said as the stimuli were being presented: Useful information about the to-be-remembered sequence of spatial locations was contained in the subjects’ vocalizations in the position condition, but not in the count condition. These results suggest that subjects use a phonemic code that reflects what they said during stimulus presentation whenever that code contains useful information; otherwise, if available, subjects code information about the temporal-spatial pattern of stimulus presentations. The strong similarity between the count and spatial location recall conditions also has implications for the more general question of the independence of item and order retention. The two conditions differed markedly in terms of the amount and quality of item information, but this difference did not seem to affect the coding strategy used to retain spatial location information. The pattern information coded by subjects in the count condition of this experiment and in the spatial location recall conditions of earlier studies is not equivalent to the sequence of spatial location position numbers. If pattern coding were equivalent to such position coding, then the subjects in the position condition should have performed similarly to those in the count condition and should have shown strong evidence for pattern coding, but they did not. Rather, the effects of interpolated digit patterns were weaker in the position condition than in the count condition, and the results of the position condition more strongly resembled those of previous temporal sequence recall conditions. Instead of position numbers, the memory units used for pattern coding appear to be items of information referring to aspects of the entire temporal-spatial pattern of stimulus presentations, as specified in the pattern model proposed by Healy (1978).

IV.

Experiment 3

Because of the similarity between the count condition of Experiment 2 and the spatial location recall condition of the earlier letter recall experiment by Healy (1978), it was concluded that the strategy used to retain spatial location information is not affected by the properties of the to-beremembered items. Further support for this hypothesis derives from the fact that previous studies indicated no effects of the phonemic properties of letters that were to be recalled in their spatial arrangement, although phonemic characteristics did influence the recall of letters according to their temporal sequence of presentation (e.g., Healy, 1975, 1977). It is possible, though, that subjects attend to the visual features of the letters,

212

Alice F. Healy

rather than the sound of their names, in spatial location retention. Earlier studies involving short-term memory for temporal sequences (e.g., Cimbalo & Laughery, 1967) did not find effects of visual similarity, but such effects might be found for spatial location retention. Just as differences in the phonemic properties of successively presented letters seem to aid in remembering their temporal sequence (cf. Drewnowski, 1980), differences in the visual properties of adjacent letters may aid in remembering their spatial arrangement. In fact, as indicated above, unreported analyses of the letter recall experiment by Healy (1978, Experiment 2) indicated more visual than phonemic confusion errors in the spatial location recall condition. Specifically, subjects were shown the letters BVSX on every trial of that experiment, and they tended to confuse the visually similar letters V and X or B and S in the spatial location recall condition, whereas they confused the phonemically similar letters V and B or X and S in the temporal sequence recall condition. Experiment 3 was aimed in part at determining to what extent subjects are influenced by the visual features of letters when they are asked to recall them in the spatial arrangement in which they were presented. Experiment 3 was also aimed at testing the generality of the previous findings concerning the spatial location retention of letters. There was strong evidence in the previous studies that subjects coded information about the temporal-spatial pattern of letter presentations in spatial location recall conditions (Healy, 1975, 1977, 1978); however, these conditions were constrained in several respects. Perhaps the most critical constraint was the number of to-be-remembered items shown on a particular trial. In every case studied by Healy, the subject had to remember the order of exactly four letters. Because there were four letters, there were 24 possible temporal-spatial patterns of letter presentation. Perhaps if more letters were shown, there would be less of a tendency to use pattern coding because of the larger number of possible temporal-spatial patterns that would result. For example, 120 possible temporal-spatial patterns result from the presentation of five letters; and it may be difficult for the subject to discriminate among so many different patterns. On the other hand, if fewer than four letters were presented to the subject for recall, then there would be fewer possible temporal-spatial patterns, so that pattern coding might be more likely to be used, even in temporal sequence recall. If three letters were to be recalled, then there would be only 6 possible temporal-spatial patterns, and these might be easy for the subject to discriminate. In contrast to earlier studies, in Experiment 3 sequences of three, four, and five to-be-remembered letters were presented to subjects. The pattern model proposed for the spatial location retention of four-letter sequences

Short-Term Memory for Order Information

213

(Healy, 1978) cannot be applied directly to sequences of three and five tobe-remembered letters. Nevertheless, the same general principles should apply for all three string lengths. Evidence for pattern coding was gleaned in this experiment by examining the effects on letter retention of the nature of the interpolated digit pattern. The temporal-spatial pattern of the intervening digits interpolated between the to-be-remembered letters and their recall on a given trial was either identical to the temporalspatial pattern of the letters shown on that trial (“same” pattern) or did not bear any regular relationship to the letter pattern (“random” pattern). Fewer letter recall errors on same digit patterns than random digit patterns has been taken as evidence for pattern coding (see Healy, 1978). Other features that have characterized and differentiated temporal sequence and spatial location recall of four-letter sequences were also examined in this study with three, four, or five to-be-remembered letters. In particular, the time course of forgetting, the serial position functions for both temporal sequence positions and spatial location positions, and the temporal distance functions were examined. In previous studies, the time course of forgetting has been found to be considerably steeper for temporal sequence recall than for spatial location recall; the serial position functions for temporal sequence positions have yielded a larger primacy advantage than the functions for spatial location positions for both temporal sequence and spatial location recall; and the temporal distance functions have revealed a disproportionately large number of interchanges involving the last two temporal positions in spatial location recall. A.

METHOD

1. Subjects

Thirty-six young men and women, who were recruited from posters placed on the Yale University campus, participated as subjects and were paid at the rate of $2.50 per hour. There were 18 subjects in each of two conditions: temporal sequence recall and spatial location recall. The subjects were further subdivided into six subconditions of the two conditions with three subjects per subcondition. 2.

Apparatus

The same apparatus was employed as in Experiments 1 and 2, except that five spatial locations of the display screen, rather than four, were used. The first four (left-most) locations were the same as those used in Experiment 2. A subject, depending on the subcondition, saw letters and digits occurring in three, four, or five different spatial locations. When

214

Alice F. Healy

three or four locations were used, they were always the left-most locations. A given location that was not used in a particular subcondition was left blank; a hyphen did not occur in the location, but hyphens were used to represent locations among those employed in a given subcondition that were blank on a particular trial. Each trial began with the display of three, four, or five (depending on the subcondition) hyphens repeated twice in a row. 3 . Design and Materials Twelve different 168-trial experimental sequences were prepared, 1 for each subcondition. Each subject saw only 1 experimental sequence, and each sequence was shown to three subjects. A trial consisted of three, four, or five successively presented capital letters as stimuli, followed by a retention interval of 3, 4, 5, 15, or 16 successively presented interpolated digits. Four sequences, 2 for the temporal sequence recall condition and 2 for the spatial location recall condition, included three to-be-remembered letters; 4 included four letters; and 4 included five letters. Each character was displayed in only one location on the screen. The letters shown on a given trial were presented in different locations, not necessarily from left to right. For the sequences with five to-be-remembered letters (and five different spatial locations), 72 of the 120 possible temporal-spatial patterns of letters were presented on a single trial, paired with a random digit pattern either 3, 4,or 16 digits long, and the remaining 48 of the letter patterns were presented twice, once paired with the digit pattern that was the same as the letter pattern and once paired with a random digit pattern of the same length (either 5 or 15 digits long). The random patterns were constructed in the following manner: The spatial locations of the digits were selected at random from the locations available for a given subcondition with the constraint that no location occurred twice in immediate succession. Likewise, the identity of the digits was selected at random from the numbers 1-9 with the constraint that no digit occur twice in immediate succession. For digit patterns that were the same as the letter patterns, on the longer retention interval, the digit pattern was repeated 3 times in a row. Seven different types of retention intervals occurred, 24 times each. Five of these retention interval types involved random digit patterns, including those with 3, 4, 5, 15, and 16 digits, and five involved digit patterns that were the same as the letter patterns, including those with 5 and 15 digits. The presentation order of the trials was quasi-random with the following constraint: In every block of 14 successive trials there were two retention intervals of each type. The assignment of letter patterns to

Short-Term Memory for Order Information

215

retention intervals was quasi-random with the following restriction: The 120 letter patterns were divided into 24 sets of 5 each. The 5 patterns in a set were identical, except for the temporal sequence position of the letter that occurred in the fifth (right-most) spatial location. Thus, for example, the pattern with the sequence of spatial locations 1, 3, 5, 4, 2 was in the same set as that with the sequence 1, 3 , 4 , 2, 5 . Each one of the five types of random digit patterns (3, 4, 5 , 15, and 16 digits) was matched with a different one of the 5 patterns in a set. Those 2 patterns in every set of 5 that were matched with the 5-digit and 15-digit retention intervals were the ones selected to occur twice, once with a random digit pattern and once with the digit pattern that was the same as the letter pattern. The sequences with four to-be-remembered letters were derived from those with five to-be-remembered letters in the following way: The letter pattern on a given trial in the sequence with four letters was identical to that on the corresponding trial in the sequence with five letters, except that the letter in the fifth spatial location was eliminated. For trials with random digit patterns, the identity of the digits on a given trial in the sequences with four letters was the same as that on the corresponding trial in the sequences with five letters. The spatial locations of the digits were the same, except that whenever a digit occurred in the fifth location in the sequences with five letters, a new location for that digit was chosen at random for the sequences with four letters, without violating the constraint that no location occur twice in immediate succession. For trials with same digit patterns, either 4 or 16 interpolated digits occurred (rather than 5 or 15), and on the longer retention interval, the digit pattern was repeated four times in a row (rather than only three). The identity of the digits on a given trial with the same digit and letter patterns in the sequences with four to-be-remembered letters was identical to that on the corresponding trial in the sequences with five to-be-remembered letters, except that the final digit was eliminated from the trials with the shorter retention interval and a new digit was added at the end of the trials with the longer retention interval. The presentation order of the trials in the sequences with four letters corresponded to that in the sequences with five letters. These constraints ensured that each of the 24 possible temporal-spatial patterns of letters occurred 7 times, once at each of the five retention intervals with random digit patterns (3, 4,5 , 15, and 16 digits), and once at each of the two retention intervals with digit patterns that were the same as the letter patterns (4and 16 digits). The sequences with three to-be-remembered letters were derived from those with four letters in a manner analogous to that used to derive the sequences with four letters from those with five. For trials with same digit patterns, either 3 or 15 interpolated digits occurred (rather than 4 or 16),

Alice F. Healy

216

and on the longer retention interval, the digit pattern was repeated 5 times in a row (rather than 4). Every one of the six possible temporal-spatial patterns of letters occurred 28 times, 4 times at each of the five retention intervals with random digit patterns, and 4 times at each of the two retention intervals with digit patterns that were the same as the letter patterns. The same letters appeared on each experimental trial of a given subject’s session. The letters were drawn from a population of five consonants: BVSXH. Note that two pairs of these letters are phonemically similar (BV and SX) and two pairs are visually similar (VX and BS). There were four sequences with each of the following letter combinations: BVX, BVSX, BVSXH. The four sequences with a given letter combination included two sequences in the temporal sequence recall condition and two in the spatial location recall condition, which were identical except for the temporal sequence and spatial arrangement of the letters on a given trial. For the sequences in the temporal sequence recall condition, the spatial arrangement of the letters was held constant throughout the 168 trials, and for the sequences in the spatial location recall condition, the temporal order of the letters was held constant. The constant temporal order of the letters in a sequence for the spatial location recall condition was the same as the constant spatial arrangement of the letters in the corresponding sequence for the temporal sequence recall condition. The constant letter orders were BVX, XVB, BVSX, BSVX, BVSXH, and BSVXH for the six sequence pairs, respectively. Seven practice trials were shown to the subjects before the experimental trials, one of each retention interval type. There were six different sequences of practice trials, one for each combination of string length and recall condition. The constant letter orders were ABC, ABCD, and ABCDE for the three-, four-, and five-letter sequences, respectively. 4.

Procedure

Subjects were tested individually in sessions that lasted approximately 1 hour and 40 min. Each subject was instructed to read aloud the letters and digits as they appeared. At the end of each list of letters and digits, the screen became blank and the subject was to write down the letters shown on the trial in their temporal sequence in the temporal sequence recall condition, or in their spatial arrangement in the spatial location recall condition. The subjects wrote their responses on 4 X 6-inch index cards on which three, four, or five (depending on the subcondition) boxes had been printed in a horizontal linear array. The subjects were not required to fill in the boxes in any particular temporal sequence, but they were forced

217

Short-Term Memory for Order Information

to fill in all the boxes on the card and they were told to guess if necessary. Short rest breaks occurred after experimental trials 43, 85, and 127. Subjects were told which three, four, or five letters would be shown to them on every trial and were given the constant temporal sequence or constant spatial arrangement of the letters. B.

RESULTS

The results are summarized in Table V in terms of percentages of correct responses as a function of recall condition (temporal sequence recall or spatial location recall), string length (three, four, or five to-beremembered letters), digit pattern type (same or random), and retention interval (short or long). Only retention intervals used for both same and random digit patterns are included in this summary (3 and 15 digits for the three-letter sequences, 4 and 16 digits for the four-letter sequences, and 5 and 15 digits for the five-letter sequences). An analysis of variance yielded 2% as an estimate of the standard error of the entries in Table V . All three string lengths showed the same general pattern of results, although overall recall level was greatest for three-letter sequences and poorest for five-letter sequences, F(2, 24) = 84.5, MSe = 514, p < .001. Specifically, for all string lengths, retention was better at the short interval than at the long interval, F( 1, 24) = 105.3, MSe = 65, p < .OO 1, and the difference between retention intervals was greater for temporal sequence recall than for spatial location recall, F( 1 , 24) = 5.5, MSe = 65, p = .026. Although the decline in recall level across retention intervals increased with increasing string length, F(2, 24) = 9.1, MS, = 65, TABLE V FUNCTION OF RECALL CONDITION, STRING LENGTH,DIGITPATTERN TYPE,AND RETENTION INTERVAL (EXPERIMENT 3) PERCENTAGES OF CORRECT RE~PONSESAS A

String length 3

Retention interval Short Same Random Long Same Random

5

4

Temporal

Spatial

Temporal

Spatial

Temporal

Spatial

95 95

97 96

80 75

95 89

62 57

61 57

87 91

95 89

67 63

91 83

42 41

48 45

Alice F. Healy

218

p = .001, the three-way interaction between string length, recall condition, and retention interval was not significant, F(2, 24) = 1.0, MS, = 65, p = .378. In addition, for all string lengths, retention was better when the digit pattern matched the to-be-remembered letter pattern than when it was a random pattern, F( 1, 24)= 14.5, MS, = 55, p = .001; the interaction of digit pattern type and string length was not significant, F(2, 24) = 2.6, MS, = 55, p = .090. As in earlier studies (Healy, 1978), the advantage for same digit patterns was somewhat more pronounced in spatial location recall than in temporal sequence recall for all string lengths; but neither the interaction of digit pattern type and recall condition, F ( 1 , 24) = 2.2, MS, = 55, p = .144, nor the three-way interaction of string length, digit pattern type, and recall condition, F(2, 24) < 1, was significant. Despite the overall similarity in the pattern of results for the three string lengths, the evidence for pattern coding in spatial location recall seemed to be greatest for the four-letter sequences. In particular, recall level on spatial location recall was superior to that on temporal sequence recall for all string lengths, F(1, 24) = 8.0, MS, = 514, p = .009, but the difference between recall conditions was most marked for the four-letter sequences, F(2, 24) = 3.9, MS, = 514, p = .034. Perhaps pattern coding was facilitated in the four-letter sequences relative to the other string lengths, because of the intermediate number (24) of possible temporal-spatial patterns in that case. Pattern coding may not be used to the same extent with five-letter sequences because of the very large number of possible patterns (120), and the high recall levels with three-letter sequences may not give room for differences between conditions to be seen. I.

Serial Position Functions

The serial position functions differed across string lengths, as one might expect, since serial position functions naturally depend on the number of serial positions. The serial postion functions are presented in Table VI for both temporal sequence positions and spatial location positions, as a function of string length and recall condition. Only the five retention intervals with random digit patterns (3, 4, 5, 15, and 16 digits) were included in this analysis. Separate analyses of variance were conducted on these data for both temporal sequence and spatial location positions for each string length. The analyses of variance yielded 1%, 1%, and 2% as estimates of the standard errors of the entries in Table VI for string lengths 3, 4, and 5, respectively. In previous comparisons of temporal sequence and spatial location recall (e.g., Healy, 1975), the

219

Short-Term Memory for Order Information

TABLE VI PERCENTAGES OF CORRECT RESPONSES AS A FUNCTION OF RECALL CONDITION, STRINGLENGTH,TEMPORAL SEQUENCE POSITION, AND SPATIAL FOR RANDOM DIGITSEQUENCES ONLY(EXPERIMENT 3) LOCATION POSITION String length 3

Condition Temporal Sequence position Location position Spatial Sequence position Location position

4

5

1

2

3

1

2

3

4

1

2

3

4

5

93 94

92 91

93 92

76 73

71 71

69 71

70 70

68 57

46 49

42 48

42 46

56 54

96 92

92 94

92 93

92 91

89 88

87 89

88 90

69 57

52 51

45 52

43 49

56 56

temporal sequence positions exhibited more of a bowed shape, or at least a greater primacy advantage, than the spatial location positions for both temporal sequence recall and spatial location recall. This same general pattern of results held for all of the string lengths in the present experiment, although the serial position functions were generally flattest for the three-letter sequences and least flat for the five-letter sequences. 2.

Phonemic and Visual Confusions

An analysis of phonemic and visual confusion errors was conducted to determine the form of coding used in temporal sequence recall and spatial location recall. In order to make the string-length conditions as analogous as possible, only errors made on the stimulus letters B , V , and X were considered. Separate tabulations of phonemic and visual confusion errors were prepared. For the tabulation of phonemic confusions, only responses to the stimulus letters B and V were included, and a confusion error was scored only if the response to the stimulus B was V , or the response to V was B ; all other error responses were scored as nonconfusions. On the other hand, for the tabulation of visual confusions, only responses to the stimulus letters V and X were included, and a confusion error was scored only if the response to the stimulus V was X or the response to X was V. For example, consider a subject in the 4 stringlength temporal sequence recall condition who saw the simulus letter V in the first temporal sequence position and wrote a B in the first position of the response card. For the analysis of phonemic confusion errors, this

Alice F. Healy

220

subject’s response would be scored as a confusion error; and for the analysis of visual confusion errors, it would be scored as a nonconfusion error. Table VII summarizes the confusion error tabulation by providing the percentages of correct responses, confusion errors, and nonconfusion errors as a function of confusion error type, string length, recall condition, and retention interval. Only the five retention intervals with random digit patterns were included in this tabulation. Since it is difficult to make sense of absolute percentages of confusion errors across conditions with different overall error percentages, conditional percentages of confusion errors, given that an error was made, were computed for each subject. For example, assume the subject deTABLE VII PERCENTAGES OF CORRECT RESPONSES,CONFUSION ERRORS,AND NONCONFUSION ERRORSFOR RANDOMDIGITPATTERNS (EXPERIMENT 3)a,b ~

~

Retention interval

3 String length

3 Temporal Phonemic Visual Spatial Phonemic Visual 4 Temporal Phonemic Visual Spatial Phonemic Visual 5 Temporal Phonemic Visual Spatial Phonemic Visual

4

5

15

16

Cor Conf Nonc Cor Conf Nonc Cor Conf Nonc Cor Conf Nonc Cor Conf Nonc

94 95

5

1

1

3

95 94

2 5

3 1

92 94

6 2

2 4

91 90

5 6

5 4

88 88

6 6

6 6

1

2

4

1

95 94

2 3

3 2

95 93

2 5

3 2

88 89

7 7

5 5

91 91

5 5

4 4

83 81

16

1 19

76 74

20 3

4 23

75 76

20 2

6

0

22

63 61

17 10

20 29

65 64

16 10

19 26

92 91

2 4

6 5

89 89

5 3

7 7

95 93

1 4

4 2

89 86

2 7

8 7

85 82

4 8

10 10

63

19 6

18 35

58 52

22 8

19 40

67 56

15 7

18 37

44

60

42

18 14

38 44

44 38

16 13

41 49

59 48

10 19

31 33

61 44

11

28 35

62 53

8 14

30 34

50 35

10

9

22

40 43

56

20

44

18

35 38

97 95

OInvolving stimulus letters B , V, and X and as a function of string length, recall condition, retention interval, and confusion type. bKey: Cor = correct response, Conf = confusion error, Nonc = nonconfusion error.

22 1

Short-Term Memory for Order Information

TABLE VIII MEANCONDITIONAL PERCENTAGES OF PHONEMIC AND VISUAL CONFUSION ERRORSINVOLVING THE STIMULUS 3)" LETTERSB, V, AND X (EXPERIMENT Retention interval String length

3 Temporal Phonemic Visual Spatial Phonemic Visual 4 Temporal Phonemic Visual Spatial Phonemic Visual 5 Temporal Phonemic Visual Spatial Phonemic Visual

3

4

5

15

16

16 13

42 65

69 26

51 46

54 34

32 12

34 52

40 55

50 56

48 52

90 1

77 13

70 12

42 24

39 25

24 27

40 29

19 61

18 39

24 49

50 14

52 16

45 15

32 24

27 24

25 36

26 35

22 26

20 32

21 32

AS a function of string length, recall condition, and retention interval for random digit patterns.

scribed above made 10 errors overall on the stimulus letter V , 6 which involved replacements of V by B , 1 which involved replacing V by X, and 3 which involved replacing V by S. Then the conditional percentage of phonemic confusion errors, given that an error was made on the stimulus letter V , would be 6/10 = 60%, and the corresponding conditional percentage of visual confusion errors would be 1/10 = 10%. The confusion error analysis is summarized in Table VlII in terms of mean conditional percentages of phonemic and visual confusion errors, given that an error was made on an instance of the stimulus letter B or V (for the phonemic confusion errors) or on an instance of the stimulus letter V or X (for the visual confusion errors). The conditional percentages are given as a function of string length, recall condition, and retention interval. Only the five retention intervals with random digit patterns were included in this analy-

222

Alice F. Healy

sis. An analysis of variance yielded 9% as an estimate of the standard error of the entries for Table VIJI. Although errors on the same letters were considered for each of the three string lengths, one difference between string lengths remained in the analysis: The chance conditional percentage of a phonemic or of a visual confusion error was 50% for the three-letter sequences, since one out of every two incorrect letters would constitute a confusion error of a given type; but the chance conditional percentage was 33% for the four-letter sequences and 25% for the fiveletter sequences. These differences in chance values presumably account for a significant main effect of string length found in the analysis of variance performed on these conditional percentages, F(2, 24) = 86.9, MS,= 140, p < .001. As in previous studies, for all three string lengths in the present experiment, the conditional percentages of phonemic confusion errors were higher than those of visual confusion errors (and higher than the chance value) for temporal sequence recall, but a smaller difference in the opposite direction was found for spatial location recall. The main effect of confusion enor type (phonemic or visual) was significant, F(1, 24) = 5.4, MS, = 1210,p = .028, as was the interaction of recall condition and confusion error type, F( 1 , 24) = 37.5, MS, = 1210, p < .001, but neither the two-way interaction of string length and confusion error type, F(2, 24) = 1.2, MS, = 1210, p = .308, nor the three-way interaction of string length, recall condition, and confusion error type, F(2, 24) = 1.9, MS, = 1210, p = .174, were significant. In previous experiments the conditional percentages of phonemic confusion errors in temporal sequence recall declined toward the chance value as retention interval increased. Likewise, for all string lengths in the present experiment, there was a decline in the conditional percentages of phonemic confusion errors in temporal sequence recall, but not in spatial location recall. No comparable decline was found in the conditional percentages of visual confusion errors in temporal sequence recall or spatial location recall: These findings are reflected in a significant interaction of confusion error type and retention interval, F(4, 96) = 3.5, MS, = 490, p = .010, and a significant three-way interaction of confusion error type, retention interval, and recall condition, F(4, 96) = 6.2, MS, = 490, p < .001. Although this pattern of results obtained for each of the string lengths, it was not quite as distinct for the three-letter sequences. Consequently, there was a significant three-way interaction of string length, confusion error type, and retention interval, F(8, 96) = 3.1, MS, = 490, p = .004, but the four-way interaction of string length, confusion error type, retention interval, and recall condition was not significant, F(8, 96) = 1.4, MS, = 490,p = .197.

Short-Term Memory for Order Information

223

Whereas a preponderance of phonemic confusion errors was found for temporal sequence recall, more visual than phonemic confusion errors were found for spatial location recall in the analyses described above. To determine whether the preponderance of visual confusion errors in spatial location recall was a general phenomenon or was more specifically attributable to the subset of errors considered in the above analyses (errors involving the letters B , V , and X ) , a more complete analysis included all possible confusion errors: The phonemic confusion errors involved in this analysis were those of B and V or S and X, and the visual confusion errors were those of V and X or B and S. In order to determine whether both pairs of phonemically confusable letters (or both pairs of visually confusable letters) showed the same pattern of results, this analysis made a distinction between the first pair of confusable stimulus letters of a given type and the second pair of letters of that type. The relative positions of the letter pairs were determined with respect to the constant letter order of the sequence. For example, consider a subject in the spatial location recall condition who saw the letters BVSX on every trial, with the temporal sequence always B , then V , then S , then X . The first pair of phonemically similar letters was BV and the second was SX, whereas the first pair of visually similar letters was BS and the second was VX. Note that for the three-letter sequences, there was only one, not two pairs of confusable stimulus letters of a given type, so that responses to the stimulus letter X did not enter into the calculations of the conditional percentages of phonemic confusion errors, and responses to the stimulus letter B did not enter into the calculations of the conditional percentages of visual confusion errors. Likewise, for the five-letter sequences, responses to the stimulus letter H did not enter into the calculations of the conditional percentages of either phonemic or visual confusion errors. The confusion error analysis is summarized in Table IX in terms of mean conditional percentages of phonemic and visual confusion errors, given that an error was made on an instance of the stimulus letter B, V, S, or X, as a function of string length, recall condition, constant letter order (BVX, or XVB for the three-letter sequences, BVSX or BSVX for the fourletter sequences, and BVSXH or BSVXH for the five-letter sequences), and letter pair position (first or second in constant letter order). Only the retention intervals with random digit patterns were included in this analysis. Analyses of variance yielded 12, 5, and 3% as estimates of the standard errors of the entries in Table IX for string lengths 3, 4,and 5, respectively. For all three string lengths, for both constant letter orders of each string length, and for both letter pair positions of each constant letter order, the conditional percentage of phonemic confusion errors was greater than that of visual confusion errors in temporal sequence recall. In

Alice F. Healy

224

TABLE IX MEANCONDITIONAL PERCENTAGES OF PHONEMIC AND VISUALCONFUSION ERRORS(EXPERIMENT 3)“ String length

3 Recall condition Temporal Phonemic First Second Visual First Second Spatial Phonemic First Second Visual First Second

4

5

BVX

XVB

BVSX

BSVX

BVSXH

BSVXH

52

65 -

59 64

69 70

42 45

40 46

39 -

14 14

16 16

17 18

16 19

63 -

31 38

19 24

23 31

23 25

42 -

45 31

31 51

20 31

27 33

-

35 -

19 -

73 -

“As a function of string length, recall condition, letter pair position, and constant letter order for random digit sequences only.

contrast, the pattern of confusion errors for spatial location recall was strongly influenced by string length, constant letter order, and letter pair position. In particular, when the letters next to each other in the constant temporal sequence were visually similar (as in the sequence SSVX), then more visual than phonemic confusions occurred. However, when the letters adjacent to each other in the constant temporal sequence were phonemically similar (as in the sequence SVSX), then more phonemic than visual confusion errors occurred-at least when the adjacent letters were the last two in the constant temporal sequence. These findings presumably do not reflect visual or phonemic similarity per se, but result from the large number of interchanges involving adjacent temporal positions, especially the last two, as was noted for four-letter sequences by Healy ( 1978). 3 . Distance Functions

To determine whether a preponderance of interchanges involved the last two temporal positions in spatial location recall in this experiment, as had been the case in previous studies, an analysis of the temporal distance

Short-Term Memory for Order Information

225

functions was performed. Table X shows these distance functions in terms of response percentages which reveal the distance in the temporal sequence between the correct letters and the letters that replaced them on the subject’s response protocol, for each of the recall conditions and string lengths. These functions do indeed show a disproportionately large number of interchanges between positions 2 and 3 in the three-letter sequences, between positions 3 and 4 in the four-letter sequences, and between positions 4 and 5 in the five-letter sequences, especially in spatial location recall. Also, it is interesting to note that for the five-letter sequences, the greatest number of interchanges does not involve the last two positions, but positions 3 and 4. C . DISCUSSION

These results provide support for the generality of the differences between temporal sequence and spatial location retention found in earlier studies by Healy (1975, 1977, 1978), which were restricted to sequences of four to-be-remembered letters. The same general pattern of results was found for all three string lengths used in the present experiment and is consistent with the earlier conclusion that phonemic coding is used in temporal sequence recall and pattern coding in spatial location recall (although pattern coding appears to be most effective for the four-letter sequences). It is particularly interesting that in the present experiment, recall level on spatial location retention was actually superior to that on temporal sequence retention, even at the shorter retention intervals of the four-letter sequences. Thus, despite the fact that pattern coding was an efficient strategy and despite the fact that pattern coding was available for temporal sequence retention, subjects chose the less efficient phonemic coding strategy in this situation. This finding is consistent with the hypothesis proposed above that subjects choose a memory code that reflects what they heard themselves say during stimulus presentation, whenever that code contains some useful information-as it does in temporal sequence recall, but not in spatial location recall. Although the pattern model proposed for the spatial location retention of four-letter sequences (Healy, 1978) cannot be applied directly to sequences of three and five to-be-remembered letters, its general principles can be applied to all three string lengths. According to the pattern model, subjects code three different items of pattern information: the spatial location of the first letter, the pattern class, and the spatial arrangement of the last two letters in the temporal sequence. Presumably, no new principles would be needed to derive a pattern model for three-letter sequences: The primacy advantage found in the spatial location recall condition for the temporal sequence positions is consistent with the hypothesis that an

Alice F. Healy

226

TABLE X OBSERVED TEMPORAL DISTANCE FUNCTIONS AS A FUNCTION OF RECALL CONDITION AND STRING LENGTHFOR RANDOM DIGITPATTERNS ONLY 3) (EXPERIMENT String length

3 Positiona

4

5

Temporal

Spatial

Temporal

Spatial

Temporal

Spatial

93 4 3

96 1 3

92 4 2

68

-

-

76 9 9 6

1

1 2 3 4 5

2

9 6 6

69 13 7 6 6

3 89 4 3

13 46 15 17

13 52 15 12

n

8

6

6 14

-

10

2 1 2

92

3

4

5

3 92 5

4

9 71 10 10

5

3 1

2 3 4 5 4 1 2 3

4 5

3 4 93

-

1 7 92

7 11 69 13

3 3 87 7

-

in

42 19 14

45

17

7 12 20 43 18

6 11 13 15 56

5 8 14 16 56

n

1

7

9 12 70

4

14

6 88

20 42

22 13

5 1 2 3 4 5

“The position code i, j (where i is the superordinate position label and j is the subordinate position label) refers to instances in which the correct letter being scored was presented in the temporal sequence position i and the subject’s response was the letter presented in the temporal sequence position j .

Short-Term Memory for Order Information

221

item of pattern information coded by subjects is the spatial location of the first letter in the temporal sequence. Furthermore, the disproportionately large number of interchanges involving the last two sequence positions in spatial location recall (which was large in relation to the number of interchanges involving the first two sequence positions) is consistent with the hypothesis that another item of pattern information coded by subjects, but rapidly lost, is the spatial arrangement of the last two letters in the temporal sequence (see Healy, 1978). In contrast, at least one new item of pattern information would have to be added to a pattern model for the five-letter sequences. Possibly this item of information might reflect the large number of interchanges involving temporal positions 3 and 4, since numerous interchanges that involved intermediate temporal positions were not found in the four-letter sequences.

V.

Experiment 4

In Experiment 3, more visual than phonemic confusion errors were found overall for spatial location recall, but this difference was not found consistently. Furthermore, there were some indications in the data that the visual confusions observed were not reflecting visual similarity per se, but were the natural consequence of temporal-spatial pattern coding. Experiment 4 was designed to provide a clearer answer to the question of whether visual letter coding is employed in spatial location recall. The subset of letters included in this experiment was expanded in comparison to the previous experiments, so that a more definitive answer to the question could be obtained. Three visually confusable pairs of letters were selected: VX, BS, and FP. As in the previous experiments, the determination of which letters were visually confusable was made informally on the basis of the visual characteristics of the stimuli used. Since large effects of visual confusability were found for four-letter sequences in Experiment 3, the present experiment was restricted to sequences of four to-be-remembered letters. The design of the experiment was strictly analogous to that of the earlier experiment by Healy (1978, Experiment 2). The only essential difference in method was the expansion of the set of letters shown to subjects. A.

METHOD

1.

Subjects

Thirty-six male and female Yale University undergraduates participated as subjects in order to fulfill a course requirement. Two of the subjects received 1 hour of course credit and $2.50 for the second hour of

228

Alice F. Healy

participation. There were 18 subjects in each of two conditions: temporal sequence recall and spatial location recall. The subjects were further subdivided into six subconditions of the two conditions, with 3 subjects per subcondition.

2 . Apparatus The same apparatus was employed as in previous experiments. The first four spatial locations of the display screen were used to present stimuli.

3. Design and Materials Twelve different 192-trial experimental sequences were prepared, based on those used by Healy (1978, Experiment 2). Each subject saw only 1 expermental sequence, and each sequence was shown to 3 subjects. Six sequences were for temporal sequence recall and 6 were for spatial location recall. A trial consisted of four successively presented consonants, printed in capital letters, followed by a retention interval of either 4 or 16 successively presented digits. The temporal-spatial pattern of letters on a given trial corresponded to the temporal-spatial pattern of letters shown on the corresponding trial in Experiment 2 of the study by Healy (1978) and in Experiment 2 of the present study. Similarly, the digits shown on a given trial corresponded exactly in identity and location to those shown on the corresponding trial of Experiment 2 by Healy (1978) and Experiment 2 of the present study. The population of consonants employed differed from that used by Healy (1978); it included six letters: BVSXPF. There were four sequences with each of the following letter combinations: BVSX, BPSF, and PVFX. Note that each of these combination includes two pairs of letters that are visually similar (BS, VX, and PF) and two pairs of letters that are phonemically similar (BV, BP, PV, SX, SF, and FX). The four sequences with a given four-letter combination included two sequences in the temporal sequence recall condition and two sequences in the spatial location recall condition, which were identical except for the temporal sequence and spatial arrangement of the letters on a given trial. The constant temporal order of the letters in a sequence for the spatial location recall condition was the same as the constant spatial arrangement of the letters in the corresponding sequence for the temporal sequence recall condition. The constant letter orders were BVSX, BSVX, BPSF, BSPF, PVFX, and PFVX for the six sequence pairs, respectively. Different permutations of the letters ABCD were shown on eight practice trials. Two different sequences of practice trials were used, one for

Short-Term Memory for Order Information

229

the temporal sequence recall condition and one for the spatial location recall condition. For both sequences the constant letter order was ABCD. Eight different temporal-spatial patterns were used, one with each of the eight combinations of retention interval and digit pattern type. 4 . Procedure

Subjects were tested individually in sessions that lasted approximately 2 hours. Each subject was instructed to read aloud each consonant and digit as it appeared on the display screen. At the end of each sequence, the subject’s task was to write down the four letters shown on that trial in their temporal sequence in the temporal sequence recall condition and in their spatial arrangement in the spatial location recall condition. The subjects wrote their responses on 3 x 5-inch index cards on which four boxes had been printed in a horizontal linear array. The subjects were not required to fill in the boxes in any particular temporal sequence, but they were forced to fill in all the boxes on the card, and they were told to guess if necessary. At the end of every block of 48 experimental trials, there was a short rest break. Subjects were told which four letters would be shown to them on every trial and were given the constant temporal sequence or constant spatial arrangement of the letters. B.

RESULTS

The results are summarized in Table XI in terms of percentages of correct responses for the two experimental conditions as a function of retention interval and the relation between the to-be-recalled letter pattern and the interpolated digit pattern. An analysis of variance yielded 2% as an estimate of the standard error of the entries of Table XI. The results replicate in all essential details those found in the earlier experiment by Healy (1978, Experiment 2), which differed from the present experiment only in terms of the particular letters shown to subjects. Although overall performance did not differ significantly between the two recall conditions, F(1, 34) = 2.5, MS, = 4641, p = .121, retention was much better at the short retention interval than at the long interval, F(1, 34) = 145.6, MS, = 540, p < .001 , and the effect of retention interval was substantially larger in temporal sequence recall than in spatial location recall, F(1, 34) = 38.9, MS, = 540, p < .001. In addition, subjects made fewer errors on trials in which the interpolated digit pattern was identical to the pattern of to-be-remembered letters, F(3, 102) = 17.9, MS, = 127, p < .001, but this difference was greater at the longer retention interval

Alice F. Healy

230

TABLE XI PERCENTAGES OF CORRECT RESPONSESBY RECALLCONDITION, RETENTION INTERVAL, AND RELATIONBETWEEN THE TO-BE-RECALLED LETTERPATTERN AND THE INTERPOLATED DIGIT PATTERN(EXPERIMENT 4) Retention interval 4

Condition Same Temporal Spatial

81 85

16

Different

Similar

Random

Same

Different

Similar

Random

81 76

82 77

83 80

61 79

55

56 69

56 69

69

than at the shorter interval, F ( 3 , 102) = 2.7, MSe = 183, p = .047, and was greater in the spatial location recall condition than in the temporal sequence recall condition, F ( 3 , 102) = 5.4, MSe = 127, p = .002.

I.

Phonemic and Visual Confusions

An analysis of phonemic and visual confusion errors was conducted, analogous to that performed for Experiment 3 . Phonemic confusion errors consisted of confusions of B and V , B and P , P and V , S and X , S and F , or F and X ; visual confusion errors consisted of confusions of B and S, V and X , or P and F . The tabulation of confusion and nonconfusion errors is summarized in Table XI1 in terms of absolute percentages of correct TABLE XI1 PERCENTAGES OF CORRECT RESPONSES,CONFUSION ERRORS,AND NONCONFUSION ERRORSAS A FUNCTION OF CONFUSION TYPE,RECALL CONDITION AND RETENTION INTERVAL(EXPERIMENT 4) Retention interval 16

4

Recall condition Temporal Phonemic Visual Spatial Phonemic Visual

Correct

Confusion Nonconfusion Correct

Confusion

Nonconfusion

82 82

13 2

5 16

57 57

17 13

26 30

80 80

8 9

13 12

72 72

10 10

18 18

Short-Term Memory for Order Information

23 1

responses, confusion errors, and nonconfusion errors as a function of confusion error type, recall condition, and retention interval. As in the preceding experiment, conditional percentages of confusion errors were also computed and are more readily interpreted. Table XIII presents the mean conditional percentages of phonemic and of visual confusion errors, given that an error was made on a particular letter as a function of recall condition and retention interval. An analysis of variance yielded 2% as an estimate of the standard error of the entries in Table XIII. As previous research had indicated, there were more phonemic than visual confusion errors in temporal sequence recall; but in spatial location recall, visual confusion errors were as prevalent as phonemic confusion errors. These effects were reflected in an overall main effect of confusion error type (phonemic or visual), F(1, 24) = 38.1, MS, = 2240, p < .001, and an interaction of confusion error type and recall condition (temporal sequence or spatial location), F(1, 24) = 37.0, MSe = 2240, p < .001. In each case of visual and of phonemic confusion errors for temporal sequence and spatial location recall, the conditional percentages moved in the direction of the chance level (33%, since one out of every three incorrect letters would constitute a confusion error of a given type) as the retention interval increased from 4 to 16 interpolated digits. Consequently, the main effect of retention interval was significant, F( 1, 24) = 17.8, MS, = 270, p < .001, as was the interaction of retention interval and confusion error type, F(1, 24) = 73.7, MS, = 550, p < .001, and the three-way interaction of retention interval, recall condition, and confusion error type, F(1, 24) = 75.1, MS, = 550, p < .001. Although the constant letter order did not affect temporal sequence TABLE XI11 MEAN CONDITIONAL PERCENTAGES OF PHONEMIC AND VISUALCONFUSION ERRORSBY RETENTIONINTERVAL AND RECALLCONDITION (EXPERIMENT 4) Retention interval Recall condition Temporal Phonemic Visual Spatial Phonemic Visual

4

16

69 11

40 30

37 37

34 34

Alice F. Healy

232

recall, it did strongly affect spatial location recall, as in Experiment 3. More phonemic than visual confusion errors were found in spatial location recall when the letters that were phonemically similar were adjacent to each other in the constant temporal sequence (as they were in the sequences with constant order BVSX, BPSF, PVFX) , and more visual than phonemic confusion errors were found in spatial location recall when the letters that were visually similar were adjacent to each other in the constant temporal sequence (BSVX, BSPF, PFVX). This effect was reflected in an interaction of constant letter order and confusion error type, F ( 5 , 24) = 7.1, MSe = 2240, p < .001, and a three-way interaction of recall condition, constant letter order, and confusion error type, F ( 5 , 24) = 5 . 3 , MSe = 2240, p = .002. This pattern of results is evident in Table XIV , which provides mean conditional percentages of phonemic and visual confusion errors as a function of letter pair position (first or secTABLE XIV OF PHONEMIC AND MEANCONDITIONAL PERCENTAGES VISUALCONFUSION ERRORSBY LEITER PAIRPOSITION, RECALLCONDITION, AND CONSTANT LETTERORDER (EXPERIMENT 4) Confusion errors Phonemic Constant letter order

Visual

Temporal

Spatial

Temporal

Spatial

56 62

40 72

19 23

15 20

65 70

24 30

12 13

47 46

54 71

35 57

14 16

20 25

45 55

28 31

20 23

39 49

35 53

30 51

24 33

18 33

38 52

11 18

24 24

35 80

BVSX

First Second BSVX First Second BPSF First Second BSPF First Second PVFX First Second PFVX First Second

Short-Term Memory for Order Information

233

ond, see Section IV,B,2 for the method used to determine letter pair position), recall condition, and constant letter order. An analysis of variance yielded 3% as an estimate of the standard error of the entries in Table XIV. The finding in spatial location recall of many confusion errors involving letters in neighboring temporal positions can be understood in terms of the pattern model, which predicts a large number of interchanges of the last two letters as a result of the rapid loss of the item of pattern information representing the spatial arrangement of the last two letters. Indeed, more phonemic and visual confusion errors in spatial location recall involved the second pair of confusable letters rather than the first pair, especially when the letters in each pair were adjacent to each other in the temporal sequence. These findings are reflected in a significant main effect of letter pair position, F(1, 24) = 48.9, M S , = 660, p < .001, and significant interactions of letter pair position and recall condition, F(1, 24)= 4.7, MS, = 660, p = ,038, letter pair position and confusion error type, F(1,24) = 7.0, MS, = 270, p = .014, letter pair position, constant letter order, and confusion error type, F(5, 24) = 5.0, MS, = 270, p = .003, and letter pair position, recall condition, constant letter order and confusion error type, F ( 5 , 24) = 7.4, MS, = 270, p < .001. These effects of letter pair position may be due in part to the specific letters that enter into each pair, but the consistency of the pattern strongly suggests that the temporal positions of the letters in a pair are of major importance. 2.

Pattern Confusions

An additional analysis was conducted to test more specifically the prediction of the pattern model that in spatial location recall the correct letter pattern will be confused with the letter pattern that contains the same items of information except for the spatial arrangement of the last two letters, especially when the last two letters in the correct letter pattern do not occur in the “normal” arrangement, but from right to left (see Healy, 1978). Considering only the trials with random digit patterns, as predicted, this type of pattern confusion occurred in spatial location recall on 27 trials when the last two letters in the correct pattern were in the normal arrangement and occurred on 63 trials when the last two letters in the correct pattern were in the reverse arrangement, out of 318 trials with errors. In contrast, in temporal sequence recall, this type of pattern confusion occurred somewhat less often-on 29 trials when the last two letters in the correct pattern were in the normal arrangement and on 32 trials when they were in the reverse arrangement, out of 414 trials with errors.

Alice F. Healy

234

The frequency of these pattern confusions is especially high when compared to the chance values of 14 for spatial location recall and 18 for temporal sequence recall, which were calculated by dividing the total numbers of trials with errors by 23, the number of possible erroneous letter patterns. These results provide further support for the operation of the pattern model, especially in spatial location recall. C.

DISCUSSION

The primary goal of this experiment was to determine whether there was any evidence for visual letter coding in spatial location recall. In this experiment no more visual confusion errors were found than phonemic confusion errors in spatial location recall, and the pattern of errors observed could be explained entirely in terms of the coding of attributes of temporal-spatial patterns. It is possible that subjects in spatial location recall do not attend to features of the letters themselves, but attend exclusively to the sequence of spatial locations that include stimuli. This pattern of results is therefore consistent with the findings in Experiment 2 that pattern coding was employed even when the subjects were not shown a sequence of different letters, but were shown a single neutral stimulus letter that occurred repeatedly in different spatial locations.

VI. A.

INDEPENDENCE

OF

Conclusions

ITEM AND ORDERINFORMATION

The question of central interest in this study was whether the retention of item and order information draws on the same processing capacity. A clear negative answer to this question was provided for both temporal sequence and spatial location information. In Experiment 1, the percentages of correct responses were just as high in the item order - item condition as in the order only condition, despite the fact that the memory load for item information was considerably greater in the item + order item condition. This result indicates that the amount of to-be-remembered item information does not influence the retention of temporal sequence information. In Experiment 2, the pattern of results for the count condition was strictly analogous to that for the previous spatial location recall conditions; although essentially no item information was to be remembered in the count condition, since only a single neutral stimulus occurred in different spatial locations. This result suggests that the coding strategy used to retain spatial location information is not affected by the amount of

+

Short-Term Memory for Order Information

235

associated item information. Experiments 3 and 4 gave further indications that spatial location retention is not affected by the quality of the to-beremembered items, because there were not consistent effects in these experiments of either the phonemic or the visual characteristics of the items. What implications do these results have for the models of short-term memory that have been proposed? These findings provide devastating evidence against Conrad’s (1965) slot theory. According to that model, order errors are a by-product of item errors and completely dependent on them. Thus, the slot model cannot accommodate the finding that order errors were just as frequent in the item order - item condition as in the item order condition of Experiment 1, even though no item errors were made in the former condition. It is also hard to reconcile this model with the finding that order errors were just as common in the count condition of Experiment 2 as in the spatial location recall condition of the earlier experiment by Healy (1978), although all items were the same in the count condition. The independence between the retention of item and order information demonstrated in these experiments is also difficult to reconcile with the models proposed by Murdock (1976), Shiffrin and Cook (1978), and Drewnowski (1980), because each of these models postulates a certain degree of interdependence between the loss of item and order information. In contrast, the augmented version of the perturbation model of Lee and Estes (1981) can accommodate these findings quite easily, since order and item errors are due to perturbations at two different levels of this system. Likewise, the more specific model proposed by Healy (1978) is consistent with the present findings because it postulates that subjects code information about the temporal-spatial pattern of letter presentations, rather than information about the letters themselves, when recalling the spatial arrangement of a sequence of letters.

+

B.

+

CODINGSTRATEGIES

A second question of the present study was why subjects use different coding strategies for the retention of temporal sequence and spatial location information. This question was answered by referring to the striking differences found between the count and position conditions in Experiment 2. These two conditions differed only in what the subjects said during the presentation of the to-be-remembered stimuli; in the count condition, like the previous spatial location recall conditions, the subjects’ utterances contained no useful information, whereas in the position condition, like the previous temporal sequence recall conditions, the sub-

236

Alice F. Healy

jects’ utterances did contain useful information. Since the count condition showed a pattern of results analogous to that in the previous spatial location recall conditions, and the position condition showed results comparable to those in the previous temporal sequence recall conditions, the conclusion was reached that what the subjects say during stimulus presentation influences the recall strategy that they use. More specifically, subjects make use of a phonemic code that reflects what they said during stimulus presentation whenever that code contains some useful information. C . MENTALREPRESENTATIONS

The present findings also have implications for a much broader issue in cognitive psychology concerning the representation of information in memory (e.g., Anderson, 1978; Kosslyn & Pomerantz, 1977). The present results suggest that the pattern code used to retain spatial location information is analogical rather than verbal. Subjects in the position condition of Experiment 2, who were required to vocalize the sequence of spatial location position numbers of the stimuli, showed less evidence of pattern coding than those in the count condition, whose vocalizations contained no useful information. This result indicates that pattern coding is not equivalent to a verbal coding of spatial position numbers, which may not be the only possible verbal code for spatial location information, but is surely the most reasonable one available in this situation. Although the present results ruled out verbal coding as a likely basis for spatial location retention, a code based on a static visual image also seems unlikely for two reasons: First, visual confusion errors were not consistently found in Experiments 3 and 4, and those that were obtained could be explained in terms of factors that were irrelevant to visual similarity per se. Second, a static visual image could not be used to retain the spatial location information in Experiment 2, since only a single stimulus item (the letter X) occurred in different spatial locations. Although a static visual image revealing the spatial arrangement of the tobe-recalled items might have provided the basis for retention in the spatial location recall conditions with letters, such an image would be a redundant string of Xs in the conditions of Experiment 2. What is the nature of the analogical pattern code used for spatial location retention? The modality in which pattern information is stored cannot be determined from the present findings. The pattern information could be stored in terms of a kinesthetic code, a visual code, or a more abstract code. Nevertheless, it is clear from the present results that this code, like that postulated by Healy (1978), makes reference to both temporal and spatial properties of the stimulus configuration.

Short-Term Memory for Order Information

237

ACKNOWLEDGMENTS This research was supported in part by NSF Grants BNS77-00077 and BNS80-00263 to Yale University and BNS80-25020 to the Institute of Cognitive Science at the University of Colorado. The author was supported by a Senior Faculty Fellowship from Yale University during the preparation of this article. The author is indebted to William K. Estes for generously providing many helpful suggestions at numerous phases of this research, to Lorretta T. Polka for her careful help with the construction of the experimental materials and the conduct and analyses of all four experiments, to Maureen McNamara for her aid in constructing the materials used in Experiment 1, and to Mary LaRue for her aid in the data analyses of Experiment 4.

REFERENCES Anderson, I. R. Arguments concerning representations for mental imagery. Psychological Review, 1978, 85, 249-277. Berch, D. B. Coding of spatial and temporal information in episodic memory. In H. W. Reese & L. P. Lipsitt (Eds.), Advances in child development and behavior (Vol. 13). New York: Academic Press, 1979. Pp. 1-46. Bjork, E. L., & Healy, A. F. Short-term order and item retention. Journal of Verbal Learning and Verbal Behavior, 1974, 13, 80-97. Brown, J. Some tests of the decay theory of immediate memory. Quarterly Journal of Experimental Psychology, 1958, 10, 12-21. Cimbalo, R. S., & Laughery, K. R. Short-term memory: Effects of auditory and visual similarity. Psychonomic Science, 1967, 8, 57-58. Conrad, R. Acoustic confusions in immediate memory. British Journal of Psychology, 1964, 55, 75-84. Conrad, R. Order error in immediate recall of sequences. Journal of Verbal Learning and Verbal Behavior, 1965, 4, 161-169. Conrad, R. Interference or decay over short retention intervals? Journal of Verbal Learning and Verbal Behavior, 1967, 6, 49-54. Crossman, E. R. F. W. Information and serial order in human immediate memory. In C. Cherry (Ed.), Information theory. London: Buttenvorth, 1961. Crowder, R. G . Similarity and order in memory. In G. H. Bower (Ed.), Psychology of learning and motivation (Vol. 13). New York: Academic Press, 1979. Pp. 319-353. Drewnowski, A. Attributes and priorities in short-term recall: A new model of memory span. Journal of Experimental Psychology: General, 1980, 109, 208-250. Estes, W. K. An associative basis for coding and organization in memory. In A. W. Melton & E. Martin (Eds.), Coding processes in human memory. Washington, D.C.: Winston, 1972. Pp. 161- 190. Estes, W. K. Phonemic coding and rehearsal in short-term memory for letter strings. Journal of Verbal Learning and Verbal Behavior, 1973, 12, 360-372. Healy, A. F. Separating item from order information in short-term memory. Journal of Verbal Learning and Verbal Behavior, 1974, 13, 644-655. Healy, A. F. Coding of temporal-spatial patterns in short-term memory. Journal of Verbal Learning and Verbal Behavior, 1975, 14, 48 1-495. Healy, A. F. Pattern coding of spatial order information in short-term memory. Journal of Verbal Learning and Verbal Behavior, 1977, 16, 419-437.

238

Alice F. Healy

Healy, A. F. A Markov model for the short-term retention of spatial location information. Journal of Verbal Learning and Verbal Behavior, 1978, 17, 295-308. Kosslyn, S. M., & Pomerantz, J. R. Imagery, propositions, and the form of internal representations. Cognitive Psychology. 1977, 9, 52-76. Lee, C. L., & Estes, W. K. Order and position in primary memory for letter strings. Journal of Verbal Learning and Verbal Behavior. 1977, 16, 395-418. Lee, C. L., & Estes, W. K. Item and order information in short-term memory: Evidence for multilevel perturbation processes. Journal of Experimental Psychology: Human Learning and Memory, 1981, 7 , 149-169. Murdock, B. B., Jr. Item and order information in short-term serial memory. Journal of Experimental Psychology: General, 1976, 105, 191-216. Navon, D., & Gopher, D. On the economy of the human-processing system. Psychological Review, 1979, 86, 214-255. Shiffrin, R. M., & Cook, J. R. Short-term forgetting of item and order information. Journal of Verbal Learning and Verbal Behavior, 1978, 17, 189-218. Sperling, G., & Speelman, R. G. Acoustic similarity and auditory short-term memory: Experiments and a model. In D. A. Norman (Ed.), Models of human memory. New York: Academic Press, 1970. Pp. 151-202.

RETROSPECTIVE AND PROSPECTIVE PROCESSING IN ANIMAL WORKING MEMORY Werner K . Honig DALHOUSIE UNIVERSITY HALIFAX, NOVA SCOTIA, CANADA

Roger K . R . Thompson FRANKLIN AND MARSHALL COLLEGE LANCASTER, PENNSYLVANIA Introduction: Retrospective and Prospective Remembering. . . . . . . . . . . . . . . . . . . . Representations of Initial and Test Stimuli . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A. Anticipation of Different Test Stimuli . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. Directed Forgetting Experiments ............ C. Changes in the Representation o ......................... D. Variations in Initial Stimuli.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111. Differentiation of Trial Outcomes. . . IV . Comparisons among Working Memory Paradigms . . . . . . . . . . . . . . . . . . . . . . . . . . A. Delayed Discrimination and Delayed Matching. . . . . . . . . . . . . . . . . . . . . . . . . B. Delayed Discrimination and Delayed Conditional Discrimination . . . . . . . . . . C. Mediating Behavior in the Dolphin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . D. Memory for Auditory and Visual Events by R a t s . . . . . . . . . . . . . . . . . . . . . . . E. Delayed Discriminations and Outcome Expectancies . . . . . . . . . . . . . . . . . . . . V. Memory for Multiple Items .................... A. Serial Probe Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. Memory for Temporal Order.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C. The Radial Maze . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . VI . Discrimination and Memory of Stimulus Sequences . . . . . . . . . . . . . . . . . . . . . . . . . A. Sequence Discriminations by Pigeons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B. Auditory Sequence Discriminations by the Dolphin . . . . . . . . . . . . . . . . . . . . . VII. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . References _ _ _ _ _ _ _ _ _ . _ _ _ _ _ _ _ _ _ _ _ _ _ . _ _ _ _ . _ _ . _ _

I.

11.

I.

239 242 242 245 246 248 250 254 255 256 259 262 264 265 266 210 27 I 272 212 215 211 _280 __.____._

Introduction: Retrospective and Prospective Remembering

In this article we describe some current research in the area of “shortterm,” or, as we prefer to call it, “working memory” in animals. We hope to contribute to the understanding of the memory processes in severTHE PSYCHOLOGY OF LEARNING AND MOTIVATION, VOL. 16

239

Copyright 0 1982 by Academic Press. Inc. All rights of reproduction in any form reserved ISBN 0-12-543316-6

240

Werner K. Honig and Roger K. R. Thompson

a1 experimental paradigms. We assume that the “long-term,” or reference, memory for any particular problem is stable. Our main interest is in the processes that govern remembering when a memory interval (MI) occurs between the initial stimulus (IS) and the test stimulus (TS) or test stimuli (TSs) in the course of a trial. Generally, performance deteriorates to some degree when such intervals are introduced; there is a clear relationship between loss of accuracy and the length of the MI. On the other hand, forgetting is not inevitable, nor is it always rapid. Our interest is in the nature of the processes that mediate good performance during the MI. Working memory processes may be conceptualized in two ways. Traditionally, they are thought to be retrospective. The IS establishes a surrogate, or “representation,” which is maintained to some degree during the MI. This could take the form of a decaying trace (Roberts & Grant, 1976), or it could involve differential mediating behavior (Blough, 1959). If such a representation is maintained, it provides the information for the subject to make a correct response decision at the end of the MI, when the TS is presented. A third possibility is best understood as a recognition process: When the TSs are presented, the subject scans its recent past to determine which one occurred most recently, and responds to it (D’Amato, 1973). Again, the IS takes the form of a representation, which would be initiated at the time of the TS. An alternative conceptualization is that the memory process is prospective, or anticipatory. Perhaps the subject can, in the course of training, develop schemas, or “temporal maps” (Honig, 1981), of the sequences of events in the training trials. He could then represent the probable trial outcome even at the time of IS. This would enable him to make an appropriate response decision at that time, and maintain the decision during the MI, rather than some representation of the IS. In view of the fact that organized behavior is generally oriented to the future rather than the past, this is a plausible assumption and by no means a recent one (see Morgan, 1930, Ch. XI). However, a decision between these alternatives-retrospective and prospective remembering-is far from easy. Both processes may be within the competence of the subject, and the one that is used may be determined by the requirements of the experimental paradigm. Furthermore, the processes might operate concurrently. A simple example illustrates the nature of the problem: Suppose that you observe the animal’s behavior in a working memory situation, and find that it engages in different mediating behaviors, doing one thing after one IS, and making a different kind of movement after the other. The differential behaviors could be surrogates or representations of the two ISs. On the other hand, since different responses are required following each IS, the mediating behaviors could be maintaining different response

Animal Working Memory

24 I

decisions regarding the correct behavior at the end of the trial. Clearly, the analysis of working memory processes sometimes needs to be indirect, and sometimes involves rather complex experimental procedures. In the area of working memory, paradigms differ in complexity. The simplest is the delayed discrimination (DD), in which a trial begins with either of two initial stimuli, which unequivocally signal different trial outcomes (such as reinforcement or nonreinforcement), or different response requirements for obtaining reinforcement. The test stimulus is (usually) the same for both kinds of trials, and responses to it reveal the discrimination between the ISs. In this situation, the subject can anticipate the trial outcome at the time of IS and therefore encode the appropriate response to the TS even before the beginning of the MI. This could take the form of a simple instruction: “Peck (don’t peck) the white key,” or “Push the left (right) lever when the light comes on.” Working memory has generally been studied with more complex procedures. Delayed matching to sample (DMS) is the most familiar. Here, the IS indicates which TS will be correct at the end of MI. The subject cannot encode the response by location, since the location of the correct TS varies between trials. Therefore, he has to wait until the TSs appear to determine which will be correct. The same goes for delayed conditional matching (DCM), where the correct TS also depends upon the IS, but does not actually resemble the IS. If only one TS is presented after the MI, this turns conditional matching into a delayed conditional discrimination (DCD). In this procedure, the trial outcome is determined conjointly by the IS and the TS. In its “matching” form, the DCD requires that the subject respond to the TS only if it is the same as the IS. Finally, the DCD can be turned into a serial probe recognition (SPR) task by presenting several ISs successively on each trial. These are followed (after the MI) by a single test item, or probe. The subject responds in one way if the test item was contained in the initial list, and in another way if it was not. Our analysis is based in part upon the amount of stimulus information that must be encoded at the time of the IS. For all the procedures just mentioned except the SPR task, the amount of information carried by the ISs is the same; there are two alternatives. It is reasonable to suppose, therefore, that memory functions based on retrospective processing should not differ. However, the amount of stimulus information that would have to be carried in prospective memory does differ among the procedures just described. In the DD, the instruction is simple, for example, “Respond” or “Don’t respond.” In the other paradigms, the prospective process would also have to carry stimulus content, such as “Respond if red, don’t respond if green” or the opposite. This stimulus content may lead to rapid memory loss, and it may be simpler for the

242

Werner K. Honig and Roger K. R. Thompson

subject to remember the IS retrospectively. However, the information carried by the two memory processes would be the same, and whether the examination of forgetting curves will distinguish between them is not clear. Less direct approaches are required, and we shall review other experimental procedures in our attempt to analyze working memory.

11. Representations of Initial and Test Stimuli The most direct approach to the distinction between prospective and retrospective processes in working memory is the analysis of the “content” of the memory: Is the subject governed by a representation of the prior IS, or of the appropriate course of action for the immediate future? This question is not a simple one. Representations can at best be studied indirectly. Roitblat (1982) reviews the question of representation in animal memory in detail. In this section we discuss experiments that seek to determine the nature of the representation during the MI in some working memory paradigms. A.

ANTICIPATION OF DIFFERENT TESTSTIMULI

Stonebraker (1981), working with pigeons, used an ingenious method to distinguish between the retrospection of ISs and the anticipation of the correct TS. Alternative sets of TSs followed a fixed set of ISs, which were always red and green displayed on the response key. In the DMS task, the TSs were also red and green. In the DCM task, the TSs were vertical and horizontal lines. An instantial or “predictive” stimulus, correlated with the task, was presented during the MI. In Experiment 1 a circle preceded red and green TSs, while a triangle preceded the line orientations. (It should be assumed throughout this article that such conditions are normally counterbalanced. We omit the description of counterbalancing in order to simplify the exposition.) After acquisition, the subjects were occasionally confronted with test stimuli contrary to those predicted by the instantial stimuli. Thus, red and green would appear following the triangle, or vertical and horizontal lines after the circle. In a second experiment, Stonebraker used only single test stimuli. Red and green again served as ISs. When the TSs were also red or green, responses to the matching TS were reinforced (a DCD where the TSs matched the ISs). On other trials, the TS was white, and the subjects were rewarded for pecking at it following only one of the two ISs (a DD). Different instantial stimuli preceded the TSs in the two problems. On probe trials, these were followed by an inappropriate TS.

Animal Working Memory

243

In both experiments, the accuracy of each bird was depressed significantly on the probe trials. The percentage correct on DMS fell from 90.6 to 74.3; on DCM it fell from 88.4 to 70.1. On the DCD and the DD, the discrimination ratios generally fell from .80 (or better) to S O on probe trials. These results are consistent with the hypothesis that pigeons remembered a specific response instruction established by the information given conjointly by the sample and the instantial stimuli. If only red and green as sample attributes were encoded or represented during the delay, then which test stimuli are presented ought not to matter. If the pigeon cannot or does not anticipate particular test stimuli, then the instantial stimuli should be irrelevant to the memory process. However, Stonebraker’s results are compatible with an alternative explanation based on retrospective processing. In principle, each discrimination could be based on a compound cue consisting of the memory of the IS and the current instantial cue (M. R. D’Amato, personal communication). For example, the red IS may be encoded and remembered; when this is followed by a circle, then the choice of red is correct. The response decision is made at the time of the test stimuli. Similarly, the memory of green, combined with the presence of the circle, would control choice of the green TS. Choices in the DCM procedure would be based on similar compounding of a memory of the IS with the presence of the triangle. Clearly, if the “wrong” TSs appear on probe trials, the subject would not have the appropriate cue available for a correct discrimination. Both models-the retrospective and the prospective-seem to be equally valid to explain Stonebraker’s data. The performance of some subjects did not fall to chance on probe trials. This can be explained with a retrospective model on the basis of generalization decrement; the pigeon still has one cue (memory of the IS) available when test stimuli appear. For a purely prospective model this finding would seem to create more of a difficulty, since the pigeon should not remember the IS at all. It is of some interest that performance did fall to chance on the delayed simple discrimination when that task was miscued. As we shall see when we compare performance on different paradigms, pigeons probably use prospective processing with this particular method. A further test would be of interest: Introduce occasional trials in which the instantial stimulus appears only briefly at the end of the MI, or in which the instantial stimulus is changed near the end of MI. This should disrupt prospective processing, which depends on the continuous presentation of the instantial cue; but it should not markedly affect retrospective processing, which depends only on the conjunction of a trace and the instantial cue at the end of the MI.

244

Werner K. Honig and Roger K. R. Thompson

Recently, Roitblat (1980, Experiment 3) sought to identify the pigeon’s memory code in DCM by analyzing confusion errors. For two birds red, orange, and blue served as ISs, and two of three line orientations-0, 12.5, and 90 degrees from vertical-served as TS pairs. For the remaining subject, the roles of color and line orientation were reversed. Note that two of the colors (red and orange) and two of the lines (0 and 12.5 degrees) were quite similar and differed from the third stimulus (blue and 90 degrees, respectively). In training, the pair of similar colors was associated with a pair of dissimilar lines, and vice versa. Thus, following red, orange, and blue, choices of 90-, 12.5, and 0-degree lines, respectively, were rewarded. Roitblat (1980) argued that “If the birds are remembering an elaborated code of the sample stimulus, then more similar samples will be confused with increasing delays than dissimilar samples. similarly, if the bird is remembering the test stimuli, then the rate of confusion between similar test stimuli should increase more rapidly with increasing delays than should the rate of confusion between dissimilar test stimuli” (p. 348). Specifically, with orange as the IS, the pigeon should choose the 90-degree line (rather than the 12.5degree line) with increasing delays if it confuses the memory of orange with a memory of red. On the other hand, the pigeon should choose the 0degree line (rather than 12.5 degrees) if it confuses the anticipated choice of 12.5 degrees with the more similar test stimulus. The latter prediction was supported as Roitblat increased MI durations from 250 msec to 5.6 sec. Regrettably, Roitblat ran only three subjects. A set of systematic replications involving other colors and possibly other stimulus dimensions would be very desirable. Results similar to those of Roitblat were obtained in an earlier study by Gaffan (1977, Experiment 2) on rhesus monkeys. The monkeys were faced with a 3 X 2 array of response panels as shown in Table I. The IS was one of three colors (red, amber, or blue) presented on panel 5 (middle of the bottom row). After a 10-sec MI, one of panels 1, 3, or 6 was illuminated with white; and a black cross was shown on panel 4. Responses were rewarded to panel 1 following amber, panel 3 after blue, and panel 6 after red. A response to panel 4 was rewarded if the appropriate panel was not illuminated. In this experiment, the correct TSs associTABLE I SPATIAL ARRANGEMENT OF RESPONSE PANELS~ 1 = “IS was amber” 4 = “No”

5: Sample (IS) color

“Adapted from Gaffan (1977, Experiment 2).

3 = “IS was blue” 6 = “IS was red”

Animal Working Memory

245

ated with the dissimilar red and blue samples were spatially contiguous, while responses associated with the similar red and amber samples were spatially separated. Now consider the distribution of errors when panel 6 (associated with red) is lit at the end of a delay preceded by blue or amber. If the monkey confuses the traces of initial stimuli, then error rates should be higher following amber, because this stimulus is confused more readily with red than with blue. On the other hand, if the monkey confuses response instructions, then error rates should be higher after blue than amber, since panel 3 (correct after blue) is closer to panel 6 than is panel 1 (correct after amber). The results support the instructional notion. When panel 6 followed blue, percentages of errors were 23.4 and 16.8 for the two monkeys. The corresponding values following amber were 7.8 and 8.4 (p

< .05).

B . DIRECTED FORGETTING EXPERIMENTS

Several recent experiments have been concerned with ‘‘directed forgetting” in working memory situations. Differential “remember” and “forget” cues presented during the MI systematically precede the presentation or omission of the test stimuli. The effects of the forget cue are assessed in probe test trials, where the TSs are presented “unexpectedly” at the end of the trial. The general result is that pigeons perform much more poorly after a forget cue than a remember cue, or after a forget cue followed within a short time by a remember cue in the same trial. It is tempting to conclude that pigeons fail to remember on forget trials because they anticipate the lack of the memory requirement. However, the interpretation of these studies is difficult because the omission of the TSs may involve omission of the reinforcer, the omission of any terminal stimulus, and the omission of any requirement to make a discrimination at the end of the trial. When these confounding features are eliminated, the “directed forgetting” effect becomes elusive (Maki, 1981). The area has been reviewed at length in three different articles (Grant, 1981; Maki, 1981; Rilling, Kendrick, & Stonebraker, 1982) and is the topic of many current research papers. We shall not review the topic in detail here. It is questionable whether the data from directed forgetting experiments contribute incisively to the distinction between prospective and retrospective memory processing. If the subject does “cancel” a memory process when the forget cue is presented, he could terminate the memory (or “trace”) of the IS, or he could terminate the anticipation of the correct TS. Directed forgetting studies indicate that working memory processes are not “automatic.” They can be terminated or continued in

246

Werner K. Honig and Roger K. R. Thompson

anticipation of the end of the trial. But the experiments have not directly addressed the question of the “content” of working memory. They have provided a paradigm on which Stonebraker could base a more incisive approach to the content question, as described in the prior section. In his method, the instantial cues permitted the anticipation of specific test stimuli, rather than their presentation or omission. C. CHANGESIN

THE

REPRESENTATION OF INITIALSTIMULI

If the subject represents the correct response in working memory, rather than the IS, there should be no systematic change in responding to the TSs which would be related to possible changes in the representation of the IS during the MI. For example, the subject might have to distinguish between the prior presentation of a loud or bright stimulus and a soft or dim stimulus. We could then detect a “fading” of the representation of the loud or bright stimulus in the course of an MI by observing an increase in (incorrect) choices of the TS appropriate to the soft or the dim IS. Very few experiments bear upon this important question. Those that are directly relevant have used duration as the discriminative stimulus. Church (1980, Experiment 3B) studied the “internal clock.” He darkened the experimental chamber as the IS, and required a choice between left and right response levers as the test response. If the chamber was darkened for any of a range of “short” intervals (2.0, 2.5, or 3.2 sec), rats had to press the left (or right) lever to be reinforced after the end of the interval. If the interval was “long” (5.0, 6.4, or 8.0 sec), the right (or left) lever was correct. The rats mastered this discrimination quite accurately. Church then introduced delays of .5,2, or 8 sec following the IS. There was no effect of the two shorter MIS; performance was very similar to the discrimination maintained with no MI. With an 8-sec MI, the rats made errors, which flattened the psychophysical function. However, the errors were not systematic. The rats had no general tendency to identify past stimuli as “short” more often than “long.” The point of subjective equality (PSE) did not change. Church concluded that the memory of duration became less accurate at the long interval, but that, on the average, the remembered length of the IS did not change. There is evidence from psychophysical research with human subjects that apparent duration of a stimulus decreases following its termination. The rats gave no evidence of this, and they may well have made a response decision at the beginning of the MI. Such a decision would have been facilitated by the method; they had to choose between two levers inserted at the end of the MI. They could have moved to the anticipated

Animal Working Memory

241

location of the correct level during that interval. Such a strategy would be an instance of prospective remembering. More recently, Spetch (1981) obtained evidence from pigeons that the representation of stimulus duration changes over the course of time. She used a DCM procedure. Either houselight illuminations or food presentations of varying durations served as the IS. The TSs were red and green presented on the side keys of the chamber. A number of initial experiments all demonstrated a “choose short” effect. That is, over the course of a memory interval, the accuracy of choice following the short IS (2 sec) declined more slowly than the accuracy of choice following the long (10 sec) IS. This does not reflect a simple constant error, as choices were accurate following very short MIS. It reflects a tendency to pick the test stimulus appropriate to the short IS as the MI increased. These results were replicated across several conditions, including one procedure that required a discrimination between short, medium, and long IS values. One of Spetch’s experiments is parallel to Church’s work on the PSE in the memory of different stimulus durations. She trained pigeons with houselight durations of 2 and 10 sec as ISs. She then introduced MIS of 0, 5, and 20 sec in two sets of test sessions. In the first set, IS durations of 4, 6, and 8 sec were included as test values. In the second set, the test values were 6, 14, and 18 sec, of which the two longer stimuli exceed the longer training value. The psychophysical functions obtained with MIS of 0 and 5 sec were similar, and the PSE was displaced only slightly toward the longer values with the 5-sec MI. Testing with an MI of 20 sec yielded a rather flat function with a PSE of about 12 sec for one bird and about 17 for the other (see Fig. 1). Since the functions are quite flat, they are probably not very reliable. Nonetheless, the data are congruent with Spetch’s other findings. Taken literally, they indicate that a test stimulus had to be considerably longer than the long training value (10 sec) to be remembered as “long” following a MI of 20 sec. The representation of the IS seems to change with time. To what can we attribute the contrasting findings of Church and Spetch? Church used a discrimination between the location of two levers, which may have encouraged anticipatory locating, whereas Spetch used two different colors as TSs. Church trained with the entire set of test stimuli, while Spetch used only two values. The longest MI used by Church was 8 sec, which may have been too short for the detection of a systematic error. Nonetheless, Spetch quite clearly demonstrated in several experiments (not all of which are described here) that pigeons make errors that reflect a systematic change in the memory of the IS, suggesting the operation of retrospective processing in the DCM procedure for these subjects.

248

Werner K. Honig and Roger K. R. Thompson 1.0

r

RETENTION INTERVAL

ln

1 0

u

2

4

6

8

1

0

14

18

6

8

1

0

14

18

0 Y

2

2 .75 g a m

.so .25

0

-

1 I

d "

0

2

"

4

I

SAMPLE DURATION ( sec )

Fig. 1. Psychophysical functions obtained from two pigeons, showing the probability of judging an IS as long following MIS of 0, 5 , and 20 sec. The point of subjective equality (PSE) shifted markedly toward longer test stimuli as the MI increased, indicating that the remembered duration 0 sec; (Q-0). 5 sec; (O--o), 20 decreased during the MI. (A) Bird 1, (B) Bird 2. (@--4), sec. (From Spetch, 1981.)

D.

VARIATIONSIN INITIALSTIMULI

In the light of the data just presented, one would surmise that deliberate changes in the initial stimuli would lead to reduced performance, since the representation of these stimuli would be altered in working memory. This is not necessarily the case. Honig (1978) obtained the opposite result in a rather unusual form of delayed discrimination. The initial and terminal stimuli were vertical and horizontal lines. Vertical was associated with a variable interval (VI) reinforcement schedule (VI 30 sec). Horizontal was associated with extinction. A trial that began with a vertical IS (S+) ended with horizontal (S-), and vice versa. During the MI a neutral interim stimulus was presented, *whichwas associated with a schedule of intermediate value (VI 120 sec). At the end of the MI, a side key was illuminated as the TS. A response to this key procured the terminal stimulus. It was in the pigeon's interest to peck at this side key on any trial that began with horizontal (S-), since this would procure S+ . On the other hand, the pigeon should not respond to the side key on a trial beginning with S + and ending with S- .

Animal Working Memory

249

The pigeons mastered this discrimination, and maintained a clear separation of response latencies to the side key in the face of considerable MIS, ranging up to 50 sec (cf. Honig, 1978). In the present context, one variant is of particular interest. Some pigeons were trained “off baseline” to discriminate between a blue s+ (V1 30 sec) and a green s(extinction). These stimuli were then substituted for the line orientation as ISs in a transfer test. There was very little disruption in performance. The pigeons treated the blue S + much as they had treated the vertical S + , and the green S- much as they had treated horizontal. Discrimination ratios reflecting memory performance at an MI of 20 sec were almost unchanged. Clearly, the pigeons did not depend upon a representation of the physical features of the original ISs. They may have been remembering differential rates of responding to the stimuli, the occurrence of reinforcement, or different associative values of the 1%. Indeed, when the first and second of these confounded aspects of the procedure were controlled for, there was some reduction in the discrimination, but it was never eliminated. It is certainly possible that the pigeons were maintaining a representation of some features of the ISs during the MI, and these features were sufficient to mediate transfer between the original training value and the transfer stimuli. An alternative hypothesis is that the pigeons learned to make a response decision during the IS to respond or not respond to the test stimulus, and maintained this decision during the MI. The same decision could have been generated by stimuli sharing the same attributes-positive (or negative) associative value, occurrence of responding, and so forth. This prospective account provides a credible explanation of the transfer data, and fits in well with other aspects of the research (Honig, 1978). Grant (1982) supported such an interpretation in an ingenious study of true and conditional matching in the pigeon with two sets of three orthogonal ISs. If the IS was either a green light on the key, or the presentation of food, or a single peck at a white key, then the pigeon was rewarded for choosing a green comparison stimulus. If the IS was either a red light, or the presentation of the magazine light without food, or 20 pecks at a white key, then the red comparison stimulus was correct. Each of these contingencies between the IS and the correct choice was trained as an independent problem. After acquisition, test trials with MIS ranging from 0 to 12 sec were introduced. Test trials with a single IS resembled training trials. Test trials with multiple ISs were of two kinds: Either the same IS was presented two or three times, or two or three different (but congruent) ISs were presented in the same trial. The pigeons performed

250

Werner K. Honig and Roger K. R. Thompson

better with repeated ISs than they did with a single IS. Furthermore, it made no difference whether the same IS was repeated within a trial, or whether two or three different IS’S were presented. These findings provide support for a hypothesis of instructional encoding. If the different congruent ISs all generate the same instruction (“Choose green”), then whether the ISs are physically the same or different ought not to matter. On the other hand, if each IS generates a specific trace, one would expect no facilitation of memory, and quite possibly a reduction in performance. The different traces would presumably interfere with each other. This study therefore supports an instructional model of working memory.

111. Differentiation of Trial Outcomes In differential outcome (DO) experiments, the initial stimuli are differentially associated with two possible trial outcomes following a correct response. The ISs therefore permit the anticipation of different events at the end of the trial, quite possibly through the establishment of a representation of the events. Such anticipations facilitate working memory. Correct responding is more often obtained after an MI in a DO condition than in control conditions in which only one outcome is presented or different outcomes are randomly associated with the ISs. A possible mechanism for the effect is that the outcome expectancy mediates the correct response at the end of trial. It may be easier for the subject to maintain the outcome expectancy as a cue, (rather than as a retrospective memory of the IS) or a prospective anticipation of the correct response. There is direct evidence for the anticipation of trial outcomes. Spetch, Wilkie, and Skelton (1980) ran pigeons with food and water reward alternating on successive trials. A single reinforcement was obtained at the end of each trial on a modified fixed-interval (FI) 25-sec schedule. On some trials, the outcome was differentially signaled by a stimulus on the response key. On others, there was no such signal; the only cue was the outcome of the prior trial, which preceded the beginning of the next trial by an intertrial interval (ITI) of 5 sec. In a control condition, food and water trials were presented with no external cues and in a random sequence, so that the outcome was not predictable. Spetch et al. observed clear evidence that pigeons could anticipate the trial outcome. Latencies to the first peck differed markedly on food and water trials, as did peck duration. Some sessions were videotaped, and observers were asked to distinguish between the response topographies, which were related to food and water reward in the manner described by

Animal Working Memory

25 1

Jenkins and Moore (1973). On all of their measures, the pattern of responding differed to the same degree, regardless of whether the outcome was signaled by a cue present during the trial or by the outcome of the prior trial. The differences disappeared when the food and water trials were presented in a random order with no differential cues on the key. Clearly, pigeons can anticipate a trial outcome on the basis of a prior cue. This is a form of prospective memory processing. The question remains whether this process will enhance working memory in discrimination procedures, where the IS provides the differential cue in advance of the MI. Peterson and his associates carried out a noteworthy series of experiments with differential outcomes in DCM procedures. In the experimental (DO) conditions, one initial stimulus signaled one trial outcome, and the other signaled a different one. In the control conditions, there was either a single outcome following both initial stimuli, or the two outcomes were equally and randomly associated with the initial stimuli. Generally, acquisition was quicker and retention was better in the DO conditions. Peterson, Wheeler, and Trapold (1980) used food and water as rewards, while Peterson, Wheeler, and Armstrong (1978) used food as the outcome for correct responses following one IS, and nonreinforcement as the outcome with the other. (Apparently the reward in the latter case was the advancement of the program to the next trial; errors resulted in a repetition of the trial.) The DO effect was shown to be reversible; performance following an MI improved following a switch from no-DO to DO conditions, and deteriorated after a change in the opposite direction. In DO studies with DCM procedures, the frequency of reward is confounded with the DO treatments. In the experimental condition, acquisition is generally faster and memory is better; thus, the subject can come to expect reward with greater certainty. In the control condition(s), the frequency of reward is less. This confound can be avoided with the successive delayed condition discrimination (DCD), where the probability of reinforcement is set by the experimental program. The procedure is described in detail in the next section. Generally half of the trials end with a reinforcement. Michael Flynn (1981), working with W. K. Honig, replicated the DO effect using this paradigm with four pigeons. The ISs were red and green. The TSs, singly presented in this procedure, were different line orientations. For two pigeons, grain and water as outcomes were differentially associated with red and green; for two others the association was random. Initial training was carried out with no MI. All subjects reached a discrimination ratio of .80. A “short” MI of 1 sec was then introduced on half the trials of each session, and a “long”

252

Werner K. Honig and Roger K. R. Thompson

.5

-

\

MI on the remaining trials. The latter was increased in 5-sec increments up to 15 sec. Discrimination ratios for the last four sessions under each condition of delay are shown in Fig. 2. Both groups of subjects maintained good performance on the 1-sec delays. There was little forgetting with the longer delays for the DO subjects. But the nondifferential subjects showed a marked effect of the MI. These data provide a good systematic replication of the effect established by Peterson and his associates. DeLong and Wasserman (198I) carried out an interesting set of experiments with DOs using the DCD procedure. Different probabilities of reward distinguished the trial outcomes. The DO group was rewarded on positive trials with a probability of 1 following one IS, and with a probability of .2 following the other. For the nondifferential group, positive trials ended with reinforcement with a probability of .6 following either IS. Negative trials invariably ended with time-out. Retention intervals of 0, 5 , and 10 sec were presented concurrently from the beginning of training. In a replication of the experiment, responses to the ISs were occasionally reinforced directly. This was done in order to ensure that response rates to the ISs were equated; the anticipation of differential probabilities of reward might affect these rates and provide a cue for remembering that would be confounded with the main experimental conditions. The DO groups learned the discrimination more quickly, and the dif-

Animal Working Memory

253

ference between the DO and the no-DO groups increased with the length of the MI. In discriminating trial outcomes, the pigeons had an easier time in the DO condition, even when the probability of one positive outcome was .2, than when the probability of both positive outcomes was .6. Peterson, Trapold, and their associates have suggested that the DO paradigm establishes differential expectancies at the time of the IS. We assume that these expectancies are well maintained as a prospective memory process during the MI, better at least than the retrospective recall of the IS. The expectancies can then provide differential cues at the end of the trial as a basis for the response decision. Peterson and Trapold (1980) provide a detailed account of a similar theory and make several predictions. One is that a reversal of the anticipated differential outcome should be disruptive at first, but performance should recover to its original level after a number of sessions. They provide support for this prediction with their “same-original” group. Their pigeons were trained on a DCM procedure, with food and a tone as the DOs. The MI was only 2 sec. A reversal of the outcomes led to the expected decline and recovery of performance. These predictions were further confirmed with the other procedures that we have just described. In a second phase of his experiment, Flynn trained all of his pigeons with food and water as DOs. The long MI was 10 sec. When each subject met criterion, the outcomes were reversedthat is, food and water were interchanged. For a few sessions, the birds retained their discrimination, but performance then dropped to a chance level for three of the four subjects, even with the 1-sec MI. Performance improved gradually in the course of extended further training. DeLong and Wasserman (1981) also reversed the probabilities of the anticipated rewards in their DO groups. Performance declined following both ISs, even when the probability of reward increased from .2 to 1 on positive trials. Again, the discrimination recovered gradually as observed by Trapold and Peterson and by Flynn. It is rather remarkable that changing the expectancy of reward at a given probability should disrupt performance even if the likelihood of reward is increased by a factor of 5. All these findings strongly suggest that the expectancies generated by DOs mediate performance in DCM and DCD procedures. The expectancy is maintained prospectively until the time of the test stimulus, when it serves as an effective cue for a correct response-the choice between two test stimuli, or the appropriate response rate to a single test stimulus. When the expectancy is changed, performance is disrupted until a new expectancy is established. Note that outcome expectancies are by no means required (in the formal sense) for good performance. The IS pro-

Werner K. Honig and Roger K. R. Thompson

254

vides sufficient information (and establishes the expectancy on each trial). The outcome expectancy appears to support, or even replace, the memory process established by the IS alone. If the latter is retrospective, then we can conclude that the outcome expectancy, as a prospective process, is maintained more successfully. If the memory process itself is prospective, then it presumably also contains more information than does the expectancy. It would have stimulus content, such as “Choose the vertical line,” or “Respond to red, don’t respond to green.” The expectancy provides a simpler cue in the presence of which the correct differential response can be instantiated at the end of the trial.

IV.

Comparisons among Working Memory Paradigms

In this section, we review data from studies comparing performance in different working memory paradigms. Systematic differences may well reflect different memory processes. The research involves paradigms in which a single IS provides information that must be remembered to make a correct response at the time of the TS. In the simple delayed discrimination (DD), the IS provides all of the required information, and the subject can readily arrive at a response decision at the time of the IS. In other, conditional procedures, such as the DCD, the information provided by the IS has to be combined with information from the TS for a correct response. The comparisons provide no decisive answers about the memory process in the paradigms considered. However, they facilitate some educated guesses. For example, assume that all working memory processes are retrospective. Then the memory functions should be similar, since the subject has to remember the same single IS up to the time of making a response decision. The final discrimination will be more complex in conditional than in simple problems, and this may be reflected in a higher error rate in the former. However, this difference should be independent of the MI. It would not be a difference in the rate of forgetting, which has to be reflected in the slope of the memory function. Therefore, if slope differences are observed, the memory processes cannot both be retrospective. One may be prospective and the other retrospective. On the other hand, both may be prospective, but differ in complexity. For example, the DD would involve a simple instruction“Respond” (or “Don’t respond”) to the TS. The DCD would involve a more complex instruction-‘ ‘Respond to horizontal, not to vertical” after one IS, and “Respond to vertical, not to horizontal” after the other. The more complex instruction may be forgotten more rapidly.

Animal Working Memory

A.

255

DELAYED DISCRIMINATION AND DELAYED MATCHING

Ideally, comparisons among paradigms would involve training and testing conditions which differ only in the defining features of the different paradigms. Only recently have there been efforts to achieve this kind of design, so the literature is sparse. Since the traditional DMS procedure involves a choice between stimuli located at two different positions, a corresponding DD procedure also requires a discrimination of location. Smith (1967) had pigeons choose a key on the left or the right wall of a chamber after pecking at an IS on a center panel, which unconditionally indicated the location of the correct response. In the corresponding DMS procedure, the IS and the test stimuli were also presented at the same locations. He used a limited range of MIS from 1 to 7 sec. Retention functions were better with the DD, but the scope of comparison was limited. While the subject had to choose the right or the left stimulus in DMS, this simply reflected the fact that simultaneous test stimuli have to appear at two different locations. Functionally, the pigeons may well have been choosing a stimulus, and not “going right” or “going left.” The nature of the response decision could have been quite different in the two problems. Whalen (1979) compared performance on a DD and on DCM with a somewhat different procedure. The IS was red or green on both kinds of trials. In the DD, each trial ended with the illumination of a single blue key. If it began with red, the first peck at the blue key was reinforced after a FI of 10 sec (FI 10 sec). If it began with green, the pigeon was reinforced as soon as 10 sec with the blue key had elapsed with no response (differential reinforcement of other behavior, DRO, 10 sec). Clearly, these requirements should produce differential rates of responding, and they did. Whalen’s DMS procedure was standard: The pigeon had to match the red or green sample by pecking at the side key of the same color after the MI. Whalen compared several variants of these procedures, and consistently obtained better performance after MIS with the DD procedure. In general (but not always), the difference between DD and DCM conditions increased with the duration of the MI. A direct comparison is difficult however, because the response measures are of necessity quite different: The DD involves a comparison of response rates in FI and DRO trials, while DMS is indicated by the proportion of correct responses in each session. Whalen resolved this problem in part by converting rates into “choices” in a manner which need not be described here. But he also strengthened his case by showing in one experiment that the two procedures were differentially sensitive to the application of a particular independent variable. He compared the abrupt and the gradual introduction of MIS in the two procedures.

256

Werner K. Honig and Roger K. R. Thompson

Whalen used four groups of subjects, all trained concurrently on both discrimination problems with no MI. The same sample ISs (red and green) were used in both problems. He then gave one group “memory training” by introducing and gradually extending the MI on both problems concurrently. A second group received such memory training only with the DD, while DMS trials were still presented with no MI. A third group received the opposite procedure: memory training with DMS, but not with DD. A fourth group received no memory training; its members were simply given further training on both discriminations. All groups were then tested on both problems with an MI of 15 sec. The results were quite clear: Performance on the DD ranged from 70% to 75% correct for all groups. DMS performance was at 65% for the group that received memory training only on the DMS task, and at 70% for the group that received it on both tasks. The other two groups performed only at 55% correct. This last figure differed significantly from all the others, which did not differ among themselves. Apparently, Whalen’s procedure revealed an important difference between the DD and the DMS processes. Specific memory training is not required to bridge the MI in the former, but it is necessary in the latter. This supports the suggestion that the processes are different; perhaps the pigeon needs to be trained in retrospective, but not in prospective, remembering. B.

DELAYED DISCRIMINATION AND DELAYED CONDITIONAL DISCRIMINATION

We have seen that it is not easy to make the DD procedurally comparable to DMS. The problem can readily be overcome by using a single test stimulus in both problems. This results in a comparison of the DD with a delayed conditional discrimination (DCD). The subjects learn to respond to the TS if it precedes reward, and to withhold pecking if the outcome is a time-out. Nelson and Wasserman (1978) described a DCD procedure of this kind, which was based on suggestions by Konorski (1959). For example, responding to a green TS is rewarded only if it follows a green (and not a red) IS; with a red TS the opposite contingency holds. The measure of discrimination (and memory) is the discrimination ratio (DR) between the responses emitted to the test stimulus when it follows the sample that makes it positive, and the total responses obtained on all trials which end with that stimulus. Errors are usually due to a failure to withhold responding when the TS is negative. To the degree that the pigeon can withhold responding when the TS is negative, the ratio approaches 1. Absence of the discrimination (and forgetting of the sample) results in a ratio of SO.

Animal Working Memory

251

In this procedure, the subject must combine information from the initial and test stimuli to respond correctly to the latter. The general method can be turned nicely into a DD by making the trial outcome depend only on the initial stimulus. While the trials still end with either of two stimuli, these are irrelevant. Then the response decision can depend on the initial stimulus alone. All aspects of the method are identical except for the conditionality of the relationship between the IS and the TS with respect to trial outcome. Honig and Wasserman (198 1) compared memory functions obtained from these two procedures in two independent experiments. In one, independent groups were trained with a DCD and a DD. In the other, the same subjects learned both discriminations. For example, in the latter study, red and blue would signal a DCD. A vertical line would precede reward if it followed red (but not blue), while a horizontal line would precede reward if it followed blue (but not red). Yellow and violet were the 1% for the DD; all trials starting with yellow ended in reinforcement whether the test stimulus was vertical or horizontal, while all trials starting with violet ended in time-out. The important findings were that the DD was acquired more quickly than the DCD, and the asymptotic retention functions were steeper for the DCD. Data are shown in Fig. 3. Figure 3A provides data from the between-subjects study with three different distributions of delay interval values. Figure 3B shows retention functions from the within-subjects procedure from two different distributions. Noted that in the betweensubjects procedure, there was little forgetting with intervals up to 20 sec with the DD. There was also a considerable effect of the distribution of MIS; when long MIS were introduced, performance declined generally. The reasons for this are not clear; for a detailed discussion, see Honig and Wasserman (1981). It could be argued that the memory in the DD procedures was better because the simple discrimination is easier than the conditional one; the response decision is based on a single stimulus rather than the combination of two. Performance was better even with a minimal memory requirement (0 or 1 sec), and this may simply reflect the difference in complexity of the problems. Honig and Dodd (1980) devised a procedure in which pigeons learned two versions of the same conditional discrimination. In the combined-cues discrimination (CCD), the conditional cues were presented together as the ISs, so that the subject could make a response decision at that time and remember it. Horizontal or vertical lines were superimposed on a green or a red background as the 1%; the TS was a plain blue field. Only two combinations of line and color preceded reinforcement. In the distributed-cue discrimination (or DCD), the two

258

Werner K. Honig and Roger K. R. Thompson

1.u-

-0

2

ta

z 0

t

f

,

5

1

10

20

I

I

30 0

5

25

10

MEMORY INTERVAL

Fig. 3. Discrimination ratios obtained from independent experiments involving DD and DCD procedures. Between-subject comparisons involve three different distributions of MI’S. Withinsubject data involve two different distributions of MIS. Memory was clearly superior in the DD procedure for all but the shortest MIS. (A) Between subjects, (B) within subjects. ( L O ) , DD; ( G O ) ,DCD. (From Honig & Wasserman, 1981.)

c

.9

0

2

4

6 8 10 12 14 RETENTION INTERVAL (sec)

16

18

Fig. 4. Discrimination ratios obtained from four pigeons on CDD trials, when information necessary for a conditional discrimination was combined in the ISs prior to the MI, and on DCD trials, when the same information was separated by the MI. The data are taken from the last block of training sessions with three different distributions of MIS, short, intermediate, and long. At longer MIS, performance was better in the CDD condition. (-), CDD; (---), DCD; (a),short; ( X ) , intermediate; long. (From Honig & Dodd, 1982.)

(A),

Animal Working Memory

259

cues were separated in the usual way; green or red appeared as the IS, while a horizontal or vertical line was combined with the blue field as the TS. All other contingencies were the same as in the CCD. Four pigeons learned this complex procedure with no MI. Three MIS were then introduced during each session, and the delays were gradually extended in the course of many. sessions. While the problem was difficult, the outcome was quite clear, as shown in Fig. 4. With short MIS of 1 or 3 sec, performance on both versions of the conditional discrimination was the same. With longer delays, the CCD provided better discrimination ratios. The CCD provides the subject with the opportunity to anticipate the trial outcome and to make a response decision early in the trial. An anticipatory process with the DCD procedure would be much more complex. The subject may well remember the initial color as a stimulus and make the response decision at the time of the TS. If this is the case, then the results suggest that prospective memory is better retained within the trial under truly comparable training procedures. C. MEDIATINGBEHAVIOR IN THE DOLPHIN

Two studies with a dolphin (Herman & Thompson, 1982; Thompson & Herman, 1981) bear upon the question of the decision process and the anticipatory mediation of correct responding. The same auditory ISs were used in both studies, but they differed in their response requirements. In the Herman and Thompson (1982) study, the dolphin’s working memory was tested with auditory identity (DMS), symbolic (DCM), and probe (DCD) tasks. The same dolphin’s performance on two DD tasks was studied by Thompson and Herman (1981). A comparison of differential delay effects across the two studies implies the use of prospective memory processes by the dolphin. In the Herman and Thompson (1982) study, the same two sounds-an amplitude-modulated 4-kHz tone, labeled sound A, and a 25-kHz pure tone, labeled sound B-were used as ISs in DMS, DCM, and DCD tasks which differed only in the number and quality of post-MI TSs. The dolphin was run in a tank. Each trial in the three tasks began with a “calltone,” which summoned the subject to a listening area located between two underwater side speakers and facing a center speaker. An underwater response paddle was placed next to each side speaker. One of the two ISs was then presented for 4 sec from the center speaker. In the DMS task two TSs identical to the two ISs were presented sequentially for 2.5 sec each from the two side speakers at the end of the MI. In the DCM task two TSs were presented also. However, these differed in quality from the

260

Werner K. Honig and Roger K. R. Thompson

two ISs with which they were respectively associated. In both the DMS and DCM tasks the sequential and spatial presentation of the two TSs was randomly determined. In the DCD task a single TS was presented. This was either identical to or different from the IS on that trial. After the TSs ended, an “exit” tone cued the dolphin to leave the listening area and press a response paddle. In the DMS and DCM tasks, pressing the paddle next to the side speaker that had projected the correct TS turned off the exit sound and produced a 2-sec playback of the matching IS from the side speaker, followed by a brief .5-sec conditioned reinforcer tone and a fish. Pressing the incorrect paddle also turned off the exit tone and produced playback of the nonmatching TS. If in the DCD task the TS matched the IS, pressing the paddle next to the side speaker that projected the TS was correct. If the TS and the IS did not match, pressing the paddle next to the “silent” speaker, which projected no TS sound, was correct. The results of the Herman and Thompson (1982) study revealed a near equivalence of performance across the DMS, DCM, and DCD tasks within the limits of the delays tested. In each task, performance levels were equal to or better than 90% correct responses up to delays of 50 sec and never fell below 80% correct responses up to delays of 90 sec and in some cases up to delays of 3 and 4 min. These delay limits were determined primarily by the dolphin’s reluctance to continue working at long delays following an error. Comparable similarities in final performance levels by monkeys in DMS, DCM, and DCD delayed matching tasks with visual initial and test stimuli were reported by D’Amato and Worsham ( 1974). Because the same ISs were used in each task, the lack of any significant performance differences could be interpreted as supporting the view that the dolphin and monkeys retrospectively remembered identical perceptual memory traces in all three delayed matching procedures. Accordingly, we might reasonably expect to see comparable memory functions in any task in which the same initial stimuli are used, regardless of the nature of the TSs or response requirements. However, when the IS sounds employed by Herman and Thompson (1982) were used in a study of auditory delayed discriminations (Thompson & Herman, 1981), the same dolphin’s performance levels were not equivalent to those obtained with the three delayed matching procedures. In the delayed discrimination study (Thompson & Herman, 1981) each trial began with the call-tone, which summoned the same dolphin to the listening area. One of two auditory ISs was then projected from the center speaker facing the listening area and located between two pairs of paddles. One IS, the amplitude-modulated sound A, indicated that the dol-

Animal Working Memory

26 1

phin would obtain a fish reward after pressing the inner paddle of one of the pairs. The other IS, the 25-kHz sound B, indicated that the outer paddle would be correct. After the MI, an exit sound served as the TS and indicated that the dolphin should leave the listening area and respond to a paddle. In the complex form of the problem, both pairs of paddles were in the water, and the exit sound emanated from a side speaker behind the pair of paddles operative on that trial. Thus, the subject could not tell until the time of the TS which paddle would provide reinforcement; she had to combine information from the IS regarding the location of the paddle within the pair, and from the TS regarding the correct pair of paddles. In the simple form of the problem, only one pair of paddles was in the tank (this varied between sessions) and the TS indicated only that the MI had ended. The complex (four-paddle) form of the problem was run first. In contrast to the DMS, DCM, and DCD tasks (Herman & Thompson, 1982), the dolphin maintained a performance level of 90% or better correct responses through to only a 16-sec MI. Her performance level than progressively decreased to about 80% correct responses as the MI increased to 30 sec (see Fig. 5). At longer delays the dolphin either self-terminated the sessions or failed to meet a warm-up criterion. In the simpler (twopaddle) discrimination, performance was generally better, and delays were extended to 66 sec before the dolphin became uncooperative. Mediating behaviors were recorded during the MIS of the simple discrimination. (No reliable, differential response patterns were discerned in

0

w

u

-

K

701

1

2

5

10

20

50

100

N)

MI DELAY (sec) Fig. 5 . Accuracy of discrimination in the dolphin following MIS on three different problems: a simple discrimination of paddle location with one pair of submerged paddles, a complex discrimination of the same kind with two pairs of paddles, and a delayed conditional matching problem. The same complex sounds were used as ISs for all discriminations. (U DCM; ) (0--O), , simple DD; ,).-.( complex DD.(From Herman & Thompson, 1982; Thompson & Herman, 1981.)

262

Werner K. Honig and Roger K. R. Thompson

the complex form.) During the longer delays, the dolphin tended to move forward in the listening area on trials in which the outer paddle would be correct, but remained in the back part of the area when the inner paddle was positive. This mediating behavior was seen at MIS longer than 4 sec; in fact, the percentage of trials on which a forward movement was observed correlated directly with the MI duration. This correlation was not simply an effect of continued training, since it was observed in several replications of the study. Despite the use of mediating behaviors by the dolphin, the delay limits of about 1 min in the simple form of the delayed discrimination task were considerably shorter than the 2- to 3-min delay limits obtained from the same subject in the DMS, DCM, and DCD tasks by Herman and Thompson. Clearly, the dolphin did not depend on retrospective memory for the same ISs in the two studies (Herman & Thompson, 1982; Thompson & Herman, 1981). If she had made a response decision after the MI, then performances should have been similar, regardless of the nature or number of response alternatives. Yet the results indicated that these features, not the IS sounds, determined the performance levels of the dolphin. The performance differences, together with the observation of the mediating behavior in the simpler DD, suggests that the dolphin prospectively remembered anticipatory responses to events signaled as the locus of reward by the IS at the beginning of each trial. The equivalent performances in the DMS, DCM, and DCD tasks can be explained by assuming that in each case the dolphin prospectively represented an instruction to respond to a sound source. The decrement in performance in the delayed discrimination tasks is attributable to the dolphin’s increased difficulty in prospectively maintaining a spatial response instruction, despite the use of overt coding in the direct delayed discrimination. This instruction involves information outside the auditory sensory modality which is dominant for the species. Also, as might be expected from an instructional-prospective framework, increasing the amount of response information to be remembered by using four response paddles instead of two led to a further decrement in the dolphin’s performance. D.

MEMORYFOR AUDITORYAND VISUALEVENTSBY RATS

The performance of rats in DCM and DCD tasks involving auditory and visual stimuli suggests that these animals remember retrospectively, making their response decision at the end of the MI (Wallace, Steinert, Scobie, & Spear, 1980). In the DCM task the subject produced either a visual or auditory IS for 4 sec by pressing a lever on the rear wall of the test chamber. The visual

263

Animal Working Memory B

2

4

#LAY

DELAY (SEC)

(SC 1

Fig. 6. Discrimination accuracy following auditory and visual ISs in rats. The delayed conditional discrimination was assessed by responses on nonreinforced trials. Forgetting is therefore indicated for this problem by an increase in the discrimination ratio. (A) Discrimination of location to two levers, (B) delayed conditional discrimination. ( G O ) , A u d i t o r y ; visual. (From Wallace er al., 1980.)

(o--o),

IS consisted of diffuse chamber illumination, and the auditory IS was white noise projected at 82 dB SPL from a speaker in the ceiling. After the MI of 0, 2, or 4 sec, one of two white light TSs located above response levers in the chamber’s front wall was turned on. Three rats had to press the lit lever following the auditory IS and the nonlit lever following the visual IS. For the remaining two subjects the response rules were reversed. Assuming that the rats remembered the response rules prospectively, similar overall memory functions would be expected. However, as the results in Fig. 6 indicate, this was not the case. Correct performance levels following auditory and visual ISs were the same at the 0-sec delay. At the two longer delays, however, there was a considerable decrement in response accuracy following the visual IS, although performance levels following the auditory IS did not change. The suggestion that the rats’ memory functions were determined by differential forgetting of the two ISs, not by the response rules, was strengthened by the results from the DCD task. In this experiment pressing a single lever was correct if the IS and TS were the same (A-A and V-V) and incorrect if they were in different modalities (A-V and V-A). Memory functions in the DCD task were determined with a discrimination ratio calculated by dividing the probability of a response on nonreinforced trials by the probability of a response on reinforced trials. This discrimination ratio approaches zero as the probability of a response on

264

Werner K. Honig and Roger K. R. Thompson

reinforced trials increasingly exceeds the probability of a response on nonreinforced trials (Wallace et al., 1980). As shown in Fig. 6, performance levels across the four trial types did not differ following either a 0- or 2-sec MI. However, regardless of TS modality, the rate of forgetting for all trials with a visual IS after the 5-sec MI was significantly greater than for all trials with an auditory IS. In both tasks the rats’ memory functions were associated with differences in the modality of the IS. The nature of the TS or content of the response instruction did not matter. This finding makes sense if we assume that the rats retrospectively remember the IS throughout the MI and that rats’ memory for visual events is inherently poorer than for sounds. It is interesting to compare the results of Wallace et al. (1980) with those obtained for the dolphin (Herman & Thompson, 1982; Thompson & Herman, 1981). In the latter studies, response instruction, and not the IS, was the critical factor determining different delay effects, which, it was argued, reflected the use of prospective memory processes. Until further experiments are conducted, we can only speculate whether the different memory strategies used by these animals were constrained by the procedures or reflect inherent species predispositions. E.

DELAYED DISCRIMINATIONS AND OUTCOME EXPECTANCIES

The superior performance that we described for several delayed discrimination procedures may be attributed in part to differential outcome expectancies. In the Honig and Wasserman (1981) study, one IS (in the DD) invariably preceded reinforcement, while the other signaled its absence. In the corresponding DCD, each IS preceded reinforcement on half the trials. The same holds for the Honig and Dodd procedure, except that it had two “positive” and two “negative” ISs. In these cases, the subject need not have generated a response decision at the time of the IS. The decision could have been mediated by differential outcome expectancies, which, as we have seen, greatly enhance performance on various conditional discriminations that incorporate a memory requirement. This explanation does not apply either to Whalen’s findings with pigeons, or to those of Thompson and Herman with the dolphin. In their procedures, all trials could in principle end with reinforcement. It is likely that behaviors during the MI mediated the correct response pattern (FI or DRO) in the pigeons, and the location of the correct response in the dolphin. Descriptions of overt mediational behavior used by pigeons to bridge the MI in a DD task (Thompson, Van Hemel, Winston, & Pappas, unpublished) provide strong evidence of anticipatory response deicisons.

Animal Working Memory

265

In this study either a red or green IS was presented on a left-hand side response key. After the 5-sec MI a white TS was presented for 1.5 sec on a right-hand side key. The pigeons were rewarded for pecking the TS after a green IS, and for not pecking the TS following a red IS. Each bird bridged the MI with overt mediational behavior that was highly predictive of the response decision. On correct trials following a red IS one bird either crouched in front of the feeder opening between the two keys or actually placed his head into it (p > .90). After a green IS, he moved away from the left-hand IS key and, during the MI, stood in front of the right-hand TS key, almost touching it with his beak (p > .95). The remaining two subjects displayed similar anticipatory responses during the MI. After a red IS, the second bird stood to the left of the TS key (p > .85). The third bird stood erect in front of the feeder opening facing the IS key (p > .90). After a green IS, both other birds, like the first, typically stood facing the TS key 0) > .90). The nature of these overt mediational responses makes it reasonable to assume that the birds in the DD task were prospectively anticipating what to do at the end of the MI, rather than retrospectively remembering which IS they had seen prior to the MI.

V.

Memory for Multiple Items

Prospective remembering by animals is indicated in several of the studies we have described. These always involve a single IS, which permits the subject immediately to anticipate responding to a predictable TS with which the IS is uniquely associated. Is prospective remembering made any less likely if multiple, rather than single, initial stimuli are used? Not necessarily so. Obviously, the complexity of a response instruction will be increased, but its use is not necessarily precluded simply by increasing the number of ISs. For example, consider the delayed matching to successive samples (DMSS) devised by Devine and his colleagues (Devine &Jones, 1974; Devine, Burke, & Rohack, 1979). In the DMSS task monkeys were exposed to sequentially presented lists of 1, 2, or 3 visual ISs. After the MI, the subject’s task was to respond to the 2 or 3 TSs in the order in which they appeared as ISs. Like the DMS task in which only a single IS is ever used, the list of ISs in the DMSS tasks provides all the necessary and sufficient information for a response decision prior to the opportunity for actually responding. Hence, the animal may prospectively remember an anticipated temporal response sequence (e.g., hit red, then blue, and then green). Admittedly, such an instruction is more complex than the use of only a single IS, but results from recent

266

Werner K. Honig and Roger K. R. Thompson

experiments on sequence discrimination learning by pigeons (Straub, Seidenberg, Bever, & Terrace, 1979) suggest that the encoding of complex response sequence is well within the capacities of nonverbal subjects. A.

SERIAL PROBERECOGNITION

Neither retrospective nor prospective memory is precluded logically from operating in the multiple-item serial probe recognition (SPR) task used at first to study human short-term memory (Wicklegren & Norman, 1966) and recently adapted for animals. In the SPR task a list of discrete items is presented sequentially on each trial. After the MI, a single probe stimulus is presented, and the subject indicates by differential responding whether the probe is one of the items in the list or whether it differs from all ‘of them. This is, in effect, a DCD with several ISs and one TS on each trial. MacPhail (1980) showed pigeons a series of color or simple visual pattern stimuli, whereas Sands and Wright (1980a,b) showed pictures of natural objects to rhesus monkeys. Thompson and Herman (1977) presented complex sounds to a dolphin. The list and probe test items in the latter two studies were drawn from a large pool of stimuli, making the SPR task a truly cognitive procedure, since the subject had to generalize recognition of a matching TS to new stimuli. Thompson and Herman (1977) presented a list of from one to six different computer-generated sounds on each trial; the “pool” consisted of 600 items. Each sound was presented for 2 sec from a center speaker in the testing area. The IS1 was .5 sec. A MI of 1 or 4 sec followed the last sound in the list. The single probe was then presented for 2 sec from one of two side speakers. To identify the probe as an item from the list, the dolphin pressed a paddle adjacent to the speaker that had projected the probe on that trial. To indicate that the probe was not in the list, the dolphin pressed a paddle next to the other speaker. Thompson and Herman obtained excellent memory for the last two sounds in the list, but performance deteriorated for the less recent items in an orderly fashion. Thus, there was a marked recency effect, as seen in Fig. 7. With six items in the list, the dolphin was responding at a chance level when the probe matched the first or second item in the list. Accuracy also decreased with length of the list when the probe sound did not match any items in the list. Thus, the performance was affected not only by recency (since nonmatching probes have no degree of recency) but also by the confusion generated with an increased number of items. The duration of the MI (1 or 4 sec) had no effect. This finding also indicates that recency was less important than the interference generated by multi-

267

Animal Working Memory

A

0

0

C

A

40 I

I

2

3

I 4

I

1

5

6

Th

Serial Position (from end of list)

Fig. 7. Probability of correct identification by the dolphin of a test sound in the serial probe recognition task. Data from various list lengths arc shown. Accuracy increased with the recency of the probe stimulus on matching trials (A). and it decreased generally as a function of list length both for matching probes summed across serial positions (B), and for nonmatching probes (C). List length: (m), 1; 2; (.......), 3; (-U-),4;(O), 6. (From Thompson & Herman, 1977. Copyright 1977 by the American Association for the Advancement of Science.)

(-A-),

ple sample items. Results from subsequent experiments described in Herman (1 980) further indicated that changes in the duration of list items or the rate at which they were presented had little effect on the dolphin’s recognition performance. The procedure used by Sands and Wright (1980a,b) was similar in principle. On each trial, a list of up to 10 (or, in some experiments, even 20) visual stimuli was presented. These were colored slides, back-projected onto the upper of two screens facing the monkey. The monkey viewed these for 1 sec each, with an interval of .8 sec between items. After the 1-sec MI, the probe was presented on the lower screen, and the monkey pushed a lever in one direction if the probe matched any item in the list, in the other direction if it did not. Sands and Wright observed very good performance. Accuracy did not

268

Werner K. Honig and Roger K. R. Thompson

decline to a chance level, even with the 20-item list. The monkey demonstrated both a recency and a primacy effect. The retention functions were very similar to those obtained from a human subject run with the same procedure, except that accuracy in the human was somewhat higher. MacPhail (1980) showed pigeons lists of from two to five visual stimuli projected on a center response key. Each list item was presented for the same duration, either 1 or 2 sec depending upon the experiment, and was followed immediately by the next item. After a brief MI of .125 or 1 sec, the probe was projected on the center key. The subject pecked the left-hand white key to indicate that the probe was in the list and pecked the right-hand white key to indicate it was not a list item. These keys were not lit until at least .5 sec after the onset of the probe, which remained on until the choice response was made. A marked recency effect, but no primacy effect, was obtained. The pigeons’ overall SPR performance was not impressive. For example, with a 3-item list the mean probabilities of correctly identifying a probe presented as the first, second, and third list item, respectively, were only .45,.53, and .58. The corresponding mean probability of incorrectly classifying a probe as a list item was .38. These depressed performance levels are likely attributable to interference generated by constructing lists from a pool of only 7 items. These were white, blue, green, red, and yellow key lights, and a black “X” and a vertical black line grid, both on a white background. Sands and Wright (1980b) found that both their monkey and human subjects were significantly less accurate with lists constructed from a 6-item pool instead of the usual 21 1-item pool. Both Sands and Wright and Thompson and Herman initially trained their subjects to perform the SPR task using a single IS before they introduced additional stimuli to form a list. This prior experience may have predisposed the subjects to prospective remembering. That is, the animal may have encoded the items in the list in the form of an instruction, whose content would have specified the stimuli to which a “matching” response should be made. To distinguish such a possibility from a retrospective account is not easy. The subjects may have scanned “traces” of the initial stimuli at the end of the MI in order to classify the probe item. A resolution of this question is suggested by response reaction times (RT), which Sand and Wright (1982) obtained for their SPR task with a monkey and a human subject. The data are shown in Fig. 8. The similarity of the data obtained from the two species is remarkable. Reaction times increased with the length of the list. Each new item added about 11 msec to the RT when “same” judgments were made. It is reasonable to suppose that prospective memory would be subject to immediate access

Animal Working Memory

269

LIST LENGTH Fig. 8. Reaction time (RT) of a monkey and a human on a serial probe recognition task. RT increased as a function of list length, when the probe matched a stimulus in the list (“same”), and when it did not (“different”). The inserts show that RT was also determined by the serial position of a matching probe in the list. See text for further details. (From Sands Kt Wright, 1982. Copyright 1982 by the American Association for the Advancement of Science.)

and would not have to be scanned. This would result in a flat function relating RT to list length. Furthermore, if the subject remembered prospectively, access to the memory should be independent of the serial position of the item. But the monkey is clearly very sensitive to the relative recency of the stimulus that matches the probe (see the inserts in Fig. 8). The more recently this stimulus appeared, the more quickly the monkey could classify the probe as positive. Finally, some interesting data are provided by test items not on the list (the “different” items). Some of these were familiar, having been used in prior trials in the same session. Other “novel” items had never been presented in the session. The subjects seem to have scanned the memory of all items presented on each trial, in order to make the decision of “different” for the probe. Furthermore, the scanning took longer for familiar items, again indicating a retrospective process. Overall, the reac-

210

Werner K. Honig and Roger K.

R. Thompson

tion time data are persuasive evidence that the monkey is using retrospective, and not prospective, memory in the SPR task. The functional similarities between monkeys’ and dolphins’ working memory in single IS tasks such as DMS (Herman, 1980) suggest that dolphins also bring retrospective memory processes to bear in the auditory SPR task (Thompson & Herman, 1977). Given the evidence for prospective remembering by both monkeys and dolphins in single IS tasks (e.g., Gaffan, 1977; Thompson & Herman, 1981), we might ask what about the SPR task apparently encourages these animals to abandon prospective memory in favor of retrospective representations. We suspect that the answer lies in the relative response ambiguity or uncertainty in the SPR task, resulting from the use of a multiple-item list in conjunction with a single probe test stimulus. In the single IS task and multiple item tasks of the type used by Devine and Jones (1975), anticipatory responses are mapped unambiguously on to each IS. In the SPR task, the response is unpredictably determined by any one or none of the list items. In other words, the animal in a SPR task cannot identify on the basis of the list alone which specific response of a prescribed set will be rewarded after the MI. Increasing the list length only adds to the animal’s uncertainty; it is interesting to speculate that an animal’s preferred representational strategy might shift from prospection to retrospection, and vice versa, as a function of changing list length. B. MEMORYFOR TEMPORAL ORDER

An even greater degree of terminal response ambiguity is inherent in the multiple-item task devised by Shimp (1976) to study memory for temporal order. In this study, pigeons are trained first to peck three successively illuminated randomly selected left or right keys. For example, on a given trial a pigeon might emit a response pattern beginning with a left key peck, followed by a right key peck, and ending with a left key peck. All possible three-peck response combinations occurred overall with equal probability. After the MI one of three probe test stimuli was presented on a center key. If the TS was red, the pigeon was rewarded for pecking the side key (left or right) it had pecked first in the response pattern preceding the MI. Similarly, if the TS was blue or white, the pigeon had to peck the key responded to second and third, respectively, before the MI. Events prior to the MI in this task establish the temporal order of left and right key pecks; this information is necessary but not sufficient for a correct response decision. The TS alone also provides necessary but insufficient information for a decision. Each TS specifies whether the

Animal Working Memory

27 1

first, second, or third response is correct, but does not identify its spatial nature. Response ambiguity remains unresolved until the information provided by the TS can be conjoined with the retrospective memory for the sequence of left and right responses. Shimp’s results demonstrated that pigeons could remember the temporal order of the three responses at a better than chance level for at least a 4-sec MI. A recency effect was also obtained. Retrospective memory for the third, or last, response in a pattern was always better than for the first and second responses.

C. THERADIALMAZE Retrospective representation is implied also in the performance of rats tested in radial mazes. This procedure can also be used for studying memory for multiple events (see Olton, 1978). In this task the subject is placed on a central platform from which maze arms radiate uniformly like spokes in a wheel. Each arm is baited with one piece of food. Hence, the best strategy for the subject is to choose each arm once. After only a few exposures to this situation, rats learn not to revisit an arm. For example, in an 8-arm radial maze rats are practically errorless, typically running down 7 if not 8 different arms in their first 8 choices (Olton, 1978; Olton & Samuelson, 1976). If the number of available arms is increased to 17, rats still do extremely well, choosing 15 different arms in the first 17 runs (Olton, Collison, & Werz, 1977). The rats’ performance in the radial maze is not determined by intramaze cues, overt response chains, or consistent algorithms, but is presumably controlled by retention of a selfgenerated “list” of previously rewarded spatial responses (Olton & Collison, 1979). The central question here is how this information may be represented during a trial. Again, there are two logical alternatives. Subjects could remember an ever-decreasing list of unvisited arms, or they could remember an ever-increasing list of chosen arms. In other words, they could remember prospectively where they anticipate going, or they could remember retrospectively where they have been. According to the first alternative, the animal’s familiarity with the maze permits it to begin each trial by “loading” short-term memory with a list of arms to be visited. Following each response choice, the list length of anticipated responses to be remembered prospectively decreases by 1 item. If this memory mode is adopted by a subject, its choice accuracy ought to increase as the number of previous choices increases, because the more choices the subject has made, the less there is to remember. The second representational model assumes that a new “goho go” response decision is made each time the animal faces a specific maze

Werner K. Honig and Roger K. R. Thompson

212

arm. The decision either to run or not to run down the arm is presumably based on retrospective scanning of memory traces for visited arms. According to this model, choice accuracy ought to decrease as the number of previous choices increases, because the more choices the subject has made, the more there is to remember. The data are overwhelmingly consistent with the latter retrospective model. The probability of a correct (nonrepetitive) response declines as the number of chosen arms increases. Errors are more likely to involve repetitions of arm choices made early in the trial than of arms visited later. A repetition of the immediately preceding chosen arm is most unlikely (Olton, 1978; Olton et al., 1977; Olton, Walker, Gage, & Johnson, 1977; Roberts & Smythe, 1979). VI.

Discrimination and Memory of Stimulus Sequences

Much of the research described up to this point requires memory of specific stimuli preceding the test stimulus. We now describe research in which a particular sequence of stimuli serves as a positive discriminative stimulus for responding to a test stimulus, while other sequences of the same stimuli are negative. The interval(s) between members of the sequence, and between the end of the sequence and test stimulus, can be varied independently. The resulting memory functions can be interpreted in terms of retrospective and prospective memory processing. Different sequences permit the subject to make a response decision at different times, and this allows us to compare the memory functions using the assumption that retrospective memory of specific stimuli is less robust than the prospective memory of a response decision. A.

SEQUENCE DISCRIMINATIONS BY PIGEONS

Sequence discriminations were first studied in pigeons by Weisman and his associates (Weisman & Dodd, 1979; Weisman, Wasserman, Dodd, & Larew, 1980; see also Rilling et al., 1982). A particular sequence of overhead lights, called AB, preceded reinforcement. This was contrasted to all other sequences involving the same stimuli, such as BB, AA, and AX (where X stands for no stimulus). The acquisition of such a discrimination was amply demonstrated. However, the memory intervals were brief. The interval between the stimuli in the sequence, or ISI, and the interval between the second stimulus and the test stimulus, or MI, were both held at .5 sec. In order to study memory in more detail, Weisman and DiFranco (1981) conducted another study, in which both

Animal Working Memory

273

intervals ranged from 1 to 8 sec. The two sequential stimuli were yellow and red overhead lights, each presented for 5 sec. The test stimulus was a white light on the response key. Responding to the test stimulus was reinforced intermittently following the AB sequence, but not after the sequences AA, BB, BA, or on “catch trials” (called XX), which consisted only of a blank interval and the test stimulus. Discrimination ratios were calculated by dividing the response rate following the AB sequence by the sum of that rate and the rate following any particular negative sequence. A ratio of 1.00 indicates perfect discrimination (or memory), while S O represents absence of the same. The IS1 assumed values of 1, 2, 4,and 8 sec, while the MI was kept constant at 1 sec. Likewise, the MI was varied in the same way, while the IS1 was kept at 1 sec. Order of treatments was counterbalanced across birds. Data are shown in Fig. 9. The two 1-sec delay conditions represent identical treatments, since the “unvaried” delay intervals were also 1 sec. When the IS1 was varied, the negative sequences ending with A (AA and BA) were almost as well discriminated as the absence of any stimulus (XX). Memory loss over 8 sec was much greater with the BB sequence. Performance was reliably poorer for BB than for any of the other sequences, which did not differ significantly among themselves. When the MI was varied, there was little forgetting with any sequence. There was a reliable reduction in accuracy for all sequences, compared to the XX control condition, but the sequences did not differ reliably among themselves, nor was the reduction related to the MI.

-

sz.7 -

‘O..

z

Lz

g.6 n

‘0

-

B

A

.5 I

2

4

8

I

2

4

8

DELAY ( sec 1

Fig. 9. Discrimination performance on negative sequences in a sequence discrimination task. The effect of the IS1 is shown with the MI held at 1 sec; significant forgetting of the first member occurred only with the BB sequence. The effect of the MI is shown with the IS1 held at 1 sec; there was little forgetting of any sequence. See text for further explanation. (A) ISI, (B) MI. (O--o),BB; (& O), AA;,).-.( BA; (,I---A), XX. (From Weisman & DiFranco, 1981.)

274

Werner K. Honig and Roger K. R. Thompson

These data lend themselves readily to an interpretation in terms of memory processing and response decisions. In the course of mastering the discrimination, the subject can learn that any sequence ending with A precedes extinction. He can thus make a decision not to respond to the test stimulus when the second stimulus appears in the AA and BA sequences. If the prospective memory of that decision is good, then the MI should have little effect. Clearly, this is the case. Note that the IS1 should also have little effect in these cases, since it does not matter whether the subject remembers the IS in sequences ending with A. All that matters is that the subject recognizes the A stimulus as the second in the sequence, makes the correct response decision, and remembers that decision. For sequences ending in B, the situation is quite different. Responding to the test stimulus will be correct only following AB; therefore, the response decision depends on the correct memory of the IS. Note that the BB sequence was sensitive to the ISI. The pigeons presumably forgot the occurrence of the first B in the course of 8 sec and responded to the test stimulus. (We cannot determine the rate of forgetting of A as the initial stimulus, since responding after the AB sequence provided the reference value for the DR.) However, once the pigeon made a response decision not to respond after the BB sequence, there was no further forgetting in the course of the RI. When that interval was varied, all the forgetting can be attributed to the 1-sec IS1 preceding the second presentation of B. Weisman and DiFranco reach roughly the same conclusion in the course of evaluating several models in a more formal fashion. The experiments of Weisman and DiFranco were preceded by a study by Larew (1978), who studied sequence discriminations with a more complex design. Two discriminative stimuli (green and orange) preceded a test stimulus of either a horizontal or a vertical line. If two identical colors preceded either test stimulus, the sequence was always negative; and responding to either test line was not rewarded. If green-orange preceded, say, vertical, this sequence was positive, but the same sequence was negative if it preceded horizontal. The opposite contingency was in effect for the orange-green sequence. After teaching his subjects this conditional sequence discrimination, Larew introduced MIS, keeping the IS1 at .5 sec. The results can be “postdicted” nicely from our previous considerations. The subject can make a response decision after the second stimulus of the AA and BB sequences, since these are always negative. There should be little forgetting over the MI, since decisions are well remembered. However, with AB and BA sequences, the trial outcome can be determined only at the time of the test stimulus, since this identifies the prior sequence as positive or negative. Thus, the pigeon must at least

Animal Working Memory

275

remember the second sequential stimulus until the test stimulus. If stimulus memory is not as good as memory for a decision, forgetting should occur during the MI. Exactly this result was obtained: The discrimination ratio following AA and BB sequences was not affected by the MI (which ranged up to 4 sec for three pigeons and up to 12 sec for a fourth), but there was marked forgetting of the BA and AB sequences. In further experiments, Larew also introduced an IS1 into this procedure. This led to a decline in performance on all types of negative trials. The subject had to remember the first stimulus in order to identify AA and BB as negative sequences, and to make a response decision at the time of the second stimulus. The subject could not rely on the second stimulus alone in order to anticipate a negative trial outcome, as he could in the Weisman and DiFranco experiment, where any sequence ending with A was negative. AB and BA sequences also had to be remembered as complete sequences in order to facilitate a response decision at the time of the test stimulus. Thus, the memory should have been affected by the IS1 on all types of trials. B.

AUDITORYSEQUENCE DISCRIMINATIONS BY THE DOLPHIN

Auditory sequence discriminations of a complexity comparable to those used by Larew (1978) and Weisman and DiFranco (1981) were studied in the bottlenosed dolphin (Thompson, 1976). Four unique complex sounds-labeled A, B, C, and D-were used to construct four twosound sequences. Either sound A or B was the first sound in a sequence, and either sound C or sound D was the second sound in the sequence. Each sound in a sequence was presented for 1 sec from a centrally located underwater speaker; and during training, a .03-sec interval separated the two sounds in a sequence. One response paddle was positioned to the left of the speaker and another was positioned to its right. The dolphin was trained to press the left-hand paddle after hearing either the AC or BD sequence and to press the right-hand paddle after hearing the sequence AD or BC. Thus, each sound in a sequence provided necessary but insufficient information for a correct response decision. As in the Weisman and DiFranco (1981) pigeon study, a correct response could be determined only after hearing the complete sequence. In the pigeon study the response following the second B of the IS was determined by whether it was preceded by A or B. Similarly, the correct spatial response by the dolphin following sounds C and D was conditional upon the preceding sound A or B. Hence, if a delay is interposed between the two sounds in a sequence, the only way a correct spatial response decision can be made is for the dolphin to retrospectively remember the first sound until the

216

Werner K. Honig and Roger K. R. Thompson

second sound is presented. Prospective memory is precluded because the first sound, A or B, does not serve as a cue for either C or D, or for the correct spatial response. Learning to respond to the left-hand paddle after hearing sequences AC or BD, and to the right-hand paddle after hearing sequence AD or BC proved to be a very difficult task for the dolphin. Interestingly, the dolphin’s performance during acquisition training was characterized by her attempting to associate only a single sound within a sequence with a spatial response. For example, sounds C and D on AC and AD sequence trials were associated with left-hand and right-hand responses, respectively, and sound B on BC sequence trials was associated with right-hand responses. The net result of the one-on-one S-R (stimulus-response) mapping was a very high error rate on BD sequence trials. Nevertheless, after extensive training with a variety of procedures over a 2-month period and a total of 7500 trials, the dolphin learned to correctly associate (with few errors) the spatial responses with the appropriate auditory sequences per se. In a series of four experiments, various procedures were used to increase the initial .03-sec interval separating the two sounds in a sequence. A new procedure was adopted when it became apparent that a delay limit had been reached under the prior method. For all methods used, there was an abrupt and marked increase in error rate whenever the delay interval between the two sounds in a sequence exceeded 2 to 3 sec. The abrupt increase in error rate was associated with the development of stereotyped responding, in which the dolphin’s response decision was determined by a single sound, rather than the sequence per se. For example, in the first experiment IS1 increments were made in steps of .1 sec over 12-trial blocks. The dolphin made 95% correct responses overall, with IS1 values ranging from .03 sec to 2.4 sec. When the IS1 was increased to between 2.5 sec and 2.7 sec, the dolphin continued to be correct on 95% of AD and BC sequence trials, but her performance level on AC and BD sequences was only 42% correct. In other words, she responded to the left-hand paddle following both sequences beginning with sound B, and to the right-hand paddle following both sequences beginning with sound A. In subsequent but no less abrupt breakdowns in performance, the dolphin typically based her response decision on the nature of the second sound alone, rather than the sequence. In view of the same dolphin’s demonstrated ability to prospectively remember single novel or highly familiar sounds for delays up to at least 2 mins in DMS tasks (Herman & Gordon, 1974; Herman & Thompson, 1982), the 2- to 3-sec IS1 limits obtained in the sequence discrimination task are somewhat anomalous. The results do not encourage the view that

Animal Working Memory

211

the dolphin retrospectively remembered the first sound in a sequence as a conditional cue upon which a response decision could be made in conjunction with the second sound. There was no evidence of a “memory gradient. ” In contrast to pigeons in sequence discrimination studies (Larew, 1978; Weisman & DiFranco, 198l), the characteristic form of the dolphin’s performance was errorless responding at successively increased delays, followed by a sudden increase in error rate at the next delay-ven though the increment might be only a fraction of a sec. The sequence discrimination results are compatible with an explanation based on a stimulus-compounding or configurational discrimination. Thompson ( 1976) suggested that during training, the dolphin eventually encoded the four different sequences as four unique compound sounds, or configures (Razran, 1965), two of which were associated with the lefthand paddle, and two with the right-hand paddle, rather than a conjunction of two individual sounds conditionally related to each of two spatial responses. It was further argued that the 2- and 3-sec delay limits at which the sequence discrimination was maintained represented a perceptual threshold beyond which the “gestalt” of the configure was no longer perceived, so that the animal now heard two unique discrete sounds, neither of which was associated with either spatial response. Do the results from the auditory sequence discrimination study allow us to conclude that the dolphin is incapable of retrospectively remembering sounds? Perhaps not-after all, we implied earlier that the dolphin’s serial probe recognition capacities (Thompson & Herman, 1977) might be best interpreted as reflecting retrospective encoding. At the very least, the results from the auditory sequence discrimination study indicate a very strong predisposition on the dolphin’s part, both in acquisition training and delay testing, to map stimulus elements onto responses in an unambiguous manner that would be conducive to prospective representations. VII.

Conclusions

In our view, the memory processes which we have proposed can provide a model for the actual functioning of working memory in the “real lives” of many animals. We assume that animals learn about temporal contingencies in their environment (cf. Honig, 1981) and make appropriate, adaptive response decisions. Often, these decisions are made on the basis of recent events-stimuli that must be discriminated from others that are irrelevant to the contingencies at hand. These recent events are remembered retrospectively. Only recent events will, and should, deter-

278

Werner K. Honig and Roger K. R. Thompson

mine response decisions, since these events are most likely to be correlated with future events. Once made, the response decision is remembered prospectively until the appropriate circumstances arise for the execution of behavior represented in the decision. Thus, prospective memory will represent responses and will be relatively stable. In this article, we have provided material to support this general view. We have compared memory functions from similar procedures, such as the DD and the DCD, or even within the same procedure, such as tests for the memory of stimulus sequences. When the same initial stimuli are used and the memory functions differ, they indicate different memory processes or prospective memory processing with differing degrees of stimulus specificity. We have reviewed other, more direct procedures as well. These involve cues presented in advance of alternative test conditions or trial outcomes, with the attendant use of the probe procedure; changes in the number of initial stimuli or differences among them; and differential opportunities to practice remembering. In some of these procedures, the indicators of memory processing do not directly involve the slope of the memory function. We have looked at response latencies, error rates for a given MI, and changes in psychophysical functions. Thus, it seems reasonable to maintain that our conclusions are based on converging operations. In order to account for the results from these diverse procedures by a single memory process, we would have to make many qualifications and ad hoc assumptions. Parsimony would be better served by the postulation of both kinds of processes. (We did show that, in any case, retrospective remembering alone cannot account for the data.) An obvious problem with this conclusion is that the assumption of two processes with different rates of forgetting appears to “explain everything.” But nature does not practice parsimony for the benefit of psychological theory. We prefer to burden the area of working memory with the responsibility of specifying more clearly the conditions in which prospective and retrospective processing are likely to be active. This question could be resolved readily within our own theoretical framework if we could determine the time of the response decision in an experimental situation. We would hold that remembering up to this decision point is retrospective, with stimulus representation. Past this point it is prospective, with response representation. We have reviewed paradigms in which we believe that the response decision is made at the time of the IS (simple DDs and Honig’s 1978 procedure), at the time of the TS (DCDs and the SPR task), or at some time during the MI (memory for stimulus sequences). Thus, the concept of the response decision seems useful for the organization of much of our data.

Animal Working Memory

279

It would be convenient if subjects raised a flag at the time of the response decision, but they do not. Indeed, the concept is supported largely by the same set of phenomena that support the distinction between the two types of memory processing-this is its weakness as an explanatory concept. The time of the response decision, which is crucial for our analysis, may well be determined by the same variables that control the nature of the memory process. The response decision may be useful as a conceptual tool, but as an explanatory device it fails because it does not transcend the data that we are trying to explain. It may be more useful to assess the relative amount of information required for prospective and retrospective processing to ensure correct performance. This assessment does not depend upon experimental data. The process that requires less information than the other will operate. If the information load is the same, the subject will use the response-oriented, prospective process. Thus, in the DD, the subject, who needs to remember either of two initial stimuli retrospectively or one of two anticipated responses, will choose to remember the response. In the DCD, prospective encoding after a response decision will also involve stimulus content, and thus more prospective information. The memory of the IS requires the same amount of information, without the added burden of the anticipated response. In the SPR procedure, the encoding of all of the sample stimuli on each trial will require an “extra step” and necessitate a prospective memory at least as complex as the necessary retrospective recall. We have seen that, in this case, independent data (the response latencies) suggest a retrospective process. The differential outcome studies suggest that the prospective anticipation of a particular, discriminable trial outcome can serve as a good mediating cue throughout the MI. This requires less information than the retrospective distinction between two different 1%. In this case, the response decision is not necessarily anticipated; instead, the subject relies on a simple expectancy that can also serve as a cue. In the discrimination of stimulus sequences, the forgetting function appears to “change course” at the point when the subject has enough information to anticipate a particular kind of trial outcome. The “information load” can then be reduced from having to remember a particular IS to the anticipation of a particular trial outcome. We base this analysis on the notion that animals remember anticipated outcomes and responses more readily than stimuli. After extended experience, the subject can determine the point in a trial at which the efficiency of remembering can be increased by switching from a retrospective process to the prospective process, since the latter is less vulnerable to interference or the simple passage of time. We have provided a lot of

280

Werner K. Honig and Roger K. R. Thompson

evidence supporting this notion indirectly. What is required now is a clearer analysis of memory content, or representation. The first major portion of the article addressed this question, but concluded that the data are not decisive. If we could determine more clearly what the animal is remembering, rather than how long, we would gain a more direct understanding of working memory processes. ACKNOWLEDGMENTS Preparation of this article was supported by research grant # A 0 102 from the National Sciences and Engineering Research Council of Canada to W. K. Honig, and by assistance to R. K. R. Thompson through a grant from the National Science Foundation, held by D. Premack at the University of Pennsylvania.

REFERENCES Blough, D. S. Delayed matching in the pigeon. Journal of the Experimental Analysis of Behavior, 1959, 2, 151-160. Church, R. M. Short term memory for time intervals. Learning andMotivation, 1980, 11,208-219. D’Amato, M. R. Delayed matching and short-term memory in monkeys. In G. H. Bower (Ed.), The psychology of learning and motivation: Advances in research and theory (Vol. 7 ) . New York: Academic Press, 1973. Pp. 227-269. D’Amato, M. R., & Worsham, R. W. Retrieval cues and short-term memory in Capuchin monkeys. Journal of Comparative and Physiological Psychology, 1974, 86, 274-282. Delong, R. E., & Wasserman, E. A. Effects of differential reinforcement expectancies on successive matching-to-sample performance in pigeons. Journal of Experimental Psychology: Animal Behavior Processes, 1981, 7 , 394-412. Devine, J. V., Burke, M. W., & Rohack, J. J. Stimulus similarity and order as factors in visual short-term memory in nonhuman primates. Journal of Experimental Psychology: Animal Behavior Processes. 1979, 5 , 335-354. Devine, J. V., & Jones, L. C. Matching-to-successive samples: A multiple-unit memory task with rhesus monkeys. Behavior Research Methods and Instrumentation. 1975, 7 , 438-440. Flynn, M. S. Differential outcomes in a successive delayed conditional discrimination procedure. Honors thesis, Dalhousie University, 1981. Gaffan, D. Response coding in recall of colors by monkeys. Quarrerly Journal of Experimental Psychology, 1977, 29, 597-605. Grant, D. S. Short-term memory in the pigeon. In N. E. Spear & R. R. Miller (Eds.), Information processing in animals: Memory mechanisms. Hillsdale, New Jersey: Erlbaum, 1981. Pp. 227-256. Grant, D. S. Samples af stimuli, responses, and reinforcers: Effect of number, type, and mode of presentation. Learning and Motivation, 1982, 13, 265-280. Herman, L. M. Cognitive characteristics of dolphins. In L. M. Herman (Ed.), Cetacean behavior: Mechanisms andfunctions. New York: Wiley (Interscience), 1980. Pp. 363-429. Herman, L. M., & Gordon, J. A. Auditory delayed matching in the bottlenose dolphin. Journal of the Experimental Analysis of Behavior, 1974, 21, 19-26.

Animal Working Memory

28 1

Herman, L. M., & Thompson, R. K. R. Symbolic, identity, and probe delayed matching of sounds by the bottlenosed dolphin. Animal Learning and Behavior, 1982, 10, 22-24. Honig, W. K. Studies of working memory in the pigeon. In S. H. Hulse, H. Fowler, & W. K. Honig (Eds.), Cognitive processes in animal behavior. Hillsdale, New Jersey: Erlbaum, 1978. Pp. 211-248. Honig, W. K. Working memory and the temporal map. In N. E. Spear & R. R. Miller (Eds.), Information processing in animals: Memory mechanisms. Hillsdale, New Jersey: Erlbaum, 1981. 4. 167-197. Honig, W. K., & Dodd, P. W. D. Delayed discriminations in the pigeon: The role of within-trial location of conditional cues. Animal Learning and Behavior, 1982, in press. Honig, W. K., & Wasserman, E. A. Performance of pigeons on delayed simple and conditional discriminations under equivalent training procedures. Learning and Motivation, 1981, 12, 149- 170. Jenkins, H. M., & Moore, B. R. The form of the auto-shaped response with food or water reinforcers. Journal of the Experimental Analysis of Behavior, 1973, 202 163-181. Konorski, J. A new method of physiological investigation of recent memory in animals. Bulletin de 1’Academie Polonaise des Sciences, Serie des Sciences Biologiques, 1959, 7 , 115-1 17. Larew, M. B. Discrimination and retention of stimulus order by pigeons. Honors thesis, University of Iowa, 1978. MacPhail, E. M. Short-term visual recognition memory in pigeons. Quarterly Journal of Experimental Psychology, 1980, 32, 521-538. Maki, W. S. Directed forgetting in animals. In N. E. Spear & R. R. Miller (Eds.), Information processing in animals; Memory mechanisms. Hillsdale, New Jersey: Erlbaum, 1981. Pp. 199-225. Morgan, C. L. The animal mind. New York: Longmans, Green, 1930. Nelson, K. R., & Wasserman, E. A. Temporal factors influencing the pigeon’s successive matchingto-sample performance: Sample duration, intertrial interval, and retention interval. Journal of the Experimental Analysis of Behavior, 1978, 30, 153-162. Olton, D. S. Characteristics of spatial memory. In S. H. Hulse, H. Fowler, & W. K. Honig (Eds.), Cognitive processes in animal behavior. Hillsdale, New Jersey: Erlbaum, 1978. Pp. 341-373. Olton, D. S., & Collison, C. Intramaze cues and “odor trails” fail to direct choice behavior on an elevated maze. Animal Learning and Behavior, 1979, 7 , 221-223. Olton, D. S., Collison, C., & W e n , W. A. Spatial memory and radial arm maze performance by rats. Learning and Motivation, 1977, 8, 289-314. Olton, D. S., & Samuelson, R. J. Remembrance of places passed: Spatial memory in rats. Journal of Experimental Psychology: Animal Behavior Processes, 1976, 2, 97- 1 16. Olton, D. S., Walker, J. A. Gage, F. H., & Johnson, C. T. Choice behavior of rats searching for food. Learning and Motivation, 1977, 8, 315-331. Peterson, G.B., & Trapold, M. A. Effects of outcome expectancies on pigeons’ delayed conditional discrimination performance. Learning and Motivation, 1980, 11, 267-288. Peterson, G.B., Wheeler, R. L., & Armstrong, G . C. Expectancies and mediators in the differentialreward conditional discrimination performance of pigeons. Animal Learning and Behavior, 1978, 6 , 279-285. Peterson, G. B., Wheeler, R. L., & Trapold, M. A. Enhancement of pigeons’ conditional discrimination performance by expectancies of reinforcement and nonreinforcement. Animal Learning and Behavior, 1980, 8, 22-30. Razran, G . H. S. Empirical codifications and specific theoretical implications of compound-stimulus conditioning: Perception. In W. F. Prokasy (Ed.), Classical conditioning. New York: Appleton, 1965. Pp. 226-248.

282

Werner K. Honig and Roger K. R. Thompson

Rilling, M., & Howard, R. C. The analysis of memory for signals and food in a successive discrimination. In M. L. Commons & J. A. Nevin (Eds.), Quantitative analyses of behavior: Discriminative properties of reinforcement schedules. Cambridge, Massachusetts: Ballinger, 1981. Pp. 289-319. Rilling, M., Kendrick, D., & Stonebraker, T. B. Stimulus control of forgetting: A behavioral analysis. In M. L. Commons, A. R. Wagner, & R. J. Hermstein (Eds.), Quantitative studies in operant behavior: Acquisition. Cambridge, Massachusetts: Ballinger, 1982, in press. Roberts, W. A., & Grant, D. S. Studies of short-term memory in the pigeon using the delayed matching-to-sample procedure. In D. L. Medin, W. A. Roberts, & R. T. Davis (Eds.), Processes of animal memory. Hillsdale, New Jersey: Erlbaum, 1976. Pp. 79-1 12. Roberts, W. A,, & Smythe, W. E. Memory for lists of spatial events in the rat. Learning and Motivation, 1979, 10, 313-336. Roitblat, H. L. Codes and coding processes in pigeon short-term memory. Animal Learning and Behavior, 1980, 8, 341-351. Roitblat, H. L. The meaning of representation in animal memory. The Behavioral and Brain Sciences, 1982, in press. Sands, S. F., & Wright, A. A. Primate memory: Retention of serial list items by a rhesus monkey. Science, 1980, 209, 938-940. (a) Sands, S. F., & Wright, A. A. Serial probe recognition performance by a rhesus monkey and a human with 10- and 20-item lists. Journal of Experimental Psychology: Animal Behavior Processes, 1980, 6 , 386-396. (b) Sands, S. F., & Wright, A. A. Monkey and human pictorial memory scanning. Science, 1982,216, 1333- 1334. Shimp, C. P. Short-term memory in the pigeon: Relative recency. Journal of the Experimental Analysis of Behavior, 1976, 25, 55-61. Smith, L. Delayed discrimination and delayed matching in pigeons. Journal of the Experimental Analysis of Behavior, 1967, 10, 529-533. Spetch, M. L. Pigeon’s memory for event duration. Unpublished doctoral dissertation, University of British Columbia, 1981. Spetch, M. L., Wilkie, D. M., & Skelton, R. W. Control of pigeons’ keypecking topography by a schedule of alternating food and water reward. Animal Learning and Behavior. 1981, 9, 223-229. Stonebraker, T. B. Retrospective and prospective processes in delayed marching to sample. Unpublished doctoral dissertation, Michigan State University, 1981. Straub, R. O., Seidenberg, M. S., Bever, T. G., & Terrace, H. S. Serial learning in the pigeon. Journal of the Experimental Analysis of Behavior, 1979, 32, 137-148. Thompson, R. K. R. Performance of the bottlenose dolphin (Tursiops truncatus) on delayed auditory sequences and delayed auditory successive discriminations. Unpublished doctoral dissertation, University of Hawaii, 1976. Thompson, R. K. R., & Herman, L. M. Memory for lists of sounds by the bottlenosed dolphin: Convergence of memory processes with humans? Science, 1977, 195, 501-503. Thompson, R. K. R., & Herman, L. M. Auditory delayed discriminations by the dolphin: Nonequivalence with delayed matching performance. Animal Learning and Behavior, 1981, 9, 9-15. Thompson, R. K. R., Van Hemel, R. E., Winston, K. M., & Pappas, N. Modality specific interference with overt mediation in a delayed discrimination task by pigeons. Unpublished. Wallace, J., Steinert, P. A,, Scobie, S. R., & Spear, N. E. Stimulus modality and short-term memory in rats. Animal Learning and Behavior, 1980, 8, 10-16. Weisman, R. G., & DiFranco, M. P. Testing models of delayed sequence discrimination in pigeons: Delay intervals and stimulus durations. Journal of Experimental Psychology: Animal Behavior Processes, 1981, 7 , 413-424.

Animal Working Memory

283

Weisman, R. G., & Dodd, P. W. D. The study of associations: Methodology and basic phenomena. In A. Dickinson & R. A. Boakes (Eds.), Mechanisms of learning and motivation; A memorial volume for Jerzy Konorsky. Hillsdale, New Jersey: Erlbaum, 1979. Weisman, R. G . , Wasserman, E. A., Dodd, P. W. D., & Larew, M. B. Representation and retention of two-event sequences in pigeons. Journal of Experimental Psychology: Animal Behavior Processes, 1980, 6 , 312-325. Whalen, T. E. Support for a dual encoding model of short-term memory in the pigeon. Unpublished doctoral dissertation, Dalhousie University, 1979. Wickelgren, W. A., & Norman, D. A. Strength models and serial position in short-term recognition memory. Journal of Mathematical Psychology, 1966, 3, 316-347.

A Attention, short-term memory and, 28

C Chess, skilled memory effect and, 2-5

D Digit span, effect of practice on, 7-8 Dolphin auditory sequence discriminations by, 275-277 mediating behavior in, 259-262

I Interference, theory of skilled memory and, 36-40

M Memory, see also Short-term, Skilled, Working construction and representation of orderings in historical background, 1 14- 120 overview of experiments, 121-122 diverging and converging nodes, 140- 141 method, 141-143 results and discussion, 143-145 measures of, 69-70 for multiple items, 265-266 radial maze, 271-272 serial probe recognition, 266-270 temporal order and, 270-271 284

node construction and, 134-138 method, 138 results and discussion, 138- 140 retrieval from partial orderings conclusions and implications, 126-127 method, 122-124 results and discussion, 124- 126 role of determinancy in constructing partial orderings conclusions, 133-134 method, 130-131 results and discussion, 131-133 role of schema, 145-146 conclusions from experiments on presentation order, 149 method, 146-147 results and discussion, 147-149 schema-relevant vs. irrelevant information, 63-66 Memory operations differentiation, 33-36 retrieval, 32-33 storage, 28-32 Memory span expert, analysis of effects of practice on digit span, 7-8 mechanism of skilled memory, 8-24 Mental calculation expert, analysis of, 43-49 Mnemonic system, mechanism of skilled memory and, 8-14

P Partial orderings, role of determinancy in construction of, 127-130 conclusions, 133- 134 method, 130-131 results and discussion, 131-133

Index Pigeons, sequence discrimination by, 272-275 Practice, effect on digit span, 7-8

R Rats, memory for auditory and visual events by, 262-264 Rehearsal for free recall reconsidered, 158-159 concluding comments, 173- 174 experiment 1, 168-173 formulating the issue, 159-160 other evidence, 165-168 overt rehearsal evidence, 160- 164 meaning of, 154-158 of nonverbal stimuli, 174-177 concluding comments, 184- 185 experiment 2, 177-180 experiment 3, 180- 181 experiment 4, 182-184 Remembering, retrospective and prospective, 239-242 Retrieval speed, encoding and, 20-24

S Schema(s) copy plus tag model, assumptions and predictions guessing, 74-75 representational, 71-74 retention, 76-79 retrieval for recall and recognition, 75-76 copy plus tag model, typicality effect ecologically valid settings and, 90-92 goals of comprehender and, 87-90 multiple schemas guiding comprehension, 80-84 presentation rate and, 85-87 scripted activities videotaped and, 84-85 types of schemas and, 80 unpresented items inferred at comprehension or at retrieval, 92-93 fate of four alternative models, 93-94 attention-elaboration, 94-95 filtering, 94 partial copy, 95-96 pointer plus tag, 96-97

285

function of, 62-63 methods and, 66-67 measures of memory, 69-70 preparation of acquisition and test materials, 67-69 nature of, 60-62 process of copying into specific memory traces, 97-98 activation of a subchunk, 100-101 genetic script and, 98 predicting false alarm rates for unstated script actions, 101- 103 stated passage actions and, 98-100 Sentence memory, 51-53 coding, 53-54 individual differences, 54-55 postsession recall, 54 Short-term memory capacity, 2 conclusions coding strategies, 235-236 independence of item and order information, 234-235 mental representations, 236 experiment 1, 196-198 discussion, 202 method, 198-200 results, 200-202 experiment 2, 202-204 discussion, 210-21 1 method, 204-206 results, 206-210 experiment 3, 211-213 discussion, 225-227 method, 213-217 results, 217-225 experiment 4, 227 discussion, 234 method, 227-229 results, 229-234 item and order information, 191-194 mechanism of skilled memory and, 14-16 temporal sequence and spatial location information, 194-196 Skilled memory further studies of, 42-43 analysis of a mental, calculation expert, 43-49 memory of a waiter, 49-51 sentence memory, 5 1-55

286 Skilled memory (con?.) mechanism of encoding and retrieval speed, 20-24 mnemonic system, 8-14 retrieval system, 16-20 short-term memory, 14-16 theory of interference, 36-40 memory operations, 28-36 short-term memory and attention, 28 structure of long-term memory, 24-28 working memory, 40-42 Skilled memory effect chess and other game skills, 2-5 nongame skills, 5-7 short-term memory capacity, 2 Skill(s), nongame, skilled memory effect and, 5-7

W Waiter, memory of, 49-51 Working memory comparisons among paradigms, 254

Index delayed discrimination and delayed conditional discrimination, 256-259 delayed discrimination and delayed matching, 255-256 delayed discriminations and outcome expectancies, 264-265 mediating behavior in the dolphin, 259-262 memory for auditory and visual events by rats, 262-264 differentiation of trial outcomes, 250-254 discrimination and memory of stimulus sequences auditory sequence by dolphin, 275-277 by pigeon, 272-275 representations of initial and test stimuli anticipation of different test stimuli, 242-245 changes in representation of initial stimuli, 246-248 directed forgetting experiments, 245-246 variations in initial stimuli, 248-250

Psychology of Learning and Motivation Vol 16

Psychology of Learning and Motivation (Vol. 6)

Psychology of Learning and Motivation, Vol 17