THE BEHAVIORAL AND BRAIN SCIENCES
(1980) 3, 373-415
Printed ln the United States ofAmerica
Against direct perception ...
28 downloads
563 Views
6MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
THE BEHAVIORAL AND BRAIN SCIENCES
(1980) 3, 373-415
Printed ln the United States ofAmerica
Against direct perception S. Ullman Artlffclstlntsi/IQfHico Laboratory, Massachusetts lnst/lute of Technoloay, Csmbridae, Msea. 02139
Abstract: Centr(d to contemporary cognitive science is the notion that mental processes involve comptltntlon� defined over internal representations. This vie w stands in sharp contrast to the "direct approach" to visual perception and cognition, whose most prominent proponent has been J.J. Gibson. In the direct theory, perception does no t involve computations of any sort; it Is the result of the direct pickup of available information. The publication of Gibson's recent book (Gibson 1979) offers an opportuni t y to e�Camine his approach, and, more generally, to contrast the theory of direct perception with the computational/representational vi ew. In t he Brst part of the present article
(Sections 2-3) the notion of "direct perception" is examined from a theoretical standpoint, and a number of objections are raised. Section 4 is a the prohlem of perceivi11g the three-dimensional shape of moving objects is examined. This problem, which has been extensively
"case study":
studied within the Immediate perception framework, serves to illustrate some of the inheren t shortcomings of t hat approach. Finally, in Section 5, an attempt is made to place the theory of direct perception in perspective by embedding it In a more comprehensive framework. Keywords: nrti6ciol intelligence, compu tational models; direct perception; ecological optics; Gibsoni an theory; Information pickup; visu al
repr esent ation
1. Introduction Gibson's recent book (Gibson 1979) is his third in thirty years devoted to the development and exposition of the theory of direct perception. The interest in Gibson's Infl u enti al theory has often transcended the interest in perception alone. One reaso n is that his approach to cognition in gene ral stands in sharp contrast to another prevailing approach, the com puta tionalfrepre.�ntationul one. According to the latter view (of which generative grammar, theories in cognitive psychology, and some of the work in artinci al intelligence are current examples), mental processes involve computations defined over internal representations [see Pylyshyn: "Computation and Cognition " BBS 3(1) 1980]. In the direct theory of perception, mediating constructs arc unnecessary, and in the early stages of his theory Gibson expres.�ed the hope t hat the direct approach, if successful, would extend to other areas of psychology as well: (The theo ry of di rect perception] "... if successful, will provide a basis for a stimulus-response psychology, which oth e rwise seems to be sinking in the swamp of i nte r vening variables" (Gibson 19C:i0, p. 70 1). In tht� pn.·-�m1t puper the concept of direct visual perception (henceforth uhhmviuted as DVP) will ue examined. The overall plan of the puper is us follows. First, a brief descri ption of the concept will be given.This is only intende d t o sta te the main points o f relevance t o the ensuing discussion, not to summarize Gibson's theory. For a comprehensive presentation of t he theory in different stages of its evolu tion, see Gibson (1950, lHfifi, 1979). These books describe differen t approaches to direct perception, not all of which (especially the 1950 formulation) are r etained in the current formulation of the theory. The notion of DVP is then examined pri m aril y from a theoretical standpoint (for discussions of empirical evidence against direct perception see Epstein & Park (1964}; Gyr (1972a,b); Epstein (1977). Section 2 examines what I t means for perception t o be di rect, and Section 3 raises g eneral arguments against the plausibility of direct perception. Section 4 is a "case study": the application of the theory to a particular pmblem, the perception of moving objects, Is discussed to highli�ht some of t he inherent shortcomings o f e 1980 Csmbridae
UniVersity Pross
0 140-52SX1801030373-43$04.00
the direct approach. Finally, Section 5 tries to put the DVP approach In the perspective o f a more comprehensive framework, and to Identify some of its missing ingre dien ts.
1.1 Dir ect visual perception Visual perception and its relation to the struct u re of the en vironment are viewed by the theory of di rect visual perception as a sequence of two direct and una mbig uous mappings: "stimulation is a fun ctio n of the environment, and perception is a function of stimulation" (Gibson 1959, p. 459). The first mapping is between various aspects of the environ ment and some spat lotem poral patterns of the visual array, sometimes called ''higher order stimuli" (the more recent formulations of the theory emphasize the transformation� ancl invariants in these patterns). The second mappin g is between stimuli and percepL�. When an observer moves in the environ· ment, some aspects of the light array that reaches his 1!-ye.� change , while others remain unchanged. The information in the.se trunsformation� and i nvariances specifies the environ ment: its layout , changes of layout, and the oceurrenc
2. What docs H mean for perception to be "immediate"? The DVP theory contends that the relation between stimuli (or information in the array of light) und percepts Is direct and immediate. To evaluate this claim we shall first exnmine what it means for percepts and stimuli to be "immediately
373
Ullman: Against direct perception related." Mor<.' specifically. we shall ask under what condi tions the theory of perception c:m virw stimuli and percepts as dircdlv relatt•d. and what would be th£> l'riteria for abandoni�g this view in favor of a difft>rt•nt kind of relation. The term ''immedint<-'" has several meanings and connota tions; in particular, the qualifications for being "immediate" may be rc.dative to the system under investigation. If a system S is investigated, then any signal that reaches S from the . outside can be <."Oruidered "immediate . . For the psychologist, for example, signals of heat or touch produced by peripheral receptors might be thought of as immediate In this sense, since they are external to the system under investigation. For the physiologist, on the other hand, who studies, say, the internal mechanisms o£ Meissner's corpuscle (a touch recep tor), the relation between touch and the receptor output cannot be dismissed as immediate. In this sense, the term "immediate" does not serve to describe the signal or operation under consideration. but to e.xprcss a point of view that places them outside one's domain of interest. Viewing the relation betwet"n stimuli and percepts as immediate in this sense would imply that regardless of how percepts are actually related to stimuli, we simply hold this relation to be outside the scope of the theory of perception, which is an unlikdy position. Let IL� acc<•pt, ther£>fore, the view that the relation between stimuli and percepts d<*S not lie outside the domain of the theory of perct�ption, and is not imnwdiah• in this sense. Describing the stimulus-percept relation as "immediate" would still be justified if the relation had 110 m<'Oningful decomposiltontinlo more elementary constituents. To clar ify these uotions of "inlllll'
TI-E BEHAVIORAL /1W BRAIN SOENCES (1980), 3
absorption functions of the retinal receptors may play a primitive role. Within the theory, certain regions of the light spectrum can be "immediately registered" by the retinal cones. This does not exclude, however, an explanation of these absorption curves. for instance, in molecular terms. Similarly, the theory of perception would be justified in clniming that the shape of an object is "directly picked up" if a further elaboration of this "picking up" operation would only be possible in physiological, but not in psychological terms. If, however, the perception of shape has a meaningful decom position, if it can be further decomposed and explained in terms of more elementary concepts and operations, then such an explanation would be more satisfactory than the "imme diate registration of shape." Another example of interest where the notion of a "mean ingful decomposition" plays an important role concerns the distinction between molar and molecular descriptions. The ideal gas law PV- NRT is an example of a molar description, stating the relation between the pressure, volume, and temperature of N moles of ideal gas. This molar equation can also be derived from more elementary phenomena, but a description in terms of the elementary phenomena would involve a shift from the domain of gas containers, their volume, pressure, temperature, etc., to the domain of mole cules in random motion. Gibson employed the molar/molec ular distinction to argue that "immediate perception" is justified since psychology studies phenomena at a molar level. In describing the movements o f an animal, for example, we are interested in a "molar" description, not in the detailed contractions of individual muscles (Cibson 1960). Analogous ly, he argues that, on tht! molar level, stimuli and percepts should be described as immediately related. This claim implies that a meaningful elaboration of the stimulus-percept realtion, and the process of information pickup, would require a shift to the molecular level, or. in the case of perception, to the.physiological and anatomical level. In other words, the relevance of the molar-molecular analogy hinges on the feasibility of decomposing the relation hdwt'(m stimuli and percepts in psychologically meaningful terms. This problem lies at the heart of the dispute between the theory of direct perception and the computational/represen tational approach: if the extraction of visual information can be expounded in terms of psychologically meaningful processes and structures, then it cannot be considered imme diate. Much of the ensuing discussion will focus on problems pertaining to this controversy. For additional controversies related to direct perception that will not be emphasi:.t:ed here. see [1l 2.1 A note on direct perception and direct realism Discussions of direct perception have often been related to the problem of realism in philosophy. It has been argued (Gibson 1967; Yelton 1968-9; Gibson 1968-9; Metzger 1972; Henle 1974; Turvey 1977) that the OVP theory has signifi cant mmifications for the problem of realism in that it lends new and sophisticated support for direct realism. [See also Fodor: "Methodological Solipsism" BBS 3(1) 1980.] Both realism in general and dire<:t realism In particular are claimed to be supported by the theory of direct information pickup. If we are endowed with mechanisms that can directly register aspects of the environment, then such an environ ment must exist (which is a case for realism), and we have a direct knowledge of It (which is a case for direct realism). A detailed examination of these issues would require too long a digression. I shall therefore make only two brief comments that bear on the issues at hand, one related to realism in general, the other to direct realism. In voicing his skepticism, the non-realist does not have to
Ullman: Against direct perception deny the self-consistency o f the realist's position. The exis tence of external objects, and of perceptions that reflect them faithfully, is one possible state of affairs. It is not the only conceivable one, however, and the non-realist sees no compelling reason to accept it. I see no significant new argument in the theory of immediate perception thot will force the non-realist to abandon his position. As far as the non-realist is concerned, the view that we possess mechanisms directly sensitive to patterns and invarianaces, and that these patterns in turn specify the external reality, is still not the sole, irrefutable position. The DVP theory is consistent with realism, but does not seem to offer a significantly ..new and sophisticated support" for it. To examine the relation between direct perception and direct realism, it would be useful to distinguish between two notions of directness. The first is the notion of the direct awareness of objects, as subscribed to by direct realists. The second notion, which has to do with direct perception, makes a claim about the psychological theory of perception. It implies that the perceptual p rocess has no psychologically meaningful decomposition, in the sense defined in the previous section. Now it may be argued that direct realism implies that a perception theory of the direct kind should be preferred. Even if this argument holds, however, it would mean that direct realism lends support to the theory of direct percep tion, rather than the other way around. If the psychological theory of direct perception is to lend new support to direct realism, it has to be evaluated on its own, independent of direct realism. This brings us back to the problem raised in the last section, concerning the analysis of the perceptual process in psychologically meaningful terms, and in particu lar the adequacy of "direct information pickup" as a primitive construct in the theory of perception. In the argument for direct perception it has often been suggested (Gibson 1966, 1967, 1972, 1979, p. 54, 60) that the alternative to direct perception is the Indirect sense-data theory of the kind o.dvocated by Locke. Sense-data theories view the perception of objects as composed of two stages. First, elementary stimuli such as homogeneous patches of color give rise to elementary sensations (or "ideas," as they were called by Locke) in the mind; then, the perception of objects is derived from composites of elementary sensations. The DVP theory rejects the "mental chemistry" of elemen tary sensations, and concludes that the perception of objects and events is the direct result of "higher order" stimuli (or, in a later formulation, the information In the visual array): I argue that the seeing of an environment by an observer existing in that environment is direct in that it is not mediated by visual sensations or sense data. (Gibson 1972, p. 215). [The direct theory] is therefore not obliged to postulate any kind of operation on the data of sense, neither a mental operation on units of consciousness nor a central nervous operation on the signals in nerves. Perception Is taken to be a process of information pickup. (Gibson 1967, p. !62). Gibson argues against theories of perception that rely on the mental chemistry of "units of consciousness." The Implication from this argument is that since perception cannot be so de�omposed, a direct theory of perception ls required. But the argument that a Gibsonian theory of direct perception is required simply because the above sensation based theories are considered untenable suffers the fallacy of "argument by selective refutation." That is, only one of the alternatives to "direct perception," not all of them, is refuted. Association of sensations is not the only conceivable form of a mediating perceptual process. Rejecting the combination of sensations by the mind does not by itself justify, therefore, the conclusion that processes such as inference, interpretation, computation, categorization, assimilation, or stabilization
(Gibson 1959, p. 460), or copying, storiug, e9mparing, and matching (Gibson 1966, p. 39), have no place in the theory of perception. 3.
Can perception have an "immediate" theory?
In the preceding section, certain aspects of "directness" in the theory of perception were examined. This discussion will now be applied to the question of the plausibility of direct visual perception. Section 3.1 raises the argument that the richness of stimuli and percepts prevents a satisfactory theory of a direct mapping between them. In Section 3.2 the notion of information pickup by the sense organs and its use as a primitive construct in the theory of perception are exam ined. 3.1 The richness of stimuli and percepts The DVP theory describes perception in terms of a family of percepts coupled with their specific stimuli. When a stimulus (or even sufficient information) is present, ft can be "directly registered" by an appropriate mechnism tuned for its detec tion, thereby giving rise to a speci6c percept. The registration of information Is a primitive construct, one that has no elaboration within the theory. According to this view the perceptual system performs only the most elementary kind of computation (if It can be called computation at all). Direct registration is thus essentially equivalent to a basic "table lookup" operation in the sense that for the most part it relies functionally on a single construct whose further elaboration lies outside the scope of the theory. A direct reg.istration is not the only sort of operation available, however, nor is it neces sarily the most appropriate one. Some insight into the appro priateness of the "immediate" sort of theory can be gained by considering, In general terms, under what conditions one can elepect a system to be adequately considered in "immediate" terms, and in what systems intermediate processes would be ne cessary. Let us first return to the elementary example of integer addition. We have seen how the addition of any two integers can be based on a restricted lookup table, augmented by the right-to-left processing rule and handling of carry. Is this mode of addition better than a large-scale table that lists directly the results of adding pairs of integers'? The large-scale table has an advantage: it does not require intermedjate steps and therefore orfers simplicity, and possibly speed. The indirect method offers a dilferent advantage: employing only a restricted table it is able to handle an unbounded set o£ inputs. The question of whether direct pairing or indirect computation is preferable thus depends on the task at hand. The direct approach is advantageous wl1en the set of Input output pairs is small (compared wlth the capacity of the system), ancl when spee d is of the essence. For example, our inborn repertoire of reflexes can probably be thought of ns a prewired, immediate coupling between stimuli and responses. When an exhaustive enumeration becomes prohibitive, processe s and rule of formation would offer an advantage over the direct coupling of input-output pairs. The production and recognition of the cricket's song Is an elegant biological example of signal production and recogni tion that can be reasonably thought of as having an immedi ate nature (Zaretsky 1971; Bentley & Hoy 1974). Cricket song is a train of sound pulses of a fixed tempora.l pattern. The generation of such a predetermined pattern requires no formation rules, and can be explained directly in terms of the underlying physiology and anatomy. As Bentley and Hoy comment, "the correct pattern arises from the neural connec tion established during development" (p. 41). Their work is THE BEHAVIORALAND BRAIN SCIENCES (1960), 3
375
Ullman: Against direct pt>rceptio n therefore aimed at identifying the neural mechanisms responsible for song product ion, and their genetic origin. Similarly, the recognition of th e song is carried out directly by a neural "song-responding mechanism" that can "resonate" to the appropriate pulse-seq uence (Zaretsky 1971). In contrast, the view rnised by generative-grammar theo ries is that the production and recognition of grammatical sentences in a natural language do not have an "immediate" theory in this sense. Rules of formation and recognition are incorporated in the system in order to handle the unbounded set of possible sentence s. Similarly, if we consider all distin guishable perceptions (such as the perception of all different shapes) as distinct percepts, the number of possible stimuli and percepts becomes too large to be amenable to direct pairing. [See Chomsky: "Rules and Representations" BBS 3(1) 1980.] To reduce the number of possible percepts one might try to lump them into groups or families. For example, "three· dimensionality" may be suggested as a single percept. (Such percepts have been suggested, for instance by Wallacl1 and O'Connell (1953} and by Braunstein ( 1962), though not in the context of supporting direct perception.) A percept of three-dimensionality would require a set of parameters associatP.d with it. since we are not only able to distingui.sh whether an object is Bat or three-dimensional, but we can also perceive its particular three-dimensional shape. The requisite associated parameters still have to be retrieved, and therefore the problem of Immediate perception is only hidden, not solved, by introducing such percepts as "three dimen sionality ." A plausible method for dealing effectively with problems that are too large and complex to be handled by direct pairing alone is to employ processes or rules of formation. A system that incorporates such processes is therefore a more likely candidate for coping with the enormously com plex tasks of visual perception.
3.2 The immediate registration of information and object properties. The basic operation performed by the visual system accord ing to the DVP theory is the registration or detection of Information. The information in the ambient light array constitutes the stimulus to the sense organ, which picks it up and thus produces the awareness of objects and events: "... there can be direct or im mediate awareness of objects and events when the perceptual system resonates so as to pick up information" (Gibson 1967, p. 168}. All the observer has to do in the process is "to pick up information by looking" (Gibson 1966, p. 3). The abstract information that the sense organ directly "resonates to" (Gibson 1966, p. 267} is conveyed primarily in the form of invariants and tranforma tions in the array of light [2l For example, we correctly perceive the unchanging shape of a rigidly moving object ". . . not because we have formed associations between the optical elements, not even because the brain has organized the optical elements, but because the retinal mosatc ts senst tive to transformations as such." (Gibson 1957, p. 294, italics added.) A general question raised by the above description is what sorts of stimuli can be registered directly, and what sorts of primitive operations can be assigned to the sense organs. Can information, transformations (as in the above paragraph), and invariants (Gibson 1979, p. 178) be considered the direct stimuli for the visual system, as proposed by the theory of information pickup? Physiology tells us that the retinal receptors register light energy in various regions of the visible spectrum. Gibson raises two arguments as to why we can nevertheless accept abstract information, rather then spatia temporal distribution of light energy, as the direct stimulus
376
THE
BEHAVIORAL
AND BRAIN SCIENCES
(1980), 3
for the sense organs. The first argument relies on the distinction between sensation and perception, and the second on the availability of patterns for immediate pickup. I shall consider each in turn. 3.2.1. Sensation versus perception. DVP parallels the sensation-based theories of perception in distinguishing between sensation and perception. According to this view, physical stimulation by light causes sensations, not percep tions (Gibson 1966, 1979). What gives rise, then, to percep tions? The sensation based theories suggest that they are produced from collections of sensations. Gibson rejects this Idea and concludes that perceptions and sensations are produced along parallel tracks: stimulation at the receptors level gives rise to elementary sensations, while stimulation of the perceptual system b y relevan t information directly produces percepts of objects and events (Gibson 1966, 1967). The above implication (that a bstract information consti tutes the stimulus for perception) depends on accepting the theory of immediate perception as the only alternative to the sensation-based view of perception. If percepts are indeed directly coupled with stimuli, then these stimuli are neces sarily highly complex and abstract. But if direct perception is not admitted, the notion of information as stimulation does not follow. If the possible role of mediating processes i s appreciated, then the light distribution a t the receptors can be a ccepted as the input to the visual system. The gap between the physical stimulus and the perception of objects can be bridged , at least in part, not by asso c iating sensations, but b y an elaborate process that constructs a representation o f the environment on the basis of the incoming light distribution. The key point is not whether the latter view is correct, but that the immediate registration of abstract information is not the only alternative to the sensation-based theories of percep tion. To summarize the above point: the argument for abstract stimuli claimed that (a) the sensation-based view is false, and therefore (b) immediate perception and (c) abstract stimuli follow. But the implication is actually that (a) and (b) together imply (c). Hence, the notion of abstract information as the stimulus for perception is primarily implied, not by the rejection of the sensation-based view, but by the acceptance of the theory of immediate perception. 3.2.2. The availability of patterns for immediate pickup. A second argument that supports, according to the DVP approach, the existence of abstract stimuli and their registra tion, is that tJ1e distribution of patterns of light in space and time are directly available to the visual system. As far as I can see, this availability of patterns as stimuli is supported In the direct theory by two arguments: (i} the existence of neural interconnections and (ii) the l ocomotion of the observer. Neural interconnections create higher-order units in the nervous system that can register spatial patterns directly [Gibson, 1967]. When, in addition, the observer moves aboul in the environment, the interconnected network of photo receptors and higher order "resonators" can register the information in the spatia-temporal patterns. Perception is therefore "not supposed to occur i n the brain but to arise in the retina-neuro-muscular system as an activity of the whole system" that moves in the environment and resonates to the available information (Gibson 1972, p. 217). Let us first clarify the point of contention in this argument. The cont roversy does not concern the relevance of spatiotem poral patterns to visual perception. It is granted that informa tion about objects is carried by the distribution of patterns of light and their changes over time. The debate concerns the nature and compl exity of the processes that "register" the
Ullman: Against direct perception information in the spatiotemporal patterns-that is, whether the registration of information should be taken as a primitive construct, or should have an explanation within the theory. The fact that spatiotemporal patterns ·of light carry sufficient information for visual perception does not by itself entail, however, the immediate registration of the informa tion in these patterns. It has recently been shown, for exam ple, (Ullman 1979a, 1979b: Longuet-Higgins & Prazdny, in press) how the rigidity and three-dimensional shape of moving objects can in principle be recovered from their changing images. These results are applicahle both to contin uous and discrete (movie-like) stimuli, and to perspective as well as parallel projection. For simplicity, let the case of discrete presentation and parrallel projection (such as the image of a distant object) serve as an example. As it turns out, the three-dimensional structure of an object containing at least four non-coplanar elements can be recovered completely if it is viewed from three distinct viewing points. This result guarantees that under simple restrictions there is indeed sufficient information in the changing image to specify the rigidity and shape uniquely [3]. The information is encoded in "high order patterns," in the sense that extended patterns in space and time are required. The recovery of the rigidity and correct three-dimensional shape is possible in this scheme, but it is far from immediate, for two main reasons. First, the shape recovery cannot be broken down into a collection of percepts, each one associated with its specific, independent stimulus, invariant, or transformation. Second, the particular process by which the available information Is utilized by the visual system has direct psychological implica tions. It is evident that in the recovery of structure from motion the visual system does not make full use of the information available to it. For example, If the number of elements in view is small, or if the presentation time is short, humans will fail to perceive the correct three-dimensional structure, although sufficient information Is In fact available. It seems, therefore, that for a satisfactory explanation of visual perception, the "pickup of available Information'' will have to be studied and analyzed, rather than taken as a primitive construct. In both the direct and indirect theories, then, visual perception relies on the information in spaliotem poral patterns of light. The underlying question on which they disagree is whether the Information In these patterns Is indeed picked up immediately. The psychophysical investigation of frequency-tuned chan nels in human vision can illustrate some of the distinctions between immediate and non-immediate registration of infor mation and patterns. Following the work of Campbell and Robson (1968), substantial evidence has been accumulated for the existence in human vision of a number of distinct channels, or mechanisms sensitive to different ranges of size and spatial frequency. It has been shown {e.g. in Richards & Polil 197.4; Julesz & Miller 1975: Watson & Nachmias 1977; Wilson 1978; Marr & Poggio 1979: Wilson & Bergen 1979) that a variety of phenomena in pattern detection, pattern discrimination, and stereoscopic vision can be explained by the properties of the channels and nonlinear interactions among them. It also appears that the basic properties of the channels themselves are a direct reflection of the receptive field properties in the retina and the lateral geniculate nucleus. These encouraging results illustrate a number of points concerning the immediate registration of patterns and information. In general, the "directness" of perceptual mechanisms may be a matter of degree, with no absolute boundary distinguishing the direct from the indirect. In the above example it appears that one can be comfortable with viewing the underlying channels as the basic mechanisms that register patterns of light more or less directly, since (a) the channels appear to be explicable in physiological terms, and (b) the detailed dissection of the channels does not appear to
have significant perceptual implications. More complex visual modules, such as stereopsis, can then be explained using the properties of the underlying channels and the interactions among them. The conclusion from this example is that a psychologically meaningful decomposition of, say, stereo scopic vision, seems possible. nut if it is, then the explanation of steroscopic vision as the immediate pickup of binocular information {Gibson 1979, Ch. 12) would not be justified. The same argument is relevant for other perceptual and nonperceptual domains. If meaningful decompositions are possible, then the psycholinguist, for instance, should be dissatisfied with the suggestion that we comprehend utter ances in natural language simply because our auditory system is tuned to directly pick up their meanings. Similarly, the perceptual psychologist should be dissatisfied with the claim that a property like rigidity is directly picked up. The underlying reason is that an attempt should be made to elaborate these processes rather than accept them as primitive constructs. If such an elaboration is possible, it would serve as an integral part of our understanding of the linguistic and perceptual processes. Even if such an elaboration should ultimately prove to be difficult or perhaps unattainable, the implication of the foregoing discussion is that the direct explanations should better be regarded as a 'last resort', rather than a starting point, for cognitive theories. [See also Pyly shyn: "Computation and Cognition" BBS 3(1) 1980.]
4. Perceiving the three-dimensional structure of moving objects This section will e:xamine the approach of the DVP theory to the problem mentioned above of perceiving the three dimensional structure of a changing environment. This problem was one of the most extensively studied within the immediate perception approach, and its examination can serve to illustrate some of the shortcomings inherent in this approach. Changes in the structure of the environment relative to the observer can be caused by the movements of the observer, by motion of object-; in the environment, and by non-rigid transformations of objects. In the case of object motion relative to the observer, the visual system has a remarkable capacity for correctly recovering the three-dimensional shape of the moving objects, even when the objects are unfamiliar, and when each static view of the scene contains no informa tion about the three-dimensional structure of the objects. The first systematic study of this capacity was carried out by Wallach & O'Connell (1953) in their investigation of what they have termed the "kinetic depth effect (KDE)." In their experiments, an unfamilar object was rotated behind a translucent screen, and the shadow of its projection was observed from the other side of the screen. In most cases the observers were able to give a correct description of the hidden object's structure and motion even when each static view of the object was unrecognizable and gave rise to no three dimensional Impression. In the original study of the kinetic depth effect, as well as in later studies (Wallach et al, 1956; Jansson & Johansson 1973), the ability to perceive structure from motion was accounted for in terms of nn "effect" produced by Unes and contours that change simultaneously in both length and orientation (4). This explanation, which proposes a direct coupling between a percept and a certain class of two dimensional patterns is, however, highly unlikely. If only actual lines in the image are considered, the account is manifestly false, since the structure of unconnected dots can be recovered through their motion. Imaginary lines connect ing identifiable points were therefore admitted as well (Wal lach & O'Connell1953). But the resulting condition (I.e. that
THE BEHAVIORAL ANO BRAiN saENCES (1980), 3
377
1
Ullman: Against direct perception the perception of three-dimensional structure is produced by lines, virtual lines, and contours that change in both length and orientation) is certainly insufficient. Consider, for exam· ple, the random motion of unl'onnected elements in the frontal plane. ThE' virtual lint'S betwt-en them chnngc constantly in both length and orientation. but no coherent three-dimensional structure is perceived. The above condi· tion is also necessary in a trivial sense only: the only two dimensional transformations of the image that violat e Wallach and O'Connell's condition are rigid tranformations {of the image, not of the three-dimensional objects) and uniform scaling. But if the structure of a three-dimensional object is not recoverable from a siugle projection, it s i hardly surprising that a uniform displacement, rotation, or scaling of the image itself, are insufficient for revealing the unknown structure [5}. The perception of structure from motion was also addressed by Gibson and his collaborators. The Brst solution proposed in their studies was that kinetic depth phenomena i duced by gradients of velocities. This hypothesis was not are n confirmed, however, by empirical investigations {see review by Epstein & Park 1964; Farber & McConlde 1979). A different hypothesis in later studies suggested that continuous perspective transformations are directly registered by the eye (Gibson 1954; 1957; 1965; 1968; Gibson & Gibson 1957; von Fieandt & Gibson 1959). But this hypothesis raises difficult problems: what singles out those two-dimensional transfor mations that originate from the motion of rigid objects, and how can those transformations be registered by the eye? Hay (1966), in an extension of Gibson's analysis, tried to provide some answers to these questions by using techniques frorn projective geometry. A major difficulty with applying projec tive geometry to the problem at hand Is that the h·ansfor· mations induced by the projections of a moving object are not equivalent to the group of projective transformations studied In projective geometry. (Projective transformations are the projection of nonsingular linear transformations. The motion of objects is not, in general, a linear transformation.) Hay tried to circumvent some of the difficulties by (a) restricting his analysis to planar objects, and (b) decomposing the problem, and treating the perception of moving objects as based on eight distinct stimuli that can be studied separately. It proved impossible, however, to extend the analysis to nonplanar objects, nor was it possible to identify the relation between the eight basic stimuli and the various motion percepts (Hay 1966; Gibson 1968). An additional problem with the hypothesis of continuous perspective tranformalions Is that neither perspectivity nor continuity s i required for the perception of structure from motion (Ullman 1979a). A later attempt at identifying the immediate stimuli for the perception of moving objects conootrated on the notion of invariants (Gibson 1960; 1966; 1972; 1979). This programme states that In the transformations induced by moving objects some aspects of the patterns change while others remain invariant. It is hypothesized that the invariants are directly registered by the eye, giving rise to the perception of objects In motion. In this latter formulaion t the notion of itwa.riances assumes a pivotal role in motion perception: "The perceptual system simply extracts the invariants from the flowing array; it resonates to the invariant structure or it is attuned to it'' (Gibson 1979, p. 249). More generally, "The extracting and abstracting of invariants are what happens in both perceiving and knowing" (p. 258). In evaluating the invariance-based programme it is worth noting that the question of whether a given system follows some rules of invariance is often merely a matter of convenience. For instance, the physical rules governing the motion of a free-falling object can be expressed in terms of invariant total energy (potential energy is transformed into kinetic energy.) Alternatively, they can be expressed in terms 378
THE BEHAVIORAL AND
BRAIN SCIENCES (1980), 3
of the effect of gravitational forces. The rules of mechanical motion can be expressed in yet another formalism (also favored by some theories of perception), the formalism of minimum principles. In Hamiltonian mechanics, motion is governed by de Maupertuis' principle of least action. For formulations of minimum principles in perception see e.g., Mach {1897), Hochberg & McAlister (1953), Attneave & Frost (1969), Attneave {1972), Restle ( 1979), and the Gestalt Prllgnan% principle (Koffka 1935). The question of which formalism is to be used, whether a minimum principle, an invariance, or otherwise, s i of second· aTY concern to the theory of visual perception in its current stage. Since little is known about the rules governing percep tion, the primary concern is the discovery of these rules, rather than the feasibility of a11 lnvariance-based formulation. The definition of invariances in the theory of direct percep· tion is in fact so broad that almost any rule, once discovered, can be reformulated in terms of invariances [6): "A great many properties of the array are lawfully or regularly variant with changing observation point, and this means that in each case a property defined by the law is invariant" (Gibson 1972, p.221). The relevant problem for the perception of structure from motion is therefore not whether the information in the visual array and the perception of moving objects are expressible in terms of invariances, but what the information is and how it is utilized by the visual system. A formulation in terms of invariances would be advantageous for the theory of direct perception f i invariances could be discovered in the changing visual array that would be (a) informative enough to specify the structure of the moving objects, and {b) simple enough so that it would be reasonable to suggest that they are picked up directly. A hypothesis along these lines has been made (Gib son, Owsley, & Johnston 1978) by suggesting that the cross ratio, which is known from projective geometry to be an invariant of projective transformations, underlies the percep· tion of moving objects [7l Whether or not the cross-ratio invarinnce is indeed utilized by the perceptual system Is an open question. But since it requires four collinear points, and cannot reveal the structure of moving objects In general, it cannot even begin to answer the problem of recovering the structure from the changing projection. As has been mentioned above [Section 3.2.2}, alternatives do exist: there are schemes that can recover unambiguously the structure of moving objects. But these schemes are neither direct nor based on invariances (Johansson 1964; 1970; Ullman 1979a, footnote 8) In summary, several inherent shortcomings of the direct perception approach are manifest in lhe altempt to apply the theory to the perception of moving objects. The direct approach leads to viewing the perception of moving objects as a collection of percepts or "effects" produced by character· istic stimuli. The decomposition of perception into simple, distinct percepts, and the search for stimulus characteristics that can reasonably be registered directly, has not proved very fruitful (at least in the sense that no direct scheme exists that can· describe the three-dimensional shape that will be perceived from the changing stimuli in the Kinetic Depth demonstrations). The more promising indirect schemes suggest that this may reflect inherent problems in the direct approach, not merely a temporary failure to identify the relevant stimulus lnvariances. 4.1 Mach's illusion and the possible role of internal representations
The perception of moving objects can serve to illustrate an additional source of dispute between the theory of immediate perception and current "indirect" theories. A well-known
Ullman: Against direct perception phenomenon in motion perception is the illusion named after Ernst Mach [9]. Mach's illusion can be demonstrated in the following way. Consider a sheet of paper folded to create a standing v-shaped figure. When viewed monocularly, this shape is ambiguous, the v-shape can reverse in depth (Eden 1962; Lindsay & Norman 1972). An observer views the v-shape
4.2 Empirical investigations o f internal representations The above discussion has considered the "internal states" of the perceptual system in the case of Mach's illusion. If, however, something like an internal representation of the environment exists In this case, It is unlikely that it is constructed in this case only; it is more likely to be a part of the perceptual process in general. In addition, In recent years there has been a growing body of evidence regarding the existence and nature of the internal representations in a variety of situations. Although the emphasis here is on a theoretical analysis, I shall briefly describe some of this evidence, as It bears directly on the problem of Internal representation. The current research Into the nature of the internal representations in perception received much of its thrust from the experiment of Shepard and Metzler (1971). In this
experiment, subjects were presented with 1600 images, each depicting a pair of three-dimensional objects. In a11 cases the two objects were separated by rotation in space, i.e., they had a different orientation with respect to the viewer. Half of the pairs depicted two objects of identical three-dimensional shape. In the other pairs the two objects were not identical, but a mirror image of each other. The subject's task was to decide as quickly as possible whether the two objects were identical in shape. The main finding of the experiment was that response time to the identical pairs varied linearly with the angular separation between the objects. Furthermore, it did not matter whether the portrayed objects were separated by rotation in the image plane or in depth. These findings were SlJbsequently replicated and extended (see Shepard 1975. 1978, for a summary of results). One noteworthy variant of the experiment established that when the two objects arc presented successively, and the subject is given sufficient advance information concerning the object to be presented and its orientation, then the response time becomes uniform, I.e., independent of the orientation difference. Shepard nnd Metzler's Interpretation of the data wo.s that the perceived identity in the experimental situation required a transformation of internal representations. This transforma tion has an effect equivalent to rotating one representation in an attempt to bring it to registro.tlon with the other. The linear dependence is then explained in terms of a constant rate of this rotation-like operation. In the case of sufficient prior information the tramformation can be performed prior to the presentation of the second object, thus reducing the response time In the observed manner. The parllcular scheme suggested by Shepard and his co workers has been the source of much debate, and alternalive theories have been proposed (e.g., Pylyshyn 1976; Marr & Nishihara 1978; Hinton 1979; Sutherland 1979; I
5. From function to mechanisms
In the theory of direct visual perception, the visual process is to be understood nt two levels, which cun I><� roughly labeled "information content" and "mechanism." At the first level the information content of the visual array (t-.g the "eco logically valid" transformations nnd invariants, und the way they specify object and events) Is to be analyzed. The second level belongs primarily to the realm of physiology, and its task is to unravel the neural mechanisms thnt register the information explored at the first level. A different appronch, described by Murr and Pogglo ( 1977), distinguishes three main levcls in the understanding of information-handling systems: the levels of function, algo rithm, and mechanism (12]. Although the borderlines between the levels are not always clear, the distinctions are .•
THE BEHAVIORAL ANO BRAIN SCIENCES ( 1980), 3
379
Ullman: Against direct perception useful in examining the relations between various nspects of information-handling systems. The first and last of these levels roughly correspond to the analysis of information content and mechanisms. respectively. The intermediate algorithmic level is indispensable in bridging the gap between the levels of function and mechanism A simple example may illustrate this role. Suppose that an investigator tries to unravel the internal workings of the electronic calculator we considered in Section 2. One possible approach would be to investigate the mechanism by probing the currents and volt ages of the various components. rf the £.unction of the �alcu lator is unknown to the investigator, he would face a difficult, perhaps impossible, task. Understanding the function of the system as performing arithmetic operations would facilitate the study of the mechanism, and would also serve an Integral part in the theory of the system (cf. Ullman 1979b, pp. 1-4). The theory of arithmetic is, however, insufficient for the mapping of arithmetic operations onto the mechanisms within the system. For the theory of arithmetic, the particular representation of numbers, for instance, is immaterial. It can be binary, decimAl, or any other representation. Knowledge of the particular representation employed would become instrumental, however, in the attempt to identify the roles of particular mechanisms within the system. This conclusion is not restricted to simple artificial devices, The general point s i that f i representations are employed, then a detailed study of the representations and the operating processes is required to relate the level of function to the level of the physical mechanisms. The dismissal of the middle level - which includes processes, representations, and the integration of informa tion - as consisting of immaterial "intervening variables" leads to three deficiencies in the theory of perception. First, as we bave seen, the algorithmic level plays an n i dispensible role in bringing together the studies of function and of mechanism. Second, the elucidation of the participating representations and processes constitutes an integral part of the theory of perception. The behaviorist might object to this notion and question whether representations and processes "really exist." Thus Neff (1936), in a review of theories of motion perception, concludes that "the assumption of an active mind is one of the most primitive beliefs of mankind" (p. 39), and Gibson dismisses perceptual processes as "old fashioned mental acts" (1979, p. 238). But a distinction has to be drawn between "symbolic" and "mental" [13]. The mediating processes in the (.'omputationalfrepresentational theory do not operate on subjective e>eperiences (Gibson 1979, p. 238), nor are they intended to account for their origin (see Fodor: "Methodological Solipsism" BBS 3(1) 1980]. Subjec tive experience remains for the computational/representa tional approach (a.s it does for the direct approach) a complete mystery. Gibson's objection to the computational approach on the grounds that "no one has suggested that a computer has the experience of being here" (Gibson 1972, p. 217} cannot serve, therefore., to refute the computational appproach. In fact, the perceptual processes are not necessDrily open to conscious introspection. Consequently, the introspective impression that the perception of objects is immediate and unanaly7.able cannot be taken as evidence supporting the theory of immediate visual perception (cf. Gibson 1972, p. 222). The calculator example examined above illustrates in what sense processes and representations are amenable to empirical investigation: certain events and components within the calculator can consistently be interpreted as having their meaning in t11e domain of numbers and opet11tions on numbers (14]. There is nothing mysterious or mentalistic, then, in accepting and studying these intermediate represen tations and processes. Analogously, although the brain mech a.nlsms may be yery different from electronic ones, it is
380
THE BEHAVIORAl AND BRAIN SCIENCES (1980), 3
perfectly conceivable that certain events and components within the brain constitute (or can be consistently interpreted as) visual representations and processes that are amenable to empirical study, and are instrumental in explaining perce� tion. The dismissal of the algorithmic level as "immaterial" is therefore unjustified in either sense of the word (i.e., "fictitious" on the one hand and "insignificant" on the other}. The third shortcoming of ignoring the algorithmic level is that it leads to oversimplifications of the theory. If processing is trivial or nonexistent, then one is led to search for "immediately registerable" information, such as the simple cross-ratio in the perception of three-dimensional structure In motion. If the role and complexity of the processes that "ptclc up" the information Is appreciated, then it would be possible to realize that the information can assume less direct forms. The complexity of these underlying processes may be veiled by the subjective ease and immediacy of perception. But this subjective m i pression should not serve to underestimate.their complexity. Schrooinger (1958) argued that as a process is perfected · in the course of evolution, it "drops out of consciousness, " and becomes inaccessible to introspection. If he is right, we can actually expect some of the most elaborate and perfected processes to be inaccessible to introspection. (n any event, the possibility that perceptual processes may be highly complex has to be confronted. The process of stereopsis (i.e., the combination of Information from the two eyes) exemplifies this hidden complexity in visual perception. Subjectively, it seems that all we have to do is use both eyes, and binocular fusion occurs. We can "pick up information by looking" (Gibson 1966, p. 3}, or so it seems. The actual process turns out to be highly complex, however. See Julesz (1971) for much of the empirical data, and Marr & Poggio (1979) for a recent theory of human stereopsis. In one respect Marr and Pogglo's analysis agrees with Gibson's: it capitalizes on "ecological" properties such as the opacity and continuity of objects. But the Gibsonian view that what remains to be done is to pick up the invnriances in the inputs to the two eyes turns out to be too simplistic. The information Is e>etracted by an intricate interplay of filtering, matching, nnd eye movements [15]. This process estahlish<>.s that there is sufficient informa tion in the visual arrays to allow for the reliable extraction of stereo disparity. I doubt, however, that the method by which the stereo information is encoded can be revealed by examining the two inputs to find the relevant immediate invarlances, independent of the processes that extract this information (Gibson 19G1; 1979, Ch. 12). Recently, Neisser (1976} has expressed an uneasiness with what he l."BIIed the information-pr<>cessing view that describes cognition in terms of processing and '"still more processing" [tbtd, Figure 1l He suggested thut if Gihson is right in his information-content analysis, JWrhups W
Commentary/Ullman: Against direct perception ing is not to create information, but to extract it, integrate it, make it explicit and usable (cf. Marr 1976; Ullman J979b, Ch. 5). In conclusion, it would be misleading to pose the problem as a trade-off between "ecological optics" on the one hand and "information processing" on the other, since they play largely distinct roles. At the top level, the functions of the visual system have to be understood. This level Includes the informotion-content anaJysi� of ecological optics. At the second level, the particular representations and processes employed by t he visual system are to be explored. The tltird level includes physiological and anatomlc.ol studies of the neural mechanisms of the visual system, and the re'tation of these mechanisms to the representations and processes employed by the system. In think that viewing the theory of Immediate perception in terms of the above three levels helps to put it in a proper perspective. The parts of the theory dealing with the informa tion content of the visual array and its relation to the "ecology" are likely to make a lasting contribution to the theory of perception. The immediate approach, on the other hand, would have to be extended by a more comprehensive theory, one that will draw an integrated pic ture of the perceptual systems at the levels of function, process, and mechanism. ACK NOWLEDCMENT l wish to thunk E. Hildreth, W. Richards, and K. Stevens for their invaluable help.
NOTES 1. To avoid possible confusion, It may be helpful to list a number of related <.-ontrovcrsles that will not be In the focus of the discussion here. either because they have been discussed In detail in the past. or because they are not central to the arguments examined in this paper. These are: (1) The role of p115t e;tperience In perception (e.g. Gibson 1972; Pittenger, Shaw & Mark 1979). (2) The interactions between non-visual modalities and vlsupl perception (Gyr 1972a, 1979; Eriksson 1974, Turvey 1977). (3) The degree to which the environment s i specified by static images, and by changes in the visual array (Gibson 1966; 1979; Nelsser 1976; Turvey 1977}. (4) The differences between continuous optical flow and discrete sam�;>ling of the visual array (Gibson 1972; Turvey 1977). 2. If the "resonator" or "tuning-fork" metaphor used to describe the process of information pickup i� taken too litemlly, it raises a n addit!onnl difflctnty: a tuning-fork is basically a linear device, while our visual system incorporates C5Scntial non-linearltics (see, e.g., Cielli & Julesz l978:Julesz & Caelli 1979). The term "resonator" will ht" interpreted therefore in 11 broader sense, t.e., any mechanism that can register inrormntion directly, not necessurily lineatly. 3. The unolysis or visual motion described In these schemes applies equally well to continuous und to discrete presentation. I do not wish to suggest that the human visual system employs a discrete sampling (in tirnc) of the visual arrny. These schemes stand In contrnst. however, to the claim thnt the Interpretation of visual motion is unnttoinuble on the basis of discrete sampling, which Is centro! to the position of Turvey (1977). 4. It should be noted that neither Wallach and O'Connell nor Johansson subscribes to the direct approach In general. The expla nation of the KDE as an ..effect" produced by the simultaneous change in length and orientation Is, however, "direct" in nature. S. Scaling can be used to indicate motion in depth (Marmo!in 1973) and time-to-collision (Lee 1976), but not to recover st.ructure from motion. 6. Similarly, Shaw, Mcintyre, and Mnce (1974) have emphasized the role of symmetries In direct perception. But the notion of i broAd enough to include, e.g., the symmetry In their formulation s rules of entropy, homeostasis, adaptation, nnd the attainment of knowledge.
7. The cross-ratio is defined in projective geometry for four collinear points (a,b,c,d) to be (ac Ld)/(bc ad). The cross-ratio of four distinct points is n i variant under projection. 8. The perceived structure is, of cour3e, an invnriance. But the registration of th1s invariant is simply equivalent to the original problem. 9. The depth reversal o£ Mach's figure, but not the motion effects, are described in Mach (1897). 10. Chomsky (1959) makes a similar argument against the mapping between stimuli and responses in behaviorism. For details see Chomsky (1959, p.551� 11. See also the discussion of the integrutlon of size and orientation Information in Hochberg (1974). 12. In onuouncing the establishment of a Center for Cognitive Studies 11t MIT, the same three levels were delicribed as the skeleton not only for the study of visunl perception, but for the <.-ognitive sciences in general. Tech Talk, 23(28), March 21. 1979. 13. While Neff, Gibson, and others view symbolic events as mental, others have committed the opposite error, reducing subjec tive experiences tosymbolic processes. For example, E.R. John claims that "consciousness itself s i 11 representational system" and can be explained in terms of inFonnation processing (Thatcher & John 1971), and G.J. Taylor contends that the study of conscious experi ence Is o. legitimate branch of natural science (Taylor 1962). For more discussion of this point see Criffin (1978) and Ullman {1978b). More generally, I do not wish to claim that the computational/repre sent:�tional theory is likely to encompass ell aspects of perceptual phenomena, certainly not all :aspects of the mind. The claim, however, Is that it provides a more satisfactory psychological theory of perception than the DVP theory. 14. The Interpretation s i not necessnrily unique, hut this difficulty is not central to the argument here. 15. In Marr and Poggio's theory. Ilut evetl If the theory is Incomplete or incorrect, to fit the available d11ta it seems likely thtlt any competing theory would be at least us complex. •
•
Open Peer Commentary Commentsrles submltt8d by lhe qusl/ffed profosslonsl resdersl'llp of th/11
Journtll will bB
con/1/dttred f(1f pubf/ost/® In s lster IBBue ss
Continuing commentary on th/s ari/Die.
by o. J. I'Sraddlck
Exp•rlmenfalP•ycllology, Unlwrtllty ol C•mbrld�e. Cambrtdf/• CB2
O•�rtm�to/
sEa, Etlflland
Direct perception: an opponent and a precursor of computational theorlea The theory of "direct Visual perception" can be seen
as
a
metalheory;
lhat Is, a theory of what would conslilule an adequa1e theory of visual perception. In
my view, the arguments marshaled a gains t it in the se
terms by Ullman are quite compelling. A theory (however complete or accurate) as to which properties of the proximal stimulus are
to
yield our perceptions is
extracted
not good enough. It is not good enough
because something better promises to be available, accolJnt of such relationships plus an account
of
namely
an
the mechanisms
n. Furthermore, what is promised s i not which permit the extractio
just
perhaps not at all) a moleetAar accOlX1t of 1he mechanisms at the level of single-unit physiology, but a clst inctivcly psychological or computational account which would organize our unders1andng of them. Ullman's quotation "the retinal mosaic Is sensitiveto transformations as such" snows that even Gibson found It dlfflcult to eschEIIN an account thai 1eant on physiological mediation in this Instance, an (or
-
account that Hew in the face of retinal physiology even as It was known
In 1957.
However, this statement does raise an Issue lhat is important
and far from resolved for representatlonal 1heories, whether physiolog-
THE BEHAVIORAL ANOBAAIN SCENCES (1980). 3
381
Commentary/Ullman: Against direct perception ical or computational: That is. the information about transformations is
perceptual processes revealed by lhasa laboratory demonstrations
clearly present in the pattern of retinal activity: representational theo·
exist only to be activated in such rare and highly artificial circum
risls have to argue thai it is not explicit at that level, but that it may
stances.
become so in some higher-level representation that is derived from the
Actually, there is a continuity between the perception of the Ames
retinal representation by computational processes. The Idea or infor
demonstrations or the Mach illusion end that of Gibson's moving
mation explicit in a representation has not to my knowledge been
observer in a richly textured environment. The completeness of the
rlgourously defined. Certainly, it has not been established that informa
information in Gibson's optic array depends on what he cells "ecologi
t in any particular, tion about a stimulUs variable has to be explici
cal laws" -that is, certain constraints on what could eKist in the
obvious sense at some level of representation in order that this
envronmenl to produce the optic array (for example, that objects have
variable should be available to conscious perception, or that it should
continuous surfaces). In perceiving the Ames room, much m()(e
drive verbal or motor behaviour. Barlow ( 1972), for example, has
specific constraints are apparently implicitly accepted by the percep·
argued that perceptually relevant variables are represented by the
tual system (such as that artificial surfaces meet at right angles). Yet
activation of ever smaller and more specific populations of neurones at
these specific constraints do not act to prevent us seeing trapezoidal
higher levels ifl the visual pathway; but this proposal is frankly specula
enclosures correctly when we have richer stimulus information, as from
tive. It is dillicult to lind arguments against the opposite view, that even
motion parallax or binocUlar disparity. It seems that the perceptual
uniquely significant stimuli could be represented at the highest levels of
system will always adopt a wor1
the visual mechanisms by patterns of activity that involve the entire
that are severe enough to disambiguate the stimulus information
population of neurones at these levels. These alternatives are roughly
avaiable, but not so severe as to require part of that information to be
analogous to "engram" and "holographic" theories of memory repre
ignored. In the Mach Illusion, for instance, we do not accept the.
li in a sentations. Is it possible to specify what information is "expcit"
constraint that inanimate objects cannot systematically move with our
hologram? Una t the idea of explicilness in repre.sentatlons is itself
head positions, although in other circumstances adopting such a
made explicit. "direct" theorists � be able to sustain scepticism
constraint would help solve the strocture-from-motion problem.
about the status of representational theories. We may argue that Gibson's approach is quite wrongheaded as a
The argument about how far our perceptions are delefmined by proximal stimulus Information. and how far by the implicit adoption of
melatheory of perception. As an argument for a particular programme
constraints as to what the distal stimulus might be like, is one that is
of perceptual research, it has much more force. lJiman's paper does
very much alive In the computational approach to perception, under the
not make clear what historical opponents Gibson was eftecbvely taking
name ol "top-down versus bottom-up processing." Here, as in older
on. One was traditional psychophysics: a programme of analysing
approaches to visual perception, strongly top-down schemes have
l8b()(atory discriminations of a deliberately highly simplified sort. WhUe
sometimes been adopted when closer analysis of the input information
this has led to a coherent and cumulative body of scientific understand
would have revealed that it was unnecessary to adopt such narrow
ing, Gibson was surely right in arguing lhst a laboratory discrimination
(even if actively modillable) constraints.
task may orten have little to do wilh the task that the environment
The value of Gibson's insistence on the information in the stimulus is
presents to a perceiving organism. Sensation-based theories of
that it maintains a healthy pressure on the perceptual theorist. It is all
perception may never have had much going for them as overt, formal
too easy to suppose that we see what we see because that Is the
theories, but they are to some degree i'nptied by a programme of
representation that we already have prepared. We need to understand
research that regards the measurement of luminance increment
the stimulus thoroughly belore we can determine how much room for
thresholds, say, as an essential early step in the understanding of
manoeuvre it leaves the subject in selecting his representation. When
has been said, however, there is no reason why we should accept
vision. A more sophisticated defence of such experiments is QUite
this
possible, one that admits that brighlness sensations are not elements
Gibson's assertion that problems of mechanism dissolve away, or that
of perceptiOn but instead takes brightness sensations as specially informative about a process of strnulus analysis that isalso integral to the early stages of perception. However, we can hardly claim that some elementary analytic process is a building block in the perception of our environment, without a prior analysis of what properties of the stimulus we need to analyse. The doctrine ol direct perception served to emphasise this need to llhink through what various stimulus proper ties could actually tell us. In las, Gibson's approach prefigured an Important part of the style, and somet imes the content, of modern computational approaches to vision. As Ullman points out, Gibson shares with the recent work of Marr an emphasis on how a functional
analysis of perception must rest on the recognitJon of ecological constraints. The second opponent Gibson took on was the d i ea of "unconscious
they can be swept into the lap of some other science.
by Bruce Bridgeman PlychoiDQY Boerd ofSlud/el, Unlverolty of Cel/romt.. S.nle Cruz, Celli. Q6064 Direct perception and a call for primary perception Ullman introduces Gibson's d i eas about direct perception In the context of the Gibsonian denial that "copying, storin�. comparing, and matching" (Gibson 1966, p. 39) have any role n i ecologically valid perception. The context In which Gibson introduced that claim is the old problem of the apparent stability of the visual world despite eye movements. It will be useful to examine this case, because It proVides a particularly clear example of both the strengths and weaknesses of the concept of direct percepon. it
inference" and its progeny. Experiments like the Ames room are taken
Gibson's solution of the problem of visual stability is disarmingly
to show that stimulus information is not enough to determine percep
simple. Instead of proposing a mechanism for the stabilization process,
tion: a knowledge of lhe WOI'Id is also required lo select the correct (or
he says of the visual world, "Why should it move?" The mobile retina
rather, most likely) interpretation ol stimulus information. Gibson stressed how impoverished these demonstrations were compared with
scans a stable optic array, and the properties of the optic array itself (Its large size and tack of relative motions within it) deftne it to be
the ordinary circumstances of perception (particularly the ordinary
stable. This is perhaps Gibson's most compaRing appHcation of direct
circumstance of a moving observer) in which stimulus information was
visual perception, because instead of solving the problem, he rede
available to specify the environment much more completely. This is an argument in which bolh sides seem to be right (or maybe, both wrong). Unconscious inference can be, and has been, invoked to avoid the scientific task of analysing what detailed Information Is present in real
fines it out of existence. The only remaining problem is how we can then see motion at all, and relative motion becomes the only specifier of motion of objects in the environment. This inversion of the problem has an Intuitive appeal because of its
stimuli, and what "low-level" processes are needed to extract II. On
simplicity: if there is no information about motion of the visual world
ltkJsion, given inadequate stimulus infOI'mation we do arrive at percep tions which are at any moment complete and unambiguous, even though different solutions to the perceptual problem may occur from moment to moment. As unman argues, it Is highly unlikely that
other complications. The theory also makes some predictions, for if
the other hand, as Ullman brings out In his discussion of the Mach
382
1HE BEHAVlOAAL A/IV BRAtl SCIENCES (1980), 3
impinging on the receptors, there is no need for extraretinal signals and exlraretinal signals are irrelevant to the perception of stabUity, then changes In those signals should not alter perception. The simplest test of the theory is the classic experiment ol pushing
Commentary/Ullman: Against direct perception on
the
outer canthus
of one eye, and observing a movement ot the
entire visual world. The movement of the world which is seen in this case ought to be impossible, according to Gibson, for movement of the image ought to specify movement of the eye rather than a disconcerting destabilization of the world. One might object that this situation is so far outside the visual system's "design specifications" that unpredictable consequences do not speak to the theory's vertdity under more natural conditions. but the more natural experiment of simulating the sensory consequences of a rapid eye movement with a tap or rapid press on the eyeball yields the same result.
ered as Perceptual Systems. is that senses should be considered as modes of perceptual attention rather than as channels for sense data. Perceptual attenlion must be an internal process rather then an affordance in the enVironment. Given this ambiguity in the Gibsonian formulation, and its other limitations which unman has so ably pointed out, how are we to circumvent the problems while retaining the value and richness of Gibson's approach? I suggest that the emphasis be changed from "direct" perception to "primary" perception, borrowing the term from John Locke's distinction between primary and secondary qualities. In
Other experiments yield results jUst as damaging to the Gibson
this refonmulation, primary perception is deflned by those processes
hypothesis. II appears that the compensation system. which Gibson claims does not exist, not only innuences perception but assumes that the mechanical adVantage of the extraocular muscles is the same
which lake place quickly and without special attention or cognitive
throughout the range of eye excursions. This can be seen by moving
effort. They are not bound by unman's requirement of simpficity like direct perception, but seem immediate and auto matic to the perceiver because they take place below the level of consciousness. The
the eye to the edge of Its excursion range, then making a rapid
mechanisms end algorithms constituting primary perception are the
movement to another point on the edge of the range (Helmholtz 1963). This time the jump of the world is not excused by unnatural reaffer
default mode of a sensory system; in vision these processes might
ence, as was the case in the first experiment. If perception at the edge
ol the fixation range is also interpreted as too extreme a condition, a third demonstration will give similar r.esults. Simply alternate fixation rapidly between two polnts In the wor1d, and aftet a lew movements the world wih begin sliding beck end forth even if the natural limit of about
include binocular vision, transformations of solid objects, constancies, and so on, while in hearing they might n i clude the several mechanisms of binaural tocaUzation, and the like. Perceptual tasks which require special effort, such as counting the numbet of objects in a large arrey, would require alternate nonprimery modes of perceptual processing, but the processes need not be qualitatively different . The task of
live rapid eye movements per second is not exceeded. If the world were stable by definition, none of these manipulations
ecological optics, and analogous concepts for other senses, is then to
should have resulted in apparent motion of the world. The Gibsonian
system, what Information in the world the system is buill to extract.
formulation appears not to be working even in situations very close to
those Which obtain in everyday visual exploration. The stability of the visual world turns out to be a surprisingly fragae affair, upset by the smenest variation in conditions. Some kind of COfollery discharge or
6l(fraretinat signal seems necessary to account for these failures of stabaization, and for several others not mentioned hete. But these theories have their own problems, for the Inverse reason - under n-ormal conditions the world doesn't seem to jump even slightly during an eye movement. This result is contradictory to the best pertormence which could be expected from a corollary discharge or similar mecha nism. for these models postulate a signal which arises from oculomotor control centers to
compensate for or take
into account the sensory effects of the movement. But this is a feed-forward process, and some drift or error would be expected in such a signal according to control
discover what properties of the world are primary for the perceptual
byJonathan F. Doner and Joseph S. Lappin OfiiU/Iment ofP•yt:/10/ogy, V•llderbiJ/IInlveraJty, N..llvlfle, r...,. 37240
The function and process of perception
1. Indirect repre1ent•t1on of direct lnformst/on'l Two propo· sitlons run through James Gibson's discussions of direct perception: .
1. AH of the infOfmalion specifying environmental structures and events is directly available to the perceiver in sensory stimulation. Perception is the unmedlated utilization (the "direct pici<-{Jp") of this information by en animal coorcfmting Its actiVities With the environ ment.
theory (unless one assumes that the system is perfect, built without
For Gibson, these two propositions are interdependent. The proposi
error). !See also Roland: "Sensory Feedback to the Cerebral Cortex
tion that the information lor perception is directly available in the sensory stimulation indicates that the function of perceptual processes
during Voluntary Movement in Men BBS 1(1) 1978.1 We are left with a dilemma - feed-forward systems are not accurate enough to do the job, but feed-back systems are too slow. The world doesn't seem to jump, and then jump back after each eye movement:
only the predictive feed-forward models have any hope of producing a zero-latency compensation.
There is now more direct evidence that any compensatory signal
is simply to acquire this information. Ullman's thesis, however, Is that these two propositions must be dislinguished: '"ecological optics' and 'infonmation processing' . . . play largely distinct roles. . . . the fact that reliable information exists in the light array does not entail that processing Is unnecessary." In the end, Ullman accepts the proposition that sensory information is sufff· cient for perception; his arguments against direct perception are
must be very rough; in psychophysical experiments, human subjects are insensitive to artificially induced jumps of the visual world during rapid eye movements if the jumps are less than one-fifth to one-third of
directed at the second proposition. According to Ullman, a character!·
the size of the eye movement (Mack 1970; Bridgeman, Hendry & Stark 1975). Perhaps, then, the compensation problem Is soli/ad by nature
lions is "indispensable In bringing the gap between . . . function and mechanism. "
using a dual mechanism , pertonming most of the compensation with a rough but fast feed-forward signal and laking care of the rest with a Gibsonian stability-by-deftnilion, perhaps with a qualtative extraretilal signal informing the system that small mismatches between retinal and
exlraretinal signals are to be Ignored. In more psychological language, attention is directed not to jumps of the retinal signal but to the new
zallon
of computational processes defined over Internal represents·
The aim of our comments below is, first, to evaluate Ullman's
rellonale for the necessary mediating role ol internal representations and processes, and second, to show how the sufficiency of sensory Information constrains the function of perceptual processes in a way consistent with Gibson's Ideas about direct perception.
2. Ullman'• el'/detJce. Ullman supports his case against direct
object of fixation. If attention is distracted, for instance by repeated
perception by giving the following illustrations a oomputational/repre·
fixation between two objects, apparent displacement will appear just
senla tional interpretellon: (a) an analogy with electronic calculators,
as It will if the mismatch between expectation and afference Is too
(b) Mach's Uluslon, (c) Shepard and Metzler's experiment
great.
rotation, and (d) the Marr and Poggio model of stereopsis. However, each of these illustrations may be interpreted es an operation on
The conclusion Is that nature is not taking sides in the Gibson versus neurophysiologlsts debate, but s i using both techniques. But the
on
mental
env'�ronmental information end thus consistent wi t h direct perception. Consider first Mach's Illusion. In this case, the two alternative
Gibsonian idea has emerged as somewhat different from the way he described it, becoming en attentional bias rather than a redefinition of the problem. Yet this top-down orientation has always been Implicit in
patterns that are perceived correspond to potential alternative physical configurations; the perceptions are defined on these physical configu
Gibson's analysis; the thesis of the 1966 book, The Senses Consld·
rations. Thus, the observer's knowledge of the true 3D configuration
THE BEHAVIORAL ANO BRAIN SCIENCES (1980), 3
383
f
i .
Commentary/Ullman: Against direct perception does not eliminate the illusion. And, contrary to Ullman's suggestion. we have no independent. non-sensory evidence of the observer's internal state with v.tlich to predict the perception. With respect to Shepard and Melzer's experiment. Ullman suggests that mental rotation requires a "transformation of internal represents· lions." We note. however. that these hypothetical representations were defined on the 30 organization of lhe stimulus patterns. whiCh contained aU of the information necessary lor making the identification. tn addition, since the effect of angular disparity held regardless of direction of rotation (clockWise versus counterclockwise), subjects must have perceived the apprapliate cfl'ection of rotation prior to determining the correspondence of the two patterns. Although Ulman suggests that the Gibsonian approach of detecting irwariances in the two monocular inputs tS "too simplistic." the Marr and Poggio (1976) model of stereopsis seems to perfomn just that function. According to Marr, Palm. and Poggio ( 1978), "the central problem" solved by their cooperative stereo algorithm is "to find a correspondence" between "two primitive descriptions" of the left and right images, satisfying two general constraints on the physical struc ture ol environmental objects. Marr and Poggio (1976) also emphasize that the specific details of their algorithm are "not fundamental to the nature of the inlormation processing" since "the nature of a computa tion that is carried out by a machine or a nervous system depends only on the problem to be solved, not on the available hardware." Contrary to UUman's suggesto i ns, cooperative. sell-organizing processes of the sort employed by Marr and Poggi o seem consistent with (though not expficated by) Gibson' s discuSSions of direct percep· lion. Rnally. we consider the electronic calculator analogy. Ullman argues that the "theory of arithmetic is . . . insufficient for the mapping of arithmetic operations onto the mechanisms within the system. " If this were !rue, however. neurophysiology would be fruitless, since it is based on the assumption that function is recoverable from the structu· ral form. With respect to the calCulator. the internal structure must not violate the theory of arithmetic. and the patterning and functional significance or the electronic events inside the calculator is defined by reference to this theory Th11 function of proctJIIS/ng. We now examine the function ol representations end the function of computation. Our goal is not to argue that processing rs urrnecessery, but to show how accepting the proposition that sensory information is sufficient alters the conception of the function of processing. By definition, no representation can violate or arbitrarJy recede the structure of the inlormalton it claims to represent. Neither can a representation contain mCI'e intormation than exists n the array. Such a representation would furrction ltke Maxwell's demon in violation ol thermodynamic law. In general, whatever information is directly avail· able in the representation(s) of an array must also be directly available in the array. This also holds lor computations over a representation. Any opera· lion performed on an informat ional array constitutes a computation. However, no computation which arbitrarily scrambles the inlormatton can have any functional significance. Thus, the nature ol the computa tion must be compatible with the inlorrnatton it computes. and must in some sense erroody this information. For example, the internal logic of a calculator embodies the set of potential inputs. II this were not so. it would be a poorly designed device. Calculators and computers are designed by man to lulfiU his speciffc purposes. Organisms, on the other hand, are designed by natural selection, where there are no predetermined purpOses. How then is lt possible lor representational and computational devices lo evolve? II the information in the array were only made functional by mediational processes then this evolution would be impossible, since the deter minant for the construction or these processes would only be available as a result ol their operation. Thus. the information directly available In the array must be a primary se!eclive pressure in the evolution of the representations and computations employed by the system. We can summariZe the case this way: granting the existence of some sort of processing (and this seems necessary since we do have brains) , the perception of sensory information is still immediate since
384
THE BEHAVIORAL ANO BRAIN SCIENCES (1980),
3
this information is directly available at all levels of representation and i a prime determinant in the avoh.rtion of these computatiOn and s representations and computations. In other words, this information perfuses the entire system; it is never generated by the system. In light of this, it seems reasonable that processing functions to coordinate action and environment by maintaining the coherence, stabi�ty. and avaaability of the information directly given in the sensory array. This is. alter an, the central problem the system laces; end it is by no means trivial. II at any point trom sensation to action this informaion t is distorted or confused. the result s i a nonfunctional organism. Acknowl•dgment
Preparation ol lhls m9fluscript was supported by NSF Research Grant 78·05857.
by Wl lllem Epateln O•parlmont of P•ychology, IJriiVfll'•lty of Wl•conoln·M•dl� M•dl•on. WloD.
5S7otJ
Direct perception or mediated perception: a comparison of rival viewpoints I share Ullman's reservations about the prospects lor formulating a
theory of dl!'ect perception. and I have expressed these concerns elsewhere (Epstein 1977a; 1979: Epstein & Park 1963). In this commentary I wish to clarify the dilferences between a theory ot direct perception and a theory of mediated perception and to suggest the type of worl< which is needed to allow a decision between the two views. The rival formulations differ chiefly with respect to the following matters: 1. Information and •flmulatlon. II is generally asserted that U1e mediational approach (often called constructivism) regards stimulation per se to be unilormative while proponents of the stimulus theory (often called direct theory) aver that there is order and information in smUlation. ti Although this characterization has a measure ot truth, a number of qualifications warrant notice. It is not to be supposed that for tile mediational theorist optical input is totally unitormalive, that is, that opt ical input does not reduce the number of acceptable perceptual representations or the number of likely environmental events. Even the fundamental rule of perceptual processing enunciated by Helmholtz, the archetypal constructivist, assumes important restraints on percep tual construction. which restraints originate in stimulation. "Such objects are always imagined as being present in the fleld of vision as would have to be there in order to produce the same impression on ths nervovs system (emphasis not in original), the eyes being used under normal condillons" (Helmholtz 1963 3:2). Among contemporary constructivists. Rock (Rock 1977; Rock & Sigman 1973; Sigman & Rock 1974) in his experiments purporting to show intelligence In perception. refers to the "sensory support" for particular perceptual solutions to imply that the sensory input limits the number of possible interpretations. Again. if we mean by information that which limits chOices, or reduces uncertainty, then s timulation is informatiVe. On the foregoing reading of mediational theory the d ifference between mediational theory and direct theory on the question of informaion t in stimulaion t is this: the direct theorist insists thai any specific optical input will specity, and be uniquely associated with, a single environmental array or event, no matter how complex, and that each and every optiCal input will have only one possible perceptual correlate. Optical input reduces the uncertainty with regard to environ mental properties or events to zero. The conslructivlst, on the other hand. insists that there is an irreducible uncertainty in optical Input. 2. fltfedlatlon•l proce••••· Any reading of the mediational approach reveals the importance assigned to processes which mediate between stimulation and perception. In contrast, the direct theory allegedly eschews all reference to intervening processes. I think this description of the difference between the two views Is overstated. The difference is not that one account posits intervening processes while the other does not. Both views posit intervening processes, but
Commentary/Ullman: Against direct perception the nature of the intervening processes differs. For the constructivist the mediating processes operate on incoming information /o create new information For the theorist advocating direct perception the process is one of extracting preexistent information. The visual system is said to have evolved mechanisms, tor example, neuronal structures, which are specialized for the pick-up of variables of stimulation which are informative. But pick-up is not always automatic. This much is implied in the distinction between "potential" and "effective" stimuli (Gibson 1g66) and in the claim that perceptual learning is a process of learning to respond to variables of stimulation previously neglected (E.J. Gibson 1977). 3. One stsge or multlstsgtt procellllng. In the constructivist view. whether it adopts the classic metaphor of the natural, intuitive geometer or problem-solver (Gregory 1974; Rock 1977) or the more contemporary metaphor of the computer program (Minsky 1975), perceptual processing is a set of operations, sequential and parallel. Each operation yields a representation, propositional or analogic, which is transformed by the subseql!ent operation into a new represen· tation. finally culminating in the experienced reportable perceptual representation. The theory ol direct perception considers these Inter nal operations and the representations that are alleged to be their products to be superlluous theoretical inventions. Perception is directly a function of information which is picked up from optical stimulation. 4. Recourse to memory, schema, anticipations. "Where little is enough, more is too much.·· This dictum. attributed to Darwin, could serve to eXPlain the attitude of the adherents of stimulus theory toward the involvement of knowledge structures in perception. Inasmuch as perception can be shown to be controlled by information in stimulation. or so it is alleged, recourse to extrastlmulation knowledge is super· ftuous. Perception should be explainable "without reference to memo ries, concepts. or any other form of epistemic mediation'" (Turvey 1977. p. 85). In contrast, the mediational formulation assigns a vital role to long-term memory. schema, and anticipations. Much of the effort devoted to deciding between the two views has addressed .the first and last of the four differences. Usually the case for direct theory has consisted of a mix of psychophysical data purporting to establish the hypothe�ized one-to-one relation between stimulation and perception. and logical polemic intended to undermine the claims of mediational theory on the fourth point. Turvey's (1977) analysis is a good illustration. Advocates of the mediational approach have tended to focus o n the fourth point. adducing evidence of the vital contribution or schemata. memory or reasoninglike processes in perception. Rock's (1977) essay "In Defense or Unconscious Inference" is the best example ol this emphasis. Although these efforts have yielded significant returns, it is essential to supplement this work by directing attention to the other points of difference. Is the relationship between stimulation and perception a one-step relation or is the relationship constituted of a multistep chain of stages or operations? And if the tatter is the preferred answer, as it is lor mediational theory, how can evidence about the nature of the mediating operations be adduced? There is very little direct experimental evidence bå on the former question. and little more than appeals to intuition have been brought to bear on the latter question. In a series ol experiments (Epstein, in prep.; Epstein & Hatfield 1978; Epstein, Halfleld & Muise 1977). based on the premise that backward masking arrests or interrupts perceptual processing at the point in time that the mask is introduced. we have tried to determine if the perception of shape-at-a-slant can be meaningfully decomposed into the stages or operations implied by the "laking-into-account" (Epstein 1973; 1977b) model of perceived shape. The principal findings have been that (a) a poststimulus mask at lSI - 0 causes the subject to report that a shape which is rotated in depth appears to have a shape which corresponds to its projective dimensions, and (b) as the lSI increases, the perceived shape gradually approximates the objective shape. We take these observations to Imply that (a) a perceptual representation of the Iconic properties of the optical input precedes the distally oriented percept, and (b) that additional operations occurring in real time are necessary for the generation of th·e distally oriented, adaptive shape percept. The masking experiments have not proven useful in establishing the characteristics of these additional operations,
and we have turned to other experimental procedures for this purpose. by Stephen Grossberg Osp•rtmDot ofMathemaUcs, Boston Unlvero/ty, Booton. I./us. 02216
Direct perception or adaptive resonance?
Dr. Ullman's article nicely describes an important dispute concerning the levels of analysis that are appropriate in a perceptual theory. In weighing Ullman's versus Gibson's position, I have also found it helpful to use several levels of analysis. One level is robust, and asks questions like: What are the authors driving at on an intuitive level? How much truth is there in each position? The other level is more precise, and asks questions like: does Ullman's particular view of "computa tions defined over internal representations" successfully explain Gibson's intuition that " the perceptual system resonates so as to pick up information" (Gibson 1967, p. 168)? Or Schrodinger's {1958) insight "that as a process is perfected in the course of evolution. il "drops out of consciousness.' and becomes inaccessible to introspec tion"? Or Neisser·s (1976) uneasiness with processing and "still more processing"? Is Gibson's two-level scheme that distinguishes "infor mation content" and "mechanism" inadequate in one way. but Ullman's three-level scheme that distinguishes "function, algorithm. and mechanism" inadequate in a different. even complementary way? On the robust level. 1 agree with Ullman that the study of internal representations and their functional properties is a crucial step toward understanding perceptual events. I also agree with Gibson. however. that the perceptual event is an activity of the whole system as it manifests resonant properties. Ullman's view of computation does not seem to embrace the Idea of resonance, and Gibson's emphasis on the resonant event has caused him to underestimate the functional substrates from which resonances emerge. tn making these state ments. I have carefully avoid the words "algorithm" and "computa tion," because these words are often used in a way that deflects the study of internal representations from directions wherein Gibson's intuitions can be mechanistically understood. Ullman forcefully argues that "the intermediate algorithmic level is indispensable in bridging the gap between the levels of function and mechanism.·· This is certainly true when the system under investigation is a computer. In studies of physical reality. the algorithmic level is less clear. The successful physical theorist uses his intuition to select an appropriate level of physical experience. On this level. the theorist defines a process, or mechanism, unfolding in real-time and analyzes the functional properties of this process, in particular entities that are measurable by experimental devices. II Ullman would call this analysis an "algorithm." then it is an algorithm quite different from some algorithms that have been used to analyze perception. In particular, the functional properties of the physical system are also expressed in real-time. and therefore questions like their stability through time become central issues in choosing one theory over another. Algo rithms as they are often implemented on the computer totally ignore the stability question. Once the program has been executed, it simply shuts off. This stability question lurks, however, at the heart of Gibson's, Schrodinger's, and Neisser's remarks. I have elsewhere derived a theory of code development (Grossberg 1976a; 1976b; 1978; 1980) which directly attacks the question of how a cognitive or perceptual code can develop which is stable enough to endure the erosive effect of irrelevant environmental lluctuations, but plastic enough to correct coding errors or adapt to relevant environmental changes. This theory is described in real-lime, so the boundary between mechanism and algorithm is vague, which Ullman might dislike. The theory also exemplifies the fact that a variety of functional transformations w011< together to compute perceptual events, which Ullman might like. However, the perceptual events in the theory are, in a literal mathematical sense, resonances, which Gibson might like. . From the perspective of my theory of adaptive resonances, I can sympathize with Gibson's position. The "computations" that go on in my networks might never change, but in different environments, and as time goes by, the networks' resonances can change drastically. In this
THE BEHAVIORAL AND BRAIN SCIENCES ( 1980), 3
385
Commentary/Ullman:
Against
direct perception
limited sense, the local COII'¥lUtations are irrelevant In a prolound
metic problem to the output, or answer. might be described as direct.
sense as well, many neural potentials and signals, which are derived by
Should more complicated rule mechanisms be introduced tor the
perfectly good local computaons, it are perceptualy irrelevant untD they
machine. however, then the process would no longer be direct,
are bound together by resonant feedback. Only then does a percep
according to Ullman. Here is a good illustration, it seems, of Ullman's
tual event occur. In this framework, one does not find processing and
failure to come to grips wi t h the real logic involved in criticizing direct
"still more processing." Rather, certain processing ends in resonance;
processing or direct perception: in either of the above two instances
other processing does not. How the system resets itself between
posited by Ullman, the output of the machine would be solely deter
resonances then becomes a central issue. Infinite regress is replaced
mined by the external input to the machine;
by a functional rhythm between resonance and reset.
answer to the same arithmetic problem. Both instances illustrate a
I strongly agree with the spiri t ol Ullman's remarks in that I believe
it would provide the same
direct process, regardless ot their differences in terms of the complex
that understanding how adaptive resonances are generated, and
ity or decomposability ot the processes intervening between input and
which resonances will occur in what situations. is o ·central issue lor
output in the machine.
cognitive psychology. I also I
The necessary logic required to attack direct visual percepUon may
naturally arise in the description of resonances. But I ftnd URman's
of ccx.se concern itself with more than those processes internal to the
mislea(jng, in that they tend to point away from the truth in Gibson's
additional concern to a critic of direct perception might be a detailed
particular levels, notably the algorithmic level, unnatural lor me, even
perceiver which might independently contribute to perception. Of
insight that the perceptual system "resonates to the invariant structure
examination ot the optic array information itself. The question might be
or it is attuned to it" (Gibson 1979, p. 249).
whether it has truly been demon11trated that optic array information could be self-sufficient with regard to evoking perception ot depth, shape, space, the object, and the like, as Gibson suggests. On this
by John W. G'/r
score Ulman is perhaps more careful in his analysis. However. even
MMit•lHe•lth R....rdlln•rirut•, o_,.n,..nlofP•ychletry, lk!Jv«wltyol/oAidligll,ll
here his treatment suffers from a tendency to jump to conclusions. He
Ann Atl>Or, M1clt. 48 II»
says, tor example. that "the search for (Gibsonian] stimulus character-·
Vleual perception is underdetermined by stimulation
istics that can reasonably be registered directly" has not proved very fruitful. The above conclusion would seem to suffer from a too limited
While a case agail)st Gibson's theory of direct visual perception can be
review of the literature. For example, there are attempts - admittedly
made. the present paper by Ullman does not fully come to gr¥>s with
speculative - by mathemaicians t which try to derive Gibsonian invar
the problem. The task at hand 101' Ullman is to demonstrate that the relation between stimAus and percept is not one-to-one. That is. he must show that with lhe stimulus (optic array) constant, changes
iants from the hierarchy of known, or hypothesized, neurophysiological processes in the brain. (See, for example. Blaivas 1975; Hoffman
1966.)
internal to perceivers have an elfecl on perception. Experimental or real life instances of this kind are actuaUy not hard to provide. For e)(ample. an article by Gyr, Wiley and Henry (1979) in this journal discussed numerous cases in which. with visual input constant. changes in internal feedbacks of the motor system produced changes in perception. In tact. based on reported experimental findings. a theory of perception was outnned in which motor processes are
by Frederick Haye.,.Roth nt.R•ndCorpor•tk>n,s.ttt•Motrlce, C•HI. 11040tl
Mediating the so-called Immediate processes of perception
cf�rectly involved in perception. More precisely, motor events - via
Ullman's reaction against direct perception
efferent copies. and the like (Evarts 1971) - produce internal e)(pecta tione about external visual afference befo1e movements (of the eye,
substantial understanding of computation and vision. I would readily
theories rests on a
ally myself with his concluslon that perception requires representations
and the like) producing such alterence are made. The theory argues
end mediating processes; hence. so do theoretical explanations of
that internally prodUced expectations are then matched with external
perception. However, t befieve Ullman has presented a weak argument
atference. and that it is the nature of this relation which determines the nature of the percept. In such a model It is clear that the connection between the external stimulus and the percept cannot be one-to-one. The logic of the above approach is, on the whole, not the logic
against the Gibson position that somewhat confuses the issues. In particular, he attacks the Idea of immediate perception by first assign
Ing only two possible meanings to the concept of "immediate" and then rejecting "immediate" perception as implausible.
employed by Ulman. Only in his brief review of the ilusions first
The weakness or his argument stems trom the fact that his two
discovered by Ernst Mach does Ullman explicitly and empirically
suggested meanings do not exhaust the space of plausible meanings,
demonstrate that the perception of structure and motion might be a
so one could simply dismiss his argument as Illogical. The first sense of
fu-1ction of two (presumably relatively Independent) variables: the
"Immediate" he considers would treat perception as immediate when
incoming image and what Ullman calls the current interpretation of the
aft preperceptual processes lie outside the perceptual system itself. He
observer. Here then is a critique ot direct visual percepUon based on
rightly rejects such an extreme interpretation. In the second case, he
the fact that the relation between external visual stimulus and percept
suggests that "immediate" should describe any system with nonde
is not one-to-one. Nowhere else in Ulman's paper, however, is this
composabte !unctions. Hence. by showing that several exemplary
particular logic elq)ficitty developed further.
perceptual functions reveal Intervening variables and subprocesses.
Instead, UUman's primary tactic in criicizing t direct perception seems to be to argue that the perceptual process is decomposable and that
he claims to have rejected the other plausible interpretation of "imme diate" perception.
the physiology or psychology underlying perception is probably quite
This approach to criticizing Gibson, while somewhat flawed, does
complex. To this approach he devotes a great many pages and
reveal weaknesses in Gibson's theory as well as strengths in UHman's
examples. However, it seems to this commentator that such an
conceptual framework. unman has attempted to define and operation
approach to the problem can only lead to wasted effort. For regardless
alize "immediate perception," because Gibson has failed to do so
of how complex the physiological or psychological mechanism may be.
himself. We should realize the difficulty this presents to a critic like
the fundamental question remains whether this mechanism adds
Ullman: he must first add enough straw to the phantom to have a straw
information to the perceptual process or is merely a highly sophisti
man worthy ot destruction. Ullman, on the other hand, has placed his
cated resonator to external and complete information residing In the
Interpretation ot direct perception (or "ecological optics") on the
top
optic array, which is Gibson's daim. Certainly Gibson himself recog
of a llvee.e .J vet theory that also includes information processing and
nized the physiology underlying perception as qui t e complex.
physiological levels of analysis.
In this connection, Ullman goes into an e)(ampie borrowed from the
A stronger attack on the theory of "Immediate" perception would
calculator. He suggests that if arithmetic were to be done with such a
rest on the necessity ot mediating processes. I have argued previously
mechanism via table-lookup, the process from the Input of an arith-
that perception requires specific kinds ot mediations, in particular,
886
TI-E BEHA'VIORAL ANO SAAI'I SCIENCES (1980),
3 ·-
Commentary/Ullman: Against direct perception memory or computational ''state." My own criticism of the immediate t lheory (Hayes-Roth 1977) was formulated in reaction to perc epion Turvey's version of the theory (Turvey 1977). Because both Gibson and Turvey adhere to similar positions, my previous comments bear repeating in this context. Immediate perception theories eschew the use of mediating concep tual structures and associated recognition and classification alogrithms. Computer scientists now commonly believe that the recog nition and classification of even simple graph structures necessitate the representational power of the ftrst-order predicate calculus, plus the computational power of serial pattern-matching procedures that successively test alternative Interpretations of stimuli by matching perceived objects to variables In the general pattern templates:. Thus, perception of even some simple line drawings requires just the sort ot mediations that n i formation-processing theories postulate. For a more detailed explanation, I refer the reader to my previous article (Hayes Roth 1977), but I will summarize the situation in the folowing paragraph. The need for mediation (memory, sequential algorithms) arises from the impossibaity of directly detecting whether an arbitrary paltern occurs in the visual field. To delect such a pattern (for example, a triangle or a regular graph of n vertices), the Visual system must consider alternative sets of vertices and lines to check whether aP the necessary, pattern-defining conditions hold. A priori, the environment may exhibit an arbitrary pattern in myriad different configurations. To guarantee immediate perception of these would require ettectlvely prewiring as many logic networ1<s as possible configurations. Because the number of possible configurations increases exponentially with tile number of lines in the pattern and in the visual field, no a priori bound on the complexity of such networks exists. Moreover, the number of networks required grows so rapidly that immediate perception appears infeasible: neither natural nor artiftciat systems can immediately reCOQ· nize patterns of arbitrary complexity. UUman cited the prohibitively large number of stimulus patterns in a similar argument to support the need for processes and rules of formation over direct coupling of input output pairs. As an alternative to more and more networking, we indirect realists postulate more and more information processing. Many problems that appear extremely complex in one system become tractable when memory, appropriate representations, and matching procedures enter the picture. As a simple example, most naflKal languages do not conform to the very restricted syntactic structure of regular grammars and, as a consequence, these languages cannot be recognized by devices without pushdown stacks (or equivalent memory apparatus). Thus, if we should demand arbilrarBy that theories of speech percep· lion exhibit "immediate" recognition, or even if we permitted state memory but disanowed register memory, we could not model �ech understanding. Gibson and his folowers have, I believe, adopted a similarly untenable position in the domain of visual perception. Since publishing my earlier critique, I have developed with the help of my colleagues at Rand a possible compromise between the two theoretical positions. Not all of the immediate perception theory can be salvaged, because it is untenable in the ways previously mentioned. The new theory retains its emphasis on immediate coupling between simulus t and Interpretation, but it postulates a model-based feedbacl< loop to mediate this process. Basically, an observer builds an internal model of his environment and perceives by fitting the model to the situation. Thus, from the point of reference of the viewer, the model simulates the features of the scene, which simply confirm the interpre· tation. From the environmental reference paint, the objects in the scene adjvst the parameters of the modal to make it conform to the observable features. For this overall scheme to work, a person in 8 new situation would require some time to construct the initial model. The model would embed possible alternative values of as-yet-unspecl· lied features (such as the sex of the observed person, the shape of his nose, and his exact orientation) as feasible parameter value ranges. Finally, the parameter tuning process would require a feedback loop that selected correct parameter values. (Similar analog-to-digital conversions typically accomplish this by successively testing the goodness-of-fit between the possible model parameters and incoming
signals.) The feedback loop could approach instantaneous reaction times in particular doma.ins with suitable machine architectures. In summary, tne Gibsonians have emphasized correctly the value of information that derives from spatial and temporal organization in the environment. They have erred by asserting dogmatically that this infom18�on could directly support perception without mediation. Ullman has proVided several empirical counterexamples to theirtheory. I have objections and have described briefly tried to elaborate his theorecal it an apportunity for theoretical synthesis that we are currenUy attempting to develop.
by Geoffrey E. Hinton
Pto(Jrlnt In �tlv• Scl•nce
C.()()(l. C•nler for Hu,..n lnfomr•lion PrtJCo.. l ng,
IJttWea/fY tJI C.IIOfnt•. S•n or�o, L• Jo/111. C•llf. f12C»3
Inferring the meaning ot direct perception
Everything Ullman says is very sensible, but It Is not entirely satisfying, because it tails to answer the most puzzling QUestion about J. J. Gibson's views: how could someone who says so many sensible things about perception maintain that perception Is direct and does not involVe computation? Either Gibson is baing very slay or there is a deep misunderstanding about wllat it means tor perception to be direct. I shall try to show that although Gibson expressed his views in a contusing way, their main content is not reaMy In conflict with the computational view that there is more to perception than meets the eye.
Ulman's e)(ample of doing simple arithmetic s i very helpful in elucidating one aspect of what Gibson means by direct perception. There is 8 level of description at which doing arithmetic involves a number of steps, but recalling that 3 plus 4 makes 7 is a single step. lhis level corresponds quite well with the "naiVe" or "everyday" psychology that we use to get tnrough the day. People can describe a sequence of steps that they went through in performing a complex addition, but they cannot generally giVe any decomposition of the step of recaling that 3 plus 4 makes 7. It just does. Similarly they cannot say l)ow they saw the digit 3. This is an unanatysable single step in everydaypsychology. One thing that Gibson means by saying percep tion is direct is that un�ke doing arithmetic or following an argument, it dOeS not n i volve a sequence of mental operations of the kind that we can introspect upon or instruct others to perform. This seems to be right, and it has important implications for any theory of perception. It suggests, for example, that Helmholtz's analogy between vision and conscious inference should not be taken too Ateralty. Digital computers have influenced theories about human perception and cognition in two rather different ways. By proVlding a medium for building wori
THE BEHAVIORAL AKJ BFCAJNSCIENCES ( 1980), 3
387
Commentary/Ullman: Against direct perception because a distributed parallel system can afford to do a lot of computation to extract complex static and dynamic properties of the intensity array. Early computer vision researchers tried to reduce lhe input data to a much sparser. more manageable symbolic represents· tion. They believed tllat they could overcome the inadequacies of this representation by using complex control structures to facifitate clever inferences, often based on knowledge of particular objects. More recently, the failure of this approach to yield fruitful results has led to a great deal of attention to the detailed structure of the inlensity image. For example, gradual intensity changes !hat were previously regarded as noise for an edge-finder are now seen to provide the basis for shape from shading. Gibson was wrong, of course, in believing that there is no interesting level of description between everyday psychology and neurophysiolo· gy, and his approach has little to offer when it comes to higher-level representations of spatial structure, but he was right in believing that a person performing arithmetic computations or making inferences is a very bad model for perception.
by Gunnar Johansson, Claea von Hofsten, and Gunnar Jansson Ooper1nttMto1 Paycho/ogy, Unlvonlityo!Upp••to. S-761 04 UpptUJia, Swldlfl
Direct perception and perceptual processes
From a theoretical position advocating a direct perception theory including perceptual processing or "computation" we would like to make some remarks on certain parts of Ullman's article. These comments concem (1) the deflnitions of direct and indirect perception, (2) the lack of distinction between perceptual processing and cognitive problem solving, (3) the three levels of theorelical problem settings. Recently, we have reviewed and analysed current positions and definitions concerning the relation between direct and indirect percep· lion. and we refer to this article for a more detailed discussion (Johansson, von Holsten & Jansson 1980). 1. The deffnltfons of direct and Indirect percepfloTJ. Ullman's distinction between lhe direct and indirect type of theory differs from the convenional t one, and probably this makes the �le t o l hi$ article misleading. Ordinarily, presence or absence of processing is not regarded as the critical feah.Je separating these two types of theory; rather it is the putalive type ol processing that is the basis for the d istinction (cf. our review and Rock 1977). The indirect type of theory in its classical form was clearly formu lated by Helmholtz. From his consistent empiricist position he argued that the raw material in lhe proximal stimulus, represented in the observer by the visual sensations, is not sufficient to give rise to perception. Ail unconscious problem-solving process dependent on earlier learning was supposed to detenmine the perceptual outcome. The direct perception theories, formulated by forerunners like Exner, Mach, and Hering, maintained that the cognitive problem solving was an inadequate and unnecessary construct, the visual system being able Ia produce percepts without cognitive inference. In contrasl to this traditional distinction Ullman evidenlly draws the demarcation line between direct and indirect theories, not with regard to the problem solving versus no problem solving characteristic, but instead between direcf perception theory without assumptions about processing (J. J. Gibson) and theories assuming some type of processing, direct as well as indirect in traditional meaning. Ullman himself, in his earlier writings and in at least the main part of the present target article, stands out, for us, as a proponent of a variant of the direct-perception type of theory. He has convincingly demonstrated that theoretically lhere exists sufficient information in the proximal stimulus for valid perception, if a decoding principle about rigidity is presupposed. Bul if this interpretation is accepted, lhen the title of the paper is misleading for readers used to the conventional terminology. On the other hand, in his section 4.1, we also found statements that have made us hesitant about his real position (see below). · 2. The dlatlncllon between P.erceptusl proces1111g and cognitive problem 1ollllng. As Is well known, there exists in the
388
THE BEHAVIORAL AND
BRAIN SCIENCES (1980), 3
conceptual classification of general psychology a distinction between perception and cognition. In short, the term "perception" denotes the organism's recording of information through receptors of various types of physical energy, while "cognition" primarily stands for logical or semilogical problem solving. Visual perception Is evident already in the most primitive species and is excellent in Insects and mollusks as wei as in most vertebrates, while the term "cognition" seems Irrelevant and is seldom used in connection with the lower animals. If perception can work in these animals without internal representations during the animal's locomotion in the world, why should these constructs be needed in man? We are very doubtful about what Ullman means when he talks about the interpretation of the observer and about Internal representations. is this an acceptance of the classical indirect perception theory or Is Ullman's use of these cognitive concepts just a reformulation of "perceptual processes" and "stages in processing"? 3. Functions, processing principles and mechsTJ/sms. We agree With Ullman that a comprehensive theory of perception will have to Include the levels of function, process, and mechanism. UDman points out the danger of not including the process level. We would also like to note the danger of neglecting the function level. Theories of perception have to a too large extent been occupied with just process and mechanism, ignoring why we perceive. Gibson's theory is par11y a reaction against this onesidedness. which has sometimes tended to regard illusions as normal cases of perception. It is to be noted thai Ullman gives Gibson full credit for his endeavors to open the eyes ot perceptionists to the ecology of perception and the richness ot available stimulation. However, we would also like to add to Ullman's emphasis on the importance of study at the elgorithmetic level that flnding algorithms appropriate for computers is just a Hrst step in this field. The general task requires finding a type of algorithm that reflects the perceptual type of processing. Finding this will essentially be achieved through systematic experimentation. In general, we have found a strong tendency toward convergence i1 the theories of perception of today. The differences between the01ies are now more a question of aspects of focus, and less a matter of contradictory statements as to the points at issue.
by Rebecca K. Jones and Anne D. Pick /nat/luteofChtrdDevelopment, Untverelly of Mlnnooots, Mlnnoopol/o, Minn. 66466 On the nature of Information In behalf of direct perception
Ullman argues that direct and indirect theories are in agreement that "visual perception relies on lhe information in spalio-temporal palterns of light." and that "the underlying question on which they disagree is whether the information in these patterns is indeed picked up immad' l ately." In other words, Ullman claims to accept Gibson's theory of information, while rejecting the theory of information pickup. However, Ullman does no.t accept Gibson's theory of information. If he did, he would realize that a theory of information pickup does not need to include computations to recover information, because in Gibson's theory, information is never lost in the. first place. Herein we shall discuss three ideas inherent in Gibson's theory of direct perception. The first is that information is the basis of vision. The second is that information allows perception to be direct and immediate. The lhird is that the concept of information does not imply a simple 1:1 mapping o1 stimuli to percepts. According to Gibson's theory of ecological optics, light reverberat ing in the air bounces off the stable and moving surfaces In fhe environment and is structured by them. The result is that at every poinl of observation there is an ambient optic array that contains information about both the persisting layout of the environment and the events thai occur In that environment. At a moving point of observation the structure of the changing ambient optic array can be described irl terms of invariants and transformations. For example, the fact thai during rectilinear movement, surfaces go out of view and come Into view as they are progressively occluded and dlsoccluded by nearer
Commentary/Ullman: Against direct perception surfaces is specified across the transformations in the optic array. The
lion of stimulation.
invariant property in the optic array is that when an optic texture
According to Gibson, the basis of vision is not discrete stimuli but
element moving along a radial flow line catches up with a slower
optical structure in flow. Also, perceiving is not a pr ocess which results
moving element. it replaces or occludes the slower one (Lee 1980). This type of information. which specifies its source in the environment, is the basis of vision.
in discrete percepts that could be coupled with stimuli. Rather, perceiv ing is the detection of affordances by means of information. lnfomna
tion specifying the layout of the surfaces in the environment is a vailable
Ullman is concerned whether the abstract stimuli Gibson describes
to all organisms. Which affordances are detected at any particular time
could possibly be the "input" to the visual system. He has at least two
depends on the organism's species and current psychological state.
objections to the hypothesis that it is. First, Gibson's argument for
For example, a hungry human being or horse will det�ct the affordance
direct perception is an invalid "selective refutation," and since the
of edibility from the information specifying a ripe apple. However, an
notion of information as stimulation follows from the acceptance of
angry person might attend to the atrordance of rigid projectile and fling
direct perception, "if direct perception is not admitted, the notion of
the apple at an opponent. In this way, information constrains percep
information as stimulation does not follow." The second objection
tion, but by no means determines it (Gibson 1976, p. 236). >herefore.
concerns the fact that "physiology tells us that the retinal receptors
there can be no 1 : 1 mapping of stimuli to percepts.
register light energy in various regions of the visible spectrum. " Ullman
In conclusion, Ullman is wrong about the bas is of his controversy
argues that if we acknowledge the role of mediating processes in
with Gibson. It does concern the relevance of information for visual
vision, thereby abandoning direct perception, we can accept the fact
perception. The arguments Ullman makes against the theory of direc t
that light energy is the input to the visual system.
perception reveal a fundamental misunderstanding about the nature of
A basic problem with Ullman's characterization of Gibson's position is that it is inaccurate. It is not Gibson's position that complex and
information. According to Gibson's theory, inf0rmation contained in structured light that specifies its source is the basis of vision. Organ
abstract stimuli are the input to the visual system. Rather, he has
isms detect the affordances of the environment directly, by means of
argued that complex information is the basis of vision. The information
this information. There is no question that it is important to describe the
is available in the ambient optic array, and the active perceptual
mech nisms by which information is obtained. However, by the very
system seeks it out, detecting it directly. Whereas input is imposed,
nature of information, these will, of necessity,
information is obtained or extracted (Gibson 1966, p. 3 1 ) .
In regard t o Ullman's specific objections, Gibson's conception of
a
be me chanisms of
discovery, detection, and extraction rather than of recovery, construe lion, and computation.
information as the basis of vision does not follow exclusively from his acceptance of direct perception. Gibson has given several arguments for why the basis of vision must be considered to be complex
information (see Gibson 1979, ch. 4). For example. Metzger' s ( 1930) Gan:deld experiments revealed that if the light coming to the nodal point of the eye carries no information {that is, has no structure). then it is impossible to accommodate. In o ther words, visual functioning requires structured light. Even if the retinal receptors register �ght energy, in the absence of information, perception wtll fail. Thus, vision
Acknowledgments The writing of this paper was supported by grants from the National Institute of Chid Health and Human Development to the University of Minnesota, Institute ot Chftd Oevelop!Tient (HO 05027) and Center for Research in Human Learning (1..0 01 136). and by e grant from the National Science Foundation. also to the Center
(NSF/BNS-77-22075). ReQuests fat reprints should be sent to either author at tne lnslitute of Child Oevetopmenl, 51 East River Road. University of Minnesota, Minneapolis. MN 55455.
is not based on input, be it light energy or complex stimuli; rather, vision is based on the active obtaining of information contained in the ambient optic array. The second idea to be discussed herein is that the detection of information is direct and immediate, which means that there are no intervening constructive processes. Since completely specific informa tion is available in the ambient optic array, there is nothing to . "recover. . According to Ullman, Gibson characterized information pickup as immediate because he held that the process of extracting
bySamuel Jay Keyser and Steven Pinker CfHIIar tor CoQn/tivs Sa/Mea, Massachuoslta ln•rJtute of T�>ehno/ogy,
CtJmbrld(ltJ,
Mus. 02139
Direct vs. representational views of cognition: A parallel between vision and phonology Ullman discusses Gibson's assertion that visual percepts are charac
decomposition."
terizable in terms of abstract properties of the optic array, and that
However. this is not true. Gibson has argued that perception is direct
there are neural deVices capable of detecting or "resonating to" these
because of the nature of information, not because there are no
abstract properties directly. He shows quite convincingly that the basis
information
"has
no
psychologically meaningful
psychological processes in perception. Indirect perception would be
for each of these assumptions is dubious. In this note, we would like to
required only if Information were not specific to Its source. In fact,
point out thai the structural phonology which preceded generative
Gibson's theory of direct visual perception requires a number of
phonology was based on the analogous assumption that phonemes
processes for extracting information, some ofwhich Gibson described.
were characterizable in terms of phonetic properties which can, in
For example, in his most recent book, he sketched rules for the visual
principle, be detected directly in the acoustic stream. Furthermore. the
control of locomotion and manipulation (Gibson 1979, pp. 232-33).
sort of arguments that Ullman directs against Gibson are precisely tt1e
Thus. Gibson's is a theory of direct perception, not because he was
sort that Chomsky (1957, 1964) directed against the structuralists.
reluctant to discuss processes of information pickup, but because of
Both critics opted lor a theoretical approach
the nature of information. Furthermore, Gibson did specify some of the
representational approach - according to which
rules for the pickup of information. Although they are n ot rules tor
thought consist of the application of rules
recovering information, they are surely psychologically meaningful
symbo�c representations onto one another [see Chomsky: "Rules and
processes.
Representations" BBS3{1) 1980).
-
which we will call the
which
perception and
map sequences o f
The last idea to be discussed is that the concept of information does
Chomsky ( 1964) discusses four tenets of structuralist phonology
not imply a simple 1 : 1 mapping of stimuli to percepts. Ullman charac
concerning the relation between phonemes. the relevant psycholin
terizes Gibson's theory as describing "perception in terms of a family of percepts coupled with their specific stimuli" and Ullman argues that
guistlc units, and phones, the units associated with the physical speech stream. Of these four (invariance, biuniqueness. linearity, and local
such a theory is untenable because it cannot handle the unbounded
determinancy), the most illustrative of the point we wish to make is
number of distinct percepts. We agree that any theory hypothesizing a "table lookup operation" or the simple coupling of percepts with stimuli must be wrong. But Gibson's theory contains no such hypothesis. He has rejected the notion that there are discrete stimuli and percepts to be coupled (Gibson 1979, pp. 56-57; p. 238), and he long ago abandoned the psychophysical hypothesis that percep'tlon Is a tunc-
invariance, which Chomsky defines as follows: "The invariance condi tion asserts that each phoneme P has associated with it a certain set f(P) of defini ng features (that is, P·O if and only If f(P) - f(Q}) and that
wherever P occurs in a phonemic representation there is an associated occurrence of f(P) in the corresponding phonetic representation. " For the purposes of this discussion we shall make use of Bloch's
THE BEHAVIORAL AND BRAIN SCIENCES ( 1980), 3
389
Commentary/Ullman:
Aga in st direct perception
( 1948; 1950) "absolute" version of invariance as the clearest analogy to Gibsonian theory (though it appears likely that a parallel case could be made tor relaUve invariance as well). Absolute invariance consists of two additional conditions. First, partial overlapping or feature sets is excluded. This means, in effect, lhat if a given occurrence of a phone [PI is assigned to a phoneme IP/, then every other occurrence of [PJ must be assigned to /P/. Second, lhe features in a feature set corresponding to a giv"n phoneme are identified in auditory terms (though defined in articulatory lerms). Thus, for any given phone, there is a single acoustic property or a conjunction of properties that defines it. Let us now consider what absolute invariance entails. Recall that the perception of phonemes is what is under consideration. Absolute invariance asserts that lhere exists a physical property or conjunction of properties that is uniquely diagnostic or the phoneme /PI; namely, just those properties associated with the phones constituting the feature set of IP 1. Clearly, lor a phoneme IPI resonator to exist. the absolute invariance condition must hold. This is because if a certain phone signaled lhe presence of IP I on one occasion but the presence of /01 on another, then a device sensitive to the physical correlates of that phone could not tell us whether IP/ or 101 was present at a given time. It is for lhls reason that we believe that the absolute invariance cond�ion in phonology is analogous to the Gibsonian assumption that "the information is in the light," ready to be "picked up" by a detection device of lhe appropriate sort. What counts as evidence against this view, both in vision and in speech? In the case of vision, Ullman argues that (a) there is no physical invariant corresponding to such percepts as the three dimensional structure of a moving pattern; (b) optical patterns can be ambiguous, their inlerpretation depending on the set of the observer as well as on the physical properties of the pattern itself; and (c) perceptual processes such as stereopsis can be decomposed into psychologically meaningful components which would remain llldiscov ered if the process were stipulated to be direct. Chomsky showed precisely the same catalogue of deficiencies to be true of structural phonology. Corresponding to (a), Chomsky cited early studies of what was to become a large literature (see Liberman et al. 1967) showing that no Invariant physical properly is necessary for the perception of stop consonants. For example, Schatz (1954) showed that the physical signal corresponding to /k/ in the word skiis perceived as a /II when heard in the context s ar, that is, as the word star, and as a /p/ when heard In the context s ul, that is, as the word spool. Corresponding to (b), consider the following sentences (brought to our attention by E. Walker): 1. I have a ladder and a pole. 2. The (LaeDrl is against the house.
The occurrence of the flapped d, symbolized as 101 in (2) above is identical in the word lsdder and in the word latter, causing (2) to be ambiguous: the listener has no way of knowing whether it is the pole or the ladder which is against the house. Sentence (1), on the other hand, is not ambiguous, because the internal syntactic representation built up while hearing the first three words of the sentence allows [laeDrJ to be interpreted in only one way. Finally, corresponding to (c), Chomsky points out that natural language sound patterns can be shown to be lawful only if absolute invariance is discarded, and only if lhe mapping from phones to psychologcal entities is decomposed into the application of phonologi cal rules. Since absolute invariance prohibits overlapping feature sets, we may not account for (1) and (2) above by saying that [OJ is sometimes an instantiation of the phoneme ttl and sometimes of the phoneme /d/; we must say that Ill and /d/ are one and the same phoneme. n1is obscures many lawful properties of English sound patterns. For example, since [OJ represents only one phoneme, we must account for the distinctness of [rayOir) ("writer") and Jra·yOirl ("rider") by positing that [al and [a.J (lengthened [a)) are two different phonemes. But then It is a mystery why no other English phOnemes dlfter only in length, and why I a/ and Ia· I do not distinguish any other English words. Sim�arly, absolute invarianc·e as applied above would
390
THE BEHAVIORAL AND BRAIN SCIENCES (1980), 3
require that, for the dialect whose speakers pronounce "throw" as (80oJ, the word would have to be phonemicized as /8/ It/ to/. As well as being contrary to our intuitions. this implies that many regularities of English consonant distribution must be attributed to sheer coincidence (cf. Chomsky 1964 p. 99). Facts like these can only be explained with whal Ullman would call a "psychologically meaningful decomposition" of the phone-phoneme mapping into rules operating on represents· lions. For example, the writer/rider distinction can be explained in a manner consistent with the rest of English phonology In the following way: the abstract representation underlying "writer" is I raytirI, and that underlying "rider" is /raydirI . English contains the following two rules which must be applied in sequence: 1. Vowels are automatically lengthened before voiced consonants. 2. Medial, poststress It/ and ld/ become [D). Theoretical phonologists, building on Chomksy's and Halle's insights, have gone on to develop a highly successful theory of English sound patterns using a rules-plus-representations approach. Ullman and his colleagues appear to be achieving comparable success in th�ir studies of vision. It seems to be a generalization about humans that they are simply not the types of devices that can be understood within a nonrepresentational framework. Acknowledgments
We are grateful to Sylvain Bromberger, Noam Chomsky. Alan Prince, and Edward Walker lor helpful discussions and comments. byJ. J. Koenderlnk Oep•rtment of Phyo/Clll and Phyolololli cal PhytJ/oo, Phy•lco LaboriJIOf)', Srate Unlvorolty Utrecht, Utreoht, The Nether/end•
Why argue about direct perception?
It seems to me that the case for or against OVP (direct visual perception) gets more argument than it deserves. The problem is one of scientific methodology; it Is not a question of what the visual system is like. There are at least two senses in which It seems natural to me to speak of DVP. When I have driven the few miles to the lab, immersed in some scientific problem, I often cannot recall at all how I arrived at my desk. Yet no doubt I reacted adequately to traffic signs and the like. II seems natural to speak of OVP here. Another instance where it is natural to speak ol DVP occurs when you stand before Leonardo's "Mono Lisa." Because It seems ludicrous to have perceived, say, 10% of the smile, or to have perceiVed certain "primitives" (pieces of sine-wave grating.s?) and from them by a devious computation to have derived the answer "smile," DVP is a good descriptive term here. You either see the smile or you do not. The first example should not be confused with absent-mindedness in a derogatory sense: It is a valid mode ot perception and can be vital for survival (as in the DVP of the swordsman axiomatized in the Zen doctrine of "no-mind"). The second example depends on the obvious fact that it is natural to say that you perceive a concept when itIs bom, that is, at the outcome of some going-on in the mind. (The smile exists only in the mind or not at all.) Of course you need not doubt that if I looked at the Mono Usa or drove to the lab with chronically implanted electrodes In my brain, then you would have an interesting case for the electrophysiologist. No doubt cells "selectively sensitive" to certain sine-wave gratings or maybe even "traffic light cells" would show hectic activity. Thus it can be natural to speak of OVP, whereas at the same time you need not doubt that many complex processes go on in the brain. (I admit that Gibson - regretably - uses the term DVP in a different sense, and 1 agree with Ullman that this sense is not conducive to scientific research, vide nfra.) i Thus one confronts a question of definition: explanation in science Is the description of the facts in terms of other facts at another level of experience. Thus there is no real difference between mechanistic explanation and phenomenology. The gas law is explained In terms or molecules, but the molecules are not further explained. You are free to decide what you regard as an "explanation." Then you choose the
Commentary/Ullman: Against direct perception point at which to refrain from further analysis. Thus, you don't talk
an enigma. OVP is no scientific theory exactly because it refrains from
about quarks or gluons when you explain the gas law. In general,
explanation, that Is from phenomenology on different levels. Thus it is a
6)(pianation must use concepts from another stratum of experience
tautological truth.
(for example, pressure explained in terms of velocities of molecules). Thus, if it is said that an explanation exists if a decomposition into
meaningful parts is possible, then it must be stated what is meant by meaningful. The meaning of explanation differs for the psychologist, neurologist, electrophysiologist, biochemist, or what have you. It is
byGeoffrey R. Loftua and Elizabeth F. Loftua D•p•rtment ofP•ychD/o(IY, Unlve,./ty of W .. lll r>Q ton. S••lll•, IV••h. 98 19!5
VIsual perception: the shifting domain of discourse
largely a matter ol taste whether you consider the chaotic motion of
1. What 11 a "domsln of discourse"? Ullman has launched an
molecules as an explanation of pressure: at the molecular level there is
attack on the Gibsonian view by raising the critical question of exactly
no pressure.
wnat it means for perception to be immediate. Essentially, Ullman's
Within a single stratum of experience things nteract I that is, change
claim is that any process, including perception, can be considered
quantitatively but not qualitatively. In different strata the concepts
immediate (direct) if that process cannot be broken down into constitu
("things'') are qualitatively different. Explanation of a concept is
ents that are "meaningful within the domain of discourse." Ullman then
explanation in terms of qualitatively diHereot concepts. If the latter
goes on to argue persuasively that within a psychological domain of
concepts are not lurther analysed (by my choice), the explanation is
discourse, various interesting relations between stimuli and percepts
complete. Only if you still adhere to certain nineteenth-century preju
can be broken down into more elementary constituents: therefore
dices can you hops to arrive at a complete explanation not by choice
perception cannot be considered to be a direct process. The immediate question that arises from this line of reasoning is how
but because nature has nothing further to offer. In perception the case is more intricate than In physics: if you talk
to define the "appropriate domain of discourse," not only for percep
about meaning or information you must specily whether these terms
�on but for any research problem. There does not, it seems to us,
refer to the person having the percept ("me") or another person ("the
appear .to be an unequivocal answer to this question. Rather, the
scientist"). Thus OVP for me may be a complex phenomenon needing fu'ther analysis for the scientist. This dichotomy s i regretably played
the common explanatory concepts ciJ'rently extant in the field of
down by Ullman. It is a pity because he so orten talks about "informs·
concern. rn some absolute sense, this weakens Ullman's case. since
lion" in a sense that is not clear. In nature there is structure (informs·
perceplion could be defined to be either direct or not direct simply by
answer rnJSI depend on personal preference or some assessment of
tion in Shannon's sense). but no meaning. Meaning (the kind of
restricting or expanding what one takes to be the appropriate domain
information meant by Ullman) exists only relative to mechanisms
of discourse.
receptive to it. Only if structure is able to change the state of the
This dilficutly could of course be resolved were researchers in a
perceiver, that is. influence his future behaviour. can you speak of
particular field to agree a priori on an "alowable" set of theoretical
n i formation in the sense of meaning. If you can perct::ive the soid
constructs, that is, an allowable domain of discourse. In practice, this
shape of moving bodies, then it follows that you are receptive to the relevant structures. "Solid shape" is not present in nature but is a
does not seem to happen, at least not explicitly. But is does occur impNcilly, and as a research endeavor evolves, one can. in general. at
mutual property of perceiver and environment. This answers Nelsser's
least detect boundaries on the explanatory concepts that come to be
question cited by Ullman: "If percepts are constructed, wtly are they
used. For example, it is unlikely that the magnetic structure of record
usually accurate?" - the percepts are nature itself. (There Is an
ing tape would be used to assess the difference between a Beethoven
obvious
answer to a variant of Neisser's question:
• 'II scientific
concepts are constructed, why are they usually accurate?" For science is nothn i g but perception exteoded by different means - to pervert Clausewitz·s famous dictum.)
The meaning of a physical measurement exists only because of our
symphony and a Bach sonata or that the eftectlveness ot a football strategy wolAd be explained in temns of nerve physiology.
2. Thfl appropriate domain of dlscour•e for PtiTctJptlon. We would like to offer two comments about what seems to us to be a currently acceptable domain of discourse in the area of visual percep
theory. It is not in nature (for example, the tact that the meniscus of e
tion; both comments, we feel, would strengthen L.Nimen's position. The
mercury column coincides with a certain mark may Indicate barometric
first concerns the use of physiological and anatomical terms as
pressure, temperature, the height of the mercury In a communicating
explanations for perceptual phenomena, and lhe second deals with the
vessel, an amount of radon. and so on. ad lnflnltum. It is only theory
setting of perceptual research within the more gener�l field of cognitive psychology.
which gives the fact its meaning. In a like fashion the meaning of percepts exists only in our "Internal
2. 1 Explanations based In a natomy and physiology. Ullman
representations." Without such you camot obtain meaning. Thus you
implies that anatomy and physiology are nol withl.n the domain of
do not "extract" what is already there: what Is there depends onme. In
discourse that is appropriate for the discussion of percepon. it We find
this sense I do not become attuned to things: the things are what they
this a difficult proposition to accept; rather we would argue that the
(bul also scientiftcally useless) tautology.
tion on the other is fuzzy and becoming fuzzier. In our view, there is
are because 1 am what I am. In this sense the term OVP is a harmless
01
course any physical theory and also any "internal represents·
boundary between anatomy/physiology on the one hand and percep abundant evidence of anatomical/physiological data being used as
lion" is based on recurring experiences. that is on /nvariances. This
explanations for perceptual phenomena. Two examples will illustrate:
also holds true lor solid shape (as Ullman concedes in the eighth
one classic, and one more recent.
footnote). But there is no compelling reason for such invariances to be
The classic example is that of dark adaptation. As shOwn early in
composed of other (simpler) invariances, es Ullman seems to imply.
this century (for example, by Hecht 1934), the function relating visual
That solid shape cannot so be analysed does not count against the
i discontinuous, reaching one apparent threshold to time in the dark s
elltraction of invariances as such. Also the fact that perception does
asymptote after 4-5 minutes but then dropping to a second asymptote
not utilize all available information is no argument. In the last Instance,
that occurs about 30 minutes later. The universal explanation for this
the basis for any invariant is change, not other invariances. ldentily
result (see Kling & Riggs 1972, pp. 283-89) Is in terms of two
arises out of the neglect of differences.
anatomically and functionally distinct sets of retinal photoreceptors, the
In summary, I think that there are circumstances In which it makes
rods and cones, which adapt at different rates.
sense to speak ol OVP. These are the instances In which you choose
The second ellample is that of visual masking. ll has been known for
to refrain from further analysis. This s i generally the case for the
some time that two stimuli presented in close spatial and temporal
perceiver himself. But it is the object of science to push back the level
configl.l'ation will inhibit one another in various ways with respect to an
of analysis as far as possible. This can only be done at the cost of the
observer's abi6ty to detect them. Various explanations using a "per
introduction of qualitatively new concepts. It I want to stop at the Mona
ceptual" domain of discourse have been offered (for example, Kahne
Usa's smile, then OVP is the theory for me. For the sdentist a closer
man, 1967). However, the most compelling accounts of masking rely
study of who knows what is compulsory. It makes the smile no less of
heavily on explanation at an anatomical/ physiological level. Breltmeyer
THE BEHAVIORAL AM:> BRAIN SCIENCES ( 1980), 3
391
,.
i
Commentary/Ullman: Against direct perception and Ganz ( 1976). for example, have offered a comprehensive theory
conscious awareness. Surely the masked word must be said to have
disUnct (sustained and transient) visual cnannets.
been perceived in the sense that it exerts many of the standard effects within the cognitive system that are exhibited by normally (consciously) perceived stimuli. This result is of interest from the present perspective
instances of explanations of perceptual phenomena that are pitched at
for two reasons. First. like the masking example described above,
of masking at the heart of wllich is the existence of two anatomically To reiterate: these examples. as well as many others. represent
the level of neurons. If such explanations are permissible - which they
Marcel's results demonstrate perceptual phenomena that can be
because neurons must intervene between the environment and the
weighing against the notion of direct perception. Second, as alluded to
certainly appear to be - then perception surely cannot be direct. percept.
explained only via recourse to a multistage processing system, thereby
by Ullman. a convincing demonstration of subliminal perception
2.2 Perception and cognitive psychology. Over the past two
removes the percept itself from the realm of conscious experience,
bona fide, wetl-recognized area within psychology. As we see it,
assertion that perception implies (presumably conscious) experience,
decades. the field of cognitive psychology 11as come into its own as a research in cognitive psychology seeks to study the How of information through the nervous system and subsumes the areas of attention,
which is rather at odds with Gibson's (for example. 1972, p. 215)
and his dismissal of the computer metaphor (P. 217) on the grounds
that a computer cannot have the experience of being "here."
perception, memory, and mental representation. Any one of these research topics - perception is the case at hand - is rarely studied in
isolation. Rather. within the framework of cognitive psychology,
perception is viewed as one aspect of a larger cognitive system. Of interest are relations between the various components of the system.
One major research endeavor concerns the interface between percep· tion and memory, wllich in turn places heavy emphasis on an account of the mechanisms by which perception of one stimulus is affected by
the perception ol other stimuli presented nearby in space or time. The point we wish to stress is lhat an interest in these issues in and of itsell
precludes the notion that percepUon can be direct - that is, the question of how perception of stimulus A is a ffected by the prior
Acknowledgments
ting of thiS paper was supported by National Science Foundation grants The wri
BNS 79-06522
to Geoffrey Loftus and
BNS 77-26856 lo Elizabeth
Loftus.
Requests tor reprints may be sent to Geoffrey Loftus. Department of Psycholo.gy, University of Washington. Seattle, Washington 98195.
by William M. Mace
Dt�p•rtmtJnt ofPeycholOQy. Trinity Collttge. Hartford. Conn. 06106
Perceptual activity and direct perception
perceptionof stimulus B presupposes thai perception of stimulus A is
Ullman's version of direct perception is not Gibson's. Indeed, Gibson
not completely determ1ned by the information in stimulus A. We wi ll
would have disputed the view Ullman calls direct perception at least as
illustrate by considering once again the topic of visual masking, and in
addition we will make some remarks about lhe highly related topic of
subliminal perception. Suppose a target stimulus such as the tetter "G" is briefly presented to an observer. Under ordinary circumstances, this stimulus will be "perceived." in the sense that the obsetller w�l be able to report that the target occurred. But perception can be prevented (that is, the
obset��er's ability to report the target can be driven to chance) by presenting a visual mask following the presentation of the target. Fur1hermore, it can be shown that different kinds of masks can halt the
vigorously as Ullman does. Gibson did not believe that perception was a matter of pairing stimuli with percepts. and he did not believe that there is no meaningful decomposition of the registration process. But
understanding what Gibson was getting at requires a broader review of his system. The differences between Ullman and Gibson are tar greater than Ullman seems to appreciate. These should be clarified.
Comparing representative cases. In comprehending and
comparing scientific theories it is useful to notice what concrete cases lie at their core. One can ask what a thoroughly representative instance
looks like. For Ullman a paradigmatic instance of perceiving would be a
flow ol information corresponding to the target at different points prior
case of object or event identification in which one imagines some
letter} occurs. When, for instance, a random-noise mask (random
ing system is to say what the unknown is or what some it its properties
to where conscious percep�on (defined as the ability to report the
dots. overlapping the target in space) or a homogeneous light flash is used, the information corresponding to the target appears to be obliterated early, probably at a retinal level (ct. Turvey t973). In a metacontrast situation, on the other hand, the contours of the mask do not have any spatial overlap with the contours of the target. Here. the
information corresponding to the target appears to be barred from consciousness at a much later level in the system. as indicated by the fact that the target can be "unmasked" by a second mask that masks the first (Oember & Purcell 1967); the target. unperceived though it is,
can stat initiate a reaction-time response (Fehrer & Raab 1962); and
evoked potentials corresponding to the target are undeterred by the mask (Schiller & Chorover 1966). We emphasize that perception of the
unknown presented to a perceiving system and the job of the perceiv
are. Perceiving is a kind of question-answering system. Thus Ullman identiftes a class of problems as problems of the recoveryol structure. For recovering structure from motion the problem Is to show how a
system might draw explicit conclusions about 30 arrangement when
access to the real 30 arrangement can only be had through a changing
20 array. Where accomplished, one can say that the 30 structure was recovered from the sequence of 2D changes. Ullman understands the problem of perceptual theory to be that of designing systems which
can bridge the "gap between the physical stimulus and the perception of objects." For vision. light distribution at the receptors is input. percepts are output. Perception is kept distinct from action. 1 hope this is a fair rendering of his position. I take it to be roughly the view shared
original target can hardly be direct if (a) it can be masked by a
by nearly everyone who works on perception except Gibson.
types of masks can preclude perception of the target at different
locomotion. Animal movement must be regulated with reference to the
The old issue ot subliminal perception has recently received
limiting case of upright standing. an animal is oriented to the surface of
temporaUy nonoverlapping stimulus to begin with and (b) different places in the nervous system.
renewed attention, much of it deriving from the work of Marcel (in press}. The main thrust of Marcel's research has been to show that a stimulus masked from consciousness (whose presence is repor1able only at a chance level} can nontheless exert considerable influence
over other stimuli presented close in time. Perhaps the most dramatic of Marcel's results involves a lexical decision paradigm. In a lexical decision paradigm (see, tor example, Meyer & Schvanevetdt 1971)
reaction time to decide whether a tetter string (for example, DOCTOR) is a word is reduced if the word is preceded by an associated word
(NURSE) relative to when it is preceded by an unrelated word (FROG)
or by no word at all. Marcel's contribution was to show that this result
foHows even when the preceding word has been masked out of
392
THE BEHAVIORAL ANO BRAIN SCIENCES (1980). 3
Gibson's paradigmatic case of perceiving is perceptually guided environment (Bernstein 1967; Tut��ey, Shaw & Mace 1978). Even in the
support as the object of its activity. To think about perceiving in Gibson's way, one must think of specific animals and specific activities, then inquire as to what environmental support is required to perform and what perceptual Information and abilities must be t those acivities, present for the adequate regulation of those activities. Over the years,
Gibson became Increasingly impressed with the tight link between perceiving and acting. As he developed his position that the changing optic array was far more informative about the environment than a nonchanging array (Gibson. Olum & Rosenblatt 1955; Gibson 1958;
Gibson, Kaplan, Reynolds & Wheeler 1969), he saw that it was
advantageous, if not absolutely necessary. for an animal to move about in order to satisfy conditions for adequate perceiving. "So we
r
Commentary/Ullman: Against direct perception must perceive in order to move, but we must also move in order to perceive" (Gibson 1979, p. 223). Exploratory locomotion is an exam ple of perceptual activity lor Gibson ( 1966). An exploring animal locomoles and adjusts the postures of its body and its members (including the head, eyes, and lens in the case of vision) partly according to the requirements of continued l.l1obstructed activity and partly according to the requirements of acquiring more information. Much information Is obtained by the organism rather tllan imposed on it. Information is used to guide the acquisition of more information. When Gibson spoke of registering or extracting information he meant to include all of the coordinated bodily movement as well as whatever neural events might be Involved in the regulation. To properly compare his approach to Gibson's, Ullman might wish to explain the role of his computed percepts in ongoing activity. Direct perceplfon. Like Ullman, Gibson believed that one could establish a continuum from clear cases of direct perception to clear cases of indirect, mediated perception. To establish the dimension he explained, "Direct perception is what one gets from seeing Niagara Falls, say, as distinguished from seeing a picture of it. The latter kind of perception is mediated. So when I assert that perception of the environment Is direct, I mean that it is not mediated by retinal pictures, neural pictures, or mental pictures" ( 1979, p. 147). Between the cases like Niagara Falls and the picture of Niagara Falls lie cases In which instruments such as telescopes may be used to enhance information ( 1979, p. 259). Farther out than pictures on the extreme of indirectness he placed knowledge acquired by description; that is, explicit knowledge ( 1979, p. 260).1 This iS clearly not the same as Ullman's continuum. II Is seemingly more concerned with what Ullman calfs direct reaRsm whereas Ullman claims to be interested in the ()rocesses of direct perception. But for Gibson, perceptual processes include coordinated activity. Cont inuing the paragraph I cited above, he said. "Direct perception is the activity of getting information from the ambient array of light. I cad this a process of information pickup that involves the exploratory activity of looking around, getting around, and looking at things" ( 1979, p. 147). The crucial point for Gibson is that the possibilities of exploring the real Niagara Fells are very different from the possibilities of exploring the picture. There Is Information to specify these differences, and the information obtained from exploring these two different situations will also be different. Were Gibson to decompose the perceptual activity of a particular animal's exploring Niagara Falls, he would have talked about the overall posture and changes of posture of the body, the actiVIties of the head on the body, the activities of the eyes within the head, and the activities of the pupil, lens, and retina (light and dark adaptation) In the eye. These adjustments do not occur sequentially or independently. They depend on one another. In short, they are coordinated. Now this coordination is a problem with complexity of truly heroic proportions but that still does not necessarily cell lor representations end computa tions (Turvey, Shaw & Mace, 1978).2 The environment, on its side, may be decomposed as part of trying to understand its nested space-time structure. But in Gibson's framework the organism and the environ rT'ent are lhe terms of the perceptual relation, and analysis of each does not destroy the terms or the directness of the relation (Shaw & Bransford 1977b). Glb•on'• comprehen•lve •Y•tem. Throughout his career Gibson was intent on developing a realist theory of perceiving, one that did justice, in principle, to the adequacy of perceiving tor the purposes of everyday animal activity. Everywhere he looked he found barriers to realism in psychology (Shaw, Turvey & Mace in press). In order to buRd theories that even had a chance of doing justice to his realist commitment, he had to redesign the framework for defining problems in addition to offering theories that addressed problems. He listed ffve i major novelties of his approach: ( 1 ) a new notion of what perception s (experience ofthlngs rather than merely experience); (2) new assump tions about what there Is to be perceived (the topic of most of his 1979 book); (3) a new conception of the information for perception; (4) a new approach to perceptual systems (the topic of his 1966 book); (5) recognition that a system registers both persistence and change In the flow of structured stimulation ( 1979, p. 239). Contrary to what Ullman
Implies, Gibson �new that a consistent realism is a very difticult position to construct. He revised and raRned his ideas constantly, as can be seen in comparing his earlier and later published works. All of the pieces have to fit - the theory of the environment, the theory of information, the theory of the animal, and the theory of how IIley are related.' Conclu•lon. Gibson never did get to the kind of theory of perceptual process that Ullman wants. Indeed Gibson had no role for such processes. Ullman has not gotten to theories of processes that capture animals exploring environments. How shall the two be recon· ciled? Notes Gibson k!enti"cd explicit knowledge wilh verbal kno'loieclge ,
knowing by of WC)(dS C)( symbols. He distinguished this lrom direct perception of an enwonmont. Thus. whatever else it may be, perceiving detlnitely is not a kind ol exPiklll knowledge in Gibson's system. Compare this to Ullman: "The rote of the processing Is not to croata inl()(matlon but to extract it, int99rate it, make il explicitand usable" (my emphasis). 2. In fact the rolo or computat ion, whiCh is sequential and discrete, in explaining coordinated control wil be very lKlCiear oolil integrated wilh dynamics in some fashion (Bersloln 1967; Pattee 1971; 19741. 3. II 1 am riohl about realism being a requirement of Gibson's psychological theory, then of course Ui'nan Ia right;, saying that Gi)son's psydlology could not oller Inductive auppo�l for realism. Bul most phiOsophars only look to psychology for rhetorical supoort arrywey. Phiosophy is not science. 1.
means
by Alan K. Mackworth Oepattmttrrr of Corrf>uter ScWnc•. U..lv.-a/ty ollk/IWh C<Jiumbla, Vancouvar, B.C., C.nada VtlT IWS
Are mediating representations the ghosts in the machine?
The immediate paradox facing a student of perception lies in the apparent conflict between the many-to-one nalure of the world to-retina transformation and the undeniable feeling that there is only one world out there when we open our eyes. That feeling of certainty diminishes not a whit If we sit motionless, or close one eye, or look through a monochrome filter, or see a movie of the world. A student of computational vision can express the paradox in terms of constraints. Human visual perception appears to be a richly overconstrained process with unambiguous results. An analysis of the constraints implicit In image formation alone, however, leaves us well short o1 such a desirable state of affairs. Gibson has always argued that this is a false problem. He first attacked it by proposing that the analysis of lhe image formation process was Incomplete, suggesting, lor example, texture gradient as a determiner of surface slope in perspective projection. He further argued that the perception of static scenes by a static observer was unnatural and overlooked the information provided by motion which supplles, through optical now patterns, a large class of additional constraints. Most recently (Gibson 1979), he argued that the assump· lions underlying the paradox are false - that is, the premise thai perception Is based on the interpretaion t ol images is, in his view. Incorrect. He replaced the image by the ambient optic array surround ing the observing organism. This array has both temporal and spatial structure. He, moreover, explicitly rejects many of his earlier views in the new formulation. In particular, instead of viewing perception as a two-stage process, he is insistent that sffordances (environmental attributes relevant to the organism's purposes) are picked up directly from the optic array. Since this approach and that underlying computational vision are also in apparent conflict, Ullman has taken on the task of examining some of the underlying assumptions of each and determining if they can or should be reconciled. The paper is an elegant and convincing attack on the premises of direct perception, although I must deClare an Interest as one schooled In the paradigm rejected root-and-branch by Gibson. Not much would be gained If I, as a commentator, simply nodded assent to Ullman's attack, so I will argue, not with his conclusions, but with the reasoning that led him to them.
THE BEHAVIORAL AND BAAI'-l SOENCES (1980), 3
393
--
Commentary/Ullman: Against direct perception Mucl'l of Ulln'Uin's attack on Gobson's theory depends on an analysis of "immediate" wnich 1 found useful, but wrong. Ullman suggests that "immediate"
in
thos context means "has no meaningful decomposi
tions info elementary constituents," addong that such a notion os relaive t to the domain of discourse. However, his analysis appNes not to the concept "immediate" but to the concept "primitive" on the sense of "nondecomposable." In my dictionaries "immediate" means,
mul�piidty of constraints arising from the image(s) (or ambient optic array) can be made consistent with each other and with the constraints corresponding to hierarchies of assumptions about the external world, the observer. and the imaging process. 5. Each representation and its set of associated constraints and processes may have its own descriptive apparatus. For example,
among other things, "without medialion or interposition": "with nothing
image·based, observer-based, and world-based representations may all be necessary.
coming between: without intermediary: direct." These are the mean ings apparently n i tended by Gibson. This os confirmed by the fact that
6. An appropriale representation allows partial knowledge of a situation to be represented without overcommitment.
he overwhelmingly uses "direct perception" rather than "immediate percep!Jon" in his last book: "So when I assert that perception of the
7. The observer's mental state. purposes, and knowledge of the world may be represented and may mediate the interpretation process.
environment is difect, 1 mean that it is not mediated by retinalpictures,
neural pictures. or menta/pictures" (Gtbson 1979, p. 147). Gibson's intention is to nAe out such ghosts in the machine as "assumptions, preconceptions. expectations, mental images, or any of a dozen other hypothetical mediators" (p. 166). The correct sense of "immediate" does capture Gibson's meaning (from Latin n i (noll + mediaIus, past participle of mediare (to be in the middle)) as we can see from his continual contrasting of direct perception with mediated perception. Direct, in this sense, means "without intervening persons. condi1oons, or agencies." Gibson tS merely insistilg on a one-stage process This is not mere quibb6ng over dictionary definitions. A theory of direct perception might wei have no intervening or mediating represen tations be/ween the optic array and the perception of affordances and yet still be decomposable within the domain of discourse. For example. in Gibsonian terms, the perception of the atfordance of the possibility of support lor an organism by a surface in the environment might be an invariant composed of the invariant for the detection of surfaces, such as the disappearance behind an occluding edge of parts of the surface, and the lllVariant for horizontal surface orientation. such as the eppropriate texture gradient. Such an explanation would be ruled out by Ullman's artalysis of "immediate" as "nondecomposable" yet is allowable within the meaning of "immediate" or "direct" that insists on no mediating representations such as retinal images or primal sketches. Because Ullman's analysis tailed to preserve this distinction the tater analysis based on it is somewhat Bawed. However, most of t sm is in fact m i plicitly based on the "no mediating Ulman's later criici representation" view: to that extent it maintains its considerable force. Ullman's argument could be repaired by showing that mediating representations are necessary for perception. But that we cannot now demonstrate, and, given the nature of scientific theories, we are unlikely ever to be able to demonstrate it. Ullman and the rest of the computational vision community, have. however, shown that these
As
a concrete example, I would claim that each of these has been
demonstrated for a specific representation of local surface orientation. the gradient space (Mackworth 1976). Ullman is careful to avoid claiming that he has shown the necessity of the representational view. There are, perhaps, two ways to try to do that .. The first would argue that certain perceptual phenomena, such as Mach's illusion or the Ames's demonstrations, require mediating repre sentaions. t The second would show that all the candidate noiVepre sentational theories are inadequate and that at least one representa tional theory is adequate. UUman follows the first fine of reasoning at times t>ut carefully stops short of concluding that representations are
required. The danger in the second line is pointed out by Ullman. He atlribules to Gibson the logical error of reasoning by a purportedly exhausive t but actually incomplete case analysis over the set of actual and possible theories. Ullman also follows the second line and comes dangerously close to committing the same error. His overall strategy is to use arguments against cftrect perception a s evidence in favour of the representational view in general and. specifically, uses the same strategy in arguing the merits of their respective theories of perceiving the three-dimenslonal structure of moving objects. Gibson, a psychological radical, makes us aware of the unwritten assumptions and philosophical baggage that our theories of percep tion carry with them. Ullman has successfully defended the representa tional/computational view and shown why we believe some of the things we do.
by K. Prazdny C
Ellfl/lnd How wrong Ia Gibson?
representations are sufficient for many vision tasks; he nicety demon
Ullman's arUcle does not add very much new to the past controversy
strates if for shape from motion and motion from shape. No other t.heory, including Gibson's, has demonstrated sufficiency so convinc
between the proponents and the opponents o f Gibson's psychophysi
ingly. This fact. unfortunately, does not allow us to conclude they are
cal theory of visual perception. The theory in its doctrinaire form is indefensible, but this does not mean that it is entirely wrong. It still has
necessary To show that a theory of perception must be mediated we should
much to offer. It is surprising that Gibson's theoretical and largely programmatic
first understand "mediation." There can be mediating representations and mediating processes. In Gibson's characterization of "mediated
views have been attacked and defended (Shaw & Bransford 1977a) so viciously so often, especially in view of the existence of other, in my
theories of perception" it is not clear which is meant. For concrete· ness, let us assume mediating representations are meant. One
view far mora radical and eccentric theoretical standpoints (Gregory
suspects that Gibson wants to throw out a& internal representations, not just ontermediate ones, and Ulman assumes he means this too. Gibson also tends to confound the reasons lor suggesting mediating representations; those reasons are unpacked to a great extent by Ullman. The juslifications for them include:
1972; 1979; Oatley 1978). Surely, these views, Which define perception as (unconscious) hypothesis formation in Which the perceptual experi ences are constructions ''from floating fragmentary scraps of data signalled by tho senses and drawn from the brain memory banks, themselves constructions from the snippets of the past" (Gregory 1972), are much more radically wrong than Gibson's refusal to speculate about the "intervening" processes of perception. Until about
1. Theories postulating such representations achieve higher levels of generative adequacy than other theories.
2. Speclftc rule·based algorithms acting on representations are the only adequate formal language we have for describing psychological processes.
3. By understanding visual perception as processes operating on a
ten years ago, Gibson's views were the major alternative for all those who were dissatisfied wilh the fashionable proposition that perception was a kind of detective work on the part of a Visual system attempting to solVe the gigantic pgsaw puzlle by patiently creating , testing, and
rejecting hypotheses proposed mainly by an already established context, until one of them prevailed. The cue theory, the conceptual
series of internal representations, one reduces the complexity and number of the processes.
framework of most of these "constructivistlc" views of perception,
4. Such representations act as a mediating ground in which the
and views space or depth perception es equivalent to static space
394
Tl-E BEHAVIORAL AND BRAIN SCIENcES (1980), 3
relies on a concept of perception as a sequence of static snapshots.
-
Commentary/Ullman: Against direct perception perception. Consequently, the classical cue theory has very little, if anything, to say about perception in kinetic contexts, a point regularly stressed by Gibson but overlooked by his critics. To see the crude way in which perception !rom the moving point of view was characterised in these theories, consider, for example, the view. originated by Minsky ( 1975), the intellectual father of the "frames" idea, a concept still widely used in artificial intelligence and cognitive psychology: "Different trames correspond to different views, and the names of pointers between frames correspond to the motion or actions that change the veiwpoint." Gibson was a pioneer, and like all pioneers he oversimplified and disregarded pieces of evidence in order to accentuate his principal proposition: that visual perception is primarily a function of the struc ture of the ambient array and not of (acquired) knowledge. It Is not accidental that his view shilled from what was called a psychophysical theory of visual perception to what he called ecological optics. He never ceased to emphasize that the analysis of the structure of information contained In the ambient light logically precedes any statement about how it is, or could be, processed. In this respect Utrman correctly points out that Gibson chose to disregard the processing and representational problems by calling perception "di rect" and "immediate." One cannot do much better as a first approxi mation. Gibson correctly pointed out that the excessive reliance of most theories of visual perception on various forms of cognitive or semilogi· cal processes was brought about by their preoccupation with static images. These are inherently ambiguous. He argued that this ambiguity iS reduced enormously, or ev_en disappears, when the observer moves, as the spatiotemporal structure of the ambient array affords informa tion specifying certain aspects of the environment uniquely. He conjec· lured that perhaps all perceptual experience is determined uniquely in this way. This has proven almost correct in the case of frogs (Dodwell 1970), but almost certainly incorrect in the case of man. This Gibsonian accent on the primacy of the dynamic aspects ol visual stimulation is slowly becoming a commonplace nowadays, but it took nearly 30 years to realize such a simple !act. Some promising starts have been made, and Ullman lists a lew (see also Koenderink & van Doorn 1975; 1976; 1977; 1979; Prazdny 1980). Gibson's argu ment that the eye is always in motion and that all vision is due to the dynamic aspects of the stimulation at the eye found dramatic support recently in the research of, for example, Kelly (1979a; 1979b), who found that the contrast threshold is elevated by a factor of about 20 if an image is stabilised, and that as ocular velocity (due, for example, to involuntary eye drift) decreases, the sensitivity of the eye decreases correspondingly, until it may even disappear at zero velocity (Kelly 1979b, p. 1348) if the stabilisation is precise enough (such a fine stabHJsation has not been achieved yet, so the disappearance of the contrast sensitivity remains only a conjecture at present). Ullman is at his best in section 4. I think he is right in asserting that the relevant problem is not whether the Information in the optic array and the corresponding perceptions are expressible in terms of inver· iants or whatever other theoretical constructs, but rather in terms of what the Information is, and how it Is used and processed by the visual system. This is a nontrivial remark. for all too often the structure of "input" information has been taken to be determined by the require ments of the representations containing a priori cognitive (or high-level) information with whiCh it was designed to interact. Without any doubt, one has to find out what information is available, and what its structure Is (for it is this structure which will codetermine the nature of further processing); and only then is it possible to ask whether, and how, it is used by a visual system. These are two separate stages of investigation, however. While one must sympathise, in general, with Ullman's theoretical position, typical of the elegant computational approach pioneered at MIT, one wonders why so much criticism has been made of Gibson's theory, apparently only because it overstressed the ffrst, and deemed as psychologically (N.B .• not computationally) irrelevant the second stage of investigation. Ullman is right; Gibson's theory regarding the precise analysis of the structure ol the optic array and the rote of certain invariant properties in
it with respect to the "meaning" of the stimulation for the organism (the theory of affordances) will stand as a lasting contribution to the theory of visual perception. And rightly so. bySandra s. Prindle, Claudia Carello, and M. T. Turvey Ottpartment of Paychology, Unlvarolty of Connecticut, Storra, CoM. 06268 and Haskins Labor•tor/os, NewHeven, Conn. 065TO
Animal-environment mutuality and direct perception
Perception is characterized by Ullman as a mapping from stimuli (defined at the receptor surface) to percepts. Gibson abandoned this characterization of perception sometime between 1957 and t961 when he abandoned the psychophysical program and the more general causal formula in which stimuli are said to trigger responses. Nevertheless it is in the context of this stimuli-to-percepts construal that Ullman evaluates Gibson's claim of direct perception. The thrust of· our commentary Is that both Ullman's characterization of perception and his advocacy of perception as indirect or mediated are grounded in a view or (an) animal and (its) environment as constituting a dualism ratt:ter than a mutuality (or synergy); and tnat the understanding of perception as direct is grounded in animal-enVironment mutuality, as Gibson well respected. Immediacy is Indifferent to the bounded/unbounded distinction and to time. Let us begin, however, by assuming the
propriety of Ullman's characterization of perception. Our purpose is to highlight Ullman's overevatuation of formal, symbol-manipulating systems as his inspiration for understanding perception and his under evaluation of physical systems (see Kugler et al. forthcoming, on Pylyshyn 1980). A formal system account of a physical system process (be it biological, physiological, or psychological) necessarily requires discrete, serial operations and an exPlicit representation ot every aspect of the process. both steady-state end transformational. ay contrast, in an actual physical system the operations are principally those of parallel and coordinated dynamics, and most changes (if not aU) need no explicit description, since they are taken care of by the dynamic taws involving real space, time, and energy (for a discussion of the formal/physical contrast see Pattee 1977; Yates, Marsh & lberall 1973; Yates 1979). In the spirit of the above contrast we can recognize that nature abounds in "devices," quite distinct from calculating machines with rules or look-up tables, that map an unbounded set or "inputs" into a few singularities. The equilibria and steady-state . phenomena of ther modynamics and hydrodynamics duty express the equiftnality of open systems. Even the candle flame Is an example: the flame moves under an indefinitely large number of conditions that threaten to eKtinguish it, and it does so in a way that maintains the flame. Artifacts also finesse the unbounded input ''problem." A mass spring system equilibrates at the same length over wide variations in the Initial dlsequilibrating conditiOns. Polar and hatchet planimeters measure the area of any regular or irregular two-dimensional figure. In none ot these cases involving unbounded sets of "inputs" (either natural or artifactuel), is there an explicit calculation step or en exp#cil algorithm mediating between antecedent state and consequent state. Immediacy n i the sense of nonmedisted has nothing to do with the number of antecedent slates of affairs. Moreover, in these cases the process in question may occur over the short term, over the medium term, or over the long term. One can move the Index of the polar planimeter quickly or slowly - it makes no difference. Hydrodynamic processes such as metabolism can take hours, days, or weeks; for example, water balance in mammals has a periodicity measured In days (see Toates: "Homeostasis and Drinking" BBS 2( 1) 1979].
'
!.
' '
ImmediacyIn the sense of nonmediated has nothing to do with time. Our purpose in the two sections that follow is to motivate a
third parallel conclusion which denies that Immediacy has to do with input-output mappings. Before proceeding we pass comment on the status of at·gorithms as explanation. With regard to the natural and ertlfactual systems just referred to, an algorithm could describe the systematic behavior following perturbation but a search for the algo rithm in the system would be futile, and an accrediting of the algorithm with causal responsibility for the behaVior would be wrong. Polar and
THE BEHAVIORAL AND BRAIN SCIENCES (1980), 3
395
• ':
Commentary/Ullman: Against direct perception hatchet planimeters are unequivocally measuring devices. but their measuring capab•lity rests with their structural design relaive t to planar figures and human users. An algorithm might be a convenient descriP· lion ol a planimeter. but it would not be an explanation: in the realm of entity. explainang the planimeter. the algogrithm would be a �tious it Sen111tlon·based theorlel and the definition of direct perception. Ullman believes - mistakenly - that there is a distinction between "sensation-based theory" and "representationallcomputa· tional theory." The former is a catch-phrase of Gibson's for any theory which proposes: 1. that, in perceiving, "between things" interface or coordinate an animal with 1ts environment- say, ideas. represenlaions, t sense·data. propositions, percepts, and so on. This reduces to saying that an anim�l is not directly acquainted with its environment as such but rather IIIith a surrogate lor that environment. 2. that the perception of any particuar object orevent is predacated on 1he togicaly prior perception of particulars of a more elementaristic nature: more precisely, in terms ol the construal of perception subscribed to by Ullman, that a set or semantically impoverish_ed preocates (those, say, of the senses) is translated into a set of semantically rich predicates (those of percepts).
Gibson's expressiOn, "sensation-based theory," as commensllf'able with the popular usage of "sensa" in philosophy to refer to phenom enal objects. A sensum is a phenomenal individual, that is, an individual that e)(ists when and only when it is experienced. Sensum is meant lo cover a variety of indMduals. including repreS6fltstions. With this definition at hand, direct and indirect perceptiOn can be crudely distinguished as follows: indirect perception is the claim that sensa (for example, representations) and only sensa are directly perceivable (since everybody assumes that there is something that is directly perceived - otherwise there would be a halting problam: direct percep tion is the claim that physical objects are directly perceivable. (The term physical is used inhiUvety and conventionaRy; its proper evalua tion With respect to the knowings of animals is only now - pursuant to Gibson's efforts (see Gibson 1979, Pari 1/- being undertaken in earnest). To be redundant, direct perception is the claim thai the perception of physical objects does not involve (is not mediated by) the perception of phenomenal individuals (lor example, represents· lions). In sum. Gibson is quite justified in contrasting (as he does, lor exarwle, in his last book. pp. 251-53) direct perception with a broad variety of theories - including representational/computational ones conected together under the umbrella expression "sensation-based theories." We doubt if anyone would have dilticulty in seeing that ( 1 ), above, is the representational doctrine and (2), above, is the computa tional doctrine. Toward an understanding of anlmsl-envtronment mutuality. To reiterate, the intuitions of perception-as-direct and perception· as-indirect are linked to two very different conceptions of the animal· envirooment relationship. The contemporary version of indirect perceptiOn championed by u•man is original only in its formal interpre tation of mental states and the operations performed on them. II is thematically continuous with e historical lineage that inck.ldes Greco Roman references to eidola and phantasms, medieval theology and its emphasis on a creative soul, the Kantian unknowabilily of the thing· in-itself, and the attempts to Ioree the facts of perception into the cause-and-effect chains of classical mechanics. As with many of its predecessors, contemporary indirect perception regards perception as involving: (1) a projection ot the env�onment- in some form - into the perceiver; (2) reasoninglike processes •mplemenled by the neural innards of the perceiver; and (3) memory or knowledge of facts and rules, both general and particular. Together these constituents yield an expficil understanding of the current enivronmenl. A minor thematic variation is thai the imagelike resemblances said to be projected into the perceiver in earlier versions of indirect perception have been replaced in the contemporary version (by and large) by quasi-linguistic representations - structural descriptions. The overarching presumption condiiOning t both past and present forms of (indirect) perception is that of a logical indepMdence between (an) animal and (�s) environment where matters of ontology
396
THE BEHAVIORAL AI'() BRAIN SCtENCES (1980), 3
and epistemology are concerned (Turvey & Shaw 1 g79). 1t is principally this presumption that: ( t) dictates the notion of detectivelike mental processes (given the logical Independence, the animal-as·perceiver must "figure out" its environment), (2) fosters the deftnition of environ· menl in conventional physical terms. that Is. terms that are animal neutral (given the lagical independence. there is no reason to consider the makeup of the animal in deciding on descriptors lor the environ· men!), and (3) discourages the physical and mathematical pursuit of ecologically scaled structured energy distributions that are speciflc to objects and events (given the logical independence. specificity Is not necessarily to be expected). (As a necessary aside, contrary to Ullman's assertions. specification has been demonstrated to date, to a reasonable and promising degree; for example, Lee's [19801 time to-contact variable and Shaw's transformational invariant lor growth [see Pittenger, Shaw & Marl< 19791. Aristotle did not subscribe to the logical independence of (an) anlmal and (its) environment (Smith 1971; 1974); nor did Dewey and Bentley ( 1949); nor did Woodbridge (1909); nor Kantor (1920); nor. most auspiciously, did Gibson ( 1979). These scholars - to greater or lesser degree - conceived of (an) animal and (its) environment as logically dependent. The key to grasping that perception is direct (and to ridding perceptual theory, more generally. of its theological trap� pings) rests utlimatety, we be6eve, in a thoroughgoing understancfang of animal-environment mutuality. Some preliminary steps have been taken (Gibson t979; Patten 1979; Shaw and Turvey, in press; Shaw, Turvey & Mace, in press: Turvey & Shaw 1979: Turvey, Shaw & Mace 1978). What Is being sought s i a comprehension of (en) animal and (ils) environment as complementary systems whi ch set acausslly as reciprocal contexts of mutuel constraints. Space does not permit a detailed discussion of the mutuality or synergy conception. but perhaps the following wiU provide an intuitive appreciation (see Shaw & Turvey, in press, tor a fuller account). Consider the state of affairs termed "sourness." On the assumption of a logical independence of animal and environment, sourness might be ascribed lo the object being tasted but, more likely, would be ascribed to the animal doing the tasting (probably to the activity of some of its neural fibers). Herein lies the perennially popular story of secondary qualities. Aristotle, assuming a logical dependence, told a quite different story: Object X has the "potential" to taste sour to animal Z (let us call this statement A) while animal Z has the "potential" to taste, as sour. object X (let us call this statement E) and in the mutuality of these two potentials, "sourness" is actualized. The object-focused statement and the animal-focused statement refer, respectively, to affordance (Gibson 1979) and to effectivity (Turvey & Shaw 1979). Consider an eccount of graspabi�ty in terms of the same lormutation: object X has the "potential" to be grasped by animal Z while animal Z has the "potential" to grasp object X: in the mutuality of this affordance and this effectivity graspability is actualized. The statements A and E above can be rewritten as problems A (how can X be grasped by Z?) end E (how can Z grasp X?), end then these problems, are analogous to what mathematicians reler to as dual problems. Now with reference to the term "potential," It is readily appreciated that its referent in the affordance statement (A) is different from its referent in lt1e effectivity statement (E). On the dual-problem analogy, "potential" can be likened to a mathematical basis, !hal ls, a linearly independent set of vectors which span a problem's solution space. We identity, therefore. two nonidentical bases. SA and �. termed dual bases, for problems A and E. In problems in mathematics these bases underlie two matrices of constraint, termed dual matrices (one for problem A and one tor problem E), whose properties can be useful in understanding the concept of mutuarrty in psychology. In parlicutar, the term "mutuality" can now be expressed as lt1e special relation that exists between dual matrices, namely: ( I ) the constraint columns in the matrix lor one problem are the constraint rows tor the dual problem. and vice versa; (2) the solution to one of the problems determines the solution to the other problem, and vice versa, although the solutions are not necessarily identical or equivalent. Rather, they are logically dependent, and their relationship s i that of duaRty. Finally, IIIith reference to the term "actualized," sourness or graspabllity entails more than just the solution to one or the other of the dual
Commentary/Ullman: Against direct perception problems and is, in fact, a state of affairs emergent from these problems when they are considered as duals. A similar point can be made from a consideration of problems of relative stability in physiology and biology which are often conceived In terms of the logical independence of animal and environment. Thus, to explain the coordination between them (for example, of the kind that
aeows regulation of body temperature), special "coordinators" ere introduced such as referent signals {set points), control algorithms,
much the same way that there were known to be stimuli for sensations. This now seems to me to be a mistake. . . . 1 should not have implied that a percept was an automatic response to a stimulus, as a sense impression is supposed to be. For even then I realized that perceiving is an act, not a response. an act of attention. not a tr iggered impression, an achievement, not a reflex {Gibson
1979; p. 149).
{Nola that Gibson claims that he never held an S-R theory of perceiv
and the like. But regularities in Uving systems have not lent themselves
ing, although, unfortunately, he feels he gave the m i pression that he did
to analysis In terms of these special coordinators. because the "set
hold such a view).
points" and "algorithms" of living systems actually reside in the design
Gibson's theory of perceptual activity is anything but the reflexive
of the animal-environment system {ct. Yates 1979). "Set points" and
pairing of stimuli with percepts that Ullman claims it Is. Ullman argues
"algorithms" ere indexical end descriptiVe -not causal. What one
that, since indirect theories of perceptual activity hold that perceiving
finds are equmbrium opera ing t points that are emergent properties of
very distributed physical processes of a dynamic system (see Snellen
1973: Werner 1977; Yates 1980; Zavelishin & Tenenbaum 1968). 1See
also Toates; BBS 2{1) 1979.)
involves en indirect computation of percepts out of stimuli, a direct theory of perception must hold that percepts and stimuli are directly coupled. This is very misleading: DVP (direct visual perception) theory does not Involve inverting Indirect theories of perception, but rather
The point to be underscored by the mathematical analogy and the
involves rejecting their basic assumptions and hypotheses. There is no
relave it constancies of living systems is that the animal term and the
basis in Gibson's tater writings lor Ullman's easerlion that Gibson held
environment term need not, and probably do not, relate as a projec
any sort of S-R account of perceptual activity, for Gibson ( 1979, p. 55
tion: the environment is not projected nto i (duplicated by, represented
If.) rejected the hypothesis that perception is caused by stimulation,
in) the animal in any form - and this is exactly the point that Gibson understood with regard to perception. That is, the task of perceptual theory cannot be that advocated by Ullman, of detaWng the stages by
"abandoned the very notion of stimulation as typically comprised of discrete stimuli" (p. 238), end Questioned the hypothesis that percep
tion results In percepts (p. 239). Ullman fails to realize that it is he who
which the world comes to be represented, either formally or neurophy
wants to explain the "mapping . . . between stimuli and percepts " :
siologically, inside the perceiver. Perceptual theory's task is different
Gibson gave u p this goal before 1966. The goal of Gibson's theory of
and much broader; it is. roughly speaking, the development of mutually
perceptual activity is not to argue against indirect theories by "selec
compatible theories of environments and organisms and the epistemic
tive refutation," as Ullman asserts. On the contrary, Gibson's theory of
(Informational) constraints which bind organism and environment as a
perceptual activity is an attempt to explain how ecological information
synergistic system. And it is in this context that we complete the triad of
is detected by mechanisms of purposive attention (Gibson 1963: 1966;
assertions on the immediacy or directness of percepUon: mmediacy i In
1976).
the sense of nonmediated hss nothing to do with s mapping of inputs
it has to do with the mutuality of potenUals in Aristotle's terms or of affordsnces and effectivilies in the terms Gibson initisted(see Gibson 1979; Michaels & Carello, in press: Shaw et el., ln to outputs;
rather,
press).
Gibson's theory of perceptual activity starts by rejecting the hypoth esis that the simulat t ion of receptors is the basis of perceiving. Perceiving is not based on stimu6 or even on the transmission of exqitation along afferent nerves to sensory cortex: rather, perceiving starts with purposively functioning animals exploring their environment. Because ecological information exists within the environment in energy
by Edward S. Reed
patterns which ere relatively large in spece-tJme, it cannot be regis·
l nn .. pol/11, c.,,., lor R111urch In Hum11n L1111m1ng, Unlvoro/ty of lol/MII•ot•, M
tared by a receptor. Also, information is too densely structured to be
Minn. 6646(1
registered by a passive receptor surface (Gibson 1979. p. 243). A
Information pickup Is the activity of perceiving
pattern of peripheral stimulation cannot be Information because it cannot specify its environmental source. Ullman takes this to Imply that
Ulman characterizes his disagreement with Gibson's theory of direct
perception requires computations to "recover" meaningful informa
visual perception as being concerned primarily with the issue of
tion. whereas Gibson takes this to imply that proximal stimulation is not
information pickup. According to Ullman, "the registration of Informa
the basis of perception, although stimulus Information, which is specific
tion is a primitive construct that has no elaboration within (Gibson's)
to its source, is the basis of Vision. Perceptual systems pick up
theory." UHman doubts that a complete theory of perception can
information by scanning its structure {running the perceptual organs
accept pickup as an unanalyzable concept, and he argues that
over the environmental structure) and by purposively seeking to
Gibson's theory is seriously marred because it leeks anything beyond a
optimize afferent-efferent loops attuned to that Information. Information
physiological explication of pickup. I believe Ullman is right to argue
does not flow along afferent-efferent channels the way excitation does:
that theories of perception should attempt to explain the psvchologlcat
indeed, when information is being picked up, the mutual adjustments of
processes involved in perceiving.
However. Ullman's
arguments
against Gibson's theory are incorrect: it is not the case that Gibson held that pickup was en unanalyzable primitive construct: on the contrary, Gibson {1966: 1968; 197«?: 1979; ch. 14) analyzed and expounded his notion of pickup at great length. According to Ullman, Gibson's theory of Information pickup is a variant of S-R psy.chology. Ullman says that, for G'bson, perceiving Is
a perceptual system's organs raoulre that excitation be flowing centrif ugally and centripetally, horizontally as well as vertically throughout the CNS. Gibson ( 1966) has shown that this theory of pickup is consistent with current physiology, and Auneson (1977) has demonstrated that Information now is not necessary in a pid
The visual system is not "a channel for transmitting signals from the retina to the brain": rather, it "Is a system for sampling the optic array"
based on a "direct coupling" between stimufi and percepts directly
{Gibson 1970; p. 76). Ullman's and other traditional accounts of
analogous to reflexes which "can probably be thought of as a
perceptual activity deal solely with the vicissitudes of sensory signals in
pre-wired,
the afferent channels of passive systems, ignoring the activity of looking' and the psychological functions which this activity serves. In
immediate coupling between stimu� and responses."
Reflexes and Information pickup are both primitlve constructs as far as
psychological theories are concerned, because they can only be
c�ntrast tb these traditional views, Gibson emphasizes that learning,
At one time, Gibson (1959) may have held just such an S-A view of
attention, anticipation, motivation and other psychosomatic factors ell , modulate n i formation pickup (although they do not affect the informa
perception. However, Gibson abandoned this S-A or "perceptual
tion itself). Thus Ullman's claim that Gibson's theory leeds to "oversim
psychophysics" theory in the early 1960s {see Gibson 1963; 1966;
plifications" concerning perceptual acivity t is much more appropriately
1979; Mace, 1977). Moreover, he has been extremely explicit in his
applied to his own account of processing than to Gibson's.
analyzed meaningfully in physio/ogicelterms.
self-criticism, clearly anticipating Ullman's critiQue: I thought I had discovered that there were stimuli for perceptions in
In conclusion, Ullman's main argument Is that Gibson's theory of perception is incomplete because it takes pickup to be an unanalyz-
n£ BEHAV10AAL ANOBRA1N scENcES (1980), 3
397
Commentary/Ullman: Against direct perception able primitive construct. This argument faits because Gibson in fact
information !rom the direction of gaze of the eyes is available (Hill
offered a detailed analysis of pickup. Gibson's psychological analysis
1972). Moreover, in this last case, we now know from studies of prism
of pickup, as purposive attending, involving motivation, reminiscence.
adaptation that this relationship is relearnable, so that following expo
and expectation (Gibson 1 966: pp. 275-80) led him to question the
sure to displacing prisms, a foveal stimulus will be perceived as straight
classical principles of sensory physiology. anticipating the later efforts
ahead when the eyes are in fact turned to the side (Kalil & Freedman
of some neurophysiologists (Luria 1973: Masterton & Berkley 1974;
1966). Thus, in these cases - and there are many others - is there any
Wall 1970). Hence, Gibson could not have held the view, ascribed to
alternative to the conclusion that the perceptual system "takes
him by Ullman. that perceiving involves a reflexlike coupling of stimuli
account" ol one kind of stimulus inlormation in "assessing" or
and percepts. because Gibson rejected the hypothesis that stimuli,
"computing" or "inferring" the perceptual character of the retinal
percept. and reflexes are components of perceiving. PEl(ceptual theo
image of an object based upon another kind of stimulus inlormation?
rists who hold that perceiving is based on receptor stimulation will have
In these examples, it is implicit that the stimulus (retinal image)
to hypothesize operations to "recover" the structure lost in the proximal stimuli. Perceptual theorists Who hold that perceiving occurs
representing the object - for example, its visual angle, retinal orienta tion or locus·- would be ambiguous were it not for other information
when animals attend to ecological information hypothesize scanning
for example, its distance. or the body's orientation, or lhe direction of
and attentional activities to explain how organisms detect n i lormation
gaze or the eyes. Ambiguity is of course anathema to a direct theory. II
relevant to their behaviors and needs. There is a "sharp contrast"
can hardly be claimed that for every percept there is a unique stimulus
between Gibson's theory or perception and other theories; it is just not
and vice versa il in fact. given that stimulus, the percept need not
occur. Therefore the effort by direct theorists has been to deny
the contrast Ullman describes.
ambiguity by logical argument, postulation of (presumably) nonambi· guous higher-order stimulus attributes, or by relegation of cases of
Acknowledgment• This work was supported by grants to the Center for Research in Hurnan
Learning,
University of Minnesota, from lhe National Science Foundation
(NSF/BNS-77-22075) and lhe National Institute for Ct.td
Health and
Human
Oevetopmenl (T36-07151 and HD-01 136).
Ufldeniable ambiguity to artificial or impoverished laboratory condi· lions. But in this commentator's opinion, ambiguity remains a c()(recl description or many stimulus arrays on the basis of both logical analysis and de racto outcome. 1 have in mind the stimulus that prevails in phenomena such as reversible figures o f all kinds, induced move ment. the kinetic and stereokinetic depth effect, illusory contours, the interposition depth effect, transparency effects, grouping and figure
by Irvin Rock IMIItuto for Cognitive Studies, llutgoraUnivorefty, Now•rk, N.J. OT I�
Difficulties with a direct theory of perception
ground organization, certain cases of stroboscopic motion and anor thoscopic perception, and the like. By "de facto" ambiguity I mean that more than one perception actually occurs in many or these cases. Often one perception occurs
II will be more productiVe to focus these comments on the essence of a
at the outset that is different from another, later perception of the same
direct theory of perception rather than on the exemplification of such a
display. Trve, there is generally a strong preference tor the final
theory by any one individual' and on the essence of an indirect theory
percept, for example. lor depth in the stereokinetic display rather than
rather than on the particular alternative put forth in the article by
a transforming two-dimensional percept (Musatti 1924). But this simply
Ullman.2 While agreeing with many of Ullman's arguments. Iwill take the
points out the fact that we need a theory to explain such preference.
occasion to raise some others and to consider some evidence not stressed by him.
Some of these examples also illustrate the fact of enrichment. another concept that is anathema to a direct theory. Enrichment
In my opinion, the essence of a direct theory is that stimulus
means that past experience contributes something of itself to the
information is available that uniquely correlates with each particular
perception, without which the same stimulus would clearly look
perception3 Thus the specification of such information provides the
different. 1 fail to see how this kind of effect can possibly be denied
necessary and sufficient explanation or perception.• The essence of an
whether we consider laboratory demonstrations (for example. Wallach,
indirect theory is that the stimulus information, while a necessary
O'Connell & Neisser 1953) or examples in daily life. A related effect
determinant. is nol sufficient, because certain mediating processes
occurs when there ere features of the percept which, on logical
must occur, once the stimulus information is registered or picked up,
grounds, simply cannot be said to be represented by any features in
prior to the achievement of the percept. Such mediating processes can
the stimulus information. There is an elaboration or supplementation in
be discribed in psychological language and are a necessary part ol the
which there are dimensions or aspects of the percept that represent
chain of events leading to the final perception. In my opinion these
the constructive aspects of mind and that simply do not have any direct
processes could be either interactive if! nature. such as were stressed
stimulus correlate. Illusory contour seems to me to be an example or
by the Gestaltists, or they could be cognitive or thoughllike in charac
this kind (Kanizsa 1955). Another is stroboscopic motion. In these
ter. Examples of such processes are variously referred to as "organiz ing" or "grouping," "interpreting," "laking account of," "computing," "inferring," "describing," "deciding," and lhe like.
cases the elaboration is not necessarily based on a contribution from past experience. Advocates ot a direct theory maintain that there are unique stimulus
Why do some investigators believe that such mediating processes
features for every kind or percept, although they may not yet have
are as necessary for perception as is the stimulus information? The
been discovered. Well, of course there ere unique stimulus features for
answer lies in part in an analysis of the necessary and sufficient
each unique kind or perception. Every theory of perceplion must
concitions for various phenomena or perception that Investigators
acknowledge that perception is grounded in sensory stimulation or
seek to explain. Consider the perceptual constancies, Which are at the
stimulus inf()(mation. Yet that does not Imply that there is always
very core of, and are virtually synonymous with, the field of perception.
something given that could be said to be the sufficient correlate of the
Despite a valiant attempt to deal with constancy in terms of higher
percept. That may or may not be the case. In some instances the
order information directly "picked up," the fact is that cof!slancy also
argument is at least plausible. Thus, for example a luminance ratio
occurs when this kind of information Is not available. Thus an object's
across two regions could be the stimulus correlate of achromatic color
size can be perceived more or less veridically when it is the only object
and thus could explain constancy or achromatic color under varying
visible. so that only occulomotor "cues" to distance are available
illumination (Wallach 1948): or a texture gradient could be the stimulus
(Rock, Hill & Fineman 1968): an object's visual orientation in the
correlate and thus the explanation of the phenomenal slope of a plane
environment is perceived, more or less veridically, by a tilted observer,
(Gibson 1950); but can discrete, alternating stimulation by two points
when the object is the only thing visible such that only gravitationally
provide the necessary and sufficient explanation of phenomenal
derived information is available (Witkin & Asch 1948). An object's
motion between the two points? Thus, the reasoning in favor of direct
radial direction is perceiVed veridically despite variation in the locus of
theory is sometimes circular, in that the stimulus conditions for a
its retinal image when it is the only visible object such that only
398
THE BEHAVIORAL AND BflAIN SCIENCES
(1980), 3
p�enomenon are analyzed and then simply claimed to be the neces-
Commentary/Ullman: Against direct perception sary and sufficient correlate of it. In other words, sometimes a
Not•s
tested. But in other cases. no such factor is discovered. Instead, the
I. Of course the chief modern ptoponenl of the direcl kind of theo
stimulus conditions of a phenomenon are empirically investigated. For example, consider the spatial and temporal parameters essential lor a
many of the findings ol lhe Gestaltisls (or of those trained by lh
stroboscopic motion percept. These were not predicted cr predictable on the basis of a direct theory, nor is there a stimulus-perception isomorphism here. Thus it would be both empty and circular now to
for higher-order stirr.�lus determinants ol perceplion (see wanaoh 1948; Btown 1931; R.ock & Ellenholtz 1959). Whether ooe lends lo agree or disagree with , no one studying perception can aford to iglore them; it t Gibson's formulaions
claim that specifying these stimulus factors, whatever they are, is a
one wishes to pul forth an ind�ect lheotY of a given phenomenon one lll\JSt fhl
higher-order stimulus attribute, such as a texture gradient. is uncov ered; and because of an isomorphism between the stimulus attribute (lor example. steepness of texture gradient) and percept (for example, slope of plane with respect to the line of sight) the correlation i s at least a plausible theory of the perception. Predictions can be made and
demonstrate why a lirecl lheory is inadequate to 8JCI)Ialn it. Ulman's target article
sulficient explanation of the percept. 8
and this open peer commentary on il should be thought of as a tribute to Gibson.
theory must successfully circumnavigate the
Whose recent dealh we ah mourn. He woUld suretyhave welcomed and thoroughly
Brief mention should be made ol certain other difficulties with direct theory. Such
8
problem of perceptual organization. The attempt to account for what
enjoyed the dobale.
goes with what. or what is figure and what Is ground, by implying that In
2. Ullman wriles as il lhe computatiOnal/represenlallonal theory were the llrst
the conditions of daily life - In contrast to the impoverished pictures
or only cognitive appoaeh lo cllaUenga or offer an alternatiVe to the direct theory
and fine drawings of the laboratory - this information is directly avail
t and does not repesent Whereas tt is in factrelallvolynew In lhe field of percepion
able simply wal not do. For example. It might be said that depth
experimental psychology. He !hereby iglotes enemalives that go back at lea.st to
information of various kinds suffices to eslabtish figure and ground because it tells us what is In front and what behind. But - to give just
Helmholtz, alternatives that stll have various cont9fl1)otery representatives within
one argument against this claim - the depth relations among objects that are very far away. so as to be beyond the range of binocular information or Information derived from observer motion. must be determined by pictorial information which. by deftnition. is no richer than that which can be given in pictures. II nonplctorlal depth informa
the fteld of psychology. 3. Since I am seeking lo analyze any kind ot drecl theory and nol only Gibson's, I do nol feel obliged to be confined to his particular way of relerring to the stimulUs, which in any event changed in lhe courae of his writings. Tnus. 1 wil use terms such as "stltrutus," "stimulus information," "retinal image," or
"Information pick up," more or less i'llerchangeably. 4. In
ion recognit
ol the lact lhal we ollen need to leam lo d'oscriminate
carv�ot directly specify what is figure and what is ground, then t ion certain internal organizational principles applied to the content of the
similar-appealing members of lhe same category or clah
scene's image must do so just as in the case of pictures and line
James Gibson, Eleanor Gibeon, and their associates tacitly acknOWledge 11\allhe
drawings. In fact, certain organizational principles- for example those i terposition - even determine what is in front of what under governing n
ty of stimulus inlorma6on does not always provide a aufllcient e,.Pana avallabii
such conditions. I suspect that the "experience error" is often respon sible for the belief that the phenomenal organization is directly given by the stimulus (Kohler 1947). A direct theory is unable to deal with the many cases in which one perception depends upon another. This point is relaled to the forego ing one in that often a particular perceptual outcome depends upon how the incoming stimulus array is organized. Thus, for example, there
is reason for believing that only if one region is ftgural will the deletion
and accretion of elements of a neighboring one signify its occlusion and disocclusion (see Kaplan 1969, but also Rock & Gilchrist 1975); only if an illusory contour becomes a phenomenal reality will certain other perceptions that depend upon such a contoured region emerge or disappear as the case may be, and so forth. The example UNman gives of the Mach book viewed by a moiling observer fo!lowing the
from one another.
llon ol perception (see E.J. Gibson 1969). It is argued thai learning here consisls In differenllellng among atlmukJs objects by extracting lhe relevant information !hal ls already given. II seems lo me 11\at the necessity ol sUCh perceptual learning raises serious dlfllcullios lor a direct theory even if lhe learning is one ol differentialion rather lhan of ervichmeol
by Sverker Runeaon Poylcofog/ab lnotkutlon�. Uni-Mtyoll}ppolll •• 6·76104 Upfnnlo, S-•n
There Is more to psychological meaningfulness than computation and representation Ullman's version of Indirect percepllon Is remarkably different from traditional ones in that It concedes two of the most important and controversial constiluents of the direct perception approach: that the
achievement of a perspective reversal iiiU8trates this same point. The
available information is rich and sufficient and that perception does not is consciously repre t involve any junctions at which the informaion
perception of the book as moving depends upon the perspective depth
t or meaningless form. (I will assume that Ullman's sented in a primiive
reversaL By way of conclusion, a few words about the problem of veridicality
"decomposing the relation between stimuli and percepts in psychologi cany meaningful terms" does not refer to conscious components.)
are in order. One of the appealing features of a direct theory is Its potential for dealing with this problem. After all. if we can somehow
Thus it seems thai In Ullman's view. as well as In Gibson's, what we perceive is the world and not images of it.
directly pick up the Information about what is "out there" because the
The major point of disagreement put forward in the target article
information is so much richer than earlier philosphers and psycholo
concerns the nature of the mechanisms or syslems that enable us to
gists appreciated, then veridicallty is no longer a problem. Conversely, theories that rely heavily on constructive events within the organism do have a problem accounting tor veridicality. The point 1 would like to emphasize, however. is that, for all theories, perception must do justice
perceive the world. Although I am not convinced that "direct versus indirect" are appropriate terms for it, the issue Is a substantial one
to or accqunt for the stimulus. There must be a good match between
among theories of perception. Ullman claims that perceptual function ing is (necessarily?) representational/computational whereas Gibson suggests that concepts like resonance and tuning provide better
lhe constructed percept and the stimulus. That Is what distinguishes
analogies. Unfortunately, Ullman shares the common misapprehension
perception from other kinds of cognition such as imagining or thinking.
that Gibson's terms "resonance" and "tuning" are meant to be psychologically ernpty and hence that perceptual mechanisms are of
Thus the percept must be grounded In the stimulus, and t�.is in Itself serves as a constraint that assures that perception will have some degree of correlation with what is producing the stimulus. But, in addition, the requirement has led to the evolution of rules and principles of preference that wnt generally guarantee veridicalily. One of the virtues of this way of looking at the problem fa that illusions then become explicable in terms of the misapplication of the same rules and principles.
interest only to, and could be studied only by neurophysiology. True enough, there isn't much computation implied by these notions, but there might be other Interesting things involVed. Thus, in technology, questions about what makes a device resonate, or how it can be made tunable, reach far beyond simple checking out of the deta�s of the hardware; they extend into advanced theory. And the principles behind a device that selectively picks up something delicate are certainly
THE BEHAVlORAL AND BfWi.l SCIENCES (1980), 3
399
Commentary/Ullman: Against direct perception worth stu
The corrmenterv WolS written while the
the Department
author was visiling assiStant professor In of Psychology, Comet UniVerSity, llhaca, N.Y.
byRobert Sllaw and James Todd O.,.ortmentof P�ology. Unlverelty of Cofii)(JCijc,vt Storr», Conn. � Abstract machine theory and direct percept.ion
Ullman, in his attempts to criticize the late James J. Gibson's theory of percepfion, has erred on two counts. First, he fails to offer anything more than e parody of Gibson's theory by attributing to him a view of "direct" perception and Its entailed mechanism which is false and misleading. Perhaps such a slrow man provides a more convenient
400
THE BEHAVIORAL AND
BRAIN SQENCES (1980),
3
target for criticism, but it is only a laney of Ullman's Imagination. In addition. or perhaps because of this failure to grasp the issue of directness in perception, Ullman then attempts to criticize the direct approach. But since his blows are aimed at a straw man, they have little relevance to the theory of perception held by Gibson and his followers. Finally, it will be worth noting, with inescapable irony, that the representational/computational account of perception championed by Ullman tends to trillielize the problem of perception by ignoring exactly those issues which Gibson made the central concerns of his ecoloO:cal approach. Let us now consider the merits of these complaints. Ullman'• conception of Glblon's theory 1 i not Glbton'•· There are two Gbsons, the one of over twenty years ago who defined perception as a psychophysical problem of mapping stimuli to percepts (Gibson 1959), and the one of this past decade who defined perception as a function of stimulus Information - more precisely, as an experiencing of things in the environment in terms of what actions they afford, rather than a having of experiences or explicit understandings {Gibson 1979). Ullman elects to attack the older characterization of perception and to ignore the more up-to-date one, perhaps because he feels the former fits more nicely than does the ratter into the standard cognitive interpretation of abstract machine theory. Ulman claims that two mappings are involved in direct v1su�l perception: "The first mapping is between various aspects of the environment and some spatia-temporal patterns of the visual array," while "the second mapping is between stimuli and perceptions." Since the first mapping is achieved by physical laws and speciffes the domain ol inputs to tha perceptual system, the crux of the problem of perception must lie in how one interprets the second mapping. 6ut it must be duly noted that this conception of "direct" perception, as the mapping of stimuli into percepts, was such anathema to Gibson that he explicitly denied it. On the contrary, perception, for Gibson, implies a mutuality of animal as both a perceiver of and as an actor in its enllironment. This view requires a radically different conception of perception from the one proposed by Ullman. It requires a direct (single), bidirectional mapping of environmental informaion t onto behaviors and vice versa - what In mathematics is caQed a duality (Turvey, Shaw & Mace 1978; Turvey & Shaw 1979: Shaw & Turvey, in press; Shaw, Turvey & Mace, In press). By ignoring this reciprocal relation between action and perception, Ullman utterly trillialiles the most fundamental problem addressed by Gibson, overlooks the raison d'etre of the ecological approach, and misses the basic theme of Gibson's last two decades of work. (For example, "How do we see how to do things?" Gibson 1979, p. 1 . ) Since we do not doubt that Ullman i s a dedicated and competent scientist within his own field, one must be puzzled at his cavalier treatment of Gibson's nearly half a century of work in a related Held. Perhaps the fault lies in Ullman's attempt to force Gibson's theory of perception n i to an unnecessarily narrow conception of machine theory. Dofts percftptlon requlrft "lntftrn•l'' (cognitive) st•t••? The typical cognitive rendition of perception in machine theory is as follo'!"S: R(l+l) - F(O(I), S(l)); where R(l+/) Is the perceptual response (output) which arises at time I+/ as a function F of some stimulus (input) Sat time I and some "internal stale" of the machine 0 at time I. Ullman adopts such an interpretation when he argues that a proper understanding of what a person perceives on a given occasion depends not only upon the stimulus input but also upon the curre.nt state ol affairs of the perceiver - an internal representation. He believes that "internal" slates ere required to disambiguate those cases in which the same input yields different perceptual effects (as in the case of the Mach illusion example). Ullman's major complaint against what he takes to be Gibson's theory of perception Is that it omits the state variable and. reduces simply to perception being a function of stimulatiOn, thatis, R(f+l) - F(S(t)), Two questions might be raised regarding the cognitive interpretation of machine theory: first, is the state-variable O(f) a necessary term in all machine descriptions, or might It not be replaced with some other term capable of performing the same formal duty? Second, allowing such a state term, or Its formal equivalent, need it be given the same semantic duty as imputed to II by lhe·cognitlve approach, or might it not
Commentary/Ullman: be treated as something other than a reified "internal state" Which causally mediates perceptual effects? If an affirmative answer can be
gNen to either of the above questions, then Gibson's ecological approach to perception could be accommodated by abstract machine theory, and the cognitive theorists' complaints would be without merit.
Against direct perception
animal-environment transactions by which the perceptual systems under study might have become attuned. On the other hand, suCh theorists should be admoniShed t o be circumspect and not take the
with some history of interac
"internal state" description fostered by this methodological tool as being a blueprint of tha ghostly states of mind - a veritable deus ex machine.
tion with an environment E, then let H(() represent this history of the state of affairs concerning A's transactions with E up to some time I.
nor theoretically necessary, then how should one conceptualize
Let us assume an animal (machine)
A
If such cognit ive (indirect) models ol perception are neither formally
This means that J-/(1) includes all the effects ol A's relationship with £,
perceptual systems In terms of machine theory so as to capture their
such as the inputs received, and the outputs afforded. Then, following Minsky ( 1967), assuming that the (perceptual) state of affairs in which
essential nature, namely, their ability to become attuned in design and function through evolution, development, and experience? Indeed, it is
A participates up to I constrains its next response r, at 1+1. there must be some relation. F, of the form R(l+l) - F(H(t), S(t)). Notice in the above formulation, no state term Q(() is needed, in the sense of an "internal" state, which somehow imparts meaning or
incumbent upon the ecological theorist to provide, so far as is possible, machine theoretic modelS which are at least as formally precise as those provided by cognitive theorists such as Ulman ( 1979b). Clearly, ecological theorists have no quarrel with abstract machine theory per
enriches the input. Rather the term H(() refers to the entire history of
se, if it is properly construed so as not to obscure or trivialize the
transactions of the animal with its environment. The reason that this
"would be too hopelessly cumbersome to deal with directly" (Minsky
fundamental problems of perceptual theory - for example, how perceptual systems become attuned by their histories. In closing, to avoid ending on a negative note, let us take a tenlelive first step toward an abstract machine formulation of Gibson's theory of
1967, p. 15). Nevertheless, the most general conception of a machine
perception, one that captures tlhe difference between the indirect
with a history rn
(cognitive) approach and the direct (ecological) approach. Notice that
conception of abstract machines is not ordinarily used by computer scientists is that any relation involving an entire history of transaction
no
way requites the notion of "internal states," 0(1),
but rnvokes such variables only as a convenience for designing and
in the traditional abstract machine conception as given by R(l+/)
programming man-made devices. such as computers. For this reason,
F(H(I), S(l}). there is no necessary reciprocal relation between inputs
-
a computational scheme over 0(1) (a program) is but a convenient
S1 and outputs R, to express the mutuality of constraint postulated by
means of providing a device. which has no history in a natural
the ecological approach to exist between perception and action. To
environment, with an artificial "history." The variable Q(l) has no
wit: !he things that an animal perceives constrains what it does, and
meaning of its own, except what is derived from the history term H(l).
what an animal does constrains what it perceives. For example, seeing
However, even if one adopts this convenience. it is by no means necessary to reily Q(t) as an "internal state." For, as Minsky ( 1 967)
the portal through which I wish to pass guides my locomotion toward it
rightly observes, any "internal state" that has no external conse
optical ftow of perceptual information for distance, direction, and rate
quences is irrelevant to the description of a machine's behavior. Since a canonical definition ol machine need not incorporate such irrelevant
(Lee 1974). In short, perceptual information has determ.'nate conse quences for action, and action promotes informational changes of
states, "it might be better to talk of our classes of histories [internal states) as ·external states'" (Minsky 1967, p. 16).
variable into perception no less than perception enters as a variable
The fundamental insight suggested by Minsky's observation is that
into action. All of this suggests, moreover, that there is an Intrinsic
and through it, while every step t take in this regard refashions the
significant consequence to perception. Hence action enters as a
the variables 0(1) and H(t) have at least two possible semantic
mutual compatabi6ty between an animal and its environment, which is,
interpretations. Whereas the cognitive interpretation describes them as
alter aU, the fundamental premise of Gibson's theory. As a rough flrst
" internal states," the behavioral interpretation describes them as
pass, this mutuality o l constraint between the animal, as actor and
"external states." This implies that the two views are complementary
perceiver, and environment, as acted upon and perceived, minirnaly
end, therefore, there must exist commensurate format characteriza tions under which the two views possess the same explanatory power.
reqlires the foUowing machine theory formulation (cf. Patten 1979): R(l�l) - F(H(t), S(t)) es before and additionally, S(l+l) - F(H(I),
Of course. neither view alone may provide adequate theories of
R(l)).
perception. In fact, the ecological approach to perception lakes the limitations of both the behavioral and cognitive views as axiomatic, and
In accordance with the earlier discussion, there is no necessary sense in which any of the above variables should be tal<en as being
proceeds upon the assumption that they must be treated jointly and
"slates" in an animal. Rather, the animal as actor/perceiver is more
that they entail a mutually defined, integral unit of analysis whose "states" are neither internal nor external. Although it may be useful lor
aptly thought of as being functionally defined over the constraints
methodological reasons to focus temporarily on a single interpretation in isolation, one cannot lose sight of their reciprocal nature without
mental terms R(l+ I) and S(l+/) (the action consequences of percep tual histories and the perceptual consequences of action histories,
losing something essential. These issues lie at the very heart, not only of Gibson's theory of
respectively) directly specify each other. then no "between" variables are causally or eplstemically required to mediate this mutual relation. It
direct perception, but also of abstract machine theory. Ironically, cognitive science might come to a beller understanding ol Gibson
Is lor this reason that both action and perception may be said to be direct (Shaw & Bransford 1977b). Indeed, animal and environment as
through a careful reexamination of its own conceptual foundations.
physicalislically understood ere both tunclionally defined, in a distribu
specified by these dual equations. Furthermore, since the environ
Perception a• a function of ecological "mschln••- " From
tive fashion, over these equations. No animal construed as a psycho
the above argument, it should be dear that what cognitive theorists
logical entity exists in the nether world between the equations; hence there are no formal hooks upon Which to hang the ghostly garb ol
take to be a necessary presupposition of perceptual theory, nemely, the existence ol so-celled "internal states", 0(1), is nothing more than
a convenient fiction of contemporary computer science methodology, which allows the programmer, in lieu of evolution and learning opportu nities, to provide machines which have no natural histories, H(l), with artificial ones. Hence the apparent "indirectness" of perception is but an arbitrary feature, bestowed upon machine models by the semantics
"Internal states." by Aaron Sloman School ofSoCI•I Sol•nc••· UnlvtHIIty ofSu•Be�. Br/flhlon BNt 9QN. Engl.nd
What kind of Indirect process Is visual perception? Introduction: historical note. It is hard to disagree with the main
of the cognitive approach, which readily disappears under more naturalistic (evolutionary, developmental, and learning) Interpretations ol machine"lheory. Sill t it is, of course, quite lair If, for the sake of the
points of Ullman's paper. Even Kant (1781) pointed out, in opposition to empiricist philosophers, that percepion t requires a "manifold" of
convenience, "cognitive" or other theorists should choose to construct algorithmic models of perceptual phenomena; such
sensory data to be segmented (to separate objaots), grouped (to link parts of the same object). classified in accordance with flexible
programs may provide a useful summary of the oomple�t histories ol
schemata (lor example, dogs, trees, and polygons come in many
THE BEHAVIORAL AND BRAIN SCIENCES ( t980), 3
401
Commentary/Ullman: Against direct perception shapes and sizes). and related (spatially, temporally, and causally). He argued that perception required a process of synthesis not unlike what
expressions, a closing fist, a peel being pulled off a banana. This generally requires more information than rigid motion: and a failure to
occurs in imagination. All of this, he claimed, required a massive contribution of the mind (or brain?). determining the general forms and
cope in the situations mentioned may be due to the fact that the mechanisms (or algorithms) require more information, even though
limits of possible perceptual experiences. Gibson produces no serious rival explanation of these facts of common sense. We certainty do not perceive tl1ings the way physicists tell us they
mathematically such information is not necessary for the perception of
"really" are. The properties and relations we perceive are not those described in quantum theory but abstractions useful for planning, executing, and monitoring actions, for recognising individuals and classes of individuals, tor forming useful generalisations, and for invoking and monitoring higher-level perceptual processes satisfying these needs. II is hardly disputable that what is perceived is in part determined by inherited abilities (for example, seeing faces), in part by learning (for example, seeing your mother's face, the structure of a
rigid motion. Of course, such a system would be able to cope with rigid motion as a special case, when provided with enough information, just as the ability to see curved lines and surfaces may enable straight and ftat ones to be perceived as special cases. Notice how few points are required mathematically for "perception" of these special cases: the fact that two points define a straight line may be of no use to a visual system that has to be able to decide whether the line is straight or curved. So, an adequate analysis of the failure requires a fuller discussion of the difference between failing to pick u p information and failing to use
flower, or the nuances of a dance). We all know how what we see can also depend on circumstances, such as how tired we are, what we
it. This note of scepticism concerning Ullman's theory of motion
want or expect to see, and the like. Any claim that perception is direct
visual information is picked up. However, I suspect that any account of perceptual processes which can readily be expressed in terms of
therefore either implies that what physicists tel
us
about the world is
perception does not undermine his discussion of processes by which
false, or it uses a very peculiar sense of the word "direct," perhaps (as
physiological processes, without the need for higher level "virtue!
Ullman suggests) merely indicating what Gibson finds interesting and
processors" (see below) would be regarded by a Gibsonian as a theory of "direct perception." Stronger anti-Gibson arguments are
worthy of analysis. (I am not disputing that in other respects Gibson has made very useful contributions to the study of perception).
needed. One of my favourite anti-direct-perception demonstrations
In view of all this I ffnd it hard to consider the claim that perception is direct as a serious contribulion to psychology, and will lherelore restrict myself to minor qullbles over deta�s of Ullman's arguments
s i the well-known example shown in Figure 1 . Many people (the exact percentage is irrelevant), when first confronted with this can stare at it
and lhe MIT view of visual perception, after making a small point in partial support of one of Gibson's claims.
for several minutes without seeing anything wrong, despite repeated exhortations to look carefully. The failure to perceive the printed words correctly does not impty that there is any failure in the lower levels of
Perceptions and sensations producBd In parallel- In section
3.2.1 Ullman mentions Gibson's theory that perceptions and sensa tions are produced in parallel by different processes. This could be true even if his claim that both are "direct" results of external stimulation is false. Processes of perception can be distinguished from processes of sensation, namely becoming aware of the sorts of things
the visual system to pick up the relevant information. (It is interesting that some people discover what is wrong spontaneously if asked to shut their eyes and count the words in the triangle. They often cannot say thereafter which occurrence of "THE" they had previously seen.)
usually referred to by philosophers as "sense data" (features and
Common observation of human abilities and inabilities suggests that there are many different levels and subprocesses in which things can
relations in the two-dimensional visual field, such as coloured patches
go wrong, and a study of different sorts of perceptual errors can help
and the elliptical appearance of circular objects viewed obliquely).
to show just how wrong Gibson's theory is. For instance, the "double
Gibson was in part reacting to philosophers who claimed that percep
take" phenomenon (thinking you've seen X, then Quickly and sponta
tion involves inferences or constructions based on conscious
neously realising it was Y alter it has moved out of view) lends support to tl;le extended Kantian theory sketched in chapter 9 of Sloman ( 1978)
processes of sensation. But normally the latter processes do not occur during perception: for instance if we are not painters or philosophers
and Sloman and Owen ( 1980) that perception involves processing
we may never notice anything elliptical. nor discern acute and obtuse
many domains of structure in parallel, with partial results in each
corners, when we see a penny on a table. It requires special training to
domain constraining searches in others. This organization partly
become aware of the contents of the visual field, as opposed to the
accounts for flexibility and graceful degradation in difficult circum
contents of the environment. Thus Gibson was probably right in saying that perception and
stances, such as occluding objects, poor lighting, fog, intervening
sensation (that is, awareness of sense data) are ndependent processes, even if he was wrong in denying that either of them requires complex constructive (but unconscious) processes. They are indepen dent only in that each can occur without the other. Of course, granting
bushes, eye defects, and the like. A theory of direct perception cannot explain such abilities except by vacuous invocation of unspecifiable invariants, and invariant detectors with a magical ability to cope with · novel and difficult circumstances. If the visual system jumps to conclusions on the basis of both partial
processes does not rule out their sharing many unconscious "low
informa1ion and (for the sake of speed) partial analysis at higher levels, this may normally work if the space of possible shapes is sparsely instantiated in the actual world: For example, not an shapes interme
level" subprocesses of feature extraction, description, and interpreta
diate between a sheep and a horse, or a horse and a giraffe, are
tion. Thus they can be paraUel without being "direct" in any interesting
found. However, it requires the systern to deploy knowledge about which shapes are instantiated, in order to use the redundancy in the optic array. Some sorts of perceptual mistakes suggest that we do
Gibson this point, not acknowledged explicity by Ullman. does not undermine Ullman's other criticisms. The independence of the two
sense. Beware of msthsmst/cslly tractable
speclsl cases. In
discussing the recovery of shape from motion (3.2.2), Ullman notes that when the human visual system is presented with a mathematically
II I \I I
adequate though impoverished stimulus it will not always perceive the correct structure. He seems to interpret this as due to a failure of the
I
visual system to pick up the available information, and he then launches into a discussion of physiological processes involved in registering properties of the optic array and producing binocular fusion. But it is possible that the failure of the human visual system to use available information may not be due to a failure to pick up the information. Ullman does not, for Instance, consider the possibility that human perception of moving shapes primarily uses mechanisms and strategies appropriate for nonrigid rnotion, such as changing facial
402
THE BEHAVIORAL AND BRAIN SCIENCES (1980),
3
I
/PARI�\ /
I
I
I
IN THE
\
I
\
\
THE SPRING
Figure 1 (Sloman). What is wrong with this flgure?
Commentary/Ullman: Against direct perception ondeed deploy such knowledge. But that is inconsistent with any theory that perception is direct Dropping out of consc/ousn•••· It is curious that unman has to rely (in section 5) on SchrOdinger's idee that processes perfected in the course or evolution drop out of consciousness. Isn't it a common· place that many processes perfected through painful individual learn· ong drop out of consciousness-tor example, reading, playing a musical instrument. sight-reading musiC, following a spoor in a jungle, driving a car. perceiving botanical or geological structures? To a suitably experienced person, these processes have the same subjec· live ease and immediacy as trle simplest perceptions. The same is true of looking through a peephole at a static scene, where the lack of stereopsis, parallax, and optical flow causes ambiguities about rela Uons between objects which cannot be resolved without prior know!· edge. II is quite remarkable how little the absence of these ambiguity resolvers affects our perception of scenes involving familiar objects. Try, for instance, covering and uncovering one eye, repeatedly, with your head quite still. A small peephole will help to eliminate information provided by accommodation and head movements. Ullman seems to grant too much to Gibson. For despite the tact that the optic array in such cases contains an enormous amount of information (il lighting is good, fog and smoke are absent. and so on), il is still inherently ambiguous about occluded parts of objects and relative depths of separate objects. 5o the fact that we see a specific scene implies that we go beyond available inlormation, contrary to Ullman's clam that "the role of the processing is not to create information, but to extract it, integrate it, make it explicit and usable." Why thisrelusal to admit that creative interence plays a rote n vision? I suspect that it arises out of a desire for theories concerned with mathematically tractable, unambiguous, information extraction, which in ti.Kn is closely bound up woth the methodological position Ullman derives from Msrr and Poggio. I shall cricise ti this in the next section. AI this suggests that it is no accident that we find the interpretation of painings t and drawings so easy: infants require no speciafised training, because the processing of n i herently ambiguous and impover ished information in the light of pnor knowledge is a normal part of perception. These facts seem to be more convincing than the example Ullman offers against Gibson, namely stereopsis (though his point about degrees of directness is a good one). As I've already suggested, Gibson might be happy to describe stereopsis as "direct" it based on the sorts of physiological mechanisms indicated by Ullman. Why doesn't Ullman use the more obvious and powerful arguments against direct perception? Is It related to his overall methodological position? Art� lht�re three levels of understanding? In section 5, follow· i'lg Mar r and Poggio, Ullman sketches the methodological assumption that it is important to distinguish three levels of understanding: function. algorithm, and mechanism I think this assumption is confused and faits to acknowledge some important lessons from computer science and artificial intelligence. Moreover, it tnreatens to divert attention from dilticult and messy problems in psychology to relatively simply mathe matical problems Fi'st of ell . the alleged top level cannot be usefuUy separated from the level of algorithms and the study of representations. For instance, consider the favoured example of pure. nlXIlber tneory: for centuries the specitlcation of algorithmic processes (for finding factors, sol\llng eq.�ations. and so on) has been central to the theory. That is the source of our concept ot an atgorilhrnl Further, the abstract properties of representations and operations on them have always been central to the theory of numbers, for i'lstance the relation between representing a number as a sum of powers of tO, a product of powers of primes, a sequence of app�cations of the successor function, and so on. Even the relations between these abstract structures and algorithms and the more concrete notation-specific instantiations are very n i timate. That is why some philosophers of mathematics have been tempted to analyse mathematics as concerned with nothing but formal manipulations of symbols. We see then that for number theory at least the distinction between the top level and the level of algorithms breaks down completely.
Further, the alleged distinction between algorithm and mechanism fails to take account of the important notion of a "virtual machine." A physical mecharism (a calculator, or computer, or brain, perhaps) may instantiate a particular virtual machine which can be used as a basis lor implementing other virtual machines (using programs which define operating systems, compilers, interpreters. and so on). There can be many layers of different superimpoSed virtual machines, and the structure need not even be hierarchiC (if, for example a relatively high level program is called as a subroutine from inside the microcode of a computer). (Compare Sloman 1978, chs. 1, 6, 10.) Many of the most important issues in AI have been concerned with the study of trade·ofls between different virtual machines tor a particular function, such as trade-oils between space and time, efficiency and flexibility, efficiency and modularity, completeness and speed, clarity and robustness. It is possible that such computational trade·ofls are the key to much of the complexity of human and animal psychology, and ultimately neurophysiology. If so, it may be a serious Impediment to scientific progress to advocate an oversimple method ological stance. The calculator example of section 5, for instance, is dangerously misleading, because the rigidity of function of a typical calculator makes it unnecessary for our understanding of it to involVe consideration of many layers of implementation or the kinds of trade-oils and mixtures of levels found in human psychology. By contrast. when we study human arithmetical expertise (acquired after many years of individual learning), most of the mathematical theory of numbers is an irrelevant digression. Instead we have to consider issues of storing many "partial results," indexing them, linking them to methods of recognising situations where they are applicable, associat ing them with monitoring processes tor detecting slips and mistakes, and so on (Sloman 1978, ch. 7). Similar issues arise in the study of human expertise in producing end understanding a natural language: instead of a mathematiCally elegant formal grammar, a typical speaker seems to use a huge collection of not completely consistent partial rules and heuristics for deploying them. I believe thai this is an inevitable consequence of the need for rapid performance. and reliability in circumstances with varying amounts of noise and degradatton of sentences produced by other speakers. The same messy kind of complellity would characterise much of visual perception, for much the same reason, even if the lowest levels of the visual system, discussed by Ullman, are an exception, embodying knowledge which can be sa fely compiled into "hardware" because the physics and geometry of light and many sorts of surfaces are constant In all visual environments. Variable aspects of the environment will need to be dealt with in a diHerent way, mediated by considerable Individual learning. In short, Neieser's unease with "processing and still more process· ing," quoted approvingly by Ullman (In section 5), may in fact turn out to be unease with a central leature of human psychology. Is subjective experience 11 complete mystery? In section 5 Ullman claims that experience Is a mystery, despite recent attempts to remove the mystery (for example, Dennen 1979 and ch. fO o l Sloman 1978), to say nothing of the much older paper by Minsky (19681. Important steps have been taken by work in AI showing how in principle Internal processes can occur which reflect some of the phenomeno logiCal structure of visual subjective experiences - for example, the experience of certain things forming a totatity, of one thing being above another. of en edge appearing convex or concave. Of course this work is in its Infancy, but it is so tar ahead of anything previously available that to say we are stnr faced with a "complete mystery" is misleading. For n i stance, we can now begin to see how other aspects of subjective experiencescan be accounted for within the computationa l/represen tational approach. The phenomenology of emotional states such as anger, terror, or embarrassment requires the use of additional computing concepts, such as priorities . resource allocation, and interrupts. To Illustrate: a characteristic of heated emotional states, such as anger or embarrassment, Is that attempts to think about something else constantly fail, suggesting that a process of resource allocation is using something like priorities and interrupts. I am currently engaged In a more detailed study of such experiences in collaboration with a research student, Monica Croucher. It is important that in a
THE BEHAVIORAL AND BRAIN SCIENCES ( 1980), 3
403
Commentary/Ullman: Against direct perception journal such as this the claim that subjective experience remains a
on psychophysics and the opposition against "elementarism" in
complete mystery should not go unchallenged. However, this is not the time for a more detailed discussion. II is worth noting that there will always be a residual area of morel disagreement over whether the mystery has been removed, since for example the question whether a robot has subjective experiences is in part a question of how it ought to
psychology.
be treated. Disagreements of that sort. whether concerned with
It is difficult to discuss Gibson's opinions, because he has altered his point of view continuously during his series of publications. I had the opportunity - I would rather say the advantage - of working with him in
the decade 1956-1966 during which he prepared his book on the senses (Gibson 1966). I left Cornell by the end of 1957, and something
machines, animals, or people, cannot be eradicated by science or
of our discussions might have been reflected in that book. His new
logic.
work "The Ecological Approach to Visual Perception" has no t yet been available in bookstores and libraries in Finland. Some basic trends, however, do persist throughout all o f his scientific work. I doubt
byK. von Fleandt o..,.nmsnt ofCleneratP•rchorogy, UniverolfyorHelainlcl. (}()ITOHe/1/nkl, Finland
In defense of lnvarlances and higher-order stimuli Take a three-dimensional wire Necker cube with one edge extended so that one can grip it. Hold the cube steady against a homogeneous
white wall. Shine a light so the cube casts a focused shadow on th e wal. If you keep rooking at the shadow instead of the actual cube, you will see a two-dimensional pattern. Now start slowly rotating the cube.
At once you ge t a compelling experience: there is before you a three-dimensional object, a shadow-cube rotating inside a diffuse
whether we are justified in calling his approach a "theory of
direct perception." His claim, as was that of Gestalt psychology, for direct, immediate observing was originally raised agaiost the "elemen
tarism" of early European associationists. He rejected the idea of point to point correspondence between "proximal stimuli" as single isolated elements and the resulting perception. Instead of a "constructed stimulus correspondence" in the sense ot Muller ( 1924), he introduced the concept of "higher order stimuli." Gibson assumed that situations and experiences are immediately described instead of being inferred from mosaics of "sensations." In this latter aspect of his approach he was a lrue phenomenologist. Relying partly on subjective experience is an indispensable ingredient in perceptual research, and it is not' a
surrounding "space."
bigger "mystery" than are the assumptions of "computers," "compa
This simple demonstration clariHes the role of invariances involved in the stirrotus pattern, a central issue in the Gibsonian approach as examined by UUrnan in his target article. For the purpose of theoretical desc ription one can calculate as Johansson ( 1964) did, all the
rators," or "internal representations." The hypothesis of higher order stimuli enabled Gibson to return to the basic concepts of psychophys
ics. The higher order stimuli embodied the invariances characteristic of any situation. Tq come back to the Necker cube experiment: the
simultaneous linear horizontal and vertical transtormations which take
three-dimensional "solid" impression builds upon a sample of invariant
place in the moving shadow pattern and build up a spatial-geometrical
relationships inherent in the transforming pattern.
projection system to account for "what is programmed by the visual computer."' Or, alternatively, one can confine the description to what is "directly experienced." When the shadow starts moving, its straight
More dubious. in my opinion, is Gibson's idea that a retinal geometry
phases that seem merely chaotic. As soon as the person sees the
of the stimulation can ba "picked up" as such by "the organism." His conviction on this point led to some lively discussions between us. During our joint "shadow caster experiment" (v. Fieandt & Gibson 1959) I did not concur in interpreting perspective transformations as
shadow a s three-dimensional, the chaotic "component movements"
being directly registered by the eye, but I was enthusiastic about the
line components undergo a variety of stretching and condensing
become orderly. This alternative, an impression of a "soDd," implies
several sets of invariances characteristic of elastic versus rigid motion
less redundancy than the two-dimensional shadow pattern. Therefore
Gibson's emphasis on realism and on veridicality of stimulation made it
this is an example of invariances in the variation (Knudsen 1974; v.
difficult lor him to discuss ambiguous figures. I often asked him:
Fieandt 1975).
"Would it not be likely that two or more representations ol the physical
I wifl call these two descriptions "data processing" and "stimulus oriented" descriptions respectively. The former term seems justifiable since Johansson ( 1970) ex.pli citly calls attention to "decoding princi ples" which he likens to the "prograrnming of the visual computer." There have been various attempts to develop mathematical models to fit slimutus transformations. Among the recent ones is the Lie Transfor mat ion Group Model (Tondeur 1965). An example ol the model's use for this purpose is found in W.C. Hoffman's (1977) work. The clearest
world compete, and that in most cases just one of the possibilities comes through because of some stronger invariances carried by that
pattern?" But he regarded ambiguous patterns as "extrerne cases" of no interest. His theory of invariances was scarcely elaborated until my book (1966) came out (see also v. Fieandt & Moustgaard 1g77). It is interesting that he later returns to this conception repeatedly. Gibson never mentioned eidetic imagery (see Haber:
Twenty Years of Haunt
"
ing Eidetic Imagery" BBS 2(4) 1979], nor did he investigate brain
example of the "stimulus oriented" approach is perhaps found in the
injured patients with intact retinal images. He could easily have tested
work of J.J. Gibson (see, for example, 1950). In any case. we are
his explanations against data from both of these Helds.
dealing with only two types of description. Thera is nothing in either
Ullman's point is therefore clearly in order when he concludes that
approach which could grant It epistemological priority in dealing with
the existence of reliable information in the light array does not entail
the phenomenon. The approach we prefer depends on our conceptual
that processing is unnecessary. We cannot dispense with the process ing, the coding, and the decoding phases in the chain of visual perception. ·The real danger in stressing the role of "decoding principles" is one
background. We do not know enough about the rules of perceptual organization in our hierarchical visual system to aspire to anything
more than explanations by analogy. When it became popular, some twenty years ago, to incorporate the
of taking computer analogies too literally. The human organism is no
research at perception into what was vaguely called a "cognitive
computer, and it will never prove fully understandable in mechanistic
approach," there was reason to believe that computational models
terms. At a certain stage of our technical evolution, "computational
were being applied withOut sufficient integration of key findings and theories of perception. The paper by Ullman, written in opposition to
methaphors. When magnetism and electricity were discovered they
the Gibsonian stimulus-oriented approach, certainly does not fully
were for a white favored constructs in theories of mind. And at the time
descriptions" might sound appropriate, but they are no more than
support this belief. On the concluding pages he gently admijs that both
of Helmholtz the closest analogy to the central nervous system was the
the "Gibsonlan" and lhe "information processing" approaches have
telegraphic wire net. The latest thing in technology will always be
their distinct roles. and there is no talk of any trade-off between them. I am ready to subscribe to this part of his presentation. On the other hand, I should like to point out some concerns with the
reflected in perceptual theory.
terminology of Ullman's paper, especially in the first sections. Some of
at the cognitive and motivational levels of human performance (Kalla
The signmcance of invariances is quite apart from the trade-off indicated above. These play a decisive role, as has also been shown
Gibson's 'concepts can be understood only against !he background of
1979, 276-85). The invariances in relationships are the objects ol
European phenomenological psychology. tam thinking of the emphasis
modern natural sciences.
404
THE BEHAVIORAL AND 8AAIN SCIENCES (1980),
3
CommentaryjUJiman: Against direct perception Not• 1. This means carryfng out an analysis of 1/le vector �eld as represented by lhe moving pattern.
by Walter B. Weimer O.tunm•nt of P�ychot<>gy,
Tl>• P•nn•ytv•nl• St•t• Un/llor31ty, Unlvor4/ly P•rl<.
P•nn•. 16802
Logical atomism and computation do not refute Gibson
Ul-nan's criticism of some facets of Gibson's views represenls a serious attempt to come to grips with direct reansm and the ecological approach to perception. Unlike several recent critics, UUman is fair and evenhanded, having made a sincere attempt to understand Gibson's views end their motivation. While I am In egreemenl with the conclusion that the ecological approach is a lasting contribution but that the "immediate" approach must be correctly located in an inclusive cognitive theory, 1 think Ullman's account Is marred by a residual logical atomlsm and a vacuous notion of computation. But first consider a strong argument against Gibson that Ullman underemphasizes. Evotut/on•ry epletemology •nd the •d•ptfve function or conetructlve proce••••· If an organism had completed its evolu tion, that is, were totally attuned to its econiche, the nonconstrucve it or ct;'ect pickup of information from the environmental array would exhaust perception, and there would be no need for cognition beyond perception. Such an organism would literally "resonate" to its environ ment (and its internal states) "directly." The information picked up would no doubt be extremely abstract, lnc1uding components tar beyond invariants and what has thue far bean packed into "afford ancee." Such an organism would have only tacit knowledge, and explicit or conscious (constructive) processes would have been suppt;;�nted by nonconstructive, or tacit end direct ones. The problem for the Gibsonian approach is that organisms have not achieved such adaptation, end the "higher" ones aaem to have evolved construcliva end conscious processes to account lor novel and unexpected occurrences that are the inevitable result of our ignorance. Evolution has built in (an admittedly faUible and Imperfect) fail-sale device (cognition, and consciousness in humans) to cope with the unex pected. The problem of novelty is lhe problem of productivity or creativity, or the infinite utilization of finite means. The Gibsonian account can be productively adequate only if en infinitude of richness and complexity (meaning and abstraction) is built into the system in advance - by packing whatever "affordances" ere necessary into the system. But how did the organism come to know these meanings in its evolutionary development)? The direct approach cannot answer - it Is adequate to understanding radio receivers, which merely "resonate" to information in the electromagnetic array, but not to listeners Who use radios. A radio is a receiver, not a perceiver- it is our cognition Which �t�es meaning to the information the receiver "resonates" to. The pickup of Information Is one thing, Its perception is another. That appears to be the strongest argument against Gibsonian "perception" that does not deny the utility of Gibsonian information pickup. Note, incidentally, that the Schrodinger ( 1967) argument puts the evolutionary cart before the horse, with a teleological notion that consciousness precedes tacit processing. This makes sense only if consciousness arose initially from tacit processes to correct and supplement the earlier, less adequate tacit processing. Then once consciousness had aided In the construction of new "aflordances," they could become direct or tacit In subsequent generations of the species. Comput•tlon vsr•u• construction or r•preeentaUon. I have referred to constructive processes rather than computation. This is not a matter of word preference among synonyms. I am a cognitiVe psychOlogist, not a cognitive science advocate. Computation has deftnite meaning only n i mathematics (or logic), and is as yet only a vague and misleading metaphor In cognition. For instance, the Gibson ian "mapping" Is legitimately a computation In a formal systems sense, but Ullman wants to deny that this kind of computation is a computa tion. Use of this vacuous term lulls one Into a false sense of security.
since in psychology its use is nonexplanatory, represenng it only one possible description of data to be explained rather than constituting an explanation. Computation Is a purely formal. (or structural) notion, having to do with the possibilily of representing an empirical domain via the syntax or calculus of one or another mathematical system. Thus, to assert that, say, cognition is computation is merely to assert that one or another logical calculus can represent the syntax of cognition (see Pytyshyn: "Computation and Cognition". BBS 3(1) 1980!. Such a claim Is at best a promissory note lor the distant future, end in the interim is misleading since it begs the semantic quest ions at issue concerning the nature of the system that Is being "represented" by that syntactic process. Thus Utinan's definition of computations as elementary relations plus schemas for combining lhem g i nores entirely the sema� tic speciftcation of the "empty" symbols that can be structured in computation. Precisely what Is at issue Is what the "elementary relations" are, and Gibson at least acknowledges the problem of meaning before begging the question. The atomism implicit in the idea that "Immediate perception" means "no meaningful decomposilion Into more elementary constituents" In Ullman's definition begs the question: it is not reductionism that Is at issue, but rather the nature of perception. It is the semantics. nol the size of the perceptual processes !hat is the problem. Since this issue remains unresolved I speak of constructive processes that generate meaning, and repre· sentational realism. but not computation. Logical atomism does not refute Gibson. lnt•rv•nlng Vllrieble or •lgorlthm7 Associated with compute· lion Is the related notion ol "algorithm." To assert that the intermediate level between information pickup and pragmatic !unction Is algorithmic adds hubris to queslion begging. It is fine to regard this as a level of intervening variables, but not all intervening variables are algorithms. Algorithms are effective decision procedures, and much of this level is plausibly considered heuristic, dealing with novelty and creativity in less than effective fashion, unless one refers to algorithms for heuristics (thereby rendering the term pleonastic). This Is an empirical issue that should not be resolved in advance by terminological convention. Gibson's rejection of the computer metaphors of computation and algol'ithmic effectivity is as correct as his rejection of "snippeling" and the construction of meaning out of nothing. The ebetr•ct apecfflcaUon ol lnformaffon. The conclusion ol Ullman's section 3.2.1 is not correct as it stands. The argument lor the necessity of abstract specifica�on of Information is not the theory of immediate perception (which Ullman correctly notes is only one alter native lo "sensation" views), but rather the empirical fact of an Indefinitely richly extended domain of perceptual experience and behavior. The problems of creativity (or novelly or productivity) require that the rules of determination specllying the informational basis ot perception range over abstract entities. It Is the organism's inevitable ignorance - the inability to register (or comprehend) an infinitely rich environmenl - that determines the abstract nature of perceptual information, or what Hayek ( 1969) caUed the primacy of the abstract. Ag11fn11t computation. My arguments do not vitiate Ullman's conclusion: rather, they attempt to ehow the inadequacy ol the i plicit in the computer metaphor in cognition. It Is the scientlsm m computation metaphor that renders Ullman's arguments less powerful and less to the point than they could be.
byWally Welker D•pertmentof N•utophyfliology, Unlver•lly or Wllo<>n•ln, Medl•on, Wloc. 6S706.
Percepts, Intervening •Jarlables, and neural mechanisms
t liked Ullman's essay. I found it useful and constructive. It provides a clearand reasoned discussion of the importance of seeking to explain perceptual phenomena in terms of more elementary processes and operations rather than n i terms of the immediate informational content contained in stimuli. Yet, I am struck by the facl that these divergent viewpoints are still the subject of controversy. Clearly, the controversy continues to be a stimulus for further research concerning adequate valid theoretical frameworks.
THE BEHAVIORAL AND BAAIN SCIENCES ( 1960), 3
405
Commentary/Ullman: Against direct perception On the S\Jfface. it might seem that modern bra10 research is on the side of a perceptual theory like unman's, whichseeks to elaborate its intervening variablesas computational processes in order to achieve increased explanatory power. (Ubman cttes examples of neurobehav· ioral data in support ol this aim.) But the language ol neural mecha· nisms- of neuroanatomy and neurophys,Qiogy - ties outside the scope of perceptual theory. as Uilman edmlls. Nevertheless. the brain's machinery is seldom tgnoreCI anymore. despite its being viewed as more and more compte)( and precisely organized, and its interrelated, miature microcircuits as multtdimensionally transactive. There are so many apparently more relevant minimechanisms now being proposed than a perceptual theory cOIJid usetuDy accommodate, or even want. I think Ullman's suggest.on that visual neural mechanisms now being articulated can "explain" such perceptual phenomena as "stereopsis'· is not yet justified. The problem is that over decades, learning, moivation, t cognition. perception, alld action have been chased about successively, but unsuccessfUlly, through dozens of structures, circuits. and systems ol mammalian brains. Some neurobiOI09Sts, perhaps wiser. or tiring of the hunt. are coming to see, feel cognize. and act as if they believed that perceptual-cognitive-movational it and behavioral circuits and their funcfions are not isolable or separable in waking animals. In fact, it is even suggested that the functions of a structure or circuit may never be evidenced as such in measurable thOught or action, just because of its betng enmeshed within complex systems having multiple, distributed polylactorial transactions. In neural science. as 10 perceptual science - as Ullman points out the main concern. even at this late date, now seems to be directed toward discovering just the basic rules of both perception and brain transactions. Both fields are just at the beginning in their attempts to consolidate end make sense of the mounting mass ol sophisticated, technical data and concepts. In both sciences. there are those "experts" who sense the grand synthesis is d�ecUy ahead (as perhaps does Grbson), as well as those. like Ullman. who see lhe tangled knot barely ready lor unraveling. Des�te the H'!lplied good sense and "obvious" ratiOnality of each theoretical framework. I think it relevant to note that psychologists and sociologists tell us that intuition and cultural and educational biases. as wall as inherent perceptual, cognitive, motivational. and behavioral I:Jjases. are krtawn to elfect rtat orly the ways that rational issues such as these are dealt with. but also whteh viewpoints and methods are adopted and employed. Can we expect otherwise? Can we expect the ulhmate complete and adequate theory - in times when science is viewed as a competitive. creative, tnstinctive art form: where theory tS the necessary outcome of curious and analytic anthropoid brain functions. evolved over tong periods ot time, presslng to explain and rationar!Ze and understand their very selves?
particular models described by Ullman lor the kinetic depth effect. In the process we shall raise further arguments against Neisser's ( 1976) aPParent beliel tn the "accuracy" of percepts, thereby further weaken· ing the basis for his attraction to the Gibsonian position. Consider the Mueller-Lyer illusion. This is a classical demonstration of the eHects of local contellt on the perceived lengths of tines. and Is one ol many situations in which our percepts are clearly inaccurate (with respect to physical reality); consider also Ullman's references to lightness effects. Helmholtz ( 1963) has suggested a most direct basis lor this iUusion - optical impertec�ons in the eye - a suggestion that has encouraged a number ol psychophysical investigations. Briefty, these imperfections cause acute configurations of lines to be filled in more by blurring than ootuse ones. More recently. Ginzburg ( 1971) has imp�ceted the band-fimited channels early in the visual system as playing a part in the blurring. In either case. psychophysical investiga· lions have shown that the perceived strength of the illusion increases
Figure la (Zucker). Display of the Mueller-Lyer illu&Jon just out of fo cus. For purposes ol reproduction the display has been quantized to 9 gray levels: the actual experiments were conducted on a display moni tor with much higher resolution (256 gray levels).
by Steven W. Zucker ComputM Vlakm •nd Gr6p/IIC6 UD
The computatlonal/repreaentatlonal paradigm as normal science: further aupport
As another advocate of the computational/representational approach. I would ike to say at the outset that I agree with Ulman's argument. Yet, Gibson's study of how much infomnalion is available ecologieatty will almost certainly have a l!lsling impact on theories of visual perception, even though his attempts to extract it have not always been successful. And, !rom a historical point of view, even his difficulties can.be interpreted as playing an important role - they provided what Kuhn ( 1962) has described as the anomalies (in a current theory) that are prereqUisite to a real change in paradigm, or scienlific point or view. The new oompulationallrepresentatlonal para digm seems to be capable of providing solutions to many of the anomeNes in the direct visual perception (DVPJ position. by developing the notion ol lnlomnation processing much more deeply than that of direct "resonance." Because these models involve so much more structure than those in OVP, however, a measure of independent support ·,(){ them seems warranted. The central purpose of this commentary Is to proVide one such measlJfe of support for the 406
THE BEHAVIORAL ANO BRAIN SCIENCES (1980), 3
Figure lb (Zucker). Display of the Muefter-Lyer iUusion substantiany out of locus.
Commentary/Ullman: Against direct perception substantially with blurring (Coren, Ward, Porac. & Fraser 1978). (See Figure 1.) The abOve explanations of the Mueller-Lyer illusion were selected
not because they necessarily tell the whole story, but because they are
as direct as (I believe) is possible. In tact. as I shall now argue, by supplementing them with en explicit notion of low-level visual represen tations and essentially the same computations as those developed by unman for the kinetic depth effect, a group ol related phenomena can
be explained as well. And, perhaps more important. the absence of a
petcept that is in principle directly available can be predicted.
We have descnbed how the MueDer-Lyer ilusion increases in blurry
presentations; now, consider that blurring taking place in real me. it
That is, consider viewing a sequence of images rapicly enough for
Figure 3 (Zucker). The angle varied In time to produce motion of the
central vertex. The moving vertex is indicated by the arrow.
apparent motion. Do you expect to see the central vertex moving in
from a second experiment, in which the angle of the arrows, rather
you would, because this vertex can be detected immediately as the
marking a line termination. and it is perceived in motion (see Figure 3).
accord with these strength changes? OVP would probably predict that darkest point In the central arrow. (Such explanations are of the same
than the blUr, is varied in time. In this case the vertex is explicitly As one paradigm supplants another, it must be shown how the new
type as those first presented for the kinetic depth effect.) This is not
solutions explain the old anomalies without Immediately creating too
effects are visible. In particular, a common motion percept is that of the
in support of both representations in general (even early on in
the case psychophysically, however, although several other motion
entire illusion moving toward each observer in depth as it becomes
more biurry, and away as it becomes more focused. And. in the convex
portion of the �fusion, the inside blur profile edges are usually perceived as approaching (with increased blur) or receding (with decreased blur)
many new ones. My purpose here was to provide additional evidence perception) and the computational explanation of !he kinetic depth
effect n i particular. Such computations were proposed essentialy because various components of the description were made explicit, including, in particular, such (low-level) constructive entities as edge
from one another. Some observers see both of these effects concur
orientation and profile extent. Gibson has provided us wi t h the begin
the related material described below, see Zucker 1980.
can be obtained.
rently. For a detailed descripion t of these experiments, together With An explanation of these effects is straightforward from the compute·
lionel/representational point of view. If we accept Marr's ( 1976) position
that early visual representations make image
intensity
changes explicit, then the early description of the blurry Images is essentially one of edges surrounding smoothly varying regions (see
Figure 2). Furthermore, the positions and sharpness parameters (or
ional foundation lor vision; now we must show how it nings of a funct
by carl B. Zuckerman P•ychofo(ly Oepsrlment, BrooklynCotr�e. Broo�lyn, N.Y. I 12tO
What are the contributions of the direct
their eQUivalents) associated with these edges are the quantities that
perception approach?
lion can be based, and schemes like Ullman's, as they were developed
direct visual perception
change in time. It is upon these descripons it
that the motion computa·
lor the kineic t depth effect, can be adopted to ftnction successfully
here as well. Moreover, it would appear that the vertex is not explicitly
available to the motion computation because, if It were, then one would
expect to see it in motion. Validation lor the latter claim can be derived
Ullman has provided a comprehensive and Incisive critique of Gibson's
(DVP) theory. I would like to add
to UDman's
presentation a brief discussion of two questions - what is really new in
OVP, and how successful has it been In enlarging our knowledge of perceptual experience?
From the publication of The Perception of the Visual World in 1950
unti his last book (Gibson 1979) Gibson has promised a radically new conception of the nature of perceptual processes; where the novelty lies, however, is not Immediately evident. Gibson's earlier stress on
perception as a function of stimulation was not new. The discovery of
stimulus properties tor perception is a fundamental activity throughout
the history of perceptual science. Progress Is made 'Ntlere the relevant
nature ot the stimulus is described end demonstrated to be effective (Wheatstone's 1838 analysis of binocular di.sparity is a fine example).
Nor was Gibson's stress on higher-order variables new. Stimulus Figure 2a (Zucker). The edges out&nlng the �lusion in Fig. 1a. They
relationships, ratios, invariants, and the like were basic factors tor
Gestalt psychology. Michotte (1963) Investigated the stimulus condi·
were computed by convolving the edge operator, described in Marr
tiona for experienced causafity and the perception of animal locomo
(that is. the zero-crossings of the operator) together with the nonzero
configurations (Musatti 1924; Wallach 1976).
ment of t�e vertex is difficult.
of Information provided by the optic array is not a completely new
( 1976), with the image, and then displaying the edge positions in 'Ntlite values·(itl black). Note how, in this representation, the exact place
tion; others have studied the impression of depth arising from moving
Gibson's more recent treatment of perception as the direct pick-up
departure. For many aspects of perceptual experience an adequate explanation (at the psychological level) is given by a description of the
properties in the incident light 'Ntllch are correlated with lhe percept
(the color qualities of nowers or human skin ere explained by the way
these surfaces act on the n i cident light - cf. Evans 1948). This stimulus
lntormaUon Is directly picked up - no further psychological analysis or "decomposiUon into more elementary constituents" (UDman) seems
necessary.
Where Gibson is different Is in his claim that the perception of the
spatial layout of the environment is also direct. Most, if not eU,
perceptual theorists have taken for granted that stimulus information is
Figure 2b (Zucker). The edges outlining the Illusion in Fig. 1b. Note
how the edge profiles have moved outward. Such positional changes
are consistent with the two kinds of motion that are actually perceived.
tnsulffclent for the perception of e three-dimensional world in which
objects are experienced as having constant properties of size, shape,
color, and stability despite changing conditions of observation. The
THE BEHP,VIORAL ANO BRAIN SCIENCES (1980), 3
407
Re.spon.se/Uilman: Against direct perception stimulus has to be processed and supplemented - accord1ng to the particul.ar theory - by spontaneous organization . memory. unconscious
inference, computational processes, and so on. For Gibson none of these mediatmg processes is needed Anolher !actor whteh datferentiates Gibson s i his concern for ecolog ical vaidaly. Just as the ethologist wants to study the behavior of an animal in its natural habitat. so Gabson is anterested in perception in its normal "habitat" - the environment. Therelore many of lhe problems which have been studied in exhaustive detail - such as geometrical illusions, spectral colors - are of htue nterest to him. Since he as not satisfied with the traditional account of stimulus antormation in terms of cues or the explanation of the constancies on lhe basis of camputationlike processes. Gabson needs to discover and describe stimulus properties that can be picked up directly. How successful has he been in this endeavor? In Gibson ( 1950). texture gradients were proposed as lhe basisfor the perception of c:fastance and tor the ammediate perception of size-at-a-distance and shape at-a-slant. But empirical work has failed to demonstrate lhe va�dity of this proposal
Recently. Rock et al. ( 1978) have shown that the texture gradient is
fied, there is no reason to limit theoretical possibilities by assuming that all perception is direct. Psychology has left behind the wreckage of many conceptual systems which attempted to force recalcitrant data into a narrow framework. In \he past decade non-Gibsonian
approaches have greatly advanced our understanding of how we perceive a stable environment when we are moving about (for exam ple, Wallach, in press). II remains to be seen whether OVP can compete.
Author's Response by S. UIIman Artlllc/1/ rnt•INQ•net� L•bor•tory, M..aecltutelf• ln•rttut• o/ Technolofly, CtmbrldQO, Mats. 02rS9
Perception, information, and computation
neither necessary nor sufficient for apparent depth in pictorial materi als. and there is little evidence that it is the basis for obtaining distance impressions in a real scene.
Gibson ( 1 979) refers to his open field
experiment in which subjectsjudged the height ol a stake placed in the
ground as far as a hall-male away. Judgments of SIZe became more variable but they did nol decrease with distance. "Certain invanant ratios were picked up" - "no matter how far away lhe object was, at interrupted or occluded lhe same number of lel\lure elements of \he ground" (p.160). A second invariant ratio was "the proportion of the stake extendi"lg above the hOriZon to that extending below the horizon" (p.t60). In another experiment. observers were able to bisect accurately a stretch of distance because "they might have bean detecting, Without knowing at, the amount of texture in a visual angle" (p. 161). But "might" or "must have been noticing information" are not elridence that such ratios are indeed \he effective stimulus informaon it tor the percept. Usualy the investigator does not stop with the perceptual performance. The situation is varied, information is elimt· nated or changed in order to isolate lhe property of the stimulus which is correlated with the perception. Perhaps Gibson's ecological attitude precludes such a nalysis.
The delecioo t of invariants seems to imply tor Gibson that the
complete - when, for example, the observer notices the ratio between the exlenl of the stake above to lhet below the hori:zon, size preception should remain unchanged at different cons tancies are
distances. But everyday observation as well as experimental research shows that at greet distances objects do look smal. Does the pick-up
of invariants explain bolh complete and incomplete constancy? Alter the 1950 book, Gibson began to stress the importance of the moving observer. The traditional assumption thai stimulation is am biguous arises. Gibson argued, from lhe standard research methods where a momentary s timulus is given to a fixed eye. tn this case the
frozen optic array is impoverished stimulation. When lhe observer
walks through the enVironment, turns his head to look at a particular surface.
or approaches an object, a transforming flux of stimulation is created which is no longer ambiguous. The visual system extracts the invariants from the transforming perspective structure. In addition, object movement can provide information about the three-dimensiOnal layout. There is much of value and promise in lhis program, but as yet neither lhe nature ol lhe invariants nor the question ot their direct
picK-up has been clarified. In some of the examples of changing arrays, at 1s not true lhat lhe transformation uniquely specifies a particulor percept. An el(pSnding array is not unambiguous informalion for an approaching object (1979, p.175) - it could lead to the perception of an expanding object at a constant distance. (According to Gibson, "The object did not seem to get larger." This is another case in which Gibson assumes perfect constancy. The expanding s�houette can be seen as approaching together with some awareness of eXpanding size). Even if Invariants In the flowing optic array are successfully idenli-
408
THE BEHAVIORAL AND 8AAt1 SCENCES (1980), 3
It would be impossible to respond adequately to the full variety of issues discussed in the commentaries. I shall there fore concentrate on a number of general topics that were central to many of the arguments.
l. The "information content" o f the visual array. In my discussion of direct perception I made a distinction between the analysis of the information contained in the visual array on the one hand, and the direct pickup of this information on the other, and argued that the two problems should not be identified (see also section 2 below). Since the information content problen1 was nevertheless central to a number of the commentaries, I would like to outline briefly my view on this issue. The Information content problem encompasses several related questions. The first is whether the visual array contains sufficient information Lo specify its source uniquely. If it does not, then the second question is: what additional sources of information arc available to the perceiver? Finally, there is an empirical question which may call "information use." That Is, what sources of informution are actually used by the human visual system? To obtain a meaningful answer to the first question, the "possible sources" of the visual array have to be specified. It is clear that if arbitrary three-dimensional environments, changing in arbitrary ways, are considered, the visual array cannot possibly specify Its source uniquely. This ambiguity holds even for a dynamic and binocular visual array. If, however, only physically plausible sources are considered, then the visual array is often sufficient to specify its source. Much of the ambiguity disappears by taking into considera tion constraints imposed by properties of the physical world that govern the formation of the visual array. An illustrative example is the rigidity constraint in the perception of three-dimensional structure from motion, described in section 4 (of the target article). A transformation of the image s i , in general, insufficient to specify its three dlmensionnl source uniquely. It turns out, however, that if the transformation is induced by the motion of a rigid object (or a number of rigid objects), then the three-dimensional obj ect and its motion in space are uniquely determined by the changing image. This uniqueness is the result of the following properties of rigid objects and their projections: some trans formations of the visual array can be induced by rigid objects In motion; others cannot. If a given transformation is compat ible with a moving rigid object, then the generating object is unique: it s i impossible for distinct objects in motion to produce the same transformation of the image. Finally, the
Response/Ullman: Against direct perception
probability that a nonrigid motion will happen to produce a rigid transformation of the image is essentially zero. (For 11 proof of these properties, see Ullman 1979a; 1979b}. The result is that rigid objects in motion are uniquely determined by their changing projection (and, for a moving observer, so is the three-dimensional structure of the environ ment). An alternative formulation of the same conclusion is that the visual stimulus is Insufficient by itself to determine its three-dimensianal source, but when supplemented with the appropriate rigidity constraint, the interpretation (that is. the environmental source) becomes unique. The information content problem has often been formu lated in terms of two alternatives (see Johannson et al.): either the environment s i completely specified by the visual array alone, or else the interpretation is not unique, and bas to rely on reasoninglike activity using past experience. Thi.s formulation of the information-content problem would be misleading in the above example. The interpretation, although not strictly specified by the visual array alone, does not require prior familiarity with specific objects, or C"-Piidt reasoninglike activity. (See section 4 for more discussion of this point.) The "environmental source" is determined uniquely by incorporating what may be co.Iled "reflective constraints" - that is, constraints that reflect physical proper ties that are generally valid ln the environment. (See also the commentaries by Braddick and Rock. For a detailed discu,s sion of reflective constraints, see Ullman 1979c, sections
4.2-4.3.)
ReBective constraints do not provide the full a nswer to the problems of information content and information use (see, fot example, Rock and Sloman). They seem likely however. to play a major role in what may be called "extensional percep tion." By this 1 mean the perception of the three-dimensional structure of the environment and the changes of this structure over time, but not, for example, the recognition of particular objects or the reading of English text. When reflective constraints are used, it may be misleading (though perhaps not strictly wrong) to say that the perceptual system drew unconscious conclusions, based on the visual stimulus as well as on implicit assumptions, or that it was engaged in problem solving-like activity. These views are basically in agreement with at least parts of ecological optics. They do not imply, however, that the visual array Is strictly sufficient to specify its source, even in the case of extensional perception. 2. Information content and information use. The prob lem of information content should be distinguished frorn that of information use. The former is a study of the optical array, while the latter is the study of the information used by the human visual system. The use of eye movement information (Bridgeman and Cyr) illustrates this distinction. Certain transformations of the visual array are more likely to be caused by the perceiver's own motion than by move ment In the environment. Consequently, it has been argued (Gibson 1966; Turvey 1979) that changes in the �suaJ array provide sufficient information to determine the perceiver's eye movement. Turvey (1979) has argued on the basis of the analysis that there was no "evolutionary pressure" for the use of ocular Information (a term I shall use to denote nonvisual information concerning eye movements, including motor feedback as well as corollary discharge). This argument roay have some validity, but it cannot rule out the possibility that ocular information is in fact used by the visual system. A possible advantage of nonvisual information s i that the trans formations corresponding to eye movements may not be easy to "pick up" directly. In any event, as Bridgeman discusses in his commentary, th�re is strong empirical evidence for the use of ocular information, along with other processes, in visual perception during eye movements. These processes explain
various empirical findings described by Bridgeman and others (see. for example, Mack 1979). Note that the main relevance of these findings to the DVP (direct visual perception) controversy s i not that the addi tional processes create otherwise unattainable information. Rather, it is that the theory of perception would be lacking without··decomposing" the perceptual process in terms of the more primitive processes and the h1terplays among them. The information content analysis is insufficient for another reason: there may be an enormous gap between the availabil ity-in-principle of information in the visual array and its utilization. Maxwell's differential equations, fof instance, contain all the essential information for the classical theory of electromagnetism, and Peano's postulates "contain" all the properties of the natural numbers. Yet, it would be inappro priate to describe mathematicians' work in number theory as a direct pickup of the information in Peano's postulates. 1 disagree, therefore, with the argument (Doner & Lappin, Jones & Pick) that i f <:ertain information is contained in the visual input, then all the rest can be adequately described as direct pickup of this information. Analogous distinctions hold for the information in the visual array. In the case of stereopsis, for example, it bas been realized since at least as early as J433 (Polyak 1957, p. 556) that the essence of binocular vision depends on the matching of corresponding elements in the left and right images. What remained to be understood was how the corresponding elements are nctually identified by the visual system. The "ecological optics" aspect of binocular vision as discussed by Gibson (1979, ch. 12) h·as proven relatively straightforward. Its Indirect aspects have been, and still are. major problems in the study of perception. a. Characterizing the case against direct perception. A number of the commentaries centered around a discussion of what dire
THE BEHAVIORAL ANO BRAIN SCIENCES ( 1980),
3
409
Respon.se/Ullman: Against direct perception perception. Perception plays a role in determining the behav ior of animals, and animals' movements and locomotion affect the visual stimulus that reaches their eyes (Jones & Pick, Mace, Prindle et al., Reed, Shaw & Todd). 7. Cognftlve problem-solving tn perception. Johansson et al. sketch what they consider to be the accepted distinction between direct and indirect theories of perception. This characteri7.ation contrasts the "direct recording of informa tion" with cognitive problem solving, or reasoning-like activi ty, that draws on past eltperience to supplement the raw material in the visual stimulus. (See Epstein, characterization 4 and Rock for similar definitions.) Additional characterizations of the DVP controversy that have been offered in the past are listed in note 1 of the target article. They include the role of past experience in percep tion, the degree to which the environment s i speciBed by a dynamic visual array (see also the commentary by Pradzny), and the difference between continuous optical flow and discrete sampling of the visual array. ln light of these widely different characterizations, is it possible to determine what the direct perception controvery is "really" about? The above characteri:zatlons were supported primarily by citations from Gibson's writings. Using this kind of support alone is not guaranteed to yield >1 unique definition of direct visual perception, since different characterizations may be proposed, corresponding to different aspects of the Gibsonian approach. To keep the discussion coherent. the characteri:zations that correspond to distinct aspects of the theory should not be confounded. In my paper, I drew a distinction between two separate aspects of the Gibsonlan approach: ecological optics and the direct pickup of information. A number of the commentaries objected to the argument against DVP by supporting the Gibsonian approach to ecological optics. A fundamentaI difficulty with these arguments Is that ecologi cal optics and DVP are not identical, and the former does not entail the latter (see section 2 above). The fact that the visual array is rich in information does not imply that constructs internal to the perceiver have no place i n the theory of perception. The various proposed characterizations can also be exam ined In light of their relevance to the theory of direct perception vis a vis alternative theories. The various charac teriutions are not all of equal relevance in this respect. If direct perception is defined, for example. as "unaided by instruments," then one can make a strong case for directness in perception. This distinction would not be essential, however, from the point of view of contrasting the Gibsonian approach with alternative theories of perception. The conclu sion is that in putting forward a characterization of DVP it is important to evaluate its "discriminative value," that is, whether it captures an essential distinction between direct perception and alternative theories. One way o£ assesis ng the discriminative value of a given distinction between DVP and alternative theories is to consider the following "modification test." Suppose that a proposed characterization is based on a proposition P which is held true in the indirect view but not in the DVP formulation. The question the(l is whether the DVP theory can be modi ned to admit the correctness of P without losing its essential i not an essential characterl:zation character. If it can, then P s of the DVP controversy. The discriminative value criterion reveals a weakness in Cyr's emphasis on motor feedback. In Gibson's formulation motor feedback s i indeed superfluous. But its role can be acknowledged with only a marginal change in the directness aspect of the theory. For example, one could replace the visual array by a "sensory array" that Includes some nonvisual Information about the posture and motion of the perceiver. It can then be argued that the sensory array is only a slight
410
Tf£ BEHAVIORAL ANO BRAIN SCIENCES (1980), 3
modification of the visual array, and that perception is still nothing beyond the "direct pickup of the information In the sensory array." (See, for example, the afferent-efferent loops in Reed's formulation of direct perception. They could, in principle, include motor feedback without drastically affed ing the directness of the theory.) . It is also unclear whether motor feedback plays an essential role in all, or most, perceptual tasks. A variety of visual tasks, say stereoscopic vision, or the recognition of various objects, can be achieved under brief presentation, without the use of eye movements or other motor activities. rf motor feedback is not essential in these tasks, then it does not provide an adequate characterization of indirectness in perception. It should be noted that Gyr's argument is not invalid: there is good evidence that motor feedback plays at least some role n i perception, and this fact does run contrary to the Gibsonian formulation. It is, however, a relatively weak characteriza tion, which bears more on the information content problem than on the directness controversy. For similar reasons, Pradzny's description of DVP as the study of dynamic perception is also a weak characterization of DVP. Dynamic transformations of the visual array ar� certainly relevant to ecological optics and the information content problem, and Gibson was a pioneer in stressing their importance. The use of dynamic patterns is not excluded, however. from the indirect theories, such as the computa tional/representational approach. It fails, therefore, to provide an essential characterization of the DVP controversy. As to Weimer's definition, I find it difficult to agree with his characterization of cognition as a ''fallible device to deal with the unexpected," or the speculation that "when evolu tion s i completed" there would be "no need for cognition beyond perception." His main argument is aimed at estab lishing the necessity of conscious processes in cognition. It does not seem to me, however, that the necessity of conscious cognition lies at the heart of the DVP controversy. First, at present this problem appears to lie outside the psychological theories of perception, direct or indirect. Second, the direct and indirect theories also disagree in their account of tacit, unconscious perceptual processes, and consequently Weim er's characterization is lacking in discriminative value. Mackworth's characterization stresses the role of processes "in between" the initial registration of the visual input and the final percept. This emphasis does not seem to capture the essence of ''directness" in perception. Tt is conceivable, for example, that a theory could explain the internal proces.�es involved in perception, without describing perception as the "final product" of these processes. The relation between interal processes and perception may be more complex, and consequently the "in between" relation is not an indispensa ble aspect of the indirect theories. In Mackworth's view the distinction between a one-stage and multistage process in perception is primarily an empirical question. This view is not entirely correct, since the decompo sition of a process into substages depends on our description of the process, and thus of the chosen level of description. The main objection to viewing the perceptual process as a single stage Is not that it is empirically wrong, but that it does not provide an adequate theory. Finally, it should be noted that perception is not necessarily viewed as a single-stage process in the DVP formulation, since it involves a continuous inter play between the perceiver and his environment. It seems, therefore, that the main distinction is not the "one-staged ness" of perception, but the adequacy of a theory that does not elaborate the decomposition of the internal perceptual process. The same fundamental distinction applies when the theory of perception is extended to include perception-action Inter actions. The interrelations between perception and motor activities do not Imply that the processes involved are direct.
Response/Ullman: Against direct perception The role of perception in controlling behavior in fact suggests a possible advantage of explicit internal representa tions of the environment. An action to be taken in response to visual input often depends on a configuration of the environ ment, say, that object A s i in front of object B. The in-front-of relation may be determined by different kinds of visual information, such as stereo, motion. or interposition. It might be advantageous if "A in front of B" was represented by the same brain event, regardless of its visual source. More gener ally, the "convergence argument" for internal representations of the environment runs roughly as follows: the conditions that lead to an action can be "summarized" in a fixed intemal form that is then used in executing the action. The conditions that determine a motor activity often depend on the configu ration and properties of the external enviTOnment. Conse quently, an explicit representation of some aspects of the environment and its properties may be an efficient "internal form." The characterization of indirect perception described by Johansson et al. has a number of different components: (i) the information in the visual stimulus is insufficient, (ii) past experience plays a major role In perception, and (iii) the perceptual process is a logicul reasoninglike activity that involves making assumptions and drawing inferences. These various components are in many respects independent. As one example: past experience can in principle act by modifying the structure of the system rather than adding to a bank of experiences. In this manner it can conceivably play a signifi cant role in perception without requiring an explicit reasoninglike activity. Using the conjunction of the different components to characterize the indirect approaches fails, it seems to me, to characterize adequately the possible alternn tivL "li to the "direct recording of information." A theory cnn reject the "direct recording" without necessarily relying only on past experience and cognitive problem solving. The more fundamental distinction (albeit not necessurily the generally accepted one) is that the direct recording of information cun be replaced by a systemlltlc theory that uses constructs internal to the perceiver. Such theories mt�y agree with pk\rts of Gibson's ecological optics, but not with the directnes.� of the information pickup. The objection to DVP that I huve discussed, in terms of the meaningful decomposition of the perceptual process, seems to rrw to provide a strong characterization of the direct percep tion controversy. Such a decomposition s i an essential nspec.:t of the alternative theories, and the DVP theory cannot be modified lo meet this objection without losing its essential churat:ter. If it is acknowledged that the rnuln concern of the theory of perception is to explnin the perceptual ..Information pickup" in terms of constructs internal to the perceiver (such as internal processes, computnt ions, representations, and the like), then the DVP formulation no longer holds.
4. Meaningful decomposition, computation, and internal representations. In the indirect theories, constructs internal to the perceiver are used in the explanation of the perceptual process. This brings such concepts as computations, processes, and internal representations into the realm of the (indirect) theory of perception. The use of these concepts in the theory of perception has been examined in a number of the commentaries (Grossberg, Hinton, Johansson et a!., Runeson, Weimer). On the most general level, the very use of computational terms i n psycho logical theories has been questioned. Weimer's objection to the standard notion of computation is based primarily on the argument that rule-governed symbol manipulation cannot account for conscious experience. A second objection to the computational approach (for example, Prindle et al.'s) is that algorithms cannot provide proper explanations. My own view regarding Weimer's objection was stated
briefly in section 5 of the target article and in note 13. I see no strong reason to believe that the computational view in its present form is the ultimate approach to the problem of perception in all its aspects. But this is only marginally relevant to the direct perception controversy. The merit of the indirect approach is not that it provides the ultimate theory of perception, but that, in the current state of knowl edge. it provides a more adequate theory than the direct theory (cf. von Fieandt and Welker). As to the problem of an algorithm as an explanation. it appears to confuse algorithm-as-a-theory with the theory of a computation. An algorithm as the theory may indeed lnck explanatory adequacy, even if it correctly describes the behavior of the system it purports to explain (cf. Chomsky 1965). But for a system that performs computation, there can be a theory of the computations it performs, and this theory should not be confused with an nlgorithm-as-an-explanation. A number of the commentaries examined the form of the computations used in perception - in terms of whether they should be regarded as coordinated sequential steps, as distrib uted parallel processing (Hinton), or as dynamic patterns (Grossberg). Although this is not an immaterial problem, I would like to deemphasize its pertinence to the DVP controversy. For the argument against direct perception, the form of the internal computations is of secondary importance. The tP.rm "compu tation" was not intended to menn computerlike operations (Prindle et al.). I have in fact, examinecl elsewhere (Ullman 1979b, ch. 3; 1979d) the possible use of distributed, uncoordi nated computations, similar to those described by Hinton. Grossberg prefers the language of differential equations and dynamic patterns, and therefore Bnds the term "resonance" appealing (see also Ruoeson for a similar argument). But his enterprise, if I understand it correctly, is aimed ut deriving a theory of the computations underlying perception (as well liS other activities), and in this crucial respect it s i in accord with other computational theories. 5. Explanatory adequacy and truth value. The objections I have raised against direct perception were based primarily on the explanatory inadequacy of the DVP formulation (cf. Kocnderink). A number of the commentaries sought to strengthen the case by attempting to disprove direct percep tion on either logical or empirical grounds. It seems to me, howevm, that no such strong case has ·been made. The additional arguments do not disprove DVP, but challenge it again on the basis of its explanatory inadequacy, relying primarily on what 1 have described as the psychologically meaningful decomposition of the perceptual process. llnyes-Roth argues that the tasks performed by the visual system require a certain "representational and computational power." These requirements imply that the perceptual system must incorpor:�te mediating processes and representa tions; hence direct perception is logically impossible. There are a number of problems in npplying this semifor mal argument to the DVP controversy. I shall mention two
here.
First, according to the DVP formulation, it would be wrong to consider the patterns of light stimulation on the retina as the input to the visual process. It Is claimed that the visual system can "resonate" directly to lnvariances in the patterns of light, in which case the recognition of triangles (considered in Hayes-Roth's commentary) could be performed directly. The crucial Issue here is whether abstract properties and invarlances can, In general, be picked up directly by the visual system. (See section 3.2 of the target article for a discussion of this issue.) A second shortcoming of the "computational power" argu ment is that a system may employ some rule for associating inputs and outputs without having a meaningful decomposlHE BEHAV,()RAL AND BFW-1 SC1ENCES (1980), 3
411
.l
l
·l
I j 1
l ·! :�
·'
:l
:r l
·'
,;
·I i .
·
l i
!
I I
Rej�rences /
Ullman: Against direct perception
lion. To stay with the semiformal line of argttment, consider a Turing machine that instantiates, by virtue of its internal states and transition table, some mapping rule between inputs and outputs. [t is possible, in principle, that the only descrip tion of this mapping rule would be to give essentially the complete ··blueprint" of the machine. The same problem may conceivably arise in a complex networklilce system, where the network may be entangled in a manner that prohibits meaningful decomposition. lf this were the case for the perceptual system, there would be no significant middle ground between ecological optics and a detailed description at the mechanism level. The fundamental objection to direct perception therefore rests, not in the computational power argument, but in the arguments for a meaningful decomposi tion and adequate level of description. Empirical evidence against direct perception is described by Loftus & Loftus. They argue that in metacontrast experi ments, for example, the perception of a stimulus A is not completely determined by A alone, but depends also on a second stimulus B. that is presented at a different location or a different time (see also a similar argument by Rock). The acceptance of such empirical results as evidence against direct perception depends crucially on the argument support ing tbe explanatory adequacy of decomposi11g the perceptual process. Otherwise, one could, in principle, describe A and B together as a single event C, and argue that C alone deter mines perception "directly.'· An extreme example along this line would be the use of a ·'history function," similar to the one described by Shaw & Todd. In discussing the Mach phenomenon, I have argued that perception can sometimes be naturally described as a function of two ··arguments": the visual array and the cunent interpretation of the observer. Shaw & Todd argue that this is not the only possible formula tion. The "current state" of the observer may itself be a function of earlier stimuli and earlier internal states. By a repetition of this argument, perception could be described without resorting to changing internal states. Perception at any given time t could be described as a function of a fixed initial state (say, the state at birth), and the stimulus history from the i.nitlal slate up to time t. The main objection to the "history function" formulation is not that it is descriptively Incorrect, but that it is unsatisfactory as a psychological theory of perception. The argument centers, again, around the notions of meaningful decomposition, internal constructs, and explanatory adequacy, rather than truth va.lue. A fundamental distinction between the direct and indirect theories, closely related to the use of ··history functions," is the reliance of the direct approach on external constructs. £t seeks to explain perception in terms of such constructs as "looking llround, getting around, and looking at things," and overt movements of the body, head, eye, pupil, and lens (Mace, Reed). When perception is explained in these terms alone, concepts akin to the "history function" become unavoidable. At the heart of the disagreement here lies the problem of psychologically meaningful constructs. From a vantage point that denies the legitimacy of internal constructs in the theory of perception, DVP s i implied. As n i all the arguments considered above, the controversy centers primarlly around what constitutes an adequate decomposi tion of the perceptual process in psychologically meaningful terms.
References Attneave, F. (1972) Repre.�entatlon of physical space. In: Codlngprocmu tn h..man memory, ed. A. W. Melton & E. Martin, Washington, D.C.: V.Ji. Winston & Snns. [SUI A!lneave, F & Frost, R. ( 19601 The determination of p<m:eived tridimensional
412
THE BEHAVIOOAL AND BRAIN SCIENCES (1980), 3
orientation by minimum crlteriL Perceptll>n and P&ychbphystc� 6(6 B):391-96. [SUJ llnlow, fl. 8. (1972)Single units ond sensation: A neuron doctrine for perc:e(> tual �ychology. Perception 1:371-94. [OJB] Barrow. H. C. & Tenenbaum, J. M. (1978) Recovering intrinsic scene charac terblics from Images. In: Comptsfer t>Uion 1!/flems, eel. A. R. H•ruon & E. M. Riseman. New York: Academic Press. [CEH) Bentley, 0., 6 Hoy, l\. 1\. (1974) The neurobiology of cricket song. Sct�ttfic Arnerk:on 231:3-t-4... Aug. [SU) Betnlleln. N. (1967) Theco-ordination and regt.diJtion of mooement&. New York: Pergamon Preos. [WMM) Blalvas, A. S. (1975) Visual Analysis: theory uf lie group representations. Math· et114tlc41 Btosclcnus 28:4S-67. UWG] Bloclo, B. (1948) A lei of pc»tulat� for phonemic nnn)ysis. lAnguage 24:346. (SJK] (1950) Studies In colloquial Japanese IV: phonemics. lAnguage 26:86125. [SJKJ Brauruteln, M. L. (1962) Oepth perception In rotating dot patterns: effects of numerosity and perspective. journal of E:rpenmontal Psychology 64(4):415-70. [SUJ Breltmeyer, B. C. & Can�. L. (1976) Impllcallons of sustained and transient chAnnels for theories of visual pattern masking, saccadic suppression and Information processing. PaOJChologtcal Reolew 83:1-36. (GULl Btldgtomo.n. 8.: Hendry, 0.; & Stork, L. (1975) Failure to detect d!Jplacernenl o£ the visual world during saccadic eye movements. Vtsino R<#earcla 1!5:719-22. (BBJ Brown. J. F. (i931) The visual perception of velocity. PaOJChologtsche For �e/aunf 14:199-232. [IRI Caelli, T. M.• & Julen, B. (1978) On perceptual annl)'7ers underlying visual teature dilc:rimlnation: part I. Bio/ogfCQI Cybernetic:> 28:167-7!5. (SU) Campbell, F. W. It Robcon.j. C. (1968) Application of Fourier analysis to the visibility of grating... )tmrnal of Phlf'lologu - Lond�m 197:551..{">6. [SUI Chomsky. N. (1007) Review of R. jalcobson and M. Halle·s Fundam«ntal• of lAnguago. lnt.rnatloool )ourn41 of Amerlc<>n Linguutlcs 23:23441. (SJKI (19S9) A review of B. F. Skinners Verbal llehaoior. In: Thestrw:tureof lan guage, oo. J. A. Fodor 6 J. J. JCa� Englewood Cliffs, N.J.: Prentice-Hall (t96of). (SUJ ( 1964) Current Issues In linguistic theory. In: The stnu:ture of language: Readtng1 In the P/atloJOphiJ of Language, eel. ]. A. Fodor and ]. ]. KBtz; Englewood Cliffs, New jersey: Prentice-Hall. [SJK] (1965) Aspecll of tlae theory ofsynttJ:r. Cambridge, Mass.: MIT Press. [SUI Chomsky, N. & M. Halle (1968) The &ornad pattern of English. New York: Harper and Row. (SJKI Coren. S.; Ward, L.: Poroe, C.; & Fraser. ll. (1978) The effect of optical blur on vlsual·geomctrlc Illusions. Bulletin of the PsychonorntcSociety I 1(6):39092. (SWZ] Dember, W. N. & Purcell, D. C. ( 1967) llccovcry of masked visual targets by Inhibition of the ma.�king stimulus. SCience 157:1335-3!;. (GilL] Denncll, D. C. (1979) Braln•torms. Montgomery. Vermont: Bradford Books. (AS) Dewey. j. & Bentley, II. F. ( 1949) Knowing and the known. Boston: Dea con. (SSP) Dodwell, P. C (1970) Vuuol pottarn reccgnftlon. New York: Holt. Rlnhort & Winston. (KP] Eden, M. ( 1962) A three·dlmcnslonal optical illusion. Quarterly Progreu Re· port no. fU M.I.T.R.L.E. 2<>"7-74. [SU] Epstein, W. (1973) th� pr� of "taking-into-account"' in visual perception. Puceptlon 2:267-85. [WEI (ed.) (1977a) Stability and constanCIJ In wual perception: rnechaulsm& and prncene1. New Yor\: Wl)ey-lnter$Ciencc. (WE) (1077b) What arc the prosp«ts forn higher-order stimulus theory of per<:e(> tlon? Scondtnaolan journal of Ps�holo&l/ 18:164-11. [WE, SU] (1979) In Ihe eye of the beholder. Psf/Chologtcol Studfu 24:82-97. (WE] (In prep.) The relationship between texture gradient and perceived slanl-in· depth. (WEJ Epstein, W. & Hatfield. G. ( 1978) The !oct>< of masking shape·at-a·slant. Per· ceptton and PsOJChophlflle> 24(6):501-4. JWE) Epstein, W.; Uat6eld. G.; & Muise, G. (1977) Perceived shape ot a slant as a function of processing time and processing lond. journal of E:cperlmental Ptvchologu: Human Perception and Perfcmrumce3:473-83. [WE] Epstein, W. & !'ark, J. (1963) Shape constancy: functional relationships and theoretical formulations. Psychologtcol Bulletin 60:265-88. (WEJ (1964) Examination o£ Gibson·s psychophysical hypothesis. Ps�laologtcnl 8ullcttn 62:180-96. !SUI Eriksson, E. S. (1074) A theory of verldlen) space perception. SCQndtnavfan journalofP•ycho/ogu 15:225-35. [SU]
References/Ullman: Against direct perception Evoru, R. M. (1948) Introduction to color. New York: WUey. (CBZ) Evorts, E. V. (1971) Feedback as corollarydi�harge: a merging of the con cepts. Neurcuclencer f!eu4rch Progrom Bulletin 9(1):86-112. UWG] FArber, j. M. & MeConkl.,, A. B. (1979)0ptleal motion u informatlotl'for un· wlgnod depth. joumolofE�rlmeratol Pri}Chologv. Hurnon Perception and I'•I}ChophJJIIC8 :1(3):494-500.
(SU]
Fehrer, E. & Rub, 0. (1962) Reaction time to stimuli masked by rnetacon trasl. journal of E�rlnumtal P•fiCho/ogy 63:143-47. (GR.L) Gibson, E. J. (1969) Perceptuolleamlng and deoelopment. New Yorlc: Apple ton-C..ntury-Crohs. [1R] (1977) How perception really develops: n view from outside the networlc. In: Balle proceruu In r•adlng, ed. D. L#Berge & S. J. Samuels. Hillsdale, N.J.: EriW.um. [WE) Gibson, E. ).; Gibson, J. J.: Smith, 0. W.; 6 Flock, H.(1959) Motlon pArallal< as a determinant of per«lved depth. journal ofExperimental Psl}chology 8{1):40-51. (SU] Gibson, E. J: Owsley, C. J.; & juhnston, J. (1978) PerceptiOfl of Invariants by Rve-month-old Infants: differentiation of two types of motion. Oevelop mentnl P1vchologv 14(4):407-15. [SU) Gibson, j. J, (1950) The perceplfon of the ulsuol world. Boston: 'Houghton Mlf Oin. UR, SU, KvF, CBZJ (1954) The visual perception of objective motion andsubJective movement. P•ychologlcal Reutew 61(5):304-14. [SU) (1957) Optico.l motions �nd tran•formntlons u stimuli for visual perception. P:vchologlcol Jleutew 64(5):288-95. [SU) (l958) Visually controlled locomotion and visual orientation In animals. Brll llh journal of P•vchology 49:182-04. {WMM) (1959) Perception A U I unction of stimulation. In: P:ycllolo�:v: a study of a •c1ence, vol. 1 .. ed. S. Koch, New Yotk, Toronto. Loodon: McGraw· Hill [ESR. RS, SU) {1960) The eorl<:t!pt of the !tlrnulus In psychology. �merlc4n Psvcho/ogt&t 15:694-703. {SU) (1961) &-ologi""l optics. V(lfon Rueorch 1:25$-62. [SU) (1963) TM r�ful tuol svstenJS. Boston: Houghton Mif flin. [BB, WI?., KvJ?, WMM, ESR, IR, SU] (1967) New rcQsons for realism. Synthese 17:162-72. tSG. SU) (1968) WhAt gives rise to the perception of motion? f's!Jchologtcal lleolew 75(4):33.5-46. [ESH, SU) (1968-69) Are thorc sensory qualitiesof objects� Syatlrese 19:108-9. [SUJ (1970) On lloe<>ries fur visual space pet•-eptlon: a reply to joha.ns:son. Scandl naulan journal of P•uchology 11:73-79. [ESR] (1972) A theory of direct vlsual perception. In: The psychologv of knowing, ed. ). 1\. 1\oyce & W. W. Ro�ebonm. New York, l'arls. London: Gordon & Drcaeh. (GHL, SU] (1976)The rnylh o( plference. IEEE:2/!390. [SWZ) Gregory, R. ( 1972) Seeing as thinking. Timet LllerarySupPlemet�t Qune 2.3):707-i! [KI'] (1974) Choosing a j)(tradlgm for perception. In: Handbook of perception. Vol. I. flutorlcaland phllo.ophlcol root� ofperception, ed. E. C. Corter ctte and M. P. Friedman. New York: Academic Prest. [WEI (1979) Perceptual hypotheses. In: 1'he p>JIChology of vision, ed Longllet· i'liggiru, H. C & Sutherland, N. S. 1\oyal Society of London symp<>sium on vision (In pre58). [KP] Griffirr, D. R. (1978) Pro.•peels lor a cognitive ethology. Bral11 und DehatJioral Science• 1(4):1527-38. [SU) Grossberg, S. (1976•) Adoptive pullern closslficatiorl ond universal recudlng, I: porallel development and coding of neural feature detectors. 8io/ogical
Cvberttellct 23:121--34. (SG] (1976b) Adaptive pattern classiBcalion and universal recoding. ll: feedback. CJpectotion, olfo�tlon, and illusions. Biological Cyb.metla 23:187202. (SC] (1978) A theory of human memory: self-organization and perfomance of selUOry-motor cod.,s, maps. nnd pions. To: Progress In theoretical blologv. 110!. 5, ed. R. R�n & F. Snell. New York: Academic Press. [SG) (1980) 'How does o brain build o cognitive code? Pzychofogtcal Reolew81:1:Sl. [SC} Gyr, J. W. ( 1177�) Is a theory of direct visual perception adequate? Psvcholog tcaiBu/Jetln 77:24&--6 1. [SU) (1912b) Comments on Gibson's j)(tper. In: The p>ychology of knoWing. ed. J. n. Royce & W. W. l'lozeboom. New York, Pazis, London: Cordon & nreoch.
[SUI
Gyr, j.; Willey, R.: & Henry, /\. (1970) Motor-sensory feodback and geometry of vlsuol spoco: o replication. Belrautoral nnd Ilrnln ScU!nces 2:59-
64. [JWC. SU) . Hoy, C. J. (1966) Optical motions and space perception - and extension of Gib son's anolyslr. !'tychologlcnlllevlew 73:5150-65. [SUI Hayek, 1'. 1\. (1969)Tho primary of the abstract. In: &yond reductionism, ed.
A. Koestler & ). II. Srnythles, pp. 309-33. New York: Macmillan. (WBW] Haye&-lloth, F. (1977) Critique of Turvey's "Contrasting orientations to the theory of visual Information prOCC$l$ ng." Psycltologteol Reolew 53135. [f'HRJ Hecht, S. (1934) VJ.osion lJ, the nature of the photoreceptor process. In: A hand book of gerrero/upertmental psychologv. eel. C. MurchinsorL Woroester: Clark Unlverslt'y PrOSI. [CRL] Hclmholb:. H. von, (1963) Trtalue on phflllo/oglcal optla, ed. j. P. C South :.U. New York: Dover. (80, WE, SWZ) Henle, 1\4. (1974) 011 naive realism. In: Pen:eplfon:e#Dy•ln honor of)amu j. Ctblon, ed. n. 0. Macleod & H. L. Piclt, Jr. Ithaca and London: Coroell University Preu. [SU) Hill. A. L. (11772) Direction constancy. Perception aru/ PzvchopluJ.U:S ll:l7578. [IRJ Hinton, G. ( 1979) SmJMO dcmonstrotlon of the effects uf stnreturol descriptions io 111ental lmagery. Cognllloc Science 3:231-50. [SUI Hochberg, j. (1974) lil&her-ord�r stimuli and Inter-response coupling in the perccptlol> of the visWll world. In: Perception: t:Mays In Jronor ofjamu j. Glbfon, cd. R. D. Mocle<>d & H.L. Pick, Jr. Ithaca and London: Cornell \Jntv.,tslty Pre$8. [SU] Hochberg, ]. & McAlister,/\. (19�) A quantltatlvcappronch to figural "gl){)(! ness." }ollrnol ofE%1�mcmtal P1vchology 46:361-64. SUJ [ Hnflmno. W. C. (1900) The lic-algcbra of vblUII perception. jouroalof Matlr· emoltca/ Psvchology 3:65-91!. [JWGJ (1977) An Informal, ltlslorlcol d�riptlnn (with bibliography) of the "L.T.G./N.I'." Cailters de Pt1JOI1ologlc 20:135-74. (KvF] )atiSSon, C. & johons.�on, C. (1973) VIsual perception of bending motion. l'er <:IIJlfon 2:321-2(1. [SUI jolwnnson C. (I Q&l) Perception of lll(ltion und changing form. Scat�dtunutan Joumal of l'svclrolosv .5:181-2011. (SU, KvFJ (1970) On theories for visual space perception - u letter to Cibson. ScarJdt naiJ/<111 journnl of l'svchologu 11:67-74. [SU, KvFJ )ohon!SOn, C.: Von Hof•ten, C.; & Ju,..son, C. (J!li!O) £vent pc.lrcepUon. AnnunI llevlcw of P•vcflologv 31:27-63. ICJI Julcsz. 11. ( 1971) Fot.mdnllontof Cyclopean perception. Chtcugcr. Univc•rsity nf Chicago Pr�. (SUI )ulosz. 11. & Cadll, T. (1979) Ou tha limits of fouriercii'C<>rnpositiun in vb1ral tc�ture perception. PcrccJ>IIon 8:69-73. (SU) jules?. n. /JI Miller, J. (10715) Jndl!pcn dcml 5()l>tiul-fr ... �)ll(!IICY·tllned chunne)$ in bin<>C'ular (LL,Ion aurl rlvalry. l'ttrcepllun 4:12S-43. [SUI Knhneman, 0. ( 19117) An onst.1-onsct lllw forunu CllSC of uppun>nt motlun onJ metac..•>nttast. Pcrco,nlun nnd P•vchophiJ•Ics 2:5n-114. [Gl\1.) Kal!., 'Eino (1979) Reality and expcrlctiCC!. ln: Four 11hilt�opillcal euatp ed., n. S. Cohen. Dnrdrecht, Boston, London: lleidel. (KvF] Kalil, R. £. & Freedman, S. J. (1900) Pc..Si•tcncc of ocular rotation follnwlng comperuatJ11n fnr dl$plaecd vision. Puceplnnl & MotorSkfUt 22:13539. [IR) l<.anlzsa, C. (lOSS) Marglnl qunsl-pcrcottlvl ln campi con stimolnzionc orno· genen. lllfll&ta di Pllcologtn 49:7-30. (In] ICQnt, lmmanuel (1781) Crlllqul' of Pure llca.tun, trnns. N. K. Smith (1929). Londo
THE BEHAVIORAL AND BRAIN SCIENCES (1980), 3
413
References / Ullman: Against direct perception (lln'Ob) Motion and vision. 2. S�bili:ztd s�tl<>-tempoul threshold surface. Journalofth� Optical Socktv D/ Amtrlc4 69(10): 134()..49. (KP) Kling. j W. & Rigp. L... A. (l972) £qcrlmentol P•vcluJIDgy. New Ynrl: Holt. Rinehart 6 Winston. (GRLJ Kn�n. F. (1974)Ste.eoldn�tt. Copenl10,;en: Abdemisk Forlag (KvF) K�nderink, J. j. 6 wn Doom, A. ). (1975) lnvari4nt properties ofthemnlion parallu 6eld due to the movem�t o( rigid bodies relath� to an observer. OptleG Acta 22(9):773-91. [KP) (1976) Local slructureof movement parallax of the plane. joumnl of the Optwl S()C1etyof Arnericd 66(7):717-23. (KP) (1977) How an ambulant o�rver can corutruct a model of the environment from tbr goornetrlcal struc:ture of the visualin8ow. Kybernttllc 77:22447. (KPJ (1979) The internal representation of solid shape with respect to vision. BW· logical Cfll-nttiCI. 32:211-16. (KP)
Kolfl:a, K. (1935) Prlncfp/n of Cutolt ptiJCholoBIJ. New York: fhrcourt, Brace, and World. (JR. SUI KIShler, W. (1947) Cettalt PffiChoiDliV· New York: Llverlgjlt rubllshlng Corp. (IR) Kosslyn, S. M. (In press) Image and Mind. Cambridge. Moss: lurvud Unf. vcrslty Pre.u. [SU] Kugler, P. N.; Turvey, M. T.; & Shaw, R. (forthcoming) Is the "cognitive pene lrtblllty" criterion invalidated by contemporary ph)'$ics? Beltaoloraland Bnlln Scftncel. (SSP) Kuhn, T. (l96i) Thestructure ofICitni!Jie rec:olullons. Chicago: U nlvenlty of Chicago Pr-. [SWZ] Lee, D. N. (1974) Visual Information during locomotion. In: Perc#�)lion: t•••IP In honar ofjames). CthiDn, ed. 11. B. M11cLcod & H. 1.. Pick. ltlaoco and London: Cornell University rr... (liS) (1076) A theory of visual control of braking based on information about time-to-collision. PnceptWn 5:437·56. [SUI (1980) Vis1.10-motor eoordlnatron in space-time. I n: Tuton.u and motor be hoolor, ed. C. Slelmac:b & ). Requin. Ams1erdam: North-Holland PubliUI Ing. [RJJC. SSP) (In press) T1le optic Bow Belei: Thefoundation or vision. PAdosophiC41 Tramocllons Dfthe Ro,al Soc1tlflofCAndon. (RKJ) Llberm&n, /t.. M.; Cooper, F. S.; Sh11nkweiler. D. P.; & Studdert·Kcnnedy, M. ( 1967) Perception of the speech code. PsychDioglcal Rcolew 74:431GJ. (SJKI LindsAy, P. H. & Norman, D. A. (1972) Human Information proceutng. New York and London: Academic Preu. [SUI Longutt-Higgins, H. C. & Prazdny, I. (In preos) The inlerprttalton of a mov· ins retinal Image. Procec/4tngs of the Ro,al Society - London. [SUI Luria, A. (1973) Theworldng broln. llormondsworth: Penguin. (ES'RJ M•ch, Ernst. (1897) The <molym ofsenrat!Qm. New York: Dover. (SUI Mace, W. M. (1977) James Glb.on's strategy lor perceiving: ask not what's In your bend, but what your head h lnaldeof. In: PerroiDing, •cling and #m()wfng: toward> an ecological psvcliologv. ed. R. Shaw & J. Brnus£ord. Hillsdale, N. ).: Erlbaum. (ESR) Mack, lt.. ( 1970) An investigAtion of the relationship betweeneye and retinal Image mcwement n i the perception ol movement. l'erceptton and Ptvcho Phlllfcs 8:291-98. [88) (1979) Nonvisual detennlnnnts of perttption. &lutui<>rt1l end Bratn Sclcnce6 2(1):75. (SUI Madcworth. A. K. (1976) Model-driven interpretation b1 intelligent vision •Y'· terns. Perceptwn 5:349-7n [AKM) Marcel. A. (In press) Con,cious and unconscious perception: vlstllll m11sklng, word recugnitiori and an approach to consciousness. Cognltluc Psuchnlo· gy. (Gl\L] Marmolln, R. (1973) Visually perceived n••tion in depth resuhlng from proxl· mal changes. Ptrcepllon and PlfiC}IopiiJIIIcl 14(1):133-48. (S'UJ Marr, D. (1976) Early proeess!ll& of vlsual lnf<><m•tion. Phi/Oipirlcal Tramoc· lions of the Royal Sot:Utv of L.tmdon 275(942)483--S34. (SU. SWZ) ( 1977) Artilklal intelligence' - e personal view. Artt/ickll lntelltgttnce 9:3748. (SUI Mnrr, D. & Ntshlharn. K. (1978) Representation nnd recognition of the spatial organizntlon of three-dimensional shape£. Proceedings of the RD11al $()(:1. ety - LondDn B 200:269-9-1. (SUI Marr. D.; Palm, C.; & l'oggio, T. (1978) Analysls of a cooperativestereo algo rithm. Btolog1C4/ Cybernellcr 28:223-39. UFDJ Marr 0. & Pogglo T. (1976) Cooperative computation or j�ereodisperfty. Science 194:283-8'7. UFD, CE.H) (1977) From understanding comput•tlon to unders!Jlnding neural circuitry. Neuroscience Rt•earch P10gsam Bulfelln 15(3):470-8& (SU) ( 1979) A computational theory of human stereo vision. Proceedlngt of the lloyal Socfcly-London B 204:301-28. (SU] Mast�rton, R. B. & Rerkley, M. A. (1974) Brain function: changing Ideas on the
414
THE BEHAVIORAL ANO 81W'l sae-ICES (1980), 3
roleof sensory, motor And 11ssoc:lation cortex In behavior. Annual Reol�w ofPIIJCiaoiDil'J 25:277-312. (E$1\J Metzger, W. (1930) Optiscbr untersuchungen lm Canzfeld TI. Pkvcltolo,Uche FortChung. 13:6-29. (RKJ)
(1972)Critical remarks to j. J. Gibson's conception of "direct"visual percep tion. I.e.. or revived prephysinloglcal realism. In: The ,nychology ofkn01.0· lng, ed. J. R. Roy� & W. W. R07.eboom. New York. Parb, London: Gor don & Breach. (SU] Meyer, D. E. & Schvaneveldt, R. W. (1971) Focllltalion In recognizing palu of words: evidence of a dependc.onco between retrieval operations. ]oumdl of £xperlmental P1vchology 90:227-34. [GRL) Michaels, C. F. & Carella, C. (in press) The theory ofdirect perception. New York: Prentice-Hall. (SSP] Minsky, M. (1967) Comp��tcll<>n:finII• and ln/infle machines. Englewood Clift's. N.).: Prentice-Hall. (RS) (1968) Matter. mind and models. In: Sert14nllc Information procettlng. ('.ambridge, Mass: MIT Press. (AS) (1975) A framework £,or representing knowledge. In: The p.sycho/ogy Df compulltf ol.t1011, ed. P. H. Winston. New York: McGraw-Hill. (WE, KP) Mochoue, A. (1963) The p<�rcepllon of ootualllfl. London: Methuen. [CBZ] MOller. C. E. (1924) Abrlss der psychoiDgle. Gllttlngen. (KvF) Musattl, C. L. (1924) Sui fenomenl 4tereoclneticl. Archloio Italiano Ptlcologtc 3:105-20. (IR, CBZ) Neff, W. S. (1936) A critical investigation of the vb.W apprehension of move ment. Amm&ieWt of 8loph1JSICI 4:2S5-16. [WMM) (1974) Discrete and continUOU$ processes in computers and brains. In: Lee· lure notn In bio-mathematics 4: ph.,.rcs and mathematics Df the nervocu lljttem, ed. M. Conrad, W. Clittinger. & M. Dal Cln. New York: Springer· Verlag. (WMM) (1977) Dynamic and linguistic mndes of ''Omplex systems. lntemallonol joumal of GeneralS!JStlml$ H:25�66. (SSP] Patten, B. C. (1979) Environs: relativistic elementary particles for ecology. Pa· per pr1!$Cnted at Oak Ridge National Laboratory. Tenn. (SSP, n5) Pittenger, J. 8.; Shaw, R. E.; & Mark. l... S. (1979) Perceptual tnformntlon for the nge of faoesns higher order lnvnrlances ol growth. ]oumal of E:rperf rnental Ptuehologv: Human Performance and PsiJChoph!JSict ii(3):�7S93. (SSP, SU) Polyal<, S. L. (1957) The oortebrtlte Ollualsystem. Chicago: University of Chi· cago Press. [SUI Prazdny, K. (1980) The information In optical Dow. Artfjicfal lnte/ligence )Duma/ (submitted fO< publication). [K.P] Pylyshyn, Z. (1976) Imagery ond ortlftciol intelligence. In: Mtnnc&ola 1tud1111 In the philosophv ofscience, vol. 9, ed. W. Savage. Minneapolis: Unl· versity of Minnesota Press. (SU) ( 1978) Computational models ond empirical constraints. &lwolora( and Brain Scl�ce• 1(1):93-99. (SUJ (1980) ComputAon it and cognition: issues In the foundations of cognltlvo �Cience. Delwoioraland Brain Scfences 3:111-69. (CEH, SSP) Resile, F. (1979) Coding theory of lhe perception or motion configurations. Psvchologtcal Review 86(1):1-21. (SUI Richards, W. & Polil, A. (1974) Texture mulching. Kybemellc 16:12091302. (SUI Rock, I. (I 077) In defense of unconscious Inference. In: St<>btillJI and COrt· llancu In l)l.tual pnceptlon, cd. W. Epstein, pp. 321-73. New York: Wi ley. (WE. CJ) (in prus) Allernutive solutions tn kinetic stimulus transformations. journal of ElCJUrfmental Psychology: lluman PerceptiDn and Performance. [SUI Rock, I. & Ebenholtz, S. (1959) The relutionol deterrninatlon of perceived size. P•vchologtcal Reulew 66:387-401. IIIli Rock, I. & Cllchrlst A. (1975) The conditions lor the perception of the covering and uncovering of a line. American journal of PSIJChDiogv 88:57182. [In) Rock, 1.; Hill, A. l...; & Fineman, M. (1968) Speed constancy asa function of size constancy. Perception P11)Chtl11hflfiCI �:37-40. (IR) Rock, 1.: Shallo, J.; & Schw:utt. F. (1978) Pictorial depth and related constancy effects a.s a function of recognition. Percepllon 7:3-19. (CBZ) Rock, I. & Sigmon. E. (1973) Intelligence factors In the perception of form through a moving slit. PercepiiDn 2:a57-69. [WE] Runesnn, S. (1977) On the possibility ol "•mart" J)ereeptulll mechnnlsms. Scan·
References / Ullman: Against direct perception dlnal>lon journal of Psychology 18:172-79. (ESR. SRI Scb>t2, C. ( 1954) Th� role of context in thP. pert·epllon of stops. Lang11age
31>:47�56. !SJK I Schiller, P. H. & Chorovcr. S. L. (1966) MetaoontriiSt: Its relation to evoked po· lt!ntials. SciMco> 153:1398-1401. [CRLI SthrOdinger. E. ( 196'7) Mind and matler In: Wh
[KPI
(1977b) Psycholu�ticul approachcs to the problem of knowledge. In: Percclv· lng, acllng, and knowing, cd. n. Shaw & J. Brnruford. Hillsdale, N.J.: Erl baum. (WMM. RS) Shaw, R. & Mcintyre. M. (1974) Algorlstic foundations for cognitive psycholo gy. In: Cognition and the symbolic processes, ed. D. Palermo & W. Weimer. Hillidale, N.J.: Erlbaum. (SRI Shaw, R.; Mclntyr... M.; & Mace, W. (1974) The role of symmetry In event perception. Tn: Perctpllon: essavs In honor ofjames ). Ctbscm, ed. R. R.
Pick Jr. Ithaca and London: Cornell University [SU) Show, R. & Turvey, M. (in press) COlllitlons as models [or ecosystems: 11 renllst Macleod & H. 1-
Press.
perspective on per�eptuol organization. In: Perceptual organt.alfon, cd.
M ICubovy and J. Pomerantz, pp. 1-39. 1-lillsdole, N.j.: Erlba.um. RS)
[SSR,
�'haw, R.; Turvey, M.: & Mace, W. (in press) Ecological psychology: the conse quences of a t'Ommilmcnt to realism. In Cognlllon and 3�mboitcpro Ctl$se.r, vol. 2,
Press.
[GEHJ
Slomon, A. (1978) The computer raoofut1on In plrl/011op/Jy: plrtlosop!.uscience
and moclels ofmtr1d. Suss
Smith, N. W. (1971) Aristotle'' dynamic approach to .sensing and some current implications. )oumnl of the History of tire 8ehaotorol Sciences 7:37577.
[SSP)
(1974) The ancient background to Greek psychology and some implications
for today. P•vcho(ogtca( Record 24:309-24. [SSP) Snellen, j. W. (197!l) Set point and exercbe. In: Essays on fcmperatureregufa. liOn, ed. j. Bligh & R. E. Moore. Amsterdam: Natth·Holland Publish· ing. [SSP)
Sutherland, N. S. (1979) The representation of three-dimensional objects. Nn· tur" 278:395-98. [SUI Taylor, J. G. (1962) The behavioral ba.rts ofperception. New Haven: Yale Unl· versity Press. [SU) Thatcher, 1\. W. & john, E. R. ( 1977) Fou111falfona ofcognftloe prote3su. Hillsdale, N.j.: Erlbaum. [SU) Titchener, E. B. (1926)J\ textbook of'f)$!/Chologv. Now York: Macmillan. [IRI Tondeur, P. (196.5) fntroducfton fo the lie groupstmd trarttformatton groups.
Heidelberg: Springer.
[KvFJ
Turvey, M. T. (1976) On j>Criphcrnl and central proccesos ln vision: inferene
ull. Psvclwlogfcallteolem 80:l-52. [GRL) (1977) Contrasting orientations to the theory of visual Information process ing. P•ychologlcalllaofew 84(1):67-88.
[WE, FHA, SU)
(1979) The thesisofefference-mediotionof vl51on can.not be rotionaliz.ed. Behaoforal tJnd Brain Sciences 2(1):81-83. [SU) Turvey, M. T. & Show, R. {1979) The primacy of perceiving: An ecological re· formulation for understanding memory. Tn: Per�ecllvt$ on memory rtt· search: etsa!J$ tn honor of Upr»afa Un!oersttv'• /SOOth anniversary, ed. L.· G. Nlllson. Hillsdale, N.J.: Erlboum. [SSP, RSI Turvey, M,; Shaw, R.; & Maoc, W. (1978) l!sues ln the theory of action: de-
grees of freedom, coordlnative structures �tnd coalitions. In: Attention and
performance VII, t:d. J. Roquin. Hillsdale, N.j.: Erlbaum.
(WMM, SSP, RS) Ullman, S. (1978a) Artificial lntelltgonce systenu and human cognition: the missing �nk:. Behtlvfor�Jf and Brain Sclenca 1(1):1 17-19. [SU) (1978b) Menial representations and mental experiences. Behat>loral and Brain Sc!errce• 1:605-6. [SU) (1979:a) The Interpretation of >lructure from motion. Procudtng• offhe RoiJtll Socfetv- London B 203:41l5-26. [SUI (1979b) The lnfel"f)retotton ofotsiUll mouon. Cambridge und London: MIT Press. (RS. SU) {1979c) Reln�•Uon and eonstrllined optimi7.1ltlon by loco] processes. Com· pUlergtap/11�and fmaiJ• proceutng 9(6): lliS-25.
[SUJ
von Fieandt, K. (1966) Tha worftl of perception. Chicago: Dorsey
Press. [T(vF]
Nor diU: P•ylco/ogl '.tT. [ICvF] von Fieondt, K. & Cib.son, J. j. (1959) The sensitivity of the eye to twn.kindsof continuous transformations of a shadow-pattern. journal of Experimental Psychology 57:344-•17. [l
}ounwlofE:q>erfmentaf P•ucholo&lJ 38:210-24.
[Ill)
(1976) On perception. New York: Quadrangle.
[GBZJ {In press) The perception of a stable environment. Sctenttfic Amerl·
can. [GI3Z) Wallach, H. & O'Connell, 0. N. ( 1953) The kinetic depth effect. )o umol of Erperlmentcl Psychafll8M 45(4):205-17. [SU) Wallach, H.; O'C'..onnell, D. N.; & Nelsser, U. (1M3) Tho memory effect of vi· sua! perception of throc-
ehologv 45:360-68. [IRJ in the Wat.on, B. A. a Nachmlu, J. (1977) PatterM of temporal lnterncon it detection of JVttlnt!$- Vl#lon Re•earch 17:893-002. [SU] Werner, J. {1977) Mathc!malical treatment of structure and function of the hu· man thermoregulatory s)l'tcm. Biological Cyblrneltct 25:93-101. [SSI') Wheotltone, C. (1838) Contributions to tho physiology of vision. Pari f. On sotne remarkable, ond hitherto unobserved, phtmom�'flo of binocular vi· lion. Ro!l
94. [CBZ)
Wilson, H. R. (1978) Quantitative choracteri2allon of two types of line rprend function near the fove:». Vl.tfon Research 18:493-96. [SU] Wilson, H. R. & Bergen, j. R. (1979) A four mechanism model for spotlal vi sion. VWcn Reuarch 10:19-32.
[SU)
Witltln, H. A. a Asch, S. E. (1948) Studies In space orientation, Ill. Perception of tl1e uprlgl1t lo the 11bsell
[lfl)
Woodbridge, F. J. E. (1909) Co��tcloumess, the sense orgaBS, and the nervous system. journal of Phtlosoplov. Psychology, and Scientific Met/rod 7:�49. 55. [SSPJ Yates, F. E. (1979) Physit41 biology: a basis for modeling living systems. }Otlr· no/ of Cvbe?netlcl and lrt{(lrmatton Science. [SSP) (1980) Systems ana)yals of hormone action: principles and nrategles. In: Blo· logical regulation artd do061opmenf. Vol. 8: Hmmonc ocllon, eel. R. F. Goldberger, New York: Plenum. (SSP) Yates, F. E; Morsh, D. ).; & lbcroll, A. S. {1973) Integration o£ the whole orgnn· ism - a foundation for a theoretical biology. In: Clrallcngtng biological problerm: dfreciiOili iOWIJrd tlrefr solution, ed. J. A. Ochnkc. New York:
Oxford University Press.
[SSP)
Yolton,J. W. (JOO!I..e9) Gillson's realism. Svnthe.e 19:400-7.
[SUI Zaretsky, M. D. (1971) Patterned response to song In cricket Ctlntral nudltory neurone. Nalute 229:195-96. [SUI Zovelbllin, N. V. & Tenenbaum. L. A. (1968) Control prOOC$SCS in the respfm· tory •ystem. Aulomalllnf Ttl•mekhan1ka 9:106-22. (SSP) Zucker, S. (1980) Molion and the Muellcr-Lyer llluslnn, Technical Report 8(}.. 211, Deptutment of Electrical Engineering, McClll University, Mnn· treal. [SWZ)
THE BEHAVIORAL AND BRAIN SCIENCES (1980). 3
415