Preface This book is the fifth in a series of volumes, trying to define a theory of the brain by bringing together formal reasoning and experimental facts. This endeavour was started some ten years ago by a group of researchers, mostly of theoretical inclination, who were guided by the uneasy feeling that more facts were being produced than they had been able to understand. At the same time, there was an awareness that theoretical reasoning about the brain was perhaps not duly constrained by factual considerations. Thus, the first volumes were dedicated mainly to the task of making neural theories and neural facts more palatable to each other. This epistemological problem is not anymore in the foreground now. A new breed of brain scientists has learned to appreciate and to competently use both the methods of sophisticated experimentation and those of model building. Rather than striving to forge a marriage, we can now happily draw on a new generation, the offspring of a connubium already consumed. The present collection of papers focuses on the subject of vision, and brings together new insights and facts from various branches of experimental and theoretical neuroscience. The experimental facts presented in this book stem from disparate fields, such as neuroanatomy, electrophysiology, optical imaging, and psychophysics. The theoretical models in part are home-spun, but no less inspiring for that, while others judiciously apply sophisticated mathematical reasoning to results of experimental measurements. We trust that the reader will feel, like we do, that these various attempts may well present the prelude to a new kind of brain science, where facts and theory begin to blend in a manner reminiscent of the development of physics in the last centuries. Starting point of this enterprise were the presentations and discussions at the Fifth International Meeting on Brain Theory, held at the Istituto per la Ricerca Scientifica e Tecnologica (IRST) in Trento (Italy) on April 5-7, 1994. This meeting, organized by Moshe Abeles, Ad Aertsen, Valentino Braitenberg and Luigi Stringa, was the fifth in a series, starting in 1984 at the International Center for Theoretical Physics in Trieste [1], and continuing in 1986 in Bad Homburg [2], in 1990 at Schloss Ringberg [3] and in 1992 at the IRST in Trento [4]. The meeting lasted three full days, providing a natural segmentation of the presentations and discussions according to three main headings: (1) Visual perception: psychophysics and physiology (2) Cortical implementation: physiology and anatomy (3) Cortical models and computational principles. We adopted the same tripartition to organize the material in the present book. The Fifth International Meeting on Brain Theory was jointly sponsored by the European Commission (DC XII), the Fritz Thyssen Stiftung, the Istituto per la Ricerca
VI
Scientifica e Tecnologica (Trento), as well as by contributions from the various institutions who financed the participation of their delegates. Splendid hospitality, together with most efiicient organization (thank you, Nadia Oss!) was provided by IRST, Trento. The generous support by all these institutions is most gratefully acknowledged. Ad Aertsen Valentino Braitenberg REFERENCES 1. G. Palm and A. Aertsen (eds.), Brain Theory. Springer Verlag, Berlin, 1986. 2. W. von Seelen, U. Leinhos and G. Shaw (eds.). Organization of Neural Networks: Structures and Models, VCH Verlag, Weinheim, 1987. 3. A. Aertsen and V. Braitenberg (eds.). Information Processing in the Cortex: Experiments and Theory, Springer Verlag, Berlin, 1992. 4. A. Aertsen (ed.). Brain Theory - Spatio-Temporal Aspects of Brain Function, Elsevier, Amsterdam, 1993.
Brain Theory - Biological Basis and Computational Principles A. Aertsen and V. Braitenberg (Editors) © 1996 Elsevier Science B.V. All rights reserved.
Early vision: Images, context and memory Dov Sagi* Department of Neurobiology, Brain Research The Weizmann Institute of Science Rehovot 76100, Israel Visual perception involves image transformations, producing internal images with some useful features enhanced, possibly linked. Enhancement is a result of local transformations being context dependent, affected either by remote image regions or by visual memory. Psychophysical experiments show that early vision generates an image that, although dominated by the retinal output, contains a context dependent component. Context affects response of oriented filters through lateral interactions. Efficacy of lateral connections may change with experience (perceptual learning) and can be modulated by higher levels of processing (visual attention and imagery). Psychophysical results suggest that long range interactions are achieved by activation spread through multiple connections. Some properties of this model of early vision are examined and applied to a variety of problems, as lateral masking, texture discrimination and visual grouping. 1. I n t r o d u c t i o n Our visual system transforms the apparently meaningless reality inflicted on our retinas into meaningful and behaviorally relevant objects. As the retinas are specialized in detecting picture points (pixels), we are faced with the problem of attributing these point data to objects. The embedment of point measurements within the proper context of a perceivable object, requires integration between remote image parts as well as with memory. This problem is best exemplified by the hidden figure puzzle [1] shown in Figure 1. The figure seems to be composed of black and white patches devoid of any semantic significance, though a face is present. As this puzzle is difficult to solve by trial and error, it is suggested to look at Figure 5 briefly and then to return to the hidden figure, if only to be faced with the deep look of a face decorated with Christ styled hair and beard. Perception is changed dramatically, with some 'illusory' contours added (as at the top of the right shoulder), yet, these contours are supported by some minimal physical cues. It seems that the solution reached by our brains is based on integration * Supported by the Israel Science Foundation administered by The Israel Acfidemy of Sciences and Hiunanities - The Charles H. Revson Foimdation.
Figure 1. A hidden figure: the puzzle. From Porter P. [1], copyright by the Board of Trustees of the University of Illinois.
of the physical context with some memory cues (previous experience). At what level of processing does this integration take place? Recent findings [2-6] suggest that both spatial and memory dependent context affect visual perception at a very early stage of processing. It is assumed here that a stage of processing is characterized by integration of information from different sources (lower, as well as higher stages) followed by a decision (threshold). Accordingly, the first stage of visual processing is described as spatial (linear) integration of the retinal input followed by a threshold. This stage, encapsulated within the preattentive bottom-up frame, is a principal component of theories of early vision [7-9], though formulation may differ between theories. Here the emphasis is being shifted to the context dependency of early vision, suggesting that already the first stage of processing is affected by 'late' processes and is involved in the creation of a unified perception (Gestalt). It is suggested that first stage filters are arranged topographically with excitatory connections between neighbors. Activation may spread in this network, and remote filters may interact, depending on efficacy of excitatory connections. Efficacy is determined by experience and is modulated by visual attention. 2. T h e primary visual network 2.1. Spatial filters One of the basic findings in visual perception is that there exist many parallel pathways within the visual system, each of them is specialized to carry information about a different stimulus aspect. These findings were demonstrated in physiology with the introduction of the 'receptive field' concept [10,11] and in visual psychophysics with the introduction of parallel processing of color by Young [12] and Helmholtz [13]. Later work
by Hubel and Wiesel [14] and Campbell and Robson[15] opened a period of extensive research in determining the properties of visual channels analyzing form. Although the term 'channel' is not well defined, there is a general agreement on the existence of some mechanisms operating in parallel across the visual field and responding significantly to some stimuli but not others (selectivity). Some theories assume that these channel can be described as linear filters [15-18] while others elaborate nonlinear feature detectors [7-9,19]. Here we assume that the basic mechanisms are linear filters, selective for orientation, location and spatial frequency. It is possible to describe these filters as some function of retinal coordinates, orientation and spatial frequency, F{x^y \ uj^O^Xc^yc)^ with parameters representing filter location (a:c,2/c), its prefered orientation (^), and its prefered spatial frequency (a;). Popular functions used for spatial filters are Difference Of Gaussians (DOG) [18] and Gabor functions [20]. Both filter types were used to model receptive fields of cortical simple cells [8,20]. Filter response is obtained by convolution with an input image L(x,y), followed by a nonlinear transducer function (trf[]): Rfi{xc, Vc \^,0) = trflY,F{x,2/
I a;,<9,Xc, t/c) • L{x,y)]
(1)
This formulation is supported by psychophysical experiments indicating linear summation of subthreshold grating stimuli with parameters constrained to within a 'filter bandwidth' [16]. Contrast threshold for a sine wave grating is not affected by the presence of another grating that differs significantly by orientation or spatial frequency. Filters bandwidth can be estimated to be about 15° (half-width at half-height) of orientation [21,22], and between one to two octaves of spatial frequency [23-26]. Filter receptive field size varies according to its peak spatial frequency sensitivity, being about twice its optimal wavelength [27]. Some reasonable transducer functions are sigmoid functions [18] and threshold functions [28]. 2.2. E x c i t a t o r y interactions b e t w e e n filters The filters described above integrate retinal input only. Next we consider excitatory inputs from neighboring units. Physiological studies indicate the existence of long range interactions in the primary visual cortex [29]. Psychophysical evidence for excitatory lateral interactions come from experiments demonstrating that target detection is enhanced by the presence of high contrast masks within some neighborhood [3]. In these experiments, targets and masks were 2D oriented Gabor signals (i.e cosine gratings with amplitude modulated by a Gaussian envelope with cr = A, A being the gratings wavelength; a parameter setting that serves as an efficient probe for spatial filters). In the psychophysical experiments, observers were asked to detect a low contrast Gabor signal (target) in the presence of two flanking high contra,st but otherwise identical signals (masks). Figure 2 depicts observers thresholds as a function of target to mask distance, showing a threshold elevation (suppression) at small distances and threshold reduction (enhancement) with longer distances. The enhancement range was found to increase with
0.0
4.0
8.0
12.0
DlBtmncm (wavlmngth)
Figure 2. (SL) Stimulus used for exploring lateral interactions, with two high contrast masks (Gabor signals) Banking a low contrast target, (b) Data from the lateral masking experiments, before and after practice (redrawn from [4]). Observers detected a Gabor target flanked by two high contrast collinear Gabor signals at different distances. Contrast detection thresholds relative to absolute threshold (no mask) are plotted as a function of target to mask distance, using Gabor wavelength units. Data is shown for two observers (RM and GH) and a model (see text). Observers show an increased range of enhancement after extensive practice. The model emulates practice by increasing synaptic efficacy and thus increasing the range of signal propagation in a network with excitatory lateral connections. Note that suppression is not affected by practice in these experiments and simulations.
practice [4]. We take the suppression region as an indication for lateral inhibition and the enhancement region to support excitatory interactions. This effect is obtained only if target and masks have the same orientation and spatial-frequency, indicating connectivity between filters with similarly shaped receptive fields. Moreover, target and masks have to be collinear in order for this effect to be significant [30], implying a connectivity pattern with higher density in the direction defined by filter prefered orientation. Recently, Polat and Norcia [5] have shown that Visual Evoked Potentials (VEPs) elicited by a Gabor signal are enhanced by the presence of collinear Gabor signals and that the enhancement is reduced at high target contrasts ( > 20%). Following Polat and Sagi [4] we assume here that these excitatory connections are limited to between close neighbors, but signal transmission through multiple connections is possible. Due to transmission constrains (synaptic) signal propagation through lateral connections involves attenuation (by a factor 0 < /c < 1) and requires additional time steps (with a temporal attenuation by a factor 0 < i/ < 1), resulting the equation: Q-^Roire,t) = -i^Reire.t)
+ K[Re{re - Sr,,t) + Re{re + Sr,,t)] -\- Q^^re^t)
(2)
with re as the axis of propagation in direction 6. Re is the response of a spatial filter with an orientation 0, and some spatial frequency parameter a;, and $(r^, t) is the sensory input to this filter. (Only filters with the same spatial frequency parameter are assumed to be laterally connected.)
And if sampling density is high enough: |i? = ( 2 K - . ) i l + < ^ i ? + | $ .
(3)
It is further assumed that excitatory weights are modulated by visual attention (see Section 4). This assumption can allow for a selective increase (or decrease) in connectivity at selected image regions within a time frame defined by network integration time. On a slower time scale, excitatory weights are subjected to perceptual learning [4]. Network response level is also controlled by a normalization process, that is by divisive inhibition. 2.3. Inhibitory interactions Response normalization is supported by data from masking experiments, where contrast thresholds increase with increasing background contrast. Psychophysical [28,31] and physiological [32] models of filter contrast response assume divisive inhibition, where filter response is divided by a 'local energy' measure reflecting the total activation at some small neighborhood around the normalized filter. These inhibitory filters seem to have an isotropic spatial weighting function[30] and probably operate on a slower time scale [33]. Thus, the normalization factor is considered here to be filter response averaged across time. The orientation selectivity of the detection suppression (see Figure 2) observed in lateral masking experiments [30] indicates short range (of about twice the filter size) orientation selective lateral inhibition. Thus the response of a given filter is inhibited by activity of filters with different orientations at the same location, and by spatially adjacent filters with the same orientation. Such a connectivity pattern may serve, in addition to local response normalization, detection of texture boundaries by reducing activity at regions with uniform orientation. 3. Application t o s o m e perceptual p h e n o m e n a 3 . 1 . Lateral masking: a probe on spatial interactions The range of spatial interactions (i.e the space constant) can be estimated by measuring detection thresholds for a target with high contrast signals (masks) positioned at some distance away from it [3,30]. Data from such an experiment is depicted in Figure 2. These experiments show detection suppression when masks are presented in close neighborhood to the target (less than twice the target wavelength, cis predicted by the divisive inhibition). At larger distances an enhancement is observed, reflecting an excitatory input from a distance up to ten times signal wavelength, much beyond the first stage receptive field size. In order to derive model predictions for this case, we cissume a narrow input at the origin with a space/time spread of one unit, $ ( r ^ , t ) = e"^*'^''^^ (in Equation 3). Taking a non-oscillating solution of Equation 3 one finds (for some u,^ with V = U-K{2-{-
8l^C) > 0),
R = i?y[e-("*+^'-) -h Ce-(*+^^)].
(4)
Model predictions for this case, depicted in Figure 2, were calculated assuming temporal integration and using a Gaussian weighting function for the inhibitory filter with
+
X X X X
+
X X +
• V X + < C C T ^
+
+
'V
y . A - A ' + A ' ^ x X
xA'
+
Jr-VY.+
A + J r + x x
+
" l ^ ^ - 7 C 4 > N / " \
L ^ r A ' T ' . ^ . r C r
V \ / r > \ / < L > v
x +
^ y ^ ^ ' y ^ ^ < ' < ^
xA-Tt
B ^ L - i " i r > ' " v / s
Figure 3. Effortless texture segmentation in (a) is somewhat easier than in (b), though boundaries aie the same. Orientation selective spatial filters predict larger response variability for textures composed of L's, as L's are more elongated than X's (when convolved with larger receptive fields). As a consequence, filter based models generate more boundaries in (b), making foreground localization difEcult.
a standard deviation equal to the first stage filter wavelength. Excitation was calculated as a spatial convolution of filter responses with a sum of exponential weighting function, as derived in Equation 4. The best data fitting parameters for Equation 4 yield an input dependent weighting function which is 16 times stronger than the weighting function accounting for lateral range interactions (i.e C = 16). The same ratio was also used by Zenger and Sagi [28]. The space constant of the long range excitation depends on experience. The experimental data depicted in Figure 2 show an increase range of interactions with practice (perceptual learning), implying an increase of the corresponding space constant from 6 to 12 filter wavelengths(^ = 0.16 to ^ = 0.08). 3.2. Texture segmentation In texture segmentation tasks one has to detect or locate a texture region (foreground) embedded within another texture (background). A typical example is shown in Figure 3. As texture segmentation is sometimes 'effortless' [7] or attentiveless [47], it is believed to be carried out, and thus to be limited, by early visual mechanisms. Human performance on texture segmentation tasks can be accounted for by a two stage filtering model [41-44] with a similar design to the one described above. The key component in texture segmentation is the first stage filter, being local, orientation selective and spatial-frequency selective. Detection of texture boundaries can be modeled as detection of local activity differences (edge detection) within a filtered image. As a typical texture creates local activity differences across all image regions (e.g. the textures depicted in Figure 3 generate highly variable activity patterns when convolved with oriented filters, due to the random local orientation), the problem is one of detecting an activity edge in a noisy image [44]. Here, spurious activity variations are smoothed by excitatory lateral interactions.
"A V
-^
Tt
-^
y.
4
J A T^
r
' L
>
+
^
+
+
>y
T^
n
V.
Ar
n
+
J
+
A -A
^
^ >
+ -V
n L
+ Ar
+ + x - v y . x
+
x
JrX"J«.
B+
+
+
+-v
+
+
x
T^^'
+
x - V x A - T ^
i f c ^ ^ v - ^ A - n + j x J C -^^
r n
L +
^
C
X
V T t J
>
y +
+
J
x
^ - A
Figure 4. Visuai grouping by similaxity (a.), proximity (b) and both (c).
and activity at regions of uniform response is reduced by (divisive) inhibition. A similar model was shown to successfully predict human performance on texture segmentation tasks [44]. To account for preattentive texture segmentation, where all image regions are processed in parallel and texture boundaries are to be detected, we assume excitatory weights to be low and constant across the processed image, allowing for good localization performance. However, performance is experience dependent. Observers practicing texture segmentation improve with practice [2]. Improvement occurs through a period of a few days and lasts for years [45]. Learning is specific for texture orientation, location and eye (practicing with one eye does not produce improvement when testing with the other eye), implying plasticity at an early stage of visual processing. These results imply that the network weights used here should be modifiable with experience. It is possible that increase in efficacy of excitatory weights produces a more efficient smoothing of spurious local variations in filters activity. It is also possible that an increase of inhibitory gain further reduces activity at regions with low response variability, thus enhancing texture boundaries.
3.3. P e r c e p t u a l grouping Visual stimuli, as Figure 4, give rise to spontaneous perception of some global organization. Thus it seems that our visual system implements specific rules for the creation of structure in seemingly ambiguous situations. While texture segmentation is marking object boundaries, grouping is a process involved in linking image regions into unified objects. The Gestalt laws of perceptual organization [48] assume that image parts group together when being similar in shape or when being in close proximity, or when creating a
10 'good form'. The Gestalt laws, though applicable in many situations, lack a quantitative formulation. It is not clear how to define shape and similarity or what entities to use for proximity measurements. Furthermore, often images contain multiple organization cues (e.g. proximity and similarity in Figure 4c) thus calling for some quantitative method to combine the different rules into a unified framework. The primary visual network described above provides a natural framework for handling internal image relations, as required in perceptual organization. Spatial filters can be used to construct a similarity metric (as in texture segmentation), and anisotropic excitatory lateral connections can be used to implement spatial relations. A recent theory of grouping accounts for human performance on proximity and similarity based grouping tasks by assuming an autocorrelation type of process [49]. Though the autocorrelation computations can be applied to filter responses, a simpler model is possible. In this model [49], the input image is represented as an intensity map, /(a:,T/), thus skipping the filtering stage. The basic operation performed is an estimate of total weighted directional correlation at each image point:
Re{x, y) = E H^^ y)' ^(^ + ^^^^^' y + ^^^^^) • ^^~^^^°^
(5)
Here, Re{x^ y) represents the total weighted correlation of the image point /(a:, y) with all image points along a line with orientation 0 passing through the point {x,y). Long range correlations are given a smaller weight (exponentially decaying), thus introducing proximity effects. This equation was successfully applied to psychophysical experiments using stimuli (Figure 4) where global orientation was defined either by elements proximity and/or shape similarity and/or luminance similarity. As the psychophysical task in the experiments was a two alternative forced choice between vertical and horizontal global organizations, a model decision parameters was constructed (A), A
2^x,y ^0=vertical\X-,y) 2^x,y •^0=horizontal\X^
^^x yj
SO that A > 1 indicates vertical organization, A < 1 indicates horizontal organization, and an unbiased guess is forced with A = 1. Performance on the grouping tasks was found to be time dependent [49]. In the experiments, stimuli (like in Figure 4) were briefly presented (20 msec) followed by noisy unorganized pattern to mask the grouping perception. For smaller stimulus to mask temporal separations {t < 100 msc), the perceived stimulus organization was dominated by proximity relations, while additional processing time allowed for similarity based grouping to take over. Equation 4 can account for this behavior if the space constant ^o (the model free parameter) is allowed to increase with time. Thus, the model suggests an increasing integration range with time. Our network can account for human behavior by assuming that the weighted spatial integration is performed by the excitatory lateral connections. The exponential distance
11 weight applied to the correlation equation is predicted by signal propagation through lateral connections (section 2.2). Dynamic control of connectivity can result in an increase of excitatory spread within network integration time (< 200 msc), predicting increasing space constants with time and thus accounting for the psychophysical findings [49]. This fast dynamic of excitatory transmission is assumed here to be controlled by visual attention. It is also possible that fast modulations of synaptic efficacies are generated by response correlations [50,51](in filters or some other responding units), however, such a mechanism would not allow transmission of activity through unstimulated image regions. Perceptual organization involves, not only enhancement of activity correlations, but also filling in gaps where information is missing (as in Figure 1). Psychophysical experiments indicate that perceptual grouping depends on the availability of visual attention [39,40], supporting the assumption of attentional control over excitatory efficacy.
4. Visual a t t e n t i o n Visual attention is assumed to link image features into objects [7,9], to integrate filter responses [34,35], or to select a region in the visual field for enhanced processing [36]. As image features (e.g. orientation, size, color) are not represented uniquely within the filter representation used here (assuming filters perform weighted linear broadband integration within a multidimensional space), attention is assumed here to derive image features from filter representations by integrating filter responses. On this account, attention drives visual process that reconfigure filters by modulating their (excitatory) interactions to create ensembles that are better tuned to specific image features. Though these processes can be implemented in a feed-forward fashion by creating higher level filters using dynamic receptive fields [37], it is attractive to assume a single layer feedback network capable of handling global process (as needed for shape analysis). Selective attention seems to be controlled by grouping process [35,38], as the number of image items that can be processed in parallel within the 'spot-light' of attention depends on stimulus parameters affecting perceptual grouping. Two image items having the same spatial frequency, or location (e.g Gabor signals [35]) require the same processing time as a single item to be identified, implying some similarity and proximity based grouping processes preceding attention allocation. Excitatory in^teractions can underlie these grouping process, with a similarity metric defined by filter parameters (orientation, spatial frequency and location). It is interesting to note that grouping processes (see section 3.3) are critically dependent on the availability of attention [39,40], in agreement with the assumption of attentive control over the efficacy of excitatory interactions. Thus attention and grouping seem to be mutually dependent. Increasing efficacy of excitatory interactions enables long range correlations to affect grouping and thus an optimization of attentive resources allocation. Duncan and Humphreys [38] suggested that hierarchical segmentation and grouping is a basic capacity-free stage in visual processing. This visual selection theory has three components: a parallel stage of perceptual description, a selection process and the entry
12 of selected information into visual-short-term-memory (VSTM) which allows control of access to awareness. According to this theory, the perceptual description is made by a process of hierarchical segmentation of the image into linked groups and subgroups (structural units). Each structural unit is described by its elementary sensory properties (relative position, color, size, motion etc) and categorical properties (based on meaning). The above described process of segmentation and description is said to be parallel and resource free, selection starts when all parts of the display compete for access to the VSTM. Linked structural units tend to gain or lose resources (activity strength) together. Increased assignment of resources to any structural unit increases its speed and probability for access to VSTM. Once a structural-unit emerges the 'winner' in the visual selection process it accesses VSTM. Duncan and Humphreys [38] propose that "Structural units act as wholes, competing for and gaining access to VSTM with all their associated descriptions" . It is possible that structural units are created by linking (grouping) processes within the early filter representation. If this is the case, attention allocation should be constrained by early vision architecture, that is, by excitatory connectivity. Filters that are connected by excitatory interactions can be linked to create a 'structural unit'. The linking process would involve 'attentive' fast synaptic modifications to strengthen the desired links, or to weaken the links between other filters. However, on this account, the linking processes is not 'resource free' as the system is limited by connectivity, and to some extent by the flexibility of the attentive synaptic gating mechanism.
5. From a t t e n t i o n t o m e m o r y Lateral interactions are probably modulated by attention on a fast time scale (msecs) and can go under longer term (years) modifications as in perceptual learning. These two processes operate on different time scales but are probably linked through some intermediate memory structures operating on time scales of several minutes to a few hours. Experiments involving lateral masking learning show an increase of interaction range with practice [4]. In order for practice to be successful, observers had to practice all target to mask distances within each practice session. Learning was not observed when observers were practicing on long distances alone, without performing on intermediate distances. Increasing the difference between the distance samples used in the experiments also prevented learning [4]. It seems that different distances have to be experienced within a single session, probably with neighboring filters activated within a time window of a few minutes, enabling generation of associations between stimuli that are close in space and time. Further evidence for a low level memory operating on a time scale of a few minutes come from visual imagery experiments. In these recent experiments [55] observers were asked to imagine the high contrast masks while detecting the low contrast target in the lateral masking experiment (see Figure 2). Results from these imagery experiments show enhanced detection following the perception effects, though at about half the magnitude. However, enhancement by imagery can be obtained only within a time window of a few
13 minutes after performing the 'real' perception task, and with the same targeted eye. Passive inspection of the mask prior to the imagery action, or running only as few as ten trials before the imagery task are not sufficient to obtain the imagery based enhancement. These results support the existence of a low level (monocular) memory, probably iconic in nature, that stores filters activity for a few minutes. This storage is activated by higher level processes (e.g. attention, as task is necessary) and is accessible by higher level processes (as imagery). Thus this memory seems to match the one that is used for establishing spatio-temporal associations. [Though spatio-temporal associations are established during a time period of a few minutes, a few hours may be required for their consolidation [45,54].]
6. Conclusion Non-isotropic excitatory interactions between oriented spatial filters provide activity enhancement from neighboring filters, thus enabling some useful global image characteristics to spread within the network of early vision and to be detected (e.g. enhancement of long lines [3,53], closure [48]), enhancement of closed contours [52]). Furthermore, it is assumed here that excitatory weights are modulated by a top-down process (i.e. visual attention), introducing memory dependent context. It is suggested that excitatory transmission efficacy is set to some low default value under nonattentive conditions and it can increase with increasing levels of attention. Though a global control of excitatory efficacy is assumed here, it is also possible that some specific weights, or pattern of weights, can be modulated by selective attention providing a fast space invariant object filter. Spatial selectivity is inherent in the concept of visual attention as visual detection is enhanced within attended regions of the visual field [36,46]. Attention was also suggested to be involved in binding image features to objects [7,9], thus serving as the bridge between featural image representation and the perceived Gestalt. Within the context of the present framework, binding can be achieved by selective modulation of weights, probably determined also by memory stored patterns, with objects parts linked by increased connectivity between their corresponding filters. Note that we do not allow for memory dependent context to be effective without any input dependent context available, as memory affects only connectivity. (Transmission efficacy may also be controlled by weak top-down signals providing subthreshold input to filters and thus improving their responsiveness to other inputs [56], sensory as well as lateral.) Visual inputs should assume some spatial correlations in order for connectivity modulations to be efficient. However, a rich enough (or noisy) input can generate patterns of spurious correlations with sufficiently organized filter activity to enable modulation by memory. The effect of these spurious meaningful images on perception will depend then on the weighting factors, which may vary with time, or between brains.
14 REFERENCES 1. Porter, P. (1954). Another picture-puzzle. American Journal of Psychology 67, 550551. 2. Kami, A. and D. Sagi (1991). Where practice makes perfect in texture discrimination: Evidence for primary visual cortex plasticity. Proceedings of the National Academy of Science, USA 88, 4966-4970. 3. Polat, U. and D. Sagi (1993). Lateral interactions between spatial channels: Suppression and facilitation revealed by lateral masking experiments. Vision Research 33, 993-999. 4. Polat, U. and D. Sagi (1994). Spatial interactions in human vision: from near to far via experience dependent cascades of connections. Proceedings of the National Academy of Science, USA 91, 1206-1209. 5. Polat, U. and A. M. Norcia (1996). Neurophysiological evidence for contrast dependent long range facilitation in the human visual cortex. Vision Research, in press. 6. Sagi, D. and D. Tanne (1994). Perceptual learning: learning to see. Current Opinion in Neurobiology 4-, 195-199. 7. Julesz B. (1981). Textons, the elements of texture perception and their interactions. Nature 290, 91-97. 8. Marr, D. (1982). Vision. Freeman, New-York. 9. Treisman, A. (1985). Preattentive processing in vision. Computer Vision Graphics and Image Processing 31, 156-177. 10. Hartline H.K. (1938). he response of single optic nerve fibers of the vertebrate eye to illumination of the retina. American Journal of Physiology 121, 400-415. 11. Barlow, H. B. (1953). Summation and inhibition in the frog's retina. Journal of Physiology, London 119, 69-88. 12. Young, T. (1802). On the theory of light and colors. Philosophycal Transactions of the Royal Society 92, 12. 13. von Helmholtz, H. (1909). Physiological Optics. Optical Society of America, 1925. Republished by Dover, New York. 14. Hubel, D. and T. Wiesel (1962). Receptive fields, binocular interaction and functional architecture in the cat's striate cortex. Journal of Physiology, London 166, 106-154. 15. Campbell, F. and J. Robson (1968). Application of fourier analysis to the visibility of gratings. Journal of Physiology, London 197, 551-566. 16. Graham N. (1977). Visual detection of aperiodic spatial stimuli by probability summation among narrowband channels. Vision Research 17, 637-652. 17. Watson A.B. (1983). Detection and recognition of simple spatial forms. In Physical and Biological Processing of Images. Eds. Braddick O.J. &; Sleigh A.C., Berlin: Springer-Verlag. 18. Wilson H.R. (1983). Psychophysical evidence for spatial channels. In Physical and Biological Processing of Images. Eds. Braddick O.J. k Sleigh A.C., Berlin: SpringerVerlag. 19. Barlow, H. B. (1972). Single units and sensation: a neuron doctrine for perceptual
15 psychology? Perception i, 371-394. 20. Marcelja, S. (1980). Mathematical description of the responses of simple cortical cells. Journal of the Optical Society of America 70, 1297-1300. 21. Campbell, F. and J. Kulikowski (1966). Orientationonal selectivity of the human visual system. Journal of Physiology, London 187, 437-445. 22. Phillips, G. C. and H. R. Wilson (1984). Orientation bandwidths of spatial mechanisms measured by masking. Journal of the Optical Society of America A 1, 226-232. 23. Blakemore, C. and F. Campbell (1969). On the existence of neurones in the human visual system selectively sensitive to the orientation and size of retinal images. Journal of Physiology, London 203, 237-260. 24. Stromeyer C.F. and Julesz B. (1972). Spatial-frequency masking in vision: Critical bands and spread of masking. Journal of the Optical Society of America, 62, 12211232. 25. Legge G. E. (1979). Spatial frequency masking in human vision: binocular interactions. Journal of the Optical Society of America, 69, 838-847. 26. Sagi, D. and S. Hochstein (1983). Discriminability of suprathreshold compound spatial frequency gratings. Vision Research, 23, 1595-1606. 27. Watson A.B., Barlow H.B. and Robson J.G. (1983) What does the eye see best? Nature, 302, 419-422. 28. Zenger, B and D. Sagi (1996). Isolating excitatory and inhibitory non-linear spatial interactions involved in contrast detection. Vision Research, in press. 29. Gilbert, C D . (1993) Circutry, architecture and functional dynamics of visual cortex. Cerebral Cortex 3, 373-386. 30. Polat, U. and D. Sagi (1994). The architecture of perceptual spatial interactions. Vision Research 34, 73-78. 31. Foley, J. M. (1994). Human luminance pattern-vision mechanisms: mctsking experiments require a new model. Journal of the Optical Society of America All, 1710-1719. 32. Heeger, D. (1992). Normalization of cell responses in cat striate cortex. Visual Neuroscience 9, 181-197. 33. Georgeson, M. A. and J. M. Georgeson (1987). Facilitation and masking of briefly presented gratings: Time-course and contrast dependence. Vision Research 27, 369379. 34. Sagi, D. (1990). Detection of an orientation singularity in gabor textures: Effect of signal density and spatial frequency. Vision Research 30, 1377-1388. 35. Adini, Y. and D. Sagi (1992). Parallel processes within the "spot-light" of attention. Spatial Vision 6, 61-77. 36. Posner, M. I. (1980). Orienting of attention. Quarterly Journal of Experimental Psychology 32, 3-25. 37. Moran, J., and R. Desimone (1985) Selective attention gates visual processing in the extrastriate cortex. Science 229, 782-784. 38. Duncan, J. and G. W. Humphreys (1989). Visual search and stimulus similarity. Psychological Review 96, 433-458.
16 39. Ben-Av, M. B., D. Sagi and J. Braun (1992). Visual attention and perceptual grouping. Perception & Psychophysics 52, 277-294. 40. Mack, A., B. Tang, R. Tuma, S. Kahn, and I. Rock (1992). Perceptual organization and attention. Cognitive Psychology 24-, 475-501. 41. Fogel, I. and D. Sagi (1989). Gabor filters as texture discriminator. Biological Cybernetics 61J 103-113. 42. Landy, S. L. and J. R. Bergen (1991). Texture segregation and orientation gradient. Vision Research 31, 679-691. 43. Malik, J. and P. Perona (1990). Preattentive texture discrimination with early vision mechanisms. Journal of the Optical Society of America A 7(5), 923-932. 44. Rubenstein B. S. and Sagi D. (1990) Spatial variability as a limiting factor in texture discrimination tasks: Implications for performance asymmetries. Journal of the Optical Society of America A, 7, 1632-1643. 45. Kami, A. and D. Sagi (1993). The time course of learning a visual skill. Nature 365, 250-252. 46. Sagi, D. and B. Julesz (1986). Enhanced detection in the aperture of focal attention. Nature 321, 693-695. 47. Braun, J. and D. Sagi (1990). Vision outside the focus of attention. Perception & Psychophysics 4S, 45-58. 48. Koffka, K. (1935). Principles of Gestalt Psychology. New York: Harcourt Brace. 49. Ben-Av, M. B. and D. Sagi (1995). Perceptual grouping by similarity and proximity: Experimental results can be predicted by intensity autocorrelations. Vision Research, 35, 853-866. 50. Malsburg, C. von der (1985). Nervous structures with dynamical links. Physical Chemestry 95, 703-710. 51. Braun, J., E. Niebur, H.G. Schusler and C. Koch (1995). Perceptual contour completion: a model based on local, anisotropic, fast-adapting interactions between oriented filters. Society of Neurosciences Abstracts 20, 1665. 52. Kovacs, I. and B. Julesz (1993). A closed curve is much more than an incomplete one: Effect of closure in figure-ground segmentation. Proceedings of the National Academy of Science USA 90, 7495-7497. 53. Field, D. J., A. Hayes, and R. F. Hess (1993). Good continuation and the association field: evidence for local feature integration in the visual system. Vision Research 33, 173-193. 54. Polat, U., and D. Sagi (1995) Plasticity of spatial interactions in early vision. In B. Julesz & I. Kovacs (Eds.), Maturational Windows and Adult Cortical Plasticity, SFI Studies in the Sciences of Complexity, Vol XXIV, Addison-Wesley, Reading MA. 55. Ishai, A., and D. Sagi (1995). Common mechanisms of visual imagery and perception. Science 268, 1772-1774. 56. Aertsen A.M.H.J., G. Gerstein, and P. Johannesma (1986). From neuron to assembly: Neuronal organization and stimulus representation. In G. Palm fc A. Aertsen (Eds.), Brain Theory, Springer-Verlag, Berlin.
17
Figure 5. A hidden figure: solution.
Brain Theory - Biological Basis and Computational Principles A. Aertsen and V. Braitenberg (Editors) 1996 Elsevier Science B.V.
19
Psychophysical Mapping of Orientation Sensitivity in the Human Cortex JOHANNES M . ZANKER ^^^
&
VALENTINO BRAITENBERG ^
^ Max-Planck-Institut fur biologische Kybemetik, D-72072 Tubingen, Germany ^ Istituto per la Ricerca Scientifica e Tecnologica, 1-38050 Trento/Povo, Italia
Based on anatomical and physiological knowledge about the representation of oriented stimuli in primate visual cortex, two attempts were made to analyse cortical orientation sensitivity maps in humans by means of psychophysics. (i) The minimum requirement to detect the orientation of a line in various regions of the visual field is characterised in terms of stimulus contrast and size. It turns out that, at sufficiently high contrast levels, a line needs only to span about 0.2 mm on the cortical surface in order to be recognised as oriented, independent of the actual eccentricity at which the stimulus is presented. This indicates that human orientation detection approaches the physical limits, requiring hardly more than approximately four input elements in order to be detected in its orientation, (ii) Starting from this observation, line segments were generated which were shorter in their cortical representation than the diameter of a cortical hypercolumn. They were used in a first attempt to investigate the spatial distribution of orientation preferences. Indeed, maps emerge from the data which resemble the orientation maps known from optical recordings of the cortical surface. However, due to technical limitations these preliminary results should as yet be treated with some caution.
1. INTRODUCTION Visual space is represented in the cortex in a highly ordered manner. Each position in the visual field (defined in angular co-ordinates of the eye) corresponds to a well defined region in the cortical area VI, and similarly - though less precisely - at several subsequent stages of the visual stream. Such retinotopic representations led to the notion of cortical maps which, of course, do not produce isometric images of the scenery, but are considerably distorted by the architecture of the cortical surface (Daniel & Whitteridge, 1961; Mallot, 1985; Fox, Miezin, Allman, Essen, & Raichle, 1987; Tootell, Switkes, Silverman, & Hamilton, 1988; Horton & Hoyt, 1991). Furthermore, there is a systematic change of scale when going from the centre of the visual field, the fovea, to the periphery. The so-called *M-scaling' or cortical magnification factor indicates how the number of neural elements - from retinal ganglion cells to cortical * present address: Department of Psychology, University College London, Gower Street, London WCIE 6BT, England
20
neurones - responsible for one degree of visual field decreases with increasing eccentricity according to a power function (Tynan & Sekuler, 1982; Cowey & Rolls, 1974; Rovamo & Virsu, 1979; Pointer, 1986; Wassle, Griinert, Rohrenbeck, & Boycott, 1989). This reflects the well-known fact that the spatial grain decreases from the fovea (which in fact is defined as the region of highest spatial resolution) to the periphery: A 1 mm x 1 mm region of cortical surface corresponds to a small part of the visual field in the fovea (about 1/8 x 1/8 degree^) and to a large part in the periphery (about 1 x 1 degree^ at 20°, estimated after Rovamo, Virsu, & Nasanen, 1978). In addition to this, there is ample physiological and anatomical evidence that certain stimulus features at a given position, such as ocularity, colour, local spatial frequency, or orientation are not randomly distributed over the cortex, but organised in spatial maps as well (Orban, 1984; Hubel & Livingstone, 1987; Livingstone & Hubel, 1988). For the representation of the local orientation of visual stimuli it is widely agreed that vertical columns of identical preferred orientation are arranged in two-dimensional arrays in which orientation changes systematically over the cortical surface (for review, see Hubel & Wiesel, 1977). Originally, this cortical orientation map was assumed to be best described by the *ice cube model' in which columns tuned to one orientation are arranged in rows with a continuous rotation of the preferred orientation. A set of orientation columns comprising a complete rotation in the orientation domain was labelled 'hypercolumn' in this model, and was understood as the comprehensive representation of all stimulus features at a given location in the visual field. This basic scheme was later challenged by the idea of a concentric arrangement of different orientations around regularly spaced singularities (Seelen, 1970; Braitenberg & Braitenberg, 1979). Such an-organisation may be realised in two specific versions (Gotz, 1987; Gotz, 1988; Braitenberg, 1992), depending on what organisational principle is proposed to determine the orientation specificity (see figure 1, top). These considerations about the regular arrangement of orientation in the cortex were originally based on linear tracks of extracellular recordings from neurones in cortical area VI which revealed orientation constancy or systematic changes in orientation, depending on the angle of the track relative to the plane of the cortex. Later these experiments were nicely complemented by optical recordings in which the surface arrangement of orientation preference was monitored in parallel (Blasdel & Salama, 1986; Tso, Frostig, Lieke, & Grinvald, 1990; Blasdel, 1992; Bonhoeffer & Grinvald, 1993). Such data revealed orientation maps close to those expected on the basis of the more recent models. They even allowed the singularities emerging in these maps to be related to the cytochrome-oxidase blobs known from histology in which neurones appear to be only weakly orientation tuned (Livingstone & Hubel, 1988). When several features are mapped on the same surface, the question arises as to how the different maps are kept apart, and whether they are functionally related to each other. One major question which seems not to be answered by the experiments published so far is how the representation of position and of orientation interfere with each other. Is a cortical unit which represents a certain orientation at a certain position in the visual field excited from the same region in space as its neighbours which represent different orientations, or do different orientations also imply different locations in the visual field?
21
cortical orientation map
-1 -
T
I
—
\
—
\
\
—
I y
receptive field organization
input sheet
patchy cooperation
Figure 1: Basic properties of cortical orientation maps. Top left and right: electrophysiological recordings suggest concentric arrangements of the preferred orientation of cortical units. In one variant each orientation (indicated by horizontal, vertical, or oblique bars) is represented twice (left square, after Braitenberg), in the other variant only once (right square, after Gotz) when moving along a fiill circle around singularities in which no orientation is preferred (indicated by dots). In general, the concentric arrangement corresponds to optical recording data (top centre, after Bonhoeffer & Grinvsild 1993; the pseudo-colour orientation map is converted to grey-scaled image, singularities are indicated by dots). However, realistic orientation maps differ from the ideaUsed schemes by irregularities in the fine structure as demonstrated by overlaying bars of the four basic orientations over the grey-scaled orientation map at the positions where the preferred orientation is expressed most strongly. Bottom: two altematives to interpret the large receptive fields of local neurones which are tuned to a specific orientation. Left: Each element (for example, centre element indicated by shading in the upper layer) could receive input from a large sheet of small, but densely arranged elements with the same preferred orientation (*line detectors', indicated by shaded elements in the lower layer), leading to a large, continuous receptive field represented at the given position. Right: A number of line detectors tuned to the same orientation (indicated by shaded elements in the single layer), but distributed across a region much bigger than the receptive field of the single element, could operate in an associative network thus forming a large - though patchy - effective receptive field for either element by co-operation.
At first, one may be tempted to dismiss the latter altemative since the receptive field as defined by electrophysiologists, i.e. the region in the visual field in which the element under consideration is excited by a visual stimulus, is much bigger than an orientation column (Hubel & Wiesel, 1977). In fact, receptive fields are bigger than a hypercolumn when their size is
22
measured in terms of cortical co-ordinates (see figure 1, bottom left). Thus the overlap between the receptive fields of neurones with different preferred orientation is considerable. However, such a view would have to be challenged if line segments much smaller than the classic receptive field could also be perceived as having a certain orientation. This would imply that each receptive field is in reality composed of a set of smaller - or at least narrower orientation sensitive elements (Zanker, in prep.) all tuned to the same orientation but with different locations in the visual field. For simplicity, such small oriented receptive fields will be called here 'line detectors', irrespective of whether they respond best to lines or edges. How could large receptive fields be constructed from such line detectors? Given that the set of pyramidal cells acting as an associative network is a plausible theory of the cortex (Braitenberg & Schiiz, 1991), one would not be surprised to find that elementary line detectors tuned to the same direction within one region of the cortex join up in a Hebbian cell-assembly through a process of associative learning. If this were so, what the electrophysiologist records from a single cell would be the collective activity of all neurones which excite the neurone under observation, with a corresponding large collective receptive field. In consequence, a certain orientation is only represented at a certain position in space but not at those being represented by the neighbouring units representing different orientations (figure 1, bottom right). Such a *patchy' cooperation being responsible for orientation mapping would not lead to serious consequences under normal viewing conditions because stimuli usually provide oriented edges or lines which are much longer than a hypercolumn and thus would excite several units of the same preferred orientation simultaneously. Furthermore, involuntary but permanent eye movements (Ratliff & Riggs, 1950), such as slow fixational drifts or tremor-like instabilities, will prevent small stimuli from being constantly centred on a single line detector. For these reasons, one easily could mistake a patchy co-operation of line elements for a hard-wired receptive field organisation in which an input sheet densely packed with line detectors feeds into the orientation sensitive cortical unit under observation (figure 1, left side). It should be possible, however, to determine the functional properties of the small subfields of conventional receptive fields either with electrophysiological recording (Schiller, Finlay, & Volman, 1976; Camarda, Peterhans, & Bishop, 1985a; Camarda, Peterhans, & Bishop, 1985b) or with psychophysical experiments. If the elementary receptive fields of the line detectors turn out to be small enough, smaller than the diameter of the hypercolumn, a possible link between their orientations and their respective positions within the hypercolumn could be unravelled. The experiments which we describe here were designed to examine such a fine-grain orientation mapping. There were two preliminary questions to be asked: (i) What is the minimal length of a line in order for it to be seen as oriented? Varying the retinal eccentricity of an oriented stimulus, this question will deal with the problem of how the visual field is scaled to the cortex, (ii) Can the unsteadiness of fixation, which would make a finegrain analysis quite impossible in the fovea, be overcome by mapping orientation in the more peripheral portions of the visual field? There each hypercolumn corresponds to a large region of the visual field (more than 1° visual angle) and the random errors due to the rotational movements of the eye bulb (which are the same for the whole field, about 10 min.arc after Ratliff & Riggs, 1950) should be less disturbing.
23
2. MATERIAL AND METHODS Human observers were seated in front of a computer screen (SUN workstation, 1200 x 1000 image pixels are displayed on the 36 x 30 cm screen, 66 Hz frame rate, 82% phosphor decay in 4 ms) at a viewing distance of 32.5 cm. The head of the subject was fixed by means of a head-chin rest which prevented gross head movements and guaranteed a constant viewing distance (see figure 2 top). Eye movements were restricted by instructing the observer to look constantly at the fixation target which was permanently present on the screen. In order to stimulate the right eye monocularly, i.e. to investigate only one of the two possibly overlapping orientation maps originating from either eye in isolation, the subject's left eye was covered by a black eye patch. The experiments were carried out in a quiet room illuminated by dim light. The subjects were asked to settle in the most comfortable position, to concentrate on the screen, and to report their decisions about the stimuh by pressing the computer's mouse buttons. All subjects were male colleagues of the authors, most of them naive with respect to the purpose of the experiments, and their average age was 29 years (between 23 and 36 years). No subject had any obvious ophthalmic disorder, some of them were wearing their usual optical corrections. The stimuli were produced on the computer screen within a window of 1100 x 500 pixels (corresponding to 33 x 15 cm) which was filled with dark pixels (average intensity Imin usually about 4 cd/m^). A fixation target was displayed at a constant position within the dark background throughout the whole experiment. It consisted of a frame of 16 x 16 bright pixels and a single bright pixel which was presented in its centre during the intervals between the stimulations (Imax usually about 100 cd/m^). After an acoustic signal, the test stimulus was presented for 45 msec. During this period the central bright pixel of the fixation target was replaced by a bright line: a row of eight pixels, for instance. In synchrony with this 'reference bar' a second line of the same length, the 'test bar', appeared at another position, i.e. in the periphery of the observer's visual field who was fixating the reference bar (see figure 2 bottom). The length of the reference bar, and the length and brightness of the test bar were treated as parameters for a series of experiments. The experimental variable was the orientation of either bar which could be horizontal (0°), vertical (90°) or oblique (45° or 135°). Note that the number of bright pixels may change for oblique bars, as compared to bars parallel to the pixel raster axes, when the absolute length of the bar is kept constant. After the presentation interval the fixation target reappeared on a otherwise dark screen, and was displayed for at least one second before the next stimulus was presented. The 16 possible combinations of the orientations of the two bars for each test position, and the test positions, were presented in random order. The subjects were asked simply to report whether the orientation of test and reference bar was the same or different. When all 4 x 4 combinations of the orientations of the two bars are displayed in random order, the same orientation is presented in one out of four cases, on average. This means that chance level - when the orientation is not perceived reliably - would be at 25% correct decisions, when subjects would always press the 'same' button, or 75% when the 'different' button is selected instead. Intermediate or floating strategies would lead to intermediate values, and 50% correct decisions are reached for decisions taken at random under those conditions. Since under certain experimental conditions (see below) the test bars
24
were hardly visible at all, the subjects were allowed to vote for a third option described to them as 'undecided or invisible'. This deviation from the standard two-alternative forced-choice paradigm extends the dynamic range for the psychophysical measurements to a baseline of 0% correct responses for honest and careful subjects, but still allows for higher scores reported by more self-confident subjects even when the percept is actually weak. Thus the results of this three-altemative-choice procedure with unbalanced presentation probabilities have to be regarded with appropriate care.
display observer viewing distance
mouse
orientation
, , eccentricity ^'^ation *^''9®* ^ ^ square
size
test bar
Figure 2: Experimental set-up and stimulus configuration. Top: The subject was seated in front of a computer monitor and the head position was fixed by means of a head-chin rest. The subject observed the display monocularly at a viewing distance of 32.5 cm and reported the decisions by pressing the buttons of the computer mouse. Bottom: Within a dark background a reference bar was presented in the fixation square, together with a test bar displayed at a variable eccentricity. The orientation and the size (length) of the target was treated as experimental variable.
reference bar
In addition to calculating the average performance over all orientations, the preferred orientation was estimated for each tested position in the visual field. For this purpose the number of correct responses was registered separately for each orientation of the test bar. Then a regression was calculated between the percentages of correct responses to the four bar orientations (at angular intervals of 45°) and a rectified cosine function for which the phase was shifted in 15° steps relative to the data points. The angular position of the best fitting cosine, i.e. that leading to the largest regression coefficient, was taken as an estimate of the preferred orientation. In the corresponding figures the preferred orientation is symbolised by a bar plotted at the angle leading to the best fit, with its length indicating the regression coefficient as a measure of the reliability of this estimate. It should be noted that in some cases no preferred orientation can be estimated (dots in the graphs), namely when bars with orthogonal orientation are detected by the subject with the same success.
25
3. OMENTATION DETECTION AND CORTICAL MAGNIFICATION The first experiment was designed to determine how far out in the peripheral visual field the orientation of a small line segment can be detected. For this purpose the fixation target was centred in the stimulus window, and a test bar was presented along the horizontal meridian to the left or to the right of this spot, i.e. in the nasal or temporal part of the visual field. The eccentricity was varied between 150 and 500 pixels away from the fixation point, corresponding to 7.5°-25° visual angle. The size of the reference and the test bar was 8 screen pixels (2.4 mm, corresponding to 0.42° in the fovea), and the orientations and the position of the test bar were randomised as described before. This basic experiment was repeated with four different stimulus contrasts to vary the strength of a given stimulus without changing its geometry. The Michelson contrast, defined as (Imax-Imin)/(Imax+Imin), was set to 40%, 63%, 82% and 94%, respectively. Each orientation combination was tested twice in two successive blocks, and the correct responses of a subject were pooled for all tested combinations. This led to 32 decisions for each of 2 x 8 possible test bar positions in the two hemi-fields. The average number of correct decisions from 5 subjects, each deciding 512 times for one stimulus contrast, is plotted in figure 3 as a function of stimulus eccentricity.
100 h
(0 0) (0
co a (0
S> o
o -10
0
10
30
eccentricity [deg] Figure 3: Orientation detection for various line contrasts. The percentage of correct decisions (n=5 subjects), pooled for all test bar orientations, is plotted as function of stimulus eccentricity (angular position in the visualfield)for both visual hemi-fields (nasal: negative eccentricity; temporal: positive eccentricity). The Michelson contrast of the test bar was set to 40% (circles), 63% (squares), 82% (up triangles), and 93% (down triangles). In general, performance is reduced with increasing eccentricity and decreasing contrast; in the temporalfieldperformance drops to about zero in the blind spot (16°-18° eccentricity).
26 In the left hemi-field of the right eye, i.e. in the nasal visual field, the percentage of correct decisions depends on stimulus eccentricity in a very clear pattern. At high and medium contrasts the orientation is detected reliably when the bar is presented close to the fovea, and performance drops with increasing eccentricity. Higher contrasts always lead to a better performance, meaning that at lower contrast the curve deviates from 100% correct responses closer to the fovea, and approaches lower values in the farther periphery. For the lowest contrast the overall performance is reduced in general. In this case the percentage of correct responses remains below 100% even close to the fovea, and drops to zero already at about 20° in the periphery, whereas at high contrast the performance deviates from 100% correct responses only in the far periphery. Altogether, the curves resemble a set of psychometric functions, which either can be interpreted as having different slopes and saturation levels (cf. curves for 40% and 63% contrast in figure 3), or as being shifted along the abscissa (cf. curves for 63% and 93% contrast in figure 3). Which of the two alternatives is more plausible, or whether they both hold for different contrast ranges, cannot be decided based on this limited set of data. In the right hemi-field of the right eye, i.e. in the temporal part of the visual field, this basic pattern of orientation detection performance is overruled by an additional peculiarity appearing at eccentricities between 15° and 18°. In this range, the performance is reduced to zero values for all stimulus contrasts. This behaviour was to be expected from the fact that the blind spot is located at this position in the visual field, and therefore the subjects should not be able to detect the stimuli at all in this region. This fact can be used to control for the reliability of the subjects' fixation by placing one test bar position in the blind spot. In concentrated and honest subjects, the detection probability should always be zero in the blind spot when it is stimulated from time to time during each experiment. Two basic conclusions can be drawn from the experimental results presented so far. The performance for orientation detection of a bar of a given size decreases (i) with decreasing stimulus contrast, and (ii) with increasing eccentricity of its presentation within the visual field. The latter result could either be due to contrast sensitivity which might be reduced in the periphery, and therefore could lead to a lower percentage of correct decisions, or to the fact that performance is limited by the bar size which has to be scaled with the spatial grain of the visual system as discussed in the introduction. The projection of a test bar of given physical length is larger in cortical coordinates when it is presented near the fovea than farther out in the periphery. To test this cortical scaling option, the physical bar size has to be varied in such a way as to keep the size of its cortical projection constant. In a second series of experiments, performance for orientation detection was therefore measured at various eccentricities with bars which were 4, 8, or 16 screen pixels long (1.2 mm, 2.4 mm, and 4.8 mm on the screen correspond to approximately 0.21°, 0.42°, and 0.85° visual angle, respectively, in the fovea). In these experiments we concentrated on the nasal visual field where orientation detection is not disturbed by the blind spot and steady performance curves have to be expected. We also included one test position close to the fovea and one within the blind spot in the temporal visual field to make sure that the subjects did not move their gaze or focus of attention. Targets appearing from time to time in the blind spot could be detected by the subjects only if their gaze were shifted from thefixationpoint during the experiments. The
27
two measurements in the nasal visual field thus allowed to appreciate coarsely the quality of fixation and the decision strategy of the subject in having one control position of maximum and one of minimum detection probability. The results for seven subjects are plotted in figure 4. Again the number of correct decisions was pooled for all target orientations, giving the average orientation detection probability as function of test bar eccentricity, for three different bar sizes and two contrast values (94% on the left, and 63% on the right side of fig. 4, respectively) treated as stimulus parameters.
100
80 (0
§ 60 o.
60
(0
40
2 high contrast -20
-10
0
eccentricity [deg]
10
1
20
low contrast -20
-10
0
I
\
0
10
eccentricity [deg]
Figure 4: Orientation detection for various line lengtiis. The percentage of correct responses for orientation detection is plotted as function of stimulus eccentricity for high (left side: c=94%, n=7 subjects) and low contrast (right side: c=63%, n=5 subjects). The length of the bar was set to 1.2 mm (dots), 2.4 mm (squares), or 4.8 mm (triangles), corresponding to 0.21° - 0.85° visual angle in the fovea. At either contrast level, performance at a given eccentricity is poorer for smaller bars than for larger bars. It is obvious for either contrast that performance decreases with increasing eccentricity, just as it was observed in the first experiment. In addition, the three curves are clearly arranged with respect to stimulus size. Orientation detection reaches a given level at smaller eccentricities when shorter test bars are used, as compared to longer bars. In other words, the orientation of a longer bar can be detected further out in the periphery than that of a shorter bar. This basic behaviour holds for both stimulus contrasts, but the performance is generally reduced for the low contrast stimuH. Again, it is not perfectly clear whether this is an effect of different slopes of the psychometric curves or of different positions along the abscissa. This combined dependence of the detection probability on stimulus eccentricity and size is to be expected from the knowledge about the spatial grain of cortical organisation mentioned in the introduction. According to the cortical magnification a given part of the visual field is represented by a certain number of ganglion cells in the retina and a certain area on the cortex (and a corresponding number of cortical neurones) which decreases according to a power function. This means that a Hne of constant length will excite more neural elements when presented in the fovea, as compared to peripheral stimulation. The M-scale factor indicates
28 what distance on the cortex corresponds to what angular distance in the visual field for a given position within the visual field. For the nasal meridian, for instance, 1 mm on the cortex surface covers a viewing angle of about 8' at 0° eccentricity (i.e., in the fovea) and about 1° at about 20° eccentricity. According to Rovamo and Virsu (1979), the magnification can be well approximated by the polynomial MF = 7.99 / (1 + 0.33 • E + 0.00070 • E^) in the nasal visual field, with E being the eccentricity, and MF the cortical size of one degree linear distance within the visual field, given in mm/°. This formula can be utilised to estimate for our experiments the size of the test bar as it is projected on the cortex. Following this line of thought, the performance of orientation detection is replotted in figure 5, now showing the percentage of correct decisions about the orientation of a bar as function of its cortical size. It can be seen immediately that the curves for the three bar sizes collapse when the bars are presented at the high contrast (left side). Irrespective of the physical bar size, which was varied as experimental parameter, the data points seem to describe a single function, sharply increasing from a value of about 40% at a cortical size of roughly 0.1 mm to approach 100% saturation at about 0.5 mm cortical size. This indicates that at high contrast the length of the cortical representation of a stimulus is the decisive variable which determines orientation detection performance. Further inspection of the same graph reveals an important feature of this size dependence of orientation detection: A performance between the minimum and the maximum level, i.e. about 60% - 80% correct decisions, may be regarded as a reasonable criterion to estimate threshold cortical size which indicates a reliable detection of orientation. This level is reached for cortical sizes of 0.2 mm - 0.3 mm, which is definitely less than the typical diameter of a hypercolumn being roughly 1 mm at any region of the primate cortex. This means that there is a good chance to investigate orientation detection in small regions which only cover a part of a hypercolumn when short bars are used as stimuli. Using bars which span about 0.2 mm on the cortical plane will allow stimulation of local groups of orientation columns without touching neighbouring columns responding to different orientations, and therefore an important precondition for investigating orientation maps seems to be fulfilled. Another surprising conclusion can be drawn from inspecting figure 5 when the density of input fibres on a given cortical area is considered. A rough estimate of the average number of input lines feeding from the retina onto a patch of cortical surface yields a value of 380 optic nerve fibres per mm^ striate cortex (1.01 • 10^ fibres project on 2613 mm^ cortical surface of area 17; according to Blinkov & Glezer, 1968). This leads to a value of about 20 x 20 fibres from the eyes spreading into an average rectangular cortex patch of 1 mm side-length. Similar figures of 10 to 15 input elements covering a distance of about 1 mm in the cortex emerge from more elaborate considerations of visual functions and cortical anatomy in the monkey (Braitenberg, 1985). If 0.2 mm cortex are sufficient to detect the orientation of a bar, it turns out that no more than 4 input elements appear to be necessary to detect orientation, assuming that our crude approximation of input density is appropriate. This is close to the theoretical limit, regarding each input fibre as a 'cortical image pixel': If the receptive fields of retinal
29 ganglion cells, and hence of geniculo-cortical fibres, are circular and not elongated, the minimal set of inputs necessary to recognise the orientation of a line is obviously two.
r
80)
I
8
I
r • •• • 1.2 mm bar • • •• • 2.4 mm bar I •• -^ ••• • 2.4 mm bar I • 4.8 mm bar
-.,^p. .^'
80
c " Q. Q »-
I
•
A.
•
^
.•
!•••••
60 h
• — •
^ •
; 4
A
m
*
4 •*
high contrast I
1
0.4
i
1
0.6
.•
^
f
^'
.'
1—
0.8
cortical size [mm]
—
1.0
••
*
A.
^ : A
low contrast —
0.2
.
C S
I
A'
• M .*
1
-I 60 1
H 40
»ari •
i80
A-
h
. ;
• - T •
.-
A a m m Kar I
.'•
fiO
40
] 100
. A- • • • . A
100
—
•
—
.
0.4
—
1
_
—
0.6
.
—
0.8
1
—
.
—
1
2 *-
"S 1 20 £ o o j I0
—
1.0
cortical size [mm]
Figure 5: Orientation detection and cortical size. The percentage of correct responses (same data and conventions as shown in figure 4, but note different scales on the ordinates) is plotted as function of cortical size by scaling the measurementsfromthe nasal visual field with the magnification factor. At high contrast, the performance curves for the three physical test bar sizes collapse to a single psychometric function, indicating that under these circumstances cortical size is the critical variable for orientation detection. When stimulating at low contrast (right side of fig. 5), the curves for the different bar sizes do not overlap as nicely as for high contrast. This may indicate that other factors besides stimulus size, such as local contrast enhancement features, may become relevant or even Umiting for the detection of orientation when contrast is lowered and may approach the threshold for the detection of a target per se. Under these conditions it also has to be taken into account that not only Une length but also its width (determined by the size of a single screen pixel) will vary in its cortical image size with eccentricity. Hence the cortical width may be larger for a physically smaller bar at a smaller eccentricity than for a longer bar which has the same cortical length due to larger eccentricity. According to this consideration the physiologically effective contrast may vary with the aspect ratio, leading to better detection of the bar (irrespective of its actual orientation) when the width-to-length fraction is larger. In fact, a better performance for smaller bars seems to emerge from the data shown on the right side of figure 5. Independent of this possible shift of the psychometric functions with cortical bar width, the performance at low contrast reaches half-maximum values (about 50%), in the region between 0.2 and 0.5 mm cortical size of the stimulus. Allowing for some amount of scatter in the data, this is close to the observation for high contrast values. In any case the saturation performance level is reached with a cortical stimulus size of 0.5 - 0.7 mm, still corresponding to less than the diameter of a hypercolumn (about 1 mm after Braitenberg, 1985). As reflected by these rather rough but conservative estimates, for low contrast bars the increase of performance with increasing cortical size (i.e. decreasing eccentricity) is less steep, and overall performance is worse. Because the base level of about 0% correct responses
30
(please note the different scales of the ordinates in left and right part of figure 5) corresponds to the subject's decision 'invisible', it may be suggested that at low contrast the detection of the stimulus per se could indeed be limiting the performance, instead of the detection of its orientation. At least, it cannot be excluded that different mechanisms might be responsible for orientation detection in the low and high contrast domains.
4. IN SEARCH OF CORTICAL ORIENTATION MAPS As mentioned before, the diameter of a hypercolumn is about 1 mm on the cortex (Braitenberg, 1985), independent of the actual position in the visual field as expressed by the cortical magnification. This means that a patch of cortex covering all orientations, corresponding to the average distance between two blobs in which no preferred orientation is observed, has the same size all over the striate cortex, but represents an increasing part of the visual field with increasing eccentricity. Knowing that we can detect the orientation of bars which in their cortical projection are much smaller than a hypercolumn, we felt motivated to measure the regional variation of orientation detection, looking for maps of preferred orientation. Using the same set-up and stimulus configuration as in the last experiment, we stimulated in the far periphery, around 29° within the nasal visual field. Since at this eccentricity a 1 x 1 mm^ cortex patch (i.e. a hypercolumn) corresponds to more than 1 x 1 degree^ within the visual field, we tested orientation detection at 5 x 5 positions within a region of about 1.6 X 1.6 degree^. The subjects were stimulated with bars consisting of 8 pixels in a line (corresponding to 0.32° visual angle or 0.21 mm of cortex) at high contrast (c=94%). Each of the 25 positions and each of the 16 orientation combinations was tested twice in two successive blocks in a single experiment which demands up to two hours of full concentration from the subject. The results of this experiment run on two experienced subjects can be seen in figure 6. The proportion of correct responses was about 50% on average (34%-69%) for the first subject, and about 65% (43%-81%) for the second, and did not vary systematically within the area tested. Therefore overall performance is not shown in the figure, and only the preferred orientation and the reliability of the estimate is indicated by the orientation and the length of the bars plotted into the 25 test fields, as described in the methods section. In both examples shown in figure 6 the bars seem to be grouped in some order. First of all, there are no specific preferred orientations which are not represented in the test field, basically all orientations appearing at some position. At some locations, on the other hand, no preferred orientation can be estimated (indicated by dots in figure 6), or the orientation selectivity is very weak (short bars). Most spatial changes between the preferred orientations are smooth, in that orientations in neighbouring test positions resemble each other, whereas they may differ considerably when the test positions are further apart. In this respect, the psychophysical measurements seem to resemble an 'orientation map'. Furthermore, there happens to be a central position without clear orientation preference, a singularity, for which the preferred orientations of the neighbouring test positions are arranged very similar to a tangential circle (shaded test fields). The size of this vortex-like pattern even matches the expected hypercolumn size which has a diameter corresponding to about 1° viewing angle in
31 this part of the visual field. Of course, there are distortions which impede the basic pattern extracted from these first measurements, but this also resembles to some extent the physiological observations. In the cortical orientation maps visualised by means of optical recording techniques, the pattern of preferred orientations can have local irregularities, as can be seen in the examplefi*omthe literature data sketched in figure 1.
-30
^2 (0 0)
1
"
-29
1
1
-28
>
I I
.^^/^'^/^"^/^""^ .^"^ H
-30
-29
-28
1
+1
OMzOO000l HvQQQQ©] (ZX30®® 1 s •0GXZXEXS) >
O
E
«*E o
0)
0
o c
(0 (0 T3
75 o '*^
>
i ^ ' A A A ^ A ^ A A A ^
•®0000 .^000© 1
.
1
,
-1 11
horizontal distance from fovea [°]
- 00G)0Oi
-^^^k\ 00000]
V.
»^o o c CO
*•* w 'S 75 o
r > horizontal distance from fovea [°] 1
1
1
1
1 1
Figure 6: Psychophysical orientation map in the peripheral visual field. The preferred orientations and the reliability of the estimate are shown for each test position as orientation and length of a bar plotted in a circle indicating roughly the size of the test bar. Orientation detecticHi was tested in two subjects (JMZ and HV) monocularly at 25 positions around 30® eccentricity (abscissa: horizontal position in the visual field) close to the meridian (ordinate: vertical position) in the nasal visual field. Note that the hexagonal grid is slightly distorted (vertical distances are increased by roughly 10%) due to the flat screen geometry.
However, given the inevitable data scatter, there is a danger that the exemplary images of the orientation preference in the peripheral visual field which are shown in figure 6 may be misleading. Instead of depicting the cortical orientation map they could simply reflect a pattern which emerges by chance in the data analysis, from essentially irregular distributions of responses. Taking into account the duration of the experiment, there may be doubts about the stability of attention and fixation over a period of about two hours. This also makes it difficult to map bigger regions at the same resolution with this psychophysical method, searching for a clearer image of the vortex-like topological structure in a larger patch of the perceptual map. So the question arises how reliable such a map actually may be, beyond the first impression that it seems to meet the expectations. Thus the reproducibility of such measurements was investigated in a control experiment. A horizontal row of 7 test positions in the far nasal periphery was presented to the subjects five
32 times at about one hour intervals. From the 5 consecutive maps of one subject which are shown in figure 7 exemplarily, it is immediately clear that distortions of the data are considerable. Whereas the orientations in some test fields are well reproducible (leftmost position, for instance), in other positions they vary strongly (see switch between almost orthogonal preferred orientations between trial 2 and 3 in rightmost position, for instance). Thus it is not entirely clear whether the measured orientation specificity maps really reflect cortical orientation preferences or simply emerge from the imponderabilities of the test procedure. When the average orientation difference between consecutive trials at the same position is compared quantitatively to the average orientation difference between two positions separated by 1° in space, it turns out that these two measures are quite close for the two subjects who took part in several control experiments of this kind (32.5-39.5° and 37.8-44.3°, respectively).
-30
' + .5h
-29
-28
I ' I ' I ' trial 1 A
0 -.5
I I I I I I I + .5 0 -.5 + .5 0 -.5
trial 2 4
h000©(Z)©(S)-^ I I I I I I I trials
I I I I I I I
trial 4
+ .5 0 -.5
hasxz)®®®®-!
+ .5
I 11 I I I I trials
o|-0©6®(Z)®®-|
•o
"(5
Figure 7: Replication of orientation mapping. For a single row of test positions of the field shown in figure 6 (on the horizontal meridian around 29° eccentricity in the nasal visualfieldof the right eye) the experiment was repeatedfivetimes, in order to appreciate the reliability of the preferred orientation measurements (conventions as in figure 6). Whereas at some positions the preferred orientation is quite stable in this example (compare the bars in the leftmost test field for all trials), at other positions they may change considerably (see jump from trial 2 to trial 3 at rightmost test position).
• ' •from ' fovea • horizontalI distance [T
In summary, from this limited set of data, the quesrion cannot be fmally settled whether a psychophysical orientation map such as depicted in figure 6 reliably reflects the cortical organisation of orientation preference in humans. Future research thus has to concentrate on the question of how psychophysical mapping can be made more reliable. It has to be considered whether the experimental setup and strategies can be optimised, and data analysis can be improved. Strongest emphasis should be placed on attempting to stabilise the visual stimulation on the retina, because involuntary eye movements could be a major source of
33
irregularities in the maps. Image stability could be controlled by measurements of eye movements throughout the experiments, and in advanced studies one could even try to generate fixed retinal images by elaborate optical methods (Kelly, 1979). 5. CONCLUSIONS Detecting the orientation in the retinal image of the surroundings is a very basic task for visual systems. It is extremely important on various spatial scales, from the detection of object orientation (such as reading the hands of a watch), through outlining object contours (as is necessary for recognising specific shapes), to texture based image segmentation which requires fine grain analysis of local orientation. Thus one is not surprised that orientation sensitivity is one of the first and most general steps of visual processing in the cortex which is extensively described physiologically (Hubel & Wiesel, 1977; Orban, 1984; Bonhoeffer & Grinvald, 1993, for instance), and has attracted some theoretical attention (Braitenberg, 1985; Gotz, 1988; Bauer & Dow, 1989; Malsburg, 1973; Swindale, 1991; Linsker, 1986, for instance). In the present context, we investigated psychophysical^ human performance for detecting the orientation of small bars, and related the geometrical constraints to cortical anatomy. It turned out that bars can be made surprisingly small while still being reliably detected in their orientation. Taking the cortical magnification into account, it can be shown that at high contrast not the physical size (i.e. the angular size in visual space), but the elongation of the stimulus in the cortical projection is limiting the performance of orientation detection. This suggests that a critical number of input lines - or 'cortical pixels' - are required to resolve the orientation of a local contour. Assuming a homogeneous cortical architecture, the literature values for the number of fibres in the optic nerve and for the surface area of the primary visual cortex lead to the notion that no more than about 4 input lines coming from the eye are needed to detect the local orientation. This is close to the theoretical limit of two cortical pixels which are the absolute minimum to discriminate orthogonal orientations. It is obvious that more pixels are required to achieve a higher differential orientation sensitivity. This performance limit has serious consequences for the mechanism which has to be proposed to underlie orientation detection. In particular, the data support the idea of a largely localised elementary mechanism. With large receptive fields - for instance of the Gabor type which is often suggested to describe VI neurones appropriately (Daugman, 1985; Heitger, Rosenthaler, Heydt, Peterhans, & Kubler, 1992) - it would be difficult to detect reliably the orientation of stimuli which are small compared to the receptive field size (Zanker, in prep.). Knowing about the local character of orientation detection, we felt encouraged to put some effort into tackling psychophysically the question of the basic cortical organisation of orientation sensitivity. Although the results of these attempts to describe maps of preferred orientation are of limited significance on statistical grounds, the basic patterns of orientation preference seem to indicate map-like structures. This should be taken as a promising starting point for future work in which such maps are investigated with higher precision. Taken together, the experimental results presented here can be interpreted as supporting the ideas about cortical organisation outlined in the introduction. Such organisation would explain the electrophysiological and anatomical data on the basis of very simple inhibitory mechanisms
34
within the hypercolumns which would lead to the observed patterns (Braitenberg, 1985; Braitenberg & Schiiz, 1991). Besides these speculations prompted immediately by our psychophysical results, in future work it has to be considered how to extend the coarse view on cortical organisation which we have had to adopt so far for reasons of simplicity. In order to characterise the elementary mechanism underlying orientation sensitivity, it would be valuable not only to test the detection of bars which are at least 45° apart in their orientation, but also to investigate orientation discrimination with smaller differences in the test bar angles. The orientation fine tuning would provide important information to analyse models with reference to their differential sensitivity. In this context, it has also to be noted that there is no general reason to restrict the observed efficiency of orientation processing to the first cortical stages, as was done here because it would be the most simple functional principle. Of course, higher processes which are not addressed by the experiments so far could be responsible for specific features of orientation detection. In a similar sense, different mechanisms in the fovea cannot be excluded, which for instance could improve orientation detection by special pooling mechanisms. Such a mechanism could be the reason for the casual observation that at very low contrasts the orientation of bars can be still be seen in the fovea, although they would disappear totally in the periphery, even if presented with the same cortical size. In summary, the experiments presented here may be interpreted as a first step into a large field of experimental questions which relate orientation selectivity to cortical organisation, and therefore may turn out to be relevant for our understanding of basic mechanisms of brain function. Acknowledgements: We want to thank A. Aertsen, A. Borst, B. Crespi, M. Fahle, C. Furlanello, and L. Stringa for many valuable discussions and carefully reading of earlier versions of the manuscript, and C. Clifford for correcting the English.
REFERENCES Bauer, R. & Dow, B.M. (1989). Complementary global maps for orientation coding in upper and lower layers of the monkey's foveal striate cortex. Experimental Brain Research, 76, 503-509. Blasdel, G.G. (1992). Orientation selectivity, preference, and continuity in monkey striate cortex. The Journal of Neuroscience, 12, 3139-3161. Blasdel, G.G. & Salama, G. (1986). Voltage-sensitive dyes reveal a modular organization in monkey striate cortex. Nature, 321, 579-585. Blinkov, S.M. & Glezer, I.I. (1968). Das Zentralnervensystem in Zahlen und Tabellen. Jena: VEB Gustav Fischer Verlag. Bonhoeffer, T. & Grinvald, A. (1993). The Layout of Iso-orientation Domains in Area 18 of Cat Visual Cortex: Optical Imaging Reveals a Pin wheel-like Organization. The Journal ofNeuroscience, 13, 4157-4180.
35 Braitenberg, V. (1985). Charting the Visual Cortex. In A. Peters & E.G. Jones (Eds.), Cerebral Cortex (pp. 379-414). Plenum Publishing Corporation. Braitenberg, V. (1992). How Ideas Survive Evidence to the Contrary: A Comment on Data Display and Modelling. In A. Aertsen & V. Braitenberg (Eds.), Information Processing in the Cortex (pp. 447450). Berlin: Springer. Braitenberg, V. & Braitenberg, C. (1979). Geometry of Orientation Coloumns in the Visual Cortex. Biological Cybernetics, 33, 179-186. Braitenberg, V. & Schiiz, A. (1991). Anatomy of the Cortex. Statistics and Geometry. Berlin: Springer Verlag. Camarda, R.M., Peterhans, E., & Bishop, P.O. (1985a). Spatial organization of subregions in receptive fields of simple cells in cat striate cortex as revealed by stationary flashing bars and moving edges. Experimental Brain Research, 60, 136-150. Camarda, R.M., Peterhans, E., & Bishop, P.O. (1985b). Simple cells in cat striate cortex: responses to stationary flashing and to moving light bars. Experimental Brain Research, 60, 151-158. Cowey, A. & Rolls, E.T. (1974). Human Cortical Magnification Factor and its Relation to Visual Acuity. Experimental Brain Research, 21,447-454. Daniel, P.M. & Whitteridge, D. (1961). The representation of the visual field on the cerebral cortex in monkeys. Journal of Physiology, 159, 203-221. Daugman, J.G. (1985). Uncertainty relation for resolution in space, spatial frequency, and orientation optimized by two-dimensional visual cortical filters. Journal of the Optical Society of America, A 2, 1160-1169. Fox, P.T., Miezin, F.M., AUman, J.M., Essen, D.C.v., & Raichle, M.E. (1987). Retinotopic organization of human visual cortex mapped with positron-emission tomography. The Journal of Neuroscience, 7, 913-922. Gotz, K.G. (1987). Do "d-blob" and "1-blob" Hypercolumns Tesselate the MOnkey Visual Cortex? Biological Cybernetics, 56, 107-109. Gotz, K.G. (1988). Cortical templates for the self-organization of orientation-specific d- and 1hypercolumns in monkey and cats. Biological Cybernetics, 58, 213-223. Heitger, F., Rosenthaler, L., Heydt, R.v., Peterhans, E., & Kiibler, O. (1992). Simulation of neural contour mechanisms: From simple to end-stopped cells. Vision Research, 32, 963-981. Horton, J.C. & Hoyt, W.F. (1991). The Representation of the Visual Field in Human Striate Cortex. Archives of Ophthalmology, 109, 816-824. Hubel, D.H. & Livingstone, M.S. (1987). Segregation of form, color, and stereopsis in primate area 18. The Journal of Neuroscience, 1, 3378-3415. Hubel, D.H. & Wiesel, T.N. (1977). Functional architecture of macaque monkey visual cortex. Procedings of the Royal Society London, B 198, 1-59.
36 Kelly, D.H. (1979). Motion and vision. I. Stabilized images of stationary gratings. Journal of the Optical Society of America, 69, 1266-1274. Linsker, R. (1986). From basic network principles to neural architecture: Emergence of orientation columns. Proceedings of the National Academy of Sciences USA, 83, 8779-8783. Livingstone, M.S. & Hubel, D.H. (1988). Segregation of Form, Color, Movement, and Depth: Anatomy, Physiology, and Perception. Science, 240, 740-749. Mallot, H.A. (1985). An Overall Description of Retinotopic Mapping in the Cat's Visual Cortex Areas 17, 18, and 19. Biological Cybernetics, 52,45-51. Malsburg, C.v. (1973). Self-Organization of Orientation Sensitive Cells in the Striate Cortex. Kybemetik, 14, 85-100. Orban, G.A. (1984). Neuronal Operations in the Visual Cortex. Berlin: Springer Verlag. Pointer, J.S. (1986). The cortical magnification factor and photopic vision. Brain Research, 61, 97-119. Ratliff, F. & Riggs, L.A. (1950). Involuntary motions of the eye during monocular fixation. Journal of Experimental Psychology, 40, 687-701. Rovamo, J. & Virsu, V. (1979). An Estimation and Application of the Human Cortical Magnification Factor. Experimental Brain Research, 37,495-510. Rovamo, J , Virsu, V., & Nasanen, R. (1978). Cortical magnification factor predicts the photopic contrast sensitivity of peripheral vision. Nature, 271, 54-56. Schiller, P.H., Finlay, B.L., & Volman, S.F. (1976). Quantitative Studies of Single-Cell Properties in the Monkey Striate Cortex, I. Spatiotemporal Organization of Receptive Fields. Journal of Neurophysiology, 39, 1288-1319. Seelen, W.v. (1970). Zur Informationsverarbeitung im visuellen System der Wirbeltiere. Kybemetik, 1, 89-106. Swindale, N.V. (1991). Coverage and the design of striate cortex. Biological Cybernetics, 65,415-424. Tootell, R.B., Switices, E., Silverman, M.S., & Hamilton, S.L. (1988). Functional Anatomy of Macaque Striate Cortex, n. Retinotopic Organization. The Journal ofNeuroscience, 8,1531-1568. Tso, D.Y., Frostig, R.D., Lieke, E.E., & Grinvald, A. (1990). Functional Organization of Primate Visual Cortex Revealed by High Resolution Optical Imaging. Science, 249,417-420. Tynan, P. & Sekuler, R. (1982). Motion processing in peripheral vision: reaction time and perceived velocity. Vision Research, 22, 61-68. Wassle, H., Griinert, U., Rohrenbeck, J., & Boycott, B.B. (1989). Cortical magnification factor and the ganglion cell density of the primate retina. Nature, 341, 643-646.
Brain Theory - Biological Basis and Computational Principles A. Aertsen and V. Braitenberg (Editors) 1996 Elsevier Science B.V.
37
Multiple Parietal Representations of Space Carol L. Colby, Jean-Ren6 Duhamel and Michael E. Goldberg Laboratory of Sensorimotor Research, National Eye Institute, Building 49, Room 2AS0, National Institutes of Health, Bethesda, MD 20892 1. Introduction We live in an ever-changing sensory world. As we move our eyes and move through the environment, new images are continuously presented to the brain. Given such constantly changing input, it is remaiicable how easily we are able to keep track of where things are. We can reach for an object, or look at it, or even kick it without making a conscious effort to assess its location in space. The traditional view of spatial perception, strongly supported by subjective experience, is that we "know where things are** in some absolute, world-based frame of reference and use this spatial information to guide our movements. In this standard and intuitively plausible view, spatial perception is a monolithic process: the brain forms a single spatial representation of each object regardless of what action is going to be performed in relation to that object. A new and somewhat counterintuitive view is that the brain represents the spatial location of an object many times over in different cortical areas and each representation is suited to certain kinds of behavicmal responses. Neurophysiological research carried out in monkeys indicates that in areas responsible for controlling head movements, visually sensitive neurons encode the location of an object relative to the head (21,6,8). In contrast, areas involved in limb movement have neurons which encode object location relative to Umb position (17) and, in areas controlling eye movements, visually sensitive neurons encode the location of an objectrelativeto the center of gaze (16,15,13). Two such areas have been discovered in the parietal cortex of rhesus monkeys (6). Neurons in each area are selective for particular stimulus dimensions and particular regions of space. They signal not only where an object is but how to act on it. The representation of space in a given area reflects a particular motor output by which a stimulus can be acquired or avoided. These results suggest that parietal cortex contains multiple action-based spatialrepresentations.In the scheme emerging from these studies, spatial perception is a modular process. A single object may be multiplyrepresentedin terms of the actions that can be performed on it. While this new view seems implausible at first glance, it is exacdy analogous to current views of visual perception. When we see a large red bouncing bsdl, we perceive a single object, even though its size, shape.
38
color and direction of motion arc analyzed separately. Likewise, beneath the apparent unity of subjective spatial experience, lies a diversity of spatial representations, each with specific knowledge of how to act on an object. 2. Parietal Visual Areas Posterior parietal cortex is divided into a number of separate areas. Unlike lower level extrastriate visual areas, these parietal areas do not typically contain simple retinotopic maps and their borders cannot be defined with reference to ventral or horizontal meridian representations. Parietal areas have instead been initially identified on the basis of their connections with other cortical areas (1,2,9,19). Connections alone, however, are not sufficient to define an area. For instance, both the lateral and the ventral intraparietal areas (LIP and VIP) receive strong projections from area MT, in the superior
AREAS
AREA 7
•
SACCADE / TONIC / FOVEAL / FIXATION / ORIENTATION
H VISUAL / NOT DIRECTION SELECTIVE I
I
PASSIVE SOMATOSENSORY
§
ACTIVE SOMATOSENSORY / REACH
VISUAL/DIRECTION SELECTIVE
Figure 1. Distribution of neuronal response properties in rhesus monkey intraparietal sulcus. Each column represents a single 10 mm penetration along the lateral or medial bank of the sulcus. Two rows of penetrations spaced 1 mm apart are shown for each bank. The anterior part of the sulcus is shown at the top of the figure. The posterior portion of the sulcus is shown at the bottom of the figure, where the banks of the sulcus have been separated.
39 temporal sulcus (23). The most reliable guide to aieal boundaries in parietal cortex is the response properties of the neurons. Because many features of parietal neurons are observable only in alert animals (20,22,4,5) we have done a behavioral mapping of parietal cortex in which neural activity is examined in relation to a large set of tasks. Our standard protocol includes tests for visual and somatosensory responsiveness, attentional modulation of these responses, and for oculomotor and somatomotor activity. Previous efforts to map parietal cortex have also used alert animals and shown clear evidence for a regional distribution of distinct functional cell types (18). In the current experiments, we controlled the monkey's behavior by rewarding the animal for fixation or eye movements or selective attention to particular stimuli. A second advance over previous mapping experiments is the use of a recording grid which allows us to record from identified locations repeatedly and allows precise reconstruction of the location of recording sites (10). This reconstruction is especially important because recordings are carried out over a long period of time and because these mapping experiments have focused on cortex buried in the intraparietal sulcus. Accurate reconstruction of recording sites is critical for establishing the location of borders between physiologically defined areas. An overall view of the functional organization of intraparietal sulcus is presented in Figure 1. Of the many different kinds of cells observed, each has a restricted distribution within the sulcus. In moving from one region of the sulcus to the next, the probability of encountering any particular cell type changes systematically. In some regions of the sulcus, such as that near the fundus, the borders between areas are sharply defined by sudden changes in the predominant cell type. In other regions, such as on the medial bank, the boundaries are apparently less sharp and there is a more gradual shift in the response characteristics of the cells. Two areas in the intraparietal sulcus, LIP and VIP, have been defined physiologically. The following sections will describe response properties of neurons in each area and how they contribute to a different spatial representations. 3. Spatial Representation in Area LIP Neurons in the lateral intraparietal area are active in relation to both visual and oculomotor events (16,7,14). They discharge when a visual stimulus appears in the receptive field and, for about half the population, discharge again when the monkey executes a saccade to the location cued by the stimulus (Fig. 2). The strength of the visual response is modulated by the behavioral set induced by the task. When the monkey must attend to the stimulus, the amplitude of the on response is enhanced, compared to the response seen when the stimulus is irrelevant for the monkey's behavior. In addition to these visual, attentional and motor signals, many LIP neurons also carry a memory signal (14,6). These tonically active neurons continue to respond to a visual stimulus in the remembered saccade task during the interval between the appearance of the stimulus and the onset of the saccade. There is no visual stimulus present in the receptive field during this interval. In order to perform the task accurately, the monkey must retain an image of the stimulus location during the delay interval. Tonic activity during the delay reflects a memory trace of stimulus location. These results on multiple sources of activation in LIP underscore the importance of studying neurons in different behavioral tasks. If we had used only a saccade task we
40
might have concluded that the neurons were driving eye movements. But they cannot just be guiding eye movements because they are active in tasks in which saccades aie either inelevant or forbidden. Conversely, if we had used only a fixation task we might have concluded that the neurons were simply visual. But they cannot be performing a purely visual analysis because they consistently respond in circumstances in which there is no stimulus. The results from multiple tasks indicate that the responses of LIP neurons do not depend exclusively on either vision or movement The single point of intersection of the various activations observed is the receptive field itself.
RF
FP
FP
Vr
RF L. HMIIII ! • • • II i n
RF
I
• • • H nil I I 11 a m i I B M i • • • I III I INI I • i l I I 11 H MIH
I M
••11 I
II • • • • • • • § • III nil IN laili^Hii I I I • I M II I I id • • • • i i H i i m i i ^ i M n BIIHI M I l i l l l ^ 11 I l l l i n ^ I H I • INIilli I I n i l I • • milil^ • I M B N I I I • • I • I I I a n II wm i i a a i iii • • • I I I ••imii I I w i l l I ••IMia I I I I I B I i I I IIIBI • i i m III iiiMMii I I • IHIIMHH • • • I • I • « • III I I ill
• •••iM^iiw in Mil I I I I II I II MHIW § • • • I I l 4 I ••Mil mil I I ••HI I I I I i H H i a n I i H i I I IN I II i tmit III I
I I I • • • • • • a i ^ I I I III I II 11 I I I I H B M I I I I I II I ri I 11 • • i i i a i i i H i i i w I • i i m H I i i i i m II I I I i i H i IIII l a i i II M i l la II iif
I
^
•••IIH
HII I B I ^ I H
•man
r
ii ina iii
I H I III •liiim
II
•
•
I B BI^HI^ l l l l i H I l I I III I I • » • • I II l l i l l i
I
•iiiimHiH H M i • I mm iii
I
64151-14
Stimulus Appearance
Saccade Onset
Figure 2. Response of an LIP neuron in a remembered saccade task. The cartoon above each diagram shows the relative locations of thefixationpoint (FP) and the receptive field (RF). The time lines show vertical eye position (V) and the onset and offset of a stimulus in the RF. Each tic mark in the raster diagram signifies a single action potential. Successive trials are shown on successive lines, synchronized (vertical line) on the event indicated below the histogram. The calibration bar at left signifies a response rate of 1(X) spikes per second. In this task, the monkey must fixate while a stimulus is briefly presented in the RF. After a variable delay, the fixation point is extinguished and the monkey saccades to the location where the stimulus appeared. Separate visual and motor bursts are seen in each trial, as well as tonic activity during the memory period.
41 We hypothesize that l i P neurons encode spatial locations. Further, their activity is modulated by attention to a spatial locus that is defined not by a stimulus or by a movement, but by the spatial vector that could describe either. In essence, the activity of an LIP neuron encodes an attended spatial location. If LEP is encoding space, rather than visual perception or specific behavion, it is necessary to understand die coordinate system in which it operates. There are three plausible coordinatefiramesfcH* representing stimulus position. For a neuron operating in retinal coordinates, neural activity signals where the stimulus is on the retina. In a headcentered coordinate firame, neural activity signals where the stimulus is relative to the head, regaidless of where the eyes are looking. In oculocentnc coordinates, neural activity signals the saccade necessary to foveate the stimulus. We have been able to discriminate among these three possibilities by observing what happens to memory-related activity in LIP when the monkey makes a saccade (13). Every time an eye movement occurs, the projection of the visual world changes on the retina and, by implication, in all the retinotopically mapped areas of the brain. If it is to be useful, visual information currendy being processed in LIP must be remapped in conjunction with each saccade. We have discovered that LIP neurons remap the memory trace of a previous stimulus event (Fig. 3). While the monkey fixates, a stimulus is briefly presented (50 msec) at a location well outside the receptive field of the neuron (Fig. 3B). A new fixation point appears, and the monkey makes a saccade to it Because the stimulus flash is so brief, the stimulus is no longer present at the time of the saccade. The effect of the saccade is to bring the receptive field to the location that was previously stimulated. If the neuron had continued to encode events at the original receptive field location (i.e., in head-centered coordinates), it would remain silent after the saccade because there was never any stimulus at that location. Likewise, if the neuron had access only to retinal information (retinal coordinates) it would also remain silent, since no stimulus appeared in its retinal receptive field. The results show instead that LIP neurons do respond when the receptive field is brought to land on a previously stimulated location (Fig. 3B). There is no stimulus present on the screen, so the neuron can only be responding to a memory trace of the stimulus. Control experiments confirm that neither the stimulus alone (Fig. 3C) nor the saccade alone (Fig. 3D) can drive the neuron. We conclude that the neuron is responding to a remapped memory trace of the stimulus which is encoded in oculocentric coordinates. The specific spatial problem which LIP must solve is how to signal a spatial location with reference to the current fovea. By remapping memory traces, parietal cortex constructs a spatial representation that encodes stimulus location in terms of distance and direction from the current center of gaze. While neurons in LIP have retinotopic receptive fields, visual information in LIP is dynamically updated in conjunction with eye movements to produce an oculocentric representation: stimuli are coded in terms of their distance and direction from the fovea. LIP neurons maintain an oculocentric representation of target position by using a corollary discharge from the eye movement command to update retinotopic visual information. This remapping of stimulus location serves to maintain an alignment between the external world and the internal representation of it. Remapping also provides the oculomotor system with continuously accurate information about the vector of the saccadic eye movement necessary to acquire the stimulated location. Remapping is the means by which a coordinate transformation is effected from retinotopic to oculocentric coordinates.
42
B.SaccadetoFP2 Stim flashed for 50 msec
A. Stimulus In Receptive Field No Saccade
'I*
H t
StIm FP r
FP2
r-
Stlm
n_
FP J I I I III i
1_
III
untt 64151
Stimulus Onset
Beginning of Saccade
C. Stimulus Outside Receptive Field No Saccade
D. Saccade to FP 2 No Stimulus
'I*
pp. V H f_
FP2_
H r FP2
StIm.
sum
•
.FP2
~r
1
FP
III 4
!• I
I J.H
< .nil—I
Stimulus Onset
l«ii.M !••»*• I • !
•!••<
I
I •!
Saccade to FP 2
Figure 3. LIP neuron response to the memory trace of a stimulus. A. Hxation condition: neuron responds to a stimulus in the RF. B. Memory trace condition: the stimulus is flashed outside the RF and is gone before the saccade to FP2 begins. C. Stimulus control condition: a stimulus presented outside the RF does not drive the cell in the absence of a saccade. D. Saccade control condition: the saccade alone does not drive the cell in the absence of a stimulus.
43
In a further set of experiments, we found that the intention to make a saccade is itself sufficient to shift the receptive field of an LIP neuron, that is, LIP neurons can predict the sensory consequences of an impending saccade. They respond to a stimulus that will be brought into the receptive field as if the stimulus were already present in it. This remapping occurs with every saccade, whether or not the stimulus will be the target of a later saccade. An example of this phenomenon is shown in Figure 4. In the fixation task (Fig. 4A) this LIP neuron responds to the appearance of stimulus in its receptive field with a latency of 70ms. When the stimulus is presented outside of the receptive field, and the monkey is required to make the specific saccade that will bring the stimulus into the receptive field, the neuron begins to respond even before the saccade is initiated (Fig. 4B). Control experiments confirm that this activity is a predictive visual response. The first control condition shows that the stimulus is in fact outside the receptive field when the monkey looks at the original fixation point (Fig. 4C). The second control condition shows that the saccade to the new fixation point is not associated with neural activity by itself (Fig. 4D). Many LIP neurons with predictive responses, like the one illustrated here, have visual responses but no saccade related activity in the remembered saccade task, indicating that the predictive response must be visual and not related to motor planning. In predictive remapping, the area of retina that is capable of stimulating the cell transientiy shifts, so that the cell responds to stimuli that will be in its receptive field after the saccade. This phenomenon enables accurate localization of visual objects without the processing delay inherent in relying on reafferent information following a saccade. In summary, neurons in LDP encode events at specific spatial locations. Their activity is not uniquely related to either sensory or motor events. Rather, they signal the location at which an event occurred. The spatial coordinate frame used by LIP neurons is oculocentric: locations are specified in terms of their distance and direction from the fovea. This representation is of prime usefulness for the oculomotor system which must program movements not to a target in absolute space but relative to the current center of gaze. Because covertiy attended stimuli evoke the same response from LIP neurons as do the targets for saccades, we asked whether attention alone was sufficient to remap the internal representation of a stimulus. We have shown above that the intention to make a particular eye movement can transientiy shift the receptive field of an LIP neuron and can remap the internal representation of a previous stimulus. Can a movement of attention alone accomplish the same thing? We tested this using a variant of the peripheral attention task in which the monkey attends to a peripheral "fixation point" without looking at it (Fig. 5). A stimulus is presented on the screen in a location that would be in the receptive field if the monkey were permitted to look at the new fixation point If shifting attention to the new fixation point is equivalent to intending to move the eyes there, then the neuron shouldrespondto the onset of the stimulus. Since the neuron does not respond, we conclude that a shift of attention alone cannot induce a shift in the LIP spatialrepresentationthe way that an intended eye movement does. This failure to shift the LIPrepresentationin conjunction with an attentional shift suggests that the function ofremappingis to maintain an accurate alignment between the visual world and its internal representation. With an attentional shift alone, nothing moves on the retina and there is no need to remap the internal representation. When a saccade is about to occur, however, LIP can make use of information about the intended
44
. Saccade to FP 2 Brings Stimulus Into RF
A. Stimulus In Receptive Field No Saccade
FP2 • H V FP2
sum
RF FP
v.,'if,
''I/i ' .fll'.r.W.l.'/'.v.l'.'",
I
js'.ivii::.;.;',
\'.:'. '• 'r ev's".V."'.•!• "i
llLl.>L I
I
-I
untt64156
looms
Stimulus Appearance
Saccade Onset
C. Stimulus Outside Receptive Reld No Saccade
D. Saccade to FP 2 No Stimulus
ill '1^
H V FP2 Stim
FP2 Stbn
FP
FP
id
-kM--l
Stimulus Appearance
Onset of FP 2
Figure 4. Predictive remapping. A. Fixation condition: LIP neuion responds to a stimulus in the RF. B. Saccadic remapping condition: while the monkey fixates, a stimulus £^pears outside the RF as well as a new fixation point (FP2) to which the monkey must saccade. The neuron responds to the stimulus even before the saccade begins. C. Stimulus control condition: a stimulus presented outside the RF does not drive the neuron. D. Saccade control condition: the saccade alone does not drive the neuron.
45 Attention at FP2 Stimulus Outside Receptive Field, No Saccade
O'
2il
FP2
•
• FP
V ^
FP2
1
1
1
1
FP
1
ll,>J..HilL.I J . L.L.I.I Stimulus Onset
Figure 5. Attention shift experiment. The monkey foveates FP but must attend to FP2 in order to detect a slight dimming of the light and release a bar. The neuron does not respond to a stimulus presented outside the RF, as it did in the memory trace and predictive remapping experiments. eye movement to anticipate the retinal consequences of that saccade and update the stored representation of object locations. In parietal cortex, a distinction is made between attention and intention. Visual responses of neurons in LIP arc modulated both by overt movements of the eyes and by covert shifts of attention. Quite different purposes are served by sensitivity to intended eye movements and to attentional shifts. Response modulation by attentional state permits enhanced processing of images within the focus of attention. In contrast, response modulation by intended eye movements makes it possible to maintain perceived spatial constancy of the visual world as images are displaced on the retina. Two mechanisms contribute to spatial constancy. First, LIP neurons respond to the memory
46 trace of a visual stimulus when an eye movement brings the spatial location of that stimulus into the receptive field. This memory trace response indicates that the LIP representation of the visual world is shifted in conjunction with eye movements. Second, some LIP neurons accomplish this shift in anticipation of the actual eye movement. This anticipatory shift may reflect the attentional shift that normally precedes eye movements. An attentional shift alone, however, cannot produce a shift in the stored representation. Only when an intended eye movement is about to occur do we see evidence for a remapped representation. These results suggest that while eye movements and attention normally coincide, the underlying neural mechanisms are distinct and subserve different cognitive functions. Further, the neurophysiological distinction between attention and intention indicates that these are separate cognitive processes. 4. Spatial Representation in Area VIP The ventral intraparietal area is located in the fundus of the intraparietal sulcus and has been defined on the basis of its distinctive visual response properties (8).
10 cm
5 cm
57 cm
07
lft^|^j^iL^iil..it
It^JkL isL
Figure 6. Distance selectivity in a VIP neuron. Each panel shows the response to a stimulus presented at a different distance from the monkey. Stimuli were equated fen- size and luminance. In all conditions the monkey had to maintain fixation on a point on the tangent screen at 57 cm.
47
From Above
straight Toward 1 • 1 1 I 1
III N 1
20 1 HI
M i a IIMI 1 i Mil 1 •••
'
1 11
• •
1 1 I I I ai 1 • • • i i i B i i i m • n i l • i i i i i M i i i 1 i M i i i B i a i mm i i
Li
1 1 1 •
•
1
II Illll 1 1
'•'",',','"
• illll
1
,
III 1 Hi
•IB
" ^
- > - ^ * " *
1 1
11 I
N 1 1Illll 1 II I I
1
II I I I a m • i III •III!
uiiki
iiii>4* •Ml
0
20
Figure 7. Trajectory selectivity in a VIP neuron. Top row: stimuli arc moved toward the monkey's brow while the monkey fixates a central FP on the tangent screen. Bottom row: the same stimuli are moved toward the monkey's chin, evoking a much larger response. The projected point of contact of the stimulus is more strongly related to response rate than either the absolute direction of motion (straight toward vs. down and toward) or the portion of the visual field stimulated (upper vs. lower).
48 Toward Forehead from Above with Eyes Up
'f*r f'hf "t 64-062-07
200MS
Toward Chin from Above with Eyes Up
I J Toward Chin with Eyes Down
^ ^
iJHi
Li4.*Jl>LK
-a
20
Figure 8. Head-centeied spatial coordinates in VIP. Changes in eye position do not change selectivity for stimuli moving towards the chin.
49 Cortex dorsal to VIP in the anterior portion of the medial bank is puiely somatosensory with an emphasis on hand representation (Fig. 1). Near the fundus of the sulcus, there is a sudden transition to a region of strong visual responsiveness. This visual area extends from the medial bank across the fundus to the lateral bank. VIP neurons are well driven by moving visual stimuli and most are selective for direction of stimulus motion. Other properties similar to those found in areas MT and MST, such as speed tuning and responsiveness to whole-field motion, are also observed, consistent with its inputs from these areas (23,3). These visual response properties set VIP apart from the surrounding cortex. Two special types of VIP neurons are of interest with regard to spatial representation. The first is the ultranear cells. These visual neurons respond only to stimuli presented very close to the the animal, within a few centimeters of the face (Fig. 6). These neurons may signal the presence of a stimulus that can be acquired by reaching with the mouth. The second special type is trajectory neurons. These cells respond selectively to a stimulus moving towards or away from the animal. For these neurons, the absolute direction of stimulus motion is less important than the anticipated point of contact of the stimulus. In the example shown, a stimulus moved toward the chin elicited a much stronger response than the same stimulus moved toward the brow (Fig. 7). This result suggests that something other than a simple retinal coordinate frame is used to represent space in VIP. This suggestion was confirmed by having the monkey change its gaze direction (Fig. 8). The neuron continued to respond best to a stimulus moving towards the chin regardless of eye position. This insensitivity to eye position indicates that the stimulus is encoded in head-centered coordinates and not retinal coordinates. A surprising feature of VIP is that most neurons can be independendy driven by somatosensory stimulation (11,12). The somatosensory receptive fields are found primarily on the face and head. The visual and somatosensory receptive fields correspond to one another in location, in size and in directional selectivity, as illustrated in Figure 9. This neuron responds both to a peripheral visual stimulus moved toward the fovea and to a cutaneous stimulus moved across the face toward the mouth. VIP neurons with visual receptive fields in the upper hemifield have somatosensory receptive fields on the upper part of the face and brow, while neurons with lower field visual receptive fields have somatosensory receptive fields on the lower part of the face. Strikingly, VIP neurons with foveal visual receptive fields have somatosensory receptive fields around the mouth, as though the mouth were the "fovea" of the facial somatosensory system. For bimodal trajectory sensitive neurons, the visual response is tied to the location of the somatosensory receptive field. Finally, some VIP neurons with very large visual receptive fields have somatosensory receptive fields that include the hand and arm as well as the head. These findings on VIP response properties are consistent with a spatial representation in head-centered coordinates. Both the ultranear and the trajectory sensitive neurons appear to encode stimulus location in a head-centered coordinate frame. The bimodal neurons may have a special role in hand, eye and mouth coordination. Overall, visual targets may in coded in VIP in terms of how they can be acquired by reaching with the head and mouth.
50
X
y 1
I m m i l III III
IIHIHtll I I
1
II
1 1 11 1
iimaiiiiii I I
I I I Hill I I I IIIIMIII MM IIIIHII n i l
1 1 1II 1 1 1 p •
II 1
I
I ai I I I I H iiiiiii I
I
I
I II
II H I
II I I I I I I I
i i i i i i i III i l l I I I n i l I III m i l i i a i i i I I II nil I I I MM ! • 111111111 n i l II i l l I I m i l nil I I I I I I I I I I mil I I I II I I III 11 I I I III I III I I I n I III I l i I I I II III IIII II I I I I I II I I I I I I I I I I I I I I I II I I I I till III I I I I I I I
I
II
I I III M I I I
I I
I n i l
II
n i l RIM 11 I I I 11 I miiiii • nil
1
1 1 11
1
H 1 1 1 1 1 1 1 1 1 1 1 mil II 1
1
Ui I
1
•
'
ljui|--l k-4-
1 1
nil 1 1
1
I I I I iii I i I Bllll I II II II I 11 I |i I I II I I I I I I I I I g I I I I I I I I B I I I I I I I I I I 11 II I I II II II I I I I I I III I I I I I I II r I I I I II H I I I I il 1111 11 I nil I 111 I I I I I III II I I I I I I I I I I I I I I 11 I I I I I i I I I III
ti Figure 9. Bimodal sensory responsiveness in a VIP neuron. Top panel shows the location of the visual RF and a directionally selective response to a stimulus moved through the RF. Bottom panel shows the location of the somatosensory RF for the same single neuron and a directionally selective response to a somatosensory stimulus moved across the RF (tested with eyes closed).
5. Conclusions Parietal cortex contains multiple spatial representations, two of which have been described here. Neurons in area LIP encode stimulus location in oculocentric coordinates, while some VIP neurons encode stimulus location relative to the head. These multiple representations are presumably tailored for guiding specific kinds of actions, namely eye movements and head movements. The function of parietal cortex is to signal the location of attended objects. It does so in order to allow the perceiver to act on its environment, and different kinds of actions are likely to be supported by different spatial representations.
51
References 1. Andersen, R. A., Asanuma, C , and Cowan, M. Callosal and prefrontal associadonal projecting cell populations in area 7a of the macaque monkey: a study using retrogradely transported fluorescent dyes. / . Comp, Neurol. 232: 443-455, 1985. 2. Andersen, R. A., Asanuma, C, Essick, G., and Siegel, R. M. Corticocortical connections of anatomically and physiologically defined subdivisions within the inferior parietal lobule. / . Comp, Neurol 296: 65-113, 1990. 3. Boussaoud, D., Ungerleider, L. G., and Desimone, R. Pathways for motion analysis: Cortical connections of the medial superior temporal and fundus of the superior temporal visual areas in the macaque. / . Comp, Neurol. 296: 462-495, 1990. 4. Bushnell, M. C , Goldberg, M. E., and Robinson, D. L. Behavioral enhancement of visual responses in monkey cerebral cortex: I. Modulation in posterior parietal cortex related to selective visual attention. / . Neurophysiol. 46: 755-772, 1981. 5. Colby, C. L. The neuroanatomy and neurophysiology of attention. / . Child Neurol. 6: S88-118, 1991. 6. Colby, C. L. and Duhamel, J.-R. Heterogeneity of extrastriate visual areas and multiple parietal areas in the macaque mcHikey. Neuropsychologia 29: 497-515, 1991. 7. Colby, C. L., Duhamel, J.-R., and Goldberg, M. E. The analysis of visual space by the lateral intraparietal area of the monkey: the role of extraretinal signals. In: Progress in Brain Research, Vol, 95, edited by T. P. Hicks, S. Molotchnikoff, T. Ono,, 1993, pp. 307-316. 8. Colby, C. L., Duhamel, J.-R., and Goldberg, M. E. Ventral intraparietal area of the macaque: Anatomic location and visual response properties. / . Neurophysiol, 69: 902-914, 1993. 9. Colby, C. L., Gattass, R., Olson, C. R., and Gross, C. G. Topographic organization of cortical afferents to extrastriate visual area PO in the macaque: a dual tracer study. / . Comp. Neurol. 238: 1257-1299, 1988. 10. Crist, C. F., Yamasaki, D. S. G., Komatsu, H., and Wurtz, R. H. A grid system and a microsyringe for single cell recording. / . Neurosci. Methods 26: 117-122, 1988. 11. Duhamel, J.-R., Colby, C. L., and Goldberg, M. E. Congruent visual and somatosensory response properties of neurons in the ventral intraparietal area (VIP) in the alert monkey. Soc. Neurosci. Abstr. 15: 162, 1989. 12. Duhamel, J.-R., Colby, C. L., and Goldberg, M. E. Congruent representations of visual and somatosensory space in single neurons of monkey ventral intraparietal cortex (area VIP). In: Brain and Space, edited by J. Paillard, Oxford: Oxford University Press, 1991, pp. 223-236. 13. Duhamel, J.-R., Colby, C. L., and Goldberg, M. E. The updating of the representation of visual space in parietal cortex by intended eye movements. Science 255: 90-92, 1992. 14. Gnadt, J. W. and Andersen, R. A. Memory related motor planning activity in posterior parietal cortex of macaque. Exp. Brain Res. 70: 216-220, 1988.
52 15. Goldberg, M. E. and Bruce, C. J. Primate frontal eye fields. HI. Maintenance of a spatially accurate saccade signal. / . NeurophysioL 64: 489-508, 1990. 16. Goldberg, M. E., Colby, C. L., and Duhamel, J.-R. The representation of visuomotor space in the parietal lobe of the monkey. Cold Spring Harbor Symp, Quant, Biol. 55: 729-739, 1990. 17. Graziano, M. S., Yap, G. S., and Gross, C. G. Coding of visual space by prcmotor areas. Science 266: 1054-7, 1994. 18. Hyvarinen, J. Regional distribution of functions in parietal association area 7 of the monkey. Brain Res. 206: 287-303, 1981. 19. Maunsell, J. H. R. and Van Essen, D. C. The connections of the middle temporal visual area (MT) and their relationship to a cortical hierarchy in the macaque monkey. / . Neurosci. 3: 2563-2586, 1983. 20. Mountcasde, V. B., Lynch, J. C, Georgopoulos, A., Sakata, H., and Acuiia, C. Posterior parietal association cortex of the monkey: command functions for operations within extrapersonal space. /. NeurophysioL 38: 871-908, 1975. 21. Rizzolatti, G., Gentilucci, M., Luppino, L., Matelli, M., and Ponzoni-Maggi, S. Neurons related to goal-directed motor acts in inferior area 6 of the macaque monkey. Ejq). Brain Res. 67: 220-224, 1987. 22. Robinson, D. L., Goldberg, M. E., and Stanton, G. B. Parietal association cortex in the primate: Sensory mechanisms and behavioral modulations. /. NeurophysioL 41: 910-932, 1978. 23. Ungerleider, L. G. and Desimone, R. Cortical connections of visual area MT in the macaque. /. Con^. NeuroL 248: 190-222, 1986.
Brain Theory - Biological Basis and Computational Principles A. Aertsen and V. Braitenberg (Editors) © 1996 Elsevier Science B.V. All rights reserved.
53
NEURAL MECHANISM OF FIGURE-GROUND SEGREGATION AT OCCLUDING CONTOURS IN MONKEY PRESTRIATE CORTEX R. Baumaim, X. M. Sauvan, and E. Peterhans* Department of Neurology, University Hospital Zurich, Frauenklinikstr. 26, CH-8091 Zurich, Switzerland
1. ABSTRACT An important aspect of figure-ground segregation is the detection of occluding contours and the discrimination of figure and ground at such contours. In this paper we investigate the neural processing and the representation of occluding contours defined by occlusion cues. We trained rhesus monkeys on a visual fixation task that reinforced foveal viewing. During the periods of active visual fixation we recorded the responses of single neurons in areas VI and V2 of the visual cortex. The stimulus conditions that we used mimicked situations of spatial occlusion; usually an opaque, uniform rectangle (or tongue) which appeared to overlay a larger, rectangular grating texture. The neural responses were analyzed with respect to figure-groxmd direction and contrast polarity. Neurons in area VI either failed to respond to such stimuli, or they were sensitive to contrast polarity only; others responded imselectively to all types of contour. By contrast, more than one third of the neurons studied in area V2 were sensitive to the direction of figure and groimd at such contours. The majority of these neurons were not sensitive to the contrast polarity of these contours. Some rare neurons preferred a certain combination of figure-ground direction and contrast polarity. We explain these results in terms of a model that had been developed to explain neural signals of illusory contours, contours that often coincide with occluding contours. In conclusion, the results of the present paper suggest that in monkey visual cortex occluding contours and mechanisms for the segregation of figure and ground at such contours are first represented at the level of area V2.
*This research was supported by SNF 31-31970.91 (Esprit Insight-II 6019) and HFSP RG-31/93. X.S. was supported by SNF 31.31963.91 (Esprit Mucom-II 6615). We thank R. van der Zwan for comments on the manuscript. Please send correspondence to E. Peterhans at the address given above.
54
2. INTRODUCTION Object recognition requires consistent representations of object borders iQdependent of the background structures and viewing conditions. In the retinal image borders can be defined by a number of cues, most typically by discontinuities of luminance, color, texture, and motion, or by binocular disparity (1-3). In this paper we investigate the neural processing of borders defined by occlusion cues (4, 5). Figure 1 shows examples of artificial visual scenes illustrating die problem. This Figure induces the perception of occluding contours and surfaces, Figs. 1A and C a triangle, and Figs. IB and D a peanut shaped object, both of which appear to overlay objects of other shapes in die background. It is the terminations of diese background structures (line-ends, comers) which produce illusory contours at sites of fading, or missing contrast. These contours complete die gaps of die occluding contours and dius facilitate die perception of occluding objects. This mechanism may contribute to the segregation of figure and ground, particularly in cluttered ^ visual scenes where objects overlay one anodier. In diis light, illusory contours contribute to perceptual stability. A similar mechanism may also enhance the segregation of figure and ground in natural visual scenes (see Ref. 6 for further discussion). Figure 1 also shows diat it is die spatial alignment and the direction of the terminating background structures which determine the shape of the occluding contours and the depth order of the associated surfaces. This mechanism is independent of contrast polarity. Figures widi dark (A, B) and light terminations (C, D) induce identical shapes of die occluding objects and die same depth relations as long as they have the same spatial arrangement. We have investigated this process in the visual cortex of the alert monkey during behaviorally induced visual fixation. The responses of single neurons to stimuli akin to diose of single elements of Fig. IB and D were studied in areas VI and V2. The results confirm our previous findings that illusory contours are represented in area V2, but not in area VI (see also Refs. 7 and 8). In addition, diey indicate diat not only illusory contours, but also information about the direction of figure and ground at such contours are represented at this early level of processing. A report on the very first neurons found of this type has been published (9).
3. METHODS 3.1 Animal preparation Rhesus monkeys (Macaca mulatta) were trained on a visual fixation task that reinforced foveal viewing. The fixation target consisted of two vertical lines that
55
A
B
Figure 1. Illusory figures that mimic situations of spatial occlusion. It is the spatial arrangement of the occlusion cues (line-ends, corners) with determines the shape of the illusory contours, and the depth order of the associated surfaces. This mechanism is independent of contrast polarity; both dark (A, B) and light terminations (C, D) induce identical percepts of form and depth. (Adapted from Ref. 4).
were 7 min arc long, 1 min arc wide, and separated by 5 min arc from center to center. The animals could initiate a trial by pulling a lever. After a variable time interval (0.5-5 sec) the orientation of the fixation target was turned by 90 deg and the animal had to release the key within 0.4 sec. Correct responses were rewarded with a small amount of fruit juice or water. When the animals reached a performance rate greater than 90% they were prepared for recording. In successive operations, under general anesthesia, a head-holder and two recording chambers (one for each hemisphere) were mounted to the skull. The chambers were centered on the operculum over the area representing the central 1-6 deg of the lower visual
56
field. For a detailed description of our method of accessing areas VI and V2 see Ref. 10. 3.2 Visual stimiilatioii and recording We characterized the response properties of single neurons using conventional stimuli like bars, edges and square-wave gratings. Subsequendy we used illusorycontour stimuli as shown in Fig. 2B, C (a detailed description of the responses to such stimuli are given in Refs. 7 and 8), and occluding-contour stimuli as shown in Fig. 3A-D. In human observers the occluding-contour stimuli induced the perception of opaque, bright (or dark) rectangles bounded by illusory contours. Typically, we used moving stimuli which were perceived as rectangular tongues sliding back and forth over a grating texture. The width of the grating lines was 1-9 min arc, the line spacing 6-28 min arc. As shown in Fig. 3, the edge of the tongue was presented at the neuron's preferred orientation and the grating texture perpendicular to it. It was oscillated with constant speed and frequency over the neuron's response field (ellipse) which had been determined with a bar or an edge. These stimuli and the fixation target were generated by means of analog and digital circuits and displayed on a specially designed high resolution, flat faced oscilloscope equipped with a fast decaying phosphor (Ferranti A5, peak at 555 nm). By means of a half silvered mirror we added a uniformly illuminated background to the display, such that the final luminance of the background was 10 or 36 cd/m^, and that of the stimuli 20 or 72 cd/m^. Two images, one for each eye, were generated side by side on the oscilloscope screen at a rate of 100 Hz. They were presented to the animal via a stereoscope, for each neuron at its preferred depth. The responses of single neurons were recorded during the periods of active visual fixation. We used glass-coated platinum-iridium microelectrodes prepared according to Wolbarsht et al. (11), but without platinum-black coating. The signals were amplified, fed to a Schmitt-trigger, recorded on instant film in the form of a dot display, and stored by computer for immediate display and off-line analysis.
4. RESULTS In perception, the borders of objects can be perceived independent of the cue by which they are defined in the retinal image. For example, we perceive a triangle when its borders are defined by color or contrast edges, and similarly when these border are defined by occlusion cues (see Fig. 1). This generalization of contours which implies conversion of cues is an important step toward perceptual constancy. Earlier studies from our laboratory on the processing of contour in monkey visual cortex suggested that the process of contour generalization begins in area V2 (7, 8, 12). Those studies showed that neurons of area V2, but not of area VI, signalled
57
Unit 3GD5
27.7
B 4.6
4.0
D 0.0
0.5s (2°) Figure 2. Responses of a "contour neuron" of area V2. (A) shows the responses to a dark bar at optimal orientation, (B) and (C) the responses to illusory-contour stimuli. The ellipses indicate the neuron's response field as determined with a bar stimulus; the cross marks the fixation point of the monkey. In each stimulus condition (A-C) the contour was moved over the response field. The dot-displays show the responses to 24 motion cycles (frequency: 1 Hz), those in the forth sweep in the left, and those in the back sweep in the right half of the display. The responses were recorded in blocks of 8 motion cycles in an interleaved, pseudorandom order. Each dot marks an action potential, the figures represent mean numbers per motion cycle. The spontaneous activity was zero (D). (Reproduced with permission from Ref. 29).
58
Unit 7BD1
15.7
B
16.2
10.0
3.0
2.3
Figure 3. Sensitivity to figure-ground direction at occluding contours. Responses of a neuron of area V2. The line terminations were presented perpendicular to the neuron's preferred orientation, and moved back and forth over the neuron's response field as plotted with a bar stimulus (ellipses). Note that the position of the contour represents the starting point of the forward sweep. The dot displays (A-D) show the responses recorded during 24 motion cycles (frequency: IHz), the figures on the right indicate mean numbers of spikes per motion cycle. The bottom display shows spontaneous activity for a corresponding time interval.
59
the orientation of contours independent of whether these contours were defined by luminance contrast or occlusion cues. The responses of such a *'contour neuron"" of area V2 are reproduced in Fig. 2. This neuron preferred long dark bars with oblique orientations (A), but it also responded to two types of illusory-contour stimuli (B, C). In human observers these stimuli induced the perception of an illusory bar overlaying two bright rectangles in the background (B), and an illusory contour between abutting line-gratuigs (C). The ellipse illustrates the dimensions of the minimum response field as plotted with a dark bar. In each stimulus condition (A-C) the "contours" were moved back and forth over the neuron's response field. The dot displays on the right show the responses in the two directions of stimulus movement. The bottom display (D) shows spontaneous activity. The two illusorycontour stimuli evoked similar, though weaker responses than the solid bar. On the average, the response strength evoked by illusory-contour stimuli was about 60% of that evoked by solid bars or edges (7). Since illusory contours often coincide with occluding contours (see above), we asked ourselves whether neurons of area V2 that are sensitive to the orientation of illusory contours also carry information about the direction of figure and ground at such contours. Thus, we studied the responses of neurons of area V2, and for control also of neurons of area VI, in stimulus conditions that were akin to the single elements of Figs. IB and D. We used pairs of stimuli that had the same figure-ground direction at the contour, but opposite contrast polarity. They are shown in the stimulus insets of Fig. 3 (A-D). The combination of these two pairs of stimuli (A,B and C, D) allowed us to analyze separately the effects of figureground direction and contrast polarity. The types of responses that we predicted from these stimuli are listed in Table 1. Neurons sensitive to figure-ground direction were expected to respond preferentially to one type of contour pair, either (A and B) or (C and D), independent of the contrast polarity at the contour. Neurons sensitive to a certain combination of figure-ground direction and contrast polarity were expected to prefer one type of contour, either (A), (B), (C), or (D). Neurons not sensitive to figure-ground direction were expected to be sensitive to the contrast polarity at the contour (dark/light or light/dark) and thus to prefer contours (A and D) or (B and C), or to be unselective giving similar responses to all four types of contour. This paper is based on the responses of 146 neurons, 46 recorded in area VI, 100 in area V2.
60
Table 1. Responses predicted from occluding-contour stimuli
Neural sensitivity to
Effective contours*
Figure-ground direction
(A and B) or (C and D)
Figure-ground direction and contrast polarity
A or B or C or D
Contrast polarity
(A and D) or (B and C)
Unselective
A and B and C and D
*For an illustration of these contours see Fig. 3 A-D. Note that contour pairs (A and B) and (C and D) have opposite contrast polarity.
4.1 Sensitivity to figure-ground direction Figure 3 shows the responses of a neuron of area V2 that was sensitive to figureground direction. It gave similar responses to the first stimulus pair (A, B) for which the line-terminations indicated an occluding surface to the right of the contour. Much weaker responses were evoked by the second stimulus pair (C, D) with the opposite figure-groimd arrangement. Of course, the four stimuli are highly abstract conditions which reduce the problem of figure-ground direction to a single contour. At this simplified level, human perception can be ambiguous. This is evident from the stimulus insets of Fig. 3. Two percepts are possible, either the opaque tongue covering a grating texture, or a grating with a cut-out notch of the size of the tongue. In the figure the second percept can dominate due to the disturbing effect of the ellipse. However, in the actual stimulus, particularly when it was moving, the first percept dominated. Also, if one looks at a single element of Figs. IB or D the illusory contour persists, though perhaps somewhat weaker than in the complete figure (cf. DISCUSSION). In any case, independent of the actual percept, the information conveyed by the neuronal response is unambiguous - it indicates the local position and the direction of line-terminations. The signals of these neurons can provide the very first, local information necessary for mechanisms of figure-ground segregation from occlusion cues. This interpretation predicts two types of neurons at each orientation preferring opposite figure-ground directions. Figure 4 shows the responses of two neurons that were, relative to their
61 preferred orientation, sensitive to opposite figure-ground directions. For clarity, the preferred orientation of each neuron has been plotted vertical (orientation of the ellipse), and the mean responses are shown in the form of histograms. Besides the responses to occluding-contour stimuli (open bars), Fig. 4 also shows the responses
Unit 6CB1
Unit 6CQ4
1 ^ ±
JL JL
I
I
I
A
B
C
JL
J_
1^
D
A
B
C
D
en 13
0 E
F
G
H
E
F
G
CO
H
Figure 4. Sensitivity to opposite figure-ground directions of a pair of neurons of area V2. The first neuron (6CB1) gave stronger responses to contours having the occluding surface to the right (A, B), the second neuron (6CQ4) to contours having the occluding surface to the left (C, D). This result was independent of contrast polarity; all solid edges evoked similar responses (E-H). For each contour (A-H) the mean responses of 24 motion cycles are plotted in the form of histograms (frequency: 1.5 Hz, unit 6CB1; IHz, unit 6CQ4). The vertical bars indicate standard deviations; the ellipses show the response field of unit 6CQ4 as defined with a bar stimulus.
62
to the corresponding solid edges (filled bars). The first neuron (6CB1) gave stronger responses to contours (A) and (B), i.e. to the left stimulus pair, and weaker responses to contours (C) and (D), i.e. to the right stimulus pair. This result was independent of contrast polarity. Note that the contrast polarity at contour (A) was opposite to that of contour (B), and similarly with contours (C) and (D). By contrast, all solid edges evoked similar responses (E-H). Also, Fig. 4 shows that the response strength to occluding-contour stimuli varied from neuron to neuron - in some this response was stronger (6CQ4), in others (6CB1) it was weaker than the responses to solid edges. 4.2 Quantifieatioii of sensitivity tofigure-grounddirection Of the 146 neurons studied with occluding-contour stimuli, we obtained complete quantitative records for 63 neurons. For these we determined an index of sensitivity to figure-ground direction (Ig) from the mean responses to the four types of contour as shown in Fig. 3 (RA-D)' Is = I ( R A + R B ) - ( R C + R D ) / ( R A + R B + R C + R D ) I
Figure 5 shows the result separately for areas VI and V2. One can see that this index was never greater than 0.2 for neurons of area VI, thus indicating low sensitivity to figure-ground direction (Fig. 5A). In contrast to neurons of area VI, many neurons of area V2 had indices greater than 0.2 (Fig. 5B). During qualitative testing (i.e. by simply listening to the responses) neurons for which the subsequent analysis revealed an index lower than 0.2 were not recognized as being sensitive to figure-ground direction. Furthermore, the histogram of Fig. 5B suggests a bimodal distribution for the indices of neurons of area V2 with a trough at 0.2. Therefore, only neurons with indices greater than 0.2 were called sensitive to figure-ground direction. Neurons with indices equal or lower than 0.2 were called unselective. By this criterion, none of the neurons of area VI for which this index was available (0/18) was sensitive to figure-ground direction, whereas more than 40% of the neurons of area V2 (20/45) showed this property. After pooling quantitative and qualitative results, none of the neurons of area VI (0/46), and 36% of the neurons of area V2 (36/100) were classified as being sensitive to figure-ground direction. 4.3 Sensitivity tofigure-grounddirection and contrast polarity We also found neurons that were selective for a certain combination of figuregroimd direction and contrast polarity. These neurons clearly preferred one type of occluding contour, for example in Fig. 3 either contour (A), (B), (C), or (D). These neurons preferred the same contrast polarity at the occluding contour as they did for solid edges, either light/dark or dark/light. So far, only 4 of the 100 neurons studied in area V2 were of this type. They gave strong responses to one type of occluding-contour stimulus and comparatively weak responses or none to
63
VI 10 -I
N=18
0.4
0.8
V2
15 n
to CD N=45
n rn 0.4
0.8
Index ( L ) Figure 5. Quantification of sensitivity to figure-ground direction. (A) shows the distribution of the sensitivity index (!§) for neurons of area VI, (B) for neurons of area V2. Neurons with indices greater than 0.2 were called sensitive to figureground direction (open bars), neurons with indices equal or lower than 0.2 were called unselective (filled bars). For a definition of the index see text.
the other types. However, more experiments are needed to determine the details of the functional properties of these neurons. 4.4 Effect of stimulus moyement Since we typically used moving stimuli to study sensitivity to figure-ground direction one might argue that the result was affected by stimulus movement. In
64
order to separate the two stimulus parameters, that is figure-ground direction and direction of movement, we studied some neurons with both stationary and moving stimuli. An example of the responses of a neuron studied in both stimulus conditions is shown in Fig. 6. The neuron was recorded in area V2. It gave stronger responses when the occluding surface was to the left of the contour (B) and weaker responses when this surface was to the right of the contour (A). This selectivity was independent of contrast polarity. (The two contours (A and B) evoked 2.8 and 5.8 spikes/sec for the stimulus pair with dark occluding surfaces as shown in Fig. 6 (open bars), and 3.0 and 5.5 spikes/sec for the corresponding stimulus pair with bright occluding surfaces, not shown). We used the stimulus pair with the dark occluding surface for comparing the responses to moving and stationary stimuli. Note that the sweep amplitude in the moving condition was much larger (2 deg) than the fixational eye movements to be expected in our stimulus conditions (cf. DISCUSSION). As shown in Fig. 6, this neuron had the same preferred direction of figure and groimd both in the moving (open bars) and stationary stimulus condition (dotted bars). We also determined the preferred direction of stimulus movement for conventional stimuli for 19 of the 20 neurons studied quantitatively that showed sensitivity to figure-ground direction. This analysis revealed that more than half of these neurons (12/19) were not direction selective. They gave similar responses in both directions of stimulus movement. In the remaining 7 neurons, the preferred figure-groimd direction and the preferred direction of movement were either the same (N=2) or opposite (N=5). (For an illustration of the relationship between the two types of selectivity see Fig. 7). This suggests diat the sensitivity of neurons of area V2 to figure-ground direction is independent of the neurons' sensitivity to motion direction.
5. DISCUSSION The results described in the present paper suggest that mechanisms for the segregation of figure and groimd at occluding contours are implemented relatively early in visual processing. To our knowledge these are the first data to show that single neurons in monkey visual cortex can be sensitive to the direction of lineterminations at occluding contours, and thus indicate the direction of figure and ground at such contours. These neurons were found in area V2, but not in area VI. With very few exceptions, this sensitivity was independent of the contrast polarity at the contours. Furthermore, the preferred figure-ground direction of these neurons was not related to the preferred direction of stimulus movement which suggest that it is the spatial arrangement and the direction of the occlusion cues (line-terminations) which combined to produce the responses of these neurons.
65
Unit6CM1
^ 5 n O
0
'—11 — '
'—11 — '
A
B
Moving
1 T '•• t 1
A
•_
'
1
B
Stationary
Figure 6. Sensitivity to figure-ground direction in moving and stationary stimulus conditions. Responses of a neuron of area V2 that preferred the occluding surface to the left of the contour (B). This selectivity was similar in the moving (open bars) and in the stationary stimulus condition (dotted bars). In both conditions the neuron gave stronger responses to contour (B) than to contour (A). The contours were either moved back and forth over the neuron's response field (ellipse) at a frequency of 1.5 Hz, or were kept stationary in the center of this field during a corresponding time interval. The bars represent mean responses recorded during 24 motion cycles or during a corresponding time interval. Vertical bars show standard deviations.
66
5.1 Interpretation of the neuronal responses The results of the present paper suggest that contours defined by occlusion cues, and the depth order at such contours are first encoded in area V2. In perception, such contours are often perceived as illusory contours (13, for a review see Ref. 5), and it has been shown that it needs at least two such cues (line-terminations) in order to perceive an illusory contour (14). Increasing the number of lines strengthens the percept. A similar effect has been found in neurons of area V2 that were sensitive to illusory contours (7). A single line-end usually failed to evoke a response. Adding more lines strengthened the response. A stimulus of 8-13 lines, as shown in Fig. 2C, usually evoked maximum response. In the present study we were careful always to present several line-ends within the response field (as determined with conventional stimuli) in area V2, but also in area VI where the receptive fields can be very small. Thus, the occluding-contour stimuli which we used in these experiments were all adequate to induce the perception of illusory contours. Of course, the four stimulus conditions that we used (see inset of Fig. 3) were highly abstract, designed to reduce the problem of figure-ground direction to a single, straight contour suitable for a study of single neurons. The perception of these stimuli was sometimes ambiguous. Human observers could perceive a notch cut out of a grating pattern or an opaque tongue overlaying a grating texture. These two percepts could alternate, especially in the stationary stimulus condition. However, the neuronal signals as such were unambiguous - they carried specific
B
Figure 7. Possible relationships between figure-ground direction and preferred direction of movement (arrow) of cortical neurons. In (A) the two directions are the same, that is from right to left, in (B) they are opposite.
67
local information reflecting the spatial arrangement and the direction of lineterminations at a particular contour. This neuronal information may be one of the very first to be used by mechanisms of figure-ground segregation from occlusion cues. In neurons sensitive to figure-ground direction we found no relationship between stimulus movement and the preferred figure-ground direction. First, the majority of these neurons were only weakly or not selective for motion direction. This result agrees with an earlier study from this laboratory which showed that direction selectivity was not a critical stimulus parameter for neurons of area V2 (10). Second, in neurons that did show direction selectivity, the preferred direction for stimulus movement was more often opposite to the preferred figure-ground direction than the same. Similarly, it seems unlikely that fixational eye movements accounted for these results. Motter and Poggio (15) have measured the scatter of eye positions ia rhesus monkeys performing a binocular fixation task which was virtually identical to the task of our monkeys. The result revealed a standard deviation of 6-8 min arc for the horizontal and 7-13 min arc for the vertical components. Their study showed that the size of the fixation target and the orientation discrimination involved in this task reinforced foveal viewiag. A similar result has been obtained by Snodderly and Kurtz (16). The scatter of the responses as shown in the dot displays of Figs. 2 and 3 allows a rough estimate of the fixational eye movements in our stimulus conditions. One can see that they were, on the average, much smaller than the width of the response field, and they appeared to be randomly scattered (for further discussion see Ref. 8 page 1753). Because fixational eye movements occur both in the moving and in the stationary stimulus conditions, and because they vary randomly, they should not affect the comparison between the results recorded in these two conditions. 5.2 Hypothetical mechanism producing sensitivity tofigm*e-gromiddirection We explain the sensitivity of cortical neurons to figure-ground direction in terms of a model that has been proposed previously for neural mechanisms of illusory contours (for a review see Ref. 6). The scheme of the model is shown in Fig. 8. It assumes two input pathways for neurons of area V2. The first input (open symbols) produces sensitivity to occluding contours andfigure-groxmddirection, the second (filled symbol) produces sensitivity to contrast edges. The first pathway uses the signals of end-stopped cells with asymmetrical receptive fields. The excitatory parts of their receptive fields are thought to be of the complex type which implies that they respond similarly well to dark and to light line-terminations. The preferred orientation of the end-stopped cells is assumed to ht perpendicular to the occluding contour and their alignment parallel to it. For simplicity, only four end-stopped cells are shown, but they are thought to cover the receptive field densely. In the example of Fig. 8 completely asymmetrical fields are assumed (the hatched discs
68
A
B
Figure 8. Hypothetical mechanism producing sensitivity to illusory contours and figure-ground direction. We assume a dual input for neurons of area V2. The first input (open symbols) produces the signals of illusory contours and figure-ground direction. It groups the signals of end-stopped cells with asymmetrical receptive fields (horizontal ellipses; the hatched discs represent inhibitory end-zones). These neurons prefer orientations perpendicular to tiie illusory contour and respond to line-terminations as shown in A-C. The signals of distant pairs of such cells are combined by multiplications (X) and the summed output (2) produces the signal for illusory contours. Thus, it is the field asymmetry of the end-stopped cells which conveys information about the direction of occlusion cues to these neurons. The second input (filled symbol) adds the signals of neurons with simple or complex type receptive fields (vertical ellipse), and thus the signals for solid lines or edges. This model predicts three types of neurons in area V2: Neurons not sensitive to figure-ground direction (A), and neurons sensitive to either one (B) or the other figure-ground direction (C). It is assumed that all three types of neuron also receive the second input and thus respond to bars or edges.
69
indicate inhibitory end-zones). However, in visual cortex, end-stopped cells with all degrees of field asymmetry can be found (see below). The signals of distant pairs are combined by a multiplication (X) because it needs at least two line ends to perceive such a contour (14). Furthermore, a single line-end usually also failed to activate cortical neurons (7). The sum of the output of such pairs (S) produces the responses of contour neurons of area V2. As discussed above, this summation was also found in the responses of cortical neurons (7). The second pathway (filled symbol) adds to flie first. It uses the signals of neurons with simple or complex type receptive fields (vertical ellipse) which respond to solid bars or edges. 5.3 Simulatioii of the model The model outlined in Fig. 8 has been simulated keeping at all stages as closely as possible to physiology (Refs. 17-19). The simulation involved convolutions of the image with different types of filters which simulated the functions of simple, complex and end-stopped cells. The result paralleled perception: It reproduced the perceived illusory contours and the figure-ground direction at such contours. These results were obtained with binary and with gray-valued images (for examples see Refs. 18 and 19). Alternative models for mechanisms of illusory contours and the segregation of figure and ground at such contours have been proposed (Refs. 2024). The main differences between these models and ours are twofold. First, a basic step of our model is the detection of the position and direction of terminations (occlusion cues) which are explicitly encoded by filters simulatiag the fimction of end-stopped cells with asymmetrical receptive fields. Second, our model implies a one-directional, feed forward process including sets of different filters at different stages that are convolved with the image; no feedback control is required (for a more detailed discussion of the various model see Refs. 19, 20 and 25). Considering its simplicity, the model produces stable results, also in gray-valued images. More complex neuronal interactions such as dynamic synaptic changes and the temporal coherence of groups of neurons are not required (26-28). 5.4 End-stopped cells We have studied the receptive fields of end-stopped cells in the light of this model (29, 30). About half of the neurons recorded in areas VI and V2 responded to lines terminating in the receptive field when this line covered one half of the field, and only half of that response or less when it covered the other half. Such responses indicated asymmetrical receptive fields with a strong inhibitory zone at one end of the field and a weak one or none at the other end. The remainder of the end-stopped cells gave similar responses to line terminations covering either one or the other half of the receptive field. These responses indicated symmetrical receptive fields with inhibitory zones of about equal strength at both ends. The results of these experiments suggest that end-stopped cells do respond to lineterminations, even those with symmetrical receptive fields. However, only end-
70 stopped cells with asymmetrical fields carry information about the directions of such terminations. Little is known about the role of end-stopped cells in visual processing. Hubel and Wiesel who first described these neurons in cat and monkey visual cortex (31, 32) proposed that those with symmetrical receptive fields could be involved in curvature detection, and models simulating such cells have been proposed (33). Indeed, it has been found recently that end-stopped ceils in cat visual cortex can be sensitive to curvature (34, 35). However, these neurons responded about equally well to long lines of optimal curvature and to short straight lines of optimal length. Thus, the signals of these neurons seem to be ambiguous; they are not exclusively related to curvature. The function of end-stopped cells with asymmetrical fields is less clear. Hubel and Wiesel showed that they respond to tongues and to comers of various angles (31, 32). The studies from our laboratory suggest that these neurons are sensitive to terminations (line-ends and comers) and thus may be involved in the detection of occlusion cues (29, 30). Thus, in the simulation of the mechanism producing illusory contours and sensitivity to figure-ground direction, only endstopped cells with asymmetrical fields have been invoked (18, 19).
5.5 Psychophysics In a study of the phenomena of the da Vinci stereopsis Nakayama & Shimojo (36) discovered that unpaired, monocular cues can induce the perception of illusory contours. They showed that the figure-ground direction at such contours is determined by the eye of origin of the unpaired stimulus element. Since the signals of the two eyes converge early in visual processing (mostly in area V1), they concluded that neural mechanisms for the perception of illusory contours and the detection of figure and ground at such contours should be represented early in visual processing; in area V1 or its immediate targets of projection. Our findings fit with this hypothesis and show that in monkey visual cortex this representation begins in area V2. Furthermore, it has been shown that motion perception at occluding contours, as for example in the barber pool illusion, depends on the depth order invoked by occlusion (37), and that interaction between depth from occlusion and depth from binocular disparity can produce conflicting percepts (38, 39). This also suggests that mechanisms for figure-ground segregation from occlusion cues are implemented just as early as mechanisms contributing to the perception of depth from motion or stereoscopic cues. In synergy, these mechanisms may enhance the perception of occluding contours and stabilize the perceived depth order of the associated surfaces.
71
6. CONCLUSIONS The present paper proposes a neural mechanism for the segregation of figure and ground at occluding contours. Evidence for such a mechanism is provided by the neurophysiological data showing that neurons in area V2 of the monkey visual cortex can be sensitive to illusory contours and to the depth order associated with such contours. We explain the responses of these neurons in terms of a model that uses a feed forward mechanism based on the signals of end-stopped cells sensitive to the position and the direction of occlusion cues (line-ends, comers). The validity of this model has been tested by simulation and comparison of the results with perception. These results fit with evidence from psychophysics which suggests that grouping mechanisms for figure-ground segregation from occlusion cues are implemented early in the visual pathway, and that these mechanisms may coexist with mechanisms contributing to the perception of depth from motion or stereoscopic cues.
REFERENCES L P . Cavanagh, In: Neural Mechanisms of Visual Perception, D. M. K. Lam and C. D. Gilbert (eds). Portfolio Publishing Company, The Woodlands, TX, (1989) 261-279. 2. B. Julesz, Foundations of Cyclopean Perception, University of Chicago Press, Chicago, (1971). 3. K. Nakayama, S. Shimojo and G. H. Silverman, Perception 18 (1989) 55-68. 4. G. Kanizsa, Organization in Vision, Praeger, New York, (1979). 5. S. Petry and G. L. Meyer, The perception of illusory contours. Springer, Berlin, (1987). 6. E. Peterhans and R. von der Heydt, Trends Neurosci. 14 (1991) 112-119. 7. R. von der Heydt and E. Peterhaos, J. Neurosci. 9 (1989) 1731-1748. 8. E. Peterhans and R. von der Heydt, J. Neurosci. 9 (1989) 1749-1763. 9. R. von der Heydt, F. Heitger and E. Peterhans, Biomedical Research 14 (Suppl. 4) (1993) 1-6. 10. E. Peterhans and R. von der Heydt, Eur. J. Neurosci. 5 (1993) 509-524. 11. M. L. Wolbarsht, J. E. F. MacNichol and H. G. Wagner, Science 132 (1960) 1309-1310. 12. R. von der Heydt, E. Peterhans and G. Baumgartner, Science 224 (1984) 12601262. 13. S. Coren, Psychol. Rev. 79 (1972) 359-367. 14. F. Schumann, Zeitschrift Sk Psychologic 23 (1900) 1-32. 15. B. C. Motter and G. F. Poggio, Exp. Brain Res. 54 (1984) 304-314. 16. D. M. Snodderly and D. Kurtz, Vision Res. 25 (1985) 83-98.
72
17. F. Heitger, L. Rosenthaler, R. von der Heydt, E. Peterhans and O. Ktibler, Vision Res. 32 (1992) 963-981. 18. F. Heitger and R. von der Heydt, In: Proc. 4th Int. Conf. Computer Vision, Berlin, Germany. IEEE Computer Society Press (1993) 32-40. 19. F. Heitger, R. von der Heydt, E. Peterhans, L. Rosenthaler and O. Kxibler, Image & Vision Computing (1996) in press. 20. L.H. Finkel and G.M. Edehnan, J. Neurosci. 9 (1989) 3188-3208. 21. J. Skrzypek and B. Ringer, In: Proc. 3rd Int. Conf. Computer Vision Champaign, IL. IEEE Computer Society Press (1992) 681-683. 22. S. Grossberg and E. MingoUa, Percept. Psychophys. 38 (1985) 141-171. 23. S. Grossberg, Percept. Psychophys. 41 (1987) 117-158. 24. P. Sajda and L.F. Finkel, In: Proc. 3rd Int. Conf. Computer Vision Champaign, IL. IEEE Computer Society Press (1992) 688-691. 25. L. Finkel and P. Sajda, Neural Computation 4 (1992) 901-921. 26. O. Spoms, G. Tononi and G. M. Edelman, Proceedings of the National Academy of Sciences of the United States of America 88 (1991) 129-133. 27. R. Eckhom, R. Bauer, W. Jordan, M. Brosch, W. Kruse, M. Munk and J. H. Reitboeck, Biol. Cybem. 60 (1988) 121-130. 28. C. M. Gray, P. Konig, A. K. Engel, W. Singer, Nature 338 (1989) 334-337. 29. E. Peterhans and R. von der Heydt, In: Representations of Vision. Trends and Tacit Assumptions, A. Gorea, Y. Fregnac, Z. Kapoulis and J. Findlay (eds), Cambridge University Press, Cambridge, (1991) 111-124. 30. E. Peterhans and R. von der Heydt, Soc. Neurosci. Abstract 16 (1990) 293. 31. D. H. Hubel and T. N. Wiesel, J. Neurophysiol. 28 (1965) 229-289. 32. D. H. Hubel and T. N. Wiesel, J. Neurophysiol. 195 (1968) 215-243. 33. A. Dobbins, S. W. Zucker and M. S. Cynader, Vision Res. 29 (1989) 13711387. 34. A. Dobbins, S. W. Zucker, and M. S. Cynader, Nature 329 (1987) 438-441. 35. M. Versavel, G. A. Orban and L. Lagae, Vision Res. 30 (1990) 235-248. 36. K. Nakayama and S. Shimojo, Vision Res. 30 (1990) 1811-1825. 37. S. Shimojo, G.H. Silverman and K. Nakayama, Vision Res. 29 (1989) 619626. 38. J. P. Harris and R. L. Gregory, Perception 2 (1973) 235-247. 39. V. S. Ramachandran and P. Cavanagh, Nature 317 (1985) 527-530.
Brain Theofy - Biological Basis and Computational Principles A. Aertsen and V. Braitenberg (Editors) O 1996 Elsevier Science B.V. All rights reserved.
75
Microarchitecture of Neocortical Columns Rodney J. Douglas and Misha A. Mahowald and Kevan A.C. Martin* MRC Anatomical Neuropharmacology Unit, Mansfield Road, Oxford 0X1 3TH, United Kingdom The neocortical system, with its exquisite variety of function, is built on a series of column-like structures that aggregate to form slabs and pinwheel patterns. The basic unit of the column is a vertical chain of neurons where later stages of the chain reconnect with earlier stages to form a series of recurrent circuits. We present a simple electrical circuit analogy to represent this recurrent chain and show how stability in the circuit can be achieved through the known biophysical mechanisms of the neuron and synapses. The possible role of recurrent excitation and inhibition is then explored in the context of extracting a signal embedded in noise. The example demonstrates how the recurrent circuits of the neocortex, with neurons connecting on a nearest neighbour basis, provide a means of representing the signal in a relatively noise-free neural code and of allowing the restored signal to scale with the magnitude of the input from the periphery.
1. MAPS, AREAS AND COLUMNS A brief history of neurophysiological research on the neocortex would reveal three interrelated strands that dominated the research over many decades. The first strand is the work that established the existence of topographic maps of the sensory and motor world on the surface of the neocortex. The best known example is the topographic map described for primate area 17 by Daniel and Whitteridge [15]. From data derived from electrophysiological mapping of area 17 they were able to develop a simple mathematical model that predicted the both the form of the representation of the visual field upon the striate visual cortex of the monkey and the unfolded shape of area 17. Their map revealed that neighbouring regions in visual space were represented in neighbouring regions of the visual cortex and this principle remains true for all sensory and motor maps in the cortex. Through the concept of magnification factor, i.e. the factor that relates the surface of the cortex devoted to a unit size of the sensory space, they were able to suggest a direct relationship between visual acuity and the amount of cortex devoted to the presentation of the fovea. This relationship was a necessary precursor of the notion of a cortical 'hypercolumn' (see third strand below). *We thank Bashir Ahmed, John Anderson and Charmaine Nelson for their contributions to the work described in this chapter. We acknowledge the financial support of The Royal Society, the Medical Research Council, the US Office of Naval Research, the EC, and the Wellcome Trust.
76
The second dominant strand is the existence of multiple cortical areas devoted to a single sensory domain, such as vision or audition. These areas began to be mapped physiologically in some detail in the early part of this century [12,1]. Many of these areas were originally defined by the fact that they have a complete or partial topographic map of the sensory surface. In the visual system, for example, Cowey [13] was able to demonstrate the existence of a topographic map in area 18, and showed that the border of area 17 and 18 followed the same principle of nearest neighbour mapping. Obviously to achieve this, the visual field representation was mirrored along the border between the two areas. All sensorimotor systems have multiple representations in cortex. The size of the individual corlK al areas varies as does the grain of the maps in the different areas. The largest single <\vca.^ area 17 in the primate, for example, has the most most precise retinotopic representation of the visual field, whereas in area MT, a much smaller anterior visual area in the primate, has a much coarser retinotopic representation of visual space. In addition, within a single area there may be repeated segments of the same basic representation. The visual field is doubly represented in layer 4 of each eye has a full representation of the whole visual field [34]. Even in this instance, local neighbourhood relations are maintained because the maps are retinotopic and interleaved with each other in a series of alternating left and right eye Zebra-like stripes [35]. These stripes appear as patches, blobs, or columns when cut in cross section. These local mapping strategies form the third dominant strand of investigation. The columnar systems in the primary visual corte^x have been a major preoccupation of neurobiologists of all hues. With the development of sophisticated 2-D and 3-D methods of functionally mapping these columns using metabolic markers [35] and optical recording methods [8,24] and of anatomical methods of mapping the columns [35], the two dimensional nature of the columnar map has become clearer. The actual form of the columns varies and depends on the actual function being mapped. For example, in the primary visual cortex of both the cat and monkey at least three varieties of columns have been described. One is the the slab-like arrangement found in the ocular dominance columns [34]. A second is the circular 'pinwheel' arrangement that form of part of the orientation map [10]. The third is the arrangement of the cytochrome oxidase 'blob' [28] - a column that lies at the centre of the ocular dominance columns and which also aggregates neurons that have functional properties in common [41]. These columnar structures are all superimposed on the retinotopic map of the visual field and these mapping are united in the inspired invention of the notion of hypercolumns [33]. A hypercolumn is the minimal unit that contains all the machinery necessary to process all the values of a particular variable for each part of the visual field. In the case of the orientation system it consists of a full set of slabs subserving the 180 degree cycle, for ocular dominance it consists of a left eye slab and a right eye slab. Since the size of the visual receptive fields and their scatter scales with the magnification factor, it transpires that a move of 2-3 mm along the surface of the primate are 17 in any direction will lead out from one region of the visual field to an entirely new, but neighbouring region of the field. Thus, independently of which part of the visual field is being represented, a region of visual cortex 2mm by 2mm in surface area and extending from pia to white matter will contain the neuronal machinery to analyse that small region of visual field. Although the visual cortex was the arena in which the major development of this key
77 concept of columns took place, it is now appears that the cortical column is ubiquitous. In all species that have been examined, anatomical, histochemical, or functional columns can be found throughout the neocortex. Although the simplest illustration is the spatial neighbourhood relations preserved in the retinotopic representation of visual space in the cortex, the same nearest neighbour relations also run along a other dimensions, like orientation and ocular dominance, i.e. any given neuron is likely to be tuned to similar parameters to its neighbours. Columns form a fundamental unit of cortical organization. It is no surprise then that they have received close attention from theoreticians, who have for the most part produced models in which the afferents self-organize into the columnar pattern through activity-dependent competitive mechanisms (e.g. [66]). These theories provide a description of the development the columns in all their forms, blobs, slabs, pinwheels etc., but they do not express a view about why one pattern rather than another should be associated with a particular cortical function. The functional 'usefulness' of the mapping is not really addressed, neither is the issue of whether the cortical circuits have any role in determining the form of the afferent mapping. The single principle that unites these three strands is the preservation of neighbourhood relations. The fundamental organization of neocortex is that aggregates of neurons with common connectivity and functional properties are organized in coherent, repeated patterns.
2. M A P F O R M A T I O N S Experimentally it is very difficult to decide whether the pre-or postsynaptic elements are pre- eminent in determining the form of the map. Generally, the line has been taken that it is the presynaptic elements alone that determine the form of the projection. For example, the retinotopic map of the visual field on the striate cortex is thought to be due to self-ordering of the thalamic projection to layer 4. The arbors of individual thalamic afferents in the sensory areas form a non-uniform, patchy projections, which underly the ocular dominance columns seen with bulk tracing techniques or optical recordings in the visual cortex. In these instances, it seems, the cortical circuits need have only a relatively uniform structure, since the functional ocular dominance columns are in fact imposed by the relative right and left eye afferent arbors arising from subcortical neuron populations in the lateral geniculate nucleus. Even in the clear cases where the cortical neurons are involved, as in the whisker barrel fields in the the mouse somatosensory cortex in which the aggregates of neurons that respond selectively to the activation of one particular whisker can be seen in a simple Nissl-stained section [65]. The problem is of course, that viewing these aggregates in the adult says nothing about the mechanism of their development. In the case of the barrel fields, there is actually strong evidence that the barrels are induced by the afferents. Changes in whisker number, for example, will be reflected in a congruent change in the barrel field map in the cortex [65]. This view of cortex was stated in a strong form by Hubel and Wiesel [33], who proposed that the whole visual cortex contained repeated units of the same basic machinery and that the local differences in function were provided by differences in the pattern and function of the afferents supplying any local piece of the cortical machinery. The great attraction of this view is that the genetic instructions that build neocortex.
78
which after all constitutes over 80% of the brain volume in humans, need not specify many thousands of unique modules. However, we cannot assume that the neocortex is simply a tabula rasa upon which the subcortical projections to scratch their idiosyncratic graffiti during development. The possibility cannot be excluded that the cortex has some protomap that guides the development of particular form of afferent mapping [59,37]. The protomap hypothesis can itself be considered at at least two scales. One is the cortical mantle itself- how does it divide itself up into the 100 or so distinct areas? There is now convincing experimental evidence that there is some predisposition to form areas in the absence of subcortical input, but that this predisposition can be strongly influenced by the presence of the thalamic afferents. Thus the most distinctive cortical area - area 17 of the primate - develops highly abnormally in utero if the eyes are removed [60,17,38]. Neverthless, islands of histologically normal looking area 17, with its distinctive laminar pattern still develop. At another level is the possible protomap within a single area. It is simply not known whether there are protomaps for the columnar patterns, such as ocular dominance stripes, cytochrome blobs, or the orientation system, but studies of the plasticity of these systeijis in longitudinal studies of the same animal suggest that there is some basic framework that guides the particular organization in that particular animal. It is as though a 'fingerprint gene' was determining a basic form within which individual variations were possible through epigenetic interactions. 3. ELEMENTS OF CORTICAL MICRO CIRCUITS Considerations like this beg the question of what actually are the cortical circuits that make up these different functional aggregates of neurons. Does the same circuit simply get repeated through a cortical area, as Hubel and Wiesel suggested, or does the precise circuit vary according to the afferent innervation? In the case of ocular dominance columns the simplest explanation would be that a neuron's ocular dominance is determined by the relative number of synapses formed on that neuron that derive from right or left-eye driven LGN afferents. The form of the local cortical circuit at any point within the ocular dominance map would be the same. A similar argument would of course hold for the orientation columns - they could reasonably be set up by the specific geometric arrangement of the thalamic afferents converging on the target cells. If the cortical microcircuits were examined at any position in the map of orientation columns, the prediction is that they would be the same. If the cortical circuits are the same for these two cardinal functions, then it is reasonable to suppose that a multidimensional mapping of various attributes could operate on the same principle. This would leave the Hubel and Wiesel notion of a basic uniformity in the cortical machinery intact. But what is the evidence that the cortical machinery is repeated over and over, like a crystal? Experimentally it is clear that single neurons respond to a variety of stimulus attributes, including orientation, motion, contrast, depth and velocity. Somehow the circuits of the cortex are arranged to permit such multidimensional function. And what is the possible function of the cortical circuits if the pattern of thalamic afferent input is so important in determining the basic functional properties? The data for the uniformity of the cortical machinery can be approached at several
79 different scales and levels of sophistication. One line of evidence has come from simple counts of Nissl-stained sections of different cortical areas in different species. Rockel Hiorns and Powell [61] reported that counts of the number of neurons under a millimetre of surface of various areas of cortex sampled from mouse, rat, cat monkey and man, was approximately constant. The number was about 100 000 neurons, with the exception of the primate visual cortex which had about double the number. The claim was that the absolute number of neurons under a unit area of cortical surface was genetically determined and this genetic instruction had been preserved though mammalian evolution [58]. This is a bold claim and not surprisingly, several dissenting voices have been heard [55]. But even the counter-claims that there are differences between areas in the absolute number of neurons per unit surface area do not offer figures that are more than about 2 fold different from those suggested by Powell and his coworkers. But even were there exact agreement on this particular point, no-one has offered any hypothesis as to why evolution might have arrived at this or any other number. Can it simply be a number arrived at through some serendipity of the evolutionary process, or does the number, whatever is exactly, have a functional significance? To get any hint at this answer we need to explore another level of organization. Powell's hypothesis was not based simply on counts of neurons in Nissl-stained sections. A second strand to his argument was that the composition of neuronal types in the different areas of neocortex was also conserved through evolution. That is, when examined in the electron microscope, about two thirds of the neurons appear to be pyramidal cells and about one third are non-pyramidal intereneurons [58]. The proportions produced by Powell and his coworkers have never been seriously challenged, although they ran counter to the dogma of Ramon y Cajal who, on the basis of Golgi-stained sections, supposed that the number of non-pyramidal neurons increased greatly from mouse to man (see discussion by DeFelipe and Jones ([16], pp. 590-599). With the advent of immunochemical methods Ramon y Cajal's view has had to be modified. It has been shown that smooth 'non-pyramidal' neurons contain the synthesizing enzyme for gamma amino butyric acid (GABA) and are immunopositive for antibodies directed against GABA itself. Both in rat and in primate, the GABA-positive neurons form about 20% of the total in all cortical areas [29,26,55]. These neurons include the basket cells, the chandelier cells, double bouquet cells and various other subclasses. An additional population of non-pyramidal spiny neurons are found in the primary sensory areas. These are the spiny stellate cells that are found exclusively in layer 4, which form about 5-10% of the neurons. They do not contain GABA. Thus, even in area 17 of the primate, the relative proportions of the different cell types seems to have remained approximately constant. However, what Ramon y Cajal certainly saw was an increasing elaboration of the dendritic or axonal arborization of the population of smooth neurons, i.e., their morphology and connectivity was evolving, but modern studies have shown that their proportion remains constant. 4. VERTICAL COLUMNAR MICROCIRCUITS It was Lorente de No [42] who, on the basis of his Golgi studies, emphasized the vertical organization of chains of neurons, which he saw as the functional unit of cortical organization. This theme of verticality was taken up by the physiologists whose discovery
80 of topographic maps in the sensory areas gave a teleological reason for this arrangement: neighbouring neurons processed signals arising from neighbouring regions of the sensory space [53,15]. Another dimension was added when Hubel and Wiesel [32] showed that within a single column there were different receptive field types that were aggregated in different layers of the column. To them this suggested a chain of processing in the vertical dimension, i.e. a hierarchical tier of processing within a local column of grey matter extending from the white matter to the pial surface. This was a concept of great synthetic power and subsequent anatomical work gave further clarity to this view of the organization. Tracing methods showed that the projections to other cortical and subcortical regions were provided by neurons in different layers, e.g. corticothalamic projection arose from layer 6 pyramidal cells, corticocollicular projections arose from layer 5 neurons (e.g. [44]). The columnar principle remained however. In a given column, the neurons that project to the thalamus are activated by much the same stimuli as those projecting to the colliculus. Essentially all the projections to the areas involved in motor control (tectum, striatum, pons, medulla, spinal cord) arise from a small percentage (10% in cat visual cortex) of neurons located mainly in layer 5. From the layer 6 projection back to the thalamus, the cortex can influence the pattern of sensory activity it receives. Since the transmission times from cortex to thalamus and back are only about twice as long on average as between cortical neurons themselves, the thalamic relay cells could almost be considered a sublayer of neocortex itself. The pattern of projection within the column has been studied mainly using the Golgi technique applied in immature animals. In adults, degeneration and tract tracing techniques have also been exploited. The most vivid and complete picture of the 3-dimensional structure of cortical neurons in both immature and adult neocortex, has however come from intracellular filling of single neurons in vivo [23,51]. If the enzyme horseradish peroxidase is injected intracellularly, it is transported through the finest dendritic and axonal processes of that neuron. Subsequent histochemical processing reveals the complete dendritic and axonal arborization of a single neurons, without all the problems of incompleteness or immaturity that plague the Golgi techni(iue. Furthermore, the axons of the different types of neurons are not labelled together as in the tract-tracing methods. Clearly there must be some agreement between the conventional neuroanatomical and electrophysiological methods and the complete structure-function picture of single neurons provided by the intracellular HRP techniques. The different techniques do in fact agree (see [48,20,7,40,43]. The main thalamic projections are to the middle layers of the cortex, principally, but not exclusively to layer 4. The spiny neurons of layer 4 project vertically to the superficial layers, which in turn project to the deep layers. The pyramidal cells in the deep layer project to each other and upwards to layers 1-4. The basic vertical pattern of interlaminar connections of the spiny neurons is conserved through all cortical areas. The pattern of projection of the smooth neurons has been less intensively studied. In general, the axonal arbors of the smooth GABAergic cells appears to be more compact than that of the spiny cells, but many of the the interlaminar projections patterns of smooth neurons are equivalent and congruent to those for spiny neurons. This is interesting because the targets of the smooth cells are their neighbours, i.e. the smooth neurons lie in the same column as the neurons they inhibit.
81 5. L A T E R A L O R G A N I Z A T I O N OF M I C R O C I R C U I T S Lest the impression be given that there are no connections between columns, it should be emphasized that the physiology and the anatomy show clearly that there are lateral connections. For example the monocular fields of the layer 4 neurons lying in segregated left and right eye columns become binocular in the upper and lower layers of the visual cortex [32]. This mixing of left and right eye can only occur through some lateral interaction. Similarly, the work of Powell and coworkers, in which they made microelectrode lesions in different cortical areas, showed that there was dense terminal degeneration extending for a few hundred microns from the lesion and thereafter becomes moderate and extends, usually asymmetrically for 2-3 mm, depending on the lamina. This pattern was seen in all cortical areas tested in cat and monkey [19,22]. Significantly, the pattern remained the same even when the lesion was placed next to an architectonic boundary between two different cortical areas. The efferent fibres form a tight bundle running perpendicularly to the surface of the cortex. The quantitative distribution of synapses in these patchy projections has yet to be determined, but a first approximation was given by Fisken et al [19] who made minimal lesion in area 17 of the monkey to produce degenerating terminals of axons of neurons in the lesion area. They found that nearly 40% of asymmetrical (excitatory) synapses and 30% of the symmetric (inhibitory) synapses were found less than 500 um from the site of the lesion. Nearly 70% of the degenerating asymmetric synapses and 60% of the symmetric synapses were found within 1mm of the lesion. The symmetric synapses formed about 11% of the degenerating synapses and did not fall off so rapidly with distance. The distribution of degenerating synapses might include those of boutons from fibres of passage damaged by the lesion, but nevertheless the point is that the major connections were local. Similar qualitative observations have been made by Lund Yoshioka & Levitt [45], and Malach ([46], this volume) in their experiments, which used biocytin rather than degeneration to label the terminals. More recent studies in which chemical tracers have been used rather than degeneration have added little to the vertical dimension of this picture but have derived a clearer picture of the pattern of the lateral projections. These lateral projections are not uniformly distributed but form patchy projections. Comparative studies [45,3] in the macaque monkey revealed that the patchy lateral projections in the superficial cortical layers were similar in dimensions and 'patchiness' in areas as diverse as visual (area 17, 18, 19), somatosensory (areas 1, 2, 3b), motor area 4, and area 9 and 46 in the frontal cortex. The dimensions and spacing of the lateral patches was within a factor of two in species as diverse as monkeys, tree shrews and cats. The intrinsic pattern of connectivity revealed by this technique does not match precisely the patterns produced by afferents such as those arising from the thalamus or from other cortical areas. The intrinsic mosaic of connections is slightly smaller in scale than those of the extrinsic systems, a device which Lund et al (1993) suggest, might allow for more heterogenous sampling of inputs. A similar argument has been advanced by Malach (1992; this volume) to account for the equivalence in the size of the dendritic arbors and the size of the patches formed by neurons in area 17 of marmosets and squirrel monkeys. Both Malach [46,47] and Lund et al [45] found a positive correlation between the size of the dendritic spread and the size
82 of the patches. Malach ([46,47], this volume) pointed out that this comparability in size allows a maximization of the spread of sampling of different proportions of inputs from the different functional compartments delineated by the patches. 6. F E E D F O R W A R D H I E R A R C H I E S The pattern of connectivity and the proportions of the different component neurons and synapses outlined above leads inexorably to the conclusion that in terms of anatomical connectivity the activity of any single neuron in a column is dominated by excitatory synapses provided by monosynaptic or disynaptic activation from neighbouring neurons. This raises the interesting question of how this excitation is organized. Based on his view of the anatomical connections, Lorente de No [42], concluded that the chains of neurons connected so that they could repeatedly re-excite the same neurons. An alternative view was taken by Hubel and Wiesel [30,34] in their early formulation of the cortical circuits for vision. In their scheme the interconnections within the columns were organized in a feedforward hierarchical fashion, so that the chains of neurons never reconnected to form a re-excitatory loop. A similar plan was suggested by Gilbert and Wiesel [23] in their seminal study of the structure and function of cat visual cortex. Their schema closely followed that of Hubel and Wiesel [30,31] except that an inhibitory feedback loop was incorporated from layer 6 to layer 4 (see below). Indeed, traditionally it has been supposed that the recurrent collaterals of pyramidal cells are involved in a recurrent inhibitory pathway to control excitation within the cortex [57,58]. In the excitatory feedforward case originally proposed by Hubel and Wiesel [30], the activity of the cortical circuit was dependent on the pattern of activity of the thalamic afferents. By contrast, Lorente de No's were circuits of recurrently connected excitatory neurons containing no inhibitory neurons [42]. His view was that the effect of impulses entering the cortex depended entirely on the state of the existing activity of the chains of cortical neurons. Although the two models, one of feedforward excitation, the other of recurrent excitation are diametrically opposed, it has been difficult to distinguish either experimentally or theoretically between these two versions of the basic cortical circuit. Generally the feedforward version has been preferred over that of the recurrent excitatory model of Lorente de No for the obvious reason of simplicity and functional stability. However, recent work from our laboratory has suggests that Lorente de No's view needs to be re-considered. Both within and between lamina we have found recurrently connected excitatory neurons, which may contribute greatly to the effect of activity entering the cortex from the thalamic afferents. 7. R E C U R R E N T E X C I T A T I O N A N D I N H I B I T I O N I N L A Y E R 4 The new experimental evidence turns on the projection of layer 6 pyramidal cells to layer 4 and between the spiny stellate cells of layer 4 itself. The question of the organization of the layer 6 recurrent pathway to layer 4 had originally been addressed by Gilbert and Wiesel and coworkers. Their suggestion that the layer 6 pyramidal cells were involved in a recurrent inhibitory pathway to layer 4 was supported by two independent strands of evidence. The first was their detailed ultrastructural examination of the synapses formed by the layer 6 pyramids in layer 4 [52] in which they found that most of the layer 6 pyra-
83 midal cell boutons form asymmetric synapses on dendritic shafts. This is a very unusual arrangement since most pyramidal cells form their synapses with dendritic spines. By serial electron microscopic reconstructions they discovered most of the target dendrites were sparsely spiny, which they supposed to originate from GABAergic, inhibitory neurons. Their model of end-inhibition was essentially that proposed by Hubel and Wiesel [31]. This hypothesis they followed up by physiological experiments in which they examined the effect of blocking activity of the layer 6 pyramids on the activity of layer 3 and 4 neurons [9]. They found that end-inhibition was considerably reduced when layer 6 was blocked. Their conclusion was that the basic function of the layer 6 pyramidal cells was to provide a recurrent inhibition to layer 4. Our new observations were obtained from single neurons that had been filled with horseradish peroxidase by intracellular recordings and injections in vivo. We began by filling various afferents of layer 4 and studying them at the light microscopic level. The neurons that have axonal arbors in layer 4 include the relay cells of the thalamus, the layer 6 pyramidal cells, and the spiny stellate cells themselves, which are only found in layer 4. The axons of each of these types distribute in a characteristic way. The thalamic afferents form dense clumps of terminals, about 0.5 m m in diameter [21,36]. The layer 6 pyramidal cells have a rich innervation of the region of layer 4 radially above the soma of the pyramidal cells, i.e. around the apical dendrite of the layer 6 pyramid as it passes through layer 4, and a collateral innervation of adjacent areas [23,51]. The tangential appearance of the layer 6 pyramidal axonal arbor is that it is less clumped and more diffuse than the thalamic afferents. The spiny stellates have a rich innervation of the area within and above their dendritic tree as well as laterally-directed branches that form clusters in layer 4 and layer 3 [23,51]. These clusters are of similar dimensions to those of the thalamic afferents and also form clumps spaced by 1mm. Thus the basic picture of a columnar innervation is in accordance with the columnar principle of nearest neighbours connect. But to what do they connect? The details of the circuit were discovered through a detailed ultrastructural analysis of the pre- and postsynaptic elements. Before our study, it was not known whether the spiny stellate cells in layer 4 of cat cortex form synapses with all the possible presynaptic elements in layer 4. Peters & Feldman [56] and Peters [54] proposed a 'rule' that geniculate afferents contacted dendrites in layer 4 in the statistical probability of occurrence of pre- and post-synaptic elements. Braitenberg and Schuz [11] generalized Peter's 'rule' for all pre- and postsynaptic elements. We adopted their generalization and hypothesized that all types of boutons in layer 4 would form synapses with spiny stellate cells. In order to demonstrate that this polysynaptic innervation did occur, we defined the ultrastructural signature of the presynaptic elements based on the type of synapse, its location (spine or dendritic shaft), and size of presynaptic bouton [4]. After detailed comparison with the synapses formed on the spiny stellate cells, we were able to show that the dendrites of the spiny stellate neurons are polysynaptic innervated by all the presynaptic elements we identified in layer 4. The spiny stellates form most of their asymmetric (excitatory) synapses with the layer 6 pyramidal neurons (45%) and other spiny stellate neurons (30%) and only about 6% with the thalamic afferents [2]. The remainder of synapses could not be identified with certainty, but other minor sources like the claustrum could be involved. The small basket cells of layer 4 appeared to provide the majority (90%) of the symmetric (inhibitory) synapses
84
L4 smooth W 6% stellates 3>
thalamus Q
^
Figure 1. Schematic of some of the elements of the layer 4 circuits of neocortex. The inhibitory neurons (small basket cells) are indicated in shaded profiles, the excitatory neurons in open profiles. The percentages refer to the proportion of synapses formed between the various elements and the spiny stellate neurons. Inhibitory and excitatory percentages calculated separately.
[2]. These connections are summarized in Fig. 1. 8. M I C R O C I R C U I T S OF L A Y E R 4 A microcircuit could now be assembled from these elements and their interconnections. This circuit is similar in concept to the 'elementary unit' of Lorente de No [42], which contained all the necessary elements in cortex for transmitting impulses from the afferent fibre to the efferent axon. Although only a subcircuit is being considered here, the same principle applies: we have identified the chain of neurons that are required to excite and inhibit the spiny stellate cells within a single column. From our electrophysiological work in the cat we know that the thalamic afferents excite monosynaptically the spiny stellate cells, the layer 6 pyramidal cells and the small layer 4 basket cells [50,51]. Our anatomical work showed that the layer 6 pyramids and the spiny stellates form excitatory synapses
85 with other spiny stellate cells, with layer 6 pyramidal cells, and with small basket cells. Translated into function, this elementary subcircuit has all the components required for significant recurrent excitation and inhibition. This intrinsic organization of the column is intriguing. It suggests that the majority of excitatory and inhibitory synapses on any neuron in the column are provided by neurons whose functional properties are very similar to those of the target neuron, i.e. they will be excited from the same part of visual space, have similar orientation selectivity and similar ocular dominance. Each neuron receives between 5-10,000 excitatory synapses [4]. On average, each excitatory synapse is provided by a different neuron [49,2]. For the inhibitory synapses, more than one, but no more than 10 on average are provided by the same neuron [39]. Thus, if the majority of the synapses formed with any one neuron are provided by neurons in the same column, a stimulus that activates any one neuron in a column must activate most of the neurons, both inhibitory and excitatory, in that column. This prediction is counter to general notions of how inhibition must work. It seems to make no sense to connect together inhibitory and excitatory neurons that are activated by the same stimulus since they will be competing against each other. Nonetheless, both the anatomy and the physiology provide no easy alternative conclusion. For example, the columnar model indicates that neurons of similar specificity are grouped together, regardless of their morphological type. We cannot suppose that the clutch cells are differentially active, at least with the set of stimulus parameters that optimally activate a given column. In confirmation, intracellular marking experiments [23,50] show that the receptive field properties of the smooth inhibitory neurons are indistinguishable from their neighbouring spiny neurons. The light and electron microscopic analyses of these same smooth neurons show that they have a dense meshwork of local axonal collaterals and that they form synapses with somata and proximal dendrites of neighbouring spiny and smooth neurons [39]. Similar considerations apply to the spiny neurons; they too excite their neighbours through local collateral arborizations. Thus, both the inhibitory and the excitatory neurons would be activated by similar stimuli. A review of the receptive field structure of the layer 4 neurons in the cat's visual cortex, however, indicates that such a relationship between inhibition and excitation must exist. The vast majority of these neurons have 'simple' receptive fields and a cardinal characteristic of simple receptive fields is that they have separate subfields that are activated by stationary flashed ON or OFF stimuli of the same orientation [30]. As with centresurround receptive fields of the retina and thalamus, the ON and OFF subfields of simple cells are mutually inhibitory. For this structure to exist, the inhibitory and excitatory elements must have the same selectivity of orientation and eye preference, i.e. they have to lie within the same column. 9. A N A L Y T I C S O F R E C U R R E N T C I R C U I T S The dynamics of such a cortical network and the usefulness of its functional organization is poorly understood. Nevertheless, it seems important to explore the basic principles of organization and function of this portion of the columnar circuit, particularly because the same organization seems to be replicated in other layers of cortex. Our approach has been to take the detailed biological results described above and use them in construct a
86 simple circuit that could be used to analyse two fundamental issues that arise from the view of the column presented by the biology. Firstly, how can a population of recurrently connected excitatory neurons be prevented from going into a catastrophic positive feedback instability? Secondly, how are the inhibitory neurons employed within this recurrent circuit? These two aspects are drawn together in the simplified circuit illustrated in Fig. 2a. The circuit consists of a population of identical spiny stellate neurons that receive the same thalamic input and are connected to each other with the same synaptic strength. They also excite a population of inhibitory interneurons that in turn inhibit the spiny stellate neurons. Since all the outputs of all the spiny stellates are identical, the network can be reduced to the circuit shown (Fig. 2a). In this circuit, the thalamic synapses provide the input current Un. The spiny stellates form synapses with each other and provide a recurrent current Irec Both /,„ and Irec are inward currents that depolarize the neuron. An inhibitory current linh^ is an outward current, which is provided by the small basket cell synapses. Every cortical neuron has a number of active and passive conductances, some of which are intrinsic properties of the neuron itself others of which are extrinsic and arise from the inhibitory and excitatory synapses. These conductances and the synaptic currents operating within the simplified circuit shown in Fig. 2a can be expressed in a simple equivalent electric circuit (Fig. 2b). In this abstraction, the intrinsic conductances of the neuron are collected together in a single input conductance, G. The action potential discharge of the neurons is proportional to the net excitatory current delivered at the soma/axon hillock. The rate of firing (F) is effectively equal to the voltage Ig/G. Conservation of current requires that the current entering the neuron is exactly equalled by the current leaving the neuron via the passive membrane conductances and, more importantly, the spike conductances, which are an order of magnitude larger than the passive membrane conductances. Thus the spikes themselves are a significant current sink in the circuit. The outward currents, which act to decrease excitation within the circuit, include the voltage and ligand gated potassium conductances, the spike conductances, and the GABA-mediated synaptic conductances which give rise to inhibitory current {linh)In the circuit, the synaptic inward (excitatory) current to a given neuron is the sum of the thalamic input current (tin) and the recurrent excitatory current (Irec)- Each neuron contributes a fraction of the recurrent current that each other spiny stellate receives (Fig. 2b). Because the spiny stellates are all identical, the feedback current to each neuron is proportional to its own output, Irec = ocF. a is an effective conductance, which we call a network conductance. Since the excitatory feedback is a positive feedback, the excitatory network conductance is negative. Conversely, the inhibitory neurons provide an outward feedback current Unh that contributes to a positive network conductance pF. The effective conductance of the neuron (Gefj) is the sum of G + /? — a. Since the output of all the neurons is proportional to the thalamic synaptic current (lin) the discharge rate ( F ) of any neuron is given by Iin/{Geff). Therefore as Geff decreases, the rate of firing of the neurons increases, and as Geff increases, the firing rate decreases. The recurrent circuit remains stable and gives an output that is proportional to lin, provided that the excitatory network conductance (a) does not exceed the sum oi G -\- (3, i.e. the feedback loop gain does not exceed one. Under these conditions the output of the network will
87 relax to zero if there is no thalamic excitatory current. The thalamic synaptic current is amplified by the recurrent excitatory network with a gain that is expressed as Iin-\'IrtdUn^ which can be alternatively expressed as GH-/?/Ge//This gain can be much greater than one and as the value of the excitatory network conductance (a) approaches G + /?, the output is largely due to the current delivered by the spiny stellate network of excitatory synapses (/rec) rather than the thalamic synapses. Thus, this circuit encapsulates and solves analytically the issue raised 45 years ago by Lorent do No: how does the activity of the cortical column influence the impulses entering the column from afferent systems like the thalamus? In this modern formulation, the output of the spiny stellates is always proportional to the thalamic excitation, but the magnitude of the eflFect of the thalamic synapses on the spiny stellates, and hence the columnar circuit, depends on the gain of the cortical network at that point in time, i.e. the factor by which the thalamic input is amplified. This gain factor is affected by the activity existing in the network. The gain is highest when all the neurons in the column are above threshold and its gain is zero when all the neurons are below. Many pre- and postsynaptic factors determine the state of activity of the network. Presynaptically, amongst the many diflFerent factors that need to be considered is the issue of the synaptic efficacy. With repeated stimulation a synapse may potentiate or depress. This process is also dependent on the rate the synapse is stimulated by action potentials. Postsynaptically, issues of receptor saturation, the concentration of ions in small compartments like the spine, and the processes of adaptation will all have an affect on the gain of the circuit. The adaptive processes may be especially significant. The probability that a given spiny stellate cell will produce an action potential will depend on when it last produced an action potential. The action potential discharge of the spiny stellate neurons adapts rapidly and this adaptation is due largely to a calcium-dependent potassium current that has a time constant of about 20ms. Thus, the production of just one action potential by a spiny stellate will affect its response to the next volley from the thalamic synapses. The number of active synapses on the spiny stellate will also have a considerable effect on the input conductance of the neuron [5]. These numerous factors are changing dynamically and their cumulative effects need to be assessed through more detailed models than that presented here. 10. WHAT RECURRENCE IS GOOD FOR: A NEW ORIENTATION The possible role of the columnar recurrent circuit can now be considered in the context of the attribute of orientation selectivity in the visual cortex. The property of orientation selectivity has been studied in much detail at the level of the receptive fields of single cells and in the geometric arrangement of columns described above. Orientation selectivity of single neurons is quite robust in the face of changes in spatial and temporal frequency and in stimulus contrast of the stimuli. Bars and gratings give similar tuning curves and the orientation selectivity of binocular neurons is the same tested through either eye. It is clear that these properties do not reside in the physiological properties of the thalamic afferents, nevertheless the geometry of the thalamic afferent synapses are thought to be a necessary condition in setting up orientation selectivity [30,6]. The issue of what the role of the intracortical circuitry is in this system is rather contentious. One view is that of Hubel and Wiesel [30], which is that the intracortical circuitry does not
a.
'/TO
£ > - iinh ^ < ^
iin
current gain = G/Geff Geff = G + p - a
Figure 2. Reduced spiny stellate microcircuit for layer 4. a. Spiny stellate neurons form synapses with thalamic afFerents and other spiny stellates and with inhibitory basket cells (shown in black). The excitatory synapses provide an inward current, Iin from thalamic afFerents and Irec for recurrent spiny neurons. The basket cells provide an outward current Iinh. Ig is the current flowing across total conductance of the spiny stellate and the output is given by the frequency of discharge F. b. Equivalent electrical circuit. The spiny stellate net conductance is G. Currents as in a. a is the network conductance of the excitatory portion of the circuit and /3 is the network conductance of the inhibitory portion of the circuit.
89 contribute to the basic receptive structure of simple and complex cells. A contrary view is that the thalamic afFerents provide a non-oriented or weakly oriented excitation that is shaped by inhibitory neurons in the cortex [62,63,25,6]. In this view the inhibitory neurons provide a powerful 'cross-orientation' inhibition that is the critical functional component producing orientation selectivity. We have previously reviewed the evidence for both mechanisms and will not review our conclusion that neither of these extremes gives a coherent account of the cortical mechanisms of orientation selectivity [49]. Instead, we start from an acceptance of the existence of recurrently connected columnar circuits and attempt to understand the manner in which these circuits might interact with the thalamic input to produce orientation selectivity in columns in cat visual cortex. Our tool for exploring these interactions is a simple model of layer 4. In this model (Fig. 3), 40 spiny stellate cells were connected together in a ring: these could be considered to be components of an orientation 'pin-wheel'. All the spiny stellate cells received monosynaptic excitation from a group of thalamic afFerents. The receptive fields of the group of thalamic afferents forming synapses with any single spiny stellate neuron were roughly arranged along an axis in visual space (Fig. 3b). The preferred axis of each array of thalamic neurons shifted in an orderly fashion so that the full 180 degrees of the orientation domain was spread across the 40 neurons. The intracortical connections of the spiny stellates were arranged so that nearest neighbours had the strongest connections with each other and more distant neurons were weakly interconnected. These connections were distributed according to a simple gaussian function (Fig. 3a). The spiny stellates were recurrently connected to a pool of inhibitory neurons, i.e. they provided a convergent excitatory input to the inhibitory neuron pool, which provided a divergent and equal strength inhibitory connection to all the spiny stellates. For simplicity this pool was considered as a single neuron (Fig. 3b, grey neuron). This provided for an interesting analysis of the role of intracortical inhibition in orientation specificity. Since we were not studying the dynamics of the circuit, we did not provide for a feedfoward inhibitory pathway driven by the thalamic afFerents. The orientation tuning of the population of spiny stellate neurons was tested under various conditions of connectivity. The 'recordings' are the results that would be obtained if the net activity of the whole ring of spiny stellates could be seen simultaneously as they were being stimulated with one orientation. This recording is in effect a one-dimensional optical recording of the voltage of the array of 40 neurons. In the first condition, the spiny stellate ring was connected only to the geniculate afFerents and the afferents were stimulated with a weak stimulus at one orientation. The resultant activity profile showed that the orientation tuning of the array was very broad and that the signal-to-noise ratio was poor (Fig 3c, dotted line). This is what would be expected from the 'jitter' in the thalamic afferent connectivity. A very different profile was obtained when the intracortical circuitry was engaged (Fig. 3c solid curve). Here the same weak, noisy stimulus gave a well-tuned and robust response. The explanation of this result derives directly from the analysis of the recurrent circuitry of the column (Fig. 2). _ The process is as follows: the oriented stimulus activates all the thalamic afferents. Those converging on the cells with a receptive field biased along the principle axis of the stimulus will be slightly more excited than those tuned to other orientations. The
90 neurons reaching threshold will produce action potentials and excite their neighbouring spiny stellate cells, which in turn will excite the inhibitory neuron pool. The inhibitory neuron pool, because it connects to all the spiny stellates, will apply the same inhibition to all neurons ((Fig. 3c; inhibition 'strength' i = O^t = oo). Weakly driven spiny stellates will be completely inhibited, but more strongly activated spiny stellates will continue to fire and provide positive feedback to their neighbours. Neurons that are non-optimally activated will become more inhibited and fall silent, while the positive excitatory feedback between the optimally activated neurons will amplify the weak and noisy thalamic afferent signal. The result is a relatively noise-free and robust signal. The mechanism of action of the inhibitory neuron in this process is very interesting. It acts in at least two modes, depending on the state of the network. Initially, it acts as a thresholding device to extract the best estimate of the noisy input signal. As the network converges to the optimal solution, the inhibitory neuron pool will be strongly activated and will therefore be orientation tuned. In the final state, the inhibition is proportional to the degree of excitation of the active population of spiny stellates. This proportional inhibition stabilizes the co-operative excitation established within the ring. The neurons in the model circuit act co-operatively [27,64,18] to vote on their best decision as to the orientation of the stimulus. Although this co-operative action is in some senses a democratic one, it is not the democracy of the ballot box, where each neuron makes its own independent decision before adding its individual secret vote to the box. Instead the voting is done on the town hall model, where a show of hands decides the issue. Here each member is subject to the influence of its fellow's vote. A member (in this case a neuron) intending to vote differently from their immediate neighbours will be influenced by neighbours to change their vote to agree with those of its neighbours. This peer pressure is not the only factor. Unlike the town hall, in this cortical model there is active suppression of members whose local support is small. Nevertheless, as in most democracies, the winners take all.
11. C O L L E C T I V E M E M O R Y A N D M O D E L S It is important to note that the connectivity of the model circuit predisposes it to behave in the selective way described. It acts as a correlation detector for a predetermined set of patterns, amplifies the correlated signal and suppresses the noisy uncorrected signal. Thus, even before the weight of the synapses is considered, the 'weight' of the specific connections is having a powerful influence on the result. This embedding of an expectation of the nature of the stimulus in the hardware of the neocortex is not too long a long march from Craik's view that the brain constructs of a working model of reality [14]. Thus, the principle of organization and function of the cortical columnar systems outlined here could apply equally well to most of the other processes we know about in the cortex, whether they be sensory or motor, hardwired or plastic. The same architecture could be used to generate coherent action to take the noisy and ambiguous individual signals arising from the sense organs, shape it into some coherent form according to previous experience, and generate an appropriate response.
91
a.
proximity
current
inhibition t=oo A 5
'"-'
NvV-v
t=0
45deg
cell number
Figure 3. Reduced model of orientation map. a gives the distribution of excitatory connections of a given spiny stellate neuron, b. 'Ring' of 40 spiny stellate cells interconnected according to distribution given in a. Shaded symbol in centre is an inhibitory neurons to which all spiny stellates are recurrently connected. Boxes indicated topographical distribution of receptive fields of thalamic afferents connecting to spiny stellates indicated. All spiny stellates were 'stimulated' with bar indicated by shaded rectangle, c. Activity profile of 40 spiny stellate neurons when connected only to thalamic afferents (dotted line) and with spiny stellate intereconnections engaged (solid line). Magnitude of inhibition shown by horizontal lines at time t = 0 and in steady state ^ = oo .
92 REFERENCES 1. 2.
3.
4.
5.
6. 7.
8. 9. 10.
11. 12. 13. 14. 15. 16. 17.
18.
E.D. Adrian. Afferent discharges to the cerebral cortex from peripheral sense organs. J. Physiol (London), 100:159-191, 1941. B. Ahmed, J.C. Anderson, R.J. Douglas, K.A.C. Martin, and C. Nelson. Polyneuronal innervation of spiny stellate neurons in cat visual cortex. J, Comp. Neurol, 341:39-49, 1994. Y. Amir, M. Harel, and A. Grinvald. Cortical hierachy reflected in the organization of intrinsic connections in macaque monkey visual cortex. J. Comp. Neurol, 334:19-46, 1993. J.C. Anderson, R.J. Douglas, K.A.C Martin, C. Nelson, and D. Whitteridge. Synaptic output of physiologically identified spiny neurons in cat visual cortex. J. Comp. Neurol, 341:16-24, 1994. O. Bernander, R.J. Douglas, K.A.C. Martin, and C. Koch. Synaptic background activity influences spatiotemporal integration in single pyramidal cells. Proc. Natl Acad. ScL USA, 88:11569-11573, 1991. P.O. Bishop, J.S. Coombs, and Henry G.H. Receptive fields of simple cells in the cat striate cortex. J. Physiol (London), 231:31-60, 1973. G.G. Blasdel, D.S. Lund, and D. Fitzpatrick. Intrinsic connections of macaque striate cortex; axonal projections of neurons outside lamina 4c. J. Neurosci., 5:3350-3369, 1985. G.G. Blasdel and G. Salama. Voltage sensitive dyes reveal a modular organization in the monkey striate cortex. Nature, 218:438-441, 1986. J. Bolz and C D . Gilbert. Generation of end-inhibition in the visual cortex via interlaminar connections. Nature, 320:362-365, 1986. T. Bonhoeffer and A. Grinvald. The layout of iso-orientation domains in area 18 of cat visual cortex: optical imaging reveals a pinwheel-like organization. J. Neurosci., 13:4157-4180, 1993. V. Braitenberg and A. Schiiz. Anatomy of the Cortex. Springer-Verlag, Berlin, Germany, 1991. T.G. Brown and C.S. Sherrington. Observations on the localization in the motor cortex of the baboon {papio anubis). J. Physiol (London), 43:209-218, 1911. A. Cowey. Projection of the retina on to striate and prestriate cortex in the squirrel monkey (saimiri sciureus). J. Neurophysiol, 27:366-396, 1964. K.J.W. Craik. The Nature of Explanation. Cambridge University Press, Cambridge, UK, 1943. P.M. Daniel and D. Whitteridge. The representation of the visual field on the cerebral cortex in monkeys. J. Physiol (London), 159:203-221, 1961. J. DeFelipe and E.G. Jones. Cajal on the Cerebral Cortex. An annotated translation of the complete writings. Oxford University Press, New York: NY, 1988. C. Dehay, G. Horsburgh, M. Berland, H. Killackey, and H. Kennedy. Maturation and connectivity of the visual cortex in the monkey is altered by removal of retinal input. Nature, 337:265-267, 1989. R.J. Douglas, M.A. Mahowald, and K.A.C. Martin. Hybrid analog-digital architectures for neuromorphic systems. In IEEE International Conference on Neural Net-
93 works, pages 1848-1853, Orlando, 1994. 19. R.A. Fisken, L.J. Garey, and T.P.S. Powell. The intrinsic and commissural connections of area 17 of the visual cortex. Proc. Roy. Soc. Lond. B, 272:487-536, 1975. 20. D. Fitzpatrick, J.S. Lund, and G.C. Blasdel. Intrinsic connections of macaque striate cortex: Afferent and efferent connections of lamina 4c. J. Neurosci., 5:3329-3349, 1985. 21. T.F. Freund, K.A.C. Martin, P. Somogyi, and D. Whitteridge. Innervation of cat visual areas 17 and 18 by physiologically identified x- and y- type thalamic afferents. ii. identification of postsynaptic targets by gaba immunocytochemistry and golgi impregnation. J Comp Neurol, 291:242-275, 1985. 22. K.C Gatter and T.P.S. Powell. The intrinsic connections of the cortex of Area 4 of the monkey. Brain, 101:513-541, 1978. 23. C.D. Gilbert and T.N. Wiesel. Morphology and intracortical projections of functionally characterised neurons in the cat visual cortex. Nature, 280:120-125, 1979. 24. A. Grinvald, E. Lieke, R.P. Frostig, C. Gilbert, and T.N. Wiesel. Functional architecture of cortex revealed by optical imaging of signals. Nature, 324:361-364, 1986. 25. P. Heggelund. Receptive field organization of simple cells in cat striate cortex. Exp. Brain Res., 42:89-98, 1981. 26. S.H.C. Hendry, E.G. Jones, H.D. Schwark, and J. Yan. Numbers and proportions of gaba immunoreactive neurons in different areas of monkey cerebral cortex. J. Neurosci., 7:1503-1519, 1987. 27. J. J. Hopfield. Neural networks and physical systems with emergent collective computational abilities. Proc. Natl. Acad. Sci. USA, 79:2554-2558, 1982. 28. J. Horton and D.H. Hubel. Regular patchy distribution of cytochrome oxidase staining in primary visual cortex of macaque monkey. Nature, 292:762-764, 1980. 29. C.R. Houser, S.H.C. Hendry, E.G. Jones, and J.E. Vaughn. Morphological density of immunocytochemically identified gaba neurons in monkey sensory motor cortex. J. NeurocytoL, 12:617-638, 1983. 30. D. H. Hubel and T. N. Wiesel. Receptive fields, binocular interaction, and functional architecture in the cat's visual cortex. Journal of Physiology (London), 160:106-154, 1962. 31. D. H. Hubel and T. N. Wiesel. Receptive fields and functional architecture in two non-striate visual areas (18, 19) of the cat. J. NeurophysioL, 28:229-289, 1965. 32. D.H. Hubel and T.N. Wiesel. Receptive fields and functional architecture of monkey striate cortex. J. Physiol. (London), 195:215-243, 1968. 33. D.H. Hubel and T.N. Wiesel. Uniformity of monkey striate cortex: A parallel relationship between field size, scatter, and magnification factor. J. Comp. Neurol., 158:295-306, 1974. 34. D.H. Hubel and T.N. Wiesel. The functional architecture of the macaque visual cortex, the ferrier lecture. Proc. Roy. Soc. Lond. B, 198:1-59, 1977. 35. D.H. Hubel, T.N. Wiesel, and M.P. Stryker. Anatomical demonstration of orientation columns in macaque monkey. J. Comp. Neurol., 177:361-380, 1978. 36. A.L. Humphrey, M. Sur, D.J. Ulrich, and S.M. Sherman. Projection pattern of individual X- and y-cell axons from the lateral geniculate nucleus to cortical area 17 in the cat. J. Comp. A^ewro/., 233:159-189, 1985.
94 37. H. Kennedy and C. Dehay. Cortical specification of mice and men. Cerebral Cortex, 3:171-186, 1993. 38. H. Kennedy, C. Dehay, and G. Horsburgh. Striate cortex periodicity. Nature, 348:494, 1990. 39. Z.F. Kisvarday, K.A.C. Martin, D. Whitteridge, and P. Somogyi. Synaptic connections of intracellularly filled clutch neurons, a type of small basket neuron in the visual cortex of the cat. J. Comp. Neurol., 241:111-137, 1985. 40. E.A. Lachica, P.B. Beck, and V.A. Casagrande. Parallel pathways in macaque monkey striate cortex: anatomically defined columns in layer III. Proc. Natl. Acad. Sci. USA, 89:3566-3570, 1992. 41. M.S. Livingstone and D.H. Hubel. Anatomy and physiology of a colour system in the primate visual cortex. J. Neurosci., 4:309-356, 1984. 42. R. Lorente de No. Cerebral cortex: architecture, intracortical connections, motor projections. In J.F. Fulton, editor. Physiology of the Nervous System, chapter 15, pages 288-315. Oxford University Press, New York, NY, 1949. 43. J.S. Lund, G.H. Henry, C.L. Macqueen, and A.R. Harvey. Anatomical organization of the primary visual cortex (Area 17) of the cat. a comparison with Area 17 of the macaque monkey. J. Comp. Neurol., 184:599-618, 1979. 44. J.S. Lund, R.D. Lund, A.E. Bunt Hendrickson, and A. Fuchs. The origin of efferent pathways from the primary visual cortex, area 17, of the macaque monkey. J. Comp. Neurol, 164:265-285, 1975. 45. J.S. Lund, T. Yoshioka, and J.B. Levitt. Comparison of intrinsic connectivity in different areas of macaque monkey cerebral cortex. Cerebral Cortex, 3:148-162., 1993. 46. R. Malach. Dendritic sampling across processing streams in monkey striate cortex. J. Comp. Neurol, 315:303-312, 1992. 47. R. Malach. Cortical columns as devices for maximising neuronal diversity. TINS, 17:101-104, 1994. 48. K.A.C. Martin. Neuronal circuits in cat striate cortex. In E.G Jones and A. Peters, editors. Cerebral Cortex, Vol. 2, Functional Properties of Cortical Cells, volume 2, pages 241-284. Plenum Press, New York, 1984. 49. K.A.C. Martin. The Wellcome Prize Lecture, from single cells to simple circuits in the cerebral cortex. Q. J. Exptl Physiol, 73:637-702, 1988. 50. K.A.C. Martin, Somogyi P., and D. Whitteridge. Physiological and morphological properties of identified basket cells in the cat's visual cortex. Exp. Brain Res., 50:193200, 1983. 51. K.A.C. Martin and D. Whitteridge. Form, function and intracortical projection of spiny neurones in the striate visual cortex of the cat. J. Physiol. (London), 353:463504, 1984. 52. B.A. McGuire, J.-P. Hornung, C. Gilbert, and T.N. Wiesel. Patterns of synaptic input to layer 4 of cat striate cortex. J. Neurosci., 4:3021-3033, 1984. 53. V.B. Mount castle. Modality and topographic properties of single neurons of the cat's somatic sensory cortex. J. Neurophysiol, 20:408-434, 1957. 54. A. Peters. Thalamic input to the cerebral cortex. TINS, 2:183-185, 1979. 55. A. Peters. Number of neurons and synapses in primary visual cortex. In E.G. Jones and A. Peters, editors, Cerebral Cortex 6: Further aspects of cortical function including
95 hippocampus, volume 6, pages 267-294. Plenum Press, New York, NY, 1987. 56. A. Peters and Feldman M.L. The projection of the lateral geniculate nucleus to area 17 of the rat cerebral cortex. I. general description. J. NeurocytoL, 5:63-84, 1976. 57. C.G. Phillips. Actions of antidromic pyramidal volleys on single betz cells in the cat. Q. J. Exptl Physiol, 44:1-25, 1959. 58. T.P.S. Powell. Certain aspects of the intrinsic organisation of the cerebral cortex. In 0 . Pompeiano and C. Ajmone Marsan, editors. Brain mechanism and perceptual awareness, pages 1-19. Raven Press, New York, NY, 1981. 59. P. Rakic. Specification of cerebral cortical areas. Science, 241:170-176, 1988. 60. P. Rakic, I. Suner, and R.W. Williams. A novel cytoarchitectonic area induced experimentally within the primate visual cortex. Proc. Natl. Acad. Sci. USA, 88:20083-2087, 1991. 61. A.J. Rockel, R.W. Hiorns, and T.P.S. Powell. The basic uniformity in structure of the neocortex. Brain, 103:221-244, 1980. 62. A.M. Sillito. The contribution of inhibitory mechanisms to the receptive field properties of neurones in the striate cortex of the cat. J. Physiol. (London), 250:305-329, 1975. 63. A.M. Sillito. Inhibitory processes underlying direction specificity of simple, complex, and hypercomplex cells in cat's striate cortex. J. Physiol. (London), 271:699-720, 1977. 64. H. Sompolinsky, D. Golomb, and D. Kleinfeld. Cooperative dynamics in visual processing. Physics Review A, 43(12):6990-7011, 1991. 65. H. Van der Loos and T.A. Woolsey. Somatosensory cortex: structural alteration following early injury to sense organs. Science, 175:395-398, 1973. 66. D.J. Willshaw and C. von der Malsburg. How patterned neural conenctions can be set up by self organization. Proc. Roy. Soc. Lond. B, 194:431-455, 1976.
Brain Theory - Biological Basis and Computational Principles A. Aertsen and V. Braitenberg (Editors) © 1996 Elsevier Science B.V. Allrightsreserved.
97
FUNCTIONAL TOPOGRAPHY OF HORIZONTAL NEURONAL NETWORKS IN CAT VISUAL CORTEX (AREA 18) Z.F. Kisvarday^ T. Bonhoeffer'^, D.-S. Kim^, U.T. Eysel^
Abteilung fur Neurophysiologie, Medizinische Fakultat, Ruhr-Universitat Bochum, Universitatsstrasse 150, 44801 Bochum, Germany Max-Planck Institute fiir Psychiatry, Am Klopferspitz 18A, 82152 Miinchen-Martinsried, Germany Max-Planck Institute fiir Himforschung, Deutschordenstrasse 46, 60528 Frankfiirt, Germany
1. INTRODUCTION An intriguing and general feature of the cerebral cortex is that it contains fiinctional units of cell assemblies arranged in a columnar manner. The actual „columnarity" of the cortex was first observed in fi-ontal and sensorymotor areas (Mountcastle, 1957; Szentagothai, 1965; Szentagothai and Arbib, 1974; Goldman and Nauta, 1977), nevertheless, it is fair to say that since those early observations the most thoroughly studied regions in this respect have been the visual cortices, in particular primary and secondary visual areas of cat and monkey. Although anatomical studies revealed that corticocortical connections link distinct neuronal groups they could not reveal what might be the fiinctional role of these connections. On the other hand physiological experiments showed that visual cortical neurons selective e.g. for orientation were regularly distributed; in the same cortical colunm each cell had similar orientation preference and neighbouring columns showed a gradual shift in the preferred orientation (Hubel and Wiesel, 1962, 1963; Albus, 1975) resulting in a probable spatial mosaic of orientation selectivity (Braintenberg and Braintenberg, 1979). A causal link between the anatomical and physiological observations had been anticipated well before direct evidence was available. In a theoretical study, Mitchison and Crick (1982) suggested that corticocortical connections run between groups of neurons sharing similar physiological attributes; in terms of orientation selectivity iso-orientation columns should be linked (iso-model). During ensuing years this concept had become so much favoured that the report of Matsubara et al. (1985) came as a surprise. They decided to test the model proposed by Mitchison and Crick using a * Acknowledgement: The authors thank Ms E. Toth and Mr F. Brinkmann for their excellent technical assistance, and Ms D. Strehler for photography. This work was supported by the Deutsche Forschungsgemeinschaft (Ey8/17-1) and the European Communities (SCI 0329-C). We dedicate this work to Janos Szentagothai who died in September 1994.
98 combination of physiological mapping and anatomical tracing in the same cortical region, area 18 of the cat. What they found was all but supporting the iso-model. Notably, the corticocortical connections they visualized linked sites whose orientation preferences simply did not match. A few years later, the study of Gilbert and Wiesel (1989) revisited the same issue, this time in area 17 of the cat. Their findings, however, rather favoured the iso-model. Hitherto, these two studies represented the only genuine attempts in which anatomy and physiology were combined to address the central question whether corticocortical connections prefer to link sites of similar or dissimilar orientations. Clearly, at best, the above results established a balance between the two options. Thus, all the more surprising that an almost unanimous consensus exists in the literature favouring the iso-model. An explanation to this may derive from the fact that the iso-model better suits to the current view of parallel processing of visual information (Stone et al, 1979; Lennie, 1980; Livingstone and Hubel, 1988; for review see Merigan and Maunsell, 1993). Furthermore, support for the iso-model stems from studies measuring correlated neuronal activity whereby positive cross correlation could be detected between laterally displaced cells of similar orientation preferences (Michalski et al, 1983; Nelson and Frost, 1985; Ts'o et al, 1986). This technique, however, without anatomical back-up has serious inherent limitations in terms of interpreting the results (Aertsen and Gerstein, 1985). Additional contribution to the broad acceptance of the iso-model is probably lent from observations in primates on the specificity of lateral connections. Notably, the cytochrome oxidase compartments stationing cells of special physiological properties showed a qualitatively positive tendency although not selectivity of connecting each other (Livingstone and Hubel, 1984). We think that looking at the available pro- and contraevidences a quantitative analysis is imminent to reveal whether lateral connections abide by the iso-model in the visual system. Therefore we carried out experiments using a combination of detailed physiological and anatomical approaches in area 18 of the cat. The present chapter shall attempt to provide an account on the achieved results and highlight some of the implications.
2. CELLULAR COMPONENTS OF THE HORIZONTAL SYSTEM 2.1. Lateral excitatory connections Although the main neuronal composition of corticocortical connections can be inferred from Golgi-studies showing that certain pyramidal cells possess long intracortical axons full details about the types of neurons and axonal arborization patterns could not be obtained until modem extra- and intracellular-labelling techniques were available. Using bulk injections of retrograde and anterograde tracers it has been demonstrated in a number of cortical regions and in a number of mammalian species that the population of corticocortical connections originating from any given location formed a lattice-like pattern, often called patches, distributed over several mm laterally around the injection site (Rockland and Lund, 1982; Matsubara et al 1985, 1987; Gilbert and Wiesel, 1989; Luhmann et al 1989; Burkhalter et al, 1989, 1990; Yoshioka et al, 1992; Levitt et al, 1994). Obviously, the constituting neuronal elements of these connections could be best studied using the intracellular labelling technique whereby pyramidal cells in the superficial and deep layers were found to emit horizontal axons and bursts of collaterals at quasi-regular intervals (Gilbert and Wiesel, 1979, 1983; Martin and Whitteridge, 1984). A direct link between the population and the single cell results could be witnessed in a three-dimensional reconstruction often pyramidal cells in area 17 of the cat (Kisvarday and
99 Eysel, 1992). Interestingly, in this study, the patchy axons of pyramidal cells made up a network that was also patchy covering an area of 6.5x3.5 mm elongated in antero-posterior direction. Boutonal counts of the reconstructed neurons revealed that a single pyramidal cell may provide an average of 80 boutons per patch of its 300-1200 boutons in total. Obviously, these numbers should be considered as lower estimates. Using these and adequate quantitative data from the literature, we estimated that a single pyramidal cell received less than 0.1% of its total excitatory input from the same patchy axon (Kisvarday and Eysel, 1992). Furthermore, we found that each patchy axon innervated only 1-3% of all neurons in a given patch. These values are very much in line with those probability values obtained in dual intracellular recordings for monosynaptic connections between remote pyramidal cells (Mason et al, 1991, Thomson et al, 1988). They also agree with previous anatomical observations demonstrating one or only a few direct contacts from one pyramidal cell to another strongly supporting the view that any pyramidal cell is under the influence of a vast number of other cortical neurons (Szentagothai, 1975). Thorough examination of individually labelled pyramidal cells revealed an additional interesting connectivity feature. Namely, when their projection patterns were compared with each other's it was found that many of them were reciprocally connected. The mode of reciprocity was not, however, mutual between pairs of individual pyramidal cells that could obviously lead to abnormal hyperactivity in such a network. Instead, this kind of relationship did exist at the plopulation level that is a pyramidal cell could receive input from neurons of the very same patch it innervates but only from those neurons to which the pyramidal cell is not presynaptic. Although reciprocity seems to be a fundamental connectivity rule within the patchy excitatory network there is strong support to acknowledge that this is not always the case. Clearly some of the patches labelled from the very same injection site contained either labelled somata or labelled boutons but not both (Boyd and Matsubara, 1991; Kisvarday and Eysel, 1992). Certainly this pattern could emerge only if the neighbourhood relationship between colunms contains some degree of discontinuity (Amir et al, 1993). Although pyramidal cells with patchy axons represent the chief component of horizontal connections morphological analyses unambiguously proved that other excitatory cell types also have to be taken into account. For example, in the cat striate cortex, some spiny stellate neurons of layer IV were shown to establish axonal patches in layer III, others had patchy axonal arborization within layer IV (Martin and Whitteridge, 1984). In the monkey striate cortex, spiny stellate neurons in layer IVCP could provide periodic projection pattern into the superficial layers and to the IVCa and P subdivisions (see Lund et al, 1994). Unfortunately, the amount of projection by layer IV cells contributing to the patchy system of superficial layers is not known in either species. 2.2. Lateral inhibitory connections Examination of Golgi-pitp^ations and intracellularly labelled cells revealed a broad variety of non-pyramidal local circuit neurons (Cajal, 1899; Szentagothai, 1973; Jones, 1975; Lund et al, 1979, 1988; Peters and Regidor, 19^81) which are thought to be GABAergic and inhibitory in fijnction. Fine structural studies showed that of the many types of GAB A neurons only a few had lateral axons comparable in extent to that of the horizontal pyramidal cell network. These neurons were invariably identified as large basket cells in layers III-V (for review see Kisvarday, 1992). Commonly they have smooth dendrites and 3-5 thick myelinated axon collaterals running up to 1-2 mm parallel with the cortical surface. A very important characteristic of large basket cells is that they provide lateral input only to certain regions
100 within their reach. Notably, the 3-5 main axon collaterals follow a relatively straight course and en route regularly give off radial segments laden with boutons. This axonal arborization pattern resembles narrow vertical slabs or wedge-shaped fields that can be best appreciated in reconstructions viewed fi'om the cortical surface as shown in figures 2a and 3. Combined lightand electronmicroscopy proved that large basket cells target chiefly the perisomatic region of pyramidal cells (Somogyi et al, 1983) and non-pyramidal cells (Kisvarday et al, 1993), including other large basket cells, at a ratio of approximately 8-13:1. Recent estimates put the minimum number of boutons per basket cell between 2000 and 3000, implying that each basket cell could directly influence 300-600 other neurons (Martin 1988; Kisvarday et al, 1993). With regard to the latter figure large basket cells thus seem to resemble pyramidal neurons, each member of both groups could contact similar numbers of other neurons. At this point it might be relevant to ask whether there is sufficient number of large basket cells to play a significant role in lateral networks? Excitatory cells are known to outnumber inhibitory neurons by a ratio of 4:1 (Gabbott and Somogyi, 1986). Although exact measures are still not available on this issue a recent calculation puts the minimum occurrence of basket cells, including clutch cells and columnar basket cells, to be around 17% of all GABAergic neurons (Kisvarday, 1992). From this it follows that the proportion of large basket cells could be at least a few percent of all cortical neurons. While this seemingly sparse lateral inhibition suggests weak fiinctional relevance it should be remembered that lateral excitatory connections originate only fi-om an estimated 3-11% of cortical cells (Albus and Wahle, 1994). With this in mind and that large basket cells contact a fimctionally very potent region, proximal to the generation site of action potentials, of the target cells the actual significance of lateral inhibition could be as high as that of lateral excitation. Indeed, preliminary intracellular data revealed that in the hippocampus stimulation of a basket cell could delay or even suppress the repetitive discharge of action potentials of its targeted pyramidal cell (Miles et al, 1994). Furthermore, experiments using GABA-induced inactivation of sites remote fi'om the recording site showed strong influence of lateral inhibition, most likely to be mediated by large basket cells, on orientation (Eysel et al 1990; Crook et al, 1991; Crook and Eysel, 1992) and direction selectivity (Eysel et al, 1988; Crook, Kisvarday and Eysel, unpublished results). These findings suggest that basket cells in the neocortex could be extremely influential in determining the output of their target neurons.
3. FUNCTIONAL TOPOGRAPHY OF LATERAL CONNECTIONS In this section, we examine the topographical relationship between orientation selectivity and lateral connectivity. Contrary to previous studies we used methods that allowed us to quantify both the physiological and the anatomical results. In one set of experiments orientation maps were obtained by recording multiunit activity in layer III at intervals of 100-300 \xm in a region of 1-2 mm^. At each penetration, orientation selectivity of a small cluster of cells was determined for the dominant eye using computer controlled stimuli (see details in Kisvarday and Eysel, 1993). The location of penetrations was marked on an enlarged photograph of the exposed area for which surface blood vessels were used as landmarks. In addition the precise stereotaxic position of each penetration was registered with a resolution of 10 ^im. Once the exact location of the recording electrode can be identified in the tissue this method provides reliable information about the orientation preference of neurons in the close neighbourhood of the electrode. It has to be mentioned, however, that this kind of mapping suffers fi-om two major limitations. First, due to the relatively small number of penetrations the resulting
101 orientation map has a relatively low spatial resolution. Secondly, to collect sufficient amounts of quantitative data even from a small region may take several days. Hence, in another set of experiments we employed optical imaging of intrinsic signals to map orientation selectivity (Kisvarday et al, 1994). The major advantage of applying this technique is that it provides information from large cortical areas at relatively high spatial resolution. We routinely obtained orientation maps from regions of 8-19 mm^, a noticeable tenfold increase compared to our physiological mapping with electrodes. The image resolution of our maps was kept at 5002500 iim^ per pixel resulting in a confidence range of ±(20-50) jam in lateral direction from any given point in the map. It should not be forgotten, however, that although optical imaging is faster and provides better resolution than mapping with electrodes, it utilizes averaged signals originating from the upper 600-900 \xm of cortical tissue. Furthermore, since this method detects changes in oxygen metabolism and not directly the electrical activity of neurons the results need to be treated with caution. Other problems may arise in interpreting the results of optical imaging if, for example, the spatial architecture of orientation columns is perturbated most likely at convoluted regions. In order to minimize the number of such errors we carried out our experiments in largely flat zones of area 18. After completion of either electrophysiological or optical mapping of orientation selectivity the underlying anatomical connections were revealed with iontophoretic delivery of biocytin (see details in Kisvarday and Eysel, 1993). Because the orientation maps were obtained in a plane tangential to the cortical surface we sectioned the injected cortical regions in a plane as parallel with the cortical surface as possible. Apparently a critical issue of this study was to find the best match between the physiological and the anatomical images. To overcome this problem each section was osmium treated and resin embedded to retain the fiill threedimensional structure of the tissue. We also found that this procedure provided little and more even tissue shrinkage than other histological treatments. Another unportant advantage of osmication is that any tissue damage caused by electrode penetrations that otherwise go unnoticed can be detected in the light microscope. This has proved advantageous in correlating the layout of recording electrode holes marked on the physiological charts and their actual location in the tissue. In this way, we were able to find a precise match between our orientation maps and histological sections with an overall error of 30-40 |im. Figure 1 shows a typical example of a biocytin injection site and its surrounding labelling. In general, our injection parameters were set to produce small injections with a core diameter of less than 200 [im in diameter. Although, in the core region, the reaction deposit was often opaque rendering any tracing of labelled elements impossible, in the surrounding tissue, individual axons could be readily traced for several mm. The labelled axons often formed clusters or patches which occasionally contained retrogradely labelled neurons at various labelling intensity level. Interestingly not all retrogradely labelled cells were found in patches, they could also be found in inter-patch zones. On the basis of dendritic morphology two main neuronal types of retrogradely labelled cells could be distinguished; (i) non-pyramidal cells with smooth dendrites, (ii) pyramidal cells with spiny dendrites. We found that non-pyramidal cells constituted a smaller proportion of labelled somata than can be anticipated by their known proportion in the cortex (Gabbott and Somogyi, 1987). A similar observation was reported recently using bulk injection of retrograde tracers combined with GABA-immunostaining (Matsubara and Boyd, 1992; Albus and Wahle, 1994). Since we used biocytin that often reveals recurrent axon collaterals of retrogradely labelled neurons, in a number of instances, we were able to determine the type they belong to. Much to our surprise, virtually all retrogradely labelled non-pyramidal neurons with traceable axons showed characteristics of the family of
102 large basket cells. This phenomenon suggests that, at least in the cat, biocytin is selective in revealing inhibitory connections of the long-range type. Unfortunately, tracing the fine axon collaterals of individual pyramidal neurons proved to be more difficult in these materials. Their thin axons were either too faint or arborized in regions which were too dense for detailed tracing. 3.1 Analysis of single large basket cells Axons In this section, we present anatomical data on the axonal distribution of individual large basket cells obtained in area 18 that had been mapped for orientation using the optical imaging technique. For the analysis presented here, two large basket cells in layer III were selected on the basis of completeness of their axonal arbors one of which is shown in figure 2a. It is well established that the perisomatic termination of basket cell boutons strongly indicate the actual physical location of their target neuron. We utilized this phenomenon together with the notion that signal transduction between neurons takes place at synaptic junctions. Thus, by plotting the spatial distribution of basket cell boutons according to orientations a quantitative estimate could be made for their inhibitory contribution on the orientation map. In order to do this we dissected the orientation map into 16 orientation divisions each of which represented 11.25 degrees along the orientation axis. Then the relative occurrence of boutons in each of these divisions was displayed (Fig. 2c). Looking at the results there are a few inFigure 1. Tangential view of a biocytin injection into layer III of area 18 of the cat. Hie dark area in the bottom represents tiie injection site showing a core diameter of 150 pm. Arrowheads indicate labelled main axons whose terminal boutons form a patch in the upper part of the picture. Bar: lOOpm.
•V'
•% \
c-
il
^ -" -
*• V *
*^''.
103 teresting points that are worth to be considered. Firstly, boutons of the same basket cell provide input to all orientations, a finding that is very much in line with previous assumptions on the broad scale of lateral inhibition by large basket cells to orientation sites (Martin, 1988; Kisvarday and Eysel, 1993). Secondly, although the overall shape of the distribution is rather symmetrical it is nevertheless skewed with respect to the preferred orientation at the soma location. We do not know yet whether this phenomenon is due to a genuine bias of the axons to provide input to these orientations or due to incomplete reconstruction of the axonal field, most likely at the densely labelled injection site. Thirdly, the above quantitative analysis allowed us to compare our findings directly with previous results on the same issue. In a recent analysis of area 17, we compared the distribution of individually traced large basket cells with orientation maps obtained with electrophysiological mapping (Kisvarday and Eysel, 1993). Applying a semi-quantitative method basket cells were found to provide approximately the same amount of input to orientation sites defined as iso- (±30^, oblique- (±[30-60]°) and cross-orientation (±[60-90]°) with respect to the basket cell's own orientation preference. When we used a similar scheme in area 18, the results averaged for two large basket cells resembled those of area 17, 42.6% of all boutons occupied regions of iso-orientation sites, and oblique- and cross-orientations were represented by 35.4 and 22.0%, respectively. In essence, the above findings provide direct evidence that in the visual cortex of the cat individual large basket cells could mediate lateral inhibition to virtually all orientations. A direct application of these results is that in addition to iso-orientation inhibition there is now fijnctional-topographical evidence for non-iso-orientation inhibition in the strictest sense. In this context, one should not forget that the results underestimate the real contribution of lateral inhibition to non-iso-orientation sites because our categories of iso-, oblique- and crossorientation preferences are artificial. Recording of single cells show that, in most cases, a 30° oflfeet of the visual stimulus fi-om the preferred orientation of a neuron represents an already non-optimal orientation. With this in mind our result are consistent with extracellular experiments demonstrating strong inhibitory influence fi'om non-optimal orientations (Sillito, 1975, 1979; Morrone et al, 1982; Ramoa et al 1986; Eysel et al, 1990) and some intracellular experiments showing the broad band nature of inhibitory postsynaptic potentials (Benevento et al, 1972; Douglas et al, 1991; Volgushev et al 1993). Dendrites The method that we applied for unravelling the relationship between the axonal topography of basket cells and orientation maps can also be used for a similar analysis of their dendritic fields. Assuming that each point in the orientation map defines an orientation specific input to the underlying structures, e.g. somata and dendrites, a logical question that one may ask is whether the dentritic field of a basket cell receives input fi'om a narrow or a broad range of orientations. Hence we used the same approach as for the axons and counted the occurrence of basket cell dendrites in orientation pixels. The results showed that „input" to the dendrites is more narrowly tuned than the boutonal „output" of the same basket cell. Using a simple numerical analogy of the data, 86% of the dendrites were found to branch at iso-orientation sites, and oblique- and cross-orientation sites were represented by 11 and 3 %, respectively. These values might be argued to follow fi'om the fact that dendrites have smaller lateral extent than axons although they occasionally reach distances up to 300 [im fi'om the parent soma. It would be interesting to see how the orientation tuning of physiologically characterized basket cells correlates with the "dendritic input" calculated with the above method. In any case, our
104
a
Single basket cell
^>^\\\ \ I / I
//^^^
I
b
Population of basket axons
^ X \ \ \ \ I / ///X-
Figure 2. Reconstructions of axon collaterals emitted by a single basket cell (in a) and by a population of eight basket cell collaterals (in b) viewed from the cortical surfece. The parent soma (in a) and the injection site (in b) are marked by asterisk. Notice that the single basket cell and the eight basket axons innervate only certain regions. In (c) and (d), frequency distributions of boutons provided by the basket cell shown in (a) andti&epopulation of basket axons shown in (b) are displayed as a function of orientation preferences. It is evident that the boutonal distribution of tiie single basket cell is normal-like and skewed to the soma location (arrowed) and the distribution of the pooled axons is highly irregular. In (d), the injection site is indicated by asterisk. Bar: (a) and (b), 250 pm. Figure 3. (in facing page) Tangential view of reconstructed axonal fields of two large basket cells, one in layer III (in black) and one in layer V (in white), in the same cortical column. Somata of the two cdls are indicated by arrows. Notice that both axonal fields tend to overlap, nonetheless, they terminate'in different laminae. Bar: 500 ^mi.
105
106 finding of narrow orientation distribution of basket cell dendrites matches well with electrophysiological data (Martin et al, 1983). 3.2 Population of basket cell axons While the analysis of single basket cells could disclose subtle details about the functional topography of lateral inhibition it could by no means completely reveal the situation at the population level. Accordingly we studied how the population of large basket axons originating from a circumscribed region of the cortex is distributed along the orientation map revealed with optical imaging. Therefore, in one animal, all labelled basket axons (8) which projected more than 400 |Lim laterally and originated from the centre of the injection site were reconstructed in area 18. Laminar analysis showed that each of them arborized mainly in layer III with occasional branching into upper layer IV. Figure 2b demonstrates the boutonal distribution of the eight basket axons in a tangential plane. Interestingly, we found that their composite field occupied only certain sectors and did not project into other regions. In this respect the population of basket axons resembled individual basket cells shown above. An implication of such a topography could be that the population of basket cells of a given cortical column may provide input only to certain visuotopic locations in the cortex. This assumption is strengthened by the observation of two basket cells whose somata were found in the same cortical column, one in layer III and the other one in layer V (Fig.3). Superimposed their reconstructed axons revealed that both basket cells were in quasi-register by innervating corresponding sectors of different laminae. A similar, spatially specific arborization pattern has been described for layer III pyramidal cells providing patchy input to layers III and V (Gilbert and Wiesel, 1983; Martin and Whitteridge, 1984; Kisvarday et al, 1986). It thus seems reasonable to speculate that lateral inhibition may also be selectively distributed along the visual cortical map although the governing rules may well be very different from that of the excitatory system. The similar arborization pattern of individual basket cells and the population of basket axons suggests that their orientation distributions might also be similar. Indeed, when we applied the same quantitative method that was used for single basket cells the results showed that the boutons of pooled basket axons also covered the entire range of orientations (Fig.2d). However, while single basket cells showed preference for one or two orientations the orientation distribution for the population of basket axons was rather irregular. Again, it is possible that we underestimated the proportion of orientation preferences at the core of the injection site hence the data needs to be critically viewed. 3.3 Lateral patchy connections in area 18 of the cat Connectivity of orientation domains The topography of lateral intracortical connections has been commonly visualized with bulk injection of anterograde and retrograde tracers. While pure anatomical data on this issue are widely available, there are few attempts in the literature, mainly due to technical difficulties, in which the anatomical findings could be directly correlated with function. As pointed out above, the patchy character of iso-orientation domains is thought to be directly linked to the patchiness of lateral connections. Experimental data obtained in areas 17 and 18, however, led to results showing an apparent discrepancy. Here we examine how the distribution of lateral connections in area 18 relates to the topography of orientation. To explore the underlying connectivity rules we used a combination of optical imaging of intrinsic signals and focal injection of biocytin (see details in Kisvarday et al, 1994).
107
Figure 4. Anterogradely labelled boutons reconstructed from adjoining tangential sections. The biocytin injection was made into layer III in area 18 resulting in labelled patches. The labelled area is elongated in antero-posterior Erection. Right = anterior, down = medial. Bar: 1 mm. While previous attempts relied on qualitative evaluation of the resuks in determining topographical relationships we decided to apply quantitative approaches as follows. We chose the location of labelled boutons as the most informative measure for a quantitative interpretation of the anatomical signals. Accordingly, each labelled bouton in large consecutive sections was registered and digitized using the computer software Neurolucida (MicroBrightField) and a personal computer attached to a light microscope (Leitz) with a motorized stage. A typical example for such a reconstruction is shown in figure 4 where the labelled boutons were compiled from the entire cortical thickness. For further analysis, boutonal distributions were converted into boutonal density maps using the same pixel resolution as that of the corresponding orientation maps. In this way the actual distribution structure of the connections could be treated quantitatively. A direct comparison between an orientation map and the underlying boutonal density map can be seen in figure 5. In this example, biocytin was injected into the centre of an orientation domain showing a clear preference to horizontal stimulus orientation. As expected, the injection site and its immediate neighbourhood showed the highest density of labelling and at a lateral distance of 1.3-1.4 mm from the core of the injection site patches of labelled boutons occurred. When the topography of the patches was analysed in terms of orientation specificity they revealed an obvious tendency, the preferred orientations of the patches were similar to that of the injection site, i.e. the patches and the injection site covered mainly blue zones in the orientation map in figure 5. We wanted to know exactly to what orientations the patches corresponded. Therefore we calculated the average orientation preference in a circular region (200 jim in diameter) of each
108
109 patch centred on its density maximum. The actual differences between the averaged orientation preferences of the patches and that of the injection site were then determined (Fig. 5b). It is clear that none of the patches differed more than ±10 degrees from the average orientation preference of the injection site. The most simple application of this result is that patchy connections in area 18 of the cat predominantly link sites of similar orientations. From this it might follow that the connectivity rule in area 18 is the same as in area 17. However, as is often the case, the situation regarding lateral connectivity turned out to be much more complex. Detailed inspection of the orientation map and the anatomical reconstruction shown in figure 5 revealed two important facts. Namely, in this particular experiment, the injection site was confined to the centre of an orientation domain where orientation gradients are obviously low compared to orientation centres. It is tempting to speculate that lateral connections in regions of high orientation gradient, orientation centre, maybe different. Thus we carried out a critical test on this issue by injecting biocytin into regions containing orientation centres or „pinwheels" and the results are discussed in the following section. A second important fact is the presence of labelled boutons in interpatch zones irrespective of the topographical location of the injection site. It was noticed that although these boutons constituted a relatively small proportion of the total labelling they provided a continuum between the patches so that no regions within the labelled area was exempt from boutonal labelling (see brown regions in figure 5b). At present little is known about the exact origin of these connections. Clearly their majority is provided by a mixture of pyramidal type of axons and, as we have seen above, to some extent by basket cell collaterals. Connectivity of orientation centres In the previous paragraph, we have shown that injections into orientation domains of area 18 label regions which on average have similar orientation preferences. In this paragraph, we provide comparative data on injections made into zones including orientation centres. Qualitatively, injecting into an orientation centre resulted in patchy labelling although the patches were less remarkable than after injecting into orientation domains. This feature can be appreciated in 3-dimensional representations of boutonal density maps as shown in figure 6. Apparently the orientation centre-injection in the lower panel of figure 6 resulted in strong labelling of interpatch regions that hinders to recognize individual patches. Nevertheless, on the basis of contour plots, whereby the contours of iso-density locations were displayed, we were able to differentiate 5 distinct patches around the injection site. Comparing the distribution of the labelled patches resulting from orientation domain- (Fig. 6, upper panel) and from orientation centre-injections (Fig. 6, lower panel) we found no appreciable difference between their overall number and centre-to-centre spacing. Again, we asked whether the orientation preferences of the labelled patches were similar or dis-
Figure 5. Orientation map obtained with optical imaging of intrinsic signals (a) and boutonal density map (b) of tiie same region of area 18. The orientation map was computed from single condition maps (Bonhoeffer and Grinvdd, 1993) according to the color scheme on theright-handside. Asterisk indicate tiie site of iontophoretically delivered biocytin into layer HI. In b, the boutonal density distribution is shown in the very same area as in (a). Note the patehy distribution of the labelled boutons around the injection site. Comparison between ^e optical and the anatomical images revealed that the labelled patches occupy regions possessing like-orientations to that of the injections site. The difference in orientation preferences between each pateh and the injection site was numerically estimated by calculating the average orientation preference in areas of 200 \mi in diameter centred on each pateh and tiie injection site. The actual difference in orientation preference between the injection site and the patehes are indicated in degrees. Bar: 1 mm.
no
Orientation domain-injection
Orientation centre-injection
Ills I s ct^ 4^ yii|i|P|;ftir'Ji ,
ifilfillii.
Figure 6. Shaded surface representations of the density distribution of labelled boutons in cat visual cortex area 18. In the upper panel, the core of the injection site is situated in the middle of an orientation domain. In the lower panel, the core of the injection site is situated in an orientation centre so-called pinwheel, where many orientations meet in a small region. Note the obvious patchy distribution of labelling in the upper distribution as opposed to the less obvious patchy labelling in the lower distribution. In both distributions, notice the preponderance of local connections up to a lateral distance of 300-400 |Lim from the core of the injection site.
Ill
l l l i l l l ^ ^ % 20 15
rtl :;:v; : ; ^:; ; ; :; : :i^
Patch 2
TO |y^
% 20 15 10 0
^^^^^^^i^^^^^^^
ll|il|iI|MFv^^
;:^-^^.^^yy/iim
% 20
PafcH3
';;,MM
15
id 0
-•-^•>>;/;/::^/:.^I/;A;>\:\.;N;*^^>^I-:.;'
• :i^^^:^:^::/,i/;;::;/:;:;i:;:
••'."•:5:.
0
^^
•••~S>->t^>':/./:^^:7;::1:^;:V:::\::\^:^;^^^^---,-
% 20 15 10 5 0
Patch 5
— - ' > / / / / J V \ \ Vx-H.-*
Orientation centre-injection %
^j
Injection site
^^
|
^
IHi i ^^y//
I I \ \\>v^---—
Patch 3
% 5
——^x/// / I \ \ \ x ^ — -
yy//
I I \ \\\v-^—
Patch 5
Patch 4
4 3
aTn ruDLlJ^^rt^ —^yy//I
I \\\\x^—
2
^^
1 — ^ y y /i
I I \ NW-N.-^—
Figure 7. Boutonal distribution of labelled patches following an injection into an orientation domain (upper panel) shown in the upper panel offigure6 and into an orientation centre (lower panel) shown in ihe lower panel of figure 6 with resp|ect to 16 orientations. Notice that orientation domain-injection produced labelled patches showing similar frequency distribution to that of the injection site, and orientation centre-injection produced labelled patches showing radically different fiiequency distribution to that of the injection site.
112 similar to that of the injection. Hence we dissected the orientation maps into 16 zones and calculated what proportion of the labelled boutons fell into each zone. Only the central region (200|im in diameter) of each patch was subjected to this analysis. Figure 7 compares the outcome between an injection made into an orientation domain (upper panel, also shown in figure 5 and in the upper panel of figure 6) and an injection made into an orientation centre (lower panel, also shown in the lower panel of figure 6). Apparently there are striking differences between the two sets of distributions. Orientation domain-injection produced patches each of which had rather similar distribution pattern to that of the injection site. Their preferred orientation range was relatively narrow and comprised only 3-5 bins corresponding up to ± 25 degrees centred on the orientation preference at the injection site. Contrary to this, orientation centre-injection produced boutonal patches whose distribution along the orientation scale was radically different fi'om that of the injection site and alsofi"omeach other. Their preferred orientation width was broader than that after orientation domain-injection. As expected on the basis of the orientation map at the injection site, the labelling represented the entire scale of orientations. These observations may suggest that the actual orientations represented at the injection site are proportional to those represented in different patches. In other words, the sum of boutons in corresponding bins of the patches should provide a distribution similar to that of the injection site. Indeed, combining the graphs of patch 1-5 in the lower panel of figure 7 yielded a distribution (not shown) similar to the one of the injection site. This kind of relationship was found applicable for other cases regardless whether orientation centre- or orientation domain-injections were taken into account. The fact that after orientation centre-injection different patches had boutonal density maxima at different orientations raises a number of questions. One possibility is that lateral connections of orientation centres might be radically different fi'om those of orientation domains. If so, a direct implication is that the very same pyramidal neuron in an orientation centre may send collaterals to regions representing many different orientations. Hence these pyramidal cells could be expected to show broad orientation tuning as being the source of input to many orientation sites. Preliminary data on this issue showed, however, no indication that neurons at orientation centres possess significantly broader orientation tuning than those in orientation domains (Bonhoeffer and Grinvald, 1991). Another possibility could be that each pyramidal cell provides input only to patches of corresponding orientation preferences. In this instance, however, to maintain a distribution pattern that is similar to that of orientation domain-injections each pyramidal cell should send axon collaterals only to a few patches. Obviously, further experiments are required for reconstructing individual pyramidal cells in specific regions of orientation maps. Comparison of the results with previous studies The intriguing topography of connections following injections into orientation domains and orientation centres may explain the existing controversy in the literature between areas 17 (Gilbert and Wiesel, 1989) and 18 (Matsubara et al, 1985, 1987). It logically followsfi'omthe intrinsic nature of orientation maps that increasing the size of an injection site results in labelling that derives fi'om orientation domains and orientation centres. We believe that this reasoning is consistent with the results of Matsubara and her colleagues who used extreme large injections up to 1 mm in diameter. Obviously, they simultaneously revealed connections originating from orientation domains and orientation centres. Hence the average orientation preference in their patches considerably differed from the one recorded at the injection site. On the other hand, critically viewing the study of Gilbert and Wiesel (1989) one must admit that in their examples no absolute specificity of connections
113 between iso-orientation sites could be seen. In fact the orientation preference of a number patches, for example in their figure 5C and F, does not match with that of the injection site. Nevertheless there is a tendency for the majority of connections running between sites of similar orientation preferences. A further undecided point in the same study was whether the injections were placed into orientation domains or orientation centres. Clearly this is a key issue, according to our findings, in detecting one or the other type of lateral connections with regard to orientation. Reportedly, in area 17, every square mm of cortex contains 1.8 orientation centres (Rao et al, 1994) thus using injections of 200 ^m across the chance hitting an orientation centre is inevitably quite low (about 6%). It is unlikely that any of the 14 injections in the report of Gilbert and Wiesel (1989) confined with an orientation centre suggesting that their study was biased to reveal connections of orientation domains. Apparently future experiments in area 17 should address these questions and provide answer to the above uncertainties.
4. RELATIONSHIP BETWEEN LATERAL EXCITATORY AND LATERAL INHIBITORY CONNECTIONS We have shown above, that lateral inhibitory connections are not suited for establishing patchy connections. A similar conclusion can be drawn from studies using retrograde labelling of GABAergic corticocortical connections (Kritzer et al, 1992; Matsubara and Boyd, 1992; Albus and Wahle, 1994). This does not mean, however, that they cannot make important contributions to patches. Obviously, this idea could only be tested if excitatory and inhibitory connections were distinguishable. Therefore we developed a set of morphological criteria by which we were able to differentiate between putative excitatory and putative inhibitory boutons. To this we utilized the peculiar property of biocytin of revealing axons and dendrites of large basket cells in addition to pyramidal neurons. Representative light microscopic examples for excitatory and inhibitory axon collaterals are shown in figure 8. Briefly, putative excitatory axons emitted smaller boutons than inhibitory basket axons. Excitatory boutons were often connected via a short stalk to the axon stem, an uncommon feature of basket boutons. On the other hand basket axons fi-equently established pericellular nests around the target neurons, a feature that has never been seen for pyramidal and spiny stellate cells. These delicate features were taken into account for all boutons together with the global pattern of their parent axons in deciding which of the two groups, the putative excitatory or the putative inhibitory, they belong to. It needs to be admitted that not all preparations were suitable for distinguishing between these two main types of labelling. The critical factors to obtain high quality labelling were as follows: no brain movement during iontophoretic injection of biocytin, good fixation of the tissue, and excellent quality of histochemical reactions. So far, we collected data fi-om experiments in which the orientation map was obtained in area 18 using electrophysiological recordings with glass pipettes. Having completed the reconstruction of excitatory and inhibitory boutons their digitized density distribution was overlaid with the map of preferred orientations. An example for this is shown in figure 9a and b. As could be expected the excitatory boutons formed numerous patches around the injection site with a centre-to-centre distance of 1.1 mm. From these patches only those within the recorded area are shown in figure 9a. Looking at the orientation sites in the excitatory patches there is no doubt that they represent similar orientations to that of the injection site. Notice, however, that like in our previous examples, a fair amount of excitatory boutons are found at regions possessing dissimilar orientation preferences with regard to the injection site's own orientation
114 preference (Fig.9a). An entirely different picture was obtained for the inhibitory boutons (Fig.9b). Although the majority of them were found in the close vicinity of the injection site those at more remote locations occupied regions whose orientation preference did not match with the one at the injection site. Again it should be stressed that inhibitory boutons did not form patches in a sense as their excitatory counterparts. Nonetheless their distribution was not even either, showing accumulation at certain locations. The exact topographical relationship between the excitatory and inhibitory boutons can be better appreciated when the layout of their density maxima is compared using crossbars. This is demonstrated in figure 9c and d. One can readily see that apart fi-om the immediate neighbourhood of the injection site the high density locations of excitatory and inhibitory boutons do not lay in register. Instead the two systems are spatially offset in a way that inhibitory boutons tend to terminate in interpatch
inhibitory boutons Figure 8. Light-microscopic examples for presumed lateral excitatory (in a) and presumed lateral mhibitory (in b and c) axon types. Note the marked difference in bouton size between (a) and (b). Basket axons give off mainly en passant boutons (arrows in b) as opposed to excitatory axons. The latter emit club-like (arrows in a) and en passant boutons (not shown). Characteristically, basket boutons often estabUsh perisomatic contacts (arrows in c). The nucleolus of the targeted cell in (c) is marked by asterisk. Bar: (a)-(c), 10 pm. Figure 9. (in feeing page) Relationship between the overall horizontal topography of excitatory and mhibitory connections in a region mapped for orientation selectivity using recording electrodes. All four panels show digitized images of density distributions of boutons labelledfi-omthe same injection site. In the middle, color-coding refers to the number of boutons per 50x50 iimi^ in each panel. Preferred orientations are indicated by small blue bars (in a and b), and at the injection site the preferred orientation is indicated by a large bar in a disk (in a-d). The highest density of excitatory boutons (patehes) occurs at sites showing similar orientation preferences to that of the injection site. Contrary to this, inhibitory boutons prefer regions that possess dissimilar orientation preferences to that of the injection site. In c and d, crossbars indicate the centre of rqgions containing high density of excitatory boutons. Note that these regions are largely avoided by inhibitory boutons. Bars: a-b and c-d, 1 mm.
115
116 zones and marginally to the excitatory patches. In this respect our findings are very much in harmony with those of Matsubara and Boyd (1992) who found most retrogradely labelled inhibitory cells on the edges or outside of labelled patches in area 18. However our results differ fi-om those obtained in area 17 showing that lateral inhibitory connections are rather uniformly distributed, about half of them reside within labelled patches of non-GABAergic
iso-orientation
oblique<-orientation
inhiblfton
^* inhibition
cross-'Orientation ^ inhibition 20
A
1
1.5
1.5
1
excitation
$)^gifgtipn
2.5
3 (mm)
1.5
(mm)
(mm)
(mm)
excitation
2.5
3 (mm)
2-5
3 (mm)
Figure 10. The relative impact of long-range inhibitory (upper diagrams) and excitatory connections (lower diagrams) at iso-, oblique- and cross-orientation sites as a function of lateral distance. Method: optical imaging of orientation selectivity was obtained in area 18 followed by iontophoretic injection of biocytin into layer 111. Boutons of anterogradely labelledfibreswere reconstructed in layers I-VI. Using morphological criteria (see Fig.8) it was possible to distinguish between putative GABAergic boutons emitted by labelled layer HI large basket cell axons (a total of eight axons) and putative excitatory boutons emitted by labelled spiny cells. The 2-dimensional distribution of inhibitory (basket cell) boutons and of excitatory boutons along the orientation m ^ was then overlaid with concentric rings of lOO^un width centred on the core of the injection site. The number of boutons falling into zones representing iso- (±30°), oblique- (±[30-60]°) and cross-orientation (±[60-90]°) was counted and expressed as the percentage of the total number of inhibitory and excitatory boutons, respectively. Asterisk indicates the core zone of the injection site where distinction between inhibitory and excitatory boutons was not possible due to the high density of labelled elements. As a result of this the inhibitory and the excitatory connections are under-represented at the core zone of the injection site. These results can be summarized as follows: iso-orientation inhibition is strongest up to a lateral distance of 400 \xmfi*omthe core of the injection site, cross-orientation inhibition has its maximum at half a hypercolumn distance, and oblique-orientation inhibition is reasonably strong between distances unoccupied by iso- and cross-orientation inhibition. Contrary to lateral inhibition, lateral excitation follows a different distribution: for iso-orientation excitation the hi^est impact was found up to a distance of 400pmfi-omthe core of the injection site and at a hypercolumn distance. Cross- and obhqueorientation excitation were very weak over a broad range centred at about half a hypercolumn distance. Notice, that in this animal a hypercolumn distance was exceptionally large (1.5 mm).
117
Figure 11. Concept diagram shows the functional topography of intracortical long-range excitatory and long-range inhibitory systems. There are at least two tj^es of lateral excitatory connections. One, that originates from orientation domains and one that originates from orientation centres. The former has links to regions that show similar orientation preferences (upper scheme), the latter one is less specific for orientation (lower scheme). Long-range innibition by large basket cells (B) provides input to many orientations.
118 cells (Albus and Wahle, 1994). These differences might be explained by the different techniques used or they simply derive from the different connectivity schemes of lateral inhibition in the two areas. 4.1 The relative impact of inhibitory and excitatory connections The quantitative aspect of the above results were also examined using corresponding optical and anatomical data. We asked specifically what is the relative impact of excitatory and inhibitory connections in relation to orientation selectivity on the cortical surface. Conceivably the results could be translated as indirecJt measures on the impact of lateral connections onto their target cells with similar or dissimilar oreintation preferences in relation to their own. For the sake of simplicity we divided the orientation palette into three zones representing iso-, oblique- and cross-orientation relative to the averaged orientation preference at the injection site being the source of the connections. Obviously an intrinsic property of orientation maps is that with increasing lateral distance from any given point in the map the relative occurrence of orientation preferences gradually changes. Since lateral connections meet this change along their course their impacts predominantly depend on their tangential distribution pattern. The calculated relative impacts for lateral excitatory and lateral inhibitory connections are shown in figure 10. Apparently both types of inputs have strong influence to iso-orientation sites up to a lateral distance of 400 jim and in case of excitatory connections at a hypercolumn distance (see graphs on the left-hand side of figure 10). More laterally, where inhibition displays very strong non-iso-orientation impact, in particular, at half a hypercolumn distance, non-iso-orientation excitation remains continuously weak. It should be added here that all the above calculations are concerned with connections of orientation domains. We do not have yet comparable calculations on connections of orientation centres.
5. CONCLUSIONS The present review attempted to provide some insight into the theme of lateral connections with special reference to theirfiinctionaltopography. The main conclusion of our results is that corticocortical connections establish a complex network in relation to the topography of orientation specificity. Our findings are summarized in figure 11 demonstrating that the underlying connectivity rules of orientation domains and orientation centres are likely to be different which could explain much of the controversy between previous studies on this issue. Furthermore we observed a clear topographical difference between excitatory and inhibitory corticocortical connections. Ourfindingscould also be put forward as evidence for interactions between fimctionally different compartments. Consistent with this idea is the fact that, like in the cat, lateral connections in primate visual areas VI and V2 seem to allow for considerable amount of cross-talk between specific compartments of orientation selectivity (Malach et al, 1993, 1994). On the other hand, corticocortical connections need to be viewed as the substratum of a multitude of physiological operations rather than only orientation selectivity (for review see Gilbert, 1992). While we currently have the technical means of studying orientation specificity in large cortical regions comparative data for other attributes such as direction selectivity may soon become generally available (Grinvald et al, 1994).
119
6. REFERENCES Aertsen A.M.H.J. and Gerstein G.L. (1985) Evaluation of neuronal connectivity: sensitivity of cross correlation. Brain Res., 340:341-354. Albus K. (1975) A quantitative study of the projection area of the central and the paracentral visual field in area 17 of die cat. II. The spatial organization of the orientation domain. Exp. Brain Res., 24:181-202. Albus K. and Wahle P. (1994) The topography of tangential inhibitory connections in the postnatally developing and mature striate cortex of the cat. Eur.J.Neurosci., 6:779-792. Amir Y., Harel M. and Malach R. (1993) Cortical hierarchy reflected in the organization of intrinsic connections in Macaque monkey visual cortex. J.Comp.Neurol., 334:19-46. Benevento L.A., Creutzfeldt O.D. and Kuhnt U. (1972) Significance of intracortical inhibition in the visual cortex. Nature, 238:124-126. Bonhoeffer T. and Grinvald A. (1991) Iso-orientation domains in cat visual cortex are arranged in pinwheel-like patterns. Nature, 353:429-431. Bonhoeffer T. and Grinvald A. (1993) The layout of iso-orientation domains in area 18 of cat visual cortex: Optical imaging reveals a pinwheel-like organization. J.Neurosci., 13:4157-4180. Boyd J. and Matsubara J.A. (1991) Intrinsic connections in cat visual cortex: a combined anterograde and retrograde tracing study. Brain Res .,560:207-215. Braintenberg V. and Braintenberg C. (1979) Geometry of orientation columns in the visual cortex. Biol.Cybemetics, 33:179-186. Burkhalter A. and Bernardo K.L. (1989) Organization of corticocortical connections in human visual cortex. Proc.Natl.Acad.Sci.USA, 86:1071-1075. Burkhalter A. and Charles V. (1990) Organization of local axon collaterals of efferent projection neurons in rat visual cortex. J.Comp.Neurol., 302:920-934. Crook J.M. and Eysel U.T. (1992) GABA-induced inactivation offimctionallycharacterized sites in cat visual cortex (area 18): Effects on orientation tuning. J.Neurosci., 12:1816-1825. Crook J.M., Eysel U.T. and Machemer H.F. (1991) Influence of GABA-induced remote inactivation on die orientation tuning of cells in area 18 of feline visual cortex: A comparison with area 17. Neurosci., 40:1-12. Douglas R.J., Martin K.A.C. and Whitteridge D. (1991) Selective responses of visual cortical cells do not depend on shunting inhibition. J.Physiol.(London)j 440:659-696. Eysel U.T., Crook J.M. and Machemer H.F. (1990) GABA-induced remote inactivation reveals crossorientation inhibition in the cat striate cortex. Exp.Brain Res., 80:626-630. Eysel U.T., Muche T. and Worgotter F. (1988) Lateral interactions at direction-selective striate neurones in the cat demonstratai by local cortical inactivation. J.Physiol.(London), 399:657-675. Gabbott P.L.A. and Somogyi P. (1986) Quantitative distribution of GABA-immunoreactive neurons in the visual cortex (area 17) of the cat. Exp.Brain Res., 61:323-331. Gilbert CD. (1992) Horizontal integration and cortical dynamics. Neuron, 9:1-13.
120 Gilbert CD. and Wiesel T.N. (1979) Morphology and intracortical projections of functionally characterised neurones in the cat visual cortex. Nature, 280:120-125. Gilbert CD. and Wiesel T.N. (1983) Clustered intrinsic connections in cat visual cortex. J.Neurosci., 3:1116-1133. Gilbert CD. and Wiesel T.N. (1989) Coliunnar specificity of intrinsic horizontal and corticocortical connections in cat visual cortex. J.Neurosci., 9:2432-2442. Goldman P. and Nauta W.J.H. (1977) Columnar distribution of cortico-cortical fibers in the frontal association, limbic and motor cortex of the developing Rhesus monkey. Brain Res., 122:393-413. Grinvald A., Lieke E.E., Frostig R.D. and Hildesheim R. (1994) Cortical point-spread function and long-range lateral interactions revealed by real-time optical imaging of Macaque monkey primary visual cortex. J.Neurosci., 14:2545-2568. Hubel D.H. and Wiesel T.N. (1962) Receptive fields, binocular interaction and functional architecture in the cat's visual cortex. J.Physiol.(London), 160:106-154. Hubel D.H. and Wiesel T.N. (1963) J.Physiol.(London), 165:559-568.
Shape and arrangement of columns in cat's striate cortex.
Jones E.G. (1975) Varieties and distribution of non-pyramidal cells in the somatic sensory cortex of tiie squirrel monkey. J.Comp.Neurol., 160:205-268. Kisvarday Z.F., (1992) GABAergic networks of basket cells in the visual cortex. In: Prog.Brain Res., R.R. Mize, R.E. Marc and A.M. Sillito (Eds.), 90:385-405. Kisvarday Z.F. and Eysel U.T. (1992) Cellular organization of reciprocal patchy networks in layer III of cat visual cortex (area 17). Neurosci., 46:275-286. Kisvarday Z.F. and Eysel U.T. (1993) Functional and structural topography of horizontal inhibitory connections in cat visual cortex. Eur.J.Neurosci., 5:1558-1572. Kisvarday Z.F., Beaulieu C and Eysel U.T. (1993) Network of GABAergic large basket cells in cat visual cortex (area 18): Implication for lateral disinhibition. J.Comp.NeuroL, 327:398-415. Kisvarday Z.F., Martin K.A.C, Freund T.F., Magloczky Zs., Whitteridge D. and Somogyi P. (1986) Synaptic targets of HRP-filled layer III pyramidal cells in the cat striate cortex. Exp.Brain Res., 64:541-552. Kisvarday Z.F., Kim D.-S., Eysel U.T. and BonhoefFer T. (1994) Relationship between lateral inhibitory connections and the topography of the orientation map in cat visual cortex. Eur.J.Neurosci., 6:1619-1632. Kritzer M.F., Cowey A. and Somogyi P. (1992) Patterns of inter- and intralaminar GABAergic connections distinguish striate (VI) and extrastriate (V2, V4) visual cortices and their fimctionally specialized subdivisions in the Rhesus monkey. J.Neurosci., 12:4545-4564. LennieP. (1980) Parallel visual pathways: A review. Vision Res., 20:561-594. Levitt J.B., Yoshioka T. and Lund J.S. (1994) Intrinsic cortical connections in macaque visual area V2: evidence for interaction between different functional streams. J.Comp.NeuroL, 342:551-570. Livingstone M. S. and Hubel D.H. (1984) Specificity of intrinsic connections in primate primary visual cortex. J.Neurosci., 4:2830-2835. Livingstone M. S. and Hubel D.H. (1988) Segregation of fonn, color, movement, and depth: anatomy, physiology, and perception. Science, 240:740-749.
121 Luhmaim HJ., Singer W. and Martinez-Millan L. (1989) Horizontal interactions in cat striate cortex: I. Anatomicd substrate and postnatal development. Eur.J.Neurosci., 2:344-357. Lund J.S., Hawken M.J. and Parker A.J. (1988) Local circuit neurons of Macaque monkey striate cortex: IL Neurons of laminae 5B and 6. J.Comp.Neurol., 276:1-29. Lund J.S., Henry G.H., Macqueen C.L. and Harvey A.R. (1979) Anatomical organization of the primary visual cortex (area 17) of the cat. A comparison with area 17 of the macaque monkey. J.Comp.Neurol., 184:599-618. Lund J.S., Yoshioka T. and Levitt J.B. (1994) Substrates for interlaminar connections in area VI of macaque monkey cerebral cortex. In: Cerebral Cortex, A. Peters and K.S. Rockland (Eds.), Plenum Press, New York, Vol.10, 37-60. Malach R., Amir Y., Harel M. and Grinvald A. (1993) Relationship between intrinsic connections and functional architecture revealed by optical imaging and in vivo targeted biocytin injections in primate striate cortex. Proc.Natl.Acad.Sci.(USA), 90:10469-10473. Malach R., Tootell R.B.H. and Malonek D. (1994) Relationship between orientation domains, cytochrome oxidase stripes, and intrinsic horizontal connections in Squirrel monkey area V2. Cerebral Cortex, 4:151-165. Martin K.A.C. (1988) From single cells to simple circuits in the cerebral cortex. Q.J.Exp.Physiol., 73:637-702. Martin K.A.C. and Whitteridge D. (1984) Form, function and intracortical projections of spiny neurones in the striate visual cortex of the cat. J.Physiol.(London), 353:463-504. Martin K.A.C, Somogyi P. and Whitteridge D. (1983) Physiological and morphological properties of identified basket cells in the cat's visual cortex. Exp.Brain Res., 50:193-200. Mason A., NicoU A. and Stratford K. (1991) Synaptic transmission between individual pyramidal neurons of the rat visual cortex in vitro. J.Neurosci., 11:72-84. Matsubara J. and Boyd J. (1992) Presence of GABA-immunoreactive neurons within intracortical patches in area 18 of the cat. Brain Res., 583:161-170. Matsubara J.A., Cynader M. and Swindale N.V. (1987) Anatomical properties and physiological correlates of the intrinsic connections in cat area 18. J.Neurosci., 7:1428-1446. Matsubara J.A., Cynader M., Swindale N.V. and Stryker M.P. (1985) Intrinsic projections with visual cortex: evidence for orientation specific local connections. Proc.Natl.Acad.Sci.(USA), 82:935-939. Merigan W.H. and Maunsell J.H.R. (1993) How parallel are the primate visual pathways? Ann. Rev. Neurosci., 16:369-402. Michalski A., Gerstein G.L., Czarkowska J. and Tamecki R. (1983) Interactions between cat striate cortex neurons. Exp.Brain Res., 51:97-107. Miles R., Toth K., Gulyas A.I., Hajos N. and Freund T.F. (1994) Functional differences between dendritic and perisomatic hippocampal inhibition. Soc.Neurosci., 20:71 OP. Mitchison G. and Crick F. (1982) Long axons within the striate cortex: Their distribution, orientation, and patterns of connection. Proc.Natl.Acad.Sci.USA, 79:3661-3665. Morrone M.C., Burr D.C. and Maffei L. (1982) Functional implications of cross-orientation inhibition of cortical visual cells. I. Neurophysiological evidence. Proc.R.Soc.Lond.B., 216:335-354.
122 Mountcastle V.B. (1957) Modality and topographic properties of single neurons of cat's somatic sensory cortex. J.Neurophysiol, 20:408-434. Nelson J.I. and Frost B.J. (1985) Intracortical facilitation among co-oriented, co-axially aligned simple cells in cat striate cortex. Exp.Brain Res., 61:54-61. Peters A. and Regidor J. (1981) A reassessment of the forms of nonpyramidal neurons in area 17 of cat visual cortex. J.Comp.Neurol., 203:685-716. Ramoa A.S., Shadlen M., Skottun B.C. and Freeman R.D. (1986) A comparison of inhibition in orientationandspatialfrequency selectivity of cat visual cortex. Nature, 321:237-239. Ramon y Cajal S. (1899) Micrographica, 4:1-63.
Estudios sobre la corteza cerebral humana.
Revista Trimestral
Rao S.C, Toth L.J., Sheth B.R. and Sur M. (1994) Invariance of orientation domains in area 17 of cat visual cortex revealed by intrinsic signal imaging. Soc.Neurosci., 20:836P. Rockland K.S. and Limd J.S. (1982) Widespread periodic intrinsic connections in the tree shrew visual cortex. Science, 215:1532-1534. Sillito A.M. (1975) The contribution of inhibitory mechanisms to the receptive field properties of neurones in the striate cortex of the cat. J.Physiol.(London), 250:305-322. Sillito A.M. (1979) Inhibitory mechanisms influencing complex cell orientation selectivity and their modification at high resting discharge levels. J.Physiol.(London), 289:33-53. Somogyi P., Kisvarday Z.F., Martin K.A.C. and Whitteridge D. (1983) Synaptic connections of morphologically identified and physiologically characterized large basket cells in the striate cortex of cat. Neurosci., 10:261-294. Stone J., Dreher B. and Leventhal A. (1979) Hierarchical and parallel mechanisms in the organization of visual cortex. Brain Res.Rev., 1:345-394. Szentagothai J. (1965) The use of degeneration methods in the investigation of short neuronal connections. Progr.Brain Res., 14:1-32. Szentagothai J. (1973) Synaptology of the visual cortex. In: Handbook of Sensory Physiology, R. Jung (ed.), VII/3B, Heidelberg: Springer-Verlag, pp. 269-324. Szentagothai J. (1975) The „module-concept" in cerebral cortex architecture. Brain Res., 95:475496. Szentagothai J. and Arbib M.A. (1974) Conceptual models of neuronal organization. Neurosciences Res.Prog.BulL, 12:307-510. Thomson A.M., Girdlestone D. and West D.C. (1988) Voltage-dependent currents prolong single-axon postsynaptic potentials in layer III pyramidal neurons in rat neocortical slices. J.Neurophysiol., 60:1896-1907. Ts'o D.Y., Gilbert CD. and Wiesel T.N. (1986) Relationships between horizontal interactions and fimctional architecture in cat striate cortex as revealed by cross-correlation analysis. J.Neurosci., 6:1160-1170. Volgushev M., Pei X., Vidyasagar T.R. and Creutzfeldt O.D. (1993) Excitation and inhibition in cortical orientation selectivity revealed by whole cell recordings in vivo. Vis.Neurosci., 10:11511155. Yoshioka T., Levitt J.B. and Lund J.S. (1992) Intrinsic lattice connections of macaque monkey visual cortical area V4. J.Neurosci., 12:2785-2802.
Brain Theory - Biological Basis and Computational Principles A. Aertsen and V. Braitenberg (Editors) © 1996 Elsevier Science B.V. All rights reserved.
Fast cortical dynamics: Receptive field plasticity, synaptic mechanisms and perceptual consequences* Charles D. Gilbert and Aniruddha Das The Rockefeller University, 1230 York Avenue, New York, NY 10021-6399
A new view of the cortical mechanisms of sensory processing has emerged, in which even in adult cortex, and even at early stages in the visual pathway, the response properties of cells are dynamic, influenced by experience and by the context within which features are presented. In this view the receptive field characteristics of individual cells, and the topography of cortex, are mutable. The functional dynamics of cortex can be documented at many levels, including alteration of cortical connections, receptive field structure, cortical functional architecture and perception. It has been seen to take place over a wide range of time scales, ranging from seconds to months, and time course has obvious implications as to the functional role of this plasticity. 1. LONG TERM PLASTICITY The largest scale changes in the topography of adult visual cortex have been observed following the placement of binocular retinal lesions. After these lesions, a large area of cortex is silenced, becoming unresponsive to visual stimulation. This silenced area is referred to as a cortical scotoma. Over a period of roughly two months, a scotoma that is initially 6 to 8 mm in diameter shrinks progressively and finally disappears. The visual input in the recovered region comes from the part of the retina outside the lesion, such that there is an increase in the cortical representation of the perilesion retina, and a disappearance in the representation of the lesioned retina (Gilbert et al., 1990; Gilbert and Wiesel, 1992; Kaas et al, 1990; Heinen and Skavenski, 1991). The mechanism of the cortical rearrangement lies in the cortex: At a time when the recovery of the cortical scotoma is complete, there is still a large visually unresponsive area in the lateral geniculate nucleus (Gilbert and Wiesel, 1992; Darian-Smith and Gilbert, 1995). Within the cortex, the connections responsible for providing visual input to the original cortical scotoma are likely to be the long range horizontal connections formed by cortical pyramidal cells, as opposed to thalamocortical connections, since the extent of the cortical rearrangement approximates that of the horizontal arbors, and is This work was supported by National Institutes of Health Grant EY07968 and a McKnight Development Award. This work is taken in part from the following publications: Gilbert 1992, 1993; Das and Gilbert 1995.
123
124 several times greater than that of the geniculocortical arbors. The geniculocortical connections do not sprout beyond their normal territories to mediate the topographic rearrangement (Darian-Smith and Gilbert, 1995). The long term changes in cortical topography appear to involve sprouting of the collaterals of the horizontally projecting axons and synaptogenesis (Darian-Smith and Gilbert, 1994). Although the horizontal connections are of sufficient breadth normally to extend from outside a 6 to 8 mm diameter cortical scotoma to its center, their influence is greater strengthened by adding collaterals to existing collateral clusters, rather than by extending their overall range. Though the complete recovery of a cortical scotoma takes a few months, rather striking changes in receptive field size are seen even within minutes after making the retinal lesions. Cells whose receptive fields originate within and near the boundary of the retinal lesion show a remarkable change in receptive field structure. Their fields increase by an order of magnitude in area, and in addition shift to positions just outside the lesioned area (Gilbert and Wiesel, 1992; Chino et al, 1992). The magnitude of the shift is considerably smaller than that observed after long periods post-lesion, perhaps 1mm in terms of cortical distance, as compared to 3-4mm in the long term. The increase in receptive field size, however, is larger than that seen with longer survivals, suggesting a possible sequence of expansion followed by consolidation and shrinkage. 2. SHORT TERM PLASTICITY: RECEPTIVE FIELD EXPANSION The tale of the effects of retinal lesions raised suspicions that equivalent mutability of receptive field structure and position might be operating under normal circumstances of visual experience. The lesion itself does not lead to degeneration of cortical inputs, in fact the retinal ganglion cell layer is left intact by the diode laser that makes the lesion. Rather, it is a cessation of visual activity coming from anatomically maintained inputs. In addition, the relatively short time course of the immediate term changes in receptive field size and position suggests that changes might occur with short periods of visual stimulation. We therefore attempted to mimic the effects of retinal lesion by using a pattern of visual stimulation consisting of a dynamic random dot display or moving lines, within which is an occluded area, an "artificial scotoma". When the display is placed such that a cell's receptive field is centered within the artificial scotoma, after a few minutes of conditioning the receptive field expands to several times its original diameter (Pettet and Gilbert, 1992). As shown in the example in Figure 1, one sees responses beyond the boundary of the original receptive field. The expansion of receptive fields is seen frequently in striate cortex. In a study where pairs of neurons were studied in order to measure changes in the strength of connections between them, approximately 75% of the pairs showed receptive field expansion. Expansion was observed early in experiments, when the cells appeared to be the most healthy in terms of the briskness of response. Where expansion was not observed, the receptive fields were initially aberrantly large, suggesting that they had already expanded and were fixed in their expanded state.
125
\
\
\
\
\
\
\
\
\
\
\
N. \ . ^
\
\
\ \
\
\
\ \
\
\
\ \
\ \
%
\
\
%
\
\
\ \
\
\
\ \I \ \ i
\
\
\ \
\ \
\
'Sh''"- 4 - 3 - 2 - 1 0
1
2
3
Angular distance from center
- 4 - 3 - 2 - 1 0
1
2
3
Angular distance from center I I during art. scot. ^ before
I I—[during art, scot, during center slim.
Figure 1. Effect of an artificial scotoma on receptive field size. Top: the conditioning stimulus, consisting of a pattern of lines (or a random dot array ~ see Figure 2) moving outside the receptive field (the square with a solid outline). The lines disappear when they move within the masked area (dotted line, though not drawn in the actual stimulus; the lines presented in the stimulus are shown as blackened rectangles, and their disappearance indicated by the open rectangles). Bottom left: After 10 minutes of conditioning, the receptive field size is again measured, and shows expansion. The histograms are response profiles of the cell along the orientation axis. The boundaries obtained by hand mapping are shown below. Bottom right: Stimulating the center of the receptive field causes it to collapse in size. The receptive field can be caused to alternately expand and contract by a sequence of surround followed by center stimulation. (Adapted from Pettet and Gilbert, 1992, Figs 1 and 3)
126 The effective stimulus for inducing expansion includes a variety of patterns in the receptive field surround, including both randomly placed drifting bars and a dynamic pattern of random dots. The random-dot conditioning stimulus leads to receptive field expansion as effectively as the moving-bar conditioning stimuli. The expansion is labile, with receptive field borders shrinking back to their original sizes when neurons are stimulated through their receptive field centers after the first few minutes of conditioning. After prolonged conditioning, however, receptive fields tend to become locked in an expanded state. Expansion is asymmetric about the receptive field centers. Though it typically stops at the boundaries of the artificial scotoma; for roughly a third of the sample when expansion is observed, one receptive field or both expands far beyond the scotoma boundaries. There is no consistent bias in this asymmetry, with some cells expanding more towards the center of the scotoma and others expanding more towards its boundary. The orientation preferences and orientation selectivities of cells with expanded receptive fields do not change significantly relative to the pre-expanded state. Although neuron furing rates typically increase with receptive field expansion, this is not invariably the case. In roughly one third of the cases tabulated, the firing rate of one or both neurons either remained the same or decreased after conditioning. In these instances the expansion of receptive field size is clearly not a multiplication of the response profile of the cell by a constant, but a selective increase in sensitivity over a restricted portion of the receptive field. The salient features of the receptive field expansion are 1) Following expansion, one can activate the cell with a stimulus consisting of a single line segment at visual field locations where there was no response previously. Though most cells also show an increased response in locations corresponding to the center of the original receptive field, roughly one third of the cells show no change or a decrease in responsiveness of the original receptive field. Furthermore, often cells which have inhibitory flanks surrounding the receptive field become excitatory at these locations. Therefore a simple "gain control" is not an appropriate model of the expansion. One may consider the alteration in receptive field structure as a "plastic" change, because the identical stimulus which before conditioning does not elicit a response, after conditioning does give a response. This is in distinction to modulatory nature of "contextual" effects, where the simultaneous presentation of stimuli outside and inside the receptive field give a different response than when the stimulus within the receptive field is presented alone: Such a contextual effect could be mediated by non-linear interactions between different inputs to a cell, without involving any change in effective connection strengths. 2) Clearly, the expansion involves interactions between disparate visual field locations: stimulation well outside the receptive field induces responses at unstimulated locations. This "action-at-adistance" is unlike traditional concepts of contrast gain control, which are strictly local and opposite in sign to the effects described here. 3) Various synaptic mechanisms may account for the observed receptive field changes, though they must account for the fact that the receptive field expansion and the changes in effective connection strength are sustained, generally only reversed by stimulation within the receptive field.
127 3. MECHANISMS UNDERLYING RECEPTIVE FIELD PLASTICITY The fact that cells are capable of expanding their receptive fields suggests that they have access to input from a larger part of the visual field than one would expect from measuring their original receptive fields. There are several classes of connections that have the potential of propagating visual information between disparate visuotopic locations on the cortical map. Prominent among these is the class of connections known as the long range horizontal connections formed by cortical pyramidal cells. The axons collaterals of these cells run for distances of 6 to 8 mm parallel to the cortical surface within area 17 (Gilbert and Wiesel, 1979, 1983, 1989; Rockland and Lund, 1982, 1983; Martin and Whitteridge, 1984). These connections allow individual cells to integrate information from a wide area of cortex, and as a consequence of the topographical architecture of cortex, from a large part of the visual field, including loci outside the receptive field. In any one column of the primary visual cortex, cells have overlapping receptive fields. Taking together the receptive field area and the scatter in receptive field position, the receptive fields of all the cells in the column will cover a tiny fraction of the visual field. A rough rule of thumb governing topographic order in this area is that over a distance of about 1.5mm there is a shift in receptive field position such that there is no overlap in the receptive fields of cells separated by this distance (also corresponding to two complete cycles of orientation columns, or two "hypercolumns," Figure 3, Hubel and Wiesel, 1974). Thus horizontal connections spanning 6 to 8 mm allow communication between cells with widely separated receptive fields. This raises the puzzling finding that cells integrate information over a larger part of visual space than that covered by their receptive fields, and calls into question the very definition of receptive field. The explanation for this seeming contradiction between cortical topography and receptive field structure is that the definition of the receptive field is stimulus dependent, and that a cell's response can be modulated by stimuli lying outside of the classical receptive field. Put another way, a cell's response to a complex visual stimulus cannot be fiiUy predicted from its response to a simple stimulus, such as a single short line segment. Though the horizontal connections are very widespread, they are quite specific in terms of the functional properties of the target cells. Rather than contacting all cells within a certain radius, the axon collaterals of the horizontally projecting cells are distributed in discrete clusters. The clustering implies a possible relationship to the functional architecture of the cortex: the tendency for cells with similar functional properties to be grouped into columns of similar functional specificity. In the primary visual cortex, cells with common orientation selectivity and eye preference are distributed in this fashion. Furthermore, as one moves across the cortical surface there is a systematic clockwise or counterclockwise shift in orientation preference and a progressive shift from left to right eye dominance. Several lines of evidence show that the clustering of the horizontal connections allows them to mediate communication between columns of similar orientation preference. The spacing between clusters is roughly the same distance as that required to run through a full cycle of orientation colunms (or one hypercolumn for orientation, which is about 750 jiim wide). A physiological technique that demonstrated the functional relationship of cells communicating via the horizontal
128 connections is that of cross-correlation analysis. A cross-correlogram is a histogram of differences in the spike times for a pair of cells. Cells that are connected, or that share a common input, will show a peak in this histogram at a particular delay. Looking across a population of cells, cross-correlation analysis shows that cells in columns of similar functional specificity showed correlated firing, even when separated by distances as great as 2 nmi (Ts'o et al, 1986; Ts'o and Gilbert, 1988). To explore the changes responsible for the receptive field expansion at the level of cortical circuitry, we used the same technique of cross-correlation analysis. This approach allows one to obtain a measure of the effective connection strength between pairs of neurons recorded extracellularly. We used this approach to look at the changes in connection strength accompanying receptive field expansion. The work described below is taken from the study of Das and Gilbert (1995). When a neuron's receptive field expanded with conditioning, we generally saw an increase in the peak height and area of the associated shift-corrected cross-correlogram, whereas there was no increase in the peak when its receptive field failed to expand. The same pattern of changes in the cross-correlogram could also have been due, however, to the observed increases in spike rate with receptive field expansion. To correct for this influence of spike rate on correlogram peaks, we utilized a combination of approaches. First, we adjusted the contrast of the stimuli before and during expansion to keep the neurons' firing rates relatively constant. Second, we normalized the correlograms by the level of the flanks, or baselines, of the correlograms. Calculations over our data set indicated that flank-normalization could adequately compensate for variations of ± 50% in the average neuron firing rate. This procedure was thus used to correct for the fluctuations in firing rate that remained despite modifying test bar contrast. In those cases where the average neuronal firing rate changed by more than a factor of 2 on conditioning, results were discarded since it was not clear that normalization could adequately compensate for such a large change. All results reported in the following section are derived from flank-normalized cross-correlation histograms, i.e. NCCH. The process of normalization is quite important to validate the results we present. For a detailed treatment of the normalization technique, see Das and Gilbert (1995). 3.1 Change in effective connectivity with expansion Using the cross-correlation technique one can establish that effective connectivity consistently increases with successful conditioning while remaining stationary or decreasing when receptive fields fail to expand. As described above, the crosscorrelation technique involves recording from a pair of cells. In the experiments investigating changes in effective connection strength with receptive field expansion, we positioned the artificial scotoma over the receptive fields of both cells, as illustrated in Figure 2. Cells were visually driven by one test bar moving through the region of highest receptive field overlap (the "common region") or by a pair of bars in the distal nonoverlapping regions of the receptive fields (the "distal regions"), and the visually driven responses were cross-correlated. An example of the change in correlation strength observed when the fields undergo expansion is shown in Figure 3. Here, the receptive
129
^
Figure 2. Test and conditioning stimuli in relation to the receptive fields. Top: Layout of stimuli in relation to receptive fields. The two black rectangles represent minimum response field borders for a pair of receptive fields before conditioning, with the adjacent pair of oblique lines next to each indicating the corresponding orientation limits. The small filled bar represents the test stimulus traversing the "common" receptive field region while the pair of clear bars represent the stimulus pair traversing the "distal" regions. Dotted lines mark the limits of each test bar trajectory (1 sweep/2 sec, speed: 0.75 deg/sec). Test bar size, traverse and speed were chosen close to the optimum values determined earlier using a hand-held projector. Bottom: Conditioning stimulus in relation to receptive fields. The neuron pair (depicted in the upper figure) was conditioned by filling the monitor screen with a pattern of random dots everywhere except for an "artificial scotoma" indicated by the large dark gray rectangular area enclosing the receptive fields. Luminosity inside the artificial scotoma was equal to the average luminosity of the random dot field. After conditioning, the receptive fields expanded to the boundaries indicated by the light grey rectangular outlines. Pairs of oblique dashed lines indicate the orientation limits, essentially unaltered after conditioning. In this example the receptive fields expanded away from each, (taken from Das and Gilbert, 1995)
130 field areas of the two neurons increased by a factor of 4.4 in one cell and 7.1 in the other. Correspondingly, the area enclosed by the peak of the NCCH increased by a factor of 2.4. When receptive fields can be collapsed to their original size by stimulating the field center, the correlogram also returns to a distribution remarkably similar to its original state. As shown with the neuron pair treated in Figure 4, with receptive field expansion the area of the cross-correlogram peak increased by a factor of 1.7, and after the receptive field collapsed, the peak area and shape also returned to its value before conditioning. In contrast, in cases where conditioning did not lead to increase in receptive field size, we found no increase in the effective connectivity. A subset of the neuron pairs that did increase receptive field size on conditioning did not show the corresponding change in effective connectivity. Invariably, in such cases, the tips of the recording electrodes were close to each other (<250^i where measured) thereby accounting for a significant overlap of the receptive fields before expansion and, consequently, even greater overlap after expansion. Results for the entire population of 37 neuron pairs are shown in Figure 5. A number of cell pairs have more than one data point associated with them, either because correlograms were obtained at both the conmion and distal stimulus locations, or because we could successfully compare NCCHs for the receptive fields in their original, expanded and subsequently collapsed states. The values of p, the ratio of peak expansion, are seen in Figure 5 to divide along the same categories of receptive field expansion and overlap alluded to earlier. Neurons that expanded their receptive fields with conditioning increase, on average, the size of their NCCH peak, giving higher values of p than neuron pairs whose receptive fields did not expand. Over the entire population of pairs showing expanded receptive fields the p values range from 0.84 to 3.5. Within this group, the tendency to show increased connection strength depended on the extent of receptive field overlap before conditioning: The subset of neuron pairs with low initial receptive field overlap had p values ranging from 1.1 to 3. Those neuron pairs with a high initial receptive field overlap did not change their NCCHs despite substantial receptive fields expansion, giving values of p clustered around 1. The neuron pairs that did not expand on conditioning showed no increase, and often a decrease, in p. The range of p values for this group, as well as for the preexpansion and postcoUapse NCCH peak areas in the group showing reversible expansion, was 0.51 to 1.19 with one value at 1.54. 3.2 Effective connectivity as a function of the receptive field region swept by test bar(s) By comparing cross-correlograms obtained using the common and distal test bar stimuli on each pair of neurons, we were able to make a number of local comparisons of effective connectivity, and the conditioning-induced changes in this quantity, between sub-regions of receptive fields. Two notable observations emerged from our comparisons. First, the connection strength measured in the overlapping regions of receptive fields were not uniformly higher or lower than those measured from the non-overlapping regions of the same pair of receptive fields. The average ratio of (connection strength measured using common bar)/(connection strength measured using distal bars) = 1.15 ± 0.33 , (N=22 cell pairs) before conditioning and = 1.19 ± 0.76, (N=22 cell pairs) after expansion.
r-CO
0
c o c o U
0) .^H
C«
!S
?•»
(U
"^
Q
^
?^
«
a a 4-
a> o
^ s
i2
S a> c
§ rS
2 "" 13 i u ^
T3
a
131
^
C^.22
W)
a
13
132
o
0
CD
^
O
d ^ ^ CM CO CVJ
D
"D 0 to
a D o U
VH
§
CD
C/3
1
cd
« ^
>-i
2 -
§
•§ Vi
•S
•'O
12 Q. 0 :3
Cd
3 § S, D
(T^
o
-»-» 1/3
1
S P 0) ^ S; ^ J3 JH
. 2
1^ <4-> W3
^
^
ON
(Z)
X a
b^
0^ §
9^3
' ^
S ^ bO ^ •^ w» c (» 5 ;^ o H
Ui Q .
CI4
(/i
-o -^ c ^ o ^ .a ' 0 c S
1)
c«
•>
25
3
^
13 0
c9 ^
c
3 0 '•^ C« W5
d
«->
TS cd
El
5
0
0
>
•S ^
II c^
r^ »—'
^^ ^ .. <^
OK
(U
0 u
c Hw 0 Cd D .ti ^ - 3 <^ "^
c3 D c3 H
0
2 ^ ^ > 2
0 >9^ r^. p
^ -^ u Cft
?^ 0) . .
J-i
fi
^w
t« -^
11 ' 0 2 ^i
(L>
-
.in
(^
S t^ M-H
0
0
^1 c >
0
s
^ t^
6 5
1 s^X • S .Of
^ ^2 go
u: Snia
133
p: Ratio of Normalized peal< areas After Conditioning / Before Conditioning
3.0
2.0
1.0
RF Expansion: No or Partiai RF Overiap
No Expansion
Fuii Overiap
Figure 5. Ratio p of peak areas for shift-corrected, flank-normalized cross-correlograms, (NCCH) over the total population of neuron pairs. The first two columns show neuron pairs whose receptive fields expanded with conditioning, p values ranging from 0.84 to 3.5 (mean = 1.57 ± 0.47, N = 44). All neuron pairs in the first column started with non-overlapping or weakly overlapping receptive fields. Values of p range from 1.1 to 3.5 (mean = 1.72 ± 0.53, N=31). The second column shows neuron pairs starting with overlapping receptive fields. Here, values of p range from 0.8 to 1.8 (mean = 1.2 ± 0.23, N= 13). The third column of units shows p values for receptive field pairs that did not expand with conditioning. Range of p from 0.5 to 1.54 (mean = 0.81 + 0.24, N = 21). For each column the receptive field response to conditioning is indicated by a sketch of a representative receptive field pair. (Taken from Das and Gilbert, 1995)
134 There were distinct hot spots, however, of the increase of connection strength with expansion. For a given neuron pair, the value of p obtained from one receptive field region could be as much as three times larger than the value obtained from the other. These regions of high p appeared to coincide with the direction of greater receptive field expansion even for those neuron pairs where receptive field expansion was predominantly away from the region of high overlap. In general, when receptive field centroids moved away from each other after conditioning, the region of no receptive field overlap gave a p value significantly higher than the region of high overlap; the reverse was true when receptive field centroids moved towards each other. With most neuron pairs we obtained CCHs with a shape and peak position indicating that the measured correlation was mainly due to common input with no significant contribution from direct monosynaptic connections linking the given neuron pair. Moreover, for most neuron pairs conditioning failed to produce any change in peak position or in the measured Full Width at Half Maximum (FWHM) independent of receptive field expansion. In only 2 neuron pairs did we find correlogram peaks shifted significantly away from t=0 and in one pair got a sharpening and shift of the peak with receptive field expansion. The conclusion one can draw from these results is that one can observe, over a short time scale, changes in receptive field structure, and that accompanying these changes in the functional properties of cortical cells is an alteration in the strength of connections within the cortical circuit. Existing connections, which under normal circumstances may exert a modulatory influence on cortical cells, change their effective strength to a suprathreshold, driving influence. 3.3 Synaptic mechanisms of receptive Held plasticity Our finding that receptive field expansion after conditioning is accompanied by an increase in effective intracortical connection strengths is inferred from an increase in the peak areas of flank-normalized cross-correlograms (NCCH) of neuron pairs in superficial layers of cat primary visual cortex. The effective connection strength obtained using this measure does not increase with conditioning if the neuron pair under study fails to expand; connection strengths also return to their values before conditioning, dynamically, with the collapse of the neuronal receptive field. The receptive field expansion and associated increase in connection strength is considered a plastic change since it persists after the conditioning stimulus has been removed. This also distinguishes it from the class of phenomena that can be broadly categorized as context-dependent influences on the receptive field, where the responses of a cell to stimuli inside its classical receptive field can be modulated by concurrent stimuli outside the receptive field (Nelson and Frost, 1978,1985; Gulyas et al, 1987; Gilbert and Wiesel, 1990) and which may be due to the non-linear summation of converging inputs to a neuron. Our results imply that receptive field expansion is linked to selective and local changes in inputs to conditioned neurons, and not just a generalized increase in neuron excitability.
135 Although receptive field expansion was often accompanied by increased neuronal firing rates, significant increases in receptive field area were observed in many cases with no associated increase in firing rate. Moreover, receptive field expansion was typically asymmetric, being attracted to, but generally stopping at, the borders of the artificial scotoma. The associated increase in effective connectivity was also quite inhomogeneous over the receptive fields, tending to be significantly higher in the subfields of greater receptive field expansion. As shown by Das and Gilbert (1995), flank normalization effectively compensates for changes in CCHs that are solely due to changes in neuron firing rate; thus it is a useful tool for studying these more selective mechanisms underlying receptive field expansion. At the same time, because it ignores any generalized increase in the effectiveness of all inputs to the conditioned neurons, it probably provides a lower bound on the total increase in connection strength with receptive field expansion. Inhomogeneity in the effect of conditioning may also explain the high variance in values of p over our data set, particularly for the neuron pairs that had low receptive field overlap before conditioning; the magnitude of p is possibly related to the proximity of the test stimulus to "hot spots" of change in synaptic input to the corresponding receptive fields. An increase in the value of p implies an increase in the proportion of common input to the total input driving a pair of neurons. This could underlie our observation that the subset of neuron pairs with significant receptive field overlap before expansion did not increase NCCH peak area despite large increases in receptive field area. When a pair of receptive fields is well separated before conditioning, an increase in the effective overlap could lead to a large increase in the fractional contribution of common input to each neuron's total input. On the other hand, when two receptive fields are largely overlapping to start with, even a large increase in receptive field area does not significantly change the degree of overlap and is thus unlikely to seriously alter the proportion of common input to the neurons' total input. Several characteristics of the receptive field expansion and the associated change in NCCH peak area suggest that horizontal intracortical connections, with their specificity for receptive properties, form the major substrate for the dynamic changes observed. The receptive field properties of orientation-preference, orientation-specificity and ocularity were invariably left unaffected by expansion. Pairs of neurons whose receptive fields differed by more than 40° in orientation, or had opposite ocularities, showed no crosscorrelation before or after conditioning despite large increases in receptive field size. Our results suggest that the observed increase of effective connectivity for a given neuron is essentially restricted to this "scaffolding" of horizontal intracortical connections, in effect recruiting more input from within this network without any increase in input either from cortical neurons of different orientations, or from sub-cortical sources. Our results also suggest that the intracortical connections of primary importance to the observed plasticity are primarily in area VI. Although feedback connections from other visual areas also preferentially link cortical regions with similar receptive field properties, CCHs from pairs of neurons in V2 have significantly wider peaks (on the order of 50 ms)
136 that are also largely centered about t=0 (Ts'o et al, 1993). This makes it plausible that connections between neurons in VI and V2 would also have wider CCHs of the common input type than those observed by us. Thus a significant increase in the contribution from feedback connections after conditioning should have led to consistent widening of the cross-correlogram peaks. We found cross-correlogram peak widths of 5 to 15 ms consistent with neuron pairs within area VI; and our observation that the peak widths generally did not change with conditioning suggests that the role of feedback connections in receptive field expansion is not very large. Our results leave open, as yet, questions of whether the plasticity involves alteration of inhibitory or excitatory connections, and whether the effect is directly on the synapses of the recorded neurons or the inputs of antecedent cells. Horizontal axonal collaterals provide both direct intracortical excitatory input, and inhibitory input through local interneurons (Hirsch and Gilbert, 1991; McGuire et al, 1991). Thus, if conditioning affects the input to the neurons being studied, this may take the form of facilitation of the excitatory input from cells with receptive fields in the expanded part of the conditioned cell's receptive field, or adaptation of the local inhibitory input, arising from cells with receptive fields either within or outside of the area of the scotoma. Facilitation of the horizontal connections has been observed in slice preparations of primary visual cortex (Hirsch and Gilbert, 1993), but the connection between the use-dependent changes observed in vitro and the facilitation between neurons observed here has yet to be established. One important feature of the sUce study is that facilitation is most readily observed where there is relatively little disynaptic inhibition in the horizontally evoked synaptic potential. It is therefore possible that adaptation of inhibition may be an important preliminary step to facilitation of the excitatory synapses, and both mechanisms may be operating to cause the increase in effective connectivity observed here. Alternatively, the increase in connectivity may occur only on those pathways where there is relatively little inhibition in the interaction between cells. Results from a theoretical model that explores possible cortical mechanisms underlying dynamic receptive field expansion suggest that the balance of lateral inhibition and excitation in cortex is a critical determinant of the observed phenomenon (Xing and Gerstein, 1994). The Xing and Gerstein model indicates that receptive field behavior similar to that observed with artificial scotomata is possible only when cortical inhibition dominates but does not entirely overwhelm excitation, and where synaptic connections are allowed to adapt over a period of minutes. Whatever analogy one may draw between the current situation and known mechanisms of potentiation of connections, the mechanism must account for the action-at-a-distance characteristic of the expansion, where cells not directly activated by conditioning expand into their immediate surrounds when regions remote from their receptive field borders are stimulated. Any proposed mechanism of receptive field expansion must also allow for the sustained time course of this phenomenon, with receptive fields remaining in their expanded state until the receptive field center is stimulated. As mentioned earlier, receptive field expansion is initially very labile and the receptive field rapidly collapses on stimulation . Prolonged cycles of conditioning and testing lead, however, to the receptive fields remaining locked in an expanded state, not easily reversible. This could be due to a
137 variety of factors. The irreversibility could be an artifact of the anesthesia in our animals, a possibility consistent with our observation that neurons firing less briskly tended to show greater irreversibility. Alternatively, we may be observing a plastic phenomenon that has both a labile and immediate, as well as a more long-term component. The immediate component may be relevant in short-term modification of perception, while the long-term one may be relevant to perceptual learning ~ which is overdriven by subjecting area VI neurons to the abnormally long-lasting and unchanging conditioning stimulus. 4. PERCEPTUAL CONSEQUENCES OF RECEPTIVE FIELD PLASTICITY The potential perceptual consequences of this mutability of receptive fields are several. First, it may mediate a process of normalization or calibration of the visual system that continues through adulthood, though importantly one involving a comparison of visual attributes appearing at different visual field locations. The presence of stimuli in one part of the visual field and not another induces the cortex to alter the amount of territory dedicated to positions within and outside the scotoma. Secondly, the mutability may play a role in perceptual constancies such as color constancy, which may involve a comparison of the responses of wavelength selective cells across the visual field in order to allow for changes in the spectral characteristics of the illuminating light. Thirdly, the expansion of receptive fields could be involved in segmenting the visual image, which is manifest in psychophysical studies as the host of perceptual fill-in phenomena ranging from illusory contours to fill-in of brightness, colors and textures (Yarbus, 1957; Krauskopf, 1961; Crane and Piantanida, 1983; Paradiso and Nakayama, 1991; Ramachandran and Gregory, 1991). A specific example of a perceptual consequence of receptive field expansion has been seen in local distortions of spatial perception (Kapadia et al, 1994). The experiments by Kapadia et al. studied spatial localization around an artificial scotoma, and found that the ability to determine the position of short line segments was strongly biased toward the interior of the scotoma. This "shift" or misassignment of position can be attributed to receptive-field expansion within the artificial scotoma. The bias induced by an artificial scotoma is a separate phenomenon from the positional alterations of concurrently presented contextual stimuli (Badcock and Westheimer, 1985). An additional insight from the psychophysical studies is that the perceptual shifts begin within one second of stimulus presentation, suggesting that receptive fields are constantly altered by their local context and that these dynamics may be part of normal vision. Furthermore, it has become increasingly evident that visual perception can be modified by repeated performance of discrimination tasks, known as "perceptual learning." The characteristics of this learning suggests involvement of primary visual cortex. Until recently, it was conmionly believed that within the early stages of sensory processing, the functional properties of neurons and the circuitry of sensory cortex are subject to experience early in cortical development, but are fixed in adulthood. It is obvious, however, that some form of neural plasticity must exist well into adulthood, because we continue to be capable of adapting to experience and of learning to recognize new objects.
138 One usually associates learning with the acquisition and storage of complex percepts, such as faces, which is generally believed to be an attribute of advanced stages of cortical processing. There is an accumulating body of evidence indicating that quite to the contrary, even at the earliest stages of sensory processing neuronal functional specificity is mutable and subject to experience. The improvement in performance results from repeating a perceptual discrimination task many times, which involves exposure to a stimulus and evaluation of a particular visual attribute. Improvement in visual performance with repeated trials has been observed for a number of visual submodalities, including acuity, stereoacuity, texture, motion and orientation. There is every reason to believe that learning can be seen for any visual attribute. Hyperacuity, defined as a spatial threshold that is smaller than the grain of the sensory mosaic (often by an order of magnitude or more), shows improvement with training. One way of demonstrating hyperacuity is to ask an observer to judge an offset in the positions of two points or lines. McKee and Westheimer (1978) observed improvement in hyperacuity from 10" to 5" of arc that occurs over several days of training, and Poggio et al (1992) found learning in hyperacuity over a few minutes. Stereoacuity, the ability to discriminate the relative depth of two points placed at different distances from the observer, shows an even greater improvement with practice (Westheimer and Truong, 1988). For a patterned stimulus, such as a compound sinusoidal grating, one can improve in the ability to discriminate the spatial phase or offset of the components of the grating (Fiorentini and Berardi, 1980). Texture discrimination shows improvement, as evidenced by the ability to determine, within a set period of time, the orientation of an array of oriented lines in a background array of differently oriented lines (Kami and Sagi, 1992). Finally, motion discrimination, the ability to see differences in the direction of motion, also shows improvement with repeated trials (Ball and Sekuler, 1982, 1987). The key evidence for the idea that the neural substrate of the learning effects is found in primary sensory cortices comes from their specificity. If one obtains an improvement in discrimination at one location in the visual field, that improvement tends to be restricted to that position, and does not transfer to other locations. This is thought to involve early stages in the visual pathway where receptive fields are small, relative to those seen at higher levels. Other evidence that early stages are involved is lack of interocular transfer. Certainly, input from the two eyes is integrated at the single cell level very early in the visual pathway, within primary visual cortex. Whereas some learning effects do not show interocular transfer, some do. Cells at many stages along the visual pathway show selectivity for the orientation of bars or edges, and the most sharply orientation tuned cells are found within striate cortex. Many of the learning effects are quite selective for the orientation of the line elements used (Fiorentini and Berardi, 1980, 1981; Kami and Sagi, 1992; Poggio et al, 1992), or for the direction of movement about which the discrimination is made (Ball and Sekuler, 1982). liiough it is tempting to use the experiments described above as evidence for involvement of early stages in leaming effects, one should be aware of several caveats. One problem, for example, is how leaming can be both orientation and eye specific, since the monocular cells of striate cortex lack orientation specificity (Hubel and Wiesel, 1977;
139 Livingstone and Hubel, 1984). Though as one goes higher in the hierarchy of visual processing, cells tend to lose specificity (increasing receptive field size, for example), it is possible to effectively reduce receptive field size in some visual cortical areas by attention or expectation (Moran and Desimone, 1985). Even if one normally associates higher visual cortical areas, such as inferotemporal cortex, with storage of more complex percepts, training can give cells in inferotemporal cortex a specificity for very simple features. For example, animals trained to discriminate the orientation of single lines will have cells in inferotemporal cortex that are selective for such stimuli (Vogels and Orban, 1991). Conceivably, therefore, one might confer greater specificity to cells at higher levels by training. Learning effects appear to depend on the precise configuration of the stimulus used in training, including the context surrounding the feature being discriminated. When stereoacuity is done in the presence of an array of surrounding dots, the learning in stereoacuity of a central dot is specific for the separation between central dot and surrounding array that is used during the training period (Coutant and Westheimer, 1993). Hyperacuity learning done with lines as the conditioning stimulus does not transfer to a hyperacuity task composed of dots (Poggio et al, 1992). It is difficult to imagine how the specificity to the more complex patterns could be based in primary visual cortex alone. There may be a distributed representation of the learned information, with the complex form represented in one cortical area and the information about its attributes, such as depth, represented in another. In the final analysis, the idea that perceptual learning has characteristics suggestive of the involvement of early visual processing is further supported by the recent findings of cortical plasticity with peripheral lesions. Conversely, the work on perceptual learning suggests that the plasticity observed with retinal lesions may utilize mechanisms that are available for normal sensory processes, and were not exclusively developed for recovery of function after CNS damage. 5. CONCLUSIONS The change in effective connectivity in striate cortex associated with receptive field expansion points towards a continuing experience-dependent modifiability of cortical circuits. The presence of exuberant connections as exemplified by the long range horizontal axon collaterals of cortical pyramidal cells had originally seemed to violate the principles of cortical functional architecture and receptive field structure. But the possibility that the physiological strength of the connections within an axonal field may be locally modifiable suggests a possible role for this exuberance, wherein the target cells at any one time only manifest a subset of the inputs and consequent receptive field characteristics from a potentially larger range. As seen in these experiments, the dynamic changes in "synaptic weight" can occur over very short time scales, and therefore can be useful in the ongoing analysis of visual scenes.
140
REFERENCES Badcock DR, Westheimer G (1985) Spatial location and hyperacuity: the centresurround localization function has two substrates. Vision Res 25:1259-1269. Ball K, Sekuler R (1982) A specific and enduring improvement in visual motion discrimination. Science 218:697-698. Ball K, Sekuler R (1987) Direction-specific improvement in motion discrimination. Vision Res 27:953-965. Chino YM, Kaas JH, Smith IE EL, Langston AL, Cheng H (1992) Rapid reorganization of cortical maps in adult cats following restricted deafferentation in retina. Vision Res 32:789-796. Coutant BE, Westheimer G (1993) Population distribution of stereoscopic ability. Ophthalmic and Physiolog Optics 13:3-7. Crane HD, Piantanida TP (1983) On seeing reddish green and yellowish blue. Science 221:1078-1079. Darian-Smith C, Gilbert CD (1994) Axonal sprouting accompanies functional reorganization in adult cat striate cortex. Nature 368:737-740. Darian-Smith C, Gilbert CD (1995) Topographic reorganization in the striate cortex of the adult cat and monkey is cortically mediated. J Neurosci, in press. Das A, Gilbert CD (1995) Receptive field expansion in adult visual cortex is linked to dynamic changes in strength of intrinsic cortical connections. Submitted for publication. Fiorentini A, Berardi N (1980) Perceptual learning specific for orientation and spatial frequency. Nature 287:43-44. Fiorentini A, Berardi N (1981) Learning in grating waveform discrimination: Specificity for orientation and spatial frequency. Vision Res 21:1149-1158. Gilbert CD, Hirsch JA, Wiesel TN (1990) Lateral interactions in visual cortex. Cold Spring Harbor Symposia on Quantitative Biology 55:663-677. Gilbert CD, Wiesel TN (1979) Morphology and intracortical projections of functionally identified neurons in cat visual cortex. Nature 280:120-125. Gilbert CD, Wiesel TN (1983) Clustered intrinsic connections in cat visual cortex. J Neurosci 3:1116-1133. Gilbert CD, Wiesel TN (1989) Columnar specificity of intrinsic horizontal and corticocortical connections in cat visual cortex. J Neurosci 9:2432-2442. Gilbert CD, Wiesel TN (1990) The influence of contextual stimuli on the orientation selectivity of cells in primary visual cortex of the cat. Vision Res 30:1689-1701. Gilbert CD, Wiesel TN (1992) Receptive field dynamics in adult primary visual cortex. Nature 356:150-152. Gulyas B, Orban GA, Duysens J, Maes H (1987) The suppressive influence of moving textured backgrounds on responses of cat striate neurons to moving bars. J Neurophysiol 57:1767-1791. Heinen SJ, Skavenski AA (1991) Recovery of visual responses in foveal VI neurons following bilateral foveal lesions in adult monkey. Exp Brain Res 83:670-674. Hirsch JA, Gilbert CD (1991) Synaptic physiology of horizontal connections in the cat's visual cortex. J Neurosci 11:1800-1809
141 Hirsch J A, Gilbert CD (1993) Long-term changes in synaptic strength along specific intrinsic pathways in the cat visual cortex. J Physiol 461:247-262 Hubel DH, Wiesel TN (1974) Uniformity of monkey striate cortex: a parallel relationship between field size, scatter and magnification factor. J Comp Neurol 158:295-306. Hubel DH, Wiesel TN (1977) Functional architecture of macaque striate cortex. Proc R SocLond (Biol) 198:1-59. Kaas JH, Krubitzer LA, Chino YM, Langston AL, PoUey EH, Blair N (1990) Reorganization of retinotopic cortical maps in adult mammals after lesions of the retina. Science 248:229-231. Kapadia MK, Gilbert CD, Westheimer G (1994) A quantitative measure for short-term cortical plasticity in human vision. J Neurosci 14:451-457. Kami A, Sagi D (1992) Where practice makes perfect in texture discrimination: evidence for primary visual cortex plasticity. Proc Natl Acad Sci 88:4966-4970. Krauskopf J (1961) Heterochromatic stabilized images: A classroom demonstration. Am J Pyschol 80:632-637. Livingstone MS, Hubel DH (1984) Anatomy and physiology of a color system in the primate visual cortex. J Neurosci 4:309-356. Martin KAC, Whitteridge D (1984) Form, function and intracortical projections of spiny neurones in the striate visual cortex of the cat. J Physiol 353:463-504. McGuire BA, Gilbert CD, Rivlin PK, Wiesel TN (1991) Targets of horizontal connections in macaque primary visual cortex. J Comp Neurol 305:370-392. McKee SP, Westheimer G (1978) Improvement in vernier accuity with practice. Vision Res 24:258-262. Moran J, Desimone R (1985) Selective attention gates visual processing in the extrastriate cortex. Science 229:782-784. Nelson JI, Frost BJ (1978) Orientation-selective inhibition from beyond the classic visual receptive field. Brain Res 139:359-365. Nelson JI, Frost BJ (1985) Intracortical facilitation among co-oriented, co-axiallj' aligned simple cells in cat striate cortex. Exp Brain Res 61:54-61. Paradise MA, Nakayama K (1991) Brightness perception and filling-in. Vision Res 31:1221-1236. Pettet MW, Gilbert CD (1992) Dynamic changes in receptive-field size in cat primary visual cortex. Proc Natl Acad Sci 89:8366-8370. Poggio T, Fahle M, Edelman S (1992) Fast perceptual learning in viusal hyperaccuity. Science 256:1018-1021. Ramachandran VS, Gregory TL (1991) Perceptual filling in of artificially induced scotomas in human vision. Nature 350:699-702. Rockland KS, Lund JS (1982) Widespread periodic intrinsic connections in the tree shrew visual cortex. Brain Res 169:19-40. Rockland KS, Lund JS (1983) Intrinsic laminar lattice connections in primate visual cortex. J Comp Neurol 216:303-318. Ts'o DY, Gilbert CD, Wiesel TN (1986) Relationships between horizontal and functional architecture in cat striate cortex as revealed by cross-correlation analysis. J Neurosci 6:1160-1170. Ts'o DY, Gilbert CD (1988) The organization of chromatic and spatial interactions in the primate striate cortex. J Neurosci 8:1712-1727.
142 Ts'o DY, Roe AW, Shey J (1993) Functional connectivity within VI and V2: patterns and dynamics. Abstr Soc Neurosci 19:1499 Vogels R, Orban GA (1991) Training affects task-related properties of inferotemporal neurons. Abst Soc Neurosci 17:1283. Westheimer G, Truong T (1988) Target Crowding in Foveal and Peripheral Stereoacuity. Amer J Optom & Physiol Optics 65(5):395-399. Xing J, Gerstein GL (1994) Simulation of dynamic receptive fields in primary visual cortex. Vision Res 34:1901-1911 Yarbus AL (1957) The perception of an image fixed with respect to the retina. Biophysics 2:683-690.
Brain Theory - Biological Basis and Computational Principles A. Aertsen and V. Braitenberg (Editors) © 1996 Elsevier Science B.V. All rights reserved.
143
SPATIO-TEMPORAL DYNAMICS OF SYNAPTIC INTEGRATION IN CAT VISUAL CORTICAL RECEPTIVE FIELDS Yves Fregnac and Vincent Bringuier Equipe Cognisciences, Institut Alfred Fessard, CNRS, Gif sur Yvette, France
Two major constraints in connectivity decide of the spatial extent of visual cortical receptive fields, both during development and adult functioning. 1) Visual projections extrinsic to visual cortex are organized orderly to form a point to point mapping of the retina onto the cortical mantle. The retinal origin and the spatial weighting of the stream of retinogeniculo-cortical synapses feeding any mammalian area 17 cell is such that at the adult age the average size of the minimal discharge field (MDF) of first-order cortical neurons, where presentation of a stimulus can elicit suprathreshold firing, is of the order of 1° of solid angle (for the central visual field representation, review in Orban, 1984). 2) A second type of connectivity consists in long distance horizontal cortico-cortical connections and feedback cortico-geniculate projections which violate the retinotopy rules expressed by feedforward inputs both at the cortical and thalamic levels (review in Gilbert, 1992; Salin and BuUier, 1995). The functional role of these diffuse non-topographic "lateral" pathways is less known: it is thought to exert a synchronizing action within and between each successive stage of integration, thus favoring the binding of local visual operations occurring simultaneously in different parts of the visual field (review in Singer and Gray, 1995). These constraints in wiring specificity, which may apparently appear contradictory in terms of topological matching, represent the two faces of sensory integration in the retinogeniculo-cortical pathway. One is a funneling mechanism carrying out processing through local feedforward pathways and re-entering short-range connections; the second, mediated by the activation of long-range connections and feedback projections from higher centers, might be required to signal a global perceptual coherence during figure/background segregation (Crick, 1984; Treisman, 1977). This neural search for enlightening pattern coherence and feature contrast could be based on relational processes which self-organize on the basis of optimization constraints imposed by the collective state of the output network
144 (Koechlin and Burnod, 1995; Mac Kay, 1995; Whitehead and Ballard, 1990). Recent experimental evidence, part of which will be reviewed below, suggests that the functional wiring of thalamo-cortical and cortico-cortical circuits (determined on the basis of "effective" links between cells) indeed is not fixed and is modulated in a dynamic manner by input features. Specific properties of the sensory image (e.g. co-linearity, iso-orientation, motion) may affect the balance between feedforward and intrinsic reafferent input, lateral input (Somers et al., 1994; Stemmler et al., 1995), and the feedback supervision exerted by higher centers. This latter supervising retroaction would be provided by the primary visual cortex in the case of thalamic cells (Cudeiro and Sillito, 1996; Mc Clurkin et al, 1994; Sillito et al., 1994), and associative areas (Movshon et al., 1986) or subcortical structures (Casanova, 1993) in the case of area 17 cells. Other factors, more intrinsic to the considered cell, may also control the effectiveness of post-synaptic integration and depend on the spatial and temporal distribution of active inputs onto the target cell dendrite and on their interplay with membrane non-linearities. Furthermore, the synaptic connections themselves are subject to activity-dependent regulation during the life-time of the animal. A direct consequence of these different considerations is that the receptive field (RF) of neurons in visual pathways should not be considered as a static hardwired window probing the outer environment, but as an active filter which may continuously adapt and be updated as a function of global context and past experience. The activity dependency of synaptic connectivity may be manifest on different time scales: - Forced regimes of activity have been used to induce adaptive changes in the size and location of receptive fields and consequently in the conformation of the neural map representing the sensory space at the cortical level. If one considers the "slow" time scale of epigenetic effects (hours/days), functional adaptation of RF properties (i.e. orientation selectivity, ocular dominance..) has been shown to be dependent on long-lasting changes in synaptic efficiency (review in Fregnac et al., 1994c). The dominant phenomenology which has been observed mostly in cortical networks is very reminiscent of the Hebbian scheme of plasticity where temporal correlation between pre- and post-synaptic activity strengthens active synapses (Fregnac, 1995, Hebb, 1949). Such changes, demonstrated conclusively in somatosensory and visual cortex, might be considered as elementary memory traces acquired through sensorimotor exploration of the environment during critical period(s) of postnatal life (Merzenich and Sameshima, 1993). In addition to the high level of functional plasticity expressed in the developing cortex, the mature cortex retains a remarkable capacity to undergo functional reorganization. Cortical maps can be reconfigured following localized feedforward input deafferentation (Gilbert, 1992), or during active learning of a new
145 behavior of particular relevance for the organism (review in Barnes et al., 1994). Both types of data support the hypothesis that cortical receptive fields retain the capacity to re-express subsets of inputs which are silent during normal functioning, throughout postnatal life. - The second time scale on which previous activity might influence the synaptic integration process of sensory information is much faster, and occurs over a time-span in the order of hundreds of milliseconds, compatible with the construction of a percept. Milner and Von der Malsburg proposed hypothetical schemes where functional links could be reinforced as a function of feature associations, and lead to the transient formation of cellular assemblies through synchronization of activity (Milner, 1974; Von der Malsburg, 1981; Von der Malsburg and Bienenstock, 1986). When compared with more classical rate coding schemes (Barlow, 1972), the assumption made on theoretical grounds of "fast" reversible modulation of synaptic links has the advantage of keeping explicit relational representations of both the "parts" and the "whole", and therefore of being able to deal with compositionality of highly structured representations. Experimental support for the latter hypothesis has been obtained in the visual system where stimulus-induced neural oscillations were shown to synchronize across large cortical distances as a function of global features of the stimulus (Eckhorn et al., 1988; Gray et al., 1989). This latter description of recurrent waves of synchrony, already predicted by Abeles (1982), does not necessarily imply short-term synaptic plasticity. In a network as complex and intrinsically richly connected as visual cortex other types of modulation of coupling between cells can be achieved. The theoretical work of Aertsen and colleagues has led to the definition of the concept that "effective" connectivity could be up- and down-regulated by the mean activity level of the rest of the network during the time taken by a visual stimulus to sweep across the RF, without requiring biophysical changes at the pre- or postsynaptic sites of the activated synapse (Aertsen et al., 1989; Boven and Aertsen, 1990). Although recent data further amplify the fact that phase locking and synchrony of activity between cells appear to be more relevant relational variables in our understanding of perceptual binding than oscillatory behavior of single neurons (review in Fregnac et al., 1994a), the cellular mechanisms which subtend the dynamics of collective behavior in cortical cell assemblies remain largely unknown In contrast with previous knowledge based on extracellular recordings, the recent development of intracellular techniques in vivo (sharp electrode or "blind patch") would ideally allow experimenters to analyze and dissect the contribution of feedforward and lateral connectivity in the functional expression of a synaptic "integration field". We will present recent data which demonstrate that the visual receptive field of cortical neurons described at
146 the subthreshold level extends over much larger regions of the visual field than previously thought, and that the capacity of cells to amplify subthreshold responses could depend on the past history of membrane potential. Our results predict that associative presentation of peripheral and central input could lead to non-linear integrative processes which may account for fast modulation of synaptic input. Such a mechanism could be one of the diverse ways by which correlation of events can influence coupling between neuronal elements on a short-time scale.
1. REEXAMINATION OF THE RECEPTIVE FIELD CONCEPT The concept of receptive field was defined initially in the somatosensory and visual systems: it represents a spatial dimension of our immediate environment or of its projection onto the receptive surface (Adrian, 1941; Hartline, 1938; Hartline, 1940; Kuffler, 1953; Mountcastle, 1957). The excitatory center of the receptive field (ECRF) is, according to Adrian, the cutaneous area from which a single mechanoreceptor can be stimulated (Adrian, 1935). In 1938, Hartline extended this concept in the visual system to the case of retinal ganglion cells: their receptive field (RF) is "the region of the retina which must be illuminated to obtain a response in a given fiber. The location of the RF is fixedy its extent however depends upon the intensity and size of the spot of light used to explore it, and upon the conditions of adaptation. These factors must, therefore, be specified in identifying it" (Hartline, 1938). Typically, a small spot of light (in the order of 50 |j,m) was projected on the frog or alligator retina to map the extent of the zone from which one could elicit at least one action potential in the ganglion cell. The size of the set of sensory receptors whose activation significantly increased the rate of firing of the studied neuron was shown to extend over 1 mm of eccentricity, therefore encompassing by far the diameter of a rod or a cone. This suggests that the receptive field core is produced by the functional convergence of several photoreceptors onto a single ganglion cell. After the discovery by Kuffler of ON and OFF antagonist center/surround organization (Kuffler, 1953) it was further confirmed in rabbit retina that the retinal location and the extent of ganglion cell's RF measured by what engineers would call the impulsional method are in exact superposition with the location and horizontal spread of the dendritic tree of the recorded and intracellularly labeled cell (Yang and Masland, 1992). A summary conclusion that can be drawn from these empirical definitions of retinal cell's receptive fields is that the RF core is primarily defined by feedforward activation of ganglion cells from the photoreceptor stage, an integrative process which mostly involves parallel "vertical" polysynaptic pathways via bipolar cells and ignores asymmetric lateral interactions mediated by horizontal and amacrine cells. The number of
147 individual photoreceptors converging onto the same ganglion cell has been shown to depend on the species and the retinal eccentricity from gaze axis representation (area centralis in the cat, fovea in monkey) and drops in that region to a minimal convergence ratio of 4:1 in the cat or 1:1 in the macaque monkey. The work of Wassle and Boycott showed that if one restricts the analysis to ganglion cells of the same morphology (i.e. beta) and center organization (i.e. ON-center) their soma belonged to an hexagonal array in such a way that their mean dendritic extent would ensure an optimal coverage of the photoreceptor mosaic (review in Wassle and Boycott, 1991). Under these restricted test conditions (impulsional spot or line flashed ON and OFF across the retina) the concept of the minimal discharge field (Barlow et al., 1967) arises where RF connectivity is equated with the spatial convergence of parallel feedforward input lines and a linear integrate-and-fire summation process in the postsynaptic cell (MDF in Figure 1). This concept has been generalized to other visual relays and the study of RFs along the retino-geniculo-cortical pathway, led by Hubel and Wiesel in the 60's, is based entirely on a description of the RF extent based on the convergence pattern of input lines arising from the previous stage of synaptic integration (review in Hubel, 1988). In reference to the "platonic" neurons of Rodolfo Llinas (Llinas, 1988), we will call this functional representation of connectivity the "platonic receptive field". Obviously, such a definition has the inconveniences of its simplicity. It does not take into account the complex intrinsic connectivity and feedback circuits which still amplify the recruitment of more distant input sources in the retinal space. It also is based on the validation of the sole input which is able to produce a change in the output signal that will be propagated beyond that local level of processing, i.e. the action potential. Bishop and Henry long ago observed that "since region for discharge generally lie toward the center of the receptive field and inhibitory regions are farther out, the peripheral regions have nearly always been neglected" (Bishop and Henry, 1972). Although remarkably operational in defining the tiling of retinotopic mapping of the visual space, the platonic RF is a poor descriptor of the single cell sensitivity to the':' external world. How the potential connectional envelope resulting from the high level of anatomical convergence and divergence of fibers carrying sensory message from one level of integration to the next can be compared with the classical functional assessment of a receptive field remains largely unknown beyond the retinal level. At the beginning of the fifties, experimenters were already aware of lateral disinhibition processes in the retina and possible interactions between stimulation of central
148
''
4
IXCJTAltHlYSOWtOrffiffiSHOLD
i
INHffinORY
MDF: Minimftl DIscfaarge Field SLF: SttbttmmalRec^tiveFidld AF: Assodaitiofi Held
Figure 1. Reexamination of tiie Receptive Field concept This schematic diagram represents three functional sets of afferent connections, that could account respectively for the minimal discharge field (MDF), the subliminal receptive field (SLF) and the association field (AF) of a visual cortical neuron. Some connections might be conmion between sets of inputs. The MDF is defined a3 the region of the receptive surface where incoming impulsional stunuli give rise to neural suprathreshold activation of mono- and poly-synaptic excitatory and inhibitory pathways (schematized by filled symbols) and induce a significant change in the spikingfi-equencyof the recorded neuron. The SLF is the retinal region where impulsional stimulation induces only subthreshold EPSPs or IPSPs in the target cell. The AF is defmed in a higher dunensional space than that formed by the receptive sheet itself, since it requires spatial simmiation and temporal coactivation of several subliminal inputs originating fi-om regions which are not necessarily contiguous. The association itself occurs either directly at the recorded cell or at an intermediate stage of integration along the afferent pathway (where roimd-shaped soma may symbolize populations of afferent neurons), and still leads to detectable subthreshold events or modulation of the firing rate of the recorded neuron. The localization of synapses (filled circle for inhibitory, empty and filled triangles for excitatory) on the dendritic tree of the target cell is arbitrary.
149 part of the receptive field and of distal regions of the visual field. The initial work of Barlow led to the definition of "unresponsive" regions of integration, surrounding the receptive field, where no direct responses could be elicited by the presentation of a stimulus, but which could modulate responses to the concomitant activation of the central part of the receptive field (Barlow, 1953). The most classical interactions are the "periphery effect" (Mc lUwain, 1964) and the so-called "shift effect" in the retina and the thalamus (Cleland et al., 1971; Fischer and Kruger, 1974; Fischer et al., 1975; Ikeda and Wright, 1972). This latter effect corresponds to an increase in the sensory response in the central part of the receptive field, due to the presentation or the rapid displacement of a luminous object outside the border of the receptive field. A second type of interaction is the suppressive influence of the immediate periphery of the RF described mostly at the LGN level, and which has been shown to correspond to a negative feedback controlled by higher level cortical centers (Marrocco et al., 1982; Murphy and Sillito, 1987; but see Jones and Sillito, 1994). Other authors have also described ectopic areas of integration in the thalamus which may be either islands of excitatory or inhibitory influences of collicular origin (Molotchnikoff and Cerat, 1992). The importance of such effects has been often de-emphasized at the cortical level, and some studies indeed failed to reproduce periphery effects in visual primary cortex (Rizzolati and Camarda, 1977). However growing evidence supports the existence of peripheral modulation of cortico-cortical origin, thus defining an integration field surrounding the classical MDF much wider than classically thought. One of the first descriptions of an inhibitory surround was given by Hubel and Wiesel for lower order hypercomplex cells in areas 18 and 19 in the cat (Hubel and Wiesel, 1965). Other studies contributed to the definition of lateral interaction effects which could be mediated by intracortical processes, and were suppressive (Bishop et al., 1971; Bishop et al., 1973; Blakemore and Tobin, 1972; Creutzfeldt et al., 1974; De Angelis et al., 1994; Orban et al., 1987) facilitatory (Fiorani et al., 1992; Nelson et al., 1985; Kapadia et al., 1995) or mixed (Henry et al., 1978; Maffei and Fiorentini, 1976; Nelson and Frost, 1978). They could also be the end-result of feedback from larger integration units situated in non-primary cortical areas or from the pulvinar (Casanova et al., 1991). A recent study, using a bi-partition in the visual field centered on the MDF, showed that both facilitatory and suppressive influences can be elicited from unexpectedly large interaction zones in area 17 (Li and Li, 1994), which raises the question of the spatial extent of the corresponding association field of primary visual cortical receptive fields. These modulatory influences originate at least in part at the cortical level, since they are orientation specific (Blakemore and Tobin, 1972; De Angelis, et
150 al., 1994) and can be produced when using dichoptic stimulation protocols (De Angelis, et al, 1994). It is nevertheless remarkable that, in contrast with retinal RFs, the extent of the "platonic cortical receptive field" defined on the basis of the dimensions of MDF or ERFC remains limited to a few degrees of visual angle along the retino-geniculo-cortical pathway in photopic or mesopic conditions of adaptation (Orban, 1984). The description of large subfields of associative interaction should not be surprising since they agree with the anatomical observations of long distance intracortical horizontal axons and with the demonstration, using cross-correlation techniques, of functional links between cells whose receptive field centers occupy distant positions in the visual field (Schwarz and Bolz 1991; Ts'o et al., 1986).
2. EVIDENCE FOR MINIMAL DISCHARGE FIELD PLASTICITY A variety of neurophysiological postulates have been proposed in order to explain how activity-dependent changes in synaptic efficacy might result in the emergence of visual cortical receptive fields (Bienenstock et al., 1982; Linsker 1986a-c; Miller, 1990; Stent, 1973). They all share the common assumption that on-going changes in temporal correlation determine the sign (i.e. potentiation or depression) and the amplitude with which synaptic gains will subsequently be altered (Fr^gnac, 1995, Hebb, 1949, Sejnowski, 1977a-b, Sejnowski and Tesauro, 1989). Repeated pairing between dynamic visual stimuli and exogenous control of the level of firing of the recorded neuron has been shown to induce upand down-regulations of visual responses analogous to those produced during monocular deprivation, strabismus or restricted visual experience (Fregnac et al., 1988; Fregnac et al., 1992; Shulz and Fregnac, 1992), which suggests that the spatial extent and organization of visual cortical receptive fields might have been also modified. In collaboration with Dominique Debanne and Daniel Shulz, we recently determined how supervised Hebbian mechanisms could be used to regulate the relative dominance of ON and OFF responses across Simple and Complex receptive fields. Two types of protocols were used to study the associative effects of pairing the presentation (ON) or extinction (OFF) of the stimulus in a given position of the receptive field (paired position), with either a depolarized (S"*") or an hyperpolarized (S") state of the recorded neuron. In the first protocol extracellular KCl (3M) electrodes were used both to record and to control postsynaptic activity (Debanne et al., submitted). Passing brief negative current
151 pulses during juxtacellular recording (spike amplitude around 20 mV) allowed to suppress the preferred response of the stimulus via field-effect, whereas ejection of positive current, in conjunction with potassium ions, led to an increase in the firing frequency of the neuron. Fifty associations at low temporal frequency (1 every 8 seconds) induced significant changes in the ON and OFF ratio of visual responses in localized regions of cortical receptive fields in both kitten and adult cat area 17 (Figure 2A). The second, more difficult, type of protocol was carried out at the subthreshold level in collaboration with Attila Baranyi (Debanne, et al., 1995), and consisted of pairing ON or OFF postsynaptic composite potentials with somatic current injection in the intracellularly recorded cell (Figure 2B). The general phenomenology of the observed results is that most changes in ON and OFF responses agree with Hebbian schemes of plasticity. Long-lasting modifications of the ratio of ON/OFF responses induced by the extracellular iontophoretic method, measured in the paired position or integrated across the whole extent of the RF, were found in 44% of the conditioned neurons. In most cases they favored the characteristic which had been paired with the "high" level of imposed activity. They consisted mostly of potentiation or depression of long-latency responses (100-800 msec), and the amplitude change was on average half of that imposed during pairing. In a few cells the de novo expression of a suprathreshold response could be induced for an initially ineffective characteristic of the visual stimulus. The spatial selectivity of the pairing effects was further studied by stimulating alternately both paired and spatially distinct unpaired positions within the RF. Most modifications were observed in the paired position, and restricted in two thirds of cases to that region of the RF alone. However, interpretations of these results in terms of synaptic plasticity were hindered by the fact that iontophoresis used to control postsynaptic activity possibly modified extracellular ionic concentrations as well as presynaptic release in an unwanted way. Furthermore only the rate of firing was accessible to the experimenter. Consequently we used intracellular recording techniques to directly monitor and control the membrane potential of the target neuron by intrasomatic current injection, and in addition to measure changes in synaptic potentials during imposed coactivity protocols in vivo (Baranyi, et al., 1991). Figure 2B illustrates the long-term potentiation of a composite postsynaptic potential (PSP) in a Simple receptive field, induced by a synchronous pairing of the ON-stimulation in the ON-zone of the test stimulus with a depolarizing current pulse (+2 nA) in a 10 week-old kitten. The potentiation of the ON-response was still present 35 minutes later, and was observed only in the ON-field. The OFF-response in the OFF-field was unchanged, which demonstrates that changes were input specific and restricted to the paired region of the
152 receptive field. The results suggest that up- and down-regulation of ON- and OFFresponses could result from selective changes in the transmission gain of the synapses which were activated during associative pairing. In order to further establish the synaptic nature of the observed changes, one of us (Y.F.), in collaboration with Michael Friedlander and Jim Burke , performed associative Hebbian protocols in vitro in 4-5 week-old kitten and guinea-pig visual cortical slices (Fregnac et al., 1994b). No pharmacological blockade of intracortical inhibition was used (bicucuUine-free ACSF) in order to leave the cortical tissue in a situation as close as possible as that occurring in vivo. Recordings were made with 2M potassium methyl-sulfate (80
Figure 2: Long lasting changes of ON- and OFF-responses in the MDF induced by associative pairing protocols in Simple cells. (a): Extracellular recordings in area 17 of a 5 week old kitten: this cell initially exhibited a strong dominant ON response over the whole extent of the receptive field (shaded rectangle). The control visual response for one (solid bar in upper diagram) of the two positions which were quantitatively studied is represented in the first row and was dominant ON with a small OFF-response (C). The pairing procedure consisted of 50 associations of a negative current pulse (- 4 nA, - 100 ms delay, 2000 ms duration) in phase with the presentation of the visual stimulus, and of a positive current (+ 4 nA, - 100 ms delay, 2000 ms duration) in phase with the extinction of the visual stimulus. A significant increase of the OFF response was imposed during pairing whereas the negative current was ineffective in reducing the ON response. Five minutes after the end of the pairing, the OFF response was significandy potentiated (Kolmogorov-Smirnov, p<0.008). This effect was still present forty-five minutes after pairing and was selective for the paired position. The ratio of complexity - given by OFF/(OFF+ON) - was calculated in each position using a moving average technique (Fregnac et al., 1988), and plotted in the inset as a function of time (pointing downwards). Calibration bars: horizontal 1 sec, vertical 10 action potentials /second. (b): Intracellular recordings of a Simple cell in area 17 of a 10 week old kitten. The ON and OFF zones, symbolized respectively by empty and filled rectangles, were stimulated alternately. The ON presentation of the stimulus (solid empty bar) in the ON zone was paired 30 times with a concomitant intracellular depolarizing current pulse (+2 nA, 200 ms duration). Left, control PSPs (in the absence of current through the recording electrode), averaged over 20 successive trials. For each pair of curves, the thin trace represents the template waveform before pairing, whereas the thick one corresponds to the PSP (arrows) observed at various delays following pairing. Right, trial by trial plot of the peak amplitude of the PSP (ordinate in mV, with time downwards). The dotted line represents the mean of the PSP peak values observed before pairing, whereas the continuous segments indicate that averaged over successive blocks of trials following pairing (individual events larger than 6 mV which are out of scale are included in the means). The doubling in the size of the paired PSP was still present 35 minutes later (Kolmogorov-Smirnov, p<0.0175). The resting membrane potential (- 67 mV) was unchanged by the pairing procedure. No significant change of response was observed for the unpaired stimulus (OFF response to stimulation of the OFF zone, data not shown).
c«
J
0
/:
U
0
I a ^
^
p^
a. +
c CO
IT)
c E
aftT^iJiihiyiiiiyiJiillilmiillii^i^^^
u
c
E
h
2:
o
153
154 MQ) or with 0.5 M KCl and biocytin in cortical layers II to IV. Controls of the synaptic efficacy of the test pathway consisted of single constant current stimuli (50-150 mA) delivered to the white matter or at an intracortical site, at 0.2 to 1.0 Hz during a 10-15 minute observation period. During pairing, the polarizing pulse onset occurred 5-10 ms before the test-stimulus, and its duration (50 to 80 ms) was adjusted to overlap the time-course of the test-PSP. The same polarity was used throughout pairing, corresponding either to depolarizing (S+) or hyperpolarizing current injection (S-), with a mean value of 2 to 3 nA. Input specificity was tested using dual stimulation protocols where one bipolar stimulating electrode was positioned in the white matter and a second one was positioned laterally to the recording site in layers IMII. During differential pairing, stimulation of one pathway was paired with postsynaptic depolarizing current, while the other pathway was unpaired but activated the same number of times in an alternating fashion. Interstimulus intervals (ISI) between the activation of the paired and unpaired input varied between 500 ms and 2500 ms, depending on the cell. Two findings were observed: 1) the synchronous association between presynaptic input and postsynaptic depolarization resulted in homo synaptic potentiation of functionally identified postsynaptic potentials even in the absence of blockade of local inhibition; 2) the synchronous pairing of afferent activity with hyperpolarization of the postsynaptic cell resulted in homosynaptic depression in visual cortex in vitro as already observed in vivo , These effects were shown to be input specific and required temporal association between afferent activation and the change in membrane potential. We conclude from both in vitro and in vivo sets of experiments that changes in the spatial organization of ON- and OFF-responses across the RF may be the result of Hebbian changes of composite depolarizing ON and OFF PSPs. The additional observation in vivo of the induction of suprathreshold "de novo" responses outside the classical RF (Shulz and Fregnac, 1992; Debanne, et al., submitted) led us to suggest that Simple and Complex RF organizations might be two stable functional states derived from an envelope of common excitatory subthreshold connectivity wider than the classical minimal discharge field. The main functional difference between the two types of RF organization appears to be that in Simple cells the spatial profile of the RF expressed at the suprathreshold level is, in contrast to that observed for Complex cells, sculptured by intracortical inhibitory processes. This connectivity scheme predicts that local blockade of GABA-ergic inhibition should reveal co-extensive ON- and OFF-responses across the RF. To test this, in collaboration with Daniel Shulz, combined electrodes were assembled for intracellular recording and simultaneous juxtacellular iontophoresis in order to record intracellularly from kitten or cat visual cortical neurons and at the same time apply GABA agonists or
155 antagonists in the vicinity of the recorded cell (less than 50 jim in Shulz et al., 1993b). When bicucuUine was applied with an iontophoretic current set to antagonize the effects of exogenous GABA, the antagonist responses in subfields of Simple cells disappeared, revealing the expression of a previously masked excitatory input. The pattern of suprathreshold visual responses thus became Complex, and the area of the MDF enlarged by 50-200%, revealing a connectivity which up to the iontophoretic application was silenced by intracortical inhibition. These effects were fully reversible within 15 to 20 minutes following the end of the drug application. This experimentally induced transformation of a Simple RF into a Complex-like one, and the loss of the spatial separation of antagonist ON and OFF subfields by the iontophoresis of bicucuUine methiodide are not compatible with the original scheme of Hubel and Wiesel, who proposed that the spatially separate elongated ON and OFF subfields of Simple RFs result from the convergence of excitatory inputs from respectively ON-center and OFF-center principal geniculate cells, co-aligned in the visual field (Hubel and Wiesel, 1962). In contrast, our results agree with the extracellular records of Sillito (Figure 1 in Sillito, 1975), who noted that, during iontophoresis of bicucuUine, overlapping ON and OFF responses were apparent throughout the RF of Simple cells. In these experiments, however, it was not possible to conclude if the suppression of activity during the antagonist response resulted from a direct postsynaptic inhibition (which we now show to be the case) or if it was due to a more complex indirect interaction of the network in which the recorded cell was embedded. They also agree with a recent report by Nelson and collaborators (Nelson et al., 1994) showing that during whole-cell patch recordings of supposedly Simple cells, intracellular dialysis with chloride-channel blockers, such as picrotoxin or 4,4'-diisothiocyano-tostilbene-2,2'-disulfonic acid (DIDS) reveal depolarizing potentials at both the onset and the offset of the light in two positions of the RF. Together, these results strongly suggest that a direct inhibitory input onto the recorded cell contributes to the spatial structure of the RF. Moreover, the demonstration of conditioning-induced modifications of the RF profile, and the unmasking of excitatory responses during blockade of GABA-ergic inhibition strongly indicate that the recorded cell receives direct ON and OFF excitatory afferents over the whole extent of the RF. We conclude that ON and OFF responses in Simple receptive fields can be up- and downregulated by activity and that a Complex subthreshold depolarizing input feeds the cortical receptive fields over a spatial region which is much larger than the classical minimal discharge field.
156 3. A SUBTHRESHOLD VIEW OF THE PLATONIC RECEPTIVE FIELD As reviewed earlier (section 1), extracellular analysis of cortical receptive fields (RFs) has revealed side-band and end-zone inhibitory influence and led to the description of more spatially diffuse "unresponsive" modulatory regions surrounding the minimal discharge field (MDF). However studies of these regions remain indirect since they are mostly based on the modulation of the cell's response to a central test stimulus by a concomitant peripheral activation of the RF. Obtaining direct evidence for subthreshold excitatory and inhibitory regions extending beyond the classical MDF requires intracellular methods. For technical reasons, and in spite of continuous efforts over the past 25 years, a limited number of such recordings has been obtained using fine tipped micropipettes (Berman, et al. 1991; Creutzfeldt and Ito, 1968; Creutzfeldt, et al. 1974; Douglas et al., 1988; Douglas et al., 1991; Dreifuss, et al. 1969; Ferster, 1986; Ferster and Lindstrom, 1983; Innocenti and Fiore, 1974; Toyama, et al., 1974). Whole-cell recordings with patch electrodes have also been attempted with reasonable success (Hirsch et al., 1995; Jagadeesh et al., 1992; Pei et al., 1994; Pei et al., 1991; Volgushev et al, 1993), thus allowing a reduction in the leak conductance usually associated with sharp electrodes, an increase in input resistance and a better control of membrane potential. In collaboration with Daniel Shulz, Dominique Debanne, and Attila Baranyi we have concentrated our efforts during the last 5 years to optimize recording stability with sharp electrodes for periods lasting from 30 minutes up to 4 hours in more than 120 cells (Bringuier, et al. 1992; Fr^gnac, et al., 1994a-c). Intracellular recordings have been made in area 17 of the cat, ranging from 4 weeks of age to adulthood, with sharp electrodes (70 MQ) filled eitiier witii KCl (3M), or potassium methylsulfate (2M) with the addition of KCl (4 mM), and in some cases QX314 (100 mM). More recently, in collaboration with Lyle Borg-Graham and Cyril Monier, patch recordings in whole-cell configuration were made with 2-5 MQ glass micropipettes, filled with standard K-gluconate solution (140 mM Kgluconate, 10 mM Hepes, 4 mM ATP, 2 mM MgCl2, 0.4 mM GTP, 0.5 mM EGTA; see Borg-Graham et al., 1995). Whole-cell recordings were obtained following giga-seals (2-8 GQ ) with Rin = 20-350 MQ, Raccess = 3-25 MQ, Brest = -75/-50 mV). Whatever type of electrode (sharp or patch), three types of neurons were identified on the basis of their responses to intracellular current pulses according to criteria defined in vitro (Mc Cormick et al., 1985). The three classes of regular spiking, bursting, and fast spiking cells were recorded at every age. In order to study membrane potential fluctuations independentiy of action potential occurrence, an algorithm for spike suppression was used which detects
157 derivative amplitude threshold trespassing and interpolates the signal to the next value when the signal comes back within preset Umits of dynamic range. Different techniques were used to provide a separate analysis of the spatial domains of origin of the different classes of synaptic events revealed at the subthreshold level. Intracellular current injection combined with visual stimulation allows to record evoked activity at membrane potentials where either EPSPs or IPSPs will be dominant to the exclusion of the other. For instance the cell can be clamped near the chloride equilibrium potential to cancel fast inhibitory events, or, in contrast near EPSP reversal potential when blocking spike activity with the use of QX314 (Glaeser et al., 1995). In some cases it was also possible to achieve voltage-clamp recordings either in single electrode discontinuous mode in sharp electrodes (Fregnac et al., 1994a) or continuous mode with patch electrodes (Borg-Graham et al., 1995), and characterize inward or outward currents thus limiting the contribution of voltage-dependent conductances. Such methods allow the comparison of classification of Simple and Complex receptive fields based on the spatial segregation or overlap of ON and OFF firing with that based on subthreshold responses, and confirm the observation of antagonist hyperpolarizing events in Simple receptive fields (Ferster, 1988). In contrast, depolarizing waves are observed with concomitant firing for both ON and OFF transitions of the stimulus in Complex receptive fields. The study of the voltage-dependency of the synaptic response profiles shows that antagonist responses in Simple receptive field correspond to an inhibitory potential which inverts at around -75 mV (Shulz et al, 1993b), and is probably mediated via GAB AA receptors. GABAfi-like slow inhibitory potentials which prolong the tail following the initial phasic hyperpolarization have been also observed. This description of receptive field organization agrees with the ON-OFF antagonism model proposed by initially by Palmer and Davis (1981) and confirmed with quasi-intracellular recordings by Ferster (1988). However, careful reading of some of our intracellular records shows the existence of mixed subthreshold depolarizing and hyperpolarizing antagonist responses in some Simple receptive fields. These observations confirm that for the antagonist characteristic of a given subfield in a Simple cell, excitatory input may be present which is short-tailed by dominant inhibitory events, as predicted on the basis of plasticity experiments (see section 2). This led us to suggest that subthreshold complex depolarizing input could be present in Simple receptive fields, but masked at the suprathreshold level. In order to further quantify these observations, in collaboration with Larry Glaeser, Cyril Monier and Lyle Borg-Graham we developed quantitative methods for mapping the
158 spatial origin of synaptic input in visual cortical receptive fields (Borg-Graham et al., 1995; Glaeser et al., 1995). Two types of analysis can be distinguished to define the synaptic integration field of cortical cells.
3.1. Reverse correlation analysis The first method is derived from linear system analysis, and is based on the reverse correlation technique. This technique, developed initially in the auditory system (Eggermont et al., 1983), was first applied to visual cortex by Jones and Palmer (Jones and Palmer, 1987a) and more recentiy by Freeman's team (review in De Angelis et al., 1995). It consists of measuring the impulse response of cortical cells using quasi-punctual stimuli (light or dark squares of opposite contrast as an approximation of a Dirac distribution) flashed ON (for 50 ms) in random positions of the visual field (partitioned in a preset grid of 80-400 locations). Each position in the visual field is tested at least twice (once at each contrast). The use of opposite contrasts makes it possible to separate and extract excitatory ON and OFF responses in linear Simple cells. The intracellular signal was fed through an event detector which transforms the membrane potential into a discrete time interval series by applying one of the following algorithms: (a) threshold-crossing of instantaneous deviations (e.g. + 3 mV / - 3 mV in Figure 3) of the membrane potential from a moving average (established over the last 2 seconds of activity), (b) threshold-crossing of waveform energy calculated with respect to the average baseline within a moving fixed duration integration window, and (c) prototypical and waveform-specific template matching to discriminate between fast and slow IPSPs and EPSPs, Ca^"*" and Na"^ spikes (see also Hirsch et al., 1995). In this latter case discrimination of events was made assuming fixed polarity and fixed durations of rising and falling phases independently of the amplitude of the selected waveform. The number of maps that can be finally obtained repeating the reverse correlation analysis procedure is given by the product of the number of stimulus configurations (used for the spatial convolution) and the number of waveform templates or amplitude thresholds used for event detection. Spatial maps of correlated events are constructed by finding which pixel of the visual field exploration was ON, at a fixed (reverse correlation) delay preceding each time arrival of a postsynaptic event. The content of the pixel where the stimulus was present at that delay (i.e. correlated with the postsynaptic event) is then incremented. The "stationary map" is defined as the cumulative count matrix integrated over reverse correlation delays ranging from -30 to -200 ms. In order to assess the statistical significance of the maps, a similar
159 procedure of correlation counts is carried out in the forward direction, thus establishing for each pixel an empirical distribution of counts corresponding to the null hypothesis of independent output and input processes. For each pixel 0.01 confidence thresholds are computed above or below which its content becomes significant of a suppressive or excitatory evoked activity. The proportion of significant counts" related to the stimuli presented at the retina is given by the sum of the contents of the thresholded significant pixels divided by the total number of detected events. This index, chosen to quantify the contribution of detected events to the visual response, is in fact equivalent to the "signal/noise" ratio of the map. Results illustrated in Figure 3 correspond to the application of a + 3 mV (EPSP RF), - 3mV (IPSP RF), + 30 mV (MDF) threshold amplitude discrimination method in a unimodal (ON) Simple cell recorded in area 17 of a 6 week old kitten. The dates of occurrence of the local extrema, detected between successive positive and negative crossings of the threshold by synaptic potentials, are extracted from the intracellular record. They define a point-process that has been analyzed separately for each threshold by classical reverse correlation techniques. A preliminary study in kittens during the critical period has shown that at least at that age reverse correlation maps in Simple cells exhibit aggregate excitatory and inhibitory regions extending up to nine times the area of the classical MDF (Glaeser et al., 1995). All excitatory maps were consistent with those constructed on the basis of stimulus-locked waveforms averaged at each spatial location. Moreover, the reverse correlation technique revealed inhibitory regions which were not apparent using stimuluslocked averaging but were detected as excess correlations when allowing a jitter in the event detection (of the order of the duration of the stimulus presentation). Amplitude thresholding techniques indicate that the lower the threshold, the larger is the extent of the RF, and that depolarizing subthreshold events as small as 500 [lY are still correlated with the retinal origin of the stimulus. The largest "signal/noise" ratio of the visual map is generally observed for events ranging between two to ten millivolts of amplitude. Furthermore, the expansion obtained by lowering the threshold recruits depolarizing composite PSPs originating from neighboring regions often in a non-isotropic way.
3.2. Measure of changes in total synaptic conductance The success of the previous method depends on the detectability of synaptic events which may be masked depending on the actual value of the membrane potential at the time of recording. In addition, balance in excitatory and inhibitory inputs might result in the absence of significant variation of membrane potential during current clamp recordings, and
160
a
5.5
TIM£ (ms) Figure 3. Reverse correlation maps of visual cortical receptive fields (a): This example is taken from a imimodal ON Simple cell, recorded in adult cat area 17 at a resting potential of - 51 mV. A 12 x 12 position matrix centered on the hand-plotted receptive field was explored by a random sequence of light and dark squares of light, each shown one at a time for 50 ms, and tiie complete exploration of the field for both contrasts was repeated 23 times. The reverse correlation algorithm was then run with a retrograde delay of 60-100 ms with respect to the events detected in the biological signal (60 ms for MDF and EPSP, 90 ms for IPSP, see b). In order to assess the statistical significance of the maps, a similar procedure of correlation counts was made in the forward direction, to establish an empirical distribution of amplitude for each pixel, corresponding to the null hypothesis of independent output and input processes. The final maps for the light contrast showed significant excess correlations (p<0.001) for each type of selected event. The EPSP field area was found to be 9 times larger than that of the spike field (MDF) and its shape appeared to be unrelated to the preferred orientation of the cell (arbitrarily set as the horizontal axis). (b): Discretization process of the intracellular signal (record sample taken from the cell shown in a). A moving average of the membrane potential (given at any time by the mean value measured during the previous second of recording) was subtracted from the intracellular signal. Three time series were extracted from the quasi d.c. signal (AVm) using a threshold crossing algorithm for three detection levels: - 3 mV ("IPSP map"), + 20 mV("Minimal Discharge Field") and + 3 mV ("EPSP Field"). The selection process is illustrated for this latter case. The dates of occurrence of the detected events are chosen at half time between successive pairs of threshold crossings of opposite slopes, the earliest one being in that case always positive.
161 thus synaptic events become undetected. Another method is to analyze impulse responses of cortical cells in voltage clamp mode at at least two holding potentials and extract a direct measure of conductance changes from the comparison of the records. Attempts which have been made previously using sharp electrodes in current clamp mode, did not find evidence for conductance changes larger than 10% in response to moving stimuli, even when prominent phase-locked synaptic inhibition was recorded at the somatic level (Berman et al., 1989; Berman et al, 1991; Douglas, et al, 1988) In order to re-address this question using static stimuli during blind patch recordings, specific methods to monitor synaptic conductance changes were developed in collaboration with Lyle Borg-Graham and Cyril Monier (Borg-Graham et al., 1995). Synaptic input in response to flashing dark and light bars was studied in several positions of the RF. An estimation of the synaptic conductance waveform Gsyn(t) as seen by the soma was computed from two voltage clamp current records Il(t) and I2(t) taken at two holding potentials, (Vl and V2), using an experimental approach applied in the retina by Borg-Graham (BorgGraham, 1991), and according to the following formula: Gsyn(t) + Grest = [ Il(t) - I2(t) ] / [ Vl - V2 - R access* ( Il(t) ' ^(t) ) 1
Preliminary results show that in Simple cells transient increases in conductances (100300 %) are observed in response to both ON and OFF transitions of a stationary bar flashed in a fixed position of the RF, in spite of the asymmetry that may exist between the respective amplitude in current clamp mode of two types of antagonist responses (ON vs. OFF). This suggests that, contradictory to the results of Douglas et al. (1995) and Jagadeesh et al. (1993), large increases in conductance can be seen using blind patch and continuous voltage clamp in the linear dynamic range of the cell (Borg-Graham et al., 1995). Furthermore, it is possible that these might correspond to the activation of shunting inhibition which might reduce the depolarization of the recorded cell. A critical parameter in assessing the measurement of conductance increases at the soma is the access resistance Raccess o^ the recording electrode which might dramatically damp the visibility of this phenomena.
4. SPATIAL ORGANIZATION OF THE ASSOCIATION FIELD In contrast to the impulse method, "interaction" terms between different regions of space can be revealed by using coherent spatial and temporal sinusoidal gratings of luminance shown at different eccentricities from the center of the RF. It is expected that the
162 actual extent of the receptive field will depend on the stimulus used to elicit a postsynaptic response, i.e. on the spatial and temporal summation processes participating in the test response. In collaboration with Frederic Chavane and Jean Lorenceau, we studied intracellular responses of area 17 cells to sine wave and rectangular gratings (of optimal speed, spatial and temporal frequency). The stimuli were flashed or drifted along an axis orthogonal to their orientation during 400 ms, and shown in annular portions of the visual field, otherwise of the same mean luminance (Bringuier et al., 1995). Once the MDF had been characterized using hand-held stimuli, the visual field was partitioned arbitrarily in three zones co-centered on the geometrical center of the RF: MDF (120% of the plotted RF), NEAR Periphery (9 times larger than MDF), FAR Periphery (the extent of which was limited by the borders of the 21' screen positioned at a distance of 57 cm). In all cells recorded so far, significant depolarizing responses could be elicited by stimulation of the NEAR periphery surrounding the MDF. In most cases they could still be observed in the FAR periphery (up to 10.5^, angular distance measured from the center of the MDF to the inner border of the annulus, see Figure 4). Even at such eccentricity, large amplitude responses were recruited (mean value between 5 to 10 mV), indicating that, as for stimulation in the central part of the RF, drifting gratings were more "efficient" in revealing subthreshold responses than briefly flashed patches of light (see reverse correlation section). In addition, a significant hyperpolarizing component in the peripheral responses was unmasked in almost half of the cells using depolarizing intracellular current injection. Both hyperpolarizing and depolarizing responses became increasingly more phasic the larger the distance from the center of the MDF. Several pathways could theoretically mediate these responses: a) lateral connectivity at the retinal or geniculate level, b) divergent thalamo-cortical axons, c) lateral chains of mono or polysynaptic connections within area 17, d) feedback projections from other cortical areas, such as A18, PMLS, or from subcortical structures such as superior colliculus, claustrum and LP-pulvinar complex. In these latter cases, RFs are usually larger than those of area 17 and the retinotopic precision of the reciprocal feedback pathways is less than that achieved by the feedforward projection (review in Rosenquist, 1985 and Salin and BuUier, 1995). An observation which could help tracing backward the origin of peripheral responses is the strong dependency of the latency of depolarizing responses as a function of the eccentricity from the center of the MDF. An example is illustrated in Figure 5a. This Simple cell was stimulated with a stationary sine-wave grating of optimal orientation, phase
163
64 mV^
- 59 mV^M%UkJwMVM
70 mV^
70 mV . r V U ^ ^ ^
79 mV^
78 mV^
.^Qn^y^JvKjuouJLMMy
.71mV.^
81 mV^
II 200 ms
Figure 4. Voltage dependency of intracellular responses to the stimulation of the minimal discharge field (left column), the near periphery (middle column) and the far periphery (right column) Simple cell recorded intracellularly in a 10 week old kitten (resting potential - 70 mV). The MDF is symbolized by a filled dark square in the upper diagrams. The shortest distance from the center of the MDF to the inner border of the stimulus was 3.25° for the NEAR periphery and 7.5° for the FAR periphery. The spatial frequency and the phase of the grating (alternated in phase at 1 Hz) were adjusted so that they matched the ON-OFF organization of the MDF. In the NEAR and FAR periphery, averaged response profiles observed for the two opposite phases (transitions indicated below) were more symmetrical than within the MDF, suggesting a Complex organization of the surround region of this Simple cell.
164 and spatial frequency, in the three different partitions of the visual field already mentioned: MDF, NEAR periphery and FAR periphery. Responses to MDF preceded on average those originating from the FAR periphery by about 30 ms. If one takes into account the angular distance separating the center of the receptive field from the inner border of the grating and the retinotopic magnification factor in area 17 (Albus, 1975), an "apparent speed of propagation" (ASP) can be computed and expressed as the ratio of the distance in the cortical projection map divided by the difference in response latency between the central and peripheral loci of activation. In the present case (Figure 5a), the ASP was in the order of 0.24 nVs. A systematic lag of peripheral responses compared with central ones was observed using a variety of stimuli (flashed squares, flashed bars) which varied from 0.02 to 0.9 m/s depending on the tested cell. This rather large spread in latency was not correlated with the type of stimulus used and could reflect the implication of different mechanisms. These electrophysiological results are strikingly similar, even at the quantitative level, to observations by Grinvald and colleagues derived from real-time optical imaging in supragranular layers of Macaque area VI (Figure 5b). The only noticeable difference is an intrinsic delay between the onset of subthreshold membrane potential depolarization and the detection of an emitted fluorescence signal probably due to spatial summation (Grinvald et al., 1994). The speeds of propagation measured in these optical measurements range from 0.1 to 0.25 m/s, and are therefore within the range of our own observations. These ASP values are remarkably slow, especially when compared with the conduction velocity of thalamo-cortical fibers: 8 to 20 m/s for X fibers, 15 to 40 m/s for Y fibers in the cat according to Hoffmann and Stone (1971). This excludes the possibility that peripheral responses are mediated via diverging thalamocortical axonal arbors. It is also unlikely that the graded changes in response latency as a function of eccentricity from the MDF center result from correlated increases in propagation delays of feedback projections from higher order areas, since the latter are mediated by larger RFs, thus unable to account for a strong spatial dependency. The observed latencies are in fact most compatible with the conduction velocity along horizontal axons in slices of rat and cat visual cortex, as measured electrophysiologically (0.15 to 0.60 m/s at 31-34°C in Murakoshi et al., 1993; Nowak, 1995; 0.35 m/s inferred from Hirsch and Gilbert, 1991) or by optical methods (0.16 m/s at 22°C in Nelson and Katz, 1995). Another possibility remains, namely that intracortical poly-synaptic transmission is responsible for the horizontal propagation of waves of activity (Grinvald et al., 1994), and that intracolumnar vertical reverberation might slow down to a certain extent (depending on the level of intracortical inhibition) the tangential spread of activity (Tanifuji et al., 1993). However, this latter scenario seems less plausible, like all the other schemes of facilitation or suppression relying on polysynaptic propagation (lateral interactions within
165
a
50 ms
50 ms
Figure 5. Differences in latencies between central and peripheral stimulation of the receptive field (a): Same cell as in Figure 4. The averaged responses to the stimulation of the MDF (triangle) and FAR periphery (square) are superimposed on the same graph. The response to peripheral input started to rise about 30 ms after the response to a stimulus restricted to the MDF. Tlie absolute latency of the earliest subthreshold response was in this case 30 ms. (b:) Data in area VI of the anesthetized macaque monkey, adapted from Grinvald et al (1994). The optical measurement of fluorescence emitted by supragranular cells during incorporation of voltage sensitive dyes was made on a 6 x 6 mm2 cortex area and studied before and during the presentation of a stunulus equivalent to the "FAR periphery" annulus used in our experiments. The two traces are tiie superimposed optical responses sampled in two pixels of the retinotopic cortical map of the stimulated area, one corresponding to the center of the imaged area and receiving mdirect activation (triangle), the other one situated 3.6 mm away and receiving a direct feedforward input from the annular grating (square). A lag of 40 ms was apparent between the respective onsets of the two optical responses, compatible with the direct subthreshold measurements made at the intracellular level and presented in a.
166 the retina or the LGN), because it requires presynaptic spiking activity from intermediate neurons and therefore the existence of much larger MDFs in these units than experimentally reported. Concerning inhibitory interaction, apart from the extrinsic inhibitory connectivity (described mostly in the rat) linking different cortical areas, either between areas 18 and 17 (Mc Donald and Burkhalter, 1993) or between the areas 17 of the two hemispheres (Hughes and Peters, 1992), the best candidates to mediate peripheral hyperpolarizing responses are inhibitory interneurons of area 17 with long range axons (Somogyi et al., 1983). A remaining possibility is that a fraction of the delay between central and peripheral responses results from conduction along the post-synaptic dendrite. For example, for an area 17 pyramidal neuron of layer V, the dendritic path from the site of synaptic contact made by extrastriate originating axons (which run in layer I) to the soma extends over 1 to 1.5 mm. Agmon-Snir and Segev have studied analytically and simulated the integration time due to passive propagation in dendritic trees (Agmon-Snir and Segev, 1993). The center of gravity ("centroid") of an EPSP waveform, depending on the set of membrane parameters, was found to be shifted in time from zero to more than 20 ms. However, this effect reflects more the filtering of the EPSP by the cable properties of the membrane (which affects the rising slope and the abscissa of the peak), than a global translation of the whole response curve along the time axis. Although dendritic propagation might not account for the differences in latencies measured at the point of initiation of the response, it might still be responsible for the smaller amplitudes and slopes of depolarization observed in peripheral responses. In this respect, an attractive possibility would be that the average distance of the activated synapses from the soma reflects the eccentricity of the visual stimulation with respect to the center of the receptive field. For example, it is remarkable that the layer VI pyramids, whose apical dendrites extend all the way through layers Il-in (containing the largest plexus of intrinsic tangential connections), have the largest receptive fields, even though a quantitative relationship between the morphology of the cell and its RF length needs to be re-ascertained (Bolz and Gilbert, 1986; Gilbert, 1977; Grieve and Sillito, 1991). While most retinal models of direction selectivity assume conservation of a retinotopic grain in the mapping of synaptic input onto the dendritic structure of amacrine and retinal ganglion cells (Borg-Graham and Grzywacs, 1992), similar hypotheses have not been explored at the level of a single cortical neuron. Although no correlation has been observed between the tangential organization of the dendritic arbor and the spatial characteristics of the receptive field (size and orientation: Martin and Whitteridge, 1984), to our knowledge no study has investigated a possible proximal/distal mapping of synaptic input onto the somato-dendritic structure in terms of retinal coordinates "ego"-centered on the RF center.
167 Our study of the voltage-dependency of peripheral excitatory responses showed that in most cases they behaved like classical AMPA-mediated and more rarely like NMDAreceptor activated synaptic inputs. Stimulation of the NEAR periphery revealed a subthreshold response which remained below spiking threshold when the cell was kept at rest (middle row in Figure 5) in spite of the increased spatial summation linked with the use of the coherent grating stimulus. Spiking responses could be revealed in most cases when depolarizing the postsynaptic cell above a given threshold (top row in Figure 5). The level of the required depolarization depended on the recorded cell, and was in most cases higher the further away the peripheral stimulus was from the border of the MDF. One of the dominant aspects of using fields of stimulation larger than the MDF itself, is the suppressive interaction produced between concomitant stimulations of different parts of the visual field. For instance, integration along the long axis of an hypercomplex RF, corresponding to its preferred orientation, is highly non linear. Our own data show at the subthreshold level that localized stimulation with the preferred orientation restricted to the polar end-zone of hypercomplex cells is excitatory, whereas a reduction in firing is observed as expected when a long slit of light co-$timulate these regions in spatial continuity with the MDF. A general rule seems to be that above a certain degree of spatial summation a divisive effect predominates, i.e. a general decrease is found for both excitatory and inhibitory potentials. These conclusions hold when the same orientation is presented at the same time in the NEAR and FAR periphery, or in the MDF and the NEAR periphery, and are in agreement with extracellular observations already made by DeAngelis and collaborators (De Angelis et al., 1994). Recent extracellular data suggest however that facilitatory effects might be observed even in hypercomplex cells when the co-aligned central and peripheral stimuli are separated by sharp contrast discontinuities (Kapadia et al., 1995). These data can be compared to the models of hypercomplexity initially proposed by Hubel and Wiesel, who hypothesized that suppression of firing observed when increasing the length of an optimally oriented stimulus across the end zones of the RF might stem from the recruitment of inhibitory interneurons, the RF of which might be either located next to the RF of the main excitatory afferents, or partially overlapping it (Hubel and Wiesel, 1965). The work of Sillito and Orban and collaborators later favored that second possibility (Orban et al., 1979; Sillito, 1977) and implied a selective recruitment of inhibitory inputs with long stimuli bypassing the length of the MDF. Sillito imagined a "veto" mechanism, where excitatory inputs coming from different parts of the visual field converge both on the postsynaptic cell and on inhibitory interneurons which above a certain level of presynaptic activation would shunt or suppress the excitatory effect produced by the direct feedforward
168 inputs (Sillito, 1977; Sillito and Versiani, 1977). A partial correlate can be found at the cellular level in the in vitro study of lateral intracortical connectivity made 15 years later by Hirsch and Gilbert (1991). These authors found that increasing stimulation intensity of the horizontal plexus in slices of cat area 17 revealed progressively an inhibitory component forbidding the suprathreshold activation of presumably excitatory cells (Hirsch and Gilbert, 1991). This negative regulation of postsynaptic firing was absent in the same cells when stimulating white matter or local afferent circuits. Furthemiore, Fast spiking cells, identified as inhibitory on the basis of their non-adapting discharge in response to current injection, tended to respond in a graded fashion (although with a higher threshold) when increasing the level of drive of long-distance lateral input. This differential behavior of excitatory and inhibitory elements has been used by Somers et al. (1994) and Stemmler et al. (1995) to model the non-linearity of lateral visual interactions (see also Figure 11). The different extracellular and intracellular data reviewed here lead us to conclude that the subthreshold integration field of visual cortical cells is mostly excitatory (see SLF in Figure 1), but that suppressive interaction between different parts of the visual fields stimulated in spatial and temporal contiguity tends to appear and antagonize excitatory responses, more efficiently the larger the compound size of visual field being co-activated (see Association Field in Figure 1).
5. TEMPORAL STRUCTURE OF VISUALLY EVOKED SYNAPTIC ACTIVITY The binding problem, which is a central issue in sensory perception, addresses the question of how the distributed neural information about elementary features of an image can be integrated in order to generate a coherent perception (Ballard, 1986; Damasio, 1989; Treisman and Gelade, 1980). It is in the auditory system that the first model of "coding-bycorrelation" was proposed. In this theoretical approach, the segmentation of an acoustic input flow was achieved by patterns of synchronization and de-synchronization between detectors of spectral components (Von der Malsburg and Schneider, 1986). A particular assumption was that all the elementary units forming the neural network used for the simulation had a "burst" type of firing, i.e. a rhythmical activity at an almost fixed periodicity. The phase-locking of the oscillators was generated both by the temporal structure of the input and by the patterns learned by the network on a slower time scale. Related electrophysiological observations were made a few years later in visual cortex simultaneously by Eckhorn et al. (1988) and Gray et al. (1989) who reported the
169 existence of stimulus induced oscillations of neural activity that could synchronize across large cortical distances as a function of global features of the stimulus. Although the periodic burster activity assumed in the model of Von der Malsburg and Schneider had not been shown to be critical in order to obtain temporal segregation, Singer and Von der Malsburg further assumed that cortical gamma oscillations (30-70 Hz) induced during visual activation were instrumental in building-up synchrony (Singer, 1990; Singer, 1993; Von der Malsburg and Singer, 1988). A literal prediction that can be derived from such a hypothesis is that oscillatory activity should be more manifest for features which are critical in binding assemblies (Stryker, 1989). A simplified view is that, if synchrony acts as a "glue" between specialized feature detectors (Engel et al., 1992), these should oscillate for the stimulus values they are tagged for, i.e. oscillatory behavior should follow the stimulus-preference of cells. A possible scenario is that intrinsic bursting neurons at the cortical level set the rhythm and the orientation selectivity of the oscillations and maintain some phase locking in the reverberation of activity within or across cortical columns. Whereas in vitro literature has shown that neocortical cells possess persistent sodium, and transient potassium and calcium conductances able to generate subthreshold and spiking periodic activity from 30 to 50 Hz (Llinas et al., 1991; Silva et al.; 1991, Wang, 1993), there is up to now no direct evidence showing how these voltage-dependent currents contribute to visually induced oscillations (Bringuier et al., 1992; Jagadeesh, et al. 1992; but see Mc Cormick et al., 1993). In contrast, recent pharmacological dissection of excitatory/inhibitory networks and simulation studies using small assemblies made by formal or simple integrate-and-fire neurons demonstrate that periodic dynamics emerges readily from the recurrent structure of the network, even when formed by feed-back inhibitory connections (Schuster and Wagner, 1990; Whittington et al., 1995), purely excitatory feedback (Deppisch et al, 1993) or both of these (Cotterill and Nielsen, 1991; Hansel and Sompolinsky, 1992; Schillen and Konig, 1991; Traub et al, in press; Wilson and Cowan, 1972). The intrinsic vs. extrinsic origin of cortical oscillations can be addressed experimentally using intracellular recordings in vivo. An early report, based on a small number of cells, described periodic inhibitory potentials during visual stimulation (Ferster, 1986). The same laboratory, using blind patch clamp recordings, reported later rhythmic EPSPs, although the involvement of IPSPs was not excluded (Jagadeesh et al., 1992). These data can be compared with our own intracellular data basis, mostly recorded with sharp electrodes We developed quantitative methods to study of the oscillatory behavior in single cells, based on periodicity detection in autocorrelation functions or histograms (Figure 6),
170 and applied them to extracellular and intracellular recordings in the kitten and cat visual cortex (Bringuier et al, 1992; Fregnac et al, 1994a; Bringuier, et al., submitted). Our main findings were that there is no strict relationship between the currentinduced activity pattern and the synaptically induced activity, and that subthreshold intrinsic oscillations are rarely seen in vivo. While in most cells the frequency of the current induced repetitive firing increases with the amount of current, visually evoked oscillations when present have a stable frequency whatever the level of polarization. Figure 7 illustrates periodic inward currents or phase locked sequences of inward/outward current for cells recorded in single electrode discontinuous voltage clamp mode. A simple examination of subthreshold oscillations provides a better indicator of synchronicity arising from the convergence of inputs on a single cortical cell. Our intracellular data show that a single period within an oscillatory visual response consists of a composite depolarizing potential, generally predominantly excitatory, with a typical amplitude of 5 - 1 5 mV. These values can be compared with in vitro measures made using double intracellular recordings, where amplitudes of unitary EPSPs ranged between 0.05 and 2.3 mV at resting potential (Fetz et al., 1991; Mason et al, 1991; NicoU and Blakemore, 1993; Thomson and Deuchars, 1994; Thomson and Radpour, 1991). If we assume this range to be valid for cat visual cortex in vivo, it follows that, i) each periodic potential results from many presynaptic spikes (possibly conveyed by many presynaptic fibers) and ii) this presynaptic activity is segmented in temporal clusters of action potentials. Ideally, voltage clamp techniques can be used to directly monitor postsynaptic currents and by-pass the filtering effect of the membrane time constant. In the cell illustrated in Figure 8, both the variability and the "bumpiness" of the periodic inward current peaks suggest that they result from the action of many presynaptic spikes. In the chosen example, the average value of the periodic postsynaptic currents was close to 300 pA at - 66 mV. In rat visual cortex in vitro, unitary EPSC's amplitudes have been found to range from 5 to 90 pA at resting potential, with a mode at 20 pA (Stern et al., 1992). If these numbers hold for cat visual cortex in vivo, and under the assumption of linearity, 15 unitary EPSCs are required to trigger an action potential (given that in current clamp recording at - 66 mV of the cell shown in Figure 8, each periodic potential was generating one spike on average (data not shown)). Because of the limitations in space clamping exacerbated by a tonic shunt due to high level of activity often found in vivo , it is most likely that the various assumptions made in our calculations give a lower estimate, and that a range from 15 to 100 synchronously active synaptic sites seems more reasonable. This number remains small in comparison with the thousands of excitatory synapses received by each pyramidal cell.
171
a S 200 ms
{JAK'UN
50 ms
50 ms
Figure 6. Visually evoked periodic and non-periodic postsynaptic potentials Responses from two different cells, one non-periodic (a, left panel) and the other periodic (b, right panel), to a single sweep of a light bar across the RF. (a): Area 17 neuron recorded in a 14 week old kitten. The visual stimulation elicited sequences of large and fast depolarizations, occurring at a variety of frequencies and resulting in an irregular spike train. A detailed view of the cell's response, showing an absence of strict periodicity, is presented at the bottom on a faster time scale, (b): Area 17 neuron recorded in a 9 week old kitten. The response to a moving bar (top) was characterized by large, fast and highly periodic depolarizations, eliciting a rhythmic discharge. The detailed view presented at the bottom shows that periodic potentials had rather burst-like shapes on which spikes rode at high frequency, and were separated by silent episodes.
172 CURRENT CLAMP
100 ms
100 ms
VOLTAGE CLAMP
'i>fr^t
.^MVA..,^^^^
.r^'
100 ms
\M.
100 ms
Figure 7. Synaptic events associated with visually evoked periodic activity Records from two cells (left and right panels) stimulated with drifting bars, showing two forms of oscillatory behavior analyzed in current clamp at rest (upper half) and voltage clamp held close to the resting potential (lower half). Left: in this cell, recorded in a 9 week old kitten, periodic events observed in current clamp consisted of depolarizing potentials separated by pauses during which the membrane potential was stabilized close to the resting level ( - 66 mV). A similar pattern was observed when the cell was voltage clamped at - 66 mV (bottom), demonstrating further that periodic potentials stemmed from bursts of synchronized inward synaptic currents. Right: In this complex cell, recorded in a 8 week old kitten, periodic events in current clamp consisted of alternate depolarizing and hyperpolarizing potentials shifted in phase by 45° (top), corresponding to a fixed sequence of inward and outward synaptic currents when the cell was voltage clamped at - 60 mV (bottom).
173 Another characteristic pattern present in EPSCs recordings is that despite variability, a clear asymmetry between the rising and the falling phase is preserved. In most cases, the falling phase follows the rising phase immediately, without any apparent plateau. The rising phases last on average 17.7 +/- 12.2 ms (20% - 80% rise time, n=196 EPSCs). A theoretical development (to be published elsewhere) shows that the rise time probably gives an upper limit boundary (because of imperfect space clamp) of the duration of overlap of presynaptic activity, i.e. of the degree to which presynaptic spikes are synchronized (Bringuier, 1995). The shorter rise times are 7-10 ms, which corresponds to about one tenth of the period of the oscillation, suggesting a highly structured temporal activity. The possibility that presynaptic spikes are not exactly synchronized but might be slightly spread in time has been explored by several theoretical studies: Diesmann and colleagues have developed a formalism to characterize the efficiency of the transmission of such "pulse packets" through synaptic connections (Diesmann et al., 1995). Bernander and collaborators showed that for a postsynaptic neuron showing a refractory period, pre-synaptic spikes will tend to be more efficient in eliciting several spikes in a row from the common target cell if their arrival times show some temporal jitter (Bernander et al., 1994). If clustering of synaptic events appears optimal in terms of transfer of information in the network, episodes of synchrony do not need to reoccur periodically, as illustrated in Figure 6a. This conclusion is supported by reports based on crosscorrelation studies showing non-oscillatory synchronization (Kriiger and Aiple, 1988; Nelson et al., 1992; Schwarz and Bolz, 1991; Toyama, 1988; Ts'o et al, 1986). It strengthens the view that multiple sudden depolarizations and repolarizations can be observed at a variety of frequencies resulting in a propagation of pulse packets in a nonperiodic manner. The packets of 15-100 presynaptic spikes as indicated above are probably conveyed by a comparable number of fibers for the following reasons: i) synchronization of input results in a barrage of activity impinging on the postsynaptic cell within a temporal window lasting for 10-20 ms; ii) in our recordings visually evoked bursts rarely contained more than 3 action potentials, and iii) morphological reconstruction of identified geniculo-cortical or intracortical axons shows that each of these makes only a few contacts per postsynaptic cell (Anderson et al., 1994; Thomson and Deuchars, 1994). In our view, measuring large EPSCs rise times is a procedure somewhat analog to cross correlation. Although, as we mentioned earlier, it gives an image of pre-synaptic synchronization that might not be extremely sharp, it has the advantage over classical cross-correlation measurements of being more functional, in the sense that the monitored activity is indeed postsynaptically integrated.
174
10 ms
20 ms Figure 8. currents
Profile and variability of periodic visually evoked inward synaptic
The analysis of synaptic currents recorded in a Complex cell of a nine week old kitten was restricted to responses evoked by repeated sweeps of an optimally oriented stimulus at a fixed velocity across the RF, while the cell was held in voltage clamp at - 80 mV. (a): Superposition of 28 periodic EPSCs extracted from four trials synchronized with the maximum of the inward peak. A large variability is apparent, in both in the rising and falling phases of the compound currents. In spite of this variability, a clear asymmetry differentiates the slopes of the rising and the falling phases. (b): The twenty eight EPSCs have been clustered according to the trial from which they were extracted. Variability seen between individual synaptic events appears to be present both within and between trials.
175 6. SPATIO-TEMPORAL INTERACTIONS IN THE ASSOCIATION FIELD As mentioned briefly earlier , theories of perceptual binding based on correlational schemes presuppose the existence of "Hebbian-like" plasticity in order to account for cell assembly formation. However changes in functional links between coactive cells are hypothesized to operate reversibly on a much faster time scale (ms) than synaptic changes described during development, i.e. they are compatible with the minimal integration time necessary to separate a figure from its background or for recognizing objects as parts of a more complex scene (Von der Malsburg, 1981; Von der Malsburg and Bienenstock, 1986). It was assumed, in particular, that "synfire chains" or graphs grow and wane during recognition because of rapid activity-dependent changes in synaptic efficacy (Abeles, 1991; Bienenstock and Doursat, 1989). In spite of this declared quest for "fast Hebbian" synapses, the Hebbian temi should not be taken too literally: reversible modulation of functional links, if it exists, does not in theory require a recapitulation of the elementary subcellular processes responsible for Hebbian associative LTP and heterosynaptic LTD, in order to share a similar phenomenology accelerated on a 10^ to 10^ faster time-scale. An obvious non-Hebbian form of "fast" changes in synaptic efficacy may be found in the rather general observation, that the evoked response in cortical cells elicited by presynaptic spikes activating the same synapse less than 50 ms apart can undergo fast upand down-regulations in amplitude. It is indeed well established, at least in vitro , that two successive stimulations of the same presynaptic fiber induce paired pulse depression (Wilcox and Dichter, 1994) or potentiation (Thomson et al., 1993b) of the test response (review in Magleby, 1987; Thomson, et al., 1993a). Such fast changes in synaptic gain are thought to be predominandy of presynaptic origin, and linked to free calcium accumulation or depletion and alterations in vesicular recruitment which will affect subsequent liberation of neurotransmitter (Zucker, 1989). Recent evidence obtained in identified synapses between pairs of neurons recorded intracellularly suggests that the sign of the change might be dependent on the initial synaptic efficacy (Debanne et al., 1996), the weaker the synapse, the more readily it will be susceptible to be facilitated and thus help to propagate the chain of synchronous activity. Nevertheless the implication of postsynaptic regulatory mechanisms cannot be excluded at both excitatory and inhibitory synapses and might involve receptor desensitization or fast-acting regulation of second messengers (Marty and Llano, 1995). Another class of regulatory processes which cause strong non-linearities in the final summation and integration of synaptic events carried out by the postsynaptic neuron, is linked to the activation of voltage-dependent and temporally gated conductances allowing
176 the cell to boost or damp the response to a test synaptic input during a fixed temporal window. For instance the in vitro work of Deisz et al. (1991) in somatosensory cortex shows a strong APV-resistant voltage-dependency of synaptic potentials evoked by white matter activation. These authors demonstrated that a low threshold calcium current is responsible for the increase in size and duration of the initial EPSP appearing for more depolarized membrane potential values (Figure 9a). The most remarkable feature of this process is that the amplification gain depends on the past history of the membrane potential, and acts over a critical temporal window, in the order of 100 to 200 ms following the onset of depolarization, during which the boosting of the postsynaptic response can be expressed independently of an actual change in synaptic transmission. When the cell has been kept depolarized for a longer period, the arrival of synaptic input, out of phase with the onset of the change in membrane potential, no longer benefits from the calcium inward current which is now inactivated (Figure 9b). How general is this process in cortical cells? It is now well established, at least in the pyramidal neurons of CAl, that apical dendrites possess voltage sensitive calcium conductances (Magee and Johnston, 1995a-b). The currents are mainly of the T type (Christie et al., 1995), as found by Deisz et al. in neocortical neurons (Deisz et al., 1991). At the level of dendritic spines and shafts, back-propagating action potentials elicit intracellular calcium transients that can cooperatively enhance the calcium accumulation resulting from synaptic activity (Yuste and Denk, 1995). In the apical dendrites of pyramidal cells of the primary visual cortex, voltage dependent calcium currents are also present and activated by synaptic input (Markram et al., 1995; Yuste, et al., 1994). These have even been implicated in the induction of LTP following repeated activation of the same afferent pathway in the presence of APV (Komatsu et al., 1991). Other mechanisms of fast regulation of EPSPs should be considered. In principle, a major source of voltage dependent amplification should be observed following NMD A receptor activation (Stem et al., 1992, Thomson, 1986), and visual responses elicited at rest have been shown to contain an APV-sensitive component (Miller et al., 1989). An other type of conductance which might prolong the depolarization evoked by an excitatory input is that linked to persistent sodium channels and has been described in vitro in somatosensory and visual cortical neurons (Hirsch and Gilbert, 1991, Stafstrom et al., 1984). A local iontophoretic application of glutamate on the apical dendrites of somatosensory cortex layer 5 cells induces an inward current, the amplitude of which increases with depolarization (Schwindt and Crill, 1995). Because this faciUtation is not impaired by voltage clamping the soma, these authors conclude that the site of interaction must be rather distal. However,
177 Stuart and Sakmann, using local control and simulation of the postsynaptic depolarization while recording simultaneously from the soma and dendrite of the same layer 5 pyramidal cell, have shown in visual cortex that the TTX sensitivity of the depolarization-induced facilitation of the EPSPs is spatially restricted to a region close to the soma (Stuart and Sakmann, 1995). In spite of this discrepancy, one cannot exclude that very distal dendrites have enough sodium channels to amplify significantly local synaptic inputs. As far as a graded regulation of synaptic gain is concerned, the conductance dependent gain control hypothesis presents a definite advantage over a scenario based on pure synaptic plasticity: it does not require reversibility of the established changes, whereas fast Hebbian schemes have to be accompanied by complementary depression rules to reset connectivity back to its initial state, i.e. before the start of the recognition process. However the diversity of conductances participating to the postsynaptic machinery offers symmetrical associative ways of damping afferent input, thus providing a normalization of synaptic weights, certain synapses being boosted whereas others might be transiently weakened. An example can be found in non-cortical networks, such as the solitary complex (Figure 9 c-d). In this structure which is a caudal bulbar relay of the brainstem involved in respiratory activity (Bianchi et al., 1988; Ezure, 1990), it has been shown that low threshold calcium currents present before birth (which in neocortical neurons are responsible for the associative boosting) are replaced during the first postnatal week of life by potassium currents at the same time as GABA-ergic innervation appears (Schweitzer et al., 1992). Associative protocols similar to those applied in neocortical neurons reveal a transient damping of synaptic input at that age (Fortin, 1993). This opposite change in the modulation of the apparent synaptic gain is due to a potassium outward current, probably of the IA type, which shortens the time course of EPSP if the depolarization has been applied within the last 100ms, but is inactivated for longer delay periods (Figure 9d). In order to test for the existence of boosting or damping of subthreshold inputs in visual cortical neurons, we decided in collaboration with F. Chavane and J, Lorenceau to replay the protocol devised by Deisz et al. in a more functional framework. The aim of this work was several fold: i) to study the spatial and voltage dependency properties of subthreshold responses in the periphery of the MDF, and iii) to determine how synchronization of one input with somatic current depolarization, or of different sources of subthreshold activity, could facilitate their postsynaptic integration. The experimental protocol was devised the following way: the same visual stimulus was shown twice, one second apart, and the two test responses of the cell were recorded and compared at the same membrane potential (bottom part in Figure 10). The holding potential was varied from one
178 trial to the next to reveal a possible voltage dependency. For a given trial, the only distinction between the two successive visual stimulation epochs was their relationship with the recent history of the membrane potential: at the beginning of the trial, the cell was held at rest; then, the onset of the first presentation of the visual stimulus was paired with the intracellular injection of a constant subthreshold current, the application of which was maintained throughout the second presentation of the stimulus (Bringuier et al, 1995). The complete "Current-Vision association" protocol consisted in randomly interleaved pairings of dual sequential visual stimulations in different parts of the visual field (MDF, NEAR, FAR) with an intracellular subthreshold current step. The value of the current for a given trial was a fraction of the threshold intensity needed to reach spike initiation. Both polarities of current and spatial locations of the test stimuli were randomly chosen in order to avoid long-lasting effects of the pairing procedure. The onset of the current pulse was matched in phase with the ON response to the first visual stimulus such as that the
Figure 9. Role of slowly inactivating conductances in the modulation of the effective synaptic gain in vitro . a and b: data adapted from Deisz et al, 1991, with permission. c and d: data adapted from Fortin, 1993 with permission. (a): Voltage dependency of EPSPs recorded in a rat sensorimotor cortex layer 2-3 cell and evoked by stimulation of white matter. The superimposed traces correspond to recordings with and without orthodromic stimulation at different levels of depolarization. EPSPs evoked at a depolarized level, 50 to 100 ms after the current onset are markedly larger than control EPSPs evoked at resting potential or evoked when the cell has been kept depolarized for a long time. The current step value is chosen just below the level of activation of the low threshold calcium current. (b): Superimposed traces of EPSPs evoked in independent trials and positioned at different delays after the onset of the depolarizing current pulse. A selective enhancement of the EPSP is observed only for that evoked a short delay after the onset of the current pulse. (c): Voltage dependency of EPSPs recorded in a cell of the solitary complex evoked by stimulation of the solitary tract. The superimposed traces correspond to recordings with and without orthodromic stimulation at different levels of depolarization. EPSPs evoked at a depolarized level, 50 to 100 ms after the current onset are markedly shorter in duration than control EPSPs evoked at resting potential or evoked when the cell has been kept depolarized for a long time. (d): Superimposed traces of EPSPs evoked in independent trials and positioned at different delays after the onset of the depolarizing current pulse. The comparison of the shape of all recorded EPSPs, shown in the lower inset, demonstrates the selective shortening of the falling phase of the EPSP, due to a potassium current (see text) activatable only for a short period following the onset of the current pulse.
>
^ ^ » o ^
<
o »
H
zti Q Z ix]
a ti< Q 1^ H
o
a
< o
»^. e HM
r
1
1-J
C
^ wm^m
J
^ © S
<s
a^^MMM I
^
s
\*
I©
A
A ^ ^ .^ A
<< \A
H
< ^^^
/
r"^
jf
—*•"
»-H
>
<\^ \A ^--^
O
/ /
1/7i /^-4 If
\
U \\ f
V.
i
SYNAPTIC BOOSTING
«
xaxaoD
< *
'"
"^ ' ^>
1
y
> s
SYNAPTIC DAMPING
s
o
% ^-*
/
/
^•^^i^
> s o
n
/\
(2T~ ^/ ^ v-_
I|"'-^>s.^i
xaidwoD AHvxnos
179
180 membrane potential depolarization would start concomitantly with the first evoked visualresponse, i.e. would be applied at a delay precisely adjusted to compensate for the latency of visual cortical responses. The second visual stimulus was presented out of phase with respect to the pulse onset, i.e. 1 second later, at a delay where temporal interaction no longer occurs between successive stimuli (Nelson, 1991). Three types of effects were observed: - i) One concemed thefiringfrequency of the neuron: in certain cells the response of the cell to the first stimulus became larger and more tightly tuned temporally than the response to the second stimulus. This observation might appear surprising since increases in the duration of the postsynaptic potential under the influence of calcium conductances do not predict an improvement in the synchronization of temporally overlapping inputs. - ii) A second finding was that in some of the cells, peripheral input leads to a synaptic potential whose amplitude and duration was favored during the first visual stimulus presentation, thus reproducing in a functional context the observations made by Deisz and collaborators (1991). - iii) A third observation was that occasionally IPSPs could also benefit from such boosting. In summary, cells in visual cortex seem to possess the non-linear conductances needed to selectively boost either excitatory or inhibitory synaptic input. These conductances could be turned on during specific functional situations, such as the contiguous presentation of a visual stimulus in the MDF of the recorded cell and of a contextual pattern in the periphery of the RF. However the associative requirements between the test stimulus onset and the somatic intracellular depolarization used in the current-vision protocol impose peculiar timing relationships during natural visuo-visual associations: in order to benefit from the temporal selectivity in inactivation kinetics of postsynaptic conductances, non-homogeneities in response latencies of responses originating from different parts of the visual field (see Figure 5) have to be accounted for and precisely balanced. For instance, we predict a more efficient boosting when the stimulation of the central part of the RF by the inner grating of a cross-oriented bipartite stimulus is preceded by the peripheral appearance of the outer grating by a few tens of milliseconds (see also Zipser, 1995). A possible test situation would be the use of a collapsing annulus (centripetal motion) which should be more efficient than expanding circles (centrifugal motion) in favoring postsynaptic summation. In that respect, it would be interesting to compare the spatial distribution in the latencies of subthreshold responses found in primary cortex with those observed in non-primary visual areas, such as the Clare-Bishop area of the cat, where a
181
iJHl
A,
20 ap/s
[lOniV 500 ms
VISUAL
T 2/3 T 1/3 T CURRENT
"IT
0 -1/3T
Figure 10. Boosting of visually evoked lateral £PSPs by temporal coincidence with post synaptic depolarization Complex cell, recorded in area 17 of an adult cat at a resting potential of - 73 mV. The association protocol is schematized in the lower part of the figure: an identical visual stimulus was shown twice in the same trial, each time for a duration of 400 ms, separated by a 600 ms delay. Meanwhile, a step of current was injected so that its onset coincided with the beginning of the visual response (which required an initial measurement of its latency). The two successive presentations of the stimuli differed only by then: delay with respect to the current pulse. From trial to trial, the parameters which were varied were the amount of current injected (five levels, e>q)ressed as a fraction of the intensity (T) needed to elicit a significant change in the spontaneous level of spiking) and the retinal location of the visual stimulus (MDF, NEAR, FAR, see text). The two upper graphs represent the response to the association of the largest intracellular current (+ 0.27 nA) with a NEAR stimulation, using a moving grating restricted to the side bands of the MDF. The top histogram is the PSTH averaged over 16 trials and the lower trace corresponds to the averaged membrane potential after spike filtering. The second presentation of the test grating evoked almost no response, whereas the first one elicited a large membrane depolarization and a significant increase in firing, the duration of which lasted for that of the stimulation.
182 large proportion of lateral suprasylvian sulcus cells prefer centrifugal motion away from the area centralis (Rauschecker, et al, 1987; but see Toyama, et al., 1990). A possible research line in these different cortical areas could be to look for optimal temporal association sequences of spatially distributed patches of elementary visual patterns derived from the knowledge of the distribution of subthreshold input latencies across the visual field.
7. THE MAGIC RING In summary our data suggest that the receptive field of a visual cortical cell should not be considered as a fixed entity but more as a dynamic field of integration and association. Two types of dynamics can be argued for: - i) The central core of the receptive field (MDF) can be profoundly reorganized at least during development and most probably during selective phases of learning under the control of activity-dependent mechanisms. Adaptive changes in visual responses are thought to reflect long-lasting potentiation and/or depression of synaptic efficacies. - ii) During sensory processing, reconfiguration of synaptic weights may be achieved on a much faster time-scale and linked to non-linear properties of the postsynaptic membrane as well as that of recruited networks. Association of information available in the central part of the RF and of input coming from the reputed "unresponsive" regions surrounding it, or arising simultaneously from different parts of the visual field, might be suppressive in certain cases and capable of boosting hidden responses in other cases, depending on the global stimulus configuration. We conclude from these two lines of evidence that the spatial extent of cortical receptive fields could vary on at least two distinct time scales. The slower one, compatible with time constants of LTP and LTD, would reflect memory traces of the past activity of the network in which the cell is embedded. The faster one, compatible with recognition processes, would depend on the kinetics of inactivation of specific conductances and the recent history of the postsynaptic membrane potential. A remarkable observation is that the SLF of cortical cells assessed intracellularly is much larger than that expected from classical extracellular methods. The unmasking of peripheral responses at the suprathreshold level is often observed when the postsynaptic cell is depolarized and/or when certain combinations of inputs distributed across the visual field are used. This process could require synchronization of converging subliminal inputs evoked by the simultaneous presentation of co-linear stimuli or stimuli sharing the same direction or orientation across the visual field, which might increase the gain of horizontal
183 excitatory connectivity. A puzzling question is to determine if intrinsic oscillating behavior or resonance of recurrent inputs that have been described in cortical networks participate in building up synchrony. Our own data suggest that a cortical cell does not need to undergo a rhythmic pattem of activity to be able to synchronize itself with others cells. Rhythmicity is predominantly observed for a restricted set of visual inputs, and which are not obligatorily optimal for the cell firing. Periodic composite synaptic potentials reflect most probably the working signature of recurrent circuits proned to resonate. However the detailed study of the temporal profiles of visually evoked PSPs and PSCs leads us to conclude that even at the single cell level it is not possible to separate oscillatory behavior from network synchrony. On one hand, no convincing evidence has been presented in the literature that oscillations participate in the postsynaptic elaboration of synchronous firing. Complex stimulus dependency does not argue for hierarchical models in which in-phase depolarizations evoked by various features of the same complex object summate preferentially on hypothetical grandmother cells. On the other hand, detailed analysis of temporal patterns of membrane potential trajectories indicate that often oscillations seen at the single unit level are composed of synchronous barrages of EPSPs, or of phase-locked EPSPs and IPSPs separated by pauses of synaptic activity. Thus, the oscillatory behavior of cortical cells results predominantly from synchronized packets of afferent input. If rhythmicity in oscillations does not appear to be instrumental in building synchrony, the altemation of pauses at the resting membrane potential with abrupt changes in the polarization level of the neuron, might be enough to do the job: the onset of a depolarization plateau could allow the activation of slowly inactivating conductances which might help to boost transiently incoming input. The end of the burst and the associated AHP would de-inactivate the conductances and reset the input gain control mechanism before the next burst. To summarize our view, it is likely that repetition, as Von der Malsburg proposed, might be necessary for the synchrony to be reliably distinguished from the single episodes of correlated activity that are likely to occur by chance, but no strong evidence has been so far provided showing unambiguously that periodicity in the reoccurring synchronizing wave is also required. As stated above, an attractive cellular process that might reinforce synchronous transmission of converging inputs consists in the selective ampUfication of sensory responses when the stimulus appearing in the central part of the receptive field differs strongly from that presented in the periphery ("pop-out effect") and evokes already a subthreshold change
184 in the polarization state of the neuron, favorable for the activation of boosting conductances. According to the literature reviewed in section 6, this would allow the convergence and the synchronization of various inputs during a temporal window of 100 to 300 ms . As alluded to earlier, particular predictions can be applied to the filtering ability of visual cortical cells in response to (normally subliminal) contextual input presented in the periphery at the same time as they are depolarized for instance by the presentation of the sensory stimulus in a central part of the receptive field. Our intracellular subthreshold data can be compared to the recent observations of "pop-out" effects in visual cortical receptive fields where two types of protocols have been used: Knierim and Van Essen (1992) varied the orientation of the texture of the background while stimulating the MDF with a fixed orientation, and Sillito and colleagues used bipartite stimuli where the outer grating had an orientation orthogonal to that of the inner disk covering the center of the receptive field (Sillito et al., 1995). The latter group recorded from cells whose level of firing became selectively enhanced when the two orientations (central and peripheral) differed by 90° independently of their absolute values (configuration D in Figure 11). The fact that the level of firing for the bipartite stimulus became higher than that evoked by the presentation of the optimal inner grating alone (configuration A in Figure 11) suggests that the optimal feature encoded by some cortical cells is not the orientation per se shown in the MDF but the contrast in orientation between center and periphery. A plausible regulatory mechanism derived from our experiments could be that the transmission of the peripheral input, which is by itself mainly excitatory (when remaining within the tuning subthreshold preference of the cell), benefits from the sudden concomitant presentation of the stimulus in the central part of the receptive field. This boosting effect would especially predominate when the inner stimulus is cross-oriented or distinct from the preferred orientation of the cell, i.e. when by itself it evokes only a subliminal depolarization, equivalent in our case to subthreshold somatic current injection (Figure 10). In contrast, stimulus co-alignment across the visual field when the central input has the same (preferred) orientation as that of the peripheral stimulus, might trigger (as we have observed) a nonlinear suppressive interaction, responsible for hypercomplexity, and mediated by intracortical inhibitory interneurons. The reduction in postsynaptic firing of hypercomplex cells would be expressed only when the total excitatory drive, provided by the sum of the feedforward and lateral inputs, exceeds the threshold needed to fire the inhibitory interneuron (Si, third column in Figure 11). The spiking threshold is assumed to require a higher level of convergence in inhibitory cells (Si) than in excitatory cells (Se). These two simple mechanisms (inhibitory interneuron, difference in spiking thresholds), which were in part already suggested by Sillito (1977), would result in a differential increase in the
185 response, thus allowing a cell to detect local orientation contrast between its preferred orientation and that shown in the peripheral field (Figure 11). Related models of a biphasic control of the cortical response gain as a function of the global drive (feedforward, recurrent, lateral) have been proposed (Somers et al., 1994; Stemmling et al., 1995). However their predictions do not reproduce some of Sillito's findings, in particular where higher levels of discharge may be observed in response to high contrast cross-oriented bipartite stimuli as compared to that evoked by the optimal stimulation of the MDF alone. Figure 11 illustrates an hypothesis of a cortical circuit for the detection of orientation contrast. The orientation contrast selective cell is assumed here to be a second-order neuron. This cell receives a much stronger excitation from the feedforward input than from the lateral input, because of initial amplification of feedforward input by first-order cells. The lateral connectivity agrees with our observation of combined excitatory and inhibitory subthreshold input originating from the periphery of the MDF. All cortical cells are assumed to amplify their own response through recurrent excitatory and inhibitory local circuits of the type described by Douglas and colleagues (1995). As in previous models (Sommers et al., 1994; Stemmler et al., 1995) the convergence of feedforward and lateral excitatory drive is required to fire a specific class of inhibitory interneurons. Their suprathreshold activation (which occurs mostly for stimulus C in Figure 11) will induce a down-regulation of the cortical cell's response to an optimally oriented grating covering the full extent of the visual field. In order to account for the facilitating effect of cross-oriented stimuli, we make the specific assumption that the AND-gate interneurons are activated below threshold for only one type of input (feedforward in A, and lateral in B), and below or near threshold for orthogonal orientations shown simultaneously in the center and the periphery of the RF (cross-oriented bipartite stimulus in D, see Figure 11). A possible way of achieving this requirement independently of the absolute orientations of the gratings is to further assume that, in contrast to most cortical cells, feedforward and lateral inputs provide a similar subthreshold orientation tuned contribution to the AND-gate intemeuron. The sum of these two similar tuning curves shifted one from each other by 90° (cross-oriented stimulation) may be considered as poorly oriented (with an appropriate scaling in their width of tuning) and kept below or near the threshold of activation of the interneuron. Consequently, as long as the threshold of the inhibitory intemeuron is not trespassed, the recurrent circuitry would provide a higher level of response to the bipartite configuration in D than that evoked in the same cell by the preferred stimulus restricted to the MDF (configuration C), and this for three reasons:
186 - i) the AND-gate circuit is poorly activated by the cross-oriented stimuli, - ii) the orientation contrast selective cell receives in addition to the feedforward input an excitatory lateral drive, - iii) the temporal association of the inner / outer stimuli recruits the boosting of synaptic gain suggested by our own experiments. Comparison between response levels evoked by configurations C and D still predicts a larger response for D than for C for nonoptimal stimuli, but quantitative simulations have to be made taking into account the dependency of the boosting mechanism itself on pre- and postsynaptic parameters. In addition, in the case of cross-oriented stimuli where an optimal orientation is shown in the MDF, it is conceivable that the addition of a lateral drive will overcome the Si threshold and trigger modest activity in the interneuron, thus providing a negative regulation of the peak response. This, in turn, would result in making the output of the cell less dependent on the absolute orientation of that shown in the central part of the RF. Although such models put a strong emphasis on an apparent dichotomy between horizontal (lateral) and vertical (feedforward) cortical connectivity, it remains to be ascertained to which degree such a clear cut separation is indeed present. A first series of argument is based on neuroanatomical evidence: the pattern of terminals formed by axons of spiny stellate first-order cells and of layer n/in pyramidal cells show a spatial spread
Figure 11. A cortical circuit for the detection of orientation contrast. Top, schematic view of the microcircuit. Excitatory cells (E-cells) are symbolized by empty triangles, and inhibitory cells (I-cells) by filled black circles. The recorded cell (top) is selective to orientation contrast. Bottom, input and output activities corresponding to the E- and I-cells in response to different configurations of stimulation centered on the aggregate RF of the cortical column. In A, only feedforward pathways (and consequently recurrent local intracortical loops) are activated by restricting the presentation of an oriented stimulus to the MDF of the recorded E-cell. In B, only lateral activation is provided to the same cell by presenting an oriented stimulus in the "unresponsive" periphery of its RF. In C, both central and surround stimuli of the same orientation are presented simultaneously and coaligned without spatial offset or intermediate gap. In D, central and surround stimuli still cover the whole visual field, but are cross-oriented, i.e. their respective orientations differ by 90°. The two top rows of diagrams correspond to the orientation tuning curves of inputs (synaptic conductance) to the recorded E-cell (upper row) and to the inhibitory interneuron (lower row). Both lateral and feedforward inputs are assumed to show the same orientation bias (OPT: optimal orientation). Se and Si stand for the postsynaptic firing threshold of the excitatory and inhibitory neurons, respectively, the former being reached for lower absolute global input levels than the latter. The two bottom row diagrams correspond to the orientation tuning curves of the outputs (mean spike frequency) of the same two cells. See text for details.
187 RECURRENT \ LATERAL J INPUT RECURRENT
FEEDFORWARD INPUT
D
B lllililllii iiiiiiiiii FEED-FORWARD
-90° OPT +90° ORIENTATION
LATERAL
OPT
ISO-ORIENTED
CROSS-ORIENTED
OPT
OPT
n contained within an hypercolumn, on the order of 500 jim (Douglas, et al., 1995). But the distribution is clearly bimodal with one set of boutons forming proximal intracolumnar contacts (within 300 |im) and a second one connecting more distant columns possibly analyzing the same part of the visual field. In contrast, the disinhibitory field appears more continuously distributed over larger distances in the order of one millimeter, at least in area 18 (Kisvarday, et al., 1993). This pattern of connectivity is further complicated by the dendritic coverage of supragranular pyramids, which extends over 200 [im and whose overlap increases diversity in sampling (Malach, 1994). Another type of evidence is gained from the electrophysiological comparison of the postsynaptic effects evoked respectively by intracolumnar (or radial) and lateral activation. A first issue concerns the EPSP/IPSP balance in the evoked response. Supragranular pyramids show systematic IPSPs in response to intracolumnar or radial feedforward stimulation when the cell is sufficiently depolarized, in vitro (Jones and Baughman, 1988, Sutor and Hablitz, 1989a) and in vivo (Douglas, et al., 1991). This effect is not as pronounced when the stimulation is lateral, either located in layer 1 (CauUer and Connors, 1994; but see Nakajima, et al., 1988) or in layer 2/3 (Hirsch and Gilbert, 1991): IPSPs are sometimes absent even when applying strong stimulation intensity, but when present, are of polysynaptic origin (Hirsch and Gilbert, 1991). It has also been observed that the intensity threshold required to spike an action potential is higher for lateral stimulations than for radial ones. It is not yet clear if these conclusions also apply to layer 5 cells (CauUer and Connors, 1994; Nakajima, et al. 1988). A third issue concerns putative excitatory transmitters and receptors. Both AMPA and NMDA types of receptors mediate responses to glutamate after radial stimulation (Jones and Baughman, 1988; Sutor and Hablitz, 1989b). However, lateral depolarizing responses seem to be largely dominated by AMPA receptor and almost devoid of NMDA mediated responses (Hirsch and Gilbert, 1991; Murakoshi et al., 1993; Cauller and Connors, 1994; but see Nakajima, et al., 1988). Cauller and Connors have however observed APV sensitive, long-latency EPSPs in response to lateral stimulation. A last issue concerns the enhancement of EPSPs through the activation of inward rectification. This phenomenon, which is probably due to a persistent sodium channel, has been observed both for radial EPSPs (Sutor and Hablitz, 1989b) and laterally evoked EPSPs (Hirsch and Gilbert, 1991). Indeed, the probable perisomatic location of the responsible sodium current (Stuart and Sakmann, 1995) would tend to make this type of enhancement rather unspecific with respect to the input.
189 In conclusion, in spite of an unclear status, a trend in computational neuroscience is to attribute distinct roles to lateral and feedforward input and include non-linearities brought upon by local reverberating activity into a Meta-neuron replacing the considered cell and its local environment. Apparently further work is needed to justify a dissection of associative properties in neuronal integration by compartmentalizing the incoming input into two or three distinct classes of afferent connections. Nevertheless a new Receptive Field concept seems to emerge from an increasing wealth of data gathered at the cellular and neuroanatomical level, which suggest the existence of a critical frontier separating the Platonical RF center from its peripheral association shrine. Discontinuities in contrast, orientation, colinearity of contours shown on each side of this still hypothetical boundary would trigger non-linear discrimination between what the "inner eye" of the cortical cell sees and the contextual prediction established by the rest of the network and transmitted by lateral connections.
Acknowledgments: V/e thank Ralph Freeman for providing stimulus generation algorithms, AdAertsen for sharing his expertise on reverse correlation techniques. The electrophysiological in vivo work reviewed in this chapter was done in collaboration with Cyril Monier, Frederic Chavane, Larry Glaeser Jean Lorenceau, Attila Baranyi, Dominique Debanne and Daniel Shulz. Research was funded by grants to Y.F.from the CNRS (ATIPE Cognisciences), HFSP (RG-69/93) and the Conseil de I'Essonne. V.B. was supported during his PhD. by MRES, Fondation Fouassier and Fondation des Aveugles de France fellowships. Off-line data analysis was performed with a specialized home-made program Acquisl (developed by Gerard Sadoc and commercialized by ANVAR CNRS, Dipsi) and further processed using MatLab Signal Processing Library. Thanks are due to Dr. Lyle Borg-Graham and Cyril Monier for helpful comments on the manuscript, and to Dr. Kirsty Grant for help with the English.
190 REFERENCES: Abeles, M. (1982). "Local cortical circuits. An electrophysiological study." (New-York: Springer-Verlag), 101. Abeles, M. (1991). "Corticonics: neuronal circuits of the cerebral cortex." (Cambridge: Cambridge University Press), 280. Adrian, E.D. (1935). "The mechanism of nervous action." (Philadelphia: University of Pennsylvania Press), Adrian, E.D. (1941). Afferent discharges to the cerebral cortex from peripheral sense organs. J. Physiol. (Lond.). 100,159-191. Aertsen, A.M.H.J., Gerstein, G.L., Habib, M.K. and Palm, G. (1989). Dynamics of neuronal firing correlation: modulation of "effective connectivity". J. Neurophysiol. 61, 900917. Agmon-Snir, H. and Stgew, I. (1993). Signal delay and input synchronization in passive dendritic structures. J. Neurophysiol. 70, 2066-2085. Albus, K. (1975). A quantitative study of the projection area of the central and the paracentral visual field in area 17 of the cat. I. The precision of the topography. Exp. Brain Res. 24, 181-202. Anderson, J.C, Douglas, R.J., Martin, K.A.C. and Nelson, J.C. (1994). Map of the synapses formed with the dendrites of spiny stellate neurons of cat visual cortex. J. Comp. Neurol. M l , 25-38. Ballard, D.H. (1986). Cortical connections and parallel processing: structure and function. Behav. Brain Sci. 9,67-120. Baranyi, A,, Debanne, D., Shulz, D. and Fregnac, Y. (1991). Postsynaptic membrane potential regulates potentiation and depression of visually evoked synaptic potentials in kitten cortical neurons recorded in vivo. Soc. Neurosci. Abstr. 17, 1470. Barlow, H.B. (1953). Summation and inhibition in the frog's retina. J. Physiol. (Lond.). 119. 69-88. Barlow, H.B. (1972). Single units and sensation: a neurone doctrine for perceptual psychology? Perception. 1, 371-394. Barlow, H.B., Blakemore, C. and Pettigrew, J.D. (1967). The neural mechanism of binocular depth discrimination. J. Physiol. (Lond.). 193. 327-432. Barnes, C , Baranyi, A., Bindman, L., Dudai, Y., Fr6gnac, Y., Ito, Y., Knopfel, T., Lisberger, S.G., Moulins, M., Morris, R.G.M., Movshon, J.A., Singer, W. and Squire, L.R. (1994). Group report: relating activity-dependent modifications of neuronal function to changes in neural systems and behavior. In "Cellular and Molecular Mechanisms Underlying Higher Neural Functions", eds. A. Selverston and P. Ascher J. Wiley and Sons), 81-110. Berman, N.J., Douglas, R.J. and Martin, K.A.C. (1989). The conductances associated with inhibitory postsynaptic potentials are larger in visual cortical neurones in vitro than in similar neurones in intact, anaesthetized rats. J. Physiol. (London). 418, 107. Berman, N.J., Douglas, R.J., Martin, K.A.C. and Whitteridge, D. (1991). Mechanisms of inhibition in cat visual cortex. J. Physiol. (Lond.). 440, 697-722. Bernander, O., Koch, C. and Douglas, R.J. (1994). Amplification and linearization of distal synaptic input to cortical pyramidal cells. J. Neurophysiol. 72, 2743-2753. Bianchi, A.L., Grelot, L., Iscoe, S. and Remmers, J.E. (1988). Electrophysiological properties of rostral medullary respiratory neurones in the cat: an intracellular study. J. Physiol. (London). 407, 293-310. Bienenstock, E., Cooper, L.N. and Munro, P. (1982). Theory for the development of neuron selectivity: orientation specificity and binocular interaction in visual cortex. J. Neurosci. 2, 32-48, Bienenstock, E. and Doursat, R. (1989). Elastic matching and pattern recognition in neural networks. In "Neural networks: from models to applications.", eds. L. Personnaz and G. Dreyfus (Paris: IDSET), 472-482.
191 Bishop, P.O., Coombs, J.S. and Henry, G.H. (1971). Response to visual contours: spatio temporal aspects of excitation in the receptive fields of single striate neurons. J. Physiol. (Lond.). 219, 625-657. Bishop, P.O., Coombs, J.S. and Henry, G.H. (1973). Receptive fields of simple cells in the cat striate cortex. J. Physiol. 221, 31-60. Bishop, P.O. and Henry, G.H. (1972). Striate neurons: Receptive field concepts. Invest. Ophthahnol. U , 346-354. Blakemore, C. and Tobin, E.A. (1972). Lateral inhibition between orientation detectors in the cat's visual cortex. Exp. Brain Res. 15,439-440. Bolz, J. and Gilbert, CD. (1986). Generation of end-inhibition in the visual cortex via interlaminar connections. Nature. 320, 362-365. Borg-Graham, L. (1991). Modelling the non-linear conductances of excitable membranes. In "Cellular Neurobiology.", Eds. J, Chad and H. Wheal (IRL Press at Oxford University Press), 247-275. Borg-Graham, L. and Grzywacs, N.M. (1992). A model of the direction selectivity circuit in retina: transformations by neurons singly and in concert. In "Single Neuron Computation.", eds. T. McKenna, J. Davis and S.F.Zornetzer (Academic Press), 347375. Borg-Graham, L., Monier, C, Bringuier, V. and Fregnac, Y. (1995). Analysis of simple and complex synaptic input in cat area 17 with in-vivo whole-cell patch recordings. Soc. Neurosc. Abstr. 21, 17.1, p.21. Boven, K.H. and Aertsen, A. (1990). Dynamics of activity in neuronal networks give rise to fast modulations of functional connectivity. In "Parallel Processing in Neural Systems and Computers", eds. R. Eckmiller, G. Hartmann and G. Hauske (Elsevier Science Publishers), 53-56. Bringuier, V. (1995). Oscillations et integration neuronale dans le cortex visuel primaire. Doctorat d'Universite, Universite Paris VI, Sciences Cognitives, Bringuier, V., Chavane, P., Monier, C , Glaeser, L., Fregnac, Y. and Lorenceau, J. (1995). Role of voltage-dependent inactivating conductances in the control of the visual integration field profile in cat area 17. Soc. Neurosc. Abstr. 21, 1647. Bringuier, V., Fregnac, Y., Baranyi, A., Debanne, D. and Shulz, D. (submitted). Synaptic origin and stimulus dependency of neuronal oscillatory activity in the primary visual cortex of the cat. Bringuier, V., Fr6gnac, Y., Debanne, D., Shulz, D. and Baranyi, A. (1992). Synaptic origin of rhythmic visually evoked activity in kitten area 17 neurons. NeuroReport. 3., 10651068. Casanova, C. (1993). Responses of cells in cat's areal7 to random dot patterns: influence of stimulus size. NeuroReport. 4, 1011-1014. Casanova, C , Nordman, J.P. and Molotchnikoff, S. (1991). Le complexe noyau lateral post6rieur. Pulvinar des mammiferes et la fonction visuelle. J. Physiol. (Paris). 85» 4457. CauUer, L.J. and Connors, B.W. (1994). Synaptic physiology of horizontal afferents to layer I in slices of rat SI neocortex. J. Neurosci. 14,751-762. Christie, B.R., Eliot, L.C., Ito, K.-L, Miyakawa, H. and Johnston, D. (1995). Different Ca^"' channels in soma and dendrites of hippocampal pyramidal neurons mediate spikeinduced Ca^"" influx. J. Neurophysiol. 22* 2253-2557. Cleland, B.G., Dubin, M.W. and Levick, W.R. (1971). Sustained and transient neurones in the cat's retina and lateral geniculate nucleus. J. Neurophysiol. 217.473-496. Cotterill, R.M.J, and Nielsen, C. (1991). A model for cortical 40 Hz oscillations invokes inter-area interactions. NeuroReport. 2, 289-292. Creutzfeldt, O., Innocenti, G.M. and Brooks, D. (1974). Vertical organization in the visual cortex (area 17) in the cat. Exp. Brain Res. 21, 315-336. Creutzfeldt, O. and Ito, M. (1968). Functional synaptic organization of primary visual cortex neurones in the cat. Exp. Brain Res. 4 324-352.
192 Creutzfeldt, O.D., Kuhnt, V. and Benevento, L.A. (1974). An intracellular analysis of visual cortical neurons to moving stimuli: responses in a co-operative neuronal network. Exp. Brain Res. 21, 251-274. Crick, F.H.C. (1984). Function of the thalamic reticular complex: the searchlight hypothesis. Proc. Natl. Acad. Sci. USA. 81,4586-4590. Cudeiro, J. and Sillito, A.M. (1996). Spatial frequency tuning of orientation-discontinuitysensitive corticofugal feedback to the cat lateral geniculate nucleus. J. Physiol. (London). 49Q, 481-492. Damasio, A.R. (1989). The brain binds entities and events by multiregional activation from convergence zones. Neural Comput. 1,123-132. De Angelis, G.C., Freeman, R.D. and Ohzawa, I. (1994). Length and width tuning of neurons in the cat's primary visual cortex. J. Neurophysiol. 21, 347-374. De Angelis, G.C., Ohzawa, L and Freeman, R.D. (1995). Receptive-field dynamics in the central visual pathways. T.I.N.S. 18,451-458. Debanne, D., Guerineau, N.C., Gahwiler, B.H. and Thompson, S.M. (1996). Paired-pulse facilitation and depression at unitary synapses in rat hippocampus: quantal fluctuation affects subsequent release. J. Physiol. (London). 490,713-727. Debanne, D., Shulz, D. and Fregnac, Y. (1995). Temporal constraints in associative synaptic plasticity in hippocampus and neocortex. Canad. J. Physiol. Pharmacol. 73, 1295-1311. Debanne, D., Shulz, D. and Fr6gnac, Y. (submitted). Activity-dependent regulation of ON and OFF responses in cat visual cortical neurons. Deisz, R.A., Fortin, G. and Zieglgansberger, W. (1991). Voltage dependence of excitatory postsynaptic potentials of rat neocortical neurons. J. Neurophysiol. 65, 371-382. Deppisch, J., Bauer, H.U., Schillen, T., Konig, P., Pawelzik, K. and Geise, T. (1993). Alternating oscillatory and stochastic states in a network of spiking neurons. Network. 4, 243-257. Diesmann, M., Gewaltig, M. and Aertsen, A. (1995). Characterization of synfire activity by propagating 'pulse packets'. Caltech: 1-6. Douglas, R.J., Koch, C , Mahowald, M.A., Martin, K.A.C. and Suarez, H.H. (1995). Recurrent excitation in neocortical circuits. Science. 269, 981-985. Douglas, R.J., Martin, K.A.C. and Whitteridge, D. (1988). Selective responses of visual cortical cells do not depend on shunting inhibition. Nature. 332, 642-644. Douglas, R.J., Martin, K.A.C. and Whitteridge, D. (1991). An intracellular analysis of the visual responses of neurones in cat visual cortex. J. Physiol. (Lond.). 440, 659-696. Dreifuss, J.J., Kelly, J.S. and Krnjevic, K. (1969). Cortical inhibition and gammaaminobutyric acid. Exp. Brain Res. 9, 137-154. Eckhorn, R., Bauer, R., Jordan, W., Brosch, M., Kruse, W., Munk, M. and Reitbock, H.J. (1988). Coherent oscillations: A mechanism of feature linking in the visual cortex? Biol. Cyb. 60, 121-130. Eggermont, J.J., Johannesma, P.LM. and Aertsen, A.M.H.J. (1983). Reverse-correlation methods in auditory research. Quart. Rev. Biophys. 16, 341-414. Engel, A.K., Konig, P., Kreiter, A.K., Schillen, T.B. and Singer, W. (1992). Temporal coding in the visual cortex: new vistas on integration in the nervous system. Trends Neurosci. 15, 218-226. Ezur6, K. (1990). Synaptic connections between medullary respiratory neurons and considerations on the genesis of respiratory rythm. Prog. Neurobiol. 25,429-450. Ferster, D. (1986). Orientation selectivity of synaptic potentials in neurons of cat primary visual cortex. J. Neurosci. 6, 1284-1301. Ferster, D. (1988). Spatially opponent excitation and inhibition in simple cells of the cat visual cortex. J. Neurosci. 8,1172-1180. Ferster, D. and Lindstrom, S. (1983). An intracellular analysis of geniculo-cortical connectivity in area 17 of the cat. J. Physiol. (London). 342. 181-215. Fetz, E., Toyama, K. and Smith, W. (1991). Synaptic interaction between cortical neurons. In "Cerebral Cortex", eds. A. Peters and E. G. Jones (New-York: Plenum Press), 9, 147.
193 Fiorani, M.J., Rosa, M.G.P., Gattas, R. and Rocha-Miranda, C.E. (1992). Dynamic surrounds of receptive fields in primate striate cortex: a physiological basis for perceptual completion? Proc. Natl. Acad. Sci. USA. 89, 8547-8551. Fischer, B. and Krtiger, J. (1974). The shift-effect in the cat's lateral geniculate neurons. Exp. Brain Res. 21,225-227. Fischer, B., Krtiger, J. and Droll, W. (1975). Quantitative aspects of shift effect in cat retinal ganglion cells. Brain Res. 83, 391-403. Fortin, G. (1993). Le r6seau neuronal du complexe solitaire: roles respectifs des connexions synaptiques, des propri^tes mebranaires et du m^tabolisme intracellulaire. These de Doctorat de rUniversit6 Paris VI, University Paris VI, Fr^gnac, Y. (1995). Comparative and developmental aspects of hebbian synaptic plasticity. In "Handbook of Brain Theory and Neural networks", eds. M. Arbib (MIT Press), 459-464. Fr6gnac, Y., Bringuier, V. and Baranyi, A. (1994a). Oscillatory neuronal activity in visual cortex: a critical re-evaluation. In "Temporal Coding in the Brain, Research and Perspectives in Neurosciences", eds. G. Buzsaki, R. Llinas, W. Singer, A. Berthoz and Y. Christen (Berlin: Springer-Verlag), 81-102. Fr6gnac, Y., Burke, J., Smith, D. and Friedlander, M.J. (1994b). Temporal covariance of pre and postsynaptic activity regulates functional connectivity in the visual cortex. J. Neurophysiol. 71, 1403-1421. Fregnac, Y. and Debanne, D. (1993). Potentiation and depression in visual cortical neurons: a functional approach to synaptic plasticity. In "Brain Mechanisms of Perception and Memory: from Neuron to Behavior", eds. T. Ono, L. Squire, M. E. Raichle, D. I. Perrett and M. Fukuda (Oxford University Press), 533-561. Fr6gnac, Y., Debanne, D., Shulz, D. and Baranyi, A. (1994c). Does membrane potential regulate functional plasticity in kitten visual cortex? In "Long-term Potentiation", eds. M. Baudry and J. L. Davis (Cambridge: MIT Press), 2, 227-264. Fr6gnac, Y. and Shulz, D. (1994). Models of synaptic plasticity and cellular analogs of learning in the developing and adult vertebrate visual cortex. In "Advances in Neural and Behavioral Development", eds. V. Casagrande and P. Shinkman (New Jersey: Neural Ablex Publ), 4,149-235. Fregnac, Y., Shulz, D., Thorpe, S. and Bienenstock, E. (1988). A cellular analogue of visual cortical plasticity. Nature. 333, 367-370. Fr6gnac, Y., Shulz, D., Thorpe, S. and Bienenstock, E. (1992). Cellular analogs of visual cortical epigenesis: I. Plasticity of orientation selectivity. J. Neurosci. 12, 1280-1300. Gilbert, CD. (1977). Laminar differences in receptive field properties of cells in cat primary visual cortex. J. Physiol. (London). 268, 391-421. Gilbert, CD. (1992). Horizontal integration and cortical dynamics. Neuron. 9, 1-13. Glaeser, L., Bringuier, V., Fregnac, Y., Borg-Graham, L., Monier, C and Fleury, G. (1995). Reverse correlation mapping of subthreshold synaptic potentials in cat primary visual cortex. Soc. Neurosc. Abstr. 21, 648.1, p. 1647. Gray, CM., Konig, P., Engel, A.K. and Singer, W. (1989). Oscillatory responses in cat visual cortex exhibit inter-columnar synchronization which reflects global stimulus properties. Nature. 338, 334-337. Grieve, K.L. and Sillito, A.M. (1991). A re-appraisal of the role of layer VI of the visual cortex in the generation of cortical end inhibition. Exp. Brain Res. 87, 521-529. Grinvald, A., Lieke, E.E., Frostig, R.D. and Hildesheim, R. (1994). Cortical point-spread function and long-range lateral interactions revealed by real-time optical imaging of macaque monkey primary visual cortex. J. Neurosci. 14, 2545-2568. Hansel, D. and Sompolinsky, H. (1992). Synchronization and computation in a chaotic neural network. Phys. Rev. Lett. 68,718-721. Hartline, H.K. (1938). The responses of single optic nerve fibers of the vertebrate eye to illumination of the retina. Am. J. Physiol. 212,400-415. Hartline, H.K. (1940). The receptive fields of the optic nerve fibers. Amer. J. Physiol. 130, 690-699.
194 Hebb, D.O. (1949). The organization of behavior. (New-York: J. Wiley and Sons), 337 p. Henry, G.H., Goodwin, A.W. and Bishop, P.O. (1978). Spatial summation of responses in receptive fields of single cells in cat striate cortex. Exp. Brain Res. 32, 245-266. Hirsch, J.A., Alonso, J.M. and Reid, R.C. (1995). Visually evoked calcium action potentials in cat stiate cortex. Nature. 378, 612-616. Hirsch, J.A. and Gilbert, CD. (1991). Synaptic physiology of horizontal connections in the cat's visual cortex. J. Neurosci. H , 1800-1809. Hoffmann, K.P. and Stone, J. (1971). Conduction velocity of afferents to cat visual cortex: a correlation with cortical receptive field properties. Brain Res. 32,460-466. Hubel, D.H. (1988). Eye, brain and vision. (Scientific American Library), Hubel, D.H. and Wiesel, T.N. (1962). Receptive fields, binocular interaction and functional architecture in the cat's visual cortex. J. Physiol. (London). 160. 106-154. Hubel, D.H. and Wiesel, T.N. (1965). Receptive field and functional architecture in two nonstriate visual areas (18 and 19) of the cat. J. Neurophysiol. 2S, 229-289. Hughes, CM. and Peters, A. (1992). Symmetric synapses formed by callosal afferents in rat visual cortex. Brain Res. 583, 271-278. Ikeda, H. and Wright, M.J. (1972). Functional organization of the periphery effect in retinal gangHon cells. Vis. Res. 12, 1857-1879. Innocenti, G.M. and Fiore, L. (1974). Post-synaptic inhibitory components of the responses to moving stimuli in area 17. Brain Res. 80, 122-126. Jagadeesh, B., Gray, CM. and Ferster (1992). Visually evoked oscillations of membrane potential in cells of cat visual cortex. Science. 257, 552-554. Jagadeesh, B., Wheat, H.S. and Ferster, D. (1993). Linearity of summation of synaptic potentials underlying direction selectivity in simple cells of the cat visual cortex. Science. 262,1901-1904. Jones, H.E. and Sillito, A.M. (1994). Directional asymmetries in the length-response profiles of cells in the feline dorsal lateral geniculate nucleus. J. Physiol. (London). 479, 475486. Jones, J.P. and Palmer, L.A. (1987). The two-dimensional spatial structure of simple receptive fields in cat striate cortex. J. Neurophysiol. 58,1187-1211. Jones, K.A. and Baughman, R.W. (1988). NMDA- and non-NMDA-receptor components of excitatory synaptic potentials recorded from cells in layer V of rat visual cortex. J. Neurosci. 8, 3522-3534. Kapadia, M.K., Ito, M., Gilbert, CD. and Westheimer, G. (1995). Improvement in visual sensitivity by changes in local context: Parallel studies in human observers and in VI of alert monkeys. Neuron. 15, 843-856. Kisvarday, Z.F., Beaulieu, C and Eysel, U.T. (1993). Network of GABAergic large basket cells in cat visual cortex (area 18): implication for lateral disinhibition. J. Comp. Neurol. 327, 398-415. Knierim, J.J. and Van Essen, D.C (1992). Neuronal responses to static texture patterns in area VI of the alert macaque monkey. J. Neurophysiol. 67, 961-980. Koechlin, E. and Burnod, Y. (1995). Dual population coding in the neocortex: a model of interaction between representation and attention in the visual cortex. J. Cogn. Neurosc. in press. Komatsu, Y., Nakajima, S. and Toyama, K. (1991). Induction of long-term potentiation without participation of N-methyl-D-aspartate receptors in kitten visual cortex. J. Neurophysiol. 65, 20-32. Kriiger, J. and Aiple, F. (1988). Multimicroelectrode investigation of monkey striate cortex: spike train correlations in the infragranular layers. J. Neurophysiol. 60,798-828. Kuffler, S.W. (1953). Discharge patterns and functional organization of the mammalian retina. J. Neurophysiol. 16, 37-68. Li, CY. and Li, W. (1994). Extensive integration field beyond the classical receptive field of cat's striate cortical neurons: Classification and tuning properties. Vision Res. 34. 2337-2355.
195 Linsker, R. (1986a). From basic network principles to neural architecture: emergence of spatial opponent cells. Proc. Natl. Acad. Sci. USA. 83,7508-7512. Linsker, R. (1986b). From basic network principles to neural architecture: emergence of orientation selective cells. Proc. Natl. Acad. Sci. USA. 82, 8390-8394. Linsker, R. (1986c). From basic network principles to neural architecture: emergence of orientation columns. Proc. Natl. Acad. Sci. USA. 83, 8779-8783. Llinas, R. (1988). The intrinsic electrophysiological properties of mammalian neurons: insights into central nervous system function. Science. 242, 1654-1664. Llinas, R.R., Grace, A.A. and Yarom, Y. (1991). In vitro neurons in mammalian cortical layer 4 exhibit intrinsic oscillatory activity in the 10- to 50-Hz frequency range. Proc. Natl. Acad. Sci. USA. 88, 897-901. Mac Kay, D.J.C. (1995). Bayesian methods for supervised neural networks. In "The Handbook of Brain Theory and Neural Networks", ed. M. A. Arbib (Cambridge: The MIT Press), 144-149. Maffei, L. and Fiorentini, A. (1976). The unresponsive regions of visual cortical receptive fields. Vision Res. 16,1131-1139. Magee, J.C. and Johnston, D. (1995a). Characterization of single voltage-gated Na"^ and Ca^"*" Channels in apical dendrites of rat CAl pyramidal neurons. J. Physiol. (London). 487.67-90. Magee, J.C. and Johnston, D. (1995b). Synaptic activation of voltage-gated channels in the dendrites of hippocampal pyramidal neurons. Science. 268. 301-304. Magleby, K.L. (1987). Short-term changes in synaptic efficacy. In "Synaptic Function", eds. G. M. Edelman, W. E. Gall and W. M. Cowan (New-York: J. Wiley), 21-56. Malach, R. (1994). Cortical columns as devices for maximizing neuronal diversity. Trends Neurosc. 12,101-104. Malach, R., Tootell, R.B.H. and Malonek, D. (1994). Relationship between orientation domains, cytochrome oxidase stripes, and intrinsic horizontal connections in squirrel monkey area v2. Cereb. Cortex. 4,151-165. Markram, H., Helm, P.J. and Sakmann, B. (1995). Dendritic calcium transients evoked by single back-propagating action potentials in rat neocorical pyramidal neurons. J. Pysiol. 4SS, 1-20. Marrocco, R.T., Mc Clurkin, J.W. and Young, R.A. (1982). Modulation of lateral geniculate nucleus cell responsiveness by visual activation of the corticogeniculate pathway. J. Neurosci. 2, 256-263. Martin, K.A.C. and Whitteridge, D. (1984). The relationship of receptive field properties to the dendritic shape of neurones in the cat striate cortex. J. Physiol. (London). 356. 291302. Marty, A. and Llano, I. (1995). Modulation of inhibitory synapses in the mammalian brain. Curr. Op. Biol. 5, 335-341. Mason, A., NicoU, A. and Stratford, K. (1991). Synaptic transmission between individual pyramidal neurons of rat visual cortex in vitro. J. Neurosci. i l , 72-84. Mc Clurkin, J.W., Optican, L.M. and Richmond, B.J. (1994). Cortical feedback increases visual information transmitted by monkey parvocellular lateral geniculate neurons. Vis. Neurosc. 11, 601-607. Mc Cormick, D.A., Connors, B.W., Lighthall, J.W. and Prince, D.A. (1985). Comparative electrophysiology of pyramidal and sparsely spiny stellate neurons of the neocortex. J. Neurophysiol. 54,782-806. Mc Cormick, D.A., Gray, C. and Wang, Z. (1993). Chattering cells: a new physiological subtype which may contribute to 20-60 hz oscillations in cat visual cortex. Soc Neurosci. Abstr. 12, 869. Mc Donald, C.T. and Burkhalter, A. (1993). Organization of long-range inhibitory connections within rat visual cortex. J. Neurosc. H , 768-781. Mc lUwain, J.T. (1964). Receptive field of optic tract axons and lateral geniculate cells: peripheral extent and barbiturate sensitivity. J. Neurophysiol. 22, 1154-1173.
196 Merzenich, M.M. and Sameshima, K. (1993). Cortical plasticity and memory. Curr. Opin. Neurobiol. 2,187-196. Miller, K.D. (1990). Correlation based models of neural development. In "Neuroscience and Connectionist", eds. M. A. Gluck and D. E. Rumelhart (Hillsdale: L. Erlbaum), 267353. Miller, K.D., Chapman, B. and Stryker, M.P. (1989). Visual responses in adult cat visual cortex depend on N-methyl-D-aspartate receptors. Proc. Natl. Acad. Sci. USA. 86» 5183-5187. Milner, P.M. (1974). A model for visual shape recognition. Psychol. Rev. 81, 521-535. Molotchnikoff, S. and Cerat, A. (1992). Responses from outside classical receptive fields of dorsal lateral geniculate cells in rabbits. Exp. Br. Res. 92, 94-104. Mountcastle, V.B. (1957). Modality and topographic properties of single neurons of cat's somatic sensory cortex. J. Neurophysiol. ^ , 408-434. Movshon, A.J., Adelson, E.H., Gizzi, M.S. and Newsome, W.T. (1986). The analysis of moving visual patterns. Pont. Acad. Sci. Scrip. Var. 54» 117-152. Movshon, J.A., Thompson, J.D. and Tolhurst, D.J. (1978). Spatial summation in the receptive fields of simple cells in the cat's striate cortex. J. Physiol. (Lond.). 283, 5377. Murakoshi, T., Quo, J.-Z. and Ichonose, T. (1993). Electrophysiological identification of horizontal synaptic connections in rat visual cortex in vitro. Neurosc. Lett. 163, 211214. Murphy, P.C. and Sillito, A.M. (1987). Corticofugal feedback influences the generation of length tuning in the visual pathway. Nature. 329, 727-729. Nakajima, S., Komatsu, Y. and Toyama, K. (1988). Synaptic action of layer I fibers on cells in cat striate cortex. Brain Res. 457, 176-180. Nelson, D.A. and Katz, L.C. (1995). Emergence of junctional cirvuits in ferret visual cortex visualized by optical imaging. Neuron. 15, 23-34. Nelson, J.I. and Frost, B.J. (1978). Orientation-selective inhibition from beyond the classic visual receptive field. Brain Res. 139, 359-365. Nelson, J.I. and Frost, B.J. (1985). Intracortical facilitation among co-oriented, co-axially aligned single cells in cat striate cortex. Exp. Brain Res. 61, 54-61. Nelson, J.I., Salin, P.A., Munk, M.H.J., Arzi, M. and Bullier, J. (1992). Spatial and temporal coherence in cortico-cortical connections: a cross-correlation study in areas 17 and 18 in the cat. Vis. Neurosci. 9, 21-37. Nelson, S., Toth, L., Sheth, B. and Sur, M. (1994). Orientation selectivity of cortical neurons during intracellular blockade of inhibition. Science. 265, 774-777. Nelson, S.B. (1991). Temporal interactions in the cat visual system: I. Orientation-selective suppression in the visual cortex. J. Neuroscience. H , 344-356. NicoU, A. and Blakemore, C. (1993). Single fibre EPSPs in layer 5 of rat visual cortex in vitro, NeuroReport. 4, 167-170. Nowak, G. (1995). Etude electrophysiologique des aspects temporels du traitement de I'information dans le n6ocortex visuel. Docorat de I'Universite, University Claude Bernard - Lyon I, Orban, G.A. (1984). Neuronal operations in the visual cortex. (Berlin Heidelberg: SpringerVerlag), 365 p. Orban, G.A., Gulyas, B. and Vogels, R. (1987). Influence of a moving textured background on direction selectivity of cat striate neurons. J. Neurophysiol. 52, 1792-1812. Orban, G.A., Kato, H. and Bishop, P.O. (1979). End-zone region in receptive fields of hypercomplex and other striate neurons in the cat. J. Physiol. (London). 42, 818-832. Palmer, L.A. and Davis, T.A. (1981). Receptive field structure in cat striate cortex. J. Neurophysiol. 46, 260-295. Pei, X., Vidyasagar, T.R., Volgushev, M. and Creutzfeldt, O.D. (1994). Receptive field analysis and orientation selectivity of postsynaptic potentials of simple cells in cat visual cortex. J. Neurosci. 14,7130-7140.
197 Pei, X., Volgushev, M., Vidyasagar, T.R. and Creutzfeldt, O.D. (1991). Whole-cell recording and conductance measurements in cat visual cortex in vivo. NeuroReport. 2, 485-488. Rauschecker, J.P., Grunau, M.W. and von Poulin, C. (1987). Centrifugal organization of direction preferences in the cat's lateral suprasylvian cortex and its relation to flow processing. J. Neurosc. 7,943-958. Rizzolati, G. and Camarda, R. (1977). Influence of the presentation of remote visual stimuli on visual responses of cat area 17 and lateral suprasylvian area. Exp. Brain Res. 29, 107-122. Rosenquist, A.C. (1985). Connections of visual cortical areas in the cat. In "Cerebral Cortex", eds. A. Peters and E. G. Jones (New York: Plenum), 3, 81-117. Salin, P.A. and BuUier, J. (1995). Corticocortical connections in the visual system: structure and function. Physiol, rev. 75, 107-154. Schillen, T.B. and Konig, P. (1991). Stimulus-dependent assembly formation of oscillatory responses: desynchronization. Neur. Comput. 3, 167-178. Schuster, H.G. and Wagner, P. (1990). A model for neuronal oscillations in the visual cortex. 1. Mean-field theory and derivation of the phase equations. Biol. Cyb. 64, 7782. Schwarz, C. and Bolz, J. (1991). Functional specificity of a long-range horizontal connection in cat visual cortex: a cross-correlation study. J. Neurosci. H , 2995-3007. Schweitzer, P., Fortin, G., Beloeil, J.C. and Champagnat, J. (1992). In vitro study of newbom rat brain maturation. Neurochem. Interntl. 20, 109-112. Schwindt, P.C. and Crill, W.E. (1995). Amplification of synaptic current by persistent sodium conductance in apical dendrite of neocortical neurons. J. Neurophysiol. 74, 2220-2224. Sejnowski, T.J. (1977a). Statistical constraints on synaptic plasticity. J. Theor. Biol. 69, 387389. Sejnowski, T.J. (1977b). Storing covariance with non-linearly interacting neurons. J. Math. Biol. 4, 303-321. Sejnowski, T.J. and Tesauro, G. (1989). The Hebb rule for synaptic plasticity: algorithms and implementations. In "Neural models of plasticity", eds. J. H. Byrne and W. O. Berry (Academic Press), 94-103. Shulz, D., Bringuier, V. and Fregnac, Y. (1993b). Complex-like structure of simple visual cortical receptive fields is masked by GABA^ intracortical inhibition. Soc. Neurosci. Abstr. 19, 638. Shulz, D. and Fregnac, Y. (1992). Cellular analogs of visual cortical epigenesis: H. Plasticity of binocular integration. J. Neurosci. 12, 1301 -1318. Sillito, A.M. (1975). The contribution of inhibitory mechanisms to the receptive field properties of neurones in striate cortex of the cat. J. Physiol. (London). 250, 305-329. Sillito, A.M. (1977). The spatial extent of excitatory and inhibitory zones in the receptive field of superficial layer hypercomplex cells. J. Physiol. (London). 273,791-803. Sillito, A.M., Grieve, K.L., Jones, H.E., Cudeiro, J. and Davis, J. (1995). Visual cortical mechanisms detecting focal orientation discontinuities. Nature. 378, 492-496. Sillito, A.M., Jones, H.E., Gerstein, G.L. and West, D.C. (1994). Feature-linked synchronization of thalamic relay cell firing induced by feedback from the visual cortex. Nature. 2S2,479-482. Sillito, A.M. and Versiani, V. (1977). The contribution of excitatory and inhibitory inputs to the lengh preference of hypercomplex cells in layer II et III of the cat's striate cortex. J. Physiol. (Lond.). 271,775-790. Silva, L.R., Chagnac-Amitai, Y. and Connors, B.W. (1991). Intrinsic oscillations of neocortex generated by layer V pyramidal neurons. Science. 251,432-435. Singer, W. (1990). Search for coherence : a basic principle of cortical self-organization. Concepts in Neurosci. 1, 1-26. Singer, W. (1993). Synchronization of cortical activity and its putative role in information processing and leaming. Annu. Rev. Physiol. 55» 349-374.
198 Singer, W. and Gray, CM. (1995). Visual feature integration and the temporal correlation hypothesis. Annu. Rev. Neurosci. 18, 555-586. Somers, D., Nelson, S.B. and Sur, M. (1994). Effects of long-range connections on gain control in an emergent model of visual cortical orientation selectivity. Soc. Neurosc. Abstr. 20,1577 (646.7). Somogyi, P., Kisvarday, Z.F., Martin, K.A.C. and Whitteridge, D. (1983). Synaptic connections of morphologically identified and physiologically characterized large basket cells in the striate cortex of cat. Neurosci. 10, 261-294. Stafstrom, C.E., Schwindt, P.C. and Crill, W.E. (1984). Repetitive firing in layer V neurons from cat neocortex in vitro. J. Neurophysiol. 52, 264-277. Stemmler, M., Usher, M. and Niebur, E. (1995). Lateral interactions in primary visual cortex: a model bridging physiology and psychophysics. Science. 269.1877-1880. Stent, G. (1973). A physiological mechanism for Hebb's postulate of learning. Proc. Natl. Acad. Sci. USA. 70, 997-1001. Stern, P., Edwards, F.A. and Sakmann, B. (1992). Fast and slow components of unitary EPSCs on stellate cells elicited by focal stimulation in slices of rat visual cortex. J. Physiol. (London). 449, 247-278. Stryker, M.P. (1989). Cortical physiology: is grandmother an oscillation? (News and Views). Nature. 338, 297-298. Stuart, G.J. and Sakmann, B. (1995). Amplification of EPSPs by axomatic sodium channels in neocortical pyramidal neurons. Neuron. 15, 1065-1076. Sutor, B. and Hablitz, J.J. (1989a). EPSP's in rat neocortical neurons in vitro. L Electrophysiological evidence for two distinct EPSP's. J. Neurophysiol. 61, 607-620. Sutor, B. and Hablitz, J.J. (1989b). EPSP's in rat neocortical neurons in vitro. IL Involvement of N-methyl-D-aspartate receptors in the generation of EPSPs. J. Neurophysiol. 61, 621-634. Tanifuji, M., Yamanaka, A., Sunaba, R. and Toyama, K. (1993). Propagation of excitation in the visual cortex studied by the optical recording. Jpn. J. Physiol. 43, S57-S59. Thomson, A.M. (1986). A magnesium-sensitive post-synaptic potential in the rat cerebral cortex resembles neuronal response to N-methylaspartate. J. Physiol. (London). 370, 531-549. Thomson, A.M. and Deuchars, J. (1994). Temporal and spatial properties of local circuits in neocortex. Trends Neurosci. 17, 119-126. Thomson, A.M., Deuchars, J. and West, D.C. (1993a). Large, deep layer pyramid-pyramid single axon EPSPs in slices rat motor cortex display paired pulse and frequencydependent depression, mediated presynaptically and self-facilitation, mediated postsynaptically. J. Neurophysiol. 70, 2354-2369. Thomson, A.M., Deuchars, J. and West, D.C. (1993b). Single axon excitatory postsynaptic potentials in neocortical interneurons exhibit pronounced paired pulse facilitation. Neuroscience. 54, 347-360. Thomson, A.M. and Radpour, S. (1991). Excitatory connections between CAl pyramidal cells revealed by spike triggered averaging in slices of rat hippocampus are partially NMDA receptor mediated. Eur. J. Neurosci. 3, 587-601. Toyama, K. (1988). Functional connections of the visual cortex studied by cross-correlation techniques. In "Neurobiology of Neocortex", eds. P. Rakic and W. Singer (New-York: J. Wiley and Sons), 203-217. Toyama, K., Fujii, K. and Umetani, K. (1990). Functional differentiation between the anterior and posterior Clare-Bishop cortex of the cat. Exp. Brain Res. 81, 221-233. Toyama, K., Matsunami, K., Ohno, T. and Takashiki, S. (1974). An intracellular study of neuronal organization in the visual cortex. Exp. Brain Res. 21,45-66. Traub, R.D., Whittington, M.A., Colling, S.B., Buszaki, G. and Jefferys, J.G.R. (in press). Analysis ofgamma rythms in the rat hippocampus in vitro and in vivo. J. Physiol. (London). Treisman, A. (1977). Focused attention in the perception and retrieval of multidimensional stimuli. Percept, and Psychophys. 22, 1-11.
199 Treisman, A.M. and Gelade, G. (1980). A feature-integration theory of attention. Cogn. Psychol. 12, 97-136. Ts'o, D.Y., Gilbert, CD. and Wiesel, T.N. (1986). Relationships between horizontal interactions and functional architecture in cat striate cortex as revealed by crosscorrelation analysis. J. Neurosci. 6,1160-1170. Volgushev, M., Pei, X., Vidyasagar, T.R. and Creutzfeldt, O.D. (1993). Excitation and inhibition in orientation selectivity of cat visual cortex neurons revealed by whole-cell recordings in vivo. Vis. Neurosci. JjO, 1151-1155. Von der Malsburg, C. (1981). The correlation theory of brain function. Internal report, MaxPlanck Institute for Biophysical Chemistry Goettingen RFA, Von der Malsburg, C. and Bienenstock, E. (1986). Statistical coding and short-term synaptic plasticity: a scheme for knowledge representation in the brain. In "Disordered Systems and Biological Organization.", eds. E. Bienenstock, F. Fogelman-Souli6 and G. Weisbuch. (Berlin: Springer-Verlag), 247-272. Von der Malsburg, C. and Schneider, W. (1986). A neural cocktail-party processor. Biol. Cyb. 54, 29-40. Von der Malsburg, C. and Singer, W. (1988). Principles of cortical network organization. In "Neurobiology of Neocortex", eds. P. Rakic and W. Singer (New-York: J. Wiley and Sons), 69-99. Wang, X.J. (1993). Ionic basis for intrinsic 40 Hz neuronal oscillations. NeuroReport. 5, 221-224. Wassle, H. and Boycott, B.B. (1991). Functional architecture of the mammalian retina. Physiological Reviews. 71, 447-480. Whitehead, S.D. and Ballard, D.H. (1990). Active perception and reinforcement learning. Neural Comp. 2,409-419. Wilcox, K.S. and Dichter, M.A. (1994). Paired pulse depression in cultured hippocampal neurons is due to a presynaptic mechanism independent of GABAg autoreceptor activation. J. Neurosci. 14, 1175-1788. Wilson, H.R. and Cowan, J.D. (1972). Excitatory and inhibitory interactions in localized populations of model neurons. Biophys J. 12, 1-24. Whittington, M.A., Traub, R.D. and Jefferys, J.G.R. (1995). Synchronized oscillations in interneuron networks driven by metabotropic glutamate receptor activation. Nature. 373. 612-615. Yang, G. and Masland, R.H. (1992). Direct visuaUzation of the dendritic and receptive fields of directionalloy selective retinal ganglion cells. Science. 258, 1949-1992. Yuste, R. and Denk, W. (1995), Dendritic spines as basic functional units of neuronal integration. Nature. 225, 682-684. Yuste, R., Gutnick, M.J., Saar, D., Delaney, K.R. and Tank, D.W. (1994). Ca^"" accumulations in dendrites of neocortical pyramidal neurons: an apical band and evidence for two functional compartments. Neuron. 13, 23-43. Zipser, K. (1995). Measuring the delay of the onset of extra-receptive field modulation in VI neurons. Soc. Neurosci. Abstr. 21,1751 (689-2). Zucker, R.S. (1989). Short-term synaptic plasticity. Annu. Rev. Neurosci. 12, 13-31.
Brain Theory - Biological Basis and Computational Principles A. Aertsen and V. Braitenberg (Editors) © 1996 Elsevier Science B.V. All rights reserved.
201
On the Role of Neural Synchrony in the Primate Visual Cortex Andreas K. Kreiter and Wolf Singer Max-Planck-Institut fiir Hirnforschung, Deutschorden Strafie 46, D-60528 Frankfort, Germany
Neuronal representation of visual stimuli - The role of single neurons The question how sensory signals are processed and represented is still unresolved. Traditional approaches assume that information is contained mainly in the response rate of individual neurons. Elevated firing of feature selective neurons is thought to signal the presence of particular stimulus configurations. There is indeed ample evidence from all sensory systems that stimulus features selectively modulate the response rate of single cells. Well documented examples are found in the primate visual system. In the retina, the lateral geniculate nucleus, and primary visual cortex the spike rate of individual cells reflects simple stimulus properties like spatial location, spatial extent, orientation, spectral composition, the direction of motion, binocular disparity and several others (Hubel and Wiesel, 1959; Hubel and Wiesel, 1962; Zeki, 1975; Orban, 1984; Desimone et al. 1985; Desimone, 1991; Henry, 1985; Maunsell and Newsome, 1987; Livingstone and Hubel, 1988). Within extrastriate areas of the macaque 1 visual cortex rate coded responses occur in addition to more complex stimuli such as optical flow fields, non-cartesian gratings, illusory contours, specific arrangements of simple geometric forms and characteristic features of faces and body parts (Gross et al. 1972; Bruce et al. 1981; Perrett et al. 1985; Saito et al. 1986; Desimone, 1991; Sakai and Miyashita, 1991; Rolls, 1992; Gallant et al. 1993; Miyashita, 1993; Tanaka, 1993). Further support for the functional significance of rate codes comes fi'om the evidence that the sensitivity of single cortical neurons for certain stimulus properties reaches the psychophysical threshold. This suggests a close relation between the activity level of single nexu-ons and perception. Thus, in the monkey the grating acuity and contrast sensitivity of the most sensitive neurons in VI can reach the threshold found for human observers (Parker and Hawken, 1985; Hawken and Parker, 1990). Other studies concluded that contrast sensitivity (Tolhurst et al. 1983) and orientation selectivity in VI come close to psychophysically observed thresholds but do not reach them (Vogels and Orban, 1990). Within the motion sensitive medio-temporal area (MT, V5) good
202
correspondence was found between single unit activity evoked by dynamic random dot stimuli and the simultaneously estimated psychophysical performance of monkeys engaged in a motion discrimination task (Newsome et al. 1989; Britten et al. 1992). Furthermore, it was shown that electrical stimulation of local groups of neurons in MT can influence the decision of monkeys judging the direction of motion in random dot displays (Salzman et al. 1990; Salzman et al. 1992; Murasugi et al. 1993). These and many other results clearly indicate that the response rate of single neurons contains information which contributes to the representation of perceptual objects. However, theoretical considerations and experimental findings suggest that in general the full description of a visual object cannot be provided by the activity of a single neuron. Object representation by single units would require at a final processing stage a dedicated neuron for every distinguishable pattern (Fig. 1, left column). Because of the exceedingly large number of possible configurations the limited reservoir of cortical neurons seems to exclude that this strategy is the only option to represent patterns. The combinatorial problem is somewhat reduced in the model suggested by Horace Barlow in 1972 in which patterns are thought to be represented by generalizing "cardinal" cells at the top of the processing hierarchy. These cells are assumed to represent objects as complex as an apple or a mouse, objects that can be described with a single word (Fig.l, middle column). Their responses are thought to be context independent and invariant to changes in size, position and minor modifications of appearance of the respective objects. For the response of such generalizing cells it would for example not matter whether an apple is green or red, with or without brown patches, as long as it can be perceived as an apple. According to this model, the complete perceptual situation is represented exclusively by the small set of vigorously activated cardinal cells. Such generalization reduces the combinatorial problem to some extent. However, the cardinal cell model cannot easily cope with the requirement to express relations between different attributes of the represented objects. This is because the cardinal cells represent by definition abstractions which lack details. If a cardinal cell responds for example to a mouse the response does not signal whether the mouse is patchy or uniformly coloured since a "mouse" cardinal cell would have to respond to any mouse, even if it is partially hidden, pinkish and striped. Therefore, the mouse would be recognized but it would lack any specific property. The existence of a 'patchy'-cell would also not help since this attribute could also be related to the apple just eaten by the mouse. Of course a 'patchy mouse'-cell would resolve the problem but this leads again into the combinatorial explosion mentioned above. Thus, the main advantage of cardinal cells, their ability to generalize, turns out to be their major problem and prevents effective object representation by such individual neurons. Stimulus representation by single, highly specific object detector cells (or gnostic units) is not only problematic fi-om a theoretical point of view but has also
203
o oooo
oo
ooooooooo OOQpOOOOOO oo®oooooo
ooooooooo
oioodoo'iriiiiT
OCiOQppilcR)
oooooooooo QOpOQQOOO
OOOOSCTOOOO
Pontifical Cells
Cardinal Cells
iOQ
o" o^o°^p o Neural Assembly
Fig. 1. Three different models for stimulus representation in the cerebral cortex. Pontifical cells (left column) are at the top level of a hierarchical feed forward network which extracts at each level progressively more complex stimulus constellations from the sensory input. At the top level a single activated neuron out of a huge set of pontifical cells represents the complete scene. Cardinal cells (middle column) are also builtfi*oma feed forward network. They are not specific for an entire scene but detect certain objects. A scene is represented by the small group of activated cardinal cells which code for individual objects. In both models the lower level neurons in the feed forward network do not participate in the representation of perceived objects. Neural assemblies (right column) are formed in feedback networks of neurons which behave like coarse filters rather than precise detectors. An object is represented by a population of cooperatively interacting neurons.
204
not received much experimental support. Neurons selective for the most complex stimuli are found in the inferior temporal cortex and the anterior superior temporal sulcus. They respond to hands, faces or behaviourally relevant objects (Gross et al. 1972; Bruce et al. 1981; Perrett et al. 1985; Saito et al. 1986; Desimone, 1991; Sakai and Miyashita, 1991; Rolls, 1992; Gallant et al. 1993; Miyashita, 1993). However, the results obtained by Tanaka and colleagues (Tanaka, 1993) show that most of these cells are tuned to certain geometrical arrangements of features that are indeed more complex than bars or gratings but much simpler than what is needed to distinguish unambiguously different objects that share certain figural aspects. Therefore, these cells cannot be regarded as gnostic units. Even face cells are not selective for individual faces, nor do they respond indiscriminately to every face as it would be required for a general face detector. Rather they are tuned to certain properties of faces like geometrical relations between features of faces (Yamane et al. 1988) from which the identity, the direction of gaze (Perrett et al. 1985), or the emotional expression of a face (Hasselmo et al. 1986; Hasselmo et al. 1989) can be inferred. (For review see Desimone (1991), Young and Yamane (1992), Rolls (1992)). Thus, even the cells with the most complex stimulus requirements described so far in the visual cortex do not have the properties required for stimulus representations based on single units. Problems with single cell codes do not only occur at the level where complex objects have to be represented. Also quite simple stimulus properties such as orientation or direction of motion are diflBcult to extract from the responses of single cells tuned to these properties. The reason is that the responses of neurons in the visual cortex are broadly tuned and influenced by features belonging to different dimensions. Different stimuli can elicit responses of equal strength in the same neuron. The cells are smoothly tuned to a region in the high dimensional stimulus space, and their responses exhibit no unequivocal relation between firing rate and stimulus properties (Gerstein and Gochin, 1992). Therefore single neurons cannot signal reliably the presence of a particular feature. Taken together, theoretical considerations and experimental results suggest that representations of stimulus properties cannot be achieved solely by single cell codes. How can this conclusion be reconciled with the close relation between single unit activity and psychophysical performance? The experiments mentioned above have in common that they did not require to exploit the frill coding capacity of the system. First, comparisons between single neurons and psychophysical performance were based on tasks requiring only decisions on very simple stimulus properties for which specifically tuned neurons exist. This circumvents the problem that complex constellations cannot be encoded selectively by single units. Second, the psychophysical measurements were done with detection or discrimination tasks tailored to one specific aspect of the stimuli. This eliminates potential difficulties with the complete representation of stimuli by single units
205 because the task can be solved without a full description of the stimulus. Third, in part of the studies the stimuli which had to be distinguished were very different as e.g. motion to the left versus motion to the right or grating present versus absent. The subjects' task was not to differentiate between fine differences within the same stimulus dimensions, as for example between two similar directions of motion. It was sufficient to simply detect the stimuli and therefore the difficulty to derive parametric judgements from the responses of broadly tuned single units was excluded. Fourth, the stimuli were presented in very reduced displays containing only the basic feature which had to be judged. Under such conditions there is no problem to identify cells which deliver the relevant information since most of the activated neurons are directly related to the behaviourally relevant aspect of the relevant stimulus. Fifth, the stimuli were well known to the subjects due to prior instruction or overtraining. Through learning this could have facilitated the identification of the neurons with the most discriminative responses. This reduces the problem of ambiguities in response rates of single neurons. Thus, tasks based on restricted and well known sets of simple, qualitatively different stimulus conditions do not necessarily require full descriptions of the stimuli. It is sufficient to know the neurons which respond differentially in the few stimulus conditions to be discriminated and this ability can with all likelihood be acquired by learning. Even under such well defined conditions it is unlikely that a decision is based on a single cortical neuron. The information content of single cell responses as assessed by signal detection techniques which assume an optimal observer is rather impressive. Natural decision-making structures have certainly a much more limited computational precision and are in addition affected by noise from other sources than the neurons tuned optimally to the respective stimulus. The finding that psychophysically determined motion detection thresholds were often above the threshold of single MT neurons (Britten et al. 1992) supports the assumption that the decision-making mechanisms of psychophysical observers have only suboptimal access to individual, highly discriminative neurons. Since the behavioural performance reveals a comparable amount of information as the average single unit activity, the suboptimal access to this information must be compensated by pooling responses from multiple neurons. This increases the signal to noise ratio even though the possible improvements are limited by correlations between the response amplitudes of different neurons (Gawne and Richmond, 1993; Zohary et al. 1994; Kreiter and Singer, 1995b). Simulations confirmed that the motion discrimination performance of monkeys can be explained by the opposing effects of pooling on the one hand, and response rate correlation and noise from suboptimally driven neurons on the other (Britten et al. 1992). Thus, even under conditions in which task design and informational content of single unit activity would permit decisions based on the responses of a single cell, the limitations of neural information processing suggest that groups of neurons are required to extract the necessary information.
206 Distributed processing and population coding The perceptually and behavioxirally restricted and predetermined conditions in psychophysical experiments represent an extreme case in a rich spectrum of perceptual situations to which the visual system is usually exposed. More often it has to cope with complex visual scenes containing many, partially overlapping objects in front of a varying background. The specific constellation, identity and behavioxu-al relevance of objects contained in the scene are typically not known in advance and usually there is a huge set of different possibilities. Under such conditions the ambiguous message of an individual neuron cannot be disambiguated by precise expectations and a restricted set of possible stimulus constellations. Rather the meaning of an individual neuronal response has to be inferred from relations with the responses of other neurons, especially of those activated by the same stimulus feature or perceptual object. Unambiguous information about stimuli is only contained in the pattern of responses of multiple neurons. Therefore, it is commonly held that population coding is a indispensable principle for cortical representation of sensory information (Hebb, 1949; Braitenberg, 1978; Edelman and Mountcastle, 1978; Grossberg, 1980; Abeles, 1982; Abeles, 1991; Hopfield, 1982; Aertsen et al. 1986; Gerstein et al. 1989; Georgopoulos, 1990; Palm, 1990; Singer, 1990a; Singer, 1990b; Zipser and Andersen, 1988; Rolls, 1992; Young and Yamane, 1992). Theoretical considerations suggest many advantages of population over single unit codes. Because of the combinatorial natxire of population codes their representational capacity is much higher than that of the simi of the involved single cells. The loss of specific information in the generalizing responses of cardinal units is prevented by the flexible association of many different cells responding to different aspects of the same stimulus. A further advantage of population codes is the flexibility with which new representations can be generated. If representations consisted of individual detector neurons the formation of a new representation would require that a formerly unused neuron is newly connected to the already existing circuitry. This is much more demanding than inducing gradual changes of S5maptic weights in a network of distributed but cooperating neurons. Also other associative capabilities such as generalization, pattern completion and fault tolerance are advantageous properties associated typically with distributed representations (Palm, 1982; Rumelhart and McClelland, 1986; Hopfield and Tank, 1991).
The problem of concurrently activated representations Distributed or population codes represent information about a certain stimulus or content by the pattern of graded activity in many different units. If only one stimulus is represented at the same time, then the distributed information can be
207
interpreted unambiguously since all responses are related to the same content. If however several distinct stimuli are present simultaneously, their representations may become ambiguous since there is no explicit information which identifies the neurons activated by the same stimulus (Fig. 2). Because of the high degree of divergence and convergence in the cortical circuitry each neuron will in almost all complex stimulus configurations receive synaptic inputs evoked by different stimuli in an unpredictable and unresolvable mixture. Thus, the requirement for joint processing of signals related to a particular stimulus could become compromised by inclusion of responses evoked by other stimuli. In the visual system this problem arises if several objects are present within the same scene at the same time and in particular when they are spatially contiguous or overlapping. In this case the patterns of activated cells evoked by different stimuli will fiise to an undifferentiated compound pattern which needs to be decomposed into components related to the various individual objects.
B
o o o POL o
ooooogo Ooooo o
op_oX^o^ ooo Pox o
=\ Oo9^oOOo °;m^'
+ °o°o°SoO»
OQOOO O
Qo
o OoOOO O
Fig. 2. Different parts of a visual scene (A) will activate different populations of neurons (B) which represent their properties. If these populations reside in overlapping cortical regions there is no possibility to distinguish responses related to different populations (C). Thus, the population coded stimulus descriptions cannot be accessed if several populations are active simultaneously.
208 Therefore, mechanisms are required to identify responses evoked by the same stimulus and to distinguish them from those related to other stimuli. In addition, effective interactions between neurons have to be restricted to those processing related contents to prevent unintended disturbances by unrelated signals. Retinotopic mapping has been considered as a solution to this problem because it permits the segregation of responses evoked from different locations. In the same way ordered maps for stimulus features other than location can contribute to the association of similar and the separation of different features. This segmentation mechanism may be sufficient if stimuli are widely separated in space and the efferent projections of the activated populations have no common target. Then processing within the respective populations is largely independent and false conjunctions are excluded. In general, however, such anatomical mapping is not a sufficient mechanism to segregate responses related to different stimuli and to link those related to the same content. The broad tuning of the neurons in stimulus space and the scatter within maps cause populations related to different stimuli to overlap in the map if the respective properties of the stimuli are similar. In the case of retinotopic maps, for example, the spatial extent of the receptive fields and their scatter between different cells at the same cortical site (Van Essen et al. 1984) predict that the same cells will often be activated by different but adjacent or overlapping stimuli. In addition size and scatter of receptive fields rise quickly in the ascending visual pathway until the receptive fields cover most of the visual field and retinotopy is almost absent. Therefore, it becomes progressively difficult to relate active cells to certain parts of the scene on the basis of retinotopic mapping. Similar problems arise in the maps for other features. Furthermore, the location of neurons in a feature map and their connections are fixed and cannot be changed at the relevant time scale. The set of neurons related to a certain stimulus is, however, permanently changing when stimulus configurations change. Two neurons which are related to the same stimulus may become activated by different stimuli in another scene. Such changing associations of different cells can of course not be expressed by the invariant spatial positions of the neurons in a feature map. While mapping is certainly useful to attenuate binding problems by concentrating signals which are possibly but not necessarily related because of their proximity along certain coordinates of stimulus space, it cannot be used to represent the actual relations between responses. This clearly requires a djmamic mechanism that binds responses in changing constellations as stimulus configurations change. A more flexible mechanism is suggested by models of distributed processing which assume cooperative interactions in neuronal assemblies and competition between them (Fig. 1, right colimin). The relation of neurons to an activated assembly is expressed by their enhanced firing level. The stronger sjmaptic couplings between neurons of the same assembly mediate this cooperative enhancement while the activity of other neurons which do not belong to the active assembly is reduced by an inhibitory mechanism which limits the global
209
excitation of the network. The responses of neurons representing a single stimulus can then be differentiated because their enhanced activity separates them from responses arising in neurons which are not part of the activated assembly. However, if two assemblies get activated simultaneously their neurons are again indistinguishable because their enhanced activities permit segregation only from neurons not participating in any active assembly but not from each other. Theoretically it might be possible to distinguish different assemblies by different activation levels. A disadvantage of this strategy is that it destroys the information contained in the response amplitudes because these would have to be the same for all members of a particular activated assembly.
Temporal synchronization as a possible solution It has been suggested on the basis of theoretical considerations that relations between the responses of different neurons could be expressed by temporal patterning rather than by activity levels (Milner, 1974; Grossberg, 1980; Abeles, 1982; Abeles, 1991; von der Malsburg, 1985; von der Malsburg, 1986; von der Malsburg and Schneider, 1986; von der Malsburg and Singer, 1988; Gerstein et
Bo
oO o o o o o o p oo o o °?)OgO OOQO o O O
o
OOOQO
o oo o
+
=>
ooo°°o® OO ® 0 ^ O o o o ^Oo o o o o( oo w o ooo^o
o o o oo o
Fig. 3. Labelling of populations by s5nichronous discharge. The superposition of different populations can be resolved if neurons related to the same population discharge synchronously and avoid synchronization with other populations. Different populations may therefore be active simultaneously without confoimding their stimulus descriptions.
210 al. 1989; Singer, 1990b; Singer, 1993; Singer and Gray, 1995). The hypothesis discussed here assumes that responses which need to be bound together for further joint processing become organized in time by synchronization. Neurons engaging in episodes of synchronous discharge would thereby signal their participation in the representation of the same stimulus. Consequently, several distinguishable populations can coexist at the same time if synchronization between the neurons related to different populations is avoided (Fig. 3). This can be achieved most easily if synchronous firing is defined in the millisecond range since it requires not more than to adjust the precise time of discharge in relation to the discharge pattern of other neiu^ons and makes rate changes unnecessary. The main effect of this mechanism is that synchronous EPSPs elicited by neurons of the same population tend to simimate more effectively in target cells (Abeles, 1991). Thus, the saliency of the distributed signals related to the same stimulus would be enhanced and their joint processing favoured. Due to the lack of synchronization between the discharges of cells belonging to different populations the processing of different contents becomes separated in time and this reduces the chance of false conjunctions and unwanted signal exchange between unrelated populations.
Experimental evidence for temporal synchrony in primate visual cortex If neuronal assemblies are distinguished by the temporal coherence of activity evoked in their constituting neurons then episodes of synchronized discharges should be observable in response to coherent stimuli. Even though only a few studies have investigated cross-correlations between different neurons in the primate visual system there is clear evidence that different neurons can synchronize their discharges with a precision in the range of a few milliseconds. In the striate cortex of anaesthetized monkeys synchronization was found preferentially but not exclusively between cells with similar selectivity for ocular dominance, orientation, color and spatial location (Krtiger and Aiple, 1988; Ts'o and Gilbert, 1988; Livingstone, 1991). Under certain circumstances also larger groups of neurons get organized into a synchronous pattern of activity. This is suggested by the observation of oscillatory local field potentials (Fig. 4) in VI of awake fixating monkeys (Kreiter, 1992; Eckhom et al. 1993). Correlations have also been found across area boundaries between areas VI and V2 (BuUier et al. 1992; Munk et al. 1993; Frien et al. 1994) Correlation analysis has also been performed in visual areas beyond VI and V2 such as the inferiotemporal cortex (IT) and the middle temporal area (MT) of the superior temporal sulcus (STS) of awake macaque monkeys. Within area MT strong synchronization between adjacent neurons has been observed. Different neurons recorded simultaneously with the same electrode were found to synchronize their discharges with a precision of a few milliseconds over short epochs of 100 to 300 ms duration (Fig. 5). These grouped discharges are
211 separated by silent intervals with almost no activity and repeat every 15 to 35 ms. Neither the grouped discharges nor the episodes in which they occur are time locked to the stimulus (Kreiter and Singer, 1992).
50 Frequency [ Hz ]
Time [ s ]
100
c J 4
0)
f
CO
Hn r^wFw
n
±dq +80
Time [ s ]
Time [ m s ]
Fig. 4. Local field potentials in VI of an awake fixating macaque monkey. The stimulus was a small striped (3 cyclesT) square moving over the RF (17s). (A) Time course of the power between 40 and 80 Hz contained in the field potential. Note the parallel time course of the PSTH (C) which indicates that the LFP must be generated by neurons with RFs extending not more in space than those of the simultaneously recorded cells. (B) Power spectrum and (D) auto-correlogram of the LFP in the stimulated epoch (continous line) marked by the vertical lines in (A) and a nonstimulated epoch (dashed line).
212
+ 100
Time [ ms ]
Time [ ms ]
Fig. 5. Two examples of locally synchronized activity evoked with a moving bar in area MT of an awake fixating macaque monkey. Spikes of different cells recorded with the same electrode tend to occur in short clusters separated by silent intervals. The auto-correlograms below reflect these clusters by the broad centr^ peak and the initial trough. Additional side peaks as in (A) indicate that the sjnichronous discharges occurred in regular intervals. The dashed line indicates the trigger level.
Neuronal synchronization within area MT is not confined to directly adjacent neurons but has also been observed between spatially separate recording sites (Figs. 6,7,8) whereby the direction of motion preferred by the neurons did not have to be similar. Cases with more than 90" difference between the preferred directions of motion could show substantial synchronization if they were activated with a single moving bar stimulus. The temporal precision of coincident firing was in the range of 3 to 5 ms as indicated by the width of the peaks in the cross-correlograms which almost always straddled the origin. Within inferiotemporal cortex Gochin et al. (1991) observed synchronization between directly adjacent neurons recorded with the same electrode. The width of the peaks varied between 1 and more than 400 ms and straddled the origin in about 60% of the cases. Taken together, neurons in various areas of the primate visual cortex are able to s3aichronize their discharges. The temporal precision of S3nichronization is oflien in the millisecond range and thus sufficient for the purpose of d5niamic grouping of neurons into distinguishable populations.
213 D e p e n d e n c e of synchrony on stimulus configuration One of the important features of population coding is that a particular neuron can participate in the representation of different contents by joining at different times different assemblies. If synchronization is used to bind neurons into assemblies this predicts that neurons must be able to change the partners with which they synchronize when changing stimulus configurations require them to join different assemblies. This prediction has been tested in area MT of awake fixating monkeys with simple bar stimuli (Kreiter and Singer, 1995a). The assimiption is that neurons activated by a single moving bar stimulus contribute to the neuronal representation of this stimulus and synchronize their responses. If, however, the same group of neurons is activated by two independently moving bars, neurons should regroup themselves into two different assemblies according to their respective preferences for the two bars, and cells belonging to different assemblies should no longer exhibit synchronized discharges despite of being simultaneously active. A condition where neurons at two recording sites either join the same assembly or split into two different assemblies can be created by recording fi"om sites which have overlapping receptive fields but different preferences for the direction of stimulus motion (e.g. Fig. 6.A). Strong S3nichronization between the responses of neurons at both sites was observed if they were evoked with a single bar moving into a direction intermediate between the two preferred directions of motion (Fig. 6.C). In contrast, simultaneous stimulation with two different bars, moving into different directions close to the respective preferred directions of the two recording sites (dual bar condition. Fig. 6.B), resulted in a disruption or strong attenuation of synchronization (Fig. 6D). This effect was robust since it occurred in all tested pairs which exhibited correlations. The average reduction of synchronization in the dual bar condition as compared to the single bar condition was by a factor of six (n=19. Fig. 6.E). The change of synchronization strength could not be explained by changes in activation levels. Despite of the better fit of the stimuli to the preferred directions of motion in the dual bar condition, the average spike rate in the single bar condition was only 15% less and was in several cases even higher (Fig. 6.F). So far the result is in agreement with the predictions made by the correlation hypothesis and confirms similar experiments in anaesthetized cats (Gray et al. 1989; Engel et al. 1991a; Engel et al. 1991b). It is conceivable that the breakdown of synchrony in the dual bar condition is unrelated to assembly coding and simply a consequence of the interference caused by the second bar. This possibility was ruled out by an experiment in which two moving bars were crossing each other as in the dual bar condition, but in such a way that the nevirons at both recording sites were activated only by one of the two stimuli (Crossing bar condition; Fig. 7.B). In this situation the neurons at both recording sites are expected to synchronize with each other despite of the second bar since they are all activated by the same stimulus and should be part of the same assembly. As predicted, comparison of the correlations observed for the crossing
214
1
1
1 m pJli;
+63
^k ,—, 120• ^ n=19 o^ 100-
o z 1
80-
/
o ••6
c o O m
m
(D O)
c c/)
//
/
/
/
/
• ••
0-n/
/
/
0 -63 0 5 Time [ ms ] Time [ ms ]
250i
/
n=38
C N
OX
/
200 /
O 0 150
60- ! . . ' ^ <• • • / • / x^ 40- 1 20-
+63
0 -63 0 5 Time [ ms ] Time [ ms ]
i
2
^
/
/
/ •
CO o -
CD 2 ^ D) C CO
/
LL CD :^ QCO
100H 50
/ 1
1
1
1—>
0 25 50 75 100 Dual Bar Condition-NC [ % ]
0
—
I
1
1
1
1
—
^
0
100 200 Dual Bar ConditionSpike Frequency [ Hz ]
Fig. 6. Dependence of synchronization on single and dual bar configuration: (A,B) plots of the receptive fields and stimulus configurations. The dot marked 'F' corresponds to the fixation point. Arrows within the RF plots indicate the preferred direction of motion for the neurons at the respective recording sites and arrows at stimulus bars the direction of motion. (A) depicts the single and (B) the dual bar configuration. Cross-correlogram and PSTHs obtained for the single bar condition are shown in (C) and for the dual bar condition in (D). The thin vertical lines in the PSTHs mark the window over which the cross-correlograms were computed. The scale bars correspond to 40 spikes/s. Note the pronounced s)mchronization in the single bar condition and the absence of s)m.chronization in the dual bar condition. Scatter plots of normalized correlation values (NC, defined as the peak amplitude above offset divided by the offset) and firing rates obtained for the single bar condition (ordinate) against those obtained for the dual bar configuration (abscissa) in 19 cases are shown in (E) and (F), respectively. The dashed line indicates the region of equal values for both conditions. In all cases synchronization is considerably stronger for the single bar condition while response rates are similar.
215
0) CD
Q-
^ +63
0 -63 0 5 Time [ ms ] Time [ ms ]
Time [ ms ]
1
wl
2
^b.?^-^
Time [ ms ]
E 100-
^
n=11
—
•
o zc
.o
•D
80-
/
..r*
C
o O
^c« _ ca CD C CO
(/) o ^ 1 _
/
//
/
60-
• // / •
4020/
/
/
2501 n=22
/
o^
1
•/ / /•
0^ / / 0 20 40 60 80 100 Single Bar Conditlon-NC [ % ]
c o
H-^
•D
C
• / /
1—1
N
X
.
200
o o O CcD 150
V
X—
CO D
m CT 0) iL rx LL c
100
( 0 CD CO ^
2Q_ QC/D
50
i
••/
0-L 0 100 200 Single Bar ConditionSpike Frequency [ Hz ]
Fig. 7. Comparison of synchronization in the single and crossing bar configuration: (A,B) plots of RFs and stimulus configurations. (A) shows the single and (B) the crossing bar condition. Crosscorrelogram and PSTHs obtained for the single bar condition are shown in (C) and for the crossing bar condition in (D). The scale bars for the PSTHs correspond to 40 spikes/s. In this case the normalized correlation (NC) was 56.5% for the single bar configuration, 58.8% for the crossing bar configuration and 4.4% for the dual bar configuration (data not shown). (E,F) Scatter plots of normahzed correlation (NC) and firing rates obtained for the crossing bar configuration (ordinate) against those obtained for the single bar condition (abscissa) in 11 cases are shown in (E) and (F), respectively. Note that the additional bar in the crossing bar configuration causes no major reduction of sjmchronization as compared to the reduction found for the dual bar configuration. Conventions as in Fig. 6.
216 and the single bar condition revealed no significant differences (Fig. 7). The second stimulus did not disturb the synchronization that occurred when only the single bar was presented. Desynchronization in the dual bar condition can therefore not be attributed to unspecific effects such as the mere number of simultaneously present stimuli. Stimulus configurations associated with S5mchronization have in common that the bar used in both, the single and the crossing bar configuration activates the neurons at both recording sites. It might thus be that synchronization is critically dependent on the presence of a bar whose orientation and direction of motion is just intermediate between the preferences of the neurons at both recording sites. The synchronization hypothesis predicts that sjntichronization should not depend on the precise parameters of the stimulus but should occur as long as the neurons respond to the same stimulus and contribute to its representation. This prediction was tested with single bars moving in different directions over the receptive fields (Fig.8). Synchronization remained similar despite of considerable changes of the direction of motion as long as both recording sites remained activated. Even if one of the two bars of the dual bar condition was presented alone sjnichronization was often as strong as in the original single bar condition. Thus, the orientation and direction of motion of an individual bar is not critical for the synchronization of the spike trains at both sites, as long as both sites are sufficiently activated by the same bar. Furthermore, the reduction of synchronization in the dual bar condition cannot be attributed to the particular parameters of the two bars since each of them, when presented alone can induce s5aichronization. The conclusion from these control experiments is therefore that the critical variable determining s3mchronisation is whether the responses at the two recording sites are evoked by a common stimulus or by two different stimuli. Further support for this conclusion is provided by the quantitative analysis of synchronization in the dual bar condition. In part of the cases correlation was only attenuated but not abolished completely (Fig. 6.E). Such a residual correlation is expected if at least one of the simultaneously present stimuli causes residual activation of neurons at the respective non-optimal site. In this case the neurons at both sites contribute in a graded way to the representation of the same stimulus and should therefore exhibit some residual correlations. The correlation hypothesis therefore predicts a relation between the strength of the residual correlations and the extent to which each of the two bars coactivates also neurons at the respective non-optimal site. To describe this relation quantitatively coactivation was estimated as the fraction of activity evoked at the same site by the non-optimal bar of the dual bar stimulus over the activity evoked by the dual bar stimulus at the same site. These coactivation values from both sites were subsequently averaged and compared with the correlation strength in the dual bar condition, the latter being expressed as a fraction of the correlation strength measured in the single bar condition. The analysis revealed that the residual correlations in the dual bar condition are positively correlated
217
iilil \ iji
w
A,
d^
F?
150-
k
/
•—'
O
z
Time [ ms ]
100-
//
CD
I
\
Jl
4
CO
w
Q
o IMsv
Time [ ms ]
0
c O
»' 50-
//
> //
/
/
.
>4-
• 0- / • >> 50 100 150 0/ Sinc)le Bar Condition-NC [ %
Fig 8. Dependence of synchronization on different directions of motion: Stimulus configurations are indicated in the plots above the respective cross-correlograms and PSTHs. Other conventions as in Fig. 6. The single bar configuration (A, repeated in E) resulted in a strong correlation (NC = 73.9% in A and 52.1% in E) which essentially disappeared in the dual bar configuration (B, NC = 3.8%). Changing orientation and direction of motion of the single bar by 15° resulted only in a minor change of correlation (C, NC = 60.6%). In (D) the orientation and direction of motion of the single bar is changed further so that it equals the right bar of the dual bar configuration shown in (B) and stimulates site 1 only poorly. Note that the NC value remains similar as in the original single bar condition (NC = 56.8%). The scale bars for the PSTHs correspond to 40 spikes/s. The scatter plot (F) of NC values obtained for presentation of one of the bars of the dual bar configuration (ordinate) versus those obtained for the single bar configuration (n=15) indicates that only in a few cases the single bar condition resulted in a stronger correlation.
218
Fig. 9. Relation between the residual correlation found in part of the cases for the dual bar condition and the extent of coactivation of the two sites by the respective nondominant bar. Coactivation is expressed as the average of the two ratios between the respective nondominant responses and the responses evoked at corresponding sites by the dual bar stimulus. Residusd correlation is expressed as the ratio of the NC values measured in the dual bar condition over those measured in the single bar condition. The residual correlation increases with increasing coactivation of both sites by the same bar (r=0.846, p<0.001).
0.6H c 0.5 Vii
cd
2 0.4 k.
o O 0.3 CO D •o (0 0
0.2
a:
o.H 0.0 0.2
"oe" 0.4 0.6 0.8 Coactivation Index
with the extent of coactivation. As shown in Fig. 9 the dependence is highly significant (r=0.846, p<0.001). The more the cells at both sites are driven by the same bar in the dual bar configuration, the stronger is the observed correlation or, in other terms, the more exclusively each bar activates only the site with matching preference, the smaller is the correlation. Thus, the amount of synchrony reflects the relative contribution that a stimulus makes to the activation of a distributed population of neurons. This suggests the possibility that the extent to which the responses of different neurons are associated with a particular stimulus is expressed by graded sjnichronization. The conclusion from these experiments is that neural synchronization is a dynamic phenomenon which depends on global properties of the stimulus configuration and is not merely a reflection of the static anatomical connectivity. The synchronization of directionally selective neurons in MT reflects their activation by the same stimulus and could therefore serve to define transiently the population of neurons representing a particular moving contour and to distinguish this population from other simultaneously active neurons which represent other stimuli.
Physiological effects of neural sjmchrony If correlated discharge is of functional importance for cortical processing then it has to have physiological consequences which differ from those of uncorrelated spike trains with the same mean frequency. Direct experimental evidence for this is difficult to obtain because it would require individual manipulation and
219 observation of the activity of many distributed neurons. However, there are several features of cortical neurons and their connectivity which suggest that synchronized spike trains drive target neurons more effectively than unsynchronized input patterns. One of the fundamental requirements to distinguish synchronized events from unsjmchronized input is that the time constants for temporal integration are short. Measurements of the membrane time constants in the cortical slice preparation resulted in values between about 6 and 20 ms (Mason et al. 1991; Kim and Connors, 1993; Thomson and West, 1993). Based on these estimates and the high frequency of EPSPs evoked in physiologically stimulated cells it has been suggested that effective coincidence detection is impossible in cortical neurons (Shadlen and Newsome, 1994). However, the functionally relevant effective membrane time constant is influenced by synaptic activity which is usually absent in slice preparations. In vivo, synaptic conductances are permanently activated at multiple sites of the neurons and this is expected to reduce membrane resistance and consequently also membrane time constants. This effect should be particularly strong in the small dendritic compartments in which membrane time constants are shorter even without synaptic bombardment (Kim and Connors, 1993). The rise times of cortical EPSPs and IPSPs are usually 0.5-2 ms indicating that very fast modulations of membrane potentials are possible if not only passively decajdng EPSPs but also the actively repolarizing currents of IPSPs are considered. Theoretical studies suggested that even coincidences within the sub-millisecond range could be detected if active propagation is present in dendrites (Softky, 1994). Direct evidence for the notion that the membrane potential can change sufficiently fast to discriminate short synchronous from successive events is provided by the intracellular recordings of cortical neurons activated in vivo by visual stimuli (Ferster, 1986; Jagadeesh et al. 1992). These reveal that fluctuations of membrane potential can be extremely fast both in de- and hyperpolarizing directions. Thus, the experimental evidence suggests that EPSPs and IPSPs working in concert can change the membrane potential in the intact cortical network fast enough to permit coincidence detection. A characteristic feature of cortical networks is the high degree of convergence of synaptic input. In VI it has been estimated that the average neuron receives 3900 synaptic contacts, 17% of them being GABAergic (Beaulieu et al. 1992). Since multiple contacts between an axon and the same postsynaptic cell are rare (Braitenberg and Schtiz, 1991), the high degree of convergence is associated with relatively low efficacy of individual excitatory inputs (Mason et al. 1991; Thomson and West, 1993; Deuchars et al. 1994). Therefore, considerable temporal or spatial summation is required to reach threshold in postsynaptic neurons (Abeles, 1982; Abeles, 1991). If the time constants for the integration of synaptic potentials are indeed short then synchronously arriving EPSPs can be expected to be more effective than temporally dispersed inputs. In addition, there
220
is the possibility that synchronous EPSPs summate in a nonlinear manner and produce stronger depolarization than suggested by linear interactions (Abeles, 1991). Inhibitory interactions further increase the efficiency of sjmchronously active afferents relative to that of uncorrelated inputs (Fig. 10). Inhibitory neurons and p3rramidal cells often receive input from collaterals of the same axon (McGuire et al. 1991). Therefore, inhibition rises in parallel with the activation of P3rramidal neurons and limits their excitation (Ferster, 1986). Synchronously active afferents evoke EPSPs in both populations of neurons and are likely to reach threshold simultaneously in pyramidal cells and intemeurons. Because the IPSPs are delayed by the intemeurons they are likely to become effective in pjn^amidal neurons only after these have discharged (Douglas and Martin, 1991). Subsequently, the hjrperpolarization can decline quickly since the IPSPs arrived simultaneously and no further afferent spikes activate the inhibitory intemeurons. The next synchronized group of spikes approaches the pyramidal nem-on again in a state of normal excitability. In contrast, uns3nichronized input activates inhibitory intemeurons more continuously and in order to reach threshold in pjrramidal cells it has to overcome the inhibition sustained by permanently active intemeurons. Furthermore, the properties of cortical sjmapses also seem to favour transmission of synchronous input. Sjmapses among excitatory cortical neurons tend to exhibit marked paired pulse depression and frequency attenuation (Thomson and West, 1993) while excitatory S5mapses on inhibitory neurons show paired pulse facilitation and frequency potentiation (Thomson et al. 1993). In
®
®
nil II I I I II nil I I I I III I INI I III II nil III III I II Mil III II
44 Fig. 10. Schematic illustration comparing the effectiveness of several unsynchronized (a) or mutually synchronized (b) spiketrains reaching a cortical network of excitatory and inhibitory neurons. The unsynchronized spike trains evoke only weak activation of the p3n'amidal neuron as compared to the synchronized spike trains (For detaUed explanation see text).
221 conjunction, these properties constrain the possibiUty to assure propagation of responses by increasing discharge rates. The alternative strategy to assure response propagation, the sjmchronization of inputs, is unaffected by these characteristics of cortical S3niapses because it does not require enhanced JBring rates. In conclusion, the basic biophysical properties of pyramidal cells, the architecture of excitatory connections, the wiring of inhibitory intemeurons and the particular features of excitatory S3niapses appear to be ideally suited to favour propagation of synchronous activity and to attenuate transmission of temporally dispersed responses. Modulating the degree of synchrony among simultaneously active neurons is thus a highly effective mechanism to modulate the "saliency" of neuronal responses and to thereby select constellations of inputs for further joint processing.
Functional significance Stimulus dependent synchronization has several functionally important consequences. First, it can serve to label the responses of neiu^ons which ought to be processed together. As outlined in the first section population coded information can only be evaluated if the neurons coding for a particular content can be distinguished from the usually large number of other activated neurons which code for different contents. The experimental evidence that neurons tend to synchronize their activity if they respond to the same stimulus while their responses are uncorrelated if they are activated by different stimuli supports such a functional interpretation. Recent simulation studies also indicate that response synchronization can be used effectively to select responses representing the same perceptual object and to segregate them from responses evoked by different objects or embedding background (Wang et al. 1990; Konig and Schillen, 1990; Horn and Usher, 1991; Grossberg and Somers, 1991; Spoms et al. 1991; Amdt et al. 1992; Neven and Aertsen, 1992; Ritz et al. 1994; Schillen and Konig, 1994; Sompolinsky and Tsodyks, 1994). A particular advantage of this mechanism is that it does not interfere with rate codes. Synchronization in the millisecond range requires only slight temporal shifts of spikes which would have occurred anyway and does not require modification of the average spike rate. A second important aspect of stimulus dependent S3nichronization concerns the protection of fiinctional interactions between members of the same assembly from disturbance by other, unrelated input signals. Within a system characterized by a high degree of convergence and divergence the risk is high that processing of a particular content gets compromised by unrelated signals if several assemblies are activated simultaneously. Because of the virtually infinite number of possible constellations of assemblies, an anatomical separation of their respective networks is impossible. Therefore, a djniamic mechanism is required which
222
rapidly and flexibly enhances functional interactions between neurons processing the same content and at the same time reduces the impact of unrelated signals from the many other active assemblies with which they are likely to have connections. Stimulus dependent synchronization is particularly well suited to serve this purpose since it concentrates the responses in a neuronal population to short epochs which are separated by intervals of low excitability. The reason is that once a synchronous volley of afferent activity has been transmitted by the selected group of p3n"amidal cells these very same cells will be inhibited for several tens of milliseconds because the same volley which caused their excitation will have driven inhibitory interneurons. This effect is probably further enhanced by feedback inhibition, because inhibitory interneurons receive excitatory inputs from the very same pyramidal cells that they inhibit. Potentially disturbing inputs from other active populations are per definition not synchronized with the activity in the population under consideration and hence will evoke EPSPs in the inactive phase where they will be shunted by IPSPs. The probability and effectiveness of interactions between neurons processing different stimuli is thus considerably reduced in comparison to neurons processing the same stimulus. Similarly the activity of neurons which are not recruited into any synchronized ensemble will remain rather inefBcient because their EPSPs cannot benefit from spatial simamation. This improves the signal to noise ratio for the organized activity patterns of populations. Thus, sjmchronization can enhance selectively the efficacy of functional interactions between different neurons related to the same, transiently defined population and between their responses at common target cells, despite of the fixed synaptic connectivity in the network (Aertsen et al. 1989, 1994; Boven and Aertsen, 1990; Somers and Kopell, 1993; Aertsen and PreiBl, 1991; Grannan et al. 1994). This improves the conjoint processing of responses evoked by the same stimulus and keeps interactions of responses related to different stimuli predictably small.
References Abeles M (1982) Local cortical circuits. Berlin Heidelberg New York: Springer-Verlag Abeles M (1991) Corticonics. Cambridge: Cambridge University Press Aertsen A, Gerstein G and Johannesma P (1986) From neuron to assembly: Neuronal organization and stimulus representation. In: Brain Theory (Palm G, Aertsen A, eds), pp 7-24. Berlin: Springer-Verlag Aertsen A, Erb M, and Palm G (1994) Djmamics of functional coupling in the cerebral cortex: an attempt at a model-based interpretation. Physica D 75:103-128 Aertsen A and PreiBl H (1991) Djmamics of activity and connectivity in physiological neuronal networks. In: Nonlinear Dynamics and Neural Networks (Schuster H, ed), pp 281-301. Weinheim: VCH Verlag Aertsen AMHJ, Gerstein GL, Habib MK, and Palm G (1989) Djniamics of neuronal firing correlation: modulation of "effective connectivity". J.Neurophysiol. 61:900-917
223
Arndt M, Dicke P, Erb M, Eckhom R and Reitboeck HJ (1992) Two-layered physiologyoriented neuronal network models that combine d3nQamic feature linking via sjnichronization with a classical associative memory. In: Neural network dynamics (Taylor JG, Caianello ER, Cotterill RMJ, Clark JW, eds), pp 140-154. Springer Barlow HB (1972) Single units and cognition: A neurone doctrine for perceptual psychology. Perception 1:371-394 Beaulieu C, Kisvarday Z, Somog3d P, Cynader M, and Cowey A (1992) Quantitative distribution of GABA-immunopositive and -immunonegative neurons and synapses in the monkey striate cortex (Area 17). Cereb.Cortex 2:295-309 Boven K-H and Aertsen A (1990) Dynamics of activity in neuronal networks give rise to fast modulations of functional connectivity. In: Parallel processing in neural systems and computers (Eckmiller R, Hartmann G, Hauske G, eds), pp 53-56. North-Holland: Elsevier Science Publishers B.V. Braitenberg V (1978) Cell assemblies in the cerebral cortex. In: Lecture Notes in Biomathematics 21, Theoretical Approaches in Complex Systems (Heim R, Palm G, eds), pp 171-188. Berlin: Springer Braitenberg V and Schiiz A (1991) Anatomy of the Cortex - Statistics and Geometry. Berlin: Springer Britten KH, Shadlen MN, Newsome WT, and Movshon JA (1992) The analysis of visual motion: A comparison of neuronal and psychophysical performance. J.Neurosci. 12:4745-4765 Bruce CJ, Desimone R, and Gross CG (1981) Visual properties of neurons in a polysensory area in superior temporal sulcus of the macaque. J.Neurophysiol. 46:369384 BuUier J, Munk MHJ, and Nowak LG (1992) Synchronization of neuronal firing in areas VI and V2 of the monkey. Soc.Neurosc.Abstr. 18:11 Desimone R, Schein SJ, Moran J, and Ungerleider LG (1985) Contour, color and shape analysis beyond the striate cortex. Vision Res. 25:441-452 Desimone R (1991) Face-selective cells in the temporal cortex of monkeys. J.Cogn.Neurosci 3:1-8 Deuchars J, West DC, and Thomson AM (1994) Relationships between morphology and physiology of pyramid-pyramid single axon connections in rat neocortex in vitro. J.Physiol. 478.3:423-435 Douglas RJ and Martin KAC (1991) A functional microcircuit for cat visual cortex. J.Physiol. 440:735-769 Eckhorn R, Frien A, Bauer R, Woelbern T, and Kehr H (1993) Highfi-equency(60-90 Hz) oscillations in primary visual cortex of awake monkey. NeuroReport 4:243-246 Edelman GM and Mountcastle VB (1978) The Mindful Brain. Cambridge: MIT Press Engel AK, Konig P, and Singer W (1991a) Direct physiological evidence for scene segmentation by temporal coding. Proc.Natl.Acad.Sci.USA. 88:9136-9140 Engel AK, Kreiter AK, Konig P, and Singer W (1991b) Sjnichronization of oscillatory neuronal responses between striate and extrastriate visual cortical areas of the cat. Proc.Natl.Acad.Sci.USA. 88:6048-6052 Ferster D (1986) Orientation selectivity of synaptic potentials in neurons of cat primary visual cortex. J.Neurosci. 6:1284-1301 Frien A, Eckhorn R, Bauer R, Woelbern T, and Kehr H (1994) Stimulus specific oscillations at zero phase between visual areas VI and V2 of awake monkey. NeuroReport 5:2273-2277 Gallant JL, Braun J, and Van Essen DC (1993) Selectivity for polar, hyperbolic, and cartesian gratings in macaque visual cortex. Science 259:100-103 Gawne TJ and Richmond BJ (1993) How independent are the messages carried by adjacent inferior temporal cortical neurons? J.Neurosci. 13:2758-2771 Georgopoulos AP (1990) Neural coding of the direction of reaching and a comparison with saccadic eye movements. Cold Spring Harbor Laboratory Press 55:849-859
224
Grerstein GL, Bedenbaugh P, and Aertsen AMHJ (1989) Neuronal assemblies. IEEE Trans.Bio-Med.Eng. 36:4-14 Gerstein GL and Gochin PM (1992) Neural population coding and the elephant. In: Information processing in the cortex - Experiments and Theory (Aertsen A, Braitenberg V, eds), pp 139-173. Berlin: Springer Gochin PM, Miller EK, Gross CG, and Gerstein GL (1991) Functonal interactions among neurons in inferior temporal cortex of the awake macaque. Exp.Brain Res. 84:505-516 Grannan ER, Kleinfeld D, and Sompolinsky H (1994) Stimulus-dependent synchronization of neuronal assemblies. Neural Comp. 5:550-569 Gray CM, Konig P, Engel AK, and Singer W (1989) Oscillatory responses in cat visual cortex exhibit inter-columnar synchronization which reflects global stimulus patterns. Nature 338:334-337 Gross CG, Rocha-Miranda CE, and Bender DB (1972) Visual properties of neurons in inferotemporal cortex of the macaque. J.Neurophysiol. 35:96-111 Grossberg S (1980) How does a brain build a cognitive code? Psychol.Rev. 87:1-51 Grossberg S and Somers D (1991) Synchronized oscillations during cooperative feature linking in a cortical model of visual perception. Neural Netw. 4:453-466 Hasselmo ME, Rolls ET, and Baylis GC (1986) Selectivity between facial expressions in the responses of a population of neurons in the superior temporal sulcus of the monkey. Neuroscience Letters S26:S571 Hasselmo ME, Rolls ET, and Baylis GC (1989) The role of expression and identity in the face-selective responses of neurons in the temporal visual cortex of the monkey. Behav.Brain Res. 32:203-218 Hawken MJ and Parker AJ (1990) Detection and discrimination mechanisms in the striate cortex of the old-world monkey. In: Vision: coding and efficiency (Blakemore C, ed), pp 103-116. Cambridge: Cambridge University Press Hebb DO (1949) The Organization of Behavior. New York: Wiley Henry GH (1985) Physiology of cat striate cortex. In: Cerebral Cortex (Peters A, Jones EG, eds), pp 119-155. New York: Plenum Press Hopfield JJ (1982) Neural networks and physical systems with emergent collective computational abihties. Proc.Natl.Acad.Sci.USA. 79:2554-2558 Hopfield JJ and Tank DW (1991) Computing with neural circuits: a model. Science 233:625-633 Horn D and Usher M (1991) Parallel activation of memories in an oscillatory neural network. Neural Comp. 3:31-43 Hubel DH and Wiesel TN (1959) Receptive fields of single neurons in the cat striaate cortex. J.Physiol. 148:574-591 Hubel DH and Wiesel TN (1962) Receptive fields, binocular interaction and functional architecture in the cat's visual cortex. J.Physiol. 160:106-154 Jagadeesh B, Gray CM, and Ferster D (1992) Visually evoked oscillations of membrane potential in cells of cat visual cortex. Science 257:552-554 Kim HG and Connors BW (1993) Apical dendrites of the neocortex: Correlation between Sodium-and Calcium-dependent spiking and pyramidal cell morphology. J.Neurosci. 13:5301-5311 Konig P and Schillen TB (1990) Segregation of oscillatory responses by conflicting stimuli -desynchronizing connections in neural oscillator layers. In: Parallel processing in neural systems and computers (Eckmiller R, Hartmann G, Hauske G, eds), pp 117-120. North Holland: Elsevier Science Pubhshers B.V. Kreiter AK (1992) Kodierung neuronaler Assemblies durch koharente Aktivitat: Korrelationsanalysen im Sehsystem von Saugetieren (Thesis, University of Tubingen) Kreiter AK and Singer W (1992) Oscillatory neuronal responses in the visual cortex of the awake macaque monkey. Eur .J.Neurosci. 4:369-375
225 Rreiter AK and Singer W (1995a) Spike rate covariation in area MT of awake macaque monkeys. In: Learning and Memory - Proceedings of the 23rd Gottingen Neurobiology Conference 1995 (Eisner N, Menzel RM, eds), Stuttgart: Thieme Kreiter AK and Singer W (1995b) Stimulus dependent synchronization of neuronal responses in the visual cortex of the awake macaque monkey, (submitted) Kriiger J and Aiple F (1988) Multimicroelectrode investigation of monkey striate cortex: spike train correlations in the infragranular layers. J.Neurosci. 60:798-828 Livingstone MS (1991) Visually-evoked oscillations in monkey striate cortex. Soc.Neurosci.Abstr. 17:176 Livingstone MS and Hubel DH (1988) Segregation of form, color, movement, and depth: Anatomy, physiology, and perception. Science 240:740-749 Mason A, Nicoli A, and Stratford K (1991) Synaptic transmission between individual pyramidal neurons of the rat visual cortex in vitro. J.Neurosci. 11:72-84 Maunsell JHR and Newsome WT (1987) Visual processing in monkey extrastriate cortex. Ann.Rev.Neurosci. 10:363-401 McGuire BA, Gilbert CD, RivUn PK, and Wiesel TN (1991) Targets of horizontal connections in macaque primary visual cortex. J.Comp.Neurol. 305:370-392 Milner PM (1974) A model for visual shape recognition. Psychol.Rev. 81:521-535 Miyashita Y (1993) Inferior temporal cortex: where visual perception meets memory. Ann.Rev.Neurosci. 16:245-263 Munk MHJ, Nowak LG, and Bullier J (1993) Spatio-temporal response properties and interactions of neurons in areas VI and V2 of the monkey. Soc.Neurosc.Abstr. 19:424 Murasugi CM, Salzman CD, and Newsome WT (1993) Microstimulation in visual area MT: effects of varying pulse amplitude andfrequency.J.Neurosci. 13:1719-1729 Neven H and Aertsen A (1992) Rate coherence and event coherence in the visual cortex: a neuronal model of object recognition. Biol.Cybern. 67:309-322 Newsome WT, Britten KH, and Movshon JA (1989) Neuronal correlates of a perceptual decision. Nature 341:52-54 Orban GA (1984) Neuronal operations in the visual cortex. Berlin: Springer-Verlag Palm G (1982) Neural Assemblies. Heidelberg: Springer Palm G (1990) Cell assemblies as a guideline for brain research. Concepts in Neurosci. 1:133-147 Parker A and Hawken M (1985) Capabilities of monkey cortical cells in spatial-resolution tasks. J.Opt.SocAm. 2:1101-1114 Perrett DI, Smith PAJ, Potter DD, Misthn AJ, Head A,S., Milner AD, and Jeeves MA (1985) Visual cells in the temporal cortex sensitive to face view and gaze direction. Proc.R.Soc.Lond. B 223:293-317 Ritz R, G«rstner W, Fuentes U, and van Hemmen JL (1994) A biologically motivated and analytically soluble model of collective oscillations in the cortex: II. Application to binding and pattern segmentation. Biol.Cybern. 71:349-359 Rolls ET (1992) Neurophysiological mechanisms underlying face processing within and beyond the temporal cortical visual areas. Phil.Trans.R.Soc.Lond.B 335:11-21 Rumelhart DE and McClelland JL (1986) Parallel Distributed Processing: Explorations in the Microstructure of Cognition. Volume 1: Foundations. Cambridge: MIT Press Saito H-A, Yukie M, Tanaka K, Hikosaka K, Fukada Y, and Iwai E (1986) Integration of direction signals of image motion in the superior temporal sulcus of the macaque monkey. J.Neurosci. 6:145-157 Sakai K and Miyashita Y (1991) Neural organization for the long-term memory of paired associates. Nature 354:152-155 Salzman CD, Britten KH, and Newsome WT (1990) Cortical microstimulation influences perceptual judgements of motion direction. Nature 346:174-177 Salzman CD, Murasugi CM, Britten KH, and Newsome WT (1992) Microstimulation in visual area MT: effects on direction discrimination performance. J.Neurosci. 12:23312355
226 Schillen TB and Konig P (1994) Binding by temporal structure in multiple feature domains of an oscillatory neuronal network. Biol.Cybern. 70:397-405 Shadlen M and Newsome WT (1994) Noise, neural codes and cortical organization. Current Biol. 4:569-579 Singer W (1990a) The formation of cooperative cell assemblies in the visual cortex. J.exp.Biol. 153:177-197 Singer W (1990b) Search for coherence: a basic principle of cortical self-organization. Concepts in Neurosci. 1:1-26 Singer W (1993) Synchronization of cortical activity and its putative role in information processing and learning. Annu.Rev.Physiol. 55:349-374 Singer W and Gray CM (1995) Visual feature integration and the temporal correlation hypothesis. Ann.Rev.Neurosci. 18:555-586 Sofbky W (1994) Sub-millisecond coincidence detection in active dendritic trees. Neurosci. 58:13-41 Somers D and Kopell N (1993) Rapid synchronization through fast threshold modulation. Biol.Cybern. 68:393-407 Sompolinsky H and Tsodyks M (1994) Segmentation by a network of oscillators with stored memories. Neural Comp. 6:642-657 Sporns O, Tononi G, and Edelman GM (1991) Modeling perceptual grouping and figureground segregation by means of active reentrant connections. Proc.Natl.Acad.Sci.USA. 88:129-133 Tanaka K (1993) Neuronal mechanisms of object recognition. Science 262:685-688 Thomson AM, Deuchars J, and West DC (1993) Single axon excitatory postsynaptic potentials in neocortical intemeurons exhibit pronounced paired pulse facilitation. Neurosci. 54:347-360 Thomson AM and West DC (1993) Fluctuations in pyramid-pyramid excitatory posts3niaptic potentials modified by presynaptic firing pattern and postsynaptic membrane potential using paired intracellular recordings in rat neocortex. Neurosci. 54:329-346 Tolhurst DJ, Movshon JA, and Dean AF (1983) The statistical reliabilityof signals in single neurons of cat and monkey visual cortex. Vision Res. 23:775-785 Ts'o DY and Gilbert CD (1988) The organization of chromatic and spatial interactions in the primate striate cortex. J.Neurosci. 8:1712-1727 Van Essen DC, Newsome WT, and Maunsell JHR (1984) The visual field representation in striate cortex of the macaque monkey: asymetries, anisotropies, and individual variability. Vision Res. 24:429-448 Vogels R and Orban GA (1990) How well do response changes of striate neurons signal differences in orientation: A study in the discriminating monkey. J.Neurosci. 10:35433558 von der Malsburg C (1985) Nervous structures with dynamical links. Ber.Bunsenges.Phys.Chem. 89:703-710 von der Malsburg C (1986) Am I thinking assemblies? In: Brain Theory (Palm G, Aertsen A, eds), pp 161-176. Berlin: Springer-Verlag von der Malsburg C and Schneider W (1986) A neural cocktail-party processor. Biol.Cybern. 54:29-40 von der Malsburg C and Singer W (1988) Principles of cortical network organization. In: Neurobiology of Neocortex (Rakic P, Singer W, eds), pp 69-99. Chichester: John Wiley & Sons Limited Wang D, Buhmann J, and von der Malsburg C (1990) Pattern segmentation in associative memory. Neural Comp. 2:94-106 Yamane S, ICaji S, and Kawano K (1988) What facial features activate face neurons in the inferotemporal cortex of the monkey ? Exp.Brain Res. 73:209-214 Young MP and Yamane S (1992) Sparse population coding of faces in the inferotemporal cortex. Science 256:1327-1331
227
Zeki SM (1975) The functional organization of projections from striate to prestriate visual cortex in the rhesus monkey. Cold Spring Harbor Symp.Quant.Biol 40:591-600 Zipser D and Andersen RA (1988) A back-propagation programmed network that simulates response properties of a subset of posterior parietal neurons. Nature 331:679-684 Zohary E, Shadlen M, and Newsome WT (1994) Correlated neuronal discharge rate and its implications for psychophysical performance. Nature 370:140-143
Brain Theory - Biological Basis and Computational Principles A. Aertsen and V. Braitenberg (Editors) © 1996 Elsevier Science B.V. All rights reserved.
231
Models for dynamic receptivefieldsand cross-correlograms in visual cortex. George L. Gerstein and Jing Xing Department of Neuroscience, University of Pennsylvania, Philadelphia PA 19104, U.S.A. 1. ABSTRACT We examine network models of a sensory and particularly a visual system with three layers corresponding to receptors, thalamus, and cortex. Spiking elements are used, with lateral excitatory and inhibitory connectivity in the cortical layer. Under the appropriate conditions we can reproduce a range of experiments that have demonstrated dynamics in receptive field sizes and dynamic changes in cross-correlations. 2. INTRODUCTION One of the prime discoveries of the past few years has been the extent to which nervous system organization remains plastic and dynamic even in the adult long after the development period (Wall, 1988). Evidence for such changes comes from a number of different sensory systems, and generally has demonstrated that the cortical sensory map can be modulated through various peripheral manipulations. While some of these manipulations involved surgical deafferentation or input rearrangements (Gilbert and Wiesel 1992, Merzenich et al 1983, 1984), it turned out that map modulations could also follow after periods of intense use of a particular region of the periphery (a single finger for example) in a behavioral task (Jenkins et al 1990, Recanzone et al. 1992). All such map modulations occurred on a rather long time scale of months, although there were some immediate changes (generally related to unmasking of weak inputs) after some of the surgical interferences with the periphery. Another set of observations has demonstrated dynamic changes in organization on a much faster time scale. Here we have intracortical microstimulation where in a matter of hours the receptive fields of neurons in the stimulated region can be changed (Bedenbaugh 1993, Dinse et al., 1993, Maldonado 1993). These changes are either a movement of the original receptive field to cover the representation at the point of cortical stimulation, or the enlargement of the original receptive field to include this representation. A still more rapid receptive field modulation has been reported either with retinal lesions, or, more simply, with use of a very large region of visual stimulation but with a mask over the original receptive Supported by NIH MH-46428 and DC-01249.
232
field of the neuron under observation (Gilbert and Wiesel 1992, Pettet and Gilbert 1992). This has been called a "Scotoma" experiment, and produces RF enlargements within minutes. Subsequent stimulation in the scotoma region alone or, by removing the mask, in the whole large region, quickly collapses the RF to its original dimensions. In a recent paper we have described neural net simulations that were particularly intended to explore rules for RF modulations in the scotoma experiments (Xing and Gerstein 1994). The simulations were based on spiking neuron elements, and particularly examined the role of lateral inhibition and excitation in a cortical layer. We review this work below, and extend it to new data available from the Gilbert laboratory (Das and Gilbert, 1995) about cross-correlation between neuron pairs recorded during the scotoma experiment. Such measurements in effect estimate the "effective connectivity" in the network, and begin to address properties of neuronal assemblies as they influence individual neuron receptive fields. Incorporation of such data into our models further restricts the classes of mechanism that (in such simplified models) replicate the experimental observations.
3. THE MODEL Our simulation consists of three layers, each made up of 32x32 elements, and representing respectively input, subcortex, and cortex. The elements in the input layer in no way simulate retinal function, and serve merely to provide spatially and temporally controlled input currents to the subcortical layer. The projection from the input to the subcortical layer is 1 to 3x3 in register; similarly the projection from subcortical to cortical layer is 1 to 11x11. Within the stated region, the actual connections are assigned randomly within an in register Gaussian distribution with standard deviation 4. In this simplified model the spatial density of elements in visual space is constant, unlike the living system. It is hard therefore to express these numbers in terms of known anatomy. However, "typical" LGN receptive fields are about 0.75 degree, while typical cortical receptive fields are about 2.5 degrees (with much variance). We may, somewhat cabalistically, note that this is about the same ratio as the parameters of the projections defined above. As it happens, it turns out that the modeling results are remarkably robust with respect to the subcortex to cortex projection parameter; there are no significant changes even at 1 to 3x3. There are both excitatory and inhibitory lateral connections in the cortical layer. Each neuron forms excitatory synapses on three surrounding "neuron rings", with a probabilistic density that decreases monotonically with distance. Inhibitory synapses are formed probabilistically on seven surrounding rings, with a density that first increases, then decreases with distance. This density function also turns out to be a robust parameter; a monotonic decreasing function yields qualitatively similar results. Recent anatomical evidence on the spatial scale of nearby cortical pyramids does support the increasing/decreasing inhibitory scheme (Thomson and Deuchars 1994), as well as the monotonically decreasing excitation. However on the spatial scale of hypercolumns, the lateral spread of axonal arborization is monotonically decreasing for both excitation and inhibition (Gilbert and Wiesel 1989). The spatial scale of elements in our model is somewhere between these extremes, but is robust to either spatial density scheme. Figure 1 shows a cartoon of the above network arrangements.
233
Excitatory connections
Cortex
LGN
Retinal Input
Stimulated Scotoma (Unstimulated)
Figure 1. Diagram of the three layers in the model, together with examples of the connection scheme. Each plane contains 32x32 elements (although for pictorial clarity only 16x16 are indicated in the "retinal" layer). The scotoma area is indicated by the dark square in the middle of each plane; this is the subregion which is shielded from stimulation applied to the remaining portions of the "retinal" layer. Edges of the scotoma region in the two higher layers are fuzzier than indicated because of the divergence of each stage in the upward projection.
234
For greater verisimilitude, we have used spiking neuron elements in the model rather than just input-output rate functions. Neurons are simulated by modifications of MacGregor's (1987) program PTNRNIO, and with added spontaneous activity. Thus the activity of each neuron is described by four state variables: transmembrane potential, threshold, potassium conductance, and a spike variable. When the conditions are appropriate, such a neuron produces an action potential, and evokes EPSPs and IPSPs at elements to which it projects. The post synaptic potentials had short rise time followed by exponential decay. The IPSP was set to be 2-3 times longer than the EPSP, and had an additional 2-6ms latency (to mimic the action of one or more intemeurons). More detailed description of the neuron element is in the appendix of Xing and Gerstein (1994).
4. METHODS 1) General procedure. The experiments we are simulating (Pettet and Gilbert 1992) used a wide area conditioning (background) stimulus consisting of a moving grating, or smaller, uniformly oriented bars. A region larger than the receptive field of the neuron under study was masked, producing the artificial scotoma. After some minutes, the RF was observed to expand. Subsequent stimulation at the RF center caused its collapse to near the original dimensions. There was obviously a problem in determining the RF of the neuron, since this required stimulation that would cause RF collapse. Pettet and Gilbert (1992) used a protocol where periods of wide area (background) stimulation were alternated with short probings of the RF; presumably just a few probe stimuli at a time do not much affect the RF size, thus allowing determination of its expanded state. Our simulation procedure followed the same general plan. 2) Conditioning background stimulus (BS). At each time step of the simulation a randomly chosen 10% of the input elements were stimulated with a current step that caused a lOmv depolarization. This was sufficient to give fairly high rates of firing in the subcortical layer of the model. The stimulus does not directly mimic the gratings or bar arrays used in the experiments, but replicates the wide area of effective stimulation. A central region of the input layer was not stimulated, thus mimicking the artificial scotoma; the corresponding "cortical scotoma region" was therefore also shielded from the background stimulation. 3) Center stimulus. For each cortical layer neuron studied, RF collapse after expansion was produced by intensely stimulating a 3x3 spot in the input layer at the location of the neuron's RF center. This stimulus was used either alone or together with the BS. 4) RF measurement. Responses of the neuron under study are explored by presenting an appropriate current step to 3x3 "spots" of the input layer. Ten scans (random location sequences) of the relevant input region are averaged to produce the two dimensional RF description. The process is intended to mimic RF measurement in experiments. 5) Possible mechanisms. Wide ranging lateral connections, both excitatory and inhibitory, are known to exist in cortex (Rockland et al. 1982; Ts'o et al. 1986; Gilbert and Wiesel 1989). Such connections allow influences on a given neuron that reach far beyond the classical
235 receptive field, and form a likely basic mechanism for the dynamics of RF size. The critical factor here would be the balance between excitation and inhibition sensed by a given neuron; RF enlargement and increased responses could result either from increase of excitation or reduction of inhibition. We have studied four possible mechanisms that can change the local balance of excitation and inhibition. Mechanism A uses a cortical layer network which is inhibition dominant, with adaptation of neurons that are made to fire rapidly. Mechanism B uses only strong lateral inhibition. Mechanism C is excitation dominant, so that the background stimulus spreads strong lateral excitation; there is no adaptation. Mechanism D introduces activity based learning rules which allow either increase of excitation or reduction of inhibition. 5. RESULTS In our previous paper (Xing and Gerstein 1994) we described in considerable detail the results of using all four of the candidate mechanisms defined above, and concentrated particularly on Mechanism A which produced the best results. Here we will briefly review the performance of each mechanisms for RF expansion properties and then, concentrating on Mechanisms A and C, we will examine the associated changes of cross-correlation between neurons. The left column of Figure 2 shows typical receptive fields for cortical layer neurons as they appear initially in the four different types of network before any modulating stimulation is applied. The gray level is proportional to the responsiveness of the neuron; overall the RF could roughly be described as a Gaussian hill. Diameters are typically 5-7 elements of the input layer. These RFs systematically tile the entire input layer, and overlap heavily. Mechanism A. Here we assume that the cortical network is (a) inhibition dominant, (b) that the input layer has a steady random activity that always produces a non-zero resting firing level in the cortical neurons, and (c) that with constant stimulation, the cortical layer neurons show decreasing firing. The adaptation is modeled by making threshold rise when the neuron has a recent history of rapid firing. Results are robust with regard to the exact parameter values used for the adaptation. With this mechanism, we expect the background stimulation (BS) to cause strong activity in the cortical layer region that lies outside projection of the scotoma. These active neurons will adapt, and their firing levels will fall below the spontaneous level. This, in turn, removes lateral inhibition from the neurons within the cortical layer scotoma region, so that their receptive fields can expand. After presentation of the background stimulus (BS), and after the firing of the surrounding cortical layer neurons has stabilized into its adapted state, we typically observe an enlargement of the RF; this is shown in Figure 2Ab for the same neuron as in 2Aa. The time course of the enlargement, or its decay depends on the protocol with which the BS stimulus is interleaved with the RF measurements; this is precisely the observation in the experiments. When additional stimulation is applied at the center of the scotoma region, the expanded RF
236
Before scotoma
After scotoma
Central Stimulus
(a)
(b)
(c)
B
D
Figure 2. Examples of receptive field for typical neurons near the center of the scotoma region (which is indicated by the square outline). Top row is for Mechanism A (inhibition dominant, with adaptation); second row is for Mechanism B (strong lateral inhibition, no adaptation); third row is for Mechanism C (excitation dominant, no adaptation); bottom row is for Mechanism D (activity-related synaptic learning rules). Column a: original receptive field; column b: receptive field after stimulation of surround, but not scotoma region; column c: receptive field after subsequent additional stimulation of the scotoma region. The results of Mechanisms A and B agree with experiment (expansion of receptive field, followed by contraction). The results of Mechanisms C and D show the expansion, but not the subsequent contraction. Gray levels indicate the cortical layer neuron's responsiveness. Note that only a 16x16 region centered on the RF is shown, although the networks are all 32x32.
237
reverts to near its original size, as shown in Figure 2Ac; again, this repHcates the experimental observations. The center stimulus produces considerable inhibition throughout the scotoma region; this, in turn, reduces the mutual excitatory influences that contributed to the RF expansions. The balance between excitation and inhibition parameters in the network is critical for RF size modulation by the scotoma stimuli; inhibition dominant parameters must be used. Uniform stimulation of the entire system (i.e. BS stimulus without a scotoma region) has no effect on the receptive fields. Adaptation is uniform throughout the net, and the balance of influences remains about the same everywhere. This also replicates experimental observation. Mechanism B. Here we (a) set a considerably larger inhibition parameter than in Mechanism A, (b) install similar spontaneous activity, and (c) have NO adaptation of thresholds to the firing history. Responses to the modulating stimulus are essentially instantaneous (since there is no adaptation time involved), and effects of the modulating stimulus cease equally rapidly upon its removal. Cortical layer neurons well inside the borders of the scotoma region show RF expansion after BS presentation (Figure 2Bb for the same neuron whose original RF is shown in 2Ba). After additional stimulation of the scotoma center, the RF reverts (Figure 2Bc). However, the neurons near the borders of the scotoma region do not undergo these changes, and may even show reduced receptive field sizes. Such non-uniformity, as well as the minimal time constant for the effects differ from the experimental observations, although perhaps not to a fatal extent. The reason for the border non-uniformity is the strong inhibition reaching into the scotoma region from the surround that is stimulated by the BS. In turn, neurons far inside the border are relieved of inhibition, and can expand their RFs. The extent of the border region is determined by the size of the scotoma relative to the distance through which a given neuron exerts inhibition. Mechanism C. Here we (a) set the network to be excitation dominant, (b) install spontaneous activity, and (c) allow NO adaptation. This type of network is considerably less stable than the inhibition dominant arrangements, and exhibits largerfluctuationsof activity. Presentation of the BS causes increased activity in the surround region. The resulting excitation reaches into the scotoma region and causes RF expansion (Figures 2Ca to 2Cb). Subsequent additional center stimulation causes additional RF expansion (Figure 2Cc), in disagreement with the experimental results. Modulation effects here are also essentially instantaneous. The relative size of scotoma and distance through which a given neuron exerts excitation determines whether there will be a border non-uniformity as described for Mechanism B. Here, for a sufficiently large scotoma region, the border and center effects will be the mirror image of those from Mechanism B: center neurons would show no RF modulation effects. This is because excitation has a shorter spatial range than the inhibition, so that it will not reach the center neurons.
238 Mechanism D. Here we allow changes of synaptic weight through modified Hebbian or other learning rules. Two general variations were studied: 1) We allow an excitatory synapse to be strengthened when there is a coincidence of pre and post synaptic firing within some appropriate time window. The increment of synaptic strength is taken proportional to the difference between the present strength and some asymptotic maximum value. Normalization of the total outward synaptic strength from a given neuron is used, i.e. if the learning rule strengthened some synapses, others originating from the same neuron were weakened. After stabilization with the BS stimulus, this version of the model produced modest increases of RF size in the region outside the scotoma, with considerable increases of responsiveness. However, RF sizes for neurons within the scotoma region did not change. None of these observations correspond to the experimental data. 2) We allow high frequency firing of a neuron to increase the strength of all its outward excitatory synapses (or alternatively to decrease the strength of all its outward inhibitory synapses). After stabilization with the BS stimulus, this model with decreasing strength of outward inhibitory synapses showed enlargement of RF sizes both outside and inside the scotoma region. However, subsequent stimulation within the scotoma region, or stimulation of the entire field did not produce collapse of RFs to their original size. This is illustrated in Fig. 2D, and does not replicate the experiments. o C\J CD
CD CO
c o
Q. (fi CD
o y-
CD
> <
JM
u
IMIij
10
10 Distance from the RF center (rings)
Figure 3. Typical receptive field profiles for Mechanism A in original (dark bars) and expanded (light bars) conditions. Note the asynmietric expansion and strengthening of the receptive field; this asymmetry is typical for neurons away from the scotoma center.
239 5.1 Receptive field profiles When receptive field changes occur, they usually have profiles similar to those shown in Figure 3. (for Mechanism A). The original receptive field profile is shown by the dark bars, the expanded RF profile by the light bars. Note that the expansion is asymmetric because the influence creating the expansion is coming from outside the scotoma region, and thus has a gradient of effect within the scotoma region. Note also that the expansion is not just an increase in responsiveness: the shape changes. If the expansion were the results of increased responsiveness alone, the size ratio of each light/dark bar would have been the same. The RF shape change implies change in the local effective connectivity of the neuron, and agrees with experimental observations. 5.2 Cross-correlation results All the model mechanisms that we have examined involve changes in the effective strength of the lateral connections in the cortical network. We would therefore expect to see changes in cross-correlation measurements between cortical neurons as the BS stimulus is presented and the RF expansions take place. Indeed a recent paper (Das and Gilbert 1995) has begun to report such measurements, initially limited to observations of two neurons at a time, both within the scotoma region. We can therefore at least partially compare modeling with physiological results. The basic experimental or theoretical cross-correlation measurement here involves comparison of the original and post BS stimulus conditions, i.e. before and after the BS stimulus has caused the RF field expansions described above. Generally the firing rates of the neurons will undergo changes between the two conditions, so that it is important to properly normalize the "raw" cross-correlations to compensate. In addition, if a stimulus is used during the measurement of cross-correlation, a shift (or shuffle or PST) type of predictor must be used to subtract the direct contribution of the stimulus as it modulates the firing of both neurons. After both such normalizations we may interpret changes in the cross-correlogram peak structure as evidence for change in the strength of the lateral connectivity. In the experimental work (Das and Gilbert 1995) considerable effort was made to select and test the normalization to be used. Their procedure was to subtract the average shuffle predictor and then to normalize the difference correlogram to the average bin counts in the raw flanks, i.e. to Nl*N2*dt/T where Nl and N2 are the total spike counts for each of the two neurons, dt is the bin width, and T is the total observation time. The final normalized correlogram was smoothed with a 3 bin filter. In addition, stimulus strength in the experimental cross-correlation measurements was adjusted to approximately match overall firing rates before and after the BS stimulus presentation, thus reducing differences in the flank normalization of the correlograms being compared. An example of experimental results is shown in Figure 4 (Das and Gilbert, private communication, but similar to the results of Das and Gilbert 95). Both neurons are inside the scotoma region, and the cross-correlation of their activities shows a synmietric central peak. The area of this peak is considerably increased after application of the scotoma stimulus.
240
•D 0
c
o •^T3
0 C
a oo o D 0 a> D o ^ ^ 0o 0o 0.10.
O O
Qi,
^
>o
'~ II
c to
c D D
"D (1)
;:5 iO
O
0)
11
>
c u
O (1)
o
• >
0)
T3
o r o Ui
o a (D
C
,_
D
o X 0
22
0
o
C
c
a Q-
O
0}
o o
"to
D
O
0)
to
o ^ E
66 0) <^2
O)
v> J^
o
f—
5. a.
o CO o o:^. o sQ O E o II II CD o '^ t o CO
to
o O c n 3 •D
0)
a)
E o c o o
E "5 u 2,
z
0)
c
66 66 0 0 o Q.
JQ
• >
TJ
D
0)
c
(D
"n
o
E E T: O V)
0
E
O O Z
O
c
11
II
0)
o
^
O
> C
II iO c c o iJ
il •D
O
0) O
rs! CO
to
E
a
^
<
C
3
O
0 0
E g
CM
c o o D
5 S
0
X
0 0)
o
D
2to
o
D
a
X
0 0 O 0
"c "5 o M
o
o
^
Figure 4. Cross-correlations from real neurons, before scotoma stimulus on left, after on right. Receptive fields initially were both within the indicated scotoma region of the stimulus. Correlogram time scales are +/-50ms. (Das and Gilbert, private conmiunication; other examples in Das and Gilbert 95).
241
CO
00
CD
CO
oj
OJ
H
H
o -• -100
-50
0
50
100
-100
-50
(a)
50
100
50
100
(b)
00
CD
CD
CD
'^ CM
0
1
1
CM
H
WkJljwi -T
-100
-50
0
50
1
100
-100
-50
0
Time shift (ms)
Time shift (ms)
(c)
(d)
Figure 5. Examples of cross-correlations between two model neurons, both inside the scotoma region. Top row: Mechanism A (inhibition dominant, with adaptation); bottom row: Mechanism C (excitation dominant, no adaptation). Left column: original condition; right column: after stimulation outside scotoma region, and corresponding to the expanded receptive field condition as shown in Figs. 2 and 3. Average shift predictors are subtracted and the difference cross-correlograms are normalized by the average value of the counts in their raw distant flanks. This matches the normalization used by Gilbert et al. in their experimental measurement of cross-correlation.
242
Although the above steps are certainly not the only possible way to normalize such cross-correlation sets, we have used the same procedure, including the stimulus strength adjustment. Thus comparison of our model results with Gilbert's data are appropriate. Results for Mechanism A (which fit the RF expansion/contraction data well) and for Mechanism C (which failed to replicate data) are shown in the cross-correlograms of Figure 5. The upper row is for Mechanism A, the lower row for Mechanism C; left column is original state, right column is after a period of BS stimulation adequate to produce the RF expansions. The data shown here are for two neurons which are both within the scotoma region. Note that all the model correlograms show a degree of high frequency oscillation which is also visible to some extent in the experimental data of Figure 4. Possible sources might include time quantization problems or actual instabilities in the networks; the conditions and details of such high-frequency oscillations remain to be investigated. In the following we will consider only envelopes of the correlogram peaks. The correlograms of Figure 5 all show a central peak that is symmetric about the origin, and usually interpreted as the signature of shared input to the two neurons from unobserved sources. After presentation of the scotoma stimulus, both mechanisms produce an increase in the area of the central peak, corresponding to a strengthened connectivity. Note that the time scales of the experimental and model correlograms differ, indicating a need to fine tune the model timing parameters. However, comparison of experimental and model correlogram shapes and changes remains appropriate. For Mechanism A (upper row) in an inhibition dominant situation, the crosscorrelogram peak is relatively narrow. Its source is mainly the shared input to the observed neurons from the thalamic layer. The effect of the BS stimulation is to decrease lateral inhibition, thus revealing previously suppressed input in the thalamic to cortical layer divergent projections. The result in the (top, right) cross-correlogram is an increase in central peak height, with some change in width. For Mechanism C (bottom row) in an excitation dominant situation, the crosscorrelogram peak is initially much wider because of the additional contributions of the mutual excitation between neurons in the cortical layer. The effect of the BS stimulation is to increase this mutual excitation, causing more and broader synchronization of firing; there is an increase central peak width without much change in height (bottom, right cross-correlogram). Although experimental results of cross-correlation between neurons that are either both outside the scotoma, or where one neuron is inside and the other outside the scotoma are not yet available, the model makes at least qualitative prediction of the expected results. The detailed changes of cross-correlation in the model depend on many of the parameters. However, across a reasonable parameter range, and for most neuron pairs the changes of crosscorrelation accompanying RF expansion are summarized in Table 1. The largest variability in these results was for the inside-outside pairs; the problem is partly that it is difficult to deal with the boundary region where such definitions are somewhat ambiguous because of the divergence of the ascending projection and the lateral connectivity.
243
Cross-Correlation Changes Mechanism A Mechanism C inh > exc exc > inh Ins - Ins
+
+
Out - Out
-
+ (or ++)
Ins - Out
-
+ (or ++)
Table 1. Qualitative cross-correlogram changes observed for most neuron pairs as indicated. Ins = inside, Out = outside the scotoma region. Increase of central correlogram peak indicated by +, decrease by -. Note that there is a border region (because of the ascending projection divergence) where categorization into inside and outside is poorly defined. This is particularly important in examining I-O correlograms.
Table 1 shows a clear qualitative difference in the model predictions such that in the outside-outside and inside-outside correlograms the BS stimulus leads to a reduction of central peak area for Mechanism A, while there is an increase in central peak area for Mechanism C. Further experimentation is clearly appropriate. 6. DISCUSSION In this paper we have examined dynamic properties of a model network with lateral connectivity involving both excitation and inhibition. The elements were simplified point neurons, but included membrane properties, various ionic currents, spike generation, and synaptic potentials. We reviewed our previous work with replication of dynamic receptive field changes under the specialized condition that leaves a central region of the model without stimulus, while the surround receives stimulation. We have examined a number of different mechanisms and rules for the lateral connectivity, including networks with either dominant excitation or with dominant inhibition as well as with several forms of learning rules. Models with dominant inhibition were far more successful in replicating the full range of available experimental results on RF dynamics. It turned out that all the model variations we tried, both inhibition and excitation dominant were able to produce some expansion of RF size for neurons within or at least near the borders of the stimulus scotoma region. However subsequent shrinkage of the expanded RF by stimulating within the scotoma area was observed only with Mechanisms A and B, lateral inhibition dominant with or without adaptation. Models (Mechanism D) with learning rules which produced modification of synaptic strength under appropriate constellations of activity were not as useful, and generally contradicted experimental results.
244 In further tests of the model we examined cross-correlations between model elements before and after presentation of the stimulus conditions that lead to RF expansions. Both of the inhibition dominant (Mechanisms A and B) models produced increases of central peaks in correlograms calculated between the spike trains of two neurons both of which lay within the scotoma region. This agreed well with experimental observations. Differences between models arise however when spike train correlations are calculated for neuron pairs where one element is inside the scotoma and the other is outside, or when both are outside (Table 1). Unfortunately experimental data for these conditions is not yet available. One of the most interesting aspects of the experimental observations (Kapadia et al. 1994, Das and Gilbert 1995) is the extreme rapidity with which the RF changes occur. The time scale is sufficiently short that synaptic weight changes following on anatomical sprouting seem most unlikely. An additional experimental observation is if, in the expanded RF state all stimulation is suspended, that the expanded state persists to some extent for a period of at least minutes These temporal properties are fitted by the dynamics and habituation properties of the models set out in this paper. Alternatively, however, the time scale is also suggestive of LTP and LTD like mechanisms. Strictiy Hebbian mechanisms seem less likely, since the scotoma stimulation paradigm creates very little post-synaptic firing in the observed neurons. The models we have examined did not address LTP like mechanisms, but instead just used the dynamic lateral spread of activity from the stimulated to the unstimulated scotoma region.
REFERENCES Bedenbaugh, R (1993) Plasticity in the Rat Somatosensory Cortex Induced by Local Microstimulation and Theoretical Investigations of Information Flow Through Neurons. PhD. Dissertation. Department of Bioengineering. University of Pennsylvania. Das, A. and Gilbert, C D . (1995) Receptive field expansion in adult visual cortex is linked to dynamic changes in strength of intrinsic cortical connections. (J. Neurophysiol. in press) Dinse H.R., Recanzone, G.H., and Merzenich, M.M. (1993) Alterations in correlated activity parallel ICMS induced representational plasticity. Neuroreport 5:193-196. Gilbert, C D . and Wiesel, T.N. (1989) Columnar specificity of intrinsic horizontal and cortico-cortical connections in cat visual cortex. J. of Neurosci. 9: 2432-2442 Gilbert, C D . and Wiesel, T.N. (1992) Receptive field dynamics in adult primary visual cortex. Nature, 356: 150-152 Jenkins, W.M., Merzenich, M.M., Ochs, M.T., AUard, T. and Guic, E. (1990) Functional reorganization of somatosensory representations within area 3b of adult owl monkey after behaviorally controlled tactile stimulation. J. Neurophysiol., 63: 82-104 Kapadia, M.K., Gilbert, C D . and Westheimer, G. (1994) A Quantitative measure for shortterm cortical plasticity in human vision. J. Neurosci., 14:451-457
245 MacGregor, RJ. (1987) Neural and Brain Modeling , ACADEMIC PRESS, New York Maldonado, RE. (1993) Cortical plasticity and neuronal assemblies in rat auditory cortex. Ph.D thesis. Department of Physiology, University of Pennsylvania. Merzenich, M.M., Kaas, J.H., Wall, J.T., Nelson, R.J., Sur, M. and Felleman, D. (1983) Topographic reorganization of somatosensory cortical areas 3b and 1 in adult monkeys following restricted deafferentation. Neuroscience, 8: 33-55 Merzenich, M.M., Nelson, R.J., Stryker, M.S., Cynader, M.S., Schoppmann, A. and Zook, J.M. (1984) Somatosensory cortical map changes following digit amputation in adult monkeys. J. comp. Neurol. 224: 591-605 Pettet, M.W. and Gilbert, CD. (1992) Dynamic changes in receptive field size in cat primary visual cortex. Proc. Natl. Acad. Sci., 89: 8366-8270 Recanzone, G.H., Merzenich, M.M., Jenkins, W.M., Grajski, K.A., and Dinse, H.R. (1992) Topographic Reorganization of the Hand Representation in Cortical Area 3b of Owl Monkeys Trained in a Frequency-Discrimination Task. J. Neurophysiol., 67, 1031-1056 Rockland, K.S., Lung, J.S. and Humphrey, A.L. (1982) Anatomical banding of intrinsic connections in striate cortex of tree shrews (Tupaia glis). J. Comp. Neurol. 209: 41-58. Thomson, A.M. and Deuchars, J. (1994) Temporal and spatial properties of local circuits in neocortex. Trends in neurosci., 17: 119-125 Ts'o, T.Y., Gilbert, CD. and Wiesel, T.N. (1986) Relationships between horizontal interactions and functional architecture in cat striate cortex as revealed by cross-correlation analysis. J. Neurosci. 6: 1160-1170. Wall, J.T. (1988) Variable organization in cortical maps of the skin as an indication of the lifelong adaptive capacities of circuits in the mammalian brain. TINS, 11: 549-557. Xing, J. and Gerstein, G.L. (1994) Simulation of dynamic receptive field in primary visual cortex. VisionRes., 34: 1901-1911
Brain Theory - Biological Basis and Computational Principles A. Aertsen and V. Braitenberg (Editors) © 1996 Elsevier Science B.V. All rights reserved.
247
Anatomical origin and computational role of diversity in the response properties of cortical neurons Kalanit Grill Spector^, Shimon Edelman^ and Rafael Malach*' ^Department of Applied Mathematics and Computer Science The Weizmann Institute of Science, Rehovot 76100, Israel '^Department of Neurobiology, The Weizmann Institute of Science, Rehovot 76100, Israel
The maximization of diversity of neuronal response properties has been recently suggested as an organizing principle for the formation of such prominent features of the functional architecture of the brain as the cortical columns and the associated patchy projection patterns [1]. We report a computational study of two aspects of this hypothesis. First, we show that maximal diversity is attained when the ratio of dendritic and axonal arbor sizes is equal to one, as it has been found in many cortical areas and across species [1,2]. Second, we show that maximization of diversity leads to better performance in two case studies: in systems of receptive fields implementing oriented steerable/shiftable filters, and in matching spatially distributed signals, a problem that arises in visual tasks such as stereopsis, motion processing, and recognition.
1. I n t r o d u c t i o n A fundamental feature of cortical architecture is its columnar organization, manifested in the tendency of neurons with similar properties to be organized in columns that run perpendicular to the cortical surface. This organization of the cortex was initially discovered by physiological experiments [3,4], and subsequently confirmed with the demonstration of histologically defined columns. Tracing experiments have shown that axonal projections throughout the cerebral cortex tend to be organized in vertically aligned clusters or patches. In particular, intrinsic horizontal connections linking neighboring cortical sites, which may extend up to 2 — 3 mm^ have a striking tendency to arborize selectively in preferred sites, forming distinct axonal patches 200 — 300 fim in diameter. Recently, it has been observed [5-7] that the size of these patches matches closely the average diameter of individual dendritic arbors of upper-layer pyramidal cells. Insofar as this correlation between column or patch size and dendritic spread is a fundamental property that holds throughout various cortical areas and across species [2], one is led to assume that it constitutes an important characteristic of the columnar architecture of the cortex. Determining its functional significance may, therefore, shed light on the principles that drive the evolution of the cortical architecture.
248 One such driving principle may be the maximization of diversity in the neuronal population in the cortex [1]. According to this hypothesis, matching the sizes of the axonal patches and the dendritic arbors causes neighboring neurons to develop slightly different functional selectivity profiles, resulting in an even spread of response preferences across the cortical population, and in an improvement of the brain's ability to process the variety of stimuli likely to be encountered in the environment. The present work concentrates on two aspects of this hypothesis. First, we address the basic question of the manner whereby the patchy columnar architecture can support the maximization of diversity. In section 2, we propose a quantitative definition of diversity and analyze its dependence on the ratio of axonal and dendritic patch sizes, showing that a maximum is attained when that ratio is equal to 1. Second, we explore the possible computational rationale behind the maximization of diversity. In section 3, we show that diversity in orientation and location of receptive fields (RFs) is beneficial when considered in the framework of oriented steerable/shiftable filter generation. In section 4, we consider the influence of diversity in RF location on the ability of RF-based systems to match spatially distributed signals - a problem that arises in visual tasks such as stereopsis, motion processing, and recognition. 2. A n anatomical correlate of neuronal sampling diversity To test the effect of the ratio between axonal patch and dendritic arbor size on the diversity of the neuronal population, we conducted computer simulations based on anatomical data concerning patchy projections [8,2,5,7].^ The patches were modeled by disks, placed at regular intervals of twice the patch diameter, as revealed by anatomical labeling. Dendritic arbors were also modeled by disks, whose radii were manipulated in different simulations. The arbors were placed randomly over the axonal patches, at a density of 10,000 neurons per patch. We then calculated the amount of patch-related information sampled by each neuron, defined to be proportional to the area of overlap of the dendritic tree and the patch. The results of the calculations for three values of the ratio of patch and arbor diameters appear in Figure 1. The presence of two peaks in the histogram obtained with the arbor/patch ratio r = 0.5 indicates that two dominant groups are formed in the population, the first receiving most of its input from the patch, and the second - from the inter-patch sources. A value of r = 2.0, for which the dendritic arbors are larger than the axonal patch size, yields near uniformity of sampling properties, with most of the neurons receiving mostly patchoriginated input, as apparent from the single large peak in the histogram. To quantify the notion of diversity, we defined it as: diversity
~ ^ ^^ ^
<\t\>
(1)
where n{p) is the number of neurons that receive p percent of their inputs from the patch, and < • > denotes average over all values of p. Figure 1, right, shows that diversity is ^Necessary conditions for obtaining dendritic sampling diversity are that dendritic arbors cross freely through column borders, and that dendrites which cross column borders sample with equal probability from patch and inter-patch compartments. These assumptions were shown to be valid in [5,11.
249
Sampling percentage histogram
Diversity
2500 «2000
I ratio=2
5l500t E
20
40 60 percent sampled from patch
80
100
0.5 1 1.5 ratio between neuron and patch
Figure 1. Left: histograms of the percentage of patch-originated input to the neurons, plotted for three values of the ratio r between the dendritic arbor and the patch diameter (0.5, 1.0, 2.0). The flattest histogram is obtained for r = 1.0 Right: the diversity of neuronal properties (as defined in section 2) vs. r. The maximum is attained for r = 1.0, a value compatible with the anatomical data.
maximized when the size of the dendritic arbors matches that of the axonal patches, in accordance with the anatomical data. This result confirms the diversity maximization hypothesis stated in [1]. 3. Orientation t u n i n g as a functional manifestation of neuronal sampling diversity In this section, we consider the smooth gradation of orientation tuning across the cortical surface in area VI in mammals as a possible consequence of highly diverse sampling in the underlying neuronal population. We start with some background regarding the orientation columns in VI. 3.1. Background The orientation columns in VI are perhaps the best-known example of functional architecture found in the cortex. On the basis of single unit recordings Hubel and Wiesel [4] reported that cells encountered in penetrations perpendicular to the cortical surface have similar orientation selectivity. In tangential penetrations, orientation preference shifts as the electrode advances. In their early recordings, Hubel and Wiesel reported discrete shifts rather than a continuous change in orientation preference. In subsequent recordings they found that orientation preference can vary more smoothly, and concluded that it varies continuously throughout VI [9]. Cortical maps obtained by optical imaging [10] reveal that orientation columns are patchv rather then slab-like, i.e., domains corresponding to
250 a single orientation appear as a mosaic of round patches, which tend to form pinwheel-like structures. Moreover, incremental changes in the orientation of the stimulus were found to lead to smooth shifts in the positions of these domains. We hypothesize that this smooth variation in orientation selectivity found in VI originates in patchy projections, combined with diversity in the sampling properties of cortical neurons sampling from these projections. The simulations described in the rest of this section substantiate this hypothesis. 3.2. M o d e l i n g Orientation Columns: C o m p u t e r Simulations The goal of the simulations was to demonstrate that a limited number of discretely tuned elements can give rise to a continuum of responses. Several models for the emergence of orientation selectivity from the largely nonoriented RFs in the LGN have been suggested in the literature [4,11]. Some of these (in particular, Vidyasagar's model [11]) postulate that the basis for orientation selectivity in the cortex is provided by as few as two channels (see also Foster and Ward [12]). In our simulations, we concentrate on the possibility of deriving a continuum of orientation preferences from the small number of discrete orientation channels, rather than on the manner in which these channels are shaped by the properties of the geniculocortical projection. To set the size of the original discrete orientation columns, we invoked the notion of a point image [13,14], defined as the minimal cortical separation of cells with nonoverlapping receptive fields. Thus, we created a network of orientation columns, whose size was determined by the diameter of their constituent receptive fields. Each column was tuned to a specific angle, and located at an approximately constant distance from another column with the same orientation tuning (we allowed some scatter in the location of the receptive fields). The receptive fields of adjacent units with the same orientation preference were overlapping, and the amount of overlap was determined by the number of receptive fields incorporated into the network. The preferred orientations were equally spaced at angles between 0 and TT. The receptive fields used in the simulations were modeled by a product of a 2D Gaussian Gi, with center at r}, and an orientation selective filter 6*2, with optimal angle ^,: G{r^f*j^O^Oi) = Gi{f^fj)G2{0,0i) This model for a receptive field is equivalent to a directional derivative of a 2D-Gaussian (we note that receptive fields of orientation selective cells in VI resemble directional derivatives of Gaussians up to the 4th order; see [15]). According to the recent results on shift able steerable filters [16,17], a R F located at rj and tuned to the orientation <;io can be obtained by a linear combination of basis receptive fields, as follows:^ M-lN-l jzzO t=0 M-1
N-1
= '£bjir-'o)G,if,,^j)J2k{o)G,i0A) j=0
(2)
i=0
From equation 2 it is clear that the linear combination is equivalent to an outer product ^A mathematical formulation appears in appendix A.
251
noises3% noise in coef>2.5% #steering filters«11
20
30 40 50 number of shifting filters
noise33% noise in coefs2.5% #steering filters«11
20
30 40 50 nunnber of shifting filters
Figure 2. The effects of (independent) noise in the basis receptive fields and in steering/shifting coefficients. Left: the approximation error vs. the number of sis receptive fields used in the linear combination. Right: the signal to noise tio vs. the number of bcisis receptive fields. The SNR values were calculated l^log^Q{signal energyjnoise energy). Adding receptive fields to the basis increases accuracy of the resultant interpolated RF.
the baraas the
of the shifted RF and the steered RF. The numbers {^t(<^o)}t=o ^^^ i^jifo)] _o denote the steering and shifting coefficients, respectively. Because orientation and localization are independent parameters, the steering coefficients can be calculated separately from the shifting coefficients according to equations 17 and 20 in appendix A. The number of steering coefficients depends on the polar Fourier bandwidth of the basis receptive field, while the number of steering filters is inversely proportional to the basis receptive field size (see appendix A.2.) To make the model more biologically plausible, we introduced noise into the model. Additive gaussian white noise was added both to the functions representing the basis RFs, and to the shifting and steering coefficients. The simulations show that in the presence of noise the minimal basis has to be extended to achieve a good signal to noise ratio (SNR; see Figure 2). Combining the responses of many neurons decreases the sensitivity of the system to noise. The results of this simulation for several RF sizes are shown in Figure 3, left. As predicted by the mathematical formulation, the number of basis RFs required to approximate a desired RF profile is inversely proportional to the size of the basis RF. The right panel in Figure 3 shows that as the basis receptive fields are made bigger, fewer of them are needed to achieve a given approximation error. 3.3. Steerability and biological considerations The anatomical finding that the columnar "borders" are freely crossed by dendritic and axonal arbors f5l, and the mathematical properties of shiftable/steerable filters out-
252
The dependence of the number of RF« on the variance
10
20
30 40 number of RF«
50
1.5
2 2.5 variance
Figure 3. Left: error of the steering/shifting approximation for several basis RF sizes. Right: the number of basis RFs required to achieve a given error for different RF sizes. The dashed line is the hyperbola num.RFs x size = const.
lined above suggest that the columnar architecture in VI provides a basis for creating a continuum of RF properties, rather that being a form of organizing RFs in discrete bins. Computationally, this may be possible if the input to neurons is a linear combination of outputs of several RFs, as in equation 2. Is this assumption warranted by other anatomical and physiological data regarding cortical interconnection patterns? Horseradish peroxidase (HRP) labeling studies [8] have shown that lateral connections of orientation columns extend to a range of 2 — 4 mm. In other studies that used 2DG autoradiography and retrograde labeling, connectivity patterns were superimposed on functional maps [18]. The results showed that cells tended to connect to cells of like orientation preference. The relationship between functionally defined columns and patchy connections was studied by [7]. They used optical imaging techniques to construct functional maps of orientation columns, then targeted injections of biocytin tracer to selected functional domains. Their results show that long-range connections, extending one m,m> or more, tend to link cells with like orientation preference. In the short range, up to 400 //m from the injection site, connections were made to cells of diverse orientation preferences. The selectivity of the short-range connections is markedly disrupted probably because dendritic arbors and axonal connections freely cross orientation column borders. We suggest that the long-range connections, which connect cells of like orientation preference, provide the inputs necessary to shift the position of the desired RF, while the short-range connections, which connect cells of diverse orientation preference, provide the connections needed to steer the RF to an arbitrary angle.^ ^Our simulations also support the findings of [19] on the relationship between cortical magnification and RF size. They reported that foveal RFs, of size 25' — 30', show more overlap than peripheral, of size about 2** — 4°, in accordance to our results, as depicted in Figure 3.
253 Are both excitatory and inhibitory connections, present in our model, to be found in the biological data? Gilbert and Wiesel [18] note that the majority of long range horizontal connections are excitatory connections between pyramidal cells. Cross correlation studies [20] support the excitatory nature of these connections. Inhibitory connections come from two sources. First, a certain proportion of post-synaptic cells (as many as 10%) may be inhibitory interneurons [18]. Second, as noted in [21], there is a possibility that orientation-biased cells in cytochrome oxidase (CO) blobs in primates provide inhibitory inputs to the sharply tuned orientation selective cells (it has been shown that CO cells show high GABA-decarboxylase activity that is related to inhibition). 4. Matching w i t h patchy connections 4.1. T h e problem of matching Many visual tasks require matching between images taken at diiferent points in space (as in binocular stereopsis) or time (as in motion processing). The first and foremost problem faced by a biological system in solving these tasks is that the images to be compared are not represented as such anywhere in the system: instead of images, there are patterns of activities of RFs, whose profile parameters and location in the visual field are, to a considerable extent, random. It is now a matter of common agreement that while the cortex is not wired as precisely as an electronic device, neither it is a free-for-all jumble of connections completely devoid of order. On the one hand, the intrinsic patchy connections exhibit a certain degree of wiring precision. On the other hand, there is also a significant patch-interpatch mixing [5], and a typical patch has a non-negligible diameter of 200 — 300 //m. The number of axonal arborizations in a patch is of an order of magnitude of lO'*, and about the same number of dendrites sample it. On the average each dendritic tree makes one synapse with an overlapping axonal arbor [22,23], but the degree of target specificity in a patch is difficult to estimate. An additional complication is due to the plasticity of cortical connections, as evident both in the classical deprivation experiments [24], and also in the data on behaviorally controlled stimulation [25,26]. LeVay [27] suggests that intrinsic patchy connections could arise during development in response to a rule of the kind "cells that fire together wire together". It is possible that these connections emerge from profuse non-patchy projections, by a selective activity-dependent elimination of synapses. Patchy connections are not, however, necessarily bad news. As we show in the following section, a system composed of scattered RFs with smooth and overlapping tuning functions can perform matching precisely by allowing patchy connections between domains. Moreover, the weights that must be given to the various inputs that feed a R F carrying out the match are identical to the coefficients that would be generated by a learning algorithm required to capture a certain well-defined input-output relationship from pairs of examples. 4.2. Matching patchy signals: a m a t h e m a t i c a l formulation We formulate the matching problem according to the scheme sketched in Figure 4, in which the dendrites of unit C are shown sampling two domains, A and B. The dendritic arbor is a patch of diameter equal to that of the projection profile of cells feeding areas A and B. This profile is modeled bv a multi-dimensional Gaussian. The task faced bv unit C
254 is to determine the degree to which the activity patterns in domains A and B match.
Figure 4. Unit C receives patchy input from areas A and B which contain receptors with overlapping RFs.
Let (j)jp and Ojp be the responses of the ^'th unit in domains A and B, respectively, to an input oj^:
P,vA, =
^M-\p^]
(3)
where Xj is the optimal pattern to which the j ' t h unit is tuned. If, for example, domains A and B contain orientation selective cells, then x} would be the optimal combination of orientation and location of a bar stimulus. For simplicity we assume that all the RFs are of the same size <7, that unit C samples the same number of neurons A'^ from both domains, and that the input from each domain to unit C is a linear combination of the responses of the units in each area. The input to C from domain A, with aj^ presented to the system is then: TV
(4) The problem is to find coefficients {aj} and {hj] such that on a given set of inputs [x^] the outputs of domains A and B will match. We define the matching error as follows: p
/ N
N
\ ^
t=i
/
(5) p=i \ i = i
T h e o r e m 1 The desired matching coefficients can be generated by an algorithm to learn an invut/output mavpina from a set of examples.
trained
255 As an example of the learning algorithm, one may chose radial basis function (RBF) approximation [28]. This approach is particularly suitable for our purpose, because the basis functions in RBF approximation can be regarded as multidimensional Gaussian RFs. Proof: To find the coefficients, we differentiate equation 5 with respect to each coefficient. The following linear system is obtained:
^
= 0^
E^ipE«^-^'^p = E'^ipEM.p p=l
^^j
A:=l
p=l
yj = i...N
(6)
k=l
The inner sums in equation 6 are the outputs of the two domains on the training set (cf. equation 4). We require that they match, for each example in the training set. Therefore, to calculate the coefficients, the following set of equations must be solved for {aj} and {6,}: N
N
A;=l
k=l
where tp is the required output for the p'th example. Consider now an algorithm that learns an input-output relation by minimizing the total error on a given training set: p
/ N
\ 2
E = Y,[E<'iK-tJ p=l \ t = l
(8) /
Minimizing the error in equation 8 yields the same system for a, and bi as equation 7. Q Further research is needed to generalize this result to the case when the two inputs should be related by a function which is not the identity. For example, in bilateral symmetry detection the patterns in A and B should be, in a sense, mirror images of each other. In general, therefore, to find the synaptic weights for unit C, one must minimize: p
/N
/N
\ \ 2
^ - = E ( E '^i't'iv - / (^E ^'^ivj j 5. S u m m a r y Our results show that maximal diversity of neuronal response properties is attained when the ratio of dendritic and axonal arbor sizes is equal to 1, a value found in many cortical areas and across species [2,1]. It also appears that maximization of diversity leads to better performance in systems of receptive fields implementing steerable/shiftable filters, which may be necessary for generating the seemingly continuous range of orientation selectivity found in V I , and in matching spatially distributed signals. Thus, the maximization of diversity of neuronal response properties considered as a cortical organization principle [1] may have the double advantage of accounting for the formation of the cortical columns and the associated patchy projection patterns, and of explaining how systems of receptive fields can support functions such as the generation of precise response tuning from imprecise distributed inputs, and the matching of distributed signals, a problem that arises in visual tasks such as stereoDsis, motion processing, and recognition.
(9)
256 A. Calculating t h e Steering and Shifting Coefficients Suppose that a certain processing task has to be carried out on a signal, using only a limited amount of hardware. An example for such a ta^k is to filter the input signal at several angles. There are two approaches to this problem: 1. Find the response to the signal at many angles. Given a new angle return the response of a filter at an angle closest to the desired angle. 2. Use a basis set of filters oriented to several angles. To obtain the filtered response at an arbitrary angle, interpolate between the responses of the basis filter set. It is clear that method (2) is more efficient as compared to method (1). The difficulty arises in selecting the basis filters and in calculating the coefficients of the linear combination. A mathematical formulation for this so-called steering problem is given by Freeman and Adelson [16]; a similar formulation was given by Simoncelli et al. [17], who also introduced the concept of shiftability. In the shiftability scheme a set of basis filters set is located at specific intervals. Using a linear combination of these filters, it is possible to obtain an interpolated filter at an arbitrary location in the interval (not only at the grid points defined by the locations of the basis filters). A . l . M a t h e m a t i c a l Formulation We shall consider the case of joint shiftability and steerability, and allow scatter in the location and the angle tuning of the filters. In this scheme there is a set of filters whose centers are located at points fj = (xj^yj) and are tuned to orientations ${. A general form of such a filter is: G{f,r-,ei)
= G^{r,r-)G2{ei)
(10)
where Gi is a circularly symmetric filter, with center at fj. The shifting constraint requires that Gi have either a finite or a decaying Fourier transform (otherwise the filter cannot be properly sampled). An example of such a filter is the 2-dimensional Gaussian with center r^ and variance
Gr{f,r--) = e-^'-""'^'/'"'
(11)
G2{0i) is a filter oriented at angle Oi. The steering constraint on 62(^1) requires that it have a finite Fourier series expansion in polar coordinates.
G2(^0= E
«n(rle-^'
(12)
n=-N'
The set of basis filters whose centers are located at points f*j and are oriented at angles Oi is denoted by: {G(r,r-;-,^0}.=o:N-i
(13)
257 where M is the number of shifting filters and N the number of steering filters. An example for such a set of basis filters are the derivatives of 2-dimensional Gaussians. Given the basis filters, we would like to filter an input image at an arbitrary location rj in an interval (—4cr, 4cr) x (—4(7,4(7) and an arbitrary angle <^, (0 < <^ < TT). TO do so we need to calculate the interpolated filter G^(r5,<^) and convolve it with our image. The result is a linear combination of the filters in the filter array: M-17V-1
(a) G{ro,^)
=
EEhir-'o)H)G(f,r--,e,) j=0 t=0 M-1
(b) G(ro,^)
=
N-l
E b,{r-'o)G^{f,r•) ^ j=0
k{)G2iei)
(14)
t=0
The first sum provides the localization and the second sum provides the orientation tuning. The sets {6j(rl))}jlo^ and {ki{(f>)}i[^^ denote the shifting and steering coefficients, respectively. Equation 14b is the outer product of the interpolated shifted and steered filters. Because location and orientation of a 2D filter are independent attributes, the steering coefficients can be calculated separately from the shifting coefficients. Let us consider the problem of calculating the shifting coefficients. The shifting filters have the same impulse response but are centered at different points on the grid. Let h{f— fj) denote the impulse response of the j ' t h filter in the array. We assume that the new filter located at f*o can be written as a linear combination of the available filters: M-l
h{r-fo)= E^A^o)h{f-f,)
(15)
i=o Taking the Fourier transform of both sides we get: M-l m=0
(recall that 8cr is the size of the interval). This equation imposes a constraint only at frequencies where H[k] is nonzero. If the shifting filter is modeled by a Gaussian, then its Fourier transform is infinite and equation 16 should hold for all frequencies, i.e., for A; = 0, ..., M — 1. The resulting system, /
e^""°
\
/ I
e^'^^^^i/^^
•••
/
\
gi27r(M-l)n/8cr
...
e-^27rfM-i/8a >^ ^
^^(^^^ >^ (17)
I gJ>o(M-l)
2
gJ27r(M-l)fM-i/8a)
/
\ ^M-l(ro) /
is solved by inverting the matrix if it is non-singular, or by using the matrix pseudoinverse which gives the best solution in the mean square error sense. Note that it is possible to solve the system also when the filters are not equally spaced, because the only restriction imposed on the locations of the filter centers is that thev reside in the interval
258 Tj G (—4cr, 4cr). To obtain a stable solution, the filters should be spread uniformly over the interval. In our simulations we introduced a scatter to the locations of an equally space filter set: fj = Saj/M ± rtoleranceThe number of shifting filters is inversely proportional to the filter's size, and is determined according to the Nyquist criterion. For filters which are derivatives of Gaussians with variance a^ the number of shifting filters should obey the following inequality: M >2'
fNyquist
(18)
^
If we restrict the coefficients 6, to be real, then: M > 2-- + 1
(19)
The steering coefficients are calculated similarly, but instead of taking a regular Fourier transform as in equation 16, we use a polar Fourier transform. The basis filters are oriented at angles Oi = iir/N ± 0tolerance-
I
1
\
I
1
1 \ pJ^N
(20)
\ ^MN-i) J
j^iA^
\ e-
PJONN
The number of steering coefficients is determined according to theorem 2 in [16]. Let N' be the highest frequency in the polar Fourier expansion of G2{0) (see equation 12). The number of steering filters, N is given by the following inequality: N>2'N'
^-X
(21)
For functions that can be expanded into a Fourier series that includes only even or odd terms, the minimal number of filters can be reduced to TV > A'^' + 1. A.2. General considerations Following are some general observations regarding computation with steerable/shiftable receptive fields, as applied in the present work. 1. In a noiseless system it is sufficient to use the minimal number of steering and shifting filters as given in equations 19 and 21. To make the system stable with respect to noise in the filters and in the interpolating coefficients we should increase the number of basis filters. Increasing the number of filters increases the stability of the system and decreases its sensitivity to noise, i.e. the interpolated filter will be smooth even when the inputs are noisy. However, increasing the number of filters does not make the system insensitive to large variations in the coefficients values. Section 3 provides an analysis of the relationship between the number of filters in the svstem and the robustness of the svstem to noise.
259 2. Although the centers of the basis filters are located at specific intervals, the responses of different filters have regions of overlap. Portions of the signal will be filtered by several basis filters. This is reminiscent of channel coding, a signal representation scheme whose computational advantages are discussed, e.g., in [29]. 3. In the filtering scheme we described, the basis filter set can be used to filter an image in a region of (—4cr, 4cr) X (—4cr, 4cr) and angle (j) in the interval (0 < <^ < TT). TO extend the filtering to a larger interval one must duplicate the basis filter set for each new interval of size (—4cr, 4(j) x (—4cr, 4cr). The center filters of each interval will be non-overlapping, although filters at the boundaries of these regions will overlap. REFERENCES 1. R. Malach. Cortical columns as devices for maximizing neuronal diversity. Trends in Neurosciences, 17:101-104, 1994. 2. J. S. Lund, S. Yoshita, and J. B. Levitt. Comparison of intrinsic connections in different areas of macaque cerebral cortex. Cerebral Cortex^ 3:148-162, 1993. 3. V. B. Mouncastle. Modality and topographic properties of single neurons of cat's somatic sensory cortex. Journal of Neurophysiology^ 20:408-434, 1957. 4. D. Hubel and T. Wiesel. Receptive fields, binocular interactions and functional architecture in the cat's visual cortex. Journal of Physiology^ 160:106-154, 1962. 5. R. Malach. Dendritic sampling across processing streams in monkey striate cortex. Journal of Comparative Neurobiology^ 315:305-312, 1992. 6. Y. Amir, M. Harel, and R. Malach. Cortical hierarchy reflected in the organization of intrinsic connections in macaque monkey visual cortex. J. Comp. Neurobiol.^ 334:1946, 1993. 7. R. Malach, Y. Amir, M. Harel, and A. Grinvald. Relationship between intrinsic connections and functional architecture,revealed by optical imaging and in vivo targeted biocytine injections in primate striate cortex. Proceedings of the National Academy of Science, [754,90:10469-10473, 1993. 8. K. S. Rockland and J. S. Lund. Widespread periodic intrinsic connections in the tree shrew visual cortex. Science^ 215:1532-1534, 1982. 9. D. Hubel and T. Wiesel. Functional architecture of macaque monkey visual cortex. Proceedings of the Royal Society of London B, 198:1-59, 1977. 10. A. Grinvald, T. Lieke, R. D. Frostigand, C. Gilbert, and T. Wiesel. Functional architecture of the cortex as revealed by optical imaging of intrinsic signals. Nature^ 324:361-364, 1986. 11. T. R. Vidyasagar. Geniculate orientation biases as cartesian coordinates for cortical orientation detectors. In D. Rose and V. G. Dobson, editors. Models of the visual cortex, pages 390-395. Wiley, New York, 1985. 12. D. H. Foster and P. A. Ward. Asymmetries in oriented-line detection indicate two orthogonal filters in early vision. Proceedings of the Royal Society of London B, 243:7581, 1991. 13. J. T. MacIIwain. Large receptive fields and spatial transformations in the visual svstem. International Review Phvsioloav. 10:223-248, 1976.
260 14. J. T. MacIIwain. Point images in the visual system: new interest in an old idea. Trends in Neurosciences^ 9:354-358, 1986. 15. T. Young. The Gaussian derivative model for spatial vision: retinal mechanisms. Spatial Vision, 2:273-293, 1987. 16. W. T. Freeman and E. H. Adelson. The design and use of steerable filters. IEEE Transactions on Pattern Analysis and Machine Intelligence, 13:891-906, 1991. 17. E. Simoncelli, W. T. Freeman, E. H. Adelson, and D. Heeger. Shiftable multiscale transformations. IEEE Transactions on Information Theory, 38:587-607, 1992. 18. C. Gilbert and T. Wiesel. Columnar specificity of intrinsic horizontal and corticocortical connections in cat visual cortex. Journal of Neuroscience, 9:2432-2442, 1989. 19. B. M. Dow, A. Z. Cynader, R. G. Vautin, and R. Bauer. Magnification factor and receptive field size in foveal striate cortex of the monkey. Experimental Brain Research, 44:213-228, 1981. 20. D. Y. Ts'o, C. D. Gilbert, and T. Wiesel. Relationships between horizontal connections and functional architecture as revealed by cross correlation analysis. Journal of Neuroscience, 6:1160-70, 1986. 21. T. R. Vidyasagar. A model of striate response properties based on geniculate anistropies. Biological Cybernetics, 57:11-23, 1987. 22. V. Braitenberg and A. Schuz. Anatomy of the cortex, statistics and geometry. Springer, Berlin, Heidelberg, New-York., 1991. 23. A. Schuz. Randomness and constraints in the cortical neuropil. In A. Aertsen and V. Braitenberg, editors. Information processing in the cortex, pages 3-19. SpringerVerlag, Berlin Heidelberg New York, 1992. 24. D. Hubel. Eye, Brain,and Vision. Scientific American Library, 1988. 25. M. M. Merzenich, W. M. Jenkins, M. T. Ochs, M. Allard, and E. Guic-Robles. Functional reorganization of primary somatosensory cortex in adult owl monkeys after behaviorally controlled tactile stimulation. Journal of Neurophysiology, 63:82-104, 1990. 26. G. H. Recanzone, M. M. Merzenich, W. M. Jenkins, K. A. Grajski, and H. R. Dinse. Topographic reorganization of the hand representation in cortical area 3b of owl monkeys trained in a frequency discrimination task. Journal of Neurophysiology, 67:10311056, 1992. 27. S. LeVay. The patchy intrinsic projections of the visual cortex. Progress in Brain Research, 75:247-261, 1989. 28. T. Poggio and F. Girosi. Regularization algorithms for learning that are equivalent to multilayer networks. Science, 247:978-982, 1990. 29. H. P. Snippe and J. J. Koenderink. Discrimination thresholds for channel-coded systems. Biological Cybernetics, 66:543-551, 1992.
Brain Theory - Biological Basis and Computational Principles A. Aertsen and V. Braitenberg (Editors) © 1996 Elsevier Science B.V. All rights reserved.
261
Cell assemblies veisus single cells Horace Barlow Physiological Laboratory, University of Cambridge, Cambridge CB2 3EG, England In the report of the 1990 Brain Theory meeting I gave several reasons for the superiority of single cell descriptions of sensory function over cell assembly descriptions (Barlow 1992), and I have recently given a more complete argument (Barlow 1994) for believing that perception is best explained at the single neuron level. It is not of course claimed that descriptions at this level are appropriate for other problems in neuroscience, such as the actions of drugs, or explaining social behaviour; these require one to consider membrane receptor molecules in the one case, or how individual humans interact in the other, and the behaviour of single neurons would not be illuminating for either. I shall not repeat all this, but want to make three points. The first is to rebut an argument one frequently hears against single neuron explanations because it fails to understand what is claimed for single neurons and ignores the behavioural facts that their properties can account for. The second is to attack some of the arguments for cell assemblies that seem to me unfounded. Finally I shall list some evidence for assemblies that I am beginning to find convincing.
1 MISLEADING TESTS OF THE SINGLE NEURON HYPOTHESIS An example of the misleading argument says that cells in VI are only crudely tuned to orientation, so that the response of a single such cell would only change when the orientation of a stimulus line was changed by several degrees. It is then claimed that orientation changes well under one degree can be discriminated psychophysically; therefore, it is said, psychophysical orientation discrimination must be based on more than a single cell. This is wrong because the wrong psychophysical results have been selected for the comparison. As you increase the length of a line, orientation discrimination improves, and the high discrimination ability quoted above is only obtained for lines many times as long as the extent of the receptive field of the VI cells (see Andrews 1967). One cannot expect a cortical neuron to be influenced by the parts of the line outside its receptive field, so the appropriate comparison to make for a cortical neuron with a receptive field, say, 9 minutes long is the psychophysical orientation discrimination for 9 minute long lines, and this is only a small fraction of the figure for long
262 lines. To make a fair comparison between psychophysical performance and that of single neurons one must use a task for which the neuron is adapted, and compare it with psychophysical performance at the identical task. It has been clear for a long time that known classes of neurons (i.e. ones whose activities are readily isolated) do not explain all aspects of psychophysical performance, nor would one expect them to. For instance single retinal ganglion cells in the cat appear to be sensitive enough to account for the ability of the intact animal to detect small, brief, flashes of light containing only a few quanta (Barlow, Levick and Yoon 1971), but no single retinal ganglion cell could account for the threshold for large, long duration, stimuli; it is quite obvious that no retinal ganglion cell can integrate information over very large extents of the visual field, but such integration is necessary to detect the small flux of quanta that the intact animal can detect; presumably there are cells at more central locations in the visual pathway that integrate information from many retinal ganglion cells, but we do not know where they are so we cannot confirm or refute this. Likewise there are presumably cells that combine information over much more extended regions than the typical VI neuron, and which are thereby able to discriminate orientations of a fraction of a degree, but again we do not know where they are so we cannot confirm or refute this speculation either. But we do know that the intact animal can make such discriminations, and the only mechanism we know of that could improve the signal noise ratio sufficiently is integration by more centrally located neurons over larger areas, and perhaps longer times. This is the reason for suggesting that "Whenever two stimuli can be distinguished reliably, then some analysis of the physiological messages they cause in some single neuron would enable them to be distinguished with equal or greater reliability" (Barlow 1994). Comparisons between behavioural and single unit performance have been made for tactile discriminations in monkey cortex (Talbot et al 1968), wavelength discrimination in monkey LGN (Devalois, Abramov and Mead 1967), sensitivity to light in retinal ganglion cells of the cat (Barlow, Levick and Yoon 1967), responses of single tactile fibres from the hand in humans (Vallbo 1989), spatial resolution (Parker and Hawken 1985) and contrast discrimination (Barlow, Kaushal, Hawken and Parker 1987) of cortical neurons in monkey, ability to distinguish coherent motion of crossed gratings in monkey V5 (MT) (Movshon, Adelson, Gizzi & Newsome 1985), the ability to detect coherent motion of random dots by neurons of monkey MT (Newsome, Britten & Movshon 1989), and many others that I do not know about or cannot now recall. There are still problems in explaining psychophysical performance: 1) Often we do not know where to find the cells that seem to be required for particular tasks. 2) We do not know what is the best measure of the response of a single neuron to use - peak impulses per second, the number in some time interval, or perhaps a measure involving synchrony or rhythmicity. 3) The dynamic range of single neurons often seems inadequate to account for psychophysical performance. 4) The general principle that emerges is that psychophysical sensitivity for a particular task parallels that of the most sensitive of the neurons, but how does the brain succeed in ignoring the noise contributed by the other neurons that are less sensitive?
263 In spite of these difficulties single neurons are way ahead of cell assemblies when it comes to accounting for psychophysical behaviour, and I hope no-one will continue to be misled by foolish comparisons between psychophysical performance at one task and that of single neurons at another.
2 IMAGINARY VIRTUES OF CELL ASSEMBLIES Hebb (1949) attached great importance to cell assemblies in his attempt to develop a conceptual system " . . . which relates the individual nerve cell to psychological phenomena", and it is an idea that has had enormous appeal to others. I understand this appeal myself, for neurons come in such vast numbers, possess such vast variety, and behave with such vast irregularity, that at first they do not seem to offer a promising basis for explaining behaviour. But on the other hand some of the virtues of cell assemblies are imaginary.
2.1 Time scale Hebb introduced reverberating cell assemblies as the means of extending the time scale of nerve excitation to match that of the psychological mechanisms involved in learning and perception, but we now know that the supposed short time scale of nerve excitation processes resulted from ignorance; in Hebb's time it was not even universally accepted that synaptic transmission was chemical, or that it could be inhibitory as well as excitatory. We now realize that there are many intracellular processes that occur on quite a slow enough time scale to match the psychological processes that Hebb was concerned with, for instance changes of intracellular Ca"*"^ levels, the actions of neuromodulators and second messengers, the restitution of ionic levels by pumps, and probably the slow diffusion and removal of external transmitters such as NO. The fact that cell assemblies are unnecessary for the main purpose for which Hebb introduced them does not of course prove that they do not exist, but it should make one cautious about them. 2.2 R e d u c i n g errors Another imaginary virtue of cell assemblies is that they are less errorprone than single cells. Whether this is true or not depends upon the conditions that have to be met for the assembly to work correctly. If, for instance, each cell in the assembly has to function correctly for the whole assembly to function correctly, then it is easy to see that the assembly misfunctions when any of its component cells misfunctions, so the overall error rate is greater than that of the individual cells. Single cell representations can in fact be robust: if, for instance, you wish to represent a complex event or concept in a way that would resist the death or malfunction of individual neurons, then quite a good way of doing so would be to represent that event or concept by a single cell, and then make a few spare copies of that cell; the event or concept would then survive until all copies were lost.
264
2.3 Minimizing effects of cell death etc Actually the argument from cell death is almost always bogus; it depends upon quoting the daily number of neurons that die, without pointing out that this is a minute fraction of the total number of cells. Similarly arguments from the resistance of the brain to mutilation are also weak because a) careful tests often reveal effects of mutilation that superficial examination had missed, b) from failure to realize that cell assemblies and distributed representations (like computers) are not inherently resistant to damage, c) because a limited amount of reduplication of neurons will minimize the effects of injury. 2.4 Exploiting combinations The fact that the number of combinations of cells is much greater than the number of cells is another imaginary virtue of cell assemblies. The idea is that if important external objects or events (a cow or a catastrophe) are represented by the joint activity of n of the N neurons, rather than by a single one, then it is possible to represent a much larger number of such events. But no-one has eyer suggested that we use a mutually exclusive representation, in which only a single cell is active, and in any plausible distributed representation each of the individual cells corresponds to some definable set of objects or events in the field of things represented. Ideas differ about the sparseness of the representation (what proportion of the cells are typically active), and about the nature of the subsets of objects or events that cause a single cell to be active. At one extreme people believe that a high proportion of cells are active and that this subset is arbitrary and meaningless in the way that the bits of the ASCII code define almost arbitrary subsets of keyboard characters; at the other it is thought that the subsets are usually themselves meaningful and are selected to make it possible to represent naturally occurring events sparsely - the idea of cardinal cells (Barlow 1972, 1994). One of the merits of representing meaningful and useful objects and events by single neurons is that it improves the efficiency of learning.
2.5 Learning in distributed representations One of the important things one has to do with a representation is to learn associations with particular classes of the things represented: one learns that certain sounds signify dinner, or that certain sights signify danger. Now learning such associations is essentially a matter of determining that the entries in a 2x2 contingency table are non-random, and any mechanism that can do this well requires access to all four entries. This is the main reason for believing that learning must occur at Hebbian synapses, for one cannot easily point to any other position in the brain which has available the information required to determine whether a pre- and post-synaptic neuron fire in an associated way. Now consider the task of learning an association with a class of objects or events represented by a combination of active cells in a cell assembly. To find out whether the class is present or not one must check that all the relevant cells in the assembly fire, except in the trivial case when all the cells fire only for that class (i.e. for non-overlapping assemblies, which are essentially reduplicated single cell representations). Now there is no single location where all the necessary information is available; one might form associations with each cell of the
265 assembly separately, and in some cases this might work reasonably well, but there are obvious problems in doing this. Imagine trying to form an association with a keyboard character by measuring the associations with each of the bits that represent it; random errors would arise from the fact that each bit is active for other letters than the one of interest, and there is also the possibility of consistent errors. Of course you could resolve the character from the bits, but this would mean creating a single neuron representation for the character, which is just what cell assemblies were supposed to avoid. If you want to form an association with a class of events you must surely find a location where the occurrence of that class is signalled; nerve cells have the characteristics required to do this while I know of nothing else in the brain that has.
3 BUT ALL THE SAME... Cell assemblies do have some real virtues, and quite recently anatomical and physiological facts have emerged that I agree suggest that they are important in the cerebral cortex.
3.1 Intermediate level descriptions The first point that carries weight with me is not actually new. It is obviously desirable to find a level of description intermediate between the whole brain and the single neuron, and I am impressed by the success of invertebrate neurophysiologists in accounting for the way small assemblies of cells generate rhythms and complex sequences of movements (see for instance Getting 1989). There must be similar cooperative interactions involved in generating directional selectivity and other forms of pattern selectivity in sensory pathways. Understanding how a group of half-a-dozen, or perhaps several hundred, cells cooperate to perform some task would obviously constitute an enormous step on the path of explaining how the 10^^ cells of the cerebral cortex control behaviour.
3.2 Connectivity of cortical neurons Second, the facts elucidated by Braitenberg and Schiiz (1991), Douglas and Martin (1991), and others about the connectivity of cortical neurons seem to point to assembly-like bahaviour. What makes me hesitate here is that these anatomical and physiological facts do not provide an intuitively satisfactory account of the known physiological properties of cortical neurons. How is the receptive field of a cortical neuron formed? If the firing of a neuron depends upon synchronous input from a group of other neurons, should not these other neurons all have identical receptive fields? Why then are the receptive fields of cortical neurons always found to be different from each other? I shall be happier when the anatomy and physiology of cortical neurons fit together better.
3.3 Well-timed impulses Third, well-timed impulses (Leg^ndy & Salcman 1985; Abeles et al 1993; Abeles, this volume) seem to me difficult, but perhaps not impossible, to account
266 for except by something like synfire chains. Nerve impulses can recur at astonishingly regular intervals in some preparations, it's just that this behaviour does not seem typical of single cortical neurons. And although single-cell events might recur with a timing accuracy of a few percent up to intervals of several hundred milliseconds, I don't think one would expect the absolute accuracy to be independent of interval. In one sense their discovery fits my arguments for the importance of single cells very well, for if the well-timed behaviour is widespread, neurons are not as noisy and irregularly behaved as they seem to be when one only attends to one cell at a time. The existence of well-timed impulses is certainly an important new fact about the cortex. For these three reasons I'm much more impressed by cell assemblies now than I was even a few years ago, and I'll be watching them with interest.
REFERENCES Abeles, M., Prut, Y., Bergman, H., Vaadia, E. & Aertsen, A. (1993). Integration, synchronicity and Periodicity. In A. Aertsen & W. v. Seelen (Eds.), Spatiotemporal aspects of brain function, Elsevier Science Publications. Andrews, D.P. (1967) Perception of contour orientation in the central fovea Part II: spatial integration. Vision Research, 7 999-1013. Barlow, H.B. (1972) Single units and sensation: a neuron doctrine for perceptual psychology? Perception, 1 371-394. Barlow, H. B. (1992). Single cells versus Neuronal Assemblies, pp 169-173 in Information Processing in the Cortex, Eds A Aertsen and V Braitenberg. Berlin: Springer-Verlag. Barlow, H.B. (1994). The Neuron Doctrine in Perception. In M. Gazzaniga. (Eds.), The Cognitive Neurosciences. Cambridge, Mass: MIT Press. Barlow, H.B., Levick, W.R. & Yoon, M. (1971) Responses to single quanta of light in retinal ganglion cells of the cat. Vision Research, 11_ (supplement No 3) 87101. Barlow, H.B., Kaushal, T.P., Hawken, M. & Parker, A.J. (1987) Human contrast discrimination and the contrast discrimination of cortical neurons. Journal of the Optical Society of America, _A 4 2366-2371. Braitenberg, V. & Schuz, A. (1991). Anatomy of the Cortex: statistics Geometry, Berlin: Springer-Verlag.
and
Devalois, R.L., Abramov, I. & Mead, W.R. (1967) Single cell analysis of wavelength discrimination at the lateral geniculate nucleus in the macaque. Journal of Neurophysiology, 30 415-433.
267 Douglas, R.J. & Martin, K.A.C. (1991) A functional microcircuit for cat visual cortex. Journal of Physiology, London, 440 735-769. Getting, P.A. (1989) Emerging principles governing the operation of neural networks. Annual Review of Neuroscience, 12 185-204. Hebb, D.O. (1949). The Organization of Behaviour, New York.: Wiley. Leg^ndy, C.R. & Salcman, M. (1985) Bursts and recurrences of bursts in the spike trains of spontaneously active striate cortex neurons. Journal of Neurophysiology, 53 926-939. Movshon, J.A., Adelson, E.H., Gizzi, M.S. & Newsome, W.T. (1985). The analysis of moving visual patterns. In C. Chagas, R. Gattass & C. Gross (Eds.), Pattern Recognition Mechanisms pp. 117-151. Rome: Vatican Press. (Reprinted in Experimental Brain Research, Supplementum 11 117-151, 1986). Newsome, W.T., Britten, K.H. & Movshon, J.A. (1989) Neuronal correlates of a perceptual decision. Nature, 341 52-54. Parker, A. & Hawken, M. (1985) The capabilities of monkey cortical cells in spatial resolution tasks. Journal of the Optical Society of America, A 2 1101-1114. Talbot, W.P., Darian-Smith, I., Kornhuber, H.H. & Mountcastle, V.B. (1968) The sense of flutter -vibration: Comparison of human capacity with response patterns of mechano-receptive afferents from the monkey hand. Journal of Neurophysiology, 31 301-334. Vallbo, A.B. (1989). Single fibre microneurography and sensation. In C. Kennard & M. Swash (Eds.), Hierarchies in Neurology: a reappraisal of a Jacksonian concept (pp. 93-109). London: Springer.
Brain Theory - Biological Basis and Computational Principles A. Aertsen and V. Braitenberg (Editors) 1996 Elsevier Science B.V.
269
Composition Elie Bienenstock Division of Applied Mathematics, Brown University, Box F Providence, RI 02912, USA [email protected] on leave from CNRS, Paris, Prance * This chapter is a composition on the theme of binding and composition. The binding problem—the investigation of the mechanisms used by the brain to bind with each other the representations of different parts or features of an object—has motivated a number of studies in recent years. In particular, attention has been given to the possible role of accurate temporal structure of multi-neuron spike trains in cortex [1-3]. The present paper uses linguistic examples to suggest that the mental material that is operated upon by the establishment of dynamical bonds largely consists, in itself, of dynamical bonds. Purther, using the construction game Lego as a metaphor for compositional symbol systems, we stress the role of cooperativity among elementary bonds. We use the term map to refer to a coherent system of bonds, i.e., a system where elementary bonds cooperate with each other. We propose to view the fundamental operation of mental composition as consisting of the doing and undoing of maps between maps ... between maps. Cooperative binding in Lego can be characterized mathematically as the establishment of commutative mapping diagrams, and we outline an approach to language in which the construction of meaning consists of the establishment of recursively embedded commutative mapping diagrams. Drawing from these considerations, we suggest that the mechanism the brain uses to achieve recursive composition may consist of the cooperative binding of complex spatio-temporal patterns. Particularly interesting in this context is the synfire-chain model proposed by Abeles [4] to explain the accurately timed events in cortex reported by him and his group. This model oflFers a simple and plausible example of a system of recursively bound spatio-temporal neural activity patterns. Moreover, in this model, the interpretation of cooperative binding as the establishment of commutative mapping diagrams is straightforward, and consistent with the generally accepted principle of Hebbian plasticity. 1. ROLLERBLADES AND ROSES This chapter is a discussion of the neurobiological substrate of higher cognitive functions, in particular language. The discussion will be focused on the microstructure of the patterns of neural activity that may be involved in such functions, rather than on *This research was supported by Office of Naval Research contract N00014-91-J-1021, National Science Foundation contract DMS-9217655, and ARL contract MDA972-93-1-0012. My thanks go to Stuart Geman and to Lee Teverow for their careful critical reading of the manuscript.
270
their gross-anatomical layout in the brain. It is, in essence, a discussion of the neural representation—sometimes called code—used by the brain. We shall make use of various metaphors during the course of the paper, and propose a formal framework for a theory of neural compositionality. The discussion will be fed mostly by abstract considerations. Our approach may thus be viewed as an attempt to derive specific suggestions about neural mechanisms from 'first principles.' Sections 1 to 5 offer some reflections on mind and language, and sections 6 to 8 attempt, with the help of the Lego metaphor, to organize these reflections in a mathematical framework. The last two sections contain suggestions as to neural representations proper. These suggestions rely on the observations made by Moshe Abeles and his group about well-timed events in cortex, and borrow from the synfire model proposed by Abeles to account for his findings [4-6]. Rather than attempting to define in general terms the notion of a neural representation, we introduce it with a simple example. Consider the following sample of utterances, which could occur in a conversation about the sport known as in-line skating, or, in practitioners' parlance, rollerhlading? How long have you had those blades?
(1)
With this new braking technology, roller blading is a piece of cake.
(2)
If you stay long enough on each skate, you get the maximum out of your stroke.
(3)
With reference to these utterances and to the context—linguistic and other—in which they could have arisen, one may ask the following questions. Can one characterize the pattern of neural activity that takes place in the brain of the speaker when he or she points at a specific pair of roller blades while producing utterance 1? Assuming that such a characterization can be achieved, and referring from now on to this pattern as the neural representation of roUerblades, how does this representation change when another pair of roUerblades is being referred to? Some parts or aspects of the neural representation are likely to change, in connection with changes of the visual attributes of the roUerblades, such as shape, distance, size, color, orientation in space, etc. These changes undoubtedly affect the firing of many neurons in the visual areas of the brain of the speaker currently making reference to the roUerblades. One may conceive of a neural representation in a narrower sense, excluding everything that varies with the visual presentation of the roUerblades. That type of neural representation would not include the rates of activity of neurons in primary visual cortex. But there might exist in the brain a pattern of neural activity, located perhaps mostly in infero-temporal cortex (IT), which would be associated with the visual perception of roUerblades in a fully invariant way. Further, in principle, nothing seems to preclude the existence of an amodal neural representation of roUerblades. Such a pattern of neural activity would be elicited, reliably and reproducibly, each time the mental entity 'roUerblades' is evoked in connection with any mental operation, perceptual or not, and only then. ^Note to the uninitiated: whereas the four wheels of an old-style roUerskate (or 'quad') form a rectangle, the wheels of a roUerblade are positioned along a single line. Hence in-line skate or roller6/arfe.
271 Is such an assumption sound? Is it not the case that no two visual occurrences of a complex object such as a pair of roUerblades are identical? Is it not the case that the number of substantially different occurrences is, for practical purposes, infinite? When carried out thoroughly, would not the elimination of all the variable aspects of the object lead one to discard virtually all that could possibly be part of a visual representation of it? Would not the same consideration hold for other modalities or aspects of the representation? If so, how is the 'roUerbladeness' of roUerblades best captured, and how could it be represented neurally? Now one might seek an invariant representation in the region of the brain that is responsible for the production or processing of the word roller blade, a symbol used by individuals within a given community to communicate information about the real object. Inasmuch as the acoustic/phonetic signal corresponding to the word roUerblade is uniquely defined, there should exist in the brain of the speaker or listener a pattern of neural activity that corresponds uniquely to this word, hence to the conceptual entity roUerblade. This pattern, which will manifest itself each time the word roUerblade is uttered or heard, will be fully invariant with respect to the visual attributes of a specific object that might be referred to at that moment. However, it is hardly the case that the acoustic signal corresponding to a given word is unique. In fact, as has been discovered in particular by researchers in automatic speech recognition, the variability of speech signals is considerable. Therefore, there can hardly exist in the brain a single pattern of activity uniquely associated with the speech signal 'roUerblade.' Moreover, as exemplified in utterance 3, the very same roUerblades can, context permitting, be referred to by using a word such as 'skate,' phonetically quite distant from the word roUerblade. Or, as illustrated in 1, the first two syllables of the word—the roller part—may be dropped without much harm. When this is done, context becomes all-important: the occurrence of the lexical item 'blade' in a different context might elicit a radically different meaning. Similarly, the highly ambiguous word 'stroke' receives from its context a quite specific meaning in utterance 3, and the metaphorical use of 'cake' in 2 takes this word quite far from the food category, where it originates. What is then the mechanism whereby the meaning of symbol 'roUerblade' is constructed? Note that the morpheme 'roller' in 'roUerblade' may be viewed either as a component in a composite item, or, alternatively, as an element of context acting to disambiguate the meaning of item 'blade.' One of the recurrent themes in this paper will be the difl[iculty in distinguishing the inside of an entity from its outside. Contrasting with these remarks, note now that entities as different as 'roUerblade,' 'rose,' 'route,' 'royalty,' 'rudeness' and 'rust' share the following important property (in addition to their common first letter). Each of these entities, when occurring in an appropriate context, is used, i.e., perceived, retrieved from memory, combined, communicated, or otherwise manipulated, in an all-or-none fashion, i.e., as a perfectly well-defined entity in its own right. When we say 'roUerblade' (or 'blade' as in 1, or 'skate' as in 3), we do not mean 'roUerskate,' or 'roUercoaster,' or 'knife blade,' or 'ski boot,' or 'mountain shoe.' Nor do we mean a hybrid, say 95% roUerblade and 1% each of the other five items. Rather, we have in mind a well-defined entity, perfectly distinct from fellow mental entities. Further, nearly all the time, we are successful in conveying to our listener that unique grain of
272
meaning, using an appropriate symbol, in context. It is the neural counterparts to such mental entities that we seek to investigate in this paper. We are concerned with the dual nature of these entities, the apparent paradox whereby each of these grains of meaning seems to enjoy an individual life like any physical object—animate or not—yet does so in the absence of a clear borderline that would separate the inside of the entity from its outside. 2. GRAINS AND BONDS For the moment, we shall proceed with the following assumption: to each well-definea mental entity there corresponds a well-defined neural counterpart, i.e., a pattern of activity which, however simple or complex, localized in a small part of the brain or widely distributed across it, occurs each time this entity manifests itself, in perception, speech, or any other conscious mental process, and only then. Said otherwise, there exists a mapping from mental entities to some putative, yet to be defined, space of neural-activity-related quantities, and this mapping is one-to-one. We refer to this mapping as a neural representation. The neural-representation assumption can be found, explicitly or implicitly, in much of modern neurobiological literature. And yet, the few remarks made above point at a serious difficulty in the way to the elucidation of the nature of the map. The difficulty has to do with the observation that at any level and within any modality where one attempts to isolate an even mildly complex entity such as a pair of roUerblades, this entity will exhibit an unpleasant tendency to elude us. Thus, we noted, when examining the physical object roUerblade, that the visual presentation of the object, while perfectly unambiguous as a whole, was highly variable, hence ambiguous, in any of its parts. The same goes, in the visual modality, for virtually any physical object presenting itself as part of a natural scene. Almost without exception, natural scenes include many objects and clutter, in overlapping and mutually obscuring configurations. Object boundaries in natural images are, more often than not, difficult if not impossible to trace locally. Our ability to segment a scene into meaningful parts is a complex process involving the use of stored knowledge about physical objects, their parts and subpaj^ts, the interactions between these and the ways in which we interact with them. This knowledge is of a high level, i.e., relational and combinatorial, and we would be hard-pressed to draw a precise boundary around the collection of relationships that exactly make up an entity such as a pair of roUerblades in the visual modality. When examining the linguistic manifestation of a mental entity, we found ourselves confronted with a similar difficulty: a semantically well-defined entity exists only by virtue of a complex set of relationships, which we may at times want to view as internal to the entity, and at other times as belonging to the context in which the entity is currently experienced. In this paper, we shall attempt to reconcile these two aparently contradictory aspects of mental entities. We shall use the term grain-like to refer to the all-or-nothing quality of mental entities as they manifest themselves in a fully specified context. But we shall also attempt to characterize the structure of mind as map-like. By this we mean that mental material itself consists of a collection of maps, or correspondences. This will appear most
273
clearly by examining some of the processes underlying the construction of meaning-of linguistic material. On the neurobiological side, we shall follow a suggestion made by von der Malsburg [1, 7]—see also [8-10]—namely that dynamical bonds may be represented in the fine temporal structure of neural activity. We shall elaborate on this proposal, relying on experimental data on well-timed events in cortex collected in particular by the group of Moshe Abeles, and using the model of synfire chain proposed by Abeles to account for his observations. We shall argue in favor of a neural representation that maps mental material into synfire patterns—or other complex spatio-temporal patterns possibly defined on a lower time resolution—in such a way as to (i) preserve the map-like structure of mind, and (ii) account for the grain-like quality of mental entities by the cooperative binding of a large number of patterns into a globally coherent construct. 3. MONKEYS AND ATTRACTORS We start with an example of mental entities that exhibit, for all practical purposes, grain-like structure, and are of interest in neurophysiological experiments. Consider the following situation, fairly typical in behaving-monkey experiments. A monkey has been trained to push a lever, or point in a certain direction, upon presentation of a given visual stimulus. The electrophysiologist, who records the activity of neurons in various areas of the brain of the animal while it is performing the task, focuses his/her attention on entities such as: "The second green light to the left has been on for two seconds or so," or: "The monkey is now pointing its hand to the left," or: "The image displayed on the monitor in front of the monkey is the 27th (in a fixed series of images used to train the monkey to perform a given task, every day of the last three months)" etc. The neurophysiologist, in an attempt to 'read the mind' of the monkey, is looking for a neural correlate, or predictor, for one of these mental entities. In this example, the entities of interest can be construed as having grain-like structure. A grain of meaning is a simple percept or motor command, such as the perception of a green light, the perception of a given image or shape previously experienced, the pointing in a given direction, etc. The primary operation of the mind on these grains of meaning is activation, that is, the evoking or retrieving of an entity, in a clear-cut, all-or-none, fashion, at a given instant of time. This activation obeys some rules, associative and other, subject to various types of fast or slow changes. Much of the physiological data collected in experiments of this type—e.g. [11]—indicates that for a given mental entity there is in the monkey's brain a well-defined collection of neurons, also called cell assembly [12], the activity of which increases duimg the activation of that entity. This increase of activity is a predictor for the entity under study. Depending on the type of mental entity, the neurons which form the assembly may be located in different regions, in particular cortical areas, of the monkey's brain. It is, for instance, observed that the cell assembly corresponding to the perception of a small, simple, visual stimulus displayed at a given location on the retina includes neurons at a given location of the primary visual cortex of the animal. In the case of more complex visual stimuli, the cell assembly, inasmuch as its existence can be revealed, tends to be located in more central regions of the visual system, in particular IT.
274
The activation of cell assemblies is believed to be an all-or-none process, reflecting the grain-like aspect of mental entities. This aspect, already emphasized by Hebb [12], is rendered most clearly in recent models, which stress properties of error correction, or auto-association, or retrieval of memories from incomplete data [13]. Alternatively, these models are described as relying on the principle of computing with attractors [14]. The attractors of the dynamics of a feedback network correspond to grains of meaning, through a one-to-one neural representation. Our contention (§10) will be that this category of neural representations is too limited. Correspondences that assign grains of neural activity to grains of meaning do account for an important aspect of mental entities, namely their elicitation in an all-or-nothing fashion. However, they fail to render the 'dual' aspect of mental material, its map-like structure. 4. BANKS AND RAIN Leaping now across species and modalities, we turn to linguistic examples. We use these examples to illustrate what we call the 'map-like' structure of mental material. Consider the lexical item 'bank.' Consider the mechanisms—syntactic, semantic and pragmatic—whereby context specifies or alters the meaning of this lexical item. Suppose we were assigned the task of exhaustively enumerating the collection of mental entities— call this collection Mbank—which can possibly come into play in these meaning-construction mechanisms. In attempting to establish such a list, i.e., to enumerate the set Mbank, we could get help from Webster's dictionary [15]. The list would likely include a few dozen acceptations of the word—as a noun and as a verb, the latter most of the time intransitive but also, as it turns out, transitive—as well as fragments of speech that illustrate these various uses. Note that the medium used in the dictionary to convey the different meanings of an item consists itself of strings of lexical items. Since any entity is rendered in this way as a composition of entities of same nature, a dictionary is a self-referential affair. Moreover, it is fairly clear that any list that we may propose as an enumeration of Mbank will be incomplete and tentative; it will, at best, be a representative sample, since the collection of mental entities Mbank is open-ended. These remarks suggest that the material of language and thought may not lend itself to a decomposition into discrete entities. The meaning of a lexical item is a highly connected object, bound to straddle many boundaries in any picture of a cognitive landscape that one may attempt to draw. The main diflSculty with a grain-like conception of mind is that there is no clear level on which grains of meaning are to be found. As we just saw, lexical items envisaged before context comes into play are not appropriate candidates, since their meaning is, more often than not, poorly defined at that stage. Decomposing the meaning of an item into a number of specific alternative explanations, each involving a different context, will bring little salvation, since such a decomposition can be carried out only incompletely and imperfectly, and, what's worse, will rely on composite constructs utilizing other such entities, that is, on entities that are hardly simpler than the one we are attempting to decompose.
275 Let us then examine in more detail the mechanisms whereby context acts to allow us to construct the meaning of a given string, say one that includes the lexical item bank. Two examples of such strings, again from [15], are the following: The engineers hadn't given the road enough bank. In the rainy season the clouds would bank up about midday, and showers fall with true tropical violence.
(4) . . ^^
These examples once more illustrate the crucial role of context in the focusing of semantic content; before context comes into play, the word bank is highly ambiguous. It appears that what we are doing in our mind, with remarkable speed, reliability, and effortlessness, when putting together the meaning of such strings can be roughly summarized as consisting of the following two processes (this account is partly inspired from [16]): (i) bringing to the fore, in some sort of active workspace, a number of concrete past experiences, as well as domains of knowledge, of more or less abstract nature; (ii) establishing maps, i.e, systems of bonds or relationships (more about 'maps' in sections 5 and 7), between these cognitive spaces. Specifically, to flesh out the meaning of utterance 4, I may, for instance, retrieve the specific recollection of a narrow winding road between two fields of corn on a summer afternoon. Having done that, I may alter the time of day and weather and repaint the scene in a dark rainy night. I may place, in the field of corn on the outer side of the turn, a heavy truck lying on its side, after its driver has lost control of it because of the rain, because he was sleepy, and because the road didn't have enough bank. I may also repaint the image a few days later, with, in place of the vehicle which has been removed by now, two fellows with yellow helmets—the engineers who designed the road—discussing the bank or lack thereof. In the process, I may have activated a number of fragments of knowledge, such as: 'Rain makes for slippery roads'; 'A lateral incline will counteract the centrifugal force'; 'Engineers should build safe roads,' etc. The evoking of these memories and fragments of knowledge, among many others of which I am hardly conscious, is part of the process of constructing an interpretation of the sentence. The second important aspect in the process of meaning construction is the establishment of maps, or correspondences, between all these domains, experiences, or chunks of knowledge. Specifically, I will instantiate the engineers who occur in the sentence in a nondescript form (nothing is specified except their having done a poor job on this road) as two fellows with their yellow helmets; I will place the engineers at the edge of the field of corn to the left of the road two days after the accident; I will chain the propositions about roads, safety, rain, etc., into a logical sequence, which I will apply to that particular road. A crucial observation here is that these maps between cognitive domains, far from being arbitrary, obey very strict constraints. To appreciate the stringency of these constraints, consider the countless nonsensical ways in which one could connect the same entities with each other. Thus, consider on the one hand the engineer-related domain and some of the entities and fragments of knowledge it evokes: 'engineers build roads,' 'engineers are humans and humans have heads,' 'heads of engineers often carry yellow helmets,' etc. Consider, on the other hand, the road-related domain: 'roads are outdoors,' 'I remember
276 a road that crossed a field of corn,' 'outdoors is where precipitation occurs,' etc. Then build a picture that connects these two domains by having yellow helmets, rather than rain, descend on the road. By doing this one would have established a system of bonds between the two domains. Yet this system of bonds is nonsensical, and will not normally occur in the process of meaning construction. Clearly, only a small fraction of all possible systems of bonds are legitimate.^ 5. MAPS BETWEEN MAPS Let us then summarize, in two propositions, our discussion of the manipulation of linguistic/conceptual material. Proposition 1: The chief mode of operation of the mind on conceptual material (construction of meaning) is composition, i.e., the bringing together of various parts, or regions, or fragments, or aspects, of this material into a structured map, i.e., a globally coherent pattern of bonds or relations. Composition takes place on the time scale characteristic of conscious mental processes, roughly 1 s. Proposition 2: The process of composition is recursive. It may also, in some cases, be self-referential, that is, involve loops, or circular constructions. The second part of Proposition 2 roughly states that if we attempt to recursively decompose a given entity, call it E, into constituents and these constituents into more elementary constituents and so on, we may at one point be faced with a situation where entity E shows up agairij as one of the sub-sub...sub-parts of itself. This is not an essential tenet of our proposal. However, it would appear to follow from the above observation that any closed system such as a dictionary, when considered as a self-sufficient body of definitions for all the lexical items in a given language, has to include loops. When constructing the meaning of a sentence, or phrase, proposed in a dictionary as a definition of a given item, the reader carries out in his/her mind compositional processes of the same nature as those outlined above for the road example. It could be argued that dictionaries also rely at times on non-verbal knowledge such as is conveyed by illustrations, and on implicit knowledge assumed to be shared by all the users of the dictionary. In this fashion, any concept, concrete such as roller blades or abstract such as rudeness, might, in principle, be construed as a loop-free, if complex, construction involving, at the bottom, objects that might be regarded as grains of meaning, such as green lights and bananas. The latter could conceivably be defined with the help of simple pictures, examples of real-life situations, actions, etc., without further recourse to linguistic/conceptual input. This debate need not affect much the practical conception one has of the structure of mental material and of language, as it has evolved and is in use in a given community. In effect, self reference in language, if present at all, cannot easily be pinned down in a specific entity or a small collection thereof. Rather, it is a global property of a conceptual ^Note, however, that one is generally free to choose from among many alternative compositions to construct a coherent meaning for a linguistic string such as 4. In particular, the past experiences elicited in the process are not only idiosyncratic to the speaker or listener but may also depend on his or her present 'state of mind.' In spite of this variability, the normal use of language is such that the global meaning of a complete string is, almost always, uniquely defined.
277
system. The definition of a given concept (whether taken from a dictionary, a textbook, a car owner's manual, a street conversation, etc.) would generally refer to a collection of other concepts, which themselves would refer to yet another collection and so on, until the trace of where the definition process starts and ends is lost. What constitutes the main tenet of the proposal made here is that composition is recursive. Again, the internal/semantic and/or contextual/pragmatic structure of any entity (inasmuch as one wants to distinguish between these two aspects) appears to be, essentially, a system of maps between maps ... between maps. Thus: Proposition 3: The structure of mind/language is map-like. In other words, the fragments of mental material which, according to Proposition 1, the mind composes with each other by the establishment of dynamical patterns of bonds, are, themselves, dynamical patterns of bonds. In the next sections, we shall make Proposition 3 more concrete by discussing a few metaphors for mind/language, all of which involve, in one way or another, the use of symbols. We will focus on one particular metaphor, the game of Lego, to illustrate the notion of maps between maps. 6. LANGUAGE AND LEGO In this section we shall use the term symbol system to refer to any system made up of objects that represent^ i.e., stand for or mimic in some sense, external objects or entities. For instance, the commodities traded in stock markets constitute a symbol system. An interesting phenomenon is that some of the brokers who buy and sell derivative products, the trade of which now dominates, by a large factor, the volume of transactions in many stock exchanges, care little, in their daily practice, about how these 'abstract' commodities relate to coffee beans, mineral oil, or semiconductor components. In the hands of these agents, the high-level commodities operate in a quasi-autonomous way, with little practical interaction with objects of low level. One may draw a parallel between this observation and our proposal that linguistic/conceptual items are, in practice, systems of bonds between other linguistic entities. A family of examples of natural symbol systems is provided by genetic material, proteins and related biomolecules. A man-made example, which, like stock markets, has evolved millions of times faster than DNA and proteins, is a construction game such as Lego. All these systems are compositional: entities within the system are constructed by bringing together other entities, in a recursive/hierarchical manner.^ The following are two important properties of each of these compositional symbol systems. First, the system includes an infinite, or potentially infinite, collection of recursively/hierarchically constructed entities. This property is sometimes referred to as (infinite) productivity. Second, allowable constructions nevertheless respect specific constraints, whereby overwhelmingly most combinations are illegitimate, or meaningless. Note that in language, the surface structure is strictly hierarchical: phonemes bind into '^The notion of compositionality is sometimes used in a narrow sense, implying that the semantic content or functional properties of a composite symbol are strictly and uniquely determined by the list of its constituents and by the specification of the mutual relationships between them. Some authors therefore hold that language is not strictly compositional, for, as we have seen, meaning, more often than not, depends on context. Yet, again, we do not insist on a sharp boundary between the 'intrinsic' content of a lexical item (semantics and syntax) and its context (pragmatics), so that the issue is a moot one.
278
syllables; syllables bind into words, etc. In this paper we focus on one particular symbol system, the game Lego. The simple, yet in some ways remarkable, binding mechanism of Lego makes it a useful metaphor for natural compositional systems such as language/mind. The Lego metaphor for language was introduced briefly in an earlier paper [21]. Our discussion of Lego will extend over several sections. This section and the next one discuss, in some detail, the mechanism of hierarchical composition in Lego. This mechanism consists of a cooperative interaction between the many elementary bonds that make up a composite bond between two composite Lego objects. We characterize mathematically this cooperativity principle as being equivalent to the commutativity of a mapping diagram. We then propose (sections 8 and 9) to use Lego, viewed as a symbol system, as a metaphor for mind/language. We suggest that there might be, in the mechanisms of neuronal interactions in cortex, an aspect analogous to the property of cooperativity of Lego bonds. In sections 10 and 11 we shall make this proposal more specific, and suggest that cooperativity of bonds in neural activity may also be, formally, a property of commutativity of mapping diagrams. An elementary Lego brick—or simply brick—is a small rectangular parallelepiped of rigid plastic material with two rows of regularly spaced teeth on its upper surface and complementary concavities on its lower surface. This motif, forming a square lattice, allows a limited number of bindings between bricks. Thus, there are only four possible binding angles between two bricks. The number of allowable configurations between composite Lego constructs is also quite limited (see below). In spite of these limitations, the system is infinitely productive; Lego constructs can be made to represent, i.e., resemble, an unlimited class of real-life objects.^ For simplicity, we consider a game consisting of identical bricks only, say with 2 x 4 teeth. We assume we are given an unlimited supply of such identical bricks. We first examine the mechanism which allows multiple-brick Lego constructions to be assembled dynamically, i.e., to be done, undone, and recomposed with each other in very many different stable configurations. Consider two Lego constructs Ai and A2, each in the shape of a brick, i.e., a rectangular parallelepiped. Say Ai and A2 are both of size 20 x 20 x 4: their upper and lower surface (length X width) is 20 x 20, while their height^ is 4. Consider the following two alternative situations: (i) Ai and A2 are bound with each other along a whole horizontal face. For instance, the bottom face of Ai is bound to the top face of A2. The bond between Ai and A2 is then of size 20 x 20, and the composite B obtained in this way is itself a parallelepiped, of size 20 x 20 x 8. (ii) Ai and A2 are bound with each other, into composite C, through a bond of size 1, by bringing a corner brick of Ai in contact with a corner brick oi A2.^ ^See a good toy store. ^Length, width and height are measured against the corresponding dimensions of an elementary brick. Note that Lego bricks are assembled in such a way that each brick straddles two or more bricks underneath it. This may result in ragged vertical faces. Hence Ai and A2 are not perfect parallelepipeds. We ignore this in the following discussion. ^For instance, the brick in the southwest corner of the bottom face of Ai is brought on top of the brick in the northeast corner of the top surface of A2. The composite C obtained in this way—not a parallelepiped—has overall size 39 x 39 x 8.
279 Evidently C is unstable: the bond between Ai and A2 is too weak. On the other hand, B is just as stable as each of its components Ai and A2. What makes B a stable construct is that it brings Ai and A2 together through a composite bond, i.e., a bond made of many bonds. We shall say that the 20 x 20 elementary bonds® that ensure the stability of the composition of Ai and A2 into B cooperate with each other, in the following sense. If one of the bricks of A2—call this brick p2—is correctly placed vis-a-vis one of the bricks of Ai—call that brick Pi—the same holds of all pairs of bricks through which Ai and A2 come into contact. By P2 being 'correctly placed' with respect to ft we mean that ft and ft stand precisely in one of the small number of possible spatial relationships that allow a binding between these bricks. Cooperative binding in Lego results from the symmetry and exact reproducibility of elementary bricks. In short, all of the 20 x 20 elementary bonds that hold Ai and A2 together agree with each other as to the relative position of the composite constructs Ai and A2 in 3-D space. It is this cooperation of a large number of elementary bonds which ensures the stability of the composition of objects Ai and A2 into object B. The notion of cooperation will play a crucial role in the rest of the paper, and we discuss it now in detail. Note first that there are many different ways in which an agreement between a large number of elementary bonds between Ai and A2 can be reached. These different bindings stand in a one-to-one correspondence with a subset T of the discrete symmetry group of the infinite 2-D lattice of Lego-brick teeth. For each rigid transformation of the 2-D plane r eT, there is a stable composite object Br deduced from B by applying the transformation r to A2 while keeping Ai in the same position. Next, we wish to stress, with the help of a thought experiment, that the mechanism of cooperativity in Lego, while very apparent, should not be taken for granted. Consider a noise-corrupted Lego world, which we call Mego, where elementary bricks are not identical. Like Lego bricks, Mego bricks are exactly of same size, yet the lattice of teeth on the top face of Mego bricks is shifted by a small random amount in a random direction with respect to the rest of the brick, in particular with respect to the complementary lattice on the bottom. This shift is drawn, independently for each brick, from a probability distribution which we need not specify. On the other hand, Mego bricks are somewhat flexible, and a bond between two bricks can be established even if the match is not perfect. This is achieved by bending somewhat the bricks to be attached, and/or some other bricks in the construct. A successful Mego bond is, just like a Lego bond, dynamical and stable: it can be done and undone by pressing the two bricks together or pulling them apart. When there is too much discrepancy in the relative positions of the two bricks, the bond fails altogether. To a certain extent, Mego 'works.' Indeed, suppose we were asked to reproduce in Mego a given AT-brick Lego object A. By carefully choosing the N bricks one by one, one may succeed, for small enough N, to create a Mego version A' of A. Note that this composite A' is likely to be warped^ in an unpredictable manner. Hierarchical composition on the other hand fails: since Mego objects are warped each in a different way, one cannot compose them with each other to produce an unbounded ^Because of the straddling of bricks, there are actually more than 20 x 20 elementary bonds, and each involves less than 8 teeth.
280 hierarchy of arbitrarily complex objects as is done in Lego. It is impossible, in Mego, to bind with each other two large constructs, say two (almost-)parallelepipeds A[ and ^35 which have been assembled independently of each other. Each one of them suffers from distortions of its own, making it incompatible with the other object. The incompatibility between A[ and A2 resides in the lack of agreement between the elementary tentative bonds that would hold the two together. Each one of these bonds has a slightly different idea about what the relative position of A[ and A2 should be, making cooperative binding impossible. Of course one could assemble A[ and A2 into a construct of the C type, i.e., with only one elementary bond between A[ and A2. Yet the resulting construct would be unstable, again for lack of cooperative binding. This thought experiment shows that the mechanism of compositionality in Lego does rely on a strong property of composite Lego constructs: they all use the same, universal (within the world of Lego) lattice of teeth and indentations, and, thanks to their symmetry, extend the lattice in a consistent way across arbitrarily large composite constructs. As a result, a composite construct Ai will bind with a composite construct A2 through a composite bond made up of a large number of cooperating elementary bonds. Another way to look at this is as follows. The binding of two Lego bricks a and /3 with each other can be viewed as a seed for the binding of two larger constructs A and B, where A includes a and B includes p. In effect, due to the rigidity and exact reproducibility of Lego bricks, a is, implicitly, part of an infinite Lego lattice, and so is /?. Therefore, registering a with respect to P allows us to predict accurately the registration of any brick in A with respect to any brick in B. In the next section we shall formalize this idea, and propose to view the binding of brick a to brick /? as equivalent to a rigid transformation from the entire Euclidean space R^ onto itself. The formalism that will result is actually not necessary to study the composition mechanics of Lego, yet we shall make use of it later when discussing neural mechanisms of composition. 7. M A P S B E T W E E N MAPS AGAIN One of the aims of this paper is to suggest that a mechanism analogous to the cooperative binding mechanism of Lego may underlie the recursive dynamical binding of neural representations in cortex. In order to help us articulate this proposal, we shall now introduce some mathematics to the effect of characterizing the composition of composite Lego objects as a map between maps. The cooperativity, or mutual agreement, between the elementary bonds will be formalized as a property of commutativity of a mapping diagram. (Cooperativity is also an essential aspect of the binding mechanism of proteins, or nucleic acids, yet the mathematics there is likely to be less straightforward than in Lego.) We shall then conclude that, from a functional perspective, a composite Lego construct is, essentially, a system of maps acting on maps. We start with the simplest of composites: a stable construct A made of two elementary bricks a and p. As noted in the previous section, the composite A is totally specified by a map 0, the rigid transformation of R^ that takes a into /?, and, in particular, the lattice of teeth of a into the lattice of teeth of p. The composition rules of Lego, dictated by the structure of Lego bricks, constrain this map to belong to a small discrete set of
281 transformations of R^. We call (j) the binding map of A, and denote A = (Q;|(/>|/?). We also say that 0 is an elementary binding map, since it binds two elementary bricks.^ Consider now a stable AT-brick Lego construct A, with N > 2. Stability, a physical property of Lego constructs, may be given the following coarse definition, using the notion of a chain. A chain is a sequence of elementary bricks Q;I,Q!2, . •. ,anj such that ai is directly bound to 0^2? 0^2 is directly bound to 0^3, ..., a„_i is directly bound to «„• A Lego construct A will be deemed stable if any two disjoint large subsets of A are connected with each other, in A, through many non-overlapping chains. According to this definition— which we shall not make more precise—a 20 x 20 x 4 parallelepiped as described above is stable. On the other hand, a composite of type C, consisting of two such parallelepipeds assembled through a bond of size 1, is unstable. We now define, for any stable multi-brick construct A, the binding diagram ^(A). First, this diagram is a mapping diagram, i.e., a collection of maps between spaces^ called the nodes of the diagram. The binding diagram ^(A) is defined as follows: 1. ^(A) contains N nodes, i.e., N spaces, one for each brick in A. 2. Each node of ^(A) is identical^® to the Euclidean space R^. 3. For each pair of bricks that are directly bound in A, i.e., for each two-brick construct {a\(l)\f3) that is a subobject of A, there is a map in ^(A) between the nodes corresponding respectively to a and to /?. This map is the elementary binding map •ANow a mapping diagram is commutative if the following is true: for any two nodes X and Y—remember these nodes are spaces—such that there are several paths from X to F in the diagram, the maps composed along these different paths—each of these chained maps carries X into Y—are all identical. Said otherwise, it does not matter which path one uses to go from node X to node Y}^ Note that a path in ^(A) corresponds to a chain in A. Clearly, the binding diagram ^(A) of any stable Lego construct A is commutative. To spell it out, let ai, 0^2, • • •, «n be a chain of bricks within the composite object A. Let Pi, P2, - • - •> Pm be another chain of bricks within A with same first and last bricks as the a chain, i.e., such that ai = /?i and an = Pm- Then the composition of binding maps along the a chain is identical to the composition of binding maps along the fi chain. Of course with N large, there generally are, for any two given bricks a and /? in A, many different chains ai, 0:2,..., «„ leading from a to ^, i.e., such that a = ai and P = anThe commutative diagram ^(A) is thus redundant: the composite A is actually totally specified by a subset^^ of the binding maps in ^(A). The commutativity of binding diagrams of stable Lego objects is trivial: by definition, the composition of binding maps along any chain from a to ^ in a stable construct A is the rigid R^ transformation that takes a into p}^ The binding diagram of an unstable ^Throughout, we assume that Lego bricks are given a fixed orientation in space. The transformation of R^, (^, is then well-defined. ^^Formally ^{A) includes N distinct spaces SaiCn € A. Each Sa is a replica of R^. ^^Commutative diagrams play an important role in many branches of mathematics. ^^Any spanning tree. ^^Formally, this map transforms Sa into S^.
282 construct is also commutative, yet it does not contain 'enough' commutative loops, and this is what makes the object unstable. While trivial, the property of commutativity should not be taken for granted. In Mego, compositionality fails precisely when the tentative binding diagram of a tentative composition fails to be commutative. On the other hand, the flexibility of Mego bricks allows us to repair the distortions in a binding diagram that is almost commutative, and render this diagram perfectly commutative. To summarize, the stability of Lego construction is achieved by the cooperation of many elementary bonds within a composite bond. Mathematically, this cooperativity is a property of commutativity of binding diagrams. We now extend the definition of a binding map, in the obvious way, to any pair of bricks a and P within a given stable Lego construct A: the binding map from a to /? is the composite map along a chain from a to p. This definition is legitimate because "^{A) is commutative: it does not matter which chain one uses to compute the binding map between distant bricks. We also revise our definition of binding diagrams: the binding diagram "^{A) will from now on include the binding map between a and P for every pair of elementary bricks a and p in A. Of course ^(A) is still a commutative mapping diagram. Up to now, binding maps were defined only for pairs of elementary bricks a and ^, whether directly bound with each other or indirectly bound within a given stable composite A. We now extend the notion of binding map so that it applies to the case where a composite construct, say Ai, is bound with another composite construct, say A2, within a larger object, say B. Specifically, the composite binding map, or simply binding map, $, from Ai to A2, within a given stable composite B which includes Ai and A2, is defined as the collection of binding maps (j) that take any brick ai within Ai to any brick 0^2 within ^2-^^ This is a legitimate definition because the binding diagram "^{B) is commutative; the binding map $ is a subset of the collection of binding maps that make up "^{B). If Ai and A2 make up all of B, the composite construct B is totally (in fact, redundantly) specified by the binding map $. We can then use the notation B = {Ai\^\A2), a straightforward extension of the notation B = (a\(f)\P) used above for the binding of elementary bricks. Also, the binding diagram ^(-B) is then the disjoint union of the composite binding map $ with the binding diagrams ^(Ai) and ^(^2)- ^ ( ^ ) = ^((^i|^|^2)) = ^ U ^(Ai) U ^(^2). In words, the binding map from Ai to A2 is exactly what it takes to complete the separate binding diagrams of these two objects to a bona fide Lego binding diagram. Of particular interest is the case where, given Ai and A2, there exists only one binding map ^ that makes (^i|$|yl2) a stable construct. We then say that the bond $ between Ai and A2 is of the 'lock-and-key' type,^^ and we use the shorter notation ^^Formally, this elemantary binding map transforms Sa^ into 5'«2^^A simple example of a lock-and-key pair A[ and Aj is as follows. Let Ai and A2 be two identical rectangular parallelepipeds, of size 20 x 20 x 4 each, as in §6. Create Aj by binding to A2 an additional 10 elementary bricks, at 10 different positions randomly drawn on the top face of A2; add the same random pattern of 10 bricks on the bottom face of A2; finally, create A[ by removing from the bottom face of Ai a 'complementary' pattern of bricks. Clearly, A[ and Aj will bind, with A[ on top of Ag, into a 20 X 20 X 8 regular rectangular parallelepiped, the construct we called B in §6. This is the only stable construct that can be created by binding A'j to Aj, since, with very high probability, any other composite bond between A^ and Aj will include at most 10 elementary bonds, hence be unstable, according to our
283
The commutativity of ^{B) implies in particular that if (/>i is a binding map between two bricks within Ai and 02 is a binding map between two bricks within A2, then the composition of (j)i by ^ (i.e., (j)i followed by $) is identical to the composition of $ by 02.^^ This commutativity allows us to view ^ as transforming the map 0i into the map (/>2. Since this applies to any (j)i € ^(^1) and any 02 G ^(^2), the binding map $ is seen to transform the binding diagram of object Ai into the binding diagram of object A2. Thus, the composition of Ai with A2 within B is precisely a map between maps, i.e., a map acting on maps. Mathematically, the operation of composing Ai with A2 is equivalent to transforming the binding diagram ^(^1) into the binding diagram ^(^2)Consider, finally, a stable composite Bi = {Ai\^i\A2) and another stable composite B2 = {^21^21^3) 5 such that the two bonds <^i and $2 can be established simultaneously, i.e., such that Ai and As do not overlap in space when composed with A2 through $1 and $2- Assuming that the resulting tripartite composite, say T, is stable too, we can denote it T = (Ai 1^11^21^21^3)- In the case where both $1 and $2 are lock-and-key bonds, this simplifies to T = (A1IA2IA3}. The intermediate construct A2 then acts like a binding map # between Ai and ^ 3 : (yli|$|A2) = (^i|A2|v43). Said otherwise, A2, for the purpose of composition with Ai and A^, is, formally, a binding map ^. We now summarize, in words, the main conclusions of our study of the 'compositional mechanics' of Lego. (Following each proposition in English we give, in parentheses, its mathematical equivalent.) First, since all Lego bricks are identical, the identity of an object Ai resides in the collection of internal spatial relationships between the constituents of this object. Simply put, a Lego object is made up of relationships (if ^(^1) = ^(^i)? then Ai and A[ are 'the same'). These relationships, due to the structure of elementary bricks, are highly constrained] in particular, they all agree with each other, i.e., form a coherent whole (the binding diagram "^{Ai) is commutative). Further, the collection of internal relationships that make up the identity of Ai also determines the function of Ai, i.e., its interactions with external objects (^(Ai) is a diagram of transformations of the entire space R^ into itself). Thus, the composition of Ai with another object A2 is an operation that acts on the internal structure of Ai and maps it, in complete detail, into the internal structure of A2 (in object (^i|^|A2), the map $ transforms ^(Ai) into ^(^2)). Finally, if Ai and A2 are indeed compatible, i.e., if there exists a composition operation (a binding map $) that binds them into a stable Lego construct B, then B, as a whole, is coherent: all the relationships within B, even those which 'cross over' from Ai to A2, agree with each other (the diagram ^(Ai) U^(^2) U ^ = ^((^i|^|^2)) is commutative). In particular, B can be further operated upon, i.e., composed with another Lego object C. That is, the composite structure of B can be further mapped into the inner structure of C. Thus, composition in Lego is recursive. informal definition of stability. (The 10 bricks on the bottom face of Aj prevent a stable binding with A[, with A'2 on top.) ^^As collections of maps, these are not identical, since their respective source and target spaces are not the same. Consider however, in each of these two collections of maps, the map that takes the source space of (/>i into the target space of <^2- The statement made here is that these two maps are identical.
284 8. LANGUAGE AND LEGO AGAIN In Sections 1 to 5, we argued that the mechanism of construction of meaning in language could be viewed as the recursive building of a globally coherent pattern of bonds, and we summarized our discussion with Proposition 3, stating that the structure of language is map-like. Our study of Lego, a simple symbol system with a transparent map-like structure, suggests that composition might be analyzed quite generally in terms of commutative binding diagrams.^^ What makes up the coherence of a linguistic construct is, no doubt, more subtle and complex than the commutativity of the Lego bond. Yet we suggest that the notion of global coherence in language parallels that of global coherence in Lego in the sense that it is this coherence or consistency that lends its identity to a composite entity, i.e., allows it to function as a well-defined unit. In the earlier sections of this paper, we referred to this aspect of linguistic/mental constructs as the grain-like structure of mind. We shall now interpret more specifically the global coherence of a linguistic construct in terms of the commutativity of mapping diagrams. Consider again utterance 1 (Section 1). Constructing the meaning of this sentence is a complex process, which involves, in particular, the instantiation of predicate 'have,' i.e., its binding with subject 'you' and object 'those blades.' In a first analysis, one may say that the predicate has 5/0^5, or roles, and that instantiation is a process of role filling. Thus, in 1, the subject slot is filled by the pronoun 'you,' and the object slot is filled by the noun phrase 'those blades.' Incorporating some of the semantics of predicate 'have' will allow a more detailed analysis of the process of instantiation. We outline such an analysis, using the composition formalism developed for Lego. The composition of the linguistic symbol 'have' with the symbol 'you' and the symbol 'those roUerblades' into string 1 will be viewed as the establishment of a binding map between two mental constructs, one describing the semantic content of predicate 'have,' while the other describes the concrete situation at hand, i.e., the context in which this symbol occurs as string 1 is being used. We posit that the semantic content of predicate 'have,' a construct which we denote HAVE, subsumes a collection of past experiences of situations where this symbol was used. In each of these instances, symbol 'have' was used to express a relation of ownership between a person and an object. Thus for instance, the speaker herself, say Adina, may remember that when she was four she had a red bike. In Adina's mind, Adina is a construct which we denote ME, and the red bike in question is a construct which we denote RED-BIKE. Adina may also recollect that a friend of hers, say Pam, has a car. We posit that construct HAVE, consisting of memories or experiences of this kind, is elicited by each use of predicate 'have.' Like a Lego construct, HAVE can be 'decomposed' in this way into a collection of mutually agreeing (see below) binding maps. However, while the recursive decomposition of a Lego object all the way to elementary bricks is straightforward—and yields a unique ^^One of the acceptations of the word 'bond' is the following [15]: 'A connection or system of connections in which adjacent parts of a structure are made to overlap so as to be tied or bound together.' Webster's lists eighteen different types of bond used in masonry: 'American bond,' 'blind bond,' etc. All these systems of connections are commutative patterns of binding maps, just like the Lego bond.
285 collection of elementary bricks regardless of the order in which the decomposition was performed—we stressed in previous sections that such a thorough decomposition is hardly possible for linguistic/conceptual entities. Linguistic entities remain moving targets until they are composed—in a coherent, i.e., meaningful, way—within a specific context. Thus, in a specific enough context, the mapHke structure of HAVE emerges. For instance, in the context of MOVING-THINGSWITH-WHEELS, Adina may construe HAVE as a binding diagram ^(HAVE) containing four constructs, ME, PAM, RED-BIKE, and CAR, and four maps, $i mapping ME (i.e., Adina) to RED-BIKE, $2 mapping PAM to CAR, $3 mapping ME into PAM, and $4 mapping RED-BIKE into CAR. In this first analysis, which, we now see, is actually more a composition than a decomposition, the diagram ^(HAVE) is already seen to be commutative: $1 composed with $4 is the same as $3 composed with $2- Said otherwise, ^(HAVE) places ME in the same relative position vis a vis RED-BIKE as it places PAM vis a vis CAR.^® Now, in Lego, relative positions between elementary bricks are measured by applying rigid maps in R^, and relative positions between composite bricks are measured through maps between maps between ... maps of R^. When dealing with linguistic constructs we cannot, practically, decompose a construct all the way to the end. We do, however, decompose—actually compose—to a certain extent, i.e., specify contextual maps through which relative positions are expressed. For instance, two contextual maps that can operate on MOVING-THINGS-WITH-WHEELS are USE-FOR-TRANSPORTATION and GETHIT-BY. With respect to these two maps, the relative position of ME and RED-BIKE is the same as that of PAM and CAR. Other maps could be used to compose the construct HAVE, even in a specific context such as provided by string 1. Thus, for instance, one may use the map WAS-RECEIVEDAS-BIRTHDAY-PRESENT: since YOU is a PERSON, and PERSONS can HAVE objects that are BIRTHDAY-PRESENTs, it may be the case that RED-BIKE and CAR stand vis a vis ME and PAM, respectively, in the relative position obtained through the map WAS-RECEIVED-AS-BIRTHDAY-PRESENT. Another commutative mapping diagram is thereby established, providing an alternative partial construction for HAVE. The two constructions are in mutual agreement, since MOVING-THINGS-WITH-WHEELS are popular BIRTHDAY-PRESENTs. This makes the union of the two diagrams commutative. In this larger context, both $1 and $2 emerge as consisting of the union of the map USE-FOR-TRANSPORTATION with the map WAS-RECEIVED-AS-BIRTHDAYPRESENT. Note that we did not specify how constructs (i.e., maps) such as GET-HIT-BY or WASRECEIVED-AS-BIRTHDAY-PRESENT bind with (i.e., operate on) constructs such as ME or PAM. There is no need to do so because all the bonds involved are, with a few exceptions, of 'lock-and-key' type (see §7). The instantiation of predicate 'have' in string 1 can now be analyzed as follows. Suppose the name of the person being asked the question 'How long have you had those roUerblades' (string 1) is Deborah. The context of this utterance has elicited, in the speaker's (i.e., Adina's) mind, a collection of partially bound constructs, including DEB^^Ignoring tenses and other details.
286 ORAH and ROLLERBLADES. The instantiation of 'have' in utterance 1 then consists of the dynamical binding of HAVE with DEBORAH and ROLLERBLADES, in such a way that the global binding diagram, including the maps mentioned above as well as USE-FOR-TRANSPORTATION, WAS-RECEIVED-AS-BIRTHDAY-PRESENT, etc. between DEBORAH and ROLLERBLADES, is commutative. This globally coherent construct, i.e., commutative binding diagram, makes explicit the facts that Deborah stands vis-a-vis her roUerblades in the same position as Pam vis-a-vis her car, in a number of distinct but partly overlapping contexts. Note that one context of interest is the physical proximity between the owner and the object owned: Deborah is on her roUerblades and Pam was in her car. This incorporation of low-level perceptual—in this case visual—binding maps into the semantics of HAVE may be seen as a step in the direction of a c?e-composition of that construct into 'primitives.' 9. LIGHT-YEARS AND HORSES In this section we examine briefly some limitations of the Lego metaphor for language. In §4, we attempted to define a set M^ank as the collection of all mental entities that make up the entity BANK. We immediately noted the futility of this attempt, in particular because Mbank would be open-ended. Using the notation of the previous section, we can now construe the mental construct BANK as a recursively embedded mapping diagram, which we may denote ^(BANK). The diagram ^(BANK) is indeed open-ended: depending on the context in which the lexical item 'bank' is used, BANK will assume a different form. Thus, the structure of a mental entity is 'external' rather than 'internal.' While a Lego construct decomposes into a finite collection of elementary bricks, the meaning of a linguistic construct is obtained by a process of composition. In this sense, the structure of mind appears to be fundamentally different from that of any strictly hierarchical symbol system assembled from objects that live in the physical space R^, such as Lego or DNA.^^ Said otherwise, the Lego metaphor is valid only locally, i.e., when composition is considered a few recursive steps at a time. The composition mechanics of composite Lego objects can be studied independently from the mechanics of elementary Lego bricks. As noted in §6, stock exchanges provide another such example. Consider now, at a given time t in the life of a given individual, the collection of all binding diagrams elicited before t in the mind of this person. Let us agree to term the union of these binding diagrams the 'mind,' M, of that person at time t. This union is not disjoint: there is a considerable overlap between the individual binding diagrams that make up A4, and M. is itself an immense diagram of maps. Mentation after time t may be viewed as the activation, on the 1-second time scale, of subdiagrams of A^.^° One may view the emergence of 'grains' of meaning as a mere consequence of the fact that some large enough subcoUections of individual binding diagrams in M happen to bind with each other into globally coherent, i.e., commutative, diagrams. Again, these grains elude a rigorous definition. However, since any composite Lego ^^On the surface however, language is strictly hierarchical: phrases decompose uniquely into words, which decompose uniquely into syllables, etc. ^°The process of memory then alters M.
287 construct A can be viewed, alternatively, as a binding diagram "^{A) (which makes the metaphor possible) or as a node in the composition diagram of A with other Lego constructs, let us attempt to reconcile our approach with the notion of a 'semantic node': the semantic node BANK would be the counterpart of a very large multibrick subpart of an immense Lego construct M, the latter being the counterpart of Ai. In other words we assume, for a moment, that the Lego metaphor for language really 'works' all the way to the end: there exists, in principle, a mapping /x which takes the mind M into a formidable Lego world M in a composition-preserving way. Thus, in M, the image //(DEBORAH HAS ROLLERBLADES) of the M construct DEBORAH HAS ROLLERBLADES is a composite of the Lego constructs //(DEBORAH), /x(HAVE), and /i(ROLLERBLADES). If we assume that such a Lego representation (M, /z) of M exists, we obtain a Lego world M with rather unusual features. First, whereas the binding diagram of any familiar Lego object B is completely connected, this is not the case of the binding diagram of M. Indeed, for any two subparts—elementary or not—of a real Lego object B, the binding diagram ^ ( 5 ) includes the binding map between these two parts. In M however, consider for instance the nodes /x(HORSE) and /x(LIGHT-YEAR). It would seem quite unlikely that M includes a map between HORSE and LIGHT-YEAR, i.e., that these two nodes would ever have been activated together in a single experience, giving rise to a memory that assembles them in one and the same context. In effect, these two semantic nodes seem as distant from each other as two semantic nodes can possibly be. This problem is perhaps not insurmountable. One may attempt to picture M as a noisy, Mego-style, Lego world (see §6). In such a model M, distortions, though small, would accumulate over distance, so that commutativity would eventually break down. Said otherwise, binding maps would reach only up to a certain distance across R^, and—so we hope—the graph-theoretic distance induced by the binding maps on M would reflect a 'semantic distance' between nodes in M. Thus, /x(HORSE) and //(LIGHT-YEAR) would be very distant from each other in M. With this picture in mind, consider the string This is putting the cart one light-year ahead of the horse.
(6)
String 6 creates a globally coherent M. construct in which the 'semantic nodes' HORSE and LIGHT-YEAR, which were thought to be, until time t, very distant from each other, stand all of a sudden, at time t, in close contact. This breakdown of the Lego metaphor is quite serious. Examples such as string 6, which abound in all natural languages—particularly in English—demonstrate the futility of any attempt to map M into the Euclidean space R^: it seems that composition as our minds perform it makes little case of any fixed distance that one may want to impose on 'mental space.' One way to pursue the Lego metaphor in the face of this mobility of mental material may be to give up altogether on low-dimensional Euclidean spaces and to view M as a manifold embedded in a high-dimensional space H. The manifold M would then be able to 'fold' upon itself in multiple ways (think of a 1-D chain of amino-acids folding upon itself in 3-D space), thereby establishing connections, at time t, between regions of M that were very distant from each other, in the topology of M, until time t.
288 The neural representation discussed in the next section is an attempt in this direction. 10. NEURONS AND LEGO In sections 2 and 3 we discussed the view that mental 'entities' can be mapped, in a one-to-one fashion, into neural activity patterns. We argued in sections 4 and 5 that mind is best decribed with the help of recursively embedded systems of maps, rather than as a collection of grain-like entities. We suggested that what lends to mental material its grain-like aspect is the global consistency of the system of maps that we construct, on the fly, from a given input and context—linguistic and/or visual—and from memory. With the help of a few examples, we attempted to characterize the notion of global consistency of linguistic constructs. Using the Lego metaphor, we argued in sections 7 to 9 that consistency could formally be viewed as a property of commutativity of a recursively embedded mapping diagram. The composition mechanics of Lego is based, in a transparent way, on this commutativity property. We now, after this long introduction, approach the goal of this paper, which is to derive, from first principles, a compositional neural representation. To this end we sJiall again use the Lego metaphor, carrying it this time all the way down to neurons and synapses. We shall offer mainly qualitative arguments. The reader is referred to [10] for a quantitative treatment of some of the aspects of the model. The basic idea [1,8] is that binding takes place in time rather than space. Reasoning from first principles this has to be so, since neurons, unlike Lego bricks for instance, do not move in space. We shall therefore posit, very roughly, that two monosynaptically coupled neurons are bound whenever their firing times agree with the transmission delay between them. This shall result in a fairly straightforward correspondence between the composition mechanics of Lego and the composition mechanics of cortical activity patterns. Whereas the former takes place in the Euclidean space R^, the latter takes place in time, a onedimensional space. However, rather than using the entire real line R, we shall consider neural events that take place in a brief interval T, say of length 1000 ms. Thus, while Lego binding maps are defined on the entire space R^ (the spatial coherence of Lego constructs is potentially infinite, thanks to the exact reproducibility of Lego bricks), we shall define neuronal binding maps on a limited interval T rather than on the entire time axis R. The correspondence we propose to establish between the Lego world and the neuron world is thus local. The main reason for this is that commutativity would not hold for long neuronal constructs. But note that mental constructs are indeed short-lived. While a record of a given construct—a synaptic trace—may remain in the system for arbitrarily long times, the construct itself, defined as a collection of binding maps, i.e., firing events, dies in a matter of a few seconds. Another difference between Lego composition and neural composition is that, while elementary Lego bricks are all identical, no two neurons are morphologically the same, and, more importantly, no two neurons are connectedm the same way. Neural composition uses the connectivity graph of cortex as a highly specific and idiosyncratic substrate, a substrate which contains records of the past function of one particular brain. Most of the choices to be made in constructing the correspondence between Lego and
289 cortex are virtually forced, from elementary anatomical and physiological considerations; these choices will appear as statements.^^ Some issues however are left open, and these will appear as 'Questions.' Elementary bricks. The elementary bricks in the cortical composition system are cortical pyramidal cells. The rationale for this is as follows. We argued in the last section that M, our tentative Lego model of mind, should be viewed as embedded in a highdimensional space. The axonal and dendritic branches of stellate cells extend only locally. Therefore, their connectivity graph is of dimension^^ three. In contrast, the axons of pyramids always extend to the white matter, and often project to distant cortical areas. As a result, the connectivity graph of pyramidal cells—which is exclusively excitatory and accounts for perhaps as much as 80% of cortical synapses—is of high dimension. For practical purposes, the dimension of this graph may as well be said to be infinite. Almost equivalently, the average synaptic distance between two pyramidal cells—call it the 'effective diameter' of the graph—is very small, most likely less than 5, a vanishingly small number compared to the size of the graph, 10^^. In short, the graph of pyramidal cells can be said to ignore the three-dimensional topology of our physical world (which is a fairly unique feature for a physical system). Binding apparatus. The binding apparatus of a pyramidal cell consists of the synapses it establishes with other pyramids, either as the pre- or as the post-synaptic element. We shall denote the synapse between presynaptic neuron a and postsynaptic neuron /3 (ignoring the possibility of multiple synapses) {a^/S). Note that, like a Lego brick, which has teeth and complementary concavities, a neuron is oriented, i.e., comes equipped with binding devices of two complementary types. Thus, for a given pyramidal cell a, there are from a few thousands to a few tens of thousands of synapses (up to about 40,000 in human cortex) of type (a, p) and about the same number of type (/?, a). Question: Which synapses support binding? Does neuronal binding machinery include a//pyramid-pyramid synapses, or does it consist only of a subclass of these, defined perhaps by anatomical, morphological, or pharmacological criteria? Elementary bond. If a and /3 are two cortical pyramids with a synapse (a,/^), the two cells are bound with each other whenever the spikes they emit are paired with each other, i.e., the relative timing of these spikes respects the total transmission delay between a and /3. The synapse (a, p) is then said to be in bound state. Note that elementary Lego bricks are all identical, hence any one of them can bind with any other one. In contrast, in our model, a pyramidal cell can bind only with monosynaptically connected cells, a large but finite set. Question: What is the precise condition for spikes to be paired? The total ^^We do not claim that these statements are unconditionally true, i.e., that the dynamics of cortex, modulo the correspondence thus established, is identical to the dynamics of Lego. We merely argue that these are reasonable choices to make in a Lego model of cortex. ^^A simple way to define the dimension of a large graph is as follows. For any node a in the graph and for any integer k, let B(a,k) be the ball of center a and radius k, i.e., the set of all nodes P such that d(ot, 0) < k. The graph-theoretic distance d(a, /3) is the length of the shortest path between a and P (in cortex, d(a,l3) is the shortest synaptic distance between neurons a and P). If, when A; grows, the number of nodes in B{a, k) grows, on average, like C x k^ for some positive C and r, then the graph is said to be of dimension r. Any local graph in R^, e.g. the network of smooth stellate cells, is of dimension 3. If the number of nodes in B(a, k) grows faster than k^ for all r, the dimension is infinite.
290 transmission delay between a and /3 is the time it takes for a spike emitted by a to trigger a spike in p. Is this delay uniquely defined? While axonal conduction times and synaptic transmission delays appear to be constant and reproducible (from a fraction of a millisecond to a few milliseconds for the former and about 1 millisecond for the latter), there is reason to doubt that this is the case of somato-dendritic integration times. From what is currently known about cortical pyramidal cells, it appears that the time it takes for the membrane potential at the axon hillock of /3 to reach the firing threshold, measured from the time of arrival of an EPSP (Excitatory Postsynaptic Potential) delivered to P by a, depends, among other factors, on the number of other EPSPs that impinged on P close to that time. Matters may be further complicated by the relative positioning of synapses on the membrane of p. In the case of interest here, where a large volley of EPSPs impinges almost synchronously on cell P (see below), the integration time may still depend on the size of that packet of EPSPs [17]. Question: How stringent is the condition for pairing of spikes? Assuming that the transmission delay between a and p has been defined, how strictly should the temporal relationship be satisfied for two spikes to form a pair, i.e., for {a,P) to be in bound state? It is not unreasonable to require millisecond accuracy; this, as we shall see, leads to the synfire model. Binding is statistical pairing. Most cross-correlograms of pairs of cortical cells are flat. In some cases a small narrow peak with an offset of a few milliseconds is observed, indicating the presence of a direct excitatory connection between the two cells. Thus, at best a small fraction of all spikes, in a given presynaptic cell a, are paired with a spike in a given postsynaptic cell p. This observation is consistent with the high fan-in/fan-out intracortical connectivity, and, to many authors, is evidence that there is little accurate time structure in the multi-dimensional point process made up of the firing times of any collection of cortical cells. While we defend the opposite view, the fact remains that, from the point of view of any given synapse (o;,/^), the majority of spikes produced by either a 01 p are unpaired according to our definition. Therefore, in the model proposed, the bond between a and P necessarily resides in a statistical property of the two-dimensional point process: we shall say that the synapse is in bound state if a large enough fraction of all spikes emitted during T are paired. One may for instance use a criterion based on comparing the frequency of observed pairs with the expected frequency of pairs under the hypothesis that the two processes are independent, with marginal distributions as observed. This still leaves a few questions open. In particular, are bonds binary-valued as they are in Lego, or is their strength gradual, depending on how many of the spikes emitted during T are paired, and on how accurate the pairing is? Question: How are bonds stabilized? The Lego binding apparatus uses a friction mechanism, exploiting the elasticity of plastic material, to create bonds that can be undone yet are stable. What is the physical mechanism for the stabilization of neuronal bonds? Since a neuronal bond consists of paired spikes, stabilizing the bond means persistently generating enough paired spikes during the entire duration of T; this mechanism should be reversible. These considerations are strongly suggestive of fast, reversible, Hebbian-type synaptic plasticity, as proposed by von der Malsburg [1], and as documented in a number of studies, e.g. [18]. Competition between bonds. While an elementary Lego brick a can bind with
291 any other elementary brick, it cannot bind simultaneously with more than a few. This exclusion rule, in Lego, results from the limited amount of R^ space available in the immediate neighborhood of any brick a. In cortex, competition, like binding, takes place on i?, the time axis. Whereas in Lego the exclusion rule is strict, cortical competition is likely to take the form of a soft constraint relying on inhibitory interactions, such as provided, for instance, by chandelier cells, synapsing on the initial segments of pyramidalcell axons. The precise biological underpinnings of the competition mechanism need not be discussed here. Suffice it to remark, in the spirit of our derivation from first principles, that some form of competition or exclusion is likely to be enforced on the short time scale: there must be a mechanism to prevent a large fraction of the cells that are monosynaptically coupled to a given cell a from firing simultaneously in a small interval of time around the firing of that cell a. In view of the vanishingly small 'effective diameter' of the cortical graph (see above), this is roughly equivalent to excluding epileptic states. Elementary binding map. If a and /? are two cortical pyramids with a synapse (a, ^) between them, the binding map (/)a^ is the shift, defined on the time interval T, which maps a spike of a onto a paired spike of p. The neuronal construct A that consists of two neurons a and ^ bound with each other through the binding map (j)a/3 will be denoted A = (a|0a/j|/?). Binding maps are thus shifts, and, for a given pair of neurons, the amount of shift is equal to the transmission delay between the two neurons. The map may be different for different pairs of neurons, and, in particular, longer if the two neurons reside in distant cortical areas. As noted above a neuronal binding map is defined on a limited time interval T rather than on the entire time axis R. This stands in sharp contrast with Lego binding maps, which are defined on the entire space R^. Constraints on elementary binding maps. In Lego, the binding map (j) belongs to a small set of transformations of R^. In cortex, the binding map ^a/s for a given pair of neurons a and l3 is unique; this map, a simple time shift, can in principle be computed from anatomical/morphological data. We shall therefore use the notation A = {a\P) rather than A = {a\(l)a/3\l3). Thus, (a^p) denotes the synapse between a and ^, whether in bound or unbound state, while {a\/3) denotes the two-neuron construct consisting of a, /?, and the synapse (a, P) in bound state. So far, we discussed the mechanism of elementary binding in cortex. We now extend this discussion to the recursive binding of composite constructs, following again the Lego example. A^-Neuron constructs. We consider a collection S of N pyramidal cells, with N large.^^ For some pairs of cells a G 5, /? G 5, there exists a synapse {a, p)\ further, some of these synapses are in bound state during T, i.e., are composed into a 2-neuron construct (a|/?). A collection A of 2-neuron constructs is said to be an AT-neuron construct with support S if (i) A covers all the N neurons of 5, and (ii) it forms a connected graph. These two requirements are self-explanatory, but notice that A need not include all active bonds between pairs of cells in S. In effect, A will generally consist of a strict subset of the collection of all 2-neuron constructs made from neurons of 5. In contrast, an A-brick Lego construct A obviously ^^In Lego we did not need a name for a collection of bricks: since all elementary Lego bricks are interchangeable, any two iV-brick collections are equivalent. Neurons however, due to their connectivity, are not interchangeable.
292 includes all the two-brick subobjects of A. The reason why we leave out some bonds when defining A is that a given neuron may take part simultaneously in several distinct multi-neuron constructs. As we shall see below, the binding diagram of each construct separately will then be commutative, yet the union of these binding diagrams will not. This situation, which we shall examine below in more detail in the case of synfire chains, does not arise at all in Lego. Binding diagram. As for Lego, we define the binding diagram "^{A) in two steps. For the moment, we define "^{A) to be the collection of all elementary binding maps (/)a/3, for all 2-neuron constructs {a\P) in A. Commutativity. The binding diagram ^(A) is commutative if the composition of elementary binding maps along different A-chains with same first and last neurons is independent of the chain. Said otherwise, the transmission delay is the same—or very nearly the same—along all ^-chains that start with the same neuron and end with the same neuron. For instance, if A includes the synaptic bonds (a|/?), (Q!|7), {I3\6), (7|(5), then the total transmission time from a to 6 has to be the same whether one uses the p path or the 7 path. Note that commutativity, even when defined, as it is now, on binding diagrams that include only monosynaptic bonds, is not granted. There are essentially three, non-exclusive, ways in which commutativity can be violated: (i) some 74-chains start with the same neuron and end with same neuron yet have different synaptic lengths; (ii) with reference to the above four-neuron example, the three neurons a, p and 6 sit in the same cortical area whereas 7 is located in a distant area; (iii) different synapses have very different locations on the somato-dendritic membranes. Situation (i) can cause arbitrarily large violations of commutativity. If we pick a large collection of neurons S at random, chances are that the construct which includes all the synapses in bound state between neurons of S will be highly non-commutative. Certainly by leaving out enough bonds—which we can always do according to our definition of an N-nevLTon construct—we may obtain a commutative construct A. Commutativity is for instance trivially realized if the graph A is loop-free, i.e., if it is a tree. However, a construct obtained in this way need not be stable (see below). The existence of stable neural constructs is thus non-trivial. In the same way as it takes non-trivial engineering to create a Lego game that satisfies the commutativity axiom (cf. the Mego experiment in §6), it may take a specific ontogenetic development mechanism to equip the cortical network with large numbers of stable commutative constructs (see below). Considering now a commutative construct A with support 5, we extend, as in Lego, the notion of a binding map to all pairs of neurons in S. Since A is a connected graph, there exists a chain of monosynaptic bonds in A between any two neurons a and P in 5, and we define the binding map cpa/s as the composite map along this chain. Thanks to commutativity, the map is well defined. We now modify the definition of the binding diagram: "^{A) will include the map (j)ai3 for any pair of neurons a and P in 5; ^(A), which includes monosynaptic bonds as well as polysynaptic bonds, is commutative. Note that a monosynaptic binding map 0^^ is a short shift, generally of the order of 1 ms, and never much longer than 10 ms. On the other hand, a binding map 0^/3 between an arbitrary pair of neurons in A can be an arbitrarily long shift, which, in view of the above remark, is undesirable. In principle, we should impose a limit on the length of
293 multi-neuron constructs. Stable constructs. We give a loose definition of stability, following the Lego example: a construct is stable provided it contains enough elementary bonds that cooperate with each other. As in Lego, cooperativity is equivalent to the commutativity of the binding diagram. The physiological interpretation of cooperativity of neuronal bonds is as follows. Consider again the four-neuron example, where a is connected to 6 through two different disynaptic pathways, one through p, the other through 7. As noted above, commutativity of this mapping diagram means that the total transmission time from a to 6 is the same whether one uses the /3 path or the 7 path. This condition favors the impinging of two simultaneous spikes on 6, since the firing of a tends to cause the simultaneous firing of P and 7. This, in turn, favors the spiking of 6, since cortical neurons are triggered most effectively by simultaneous EPSPs. However, as in Lego, commutativity per se does not grant stability. There must be enough cooperativity for the entire construct to hold together. Physiologically, two EPSPs impinging simultaneously on neuron 6 will be most of the time insufficient to bring 6 to fire. Estimates vary as to how many synchronous EPSPs are required to bring the membrane potential to firing, starting from resting level; a figure of 25 appears reasonable [5]. Thus, while Lego constructs need not be large to be stable, neuronal constructs cannot be stable until they reach a critical size; stability of binding is a collective property. This picture is consistent with the assumption that the stabilization of monosynaptic bonds relies of fast Hebbian-type plasticity (see above). If every neuron receives, from time to time, a large packet of synchronous EPSPs, all the synapses implicated will be reinforced. This in turn will favor the occurrence of paired spikes at these synapses, i.e., stabilize all the bonds in the construct. This principle, consisting of short-term feedback between converging/diverging configurations of synaptic weights and packets of synchronous spikes, was originally proposed by von der Malsburg [1]. We shall now illustrate the mechanism of stable binding with the example of a synfire chain. We assume that the reader is familiar with the definition and properties of synfire chains [4,5,19]; for a summary, see [10]. In a synfire chain as defined in [5], all elementary transmission delays are the same, say 1 ms. If the chain is strictly feedforward, i.e., if a given neuron occurs only once in the chain, and if the chain operates in noise-free conditions, all spikes are paired hence all synapses, according to our definition, are in bound state. The mapping diagram ^(A) of the chain is the following: if a is in pool i and P is in pool j , then the map (t)ap is a shift oi j — i ms. For such a synfire chain A, ^(A) is evidently commutative. Stability of A as defined above is moreover equivalent to stability of synfire transmission [5]. The conditions that guarantee stability in a synfire chain have been investigated numerically and analytically by a number of authors [5,19,10,17,20]. As shown in these studies, a key parameter for stability is the multiplicity of the chain, i.e., the degree of convergence/divergence between successive pools; multiplicity is the synfire-chain equivalent of our general notion of cooperativity. The definition of a synfire chain can be generalized to allow non-uniform transmission times. Thus, in the above four-neuron example, the condition of commutativity may be met while the four individual transmission delays are all different. The resulting
294 network, termed a synfire braid in [10], is again a stable construct, provided there is enough cooperativity between elementary bonds. As documented in a number of numerical studies, synfire transmission is resistant to various types of noise, either in the firing behavior of neurons or in the connectivity. Of particular interest is the case of chains with feedback, obtained by allowing any given neuron to recur in several pools [5,10]. In a chain with feedback, commutativity is clearly violated. However, the definition of a construct ^ on a given support S (see above) allows us to ignore some of the bonds between neurons of S. Thus, if we ignore the 'feedback' connections, resulting from the recurrence of neurons in the chain, we obtain a subgraph A with the two properties required for stability: (i) "^{A) is commutative and (ii) there is enough cooperativity to ensure stable synfire transmission. Note that the bonds that have to be ignored to make a given portion of the chain commutative—we may call these bonds 'spurious' from the perspective of that portion of the chain—may be essential to the stability of transmission in other portions of the chain. Our definitions of commutativity and stability nevertheless apply locally in the chain. Equivalently, one may consider that different portions of one chain are different chains altogether. In this situation, several strictly feedforward synfire chains are superimposed on one given collection of N neurons; the synfire bonds that make up one of the chains are spurious, i.e., violate commutativity, from the perspective of other chains. When kept within limits, this violation of commutativity does not interfere with the stability of each of the chains sharing these same N neurons. What allows the chains to function independently of each other while sharing the same neuronal support is that (i) the incoming/outgoing binding apparatus of each neuron can accommodate many simultaneous bonds, and (ii) binding as expressed in terms of pairing of spikes is a statistical property, i.e., does not require that all spikes, presynaptic or postynaptic to a given bond, be paired along that bond. Again, there is no Lego equivalent to this behavior, since it requires a substrate of high dimensionality (see above). The limits within which commutativity can be violated in this way without affecting synfire transmission are investigated in [10], under a simple synfire-superposition model. With the numerical values proposed in §8 of that paper, the critical value at which stability of synfire transmission breaks down corresponds to an overlap of about 400 synfire links per neuron. This means that, on average, the proportion of spikes that are paired along a given synapse among all presynaptic (or postsynaptic) spikes 'seen' by that synapse is 1/400. While this figure is a mere order of magnitude, it is consistent with the thesis [4,1,9,5] that cortex may support exquisitely accurate time structure of a type that may hardly show in pairwise correlograms. In sum, synfire chains, with or without feedback, provide an example of a family of stable neuronal constructs for our Lego-inspired model. This model might perhaps accommodate constructs of some other type, yet synfire chains have, at least in the hands of one group of researchers, received convincing experimental support. Moreover, synfire chains can be shown to grow spontaneously, in an initially disordered graph, under the effect of slow Hebbian plasticity with a few additional mild conditions [21-23]. This growth mechanism may be seen as the counterpart, in the Lego
295 metaphor, of the design of a regular lattice of binding devices.^^ We shall now, in effect, pursue this metaphor in the case of composite constructs, and show that synfire chains—assuming they exist in cortex in large numbers—provide a substrate for recursive construction. Recursive binding of neuronal constructs. From now on we consider stable constructs of synfire type only. Let Ai and A2 be two synfire chains, of supports Si and 5*2, with binding diagrams "^{Ai) and ^ ( ^ 2 ) . Remember that these diagrams generally do not include all the bonds within ^i and ^2, since the chains generally include feedback connections. Ai and A2 are bound into a stable construct B whenever there are enough cooperating monosynaptic bonds of type (Q;i|a2), with a i G Ai and 0^2 ^ ^2- As in Lego, these bonds, extended through ^ ( A i ) and ^ ( ^ 2 ) to polysynaptic bonds between any pair of neurons, a i G Ai, a2 G ^ 2 , create a composite binding map $ , and we write B = {Ai\^\A2). Physiologically, the mechanism of binding of the two chains Ai and A2 is the synchronization of the synfire waves that travel down each of the two chains. This mechanism has been studied in particular by [19,10,17], who showed that, thanks to the intrinsic stability of each of the two chains, a small number of bonds between them is enough to ensure their binding. Note that, in order for Ai and A2 to be able to bind with each other, the collection of synaptic connections between them should include a large enough, spatio-temporally coherent, subset. This constraint, through which the composition mechanics of neuronal constructs is made to depend on development and learning, has no counterpart in Lego world. Our formal treatment of composite binding maps for Lego (§7) however applies, mutatis mutandis, to neuronal constructs. In particular, the binding diagram of the composite construct B is the disjoint union of the binding diagrams ^ ( A i ) and ^ ( ^ 2 ) with the binding map $ . Further, the neuronal interpretation of the binding map $ is, as in Lego, a map between maps: the commutativity of ^(J5) allows us to view $ as transforming the diagram ^ ( A i ) into the diagram ^ ( ^ 2 ) . Each of ^ ( ^ 1 ) and ^ ( ^ 2 ) is, essentially, a mapping of a collection of neurons—^i or ^2—onto the time axis. Thus, $ is, in essence, a mapping of the time axis onto itself. ^^ Finally, as in Lego, we may consider bonds of lock-and-key type. The bond between the two constructs Ai and A2 is of lock-and-key type if there is only one binding map between them; we may then write B = {Ai\A2). We may also view A2 as equivalent to a binding map in the construct (A1IA2IA3), if the two bonds in this construct are of lock-and-key type. With neuronal constructs, bonds of this lock-and-key will arise naturally from Hebbian plasticity. Consider two synfire chains Ai and A2 that grew and functioned independently of each other up to a certain time. At this time, Ai and A2 are not connected with each other in a spatio-temporally coherent way, as required for binding. Synaptic connections are likely to exist between individual neurons of Ai and A2, since some of these pairs ^'^This mechanism may also be viewed as a natural selection principle, with cooperativity playing the role of fitness. ^^As usual, $ is defined only locally, on an interval of say 1000 ms.
296 of neurons belong to a common third chain (each neuron, as seen above, is shared by many, perhaps hundreds, of synfire chains); however, these connections between Ai and A2 are unlikely to form a coherent spatio-temporal pattern. When, in an appropriate functional context, the two chains are activated together, Hebbian plasticity will result in the development of a coherent system of dynamical bonds between them. Since this composite bond between Ai and A2 is the only one at that point, it is of the lock-and-key type.
11. LANGUAGE, LEGO, AND NEURONS Commutative mapping diagrams are the key mathematical concept which, in our proposal, should help us understand mental composition in terms of neural mechanisms. We introduced the notion of a binding map with the Lego example. We defined the binding map $ between two complex Lego constructs Ai and A2 ss a. composite map between two system of maps—the binding diagrams of Ai and of ^2- ^ was thus defined as a map between maps. When each of Ai and A2 is in turn envisaged as a composite object constructed from simpler objects, $ becomes a map between maps between maps. This recursion, we proposed, is suggestive of the recursive construction of meaning during mentation. However, the composite Lego construct (Ai|$|A2), is also, quite conspicuously, a map from the mere 3-D space R^ onto itself. Specifically, after we provide a thorough description of Ai and A2 separately, all it takes to further specify the composite (^i|$|A2) is merely the rigid R^ transformation that takes any given brick of Ai into any given brick of A2. And since all elementary Lego bricks are interchangeable, all it takes to provide a thorough functional description of an TV-brick construct Ai in the first place is a collection of N — 1 binding maps in R^. Thus, if it were only for Lego, we would have no use for maps between maps. In short. Lego provided only a convenient excuse for introducing our mathematical tools. These tools are quite unnecessary to the analysis of Lego itself. Neuronal composition however is immensely more complex. Each and every neuron has an identity^ associated with the graph M of permanent synaptic connections in which the neuron is embedded (§9). //we were given a thorough functional description of two complex neuronal constructs, say two synfire chains ^1 and ^42, all it would take to specify a composite (^i|$|^2) would, indeed, be a mere shift from the time axis R (or from a 1000-ms time interval T) onto itself. Yet what is a thorough functional description of a synfire chain embedded in cortex? We would certainly need to include in it a description of the identity of each neuron a in the chain. The amount of information that this would entail is gigantic: consider that we would have to specify the list of all cortical neurons presynaptic and postsynaptic to a, in a format that would allow each of these neurons to be essentially any of the 10^^ pyramidal cells in cortex. Shall we conclude that the Lego metaphor, in the end, is useless? Perhaps not. Recall our attempt, in §8, to interpret the mental construct HAVE, used in the construction of the meaning of linguistic string 1, as consisting of an embedded system of commutative mapping diagrams, paralleling our—correct if unnecessary—account of Lego mechanics.
297 In light of our Lego-inspired model of neural composition, this attempt may be viewed as resting on the assumption that a functional description of a large neuronal construct Ai can indeed be achieved much more succinctly than by listing all the external connections of each and every neuron in Ai. Such a description may for instance be one of the many possible descriptions of the mental construct HAVE, or of the mental construct BANK, or the mental construct ROLLERBLADES. Unlike the simple unique description that can be achieved for Lego constructs, mental constructs are never uniquely nor thoroughly described. A mental construct can only be viewed as an embedded system of maps. Formally, however, the composition dynamics of mental constructs is identical to that of Lego constructs. The binding of two mental constructs is generally highly constrained, often of lock-and-key type, thanks to the precision of the wiring of M. These lock-and-key constraints allow us, for instance, to refer to the construct DEBORAH HAS ROLLERBLADES as the only construct that can possibly be assembled from DEBORAH, HAVE, and ROLLERBLADES (ignoring, as above, the tense and other modifiers). In other words, the connectivity graph At, which, in our proposal, serves as a substrate for the composition mechanics of neuronal constructs, can be analyzed at two very different levels. When scrutinized at the neural level, Ai may conveniently be viewed as a manifold embedded in a very high-dimensional—practically infinite-dimensional—space. While this level of analysis may provide some insight into the elementary neural mechanisms of binding, it is of little use when we try to account for the specifics of composition at the cognitive level. At that level, no reference is made to individual neurons any more. Rather, with the help of what may appear to some as a leap of faith, we assume that the Lego analogy, plausible at the microscopic level, carries over to the level of highly embedded systems of maps between maps. Our proposal leaves us, it seems, with more questions than we set out with. In addition to the numerous far-from-resolved issues, mentioned above, concerning the implementation of the composition dynamics at the neural level proper, we hardly touched upon the issue of how to represent regularities, say of linguistic nature, into composition rules for neuronal constructs. To take bui: one example, imagine we view the neural representation of HAVE, used in a given linguistic context such as string 1, as the activation of a regular synfire chain A, with discrete pools numbered 1 to p. One may then, for instance, consider the following neural model for predicate instantiation. The binding of DEBORAH to HAVE with DEBORAH in the subject role will be achieved by the establishment of bonds between the synfire chain corresponding to DEBORAH and neurons in even-numbered pools of A] in contrast, odd-numbered pools of A would provide the sites of binding for a direct object, e.g. ROLLERBLADES.^^ In our current state of knowledge—and lack thereof—about brain function, this is one out of a virtually limitless number of possibilities. As such, it is almost certainly wrong, even in outline. Nevertheless, there is a possibility [25] that some issues pertaining to the mechanisms of acquisition of linguistic regularities might be fruitfully addressed within the framework proposed here, using simple organization principles related to theories of natural selection. ^^This is a rough synfire-chain equivalent of the neural-oscillator model of predicate instantiation proposed in [24].
298 Finally, our proposal may be of interest in the context of the recently much debated issue of the neural mechanisms of consciousness. In §9, we proposed to relate the 'mobility' of mental material, a concept which arose from our futile attempt to map mind into a low-dimensional space, to the high dimensionality of the graph of cortical pyramids. Our suggestion is that this high dimensionality, a fairly unique feature of evolved brains, might be responsible for our ability to bring together mental domains which do not, a priori, belong together.^^ One of the most striking capacities of the human mind is indeed that of drawing a common thread between disparate experiences. Such experiences may vary enormously in terms of sensory modality, time, context, etc., we are nevertheless remarkably apt at connecting and integrating them with each other, revising them as needed, and synthesizing from them various multi-dimensional pictures, eventually resulting in encompassing, multi-faceted, representations of the world. It has been proposed [26] that the mechanisms underlying this capacity might rely on the synchronization of large populations of cortical neurons that oscillate more or less regularly at some frequencies near 40 or 50 Hz. While the general principle of binding through the establishment of spatio-temporal relationships [1] appears not only basically sound but almost inevitable, the role of neuronal oscillators in this context might turn out to be quite limited. Simply put, the synchronization of oscillators, while allowing the binding of any collection of entities into a small number of classes, is hardly, however one wants to look at it, a mechanism for recursive composition. If our 'derivation from first principles' is anything near reality, our brains make full use of the high dimensionality of the connection graph of cortical pyramids. In this view, meaning is not located in 3-D space, neither in the form of cell-assembly activation, nor in the form of cell-assembly oscillation. Rather, meaning is to be sought in cortical events which are both well-timed and accurately defined in neuronal space, and which therefore almost completely elude current investigation techniques. REFERENCES 1. von der Malsburg, C. 1981. The correlation theory of brain function. Internal report 812, Max-Planck Institute for Biophysical Chemistry, Dept. of Neurobiology, Gottingen, Germany. 2. Gray, C M., Konig, P., Engel, A. K., and Singer, W. 1989. Oscillatory responses in cat visual cortex exhibit inter-columnar synchronization which reflects global stimulus properties. Nature, 338:334-337. 3. Eckhorn, R., Bauer, R., Jordan, W., Brosch, M., Kruse, W., Munk, M., and Reitboeck, H.J. 1988. Coherent oscillations: a mechanism of feature linking in the visual cortex? Biol. Cybernetics, 60:121-130. 4. Abeles, M. 1982. Local cortical circuits: An electrophysiological study, Springer-Verlag, Berlin. 5. Abeles, M. 1991. Corticonics: Neuronal circuits of the cerebral cortex. Cambridge University Press, Cambridge UK. ^'^As a further illustration, consider the titles of the various sections of the present chapter.
299 6. Abeles, M., Bergman, H., Margalit, E., and Vaadia, E. 1993a. Spatiotemporal Firing Patterns in the Frontal Cortex of Behaving Monkeys. J. Neurophysiol, 70(4): 16291638. 7. von der Malsburg, C. 1987. Synaptic Plasticity as Basis of Brain Organization. In The Neural and Molecular Bases of Learning, J.P. Changeux and M. Konishi, Eds., Wiley and Sons, pp. 411-432. 8. Milner, P.M. 1974. A model for visual shape recognition. Psychol. Review, 81:521-535. 9. von der Malsburg, C , and Bienenstock, E. 1986. Statistical coding and short-term synaptic plasticity: A scheme for knowledge representation in the brain. In Disordered Systems and Biological Organization, E. Bienenstock, F. Fogelman, and G. Weisbuch, Eds., Springer-Verlag, Berlin, pp. 247-272. 10. Bienenstock, E. 1995. A model of neocortex, Network: Computation in Neural Systems, 6:179-224. 11. Miyashita, Y. and Chang, H.S., 1988. Neuronal correlate of pictorial short-term memory in the primate temporal cortex. Nature, 331:68-70. 12. Hebb, D.O. 1949. The Organization of Behavior. Wiley, New York. 13. Hopfield, J.J. 1982. Neural Networks and Physical Systems with Emergent Collective Computational Abilities. Proc. Natl. Acad. Sci. U.S.A., 79:2554-2558. 14. Amit, D.J. 1989. Modeling Brain Function. Cambridge University Press. 15. Websters Third New International Dictionary of the English Language, 1986. Merriam-Webster, Springfield, MA. 16. Langacker, R.W. 1987. Foundations of Cognitive Grammar, Vol. 1, Theoretical Prerequisites. Stanford University Press. 17. Arnoldi, H.M.R., and Brauer, W. 1995. Synchronization without oscillatory neurons. Biol. Cybernetics, In press. 18. Zucker, R.S., 1989. Short-term synaptic plasticity. Annu. Rev. Neurosci. 12:13-31. 19. Abeles, M., Vaadia, E., Bergman, H., Prut, Y., Haalman, I., and Slovin, H. 1993. Dynamics of Neuronal Interactions in the Frontal Cortex of Behaving Monkeys. Concepts in Neuroscience, 4:131-158. 20. Diesmann, M. Gewaltig, M.O., and Aertsen, A. 1995. Characterization of synfire activity by propagating 'pulse packets.' In: Proc. of the Conference Computation and Neural Systems '95, in press. 21. Bienenstock, E. 1991. Notes on the growth of a composition machine, in Proceedings of the Royaumont Interdisciplinary Workshop on Compositionality in Cognition and Neural Networks—/, D. Andler, E. Bienenstock, and B. Laks, Eds., pp. 25-43. 22. Doursat, R. 1991. Contribution a Vetude des representations dans le systeme nerveux et dans les reseaux de neurones formels. Ph.D. Thesis, Universite Paris VI, Paris, France. 23. Bienenstock, E., and Doursat, R. The Hebbian Development of Synfire Chains. In preparation. 24. Shastri, L., and Ajjanagadde, V. 1993. From simple associations to systematic reasoning: A connectionist representation of rules, variables and dynamic bindings. Behavioral and Brain Sciences., 16:417-494. 25. Bienenstock, E. 1992. Suggestions for a Neurobiological Approach to Syntax, in Proceedings of the Royaumont Interdisciplinary Workshop on Compositionality in Cogni-
300
tion and Neural Networks—//, D. Andler, E. Bienenstock, and B. Laks, Eds., pp. 1321. 26. Crick, P., and Koch, C. 1990. Towards a neurobiological theory of consciousness. Seminars in the Neurosciences, 2:263-275.