List of Contributors
L.F. Abbott, Volen Center for Complex Systems and Department of Biology, Brandeis University, Waltham, MA 02454-9110, USA E. Ahissar, Department of Neurobiology, The Weizmann Institute of Science, Rehovot 76100, Israel L.A. Baccal~, Av. Prof. Luciano Gualberto, Tray. 3, #158,CEP 05508-900, Sao Paulo, SR Brazil J.G. Bjaalie, Department of Anatomy, Institute of Basic Medical Sciences, University of Oslo, RO. Box 1105 Blindern, N-0317 Oslo, Norway E. Covey, Department of Psychology, University of Washington, Box 351525, Seattle, WA 98195, USA E. De Schutter, Born-Bunge Foundation, University of Antwerp, Universiteitsplein 1, B2610 Antwerp, Belgium S.A. Deadwyler, Department of Physiology and Pharmacology, Wake University School of Medicine, Winston-Salem, NC 27157-1084, USA H.R. Dinse, Institute for Neuroinformatics, Theoretical Biology, Ruhr University Bochum, Bochum, Germany J.R Donoghue, Department of Neuroscience and Division of Applied Mathematics, Brown University, Providence, RI 02912, USA R.R Erickson, Departments of Psychology, Experimental and Neurobiology, Duke University, Durham, NC 27708, USA E.E. Fetz, Department of Physiology and Biophysics and the Regional Primate Research Center, University of Washington, Seattle, WA 98195, USA W.A. Freiwald, Institute for Brain Research, University of Bremen, FB2, RO. Box 330440, D-28334 Bremen, Germany B.H. Gaese, Institut ftir Biologie II, RWTH Aachen, Kopernikusstr. 16, D-52074 Aachen, Germany Y. Garbourg, Department of Physiology and Neuroscience, New York University School of Medicine, 550 First Avenue, New York, NY 10016, USA R.E. Hampson, Department of Physiology and Pharmacology, Wake Forest University School of Medicine, Winston-Salem, NC 27157-1083, USA M.T. Harrison, Department of Neuroscience and Division of Applied Mathematics, Brown University, Providence, RI 02912, USA N.G. Hatsopoulos, Department of Neuroscience and Division of Applied Mathematics, Brown University, Providence, RI 02912, USA N. Jain, Department of Psychology, Vanderbilt University, 301 Wilson Hall, 111 21st Avenue South, Nashville, TN 37240, USA D. Jancke, Institute for Neuroinformatics, Theoretical Biology, Ruhr University Bochum, Bochum, Germany. Present address: The Weizmann Institute of Science, Rehovot, Israel
vi J.H. Kaas, Department of Psychology, Vanderbilt University, 301 Wilson Hall, 111 21st Ave. South, Nashville, TN 37240, USA J.S. Kauer, Department of Neuroscience, Tufts University School of Medicine, 136 Harrison Avenue, Boston, MA 02111, USA A.K. Kreiter, Institute for Brain Research, University of Bremen, FB2, P.O. Box 330440, D-28334 Bremen, Germany D. Margoliash, Department of Organismal Biology and Anatomy, The University of Chicago, 1027 E. 57th Street, Chicago, IL 60637, USA J.T. McIlwain, Department of Neuroscience, Brown University, Providence, RI 02912, USA S.L. Moody, Department of Computer Science, Wellesley College, 106 Central St, Wellesley, MA 02481, USA M.A.L. Nicolelis, Department of Neurobiology, Box 3209, Duke University, Bryan Research Building, Room 333, 101 Research Drive, Durham, NC 27710, USA S.I. Perlmutter, Department of Physiology and Biophysics and the Regional Primate Research Center, University of Washington, Seattle, WA 98195, USA S.M. Potter, Division of Biology 156-29, California Institute of Technology, Pasadena, CA 91125, USA Y. Prut, Department of Physiology and Biophysics and the Regional Regional Primate Research Center, University of Washington, Seattle, WA 98195, USA H.-X. Qi, Department of Psychology, Vanderbilt University, 301 Wilson Hall, 111 21st Avenue South, Nashville, TN 37240, USA R.C. Reid, Department of Neurobiology, Harvard Medical School, 220 Longwood Avenue, Boston, MA 02115, USA E. Salinas, Howard Hughes Medical Institute, Computational Neurobiology Laboratory, The Salk Institute for Biological Studies, 10010 North Torrey Pines Road, La Jolla, CA 92037, USA K. Sameshima, Disc. Medical Informatics and Functional Neurosurgery Lab., School of Medicine, University of Sao Paulo, Sao Paulo, Brazil C. Schwarz, Eberhard-Karls-Universit~t Ttibingen, Department of Cognitive Neurology, Auf der Morgenstelle 15, D-72076 Tttbingen, Germany T.J. Sejnowski, Howard Hughes Medical Institute, Computational Neurobiology Laboratory, The Salk Institute for Biological Studies, La Jolla, CA 92037, USA and Department of Biology, University of California at San Diego, La Jolla, CA 92093, USA M. Shuler, Biomedical Engineering, Duke University, Bryan Research Building, Room 333, 101 Research Drive, Durham, NC 27710, USA J.D. Simeral, Department of Physiology and Pharmacology, Wake Forest University School of Medicine, Winston-Salem, NC 27157-1083, USA W. Singer, Max-Planck Institute for Brain Research, Deutschordenstr. 46, D-60528 Frankfurt/Main, Germany J.P. Welsh, Department of Physiology and Neuroscience, New York University, School of Medicine, 550 First Avenue, New York, NY 10016, USA J. White, Department of Neuroscience, Tufts University School of Medicine, 136 Harrison Avenue, Boston, MA 02111, USA S.P. Wise, Laboratory of Systems Neuroscience, National Institute of Mental Health, National Institutes of Health, 49 Convent Drive, MSC 4401, Bldg. 49 Room B1EE17, Bethesda, MD 20892-4401, USA
vii M. Zacksenhouse, Faculty of Mechanical Engineering, Technion - - Israel Institute of Technology, Haifa, 32000, Israel K. Zhang, Howard Hughes Medical Institute, Computational Neurobiology Laboratory, The Salk Institute for Biological Studies, La Jolla, CA 92037, USA
ix
Preface Although neuroscientists have long recognized the relevance of investigating the principles that underlie the interactions of large populations of neurons in behaving animals, they had to wait a long time for the introduction of experimental and analytical methods to begin exploring the physiological properties of large neural ensembles. Fortunately, during the last two decades a variety of such techniques have been introduced into the arsenal of tools currently employed in neuroscience. As a result, today one only needs to stroll through the scientific exhibits of major neuroscience conferences to realize that the field of neural ensemble physiology is rapidly becoming one of the premier areas of modern brain research. Three years ago, motivated by the growing interest in neural ensemble research, Dennis Glanzman, the director of the Theoretical and Computational Neuroscience Research Program at the National Institute of Mental Health, and I organized a workshop on "Advances in Neural Population Coding" in Bethesda, Maryland. The main goal of that meeting was to bring together a distinguished group of neuroscientists to debate current and future developments in the area of neural ensemble physiology. By all accounts, the meeting was a great success. Soon after we departed from Bethesda, it occurred to me that the new and exciting ideas discussed in that meeting deserved to be disseminated to a broader community. The book that you are now holding was born out of this motivation. Those who participated in the Bethesda meeting kindly agreed to participate in this adventure and provided articles describing some of the issues they discussed in their presentations. However, since the workshop did not cover all aspects of neural population coding, it was decided that several invited articles should be added to make the volume as comprehensive as possible. Thus, several colleagues, who were not able to participate in the original meeting, were kind enough to accept my invitation and contribute to this book with great enthusiasm. After two years of hard work, it is with great pleasure that I introduce the end product of this collective effort, which reflects the research of many scientists working in laboratories around the world. This book is divided into six sections. The first one contains a historical overview of the concept of neural population coding. The second section introduces a series of new experimental paradigms and analytical techniques for investigating potential neural coding schemes. Then, the next four sections focus on recent advances in population coding in a broad range of areas of brain research, which include: sensory (Section III) and motor (Section IV) systems, learning (Section V), and cognitive neuroscience (Section VI). I would like to take this opportunity to thank those who made this project come to fruition. First, I would like to thank Dr. Dennis Glanzman and the National Institute of Mental Health for supporting the organization of the Bethesda meeting on population coding. I would also like to thank all the authors who contributed to this book. Finally, I would like to acknowledge the continuous support and enthusiasm provided by Mrs. Jenny Henzen, the Publishing Editor of Neuroscience for Elsevier. Miguel A. L. Nicoletis
M.A.L. Nicolelis (Ed.)
Progress in BrainResearch, Vol. 130 © 2001 Elsevier Science B.V. All rights reserved
CHAPTER 1
Population coding: a historical sketch J a m e s T. M c I l w a i n * Department of Neuroscience, Brown University, Providence, RI 02912, USA
Although the idea of population (or ensemble or distributed) coding was introduced almost 200 years ago, it did not attract substantial attention until relatively recently. One reason for this, aptly stated by Hinton et al. (1986), is that distributed representations are "less familiar and harder to think about than local representations" (p. 77). Another reason is that neuroscience is only now emerging from an era in which the widespread use of microelectrodes focused experimental research on the behavior of single neurons and the possibility that their individual properties could account for much of what the brain does. As someone whose work began in the era of single-unit recording, I can testify personally to the seductiveness of this latter view. As you sit in a darkened laboratory with your attention riveted to the sounds of the audiomonitor and probe a neuron's receptive field with a tiny visual stimulus, it is easy to forget that the cell you are listening to is but one of many that are responding to the stimulus. And from there it is but a small step to the assumption that the cell is a 'labeled line' for that aspect of the stimulus which produces the most vigorous response. This short essay is not the place for an exhaustive survey of all studies past and present that speak to the relative importance of single neurons versus ensembles of neurons in the mediation of sensory, motor and perceptual events. My intention is rather to highlight certain threads in this story, admitting at the
*Corresponding author: J.T. McIlwain, Department of Neuroscience, Brown University, Providence, RI 02912, USA.
outset that my perspective is inevitably constrained by the literature that has informed my own work. The debate forms a fascinating chapter in the history of neuroscience and deserves the attention of a professional historian, who, I believe, will recognize distinct parallels With the arguments over localization of function that occupied scientists in the eighteenth century. If one substitutes 'representation' for 'function' and 'single neuron' for 'area' in the arguments of the older debate, the underlying commonalties become clear. As is well known, the localization approach reached absurd heights in phrenology, which triggered a reaction among neurologists and other scientists that ranged from extremely holistic views to more moderate positions that recognized certain degrees of specialization in cortical areas, but held that cooperative activity was essential to brain function. The important contributions to this debate by such figures as Jackson, Hebb, Luria and Lashley are noted by Hinton et al. (1986) and discussed in detail by Finger (1994) and by Clark and O~Malley (1968). As suggested earlier, though, the interesting issues today are those that concern the role of individual neurons. The earliest invocation of the notion of population coding of which I am aware is that of Thomas Young in his trichromatic theory of color vision. Young (1802) argued that " ... as it is almost impossible to conceive each sensitive point on the retina to contain an infinite number of particles, each capable of vibrating in perfect unison with every possible undulation, it becomes necessary to suppose the number limited, for instance, to the three principal colours, red, yellow, and blue ... and that each of the par-
ticles is capable of being put in motion less or more forcibly, by undulations differing less or more from a perfect unison" (pp. 20-21). It is now generally accepted that visible wavelengths are represented in the retinas of most humans by the ratio of their effects on three distinct cone types, each sensitive to a broad, but not identical, range of wavelengths, and that the activity of any one cone type is ambiguous with respect to the chromatic composition of the impinging light. It is worth noting that Young was led to his conclusion by the realization that it was physically impossible to have at each point of the retina a labeled line for each color of the spectrum that the eye can distinguish. In an interesting inversion of this reasoning, modern investigators often invoke the idea of population coding when the broad tuning of individual neurons cannot account for the exquisite discriminative capacities of the circuits of which they form a part. Thomas Young was a Fellow of Trinity College in Cambridge University, an institution that has provided other major chapters in the development of the idea of distributed representation. From the Cambridge Physiological Laboratory came the study by Adrian et al. (1931) in which they showed that cutaneous afferents in the frog had large receptive fields on the body surface. In discussing the limits such large fields might place on localization of punctate cutaneous stimuli, the authors concluded: "There is no reason to suppose that the widespread distribution of the sensory endings of a single fibre will necessarily interfere with the exact localization of a stimulus. Owing to the overlapping of the area of distribution of different fibres the stimulation of any point on the skin will cause impulse discharges in several fibres and the particular combination of fibres in action, together with the relative intensity of the discharge in each, would supply all the data needed for localization." (p. 384) Carl Pfaffman, working somewhat later in the Physiological Laboratory, was prompted by the relatively nonspecific responses of the cat's primary gustatory afferents to advance what is sometimes called the 'cross-fiber pattern' theory of gustatory coding. Pfaffman (1941) concluded that "In such a system, sensory quality does not depend simply on the 'all or nothing' activation of some particular fiber
group alone, but on the pattern of other fibers active" (p. 255). There is some irony in the fact that the most recent contribution to this debate from Cambridge is what is perhaps the strongest statement of the labeledline hypothesis. The argument of Horace Barlow (1972) is based on the assumption that the brain is organized to achieve specific representations with activity in the minimum number of neurons. Perception occurs when there is activity in a small number of high-level neurons "each of which corresponds to a pattern of external events on the order of complexity of events symbolized by a word" (p. 371). Two decades earlier, Barlow (t953) had introduced the term 'detector' to characterize a cell whose activity signals the presence of a specific stimulus object. Discussing the behavior of one type of ganglion cell in the retina of the frog, Barlow wrote: "The receptive field of an 'on-off' unit would be nicely filled by the image of a fly at 2 in. distance and it is difficult to avoid the conclusion that the 'on-off' units are matched to the stimulus and act as 'fly detectors."' (p. 86) Referring to another class of frog ganglion cells, the 'off' units, Barlow addressed the issue of spatial localization with an argument reminiscent of that of Adrian et al. (1931), but modified to emphasize the centrality of the single neuron in the coding process. "A population of ganglion cells with over-lapping receptive fields of the same size as the expected image is a neat method of judging the centre of a large object, for there will be a single, unique, ganglion cell whose field is completely filled by the image, and which will, therefore, be maximally excited." (p. 87) With the publication of their paper entitled 'What the frog's eye tells the frog's brain', Lettvin et al. (1959) gave the idea of the 'feature-detector neuron' a major boost. In this influential paper the authors applied the 'detector' terminology to four classes of retinal ganglion ceils, and in the following decades, the concept of single neurons as labeled lines was routinely invoked by others in discussing the behavior of neurons elsewhere in the visual system. This practice, however, did not go unchallenged. One cautionary voice was that of Robert Erickson, whose reviews reminded readers that the broad tuning of most
sensory neurons for various stimulus dimensions was not consistent with such a neat story (Erickson, 1968, 1974). Erickson recalled the distributed nature of the widely accepted trichromatic theory and also observed that the feature-detector idea is in effect an extension of MUller's law of specific nerve energy to the single unit level. Another critical observer, apparently exasperated by what he perceived to be an excessive application of the feature-detector idea to psychological processes, referred to the era as the 'psychobiological silly season' (Uttal, 1971). Even as the visual system was providing the greater part of the evidence advanced in support of the feature-detector/labeled-line idea, Stephen Kuftier (1952) expressed doubts in his pioneering study of the receptive fields of ganglion cells in the cat's retina: "Since receptive fields overlap, even the smallest light spot will excite numerous ganglion cells. For such reasons psychophysical conclusions based only on individual cell responses will have to remain limited." (p. 290) Here Kuffler also expressed a concern that " ... the usefuless of the 'receptive field' concept may largely be lost, if it should really include a very large area." Although he did not elaborate, Kuffier must have been uneasy about the assumption that the behavior of a single cell is critical for the spatial localization of a stimulus. Subsequent work by the author of this essay showed that some ganglion cells in the cat can indeed be activated from areas distant from the localized 'on' and 'off' zones described by Kuffler (Mcllwain, 1964) 1, and numerous studies have since revealed important contributions of areas beyond the 'classical' receptive field to the behavior of neurons throughout the visual system (reviewed in Allman et al., 1985). The 1970s witnessed efforts to quantify earlier observations that stimulation at a given point on the retina would potentially activate a significant number
tit was my great good fortune to have the opportunity to demonstrate these findings to the late Stephen Kuffler when he visited UCLA in 1963. For his gracious assistance in helping an unknown postdoc publish his then rather heretical observation, I am forever in his debt.
of cells. This direction was prompted in some measure by knowledge that the point-spread function of a linear optical system, i.e. the distribution of light in the image of a point source, completely defines the system and can be used to predict the distribution of light in the image of any object (Papoulis, 1968). In an analysis of the cat's retina, Burkhardt Fischer (1973) estimated that at least 60 ganglion cells had receptive-field centers that overlapped at a point, an aggregate he termed the 'Punktbild', which translates approximately as 'point image'. Fischer's results suggested that this number was invariant across the retina and from this he concluded that a compact collection of axons from any Punktbild would occupy the same cross sectional area of the optic nerve regardless of its retinal origin. An analogous concept was advanced by Cleland et al. (1975) in the form of the 'coverage factor', defined as "equivalent to the number of receptive field centres which would be transfixed if a pin was pushed into the visual map at a particular point" (p. 169). Subsequent work on the visual cortex (Hubel and Wiesel, 1974; Albus, 1975; Dow et al., 1981; Van Essen et al., 1984; Grinvald et al., 1994) and superior colliculus (McIlwain, 1975) indicated that cells with receptive fields containing a given visual point occupied significant areas of these central stations in the visual pathway. The distributed representation of a visual point in the superior colliculus was mirrored in the widespread distribution of activity preceding saccades, suggesting that the active populations were involved in converting the retinotopic location of the saccade target to a representation of the metrics of the impending saccade (Mcllwain, 1975; Sparks et al., 1976). Here, then, was a case in which insight into the structure of the distributed code was facilitated by knowledge of the requirements of the output system, an advantage currently denied to those working on parts of the visual system concerned with perceptual processes. As neurophysiologists were coming to grips with the need to consider the consequences of broadly distributed neural activity representing sensory or motor events, there emerged more or less independently a stream of theoretical work that envisaged distributed representations as the neural substrates of memory. Pioneering models based on this idea were advanced by Anderson (1970) and Kohonen (1972)
and interviews with some of the early workers in the field have been collected by Anderson and Rosenfeld (1998). The development of optical holography lent credence to the notion that information could be stored in non-local fashion and retrieved, and there was an early effort to apply the holographic model to brain function (for review see Willshaw, 1989). Over the past two decades neur0physiological and theoretical streams have merged significantly and have moved in directions that are beyond the scope of this essay. One example, though, will serve to illustrate this convergence. In 1984, G.E. Hinton published a demonstration of the efficiency of what he called 'coarse coding' for the representation of features (Hinton, 1984). He concluded that "The central result is a surprising one. If you want to encode features accurately using as few units as possible, it pays to use units that are very coarsely tuned, so that each feature activates many different units and each unit is activated by many different features" (p. 11). Thomas Young would be pleased. This brief sketch suggests that a reliable sign or 'signature' of a system that employs a distributed code is that its neurons taken one at the time are broadly tuned along a dimension that the system nonetheless appears to resolve with a high degree of precision. For visual cells, the dimension may be retinal location, wavelength, orientation, speed, direction or, at higher levels, form and position in head- or body-centered coordinates. Neurons in the auditory system respond to a wide range of sound frequencies and intensities, and the broad tuning of olfactory and gustatory neurons is well established. There is ample evidence that this characteristic applies as well to the primary motor cortex and superior colliculus. As noted above, in certain cases progress has been made in estimating the distribution of activity in response to stimulation at some locus along the dimension involved. These results a r e a useful starting point because they may be related to the idea o f the point-spread function of the system and the body of theory developed around that concept. However, because neural systems are notoriously non-linear, the convolution techniques suitable for linear systems cannot be applied directly, and a major challenge is to develop methods to map the real distributions of activity to stimuli more complex than points. Multiunit recording and optical imaging
methods offer great promise here, but there is also a need for a strong theoretical underpinning for such efforts. If the brain uses distributed codes, as seems certainly to be the case, does this mean that neurons cannot be 'labeled lines'? Clearly, to support a code of any complexity, active populations must be discriminable from one another, which means that differences among the individual cells are important. Neurons cannot respond equally well to everything and form useful representations of different things. Thus, the sharp dichotomy between distributed coding and labeled lines seems to be a false one and the critical question is 'labeled how and with what'. Barlow's metaphor of the neuron as 'word' assigns more semantic stability to a cell's discharge than the evidence supports, but perhaps the role of the cell is analogous to that of the letter whose meaning in any instance is inseparable from its role in the ensemble that forms the word. Indeed, the task facing neuroscience is not unlike that encountered by scholars who deciphered the writing systems of the ancient Near East from the scraps of evidence at their disposal. Did the individual symbols represent words, syllables or letters and could the language be translated into one they already knew? Shrewd hypotheses, rigorous analysis and pure luck all played major roles in the success of that enterprise (Chadwick, 1958; Pope, 1975), and the same will probably be true for an eventual understanding of distributed neural codes.
References Adrian, E.D., Cattell, M. and Hoagland, H. (1931) Sensory discharges in single cutaneous nerve fibers. J. Physiol. (Lond.), 72: 377-391. Albus, K. (1975) A quantitative study of the projection area of the central and paracentral visual field in area 17 of the cat. I. Precision of the topography.Exp. Brain Res., 24: 159-179. Allman, J., Miezin, F. and McGuinness, E. (1985) Stimulus specific responses from beyond the classical receptive field: neurophysiological mechanisms for local-global comparisons in visual neurons. Annu. Rev. Neurosci., 8: 407-430. Anderson, J.A. (1970) Two models of memory organisation using interactive traces. Math. Biosci., 8: 137-160. Anderson, J.A. and Rosenfeld, E. (1998) Talking Nets: An Oral History of Neural Networks. MIT Press, Cambridge,MA. Barlow, H.B. (1953) Summation and inhibition in the frog's retina. J. Physiol. (Lond.), 119: 69-88.
Barlow, H.B. (1972) Single units and sensation: a neuron doctrine for perceptual psychology? Perception, 1:371-394. Chadwick, J. (1958) The Decipherment of Linear B. Cambridge University Press, New York. Clark, E. and O'Malley, C.D. (1968) The Human Brain and Spinal Cord. University of California Press, Berkeley, CA. Cleland, B.G., Levick, W.R. and W~ssle, H. (1975) Physiological identification of a morphological class of cat retinal ganglion cells. J. Physiol. (Lond.), 248: 151-171. Dow, B.M., Snyder, A.Z., Vautin, R.G. and Bauer, R. (1981) Magnification factor and receptive field size in foveal striate cortex of the monkey. Exp. Brain Res., 44: 213-228. Erickson, R.R (1968) Stimulus coding in topographic and nontopographic afferent modalities. Psych. Rev., 75: 447-465. Erickson, R.E (1974) Parallel 'population' neural coding in feature extraction. In: F.O. Schmitt and EG. Worden (Eds.), The Neurosciences. Third Study Program. MIT Press, Cambridge, pp. 155-169. Finger, S. (1994) Origins of Neuroscience. Oxford University Press, New York. Fischer, B. (1973) Overlap of receptive field centers and representation of the visual field in the cat's optic tract. Vision Res., 13: 2113-2120. Grinvald, A., Lieke, E.E., Frostig, R.D. and Hildesheim, R. (1994) Cortical point-spread function and long range lateral interactions revealed by real-time optical imaging of macaque monkey primary visual cortex. J. Neurosci., 14: 2545-2569. Hinton, G.E. (1984) Distributed representations. Technical Report CMU-CS-84-157. Carnegie-Mellon, Pittsburgh, p. 11. Hinton, G.E., McClelland, J.L. and Rumelhart, D.E. (1986) Distributed representations. In: D.E. Rumelhart and J.L. McClelland (Eds.), Parallel Distributed Processing: Explorations in the Microstructure of Cognition. Volume 1: Foundations. Bradford Books, Cambridge, MA, pp. 77-109. Hubel, D.H. and Wiesel, T.N. (1974) Uniformity of monkey striate cortex: a parallel relationship between field size, scatter and magnification factor. J. Comp. Neurol., 158: 295-306.
Kohonen, T. (1972) Correlation matrix memories. IEEE Trans. Comput, 21: 353-359. Kuffler, S.W. (1952) Neurons in the retina: organization, inhibition and excitation problems. Cold Spring Harb. Symp. Quant. Biol., 17: 281-292. Lettvin, J.Y., Maturana, H.R., McCulloch, W.S. and Pitts, W.H. (1959) What the frog's eye tells the frog's brain. Proc. Inst. Radio Eng., 47: 1940-1951. McIlwain, J.T. (1964) Receptive fields of optic tract axons and lateral geniculate ceils: peripheral extent and barbiturate sensitivity. J. Neurophysiol., 27: 1154-1173. McIlwain, J.T. (1975) Visual receptive fields and their images in the superior colliculus of the cat. J. Neurophysiol., 38: 219230. Papoulis, A. (1968) Systems and Transforms With Applications In Optics. McGraw-Hill, New York. Pfaffman, C. (1941) Gustatory afferent impulses. J. Cell. Comp. Physiol., 17: 243-258. Pope, M. (1975) The Story of Decipherment: from Egyptian Hieroglyphics to Maya Script. Thames and Hudson, London. Sparks, D.L., Holland, R. and Guthrie, B.L. (1976) Size and distribution of movement fields in the monkey superior colliculus. Brain Res., 113: 21-34. Uttal, W.R. (1971) The psychobiological silly season - - OK what happens when neurophysiological data become psychological theories? J. Gen. Psychol., 84: 151-166. Van Essen, D.C., Newsome, W.T. and Maunsell, J.H.R. (1984) The visual field representation in striate cortex of the macaque monkey: asymmetries, anisotropies and individual variability. Vision Res., 24: 429-448. Willshaw, D. (1989) Holography, associative memory, and inductive generalization. In: G.E. Hinton and J.A. Anderson (Eds.), Parallel Models of Associative Memory. Lawrence Erlbaum, Hitlsdale, NJ, pp. 103-127. Young, T. (1802) On the theory of light and colours. Phil. Trans. R. Soc. Lond., 92: 12-48.
M.A.L. Nicolelis (Ed.)
Progress in Brain Research,Vol. 130 © 2001 Elsevier Science B.V. All rights reserved
CHAPTER 2
The evolution and implications of population and modular neural coding ideas Robert R Erickson * Departments of Psychology, Experimental, and Neurobiology, Duke University, Durham, NC 27708, USA
Introduction An examination of the evolution of the maj or current neural coding ideas reveals interesting and surprising steps in each that illuminate and clarify the others. The major aspects of these models allow most to be placed roughly into two major categories: population coding is one, and modular coding the other. Since these are approximately orthogonal modes of thinking, each throws the other into high relief. Within each the role of the temporal course of neural activity can be important. To bring order to the various ideas, the meanings of some of the terms used will be closely examined. This seems justifiable on two grounds. The first is that different researchers use the same terms in different ways, different terms are used for very similar or identical ideas, and terms are often simply not defined. The second is that simplification, a standardization of useful definitions (see Appendix), helps provide a coherent account of the ideas. Herein an attempt is made to have the definitions used I as close to typical contemporary meaning as possible. * Corresponding author: Robert R Erickson, Department of Psychology, Duke University, Durham, NC 27708, USA. Tel.: +1-919-660-5718; Fax: +1-919-660-5726; E-mail: eric @psych.duke.edu l Research, both past and present, will be described using modem terms, reduced to a few standard definitions (see Appendix). This is done to cast varieties of research into the common mold sought herein. For example, Gall and
Both the modular and population ideas about neural coding had their first formal statements in the early 1800s. In 1811 and 1825 Gall expressed the view that specific functions are carried out by correspondingly specific structures of the brain - modules in m o d e m terms. In 1802 and 1807, Young made a surprisingly astute and concise statement about the nature and value of population coding - in miniature - - in his theory of color vision.
Modularity Gall's modular approach was dictated by the implicit assumption that our words are usefully competent to represent our behavioral functions 2, and that each
Young are described as using modular and population ideas, although they did not use these terms, and the 'fit' may not be exact. Contemporary researchers may be surprised to see such words attached to their research. If not explicitly stated otherwise, such use of these words is the present author's. 2 Many scientists are sensitive to the influence of our words on our understanding. For example S.J. Gould discussed the problems of the verbal categorizing imperative thus: "The human mind seems to work as a categorizing device ... This deeply (perhaps innately) ingrained habit of thought causes us particular trouble when we need to analyze the many continua that form so conspicuous a part of our surrounding world. Continua are rarely so smooth and gradual in their flux that we cannot specify certain points or episodes as decidedly more interesting,
10 word-function is mediated by a distinct part of the cortex dedicated to that function (for an accessible resource see Boring, 1929, 1942). One aspect of his basic idea was certainly well-reasoned; that the importance of a given function for a particular species was reflected in the size of its neural representation. Thus primates have large visual areas, birds large cerebelli, and bats large auditory areas - - a very acceptable and m o d e m point; Gall was an excellent comparative neuroanatomist. He ventured further with the reasonable assumption that, given variance between individuals within a species, here humans, there will be idiosyncratic differences in the size of various brain areas resulting in differences in brain shape. A musically gifted individual should have more neurons in a certain brain system (think Mozart), manifest in a slightly different brain shape. These enlargements would impress themselves onto the shape of the skull, which is easy to measure. This kind of thinking is only human. Given the properties of language, we simplify nature by packaging it into our categorical words - - what else is
or more tumultuous in their rates of change, than the vast majority of moments along the sequence. We therefore falsely choose these crucial episodes as boundaries for fixed (verbal) categories, and we veil nature's continuity in the wrappings of our mental habits ... We must also remember another insidious aspect of our tendency to divide continua into fixed categories. These divisions are not neutral; they are established for definite purposes by partisans of particular viewpoints." And Kimble wrote (1996) "Language is the agent of cognition, the currency of thinking, the tool-box of communication and the custodian of culture. To be useful, it must map onto the world with some precision. Unfortunately, however, the fact that it does so encourages the faith that the fit is perfect and that truth is in the dictionary: If there is a word for it, there must be a corresponding item of reality. If there are two words, there must be two realities and they must be different." William James (1890) (a foundational psychologist and brother of the author Henry James, who may have been a greater, but less systematic psychologist than William), recognized the problem: "Whenever we have made a word.., to denote a certain group of phenomena, we are prone to suppose a substantive entity existing beyond the phenomena. [And] the lack of a word quite often leads (to the idea) that no entity can be there ... "
possible? 3 In biology, identifiable body structures have always been given names, and then the functions of these structures were given names - - an approach validated over time by its usefulness. But a potentially powerful and core problem with this approach is that we cannot be certain that our words match the functions of the brain; we need imagination here, and some good luck! Fodor (1983) used the term 'modularity' to refer to the identification of brain structures with singular functions, both adequately identifiable with our words. In his definition, each structure/function is 'encapsulated' or insulated from other structure/functions, and thus 'impenetrable' by other functions 4. Bell, in 1811, and Magendie, in 1822 (see Boring, 1929, 1942 for accessable reviews) took Gall's modular line of thinking one step further, ascribing sensory and motor activity to the dorsal and ventral roots, and then another step was taken by Mueller (1838) by naming separate functions for the variously distinct sensory nerves. Further advances in ideas about neural function pretty well depended on advances in techniques. After Schwann showed that the brain is composed of cells, in 1839, Helmholtz (1862) continued the modular logic to the level of the individual neuron, with each auditory receptor and its
3 That the answer to this question is not easy should be cause for concern for all scientists. 4 Modules can be defined mathematically as 'crisp' sets. Crisp sets are groups in which the events are totally inpenetrable and encapsulated; each event is completely described by the set. If a person (event) is Swedish, this term totally describes his nationality, with no other terms needed to complete the description. Also, the 'Swedish' set cannot be used to define any other trait of the individual. On the other hand, 'fuzzy sets' allow other than complete membership of an event in each of several sets - - an event may have a 'grade of membership' of 0.6 in one set, 0.3 in another, and 0.5 in a third set, and each set includes various degrees of memberships from a variety of other events. One person (event) may be tall, musical and agreeable to various degrees (three fuzzy sets), and another person may have different ratings on these scales. Combinations of such differences allow definition of a great many more events than there are sets. This is analogous to population coding, as used herein.
11 neuron providing the code for a particular tone. With the next technological step, recordings of the electrical activity of individual neurons, Adrian (1928) and Bronk (Adrian and Bronk, 1928) found that the only neural signal available to each neuron was an 'all-ornothing' electrical spike. Since the size and form of the spike were unmodifiable in normal function, and since rate changes appeared to encode stimulus intensity, then the identity of the stimulus (pain, touch, vision etc.) could only be encoded by which neurons were active. This supported Helmholtz's idea in audition that these individual neural structures were labeled as to their function (one structure for each tone). A fitting epithet soon followed: 'labeled-line' coding (see Perkel and Bullock, 1968)5 wherein each neuron ('line') encoded its particular ('labeled') function. The work of Hubel and Wiesel established the presence of labeled 'straight line detectors' 6. This view of the role of individual neurons in neural coding " . . . holds that the output of one neuron can be interpreted without reference to the output of its neighbors" (Stevens and Zador, 1995). This is an exact description of modularity, but at the single neuron level. Other neural organizations specialized for situations of importance to a species, such as 'face cells' and species-specialized auditory neurons were discovered as our technical abilities evolved. On a larger scale, separate areas of the cortex were found for different aspects of visual behavior (e.g. Mishkin, 1983; Zeki, 1993) which presaged the exciting development of a great variety of complex modules, facilitated by the new techniques of neuroimaging. A sampling of these include executive control (Knight, 1994; Cohen et al., 1997; Godefroy et al., 1999); theory of mind (Peterson and Siegal,
1999); shyness and sociability (Schmidt, 1999); spatial and object memory (Mecklinger, 1998); semantic memory (Gabrieli et al., 1996); integration of relations (Waltz et al., 1999); abstract vs. specific object recognition Marsolek (1999); sense of self (Craik et al., 1999) etc. The point of the above is that the same style of modular thinking has been 'reinvented' at the singleneuron and larger levels many times since at least Gall. The historically continuous idea that certain easily identifiable structures have independent functions each of which could be identified by one of our words could hardly have been otherwise. For better or worse, to be verbal is to be Gallian. In this paper, modular coding refers to the situation in which a group of neurons totally fulfills one simply namable function, and participates in no other functions. This collection of neurons, and the event encoded, constitute a module. This definition closely follows Fodor 7, whose main functional criteria are that a module be 'encapsulated', and 'impenetrable' by other functions. Structurally, Thus modularity here includes any single-function system from single 'labeled-line' neurons or groups of neurons, to brain areas or neurotransmitter systems, or any other single-function system. In older, simpler terms, this is the idea that neural structure and function are two sides of the same coin.
5 Perkel and Bullock discuss many forms of neural codes, including 'modular', although that term had not yet come into existence. Their landmark definition of 'labeled-lines' gave the term much more flexibility than seen in current usage. They also provide an informative discussion of coding in the temporal nature of a neural response. 6 Hubel and Wiesel (1968) saw the implications of their work to be that, since there are very many neurons in the cortex, there could be modular labeled-lines for each conceivable visual object.
7The term 'modularity' was used by Fodor (1983) to incorporate a failed structuralist psychology (Titchener, 1898) into the science of neural organization. According to that view, for which Fodor uses Gall as his neural model, human behavior is divided up into a number of informationally encapsulated structures. As an example, he speculated without formal rationale that comprehension or production of language may not use the same computer as one used for the understanding of categories such as 'animals' or 'cows'; thus modularity.
Population coding At about the same time that Gall was fighting the main body of medical science, which contended that the brain was something of a bowl of soup filled with liquid 'humors', a brilliant linguist, who had the insight to decipher hieroglyphics using the Rosetta stone, came up with a very exciting idea
12 about the economics of neural coding. This was Thomas Young, whose conception was orthogonal to Gall's modularization s. His idea was propelled by a consideration of the economical representation of information, which also appears to have had a role in his analysis of the Rosetta stone 9. His hypothesis for the encoding of color is succinctly put in what are arguably the two most powerful sentences in the history of neuroscience.
His is a very illuminating and properly scientific note in that it begins with an abstract definition of the issue to be addressed - - neural economy; whatever was in the eye, there was not room enough at each point in the retina for separate receptive 'particles', each 'labeled' for each perceivable color, at each point in the retina. His postulate moved us from a situation where the encoding of many colors is economically impossible, to one which can encode " a variety of traits beyond all calculation." This is an unusual mode of thinking in that it is not based on our words and techniques; his idea was driven by a precisely defined logical problem, a salutary scientific process that cannot be claimed by (at least) Gall. His entire comment sets in its most elemental form the idea and rationale of parallel, distributed, population coding, to provide the means for the immense power of the brain to handle extremely large amounts of information at great speed. The simplicity of the model, set in the context of three neurons, makes the idea altogether clear, and thus generalizable to other more complex situations; this simplicity also makes it easy to ignore when considering these larger issues. The cryptic but core idea here is that the nervous system accepts as a final code information distributed across neurons. This is a nearly impossible idea to accept when it comes to these larger issues; in fact it is hard to believe that it can really be a code for even color unless this distributed activity is 'read out' somewhere onto 'orange' and 'blue-green' cells somewhere in the depths of the brain. But such 'readout' would completely destroy the economy of his core idea of distributed information; that it is not economically realistic that each event - - here, all the different hues, saturations and intensities - could possibly be separately represented centrally for each visual location by separately labeled individual neurons. .
"Now, as it is almost impossible to conceive each sensitive point of the retina to contain an infinite number of particles (receptors), each capable of vibrating in perfect unison (responding) with every possible undulation (wavelength), it becomes necessary to suppose the number limited, for instance (to three); and that each of the particles is capable of being put in motion less or more forcibly, by undulations differing less or more from a perfect unison; for instance, the undulations of green light will affect equally the particles in unison with yellow and blue, and produce the same effect as a light composed of those two species;... " (Young, 1802 - - see Teevan and Bimey, 1961, Ch. 1). " ... the different proportions, in which (the sensations) may be combined, afford a variety of traits beyond all calculation." (Young, 1807 - - see Teevan and Birney, 1961, Ch. 1) 10.
8 Although contemporaries, Gall and Young were unaware of each other's work, a provincialism symbolic of the main theme of this paper. 9 Young's genius was to realize that the hieroglyphic markings could have either phonetic (alphabetic) or ideographic meaning; up to his work, it was considered to be either one or the other, which proved to be a fatal obstacle. He found that some of the symbols had phonetic (alphabetic) roles, and thus in a combinational way, could have immense power in representing much information; this relates closely to his color vision theory and is a clear demonstration of population coding. Interestingly he also found that some symbols were the equivalents of modular representations; to various degrees they had only one meaning (or related meanings), that meaning was completely given by the symbol, and the symbol did not have other meanings. These are slight overstatements, but they give the flavor of the two kinds of codes. 1°I discuss this quote (see Erickson, 1978, 1984) with some change in the meaning given by Young and
.
.
Helmholtz. They present their arguments in terms of sensations - - 'red' for one neural channel, and 'green' and 'blue' for the other two. The manufacture of a single 'blue-green' sensation in response to activation of the 'blue' and 'green' channels was handed over to higher neural centers. This maneuver does not explain anything. Herein it does no violence to their economic logic to avoid the use of their phenomenal terms.
13
60-
50.
0.1 M NH4CI
40,
STtMULU$ ~ CONTINUUM
~ Q
$ R
Neuron I
$ S
P
Q
R
$
P
Stimulus P
I
2
3
30.
impulses
2
Q
R
Q
I
2
20,
3
11o01. 1o 11o $
10"
0
2
3
. . . . . . . . . A
B
C
D
E
F
G
~ ~",~ H
I
J
K
L
M
chorda tympani fibers [rat] P
Q
R
R
3 I NEURONS
•
5
S
I
2
3
Fig. 1. Illustration of Young's population coding idea for color vision, and the applicability of this idea to other systems. (A) The tuning curves of three sensory neurons are drawn along their stimulus dimension, and their responses to four stimuli are shown. This example applies to all sensory and other systems where the dimension may be direction of movement, memory, or other functions. (B) From A. Modular consideration of the individual neuron as the functional unit. Note that each responds to the stimuli in various degrees, leaving the definition of the stimulus ambiguous within each neuron, and confusing the identity of the stimulus with stimulus intensity. (C) From A. Consideration of the population of three neurons as the functional unit. Each stimulus gives an unambiguous pattern across the population; intensity is coded variations in the height of each pattern (not shown). (Erickson, 1968)
The essence of Young's idea is given in Fig. 1. In Panel A, the several curves represent the logically arranged responsiveness of neurons along their continuum. In Young's case, as illustrated by Helmholtz (1860), the continuum is visual wavelength. Since such bell-shaped curves are found in all sensory, motor and other systems, they have been given the general rubric 'Neural Response Functions' (NRFs - - Erickson et al., 1965). Fig. 1B shows the rood-
Fig. 2. An example of a population response. Thirteen taste neurons are arranged along the abscissa with their amounts of response to three stimuli. Each neuron responds to all three stimuli such that within each neuron the identities of the stimuli are confused, along with their intensities (see Fig. 1B). The pattern of response across the neurons gives an unequivocal identification of each stimulus and, in the height of the pattern, the intensity of the stimulus (see Fig. 1C). The similarities of the stimuli are given in the similarities of their population responses (NH4C1 is very similar in taste to KC1, but neither tastes like NaC1). (Erickson, 1963).
ular (labeled line) approach to neural function; here the meaning of a neuron's activity is sought exclusively within its own activity, without the context of the other neurons' activity (see Stevens and Zador, 1995). Clearly each neuron in Fig. 1B, by itself, is confused about the identity and intensity of any stimulus; neuron 1 gives the same response to stimuli P and Q, and in neuron 2 a slightly more intense stimulus R would give the same response as stimulus Q. Young's view is given in Fig. 1C; here the response evoked across the population of three neurons gives an unequivocal code for each of a very large number of stimuli (Erickson, 1968, 1974, 1982, 1984; Rolls et al., 1997a,b) as illustrated in Fig. 2 (from Erickson, 1963). The height of a pattern defines the intensity of a given stimulus. This is the population idea at the micro level. The orientation taken here is that if the nervous system finds this principle acceptable and powerful at a very small level, it need not give it up at more molar levels where the economic demands are much larger. In this paper, neural 'population coding' refers to
the essential but idiosyncratic participation of neu-
14 rons in the representation of a particular event," thus information is spread out over various neurons each of which has a differential role in the code. This broad definition includes color coding as the simplest and most easily examined example. As a standard but more complicated example in visual coding, the location, form, color and movement of an object appear to be represented in separate neural areas. In population coding, these neural codes for location and form etc. need not be brought back together into a common neural pool for binding, but are a completed code in this distributed population state. Young's distributed theory, by whatever name, makes coding of many complex issues possible with limited neural resources. Many investigators since Yong have seen the importance of population coding 21. It is notable that these workers were and are regularly unaware of the ideas of their heritage or present family, and thus cannot benefit from them or cite them. The reinventots all came to this position from a recognition that the great breadth of each neuron's sensitivity (NRF) requires that each event must be encoded by many neurons of diverse sensitivity, and that no one neuron (or class of neurons) could unequivocally encode any event. These workers were each totally original. If the rubric 'population' or 'across-fiber pattern' had been invented, communication and progress would have been facilitated. The many researchers who see breadth of tuning of individual neurons as a basic problem to be eliminated neurally (perhaps by lateral inhibition), or phenomenologically to clean up the neural message, will not be discussed other than to observe that they are uncritically carrying the Gallian idea to the level of the single neuron. The economic problem of neural coding solved by broad NRFs and the 'population' scheme has escaped notice except by Young, and perhaps Adrian and Sperry in somesthesis. A brief illustration of some reinventions of population coding in various areas follows - - the point being how frequently a good idea will be discovered. (In-
tl Far more are the number of researchers who disclosed the kind of broad NRFs without attempting hypotheses about their possible roles (see Erickson, 1968 for an early review).
clusions in parentheses within quoted materials are by the present writer.) Olfaction The earliest advocates of population coding were driven to this idea on the basis of the broad tuning of the afferents - - which for them ruled out the possibility of Mueller's idea of 'Specific Nerve Energies' (Mueller, 1838) at the single neuron level. Lord Adrian (1956), the first scientist to record the electrical activity of individual afferents in any system, in summary of his earlier work in olfaction stated that since the afferents are so broadly tuned across stimuli, stimuli must be encoded in the "relative amounts of activity across mitral cells" 12. Using homey, easy-to understand language he said; "One (mitral cell) will respond more readily to tea than to coffee, and another more readily to coffee than tea." His observation was that since each mitral cell responded broadly but idiosyncratically to diverse stimuli (see Fig. 1A), the code for any stimulus could exist only in the pattern of amounts of activity across the differentially responding neurons (Fig. 1C) - - unbeknown to Adrian an exact restatement of Young's principle. The many workers in olfaction following Adrian, using the most sophisticated neural recording and brain imaging techniques, have confirmed his conclusions about great NRF breadth, and have accepted the idea of 'across-fiber patterning' for olfaction (for a review see Erickson, 2000). Other terms are sometimes used for the same coding idea (see Appendix), such as 'combinatorial' coding (Ressler et al., 1994; Fredrick and Korsching, 1998; Malnic et al., 1999). The broadly tuned NRFs are sometimes called 'generalists' (see Schneider, 1955; Pilpel and Lancet, 1999). The use of only one term for the same event would be helpful. 12Lord Adrian was a truly prolific research scientist, especially considering the lack of relevant prior work and the extremely primitive nature of his recording instruments. He worked in all peripheral afferent systems, as well as their central projections, providing extremely perceptive observations along the way, comments that are well worth reading today. The only omitted area was taste; this he assigned to Pfaffmann.
15 Gustation As in olfaction, the remarkable breadth of tuning of the afferents played a primary role in Pfaffmann's espousal (1941) of population coding in taste. He stated that "In such a system, sensory quality does not depend simply on the 'all or nothing' activation of some particular fiber groups alone, but on the pattern of other fibers active." Zotterman (1958), one of Pfaffmann's colleagues, echoed his assertion thus: " ... even so-called primary taste sensations are built up from a composite input pattern of several taste fibre types." Erickson, one of Pfaffmann's students, in following this line of thinking concluded: " ... the neural message for gustatory quality is a pattern made up of the amount of neural activity across many neural elements" (1963). He used the term 'across-fiber pattern' for population codes, and continued to develop this systems idea as a general neural code across diverse neural functions. Di Lorenzo (1989) developed a vector model to analyze this pattern of activity across taste neurons similar to that invented by Georgopoulos for motor organization (see Georgopoulos, 1995 for a review). At the present, there is divided opinion over whether there are five 'basic', 'modular' tastes ('sweet', 'sour', 'salty', 'bitter' and 'umami') each with its own private 'labeled lines', or a population AFP code across the many broadly tuned NRFs resulting in many tastes. A clarification of what the term 'basic taste' means would help resolve this debate. At any rate taste, as olfaction, demonstrates 'penetrable non-encapsulation' with each neuron responding to many stimuli, and each tastant modulating the perceptual effects of other stimuli, There could be exceptions, for example if there were a separate system encapsulated for homeostasis of sodium. But this would not serve for the encoding of the vast array o f different taste stimuli among which animals must differentiate (see Erickson, 2000 for a review). Somesthesis The history of somesthesis is replete with reinventions of Young's theory. Adrian (1928) and Adrian et al. (1931) extended population coding from olfaction to somesthesis, this time mentioning not only
to the fact of broad tuning of each afferent, but the accuracy of a code based on broad NRFs. "If there is much overlapping in the areas supplied by the terminal branches of different fibres it might be possible to localize the stimulus more exactly by comparing the relative intensities of discharge in the different fibres." (Adrian, 1928, p. 92). In the same line, Nafe (1929) restated Mueller's 'doctrine of specific nerve energies': "The specific accompaniment of sensory excitation is correlated with the number of nervous pulses and their temporal and spatial relations." In recordings from individual ciliary (trigeminal) neurons from the cat cornea, Tower (1943) found very broad receptive fields (a quarter or more of cornea, plus adjacent sclera and conjunctiva) whose sensitivity functions resembled the NRFs portrayed in Fig. 1A. She observed that in their overlap, a few of these neurons could accurately portray the location of the stimulus. She generalized this idea to all systems which have broad NRFs; it turns out that this includes all systems for which adequate data are available (Erickson, 1968, 1974, 1982, 1984). In describing the development of broad and overlapping NRFs of somatosensory neurons, Sperry (1959) speculated that "Precise localization is further enhanced by the overlapping of the terminal connections formed by the fibers in the skin", and in the legend of the accompanying figure mentioned, "Overlap of sensory fibers permits a subject to localize a pinprick accurately." His drawing is a 2dimensional map comparable to an overhead view of the single dimension portrayed in Fig. 1A, including the form of the NRFs and their overlap. Loomis and Collins (1978) explicitly use Young's model to account for tactile perception. Johansson and Vllabo (1980) come to similar conclusions from quantitative considerations of tactile sensitivity. Many other workers in the skin senses have made similar references to AFP coding: a few of these include Mountcastle (1980) in the description of profiles of neural activity across distributed, ensemble activity; Johnson and Hsiao (1992) used population responses for tactual form and texture; Merzenich and deCharms (1996) used AFP logic for localization; Ray and Doetsch (1990) and Doetsch (2000) used AFPs for the skin senses in general; Surmeier et al. (1986) described an AFP code for heat; and
16 Nicolelis et al. (1995) described complex population responses which over time shift to other populations as a code for localization. Audition Given the breadth of auditory neuron NRFs, Helmholtz (1862) cannot be correct in assigning a separate modularly labeled neuron for each tone; this function is obviously carried by a population code. Azimuth is also encoded in AFPs (e.g. see Eisenmann, 1974; Eggermont, 1998). Using point stimuli, Knudsen (1982) showed broadly tuned and distributed auditory and visual responses in the rectum of the owl. Covey (2000) has reviewed AFP coding in the auditory system of bats. Motor systems It should be noted that vector analyses of efferent activity in motor systems assumes AFP coding - a population response summed in a special way; McIlwain (1975, 1991) used vector analysis of efferent activity from the superior colliculus, as did Georgopoulos and his colleagues for cortical motor neurons (see Lukashink and Georgopoulas, 1993; Georgopoulos, 1995); the response curves for these neurons closely resemble the curves in Fig. 1A. Eye movements were found to be driven by populations of neurons by Lee et al. (1988) and Law et al. (1998). Muscle sense Llinas and Welsh (1994) used AFP coding for movements, and Johansson et al. (1995) used "ensemble coding" to account for the accuracy of information about muscle stretch. Joint position Mountcastle et al. (1963) showed that afferents sensitive to joint position respond monotonically across most of the joint angle, with maximal responses at full flexion or extension. This instance raises the point that NRFs need not be bell-shaped as in Fig. 1A; they simply need only to be simple (smooth) and broad to fulfil their AFP coding role.
Vestibular sense Adrian (1943) was the first to describe the broadly tuned activity of vestibular afferents. Fernandez et al. (1972) found each vestibular afferent neuron to be sensitive throughout all head positions, with NRFs as in Fig. 1A. Vision That broad NRFs require distributed coding has been obvious for many workers in various aspects of vision. The following quotes describe the heart of AFP coding exactly. Hartline (1940) spoke of activity in the peripheral nerve as follows: "It is evident that illumination of a given element of area on the retina results in a specific pattern of activity in a specific group of optic nerve fibers. The particular fibers involved, and the distribution of activity among them, are characteristic of the location on the retina of the particular element of area illuminated. Corresponding to different points on the retina are different patterns of nerve activity; even two closely adjacent point do not produce quite the same distribution of activity, although they may excite many fibers in common." Hartline, with his students Miller and Ratlliff (see Miller et al., 1961), showed that the visual phenomenon of heightening of contrast at visual borders ('Mach bands') depended on an AFP of activity distributed across a population of neurons 13 Nelson (1975) showed broad NRFs for stereopsis, requiring AFP coding. Since a visual point produces widespread activity in the superior colliculus, McI1wain (1975, 1991) knew that each neuron had a very broad spatial NRF, and thus the spatial code could only reside in the AFP across many of these cells. He (1986) beautifully summarized the issue of population coding, using visual location in the superior colliculus as an example: " ... the traditional question of recepfive-field analysis is turned on its head: instead of asking 'Here is a cell: where are the points that it sees?' the question becomes,
L3Each demonstration of the 'Mach band' neural population effect was obtained from one neuron, the stimulus being moved so that this neuron would play the part of all neurons involved in the natural situation.
17 'Here is a point; where are the cells (constituting the AFP) that see it?' "Gochin et al. (1994) advocated AFP coding in inferotemporal cortex for visual form, as did Spinelli (1966) in the retina and Spinelli et al. (1970) in the cortex. Bishop (1970) predicted that "The discrimination of form may always be the property of groups of neurons, perhaps always of large assemblies of neurons and never in any sense the trigger feature for one neuron." For AFP coding for faces see Young and Yamane (1992); Rolls et al. (1997a,b); in inferotemporal cortex see Vogels (1999) for visual categories, and Booth and Rolls (1998) for complex objects. Memory Lashley (1951) set the stage for population coding for memory. Because of the broad and overlapping neural areas involved, he concluded that neurons supporting memory must be broadly tuned across many functions - - that is, no individual neuron could be given over to a particular memory. Many others followed this population format. John (1972) found neural responses in response to memory to be extensively distributed in large assemblies of nerve cells. Abbott et al. (1996) and Sakurai (1998) proposed broad neural tuning with population coding for memory - - all close to Lashley's position. General Hebb (1949) presented a 'big view' of neural coding and plasticity in responses of neural populations (large sets of cell-assemblies). Lashley had the courage to raise issues that are important rather than convenient (e.g. Lashley, 1929, 1942, 1951) although he resolved few. Within a spatial and temporal patterning model he sensibly addressed 'untouchable' issues such as sensory equivalence and motor equivalence, insight, generalization, reasoning, and temporal order in behavior (such as language). In his discussions of neural organization in general, he concluded (1929) that information must be encoded in patterns of activity across neurons thus; "The problem of reaction to ratio (both the stimuli and AFPs consist of ratios - - i.e. the characteristics of a figure, or the amounts of activity across neurons) thus seems to underlie all phases of behavior, to such
a degree that we might be justified in saying that the unit of neural organization i s . . . the mechanism... by which reaction to a ratio is produced." Barnes et al. (1997) discussed cognitive maps as residing in populations of hippocampal cells. According to Llinas and Pare (1996) cognition requires spatial and temporal 'patterns of activity' across neuronal assemblies. Sperry (1969) and Mountcastle (1980) presented the idea that consciousness could arise as an emergent property of spatial and temporal patterns of distributed neural activity, and Tononi and Edelman (1998) argue for a 'central core' hypothesis for consciousness; that consciousness is not represented in one place, or over the whole brain, but by activity distributed across a few specific brain areas (each of which is here considered a module) which work together as an ensemble. The basic tenet of Gestalt neuroscience is that information is spread out as a pattern across functionally heterogeneous brain tissue; their further comments about electrical field theory which have caused them grief are not an essential part of the theory. Whether or not acknowledged, Gestalt-type thinking pervades ideas about neural organization, especially for visual form. In a general statement about information representation, Goldman-Rakic (1988a) presents the opinion that cortical function be approached " ... in terms of information processing functions and systems rather than traditional but artificially segregated sensory, motor, or limbic components and individual neurons within only one of these components," Erickson has applied the AFP population model to neural functions in general (Erickson, 1968, 1974, 1978, 1982, 1984, 1986, 2000). It might be fairly said that in a very basic sense, almost all of these reinventions of population coding did not, at their core, advance us past Young's 1802 and 1807 statements. With better communication and systematization, workers would have been spared the labor of reinvention, and could instead have been building on the preceding data and thought. Modular and population coding Modularity usually implies population coding, even if this is not made explicit. On the one hand, a number of functionally diverse neurons usually act as a popu-
18 lation to precisely define a putative modular function, such as the form or color of an object. On the other hand, several modules may work together as a population code; three examples follow. Modules for color, location, form and movement could cooperate as a population to represent a complex stimulus such as a face (Edelman and Mountcastle, 1978; Young and Yamane, 1992; Zeki and Bartels, 1998), or to represent body-object relationships (Graziano and Gross, 1995), with each of these modules performing its function by way of the population response of neurons of diverse function. A second example may be seen in the work of Posner and Petersen (1990) and Posner (1992) who posit four modular functions for directed attention: orientation of gaze, word form, word meaning, and vigilance. Each of these functions is posited to be as modular as the optic nerve is for vision, but they cooperate as a population in providing directed attention. Again, the various 'modular' aspects of pain - - discriminative (where is it, how strong?) affective (negativity), and cognitive (aspects of pain influenced by attention etc) - - appear to be given to various brain areas, acting in distributed form to produce the unitary experience of pain (Casey, 1999). But to group several such modules together as a higher-order encapsulated module would seem to allow further dilution of the modularity concept to the point of meaninglessness. To provide coherent behavior, modules must function in parallel - - putting the concept of modular encapsulation and insularity in question. Thus modularity includes and requires the concept of population coding. While the brain necessarily operates on population principles, the need for modularity has not been shown.
Issues for the modular and population neural coding schemes When are modular or population coding appropriate ? An efficient neural science should be characterized by its ability to anticipate in which cases its proposed codes would be appropriate; after the fact arguments are doomed to succeed. Several rationales for the appropriateness of given coding mechanisms are examined next.
Rapidity of neural processing Rapidity of neural processing is certainly a major issue given the size of the brain and the complexity of the networks. For complex issues, the many synaptic delays needed for the formation of a module of themselves could slow the organism to a very slow pace. But Treisman and Kanwisher (1998) argue that given their compact nature, modules should provide very rapid processing. On the other hand, the implication of parallel processing is that when the relevant neural activity reaches its distributed form, the processing is complete. In other words, even highly complex processing could be accomplished in the very short time it takes neural activity to reach its distributed destinations: e.g. the various visual areas for form, location, color etc. In that distributed form, information is immediately available for the full appreciation of the visual scene without the further generation of labeled 'higher-order' neurons or modules. This extreme rapidity of population coding may be difficult to accept, especially in complex cases, but it is the exact implication of Young's theory. Importance The importance of an event (e.g. visual, auditory, motor etc) for a given species is a first principle of neural organization in neuroethology. And the importance of a function seems to be a primary rationale for nomination to modularity, although this was not a first principle in Fodor's presentation (Fodor, 1983). For example, Desimone (1991) suggested modularity for face perception in humans because of the importance of this function. But no detailed rationale is available that importance requires modularity. On the other hand, in the Neural Mass Differences (NMD) aspect of population coding, it is reasoned that functions require neural mass in proportion to their complexity and importance to the species in question (Erickson, 1986). Singularity Just on their face, some simple functions seem to be singular, which is another way of saying 'encapsulated' and 'impenetrable'. Certain organ systems,
19 such as the kidney and heart are mono-functional, or 'modular'. The eye and optic nerve are clearly modular in that they provide vision and nothing else, and no other neurons provide visual input; other firstorder sensory neuron systems are similarly modular (Mueller, 1838). Alpha-motoneurons are modular in that they cause contraction of specified muscles. The giant axon of the crayfish performs only one function, escape, encapsulated and inpenetrable even from guidance, this very simple act is truly modular. Pheromone systems, based on olfactory 'grandmother neurons', provide a reasonable example of modular coding by neurons dedicated to eliciting one behavior, such as finding a mate (unlike the crayfish example, the behavior is guided (penetrated) as necessary to find the target). The terms 'non-combinatorial' or 'focal' are used for such 'labeled-line' pheromonal coding (Fredrick and Korsching, 1998). 'Releaser mechanisms', simple behaviors triggered by circumscribed stimuli (e.g. Ingle and Crews, 1985), are at about the same level of singularity as pheromonal systems. Neural modularity seems appropriate for very simple, circumscribed behaviors. If modular singularity is sufficiently identified above, can it be used for more complex functions? There appears to be a single-function song production system in some birds, and perhaps a language module in humans and some great apes, certainly functions of great importance to these species. The dissociability of symptoms is one method used to define modular singularity; in the 'theory of mind' module (see Stone et al., 1998; Tager-Flusberg and Boshart, 1998), singularity is defined in its dissociability from other behaviors. That is, the capacity for a theory of mind is disturbed in autism, but not in Downs or William's syndrome. Alas, it is very difficult to prove an idea true, especially with one dissociation, because the next test may prove it false (Popper, 1962). An extensive theoretical rationale for modularity would make reasonable tests possible. In the AFP view, all activities are necessarily interwoven and interpenetrable as the organism functions coherently and simultaneously on many fronts. Many activities are carried on at the same time, and in sensible relation to each other. That is, vision is part of the complex fabric of coordinated and interdependent organismic activities - - vision is neither encapsulated nor impenetrable.
Economics and complexity Brains are finite. Modules are very expensive of neural resources. An adequate definition of modules should include a logical and quantitative discussion of which and how many a brain needs and can afford. As discussed for singularity above, modules seem appropriate only for the simplest of functions, yet they are nominated for the most complex, from directed attention to theory of mind. With across-fiber pattern models, very large amounts of information can be expressed with a few neurons. If each neuron has 10 discriminable levels of activity, 13 neurons could carry 1013 patterns, a large quantity with respect to the total number of ('grandmother') neurons or possible modules in a human brain (see Fig. 2). This model is thus very appropriate for high information loads. Detection of the enormous number (here 1013) of potential antigens could be detected with a few (here 13) broadly tuned chemoreceptors (see Lancet and Ben-Aire, 1993; Pilpel and Lancet, 1999). One million proteins could easily be generated by 34,000 genes in their various combinations. The Oxford English Dictionary is non cramped by the 26 letters on which it is based, and all of music is based on 12 notes (Bernstein, 1970). Modules or functions? There is a difference between a set of neurons modularly dedicated for a particular function and neurons usually involved in this activity. That is, organisms certainly direct their attention, and over repeated instances many of the same neurons must be involved. This does not require, and it does not seem economically reasonable, that this is the only function for these neurons. In the population view, neurons are usually multi-purpose so that those used during directed attention, for example, will probably also be involved in other activities; thus 'directed attention' and its components may be defined as functions while not necessarily requiring modularity. Readout It is a primary characteristic of population coding that not only are input and output information in AFP form, but for the same economic reasons all
20 functions internal to these must be in population form. The information is at all times expressed in distributed patterns with no readout agencies needed to interpret these patterns. The absence of a readout system provides for the great economy and rapidity of processing large amounts of information. As for color vision, motor responses or other 'internal' productions occur in the time for neural activity to reach its distributed form - - this being the 'final' code. An internal modular 'readout' concept implies multiple massive, cumbersome, and time-consuming changes of format between input and output.
(1979) nominated the last four for specific neural networks or special-purpose neural processes on the basis of the existence of such areas for language and faces. Could any of these be accepted or rejected as modules on a priori principles? Is modularity to be accepted for any function after the finding of its singularity? Is anything that seems to be 'encapsulated' a likely module? Why? Perhaps we share Gall's fatal flaw of not having a logical theory of exactly what to look for. If a logical and clearly predictive theory is not developed, the future may see a swing away from modularity as it has from Gall.
Predictability Fine coding, coarse coding, and hyperacuity When classifying neural functions with words, the difficulty arises about which words should have modular representations. For example, what a priori arguments indicate whether the following are modular functions, or not? Visual color, form and location, sex, self esteem, religiousness, music, passage of time/continuity, and causality. A sense of self, decency, honor, power, fear, humor, patriotism, humility, fatigue or love. Theory of mind, directed attention, executive control, reading, anxiety, language, response error detection, insight, or memory in its many varieties. Comprehension of speech, animals, objects, word meaning, tunes, lyrics, my grandmother. Attention, aggression, creativity, family love, erotic love, love of mankind. Fear, anger, bladder/bowel control or hunger. The first nine putative modules were included by Gall among his listing of over 50 human faculties 14. Geschwind
14Many of Gall's choices of modular human faculties seem entirely modern, without rationale. Some modules include self-defense, property, disposition to quarrel, courage, homicide/suicide/destruction, camivorism, cunning, covetousness, propensity to steal, pride/hauteur, vanity, ambition, cautiousness, memory (separately of things, people, facts), mathematics, constructiveness, understanding of locality/space (cognitive maps?), music, painting, language, understanding of relationships (comparisons), wit, metaphysical depth of thought, poetry, moral sense, mimicry, god/religion, perseverance, knowing the interior of man by his exterior (theory of mind?), mimicry/pantomime, propagation, self-defense, and satire. He gives reasonable discussions of many of
'Fine' and 'coarse' coding are terms that have been employed to describe the breadth of tuning of individual neurons (NRF extent). These terms roughly correspond to modular and population coding, respectively. 'Hyperacuity' refers to the fact that precision of neural function is much greater than either fine or coarse tuning would imply. 'Fine' tuning seems close to 'labeled-line' coding, a modular idea ('specialists', see Schneider, 1955). Fine tuning means that the breadth of the NRFs are small. However, such tuning is small only with respect to the total relevant dimension; this tuning is actually large with respect to the discriminative capacities of the system - - called 'hyperacuity'. For example, the spatial extent of the NRF of each retinal ganglion cell is indeed small considering the spatial extent of the total retina, but very large with respect to visual spatial acuity. This type of coding was termed 'topographic' by Erickson (1968) for events which are represented across neural space. For example, visual and somesthetic space are laid out as maps across neural tissue. Since in this situation the availability of neurons is great, the NRFs are relatively small in order to restrict the total mass of the neural input. This restriction evidently conserves neural mass to that sufficient for each sensory event (Erickson, 1986); looked at another way, broad or coarse somesthetic or retinal NRFs would produce
these, such as their natural history and specific (disassociation) effects of disease.
21 unnecessarily large CNS activity; imagine the input from a point stimulus if each somesthetic or retinal neuron were responsive to the total skin or retinal surface. Rolls et al. (1997a,b) provide a quantification of how the amount of information in population responses for faces etc. is sufficient with a small neural mass ('narrow' NRFs). Coarse tuning, on the other hand, refers to the representation of information by neurons with broad NRFs ('generalists'; see Appendix and Schneider, 1955) as in taste, olfaction, color, temperature and vestibular sensitivity. In the modular view, this breadth is seen as a detriment to the accurate portrayal of an encapsulated event since each neuron is evidently involved in more than one function. However, as described above, broad NRFs are essential for the accurate representation of large amounts of information, especially when the quantity of neurons available is small. For example, the extent of the NRF of a retinal or lateral geniculate cell across the color dimension is very large. Evidently, this is because there are so few neurons available for color coding at each visual location that they must be broadly tuned to both include the total wavelength dimension and to maximize the mass of the neural input. This kind of coding is termed 'non-topographic' (Erickson, 1968) since the relevant dimension is not laid out across neural space. In brief, 'fine' and 'coarse' tuning are equivalent in that the NRF is always much broader than the acuity would suggest from the labeled-line point of view ('hyperacuity', next). In each case the breadth of tuning is probably evolutionarily designed to provide sufficient neural mass differences (NMDs) between discriminable events over the population of responding neurons; fine and coarse tuning are by design, not by mistake. Hyperacuity. The fine precision seen in neural systems in general, but emphasized in sensory systems, is a puzzle given the breadth of tuning of the neurons involved; this is called 'hyperacuity'. For example the fine discriminations between wavelengths and spatial resolutions in vision are far finer than the NRF widths of the neurons involved. Hyperacuity is a large problem if the modular view is assumed since the individual neurons respond to many different stimuli (broad NRFs), but are unifunctional. In the population view, the acuity of the
system is not limited by the NRF width in either fine or coarse coding (Erickson, 1968; however see Zhang and Sejnowski, 1998). The baseline in Fig. 1A may be considered as a small space of skin or retina ('fine' coding), or as the full dimension for color, taste or olfaction, attitude of the head, temperature, arm position, or direction of limb movement ('coarse coding') in all of which hyperacuity is evident. As one example, visual 'simple cells' were first considered to be detectors for straight lines of particular orientation; that these neurons are about as broadly tuned as possible, often spanning nearly 180°, has been shown by Henry et al. (1974); Soodak et al. (1987), and Vogels and Orban (1991). These neurons still provide exquisite orientation sensibility since in population coding acuity does not depend on the breadth of the NRF but on the size of the resultant NMDs, the latter being independent of NRF width (Erickson, 1986). Summary The test for modularity will come when some researcher predicts not only that a certain function must be based on a neural module, but also clearly explains why. A statement of this nature would be vastly helpful whether it was upheld or falsified. It might be even more useful to make clear which functions could not be modules, and why. Population coding models have a strong rationale, the first being the economic one presented by Young. This APF model is clear and testable. It accommodates itself to many issues as described herein, perhaps most importantly to the economy of neural resources, rapidity of action, and congruence with known facts of neural organization such as the puzzlingly broad sensitivities of individual neurons. It is especially appropriate for large and complex issues of high information load such as the concepts nominated for modularity. Why are events taken apart? Neural mass and neural mass differences
The puzzle of distributed wherein several aspects of a to different brain areas, has "Why was the image taken
coding, for example visual image are given two parts. The first is apart?" The second is
22 "How does it get put back together?" These are addressed in turn. Neural mass One rationale for taking things apart (e.g. visual form, location, color, movement) is that this maneuver gives each aspect of an informationally dense event its own substantial piece of neural tissue. This is similar to a sculptor using a large piece of granite to produce a detailed and complex figure. Thus in taking things apart, each important aspect of an event is given its own large piece of granite for careful and detailed sculpting (see Churchland and Sejnowski, 1992). The analogous amount of neural tissue involved (areas under the three curves in Fig. 2) is termed 'neural mass' (NM) herein (Erickson, 1986). It has been argued that discriminability is dependent on total differences in activity produced by the perceived events summed over all participating neurons (NMDs, Erickson, 1986) (absolute differences between the curves in Fig. 2). The size of an NMD is a positive function of the NM, which is the amount of activity summed over neurons, and the time over which the summation occurs (NM = N × F x T, where N = number of neurons, F = amount of activity evoked in each neuron, and T = time over which summation occurs). If it is important for one aspect of an event to receive detailed analysis, then a large number of neurons especially responsive to that event would help make this possible; e.g. if the form of an object is important (e.g. a face), then assigning form to its own group of neurons especially sensitive to this function would provide a large NM, and a large potential for NMDs. Conversely, the size of the NM elicited by a given function should indicate the importance (at least the degrees of differentiation) of that function for that species, as assumed by Gall. The careful expenditure of neurons is not a central point in modular thinking, but is of first importance in AFP coding. Constancy of neural mass That the neural mass evoked for certain functions must be important is strongly suggested by the fact that functional populations are often of a standard size. This means that a definable number of neurons
is required for that function (NM = K). For example, Hubel and Wiesel (1974) indicated a constancy in the size of the cortical representation of visual stimuli of various eccentricity, and Capuano and McIlwain (1981) McIlwain (1986) and show that 'point images' in the superior colliculus are about the same size for stimuli of various eccentlicity. Merzenich et al. (1984) suggested that when the cortical neural area available to a digit increases (i.e. when there is a loss of input from adjacent digits), the size of the receptive fields for each neuron decreases proportionally; this results in a constancy of neural mass activated for each point stimulus. Perhaps such 'neural quanta' are a general rule for diverse neural functions (Erickson, 1982, 1986). Quantification of neural mass Quantification of how much neural mass is required for the representation of an event is eventually an important issue in distributed neural coding. Beginning attempts include those by Erickson (1986), Ray and Doetsch (1990), Nagai et al. (1995), Rolls et al. (1997a,b), and Zhang and Sejnowski (1998). In general, these reports show that up to a point (perhaps the 'neural quantum' mentioned above) increased neural mass provides for greater information representation. As examples, Nagai et al. (1995) showed with taste neurons that the amount of information decreases gracefully with reductions in their numbers; further, a greater loss of information accompanies loss of the more strongly responding neurons - - those which have more variance across different stimuli and thus more potential for NMDs. The capacity for strongly responding neurons to allow large differences between neural representations (NMDs) is shown in Fig. 2. Similarly, Rolls et al. (1997b) showed that with visual neurons sensitive to faces, the amount of information is a positive function of the number of neurons. Abbott et al. (1996) showed information degradation with lessening of number of neurons. Lashley (1931) alluded to the power in numbers in his concept of 'mass action'. Some functions of large importance to a species, as song in birds and language in humans, have developed large neural masses suggesting relatively great information capacity.
23
How do events get put back together? Binding and readout. Goldman-Rakic (1988a,b) posed that the bringing back together of separated aspects of an event, 'binding', is one of the major problems for modem neuroscience. This issue of 'binding' is certainly of primacy for neuroscientists; once different aspects of information about an event are spread across neural tissue, how do they get put back together again in consciousness or other coherent behavior such as movement? Farmer (1998) states the problem well: It is assumed that " ... at some (spatial) point information must be gathered together as a single percept." Damasio and Damasio (1992) suggest that the components of a concept (e.g. for a cup of coffee, the components would be the cup form and color, the coffee aroma and warmth etc.) would be brought back together into the same ensemble of neurons. 'Binding' is a problem only with assumption of the 'modular' point of view. In population terms, the distributed aspects of an event need not be brought back together to correspond with a human construct. To accept the fact that the brain functions in terms of distributed populations of neurons is to understand, as Young did, that this distributed format is
the final language of the brain; the brain does not bind events back together again. This is very clear in motor output (or for any coherent production of the brain) wherein the motor neuron activity underlying a coordinated movement is in the form of a population and is not further coalesced. Sitting requires the correct level of activity of all motor neurons, except perhaps for those of the middle ear. The output, whether in motor neurons, an emotional state, or a memory etc. is in the same kind of population, distributed form as the input (sensory, or from other brain sources). Why would there be a conceptual difference in neural organization (e.g. modularity) between intervening processing levels and input and output? In all these, the same economic problems are evident. The simplest and most conservative position is that neural information does not take categorically different forms in different parts of the nervous system - - until proven otherwise. From this point of view, the inclusion of a second (modular) code in neural organization is not necessary and may not be helpful.
In population coding, various aspects of an event may be brought together into the same ensemble to permit the generation of neural differences between discriminable events (NMDs) across a common pool of neurons. This would be in line with the convergence of relevant neural information onto common ensembles (e.g. see Damasio and Damasio, 1992; Zeki, 1993). However the information in the ensemble would never leave population form (Erickson, 1978, 1986). In summary, as for the representation of color in Young's theory, great economic problems would be caused if the various aspects of an event were brought back together as expected in binding; the degree of economic embarrassment would be a rapidly increasing function of the complexity of the event to be 'reassembled'.
The role of temporal patterns. Since all behavior is temporal, the primary importance of the temporal aspects of neural activity must be accepted. First, they obviously signal simple temporal aspects of an event, such as its onset and offset. Above this, there is considerable evidence that temporal patterns of activity carry other, more subtle kinds of information. This coding has been treated in the modular and/or population formats. In the modular form, a given temporal pattern encodes an event. In the population format, temporal patterns gain their meaning in the context of the different but concurrent temporal patterns in parallel neurons. Modular temporal coding In the modular view, each temporal pattern can represent a different event, or different events at different times; these would be 'labeled temporal patterns' in analogy with individual neuron 'labeled lines'. Several who take this position include Von der Malsburg (1995); Softky (1995); Stevens and Zador (1995), Llinas and Pare (1996), Gerstner et al. (1997), and Ehert (1997). Following Covey's seminal study in taste, Di Lorenzo and Hecht (1993) showed that an electrically driven temporal pattern per se in the nucleus of the solitary tract can produce sufficient information to identify tastes. Quantification
24 of the information in temporal patterns in V1 and IT cortex was examined by Baddeley et al. (1997). McClurkin et al. (1996) entertained the notion that color and visual pattern are both encoded in the same neurons in V1, V2 and V4. These authors suggest that each of these neurons carry separate, but simultaneous (multiplexed) temporal codes for pattern and color information; therefore, separate temporal modules for different colors and forms coexistent within the same neurons. They aver that these temporal modules take the main, perhaps the only, role in color and pattern vision. Sakurai (1996a, 1998) reviews temporal population coding in memory, including Hebb's 1949 contribution (Sakurai, 1996b). Population temporal coding In distinction to the modular idea of temporal coding, Erickson et al. (1994) established that, at least in taste, there are a few (about three) basic temporal patterns of activity (fuzzy temporal sets 4) which, viewed in population format, help establish the identity of the various stimuli. They do this much as Young's broad NRFs encode color in that each stimulus evokes these few multiplexed temporal patterns to degrees idiosyncratic to each stimulus and neuron. That is, each stimulus will produce one or more of these temporal patterns to various degrees in each neuron. In its variation across neurons, this complex temporal pattern identifies the taste stimulus along with population rate coding (average rate of firing). Friston (1997) finds the potential for information in differences in temporal patterns (not in synchrony) over various neural areas, a population idea.
Binding and temporal patterns Timing of neural activity is seen as a strong contender for solving the 'binding problem'. A common hypothesis is that some kind of temporal synchrony over the neural population accomplishes binding. For example, Farmer (1998) suggests that such " ... binding problems may be solved through transient temporal synchronization of the discharges of populations of neurones." But the binding problem is not solved, or even addressed, by distributed temporal patterns. Given temporal synchrony there is still the problem of how to get the distributed events bound together. Temporal coincidence of increased activity in the neurons involved may hold the events up together for special note (large neural mass) in an otherwise relatively quiescent field. But how this 'binds' has not thus been made more evident. If the idea of population coding is accepted it should be clear that, although good signal level - - as through in-phase temporal synchrony - - is always appreciated, events do not need to be physically brought together to be 'bound'. Others Never shunning the important problems for the convenient, Lashley (1951) emphasized the time dimension as a neglected but important aspect of neural coding, including the issue of temporal organization over long periods of time as in language. Hebb (1949) used neural timing to account for the development of coherent neural organization distributed over cell assemblies. Conclusions
Temporal/spatial patterns changing over time Nicolelis et al. (1995) showed that the representation of a tactile stimulus, leading to a movement, is represented by complex temporal patterns across changing ensembles of neurons; this shifting pattern identifies the stimulus/response event. In place of static topography they suggest that " ... spatiotemporal complexity substitutes for topography as the main strategy for the coding of sensory information."
This small and selective review suggests that the evolution of molar neuroscience can be largely characterized by two general coding ideas, modular and population, but that these ideas have not evolved (changed form) for two centuries. The evolution that has occurred has been in technical methodologies, with our statements of the ideas being recast in terms of these techniques rather than leading to them (see Erickson, 1978). These techniques are inherently modular, from Gall's examinations of skull shape,
25 B e l l a n d M a g e n d i e ' s ( a n d m a n y to f o l l o w ) s t i m u l a t i o n a n d a b l a t i o n m e t h o d s , to n e u r a l r e c o r d i n g t e c h n i q u e s a n d b r a i n i m a g i n g . It s e e m s difficult to t h i n k in t e r m s o t h e r t h a n t h e m o d u l a r d a t a t h e s e t e c h n i q u e s p r o v i d e . S e c o n d , t h e d e v e l o p m e n t o f i d e a s h a s also b e e n c o n s t r a i n e d b y t h e n a t u r e o f o u r l a n g u a g e (Eri c k s o n , 1978) i n w h i c h w e are e n c o u r a g e d to e x p r e s s o u r i d e a s i n m o d u l a r f o r m . T h e o n l y true e x c e p t i o n was that of Young whose very successful effort was to s o l v e t h e b a s i c p r o b l e m o f n e u r a l e c o n o m i c s q u i t e aside from language and techniques. It s e e m s l a r g e l y true t h a t o u r t e c h n o l o g y a n d our verbal nature have guided the evolution of our s c i e n c e . It is s u g g e s t e d t h a t b e f o r e w e c a n b e c o m fortable with conclusions about the nature of brain f u n c t i o n , b e t h e y m o d u l a r or p o p u l a t i o n , t h e e x p e r i m e n t a l t e c h n i q u e s a n d l a n g u a g e w e h a v e u s e d to direct our science should be objects of investigation t h e m s e l v e s - - t h e f a c t t h a t it s e e m s difficult to proceed in any other way does not justify the means. B e i n g t h e b a s i c t o o l s o f o u r trade, w e s h o u l d k n o w w h a t role t h e y p l a y i n u n d e r s t a n d i n g a b r a i n t h a t may not be organized in their terms. This should be a c o n c e r n o f t h e first order.
bution to the code, including 'ensemble', 'across-fiber pattern' (AFP), 'parallel', 'combinatorial', 'assembly', and 'distributed' codes; these all express the same idea, with the AFP model being additionally based on the amounts of activity in each neuron. Alternatively the use of these terms to refer to redundant neural activity given to many neurons spread out over local or large brain areas represent redundancy, not population codes. The terms 'modular', 'focal', 'non-combinatorial', 'singular', 'specific' and 'labeled-line' have been treated in roughly equivalent ways by various investigators, the idea varying only in the number of neurons involved (labeled-line refers to one neuron, module to many neurons), but not in general concept. Therefore, they are reduced to a common idea in this paper. Concerning other terms, within this paper 'function' and 'event' are roughly equivalent (neither very clearly defined). Neural Response Function (NRF) refers to the breadth of tuning of neurons along their relevant dimensions. The form of this tuning takes some simple form, such as a bell-shaped curve for color, vestibular sensitivity, line orientation in 'simple' cells, and motor neurons; other forms include smoothly and monotonically increasing or decreasing curves as in joint position, and s-shaped curves for color-coding beyond the receptor level. An NRF is an explicitly defined and generalized 'tuning curve'. NM refers to neural mass, a combined and positive function of number of neurons involved in the representation of an event, response magnitude of these neurons, mid the time over which integration occurs. It is approximately represented in brain imaging techniques. Absolute changes in neural mass, summed across neurons, are taken as the basis for neural information, and are termed neural mass differences (NMDs) (Erickson, 1986).
Acknowledgements References T h e c r i t i c a l a n d i n s i g h t f u l r e a d i n g s b y Drs. D o n a l d K a t z a n d B r u c e H a l p e r n are g r e a t l y a p p r e c i a t e d .
Appendix. Comments on definitions The proposed equivalence of some terms as presented in this paper is an attempt at parsimony. But do these 'equivalent' terms actually represent the same ideas as suggested herein, or do the different words express truly different ideas? In many cases, it is clear that investigators have not been aware of their predecessors' and peers' terms and ideas. This makes probable the reinvention of an idea when the idea is good - - but described with different words. Certainly this statement will not be acceptable to all the reinventors, and they may be right. So the present definitions are arbitrarily but simply made to encourage sacrifice by any clear, reasonable argument to the contrary. Physics would not have progressed as it has if each worker used the term 'force' in their own, idiosyncratic way, or if that idea were given various names - - in some cases at least the probable situation in molar neuroscience. There is inadvertent mischief in the use of the same term for different events, or a multiplicity of terms for the same idea. Examples follow. As used herein, 'population coding' includes any event whose representation requires the activity in neurons of diverse contri-
Abbott, L.E and Blum, K.I. (1996) Functional significance of long-term potentiation for sequence learning and prediction. Cereb. Cortex, 6: 406-416. Abbott, L.F., Rolls, E.T. and Tovee, M.J. (1996) Representational capacity of face coding in monkeys. Cereb. Cortex, 6: 498505. Adrian, E.D. (1928) The Basis of Sensation. Christophers, London. Adrian, E.D. (1943) Discharges from the vestibular receptors in the cat. J. Physiol., 101: 389. Adrian, E.D. (1956) The action of the mammalian olfactory organ: the Semon Lecture: 1955. J. Laryngol. Otol., 70: 1-14. Adrian, E.D. and Bronk, D.W. (1928) The discharge of impulses in motor nerve fibres. I. Impulses in single fibres of the phrenic nerve. J. Physiol., 66: 81-101. Adrian, E.D., Cattell, M. and Hoagland, H. (1931) Sensory discharges in single cutaneous nerve fibres. J. Physiol., 72: 377-391. Baddeley, R., Abbott, L.F., Booth, M.C., Sengpiel, E, Freeman, T. and Rolls, E.T. (1997) Responses of neurons in primary and inferior temporal visual cortices to natural scenes. Biol. Sci., 264: 1775-1783. Barnes, C.A., Suster, M.S., Shen, J. and McNaughton, B.L.
26
(1997) Multistability of cognitive maps in the hippocampus of old rats. Nature, 388: 272-275. Bernstein, L. (1970) The Infinite Variety of Music, Chapter 2. The New American Library, Inc., New York. Bishop, RO. (1970) Beginning of form vision and binocular depth discrimination. In: EO. Schmitt (Ed.), The Neurosciences: Second Study Program. R.U. Press, New York, pp. 471-485. Booth, M.C. and Rolls, E.T. (1998) View-invarient representations of familiar objects by neurons in the inferior temporal visual cortex. Cereb. Cortex, 8: 510-523. Boring, E.G. (1929) A History of Experimental Psychology. Appleton-Century, New York. Boring, E.G. (1942) Sensation and Perception in the History of Experimental Psychology. Appleton-Century-Crofts, New York. Capuano, U. and McIlwain, J.T. (1981) Reciprocity of receptive field images and point images in the superior colliculus of the cat. J. Comp. Neurol., 196: 13-23. Casey, K. (1999) Forebrain mechanisms of nociception and pain: analysis through imaging. Proc. Natl. Acad. Sci. USA, 96: 7668-7674. Churchland, RS. and Sejnowski, T.J. (1992) The Computational Brain. MIT Press, Cambridge, MA. Cohen, J.D., Perlson, W.M., Braver, T.S., Hystrom, L.E., Noll, D.C., Jonides, J. and Smith, E.E. (1997) Temporal dynamics of brain activation during a working memory task. Nature, 386: 604-608. Covey, E. (1980) Temporal neural coding in gustation. Doctoral dissertation, Duke University, Durham, NC. Covey, E. (2000) Neural population coding and auditory temporal patterns analysis. Physiol. Behav., 69:211-220. Craik, EI.M., Moroz, T.M., Moscovitch, M., Stuss, D.T., Winocur, G., Tulving, E. and Kapur, S. (1999) In search of self: A positron emission tomography study. Psychol. Sci., 10: 26-34. Damasio, A.R., Damasio, H. (1992) Brain and Language. Sci. Am., Sept., 89-95. Desimone, R. (1991) Face-selective cells in the temporal cortex of monkeys, or. Cognit. Neurosci., 3: 1-8. Di Lorenzo, EM. (1989) Across unit patterns in the neural response to taste: vector space analysis. J. Neurophysiol., 62: 823-833. Di Lorenzo, RM. and Hecht, G.S. (1993) Perceptual consequences of electrical stimulation in the gustatory system. Behav. Neurosci., 107: 130-138. Doetsch, G. (2000) S Patterns in the brain: neuronal population coding in the somatosensory system. Physiol. Behav., 69:187201. Edelman, G. and Mountcastle V.B. (1978) The Mindful Brain. MIT Press, Cambridge, MA. Eggermont, J.J. (1998) Azimuth coding in primary auditory cortex of the cat. II.. J. Neurophysiol., 80: 2151-2161. Ehert, G. (1997) The anditury cortex. J. Comp. Physiol. Set. A Sensory Neural Behav. Physiol., 181: 547-557. Eisenmann, L.M. (1974) Neural encoding of sound location: an
electrophysiological study in auditory cortex (AI) of the cat using free field stimuli. Brain Res., 75: 203-213. Erickson, R.R (1963) Sensory neural patterns and gustation. In: Y. Zotterman (Ed.), Olfaction and Taste, Vol. I. Pergamon Press, Oxford, pp. 205-213. Erickson, R.R (1968) Stimulus coding in topographic and nontopographic afferent modalities. Psychol. Rev., 75: 447-465. Erickson, R.R (1974) Parallel 'population' neural coding in feature extraction. In: F.O. Schmitt, EG. Worden (Eds.), The Neurosciences: Third Study Program. MIT Press, Cambridge, MA, pp. 155-169. Erickson, R.R (i978) Common properties of sensory systems. In: R.B. Masterton (Ed.), Handbook of Behavioral Neurobiology, Vol. 1. Plenum, New York, pp. 73-90. Erickson, R.R (1982) The across-fiber pattern theory: an organizing principle for molar neural function. In: W.D. Neff (Ed.), Contributions to Sensory Physiology, Vol. 6. Academic Press, New York, pp. 79-110. Erickson, R.R (1984) On the neural bases of behavior. Am. Sci., 72: 233-241. Erickson, R.E (1986) A neural metric. Neurosci. Biobehav. Rev., 10: 377-386. Erickson, R.R (2000) The evolution of neural coding ideas in the chemical senses. Physiol. Behav., 69: 3-13. Erickson, R.R, Doetsch, G.S. and Marshall, D. (1965) The gustatory neural response function. J. Gen. Physiol., 49: 247263. Erickson, R.R, Di Lorenzo, RM. and Woodbury, M.A. (1994) Classification of taste responses in brain stem: Membership in fuzzy sets. J. NeurophysioI., 71: 2139-2150. Farmer, S.E (1998) Rhythmicity, synchronization and binding in human and primate motor systems. J. Physiol., 509: 3-14. Fernandez, C., Goldberg, J.M. and Abend, W.K. (1972) Response to static tilts of peripheral neurons innervating otolith organs of the squirrel monkey. J. Neurophysiol., 35: 978-997. Fodor, J.A. (1983) The Modularity of Mind: An Essay on Faculty Psychology. MIT Press, Cambridge, MA. Fredrick, R.W. and Korsching, S.I. (1998) Chemotopic, combinatorial, and non-combinatorial odorant representations in the olfactory bulb revealed using a voltage-sensitive axon tracer. J. Neurosci., 23: 9977-9988. Friston, K.J. (1997) Another neural code? Neuroimage, 5: 213220. Gabrieli, J.D.E., Desmond, J.E., Demb, J.B., Wagner, A.D., Stone, M.V., Vaidya, C.J. and Glover, G.H. (1996) Functional magnetic resonance imaging of semantic memory processes in the frontal lobes. Psychol. Sci., 7: 278-283. Georgopoulos, A.R, (1995) Motor cortex and cognitive processing. In: M.S. Gazzaniga (Ed.), The Cognitive Neurosciences. MIT Press, Cambridge, MA, pp. 507-517. Gerstner, W., Kreiter, A.K., Markrarn, H. and Herz, A.V. (1997) Neural codes: firing rates and beyond. Proc. Natl. Acad. Sci. USA, 94: 12740-12741. Geschwind, N. (1979) Specializations of the human brain. Am. Sci., 9: 180-199. Gochin, RM., Colombo, M., Dorfman, G.A., Gerstein, G.L.
27
and Gross, C.G. (1994) Neural ensemble coding in inferior temporal cortex. J. Neurophysiol., 71: 2325-2337. Godefroy, O., Cabaret, M., Petit-Chenal, V., Pruvo, J.R and Rousseaux, M. (1999) Control functions of the frontal lobes. Modularity of the central-supervisory system? Cortex, 35: 120. Goldman-Rakic, ES. (1988a) Topography of cognition: parallel distributed networks in primate association cortex. Annu. Rev. Neurosci., 11: 137-156. Goldman-Rakic, RS. (1988b) Changing concepts of cortical connectivity: parallel distributed cortical networks. In: R Rakic and W. Singer (Eds.), Neurobiology of Neocortex. John WiIey and Sons, New York, pp. 177-202. Gould, S.J. (1994) The persistently fiat earth. Natural History, 3: 12-19. Graziano, M.S.A. and Gross, C.G. (1995) The representation of extrapersonal space: a possible role for bimodal, visualtactile neurons. In: M.S. Gazzaniga (Ed.), The Cognitive Neurosciences. MIT Press, Cambridge, MA, pp. 1021-1034. Hartline, H.K. (1940) The receptive field of the optic nerve fibers. Am. J. Physiol., 130: 690-699. Hebb, D.O. (1949) The Organization of Behavior: A Neuropsycholoical Theory. John Wiley and Sons, New York. Helmholtz, H. von. (1860) Handbuch der physiologischen Optik. Translated from the third German edition as J.RC. Southall (Ed.), Helmholtz's Treatise on Physiological Optics II. 1924, Dover, New York, NY. Helmholtz, H.L.E von (1862) On the Sensations of Tone. Second edition (1885) republished in 1954 by Dover, New York, NY (see pp. 144-149). Henry, G.H., Dreher, B. and Bishop, RO. (1974) Orientation specificity of cells in cat striate cortex. J. Neurophysiol., 37: 1394-1409. Hubel, D.H. and Wiesel, T.N. (1968) Functional architecture of the striate cortex. In: F.C. Carlson (Ed.), Physiologial and Biochemical Aspects of Nervous Integration. Prentice-Hall, Englewood Cliffs, NJ, pp. 153-161. Hubel, D.H. and Wiesel, T.N. (1974) Uniformity of monkey striate cortex: a parallel relationship between field size, scatter and magnification factor. J. Cornp. Neurol., 158: 295-306. Ingle, D. and Crews, D. (1985) Vertebrate neuroethology: definitions and paradigms. Annu. Rev. Neurosci., 8: 457-494. James, W. (1890) Principles of Psychology. Henry Holt, New York, p. 195. Johansson, H., Bergenheim, M., Djupsjobacka, M. and Sjolander, R (1995) A method for analysis of encoding of stimulus separation in ensembles of afferents. J. Neurosci. Methods, 63: 67-74. Johansson, R.S. and Vallbo, A.B. (1980) Spatial properties of the population of mechanoreceptive units in the glaborous skin of the human hand. Brain Res., 184: 353-366. John, E.R. (1972) Switchboard versus statistical theories of learning and memory. Science, 177: 850. Johnson, K.O. and Hsiao, S.S. (1992) Neural mechanisms of tactual form and texture perception. Annu. Rev. Neurosci., 15: 227-250.
Kimble, G. (1996) Psychology: The Hope of a Science. MIT Press, Cambridge, MA, p. 137. Knight, R.T. (1994) Attention regulation and human prefrontal cortex. In: A.M. Thierry, J. Glowinski, R Goldman-Rakic and Y. Christen (Eds.), Motor and Cognitive Functions of the Prefrontal Cortex. Research and Perspectives in Neurosciences. Springer, New York, pp. 160-173. Kundsen, E.I. (1982) Auditory and visual maps of space in the optic rectum of the owl. J. Neurosci., 9:1177-1194. Lancet, D. (1986) Vertebrate olfactory reception. Annu. Rev. Neurosci., 9: 329-355. Lancet, D. and Ben-Aire, N. (t993) Olfactory receptors. Cur~ Biol., 3: 668-674. Lashley, K.S. (1929) Neural mechanisms in adaptive behavior. In: Brain Mechanisms and Intelligence. Univ. Chicago Press, Chicago, pp. 157-163. Lashley, K.S. (1931) Mass action in cerebral function. Science, 73: 245-254. Lashley, K.S. (1942) The problem of cerebral organization in vision. In: H. Kluever (Ed.), Biological Symposia, Vol. VII, Visual Mechanisms. Jaques Cattell Press, Lancaster, pp. 301322. Lashley, K.S. (1951) The problem of serial order in behavior. In: L.A. Jeffress (Ed.), Cerebral Mechanisms in Behavior. John Wiley and Sons, New York, pp. 112-136. Law, I., Svarer, C., Rostrup, E. and Paulson, O.B. (1998) Parieto-occipital cortex activation during self-generated eye movements in the dark. Brain, 121: 2189-2200. Lee, C., Rohrer, L.C. and Sparks, D.L. (1988) Population coding of saccadic eye movements by neurons in the superior colliculus. Nature, 332: 357-360. Llinas, R. and Pare, D. (1996) The brain as a closed system modulated by the senses. In: R. Llinas and R Churchland (Eds.), The Mind-Brain Continuum. MIT Press, Cambridge, MA, pp. 1-18. Llinas, R. and Welsh, J. (1994) On the cerebellum and motor learning. Curt. Opin. Neurobiol., 3: 958-965. Loomis, J.M. and Collins, C.C. (1978) Sensitivity to shifts of a point stimulus: an instance of tactile hyperacuity. Percept. Psychophys., 24: 487-492. Lukashink, A.V. and Georgopoulas, A.R (1993) A dynamical neural network model for motor cortical activity during movement: population coding of movement trajectories. Biol. Cybernet., 69: 517-524. Malnic, B., Hirono, J., Sato, T. and Buck, L.B. (1999) Combinatorial receptor codes for odors. Cell, 96:713-723. Marsolek, CJ. (1999) Dissociable neural subsystems underlie abstract and specific object recognition. Psychol. Sci., 10: ill-118. McClurkin, J.S., Zarbock, J.A. and Optican, L.M. (1996) Primate striate and prestriate cortical neurons during discrimination II. Separable temporal codes for color and pattern. J. Neurophysiol., 75: 496-507. McIlwain, J.T. (1975) Visual receptive fields and their images in superior colliculus of the cat. J. Neurophysiol., 38: 219-230. Mcllwain, J.T. (1986) Point images in the visual system: new interest in an old idea. Trends Neurosci., 9: 354-358.
28
McIlwain, J.T. (1991) Distributed spatial coding in the superior colliculus: a review. Vis. Neurosci., 6: 3-13. Mecklinger, A. (1998) On the modularity of recognition memory for object form and spatial location: a topographic ERP: analysis. Neuropsychologia, 36: 441-460. Merzenich, M.M., Nelson, R.J., Stryker, M.R, Cynader, M.S., Schoppmann, A. and Zook, J.M. (1984) Somatosensory cortical map changes following digit amputation in adult monkeys. J. Comp. Neurol., 224: 591-605. Merzenich, M.M. and deCharms, R.C. (1996) Neural representations, experience, and change. In: R. Llinas and R Churchland (Eds.), The Mind-Brain Continuum. MIT Press, Cambridge, MA, pp. 61-81. Miller, W.H., Ratlliff, F. and Hartline, H.K. (1961) How cells receive stimuli. Sci. Am., Sept., W.H. Freeman, San Francisco, CA, pp. 1-12. Mishkin, M. (1983) Object vision and spatial vision. Trends Neurosci., 6: 414-417. Mountcastle, V.B. (1966) The neural replication of sensory events in the somatic afferent system. In: J.C. Eccles (Ed.), Brain and Conscious Experience. Springer, New York, pp. 85-115. Mountcastle, V.B. (1979) An organizing principle for cerebral function: the unit module and the distributed system. In: F.O. Schmitt and EG. Worden (Eds.), The Neurosciences: Fourth Study Program. MIT Press, Cambridge, MA, pp. 21-42. Mountcastle, V.B. (1980) Sensory receptors and neural encoding: introduction to sensory processes. In: V.B. Mountcastle (Ed.), Medical Physiology. 14th ed., C.V. Mosby, St. Louis, pp. 327346. Mountcastle, V.B., Poggio, G.E and Wemer, G. (1963) The relation of thalamic cell response to peripheral stimuli varied over a intensive continuum. J. Neurophysiol., 38: 908. Mueller, J. (1838) Handbuch der Physiologie des Menschen, Vol. H. Holscher, Coblentz. Nafe, J.R (1929) A quantitative theory of feeling. J. Gen. Psychol., 2: 199-210. Nagai, T., Katayama, H., Aihara, K. and Uamamoto, T. (1995) Pruning of rat cortical taste neurons by an artificial neural network model. J. Neurophysiol., 74: 1010-1019. Nelson, J.I. (1975) Globality and stereoscopic fusion in binocular vision. J. Theol: Biol., 49: 1-88. Nicolelis, M.A.L., Baccala, L.A., Lin, R.C.S. and Chapin, J.K. (1995) Sensorimotor encoding by synchronous neural ensemble activity at multiple levels of the somatosensory system. Science, 268: 1353-1358. Perkel, D.H. and Bullock, T.H. (1968) Neural Coding. Neurosciences Res. Prog. Bull., 6: 227-343. Peterson, C.C. and Siegal, M. (1999) Representing inner worlds: theory of mind in autistic, deaf, and normal hearing children. PsychoI. Sci., 10: 126-129. Pfaffman, C. (1941) Gustatory afferent impulses. J. Cell. Comp. Physiol., 17: 243-258. Pilpel, Y. and Lancet, D. (1999) The variable and conserved interfaces of modeled olfactory receptor proteins. Protein Sci., 8: 969-977.
Popper, K. (1962) Conjectures and Refutations. Basic Books, New York. Posner, M.I. (1992) Attention as a cognitive and neural system. Curt. Dir. Neurosci., 1: 11-14. Posner, M.I. and Petersen, S.E. (1990) The attention system of the human brain. Annu. Rev. Neurosci., 13: 25-42. Ray, R.H. and Doetsch, G.S. (1990) Coding of stimulus location and intensity in populations of mechanosensitive nerve fibers of the racoon, II. Across-fiber response patterns. Brain Res. Bull., 25: 533-550. Ressler, K.J., Sullivan, S.L. and Buck, L.B. (1994) Information coding in the olfactory system: evidence for a stereotyped and highly organized epitope map in the olfactory bulb. Cell, 79: 1245-1255. Rolls, E.T., Treves, A. and Tovee, MJ. (1997a) The representational capacity of the distributed encoding of information provided by populations of neurons in primate temporal visual cortex. Exp. Brain Res., 114: 149-I62. Rolls, E.T., Treves, A., Tovee, MJ. and Panzeri, S. (1997b) Information in the neuronal representation of individual stimuli in the primate temporal visual cortex. J. Comput. Neurosci., 4: 309-333. Sakurai, Y~ (1996a) Population coding by cell assemblies-what it really is in the brain. Neurosci. Res., 26: 1-16. Sakurai, Y. (1996b) Hippocampal and neocortical cell assemblies encode memory processes for different types of stimuli in the rat. J. Neurosci., 16: 2809-2819. Sakural, Y. (1998) The search for cell assemblies in the working brain. Behav. Brain Res., 91: 1-13. Schmidt, L.A. (1999) Frontal brain electrical activity in shyness and sociability. Psychol. Sci., 10: 316-320. Schneider, D. (1955) Mikro-electroden registrieren die elecrischen Impulse einzelner Sinnesnervenzellen der Schmetterlingsantenne. Ind. Electron. Forsch. Ferfigung, 3: 3-7. Softky, W.R. (1995) Simple codes versus efficient codes. Curt. Opin. Neurobiol., 5: 239-247. Soodak, R.E., Shapley, R.M. and Kaplan, E. (1987) Linear mechanisms of orientation tuning in the retina and lateral geniculate nucleus of the cat. J. Neurophysiol., 58: 267-275. Sperry, R.W. (1959) The growth of nerve circuits. Sci. Am., 201: 68-75. Sperry, R.W. (1969) A modified concept of consciousness. PsychoI. Rev., 76: 532-536. Spinelli, D.N. (1966) Receptive fields in the cat's retina. Science, 152: 1768-1769. Spinelli, D.N., Pribram, K.H. and Bridgemen, B. (1970) Visual receptive field organization of single units in the visual cortex of monkey. Int. J. Neurosci., 1: 67-74. Stevens, C.E and Zador, A. (1995) The enigma of the brain. Curt. Biol., 5: 1370-1371. Stone, V.E., Baron-Cohen, S. and Knight, R.T. (1998) Frontal lobe contributions to theory of mind. J. Cognit. Neurosci., 10: 640-656. Surmeier, D.J., Honda, C.N. and Willis, W.D. (1986) Responses of primate spinothalamic neurons to noxious thermal stimulation of glaborous and hairy skin. J. Neurophysiol., 56: 328350.
29
Tager-Flusberg, H. and Boshart, J. (1998) Reading the windows to the soul: evidence of domain-specific sparing in Williams syndrome. J. Cognit. Neurosci., 10: 631-639. Teevan, R.C. and Birney, R.C. (Eds.) (1961) Color Vision. Van Nostrand, Princeton, Chapters 1 and 2. Titchener, E.B. (1898) The postulates of a structural psychology. Philos. Rev. VII: 449-465. Tononi, G. and Edelman, G.M. (1998) Consciousness and complexity. Science, 282: 1846-1851. Tower, S.S. (1943) Pain: Definition and properties of the unit for sensory reception. In: Wolff, H.G., Gasser, H.S. and Hinsey, J.C. (Eds.), Pain. Research Publications: Association for Research in Nervous and Mental Disease, Vol. 23, pp. 16-43. Treisman, A.M. and Kanwisher, N.G. (1998) Perceiving visually presented objects: recognition, awareness, and modularity. Curt. Opin. Neurobiol., 8: 218-226. Vogels, R. (1999) Categorization of complex visual images by Rhesus monkeys. Part 2: single-cell study. Fur. J. Neurosci., 1l: 1239-1255.
Vogels, R. and Orban, G.A. (1991) Quantitative study of striate single unit responses in monkeys performing an orientation discrimination task. Exp. Brain Res., 84: 1-11. Waltz, J.A., Knowlton, B.J., Holyoad, K.J., Boone, K.B., Mioshkin, R.S., de Menezes Santos, M., Thomas, C,R. and Miller, B.L. (1999) A system for relational reasoning in human prefrontal cortex. Psychol. Sci., 10: 119-125. Young, M. and Yamane, S. (1992) Sparse population coding of faces in the inferotemporal cortex. Science, 256: 1327-1329. Von der Malsburg, C. (1995) Binding in models of perception and brain function. Curt. Opin. Neurobiol., 5: 520-526. Zeki, S. (1993) A Vision of the Brain. Blackwell, Oxford. Zeki, S. and Bartels, A. (1998) The autonomy of the visual systems and the modularity of conscious vision. Phil. Trans. Roy Soc. Ser. B: Biol. Sci., 353: 1911-1914. Zhang, K. and Sejnowski, T.J. (1998) Neuronal tuning; to sharpen or broaden? Neural Comput., 11: 75-84. Zotterman, Y. (1958) Studies in the nervous mechanism of taste. Exp. Cell Res., Suppl. 5: 520-526.
M.A.L. Nicolelis (Ed.)
Progressin BrainResearch.Vol. 130 © 2001 Elsevier Science B.V. All fights reserved
CHAPTER 3
Overcoming the limitations of correlation analysis for many simultaneously processed neural structures Luiz A. Baccal~i 1,, and Koichi Sameshima 2 1 Telecommunications and Control Engineering Department, Escola Polit~cnica, Av. Prof. Luciano Gualberto, Trav. 3, #158, University of Sdo Paulo, Sdo Paulo, SP, CEP 05508-900, Brazil 2 Disc. Medical lnformatics and Functional Neurosurgery Laboratory, School of Medicine, University of Sdo Paulo, Sdo Paulo, Brazil
Introduction Despite modem methods in molecular biology, neuroanatomy, and functional imaging, monitoring electric signals from neuronal depolarization remains important when evaluating the functional aspects of both normal and pathological neural circuitry. Correlation methods still rank popular and are extensively used to analyze the functional interaction in the electroencephalogram (EEG), the magnetoencephalogram, local field potentials and more recently, in simultaneously recorded single- and multiunit activity of many structures (tens to hundreds at a time). This last item has deserved increasing attention due to its potential in bridging the gap between the study of isolated single neurons and the understanding of encoding and processing of information by neuronal populations (Eichenbaum and Davis, 1998; Nicolelis, 1998). A host of other analytical techniques have emerged, some employing information theoretic rationales by assessing mutual information (Yamada et
* Corresponding author: Luiz A. Baccal~i, Telecommunications and Control Engineering Department, Escola Polit6cnica, Av. Prof. Luciano Gualberto, Trav. 3, #158, University of S~o Paulo, S~o Paulo, SP, CEP 05508-900, Brazil. E-mail:
[email protected]
al., 1993; Rieke et al., 1997; Brunel and Nadal, 1998) or interdependence between signal pairs (Schiff et al., 1996; Amhold et al., 1999), while others are extensions of spectral analysis/coherence analysis (Glaser and Ruchkin, 1976; Duckrow and Spencer, 1992; Christakos, 1997; Rosenberg et al., 1998). Despite these advances, a large fraction of neuroscientists still chiefly rely on the cross-correlation between the activity of pairs of neural structures to infer their functionality. Like cross-correlation, all of these methods are in one way or another restricted in their calculations to using just the signal of two structures at a time. In this article, we show that it is not only possible but also desirable to analyze more than two structures simultaneously. Furthermore, we show also that effective structural inference is only possible if simultaneous signals from many (representative) structures are jointly analyzed. To handle many simultaneous structures, we employ the recently introduced notion of partial directed coherence (PDC). This is a novel frequency domain approach for simultaneous multichannel data analysis based on Granger causality that employs multivariate auto-regressive (MAR) models for computational purposes (Baccal~i and Sameshima, 2001). We review PDC in Section 2 and illustrate its usefulness via toy linear models simulating multi-electrode EEG measurements in Section 3, where we contrast
34 it to other techniques (correlation/coherence analysis). We discuss an application to experimental data in Section 4. Further examples of PDC in a single- and multi-unit activity context are available in Sameshima and Baccald (2001). Partial directed coherence The concept of partial directed coherence is the latest development of a number of time series analysis efforts for describing how neural structures are interconnected (Baccal~i and Sameshima, 2001; Sameshima and Baccald, 2001). Its remote origin is the paper by Saito and Harashima (1981) which introduced the notion of directed coherence between the activity of pairs of structures. Their method allows factoring the classical coherence function (the frequency domain counterpart of correlation analysis) of a pair of structures into two 'directed coherences': one representing the feedforward and the other one representing the feedback aspects of the interaction between these two neural structures. Examples of use of pairwise directed coherence in studying the relation between Parkinson's tremor and lack of feedback in motor control are contained in Schnider et al. (1989). In an attempt to generalize directed coherence to a context of analysis of more than two simultaneously processed structures, the so-called method of directed transfer function (DTF) was introduced with several equivalent variants (Franaszczuk et al., 1994; Baccal~i and Sameshima, 1998; Baccal~i et al., 1998). This method was applied to foci determination and to EEG studies in mesial temporal lobe seizure (Franaszczuk et al., 1994). Details on DTF are contained in Appendix A, In their original paper, Saito and Harashima (1981) allude to a possible rationale for their method. This concept is now known as Granger causality (Granger, 1969). According to it, an observed time series x (n) Granger-causes another series y (n), if knowledge of x(n)'s past significantly improves prediction of y(n); this kind of predictability improvement is not reciprocal, i.e. x(n) may Grangercause y(n) without y(n) necessarily Granger-causing x(n). This lack of reciprocity is the basic property behind the determination of the direction of information flow between pairs of structures which,
in turn, is the basis for decomposing classical coherence into directed feedforward and feedback coherence factors. Following that rationale, we investigated how generalizations, like DTF, of directed coherence to N simultaneously processed structures compared to statistical tests of Granger causality for N simultaneous time series (Baccal~i et al., 1998). We realized that DTF provided a physiologically interesting frequency domain picture, yet structural inference based on its computation did not always agree with the result of Granger causality tests (GCT). We could show that this was due to intrinsic aspects of DTF's definition (Baccald and Sameshima, 2001) (see also Appendix A). Because Granger causality is a more fundamental concept than the ad hoc generalization represented by DTF, we went on to introduce the notion of partial directed coherence (Baccal~i and Sameshima, 2001). This new structural connectivity estimator relies on the simultaneous processing of N > 2 time series and is able to expose a frequency domain picture of the feedforward and feedback interactions between each and every pair of structures within the set of N simultaneously processed signals. Perhaps more importantly, PDC reflects Granger causality closely by paralleling the definition of Granger causality test estimators. The main preliminary ingredient of both PDC and GCT (and of DTF as well, but in a fundamentally different way) is their practical use of multivariate autoregressive models as exemplified for N = 3 simultaneously monitored structures in the model
Lxlnl Iallr al2ra3rl x2(n)
=
x3(n)
r=l
×
a21(r)
a22(r)
a23(r)
a31(r)
a32(r)
a33(r)
x2(n -- r) x3(n -- r)
+
w2(n) w3(n)
(1)
In this model, xl(n) depends on its own past values xl ( n - r ) through the coefficients a n (r) while, for example, xl (n)'s dependence on the past values of the other series like xz(n - r) is through the a12(r) coefficients. As such, the time series xz(n) only
35 Granger-causes xl(n) if we can statistically show that al2(r) ~6 0 for some values of r. Or equivalently, rejecting the null hypothesis of aij(r) = 0 means that xj (n) does Granger-cause xi (n). The partial directed coherence from series j to series i, at frequency f can be defined as 7rij(f ) =
~;j(f) x/~j (f)"~j (f)
(2)
Where
+ wl (n)
r=l
p - ~ aij (r)e -jz~rfr, otherwise
Model I xl (n) = 0 . 9 5 ~ x l (n - 1) - 0.9025xl (n - 2)
P
1 - ~ aij (r)e -j2Jrfr, if i = j aij ( f ) =
taneously processing fewer than the N structures important to the dynamics. The first toy model example, mimicking local field potential measurements along hippocampal structures, is represented by the following set of linear difference equations with N = 7 structures:
(3)
r=l
x2(n) = - 0 . 5 x l ( n - 1) + to2(n ) x3(n) = 0.4xl(n -- 4) -- 0.4x2(n -- 2) + w3(n)
and fij ( f ) is the vector
x4(n) = --0.5x3(n -- 1) + 0.25v/2xn(n -- 1)
au(f)
+ 0.25~¢/-2xs(n -- 1) + w4(n) (4)
hj(f) =
aNj(f) Because of its dependence on aij(r) in Eq. 3, the nullity of 7rij(f ) at a given frequency implies lack of Granger causality from xj(n) to xi(n) at that frequency. PDC is, therefore, a direct frequency domain counterpart of GCT. Though further details on PDC are contained elsewhere (Baccald and Sameshima, 2001), a compact summary is available at Appendix A together with its relation to DTE Methods of MAR model fitting are reviewed elsewhere (Marple, 1987). In the next section, we discuss some examples of PDC's application contrasting it to other techniques.
Illustrative simulations To provide objective comparisons of PDC with other techniques, we use time series generated from known linear toy models. In this case, exact theoretical calculations of pairwise cross-correlation, classical coherence, DTF and PDC can be made and allow exposing all the relative methodological merits of each approach while avoiding possible pitfalls of experimental signals collected from neural systems with unknown structure. The use of toy models is further motivated by the desire to investigate possible structural inference impairments when simul-
xs(n) = --0.25~u/2x4(n -- I) + 0 . 2 5 ~ x s ( n -- 1) + ws(n) x6(n) = 0.95V/2x6(n -- 1) -- 0.9025x6(n -- 2) + w6(n) xT(n) = --0.1x6(n -- 2) + w7(n) with wi (n) standing for innovation noises. These equations are designed so that xl(n) behaves as an oscillator driving the other structures, either directly or indirectly, according to the diagram in Fig. 1. Note that the interaction between xl (n) and x3(n) is both via a direct path and via an indirect route through x2(n). The dynamics of the pair xa(n) and x5 (n) is designed so that they jointly represent an oscillator, whose intrinsic characteristics are due to their mutual signal feedback but which are entrained to the rest of the structure via x3(n). The signals x6(n) and x7(n) belong to a totally separate substruc-
Fig. 1. Signal flow diagram for Model I.
36
1 2 ~,:,=;.~
4 5 6
I
(~
'
,
,
,
IOLO0
'
I
I
2000
Time (samples) Fig. 2. Signalsobtainedby simulatingModelI. ture where x6(n) is designed to generate oscillations at the same frequency as xl (n); x7(n) does not feedback anywhere. A sample of the signals produced in this way can be appreciated in Fig. 2. We begin our analysis by inspecting the theoretical pairwise cross-correlation contained in the array of plots in Fig. 3a. Consider just the latencies and lead structures represented by the theoretical correlation maxima of Fig. 3a as summed up in Fig. 3b's diagram whose arrows are labelled with the absolute values of the latencies and originate in the leading structure. In deducing the structural relationships between the signals using this information, we may attempt to trim the diagram in Fig. 3b. This leads to several possible hypothetical structures compatible with the observed latencies such as in Fig. 3c,d. Note that structural ambiguities not only remain and but also that no conceivable trimming of Fig. 3b can possibly produce the correct solution in Fig. 1 because Fig. 3b's relation between x3(n ) and x2(n) turns out inverted with respect to that in Fig. 1. In short, this example shows that correlation information alone leads to ambiguous structural inference when considering several time series measurements simultaneously.
The results of the pairwise interaction using the theoretical DTF is depicted by dark shaded curves along the off-diagonals over a 7 x 7 array layout of plots of Fig. 4a. Along the shaded main diagonal of the array in Fig. 4a, we portray the power spectrum for each time series. Note the spectral similarity that characterizes all signals for this structure. To facilitate comparisons, solid-line graphs along the off-diagonals of the array depict the high pairwise classical coherences among those structures that are interconnected. Fig. 4c's schematic represents a summary of the relations described by DTF in which signal sources are labelled along the x-axis and targets along the y-axis. Thus, for instance, in Fig. 4c, an arrow leaves xl(n) and reaches xs(n) because the first column of Fig. 4a has a significant shaded area in row five. No direct reverse arrow exists as there is no dark shaded area in column 5, row 1 of Fig. 4a. In this and later graphs, thinner/dashed arrows portray weaker connections. This leads to a complex connectivity pattern in the graph describing DTF relationships (Fig. 4c). As in the case of cross-correlation, the only possible inference is, for example, that the signal in xl(n) affects all other nodes without clues as to how or through which pathway this interaction takes place. Using the same
37
(b)
(a)
2
1 2 3 4 5 6 :
',
',
2
3
7 -50
0
1
50
4
(c)
5
6
7
(d)
Fig. 3. The theoretical autocorrelations are shown along the main diagonal and, below it, all the theoretical cross-correlation functions are plotted (a), with the x-axis scale ranging from - 5 0 to 50 sample points, and the y-axis is between - 1 and 1. Directed graph summarizing signal propagation latency information (encoded via arrow labels) contained in the cross-correlation function (b). Two possible simplified structures compatible with the theoretically calculated latencies are shown in (c) and (d). Note that, the graph simplification from (c) to (d), the connection 1 ~ 4 (with propagation latency 5 time units) is removed because it can be explained by the pathway 1 ~ 3 ~ 4 with the same total propagation latency value.
rule for associating source to target in labelling pairwise interaction using PDC, a completely distinct situation emerges in Fig. 4b where PDC calculations (dark shaded curves along the off-diagonal) lead to the correct structure of Fig. 4d (compare with Fig. 1). We next slightly increase the complexity of this example by adding a feedback from xs(n) to Xl (n). This is accomplished by rewriting the first equation in Model I as Xl (n) = 0.95~/2Xl (n -- 1) - 0.9025xl (n -- 2) + 0.5xs(n - 2) + wl(n)
(5)
in accord with the diagram of Fig. 5. As in the previous case, the theoretical DTF is difficult to analyze (Fig. 6a,c), as opposed to PDC (Fig. 6b,d) which clearly reflects the newly added feedback. This pattern of straightforward analysis using PDC carries over to a practical simulation scenario of using 500 data points where the feedback-free situation (Fig. 7a) is easily distinguishable
from that when feedback from xs(n) to xt(n) is present (Fig. 7b). To provide some sense of the potential temporal resolution of the method, we display the time evolution of PDCs involving xl(n) and x5(n) (Fig. 8) while randomly switching the feedback on and off. Each PDC estimate comprises the use of 250 simulated data points with 50% overlap between adjacent data segments. In comparing these examples, note that DTF's graph has arrows connecting almost all structures when the feedback is switched on (Fig. 6c); this is related to the fact that the PDC graph contains pathways (direct or indirect) that connect any two structures. Some arrows in Fig. 6c are missing (x2(n) ~ x l ( n ) , x3(n) ~ x l ( n ) , x3(n) ~ xs(n)). Their presence would have made Fig. 6c's graph fully connected. Though corresponding to existing signal pathways, the missing arrows correspond to small (but theoretically nonzero) DTFs that reflect
38
(a)
(b) DTF
PDC
J I
•
A
i''
^
:
L-- I i
II
I I
(c)
2
3
4
5
6
7
1
2
3
4
5
6
7
(d)
Fig. 4. Comparison between the theoretical DTF (a) (and its inferred structural interactions (c)) and the theoretical PDC results (b) (with its signal flow diagram (d)). In both cases the signal flow graphs are constructed by assigning an arrow from the source structure (x-axis) to the targets (y-axis) when dark shaded areas are significant. The spectral densities for the time series are depicted along the shaded main diagonal of the arrays. The pairwise classical coherences (solid lines) are also depicted. In all plots, the x-axis represents the normalized frequency in the 0 to 0.5 range, while the y-axis for power spectrum plots is scaled between 0 and peak value and values for the other coherence plots lie between 0 and 1. the weakness of the connection strength of the total pathways between structures that are far apart from one another. In fact, D T F can be interpreted as a marker for signal energy 'reachability' (see Remark 3 in A p p e n d i x A) and must be analyzed with care. For example, examine the dip in the D T F from x2(n) to x3(n) in Figs. 4a and 6a. It coincides with the m a x i m u m o f the power spectrum o f both these series. Rather than mean lack of pathway connection at that frequency, this dip exemplifies (by model design) how the energy reaching a structure (x3(n)) from another structure (xl(n)) at one frequency m a y be almost exactly cancelled by the energy coming through another pathway (xz(n)) due to a phase inversion in the signal. To emulate scalp EEG, our second example employed
Model II xl (n) = 1.8982Xl (n - 1) - 0.9025xl (n - 2) + wl (n)
xz(n)
= 0 . 9 x l ( n - 2) + to2(n)
x3(n) = 0.85xz(n -- 2) + w3(n)
X4(n ) = 0.82xl(n -- 2) + 0.6x6(n -- 3) + toa(n) xs(n) = --0.9x6(n -- 2) + 0.4xz(n -- 4) + ws(n) x6(n) = 0.9xs(n -- 2) + to6(n)
Fig. 5. Schematic diagram describing the inclusion of feedback from x5 (n) to Xl (n) into Model I.
39
(a)
(b) PDC
DTF
I L_
I;
L_ L_
II
&
1
2
3
4
5
6
7
(c)
1
2
3
4
5
6
7
(d)
Fig. 6. Comparison between the theoretical DTF (a) (and its inferred structural interactions (c)) and the theoretical PDC results (b) (with its signal flow diagram (d)) after turning on the feedback from xs(n) to xl (n). Spectral densities for the time series are depicted along the shaded main diagonal of the arrays. The pairwise classical coherences (solid lines) are also depicted. Thin/dashed arrows portray weaker connections.
(a)
(b)
PDC without feedback
P D C with f e e d b a c k
mL~"
i
i
e
7 1
2
3
4
5
6
7
1
2
3
4
5
6
7
Fig. 7. Estimated PDC for Model I without (a) and with (b) feedback from xs(n) to xl (n) using 500 simulated points. Note how x6(n) and x7(n) show residual classical coherence with the other time series, despite their lack of direct connection.
where xl (n) is a n oscillator driving, directly or indirectly, all the other structures. In this e m u l a t i o n , the odd n u m b e r e d signals represent the left h e m i s p h e r e
l e a v i n g the other ones to m a p the other h e m i s p h e r e as in Fig. 9. A n e x a m p l e o f the s i m u l a t e d signals is portrayed in Fig. 10.
40
(a)
Feedback Connection
1 <-- 5
ON OFF m
(b)
PDC 1 <-- 5
mS
u.
O
(C)
PDC 5 <---1
[] 1
d"51
~'0 (6)
'
.5
1
.5
'Classical (~oherence '
'
'
10000
12000
0
IJ.
0
0
2000
4000
6000
8000
Time (samples) Fig. 8. Gray scale plots show the time evolution of the PDC (b and c) and classical coherence (d) between structures 1 and 5 calculated from 250 simulated data point segments (overlapping by 50%) as the feedback from xs(n) and Xl (n) in Model I is switched on and off (a). The evolution of the classical coherence between these structures is also shown (d).
DTF results are shown in Fig. l l a together with its corresponding inferred signal flow graph in Fig. 1 lc. Note that DTF correctly identifies Xl (n) as the source driving all other structures (this is the basis of DTF's success as an identifier of epileptic foci in Franaszczuk et al. (1994)). As in the previous example, however, DTF remains ambiguous as to the pathway actually followed by the signal. These possible signal pathway alternatives turn out resolved in a much simpler fashion by examining PDC in Fig. 1lb which leads to a correct signal flow graph (Fig. 1 ld).
Fig. 9. Signal flow diagram for Model II.
Matching PDC calculations using simulated rather than theoretical values are shown in Fig. 12. According to PDC, Xl (n)'s role as a signal source for the whole structure is equally well deducible. A question that can arise about the use of PDC regards what happens when calculations are based on the processing of a reduced number of structures than are actually representative of the structure and dynamics of the process. Suppose we want to infer the direct interhemispheric influence by looking only at pairs of signals from both hemispheres along the sagittal plane, i.e. by computing the pairwise PDC of the pairs like (xl(n), x2(n)), (x3(n), xa(n)) and (xs(n), x 6 ( n ) ) without making joint calculations involving the other structures. The results of these separate pairwise analyses are shown in Fig. 13c, 13b and 13a, respectively. In the presence of actual connections as for Fig. 13a,c, their mutual feedback is correctly deduced from simulated data, as opposed to the relationship between x3(n) and x 4 ( n ) , where feedback presence is incorrectly detected despite the absence of an actual direct inter-
41
1 2 3 4 5
I
I
I
I
I
I
0
I
I
I
I
!
1000
2000
Time (samples) Fig. 10. Signals obtained by simulating Model II.
(a)
(b) DTF
\
1
II
\
2
PDC
¸
\
3
B
l_
4
k._ k_
5 6 1
2
3
(c)
~A I//
4
5
6
1
2
3
4
5
(d) 5
Fig. 11. Comparison between the theoretical DTF (a) (and its inferred structural interactions (c)) and the theoretical PDC results (b) (with its signal flow graph (d)). Spectral densities for the time series are depicted along the shaded main diagonal of the array. The pairwise classical coherences (solid lines) are also shown. In the signal flow graphs (c) and (d), connection arrow thickness is drawn proportional to the magnitude of DTF or PDC. Note that signal power is confined to lower frequencies.
connection. This means that we must not leave out signals from essential structures while performing the joint simultaneous signal analysis for functional structure inference.
The only hope for understanding the relationships among diverse neural structures lies in the processing of a representative number of signals simultaneously.
42 PDC
II
I Ii,
3
II
4 L ii .
Ii/'% ~
~==.=.~ /I
5 ~,.1~.,,, 6 1
2
3 4 5 6 Fig. 12. Estimated PDC for Model II using 500 simulated data points. A n a p p l i c a t i o n to e x p e r i m e n t a l data
We illustrate PDC in connection to local field potentials recorded from a rat in exploratory behavior. The simultaneously processed structures comprise the hippocampal CA1 field, somatosensory (A3) and motor (A10) cortical areas and the dorsal raphe (DR), where rhythmic oscillations in the theta range are observed during desynchronized sleep and alert states. Detailed DTF analysis of the same record of
these four jointly processed structures appeared in Baccahi et al. (1998) with special attention to the relationship between CA1 and A3. Fig. 14 depicts the PDC time evolution between these structures using the electromyogram from neck muscles to label behavioral states. For the first 30 s of this recording segment, the rat actively explored a lighted cage, then gradually turned inactive as can be followed by electromyogram. Around 52 s, the rat resumed the exploratory behavior when the cage lights were turned off. As attested by looking at classical coherence and recording traces of Figs. 15 and 16, rhythmic oscillations are more prominent during active exploratory behavior. In choosing special episodes in this record we consider the period lasting between 18 and 20 s (Fig. 15a) as characterized by high-amplitude electromyographic activity of neck muscles. The main feature of the DTF's interactions (Fig. 15b) results in the fully connected graph (Fig. 15d) that portrays the active participation of all structures. PDC (Fig. 15c) reveals the underlying signal feedback pattern (twoway interactions) and highlights DR's possibly important role. A drastically different picture emerges for the segment between 48 and 50 s of this same record (Fig. 16a) when the animal's neck muscles show low activity. DTF interactions (Fig. 16b) pro,
PDC
(a) 5
WA1 )
5
6
3
4
1
2
(Y:2S 1 2
Fig. 13. PDC results of processing pairwise structures (5,6) (a), (3,4) (b), and (1,2) (a) representingopposite hemispheres in Model II that show the possibilityof incorrectstructural inferencefor the pair (3,4) (b) when not all the structuresrelevantto the dynamicsof the model are consideredjointly.
43
EMG /. . . . . . . . . . . . . . . . . . .
head
N 'I"
~. I. . . . . . . . . . , . , .,~,.~. ~.cl~Ljl .L.,.Ja L ~L q~- ............ =
i j ,~L, - - ~ - "
PDC A3 (-- CA1
50' 40, 30' 20' 10' 0
N
50 40, 30'
Z
20.
PDC C A I ~ - A3
~
10' 0'
N •I"
,I 1 .5
Classical Coherence
50' 40' 30' 20'
0
10' 0 0
10
20
30
Time
40
50
60
(s)
Fig. 14. Time evolution of the PDC analysis highlighting the relationship between A3 and CA1 for a rat whose exploratory behavior is labelled via the electromyogram (EMG) from its neck muscles. A gray scale is used to represent the magnitude of PDC and classical coherence. The corresponding time evolution of the DTF between these structures appeared in Baccaht et al. (1998).
duce a more sparsely connected graph (Fig. 16d) as characterized by fewer strong connections. In comparing PDC calculations Fig. 15c,e against Fig. 16c,e one notices DR's role of reversal switching from being predominantly a source to being an information sink. Furthermore, note how the influence of CA1 over DR is essentially indirect with the signal first flowing through A3 and A10 in Fig. 16e, in sharp contrast to the PDC functional connectivity graph corresponding to the exploratory behavior segment (Fig. 15e), where all structures receive substantial influence from dorsal raphe (DR). Also in the quiet state (Fig. 16e), information is mostly being relayed from the other structures through A10 to DR. This example highlights the distinct and potentially interesting functional connectivity patterns that characterize different behavioral states. Conclusions and c o m m e n t s
By analyzing linear toy models, we show PDC's superior performance over other commonly used methods specially cross-correlation and classical coherence, while DTF analysis provides complementary
information whose analysis is less clear than PDC's. The main advantage of PDC lies in the graphically unambiguous frequency domain display of the relationships among simultaneous measurements of several time series as PDC can clearly expose the feedback structure between directly connected pairs of neural elements provided all the structures representative of the dynamics are jointly processed. Simultaneous recordings as well as the analysis of representative samples of neural elements through multi-site multichannel recording is therefore crucial for deducing functional connectivity. This acquires special importance in view of the fact that even PDC pairwise analysis may induce misleading conclusions about the nature of the interaction among neural elements. We therefore conclude that the use of techniques based on the processing pairwise time series are doomed to failure and that only the processing of many simultaneous structures can lead to an understanding of neural ensemble information processing and coding. Though unaddressed in this paper, statistical issues are important. While asymptotic results for the aij (r) coefficients exist and lead to Granger causality
44
(a)
A10
A3
CA1
DR l 15
,
i
i
,
I 20
i
,
,
i
I
25
Time (s)
(b)
(c) DTF
PDC
A10
A10
A3
A3
CA1
CA1
DR
DR A10
A3
CA1
(d)
DR
A10
A3
CA1
DR
(e) m
m
m
Fig. 15. Ten-second-long segment recording (corresponding to the segment 15-25 s of Fig. 14, sampled at 256 Hz) (a) from a rat engaged in active exploratory behavior. The upper trace (head) is the electromyogram from neck muscles; the other four traces are local field potentials showing rhythmic theta oscillations recorded from motor (A10) and somatosensory (A3) cortices, hippocampus (CA1) and dorsal raphe (DR). The DTFs (b) determined from segment 18-20 s show strong functional connectivity between most of the structures; weaker directional information flow occurs for the pairs DR ~ CA1, A10 ~ CA1 and A10 ~ A3. In (d) arrows indicate the direction of information flow resulting from DTF analysis; the weaker DTFs are indicated by dashed lines. Matching PDC analysis results (c) from the same segment lead to the functional connectivity graph (e) which shows that DR not only sends but also receives information from the other structures. As in (d), weak PDC values are represented by dashed arrow (pairs A10 --+ DR, A10 --~ CA1, A3 --~ DR, and CA1 ~ DR). In all plots in (b) and (c), the x-axis represents the frequency in the range 0-32 Hz, while the y-axis for power spectrum plots is scaled between 0 and peak value, and values for the other coherence plots lie between 0 and 1.
45
(a) head A10 A3 CA1 DR I
I
I
I
I
I
I
I
I
I
|
50
45
55
Time (s)
(c)
(b) DTF A10
A10
A3
A3
CA1
CAI
DR
DR A10
(d)
PDC
A3
CA1
DR
A10
A3
CA1
DR
(e)
,~m~mm
Fig. 16. Ten-second-long recording segment (a) (corresponding to the 45-55 s behavioral segment of Fig. 14, sampled at 256 Hz) showing the transition from the quiet state to active exploratory behavior induced by turning lights off at around 52 s. See Fig. 15 for details on channel labels. Note that theta waves become prominent in all four brain structures concomitantly with the onset of electromyographic activity. When compared to Fig. 15d, DTFs determined from segment 48-50 s (b) shows a larger number of weaker functional connections (d) indicated by dashed arrows. The corresponding PDC analysis (c) and its functional connectivity graph (e) portray weaker connectivity from DR to all other structures. During this episode the DTF graph connectivities (d), CA1 -* DR and A3 --+ DR are not matched by direct connections in the PDC graph (e). They can, however, be explained respectively by the existence of indirect signal pathways CA1 ~ A10 -* DR and A3 --+ A10 -+ DR. Note also that DR is essentially an information sink while CA1 is mainly an information source. In all plots in (b) and (c), the x-axis represents the frequency in the 0 to 32 Hz range, while the y-axis for power spectrum plots is scaled between 0 and peak value, and for coherence plots between 0 and 1.
46
tests (Lutkepohl, 1993; Baccaki et al., 1998), their usefulness is less clear because of the 'quasi-stationary' nature of neural signals. More importantly perhaps, as our examples show, is the fact that weak connections can be barely detectable. This is specially true in the case of DTF where the effect of weakly connected signal pathways is compounded (for example see Fig. 6a,c). Also, the number of simultaneously processed structures affects the achievable temporal resolution as more data points become necessary to insure statistically reliable detection of weak connections. Finally, it is important to have in mind that, though based on linear time series modelling, PDC proves applicable and useful for the analysis of multiple structures that involve some levels of nonlinear interactions as discussed in Sameshima and Baccakl (2001). In fact, to our knowledge, PDC is the only existing practical method that effectively goes beyond pairwise analysis and is capable of efficiently handling multiple structures simultaneously as is essential for reliable functional structural inference. Acknowledgements
Appendix A If xi (n), 1 < i < N, represent simultaneously processed discrete time signals, the canonical way to represent the relationship between these time series in the frequency domain is via their joint power spectral density matrix, which reads Sl(f)
S12(f)
Sl3(f)
S21(f)
S2(f)
$23(f)
S31(f)
$32(f)
S3(f)
(6)
in a N = 3 example. In general, one may calculate S ( f ) by using a multivariate (vector) autoregressive model (Priestley, 1981; Lutkepohl, 1993)
xN(n)
r=l
xN(n - r)
wN(n)
(7)
(8)
S(f) = H(f)~,H H(f)
where
E =
[
4
o,N ]
"
i
O'NI
O'N2N
(9)
is the covariance matrix of wi (n), and H(f)
= ; t - J ( f ) = (1 -
A(f)) -~
(10)
with P
A(f) = E
Arz-r
r=l
Iz= e j2rrf
At this point, one option for describing the mutual interaction between pairs of time series may be obtained through a generalized definition of the directed coherence from j to i as (Baccalti and Sameshima, 1998; Baccal~i and Sameshima, 2001) ~/ij(f)--
This work received financial support from CNPq for CNPq 301273198-0 (LAB), PRONEX for PRONEX 41.96.0925.00, and FAPESP for FAPESP 96/12118-9 (KS), 99/07641-2 (LAB) grants to the authors.
S(f) =
with wi(n) standing for white uncorrelated innovation noises. The coefficient matrices At, for each lag r, may be estimated either using least squares or fast maximum entropy methods. The appropriate order of the model p can be inferred using Akaike's AIC criterion (Marple, 1987). In this case, the power spectral density matrix may be written as
ajj Hi) ( f ) ~
,
(11)
where N
Si(f)= Zcr~j IH~j(X)I 2 .
(12)
j=l
Remark 1 The definition of DTF in Franaszczuk et al. (1994) is a special case where aii are made equal to 1 in Eq. 11. Because of its relationship to the power spectral density matrix, [Yij(f) I may be interpreted as a fraction of the power originating in x j (n ) that reaches X i (n ). An alternative way to describe the mutual interaction is via the aij (r) elements of At, i.e. by testing Granger causality directly. If {tij(f ) be ,4(f)'s i, j-th element, i.e. the i-th component of the j-th column ~tj(f) of A ( f ) . After suitable normalization, discussed elsewhere (Baccal~i and Sameshima, 2001), one possible definition for partial directed coherence is gtij ( f ) 7rij ( f ) -- v/~j (f)n~t j ( f )
(l 3)
Remark 2 This name, 'partial directed coherence', comes from an interpretation o f yrij ( f ) as factor in the partial coherence Kij ( f ) between two time series (Bendat and Piersol, 1986; Baccald and Sameshima, 2001). Remark 3 The relationship between PDC and DTF is that they are based, respectively, on the matrices A,(f) and H ( f ) which are inverses o f one another This inverse matrix relationship occurs in graph theory where a matrix like f t ( f ) describes
47
the connections of directed graphs while a matrix like H (f) is analogous to the graph reaehability matrix which records the graph structures rei~chable from a given node (Baccal6 et al., 1991). Remark 4 When N = 2, PDC leads exactly to the same estimator as DTF (Baccald and Sameshima, 2001).
References Arnhold, J., Grassberger, E, Lehnertz, K. and Elger, C.E. (1999) A robust method for detecting interdependencies: application to intracranially recorded EEG. Physica D, 134: 419-430. Baccal~i, L.A. and Sameshima, K. (1998) Directed Coherence: A tool for exploring functional interactions among brain structures. In: M.A.L. Nicolelis (Ed.), Methods for Neural Ensemble Recordings. CRC Press, Boca Raton, FL, pp. 179-192. Baccahi, L.A. and Sameshima, K. (2001) Partial directed coherence a new concept in neural structure determination. Biol. Cybern., in press. Baccal~i, L., Nicolelis, M., Yu, C. and Oshiro, M. (1991) Structural analysis of neural circuits using the theory of directed graphs. Comp. Biomed. Res., 24: 7-28. Baccal~i, L.A., Sameshima, K., Ballester, G., Valle, A.C. and Timo-Iaria, C. 0998) Studying the interaction between brain structures via directed coherence and Granger causality. Appl. Sig. Process., 5: 40-48. Bendat, J.S. and Piersol, A.G. (1986) Random Data: Analysis and Measurement Procedures. John Wiley, New York, 2nd ed. Brunel, N. and Nadal, J.E (1998) Mutual information, Fisher information, and population coding. Neural Comput., 10: 17311757. Christakos, C.N. (1997) On the detection and measurement of synchrony in neural populations by coherence analysis. J. Neurophysiol., 78: 3453-3459. Duckrow, R.B. and Spencer, S.S. (1992) Regional coherence and the transfer of ictal activity during seizure onset in the medial temporal lobe. Electroencephalogr. Clin. Neurophysiol., 82: 415-422. Eichenbaum, H.B. and Davis, J.L. (Eds.) (1998) Neuronal Ensembles: Strategies for Recording and Decoding. John Wiley, New York. Franaszczuk, EJ., Bergey, G.K. and Kaminski, M.J. (1994) Anal-
ysis of mesial temporal seizure onset and propagation using the directed transfer function method. Electroencephalogr. Clin. NeurophysioL, 91: 413-427. Glaser, E. and Ruchkin, D. (1976) Principles of Neurobiological Signal Analysis. Academic Press, New York. Granger, C.W.J. (1969) Investigating causal relations by econometric models and cross-spectral methods. Econometrica, 37: 424-438. Lutkepohl, H. (1993) Introduction to Multiple Time Series Analysis. Springer, Berlin, 2nd ed. Marple, S.L., Jr. (1987) Digital Spectral Analysis. Prentice-Hall, Englewood Cliffs, NJ. Nicolelis, M.A.L. (Ed.) (1998) Methods for Neural Ensemble Recordings. CRC Press, Boca Raton, FL. Priestley, M.B. (1981) Spectral Analysis and Time Series. Academic Press, London. Rieke, F., Warland, D., de Ruyter van Steveninck, R. and Bialek, W. (1997) Spikes: Exploring the Neural Code. MIT Press, Cambridge, MA. Rosenberg, J., Halliday, D., Breeze, P. and Conway, B. (1998) Identification of patterns of neuronal connectivity - - partial spectra, partial coherence, and neuronal interactions. J. Neurosci. Methods, 83: 57-72. Saito, Y. and Harashima, H. (1981) Tracking of information within multichannel EEG record - - causal analysis in EEG. In: N. Yamaguchi and K. Fujisawa (Eds.), Recent Advances in EEG and EMG Data Processing. Elsevier, Amsterdam, pp. 133-146. Sameshima, K. and Baccal~i, L. (1999) Using partial directed coherence to describe neuronal ensemble interactions. J. Neurosci. Methods, 94: 93-103. Schiff, S.J., So, P., Chang, T., Burke, R.E. and Saner, T. (1996) Detecting dynamical interdependence and generalized synchrony through mutual prediction in a neural ensemble. Phys. Rev. E, 54: 6708-6724. Schnider, S.M., Kwong, R.H., Lenz, EA. and Kwan, H.C. (1989) Detection of feedback in the central nervous system using system identification techniques. Biol. Cybern., 60: 203-212. Yamada, S., Nakashima, M., Matsumoto, K. and Shiono, S. (1993) Information theoretic analysis of action potential trains, I. Analysis of correlation between 2 neurons. Biol. Cybern., 68: 215-220.
M.A.L. Nicolelis (Ed.)
Progress in Brain Research, Vol. 130 © 2001 Elsevier Science B.V. All rights reserved
CHAFFER 4
Distributed processing in cultured neuronal networks Steve M. Potter * Division of Biology 156-29, California Institute of Technology, Pasadena, CA 91125, USA
Introduction
Dissociated neuronal networks
Thanks to a number of recent technical advances, it will become increasingly popular to study the very basics of distributed information processing using cultured neuronal networks. Most researchers studying population coding are working with intact, living animals. Clearly, cultured neuronal networks lack many features of real brains, but they retain many others. They develop organotypic synaptic connections and exhibit a rich variety of distributed pattems of electrical activity. Progress in multi-electrode array technology, optical recording, and multi-photon microscopy, has made it possible that every cell in a cultured monolayer network can be observed, monitored, stimulated, and manipulated with temporal resolution in the submillisecond range, and spatial resolution in the submicron range, in a non-destructive manner. At present, such detailed and complete analysis of neural circuits is not feasible in living animals, or even brain slices. It is an open question, however, whether any of the 'processing' done by cultured neurons is relevant to that carried out by intact brains. This chapter serves to present efforts from a number of groups that lay the groundwork for an in vitro approach to studying population coding. I will suggest what it might take to advance the state of the art to the point where we can consider studying learning, memory, and distributed information processing in vitro.
Mammalian neurons can be mechanically and enzymatically dissociated from brain tissue and grown in culture for months, with the proper attention to maintaining sterility, temperature, pH, osmolarity, oxygenation, and providing a supply of nutrients and growth factors. This technology was worked out years ago (reviewed in Banker and Goslin, 1998), although improvements continue to be made. During the first week in culture, the neurons extend many neurites, form synapses, and begin to develop spontaneous activity (Habets et al., 1987; Comer and Ramakers, 1991; Gross et al., 1993a; Basarsky et al., 1994). These activity patterns, including complex sequences of action potentials in isolation and in bursts (rapid barrages), continue to develop over the course of a month in vitro. Underlying these activity changes are morphological changes of the neurons, as they grow elaborate dendritic and axonal arbors and form numerous synaptic connections (Comer, 1994). Usually (but not always) the neurons are terminally differentiated at the time they are plated onto a culture dish. The glial cells, if present in the dish, continue to divide and proliferate until limited by contact inhibition or exogenous inhibitors of cell division (Banker and Goslin, 1998). Glial cells provide necessary trophic factors for cultured neurons (Meyer-Franke et al., 1995; Banker and Goslin, 1998), and there is evidence that direct contact between neurons and glia is also crucial for neuronal survival, if not synaptic processing as well (Pfrieger and Barres, 1997).
*Corresponding author: S.M. Potter, Division of Biology 156-29, Califomia Institute of Technology, Pasadena, CA 91125, USA. E-mail:
[email protected]
50
Multi-electrode array history Traditionally, the excitable properties of neuronal cultures are studied using glass micropipet electrodes. Because each electrode must be held and tediously positioned by a bulky mechanical micromanipulator, it is very difficult to record from or stimulate more than a couple cells at a time. This limitation has not prevented neurophysiologists from learning much about single-cell properties, ion channels, pharmacology, and synaptic plasticity in vitro (Cotman et al., 1988; Misgeld et al., 1998). However, like observing the w o r d through a drinking straw, these approaches miss many of the collective properties of neuronal networks. Multi-electrode array culture dishes allow simultaneous recording from and stimulation of over a hundred neurons, greatly expanding our field of view, while keeping the single cell in sharp focus. These wired Petri dishes are most often referred to as MEAs (multi-electrode arrays or micro-electrode arrays), but have also been called multi-microelectrode plates, planar electrode arrays, and multi-electrode dishes. MEA technology enables the study of distributed patterns of electrical activity in cultured networks via non-invasive extracellular electrodes built into the substrate. These electrodes can also be used to stimulate neurons extracellularly and non-destructively (Regehr et al., 1989; Gross et al., 1993b), allowing a long-term two-way connection between a cultured neuronal network and a computer. MEAs have been around for a while. Thomas and co-workers first described multi-electrode arrays for monitoring activity in electrically excitable cells in 1972 (Thomas et al., 1972). They recorded field potentials from spontaneously contracting sheets of cultured chick cardiac myocytes, but could not record activity from single cells. A few years later, Pine (1980) and Gross et al. (1982) independently developed arrays for chronic multi-single-cell recording and electrical stimulation of cultured neuronal networks. Until recently, custom-made MEAs, hardware and software were created by each of the labs that dared to get involved in this technically demanding field (Pine, 1980; Israel et al., 1984; Novak and Wheeler, 1986; Connolly et al., 1990; Eggers et al., 1990; Janossy et
al., 1990; Borroni et al., 1991; Jimbo and Kawana, 1992; Martinoia et al., 1993; Gross and Schwalm, 1994). Fortunately, MEA technology is now accessible to labs that do not care to delve into the subtleties of computer programming, array microfabrication and electronics development. Complete MEA systems capable of recording from at least 60 electrodes are produced by MultiChannel Systems of Germany (the 'MEA60' 1), and Panasonic of Japan (the 'MED System' 2). Guenter Gross (U. of N. Texas) supplies MEAs that can be used with multi-electrode processing hardware and software made by Plexon Inc. 3 Only very recently has computer and data storage technology made it feasible to be able to record continuously from 60 electrodes, at sampling rates over 20 kHz/channel. We will continue to see rapid advances in the capabilities of commercial MEA systems, propelled by advances in microfabrication, computer speed, and data analysis. The in vivo multi-electrode probe community is also helping to advance the state of the art, since they share many of the same hardware and data analysis problems with the in vitro community. 4
MEA fabrication MEAs consist of a number of cell-sized electrodes (10-100 Ixm) arrayed across the bottom of a cell culture dish. The substrate is usually glass, with leads made of gold or the transparent conductor indium-tin oxide, that carry signals from electrodes to external electronics, and carry stimuli to the electrodes (Fig. 1). (Indium-tin oxide electrodes and
1
http://www.multichannelsystems.com
e http://www.panasonic.com/medical_industrial /med-index.html 3 http://www.plexoninc.com/ 41 set up an internet mailing list, The MEA-Users, to facilitate interaction within and between these groups and provide a clearing-house for rapid dissemination of relevant information. To subscribe, send the message (no subject line, no quotation marks, no signature) 'subscribe mea-users' to
[email protected]. To receive a description of the group, send the message 'info mea-users' to the same address.
51
Fig. 1. 60-electrode MEA from MulfiChannel Systems, with devoping rat cortical culture after 6 days in vitro. The 10-1xmdiam. electrodes (not visible in the center of a 30-1xmgold disk) are 200 Ixm apart, with sputtered titanium nitride to reduce impedance. The gold leads (beneath an insulating layer of silicon nitride) travel under a glass ring, containing cell culture medium, to contacts around the perimeter of the 50-mm glass plate (inset). These are connected to preamplifiers, analog-to-digital converters, and to a computer. They can also be connected to stimulation circuitry, allowing long-termtwo-way communicationbetween the neurons growing over the electrodes and the computer. leads are commonly used for liquid-crystal displays on digital watches and other consumer electronics.) MEA electrodes must be biocompatible, durable, and have a reasonably low impedance (less than 500 kf2 at 1 kHz) to allow the detection of small extracellular signals (from 10 to 100 microvolts). The low impedance also allows sufficient stimulation current to be passed without exceeding the electrochemical breakdown voltage of water and other components of the medium (usually around one volt). The electrodes on MEAs have traditionally been electroplated with porous platinum ('platinum black'). This is not very durable, and thus the impedance rises unacceptably when the MEAs are re-used and even during long-term culturing. This problem can be greatly reduced by electroplating while sonicating, which allows only durable platinum crystals to form (Marrese, 1987). Recently, relatively tough, low-impedance electrode coatings have been created by sputtering iridium oxide (Blau et al., 1997) or titanium nitride (Egert et al., 1998). The surface of the MEA and the electrode
leads are coated with some biocompatible insulator (usually polyimide or silicon nitride/oxide) that prevents electrical shorting to the bath, and allows cell adhesion after coating with traditional cell culture substrates such as polyamino acids and laminin. MEAs have also been fabricated out of silicon (Pancrazio et al., 1998; Maher et al., 1999), and there has been some success recording from and capacitively stimulating neurons growing on the insulated gates of silicon field-effect transistors (Fromherz and Stett, 1995; Offenhausser et al., 1997; Vassanelli and Fromherz, 1997). The Pine group has produced a 16-well silicon 'neurochip' designed to hold 16 neurons in close apposition to electrodes at the bottom of the wells (Maher et al., 1999). It has been difficult to design an effective grillwork on the well that keeps the cell soma in the well, yet allows neurites to grow out and make contacts. I and my Pine lab colleagues have observed that neurons persistently escape from the wells, especially if there are glial cells nearby for them to adhere to.
52
MEAs for studying neural coding Retina researchers have provided fruitful examples of the directions we may wish to take with dissociated cultured networks. The 61-electrode Pine-style MEAs have been used quite successfully by several groups studying processing in the retina (Wong et al., 1993; Warland et al., 1997; Nirenberg and Latham, 1998). Explanted retinas are laid down on the MEAs, and exposed to various types of light stimuli, while recording ganglion cell responses (Meister et al., 1994). For the retina, the appropriate inputs are reasonably well-defined, that is, spatial patterns of light; and the sole output of the retina is sequences of action potentials in retinal ganglion cells. Nirenberg and Latham (1998) suggest that knowing the input-output relationship of the retina is equivalent to knowing how it encodes a visual stimulus. The words 'encoding' and 'processing' suggest some sort of non-trivial transformation of information. What would it take to believe that a dissociated cultured network had performed a non-trivial transformation of information? Sakurai (1996) as well as contributions in this volume demonstrate that intact brains rely on population coding. To verify that neurons can also make use of population coding in vitro, we must first devise a system in which there is any coding at all. Inputs, outputs, and a non-trivial transformation must be defined.
Recent progress with MEAs Understanding the relevant parameters for coding in cultured networks is likely to require long-term monitoring and stimulation. MEAs make this possible because unlike glass micropipets, MEA electrodes are non-invasive. For example, Welsh and co-workers demonstrated the usefulness of MEAs for chronic recording from cultured networks of cells from the rat suprachiasmatic nucleus (SCN) (Welsh et al., 1995). The circadian activity intrinsic to these neurons was followed continuously for weeks, and it was demonstrated to be a single-cell, not a network property, by reversibly blocking synaptic transmission with tetrodotoxin. After washout, the activity of individual neurons resumed in phase with their pre-treatment activity. This, and more recent studies (Herzog et al., 1997, 1998; Honma et al., 1998) have
shown that SCG neurons in culture exhibit a variety of circadian frequencies and phase relationships, and the mechanisms by which they are synchronized in vivo are now being tested in vitro (Liu et al., 1997). Potential 'outputs' of cultured networks might be the recurring patterns of action potential firing they spontaneously exhibit. The Gross lab has pioneered the analysis and categorization of the "bewildering variety of spatio-temporal spike and burst patterns" (Gross and Kowalski, 1991, p. 66) in MEA cultures prepared from mouse spinal cord (Droge et al., 1986; Gross and Kowalski, 1991; Gross et al., 1993a; Rhoades et al., 1996). Of 120 cultures surveyed for spontaneous activity between 3 and 12 weeks in vitro, 60% showed "predominant bursting with an ever-changing sequence of random, patterned (possibly chaotic), and short periodic burst sequences" (Gross and Kowalski, 1991, p. 66). 10% of the cultures were silent (but activity could be induced pharmacologically), 20% exhibited mostly isolated action potentials and little bursting, and 10% exhibited periodic bursting. This activity is usually synchronized across all active electrodes. They demonstrated that cultures could be switched between different modes of bursting (e.g., periodic vs. random) by washing in and out various pharmacological agents (Gross et al., 1993a). Activity on each electrode was summed using a 'leaky integrator' process in which each action potential causes the pen of a chart recorder to rise a fixed, tiny increment, while it descends exponentially with a slow time constant (e.g., 300 ms). This process facilitates burst analysis, with each burst appearing as a large peak on the chart. However, it discards the subtle timing information of individual action potentials within and between bursts. Recurring patterns of action potential firing with precise timing have been observed in a number of brain circuits, such as the hippocampus (Nadasdy et al., 1999), respiratory centers (Frostig et al., 1990) and cortex (Abeles et al., 1994). It is clear that the arrival time of individual action potentials carries a lot more information in animals than does the mean firing rate (reviewed in Gerstner et al., 1997 and Rieke et al., 1997). If such activity patterns exist in cultured networks, their dynamics might be overlooked using the 'leaky integrator' approach. Evidence that such subtle, recurring action potential patterns do exist in cultured networks comes
53 from the Kawana lab at Nippon Telegraph and Telephone in Japan. Using their own custom MEA hardware and software, they have pioneered the study of plasticity in spontaneous and stimulated activity patterns in dissociated rat cortical cultures (Jimbo and Kawana, 1992; Robinson et al., 1993a,b; Maeda et al., 1995, 1998; Kamioka et al., 1996; Kawana, 1996; Watanabe et al., 1996; Canepari et al., 1997; Jimbo et al., 1998, 1999; Konno et al., 1998; Tateno and Jimbo, 1999). Jimbo and co-workers used MEAs to reveal distributed changes in the network properties of cortical cultures as a result of extracellular stimulation via the substrate electrodes. They elegantly demonstrated that they could induce both potentiation and depression of network activity in a pathway-specific manner (Jimbo et al., 1999). They used cultures that had been growing on 64-electrode MEAs for at least one month. After this time, the cultures have reached a developmentally stable period (Jimbo et al., 1999), exhibiting a complicated pattern of spike-firing and bursting (Kamioka et al., 1996). They monitored network response to a single probe pulse stimulus (biphasic: 100 Ixs +0.6 V, 100 IXS -0.6 V) applied to each electrode in succession at 3-s intervals. The responses at each electrode (the activity of one to five neurons near it) were averaged for 10 scans of this probe pulse across all channels. The response of the whole network to probe pulses on any given electrode was quite reproducible for the first 50 ms after the pulse (Fig. 2).
To induce synaptic weight changes, a strong stimulus was delivered to the network at a single site (tetanic pulse sequence of 20 trains (5-s intervals) of 10 pulses (20 Hz, as above)). Finally, the original 10 scans of network response to single probe pulses were repeated. Across 8 MEA cultures studied (4153 days in vitro), an average of 22 electrodes (out of 64) per dish showed a potentiated response after single-site tetanus, while an average of 6 electrodes (out of 64) showed a depressed response. An analysis of cross-correlations between the tetanized electrode's activity, and the others was very revealing: those neurons that tended to fire in synchrony with the tetanized pathway were potentiated. Those whose correlation was poor gave a depressed response. Interestingly, both potentiated and depressed pathways showed enhanced synchrony with the neurons recorded on the tetanized electrode, after tetanus. They concluded that potentiation and depression of pathways in these cultures are two possible outcomes of the same process, whose details are still unknown. Tightly correlated pathways become potentiated, loosely correlated pathways become depressed. This study represents a significant advance on the paired-cell recording and stimulation work that showed similar influence of relative spike timing on plasticity (Bi and Poo, 1998; Markram et al., 1998; Zhang et al., 1998), because the multi-electrode approach showed that the changes were synapse-specific and network-wide, not cell-specific. That is, looking at activity on a specific electrode, they saw
B
A
C m 50 [
~ '.-::
"~ i;ii -=- ] .i ~ -6 i, ~i
;
"
, !i;i!
/,,~,i. ~ ~: 1
io, LI 0
.......
time[mse¢]
80
0
time [reset I
ao
E ~o .... o
.l. . . . . . . . . . . time[msec]
8o
Fig. 2. Example of the reproducible response of an MEA culture to probe stimuli, adapted from Tateno and Jimbo, 1999 (with permission). Raster plots (top) and post-stimulus time histograms (bottom, 0.5 ms bins) of action potentials recorded from one MEA electrode are shown for three different blocks of 50 probe pulses applied to electrode 'C2,R2'. Between blocks A and B, a strong tetanic stimulus was applied to electrode 'C5,R6', and between blocks B and C, the tetanic stimulus was applied to both 'C5,R6' and 'C2,R2'. Note how the response timing generally shortens and sharpens up with stimulation, yet still contains reproducible patterns past 50 ms after the probe pulse.
54 enhanced responses to some probe stimuli (as they sent a single probe pulse to each electrode in turn), and depressed responses to others, all resulting from tetanus at a single electrode in the MEA. However, this conclusion is weakened by the fact that some electrodes contact more than one neuron, and the number of cells directly activated by the tetanus is not known. They showed previously that while intraceUular tetanus to a single cell had no effect on network activity, extracellular tetanus (presumably exciting more than one cell near the electrode) evoked a large network response (Jimbo et al., 1993). In a separate study by Tateno and Jimbo, a similar tightening of synchrony was observed as a result of tetanic stimulation (Tateno and Jimbo, 1999, fig. 2). The authors hypothesize that "changes in synaptic efficacy enhance or reduce the reliability and reproducibility of spatially correlated neuronal responses in networks" (p. 45). In none of these studies did they monitor the changes in synaptic weight past one hour. It would be informative to carry out more long-term recording to determine how permanent are the changes induced by various types of stimulation. Brain slices on MEAs There are a number of groups applying MEA technology to brain slices, either acute (freshly cut) or maintained in organotypic culture (Wheeler and Novak, 1986; Novak and Wheeler, 1988; Borroni et al., 1991; Boppart et al., 1992; Heck, 1995; Borkholder et al., 1997; Stoppini et al., 1997; Thiebaud et al., 1997, 1999; Egert et al., 1998; Fejtl et al., 1998; Duport et al., 1999; Jahnsen et al., 1999). Slices have the advantage that their cytoarchitectonics and connectivity are closer to those of intact brains, compared to dissociated cultures. Thus their 'inputs' and 'outputs' might be more clearly defined. However, for analyzing networks and cells in great detail, brain slices have many of the same problems as whole animals, with too many cells packed too closely. MEAs record field potentials from slices, not single-cell activity. For acute slices, there is the concern that the electrodes are closest to a layer of dead or dying cells near the cut surface. To surmount this problem, some are experimenting with MEAs that have electrodes on the ends of small spikes (Thiebaud et al., 1999). This 'bed-of-nails' approach might allow recording
and stimulation of more healthy cells within the slice. For cultured slices, the problem is that there is poor access of oxygen and nutrients to the cells at the bottom of the slice. Thus, the ones near the electrodes again are the least healthy. The creation of porous MEAs may eliminate this problem (Boppart et al., 1992; Stoppini et al., 1997; Thiebaud et al., 1999). Because of these difficulties, the slice-MEA field is still in its infancy, but we can expect some advances in the near future that will help fill the gap between intact brains and dissociated networks. Optical imaging of cultured networks Unlike slices, dissociated neural cultures form a monolayer on a clear substrate, lending themselves well to optical recording of activity in individual cells. By imaging the calcium signals in developing cortical cultures using the calcium-sensitive dye, Fluo-3, Voigt and co-workers showed that cells that fired bursts in synchrony with the rest of the culture (on the time scale of seconds) survived better (63% survival 4 days after optical recording) than those that did not (22% survival, asynchronous and nonbursting groups combined) (Voigt et al., 1997). This suggests that neural co-activation plays an important developmental role in network architecture, even in vitro. The temporal resolution of calcium imaging systems is usually not fast enough to see individual action potentials, only bursts of them. Jimbo and coworkers used simultaneous MEA and optical recording to verify that these optical calcium signals correspond to bursts of electrical activity (Jimbo et al., 1993). It remains to be determined whether subtleties in action-potential timing responsible for the synaptic weight changes observed by Jimbo et al. (as described above) are also involved in neuronal survival. Optical recording of membrane voltage, in contrast to imaging calcium signals, can provide a direct, fast measure of electrical activity in many individual neurons of neuronal networks. In 1973, Davila, Salzberg, and Cohen presented the first optical recording of an action potential using a voltagesensitive dye and a single photodiode, and proposed that: "An apparatus with a large number of photodiodes, arranged so that each detector would receive the
55 light from an individual cell body, could, with a small computer, monitor the activity of, perhaps, a hundred cells at once. Such a large increase in the number of monitored cells could facilitate the determination of functional connexions between cells, and ultimately lead to an understanding of the neuronal basis of behaviour" (Davila et al., 1973, p. 160) Since then, multi-single-neuron optical recording of voltage signals (as distinct from optical recording of field potentials or intrinsic signals in bulk tissue) has been used with great success in invertebrate ganglia (Wu et al., 1994, 1998), and recently in mammalian intestinal enteric plexus, which is naturally a monolayer network (Obaid et al., 1999). Thus, it is reasonable to expect that optical recording should allow the observation of distributed processing in small networks of cultured neurons, in even greater detail than using MEAs. Three things have impeded the realization of this goal: (1) the optical signals from cultured (especially mammalian) neurons are very small, usually less than 1% change during an action potential; (2) sensitive, fast imaging systems with submillisecond and single-cell resolution are not readily available; and (3) the potentiometric dyes used tend to be very phototoxic and photobleach (fade) rapidly. It is well worth trying to overcome these difficulties, and to combine optical recording with MEAs. Because electrically recorded extracellular signals are approximately 100 IxV or less, and extracellular stimuli are often 10,000 times larger, electrical recording from an MEA electrode during stimulation is not feasible. Optical recording during stimulation would make it possible to observe exactly which cells were stimulated by current injection through substrate electrodes. It would also allow us to observe activity in cells too far from substrate electrodes to record from electrically. Spike-sorting algorithms could be tested out on cases where the same cells are recorded optically and electrically. The optical recordings would provide the 'ground truth', that is, exactly where each cell is in relation to the electrode and when it fired. These tests would be of interest to the in vivo multi-electrode probe community, where the ground truth for the multiunit activity picked up by the probe is not readily accessible to the experimenter.
Traditionally, optical recording is done using photodiode arrays, with 10 × 10 or 25 x 25 pixel resolution (Chien and Pine, 1991). To allow higherresolution high-speed imaging, Pine and I designed and built a CCD (charge-coupled device) camera with 64 x 64 28-1xm pixels capable of recording spontaneous and evoked action potentials (in a single trial) in cultured rat neurons (Pine and Potter, 1997; Potter et al., 1997b). This camera has the unique ability to digitize any arbitrary combination of pixels, and pass over uninteresting ones, to allow imaging at over 1000 frames/s. Bullen, Patel and Saggau created a functionally similar, but entirely original optical recording device that rapidly scans a laser beam from cell to cell using computer-driven acousto-optic deflectors (Bullen et al., 1997). This was used to record optical signals in single cultured hippocampal neurons with a 5 mV, 0.5 ms resolution (Bullen and Saggau, 1999) (Fig. 3). Such a device should be capable, as should our high-speed CCD, of detecting subthreshold spontaneous activity simultaneously in over a hundred neurons. However, the necessary light dose is quite damaging. In an effort to reduce photodamage, Obaid et al. (1999) bathed enteric neurons in a cocktail of the carotenoid pigment astaxanthin, and the enzymes glucose oxidase and catalase. Presumably by reducing oxygen concentrations and free-radical-mediated reactions, this mixture allowed continuous recording for up to 5 min. This is a tremendous improvement over the commonly accepted few seconds of potentiometric dye recording, but still a long way from being able to record for hours or days, as with MEA electrodes. Blau, Friedrich, and I are presently exploring new dyes, filter combinations, and voltage-sensitive fluorescent proteins (VSFPs) (Siegel and Isacoff, 1997; Blau, 1999; Friedrich et al., 1999), to enhance signal-to-noise ratios and reduce phototoxicity and photobleaching. Until significant progress is made in reducing the photodamage problems, the optical recording approach is limited to short-term, terminal experiments. Flat monolayer cultures also lend themselves to detailed morphological analysis by imaging at much slower time scales. The advent of 2-photon laserscanning microscopy (Denk et al., 1990) has made it possible to carry out time-lapse imaging of fluorescently labeled neurons continuously for many
56 40
Vm o
(my) -40
-80
!
V-''q
V----q
~
l!~
!1
~:~-, '~-~-~,~ ~
.......
I
i
I
i
i
0
50
100
150
200
Time (ms) Fig. 3. Optical recordings made from different parts of a single cultured rat hippocampal neuron, using the laser-scanning system of Bullen et al., 1997 (reprinted with permission). The top trace is a standard whole-cell electrode recording, showing both spontaneous (marked with an asterisk) and elicited action potentials. The second trace shows current injected (100 pA). The bottom three traces show single-trial (no averaging) fluorescence signals from the circled regions of a neuron stained with the voltage-sensitive fluorescent dye, di-8-ANEPPS.
hours without concern about photodamage (Potter, 1996). Time-lapse imaging allows us to observe how changes in cellular and network morphology relate to changes in the electrical properties of the network. High-resolution (submicron) time-lapse imaging can also be carried out non-destructively using sensitive cooled scientific CCD cameras (Ramakers et al., 1998). However, mature MEA cultures can be quite complex, with many overlapping neurites. By labeling a subpopulation of the network with lipophilic dyes, one can follow changes in individual cells in crowded cultures (Potter et al., 1996; Potter, 2000). My colleagues at Caltech are developing more longlasting labeling using viruses to infect cells with the gene for different colored fluorescent proteins (Okada et al., 1999; Nadeau et al., 2000). These should allow new lines of inquiry relating cellular morphology to electrical activity, which are difficult or impossible to carry out using living animals.
Embodied, situated neuronal cultures Even if optical and MEA technologies are capable of observing and influencing distributed patterns of activity in cultured networks, they will not allow us to say much about learning, memory, and information processing because these networks are removed from a body, and therefore isolated from the rest of the world. There is a movement gaining momentum that neural systems should not be studied in isolation (Clark, 1997). They evolved to serve a body, and that body interacts with an environment. They are described as embodied and situated. This notion has been promoted, at several conferences on the Simulation of Adaptive Behavior, as the 'animats approach' (Meyer and Wilson, 1991; Meyer and Guillot, 1994). An animat is a simulated animal. Animats, either software simulations or actual robots, have been used to develop more 'natural' artificial
57 intelligence, that is good at solving the types of problems real animals have to solve, such as locomotion, obstacle avoidance, finding food, or group behaviors such as flocking (Meyer and Guillot, 1994). The animats approach may solve the problem that unlike retinas, neural cultures lack obvious inputs and outputs. It may suggest candidate population codings or non-trivial transformations that could be carded out by a network of neurons growing in a dish suitably interfaced with a computer using MEAs. In order to embody a cultured network, we are creating the first neurally controlled animat (Potter et al., 1997a, DeMarse et al., 2000), a culture of dissociated cortical neurons on an M E A whose electrical activity controls the behavior of a simulated animal on a computer. An embodied culture capable
of behaving may then exhibit changes in behavior as a result of experience, that is, learning. The animat is situated within a computer-simulated environment, a sort of 'virtual reality'. Sensory input to the animat is fed back to the culture as patterns of electrical stimulation, in real time, allowing a sensory-motor feedback loop (Fig. 4). The behaviors and the environment give meaning to the patterns of activity within the culture. This meaning has been the key missing element in the study of population coding in cultured networks. Without it, we are merely studying the collective dynamics of a network of coupled excitable elements. But as soon as these dynamics are imbued with meaning by connecting the culture to a body and situating it within an environment, we can legitimately discuss the processing of in-
Neurally-controlled animal in virtual environment !
High-speed CCD camera
-
1
Sensory Input
Stimulator Fig. 4. Plan for an embodied cultured neuronal network. MEA technology allows us to create a long-term two-way communication between a small network of cultured cells and a computer. The computer uses patterns detected in the spontaneous neural activity to control the behavior of a simulated animal, the 'neurally controlled animat'. This animat is situated within a simulated environment, and its sensory inputs are fed back to the culture as spatio-temporal patterns of electrical stimulation. This allows one to do 'in vitro neuroethology'. Because MEA cultures are so accessible, we can follow changes in great detail at the millisecond time scale (with high-speed optical recording) and at the minutes or hours time scale (with two-photon microscopy), to make connections between the animat's behavior and the morphology and activity patterns of the neurons and supporting cells.
58 formation by cultured cells. Because the mapping from neural activity to the animat's behaviors is arbitrary, as is the mapping of sensory input to patterns of stimulation, we are much less constrained than those studying population coding in vivo. Distributed coding may exist at many different spatial and temporal scales. Rybka and I have begun to characterize the types of patterns that may be used to control the animat's behaviors (Rybka, 1999). DeMarse and I are exploring which parameters of patterns of extracellular electrical stimuli through the MEA substrate generate robust network responses (T. DeMarse and S. Potter, unpublished). These will be used as sensory inputs to the neurally controlled animat. Eventually, this system may help to bridge the gap between top-down (behavioral, cognitive) and bottom-up (molecular and cellular) neuroscience approaches.
multi-layer fabrication, or on-chip multiplexing and analog-to-digital conversion. Progress is being made in this direction for both in vivo probes (Najafi and Wise, 1986; Ji and Wise, 1992) and in vitro MEAs (Pancrazio et al., 1998). 2-Photon uncaging of neurotransmitter receptor agonists (Furuta et al., 1999) allows stimulation at more sites than presently possible using electrodes, and it is likely that it will be used on cultured networks in conjunction with MEA- or optical recording. Until MEAs with many electrodes are realized, the 60 or so electrodes presently available should be used optimally. It would be helpful if companies that supply MEAs could rapidly fabricate custom electrode geometries to suit the specific needs of each researcher. Such a personalized fabrication service for in vivo silicon probes at the University of Michigan has been quite successful. 5
Future innovations
Conclusion
Now that MEA hardware is readily available, multiunit researchers are presently hampered most by the paucity of powerful software tools that allow spike detection, spike sorting, and recognition of dynamic spatio-temporal patterns of neural activity in real time. A number of other fields, such as satellite imaging or economics, are also generating very large data sets in need of automated analysis and this has resulted in a boom in the 'data mining' meta-field (Fayyad et al., 1996) that MEA researchers will certainly benefit from. Number-crunching of multi-neuron signals, recorded either optically or electrically, would seem to be a perfect application for parallel processing systems. The signal from each electrode, pixel, or neuron could be analyzed by a single microprocessor of a many-processor computer. Already a large bank of digital signal processors (Wheeler and Valesano, 1985), such as the system developed by Plexon Inc., is used by a number of labs to do real-time spike-sorting and analysis of MEA data. Even for a thousand-neuron culture, to record with 60 electrodes is a vast undersampling of the net's activity. Assuming the hardware and software can keep up (a difficult task!), it would be useful to have MEAs with many more electrodes. Getting all those signals out to external electronics presents a significant wiring problem that might be solved using
The nascent field of population coding in networks of cultured neurons is poised for rapid expansion, thanks to advances in a number of key technologies. Neural cell culture, long-term multi-electrode recording and stimulation, and multi-single-unit optical recording are now accessible to many labs. Recent studies show that these networks exhibit a variety of recurring activity patterns that can be modified by electrical stimulation. Computers are fast and cheap enough to allow real-time spike analysis and stimulus generation, which will make it possible to give cultured networks a simulated body to behave with, and an environment to interact with. By allowing the culture to behave and receive sensory input (even if artificial), meaning can be ascribed to the patterns of electrical activity it produces, and persistent changes in network activity can be thought of as learning. Simultaneous high-resolution timelapse imaging using 2-photon or video microscopy will enable the study of the morphological correlates of this learning. Artificial neural networks, with only a few tens or hundreds of computer-modeled neurons so simple they are usually called 'units', have accomplished many interesting and useful learning,
5 http://www.engin.umich.edu/facility/cnct/
59
pattern recognition and processing tasks (e.g., Dowla and Rogers, 1996). Thus, I suspect that a network of a few thousand real, living neurons, with all their intracellular complexity and prolific interconnectivity, is capable of quite a bit of distributed information processing. Abbreviations animat CCD MEA SCN VSFP
simulated animal charge-coupled device multi-electrode array suprachiasmatic nucleus voltage-sensitive fluorescent protein
Acknowledgements I thank Profs. Scott Fraser and Jerome Pine for their continued support and guidance. I thank Drs. Tom DeMarse and Axel Blau for editorial comments. Our work is supported by NIH grant 1RO1NS38628-01 from the NINDS, and by the Beckman Foundation. References Abeles, M., Prut, Y., Bergman, H. and Vaadia, E. (1994) Synchronization in neuronal transmission and its importance for information-processing. Prog. Brain Res., 102: 395-404. Banker, G. and Goslin, K. (1998) Culturing Nerve Cells. MIT Press, Cambridge, MA, 2nd ed. Basarsky, T.A., Parpura, V. and Haydon, P.G. (1994) Hippocampal synaptogenesis in cell-culture - - developmental timecourse of synapse formation, calcium influx, and synaptic protein distribution. J. Neurosci., 14:6402-6411. Bi, G.Q. and Poo, M.M. (1998) Synaptic modifications in cultured hippocampal neurons: Dependence on spike timing, synaptic strength, and postsynaptic cell type. J. Neurosci., 18: 10464-10472. Blan, A. (1999) Bioelectronical Neuronal Networks. PhD. thesis, University of Tuebingen, Tuebingen. Blau, A., Ziegler, C., Heyer, M., Endres, E, Schwitzgebel, G., Matthies, T., Stieglitz, T., Meyer, J.U. and Gopel, W. (1997) Characterization and optimization of microelectrode arrays for in vivo nerve signal recording and stimulation. Biosens. Bioelectron., 12: 883-892. Boppart, S.A., Wheeler, B.C. and Wallace, C.S. (1992) A flexible perforated microelectrode array for extended neural recordings. IEEE Trans. Biomed. Eng., 39: 37-42. Borkholder, D.A., Ban, J., Maluf, N.I., Perl, E.R. and Kovacs, G.T.A. (1997) Microelectrode arrays for stimulation of neural slice preparations. J. Neurosci. Methods, 77: 61-66. Borroni, A., Chen, EM., LeCursi, N., Grover, L.M. and Teyler,
T.J. (1991) An integrated multielectrode electrophysiology system. J. Neurosci. Methods, 36: 177-184. Bullen, A., Patel, S.S. and Saggau, E (1997) High-speed, random-access fluorescence microscopy, 1. High-resolution optical-recording with voltage-sensitive dyes and ion indicators. Biophys. J., 73: 477-491. Bullen, A. and Saggan, E (1999) High-speed, random-access fluorescence microscopy, II. Fast quantitative measurements with voltage-sensitive dyes. Biophys. J., 76: 2272-2287. Canepari, M., Bove, M., Maeda, E., Cappello, M. and Kawana, A. (1997) Experimental analysis of neuronal dynamics in cultured cortical networks and transitions between different patterns of activity. Biol. Cybern., 77: 153-162. Chien, C.B. and Pine, J. (1991) An apparatus for recording synaptic potentials from neuronal cultures using voltage-sensitive fluorescent dyes. J. Neurosci. Methods, 38: 93-105. Clark, A. (1997) Being There: Putting Brain, Body, and the World Together Again. MIT Press, Cambridge, MA. Connolly, P., Clark, P., Curtis, A.S., Dow, J.A. and Wilkinson, C.D. (1990) An extracellular microelectrode array for monitoring electrogenic cells in culture. Biosens. Bioelectron., 5: 223-234. Corner, M.A. (1994) Reciprocity of structure-function relations in developing neural networks - - the odyssey of a self-organizing brain through research fads, fallacies and prospects. Prog. Brain Res., 102: 3-31. Corner, M.A. and Ramakers, G.J. (1991) Spontaneous bioelectric activity as both dependent and independent variable in cortical maturation. Chronic tetrodotoxin versus picrotoxin effects on spike-train patterns in developing rat neocortex neurons during long-term culture. Ann. N.Y. Acad. Sci., 627: 349-353. Cotman, C.W., Monaghan, D.T. and Ganong, A.H. (1988) Excitatory amino acid neurotransmission: NMDA receptors and Hebb-type synaptic plasticity. Annu. Rev. Neurosci., l l: 6180. Davila, H.V., Salzberg, B.M., Cohen, L.B. and Waggoner, A.S. (1973) A large change in axon fluorescence that provides a promising method for measuring membrane potential. Nature, 241: 159-160. DeMarse, T.B., Wagenaar, D.A., Blau, A.W. and Potter, S.M. (2000) Neurally-controlled computer-simulated animals: a new tool for studying learning and memory in vitro. Soc. Neurosci. Abstr., 26: 467.20. Denk, W., Strickler, J.H. and Webb, W.W. (1990) 2-photon laser scanning fluorescence microscopy. Science, 248: 73-76. Dowla, EJ. and Rogers, L.L. (1996) Solving Problems in Environmental Engineering and Geosciences with Artificial Neural Networks. MIT Press, Cambridge, MA. Droge, M.H., Gross, G.W., Hightower, M.H. and Czisny, L.E. (1986) Multielectrode analysis of coordinated, multisite, rhythmic bursting in cultured CNS monolayer networks. J. Neurosci., 6: 1583-1592. Duport, S., Millerin, C., Muller, D. and Correges, E (1999) A metallic multisite recording system designed for continuous long-term monitoring of electrophysiological activity in slice cultures. Biosens. Bioelectron., 14: 369-376. Egert, U., Schlosshaner, B., Fennrich, S., Nisch, W., Fejtl, M.,
60
Knott, T., Muller, T. and Hammerle, H. (1998) A novel organotypic long-term culture of the rat hippocampus on substrateintegrated multielectrode arrays. Brain Res. Brain Res. Protoc., 2: 229-242. Eggers, M.D., Astolfi, D.K., Liu, S., Zeuli, H.E., I)oeleman, S.S., McKay, R., Khuon, T.S. and Ehrlich, D.J. (1990) Electronically wired petri dish: a microfabricated interface to the biological neuronal network. J. Vac. Sci. Technol., B 8: 13921398. Fayyad, U.M., Piatetsky-Shapiro, G. and Smyth, EJ. (Eds.) (1996) Advances in Knowledge Discovery and Data Mining. MIT Press, Cambridge, MA. Fejtl, M., Knott, T., Leibrock, C., Schlosshauer, B., Nisch, W., Egert, U., Muller, T. and Hammerle, H. (1998) Multi-site recording as a new tool to study epileptogenesis in organotypic hippocampal slices. Eur. J. Neurosci., 10: 44. Friedrich, R.W., Gonzalez, J.E., Potter, S., Chien, C.-B., Tsien, R.Y., Scheel, J. and Lanrent, G. (1999) GFP-based optical recording from a C. elegans sensory neuron. Soc. Neurosci. Abstr., 25: 742. Fromherz, E and Stett, A. (1995) Silicon-neuron junction - capacitive stimulation of an individual neuron on a silicon chip. Phys. Rev. Lett., 75: 1670-1673. Frostig, R.D., Frysinger, R.C. and Harper, R.M. (1990) Recurring discharge patterns in multiple spike trains, II. Application in forebrain areas related to cardiac and respiratory control during different sleep-waking states. BioL Cybern., 62: 495502. Furuta, T., Wang, S.S.H., Dantzker, J.L., Dore, T.M, Bybee, W.J., Callaway, E.M., Denk, W. and Tsien, R.Y. (1999) Brominated 7-hydroxycoumarin-4-ylmethyls: photolabile protecting groups with biologically useful cross-sections for two photon photolysis. Proc. Natl. Acad. Sci. USA, 96: 1193-1200. Gerstner, W., Kreiter, A.K., Markram, H. and Herz, A.V.M. (1997) Neural codes: firing rates and beyond. Proc. Natl. Acad. Sci. USA, 94: 12740-12741. Gross, G.W., Williams, A.N. and Lucas, J.H. (1982) Recording of spontaneous activity with photoetched microelectrode surfaces from mouse spinal neurons in culture. J. Neurosci. Methods, 5: 13-22. Simultaneous single unit recording in vitro with a photoetched laser deinsulated gold multimicroelectrode surface. IEEE Trans. Biomed. Eng., 26: 273-279. Gross, G.W. and Kowalski, J. (1991) Experimental and theoretical analysis of random nerve cell network dynamics. In: E Antognetti and E.B. Milutinovic (Eds.), Neural Networks: Concepts, Applications, and Implementations. Prentice-Hall, N J, pp. 47-110. Gross, G.W., Rhoades, B.K. and Kowalski, J.K. (1993a) Dynamics of burst patterns generated by monolayer networks in culture. In: H.W. Bothe, M. Samii and R. Eckmiller (Eds.), Neurobionics: An Interdisciplinary Approach to Substitute Impaired Functions of the Human Nervous System. North-Holland, Amsterdam, pp. 89-121. Gross, G.W., Rhoades, B.K., Reust, D.L. and Schwalm, EU. (1993b) Stimulation of monolayer networks in culture through
thin-film indium-tin oxide recording electrodes. J. Neurosci. Methods, 50: 131-143. Gross, G.W. and Schwalm, EU. (1994) A closed flow chamber for long-term multichannel recording and optical monitoring. J. Neurosci. Methods, 52: 73-85. Habets, A., Vandongen, A.M.J., Vanhuizen, E and Corner, M.A. (1987) Spontaneous neuronal firing patterns in fetal-rat cortical networks during development in vitro - - a quantitativeanalysis. Exp. Brain Res., 69: 43-52. Heck, D. (1995) Investigating dynamic aspects of hrain-function in slice preparations - - spatiotemporal stimulus patterns generated with an easy-to-build multielectrode array. J. Neurosci. Methods, 58: 81-87. Herzog, E.D., Geusz, M.E., Khalsa, S.B.S., Stranme, M. and Block, G.D. (1997) Circadian rhythms in mouse suprachiasmatic nucleus explants on multimicroelectrode plates. Brain Res., 757: 285-290. Herzog, E.D., Takahashi, J.S. and Block, G.D. (1998) Clock controls circadian period in isolated suprachiasmatic nucleus neurons. Nat. Neurosci., l: 708-713. Honma, S., Shirakawa, T., Katsuno, Y., Namihira, M. and Honma, K. (1998) Circadian periods of single suprachiasmatic neurons in rats. Neurosci. Lett., 250: 157-160. Israel, D.A., Barry, W.H., Edell, D.J. and Mark, R.G. (1984) An array of microelectrodes to stimulate and record from cardiac cells in culture. Am. Z Physiol., 247: H669-H674. Jahnsen, H., Kristensen, B.W., Thieband, E, Noraberg, J., Jakobsen, B., Bove, M., Martinoia, S., Koudelka-Hep, M., Grattarola, M. and Zimmer, J. (1999) Coupling of organotypic brain slice cultures to silicon-based arrays of electrodes. Methods, 18: 160-172. Janossy, V., Toth, A., Bodocs, L., Imrik, E, Madarasz, E. and Gyevai, A. (1990) Multielectrode culture chamber: a device for long-term recording of bioelectric activities in vitro. Acta Biol. Hung., 41: 309-320. Ji, J. and Wise, K.D. (1992) An implantable cmos circuit interface for multiplexed microelectrode recording arrays. 1EEE J. Solid-State Circuits, 27: 433-443. Jimbo, Y. and Kawana, A. (1992) Electrical stimulation and recording from cultured neurons using a planar electrode array. Bioelectrochem. Bioenerg., 29: 193-204. Jimbo, Y., Robinson, H.EC. and Kawana, A. (1993) Simultaneous measurement of intracellular calcium and electrical activity from patterned neural networks in culture. IEEE Trans. Biomed. Eng., 40: 804-810. Jimbo, Y., Robinson, H.EC. and Kawana, A. (1998) Strengthening of synchronized activity by tetanic stimulation in cortical cultures: application of planar electrode arrays. 1EEE Trans. Biomed. Eng., 45: 1297-1304. Jimbo, Y., Tateno, T. and Robinson, H.P.C. (1999) Simultaneous induction of pathway-specific potentiation and depression in networks of cortical neurons. Biophys. J., 76: 670-678. Kamioka, H., Maeda, E., Jimbo, Y., Robinson, H.EC. and Kawana, A. (1996) Spontaneous periodic synchronized bursting during formation of mature patterns of connections in cortical cultures. Neurosci. Lett., 206:109-112. Kawana, A. (1996) Formation of a simple model brain on mi-
61
crofabricated electrode arrays. In: H.C. Hoch, L.W. Jelinske and H.G. Craighead (Eds.), Nanofabrication and Biosystems. Cambridge University Press, Cambridge, pp. 258-275. Konno, N., Fukami, T., Shiina, T. and Jimbo, Y. (1998) Estimation of network structure for signal propagations by the analysis of multichannel action potentials in cultured neural networks. Trans. lEE Jpn., 118-C:999-1006. Liu, C., Weaver, D.R., Strogatz, S.H. and Reppert, S.M. (1997) Cellular construction of a circadian clock: period determination in the suprachiasmatic nuclei. Cell, 91: 855-860. Maeda, E., Kuroda, Y., Robinson, H.EC. and Kawana, A. (1998) Modification of parallel activity elicited by propagating bursts in developing networks of rat cortical neurones. Eur. J. Neurosci., 10: 488-496. Maeda, E., Robinson, H.EC. and Kawana, A. (1995) The mechanisms of generation and propagation of synchronized bursting in developing networks of cortical-neurons. J. Neurosci., 15: 6834-6845. Maher, M.E, Pine, J., Wright, J. and Tai, Y.C. (1999) The neurochip: a new multielectrode device for stimulating and recording from cultured neurons. J. Neurosci. Methods, 87: 45-56. Markram, H., Gupta, A., Uziel, A., Wang, Y. and Tsodyks, M. (1998) Information processing with frequency-dependent synaptic connections. Neurobiol. Learn. Mem., 70:101-112. Marrese, C.A. (1987) Preparation of strongly adherent platinum black coatings. Anal. Chem., 59: 217-218. Martinoia, S., Bove, M., Carlini, G., Ciccarelli, C., Grattarola, M., Storment, C. and Kovacs, G. (1993) A general-purpose system for long-term recording from a microelectrode array coupled to excitable cells. J. Neurosci. Methods, 48:115-121. Meister, M., Pine, J. and Baylor, D.A. (1994) Multi-neuronal signals from the retina - - acquisition and analysis. J. Neurosci. Methods, 51: 95-106. Meyer, J.-A. and Guillot, A. (1994) From SAB90 to SAB94: four years of animat research. In: D. Cliff, E Husbands, J.-A. Meyer and S.W. Wilson (Eds.), From Animals to Animats 3: Proceedings of the Third International Conference on Simulation of Adaptive Behavior. MIT Press, Cambridge, MA, pp. 2-11. Meyer, J.A. and Wilson, S.W. (1991) From Animals to Animats: Proceedings of the First International Conference on Simulation of Adaptive Behavior. MIT Press, Cambridge, MA. Meyer-Franke, A., Kaplan, M.R., Pfrieger, EW. and Banes, B.A. (1995) Characterization of the signaling interactions that promote the survival and growth of developing retinal ganglion-cells in culture. Neuron, 15: 805-819. Misgeld, U., Zeilhofer, H.U. and Swandulla, D. (1998) Synaptic modulation of oscillatory activity of hypothalamic neuronal networks in vitro. Cell. Mol. Neurobiol., 18: 29-43. Nadasdy, Z., Hirase, H., Czurko, A., Csicsvari, J. and Buzsaki, G. (1999) Replay and time compression of recurring spike sequences in the hippocampus. J. Neurosci., 19: 9497-9507. Nadeau, H., Anderson, D.J. and Lester, H.A. (2000) Long-term, low-level expression of ROMK1 (Kirl.1) causes apoptosis and chronic silencing of hippocampal neurons. J. Neurophysiol., 84: 1062-1075.
Najafi, K. and Wise, K.D. (1986) An implantable multielectrode array with on-chip signal-processing. 1EEE J. Solid-State Circuits, 21: 1035-1044. Nirenberg, S. and Latham, P.E. (1998) Population coding in the retina. CON, 8: 488-493. Novak, J.L. and Wheeler, B.C. (1986) Recording from the aplysia abdominal ganglion with a planar microelectrode array. IEEE Trans. Biomed. Eng., 33: 196-202. Novak, J.L. and Wheeler, B.C. (1988) Multisite hippocampal slice recording and stimulation using a 32 element microelectrode array. J. Neurosci. Methods, 23: 149-159. Obaid, A.L., Koyano, T., Lindstrom, J., Sakai, T. and Salzberg, B.M. (1999) Spatiotemporal patterns of activity in an intact mammalian network with single-cell resolution: optical studies of nicotinic activity in an enteric plexus. J. Neurosci., 19: 3073-3093. Offenhausser, A., Sprossler, C., Matsuzawa, M. and Knoll, W. (1997) Field-effect transistor array for monitoring electrical activity from mammalian neurons in culture. Biosens. Bioelectron., 12: 819-826. Okada, A., Lansford, R., Weimann, J.M., Fraser, S.E. and McConnell, S.K. (1999) Imaging cells in the developing nervous system with retrovirus expressing modified green fluorescent protein. Exp. Neurol., 156: 394-406. Pancrazio, J.J., Bey, P.P., Loloee, A., Manne, S.R., Chao, H.C., Howard, L.L., Gosney, W.M., Borkholder, D.A., Kovacs, G.T.A., Manos, P., Cuttino, D.S. and Stenger, D.A. (1998) Description and demonstration of a CMOS amplifier-basedsystem with measurement and stimulation capability for bioelectrical signal transduction. Biosens. Bioelectron., 13: 971979. Pfrieger, F.W. and Barres, B.A. (1997) Synaptic efficacy enhanced by glial cells in vitro. Science, 277: 1684-1687. Pine, J. (1980) Recording action potentials from cultured neurons with extracellular microcircuit electrodes. J. Neurosci. Methods, 2: 19-31. Pine, J. and Potter, S.M. (1997) A high-speed CCD camera for optical recording of neural activity. Soc. Neurosci. Abstr., 23: 259. Potter, S.M. (1996) Vital imaging: two photons are better than one. Curr. Biol., 6: 1595-1598. Potter, S.M. (2000) Two-photon microscopy for 4D imaging of living neurons. In: R. Yuste, E Lanni and A. Konnerth (Eds.), Imaging Neurons: A Laboratory Manual. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, pp. 20.1-20.16. Potter, S.M., Fraser, S.E. and Pine, J. (1997a) Animat in a petri dish: cultured neural networks for studying neural computation. In: Proceedings of the 4th Joint Symposium on Neural Computation, UCSD, pp. 167-174. Potter, S.M., Mart, A.N. and Pine, J. (1997b) High-speed CCD movie camera with random pixel selection, for neurobiology research. SPIE Proc., 2869: 243-253. Potter, S.M., Pine, J. and Fraser, S.E. (1996) Neural transplant staining with DiI and vital imaging by 2-photon laser-scanning microscopy. Scanning Microsc. Suppl., 10: 189-199. Ramakers, G.J.A., Winter. J., Hoogland, T.M., Lequin, M.B., van Hulten, E, van Pelt, J. and Pool, C.W. (1998) Depolarization
62 stimulates lamellipodia formation and axonal but not dendritic branching in cultured rat cerebral cortex neurons. Dev. Brain Res., 108: 205-216. Regehr, W.G., Pine, J., Cohan, C.S., Mischke, M.D. and Tank, D.W. (1989) Sealing cultured invertebrate neurons to embedded dish electrodes facilitates long-term stimulation and recording. J. Neurosci. Methods, 30: 91-106. Rhoades, B.K., Weil, J.C., Kowalski, J.M. and Gross, G.W. (1996) Distribution-free graphical and statistical-analysis of serial dependence in neuronal spike trains. J. Neurosci. Methods, 64: 25-37. Rieke, E, Warland, D., de Ruyter van Steveninck, R. and Bialek, W. (1997) Spikes: Exploring the Neural Code. MIT Press, Cambridge, MA. Robinson, H.EC., Kawahara, M., Jimbo, Y., Torimitsu, K., Kuroda, Y. and Kawana, A. (1993a) Periodic synchronized bursting and intracellular calcium transients elicited by low magnesium in cultured cortical neurons. J. Neurophysiol., 70: 1606-1616. Robinson, H.EC., Torimitsu, K., Jimbo, Y., Kuroda, Y. and Kawana, A. (1993b) Periodic bursting of cultured cortical-neurons in low magnesium - - cellular and network mechanisms. Jpn. J. Physiol., Suppl. 1, 43:s125-s130. Rybka, G. (1999) Tools and Techniques for Analyzing Neural Data from Multi-Electrode Arrays. Summer Undergraduate Research Fellowship, California Institute of Technology. Sakurai, Y. (1996) Population coding by cell assemblies - - what it really is in the brain. Neurosci. Res., 26: 1-16. Siegel, M.S. and Isacoff, Y.E. (1997) A genetically encoded optical probe of membrane voltage. Neuron, 19: 735-741. Stoppini, L., Duport, S. and Correges, P. (1997) A new extracellular multirecording system for electrophysiological studies: application to hippocampal organotypic cultures. J. Neurosci. Methods, 72: 23-33. Tateno, T. and Jimbo, Y. (1999) Activity-dependent enhancement in the reliability of correlated spike timings in cultured cortical neurons. Biol. Cybern., 80: 45-55. Thiebaud, E, Beuret, C., Koudelka-Hep, M., Bove, M., Martinoia, S., Grattarola, M., Jahnsen, H., Rebaudo, R., Balestrino, M., Zimmer, J. and Dupont, Y. (1999) An array of Pt-tip microelectrodes for extracellular monitoring of activity of brain slices. Biosens. Bioelectron., 14: 61-65. Thieband, E, deRooij, N.E, KoudelkaHep, M. and Stoppini, L. (1997) Microelectrode arrays for electrophysiological moni-
toring of hippocampal organotypic slice cultures. IEEE Trans. Biomed. Eng., 44:1159-1163. Thomas, C.A., Springer, P.A., Loeb, G.E., Berwald-Netter, Y. and Okun, L.M. (1972) A miniature microelectrode array to monitor the bioelectric activity of cultured cells. Exp. Cell Res., 74: 61-66. Vassanelli, S. and Fromherz, P. (1997) Neurons from rat brain coupled to transistors. Appl. Phys. Mater. Sci. Process., 65: 85-88. Voigt, T., Baler, H. and Delima, A.D. (1997) Synchronization of neuronal-activity promotes survival of individual rat neocortical neurons in early development. Eun J. Neurosci., 9: 990-999. Warland, D.K., Reinagel, P. and Meister, M. (1997) Decoding visual information from a population of retinal ganglion cells. J. Neurophysiol., 78: 2336-2350. Watanabe, S., Jimbo, Y., Kamioka, H., Kirino, Y. and Kawana, A. (1996) Development of low magnesium-induced spontaneous synchronized bursting and GABAergic modulation in cultured rat neocortical neurons. Neurosci. Lett., 210: 41-44. Welsh, D.K., Logothetis, D.E., Meister, M. and Reppert, S.M. (1995) Individual neurons dissociated from rat suprachiasmatic nucleus express independently phased circadian firing rhythms. Neuron, 14: 697-706. Wheeler, B.C. and Novak, J.L. (1986) Current source density estimation using microelectrode array data from the hippocampal slice preparation. IEEE Trans. Biomed. Eng., 33: 12041212. Wheeler, B.C. and Valesano, W.R. (1985) Real-time digital-illter-based data-acquisition system for the detection of neural signals. Med. Biol. Eng. Comput., 23: 243-248. Wong, R.O.L., Meister, M. and Shatz, C.J. (1993) Transient period of correlated bursting activity during development of the mammalian retina. Neuron, 11: 923-938. Wu, J.Y., Cohen, L.B. and Falk, C.X. (1994) Neuronal-activity during different behaviors in aplysia - - a distributed organization. Science, 263: 820-823. Wu, J.Y., Lam, Y.W., Falk, C.X., Cohen, L.B., Fang, J., Loew, L., Prechtl, J.C., Kleinfeld, D. and Tsau, Y. (1998) Voltage-sensitive dyes for monitoring multineuronal activity in the intact central nervous system. Histochem. J., 30: 169-187. Zhang, L.I., Tao, H.W., Holt, C.E., Harris, W.A. and Poo, M.M. (1998) A critical window for cooperation and competition among developing retinotectal synapses. Nature, 395: 37-44.
M.A.L. Nicolelis (Ed.)
Progress in Brain Research, Vol. 130 © 2001 Elsevier Science B.V. All rights reserved
CHAPTER 5
Long-term chronic multichannel recordings from sensorimotor cortex and thalamus of primates Neeraj Jain *, Hui-Xin Qi and Jon H. Kaas Department of Psychology, Vanderbilt University, Nashville, TN 37240, USA
Introduction The use of chronically implanted microwire microelectrodes offers the opportunity to record from large numbers of single neurons and neuron clusters over long periods of time and from several structures in the sensorimotor system of monkeys. Chronically implanted arrays of microwires have been used to record from neurons in the sensorimotor system of rats (Nicolelis and Chapin, 1994; Nicolelis et al., 1995). More recently these procedures have been applied to the somatosensory system of monkeys with considerable success (Jain et al., 1998c, 1999; Nicolelis et al., 1998). Such recording procedures offer obvious advantages over traditional single-unit recording approaches. Most notably, it is possible to record from a large number of different neurons in different parts of the system at the same time. Thus, it is possible to see how neurons in different parts of the system respond to the same events, and determine how the responses of individual neurons relate to each other. In addition, the chronically implanted electrodes allow recording under a number of experimental conditions, including recordings in awake animals during behavior. Finally, recordings
*Corresponding author: Neeraj Jain, Department of Psychology, 301 Wilson Hall, Vanderbilt University, 111, 21st Ave. South, Nashville, TN 37240, USA. Tel.: +1-615322-7491; Fax: +1-615-343-8449; E-mail: neeraj.j ain @vanderbilt.edu
over weeks to months of time allow studies of how neurons in systems might or might not change in their response characteristics as a result of experience, sensory deprivation, learning, drug treatments, or damage to part of the system. Thus, the multielectrode approach seems to be potentially valuable in studies of both the normal functions of a system, and the plasticity of a system. Of course, such a multielectrode approach is useful only to the extent that it can be used to reliably address relevant issues and there are several obvious concerns. Thus, we started to evaluate the approach. We wanted to know if recordings could be maintained from the same neurons or small clusters of neurons for long periods of time. We also wanted to know if it is possible to use the chronically implanted electrodes without significant damage to the surrounding tissue. In addition, to the extent that the recording approach proved to be useful, we wanted to start answering questions about the neuronal functions of the sensorimotor system in monkeys. Specifically, we asked if the responses of different neurons recorded at the same time are correlated with each other during spontaneous and stimulus-evoked activity when these neurons are within a limited region of area 3b of somatosensory cortex, in comparable parts of area 3b of two hemispheres, area 3b and the thalamus of the same hemisphere, or 3b and primary motor cortex of the same hemisphere. In this paper, we first present results showing that multielectrode recording from monkeys can provide useful, reliable data. We then present our preliminary
64 results on the response characteristics of neurons distributed in the primate somatosensory system.
Suitability of the chronic recording technique A number of different electrode assemblies have been devised for simultaneous multineuron singleunit recordings from chronically implanted electrodes. These include (1) a regular array of electrodes fabricated from p-type silicon material (Nordhausen et al., 1996) popularly known as the Utah system, which is placed directly on the surface of the brain, (2) a floating or fixed electrode system which is popularly known as the Michigan system (Drake et al., 1988), (3) individual floating wires placed in the region of interest and affixed in place on the dura (Schmidt et al., 1976; Donoghue et al., 1998), and (4) arrays of stainless steel or other microwires that are fixed to the skull (Nicolelis et al., 1993; Jain et al., 1999). In our experiments we have used arrays of the latter type. The arrays are made of low-impedance Teflon-coated stainless-steel microwires (NB Labs, Dallas, TX). For our cortical implants we have used configurations with sixteen electrodes in two rows of eight electrodes each or eight electrodes in a single row. For thalamic implants we used a bundle of sixteen microwires cut to different lengths. The bundle was coated with melted sucrose prior to insertion to provide stiffness for ease of insertion (Chorover and DeLuca, 1972). In the present experiments, we used New World squirrel monkeys. Squirrel monkey cortex is largely devoid of sulci and nearly all of the sensorimotor cortex is exposed on the surface. This makes it easy to explore and implant the desirable somatotopic representation in a cortical area. The method of implantation has been described before (Nicolelis and Chapin, 1994). The arrays were implanted through small craniotomies after reflecting the dura. Before implanting the arrays, the region of interest was mapped to ensure accurate placement of arrays in the desirable regions. The somatosensory areas were mapped by standard multiunit mapping methods using tungsten electrodes (1 Mf2 at 1 kHz; Microprobe Inc., Gaithersburg, MD; Jain et al., 1995). The motor areas were mapped by intracortical microstimulation (ICMS) techniques (Preuss et al., 1996). We found that it was helpful to map the brain before inserting
the arrays because there was too much variability between individual monkeys to rely on the stereotaxic coordinates alone. The arrays were lowered to the required depth with the help of a micromanipulator while continuously monitoring the responses by recording from the electrodes. The opening in the skull was sealed and the microconnector fixed to the skull using dental cement. Recordings started after a recovery period of 1 week. During the first week, we felt that the brain might move slightly relative to the electrodes, but that recordings after 1 week would be more likely stable. The spikes were sorted and captured using a 32-channel Multichannel Neuronal Acquisition Processor (MNAP) system (Plexon, Dallas, TX). We isolated and recorded the activity of single units as well as multiunit clusters. Both single units and unit clusters were isolated using two timeamplitude windows. Because the responses of adjacent neurons are typically quite similar, recording from clusters proved to be useful in addressing many questions. The configuration files on every recording day were saved and reused to estimate the number of neurons that could be repetitively isolated. The data were analyzed using analysis software Stranger (Biographics, Winston-Salem, NC). The data obtained from these procedures allow us to determine the suitability of our system for long term recordings from normal monkeys and monkeys with peripheral or spinal lesions.
Stability of receptive fields We implanted electrode arrays in area 3b of both hemispheres in three squirrel monkeys. Area 3b is the primary representation of cutaneous receptors in cortex (Kaas, 1983). We wished to see how normal movements of the brain in the skull affect the stability of the wires over extended recording periods when the electrodes are skull mounted. Considerable stability would be necessary for our planned use of this technique to determine the progression of possible changes in the responses of neurons over a period of months after peripheral or central injuries to the system. A simple measure of determining spatial stability of electrodes is to determine if the recorded receptive fields change in location over time. Receptive fields may shift if the electrode wires tear through the cortex or move deeper
65
and across cortex. In addition a progressive damage and gliosis of the cortex surrounding the electrode wires may lead to recordings from neurons progressively farther from the original insertion site, resulting in a shift of the receptive field and weaker responses. Two weeks after the implantation and thereafter for a period lasting more than a year, we mapped the location of the receptive field on the contralateral hand for each of the wire electrodes. The skin of the hand was stimulated with hand-held probes and receptive fields were drawn on an outline diagram of the hand (Merzenich et al., 1978). While the definition of a receptive field is subjective, experienced investigators remapping the receptive field for the same neuron delimit highly overlapping receptive fields. We found that up to a period lasting more than a year, the locations of receptive fields for many recording sites did not change significantly. For example, the receptive field for one of the electrodes in area 3b remained confined to the middle and proximal phalanges of digit 3 for a period of 121 days after implantation. The minimal best responsive receptive field did not shift and remained in the region of the distal portion of the proximal phalange of the digit. The weakly responding surround, however, did show a slight variability on the proximal and middle segment of the digit (Fig. l a). After this period, a restricted lesion of afferents in the spinal cord (Jain et al., 1997) abolished responses to peripheral stimulation. In a second squirrel monkey (Fig. lb) the receptive fields of neurons recorded on an electrode wire in area 3b remained near the joint between the first and second phalange in digit 3 for more than a year. The recordings indicated that the receptive fields tended to cluster on the middle phalange of the digit after a period of about 2 months instead of the distal portion of the proximal phalange. Thus, a small change in receptive field location occurred after 2 months, possibly due to a slight change in electrode position. These results suggest that electrodes reliably record from the same neuronal groups over weeks or longer periods of time, but small shifts in location can occur over the long term. Thus the chronically implanted electrodes would be most useful over weeks of time in studies of brain plasticity where only small changes in receptive field location are likely (Merzenich et al., 1983), and over at least
a
~/~2 50,74,~21
~
27, 34, 43,
14, 19, 40, 53
b
117,217,
15
385, 394 58
30
Fig. 1. Receptive fields obtained over long periods of time from the same chronically implanted microwire electrodes. (a) The receptive fields for neurons recorded from one of the microwire electrodes after the array was implanted in area 3b of a squirrel monkey remained on the same digit over a period of 121 days. Minimal receptive field is shown and numbered according to day after implantation. Only minor differences in the locations of the minimal receptive field were observed. The weakly responding surround showed some variation in the location, but it remained on the adjacent phalanges of the same digit. The response was lost after 121 days as a result of an experimental deafferentation. (b) In area 3b of a second monkey, the minimal receptive fields for recording from another single microwire continued to be located in the region of joint between proximal and middle phalange for a period of 13 months. In this case there was a tendency for the receptive field to be located on the ulnar side of middle phalange of digit 3 after 117 days of implantation.
66 months of time where large changes are possible (Kaas, 1991; Pons et al., 1991; Jain et al., 1997, 1998b).
Stability of recordings over extended periods We also wanted to know if the responses of the same neuron could be recorded over a long period of time. In a previous report it has been suggested that the same single unit could be isolated repeatedly for up to 29 days when free-floating microwires were implanted in the cortex of macaque monkeys (Schmidt et al., 1976). We realize that it is not possible to be absolutely certain that the same neuron is isolated unless the recordings are done without interruption. This is, of course impractical without telemetry. Moreover, shapes of the waveforms can drift (Quirk and Wilson, 1999) and different neurons can have very similar waveforms. However, if a number of parameters of firing characteristics of a neuron isolated on the same channel are nearly identical over time, it is reasonable to infer that most likely it is the same neuron that is being isolated. For example, in the area 3b of a squirrel monkey we obtained recording over a period of more than 4 months that appeared to be from the same neuron (Fig. 2). Neuron '8a' had similar waveforms, auto-correlograms, peristimulus time histograms, spontaneous firing rates and receptive field locations over this period. Overall, we observed a slow drift in the population of neurons that were isolated over a period of weeks. Many of the neurons appeared to be the same, some new units appeared, while others could not be isolated anymore. We are performing a detailed analysis if all these features for all the neurons to determine the stability of recordings.
State of the cortex after extended periods of implantation We also determined histological appearance of the cortex in the region of the microwires. All the squirrel monkeys that have been implanted are still being recorded. In an earlier set of experiments (Nicolelis et al., 1998), posterior parietal cortex of an owl monkey was implanted and recordings were carried out over a period of 18 months. After this period the monkey was perfused transcardially with
paraformaldehyde, the brain was removed and sectioned on a freezing microtome into 50 ptm sections. The sections were stained for Nissl substance (Fig. 3). An examination of the tissue showed that even after 18 months of chronic recording, the state of the tissue was healthy and the damage did not extend beyond the immediate vicinity of the electrode tracks. In another monkey we stained a series of sections for the glial marker glial fibrillary acidic protein (GFAP). As expected there was gliosis surrounding the region of the electrode location. However, the dark reaction product did not extend more than 30 Ixm from the track of individual wires, although scattered gliosis was observed up to a distance of 300 txm. We conclude that the microwires can be in place for months without seriously compromising the local circuitry of the surrounding tissue (Schmidt et al., 1976; Agnew et al., 1986).
Recordings from neurons in area 3b In three squirrel monkeys, we implanted microwires bilaterally in the region of area 3b that represents the digits. In addition to determining the stability of receptive fields (see above), we recorded responses of neurons to peripheral stimulation in anesthetized and awake monkeys. The stimulation was provided by a computer controlled Chubbuck type stimulator (see Sur et al., 1984) tipped with a camel hair brush with about a 0.5 m m tip diameter. The hair brush stimulator was more effective than a solid probe in preventing the spread of stimulation in hard tissue due to vibration or movement of the hand. The entire stimulator assembly was on a table separate from the one on which monkey was lying. A stimulator on the same table as the monkey could transmit vibration through the table leading to spurious responses.
Cross-correlation of spontaneous activity across the two hemispheres Much effort in computational neuroscience has been devoted to determining if and how the activities of neurons across large distances in the brain are modulated together to generate a neural code for coherent sensory perception. The problem of how neurons responding to different aspects of the same stimulus
67
a 40 Days
1 "°1/.o
................ -100
-50
O
0
50
100
~
-10
10
30
°4o
1
30
10
30
50
b 74 Days
so
300
2 " - - ' ' '
-100
-50
0
50
100
.
'
'
'
I
'
'
'
I
"I
,
50
c 125 Days 3oo 200 1 O0 |
~.
0
,
-100
,
,t
I
-50
. . . .
t
,
'
'
0 50 Time (msec)
100
-10
I
'
'
50
Time (msec)
Fig. 2. Wave forms (left column), auto-correlograms (middle colunm) and peristimulus time histograms (right column) of neuron '8a' (a) 40, (b) 77 and (c) 125 days after chronic implantation of a microwire array in area 3b. Similarities in these three parameters, in addition to the receptive field locations and firing frequencies suggest that the same neuron was isolated and recorded over this period of about 3 months. After this period the monkey underwent an experimental deafferentation and we could no longer record from this neuron.
object together code that object has been termed the binding problem in the visual system (Singer and Gray, 1995; von der Malsburg, 1995). It has been proposed that the neurons dynamically form functional ensembles that are linked together by synchronized activity (Engel et al., 1991b). Synchronized activity of neurons has been found in visual, motor and somatosensory cortex (Konig and Engel, 1995; Usrey and Reid, 1999). In the motor cortex, units were found to have cross-correlated activities during the period of oscillations in the local field potential (LFP) (Murthy and Fetz, 1996b). In the motor
cortex, the cross-correlated activity depends on the extent to which neurons are behaviorally related, the correlation being greater if the neurons are united by their involvement in the same task. Therefore, such correlations have been related to focusing of attention on a task (Murthy and Fetz, 1996b). However, the cross-correlation of spontaneous activity of neurons in the motor cortex was found to greatly depend on the distance between neurons (Lee et al., 1998), suggesting an important role for the short horizontal connections in cortex. Cross-correlations have also been found between pairs of neurons in
68
Fig. 3. A Nissl-stained section from the brain of an owl monkey 18 months after chronic implantation of a microwire array in the posterior parietal cortex. The arrow points to tracks left by the electrodes. Note the healthy state of the tissue despite the extended period with implants. Scale bar, 1 mm.
the posterior parietal cortex and somatosensory cortex (Lee et al., 1998). A synchronized oscillatory behavior was seen in units isolated at multiple levels of the somatosensory system in rats. These oscillations were related to the onset of the whisker twitching, a behavior that predicts the beginning of exploratory behavior and perhaps attention (Nicolelis et al., 1995). We found that neurons in the hand region of area 3b of squirrel monkeys show cross-correlations (Perkel et al., 1967) in their spontaneous firing (Fig. 4a). The electrodes implanted in the hand region spanned an area of about 2 mm mediolaterally. While most of the recordings were from anesthetized monkeys, the cross-correlated activity did not depend on the anesthesia, since it was also seen in recordings from awake freely behaving monkeys (Fig. 4d). There is additional evidence from somatosensory cortex that the oscillations in the neuronal activity do not depend on the level of anesthesia (Ahissar and Vaadia, 1990) and therefore attention might not play a role in these kinds of cross-correlations. We have not yet determined how these cross-correlations are modulated if the monkeys are attending to a task. However, in area $2 of the somatosensory cortex attention to a task affects the degree of synchrony between neuron pairs (Steinmetz et al., 2000; see however Ahissar and Vaadia, 1990). Interestingly, the activities of neurons were highly cross-correlated in matched parts of area 3b of two hemispheres (Fig. 4a) as well as in primary motor
cortex M1 of the two hemispheres. Similar interhemispheric cross-correlation in the neuronal activity has been previously described in the motor cortex of monkeys (Murthy and Fetz, 1996a,b) and in the visual cortex of cats (Engel et al., 1991a). In cats, the interhemispheric synchronization was eliminated after section of the corpus callosum, and it did not depend on an overlap of the receptive fields or a matching of orientation preferences. Similar results have been obtained from recordings of single and multiunit activity along the area 17-18 border of cats (Nowak et al., 1995). In our experiments, we did not classify neurons by submodality. However, cross-correlations were seen even if neurons had non-overlapping receptive fields. We do not know to what extent such inter- and intrahemispheric crosscorrelations depend on the distances between neurons within an area because all of our electrodes were in the hand region of area 3b. We, however, did not see any cross-correlated activity between neurons in area 3b and area 4 of the same hemisphere. We have not yet investigated if the inter-hemispheric cross-correlations depend on the callosal transfer of information or are due to shared subcortical inputs. The interhemispheric cross-correlations were abolished if receptive fields on one side of the hand were stimulated (Fig. 4b and c). Neurons in area 3b contralateral to the stimulated hand showed crosscorrelations that were entrained to the stimulus, as seen in cat S1 (Johnson and Alloway, 1995). Interestingly, the neurons in area 3b ipsilateral to the hand being stimulated continued to show intrahemispheric cross-correlation that was independent of the stimulus. Therefore, the neurons in each hemisphere show intrahemispheric cross-correlations while interhemispheric cross-correlation was abolished due to unilateral stimulation. It is possible that intrahemispheric cross-correlations help form a coherent percept across inputs via discontinuous receptor sheets in the periphery such as the skin of the fingers scanning the object, and across the discontinuities in the central representations of the digits in area 3b (Jain et al., 1998a). While we did not stimulate both hands at once, cross-correlations between the two hemispheres under such circumstances might play a role in bimanual exploratory behavior and motor coordination.
69
b Stimulation on right D4
a Spontaneous Activity Left Hemisphere
Right Hemisphere
OSP07b
Left Hemisphere
DSP23
4
IS
S
4
4 =
2" 0 -200
-1~
0
100
200
~
-1~
0
DSPt3
Right Hemisphere
DSP07b
100
0 *200
200
DSP23
2
-100
DSP24
0
100
200
-ZOO
-100
DSP13
10 10 6
0
100
200
100
200
DSI=24
16
6
10
4
6
0 -200
-100
0
100
~
~
-'100
0
100
0 .200
200
¢ Stimulation on right D4 D~=OTb
-100
0
100
200
.200
-t00
0
d Awake, Spontaneous Activity
DSP'J3
DSP07b
OgP2"Ja
46
e
3
~00
~
-100
0
100
200
~00
.100
05P13
0
100
200
DSP24
6
4
-200-t00
O
~
10
100
200
~200
-100
oeP,3
0
100
ZOO
100
200
DSP24a
1o
5'
2.
•2(X)
-100
0
100
Time ( m s e c )
200
-200
-100
0
100
200
Time ( m s e c )
.20o
-IO0
0
100
Time ( m s e c )
200
o
-200
-100
o
Time ( m s e c )
Fig. 4. Cross-correlations between the activity of neurons in area 3b of a squirrel monkey. (a) Inter- and intra-hemispheric cross-correlation of spoiataneous activity of four neurons, two neurons from area 3b of each hemisphere. The monkey was anesthetized with ketamine. The reference neuron (not shown) is from the left hemisphere. The lower line in each graph is a shuffled correlogram around a point event at 1 Hz. (b, c) Cross-correlations between the same neurons shown in a when the middle phalange of digit 4 of the right hand was stimulated with a Chubbuck type stimulator at 1 Hz (20 ms on). In b the reference neuron is in the left hemisphere. Note that firing of neurons in the left hemisphere is cross-correlated which is related to the stimulus (lower lines in the graphs are fifty times shuffled cross-correlograms). However, interhemispheric cross-correlation is abolished. In c we show the same data as in b except that the reference neuron is in the right hemisphere. Note that the neurons of the right hemisphere, that do not receive direct stimulus inputs continue to show intrahemispheric cross-correlation that is unrelated to the stimulus. (d) Inter- and intra-hemispheric cross-correlations between the same neurons shown above when the monkey was awake and freely behaving. The neurons in the left hemisphere show intrahemispheric cross-correlation but there is no cross-correlation between the two hemispheres. Neurons in the right hemisphere similarly show intrahemispheric cross-correlation (not shown). In a freely behaving monkey each hand is receiving independent stimuli and thus resembles the condition shown in b and c.
Thalamic implants and corticothalamic interactions In one squirrel monkey, w e i m p l a n t e d e l e c t r o d e s in the v e n t r o p o s t e r i o r (VP) n u c l e u s o f the t h a l a m u s and ipsilateral area 3b. T h e t h a l a m i c e l e c t r o d e array was a 16 w i r e bundle. T h e e l e c t r o d e s r e c o r d e d f r o m a r e g i o n o f the V P nucleus that spanned across hand
and foot representations m e d i o l a t e r a l l y and f r o m the d e e p e s t part o f V P L (ventroposterior lateral n u c l e u s ) to the dorsally c a p p i n g V P S (ventroposterior superior nucleus). T h e r e c e p t i v e fields r a n g e d f r o m the hand to the foot and f r o m tips o f toes in ventral V P L to m o v e m e n t s o f the leg in V P S (Kaas et al., 1984). S t i m u l a t i o n o f the r e c e p t i v e fields l e a d to the e x p e c t e d r e s p o n s e in the V P L and c o r t e x
70
Area 3b (Left Side)
Ventroposterior Nucleus (Left Side)
15
IL
t0
10
5
5
0
01 , • , e , . , i . • - 1-. -400 -200 0 200
DSP01b
~00
-2~
0
2~
4~
,o
DSP11b
5
-':7..
~,u. Jl..--
DSPO4a
:-
, • i 400
0 -400
-200
DSPllc
0 200 DSP17bL
400 .400
-200
0
200
400
20 10' 0
-400
-200
0 200 DSP06a
400
.400
-200
-200
0 200 DSP06b . ~
400
-400
-200
0 200 DSPI4b,
400
.400
-200
0
200
400 -400
-200
0
200
400
200
400
-4OO
-200
O
200
40( 400
-200
0
200
400
0 200 40( .400 DSP19b~ ,,~.__
-200
0 200 DSP24b
400
40 20 0
-400
0
DSP15a
20
100 1
1
20:
~
Ioi
10
0:
0
.400
-200
0 200 DSPO8m
400
.400
-200
0 200 DSP15b
400
-40O
-200
4
20 10 O
0
.400
-200
0
200
400
-400
-200
0
200
400
.400
-200
0
200
40( .400
-200
DSP2Ob z
0
200
400
DSP25a
10-
5i Time
(ms~
Z 0-40O
-200
0
200
4 .400
-200
0
200
400
Fig. 5. Cross-correlation between the activity of neurons in VPL and ipsilateral area 3b. Proximal portion of thenar pad of the right hand was stimulated at l Hz using a camel hair (20 ms on), The array in the cortex was in the hand region of area 3b, while the array in the thalamus covered both forelimb and hindlimb representations. The reference neuron is 15a in area 3b. Cross-correlations are seen with neurons that had at receptive fields on hand but not if the receptive fields were on foot (e.g. neuron 20b, 22b and 22c; neuronal cluster 22a had very weak response to the stimulation of the hand in addition to response to the stimulation of foot). The numbers referring to a channel indicate the electrode wire designation, while the letters indicate the separate neuron or neuronal clusters that were isolated on the same electrode.
with the expected latencies. An examination of the cross-correlations of activity between the thalamic and cortical neurons showed that the neurons with
overlapping receptive fields on the hand showed correlated activity in response to the stimulation of the hand (Fig. 5). The responses of thalamic neurons
w i t h r e c e p t i v e fields on the foot w e r e not c o r r e l a t e d w i t h those with r e c e p t i v e field on the h a n d w h e n the contralateral h a n d was stimulated. A s e x p e c t e d f r o m the data s h o w n in Fig. 4, neurons in the cortex s h o w e d c r o s s - c o r r e l a t i o n s in s p o n t a n e o u s activities. In addition, t h a l a m i c neurons w i t h r e c e p t i v e fields on the foot s h o w e d c r o s s - c o r r e l a t e d s p o n t a n e o u s firing but not with t h a l a m i c neurons w i t h r e c e p t i v e fields on the hand. Thus, the neurons in V P related to the hand s e e m to be c o n n e c t i o n a l l y i n d e p e n d e n t f r o m those related to the foot.
References Agnew, W.F., Yuen, T.G., McCreery, D.B. and Bullara, L.A. (1986) Histopathologic evaluation of prolonged intracortical electrical stimulation. Exp. Neurol., 92: 162-185. Ahissar, E. and Vaadia, E. (1990) Oscillatory activity of single units in a somatosensory cortex of an awake monkey and their possible role in texture analysis. Proc. Natl. Acad. Sci. USA, 87: 8935-8939. Chorover, S.L. and DeLuca, A.M. (1972) A sweet new multiple electrode for chronic single unit recording in moving animals. Physiol. Behav., 9: 671-674. Donoghue, J.P., Sanes, J.N., Hatsopoulos, N.G. and Gaal, G. (1998) Neural discharge and local field potential oscillations in primate motor cortex during voluntary movements. J. Neurophysiol., 79: 159-173. Drake, K.L., Wise, K.D., Farraye, J., Anderson, D.J. and BeMent, S.L. (1988) Performance of planar multisite microprobes in recording extracellular single-unit intracortical activity. IEEE Trans. Biomed. Eng., 35: 719-732. Engel, A.K., Konig, P., Kreiter, A.K. and Singer, W. (1991a) Interhemispheric synchronization of oscillatory neuronal responses in cat visual cortex. Science, 252:1177-1179. Engel, A.K., Konig, P. and Singer, W. (1991b) Direct physiological evidence for scene segmentation by temporal coding. Proc. Natl. Acad. Sci. USA, 88: 9136-9140. Jain, N., Catania, K.C. and Kaas, J.H. (1997) Deactivation and reactivation of somatosensory cortex after dorsal spinal injury. Nature, 386: 495-498. Jain, N., Catania, K.C. and Kaas, J.H. (1998a) A histologically visible representation of the fingers and palm in primate area 3b and its immutability following long term deafferentation. Cereb. Cortex, 8: 227-236. Jain, N., Florence, S.L. and Kaas, J.H. (1995) Limits on plasticity in somatosensory cortex of adult rats: hindlimb cortex is not reactivated after dorsal column section. J. Neurophysiol., 73: 1537-1546. Jain, N., Florence, S.L. and Kaas, J.H. (1998b) Reorganization of somatosensory cortex after nerve and spinal cord injury. News Physiol. Sci., 13: 143-149. Jain, N., Qi, H.-X. and Kaas, J.H. (1999) Chronic multielectrode recordings reveal changes in interhemispheric interactions in
the somatosensory cortex following dorsal column lesions in adult monkeys. Soc. Neurosci., 25: 1683. Jain, N., Qi, H.-X., Strata, E, Schuller, M., Nicolelis, M.A.L. and Kaas, J.H. (1998c) Correlated action potentials between large numbers of single neurons and neuron clusters recorded simultaneously with chronically implanted microwires in somatosensory cortex of monkeys. Soc. Neurosci., 24: 134. Johnson, M.J. and Alloway, K.D. (1995) Evidence for synchronous activation of neurons located in different layers of primary somatosensory cortex. Somatosens. Mot. Res., 12: 235-247. Kaas, J.H. (1983) What if anything is SI? The organization of the 'first somatosensory area' of cortex. Physiol. Rev., 63: 206-231. Kaas, J.H. (1991) Plasticity of sensory and motor maps in adult mammals. Annu. Rev. Neurosci., 14: 137-167. Kaas, J.H., Nelson, R.J., Sur, M., Dykes, R.W. and Merzenich, M.M. (1984) The somatotopic organization of the ventroposterior thalamus of the squirrel monkey Saimiri sciureus. J. Comp. Neurol., 1984:111-140. Konig, P. and Engel, A.K. (1995) Correlated firing in sensorymotor systems. Curr. Opin. Neurobiol., 5:511-519. Lee, D., Port, N.L., Kruse, W. and Georgopoulos, A.P. (1998) Variability and correlated noise in the discharge of neurons in motor and parietal areas of the primate cortex. J. Neurosci., 18: 1161-1170. Merzenich, M.M., Kaas, J.H., Sur, M. and Lin, C.S. (1978) Double representation of the body surface within cytoarchitectonic areas 3b and 1 in 'SI' in the owl monkey (Aotus trivirgatus). J. Comp. Neurol., 181: 41-73. Merzenich, M.M., Kaas, J.H., Wall, J., Nelson, R.J., Sur, M. and Felleman, D. (1983) Topographic reorganization of somatosensory cortical areas 3B and 1 in adult monkeys following restricted deafferentation. Neuroscience, 8: 33-55. Murthy, V.N. and Fetz, E.E. (1996a) Oscillatory activity in sensorimotor cortex of awake monkeys: synchronization of local field potentials and relation to behavior. J. NeurophysioL, 76: 3949-3967. Murthy, V.N. and Fetz, E.E. (1996b) Synchronization of neurons during local field potential oscillations in sensorimotor cortex of awake monkeys. J. Neurophysiol., 76: 3968-3982. Nicolelis, M.A., Baccala, L.A., Lin, R.C. and Chapin, J.K. (1995) Sensorimotor encoding by synchronous neural ensemble activity at multiple levels of the somatosensory system. Science, 268: 1353-1358. Nicolelis, M.A., Ghazanfar, A.A., Stambaugh, C.R., Oliveira, L.M., Laubach, M., Chapin, J.K., Nelson, R.J. and Kaas, J.H. (1998) Simultaneous encoding of tactile information by three primate cortical areas. Nat. Neurosci., l: 621-630. Nicolelis, M.A.L. and Chapin, J.K. (1994) Spatiotemporal structure of somatosensory responses of many-neuron ensembles in the rat ventral posterior medial nucleus of thalamus. J. Neurosci., 14:3511-3532. Nicolelis, M.A.L., Lin, R.C.S., Woodward, D.J. and Chapin, J.K. (1993) Induction of immediate spatiotemporal changes in thalamic networks by peripheral block of ascending cutaneous information. Nature, 361: 533-536.
72 Nordhausen, C.T., Maynard, E.M. and Normann, R.A. (1996) Single unit recording capabilities of a 100 microelectrode array. Brain Res., 726: 129-140. Nowak, L.G., Munk, M.H., Nelson, J.I., James, A.C. and Bullier, J. (1995) Structural basis of cortical synchronization. I. Three types of interhemispheric coupling. J. Neurophysiol., 74: 2379-2400. Perkel, D.H., Gerstein, G.L. and Moore, G.P. (1967) Neuronal spike trains and stochastic point processes. II. Simultaneous spike trains. Biophys. J., 7: 419~440. Pons, T.P., Garraghty, P.E., Ommaya, A.K., Kaas, J.H., Taub, E. and Mishkin, M. (1991) Massive cortical reorganization after sensory deafferentation in adult macaques. Science, 252: 1857-1860. Preuss, T.M., Stepniewska, I. and Kaas, J.H. (1996) Movement representation in the dorsal and ventral premotor areas of owl monkeys: a microstimulation study [published erratum appears in J. Comp. Neurol. 1997 Jan 27; 377(4): 611]. J. Comp. NeuroL, 371: 649-676. Quirk, M.C. and Wilson, M.A. (1999) Interaction between spike
waveform classification and temporal sequence detection. J. Neurosci. Methods, 94: 41-52. Schmidt, E.M., Bak, M.J. and Mclntosh, J.S. (1976) Long-term chronic recording from cortical neurons. Exp. NeuroL, 52: 496-506. Singer, W. and Gray, C.M. (1995) Visual feature integration and the temporal correlation hypothesis. Annu. Rev. Neurosci., 18: 555-586. Steinmetz, P.N., Roy, A., Fitzgerald, P.J., Hsiao, S.S., Johnson, K.O. and Niebur, E. (2000) Attention modulates synchronized neuronal firing in primate somatosensory cortex [In Process Citation]. Nature, 404: 187-190. Sur, M., Wall, J.T. and Kaas, J.H. (1984) Modular distribution of neurons with slowly adapting and rapidly adapting responses in area 3b of somatosensory cortex in monkeys. J. Neurophysiol., 51: 724-744. Usrey, W.M. and Reid, R.C. (1999) Synchronous activity in the visual system. Annu. Rev. Physiol., 61: 435-456. Von der Malsburg, C. (1995) Binding in models of perception and brain function. Curt. Opin. Neurobiol., 5: 520-526.
M.A.L. Nicolelis (Ed.)
Progressin BrainResearch,Vol. 130 © 2001 Elsevier Science B.V. All rights reserved
CHAPTER 6
Temporal and spatial coding in the rat vibrissal system Ehud Ahissar 1,, and Miriam Zacksenhouse 2 ! Department of Neurobiology, The Weizmann Institute of Science, Rehovot 76100, Israel 2 Faculty of Mechanical Engineering, Technion - Israel Institute of Technology, Haifa 32000, Israel
Peripheral encoding In light of the spatio-temporal nature of the world around us, it is not surprising that sensory encoding is based on spatial and temporal cues that encode static and dynamic events. Spatial encoding of the world is accomplished by arrays of receptors, which are spatially organized across the sensory organs. Each receptor is most sensitive to a specific and limited range within the sensation spectrum. For example, each hair cell in the cochlea is most sensitive to a limited range of acoustic frequencies, which is mapped to a specific location along the cochlea. Each photoreceptor in the retina is sensitive to a limited range of the visual field, and the receptive field (RF) of each mechanoreceptor on the skin is restricted to a limited area of the skin. Temporal encoding is attained by the temporal pattern of receptor firing. Interestingly, temporal encoding is not limited to the dynamic aspects of the stimulus, and can also be used for encoding stationary aspects. For the latter, the stationary stimulus is modulated by the sensing organ, as, for example, in active touch. Primates explore the texture of objects by moving their fingers across it, and rodents scan their environment by moving their whiskers. Changes in pressure, caused by the presence of an object, or by the ridges and grooves across its surface, are detected by the mechanoreceptors. The
* Corresponding author: Ehud Ahissar, Department of Neurobiology, The Weizmann Institute of Science, Rehovot 76100, Israel. Tel.: +972-8-934-3748; Fax: +972-8934-4140; E-mail:
[email protected]
temporal intervals between such changes, detected by the same or different receptors, represent spatial distances (dx = v(t)dt; where dx is a spatial interval, v(t) is the scanning velocity and dt is a temporal interval). Similarly, fixational eye movements might be used to scan objects visually, with changes in illumination across the surface being detected by the photoreceptors. As in the tactile case, the temporal intervals between such changes represent spatial distances (Ahissar and Arieli, 1997). Most receptors respond mainly (or only) to changes in the sensed signal. Thus, the endogenous movements of the sensory organs may have evolved to provide sufficient changes in the sensed stimulus, even when the environment is stationary. Active sensing is significantly more advantageous than passive sensing, which relies heavily on external changes. Furthermore, active sensing facilitates 'hyper-acuity', that is, higher resolution than that allowed by receptor size or spacing. Hyper-acuity is enabled by the scanning movements, which cover the space within the receptive field of a single receptor, and between adjacent receptors. The encoding here is temporal: relative distances are represented by temporal intervals between the respective responses.
Vibrissal encoding of object location The rat vibrissal system provides a clear example of dissociated coding schemes. The mystacial pad contains about 35 large whiskers, which are arranged in 5 rows and about 7 arcs (columns) with different lengths (Fig. 1). During whisking, the whiskers move back and forth with a rhythmic motion at 4-10 Hz (Welker, 1964; Carvell and Simons, 1990; Fanselow
76 Whisker position vs. Time x
/~
...... ° .
/
/
Ra,
\
\
Encoding." Decoding:
Fig. 1. Spatial organization of the mystacial vibrissae of the rat. Blood sinuses surrounding whiskers' roots were visualized using xylene (see Haidarliu and Ahissar, 1997). A-E, rows; 1-7, arcs; or-& straddlers; NV, nasal vibrissae; NS, nostril; FBP, furry buccal pad. Courtesy of S. Haldarliu.
and Nicolelis, 1999; Kleinfeld et al., 1999), thus covering several cubic centimeters near the snout (Wineski, 1983; Brecht et al., 1997). Each whisker along the same arc scans a different trajectory while all the whiskers of the same row scan roughly the same trajectory (Brecht et al., 1997). Thus, spatial information along the vertical axis can be obtained by comparing the activity of individual whiskers along the same arc. The extraction of spatial information about the location of the object in the horizontal plane is more complicated. Whisking facilitates continuous scanning of the environment along each row of whiskers (Brecht et al., 1997). The horizontal location of an object is encoded by the temporal interval between whisker firing at protraction onset and whisker firing when it hits the object (Fig. 2). Thus, although whisking facilitates hyper-acuity, it has a price. Encoding takes time, and furthermore, the simple coding provided by labeled lines, that is, 'tell me which whisker fires J and I'll tell you the location of the object',
The term 'whisker firing' is used here to denote firing of whisker's receptors.
Protracfm~etraction
• .................................
~er •
i
!Jii
~ ~L_i
i
Receptorfiring
id
'. " ii ~ X ( • )
_
i
!r t'
rl~
t
~---~: ~ x(o) Fig. 2. Temporal encoding during whisking. Only a single whisker and representative brief bursts are depicted for simplicity. Bursts of spikes are generated by the whisker's receptors at the beginning of each protraction cycle, due to velocitythreshold crossing (Zucker and Welker, 1969; Brown and Waite, 1974; Shipley, 1974; Nicolelis et al., 1995). The length of this burst probably depends on the profile of whisker movement. Another burst is generated by the same or other receptors at the whisker's root, when the whisker hits an external object (Zucker and Welker, 1969; Brown and WaRe, 1974). The temporal interval between the onsets of the two bursts encodes the spatial interval between protraction onset and object location. A more forward location (black ellipsoid) will produce a longer interval. These temporal intervals have to be decoded to represent horizontal object location.
is not sufficient. The brain has to know when the whisker fired in order to decode the location of the object. Thus, determination of the horizontal location (relative to whisker position at protraction onset) requires temporal processing of the pattern of activity of the whiskers. Similar temporal information is contained in the firing patterns of all whiskers that touch an object during whisking. In fact, horizontal object location can be extracted by processing the pattern of activity of individual whiskers. Further information can be obtained later by lateral comparisons along the row. For example, the radial distance of an object from the snout might be computed by triangulation involving the angles that the different whiskers along the row make when hitting the object. Alternatively, decoding might be inherently spatio-temporal, and take into consideration the activity of all the whiskers along the same row simultaneously. In the latter case, both signal to noise ratio and decoding efficiency would probably be improved. In this review we do not attempt to distinguish between these two alternatives, assuming that both should obey similar principles of temporal decoding.
77
Parallel afferent pathways Vibrissal information is conveyed to the somatosensory ('barrel') cortex via two parallel pathways, the lemniscal and paralemniscal (Bishop, 1959; Diamond and Armstrong-James, 1992; Woolsey, 1997). The lemniscal pathway, which ascends via the ventral posterior medial nucleus (VPM) of the thalamus, contains large-diameter axons (Bishop, 1959) that end in focal, clustered terminals (Williams et al., 1994). Lemniscal neurons respond at short latency and have relatively small RF centers (Diamond, 1995). The paralemniscal pathway, which ascends via the medial division of the posterior nucleus (POm) of the thalamus, contains smallerdiameter axons that form widespread terminal patterns (Williams et al., 1994). Paralemniscal neurons exhibit slower responses and larger RF centers (Diamond, 1995). These two pathways form two distinct thalamocortical loops. The VPM projects to the barrels in layer 4 and to layers 5b and 6a, and receives feedback from layers 5b and 6a (Chmielowska et al., 1989; Lu and Lin, 1993; Bourassa et al., 1995). The POre projects to layers 1 and 5a and to the inter-barrel septa in layer 4 (Koralek et al., 1988; Lu and Lin, 1993; Diamond, 1995) and receives feedback from layers 5 and 6b (Lu and Lin, 1993; Bourassa et al., 1995). The existence of two nearly parallel pathways to the cortex has been puzzling. One possibility, which relies on the strong cortico-POm connections (Hoogland et al., 1988; Diamond et al., 1992), suggests that the POm implements an integrative cortico-thalamo-cortical loop, and does not process vibrissal information directly (Hoogland et al., 1988; Diamond, 1995). However, such an interpretation leaves the ascending sensory connections to the POre unexplained. An alternative possibility is that the paralemniscal pathway processes sensory information that is different from the information processed by the lemniscal pathway, and whose processing requires strong cortical feedback and longer delays. Recently, we provided evidence that is consistent with the latter scheme. In anesthetized rats, we showed that the temporal frequency of whisker movements is represented differently in the two pathways: by response amplitude (and, as a result, spike count) in the lemniscal, and by latency (and, as a
result, spike count) in the paralemniscal pathway (Ahissar et al., 1998). These internal representations are first expressed in the thalamus (Sosnik et al., 2001), and are preserved in the corresponding cortical domains (Ahissar et al., 2001). Representation-wise, the thalamo-cortical coupling is more significant in this system than the columnar affiliation in the cortex. This was evident during vertical penetrations into the cortex. When recording from the lemniscal layers, 4 and 5b, the response latency was usually constant for stimulation frequencies between 2 and 11 Hz. However, when recording from the paralemniscal layer 5a, a robust latency representation of the whisker frequency emerged. Thus, although the two different thalamo-cortical systems share the same cortical 'columns', they utilize different coding schemes to represent the whisker frequency. So far, these observations have only been obtained in anesthetized animals. However, in anesthetized animals, the physiological conditions of the neurons are affected by the anesthesia, and the motorsensory loop is 'open', that is, sensory acquisition does not depend on the motor system, as it does during whisking. Recent studies demonstrated that the effect of both these factors on response latency is small, on the order of a few milliseconds (Simons et al., 1992; Fanselow and Nicolelis, 1999; Friedberg et al., 1999). Thus, the effects of anesthesia on latency are much smaller than the tens of milliseconds latency variations observed in the paralemniscal pathway. Consequently, sensory neurons would not be expected to behave qualitatively different just because the animal is awake or whisking. Yet, natural whisking may induce specific computational constraints that are not expressed in the anesthetized animal. Thus, the working hypothesis developed here should ultimately be tested in freely moving animals performing object localization or identification during natural whisking.
Decoding and re-coding by the brain: a working hypothesis Herein, decoding refers to the extraction of information encoded in the transmitted signal, and re-coding refers to the representation of the same information using another coding scheme. The brain probably performs both operations simultaneously (Perkel and
78 Bullock, 1968). Our experimentally driven working hypothesis is that the lemniscal pathway decodes spatially encoded information, whereas the paralemniscal system decodes temporally encoded information. Both types of information are obtained during whisking, and encode spatial parameters, such as the location of objects in the surrounding space.
A Decoding options Periphery Thalamus
Cortex
relays
"passive" decoding
comparators
sensory input
~
~
I
Temporal decoding
@
g" "Active" decoding .....
Relays
In principle, temporally encoded information can be decoded using either a 'passive' or 'active' strategy (Fig. 3A). Passive decoding employs feedforward arrays of time-sensitive elements, such as delay lines (Jeffress, 1948) or synaptic time constants (Buanomuno and Merzenich, 1995), to convert temporal intervals to either labeled-line or population rate coding. Active decoding utilizes internal oscillations as 'temporal rulers', or temporal expectations, against which the temporal interval between input events is compared. This comparison can be implemented by open or closed loops and can result in a labeled-line or a population rate code (Ahissar, 1995). Whereas delay lines are inefficient for decoding temporal intervals greater than a few milliseconds (Carr, 1993), oscillation-based circuits are efficient particularly in this range (Ahissar, 1998). This suggests that delay-line solutions should be favored for processing temporal intervals in the microsecond range, which occur in binaural localization or echolocation systems, while oscillatory solutions should be preferable for processing intervals in the millisecond (a few to a few hundreds) range, which occur in visual or tactile perception. In fact, singlecell oscillators exist in the somatosensory and visual cortices. In the barrel cortex, the majority of oscillators have a spontaneous frequency around 10 Hz (Ahissar et al., 1997), which is suitable for decoding (by means of comparison) temporal information obtained during whisking. Interestingly, cortical oscillators in other species and other modalities exhibit frequency ranges that also match their corresponding sensory channels. Somatosensory oscillators in the monkey oscillate mostly around 30 Hz (Ahissar and Vaadia, 1990; Lebedev and Nelson, 1995), which is suitable for temporal information carried by rapidly adapting fibers (Talbot et al., 1968; Johansson et al.,
~
direct coupling
B Inhibitory Phase-locked loop (iPLL) PD
C Predictions
"Passive"decodingor directcoupling ............ outputrate latency Inputfrequency
"Active'"decoding (iPLL) ~
latency outputrate
Inputfrequency
Fig. 3. Possible mechanisms for temporal decoding. (A) Passive decoding is assumed to flow through feedforward connections, where the activity at each level depends on the activity at a lower (more peripheral) level. Arrows represent feedforward elements including 'simple' axons, delay lines, and synapses. Active decoding involves an independent cortical source of information local oscillators. Information flows both ways and is compared, in this example, in the thalamus. The circuits can be closed (dashed lines) or open loops. In direct coupling, the cortical oscillators are forced by the input via relay neurons. (B) The thalamo-cortical inhibitory phase-locked loop (iPLL) decoder. PD, phase detector; RCO, rate-controlled oscillator; INH, inhibitory neurons; e, inhibitory connection; ~b, phase-difference between the sensory input and the oscillator. (C) Predicted dependencies of thalamic and cortical responses on the input frequency. Adapted from Ahissar et al. (1997).
1982). Visual oscillations are mostly in the alpha (around 10 Hz) and gamma (40-100 Hz) ranges (Eckhorn et al., 1993; Gray and McCormick, 1996), matching the alpha and gamma ranges of retinal activation frequencies caused by the miniature eye movements (Bengi and Thomas, 1972; Eizenman et al., 1985) (see Ahissar and Arieli, 1997).
79
A
ISI Histograms
B
Individual ISis in single trials [] Inter-stimulus-interval Inter-spike-interval (ISI) x
400
OSC-Delay
/~
5 Hz
i
e~
-1
0
1
2
3
400"
80
0
0 0
100 200 300
3
4
ms
C
5 Time (s)
6
Tracking ranges of cortical oscillators
~0.
3
5 7 9 11 13 Stimulus frequency (Hz)
15
3
5 7 9 11 13 Stimulus frequency (Hz)
15
Fig. 4. Frequency locking of single-cell cortical oscillators (OSCs). (A and B) An OSC recorded from layer 2-3 of the barrel cortex of an anesthetized rat during stimulations of whisker E2 with square-wave stimuli. (A) ISI histograms computed for the entire stimulation periods (black) and the interleaved spontaneous periods (gray). Red arrows point to the inter-stimulus intervals. (B) Lock-in dynamics during single stimulus trains. ISis (black circles), inter-stimulus intervals (open squares), and OSC delays (gray triangles) are plotted as a function of time. Time 0 and the dashed line denotes the beginning of a stimulus train. Note the 1 : 1 firing (one OSC spike per stimulus cycle) and constant phase difference during stabilized states. In the trial presented in the lower panel, the OSC remained 'locked' for 2 additional cycles after the stimulus train ended. The s e n s i t i v i t y value extracted from the steady-state data is: a = d q S / d l ~ (160 - 55)/(200 - 125) = 1.4. Note that the phase difference, ~b, here is the complement of the OSC delay (see text). (C) Frequency locking ranges for 13 single-cell oscillators of the barrel cortex. Symbols show the spontaneous frequencies of these oscillators. Locking index = 1 - If~ - fol/(f~ + fo) where f~ is the stimulation frequency, and fo is the oscillator frequency during stimulation. Oscillators were grouped (left and right panels) according to their locking ranges. Adapted from Ahissar et al. (1997).
O s c i l l a t i o n - b a s e d temporal decoders could involve open or closed loops (Ahissar, 1995). O f the two, the closed loop is the m o r e efficient, since it is m o r e adaptive, and thus requires fewer e l e m e n t s to achieve a given d e c o d i n g resolution. Indeed, in the rat vibrissal system, we f o u n d e v i d e n c e for the op-
eration o f an oscillation-based closed-loop system. This loop, k n o w n as the phase-locked loop (PLL, Ahissar, 1998; see also K l e i n f e l d et al., 1999), provides specific predictions for n e u r o n a l activity during periodic, or f r e q u e n c y m o d u l a t e d (FM), stimuli. O n e prediction is that the cortical oscillators should
80 follow the external frequency during whisker stimulation. Indeed, within specific frequency ranges, this was the case with most barrel cortex oscillators (Fig. 4; see Ahissar et al., 1997). Another prediction is that the latency and intensity of the responses to periodic stimuli should be inversely related (Fig. 3C). The expected inverse relationship was found in the paralemniscal, but not in the lemniscal, system (Fig. 5; Ahissar et al., 2000). In the paralemniscal system, the latency increases with increasing stimulus frequencies, while the spike count (per stimulus cycle) decreases. This direction of the relationships indicates that the neuronal implementation of the PLL includes inhibition, and is therefore referred to as iPLL (Ahissar, 1998). Since corticoPOm connections probably do not send collaterals to the reticular nucleus of the thalamus (Bourassa et al., 1995), the inhibition in this loop should be cortical. The main principles of decoding by thalamo-cortical PLL circuits are: (i) the input timing (from the whiskers) is compared at the thalamus against the timing of the cortical oscillations; and (ii) the oscillation frequency is controlled by the resulting 'error signal' so as to match the input frequency (negative feedback). The PLL is an optimal temporal decoder, since it implements a narrow band-pass filter, which adapts its center frequency to match the input frequency (Gardner, 1979). By implementing PLLs, thalamo-cortical loops convert temporally encoded information into a rate-population code. This conversion is monotonic within a limited range of input frequencies, which determines the working range (see Section 'Sensitivity, gain, and working range'). Conversion of temporal code to rate-population code The main components of the iPLL are (Fig. 3B) the thalamic phase detector (PD), the inhibitory neuron (INH) and the cortical rate-controlled oscillator (RCO). The information in the input to the PLL is encoded temporally in the inter-event intervals. The PLL re-encodes this information in the form of rate code at the output of the PD. This process is accomplished in three stages: (1) phase detection, (2) rate conversion, and (3) adaptation. Phase detection and rate conversion are accomplished by the PD, whose total output decreases with the detected phase difference. This is probably implemented by a simple
Layer 4 (barrel)
Layer 5a
k%,
VPM
P0m
"G"
g
E (D -.I
60 ms
P~
Sp5t
h.....,........
o 5
8 Stimulus frequency (Hz)
Fig. 5. Typical steady-state tuning curves of local neuronal populations along the two sensory pathways. Examples of pooled single and multi-units recorded from single electrodes in the brainstem (Pr5 and Sp5I nuclei), thalamus (VPM and POm nuclei), and cortex (layer 4, barrels, and layer 5a, non-barrels). Stimuli were air-puffs directed to several whiskers of 1 or 2 rows (3 s trains of 50 ms pulses in the protraction direction for all frequencies; (Ahissar and Haidarliu, 1998). Graphs depict the mean S.E.M (small vertical bars), during steady state (0.5 < t < 3 s), of response latency (solid lines, latency to half-height rising edge of the PSTH) and response magnitude (broken lines, spike counts per stimulus cycle). Brainstem responses exhibited a negligible, if any, dependency on the stimulus frequency. Lemniscal thalamo-cortical latencies were constant, whereas spike counts decreased with increasing frequencies. Paralemniscal latencies and spike counts exhibited inverse dependencies: latencies increased, whereas spike counts decreased with increasing stimulus frequencies. Latencies are plotted with a common scale (0 to 60 ms), whereas spike counts are plotted with independent arbitrary scales that emphasize the relative values for each local population. Although the absolute spike count values varied significantly between different local populations, their frequency dependency patterns were typical for each nucleus or cortical layer. Latency values were consistent within each nucleus and cortical layer.
81
The period of the RCO varies in order to match the period of the input (see previous section). When a periodic input is applied, a stable PLL will reach a new steady state in which the RCO oscillates at the period of the input and maintains a constant phase difference 2 relative to the input events. In such a case, the steady-state phase re-encodes the period of the input, and is further encoded by the steady-state output rate of the PD. This process depends critically on the open-loop gain of the PLL. The open-loop gain,/3, determines the change in the period of the RCO due to a small change in the phase difference, and is given by the product of the PD and RCO gains. (The PD gain determines the change in the output rate of the PD due to a small change in the phase, and the RCO gain determines
the change in the period of the RCO due to a small change in its input.) When both the PD and the RCO are linear, the gains of the PD and RCO are constant, and so is the open-loop gain. In other instances, the gains may vary with the frequency. The open-loop gain determines the internal loop dynamics and thus also the stability of the PLL (Ahissar, 1998); it should be less than 2 for the PLL to be stable, and less than 1 for it to be monotonically stable 3. Note that these are the same stability conditions as for a forced oscillator (Perkel et al., 1964). A critical aspect of the PLL is that it constitutes a closed loop so that changes in the period of the RCO affect the phase difference and thus the subsequent period of the RCO. The overall response of the closed loop is described by the closed-loop gain, which differs from the open-loop gain. The closed-loop gain can be related to one of the output variables of the loop: the output rate or the phase difference. For comparison with forced oscillators, we relate it here to the phase difference and term the corresponding closed-loop gain sensitivity. Specifically, the sensitivity, a is the ratio between variations in the steady-state phase difference and variations in the normalized input period (normalized with respect to the intrinsic period of the RCO). We define a = d~b/dl, where d~b is the change in the response lead (the complement of the response latency) and dI is the change in the input period. The sensitivity of the PLL is inversely related to the open-loop gain (a = 1 + 1//3; Zacksenhouse, 2001) so the sensitivity of a stable PLL should be larger than 1.5. The sensitivity is also inversely related to the working range, which is the range of input periods over which the phase difference and output rate vary monotonically. Temporal decoding presents the PLL with the contradictory requirements for high sensitivity and a wide working range. Given this, what is the selection made by the nervous system? Extraction of the relevant values from data collected during our cortical and thalamic recordings (Ahissar et al., 1997, 2000) revealed that the brain does not employ high sensitivities; the sensitivity
2 The phase difference can be defined as either the response latency or its complement, the response lead, normalized with respect to the intrinsic period of the RCO.
3 Note that the polarity of the gain depends on the definition of the phase difference (phase lag versus phase lead).
AND gating at the thalamus (Ahissar, 1998). The critical variable is the total output of the PD over a single cycle of the RCO, or equivalently the average rate over the relevant unit of time provided by the period of the oscillator. The instantaneous rate of the PD does not matter. For example, one possible implementation (Ahissar et al., 1997) considers a PD whose active instantaneous rate is constant. Varying the active duration inversely with the phase-difference controls the total output of the PD along the cycle. Adaptation is accomplished by the RCO, whose period is affected by the output rate of the PD. By virtue of the closed loop, the RCO tunes its period to match that of the input. The tuning affects the phase relationship with the input events, and thus, the corresponding output from the PD. Upon reaching a steady state, the PLL uniquely re-encodes the input period in the phase difference and the output rate. Tuning also enables the PLL to quickly detect changes in the period of the input, which has been internally represented as expectations in the form of the adapted period of the RCO. Both of these capabilities may be characterized by the sensitivity of the PLL. Sensitivity, gain, and working range of the PLL
82 values were usually larger than 1, but less than 1.5 (see, for example, Fig. 4). This range (1-1.5) of experimentally estimated decoding sensitivity is further revealing, since it has different implications depending on the nature of the underlying neural circuit. In this section, the implications of the assumption that the cortical oscillators are embedded in PLLs will be considered. In the following section, the implications of the assumption that the cortical oscillators are directly forced will be considered (and rejected). Assuming that the thalamocortical loops implement iPLLs, the estimated sensitivity of 1-1.5 indicates that these loops operate outside the stable range, or at the most on the verge of stability. This conclusion is supported by the observation that during applications of an external periodic stimulus, cortical and thalamic neurons usually toggle between phase-locked (stable) and unlocked periods (Ahissar et al., 1997; Ahissar et al., 2000). Furthermore, the quasi-steady phase locking sections are preceded by phase oscillations (e.g. Fig. 4). These observations further support the conclusion that the neural circuits operate in a high-gain, low-sensitivity regime. While the high open-loop gain of a PLL has the disadvantage of de-stabilizing the response, it has the advantage of increasing the working range over which the system is sensitive to temporal variations in the input. Indeed, in the paralemniscal system, the tracking ranges of thalamic and cortical neurons are large in comparison to their intrinsic periods. For example, typical tracking ranges stretch from 3 to at least 7-8 Hz (Ahissar et al., 2000, 2001; Sosnik et al., 2000; see Fig. 4), which corresponds to working ranges of >200 ms. If indeed these neurons participate in iPLLs, the intrinsic frequency of the RCOs should be above 7-8 Hz, and thus, these working ranges are significantly larger than the intrinsic period. This can only be achieved with high open-loop gains and low sensitivities (fl > 2, ~r < 1.5), and thus, would be outside the stability range of the PLL (Ahissar, 1998; Zacksenhouse, 2001). We may conclude that the neural implementation achieves a large working range at the expense of the stability of individual PLLs. In light of this instability of individual PLLs, reliable temporal-to-rate transformation may require the co-operation of a pool of PLLs (see Section 'Weakly coupled PLLs' below).
PLLs versus forced oscillators Sensitivity analysis provides a tool for discriminating between oscillators embedded within iPLLs and oscillators that are forced directly by the input (Fig. 3). The sensitivity of a forced oscillator is ~r = 1/fl (Zacksenhouse, 2001), so a forced oscillator whose sensitivity is larger than 0.5 is stable, and is monotonically stable if its sensitivity is larger than 1. However, although the sensitivity of neurons in the vibrissal system has been estimated to be larger than 1, their response patterns do not exhibit monotonic stability. For example, the sensitivity of the neuron whose response is depicted in Fig. 4 has been estimated to be around 1.4, but its phase oscillates before reaching a steady state. Thus, the sensitivity analysis enables us to reject, at least in this and other similar cases, the hypothesis that the oscillating units are directly forced by the input. Weakly coupled PLLs How do unstable PLLs perform reliably and provide meaningful rate code? One appealing explanation relies on a pool of weakly coupled PLLs. The coupling may be achieved at the level of the RCOs. In response to a given input, the percentage of RCOs that phase lock to the input and oscillate at its frequency increases. The weak coupling between the RCOs may then evoke the 'magnet effect' (von Hoist, 1973), and thereby attract the rest of the RCOs to oscillate at the same frequency. Indeed, computer simulations support this scenario (E. Ahissar, unpublished observations). Thus, it might be that the rate-coded output should be read out from an ensemble of parallel PLL circuits in order to obtain reliable representations. Rate-population code at the output of the PLL Thalamic implementations of PDs most likely involve local populations of a few tens of 'relay' neurons (Ahissar, 1998), whose total spike count (per cycle) establish the output code of the PLL. The basic output code relies on the summed activity across all PD neurons. However, specific implementations may also involve a population-vector code (in addition to the population-sum code), which is based on
83 A
B
PD
II .... II II ~1
1
IlllJllllLitlllll|llJll|i|lllllllltl|li|illllllll|lhll|ih[lll|l||l||ll IIIIEEI][IIII illllllllll[llllllllllllll IllllllllLEIIll ILIIIIIIIIIIIIIIIIIIlII[I III]II~IILI IIIIllllllllllllllllllll
lllll[il[llll[ I[1111111111 IIlll$1111 [flll[il[[ IIIIIIIII IIIII, 71[ i"l'l"l l
'
J~'
tll: . ..... II!
Z
LEEIEIIIIr[Illl IIIIIIIIIIIIII II111111[!1111
lilllllllllllllllllll I[llllillllllllll IIIIllllllllllll Illll[lllJll[llf Illllllllllll IIIl.llltl
.
.
II
.
IIIIII1111111 [[llllllllil [[l[llllllll IIIIIIIIllll IIIIIIIII1 Illillll
'|{"
.
"
.
"
.
C
* IIIIIIIIIIIIIIIIIlll RCO
0
,
,
z
0.6 s
D
35
I
I
I
I
I
I
0
I
I
1
I
I
1 Time (s)
I
I
2
Fig. 6. A computer simulation of the population coding at the output of the PLL. (A) Simulated circuit and spike trains. The circuit was composed of one input cell (I), 20 PD neurons (PD), 20 different delay lines from the input to the PD neurons, and one RCO neuron (RCO) that receives an inhibitory input from each of the 20 PD neurons. The timings of the input spikes and the membrane voltage of the RCO are presented at the bottom. (B) The output of the PLL, which is the population output of the PD. The spike trains of the 20 PD neurons are depicted. Each line represents, as a function of time, the membrane voltage of one PD neuron. (C) The integrated input of the RCO, i.e. the total inhibitory conductance caused by synaptic input to the RCO neuron. (D) The instantaneous ISis of the input I and the RCO are described as a function of time. After a lock-in stage (whose duration and dynamics depend on the loop gain and the initial phase; Ahissar, 1998), the two curves essentially merge. The simulation was performed on a DEC 3100 workstation using Genesis (Wilson and Bower, 1989) and using Hodgkin-Huxley kinetics. Adapted from Ahissar (1998).
the identity of the active PD neurons, as illustrated in Fig. 6 (see Ahissar, 1998). In this implementation, specific input periods are represented by both (i) specific sets of synchronously active PD neurons and (ii) specific values of the summed activity of all PD neurons. Whether such a population-vector coding takes place is not yet known. A temporal-to-rate code transformation should result in a cortical rate-coded representation of horizontal object location. PLL simulations demonstrate that when a new object appears within the whisking trajectory, this appearance should evoke a transient response in the PD spike count. The magnitude of the transient response should be proportional to the spatial angle between the position of the protraction onset and the location of the object (Fig. 7), and should depend on the loop sensitivity. This prediction of the PLL mechanism has not yet been tested.
Spatial decoding Spatial decoding mechanisms commonly involve spatial derivatives (or comparisons) which are obtained by lateral inhibition. In the lemniscal system, spatial decoding may be implemented at the VPM, where lateral inhibition is accomplished via the thalamic reticular nucleus (TRN). The known anatomy and physiology of these two thalamic nuclei (Salt, 1989; Lee et al., 1994) support such a function. However, the massive, well-organized, cortical feedback to the thalamus indicates that this is not the entire story. Cortical projections to the VPM diverge along the arcs, whereas projections to the TRN (as well as those to the POm) diverge along the rows (Welker et al., 1988). We postulate that this anatomical polarization enables the thalamocortical machinery to focus
84 150 • 1
~ 0
0
0
~
1000
0
1000 Time (ms) Fig. 7. Computer simulation of an iPLL: Cortical representation of object location. Cortical spontaneous oscillations were at 10 Hz (Io, circles). The whisking (I = 110 ms; squares) commenced at t = 300 ms, and the object was introduced at t = 1200 ms and a spatial angle of 32°. The introductionof the object was simulated by inserting, in every whisking cycle, an additional input spike at the time (20 ms from the protraction onset) when the whisker would have touchedthe object. After a transient response, the iPLL re-lockedto the full whisking cycle, but with a new phase (upper panel; dt is the delay between protraction onset and RCO firing, triangles). Differentobject locations (8°, 16° and 32°) are representedby differentmagnitudes of the transient response (lower panel; arbitrary vertical scale). Adapted from Ahissar et al. (1997). on one specific arc (see Ghazanfar and Nicolelis, 1997), the arc whose whiskers are the first to hit the object. Since the whiskers are not all of the same length (Brecht et al., 1997), the first arc to be activated depends on the radial distance of the object from the animal's snout. The VPM representation of that arc (Haidarliu and Ahissar, 2001) will be depolarized via the arc-oriented cortical feedback, while the VPM representations of the other arcs will be inhibited, via the row-oriented TRN activation. Depolarization moves the VPM cells into a gating mode, which facilitates fine computation (Sherman and Guillery, 1996). Computation might be based on simple lateral comparisons or, more likely, on a closed-loop thalamo-cortical process. In the latter case, the computation may involve cortical expectations and error signals, as in the PLL-based temporal processing carried by the paralemniscal system. Two types of distinct computational tasks are relevant here: localization and identification. Localization relies on the identity of the activated whisker, and thus, may be determined using lateral com-
parisons along the arc. Identification relies on the fine textural details of the surface, which are encoded both temporally and spatially. Temporal phase relationships across arc-wise neighboring whiskers represent vertical details, such as offsets or curvatures, while temporal intervals in the activity of individual whiskers represent horizontal details, such as spatial frequency or roughness. The relevant temporal frequencies involved in identification depend on the object's spatial frequencies (Carvell and Simons, 1990) and are usually much higher than those involved in the horizontal localization, which is postulated to be computed by the paralemniscal system. However, the lemniscal system might still utilize the PLL strategy to decode such high-frequency information. If so, the lemniscal system should rely on the higher-frequency cortical oscillations (Ahissar et al., 1997; Jones and Barth, 1999), and should exhibit latency coding of the input temporal frequency in the range of the decoded frequencies (> 10 Hz).
Integration Tactile acquisition by the whiskers is a motorsensory active process. Whiskers are moved by the motor system to acquire information, which is then analyzed by the sensory system. This system operates in a closed loop, since sensory information drives motor circuits at multiple levels (Kleinfeld et al., 1999). As with any other closed loop, there is no starting point, the process is not more sensory-motor than it is motor-sensory. The sensory part of the loop contains two parallel pathways: the lernniscal and paralemniscal systems. Our working hypothesis is that the paralemrtiscal system decodes the low-frequency (whisking-range) temporally encoded information, and that the lemniscal system decodes spatially encoded information and high-frequency (texture-dependent) temporally encoded information. Since temporal decoding by PLLs results in a rate-population code, the integration of all the decoded outputs should be straightforward. Indeed, signs of such integration were observed in layer 2/3 of the barrel cortex (Ahissar et al., 2001). Yet, some anatomical segregation is preserved between these two systems, even at the cortical level (Kim and Ebner, 1999), which suggests
85 that even the sensory-motor control systems operate in parallel, at least up to a later motor stage. From a functional point of view, the vibrissal system includes three parallel sensory-motor processes. These processes can be distinguished by their decoding of sensory cues; one decodes temporally encoded low-frequency horizontal localization cues, another temporally encoded high-frequency textural cues, and the third, spatially encoded vertical cues. The sensory part of the first process is accomplished by the paralemniscal system, whereas those of the other two are implemented by the lemniscal system. Theoretically, each process could be implemented by a separate sensory-motor loop, and could even have a separate muscular system on the mystacial pad. Alternatively, the three processes could share at least some of the motor circuits and muscles. Although the latter scheme seems more plausible, the stage at which the different processes merge is not yet clear. As with the local sensory loops, the operation of each sensory-motor loop is characterized by its loop gain and sensitivity. The gains of the different sensory-motor loops are probably under global control so that the brain can regulate the strength of the sensory-motor coupling of each loop. This regulation can bring a specific sensory-motor process to the 'foreground', while keeping the other two processes in the 'background', during the performance of a specific task.
Abbreviations PLL iPLL ePLL FM RCO PD VPM POm TRN
phase-locked loop inhibitory PLL excitatory PLL frequency modulation rate-controlled oscillator phase detector ventral posterior medial nucleus of the thalamus medial division of the posterior nucleus of the thalamus thalamic reticular nucleus
Non-keyboard characters sensitivity gain
cr (Greek sigma) 13(Greek beta)
phase straddlers
qb(Greek phi) ct, 13, y, ~ (Greek alpha, beta, gamma, delta)
Acknowledgements We wish to thank S. Haidarliu and R. Sosnik for their help with the experiments and data analysis and B. Schick for reviewing the manuscript. This work was supported by the United States-Israel Binational Science Foundation (Israel) Grant 97-222 and the MINERVA Foundation, Germany.
References Ahissar, E. (1995) Conversion from temporal-coding to rate-coding by neuronal phase-locked loops. The Weizmann Institute of Science, Rehovot, Technical Report GC-EA/95-4. Ahissar, E. (1998) Temporal-code to rate-code conversion by neuronal phase-locked loops. Neural Comput., 10(3): 597650. Ahissar, E. and Arieli, A. (1997) Seeing through miniature eye movements: A hypothesis. Neurosci. Lett. Suppl., 48: $2. Ahissar, E. and Vaadia, E. (1990) Oscillatory activity of single units in a somatosensory cortex of an awake monkey and their possible role in texture analysis. Proc. Natl. Acad. Sci. USA, 87: 8935-8939. Ahissar, E., Haidarliu, S. and Zacksenhouse, M. (1997) Decoding temporally encoded sensory input by cortical oscillations and thalamic phase comparators. Proc. Natl. Acad. Sci. USA, 94: 11633-11638. Ahissar, E., Sosnik, R. and Haldarliu, S. (2000) Transformation from temporal to rate coding in a somatosensory thalamocortical pathway. Nature, 406: 302-306. Ahissar, E., Sosnik, R. and Haidarliu, S. (2000) Temporal frequency of whisker movement: II. Laminar organization of cortical representations. J. Neurophysiol., in press. Bengi, H. and Thomas, J.G. (1972) Studies on human ocular tremor. In: R.M. Kenedi (Ed.), Prospectives in Biomedical Engineering. Macmillen, London, pp. 281-292. Bishop, G.H. (1959) The relation between nerve fiber size and sensory modality: Phylogenetic implications of the afferent innervation of cortex. J. Nerv. Ment. Dis., 128:89-114. Bourassa, J., Pinault, D. and Deschenes, M. (1995) Corticothalamic projections from the cortical barrel field to the somatosensory thalamus in rats: a single-fibre study using biocytin as an anterograde tracer. Eur. J. Neurosci., 7: 19-30. Brecht, M., Preilowski, B. and Merzenich, M.M. (1997) Functional architecture of the mystacial vibrissae. Behav. Brain Res., 84: 81-97. Brown, A.W. and Waite, P.M. (1974) Responses in the rat thalamus to whisker movements produced by motor nerve stimulation, J. Physiol., 238: 387-401. Buanomuno, D. and Merzenich, M.M. (1995) Temporal infor-
86
mation transformation into a spatial code by a neural network with realistic properties. Science, 267: 1028-1030. Carr, C.E. (1993) Processing of temporal information in the brain. Annu. Rev. Neurosci., 16: 223-243. Carvell, G.E. and Simons, D.J. (1990) Biometric analyses of vibrissal tactile discrimination in the rat. J. Neurosci., 10: 2638-2648. Chmielowska, J., Carvell, G.E. and Simons, D.J. (1989) Spatial organization of thalamocortical and corticothalamic projection systems in the rat SmI barrel cortex. J. Cornp. Neurol., 285: 325-338. Diamond, M.E. (1995) Somatosensory thalamus of the rat. In: E.G. Jones and I.T. Diamond (Eds.), Cerebral Cortex, Vol. 11. Plenum Press, New York, pp. 189-219. Diamond, M.E. and Armstrong-James, M. (1992) Role of parallel sensory pathways and cortical columns in learning. Concepts Neurosci., 3: 55-78. Diamond, M.E., Armstrong-James, M., Budway, M.J. and Ebner, EE (1992) Somatic sensory responses in the rostral sector of the posterior group (Pore) and in the ventral posterior medial nucleus (Vpm) of the rat thalamus: dependence on the barrel field cortex. J. Cornp. Neurol., 319: 66-84. Eckhorn, R., Frien, A., Bauer, R., Woelbern, T. and Kehr, H. (1993) High frequency (60-90 Hz) oscillations in primary visual cortex of awake monkey. NeuroReport, 4: 243-246. Eizenman, M., Hallett, EE. and Frecker, R.C. (1985) Power spectra for ocular drift and tremor. V~s.Res., 25: 1635-1640. Fanselow, E.E. and Nicolelis, M.A.L. (1999) Behavioral modulation of tactile responses in the rat somatosensory system. J. Neurosci., 19: 7603-7616. Friedberg, M.H., Lee, S.M. and Ebuer, EE (1999) Modulation of receptive field properties of thalamic somatosensory neurons by the depth of anesthesia. J. Neurophysiol., 81: 2243-2252. Gardner, EM. (1979) Phaselock Techniques. Wiley, New York. Ghazanfar, A.A. and Nicolelis, M.A. (1997) Nonlinear processing of tactile information in the thalamocortical loop. J. Neurophysiol., 78: 506-510. Gray, C.M. and McCormick, D.A. (1996) Chattering cells: superficial pyramidal neurons contributing to the generation of synchronous oscillations in the visual cortex. Science, 274: 109-113. Haidarliu, S. and Ahissar, E. (1997) Spatial organization of facial vibrissae and cortical barrels in the guinea pig and golden hamster. Z Cornp. Neurol., 385: 515-527. Haidarliu, S. and Ahissar, E. (2001) Size gradients of barreloids in the rat thalamus. J. Comp. Neurol., 429: 372-387. Von Holst, E. (1973) Relative coordination as a phenomenon and as a method of analysis of central nervous functions. In: R.D. Martin (Ed.), The Behavioral Physiology of Analysis
of Central Nervous Functions of Animals and Man. Selected Papers of Eric yon Holst. University of Miami Press, Coral Gables. Hoogland, EV., Welker, E., Van der Loos, H. and Wouterlood, EG. (1988) The organization and structure of the thalamic afferents from the barrel cortex in the mouse; a PHA-L study. In: M. Bentivoglio and R. Spreafico (Eds.), Cellular Thalarnic Mechanisms. Elsevier, Amsterdam, pp. 152-162.
Jeffress. L.A. (1948) A place theory of sound localization. J. Comp. Physiol. Psychol., 41: 35-39. Johansson, R.S., Landstrom, U. and Lundstrom, R. (1982) Responses of mechanoreceptive afferent units in the glabrous skin of the human hand to sinusoidal skin displacements. Brain Res., 244: 17-25. Jones, M.S. and Barth, D.S. (1999) Spatiotemporal organization of fast (>200 Hz) electrical oscillations in rat vibrissa/barrel cortex. J. Neurophysiol., in press. Kim, U. and Ebner, EE (1999) Barrels and septa: separate circuits in rat barrels field cortex. J. Comp. Neurol., 408: 489505. Kleinfeld, D., Berg, R.W. and O'Connor, S.M. (1999) Anatomical loops and their electrical dynamics in relation to whisking by rat [In Process Citation]. Somatosens. Mot. Res., 16: 69-88. Koralek, K.A., Jensen, K.E and Killackey, H.P. (1988) Evidence for two complementary patterns of thalamic input to the rat somatosensory cortex. Brain Res., 463: 346-351. Lebedev, M.A. and Nelson, R.J. (1995) Rhythmically firing (2050 Hz) neurons in monkey primary somatosensory cortex: Activity patterns during initiation of vibratory-cued hand movements. J. Cornput. Neurosci., 2: 313-334. Lee, S.M., Friedberg, M.H. and Ebner, EF. (1994) The role of GABA-mediated inhibition in the rat ventral posterior medial thalamus. I. Assessment of receptive field changes following thalamic reticular nucleus lesions. J. NeurophysioL, 71: 17021715. Lu, S.M. and Lin, R.C. (1993) Thalamic afferents of the rat barrel cortex: a light- and electron-microscopic study using Phaseolus vulgaris leucoagglutinin as an anterograde tracer. Sornatosens. Mot. Res., 10: 1-16. Nicolelis, M.A.L., Baccala, L.A., Lin, R.C.S. and Chapin, J.K. (1995) Sensorimotor encoding by synchronous neural ensemble activity at multiple levels of the somatosensory system. Science, 268: 1353-1358. Perkel, D.H. and Bullock, T.H. (1968) Neural coding. Neurosci. Res. Progr. Bull., 6: 221-248. Perkel, D.H., Schulman, J.H., Bullock, T.H., Moore, G.P. and Segundo, J.P. (1964) Pacemaker neurons: effects of regularly spaced synaptic input. Science, 145: 61-63. Salt, T.E. (1989) Gamma-aminobutyric acid and afferent inhibition in the cat and rat ventrobasal thalamus. Neuroscience, 28: 17-26. Sherman, S.M. and Guillery, R.W. (1996) Functional organization of thalamocortical relays. J. Neurophysiol., 76: 13671395. Shipley, M.T. (1974) Response characteristics of single units in the rat's trigeminal nuclei to vibrissa displacements. J. Neurophysiol., 37: 73-90. Simons, D.J., Carvell, G.E., Hershey, A.E. and Bryant, D.P. (1992) Responses of barrel cortex neurons in awake rats and effects of urethane anesthesia. Exp. Brain Res., 91: 259-272. Sosnik, R., Haidarliu, S. and Ahissar, E. (2001) Temporal frequency of whisker movement: I. Representations in brainstem and thalamus. J. Neurophysiol., in press. Talbot, W.H., Darian-Smith, I., Kornhuber, H.H. and Mountcastie, V.B. (1968) The sense of flutter-vibration: comparison of
87
the human capacity with response patterns of mechanoreceptive afferents from the monkey hand. J. Neurophysiol., 31: 301-334. Welker, E., Hoogland, P.V. and Van der Loos, H. (1988) Organization of feedback and feedforward projections of the barrel cortex: a PHA-L study in the mouse. Exp. Brain Res., 73: 411--435. Welker, W.I. (1964) Analysis of sniffing of the albino rat. Behaviour, 22: 223-244. Williams, M.N., Zahm, D.S, and Jacquin, M.E (1994) Differential foci and synaptic organization of the principal and spinal trigeminal projections to the thalamus in the rat. Eur. J. Neurosci., 6: 429-453. Wilson, M.A. and Bower, J.M. (1989) The simulation of largescale neural networks. In: C. Koch and I. Segev (Eds.), Meth-
ods in Neuronal Modeling: from Synapses to Networks. MIT Press, Cambridge, MA, pp. 291-333. Wineski, L.E. (1983) Movements of the cranial vibrissae in the golden hamster (Mesocritus auratus). J. Zool. Lond., 200: 261-280. Woolsey, T.A. (1997) Barrels, vibrissae and topographic representations. In: and G. Adelman and B. Smith (Eds.), Encyclopedia of Neuroscience Vol. I. Elsevier, Amsterdam, pp. 195199. Zacksenhouse, M. (2001) Sensitivity of basic oscillatory mechanisms for pattern generation and decoding. Biol. Cybern., in press. Zucker, E. and Welker, W.I. (1969) Coding of somatic sensory input by vibrissae neurons in the rat's trigeminal ganglion. Brain Res., 12: 138-156.
M.A.L. Nicolelis (Ed.)
Progressin BrainResearch,Vol. 130 © 2001 Elsevier Science B.V. All fights reserved
CHAPTER 7
Thalamocortical and corticocortical interactions in the somatosensory system Miguel A.L. Nicolelis 1,2,3,, and Marshall Shuler 1 1 Department of Neurobiology, Duke University, Durham, NC 27710, USA 2 Department of Biomedical Engineering, Duke University, Durham, NC 27710, USA 3 Department of Psychology: Experimental, Duke University, Durham, NC 27710, USA
Introduction Until recently, neurophysiological theories aimed at accounting for the exquisite tactile perceptual capabilities of mammals have been dominated by the notion that the somatosensory system relies primarily on feedforward computations to generate a broad spectrum of sensations (e.g. fine touch, thermo sensation, pain, etc.) (Mountcastle, 1957, 1974; Dykes, 1983; Johnson et al., 1995). Despite the widespread acceptance of this view, over the last three decades, considerable anatomical, physiological, and behavioral evidence has been put forward to challenge a pure feedforward view of touch. Accordingly, more than ever, the potential contribution of some of the main 'building blocks' of this model of touch, concepts such as the classic receptive field, independent parallel pathways from the periphery to the cortex, cortical columns, and static somatotopic maps, have been the subject of considerable debate (Merzenich et al., 1983; Purves et al., 1992; Ghazanfar and Nicolelis, 1999). The anatomical organization of the somatosensory system provides the first hint that tactile in-
*Corresponding author: Miguel A.L Nicolelis, Department of Neurobiology, Box 3209, Bryan Research Building, Room 333, 101 Research Drive, Durham, NC 27710, USA. Tel. +1-919-684-4580; Fax: +1-919-684-5435; E-mail:
[email protected]
formation processing involves more than just feedforward interactions (Fig. 1). On their way to the neocortex, ascending somatosensory pathways converge on neurons located in a series of subcortical nuclei in the spinal cord, brainstem, and thalamus (Kaas and Pons, 1988). In addition to these parallel feedforward pathways that convey information from the peripbery to the cortex, neurons located at cortical and subeortical regions that define the different processing levels of the somatosensory system receive convergent input from multiple descending feedback pathways, originating in several cortical areas (Kaas, 1990). These feedback projections define extensive thalamocortical and corticocortical loops that are largely ignored by the classical feedforward model of touch. Part of the reason for this omission is the inherent experimental difficulty of measuring the potential effects of feedback projections, which has contributed to a scarcity of information and conflicting data regarding the physiological role of recurrent circuits. Research on artificial neural networks suggests that the main computational advantage of a neural system with highly recurrent projections is that such networks do not have to synthesize a new view of the world every time new raw information is sampled (as suggested by a pure FF model). Instead, a recurrent system can use previously learned experiences to generate an 'internal' model of the world (Mumford, 1994), and take advantage of this model to generate expectations and predictions every time an
90
Fig. 1. Schematic diagram of the rat trigeminal somatosensory system. Whiskers on the rat's snout are labeled according to the row and column in which they are located. Whisker columns are labeled from 1 to 5, caudal to rostral, while whisker rows are labeled A to E, dorsal to ventral. Peripheral nerve fibers innervating single whisker follicles have their cell bodies located in the trigeminal ganglion (Vg). Here, only the projections from Vg neurons to two main subdivisions of the trigeminal brainstem complex, the principal trigeminal nucleus (PrV), and the spinal trigeminal nucleus (SpV), are illustrated. Proponents of the feedforward model of touch usually divide these projections into rapidly adapting (RA) and slowly adapting (SA) fibers, according to their physiological responses to tactile stimuli (see text). Each of these categories contains further subdivisions, which are not described here. Neurons located in these two brainstem nuclei give rise to parallel excitatory projections to the ventroposterior medial nucleus of the thalamus (VPM). Neurons in VPM give rise to projections to layer IV of the primary somatosensory cortex (SI). A collateral of these thalamocortical projections reach the reticular nucleus (RT), whose neurons provide the main source of GABAergic inhibition to the VPM. Descending excitatory corticothalamic projections, originating in layer VI of the SI cortex, reach the VPM and the reticular nucleus of the thalamus (RTn). The assumed topographic arrangement of these projections in the VPM and the RT are illustrated in the scheme, Feedback corticofugal projections originated in layer V of the SI cortex also reach the trigeminal brainstem complex, targeting primarily the SpV subdivision.
V0
_
,
exploratory tactile behavior is planned. In the case of the somatosensory system, the vast network o f corticocortical connections and massive corticofugal pathways to the thalamus, brainstem and spinal cord would provide the anatomical substrate for the dissemination o f predictions o f an internal touch model of the world across multiple cortical and subcortical areas (Grossberg, 1999). The interaction between these 'expectations' and raw tactile information provided by feedforward pathways could then define the computations needed to allow animals to accurately perceive the nature o f any given tactile stimulus as well as provide a continuous update o f the internal model. This would be accomplished through dynamic interactions between descending and ascending projections that reciprocally connect populations of cortical and thalamic neurons.
The recent introduction of electrophysiological methods that allow one to simultaneously record the activity of large populations of single neurons, located in multiple cortical areas and subcortical nuclei while carrying out selective and reversible pharmacological inactivation of selective regions o f the cortex has provided new insights on the functional contribution o f corticocortical and thalamocortical loops in tactile information processing. Here, we review some physiological results obtained in our laboratory regarding two o f these recurrent circuits in the rat somatosensory system: the thalamocortical loop between the primary somatosensory cortex (SI) and the ventral posterior medial (VPM) nucleus of the thalamus; and the loop formed by the reciprocal callosal connections between the SI cortices.
The potential role of corticothalamic feedback projections in tactile information processing Like all other m a m m a l i a n sensory systems, the somatosensory system contains massive feedback cor-
91 ticocortical and corticothalamic projections, which define closed loops between cortical areas and between the cortex and the thalamus. For instance, in primates, feedforward somatosensory pathways terminate in four distinct somatosensory areas located in the anterior parietal cortex (areas 3a, 3b, 1 and 2). Projections from the anterior parietal cortex also reach motor cortical areas in the frontal lobe, the secondary somatosensory cortex, and the somatosensory and multi-modal cortical areas of the posterior parietal cortex (Kaas et al., 1983). Somatosensory cortical areas in the parietal cortex are reciprocally connected through feedforward and feedback projections. In addition, cortical neurons located in some somatosensory and motor cortical areas are also connected by massive corticocortical feedback projections. Another source of massive corticofugal projections, which originate in the infragranular layers of the primary and higher order somatosensory cortical areas, project to all intermediary subcortical relays (i.e spinal cord, brainstem, and thalamic nuclei) of the somatosensory system (Kaas et al., 1983). Indeed, once the full domain of these feedback projections is considered, the somatosensory system can only be defined as a highly recurrent network, in which multiple feedback projections are intertwined with several parallel feedforward pathways. The importance of corticothalamic projections can be illustrated by a brief description of the anatomical organization and the physiological effects mediated by these pathways in the somatosensory system of rodents. As in every other mammalian species (Andersen et al., 1972; Adams et al., 1997; Deschenes et al., 1998; Sherman and Guillery, 1996), feedback projections from several somatosensory (e.g. SI, SII, PV, etc) cortical areas (Chmielowska et al., 1989; Bourassa et al., 1995) converge on neurons located in primary and secondary thalamic nuclei (e.g. VPM, POM, and ZI) of the trigeminal system of rodents. Studies in mice (Hoogland et al., 1987, 1991) and rats (Bourassa et al., 1995; Deschenes et al., 1994) have shown that these corticothalamic projections terminate primarily in the distal dendrites of these thalamic neurons (Pinault et al., 1997). In the case of the ventral posterior medial (VPM) nucleus, the primary thalamic relay of the trigeminal system, these corticothalamic
projections seem to be organized in a topographic manner (Hoogland et al., 1987; Bourassa et al., 1995; Deschenes et al., 1998; Zhang and Deschenes, 1998) (see Fig. 1). In this arrangement, corticothalamic projections originating from layer VI neurons, which are located under a particular cortical 'barrel' (Woolsey and Van der Loos, 1970) (e.g. barrel D1 in layer IV), terminate on thalamic neurons located across the thalamic barreloids (Van der Loos, 1976) that define the representation of a whisker arc or column (e.g. barreloids A1, B1, C1, D1, El) in the VPM (see Fig. 1). Corticothalamic projections from layer VI also reach the reticular nucleus (RT) of the thalamus (Pinault et al., 1995, 1997), the main source of GABAergic inhibition in the rat VPM (Pinault and Deschenes, 1998). Anatomical evidence suggests that these cortical-RT projections are also organized in a topographic arrangement, which seem to be orthogonal to that observed in the VPM nucleus (Hoogland et al., 1987, 1988). Thus, axons from layer VI neurons located under a given cortical 'barrel' (e.g. C1) target neurons located across the representation of a whisker row (e.g. C1, C2, C3, C4) in the RT nucleus (Hoogland et al., 1987, 1988). Neurons located in secondary somatosensory thalamic nuclei, such as the posterior medial nucleus (POM), also receive corticothalamic terminals (Hoogland et al., 1987, 1991), albeit these are primarily derived from pyramidal neurons located in layer V of the somatosensory cortex. The morphology of corticothalamic terminals also varies according to whether they terminate in the primary (e.g. VPM) or secondary thalamic relay (e.g. POM) nuclei (Hoogland et al., 1988, 1991). Physiological studies have shown that corticothalamic projections are primarily excitatory and likely employ glutamate as their main neurotransmitter (Turner and Salt, 1998, 1999). The glutamate released from these corticothalamic terminal acts on AMPA, NMDA, and metabotropic receptors located in the distal dendrites of thalamic neurons (McCormick and von Krosigk, 1992; Salt and Eaton, 1996; Turner and Salt, 1999). Activation of metabotropic receptors by in vitro stimulation of corticothalamic axons produces long-lasting, slowrising EPSPs in the thalamus (Salt and Turner, 1998; Turner and Salt, 1998). Based on some of these findings, corticothalamic-mediated activation
92 of metabotropic receptors has been suggested to produce the modulation of neuronal firing in the VPM nucleus (Salt and Turner, 1998; Turner and Salt, 1998). For instance, it is conceivable that the slowly rising depolarization produced by activation of corticothalamic projections could allow thalamic neurons to reach firing threshold in the presence of subthreshold synaptic input. In addition, corticothalamic afferents could also contribute to the slow activation of a low-threshold calcium conductance that underlies the production of bursts of action potentials by thalamic neurons (Sherman and Guillery, 1996). Despite a wealth of anatomical, pharmacological, and in vitro physiological information, the role played by corticothalamic projections in tactile information processing has remained elusive. For instance, penicillin-induced epileptic discharge in the cat somatosensory cortex (Ogden, 1960) and cortical spreading depression in the rat cortex (Albe-Fessard et al., 1983) were found to induce a depression of sensory evoked responses in the thalamus. In another series of experiments, carded out in both anesthetized and awake preparations, Yuan et al. (1985, 1986) reported that lidocaine-induced inactivation of SI cortex resulted in reduced thalamic responses to electrocutaneous stimulation without any effect on the spontaneous activity, stimulus threshold, response latency, or receptive fields of the same thalamic neurons. Other studies, however, have reported a facilitatory influence of SI cortex on evoked thalamic discharges (Andersen et al., 1967, 1972) using cortical spreading depression (Waller and Feldman, 1967) or electrical stimulation (Anderson et al., 1964), It is likely that part of the confusion in the literature arises because corticothalamic pathways can mediate both a monosynaptic excitatory and a dysynaptic inhibitory (via RT nucleus) postsynaptic potential in the thalamus. Thus, depending on how the cortex is stimulated or blocked, a variety of facilitatory and inhibitory responses effects could be induced in the thalamus. This hypothesis is supported by the observation that microstimulation of small territories of the SI cortex can lead to a range of thalamic effects, in addition to an overall suppressive influence of thalamic sensory responses, depending upon the relative topographic location of the stimulus and neurons in the ventral posterior nucleus of the thalamus (Shin and Chapin, 1990c).
In our hands, pharmacological block of SI cortical activity by focal infusion of the GABAA agonist muscimol, and the consequent silencing of pools of cortical neurons that give rise to corticofugal projections to the thalamus and brainstem, produced a series of physiological effects in the rat VPM (Krupa et al., 1999). First, we observed that blocking cortical activity altered both the short and long-latency components of the tactile responses of VPM neurons. The end result of these modifications was the demonstration that corticofugal projections contribute to the definition of the complex spatiotemporal structure (Krupa et al., 1999) of the RFs of VPM neurons. These results were obtained by using traditional single whisker stimuli. When more complex tactile stimuli were employed in our experiments, we observed that the ability of VPM neurons to integrate complex tactile stimuli (e.g. multi-whisker deflections) in a non-linear way was also significantly reduced by a pharmacological block of cortical activity. Both supraand sublinear summation of multi-whisker stimuli (Ghazanfar and Nicolelis, 1997) was reduced in these experiments (Ghazanfar et al., 1997). Overall, these findings not only support the hypothesis that corticothalamic projections may mediate both facilitatory and suppressing effects on thalamic neurons, but they also suggest that the action of these corticofugal projections may also depend on the type of tactile stimulus provided to the somatosensory system. As described below, there is direct evidence that the physiological contribution of these descending pathways to tactile information processing may also depend on the behavioral state of the animal. The functional relevance of corticofugal projections in the rat somatosensory system was investigated in studies carried out in our laboratory t o evaluate the contribution of corticofugal projections to the ability of subcortical neurons to express unmasking of novel tactile responses following a peripheral deafferentation (Krupa et al., 1999). This reorganization process, which we dubbed 'immediate or acute plasticity', is known to trigger a system-wide reorganization of the somatotopic maps located at cortical, thalamic, and brainstem levels (Faggin et al., 1997). The most conspicuous effect of this immediate reorganization is the shifting of receptive fields of individual neurons away from the
93 deafferented region due to the unmasking of neuronal tactile responses that were not present before the peripheral block. Interestingly, such unmasking tends to occur almost simultaneously in the brainstem, thalamus, and cortex (Faggin et al., 1997). In a recent series of experiments, we observed that blocking neuronal activity in the infragranular layers of the SI cortex, a procedure that silences the projecting neurons that give rise to corticobulbar and corticothalamic feedback projections, reduces by almost 50% the number of VPM thalamic neurons that exhibit unmasking of tactile responses following a partial and reversible peripheral deafferentation (Krupa et al., 1999). Although plastic reorganization in the VPM nucleus is still observed after cortical inactivation, its spatial extent is reduced significantly. These findings have been confirmed and extended further by the recent demonstration that the immediate, but not the late phase of plastic reorganization in the ventral posterior lateral nucleus (the thalamic relay for somatosensory fibers from the rest of the body), are reduced or eliminated by removal of corticofugal projections (Parker and Dostrovsky, 1999). Further support for the functional relevance of descending corticofugal projections comes from the observation that these projections have been demonstrated to affect the physiological properties of several other subcortical relays of the somatosensory system. For instance, block of neuronal activity in the SI cortex has been reported to eliminate most of the tactile responses of neurons located in the POM nucleus of the thalamus (Diamond et al., 1992). In addition, corticobulbar projections have also been shown to influence the physiological properties of neurons located in the brainstem nuclei that relay ascending somatosensory information to the thalamus (Jacquin et al., 1990b). For instance, removal of corticofugal projections in rats increases the responsiveness of neurons in spinal trigeminal brainstem complex to whisker stimuli (Jacquin et al., 1990b). Overall, the results reviewed above make a compelling case for the need to incorporate recurrent corticofugal projections as an integral part of a comprehensive and realistic model of touch. Indeed, the recurrent nature of the somatosensory system further strengthens our hypothesis that the mammalian somatosensory system relies on highly distributed neu-
ronal interactions, which emerge from the dynamic interplay of multiple ascending and descending pathways, to represent tactile information (Nicolelis et al., 1993b, 1995, 1996, 1997, 1998a,b; Nicolelis and Chapin, 1994; Nicolelis, 1996). Although the concept of distributed processing is not new, and many investigators have proposed schemes based on population coding (Hebb, 1949; Erickson, 1968, 1986; Georgopoulos et al., 1986; Sejnowski et al., 1988; Mumford, 1992; Deadwyler and Hampson, 1997), this encoding scheme has recently attracted the attention of neuroscientists because of the successful application of artificial neural networks in pattern recognition problems (Grossberg, 1976; Grossberg, 1988; Bishop, 1995. In a distributed coding scheme, divergent neural connections ensure that specific units of information are not held in single or small groups of neurons, but instead are widely distributed, or 'encoded' by large neural ensembles located at multiple cortical and subcortical levels of the system (Hebb, 1949). Consequently, each neuron contributes in some way to processing of most of the information handled by the network. In line with this hypothesis, a series of studies in our and other laboratories (Nicolelis et al., 1993a, 1995, 1998b; Kleinfeld and Delaney, 1996; Masino and Frostig, 1996; Moore and Nelson, 1998; Ghazanfar and Nicolelis, 1999; Polley et al., 1999) have begun to re-examine traditional views of information encoding by the somatosensory system. Anatomical evidence in favor of a distributed model includes the fact that ascending feedforward (FF) somatosensory pathways that carry information from the periphery to the SI cortex exhibit different degrees of divergence (Lu and Lin, 1986; Rhoades et al., 1987; Chmielowska et al., 1989; Jacquin et al., 1990a; Lin et al., 1990; Chiaia et al., 1991; Lu and Lin, 1993; Pinault and Deschenes, 1998; Veinante and Deschenes, 1999), which contribute to the large multi-whisker RFs observed in the VPM and SI (Nicolelis and Chapin, 1994; Ghazanfar and Nicolelis, 1999). Thus, the effects of even small but incremental changes at each processing level of the pathway (e.g. from brainstem to thalamus) would tend to multiply through successive relays and could be markedly amplified by the time they reached the cortex. In addition, wide-field sensory inputs, such as high-threshold mechanical and noxious stimuli,
94 which are transmitted through paralemniscal pathways, could also converge on cortical neurons. These effects could be further amplified by corticocortical connections within the SI and between the SI and other cortical areas (Chapin et al., 1987; Fabri and Burton, 1991; Nicolelis et al., 1991). In this context, the existence of massive divergent corticofugal feedback projections to all subcortical somatosensory relay nuclei provide almost unlimited opportunity for increasing the ultimate radius of influence from a single sensory event (Mumford, 1991). In this model, single neurons would not serve as the functional unit of the system. Instead, neurons would work as part of ensembles that are capable of representing and processing multiple tactile attributes of a given complex stimulus simultaneously. Massive corticofugal projections, that reach somatosensory relay structures located in the thalamus, brainstem, and the spinal cord, could offer the anatomical substrate for the definition of such multitasking networks. Such distributed and recurrent networks could be formed by somatosensory, motor, limbic, and association cortical areas, and influence the activity of neurons located in subcortical centers, even before mechanoreceptors in the skin were activated by a tactile stimulus. According to this view, corticofugal feedback projections could incorporate subcortical nuclei into the computational processes required for the emergence of tactile percepts. Although rarely discussed in the literature, reciprocal loops between cortical and thalamic nuclei could also mediate a different type of corticocortical communication, in which thalamic networks combine convergent signals from one or more cortical areas and then disseminate the resulting signals to vast cortical territories. Such an interactive view of the somatosensory system would predict that top down-influences would be capable of modulating the activity of subcortical neurons during different behavioral states. But is there evidence for the existence of such top-down influences in the somatosensory system? In the next section we describe a well-known phenomenon that may provide the key for unraveling the physiological role played by corticofugal feedback on tactile perception and, hence, serve as the basis for mounting a formidable challenge to the FF model of touch.
A potential physiological role for corticothalamic pathways in tactile processing: sensory gating of neural responses during active tactile exploration A number of studies carried out in many species indicate that during different exploratory behaviors the magnitude and latencies of tactile responses as well as the manner in which the brain responds to complex tactile stimuli, can change considerably. Thus, in rats, reductions in responses to tactile stimuli during motor activity have been observed in SI (Chapin and Woodward, 1981, 1982a,b; Shin and Chapin, 1990b), the ventral posterior lateral thalamus (VPL) (Shin and Chapin, 1990a,b), and the dorsal column nuclei (DCN) (Shin and Chapin, 1989). Similarly, in cats, medial lemniscus sensory responses elicited by stimulation of the radial nerve are reduced in magnitude during limb movement (Ghez and Lenzi, 1971; Coulter, 1974). Primates also show modulations in SI (Nelson, 1984, 1987; Chapman et al., 1988) prior to and during motor movement. Alterations in sensory responses during movement have also been observed in human, evoked potential studies (Coquery, 1971; Lee and White, 1974; Cohen and Starr, 1987). These observations imply that the nervous system is capable of dynamically altering how cortical and subcortical neurons respond to a tactile stimulus, depending on the behavioral context in which such a stimulus is presented to the animal. The crux of this argument, therefore, lies in the hypothesis that the emergence of the broad spectrum of natural tactile sensations experienced by mammals results from a much more intimate association between the somatosensory and motor systems than postulated before by previous neurophysiological theories of touch. But what is the significance of endowing the somatosensory system with the capability of altering the type of tactile information that can reach the cortex during motor activity? First, the alterations in response magnitude may allow tactile and proprioceptive information pertinent to execution or completion of the movement to be selectively enhanced. Thus, during the execution of a planned motor act, reciprocal interactions between the motor and the somatosensory cortices would ensure that certain types of input are gated 'in' while others are gated 'out'. This may be necessary in order to allow
95 the movement to occur as planned without interference from extraneous sensory feedback. In support of this idea, Chapin and Woodward (Chapin and Woodward, 1982a) showed that tactile responses, across the cortical and subcortical relays of the somatosensory system, can be inhibited or enhanced at different epochs of the step cycle in locomoting rats. According to these authors, irrelevant tactile responses, such as those caused by the movement itself, would be selectively gated out at certain times during the movement, while sensory information that would describe certain movement epochs (e.g. foot fall) would be enhanced. There is strong evidence in the literature supporting the hypothesis that this selective modulation of tactile responses is mediated by descending efferent activity from the motor cortex or other central motor nuclei. The most relevant experimental finding supporting the occurrence of centrally mediated gating of tactile information is the observation that sensory responses across the somatosensory pathway can be reduced as much as 100 ms prior to the initiation of a given movement. These findings, which have been obtained in both monkeys (Nelson, 1987; Chapman et al., 1988) and cats (Coulter, 1974; Ghez and Lenzi, 1971) strongly suggest that reductions in tactile responses in the cortex and subcortical relays of the somatosensory system do not result from alterations in tactile or proprioceptive feedback generated by the movement itself. Instead, they may be related to the central motor command that is generated hundreds of milliseconds prior to the movement onset. Further support for this view comes from the observation that microstimulation of the motor cortex can reduce tactile neuronal responses throughout the somatosensory system. For example, Shin and Chapin showed that stimulation of the forepaw region in the MI cortex, prior to electrical stimulation to the forepaw, led to a 43% suppression of tactile responses in the thalamus (Shin and Chapin, 1990b) and 8% reduction in the dorsal column nuclei (Shin and Chapin, 1989). Another set of experiments has demonstrated that the amount of tactile response modulation varies across the different intermediary relays of the ascending somatosensory pathways. For instance, in rats, SI and VPL tactile responses to forepaw stimulation have been shown to decrease 71 and 31%, respectively, when the stimuli are delivered during
the animal's locomotion (Chapin and Woodward, 1981; Shin and Chapin, 1990b). These findings illustrate the general observation that the magnitude and frequency of the somatosensory gating effect tends to increase as one ascends through the somatosensory system. As such, this observation further supports the hypothesis that modulatory signals derive from central neural networks responsible for generating the motor command. Despite robust evidence favoring the existence of a central mechanism for modulating tactile neuronal responses, in some cases one can also demonstrate that feedforward mechanisms contribute to the alteration of cortical and subcortical tactile responses during the execution of exploratory behavior. For example, Schmidt et al. (1990b) have shown that when anesthesia was applied to one or more sensory nerves of the hand, inhibition of tactile stimulation that normally occurred when subjects moved the finger being stimulated, was reduced (i.e. there was less gating of the response) by up to 70%. This led to the conclusion that a significant portion of the gating effect, but not all, was caused by afferent sensory stimulation generated in the periphery by the movement itself. Support for the existence of peripheral mechanisms of gating has also been provided by Chapman et al. (1988), who reported that tactile responses in SI to stimulation of the medial lemniscus or the thalamus were not reduced prior to movement, but only during the execution of the movement. These authors reported that somatosensory gating occurring at the level of the dorsal column nuclei (DCN) was caused by central modulation, since it occurred prior to the movement. However, in their hands any additional gating at higher levels of the somatosensory system was caused by motor-induced peripheral afferent activity. Another series of experiments partially supported this view by showing that while passive movements do not cause tactile gating in the DCN, they can induce a certain degree of gating in the thalamus (VPLc) and SI (Chapman et al., 1988). Thus, even though there is substantial evidence for centrally mediated modulation of tactile responses, one cannot discard the possibility that proprioceptive and tactile afferent signals, generated during the execution of movements, contribute to the gating of tactile responses observed at higher levels of the somatosensory system.
96 It is important to emphasize that noradrenergic, serotoninergic, and cholinergic projections, which originate in different locations of the brainstem and diencephalon and target all intermediary relays of the somatosensory system, could also contribute to the central modulation of tactile neuronal responses during different behavioral states (McCormick and Pape, 1990; Waterhouse et al., 1994) as they do in other sensory systems (McLean and Waterhouse, 1994). Over the last three decades, one of the most elegant examples of multi-disciplinary research in neuroscience has indicated that some of these modulatory systems play a fundamental role in the control of the ascending flow of nociceptive information from the periphery that is used for the perception of pain (Fields and Heinricher, 1985). The demonstration that physiological or pharmacological activation of these descending modulatory projections can block the ascending flow of nociceptive information through the spinothalamic system and produce maintained analgesia has revolutionized our understanding of pain perception. Since pain belongs to the spectrum of tactile sensation that all mammals experience, these observations offer more experimental support for our contention that top-down influences cannot be ignored by any theory aimed at describing the neurophysiological basis of tactile perception. In line with this hypothesis, recent studies in the trigerninal system of awake, freely moving rats have corroborated and extended our conviction that top-down influences, such as those mediating the phenomenon of 'somatosensory gating', play a crucial role in the emergence of tactile perception. In these experiments, simultaneous, multi-site chronic recordings were employed to monitor the activity of large populations of single cortical, thalamic, and brainstem somatosensory neurons, while rats moved freely in a behavioral box. Initially, these experiments allowed us to investigate how the expression of different behaviors (e.g. awake immobility, active whisking, moving without whisker movements) could influence the physiological properties of populations of cortical and subcortical neurons in freely behaving animals (Fanselow and Nicolelis, 1999). Subsequently, the same experimental paradigm was used to measure how similar tactile stimuli are processed under different behavioral conditions across the rat somatosensory system.
We observed that complex and dynamic corticothalamic interactions tend to precede any active tactile discrimination in freely behaving rats. For instance, as awake rats assume an immobile posture (i.e. standing on the four paws without producing any whisker or other major body movements), most of the neurons in the SI cortex and VPM thalamus start producing rhythmic bursts of action potentials, which are translated into 7-12 Hz rhythmic oscillations (Nicolelis et al., 1995; Fanselow and Nicolelis, 1999). In the vast majority of the analyzed events, these 7-12 Hz rhythmic oscillations initiate in the whisker area of the rat SI cortex (the barrel fields) (Fig. 2). After a few tens of milliseconds, these osi
i
i
i i
S, CORTEX S, CORTEX
't
VP~
....
VPM
~A~^A~^,dAik,
1 . . . . ^At,.d~A~,AAAAI.^~A, dAA~,
1 SEC I
i
i[
I t iii i
i
!
CORTEX
SI CORTEX VPM
~
~
THALAMUS
t lilttnm~
t
t
t
,
,
tt|S~
c
t
I
i
Sl CORTEX ~ ! S[ CORTEX
VPM THALAMUS
~ I
THALAMUS
1 SEC Fig. 2. Simultaneous field-potential recordings in VPM and SI during Ix rhythm reveal that 7-12 Hz oscillations tend to appear first in the SI cortex and only later are evident in the sornatosensory thalamus.
97 SpV 1
10 0 10 0
VPM
1
A
20 A
Alk
0 .'-",."T'",'"',"'-,"-'r-",'","r"T'"r'",
2°I
A
A
A
,o
2oo 0"
0 r...r...l-..-r.-.r-.-r-.T--.r-..r-.-,..-T...r.-.~
~.
°
30
t
r
40
i
0 |....r.-.r-.:,-.--,-.-.r.-T.--,----,.-..r--T...,...., I
30
7--r--l---T--T---r--
i
-
-
0 ....,...T...,...-,....,...T...r...r_r..T...r.. , Sl 50 0 5
10"
O-
O.
s~
1o I
Io
0 T'"'r "'r"'Ilt'lr'"~'"'l'"'r'l'll'"'r"T"T "''l
0.1
0
0.1
0
0.1
0
0.1
Tlmo lag (s) Fig. 3. During whisker twitching movements, activity in SI and SpV phase leads activity in VPM. Recordings were made from microelectrodes chronically implanted in SpV, VPM and SI in awake rats. This figure depicts cross-correlograms (CCs) for neurons in these three areas during 7-12 Hz whisker twitching movements, which were accompanied by i~-oscillations. The CCs are centered on the VPM neuron depicted by the arrow, and the numbers above the CCs show the number of milliseconds by which a given SpV or SI neuron phase-led the VPM reference neuron. It can be seen that activity in SI and SpV phase led activity in VPM.
cillations appear in the VPM nucleus and later on they can be observed in the spinal nucleus (but not in the principal) of the trigeminal brainstem complex (Figs. 2 and 3). Importantly, these oscillations were never detected in the rat trigeminal ganglion, suggesting that they are generated centrally. Further analysis revealed that these 7-12 Hz thalamocortical oscillations usually precede, by hundreds of milliseconds, the initiation of small amplitude rhyth-
mic facial whisker twitching (WT) movements in the same frequency range (Nicolelis et al., 1995). We also observed that the initiation of WT movements modulated these oscillations (Nicolelis et al., 1995). Thus, soon after the onset of WT movements, rats invariably started to produce slower (4-6 Hz) rhythmic whisker protractions, which had much larger amplitudes than the WT movements. Previous behavioral studies have indicated that rats use these large rhyth-
98 mic whisker movements to discriminate the tactile attributes of objects (Carvell and Simons, 1990). In fact, 'whisking' is present in most rodents and is considered as an important exploratory behavior of rats. In our experiments, we also documented that as soon as the animal started to produce these slower and larger whisker movements, the 7-12 Hz thalamocortical oscillations disappeared (Nicolelis et al., 1995). Altogether, these observations are very reminiscent of a similar phenomenon originally described by Gastaut in human scalp EEG recordings carded out in the 1950s (Gastaut, 1952). Since its original discovery, both EEG and magnetoencephalographic recordings have been used to demonstrate the occurrence of widespread 10 Hz oscillations, originating in the hand representation of the primary somatosensory cortex of the vast majority of healthy human subjects and non-human primates (Niedermeyer, 1993). In the EEG literature, these oscillations were named motor (Ix) rhythm, since its main characteristic is to appear during awake immobility and disappear as soon as the subject starts any hand movement, a key tactile exploratory behavior of primates. The existence of these similarities led us to postulate that the 7-12 Hz oscillations that precede and are modulated by whisker movements in rats are equivalent to the Ix rhythm of primates. The functional role of Ix oscillations in primates and rodents is still unclear. Although MEG recordings have clearly indicated that this preparatory rhythm is present in most normal human subjects, no clear consensus has been reached regarding the potential physiological role played by these oscillations. Because rats use rhythmic whisker movements as their main tactile exploratory behavior, the presence of the IX rhythm in this species led us to propose that these thalamocortical oscillations could prepare the somatosensory system for the imminent onset of a cycle of tactile exploration. According to this hypothesis, during the occurrence of 7-12 Hz oscillations, tactile information would continue to flow from the VPM to the somatosensory cortex. However, since during these oscillations a significant percentage of VPM neurons are producing bursts, it is conceivable that these neurons would have difficulty in faithfully transmitting complex spariotemporal patterns of tactile information, which are likely to be generated when rats use their whiskers to actively explore an ob-
ject. Instead, we proposed (Nicolelis et al., 1995) that these 7-12 Hz oscillations could be used to enhance or even maximize the ability of the somatosensory system to detect the presence of tactile stimuli, either during awake immobility or during the production of whisker twitching movements. In other words, these 7-12 Hz oscillations could represent an 'expectation' signal, a template that could be produced by the rat somatosensory system in anticipation of whisking. In our view, this 'expectation' signal could be generated as part of central motor program and be disseminated to most of the somatosensory system through corticofugal projections. We speculate that once the presence of a tactile stimulus is detected by an immobile rat (or during WT movements), larger amplitude and slower (46 Hz) rhythmic whisker protractions are initiated so that a more detailed tactile exploration of objects by the arrays of vibrissae can be accomplished. As the animals behavior changes, so does the physiological setting of the thalamocortical loop. During the execution of these rhythmic large amplitude whisker movements, thalamocortical oscillations vanish and VPM neurons switch to a tonic firing mode. In this physiological state, VPM neurons have higher spontaneous firing rates and can faithfully represent and transmit complex spatiotemporal patterns of tactile inputs to the SI cortex. It is important to emphasize, however, that during this active tactile exploration, corticothalamic projections can still mediate important interactive computations at the level of the thalamus. In this interactive view of the somatosensory system, the thalamus is no longer considered as a simple, passive relay of tactile information from the periphery to the cortex. Instead, through its reciprocal interactions with different cortical areas, the somatosensory thalamus, (which includes the tactile portion of the reticular nucleus), could participate in a variety of computations, such as non-linear summarion of tactile stimuli (Ghazanfar and Nicolelis, 1997; Shigemi et al., 1999), signal segmentation through resonant interactions, template matching, and error generation. Similar to the recurrent models of Grossberg (1999) and Mumford (1994), the somatosensory thalamus could function as the site where incoming afferent tactile information is compared with cortically stored templates that resume the previous tactile experience of the animal. As feedfor-
99 the same animals engaged in behaviors that did not include the whisker movements (e.g. movement of the head or body). These findings corroborate work in cats (Coulter, 1974), and in humans (Schmidt et al., 1990a Schmidt et al., 1990b), in which decreases in sensory responsiveness were most robust when the sensory stimulus was applied to the part of the body engaged in a tactile exploration, as compared to adjacent digits or contralateral limbs. Thus, as previously suggested by many authors, the phenomenon of somatosensory gating appears to be fairly topographically specific, since it occurs only during motor activity used for active tactile exploration, rather than following any action involved in increasing the general arousal level of the animal. Previous studies have shown that the presence of one stimulus can alter the ability of cortical and subcortical neurons to respond to a subsequent stimulus for a period of time (Simons, 1985; Simons and Carvell, 1989). Evidence from our experiments and other studies (Castro-Alamancos and Connors, 1996a,b) suggest that the ability of one tactile response to modulate the magnitude of a subsequent one is substantially decreased during motor activity (Fig. 5). For example, when two tactile stimuli were presented with an inter-stimulus interval of 2575 ms in the absence of any whisker movements (i.e. awake immobility), the response to the second
ward and feedback projections may be required for the definition of the complex spatiotemporal structure of receptive fields in the rat VPM (Krupa et al., 1999), ensembles of these neurons could participate in template-matching operations and other computations on afferent tactile signals. In order to test some of these assumptions, another series of experiments was carded out in our laboratory. In these experiments, multi-site chronic recordings were carded out while a nerve cuff electrode was used to provide consistent stimulation to the infraorbital (IO) nerve, the nerve that carries tactile information from the vibrissae to the central nervous system, as rats switched between a series of behavioral states (Fanselow and Nicolelis, 1999). In the first series of experiments, individual electrical stimuli that produced neuronal responses that mimic those obtained by mechanical stimulation of multiple facial whiskers were delivered to the IO nerve while rats were immobile, or when they produced the two different types of whisker movements described above. The cortical (SI) and thalamic (VPM) sensory responses elicited by the electrical stimuli were then compared. As predicted by previous studies, the magnitudes of the neural responses in SI and VPM neurons were substantially reduced during the production of rhythmic whisker movements (Fig. 4). Interestingly, this reduction was not observed when
VPM
SI
20
o~,~-
0
~ 20 ~
o
-50
0
w
i
20 50
100
150
200
-50
0
50
100
150
200
ms post stimulus
Fig. 4. Activity levels in VPM and SI following peripheral stimulation differ depending on the behavioral state of an animal. Individual electrical pulses were presented to the infraorbital nerve in awake, freely moving rats and responses to this stimulation were recorded from chronically implanted microwires in VPM and SI. When the animal was in a state of quiet immobility, the initial excitatory response was followed by a period of suppressed firing, during which activity went below pre-stimulus baseline levels (dotted line). This period of suppressed firing was followed by a late excitatory component at approximately 125 post-stimulus. In contrast, during exploratory whisking behavior, the period of suppressed firing was substantially shorter in VPM and non-existent in SI, and there was no late excitatory component in either area. Error bars represent ±SEM. The initial excitatory peaks have been clipped in order to show the other components of the traces more clearly.
100
VPM Quiet
Whisking
150
100 or)
5O
to
~r
|,iiliii|
0 lstl25 50 75 100 t25 150 175200 I stimulus" time to second stimulus in ms
lstl25 50 75 100 125 150 175200 stimulus time to second stimulus in ms
E T'-
Sl Quiet
Whisking
if}
150 0c-
'OSO0 Oi ,,i/i i lstl25 50 75 100 125 150 175200 stimulus time to second stimulus in ms
I,iiiiil!
lstl25 50 75 100 125 150 175 200 stimulus ' time to second stimulus in me
Fig. 5. Responses to peripheral stimulation differ depending on the behavioral state of an animal. Recordings were made from multiple chronically implanted microwires in awake, freely moving rats. Stimulation was provided to a nerve cuff electrode implanted around the infraorbital nerve. Stimuli were presented in pairs with interstimulus intervals (ISI) ranging from 25-200 ms. This figure demonstrates two effects we observed by looking at the responses to these stimuli during two different behaviors, quiet immobility and exploratory whisking. First, during whisking, responses to the first stimulus in the pairs were smaller than those during the quiet state. This indicates that during the whisking state there is an overall gating of responses to ascending stimuli. The second effect was that when animals were in a state of quiet immobility, responses to the second stimulus in a pair was suppressed if the ISI was 25-75 ms. In contrast, during active, exploratory whisking behavior, the responses were not significantly suppressed for any ISI, compared to the response to the first stimulus in the pair. Error bars represent +SEM.
stimulus was significantly reduced (Fig. 5). However, during periods in which the same rats produced whisker movements, the response of VPM and SI neurons to the second stimulus was not statistically
different in magnitude from the first at any interstimulus interval tested (Fig. 5). Further examination of these results indicated that these effects paralleled a change in the amount of post-excitatory inhibition
101 that follows the first IO stimulus in different behavior states (immobility vs. whisker movements). Thus, in the absence of any movement of the whiskers, we observed the occurrence of a long period of reduced firing following the presentation of the first tactile stimulus (Hellweg et al., 1977; Simons, 1985; Simons and Carvell, 1989). However, this period is substantially shorter (in VPM) or non-existent (in SI) during the presence of exploratory whisker movements (Fig. 5), suggesting that motor-activity related changes in post-stimulus inhibition could account for the differential responses to paired stimuli we observed in different behavioral conditions. Overall, the results of these experiments suggest that during different behavioral states, different types of thalamocortical transmission may occur (in this case, awake immobility versus 'whisking') and that these different modes of transmission may serve different perceptual purposes. Thus, differences in cortical and subcortical tactile response characteristics, from periods of whisker immobility to periods of whisker movements, suggest that the somatosensory system can shift from a state of high-sensitivity for detecting individual punctate stimuli (i.e. during immobility and thalamic bursting), to a state in which the system can process with high-fidelity the complex incoming tactile afferent information that are generated by the active tactile exploratory behavior employed by the animal to probe its surrounding environment. Although there is no definitive proof that corticofugal projections are responsible for either the recruitment of VPM neurons into 7-12 Hz oscillations during awake immobility, or the switch of VPM neurons from bursting to tonic firing mode, several indirect observations can be used to build a strong case in favor of this hypothesis. First, in the vast majority of our recordings, IXoscillations clearly initiate in the SI cortex and only later appear in the VPM thalamus. Likewise, in all other species in which the Ix rhythm has been reported it was found to originate at the cortical level. The hypothesis that corticofugal projections provide the anatomical substrate for recruiting the thalamus into a massive wave of synchronous activity is also supported by the observation that removal of corticothalamic projections significantly reduces or completely abolishes the synchronization of neuronal firing across the thalamus (Contreras et al., 1996).
The potential contribution of corticothalamic projections to the switching in firing mode of thalamic neurons is also supported by several indirect observations. Since corticothalamic axons terminate in the distal dendrites of VPM neurons, and exert their direct excitatory effects through metabotropic receptors, they could provide the type of slow depolarization synaptic events that are required for activating the low-threshold calcium conductance that endows thalamic neurons with the ability to fire in bursts. De-inactivation of this calcium conductance, which requires hyperpolarization of VPM neurons, could also be achieved by corticothalamic projections acting through the reticular nucleus which provides GABAergic innervation to thalamic relay neurons. Though many more experiments are required to fully demonstrate the computations carded out by the interplay of corticofugal and ascending somatosensory pathways, the central assumption of our argument remains valid. The type of dynamic thalamocortical interactions described above cannot be explained by a simple feedforward description of the somatosensory system. As seen above, changes in behavioral state significantly alter the responses to tactile stimuli across cortical and subcortical levels of the somatosensory system. These studies have demonstrated that neuronal response properties can be altered on the order of seconds, as animals switch from one behavioral state to the next. The possibility of altering the manner in which somatosensory neurons respond to the same tactile stimuli under different circumstances confers a high degree of adaptability to the animals since it may allow them to filter information in different ways, as required by the situation in which they are involved. This rapid, behavior-dependent adaptation may also provide the somatosensory system with more flexibility for detecting a wider range of stimuli, or allow preferential detection of certain types of stimulation under different circumstances. Corticocortical loops as the substrate for the integration of bilateral whisker information in the rat barrel cortex The second loop investigated in our studies of the rat somatosensory system is the one defined by reciprocal callosal connections between both rat SI cortices,
102 The experimental evidence reviewed in this section suggests that this loop plays a fundamental role in integrating left and right side whisker information required for the formation of bilateral tactile percepts of the environment. In these studies, the role the cerebral cortex plays in the integration of bilateral tactile information was investigated by inferring from extracellular recordings the temporal and spatial transformations performed by cortical neurons on convergent subcortical and corticocortical input. Study of cortical processes in the barrel cortex is facilitated in part by an ability to exploit an orderly topography of connections found throughout the whisker-barrel axis that reflects the arrangement of contralateral whiskers at the periphery (Woolsey and Van der Loos, 1970; Killackey, 1973). It is upon this anatomical topography that the classical hypothesis regarding barrel cortical function emerged, which postulates that activity within a given barrel cortical column directly relates to the attributes of a stimulus applied to a corresponding contralateral whisker. Incorporating more recent findings, contemporary theories of barrel cortical function have evolved to emphasize the role the barrel cortex plays in integrating information across whiskers to form behaviorally relevant percepts, rather than in extracting information from individual whiskers. Experiments using condition-test paradigms have provided a basic understanding of the temporal and spatial nature of integration between barrel regions corresponding to pairs of whiskers (Simons, 1985; Simons and Carvell, 1989; Brumberg et al., 1996). These experiments have provided evidence that excitation of a region of the barrel cortex is followed by a prolonged period of inhibition, which attenuates over time and acts to diminish the probability of a second, stimulus-evoked response. The magnitude of inhibition appears to be maximal in the region corresponding to the whisker stimulated, with the spatial distribution of inhibition decreasing as a function of distance from this center. However, such studies are limited by the fact that observed cortical responses are not solely functions of cortical processes, but instead, also may reflect the processes of subcortical structures along the whisker-barrel axis. Whatever the nature of subcortical integration along the whiskerbarrel axis, corticocortical integration may still be addressed by exploiting the role the corpus callo-
sum plays in integrating sensory information that is lateralized subcortically. In this context, if the rat is to create a perception of the environment regarding both sides of its face if it is to generate an appropriate behavioral response to bilateral stimuli - - then this information must so too be integrated. That rats are capable of navigating complex terrain even in absolute darkness by virtue of their whiskers seems to dictate that they make comparisons between groups of whiskers. As rats bilaterally and synchronously whisk objects they encounter, to successfully detect the orientation of an obstacle or the width of an aperture they must gather and integrate information regarding the distance of objects from the whisker pads to either sides of the face. The most logical place to identify and characterize how bilateral information is integrated, therefore, is at the level left and right side whisker information first converge (Fig. 6). As ascending whisker-related pathways are fully crossed subcortically, this conver-
-
Barrel Cortex
Barrel Cortex
W h i s k e r Pad
Fig. 6. Schematicof whisker-barrel anatomyillustrating convergence of ascending contralateral, whisker-relatedpathways with ipsilateral, whisker-relatedcallosal pathways. Neurons innervating the whiskerpad terminate in the principle trigeminalnucleus (PrV) and spinal trigeminalnuclei (SpV). Projectionsfrom these nuclei then decussate and terminate in two thalamic nuclei: the ventroposteriormedial nucleus (VPM) and the posteriormedial nucleus (PoM), which in turn project to the barrel cortex. The barrel cortex also receivescallosal input regardingthe ipsilateral whisker pad from a portionof neurons in supra- and infragrandular layers of the oppositebarrel cortex.
103 gence is thought to occur at the level of the SI barrel cortices, as they are interconnected via the corpus callosum in a roughly homotopic manner (White and DeAmicis, 1977; Koralek et al., 1990; Olavarria et al., 1992; Cauller et al., 1998). This presumption is also supported by evidence collected by Pidoux and Verley (1979), who provided local field potential recordings indicating that ipsilaterally evoked responses exist in the barrel cortex and are mediated by the corpus callosum. In support of this view, we have recently provided direct evidence that indicates layer V barrel cortical neurons integrate not only contralateral whisker information, but whisker information from both sides of the face, and further postulate that such interactions underlie the formation of bilateral tactile percepts. By combining methods for creating bilateral, multi-whisker stimuli with multi-electrode recordings we addressed whether single barrel cortical neurons respond to both contra- and ipsilateral whisker stimuli, characterized ipsilaterally evoked response properties, and determined the spatial and temporal aspects of interaction evoked by bilateral whisker stimulation (Shuler et al., 2001). Pharmacological inactivation of the opposite barrel cortex corroborated the proposition that the source of ipsilateral input is the opposite barrel cortex. We further demonstrated that during inactivation, not only were ipsilaterally evoked responses abolished, but the ensuing inhibitory influence such responses exerted on subsequent contralaterally evoked activity was also shown to be removed. Lastly, by designing a behavioral task that requires the cooperativity of the barrel cortices, we determined that rats are, in fact, capable of forming bilateral tactile percepts. What constitutes an effective ipsilateral whisker stimulus was addressed by varying the number and location of whiskers stimulated ipsilateral to cortically implanted electrode arrays. The arrangement of 16, independently drivable whisker deflectors allowed single whiskers, as well as all possible combinations of two, three, and four whiskers to be stimulated in each of four whisker columns (or arcs) tested. The result of such a stimulus regime was the characterization of ipsilaterally evoked responses in 72% of neurons recorded, with an average probability of evoked response of 21.8 ± 13% (mean -tSD). Compared to an 11 ± 3 ms minimal latency
for responses elicited by contralateral stimuli, the average minimal latency for a response elicited by an ipsilateral stimulus was 23 -4- 4 ms. Numerous instances of 'supra-linear' responses were detected as combinations of simultaneously deflected whiskers frequently were capable of eliciting ipsilateral responses when no responses were elicited from the individual whiskers that defined the ipsilateral stimulus. Although response probabilities were also found to increase as the number of whiskers deflected increased, this increase was decidedly sublinear for neurons that responded to the constituent parts of the stimulus when given alone. Therefore, not only is the barrel cortex responsive to ipsilateral stimulation, but the proportions of neurons and their underlying firing probabilities are nonlinearly effected by multi, ipsilateral whisker stimuli. Given the presence of ipsilateral responses in the barrel cortex, we next addressed the impact such responses may have on contralaterally evoked activity, and vice versa. The nature of bilateral whiskerevoked interactions within the barrel cortex was investigated by using a condition-test paradigm that varied the spatiotemporal attributes of left and right side whisker stimuli (Fig. 7). Three parameters of bilateral whisker stimuli were varied: (1) the hemispheric sequence (ipsi- then contralateral, or contrathen ipsilateral stimulation); (2) the inter-stimulus interval (ISI); and (3) the spatial location of condition stimuli. Variation of these three factors tested the null hypotheses that the hemispheric sequence, ISI, and spatial relationship between bilateral whisker stimuli do not change the firing probabilities of barrel cortical neurons as compared to responses evoked when ipsi- or contralateral stimuli are given alone. These attributes of bilateral stimuli were all shown to significantly impact the evoked response probabilities of barrel cortical neurons, demonstrating that the barrel cortices are capable of integrating bilateral whisker information. Furthermore, the change in response probability caused by bilateral interactions could not be explained by postulating that barrel cortical neurons simply were less likely to fire to the test stimulus on trials that neurons had fired to the condition stimulus. This result indicates that ipsilaterally, as well as contralaterally evoked suprathreshold activity is followed by an epoch of inhibition of an even greater spatial extent. This
104 conclusion is further supported by noting that even for neurons without identifiable ipsilateral responses, contralaterally evoked responses to test stimuli were significantly impacted by prior ipsilateral stimulation. Therefore, bilateral interactions give rise to hemispheric and spatial differences in recovery of subsequent responses, potentially reflecting a differential activation of inhibitory networks. To determine the source of ipsilateral input, the opposite barrel cortex was pharmacologically inactivated by infusion of the GABAA agonist, muscimol, prior to bilateral stimulation. Not only did inactivation remove ipsilaterally evoked responses in the intact hemisphere, but the observed effects of prior ipsilateral stimulation on contralaterally evoked responses were also negated. Such results indicate that the barrel cortices provide one another with ipsilateral whisker information. Considering that callosal connections are thought to be excitatory, these results suggest that ipsilaterally evoked suprathreshold activity subsequently activates local inhibitory networks, rather than indicating that such inhibition derives from yet another source. A central question raised by these physiological results is whether rats can make use of bilateral tactile cues to discriminate objects. Surprisingly, though a number of behavioral studies have address how rats use their whiskers to discriminate tactile features of the environment (Hutson and Masterton, 1986; GuicRobles et al., 1989; Barneoud et al., 1991; Pazos et al., 1995; Brecht et al., 1997), no study to date has directly addressed the ability of rats to compare bilateral tactile features. To test the hypothesis that rats can form bilateral percepts, we developed a discrimination task in which rats learn to compare the relative distance of two walls, one to each side of the face, using only the facial whiskers. Of eight rats trained on this task, all learned to associate equidistant or non-equidistant bilateral stimuli with
water reward made available at one of two reward windows, respectively. These results provide the first evidence that rats can indeed combine information from both whisker pads (Shuler et al., 2000). The anatomical constraints of the whisker-barrel system makes it ideal for studying cortical integration of independent sources of sensory input. We exploited this anatomy by investigating cortical integration of contralateral whisker-evoked activity ascending via thalamocortical pathways with that of callosally converging ipsilateral activity. Ipsilaterally evoked responses, as well as interactions due to the hemispheric, temporal, and spatial attributes of bilateral stimuli strongly contradict the classical notion that the rat barrel cortex solely represents stimuli delivered to the contralateral whisker pad. Such interactions evoked by multi-whisker stimuli do not support the hypothesis that cortical profiles of activity result from the topographic, linear superposition of individual responses. Furthermore, such interactions cannot be explained by postulating the existence of superimposed contra- and ipsilateral topographic maps. To propose such a coding scheme, a countless number of topographies would be required to uniquely describe all possible permutations of bilateral whisker stimuli. Rather we propose that ascending thalamic as well as converging transhemispheric input differentially excite the barrel cortex, subsequently initiating a wake of spreading inhibition. Spatial and temporal asymmetries in activating excitatory and inhibitory elements of the network result in spatiotemporally unique profiles of cortical activity, allowing each hemisphere to render unambiguously the attributes of a bilateral stimulus. In conclusion, we propose that such bilateral interactions are fundamental to forming bilateral tactile percepts, allowing rats to discriminate ethologically meaningful stimuli, such as the orientation and diameter of apertures.
Fig. 7. Evoked responses of two, layer V barrel cortical neurons to bilateral whisker stimulation. Simultaneous, single unit recordings from neurons in the left (neuron 1) and fight (neuron 2) hemispheres were obtained while stimulating left and fight whisker arcs (whiskers b3, c3, d3, e3). Solid vertical lines centered at time 0 denote delivery of test stimuli, while dashed vertical lines denote delivery of condition stimuli. Six stimulus conditions are shown for neurons 1 and 2. The top two rows depict responses to the test stimuli (left whisker arc, L; fight whisker arc, R) when given alone. The bottom four rows depict responses under condition-test stimulation; left then right whisker arcs with an ISI of 35 ms (L-R 35 ms), left then fight whisker arcs with an ISI of 175 ms (L-R 175 ms), fight then left whisker arcs with an ISI of 35 ms (R-L 35 ms), and finally,fight then left whisker arcs with an ISI of 175 ms (R-L 175 ms). The y-axis is in spike counts per 1 ms bin of time (300 presentations of each stimulus configurationwere given).
t05
401
Neuron 1
Neuron 2
2o30 lps L
Contra
3O 20 10
l
10
o .i,J,.,,hi,,-,,~,.L~£.
0
L'L"JL "kL "L ' LLaA' 'u" t'r'"'L~ ,,JllJlvl,l,,,,,,,
''''''''1'''''''''"''''''''1
20301
Contra
'''''''''1
4°1
Ipsi
30 20
10
10
0 ~'dlL""'~l"J"~'" t'"' ,,,,,,,,I,,,,,,*,,
0
I,,,,,,,,,
I
JI1,
"11"11'u I'LIIII t ""1 *ill / L -- "i'l ',,,,,,ll,Jll,l,,, ,,,,,,,T,,,
I
Ipsi
203040110 C°ntra I
L'-R 20 35ms 10
t-
.m
0 J,.,,,ll.,J-..~.hJ.,
0
,,,
,~,lllllllllllllJl
301 ~lpsi L'-R 20 175ms 10 0
Contra
20
35ms
lO 0
Contra
4°1
30 Ipsi 20 10 0 .,, L.,Ld,I,J,,,I,~
401]
L~k~,~.l..~..~,L ~•
I'lllIJlltJlllJll
R,-L 20 175ms 10
Contra
Ipsi
30 20 10
Ipsi
,.,
0
/ i, I"'JllJllJll,,ll,,,
°ll t ,
-0.1 0 Time (sec)
0.1
i
Ipsi
,
h'''''llillllllll
~si
IlrllllliTi
4°~1 ° ~ 30 20 0
Contra R,~L
f--~
& d., JL.,,Jlk~.. ,..,,.,,!1
0
o
00
- -
I c;~:r: lif,,i,,,,ij
i Contra
'l'''''''''l
-0.1 0 Time (sec)
0.1
106
Conclusions Although the classic feedforward model of touch has provided a fundamental blueprint for the development of somatosensory research in the last five decades, a variety of experimental findings and theoretical arguments demonstrate that this model no longer offers an accurate description on how tactile perception emerges in the mammalian brain. Instead, anatomical, physiological, and computational arguments favor the hypothesis that tactile perception emerges through interactive and recurrent interactions between multiple cortical and subcortical levels that define the mammalian somatosensory system. Central to this recurrent model of touch is the experimental demonstration that the massive corticofugal projections, that originate in the neocortex and reach most of the subcortical structures that form the somatosensory system, may play as relevant a role in tactile information processing as the parallel feedforward pathways of this system.
Acknowledgements This chapter describes research supported by grants from DARPA-ONR (N00014-98-1-0676), NSF IBN99-80043, and NIH DE-11121-01 to M.A.L.N. and an NRSA (1 F31 MH12570-01A1)to M.S.
References Adams, N.C., Lozsadi, D.A. and Guillery, R.W. (1997) Complexities in the thalamocortical and corticothalamic pathways. Eur. J. Neurosci., 9: 204-209. Albe-Fessard, D., Condes-Lara, M., Kesar, S. and Sanderson, P. (1983) Tonic cortical controls acting on spontaneous and evoked thalamic activity. In: G. Macchi, A. Rustioni and R. Spreatico (Eds.), Somatosensory Integration in the Thalamus: a Reevaluation Based on New Methodological Approaches. Elsevier, Amsterdam. Anderson, P., Eccles, J.C. and Sears, T.A. (1964) Cortically evoked depolarization of primary afferent fibers in the spinal cord. J. NeurophysioL, 27: 63-77. Andersen, P., Junge, K. and Sveen, O. (1967) Cortico-thalamic facilitation of somatosensory impulses. Nature, 214(92): 1011-1012. Andersen, P., Junge, K. and Sveen, O. (1972) Cortifugal facilitation of thalamic transmission. Brain Behav. Evol., 6: 170184. Barneoud, P., Gyger, M., Andres, E and Van der Lots, H. (1991)
Vibrissa-related behavior in mice: transient effect of ablation of the barrel cortex. Behav. Brain Res., 44: 87-99. Bishop, C.M. (1995) Neural Networks for Pattern Recognition. Clarendon Press, Oxford. Bourassa, J., Pinault, D. and Deschenes, M. (1995) Corticothalamic projections from the cortical barrel field to the somatosensory thalamus in rats: a single-fibre study using biocytin as an anterograde tracer. Eur. J. Neurosci., 7(1): 19-30. Brecht, M., Preilowski, B. and Merzenich, M.M. (1997) Functional architecture of the mystacial vibrissae. Behav. Brain Res., 84: 81-97. Brumberg, J.C., Pinto, D.J. and Simons, D.J. (1996) Spatial gradients and inhibitory summation in the rat whisker barrel system. J. Neurophysiol., 76(1): 130-140. Carvell, G.E. and Simons, D.J. (1990) Biometric analyses of vibrissal tactile discrimination in the rat. J. Neurosci., 10: 2638-2648. Castro-Alamancos, M.A. and Connors, B.W. (1996a) Short-term synaptic enhancement and long-term potentiation in neocortex. Proc. Natl. Acad. Sci. USA, 93(February): 1335-1339. Castro-Alamancos, M.A. and Connors, B.W. (1996b) Spatiotemporal properties of short-term plasticity in sensorimotor thalamocortical pathway of the rat. J. Neurosci., 16: 2767-2779. Cauller, L.J., Clancy, B. and Connors, B.W. (1998) Backward cortical projections to primary somatosensory cortex in rats extend long horizontal axons in layer I. J. Comp. Neurol., 390: 297-310. Chapin, J.K. and Woodward, D.J. (1981) Modulation of sensory responsiveness of single somatosensory cortical cells during movement and arousal behaviors. Exp. Neurol., 72(1): 164178. Chapin, J.K. and Woodward, D.J. (1982a) Somatic sensory transmission to the cortex during movement: gating of single cell responses to touch. Exp. NeuroL, 78(3): 654-669. Chapin, J.K. and Woodward, D.J. (1982b) Somatic sensory transmission to the cortex during movement: phasic modulation over the locomotor step cycle. Exp. Neurol., 78(3): 670-684. Chapin, J.K., Sadeq, M. and Guise, J.L.U. (1987) Corticocortical connections within the primary somatosensory cortex of the rat. J. Comp. Neurol., 263: 326-346. Chapman, C.E., Jiang, W. and Lamarre, Y. (1988) Modulation of lemniscal input during conditioned arm movements in the monkey. Exp. Brain Res., 72(2): 316-334. Chiaia, N.L., Rhoades, R.W., Bennett-Clarke, C.A., Fish, S.E. and Killackey, H.P. (1991) Thalamic processing of vibrissal information in the rat. I. Afferent input to the medial ventral posterior and posterior nuclei. J. Comp. Neurol., 314(2): 201216. Chmielowska, J., Carvell, G.E. and Simons, D.J. (1989) Spatial organization of thalamocortical and corticothalamic projection systems in the rat SmI barrel cortex. J. Comp. Neurol., 285: 325-338. Cohen, L.G. and Start, A. (1987) Localization, timing and specificity of gating of somatosensory evoked potentials during active movement in man. Brain, 110(Pt 2): 451-467. Contreras, D., Destexhe, A., Sejnowski, T.J. and Steriade, M. (1996) Control of spatiotemporal coherence of a thalamic
107
oscillation by corticothalamic feedback. Science, 274(5288): 771-774. Coquery, J.M. (1971) Changes in somaesthetic evoked potentials during movement. Brain Res., 31(2): 375. Coulter, J.D. (1974) Sensory transmission through lemniscal pathway during voluntary movement in the cat. J. Neurophysiol., 37(5): 831-845. Deadwyler, S.A. and Hampson, R.E. (1997) The significance of neural ensemble codes during behavior and cognition. Annu. Rev. Neurosci., 20: 217-244. Deschenes, M., Bourassa, J. and Pinault, D. (1994) Corticothalamic projections from layer V ceils in rat are collaterals of long-range corticofugal axons. Brain Res., 664(1-2): 215-219. Deschenes, M., Veinante, P. and Zhang, Z.W. (1998) The organization of corticothalamic projections: reciprocity versus parity. Brain Res. Rev., 28(3): 286-308. Diamond, M.E., Armstrong-James, M., Budway, M.J. and Ebner, F.F. (1992) Somatic sensory responses in the rostral sector of the posterior group (POm) and in the ventral posterior medial nucleus (VPM) of the rat thalamus: dependence on the barrel field cortex. £ Comp. Neurol., 319: 66-84. Dykes, R.W. (1983) Parallel processing of somatosensory information: a theory. Brain Res. Rev., 6: 47-115. Erickson, R.P. (1968) Stimulus coding in topographic and nontopographic afferent modalities: on the significance of the activity of individual sensory neurons. Psychol. Rev., 75(6): 447-465. Erickson, R.P. (1986) A neural metric. Neurosci. Biobehav. Rev., 10: 377-386. Fabri, M. and Burton, H. (1991) Ipsilateral cortical connections of primary somatic sensory cortex in rats. J. Comp. Neurol., 311: 405-424. Faggin, B.M., Nguyen, K.T. and Nicolelis, M.A. (1997) Immediate and simultaneous sensory reorganization at cortical and subcortical levels of the somatosensory system. Proc. Natl. Acad. Sci. USA, 94(17): 9428-9433. Fanselow, E. and Nicolelis, M. (1999) Behavioral modulation of tactile responses in the rat somatosensory system. J. Neurosci., 19: 7603-7616. Fields, H.L. and Heinricher, M.M. (1985) Anatomy and physiology of a nociceptive modulatory system. Phil. Trans. R. Soc. Lond. B, 308: 361-374. Gastaut, H. (1952) Etude electrocorticographique de la reativite des rhytmes rolandiques. Rev. Neurol. (Paris), 87: 176-182. Georgopoulos, A.P., Swartz, A.B. and Ketter, R.E. (1986) Neuronal population coding of movement direction. Science, 233: 1416-1419. Ghazanfar, A.A. and Nicolelis, M.A.L. (1997) Nonlinear processing of tactile information in the thalamocortical loop. J. Neurophysiol., 78(1): 506-510. Ghazanfar, A.A. and Nicolelis, M.A.L. (1999) Spatiotemporal properties of layer V neurons of the rat primary somatosensory cortex. Cereb. Cortex, 9: 348-361. Ghazanfar, A.A., Krupa, D.J. and Nicolelis, M.A.L. (1997) Tactile processing by thalamic neural ensembles: the role of cortical feedback. Soc. Neurosci. Abstr., 1797. Ghez, C. and Lenzi, G.L. (1971) Modulation of sensory trans-
mission in cat lemniscal system during voluntary movement. Pflugers Arch. Eur. £ Physiol., 323(3): 273-278. Grossberg, S. (1976) Adaptive pattern classification and universal recording: II Feedback, expectation, olfaction, illusions. Biol. Cybern., 23: 187-202. Grossberg, S. (1988) Nonlinear neural networks: principles, mechanisms and architectures. Neural Networks, 1: 17-61. Grossberg, S. (1999) The link between brain, learning, attention, and consciouness. Consciousness Cognit., 8: 1-44. Guic-Robles, E., Valdivieso, C. and Guajardo, G. (1989) Rats can learn a roughness discrimination using only their vibrissal system. Behav. Brain Res., 31(3): 285-289. Hebb, D.O. (1949) The Organization of Behavior: a Neuropsychological Theory. John Wiley and Sons, New York. Hellweg, EC., Schultz, W. and Creutzfeldt, O.D. (1977) Extracellular and intracellular recordings from cat's cortical whisker projection area: thalamocortical response transformation. J. Neurophysiol., 40(3): 463-479. Hoogland, P.V., Welker, E. and Van der Loos, H. (1987) Organization of the projections from barrel cortex to thalamus in mice studied with Phaseohis vulgaris-leucoagglutinin and HRP. Exp. Brain Res., 68: 73-87. Hoogland, P.V., Welker, E., Van der Loos, H. and Wouterlood, EG. (1988) The organization and structure of the thalamic afferents from the barrel cortex in the mouse; a PHA-L study. In: M. Bentivoglio and R. Spreafico (Eds.), Cellular Thalamic Mechanisms. Elsevier Science, Amsterdam, pp. 151-161. Hoogland, P.V., Wouterlood, EG., Welker, E. and Van der Loos, H. (1991) Ultrastructure of giant and small thalamic terminals of cortical origin: a study of the projections from the barrel cortex in mice using Phaseolus vulgaris leuco-agglutinin (PHA-L). Exp. Brain Res., 87: 159-172. Hutson, K.A. and Masterton, R;B. (1986) The sensory contribution of a single vibrissa's cortical barrel. J. Neurophysiol., 56(4): 1196-1223. Jacquin, M.E, Chiaia, N.L., Haring, J.H. and Rhoades, R.W, (1990a) Intersubnucleus connections within the rat trigeminal brainstem complex. Somatosens. Motor Res., 7(4): 399-420. Jacquin, M.E, Wiegand, M.R. and Renehan, W.E. (1990b) Structure-function relationships in rat brain stem subnucleus interpolaris. VIII. Cortical inputs. J. Neurophysiol., 64(1): 327. Johnson, K.O., Hsiao, S.S. and Twombly, I.A. (1995) Neural mechanisms of tactile form recognition. In: M.S. Gazzaniga (Ed.), The Cognitive Neurosciences. MIT Press, Cambridge, MA, pp. 253-267. Kaas, J.H. (1990) Somatosensory system. In: G. Paxinos (Ed.), The Human Nervous System. Academic Press, San Diego, pp. 813-844. Kaas, J.H. and Pons, T.P. (1988) The somatosensory system of primates. In: H.D. Steklis and J. Erwin (Eds.), Comparative Primate Biology, Vol. 4. Alan R. Liss, New York, pp. 421468. Kaas, J.H., Merzenich, M.M. and Killackey, H.P. (1983) The organization of the somatosensory cortex following peripheral nerve damage in adult and developing mammals. Annu. Rev. Neurosci., 6: 325-356.
108 Killackey, H.P. (1973) Anatomical evidence for cortical subdivisions based on vertically discrete thalamic projections from the ventral posterior nucleus to cortical barrels in the rat. Brain Res., 51: 326-331. Kleinfeld, D. and Delaney, K.R. (1996) Distributed representation of vibrissa movement in the upper layers of somatosensory cortex revealed with voltage-sensitive dyes [published erratum appears in J. Comp. NeuroL (1997) 378(4):594]. J. Comp. Neurol. , 375(1): 89-108. Koralek, K.A., Olavarria, J. and Killackey, H.P. (1990) Areal and laminar organization of corticocortical projections in the rat somatosensory cortex. J. Comp. Neurol., 299: 135-150. Krnpa, D.J., Ghazanfar, A.A. and Nicolelis, M.A.L. (1999) Immediate thalamic sensory plasticity depends on corticothalamic feedback. Proc. Natl. Acad. Sci. USA, 96: 8200-8205. Lee, R.G. and White, D.G. (1974) Modification of the human somatosensory evoked response during voluntary movement. Electroencephalogr. Clin. NeurophysioL, 36(1): 53-62. Lin, C.S., Nicolelis, M.A., Schneider, J.S. and Chapin, J.K. (1990) A major direct GABAergic pathway from zona incerta to neocortex [see comments]. Science, 248(4962): 1553-1556. Lu, S.-H. and Lin, R.C.S. (1993) Thalamic afferents of the rat barrel cortex: a light- and electron-microscopic study using Phaseolus vulgaris Leucoagglutinin as an anterograde tracer. Somatosens. Motor Res., 10: 1-16. Lu, S.M. and Lin, C.S. (1986) Cortical projection patterns of the medial division of the nucleus posterior thalami in the rat. Soc. Neurosci. Abstr., 12: 1434. Masino, S.A. and Frostig, R.D. (1996) Quantitative long-term imaging of the functional representation of a whisker in rat barrel cortex. Proc. Natl. Acad. Sci. USA, 93: 4942-4947. McCormick, D.A. and Pape, H.-C. (1990) Noradrenergic and serotonergic modulation of a hyperpolarization-activated cation current in thalamic relay neurones, J. Physiol., 431: 319-342. McCormick, D.A. and von Krosigk, M. (1992) Corticothalamic activation modulates thalamic firing through glutamate metabotropic receptors. Proc. Natl. Acad. Sci. USA, 89: 2774. McLean, J. and Waterhouse, B.D. (1994) Noradrenergic modulation of cat area 17 neuronal responses to moving visual stimuli. Brain Res., 667: 83-97. Merzenich, M.M., Kaas, J.H., Wall, J.T., Nelson, R.J., Sur, M. and Felleman, D.J. (1983) Topographic reorganization of somatosensory cortical areas 3b and 1 in adult monkeys following restricted deafferentation. Neuroscience, 8(1): 33-55. Moore, C.I. and Nelson, S.B. (1998) Spatio-temporal subthreshold receptive fields in the vibrissa representation of rat primary somatosensory cortex. J. Neurophysiol., 80(6): 2882-2892. Mountcastle, V. (1957) Modality and topographic properties of single neurons of cats' somatic sensory cortex. J. Neurophysiol., 20: 408-434. Mountcastle, V. (1974). Neural mechanisms in somesthesia. In: V. Mountcastle (Ed.), Medical Physiology, Vol. I. C.V. Mosby, St. Louis, pp. 307-347. Mumford, D. (1991) On the computational architecture of the neocortex. I. The role of the thalamo-cortical loop. Biol. Cybern., 65(2): 135-145.
Mumford, D. (1992) On the computational architecture of the neocortex. II. the role of corticocortical loops. Biol. Cybern., 66: 241-251. Mumford, D. (1994). Neuronal Architectures for pattern-theoretic problems. In: D. Koch (Ed.), Large-Scale Neuronal Theories of the Brain. MIT Press, Cambridge, MA, pp. 125152. Nelson, R.J. (1984) Responsiveness of monkey primary somatosensory cortical neurons to peripheral stimulation depends on 'motor-set'. Brain Res., 304(1): 143-148. Nelson, R.J. (1987) Activity of monkey primary somatosensory cortical neurons changes prior to active movement. Brain Res., 406(1-2): 402--407. Nicolelis, M.A.L. (1996) Beyond maps: a dynamic view of the somatosensory system. Braz. J. Med. BioL Res., 29: 401--412. Nicolelis, M.A. and Chapin, J.K. (1994) Spatiotemporal structure of somatosensory responses of many-neuron ensembles in the rat ventral posterior medial nucleus of the thalamus. J. Neurosci., 14(6): 3511-3532. Nicolelis, M.A.L., Chapin, J.K. and Lin, R.C.S. (1991) Ontogeny of corticocortical projections of the rat somatosensory cortex. Somatosens. Motor Res., 8(3): 193-200. Nicolelis, M.A., Lin, R.C., Woodward, D.J. and Chapin, J.K. (1993a) Dynamic and distributed properties of many-neuron ensembles in the ventral posterior medial thalamus of awake rats. Proc. Natl. Acad. Sci. USA, 90(6): 2212-2216. Nicolelis, M.A.L., Lin, R.C.S., Woodward, D.J. and Chapin, J.K. (1993b) Induction of immediate spatiotemporal changes in thalamic networks by peripheral block of ascending cutaneous information. Nature, 361: 533-536. Nicolelis, M.A., Baccala, L.A., Lin, R.C. and Chapin, J.K. (1995) Sensorimotor encoding by synchronous neural ensemble activity at multiple levels of the somatosensory system. Science, 268(5215): 1353-1358. Nicolelis, M.A.L., Oliveira, L.M.O., Lin, R.C.S. and Chapin, J.K. (1996) Active tactile exploration influences the functional maturation of the somatosensory system. J. Neurophysiol., 75: 2192-2196. Nicolelis, M.A., Fanselow, E.E. and Ghazanfar, A.A. (1997) Hebb's dream: the resurgence of cell assemblies. Neuron, 19(2): 219-221. Nicolelis, M,A., Katz, D. and Krupa, D.J. (1998a) Potential circuit mechanisms underlying concurrent thalamic and cortical plasticity. Rev. Neurosci., 9(3): 213-224. Nicolelis, M.A.L., Ghazanfar, A.A., Stambaugh, C.R., Oliveira, L.M.O., Laubach, M., Chapin, J.K., Nelson, R.J. and Kaas, J.H. (1998b) Simultaneous encoding of tactile information by three primate cortical areas. Nature Neurosci., 1(7): 621-630. Niedermeyer, E. (1993) The normal EEG of the waking adult. In: E. Niedermeyed and E Lopez da Silva (Eds.), Electroencephalography. Basic Principles, Clinical Applications and Related Fields. Williams and Wilkins, Baltimore, MD, pp. 131-152. Ogden, T.E. (1960) Cortical control of thalamic somato-sensory relay nuclei. Electroenceph. Clin. Neurophysiol., 12: 621-634. Olavarria, J.F., DeYoe, E.A., Knierim, J.J., Fox, J.M. and Van Essen, D.C. (1992) Neuronal responses to visual texture pat-
109
terns in middle temporal area of the macaque monkey. J. Neurophysiol., 68(1): 164-181. Parker, J. and Dostrovsky, J. (1999) Cortical involvement in the induction, but not expression, of thalamic plasticity. J. Neurosci., 19: 8623-8629. Pazos, A.J., Orezzoli, S.L., McCabe, P.M., Dietrich, W.D. and Green, E.J. (1995) Recovery of vibrissae-dependent behavioral responses following barrelfield damage is not dependent upon the remaining somatosensory cortical tissue. Brain Res., 689(2): 224-232. Pidoux, B. and Verley, R. (1979) Projections on the cortical somatic I barrel subfield from ipsilateral virbrissae in adult rodents. Electroencephalogr. Clin. Neurophysiol., 46: 715726. Pinault, D. and Deschenes, M. (1998) Projection and innervation patterns of individual thalamic reticular axons in the thalamus of the adult rat: a three-dimensional, graphic and morphometric analysis. J. Comp. NeuroL, 391(2): 180-203. Pinanlt, D., Bourassa, J. and Deschenes, M. (1995) The axonal arborization of single thalamic reticular neurons in the somatosensory thalamus of the rat. Eur. J. Neurosci., 7(1): 3140. Pinault, D., Smith, Y. and Deschenes, M. (1997) Dendrodendritic and axoaxonic synapses in the thalamic reticular nucleus of the adult rat. J. Neurosci., 17(9): 3215-3233. Polley, D.B., Chen-Bee, C.H. and Frostig, R.D. (1999) Varying the degree of single-whisker stimulation differentially affects phases of intrinsic signals in rat barrel cortex. J. Neurophysiol., 81(2): 692-701. Purves, D., Riddle, D.R. and LaMantia, A.-S. (1992) Iterated patterns of brain circuitry (or how the cortex gets its spots). Trends Neurosci., 15(10): 362-368. Rhoades, R.W., Belford, G.R. and Killackey, H.P. (1987) Receptive-field properties of rat ventral posterior medial neurons before and after selective kainic acid lesions of the trigeminal brain stem complex. J. Neurophysiol., 57(5): 1577-1600. Salt, T.E. and Eaton, S.A. (1996) Functions of ionotropic and metabotropic glutamate receptors in sensory transmission in the mammalian thalamus. Prog. NeurobioL, 48: 55-72. Salt, T.E. and Turner, J.P. (1998) Modulation of sensory inhibition in the ventrobasal thalamus via activation of group II metabotropic glutamate receptors by 2R,4R-aminopyrrolidine-2,4-dicarboxylate. Exp. Brain Res., 121(2): 181-185. Schmidt, R.E, Schady, W.J. and Torebjork, H.E. (1990a) Gating of tactile input from the hand. I. Effects of finger movement. Exp. Brain Res., 79(1): 97-102. Schmidt, R.E, Torebjork, H.E. and Schady, W.J. (1990b) Gating of tactile input from the hand. II. Effects of remote movements and anaesthesia. Exp. Brain Res., 79(1): 103-108. Sejnowski, T.J., Koch, C. and Churchland, P.S. (1988). Comput. Neurosci. Sci., 241: 1299-1306. Sherman, S.M. and Guillery, R.W. (1996) Functional organization of thalamocortical relays. J. Neurophysiol., 76(3): 13671395. Shigemi, S., Ichikawa, T., Akasaki, T. and Sato, H. (1999) Temporal characteristics of response integration evoked by
multiple whisker stimulations in the barrel cortex of rats. J. Neurosci., 19: 10164-10175. Shin, H.C. and Chapin, J.K. (1989) Mapping the effects of motor cortex stimulation on single neurons in the dorsal column nuclei in the rat: direct responses and afferent modulation. Brain Res. Bull., 22(2): 245-252. Shin, H.C. and Chapin, J.K. (1990a) Modulation of afferent transmission to single neurons in the ventroposterior thalamus during movement in rats. Neurosci. Lett., 108(1-2): 116-120. Shin, H.C. and Chapin, J.K. (1990b) Movement induced modulation of afferent transmission to single neurons in the ventroposterior thalamus and somatosensory cortex in rat. Exp. Brain Res., 81(3): 515-522. Shin, H.-C. and Chapin, J.K. (1990c) Mapping the effects of SI cortex stimulation on somatosensory relay neurons in the rat thalamus: direct responses and afferent modulation. Somatosens. Motor Res., 7: 421--434. Shuler, M.G., Krupa, D.J. and Nicolelis, M,A.L. (2000) Discrimination of bilateral whisker stimuli in the freely behaving rat.
Soc. Neurosci. Abstr. Shuler, M.G., Krupa, D.J. and Nicolelis, M.A.L. (2001) Bilateral integration of whisker information in the primary somatosensory cortex of rats, submitted for publication. Simons, D.J. (1985) Temporal and spatial integration in the rat SI vibrissa cortex. J. Neurophysiol., 54(3): 615-635. Simons, D.J. and Carvell, G.E. (1989) Thalamocortical response transformation in the rat vibrissa/barrel system. J. Neurophysiol., 61(2): 311-330. Turner, J.E and Salt, T.E. (1998) Characterization of sensory and corticothalamic excitatory inputs to rat thalamocortical neurones in vitro. J. Physiol. (Lond.), 510(Pt 3): 829-843. Turner, J.E and Salt, T.E. (1999). Group III metabotropic glutamate receptors control corticothalamic synaptic transmission in the rat thalamus in vitro [In Process Citation]. J. Physiol. (Lond.), 519 Pt 2, 481-491. Van dcr Loos, H. (1976) Barreloids in mouse somatosensory thalamus. Neurosci. Lett., 2: 1-6. Veinante, E and Deschenes, M. (1999) Single- and multi-whisker channels in the ascending projections from the principal trigeminal nucleus in the rat. J. Neurosci., 19(12): 50855095. Waller, H.J. and Feldman, S.M. (1967) Somatosensory thalamic neurons: effects of cortical depression. Science, 157: 10741077. Waterhouse, B.D., Border, B., Wahl, L. and Mihalloff, G.A. (1994) Topographic organization of rat locus coeruleus and dorsal raphe nuclei: distribution of cells projecting to visual system structures. J. Comp. Neurol., 336: 345-361. White, E. and DeAmicis, R. (1977) Afferent and efferent projections of the region in mouse SmI cortex which contains the posteromedial barrel subfield. J. Comp. NeuroL, 175: 455482. Woolsey, T.A. and Van der Loos, H. (1970) The structural organization of layer IV in the somatosensory region (SI) of mouse cerebral cortex: the description of a cortical field composed of discrete cytoarchitectonic units. Brain Res., 17: 205-242.
110
Yuan, B., Morrow, T.J. and Casey, K.L. (1985) Responsiveness of ventrobasal thalamic neurons after suppression of SI cortex in the anesthetized rat. J. Neurosci., 5: 2971-2978. Yuan, B., Morrow, T.J. and Casey, K.L. (1986) Cortifugal influences of S 1 cortex on ventrobasal thalamic neurons in the
awake rat. J. Neurosci., 6:3611-3617. Zhang, Z.-W. and Deschenes, M. (1998) Projections to layer VI of the postermedial barrel field in the rat: a reappraisal of the role of corticothalamic pathways. Cereb. Cortex, 8: 428-436.
M.A.L. Nicolelis (Ed.)
Progress in Brain Research, Vol. 130 © 2001 Elsevier Science B.V. All rights reserved
CHAPTER 8
Synchronization and assembly formation in the visual cortex Winrich A. Freiwald 1,2,,, Andreas K. Kreiter 1 and Wolf Singer 2 1 Institute for Brain Research, University of Bremen, FB2, P.O. Box 330440, D-28334 Bremen, Germany 2 Max-Planck-lnstitutefor Brain Research, Deutschordenstrasse 46, D-60528 Frankfurt~Main, Germany
Neural assemblies The main challenge for information processing by the brain's sensory systems is the complexity of natural environments. This complexity is of a combinatorial nature: while elementary features at very different levels of physical organization reappear and may be quite limited in number, the space of possible feature combinations is virtually unlimited. Since the unlikely, the new and even the physically impossible can be seen with ease, the coding schemes used by the visual system are adapted to provide representations for all possible feature constellations and not only likely ones. For this reason, covariance of feature appearance (e.g. the co-occurrence of the color green and the form of leaves) does not reduce the principal demands for visual information processing. What mechanisms then does the visual system employ to cope with combinatorial complexity? Donald Hebb's cell assembly concept (Hebb, 1949) may be conceived as the proposal to let the problem be the solution. In the same way as natural outside world objects are composed of elementary features, internal neural representations are generated by combinations of elementary neural responses. The presence of a feature is signaled by the * Corresponding author: Wim'ich A. Freiwald, Institute for Brain Research, University of Bremen, FB2, P.O. Box 330440, D-28334 Bremen, Germany. Tel.: +49-421218-9095; Fax: +49-421-218-9004; E-mail:
[email protected]
activity of a specific neuron or a set of neurons, the presence of an object whole is signaled by a spatially distributed and cooperatively interacting population of neurons, termed a cell assembly. Thus, while neurons are outnumbered by possible feature constellations (Sejnowski, 1986), cell assemblies are not, because they are of combinatorial nature. For this theoretical reason, single 'gnostic units' at the tip of a processing hierarchy (Konorski, 1967) can hardly be the only neural correlate of object representations (Sherrington, 1941; Barlow, 1972b; Harris, 1980), even though they could serve to encode frequently occurring highly familiar constellations of features (Miyashita, 1988; Sakai and Miyashita, 1994; Vogels and Orban, 1994; Logothetis et al., 1995; Kobatake et al., 1998; Gauthier et al., 1999). The combinatorial nature of neural assemblies yields two more desirable coding properties. First, even though assembly coding seems energetically expensive in terms of the number of active neurons, it is economical in terms of the overall number of neurons necessary (Hinton, 1981; Field, 1994), because neurons can be used in different stimulus contexts by recombination with other neurons to form assemblies. Second, this coding scheme is very flexible, because variations in stimulus constellations are met by variations of neural activity patterns. By the same token, newly encountered stimuli can be represented by new patterns of already existing building blocks (Braitenberg, 1978; Grossberg, 1980; Edelman, 1987; Gerstein et al., 1989; Engel et al., 1997). Furthermore, cooperativity, the second
112 key feature of assembly codes, allows for pattern completion of only partly visible stimuli, generalization over similar stimuli and fault tolerance (Palm, 1982; Rumelhart and McClelland, 1986; Hopfield and Tank, 1991). Finally, assembly codes are reliable due to redundancy of neural population responses. Since neural systems can adapt to the statistics of different environments and behavioral demands, very different aspects of an object may serve as elementary features. Yet commonly co-occurring features appear to be represented by hard-wired neural responses that are then grouped together with other responses representing different features. Thus, the concept of assembly coding is open to incorporate diverse response selectivities of neurons. However, combinations of highly abstract cardinal neurons alone (Barlow, 1972a) are insufficient for the representation of the detailed structure of natural objects (Kreiter and Singer, 1996a). Distributed coding in the visual cortex The quintessential principle of assembly coding is the generation of complex internal activity patterns for the representation of outside world objects. Internal representational complexity is already reflected in the structural organization of the visual cortex (Tononi et al., 1994), which has been revealed by anatomical and more recently functional imaging techniques. More than 30 cortical areas have been described in, the macaque visual system (Livingstone and Hubel, 1988; Zeki and Shipp, 1988; Felleman and van Essen, 1991; Bullier and Nowak, 1995), a number likely to be paralleled by other primate species, including humans. Visual areas are typically further subdivided into smaller compartments (Kaas and Krubitzer, 1991; Krubitzer, 1995). This structural segmentation is accompanied by functional specialization. Response properties of neurons within areas and in areal subcompartments are more similar than responses of cells in different compartments. Thus, the many different compartments operate as functionally specialized modules for the analysis of the visual scene. Every visual stimulus will therefore activate modules in very different parts of the visual cortex (Zeki, 1993), leading to a largely distributed representation, as documented by functional imaging studies. Distributed representations also exist within
each module, because most neurons respond like broadly tuned filters (van Essen et al., 1992; Martin, 1994), and thus, many neurons will be activated by any feature relevant for a particular module. While these considerations show that responses are distributed within the visual system, they do not by themselves provide evidence for true parallel processing within the visual system. This conclusion can however be reached by considering the anatomical coupling of cortical areas. Based on laminar patterns of inter-areal connections that correlate with inputoutput functions (Rockland and Pandya, 1979), the various cortical areas can be placed into hierarchical schemes (Felleman and van Essen, 1991; Hilgetag et al., 1996; Crick and Koch, 1998), but several cortical areas have to be placed at every level of the hierarchy. Thus, visual information is processed in parallel at every level of the hierarchy including its final stages. The existence of rich and typically reciprocal coupling between areas, including feedforward, lateral and feed-back connections (Lamme et al., 1998), further suggests that this processing is not accomplished independently in each cortical area, but rather through cooperative interactions. In agreement with this view, strong effects of feed-back connections on primary visual cortex responses have been described recently (Hup6 et at., 1998). Distributed coding and neural interactions For the reasons discussed in the previous paragraphs, the concept of distributed coding has become almost commonplace. However, less agreement exists on the properties of these distributed representations. Issues in question are the population size, with a scale of possibilities ranging from coarse coding (see e.g. Rumelhart and McClelland, 1986; Eurich and Schwegler, 1997) to sparse coding (see e.g. Young and Yamane, 1992) and the strategies used for stimulus encoding and decoding of population activity (Churchland and Sejnowski, 1992; Snippe and Koenderink, 1992; Sanger, 1996; Brown et al., 1998; Zhang et al., 1998; Eurich and Wilke, 1999; Oram et al., 1999; Zhang and Sejnowski, 1999). This second issue raises the question of how population signals are internally organized. As stated above, a central aspect of the Hebbian cell assembly concept is the cooperativity of its members. These
113 interactions of neurons result in temporally correlated activity (Aertsen et al., 1986; Johannesma et al., 1986; Gerstein et al., 1989; Singer et al., 1990, 1997), and for this reason synchronous firing of neurons has been considered as an operational definition for the experimental detection of cell assemblies. In contrast, influential models of population coding, especially vector coding (see e.g. Georgopoulos, 1990) have assumed independence of cell firing. The rationale of these concepts is the Ergodic hypothesis, namely that the nervous system can take population averages over neurons to achieve reliable stimulus coding in the same way as the experimenter equipped with a single electrode can determine the single-neuron's exact response property by averaging over trials of stimulus presentations (Gerstein and Gochin, 1992). Within this framework, covariation of neural firing has been conceived as a limiting factor for effective sampling over populations due to the introduction of redundancies (Britten et al., 1992; Zohary et al., 1994). Contrary to this point of view, however, the potential usefulness of temporally coordinated activity for information processing by neural populations has recently been demonstrated by both theoretical arguments and experimental findings. First, information theoretical considerations have shown that correlations of neural firing, both stimulus-dependent and stimulus-independent ones, can markedly enhance the accuracy of population responses, increase information rates and therefore allow for increased stimulus discriminability (Snippe and Koenderink, 1992; Richmond and Gawne, 1998; Abbott and Dayan, 1999; t r a m et al., 1999; Panzeri et al., 1999). Thus, synchrony does not necessarily imply redundancy for stimulus encoding but can be the source of synergy. Second, synergy by synchrony has been found experimentally in the visual system. Synchronous spike pairs can carry extra information not available from the individual spikes elicited by two neurons independently. This finding was obtained for visual cortical and retinal cell pairs by demonstrating the difference between the receptive fields (RFs) of the individual neurons and the socalled 'bicellular receptive fields' reconstructed from synchronous events (Ghose et al., 1994; Meister et al., 1995; Meister, 1996) and by an information theoretic analysis of neural pairs in the lateral geniculate
nucleus of the thalamus (Dan et al., 1998). Third, in visual, auditory, motor and premotor cortex and hippocampus, changes of synchronization in relation to external or internal events have been observed without concomitant changes of firing rates (Vaadia et al., 1995; deCharms and Merzenich, 1996; Kreiter and Singer, 1996b; Riehle et al., 1997; Sakurai, 1999), indicating that information about these events was available to the nervous system by the relative timing of spikes but not by activity levels. In the motor cortex, in addition, it has been shown that extra information about the direction of a forthcoming arm movement was provided by the synchronous firing of cells beyond that available from the rates or rate changes (Hatsopoulos et al., 1999; Maynard et al., 1999). Taken together, these theoretical arguments and experimental findings show that coordinated neural interactions, a key element of the cell assembly concept, may indeed play an important role in neural population coding.
The temporal binding hypothesis If correlated activity can be the signature of neural assemblies, it may be essential in the context of multiple object encoding (vonder Malsburg, 1981; von der Malsburg and Schneider, 1986; Singer, 1990a, 1993). The basic argument, illustrated in Fig. 1, is the following. Assume for the moment, a cell assembly in the visual cortex is solely defined as a population of cells activated by an object in the environment. The several objects which are typically contained within a natural visual scene, will therefore activate several assemblies. Since, by definition, assemblies are distributed representations, i.e. are not confined to a specific cortical locus, they will superpose in cortical space. This is especially evident for objects in close proximity or with partial overlap in visual space since their representations will even overlap within processing modules with precise topographical mapping and high spatial resolution, e.g. within areas V1 and V2, even more so due to cells' non-classical RF properties (see e.g. vonder Heydt et al., 1984, Allman et al., 1985 and Lee et al., 1998). This overlap of multiple representations (A, B, C . . . . ) is a serious challenge for all distributed coding schemes, coined the 'superposition problem', because any part of a full representation
114
÷°-) ~ Oc~~i
II ,, ,II I
Assembly
i
Fig. 1. Schematic illustration of multiple-object encoding by neural assemblies. The visual scene considered here contains two complex objects, a mouse and an apple. Each object is represented by a distributed group of cells which are responding to the object's features. These two cell assemblies representing the two objects are shown in the middle of the diagram. For four cells, the receptive field properties are indicated. These are intended to match the moderate selectivity found for neurons within the inferotemporal cortex (Tanaka, 1996), showing that a distributed representation is also needed in this final stage of the ventral path processing hierarchy (Felleman and van Essen, 1991). Since the two objects overlap in visual space, the two assemblies are overlapping in cortical space. Due to this superposition, the membership relation of active cells is ambiguous. It is not clear which cells belong to the same and which belong to different cell assemblies. By the same token, the relationships between the represented features are lost. To avoid this so-called superposition catastrophe, a label is needed that uniquely identifies constituents of the same object and distinguishes between members of different assemblies. This label could be provided by the synchronous firing of neurons of the same assembly and asynchronous firing of cells in different assemblies (shown on the right-hand side). Extending the classical Hebbian definition of a cell assembly from mere activation to synchronous activity preserves the essential property of cooperativity of assemblies, because synchronization need not be provided by an external source, but can be brought about by neural groups in a self-organizing manner (see e.g.: yon der Malsburg and Schneider, 1986; Sompolinsky et al., 1990; Wang et al., 1990; Grossberg and Somers, 1991; K6nig and Schillen, 1991; Sporns et al., 1991; Neven and Aertsen, 1992; yon der Malsburg and Buhmann, 1992; Chawanya et al., 1993; Ritz et al., 1994; Schillen and K0nig, 1994; Kappen, 1997; Lumer et al., 1997; Eckhorn, 1999a).
A, say Ai, might as well belong to any other part of representations B, C . . . . as to other parts of A, Aj, Ak and so forth. Thus, the relationships between neurons may be confounded, and by the same token, the relationships between features in the outside world remain ambiguous. Not even the number of objects present can be deduced from the activity of neurons alone. Clearly, a mechanism is needed which unambiguously identifies the members of the same assembly and distinguishes members of different assemblies. A solution to this so-called 'binding problem' has been suggested by vonder Malsburg (1981, 1986), Abeles (1982a) and, in a preliminary form, by Milner (1974). According to this proposal, cells belonging to the same assembly fire action potentials synchronously with a precision of a few milliseconds, and cells belonging to different assemblies
fire asynchronously. This way, relationships between features are made explicit and multiple objects can be represented within the same cortical space. This concept, which we will refer to as the 'temporal binding hypothesis', is an attractive solution to the binding problem, because it preserves the essential features of assembly coding, which we have discussed above. Essentially, the notion of neural interactions in the Hebbian concept is made more explicit: synchrony with a precision of a few milliseconds is the label for assembly membership. The temporal binding hypothesis requires cortical neurons or microcircuits to act as coincidence detectors (Abeles, 1982b; Softky, 1994; K6nig et al., 1996), a proposal for which ample direct and indirect experimental evidence has accumulated in the last years (Alonso et al., 1996; Matsumara et al., 1996;
115 Castelo-Branco et al., 1998; Margulis and Tang, 1998; Prut et al., 1998; Rager and Singer, 1998; Stevens and Zador, 1998; Volgushev et al., 1998; Azouz and Gray, 1999; Larkum et al., 1999). Different alternative solutions to the binding problem have been suggested, which have been critically discussed in detail by Singer (1990b), Singer and Gray (1995), Kreiter and Singer (1996a), Engel et al. (1997) and Roelfsema (1998). Briefly, the proposal to introduce binding units, i.e. neurons which are selective for feature conjunctions, has to be rejected, because it re-introduces the problem which assembly codes had actually been designed to avoid, i.e. scaling neural numbers with the number of feature combinations. Mechanisms based on a spotlight of attention (Treisman, 1986, 1996; Olshausen et al., 1993) certainly help to distinguish neural responses to spatially separate objects, but do not provide a general solution to the binding problem, since multiple objects can be included within the spotlight. Furthermore, selective attention is based on the results of pre-attentive segregation and binding processes working in parallel within the whole visual field which by definition are not attentive. Thus, synchronous activity within the millisecond time range may be the signature of neural assembly formation. In this review, we will present experimental data from the mammalian visual cortex related to this proposal. Therefore, we do not aim to provide a full review of temporal coding within the visual system, but rather restrict the scope of this paper in several ways. First, since most of the results presented have been obtained using the cross-correlation technique (Gerstein and Perkel, 1969) (which is briefly explained in the legend of Fig. 3) and the Joint-PSTH (Aertsen et al., 1989), we will refer to the term synchrony as a significant excess amount of coincident events within a time window of a few milliseconds, as detected with these techniques and thereby avoid the problem of finding a general definition for synchrony (Aertsen and Arndt, 1993). Second, we will focus on temporally precise forms of spike time coordination. It should be noted, however, that in the visual cortex very different ranges of precision of synchronization have been found, ranging from a single millisecond to several hundreds of milliseconds (see e.g. Gochin et al., 1991, Nowak et al., 1995, 1999, Lampl et al., 1999 and Bair,
1999 for a review). All of these forms may be of functional relevance in the same way as firing rates within temporal intervals of different duration are. Third, we will concentrate on synchronization in the context of assembly formation, also recently termed emergent synchrony (Usrey and Reid, 1999). Other forms of synchrony in the visual system, caused by anatomical divergence, have recently been reviewed (Usrey and Reid, 1999). Finally, relative spike timing of neurons is but one form of temporal coding. Within the temporal binding hypothesis, it is not specified in which format a single neuron encodes features of an object. For the sake of simplicity and congruence with common belief, we may assume that a feature is encoded in the neuron's firing rate, but this does not necessarily have to be the case. Binding by synchrony is compatible with very different single-cell encoding schemes: a cell which preferentially responds to temporally structured input patterns, may itself use both a temporal code for representation of features and employ relative timing to express relationships between features. Thus, while at a conceptual level, single-cell and assembly codes may be treated independently, the necessity for coincidence detection by individual neurons or small neural circuits may indicate that these two issues are intimately intertwined at a mechanistic level. These are important issues, and we will therefore refer to three recent reviews on the general topic of temporal coding in the visual system (Bair, 1999; Gawne, 1999; Victor, 1999).
Evidence for spike synchronization in the visual cortex Temporally correlated activity of individual neural pairs within the visual cortex has been investigated in many laboratories starting in the early 1980s (Toyama et al., 1981a,b; Michalski et al., 1983; Ts'o et al., 1986; Volgushev, 1988; Aiple and Krtiger, 1988; Hata et al., 1988; Kriiger and Aiple, 1988; Ts'o and Gilbert, 1988; Gochin et al., 1991; Hata et al., 1991, 1993; Schwarz and Bolz, 1991; Liu et al., 1992; Roe and Ts'o, 1992; Aarnoutse et al., 1997; Albus et al., 1998; Alonso and Martinez, 1998; Molotchnikoff and Shumikhina, 1998; Shumikhina et al., 1998) most often with a motivation to reveal structural coupling between cells. This approach to
116 functional anatomy had been methodologically and conceptually outlined by Perkel et al. (1967), Moore et al. (1970), Kirkwood (1979), Aertsen and Gerstein (1985) and Surmeier and Weinberg (1985) and successfully applied to the invertebrate nervous system (Bryant et al., 1973). The occurrence of centered or shifted peaks and troughs in the cross-correlograms was interpreted as evidence for the presence of excitatory or inhibitory connections either directly connecting one cell to the other or providing common input to both recorded cells. The results of these many different studies taken together clearly demonstrated the presence of temporally precise synchronization within the cat and macaque visual cortex. While this conclusion is in general agreement with the temporal binding hypothesis, it remained to be shown that not only individual pairs of cells, but large groups of cells would fire synchronously in response to a visual stimulus.
Local synchronization and gamma oscillations Evidence has indeed been obtained that many neurons within a column of cat visual cortex can engage in a state of highly synchronous activity in response to an optimally oriented moving light bar (Gray and Singer, 1987; Gray et al., 1989). Oscillatory activity in the g a m m a frequency range was observed in both multi-unit activity (MUA), i.e. spike sequences elicited by clusters of neighboring cells which were not further subdivided into contributions of individual cells, and local field potential (LFP) signals, which m a y be thought of as a the local EEG signal of a cortical column (Fig. 2, and cf. Schillen et al., 1992). LFP oscillations can only be observed when many neurons fire in synchrony, since otherwise the individual neurons' electric fields would simply cancel out. Furthermore, the occurrence of high frequencies demonstrates that local synchrony is generated with high temporal precision, which is also indicated by the M U A responses. Thus, whole groups of neighboring neurons were shown to discharge synchronously in response to the same visual object, an important conclusion for any concept of assembly coding which could not have been reached by recordings from pairs of individual cells. Furthermore, this finding is in general agreement with the hypothesis that neighboring cells
r
r rI
12oo,
Fig. 2. Example of a single trial of a recording from cat area 17 with stimulation by an optimally oriented moving light bar. Recordings were performed with electrodes selective enough to record action potentials, but with impedances low enough to record from several cells simultaneously (multi-unit activity, MUA), shown in lines 2 and 4. Filtering the signal between 1 Hz and 100 Hz yields the local field potential (LFP) which can be thought of as a local EEG, shown in lines 1 and 3. The upper two traces display at a slow time scale that the onset of the MUA response is associated with a change in the frequency composition of the LFP. The lower two traces show the MUA and LFP during peak activity at an expanded time scale. Rhythmic activity in the gamma range (35-45 Hz) can be observed in both LFP and MUA signals. Spike discharges are tightly correlated with the negative phases of the LFP. Upper and lower voltage scales are for the LFP and MUA, respectively, upper and lower time scales refer to upper and lower pairs of LFP and MUA traces. (Adapted from Gray and Singer, 1989.)
with similar functional properties are tightly coupled to form what is called a neural group (Edelman, 1987). High-frequency, locally synchronous oscillatory activity has since been observed in many studies of the visual cortex (Freeman and van Dijk, 1987;
117 Eckhorn et al., 1988, 1993a; Kreiter and Singer, 1992; Frien et al., 1994; Brosch et al., 1995, 1997; Molotchnikoff et al., 1996; Molotchnikoff and Shumikhina, 1996; Fries et al., 1997; Gray and Viana Di Prisco, 1997) and non-cortical visual structures (Ghose and Freeman, 1992; Neuenschwander and Varela, 1993; Neuenschwander and Singer, 1996; Brecht et al., 1998a; Castelo-Branco et al., 1998). In addition to triggering most of these studies, the early finding of locally synchronous high-frequency oscillations had at least four more implications for further research. First, it was an indication for a distinct temporal structuring of neural group activity which could in principle be used for temporal assembly specification. Second, oscillators operating at high frequencies could be the means to establish precise synchronization among spatially remote parts of an assembly, since it is known from theoretical studies that coupled oscillators can show a rich dynamic repertoire (see e.g. Ermentrout and Kopell, 1984; Georgopoulos, 1990; Kopell and Ermentrout, 1990; Sompolinskry and Golomb, 1991; Hansel and Sompolinsky, 1992; Lumer and Huberman, 1992; Tass and Haken, 1996; Hoppensteadt and Izhikevich, 1997; Ramana Reddy et al., 1998). Third, the output of such a synchronous neural group onto any target region could be expected to be especially effective in driving the target cells due to spatial summation. Therefore, such states of local coherence could be expected to be especially favorable for establishing long-range synchronization. Fourth, methodologically the occurrence of local synchrony was taken as justification to use multi-unit activity for studying synchronization between spatially separated recording sites. This way more spikes could be obtained at any recording site within a given amount of time which facilitated experimental procedures for testing multiple-stimulus conditions and yet reliably computing cross-correlograms for each of these conditions. Furthermore, multi-unit recordings have proved to be sensitive tools for detecting synchrony also in the auditory and somatosensory cortex (deCharms and Merzenich, 1996; Roy and Alloway, 1999) and have been shown to be more sensitive than single-unit responses both experimentally (Bedenbaugh and Gerstein, 1997; Roy and Alloway, 1999) and theoretically (Bedenbaugh and Gerstein, 1997).
Long-range synchronization According to the temporal binding hypothesis, spatially distributed activity is integrated into a coherent cell assembly by synchronous firing of its constituent neurons. Thus, cells recorded from separate electrodes should fire in synchrony when stimulated by the same visual object. Most of the studies on visual cortex functional anatomy mentioned above have been performed with multiple electrodes and thus provided evidence for long-range synchronization. Further evidence was provided by studies directly designed to test this prediction (Eckhom et al., 1988; Gray et al., 1989, 1990; Nelson et al., 1992; Brosch et al., 1995, 1997; Ktnig et al., 1995b; Gabriel and Eckhom, 1999). Synchrony was found to occur even between recording sites spaced 7 mm within area 17 (Gray et al., 1989). Furthermore, a dependence of the establishment of long-range synchronization on local synchronous high-frequency oscillations was observed (K6nig et al., 1995b), indicating that gamma oscillations may be instrumental for long-range assembly formation. Synchronization also occurs between neural groups located in different hemispheres (Engel et al., 1991a; Munk et al., 1995; Nowak et al., 1995). These results confirm a prediction of the temporal binding hypothesis, because the mechanisms for representing a visual object crossing the midline of the visual field, i.e. activating cells in both cortical hemispheres, should be the same as the mechanisms for objects on either side alone. Moreover, long-range interactions have also been shown to exist between different cortical areas (Eckhorn et al., 1988; Engel et al., 1991c; Nelson et al., 1992; Frien et al., 1994; Steriade and Amzica, 1996; Roelfsema et al., 1997; Roe and Ts'o, 1999). This is especially noteworthy in the case of synchronization between areas with quite different response properties like cat areas 17 and PMLS (Engel et al., 1991c), the latter of which is primarily involved in motion processing while the first engages in the representation of fine spatial structure. Thus, inter-areal synchronization could be the means to integrate different features of a stimulus into a coherent representation. Finally, long-range synchronization has also been observed between cell groups in the visual cortex and the superior colliculus (Brecht et al., 1998b) and the LGN, respectively
118 (Castelo-Branco et al., 1998), again linking quite diverse stimulus selectivities. Taken together, these studies demonstrate the existence and abundance of local synchrony of large neural groups and long-range synchronization between remote cortical columns, either within the same or in different cortical areas, including visual brain structures with quite different response selectivities. Thus, spatially distributed activities can in principle be grouped together.
Stimulus dependence of neural synchrony Since neurons may belong to the same cell assembly in one stimulus condition and belong to separate assemblies in yet another stimulus condition, the temporal binding hypothesis predicts the existence of stimulus-dependent changes of synchrony. It has been shown that temporal correlations between cells within the primary visual cortex can change in a dynamic way depending on elementary aspects of a coherent stimulus and activity levels of the recorded cells (Aertsen et al., 1987, 1989; Aertsen and Gerstein, 1991; Freiwald et al., 1994; K6nig et al., 1995a). Therefore, synchronous firing does not merely reflect the fixed anatomical connectivity. However, a more critical prediction of the temporal binding hypothesis is that the global stimulus configuration determining whether two cells belong to the same or to different assemblies should be the critical factor for the occurrence and strength of synchrony. This prediction was first confirmed for long-range synchronization in cat area 17 (Gray et al., 1989) using the paradigm shown in Fig. 3. In the first stimulus condition two local cell groups were activated by the same object and therefore should belong to the same cell assembly. In agreement with the temporal binding hypothesis, the two cell groups fired in synchrony. In the second stimulus condition, however, both groups were stimulated individually. In this stimulus condition, where the two groups are expected to belong to different assemblies, they are firing asynchronously. In this experiment, the two Gestalt criteria of contour continuity and common direction of motion are the critical determinants which modify synchronization in a way analogous to human perception of one or two Gestalten, respectively. The same criteria also determined the occurrence of
response synchronization of cell groups with different orientation preferences and overlapping receptive fields (Engel et al., 1991b; Nowak et al., 1995), in different cortical areas (Engel et al., 1991c) and the superior colliculus (Brecht et al., 1999). In all these experiments multi-unit activity was recorded in the anaesthetized cat. Three questions for further research arose out of these findings, which we shall discuss below. First, do pairs of individual neurons behave the same way as pairs of groups of neurons? Second, do the same rules governing synchrony and asynchrony also apply to recordings from awake animals? Third, if large assemblies of neurons dispersed over several square millimeters in the visual cortex fire in synchrony in an oscillatory mode in the gamma frequency range, would it not be possible to observe similar results in EEG experiments in humans as well? The question on the behavior of pairs of individual neurons arises from two considerations. The first is illustrated in Fig. 4. Changes in synchrony between MUA signals need not necessarily be caused by changes of synchronization of individual pairs of cells. The reason for this possibility is, that the composition of cells generating the MUA signal might change from one stimulus condition to another. This change of active cells may account for differences in synchronization, if the active populations in one stimulus condition are coupled by horizontal connections and the populations active in the second condition are not. In the case of the long bar experiment discussed before, this purely anatomical explanation rests on two assumptions: the MUA signal can be composed of spikes from cells with opposite directional preference and cells in different columns with like directional preference are linked into a network of horizontal connections. The first assumption might at first glance be incompatible with the systematic feature mapping observed in area 17 of the cat. However, recent research trying to directly access the functional properties of neighboring neurons has stressed the existence of local inhomogeneities (Maldonado and Gray, 1996), thereby complementing the picture obtained by mapping studies. As is shown in Fig. 5, directly neighboring cells in cat area 17 can have opposite directional preferences. Therefore, MUA recordings could easily sample over these diverse populations and results are fraught with the
119 B
A
C
=
1
0 2 4 6 8 1 0 Time Is]
0 2 4 6 8 1 0 Time Is]
0 2 4 6 8 1 0 Time Is]
otO
.~_
1-2
8 -50 0 50 Spike Time Difference [ms]
-50 0 50 Spike Time Difference [ms]
20 50 Spike Time Difference [ms] -50
0
Fig. 3. Example of stimulus-dependent long-range synchronization in cat area 17. Multi-unit activity was recorded from two electrodes separated by 7 mm. Cells had similar orientation preferences, and receptive fields, represented here by open rectangles, were co-linearly arranged. Synchronization was determined by use of the cross-correlation technique (Perkel et al., 1967). Cross-correlation histograms (or cross-correlograms) are shown at the bottom in black for the motion directions indicated in the figures on top and in white for the reverse directions. (Cross-correlograms depict the number of spikes elicited by one of the cells, the reference cell, as a function of time difference towards spikes from the second cell, which serve as the trigger events. Synchronous firing is indicated by a central peak, i.e. by a larger amount of spikes elicited simultaneously by both neurons compared to other time intervals.) The occurrence of synchrony depended on the global stimulus configuration. The three different stimulus conditions are shown at the top with black bars representing light bars moving across the visual fields as indicated by the arrows. When both cells were activated by a long light bar, strong synchronization between the responses of the two neurons was observed (A), irrespective of the motion direction. When cells were activated by light bars moving in opposite directions, the cross-correlogram was flat, i.e. no signs of synchronization were observed (B). When both light bars moved in the same direction, synchronization of cells was found (C), but was less pronounced than in condition A. Thus, with a stimulus obeying to the Gestalt criteria of continuity and common fate, the neurons are responding synchronously, but not in response to two incoherent stimuli. This result is predicted by the temporal binding hypothesis, because the two neurons should belong to the same cell assembly in the first and to two different cell assemblies in the second stimulus setting. (Modified from Gray et al., 1989.)
problem mentioned above. Thus, in order to demonstrate changes of synchrony, recordings of pairs of individual cells in different columns had to be performed. The second problem in the interpretation of changes of MUA synchronization stems from theoretical arguments showing that changes of synchrony within the local populations generating the MUA signals may lead to changes in the observed strength of synchrony between two MUA recording sites (Bedenbaugh and Gerstein, 1997). The recordings from pairs of individual neurons located in different hypercolumns (Freiwald et al., 1995a) yielded three main findings (Fig. 6). First, the cells can synchronize and desynchronize their responses. Therefore, synchrony or functional connectivity is not a simple by-product of the fixed anatomical connectivity. Second, these changes are stimulus dependent in the way predicted by the temporal binding hypothesis: in the coherent stimulus condition the cells are synchronizing their responses,
while in the incoherent stimulus condition they are firing asynchronously. Third, the strength of this effect is more pronounced than that observed in the MUA recordings. This might be due to the fact that only cells which are clearly responding to the stimuli could be used for this experiment, while in MUA recordings some non-specific responses might be included as well. Thus, with respect to the discussion of the two models of Fig. 4, we have to conclude that a purely anatomical model is not sufficient to explain our findings. Rather, individual cells can synchronize their responses in ways determined by the global stimulus configuration as discussed above. This is not to say that the anatomical factors are irrelevant for temporal correlation patterns across the population of neurons within the visual cortex. The probability to observe synchronization between any two cells is indeed influenced by the network of horizontal connections (Ts'o et al., 1986; Ts'o and Gilbert, 1988; Roe and Ts'o, 1999). However, the findings
120
._-s.-._ g 1
2
A
> @
<
"Changing Synchronization .
.
.
.
Changing Population"
Fig. 4. Two alternative explanations for the experimental findings in Fig. 3. In the upper part, the experimental design and the main findings are sketched: recordings are made from groups of neurons at two locations (1 and 2, only two neurons are shown at each recording site). In the coherent stimulus condition (A), a peak in the cross-correlogram is observed, indicating synchronous firing of the cells. When stimulated incoherently (B), the cells are firing asynchronously as shown by the flat cross-correlogram. The model suggested by the temporal binding hypothesis to explain these findings is shown on the lower left-hand side in a simplified form ('Changing Synchronization'). Each individual cell pair is firing in synchrony in stimulus condition A and without any temporal relationship in condition B. Thus, a true 'change in synchronization' of individual cell pairs is the mechanism underlying the observed phenomena. However, the alternative scenario shown on the lower right also accounts for the finding ('Changing Population'). Two assumptions are made here. The first is that neighboring cells can have opposite direction preferences (indicated by the arrows at the side of the neurons). The second is that cells of like direction preference are connected by horizontal fibers, while cells with dissimilar direction preference are not coupled. Working on these assumptions, we would expect to observe synchronous firing in the first stimulus condition, because the stimulus has a unique direction of motion and is therefore activating cells with like direction preference (indicated by gray shading) whose synchronous firing is solely determined by their anatomical connections. In the second stimulus condition, the two bars have opposite directions of motion and are therefore activating cells with opposite direction preferences at the two recording sites which, by construction of this model, are not coupled and therefore do not fire in synchrony. In essence, if this model were true, the observed changes in MUA synchronization would not be true changes of synchrony, but simply changes in the composition of the active population of cells and explicable in purely anatomical terms. The second model is not implausible because horizontal fibers are the likely substrate of long-range synchronization (Engel et al., 1991a, 1992) and because the mechanisms thought to be involved in ontogenetic pruning of these horizontal fibers (L6wel and Singer, 1992) make preferential connections between cells of like directional preference a reasonable assumption, in analogy to the connectivity scheme of orientation columns (Gilbert and Wiesel, 1989). Thus, a direct test is needed to prove that individual pairs of cells can synchronize and desynchronize their responses in relation to the external stimulus condition.
121
A • . .
• ~:.;~,.~.~,
:.. ::
. ,
,
..,. .
.., ..
,.,,
..
re" e"
<E <
AP-Amplitude [Channel x]
B # Cells
Z O {0
69
4 0
Time [ sec ]
21.5
presented here show that this c o n n e c t i v i t y pattern, while setting the f r a m e w o r k for possible interactions, does not d e t e r m i n e actual s y n c h r o n i z a t i o n strength.
Synchronization in the primate visual cortex S y n c h r o n i z a t i o n b e t w e e n pairs o f cells or cell groups has b e e n f o u n d in p r i m a t e visual cortical areas,
Fig. 5. Example for different directional selectivities of neighboring cells in cat area 17. Recordings were made with the stereotrode recording technique (McNaughton et al., 1983), because this technique allows for more reliable spike sorting of action potentials compared to conventional extracellular recording electrodes. Stereotrodes are double-electrode probes consisting of two insulated metal wires glued together and cut at their tips. Since the distances from the spike generating zone to the two electrode tips usually differ from each other, the amplitudes of the two recorded waveforms also differ. Thus, some positional information on the firing cells is gained which can be used for spike sorting in addition to differences of spike waveforms (see e.g. Abeles and Goldstein, 1977 and Gerstein et al., 1983). In A, a scatter plot of action potential amplitude pairs recorded from a stereotrode is shown. For each action potential, the amplitude recorded on channel y is plotted against the amplitude recorded on channel x. Four clusters can be seen corresponding to responses from four neighboring single units. This result of spike sorting was confirmed using further waveform parameters. In B, the PSTHs of the four cells are plotted on top of each other and the sum over all four units, the MUA, is shown on top of them. Each PSTH is scaled in firing rate [Hz] ranging from zero to the maximal value plotted on the upper right of each diagram. A light bar was oriented according to the orientation preference of the neurons and moved back and forth twice. Cell 1 shows a clear direction preference, since it is only excited by movements to the right and even inhibited during movements to the left. Cell 2 shows the exact opposite preference, and also cell 3 prefers movements to the left over movement to the right, showing a clear directional bias. Finally, cell 4 shows only little modulation of its firing rate in relation to the stimulus and so does not exhibit any directional preference. This demonstrates that neighboring cells in cat area 17 can have opposite directional preferences. (Modified from Freiwald et al., 1995b.)
n a m e l y V1 a n d V 2 (Bach and Krtiger, 1986; Aiple a n d KriJger, 1988; Krtiger and Aiple, 1988; Kriiger a n d Mayer, 1990; Kreiter, 1992; F r i e n et al., 1994; Livingstone, 1996; MUller et al., 1996; L a m m e and Spekreijse, 1998), area V5 (or M T ) (Kreiter and Singer, 1992, 1996b; Cardoso de Oliveira et al., 1997) and inferotemporal cortex ( G o c h i n et al., 1991; F r e i w a l d et al., 1998, 1999) and b e t w e e n the L G N a n d V1 (Usrey et al., 1998). M o s t of these findings have b e e n o b t a i n e d from awake b e h a v i n g animals, thereby s h o w i n g that s y n c h r o n i z a t i o n is n o side effect of anesthesia. Moreover, s y n c h r o n i z a t i o n was shown to c h a n g e in a c o n t e x t - d e p e n d e n t way, i.e. to be stimulus or state dependent, in primate, feline and rodent visual cortex (Bach and KriJger, 1986; Aiple a n d Krtiger, 1988; K r i g e r and Aiple,
122
A
C
FD
D
1.5 °
1.5°
55"
r-
0"
-63
0
Spike Time Difference [ ms ]
+63
E
-63
0
+63
Spike Time Difference [ ms ]
F
100-
"O
:8
6=U= O"
-63
G
0 +63 Spike Time Difference [ ms ]
-63
0 +63 Spike Time Difference [ ms ]
)
n=18 Z 250/
Z
I
° o
II
/
150
/
"1o
/
100, CO
q
50'
io o / /
O.
; ,~ I~ I,~Z~ ~ Two Bars Condition-NMA [%]
Fig. 6. Stimulus dependency of inter-columnar synchronization of single-cell discharges. The upper line depicts two stimulus conditions equivalent to those in Fig. 3. In the next two rows, results for two cell pairs are shown for the coherent stimulus condition (left) and the incoherent condition (right): In C/D and E/F, cross-correlograms for two cell pairs are depicted together with Gabor-Gauss functions (thick smooth lines) and the shift predictors (thick lines). The Gabor-Gauss functions are used to quantify synchronization strength. To this end, the normalized modulation amplitude (NMA = normalized central peak height above offset, divided by offset height) is computed. Its value is shown in the upper right of each diagram. Shift predictors indicate the amount of synchronization attributable to stimulus-locked rate modulations. Since they are fiat, synchrony must have been generated internally by the neural network. In the first cell pair, a very strong synchronization occurs in the first stimulus condition (left), and no synchronization at all in the second condition (right). The second cell pair shows a weaker effect, but still the strength of synchrony is higher in the coherent condition than in the incoherent. The lowermost diagram compiles the results for all cell pairs which showed a sign of synchronization in at least one stimulus condition (18 cell pairs, 67% of all pairs encountered in this study). Correlation strength in the coherent condition (MA) is plotted versus correlation strength in the incoherent stimulus condition. Without any exception, all data points are above the diagonal, i.e. were more strongly synchronized in the coherent than in the incoherent stimulus condition. Only three cell pairs showed a residual correlation in the incoherent stimulus condition. (Correlation strength in the incoherent condition was about 5% of that in the coherent condition.) The two examples shown above are indicated as black dots. Thus, the stimulus dependency of long-range synchronization of individual ceils is a very strong and robust phenomenon in cat area 17. (Modified from Freiwald et al., 1995b.)
123 1988; K ~ g e r and Mayer, 1990; Frien et al., t994; Kreiter and Singer, 1996b; Livingstone, 1996: M u n k et al., 1996: Cardoso de Otiveira et al., 1997; Alonso and MartineZ, 1998; van der Togt et al., 1998; Das and Gilbert, 1999; Herculano-Houzel et al., 1999; Lampl et at., 1999) (but see L a m m e and Spekreijse, 1998). Therefore. the occurrence o f synchrony seems to be more an attribute o f the dynamic state o f the neuraI network studied than a mere side effect o f anatomical coupling. Two studies (Kreiter and Singer, 1992, 1996b) in which coherent high-frequency oscillations and long-range synchronization were investigated in the motion-sensitive area MT of the awake fixating
macaque monkey, y i e l d e d several important findings. First. they demonstrated the existence o f synchronization o f local and distant cell groups and also oscillatory activity in the g a m m a frequency range in a cortical area outside primary visual cortex in the awake animal, indicating t h a t these phenomena might be o f general importance for visual information processing. Second, synchronization was shown to be stimulus dependent (Fig. 7): a given p m r of neurons or neural groups which synchronized their responses when activated with a single contour, fired independently when stimulated with two different contours. Third. while differences in synchronization were significantly different in these two stimulus
4 °
F®
-
~
i
4o
Fe
D
90
° I
tl -63
0 +63 Spike Time Difference [ms]
0
0
-63 Time [s]
+63
Spike Time Difference [ms]
5
0 T i m e [s]
E 90. n = 2 6 • IQ••
250- n =52
/ Qo
t,n
/
z
/o
/
60
/
150/
8 30. ]" gQ
t•
ID
/
b5 0"
/
200-
•
.•
c: ~_o 100-
/
,
"== 8
/
6
/
LT.
go ~ o
Dual Bar Condition-NMA[%] o
~
500-
•/~•
,~1~' • /
5o 1001go 20o 25o Firing Rate in Dual Bar Condition [Hz]
Fig. 7. Stimulus dependency of synchronization in area MT of the awake, fixating macaque monkey. Conventions are the same as in Fig. 6. The two stimulus conditions are shown in A for the single- and in B for the dual-bar condition. The dot marked 'F' corresponds to the fixation spot. Cross-corretograms and PSTHs corresponding to stimulus condition A and B are shown in C an D. respectively. Thin vertical lines in the PSTHs mark begining and end of the response periods over which cross-correlograms have been computed. The vertical scale bars to the right of the PSTHs correspond to tiring rates of 40 Hz. Synchronization is pronounced in the single-bar condition (C) and absent in the dual-bar condition (D). A scatter plot of synchronization strength in the single- vs. the dual-bat" condition;is shown for 19 MIrA pairs in E. In all cases synchronization is considerably stronger in the single-bar than in the dual-bar condition. Firing rates in the two conditions, however, do not differ significantly (F). (Modified from Kreiter and Singer, 1996b3
124
conditions, firing rates did not change in a systematic way. Thus, some information about the stimulus is conveyed by the relative timing of spikes which is unavailable from firing rates alone. Fourth, synchronization was shown to be independent of the detailed characteristics of orientation and motion of the coherent stimulus (Fig. 8). Indeed, if the incoherent stimulus condition was turned into a coherent one by removing one of the fight bars, synchronization was of comparable strength as in the original coherent stimulus condition, in which the stimulating light bar had a considerably different orientation and direction of motion. Thus, synchronization was not involved in the representation if some elementary stimulus features which are known to determine firing rates in a systematic way. Fifth, in cases with residual correlation in the incoherent stimulus condition, this was quantitatively explicable by a residual coherence of the stimulus, i.e. both cells responding in part to the same of the two simultaneously presented bars. Thus, the most parsimonious explanation of the findings is that synchronization signals the coherence of the encoded stimulus configuration according to Gestalt criteria of coherence and common fate. while it was unaffected by physical details of the stimuli.
Synchronous high-frequency oscillatory activity in the human brain Rhythmic activity in the gamma frequency range (30-80 Hz) has been observed in human brain activity in the auditory, visual, somatosensory modalities and during motor tasks (see e.g. Bressler, 1990, Basar-Eroglu et at., 1996b, Pulvermttller et al., 1997 and Tallon-Baudry and Bertrand, 1999 for review). According to Galambos (1992) gamma activity can be classified into five different forms. Spontaneous gamma waves are not related to any stimulus, steadystate or driven responses are precisely time-locked to a periodically modulated stimulus (see e.g. MiJller, 1997 for review). Emitted gamma band oscillations are time-locked to an expected stimulus which does not occur (Bullock et at., 1994). Transient evoked gamma band responses, like driven responses, are elicited and precisely time-locked to a stimulus. These responses are characterized by precise phase locking to the stimulus and can therefore be found in trial averaged data. This stands in sharp contrast to the last category of gamma activity, induced gamma responses. These responses are related to the occurrence of a stimulus, but the latency of
Fig. 8. Synchronization in primate area MT is largely independent of different directions of motion: stimulus configurations are sketched in A. B. and C together with the corresponding cross-correlograms and PSTHs. Conventions as in Fig. 7. (A) S~'ong synchronization is observed in the single-bar condition arranged to elicit maximal responses at both recording sites. In one block of ten trials {not shown) the N M A equaled 74%. in a repetition (A) N M A equaled 52%. (B) Discharges in the dual-bar condition were almost linearly independent of each other (NMA -- 4%). (C) Changing orientation and direction of motion of the single bar to that of one of the dual bars resulted in a synchronization of comparable strength to the single-bar condition in A (NMA - 57%3. Since a third single-bar condition (not shown) with the light bar of an intermediate orientation to those in A and C yielded a similar result as well (NMA = 61°~ ), synchronization strength was largely independent of single-bar orientation or direction of motion and of changes of firing rate. Furthermore, this control experiment shows that the reduction of synchronization in the dual-bar condition cannot be attributed to the particular parameters of either of the bars, because each of them presented alone can induce synchronization if responses are elicited at both recording sites. Results for all recording sites tested in single- and 'one of the dual-bars' condition are summarized in D. In most cases, synchronization was of comparable strength and only in a few cases, strong activation of both recording sites resulted in stronger synchronization than sub-optimal activation in the 'one of the dual bars' condition. Thus. the critical determinant for the occurrence of response synchronization is the coherence of the effective visual stimulus. This control experiment also shows that due to the broad directional tuning of MT neurons, often both light bars in the dual-bar condition were capable of eliciting responses at both sites and not just one recording site. This observation can explain the fact that synchronization was often reduced, but not completely abolished in the incoherent stimulus condition. Neurons at both recording sites would then partly contribute to the representation of the same stimulus and partly to the representation of different stimuli and should for the first reason still exhibit some amount of synchronization, but for the second this should be reduced in strength in comparison to the completely coherent stimulus condition. This hypothesis is tested by determining the correlation between the strength of residual correlation and the amount of coactivation (E). Residual correlation strength is expressed as the ratio of N M A values in the dual- and in the single-bar conditions, and coactivation was quantified as the ratio of the firing rate elicited by the non-optimal of the dual bars alone and the firing rate elicited in the dual-bar condition. Both parameters are positively correlated (r -- 0.632, P < 0.005). Thus, in agreement with the logic of the temporal binding hypothesis residual correlations can at least partly be attributed to coactivation of cells by a single stimulus. (Modified from Kreiter and Singer, 1996b.)
125 oscillatory bursts jitters from trial to trial, similar to high-frequency oscillations found in cat visual cortex (Eck~orn et al., 1990). Thus, in the visual modality induced gamma is most relevant for concepts of synchronous assembly formation, becanse any oscillatory activity which can be recorded at the scalp must have been brought about by large groups of synchronousIy firing cells.
High-frequency oscillations in human EEG and the temporal binding hypothesis While gamma band responses have been observed in the human brain for more than forty years (SemJacobsen et al., 1956; Chatrian et al., 1960: PerezBorja et al.. 1961), the findings of synchronous highfrequency oscillations in other mammalian species.
120
"5 (o
Fo
:-
~G9
0 -83 0 +63 0 5 Spike Time Difference [ms] Time [s]
B
4~
Fo
5 -63 0 +63 0 Spike Time Differeqce [ms] Tim~e [s]
C
70-
.,aiL
o
=if=
4°
F$
n 0 -63 +63 0 5 0 Spike Time Difference [ms] Time [s]
D 120
n = 17
/ ,,,-
~sz
• ~ c:m 0 ~
,/, ,"~ .Ai.
40
o- ~ , 0
/
-~
0'7-1 n=16 064 0.24 .
8
0"4t
"
g
024
.-(
.o
•
-~ 0.3"I ®
0~
,
./
.-
/"~
0.1 "~ )," ,
40 80 I20 Single Bar Condition-NMA [%i
0.04,",,,,-- . °." . 0.2 0.3 0.4 0.5 0.6 017 Coactivation Index
126 which we have discussed above, stimulated further research. According to the temporal binding hypothesis. induced gamma activity could be expected to be a reflection of a binding mechanism. This hypothesis would predict an enhancement of induced gamma when a coherent percept is created. This was tested directly with the 'long bar' stimulus paradigm which had been introduced by Gray et al. (1989) for animal studies (Mfiller et al., 1996), Fig. 9. In the coherent stimulus configuration, an enhancement of induced gamma activity relative to baseline condition was observed indicating synchronization of oscillatory activity in large groups of neurons, while no increase was found in the incoherent configuration. This latter finding is expected if two bars activated two cell ensembles without any specific temporal relationships between them. In addition, it was shown that lower-frequency activity in the alpha range did not reflect significant differences in the two stimulus condition, establishing the functional specificity of these high-frequency responses. This experiment was later replicated with essentially the same results in a different laboratory (Mtiller et al., 1997). A similar design was used in another study (Lutzenberger et al., 1995) which employed irregularly and regularly spaced series of light bars moving incoherently or coherently between neighboring quadrants of the visual field. Increases in gamma activity were observed which followed the retinotopic map of human primary and secondary visual cortex. Thus, evidence for stimulus-dependent synchronous assembly formation has been obtained with moving bar stimuli from the level of single pairs of cells (Freiwald et al,, 1995a; Kreiter and Singer, 1996b), over MUA and LFP (Gray et al., 1989, Engel et al., 1991a,b, 1991c" Kreiter and Singer, 1996b; Livingstone. 1996) up to scalp recordings (Lutzenberger et al., 1995; Mfiller et al., 1996, 1997) in the visual cortex of four mammalian species. These studies of induced gamma band responses have been extended in several ways in order to test possible relationships with perceptual states and to study attentional and emotional effects. The paradigm shown in Fig. 9 has been modified to study effects of spatial attention on gamma power (Mttller, 1997; Gmber et al., 1999). An increase in power of stimulus-induced gamma activity was observed in conjunction with a topographical shift and focusing of the response to the hemisphere contralateral to
the attended hemifield (Gruber et al., 1999). Thus. an active involvement of subjects in the task, in this case by spatial attention, might lead to a more stable synchronous ensemble of more rhythmically firing neurons, which can therefore be more easily observed with scalp recordings. Because these findings parallel results from fMRI experiments reporting on increases in metabolic activity in the M T / M S T complex upon attention (Tootell et al.. 1998), increases in synchronous high-frequency activity may accompany increases in activity in these cortical areas. However, in a different study more complex effects of attention on gamma band activity have been observed (Shibata et al., 1999). Stimulus dependence of induced gamma activity was directly tested under conditions of active perception using a visual discrimination task with static figures (Tallon-Baudry et al., 1996; Fig. 10). In this experimental design perceptual objects composed of subjective contours (illusory figures of Kanizsa, 1979) were generated by small changes in the orientation of notched circles. An occipital enhancement of induced gamma activity (30-50 Hz I was found in conditions of coherent figure perception. Since substantial increases in fMRI signals under comparable conditions have been observed (Goebel et al.. 1999; Mendola et al., 1999), it can again be suggested that induced gamma activity might be a correlate of active cell ensembles. As in the case of moving bars. behavioral demand seems to be an important factor shaping synchronous ensemble activity. The results of Tallon-Bandry et al. (1996) could be reproduced in a different laboratory (Tallon-Baudry et al., 1997b) and in intracranial recordings in humans (Lachaux et al., 1998), but not under conditions requiring less active participation of the subjects (Herrmann et al., 1999). In this last study, gamma activity was found in the time window reported by Tallon-Baudry et al. (1996), but this gamma activity was of the evoked type and had less energy. An occipital increase in gamma power was also observed during perception of a 3-D figure in stereograms relative to perception of infusible randomdot auto-stereograms, in which no object was visible (Revonsuo et al., 1997). In both experimental conditions, gamma power varied in relation to perception, while no systematic variations of power were observed in other frequency bands. Gamma
127 A
C Ipsilateral Electrodes P3 + T5 + 01 Long Bar
Contralateral Electrodes P4 + T6 + 0 2 1.5, n}
l, 0.5, ft.
0, -0.5 94 0
B
u,,,-
D
Fig. 9, Stimulus dependence of induced gamma activity in the human brain. Subjects had to fixate a central fixation spot while either one light bar or two light bars were moving in the left visual hemifield. These two conditions will be referred to as the coherent and the incoherent motion condition. Bars were presented statically prior to motion onset. EEG was recorded from electrodes P3, PZ. P4. O1. OZ, 02, T5, and T6 of the 10/20 system. To follow temporal changes of induced oscillatory activity time-frequency representations of the signals were computed. These time-resolved power spectra (so-called 'evolutionary spectra' (Priestley 1988), estimated by the discrete Gabor transform (Qian and Chen, 1993), are shown as grand averages over seven subjects for the sum over posterior deviations P3/4, T516 and Ot/2, respectively, corrected by the baseline responses to the static stimuli. The left column depicts recordings from the left hemisphere, the right column from the right hemisphere, the upper row shows responses to the coherent motion condition, and the lower row to the incoherent motion condition. Power is plotted at logarithmic scale along the vertical axis as a function of frequency and time. Since the stimuli were presented in the left hemifield, a change in gamma activity is only expected above the right hemisphere for anatomical reasons and only in the coherent stimulus condition, because the signals generated by two assemblies within very similar cortical regions ~ the same hemisphere, would cancel on the scalp. Results confirm these predictions. An increase in gamma activity is only observed over occipital electrodes of the right hemisphere in response m the coherent stimulus (upper right). This result was specific m the gamma frequency band, because activity in the alpha band in the right hemisphere was modulated similarly in both stimulus conditions. Thus. the increase in gamma power cannot be an artifactual change in harmonics due m some change m power of a lower-fi,equency band. Taken together, this result is in agreement with the temporal binding hypothesis. (Modified from Mtiller et al., t996.) p o w e r in the free f u s i o n e x p e r i m e n t was largest at i n t e r v a l s i m m e d i a t e l y p r e c e d i n g the p e r c e p t i o n o f the virtual 3-D object. Similarly, frontal g a m m a b a n d e n h a n c e m e n t has b e e n f o u n d during p h a s e s o f s w i t c h i n g in bi-stabte p e r c e p t i o n ( B a s a r - E r o g l u et at,. i996a). I n all t h e s e e x p e r i m e n t s , p h y s i c a l c h a n g e s o f the stimuli w e r e m i n i m a l or absent, yet t h e y g e n e r a t e d
different p e r c e p t u a l states w h i c h w e r e a c c o m p a n i e d or p r e d i c t e d by e n h a n c e d E E G p o w e r in the g a m m a f r e q u e n c y range. This d i s s o c i a t i o n o f p h y s i c a l stimulus characteristics and p e r c e p t i o n was taken one step further in an e x p e r i m e n t , in w h i c h subjects w e r e v i e w i n g the s a m e s t i m u l u s m a t e r i a l ( c o m p o s e d o f b l a c k b l o b s on a g r a y b a c k g r o u n d ) but due to different instructions w e r e either u n a w a r e o f a h i d d e n
128 shape (a Dalmatian dog) or selectively searching for it (Tallon-Baudry et al.. 1997a,b). A strong increase in induced g a m m a activity was found during active search c o m p a r e d to E E G activity in naive subjects. irrespective of whether the stimulus contained the Dalmatian dog or not. Since this increase in induced g a m m a power has the same time course as that in the Kanizsa triangle experiment discussed above, and is considerably larger in amplitude, induced g a m m a activity has been suggested as a possible correlate o f an internal representation that is activated in order to guide active search for an expected figure (TallonBaudry et al., 1998: Tallon-Baudry and Bertrand. 1999). In agreement with this hypothesis induced g a m m a (and beta) activity has been found in the delay period of a d e l a y e d match to sample task ITallon-Baudry et al., 1998, 1999) during which an intemal representation needs to b e activated as well. In these cases of c o m b i n e d top-down and bottom-up processing the topography o f g a m m a activity significantly differed from that in purely perceptual tasks. This agrees with an essential characteristic of ensemble coding, the option t o e s t a b l i s h associations between different parts o f the cortex beyond mere bottom-up stimulus representation. Induced g a m m a band responses have also been found during processing o f visual stimuli with strong emotional content (MOiler et al., 1999). It was shown that the topography o f g a m m a band responses changes in relation to the emotional valence of the stimulus, again indicating that cells differ-
ently distributed in the brain have been bound into a synchronous ensemble. Topographic differences in g a m m a band activity have also been observed during perception of drawn face stimuli with different emotional expressions (Keil et al.. 1999).
Long-range synchronization in human EEG These studies on induced g a m m a band activity in human scalp recordings have provided sound evidence for synchronization of large neural populations in human cortex. This line of research has recently been extended b y studies demonstrating that the occurrence o f induced g a m m a activity is a c c o m p a n i e d by long-range synchronization (Bruns et al.. 1999; M i l m e r et al., 1999; Rodriguez et al., 1999). Induced g a m m a activity was observed in a face detection task (Rodriguez et al., 1999; Fig. 11) with a t i m e course c o m p a t i b l e to studies on visual perception o f more simple shapes discussed above. In addition, a later g a m m a p e a k was described. The first g a m m a p e a k was a c c o m p a n i e d b y frequency-specific, zero-phase inter-electrode synchronization in the g a m m a band. This is followed b y a phase of anti-synchronization. before the system settles into a topographically different, stimulus-independent synchronization pattern in preparation of the m o t o r response. In conditions where no face is being perceived, the initial g a m m a increase is smaller, but the m o r e distinguishing feature with respect to the condition o f face perception is the l a c k o f initial synchronization and follow-
Fig. 10. Stimulus dependence of induced gamma band activity. (A) In this visual discrimination task subjects were asked to silently count the number of occurrences of the target stimulus, a curved illusory Kanizsa trianglel Stimulus dependence was studied by comparing responses to coherent stimuli, illusory and real triangle, with responses m the incoherent stimulus ('no-triangle' stimulus). Note that the physical differences between 'illusory triangle' and 'no-triangle' stimulus are rather small, but lead to pronounced perceptual differences. Second, due to the similarity of illusory triangle and target stimulus, this task is not easily performed under conditions of foveation of the fixation cross. Therefore, this task ensured that subjects remained attentive throughout the experiment and went through perceptual states dominated by either one coherent object or three individual objects. (B) Results are presented as time-frequency plots of EEG power tat electrode Cz, grand mean over all subjects). These representations were obtained by means of the continuous Morlet's wavelet transform (Kronlandt-Martinet et al?, 1987; Grossmann et al., 1989; Bertrand et al.. 1996) which provides a better compromise between time and frequency resolution than time-varying spectra. Results are shown for the coherent stimulus conditions at the top and the incoherent condition at the bottom, with the prestimulus interval power taken as baseline level for each frequency. A first increase in gamma activity (~40 Hz) around 100 ms after stimulus onset occurred in both stimulus conditions, phase-locked to the stimulus onset. A second gamma peak (30-60 Hz) occurred around 280 ms after stimulus onset and was much stronger in the coherent than in the incoherent stimulus condition. Unlike the first gamma peak. the second one was not phase-locked to the stimulus. Thus, the occurrence of induced gamma activity differentiates between stimulus conditions. Since small differences in the orientation of the three 'packmen' are rather unlikely to bring about changes in overall EEG signals, it can be concluded that induced gamma activity is a likely neural correlate of object perception. (Modified from Tallon-Baudry et al., 1996./
129 ing anti-synchronization. Therefore, it has been suggested that transitions between different synchronous cell assemblies active during cognitive acts (here: face perception and motor responses) are characterized by phases of de- or anti-synchronization allowing for reformation of the new assembly (Varela, 1995; Rodriguez et al., 1999). Furthermore, since the physical dif£erences between the two stimulus conditions are rather Small in this task ('Mooney' faces are rotated), gamma-phase synchronization patterns
real triangle
(Target)
are most likely correlates of perception rather than of stimulus-driven responses. Thus. the link between gamma activity and long-range synchronization that had been established in a previous animal study (K6nig et al., 1995b) seems to hold also for the human brain. Further support for the hypothesis that synchronous gamma band activity in the human EEG is a correlate of cortical assemblies has been obtained in a visuo-tactile classical conditioning task (Miltner
Stimulus ON
130
o
Iz) co
123 O i C.d O co
c.o i
Q
-,.q bd Q
3
10
12
et al., 1999), where it was assumed, along the lines of Hebbian theory (Hebb. 1949), that the association between a visual and a tactile stimulus should be represented by a cell assembly comprising cell groups representing either stimulus. The authors found a specific increase in g a m m a coherence between visual and somatosensory cortex representing the conditioning stimulus and the stimulated finger. Topographic changes of coherence upon reversal of the conditioned hand and disappearance of g a m m a coherence with extinction provide evidence for a close relation between the coherence measure and behavior. These results resemble earlier findings in animal studies (Bressler et al.. 1993; Roelfsema et al., 1997), even though differences exist with respect to the precision of the coherence b a n d of long-range inter-areal interactions. In the latter study zero-phase synchronization of local field potential signals was found between visual and parietal as well as parietal and motor cortex of cats attending to a visual stimulus and reacting with a motor response. Synchrony disappeared, however, during reward and inter-trial periods. A direct correlation between conscious perception in humans and synchronous activity a m o n g large populations of widely distributed neural groups has
Fig. l 1. Changes of induced gamma power and long-range synchronization of human EEG during face perception. Examples show 'Moouey' faces used as stimuli in this study (Rodriguez et al., 1999). These are nigh-luminance contrast human faces presented upright or turned upside down. In the latter condition, faces are difficult to recognize. Below, average scalp distributions of gamma activity and synchrony are shown in the two behavioral conditions (columns) for four consecutive time slices trowsp of 180 ms duration, each comprising the time between stimulus onset and motor response. Frequency-specificsynchronization between electrode pairs was evaluated by a method (Lachaux et al., 1999) which measures the degree of phase-locking of two signals after the frequency components of interest have been enhanced by Gabor wavelet transforms. Black and grey lines between recording sites correspond tO significant increases or decreases in synchrony, respectively: Gamma power is shown at grey scale and is larger in the second time interval during face perception. At the same time phase synchrony of opposite signs is observed in the two perceptual states. During face perception, this is followed by a phase of strong anti-synchronization, settling into a pattern of positive correlation again in preparation of the motor response, which is very similar in the 'no perception' condition. (Modified from Rodriguez et al., 1999.)
131 recently bee:n, reported in MEG studies of binocular rivalry (T0n0ni et al.; 1998i SrinivaSan et al.,: 1999). In: this paradigm each eye is presented with a different stimulus leading to random transitions between states of awareness of either the left or the right eye stimulus~ To disentangle neuromagnetic responses elicited b y the two eyes, the authors used a frequency tagging method by flickering the two stimuli ai different frequencies. Neuromagnetic responses are strongly modulated at the respective flickering frequencies. It was shown that coherence strength between the steady-state responses from different recording sites increased selectively for responses to the stimulus the subjects reported seeing. Thus, changes af perception during unchanged stimulus conditions were reflected in systematic changes of complex coherence patterns throughout the cortex. Similar observations have been made in multi-electrode recordings from the early visual cortex of on-amblyopic strabismic cats (Fries et al., 1997). erception in these animals alternates between the two eyes. In this study, oscillatory responses in the gamma range showed increased synchronization when evoked by the eye that conveyed the perceived stimulus, while the reverse was true for responses to the suppressed eye, No apparent rate changes were observed. Thus, selection for further processing seems to have been achieved by modulations of synchronization strength. II needs to be considered, however, that in the MEG studies on rivairy the flicker frequencies were around 10 Hz and therefore steady-state responses oscillated in a different frequency range than the internally gene;ated high-frequency rhythms recorded in the awake cat. For this reason it remains to be seen which interaction patterns emerge at higher frequencies, and how finctings in cats and humans can be integrated with :observations from single-unit recordings in awake monkeys during binocular rivalry (Logothetis, 1999). The latter demonstrate only weak and unsystematic rate changes in early visual areas (Leopold anff Logothetis, 1996), but substantial and perception-related changes in the inferior temporal ortex (Sheinberg and Logothetis, 1997). A recent EEG study demonstrating amplitude modulations of steady-state visual evoked potentials depending on the direction of spatial attention (Mtiller et al., 1998) already used flict~er frequencies in the lower gamma
range (20-30 Hz) and demonstrated that changes in these potentials correlate with perception.
Concluding remarks In this review, we have presented experimental findings demonstrating that distributed neural populations in the visual Cortex process information in a cooperative way. Evidence has accumulated that temporal relations among the responses of concurrently activated neurons convey information which is unavailable from the single-cell responses alone. Furthermore: a Large body of data is compatible with the hypothesis that visual objects are represented by assemblieS of synchronously firing neurons. The ,experimental approaches used in this context range from the level of multiple single cell to EEG and MEG recordings and from anaesthetized animal preparations to human experiments with active c o s t i v e tasks. Therefore, we may state as our main conclusion that the strongest evidence in favor of the temporal binding hypothesis is its capacity to explain these diverse sets of results in a most parsimonious way. The central claim of the temporal binding hypothesis is that relations between features in visual space are represented by synchronization of neural responses, but the concept neither specifies what the elementary features are which are used as building blocks for ensemble formation nor how they are encoded. Therefore. the synchronous assembly concept is compatible with the existence of very different levels of RF complexity along the visual pathways as well as the non-classical CAllman et al., 1985: Sillito et al., 1995; Gilbert, 1998), dynamic (Eckhom et al., 1993b; DeAngelis et al, 1995; Ringach et al.. 1997; Cottaris and De Valois. 1998; Sugase et al., 1999) and state-dependent (W6rgOtter et al., 1998~ properties of receptive fields (Singer and Phillips, 1997: Eckhom. 1999b). The temporal binding hypothesis is also compatible with different strategies how elementary pieces of information are encoded. The most simple and widespread assumption is that features are encoded by the average firing rates of small neural populations. However, more complex coding schemes are likely to prevail in the visual cortex. Evidence for the relevance of precise spike timing of single-cell responses for information processing in the visual
132
cortex (Richmond et al,, 1990; Buracas et al., 1998; Mechler et al., 1998; Bair, 1999; Gawne, 1999) casts doubts on average firing rate as the dominant code used in neuronal processing. It has further been argued that a coding strategy based on synchronous groups of coincidence detecting neurons can explain the high timing precision of cells observed in area MT (Bair and Koch, 1996; Buracas and Albright, 1999). Thus, temporal coding at the single-cell level and synchronous population activity are not only compatible at a conceptual level, but may rely on the same or similar mechanisms of synchronization which, in addition, have been shown to contribute to mechanisms structuring receptive fields (Reid and Alonso, 1995; Alonso and Martinez, 1998). The interesting possibility exists that these processes are intimately related, because synchrony at a lower processing level will influence the saliency of responses in distributed feedforward pathways (Abeles, 1991). New experiments are needed to further explore these issues.
Abbreviations LFP MUA NMA RF
Local field potential Multi-unit activity Normalized modulation amplitude Receptive field
Acknowledgements The authors are grateful to Christian Eurich, Matthias M. Mtiller and Eugenio Rodriguez for critically reading the manuscript, We thank Matthias M. MOiler, Catherine Tallon-Baudry, and Eugenio Rodriguez for supply of figure material and Renate Ruhl-V61sing for the preparation of graphics. Help during literature search by Sunita Mandon, Sabine Melchert; and Detlef Wegener is gratefully acknowledged: This work was supported by HFSP Grant RG-20/95 B, 'Oscillatory Event-Related Brain Dynamics', and SFB 517, 'Neurocognition'.
References Aarnoutse, E.J., Lee, B., Gfidicke, R. and Albus, K. (1997) The dependency of correlated neuronal tiring in the visual cortex on stimulus orientation and movement direction. Soc. Neurosci. Abstr. 24: (Abstr.).
Abbott. L.F. and Dayan, R (1999) The effect of correlated vanability on the accuracy of a population code. Neural Compur.. 11:91-101 Abeles, M. (1982a) Local Cortical Circuits. An Electrophysiological Study. Springer. Berlin. Abeles. M. (1982b) Role of the cortical neuron: integrator or coincidence detector. Isr. J. Med. Sci.. 18: 83-92. Abeles. M. (1991) Corticonics. Cambridge University Press. Cambridge. Abeles. M. and Goldstein. M.H. (1977) Multiple spike train analysis. IEEE. 65: 762-773. Aertsen. A. and Arndt, M. (1993) Response synchronization in the visual cortex. Curr.. Opin. NeurobioL, 3: 586-594. Aertsen. A.. Gerstein. G. and Johannesma. E (1986) From neuron to assembly: neuronal organization and stimulus representation. In: G. Palm and A. Aertsen (Eds.), Brain Theory. Springer, Berlin, pp. 7-24. Aertsen, A., Bonhoeffer, T. and Krtiger, J. (1987) Coherent activity in neuronal populations: analysis and interpretation. In: E.R. Caianiello (Ed.), PhySics of Cognitive Processes. World Scientific Publishing, Singapore, pp. 1-34. Aertsen, A.M.H.J. and Gerstein, G.L. (1985) Evaluation of neuronal connectivity: sensitivity of cross-correlation. Brain Res.. 340: 341-354. Aertsen, A.M.H.J. and Gerstein, G.L. (1991) Dynamic aspects of neuronal cooperativity: fast stimulus-locked modulations of effective connectivity. In: J. Krtiger (Ed.), Neuronal Cooperativity. Springer, Berlin, pp. 52-67. Aertsen, A.M.H.J., Gerstein. G.L., Habib. M.K. mad Palm, G. (1989) Dynamics of neuronal firing correlation! modulation of 'effective connectivity'. J. Neurophysiol.. 61: 900-917. Aiple, E and Kriiger, J. (1988) Neuronal synchrony in monkey striate cortex: interocular signal flow and dependency on spike rates. Exp. Brain Res., 72: 141-149. Albus, K., Aarnoutse, E.J. and G~idicke, R. (1998) A light stimulus induces either correlation or decorrelation of neuronal tiring in the visual cortex depending on the state of the local EEG. Soc. Neurosci. Abstr.. 24: (Abstr.I. Allman, J., Miezin. E and McGuinness, E.L. (1985) Stimulus specific responses from beyond the classical receptive field: neurophysiological mechanisms for local-global compartsons in visual neurons. Annu. Rev. Neurosci.. 8: 407-430. Alonso. J.-M. and Martinez. L.M. (1998) Functional connectivity between simple cells and complex cells in cat striate cortex. Nat. Neurosci., 1: 395-403. Alonso. J.-M., Usrey, W.M. and Reid. R.C. (1996) Precisely correlated firing in ceils of the lateral geniculate nucleus. Nature, 383: 815-819. Azouz. R. and Gray, C.M. (1999) Cellular mechanisms contributing to response variability of cortical neurons in vlvo. J. Neurosei., 19: 2209-2223. Bach, M. and Kriiger, J. (1986) Correlated neuronal variability in monkey visual cortex revealed by a multi-microelectrode. Exp. Brain Res.. 61: 451-456. Bait. W. (1999) Spike timing in the mammalian visual system. Curr. Opin. NeurobioL, 9: 447-453. Bait. W. and Koch, C. (1996) Temporal precision of spike trains
133
in extrastriate cortex of the behaving monkey. Neural Compu~.. 8: 44-66.
slgnificat~ce. Int. J. Psychophysiol., 24:101-112. Bedenbaugh, R and Gerste~n, G.L. (1997) Multiunit normalized Cross'correiafion d~ffers from the average single-unit normalized 'correlation. Neural Comput., 9: 1265-1275. Bertrand, O., Tailon-Baudry. C. and Pernier. J. (1996) Time frequency analysis of oscillatory y-band activity: wavelet approach and phase locking estimation. In: C.C. Wood et al. (Eds.), Biomag96: Advances in Biomagnetism Research. Springer, New York. Braltenberg, V. (1978) Cell assemblies in the cerebral cortex. In: R. Helm mad G, Palm (Eds.), Architectonics of the Cerebral Cortex. Lecture Notes in Biomathematics. Springer, Berlin, pp. 171-188. Brecht. M., NeuenschWander. S.. Nase. G., Singer. W. and Engel, A.K. (1998a) Correlation patterns of visual activity in the vertebrate rectum: a comparative study. Soc. Neurosci. Abstr., 24: (Abstr.). Brecht, M., Singer, W. and Engel, A.K t1998b) Correlation analysis of corticotectal interactions in the cat visual system. J. Neurophysiol., 79[ 2394-2407. Brecht, M.. Singer, W. and Engel, A.K. (1999) Patterns of synchronization in the superior colliculus of anesthetized cats. J. Neurosci.. 19: 3567-3579. Bressler, S.L. (1990) The gamma wave: a cortical information carrier? Trends Neurosei.. 13: 161-162. Bressler, S.L., Coppo!a, R. and Nakamura, R. (1993) Episodic multiregi0nal Coherence at multiple frequencies during visual task pe;'formance. Nature. 366: 153-156. Britten, K.H., Shadlen, M.N., Newsome. W.T. and Movshon. J.A. (1992) The analysis of visual motion: A comparison of neuronal and psychophysical performance. J. Neurosei.. 12: 4745-4765. Brosch. M.. Bauer. R. and Eckhorn, R. (1995) Synchronous high-frequency oscillations in cat area 18: Eur. J. Neurosci.. 7: 86-95. Brosch. M., Bauer. R. and Eckhorn, R. (1997) Stimulus-dependent modulations o f correlated high-frequency oscillations in cat visual cortex. Cereb. Cortex. 7: 70-76. Brown, E.N., Prank. L.M., Tang, D, Quirk, M.C. and Wilson. M.A. (t998) A statistical paradigm for neural spike train decoding applied to position prediction from ensemble firing patterns of rat hi.ppocampal place ceils. J. Neurosci.. 18:7411 7425. Bruns. A.. Ecldaorn. R.. Jokeit, H. and Ebner, A. (1999) Suhdural high-frequency signals in humans reflect cognitive processes,
Task- and event-related changes of conventional and novel coupling measures. GOttingen Neurobiol. Rep. 1999, 1I: 488 (Abstr.). Bryant. J.. Marcos. A.R. and Segundo, G.H, (1973) Correlations of neuronal spike discharges produced by monosynaptic connections and by common inputs. J. Neurophysiol.. 36: 205225." Bullier, J. and Nowak, L.G_ (1995) Parallel versus serial processing: new vistas on the distributed organization o f the visual system. Curt. Opin. Neurobiol., 5: 497-503. Bullock, T.H.. Karamtirsel. S.. Achimowicz. J.Z., McClune. M.C. and Basar-Eroglu, C. (1994) Dynamic properties of human visual evoked and omitted stimulus potentials. Electroencephalogr. Clin. Neurophysiol.. 91: 42-53. Buracas, G.T. and Albright, T.D. (1999) Gauging sensory representations in the brain. Trends Neurosci.. 22: 303-309. Buracas, G.T., Zador, A.M., DeWeese, M.R. and Albnght, T.D. (1998) Efficient discrimination of temporal patterns by motion-sensitive neurons m primate visual cortex. Neuron, 20: 959-969. Cardoso de Oliveira, S., Thiele, A. and Hoffmann, K.-R (1997) Synchronization of neural: activity during stimulus expectation in a direction discrimination task. J. Neurosci.. 17: 9248-9260. Castelo-Branco. M.. Neuenschwander. S. and Singer. W. (1998) Synchronization of visual responses between the cortex, lateral geniculate nucleus, and retina in the anesthetized cat. J. Neurosci., 18: 6395-6410. Chatrian, G.E., Bickford, R,G. and Uihlein, A. (1960) Depth electrographic stndy of a fast rhythm evoked from the human calcarine region by steady illumination. Electroencephalogr. Clin. Neurophysiol.. 12: 167-176. Chawanya, T., Aoyagi, T.. Nishikawa. I.. Ojuda, K. and Kuramom. Y. (1993) A model for feature linking via collective oscillations in the primary visual cortex. Biol. Cybern., 68: 483-490. Churchland, RS. and Sejnowski. T.J. (1992) The Computational Brain. MIT Press. Cambridge. MA. Cottaris. N.R and De Valois. R. (1998) Temporal dynamics of chromatic tuning in macaque primary visual cortex. Nature. 395: 896-900. Crick, F. and Koch. C. (1998 ~Constraints on cortical and thalarmc projections: the no-strong-loops hypothesis. Nature. 391: 245-250. Dan. Y., Alonso. J.-M. Usrey, W.M. and Reid, R.C. (1998) Coding of visual information by precisely correlated spikes in the lateral geniculate nucleus. Nat. Neurosci., 1: 501-507. Das, A. and Gilbert. C.D. (1999) Topography of contextual modulations mediated by short-range interactions in primary visual cortex. Nature, 399: 655-661. DeAngelis. G.C.. Ohzawa. I. and Freeman. R.D. (1995) Receptive-field dynamics in the central visual pathway. Trends Neurosci.. 18: 451-458. deCharms, R.C. and Merzenich, M.M. (1996) Primary auditory representation of sounds by the coordination of action potential timing. Nature, 381: 610-613. Eckhorn. R. (1999a) Neural mechanisms of scene segmentation:
134 recordings from the visual cortex suggest basic circuits for linking field models. IEEE Trans. Neural Netw., 10: 464-479. Eckhorn, R. (1999b) Neural mechanisms of visual feature binding investigated with microelectrodes and models. Vis. Cogn., 6: 231-265. Eckhorn, R., Bauer, R., Jordan, W., Brosch, M., Krnse, W., Munk, M. and Reitboeck, H.J, (1988) Coherent oscillations: a mechanism for feature linking in the visual cortex? Biol. Cybern., 60: 121-130. Eckhorn, R., Frien, A., Bauer, R., Woelbern, T. and Kehr, H. (1993a) High frequency (60-90 Hz) oscillations in primary visual cortex of awake monkey. NeuroReport, 4(3): 243-246. Eckhorn, R., Krause, F. and Nelson, J.I. (1993b) The RF-cinematogram. A cross-correlation technique for mapping several visual receptive fields at once. Biol. Cybern., 69: 37-55. Eckhorn, R., Reithoeck, H.J., Arndt, M. and Dicke, R (1990) Feature linking via synchronization among distributed assemblies: simulations of results from cat visual cortex. Neural Comput., 2: 293-307. Edelman, G.M. (1987) Neural Darwinism: the Theory of Neural Group Selection. Basic Books, New York. Engel, A~K., K/3nig, R, Kreiter, A.K., Schillen, T.B. and Singer, W. (1992) Temporal coding in the visual cortex: new vistas on integration in the nervous system. Trends Neurosci., 15(6): 21:8-226. Engel, AIK., K6nig, R, Kreiter, A.K. and Singer, W. (1991a) Interhemispheric synchronization of oscillatory neuronal responses in cat visual cortex. Science, 252: 1177-1179. Engel, A:K., K6nig, R and Singer, W. (1991b) Direct physiological evidence for scene segmentation by temporal coding. Proc. Natl. Acad. Sci. USA, 88:9136-91401 Engel; A.K., Kreiter, A.K., K6nig, R and Singer, W. (1991c) Synchronization of oscillatory neuronal responses between striate and extrastriate visual cortical areas of the cat. Proc. Natl. Acad. Sci. USA, 88: 6048-6052. Engel, A.K., Roelfsema, R R , Fries, R, Brecht, M. and Singer, W. (1997) Role of the temporal domain for response selection and perceptual binding. Cereb. Cortex, 7: 571-582. Ermentrout, G.B. and Kopell, N. (1984) Frequency plateaus in a chain of weakly coupled oscillators. SIAM J. AppL Math., 15(2): 215-237. Eurich, C.W. and Schwegler, H. (1997) Coarse coding: calculation of the resolution achieved by a population of large receptive field neurons. Biol. Cybern., 76: 357-363. Eurich, C.W. and Wilke, S.D. (1999) Multi-dimensional encoding strategy of spiking neurons. Neural Comput., 12: 15191529. Fellemau, D.J. and van Essen, D.C. (1991) Distributed hierarchical processing in the primate Cerebral Cortex. Cereb. Cortex, 1: 1-47. Field; D.J. (1994) What is the goal of sensory coding? Neural Comput., 6: 559-601. Freeman, W.J. and van Dijk, B:W. (1987) Spatial patterns of visual cortical fast EEG during conditioned reflex in a rhesus monkey. Brain Res., 422: 267-276. Freiwald, W.A., Kreiter, A.K. and Singer, W. (1994) Stimulus
dependency of local cortical synchronisation patterns. Eur. J. Neurosci. Suppl., 7:13.01 (Abstr.). Freiwald, W.A.. Kreiter, A.K. and Singer, W. (1995a) Stimulus dependent intercohimnar synchronization of single unit responses in cat area 17. NeuroReport, 6: 2348-2352. Freiwald. W,A.. Kreiter, A.K. and Singer, W. (1995b) Synchronization Of single unit spike trains in cat visual cortex is stimulus dependent. Soc. Neurosci. Abstr., 21:648.15 (Abstr.). Freiwald, W.A., Kreiter, A.K. and Singer, W. (1998) Oscillatory and synchronous activity states in the macaque inferotemporal cortex. Soc. NeuroscL Abstr., 24: (Abstr.). Freiwald. W.A., Valdes, P., Jimenez, J.C.. Rodnguez. L.M.. Biscay, R.. Rodriguez. V., Kreiter. A.K. and Singer, W. (1999) Testing non-linearity and directedness of interactions between neural groups in the macaque inferotemporal cortex. Soc. Neurosci. Abstr., 24:(Abstr.). Frien. A.. Eckhorn, R.. Bauer, R., Woelbern. T. and Kehr, H. C1994) Stimulus-specific fast oscillations at zero phase between visual areas VI and V2 of awake monkey. NeuroRepon. 5: 2273-2277. Fries. P.. Roelfsema, P.R.. Engel, A.K., K6nig, P. and Singer, W. (1997) Synchronization of oscillatory responses in visual cortex correlates with perception in interocular rivalry. Proc. Natl. Acad. ScL USA, 94: 12699-12704. Gabriel, A. and Eckhorn, R. (19991 Phase continuity of fast oscillations may support the representation of object continuity in striate cortex of awake monkey. Correlation analysis of time- and space-resolved single responses. GOttingen Neurobiol. Rep. 1999. II: 489 (Abstr.). Galambos. R. (1992) A comparison of certain gamma band (40 Hz) brain rhythms in cat and man. In: E. Basar and T.H. Bullock (Eds.), Induced Rhythms in the Brain. Birkhiiuser. Boston, MA, pp. 201-216. Gauthier. I.. Tart, M.J., Anderson, A.W.. Skudlarski, P. and Gore. J.C. (1999) Activation of the middle fusiform 'face area' increases with expertise m recognizing novel objects. Nat. Neurosci., 2: 568-573. Gawne, T.J. (1999) Temporal coding as a means of information transfer in the primate visual system. Crit. Rev. NeurobioL, 13: 83-101. Georgopoulos, A.P. (1990) Neural coding of the direction of reaching and a comparison with saccadic eye movements. Cold Spring Harbor Symp. Quant. BioL. 55: 849-860. Gerstein, G.L. and Gochin, P.M. q1992) Neural population coding and the elephant. In: A. Aertsen and V. Braitenberg (Eds.), Information Processing in the Cortex Experiments and Theory. Springer, Berlin, pp. 139-173. Gerstein, G.L. and Perkel, D.H. (1969) Simultaneously recorded trains of action potentials: analysis and functional interpretation. Science. 164: 828-830. Gerstein, G.L.. Bloom, M.J., Espinosa. I.E.. Evanczuk, S. and Turner, M.R (1983) Design of a laboratory for multineuron studies. IEEE Trans. Syst., Man, Cybern., 13: 668-676. Gerstein, G.L.. Bedenbaugh, P. and Aertsen, A.M.H.J. (1989) Neuronal assemblies. 1EEE Trans. Biomed. Eng., 36(1): 4-14. Ghose, G.M. and Freeman, R.D. (1992) Oscillatory discharge in
135
the visual system: does it have a functional role? J. Neurophysiol,, 68(5): 1558-t574. Ghose, G.M., Ohzawa. I. and Freeman, R.D. (1994) Receptivefield maps of correlated discharge between pairs of neurons in the cat's visual cortex. J. Neurophysiol., 71(1): 330-346. Gilbert, C.D. (1998) Adult cortical dynamics. Physiol. Rev,. 78: 467-485. Gilbert. C.D. and Wiesel, T.N. (1989) Columnar specificity of intrinsic horizontal and cartico-c0rtical connections in cat visual cortex. J. Neurosci,, 9(7): 2432-2442. Gochin, RM,, Miller. E.K.. Gross, C.G. and Gerstein. G.L. (199t) Functional interactions among neurons in inferior temporal cortex of the awake macaque. Exp. Brain Res.. 84: 505516 Goebel. R.. Muckti. L. and Singer, W. (1999) Motion perception and motion imagery: new evidence of constructive brain processes from ftmcti0nat magnetic resonance imaging studies. In: A.D. Friederici and R. Menzel (Eds.), Learning. Rule Extraction and Representation. Walter de Gruyter, Berlin. pp. 165-185. Gray. C.M. and Singer, W. (1987) Stimulus specific neuronal oscillations in the cat visual cortex: a cortical functional unit. Soc. Neurosci. Abstr., t3i 1449 (Abstr.). Gray, C.M. and; Singer, W. (1989) Stimuhis-specific neuronal oscillations in: orientatioii columns in cat visual cortex. Proc. Natl. Acad. Sci. USA, 86: 1698-1702. Gray, C.M. and Viana Di Prisco, G. (1997) Stimulus-dependent neuronal oscillations and local synchronization in striate cortex of the alert cat. J. Neurosci.. 274: 109-113. Gray, C.M.i Ktmg, P., Engel, A.K. and Singer, W. (1989) Os~ cillatory responses,~in ea~ visual cortex exhibit inter-columnar synchronization which reflects global stimulus properties. Nature, 388: 334-337. Gray, C.M, Engel, A.K., Ktnig, E and Singer, W. (1990) Stimulus-dependem neuronal oscillations in cat visual cortex: receptive field properties and feature dependence. Eur. J. Neurosci.. 2(7): 607-619. Grossberg, S. (1980) How does the brain build a cognitive code. PsychOl. Rev:. 35:96-1 t 1. Grossberg, S. and Streets, D. (199t) Synchronized oscillations during cooperatlve feature linking in a cortical modet of visual perception. Neural Netw., 4: 453-466. Grossmann, A , Kronlandt~Martinet, R. and Morlet, J. (1989) Reading and understanding continuous wavelet transforms. In: J.M. Combes. A. Grossmann and M.C. Teich (Eds.). Wavelets. Time-Frequency Methods and Phase Space. Springer. Berlin. pp. 2-20. Gmber. T., Mttfler, M.M., Keil, A. and Elbert, T. (1999) Selective visual-spatial attention alters induced gamma band responses in the human EEG. Clin Neueophysiol., 110 2074-2085 Hansel, D. and Sompotinsky. H. (t992) Synchronization and computation in a chaotic neural network. Phys. Rev. Lett.. 68(5): 718-721. Harris, C.S. (1980) Insight or out of sight? Two examples of perceptuat plasticity in the human adult. In: C.S. Harris and N.J. Hi]lsdale (Eds.), Visual Coding and Adaptability. Erlbaum. Mahwah; NJ, pp. 95-149.
Hata. Y., Tsumoto, T., Sato, H.. Hagihara. K. and Tamura, H. (1988) Inhibition contributes to orientation selectivity in visual cortex of cat. Nature. 335: 815--817. Hata, Y.. Tsumoto, T.. Sato. H. and Tamura. H. (199t) Horizontal interactions between visual cortical neurones studied by cross-correlation analysis in the cat. J. Physiol.. 441: 593-614. Hata, Y, Tsumoto. T.. Sato. H., Hagihara. K. and Tamura. H. (t993) Development of local horizontal interactions in cat visual cortex studied by cross-correlation analysis. J. NeurophysioL, 69(1): 40-56. Hatsopoulos, N.G., Ojakangas, C.L., Paninski. L. and Donoghue. J.R (1999) Information about movement direction obtained from synchronous activity of motor cortical neurons. Proc. Natl. Acad. Sci. USA. 95: 15706-15711. Hebb. D.O. (1949) The Organization of Behavior. Wiley, New York. Herculano-Houzel. S.. Munk. M.HJ.. Neuenschwander. S. and Singer, W. (1999) Precisely synchronized oscillatory firing patterns require electroencephalographic activation. J. Neurosci.. 19: 3992-4010. Herrmann, C.S., Mecklinger, A. and Pfeifer, E. (1999) Gamma responses and ERPs in a visual classification task. Clin. Neurophysiol.. 110: 636-642. Hilgetag, C.-C.. O'Neill. M.A. and Young, M.R (1996) Indeterminate organization of the visual system. Science. 271: 776777. Hinton. G.E (1981) Shape representation in parallel systems. Proc. 7th Int. Joint Conf. Artificial Intelligence, pp. 10881096. Hopfield, JJ. and Tank. D.W. (1991) Computing with neural circuits: a model. Science. 233: 625-633. Hoppensteadt. EC. and Izhikevich. E.M. (1997) Weakly Connected Neural Networks. Springer, New York. Hup& J.M.. James, A.C.. Payne, B.R.. Lomber. S.G.. Girard. R and Bullier. J. (1998) Cortical feedback improves discrimination between figure and background by V1. V2 and V3 neurons. Nature. 394: 784-787. Johannesma. E, Aertsen, A.. van den Boogard, H., Eggermom. J. and Epping. W. (1986) From synchrony to harmony: ideas on the function of neural assemblies and on the interpretation of neural synchrony. In: G. Palm and A. Aertsen (Eds.), Brain Theory. Springer, Berlin, pp. 25-47. Kaas. J.H. and Krubitzer. L.A. (1991) The organization of extrastriate visual cortex. In: B. Dreher and S.R. Robinson (Eds.). Vision and Visual Dysfunction. Macmillan Press, New York, pp. 302-323. Kanizsa. G. (1979) Organization of Vision. Essays on Gestalt Perception. Praeger, New York. Kappen. HJ. (1997) Stimulus-dependent correlations in stochastic networks. Phys. Rev. E, 55: 5849-5858. [Ceil. A.. Mtiller, M.M., Ray, W.J.. Gruber. T. and Elbert, T. (1999) Human gamma band activity and perception of a Gestalt. J. Neurosci.. 19: 7152-7161. Kirkwood. RA. (1979) On the use and interpretation of crosscorrelation measurements in the mammalian central nervous system. J. Neurosci. Methods. 1: 107-132. Kobatake, E., Wang, G. and Tanaka. K. (1998) Effects of shape-
136 discrimination training on the selectivity of inferotemporal cells in adult monkeys. J. Neurophysiol., 80: 324-330. Krnig, R and Schillen, T.B. (1991) Stimulus-dependent assembly formation of oscillatory responses: I. Synchronization. Neural Comput., 3i 155-166. K/Snig, R, Engel, A.K., Roelfsema, ER. and Singer, W. (1995a) How precise is neuronal synchronization? Neural Comput., 7: 469-485. K6nig, E, Engel, A.K. and Singer, W. (1995b) Relation between oscillatory activity and long-range synchronization in cat visual cortex. Proc. Natl. Acad. Sci. USA, 92: 290-294. Krnig, R, Engel, A.K. and Singer, W. (1996) Integrator or coincidence detector? The role of the cortical neuron revisited. Trends Neurosci., 19: 130-137. Konorski, J. (1967) Integrative Activity of the Brain. University of Chicago, Chicago, IL. Kopell, N. and Ermentrout, G.B. (1990) Phase transitions and other phenomena in chains of coupled oscillators. SIAM J. AppI. Math., 50(4): 1014-1052. Kreiter, A.K. (1992) Kodierung neuronaler Assemblies durch koh~ente Aktivit~it: Korrelationsanalysen im Sehsystem von S~iugetieren. Thesis, Fakult~t ftir Biologie der Eberhard-KarlsUniversit~it Ttibingen. Kreiter, A.K. and Singer, W. (1992) Oscillatory neuronal responses in the visual cortex of the awake macaque monkey. Eur. J. Neurosci., 4: 369-375. Kreiter, A.K. and Singer, W. (1996a) On the role of neural synchrony in:the primate visual cortex. In: A. Aertsen and V. Braitenberg (Eds.), Brain Theory - - Biological Basis and Computational Principles. Elsevier, Amsterdam, pp. 201-227. Kreiter, A.K. and Singer, W. (1996b) Stimulus-dependent synchronization of neuronal responses in the visual cortex of the awake macaque monkey. J. Neurosci., 16: 2381-2396. Kronlandt-Martinet, R., Morlet, J. and Grossmann, A. (1987) Analysis of sound patterns through wavelet transforms. Int. J. Pattern Reeogn. Artif Intell., 1: 273-302. Krubitzer, L, (1995) The organization of neocortex in mammals: are species differences really so different? Trends Neurosei., 18: 408-4t7. Krtiger, J. and Aiple, F. (1988) Multimicroelectrode investigation of monkey striate cortex: spike train correlations in the infragranular layers. J. Neurophysiol., 60(2): 798-828. Krfiger, J. and Mayer, M. (1990) Two types of neuronal synchrony in monkey striate cortex. Biol: Cybern., 64: 135-140. Lachanx, J.E, Rodriguez, E., Martinerie, J:, Adam, C. and Varela, F.J. (1998) Synchrony in gamma-band oscillations in human intracortical recordings during visual discrirulnation. Neurolmage, 7:$303 (Abstr.). Laehanx, J.-E, Rodriguez, E., Martinerie, J. and Varela, F.J. (1999) Measuring phase-synchrony in brain signals. Hum. BrainMapp., 8: 194-208. Lamme, V.A.F. and Spekreijse, H. (1998) Neuronal synchrony does not represent texture segregation. Nature, 396: 362-366. Lamme, V.A.F., Sup6r, H, and Spekreijse, H. (1998) Feedforward, horizontal, and feedback processing in the visual cortex. Curt. Opin. Neurobiol., 8: 529-535. Lampl, t., Reichova, I. and Ferster, D. (1999) Synchronous
membrane potential fluctuations in neurons of the cat visual cortex. Neuron, 22: 361-374. Larkum. M.E., Zhu, J.J. and Sakmann. B. (1999J A new cellular mechanism 'for couplin~ inputs arriving at different cortical layers. Nature, 398: 338-341. Lee, T.S., Mumford, D., Romero. R. and Lamme. V.A.F. (1998) The role of the primary visual cortex in higher level vision. Vis. Res., 38: 2429-2454. Leopold. D.A. and Logothetis, N.K. (1996) Activity changes in early visual cortex reflect monkeys' percepts during binocular rivalry. Nature, 379: 549-553. Liu. Z., Gaska, J.E, Jacobson. L.D. and Pollen. D.A. (1992) Intemeuronal interaction between members of quadrature phase and anti-phase pairs in the cat's visual cortex. Vis. Res., 32(7): 1193-1198. Livingstone. M. and Hubel, D. (1988) Segregation of form, colour, movement, and depth: anatomy, physiology, and perception. Scienee. 240: 740-749. Livingstone. M.S. (1996) Oscillatory firing and interneuronal correlations in squirrel monkey striate cortex. J. Neurophysiol., 75: 2467-2985. Logothetis, N.K. (1999) Single units and conscious vision. Philos. Trans. R. Soc. Lond. B. 353: 1801-1818. Logothetis, N.K.. Panls. J. and Poggio, T. (1995) Shape representation in the inferior temporal cortex of monkeys. Curr. Biol., 5(5): 552-563. Lrwel, S. and Singer, W. (1992) Selection of intrinsic horizontal connections in the visual cortex by correlated activity. Science. 252: 209-212. Lumer, E.D. and Huberman. B.A. (1992) Binding hierarchies: a basis for dynamical perceptual grouping. Neural Comput.. 4: 341-355. Lumer, E.D., Edelman, G.M. and Tononi, G. (1997) Neural dynamics in a model of the thalamocortical system. I. layers. loops and the emergence of fast synchronous rhythms. Cereb. Cortex. 7: 207-227. Lutzenberger, W.. Pulvermtiller. E. Elbert. T. and Birbanmer, N. (1995) Visual stimulation alters local 40-Hz responses in humans: an EEG-study. Neurosci. Lett., 183: 39-42. Maldonado. RE. and Gray, C.M. (1996) Heterogeneity in local disuibutions of orientation-selective neurons in the cat primary visual cortex. Vis. Nearosci.. 13: 509-516. Margulis. M. and Tang, C.-M. (1998) Temporal integration can readily switch between sublinear and supralinear summation. J. Neurophysiol.. 79: 2809-2813. Martin. K.A.C. (1994) A brief history of the 'feature detector'. Cereb. Cortex. 4(1): 1-7. Matsttmara_ M., Chen. D.-E. Sawaguchi. T.. Kubota, K. and Fetz, E.E. (1996) Synaptic interactions between primate precentral cortex neurons revealed by spike-triggered averaging of intracellular potentials in vivo. J. Neurosci.. 16: 7757-7767. Maynard, E.M.. Hatsopoulos. N,G., Ojakangas, C.L.. Acuna, B.D.. Sanes, J.N., Normann, R.A. and Donoghue, J.R (1999) Neuronal interactions improve cortical population coding of movement direction. J. Neurosci.. 19: 8083-8093. McNanghton, B.L.. O'Keefe, J. and Barnes. C.A. (1983) The stereotrode: a new technique for simultaneous isolation of sev-
137 era/ single units in the central nervous system from multiple unit records. J. Neurosci. Methods, 8: 391-397. Mechler, F., victor, J.D.. Purpura, K.R and Shapley, R. (1998) Robust temp&ra/ coding of contrast by V1 neurons for transient but not for steady-state stimuli J. Neurosci.. 18: 65836598. Meister M. (1996) Multineuronal codes in retinal signaling. Proc. Natl. Acad. Sci. USA. 93~ 609-614. Meister, M., Lagn~do, L. and Baylor, D.A. (1995) Concerted sig~la/ling by ~ na/ganglion cells. Nature, 270: 1207-1210. Mendola, J.D., Da .e, A.~¢I., Fischl, B., Liu. A.K. and TootelI. R.B.H. ~1999i T m representation of illusory and real contours in human cor~i~ L.visual areas revealed by functional magnetic resonance ima~ ag. J. Neurosci. 19: 8560-857Z Michalski. A_ ~ 'stein, G.L Czarkowska. J. mad Tarnecki. R. (1983) Inte~acl ms between cat striate cortex neurons. Exp. Brain Res., 51 ! 97-107. Milner, RM. (1974) A model for visual shape recognition. PsychoL Rev,. 81(6): 521-535. Miltner, W.H.R.~ Braun, C, Arnold, M., Witte. H and Taub, E. (1999) Coherence of gamma-b~nd EEG activity as a basis for associative te'~'ning. Nature, 397: 434-436. Miyashita, Y. (1988) Neuronal correlate of visual associative long-term memory in the primate temporal cortex. Nature, 335: 817-820. Molotchnikoff. S. and Shumikhina, S. (1996) The lateral posterior-pulvinar complex modulation of stimulus-dependent oscillations in the cat visna/cortex. Vis. Res. 36: 2037-2046. Mototchnikoff.. ~. and Shumikhina. S. (1998) Synchronization. gamma oscilIatio~ and spatial integration of line segments in the cat visua~ Cortex. Soc. Neurosci. Abstr.. 24: (Abstr.). Molotchnikoff0 S, Shmmldaina, S. and Moisan, L.-E. d1996) Stimulus.deper~dent oscillations in the cat visual cortex: differences between bar and grating stimuli. Brain Res.. 731: 91-100. Moore. G.P., Segundo. J.P.0 Perkel, D.H. and Levitan, H. (1970) Statistical signs of synaptic interaction in neurons. Biophys. J.. 10: 876-900, Munk, M.FI.J.. Nowak. L.G.. Nelson. J.I. and Bultier, J (1995) Structural basis of cortical synchronization II. effects of cortical lesions. 3[. Neurophysiol., 74: 2401-2414. Munk. M.H.J_ Roelfsema. RR.. K6nig, E. Engel, A.K. and Singer, W. (1996) Role of reticular activation in the modulation of intracortical syncba'onization Science. 272: 271-274. Mfiller, M.M, ~1997) Oscillatory cortical activities in the human brain. 1997. Sozialwissenschaftliche Fakult~tt, Universit~t Konstanz. ~Habilitationsschrift) http: //www.uni-konstanz.de/ZE/Bib/habil/muellermc/muehab.htm Mtiller, M.M., Bosch, J., Elbert. T.. Kreiter, A.. Valdes Sosa. M.. Valdes Sosa, E and Rockstroh, B. (1996) Visually induced gamma-band responses in human electroencephalographic activity - - a link to animal studies. Exp. Brain Res., 112: 96102. Mfiller, M.M., Jungh6fer, M., Elbert. T. and Rockstroh. B. (1997) Visually induced gamma~band responses to coherent and mcoherent motion: a replication study. NeuroReport. 8: 25752579
Mtiller. M.M., Teder-S~ilej~rvi, W. and Hillyard, S.A. (1998) The time course of cortical facilitation during cued shifts of spatial attention. NaL Neurosei.. 1: 631-634. Mfiller, M.M,, Keil, A., Gruber, T. and Elbert, T~ (1999l Processing of affective pictures modulates right-hemispheric gamma band EEG activity. Clin. Neurophysiol.. 110: 1913-1920. Nelson, J.I.. Salin, P.A., Munk. M.H.J., Arzi. M. and Bullier, J. (1992) Spatial and temporal coherence in cortico-cortical connections: a cross-correlation study in areas 17 and 18 of the cat. Vis. Neurosci.. 9'. 21-37. Neuenschwander, S. and Singer, W. (1996) Long-range synchronization of oscillatory light responses in the cat retina and lateral geniculate nucleus. Nature. 379: 728-733. Neuenschwander, S. and Varela. E J. (1993) Visually triggered neuronal oscillations in the pigeon: an autocorrelation study of tectal activity. Eur. J. Neurosci.. 5: 870-881. Neven. H. and Aertsen, A. (19921 Rate coherence and event coherence in the visual cortex: a neuronal model of object recognition. BioL Cybern.. 67: 309-322. Nowak. L.G.. Munk. M.H.J., Nelson, J.I.. James. A.C. and Bullien J. (1995) Structural basis of cortical synchronization. I. three types of interhemispheric coupling. J. NeurophysioL. 74: 2379-2400. Nowak, L.G., Munk. M.H.J., James, A.C.. Girard. E and Bullier. J. (1999) Cross-correlation study of the temporal interactions between areas V1 and V2 of the macaque monkey. J. Neurophysiol.. 81: 1057-1074. Olshansen, B.A., Anderson, C.H. and van Essen. D.C. (1993) A neurobiological model of visual attention and invariant pattern recognition based on dynamic routing of information. Z Neurosci., 13(1l: 4700-4719. tram, M.W.. F61di~ik. R. Perrett. D.I. and Sengpiel. E (1999) The 'ideal homunculus': decoding neural population signals. Trends Neurosci.. 21: 259-265. Palm, G. (19821 Neural Assemblies. Springer, Heidelberg. Panzeri. S.. Schultz, S.R.. Treves. A. and Rolls. E.T. (1999) Correlations and the encoding of information in the nervous system. Proc. R. Soc. Lond. B., 266: 1012. Perez-Borja, C.. Tyce, F.A.. McDonald. C. and Uihlein. A. (1961) Depth electrographic studies of a focal fast response to sensory stimulation in the human. Electroencephalogr. Clin. NeurophysioL, 13: 695-702. Perkel, D.H., Gerstein. G.L. and Moore. G.P. (1967) Neuronal spike trains and stochastic point processes II. Simultaneous spike trains. Biophys. J., 7: 419-440. Priestley, M.B. (1988) Non-Linear and Non-Stationary Time Series Analysis. Academic Press. London. Prut. Y., Vaadia, E., Bergman, H., Haalman. I.. Slovin. H. and Abeles. M. (1998) Spatiotemporal structure of cortical activity: properties and behavioral relevance. J. NeurophysioL. 79: 2857-2874. Pulvermtiller, F.. Birbaumer. N.. Lutzenberger. W. and Mohr. B. (1997) High-frequency brain activity: its possible rote in attention, perception and language processing. Prog. NeurobioL, 52: 427-445. Qian. S. and Chen~ D. (1993) Discrete Gabor transform. IEEE Trans. Signal Proc.. 41: 2429-2438.
138 Rager, G. and Singer, W. (1998) The response of cat visual cortex to flicker stimuli of variable frequency. Fur. J. Neurosci., 10: 1856-1877. Ramana Reddy, D.V., Sen, A. and Johnston, G.L. (1998) Time delay induced death in coupled limit cycle oscillators. Phys. Rev. Lett., 80: 5109-5112. Reid, R.C. and Alonso, J.-M. (1995) Specificity of monosynaptic connections from thalamus to visual cortex. Nature, 378: 281284. Revonsuo, A., Wilenius-Emet, M., Kuusela, J. and Lehto, M. (1997) The neural generation of a unified illusion in human vision. NeuroReport, 8: 3867-3870. Richmond. B.J. and Gawne, T.J. (1998) The relationship between neuronal codes and cortical organization. In: H.B. Eichenbaam and J.L. Davis (Eds.), Neuronal Ensembles. Strategies for Recording and Decoding. Wiley. New York. pp. 57-79. Richmond, B.J.. Optican. L.M. and Spitzer, H. (1990) Temporal encoding of two-dimensional patterns by single units in primate primary visual cortex: I. stimulus-response relations. J. Neurophysiol.. 64(2): 351-369. Riehle. A., G~n, S., Diesmann, M. and Aertsen. A. (1997) Spike synchronization and rate modulation differentially involved in motor cortical function. Science. 278: 1950-1953. Ringach. D.L., Hawken. M.J. and Shapley, R. (1997) Dynamics of orientation mnmg in macaque primary visual cortex. Nature. 387: 281-284. Ritz. R., Gerstner, W., Fuentes. U. and van Hemmen, J.L. (1994) A biologically motivated and analytically soluble model of collective oscillations in the cortex. Biol. Cvbern.. 71: 349358. Rockland. K.S. and Pandya, D.N. (1979) Laminar origins and terminations of cortical connections of the occipital lobe in the rhesus monkey. Brain Res.. 179: 3-20. Rodriguez. E.. George, N.. Lachanx. J.-R, Martinerie, J., Renault, B. and Varela. EJ. (1999) Perception's shadow: long-distance synchronization of human brain activity. Nature. 397: 430433. Roe. A.W. and Ts'o, D.Y. (1992) Functional connectivity between V1 and V2 in the primate. Soc. Neurosei. Abstr.. 18:11 (Abstr.). Roe, A.W. and Ts'o, D.Y. (1999) Specificity of color connectivity between primate V1 and V2. J. Neurophysiol.. 82: 2719-2730. Roelfsema, ER. (1998) Solutions for the binding problem. Z Naturforsch., 53: 691-715. Roelfsema. RR.. Engel, A.K.. K6nig, E and Singer, W. (1997) Visuomotor integration is associated with zero time-lag synchronization among cortical areas. Nature, 385: 157-161. Roy, S. and Alloway, K.D. (1999) Synchronization of local neural networks in the somatosensory cortex: a comparison of stationary and moving stimuli. J. Neurophysiol., 81: 9991013. Rumelhart, D.E. and McClelland. J.L. (1986) Parallel Distributed Processing: Explorations in the Mierostructure of Cognition. MIT Press. Cambridge. MA. Sakai, K. and Miyashita. Y. (1994) Neuronal tuning to learned complex forms in vision. NeuroReport, 5: 829-832.
Sakurai, Y. (1999) How do cell assemblies encode information in the brain? Neurosci. Biobehav. Rev., 23: 785-796. Sanger, T.D. (1996) Probability density estimation for the interpretation of neural population codes. J. Neurophysiol., 76: 2790-2793. Schillen, T.B. and K(Snig, E (1994) Binding by temporal structure in multiple feature domains of an oscillatory neuronal network. Biol. Cybern., 70: 397-405. Sehillen, T.B., KSnig, R, LSwel, S. and Singer, W. (1992) Assessing the range of local field potentials through ocular dominance and orientation maps of cat visual cortex. Proceedings of the 20th GOttingen Neurobiology Conference (Abstr.). Schwarz, C. and Bolz, J, (1991) Functional specificity of a long-range horizontal connection in cat visual cortex: a crosscorrelation study. J. Neurosci., 11(10): 2995-3007. Sejnowski, T.J. (1986) Open questions about computation in cerebral cortex. In: J.L. MeClelland and D.E. Rumelhart (Eds.), Parallel Distributed Processing -- Explorations in the Microstructure of Cognition. The MIT Press, Cambridge, MA, pp. 372-389. Sere-Jacobsen, C.W., Petersen M.C., Dodge, H.W., Lazarte, J.A. and Holman, C.B. (1956) Electroencephalographic rhythms from the depth of the parietal, occipital and temporal lobes in man. Electroencephalogr. Clin. Neurophysiol., 8: 263-278. Sheinberg, D.L. and Logothetis, N.K. (1997) The role of temporal cortical areas in perceptual organization. Proc. Natl. Acad. Sci. USA, 94: 3408-3413. Sherrington, C.S. (1941)Man on his Nature. Cambridge University Press, Cambridge. Shibata, T., Shimoyama, Ii, Ito, T., Abla, D., Iwasa, H:, Koseld, K., Yamanouchi, N., Sato, T. and Nakajima, Y. (I999)Attention changes the peak latency of the visual gamma-band oscillation of the EEQ. NeuroReport, 10: 1167-1170i Shumikhina, S., Fortin; S., Chabli, A. and Molotclmikoff, S. (1998) Stimulus configuration and neuronal synchronization in the cat visual cortex. Soc. Neurosci. Abstr., 24: (Abstr.). Silfito, A.M., Grieve, K.L., Jones, H.E, Cudeiro, J: and Davis, J. (1995) Visual cortical mechanisms detecting focal orientation discontinuities. Nature, 378: 492-496. Singer, W. (1990a) Search for coherence: a basic principle of cortical self-organization. Conc. Neurosci., 1(1~: 1-26. Singer, W. (1990b) The formation of cooperative cell assemblies in the visual cortex. J. Exp. BioL, 153: 177-197. Singer, W. (1993) Neuronal representations, assemblies and temporal coherence. In: T.R Hicks, S. Molotchnikoff and T. Ono (Eds.), Progress in Brain Research, Vol. 95. Elsevier, Amsterdami pp. 461-474. Singer, W. and Gray, C.M. (1995) Visual feature integration and the temporal correlation hypothesis. Annu. Rev. Neurosel., 18! 555-586. Singer, W. and Phillips, W.A. (1997) In search of common foundations for cortical computation. Behav. Brain Sci., 20: 657-683. Singer, W., Gray, C., Engel, A., KSnig, R, Artola, A: and BrScher, S. (1990) Formation of cortical cell assemblies. Symp. Quant. Biol., LV: 39-952. Singer, W., Engel, A.K., Kreiter, A.K., Munk, M.H.J., Neuen-
139 schwander, S. and Roe!fsema, P.R. (1997) Neuronal assemblies: necessity, signature and detectability. Trends Cogn. Sci.. t: 252-261. Snippe, H•R and Koenderink. J.J. (1992) Information in channelcoded systems: correlated receivers. Biol. Cybern.. 67: 183190. Softky, W. (19,94) Stib~miltisecond coincidence detection in active den~tie~trees, Neur.oscienee 58(1): 13-41. Sompolinsky, H. and G01omb. D. (1991) Cooperative dynamics in visual processing. Phys. Rev. A, 43( 12): 6990-7011. Sompolinsk), H., Golomb; D. and Kleinfeld, D. (1990) Global processing of visual stimuli in a neural network of coupled oscillators. Pine. Natl. Acad. Sci. USA, 87: 7200-7204. Sporns, O., Tottoni. G. and Edetman. G.M. (1991) Modelling perceptual grouping and figure-ground segregation by means of active reentra~t connections. Proc. Natl. Acad. Sci. USA. 88:129-133. Srinivasan, R.. Russell. D.R. Edelman. G.M. and Tononi. G. (1999) Incr~sed synchronization of neuromagnetic responses during eonsctous perception• J. Neurosci., 19: 5435-5448. Steriade, M. and Amzica, F. (1996) Intracortical and corticothalamic coherency of fast spontaneous oscillations. Proc. Natl. Acad. Sci. USA. 93: 2533-2538• Stevens. C.F. and Zador. A.M. (1998) Input synchrony and the irregular firing of cortical neurons. Nat. Neurosci., 1: 210-217. Sugase, Y, Yamane, S,, Ueno. S. and Kawano. K. (1999) Global and fine information coded by single neurons in the temporal visual cortex. Na; ~re. 400: 869-873. Surmeier. D.J. and Weinberg, R.J. (1985) The relationship between cross-correlation measures and underlying synaptic events• Brain Res.. 331: 180-184. Tatlon-Bandry. C. and Bertrand. O. (1999) Oscillatory gamma activity in humans and its role in object representation. Trends Cogn. Sci.. 3: 151-162. Talton-Baudry C. Bertrand O. Delpuech C and Pernier. J. (1996} Stimulus specificity of phase-locked and non-phaselocked 40 Hz visual responses in human. J. Neurosci.. 16: 4240-4249. Tallon-Baudry, C.. Bertrand. O.. Delpuech. C and Pernier. J. (1997a) Oscillatory y-band (30-70 Hz) activity induced by a visual search task in humans• J. Neurosci.. 17: 722-734. Tallon-Baudry, C., Bertrand, O., Wienbruch. C.. Ross. B. and Pantev. C. t1997b) Combined EEG and MEG recordings of visual 40 HZ responses to illusory triangles in human. NeuroReporr, 8:110.'3-1107. Tallon-Baudry, C.. Bertrand. O.. Peronnet. E and Pernier. J. (1998) Induced y-band activity during the delay of a visual short-term rnemol~y task in humans. J. Neurosci.. 18: 42444254. Tallon-Baudry. C.. Kreiter. A.K. and Bertrand. O. (1999) Sustained and transient oscillatory responses in the gamma and beta bands in a visual short-term memory task in humans. Vis. Neurosci.. 16: 449-459. l"anaka. K. (1996] Inferotemporal cortex and object vision. Annu. Rev. Neurosci.. 19: 109-139. "lass. R and Haken. H. (1996) Synchronized oscillations in the visual cortex - - a synergetic model. Biol. Cybem.. 74: 31-39•
Tononi. G., Sporns, O. and Edelman. G.M. (1994) A measure for brain complexity: relating functional segregation and integration in the nervous system Proc. Natl. Acad. Sci. USA. 91: 5033-5037. Tononi. G.. Srinivasan, R.. Russell. D.R and Edelman. G.M. (1998~ Investigating neural correlates of conscious perception by frequency-tagged neuromagnetic responses. Proc. Natl. Acad. Sci. USA. 95: 3198-3203. Tootell. R.B.H.. Nadjikhani. N.. Hall. E.K.. Marrett. S.. Vanduffel. W.. Vaughan, J.T. and Dale, A.M. (1998) The retinotopy of visual spatial attention• Neuron. 21: 1409-1422. Toyama. K.. Kimura. M. and Tanaka. K. (1981a) Cross-correlation analysis of internenronal connectivity in cat visual cortex. J. Neurophysiol., 46(2): 191-201 Toyarna. K.. Kimura, M. and Tanaka. K (1981b) Organization of cat visual cortex as investigated by cross-correlation technique. J. Neurophysiol.. 46(2): 202-214. Treisman, A. (1986) Properties, parts and objects• In: K. Boff. L Kaufman and I. Thomas (Eds.), Handbook of Perception and Human Performance. Wiley, New York. pp. 1-70. Treisman, A. (1996) The binding problem. Curt. Opin. Neurobiol.. 6: 171-178• Ts'o. D.Y. and Gilbert. C.D. (1988) The organization of chromatic and spatial interactions in the primate striate cortex. J. Neurosci.. 8: 1712-1727. Ts'o. D.Y., Gilbert. C.D. and Wiesel. T.N. (1986) Relationships between horizontal interactions and functional architecture in cat striate cortex as revealed by cross-correlation analysis. J. Neurosci.. 6(4): 1160-1170. Usrey, W.M. and Reid. R.C. (1999) Synchronous activity in the visual system• Annu. Rev. Neurosci., 61: 435-456• Usrey, W.M., Alonso, J.-M., Reppas, J.B. and Reid, R.C. (1998j Time course of heterosynaptic and homosynaptic integration of thalamic inputs to cortical neurons in cat and monkey. Soc. Neurosci. Abstr., 24: (Abstr.). Vaadia. E.. Haalman. I.. Abeles. M.. Bergman. H., Prut. Y., Slovin, H. and Aertsen, A. (1995) Dynamics of neuronal interactions in monkey cortex in relation to behaviourai events. Nature. 373: 515-518. Van der Togt, C.. Lamme, V.A.F. and Spekreijse, H (1998) Functional connectivity within the visual cortex of the rat shows state changes. Eur J. Neuosci.. Suppl•, 10: 1490-1507• Van Essen. D.C.. Anderson. C.H. and Felleman, D.J. (1992) Information processing in the primate visual system: an integrated systems perspective. Science. 255: 419-423• Varela, EJ. (1995) Resonant cell assemblies: a new approach to cognitive functions and neuronal synchrony. Biol. Res.. 28: 95. Victor, J.D. (1999) Temporal aspects of neural coding in the retina and lateral geniculate. Network. 10: Rl-R66. Vogels. R. and Orban, G.A. (1994) Activity of inferior temporal neurons during orientation discrimination with successively presented gratings. J• Neurophysiol.. 71(4): 1428-1451. Volgushev, M.A. (1988) Interactions of neurons in the oriemational column of the cat visual cortex. Sensornye Sistemy, 1: 263-271. Volgushev. M.. Chistiakova, M. and Singer, W. (1998) Modification of discharge patterns of neocorticat neurons by induced
140 oscillations of the membrane potential. Neuroscience, 83: 1525. Von der Heydt, R., Peterhans, E. and Baumgartner, G. (1984) Illusory contours and cortical neuron responses. Science, 224: 1260-1262. Von der Malsburg, C. (1981) The correlation theory of brain function. Int. Rep. Max-Planck-[nst. Biophys. Chem,, GOttingen, 81-2, pp. 1-40. Von der Malsburg, C. (1986) Am I thinking assemblies? In: G. Palm and A. Aertsen (Eds.), Brain Theory. Springer, Berlin, pp. 161-176. Von der Malsburg, C. and Buhmann, J. (1992) Sensory segmentation with coupled neural oscillators. Biol. Cybern., 67: 233242. Von der Malsburg, C. and Schneider, W. (1986) A neural cocktail-party processor. Biol. Cybern., 54: 29-40. Wang, D., Buhmann, J. and yon der Malsburg, C. (1990) Pattern segmentation in associative memory. Neural Comput., 2: 94-
106. W6rg6tter, F., Suder, K., Zhao, Y., Kerscher, N., Eysel, U.T. and Fuuke, K. (1998) State-dependent receptive-field restructuring in the visual cortex: Nature, 396: 165-168. Young, M.P. and Yamane, S. (1992) Sparse population coding of faces in the inferotemporal cortex. Science, 29: 1327-1331. Zeki, S. (1993)A Vision of the Brain. Blackwell, Oxford. Zeki, S. and Shipp, S. (1988) The functional logic of cortical connections. Nature, 335:311-317. Zhang, K. and Sejnowski, T.J. (1999) Neuronal tuning: to sharpen or broaden? Neural Comput., 11: 75-84. Zhang, K., Ginzburgi I., McNaughton, B. and Sejnowski, T. (1998) Interpreting neuronal population activity by reconstruction: unified framework with application to hippocampal place cells. J. NeurophysioL, 79: 1017-1044. Zohary, E., Shadlen, M.N. and Newsome, W.T. (1994) Correlated neuronal discharge rate and its implications for psychophysical performance. Nature, 370: 140-143.
M,A,L, Nicotelis fEd.)
Progress in Brair~Research, Vol. t30 © 2001 Elsevier Science B.V. All rights reserved
CHAPTER 9
Divergence and reconvergence: multielectrode analysis of feedforward connections in the visual system R.C. Reid * Department o f Neurobiolog); Harvard Medical School, Boston. MA 02115. USA
Introduction Multielectrode recording is rapidly becoming a standard tool in systems ne~rophysiology. Its main advantage over singte electrode recordings is that temporal relations among the firing patt£ms of two or more neurons can b e examined. The interpretation of these temporal relations, however, is notoriously difficult. Tile study of temporal relations between neurons can,serve two purposes: (1) to explore the connectivity o1" synaptic physiology of neurons in a circuit: or (2) to elucidate strategies of neural processing or encoding by an ensemble. These two aspects of m~itielectrode recording have been exemplified in numerous Studies of the mammalian visual System. In Some studies, correlations have been used t o study the functional connectivity between visual neurons in retina. LGN and visual cortex (Cletand et al.. 197I:; Toyama et at.. 1981; Mastronarde, 1983a,b,c, ~989; Tanaka, 1983, t985; Cleland and Lee, 1985: T s ' o et al.. 1986: Reid and Alonso, 1995; Alonso et al., 1996; Brivanlou et al.. t998; Usrey et al.:. I998. 1999; Usrey and Reid, 1999). In others, correlations have been examined as a potential means to encode visual information, such as with oscillations (Eckhorn et al.. 1988: Gray et al., 1989;
Corresponding author: R Clay Reid, Department of Neurobiology, Harvard Medical School, 220 Longwood Avenue, Boston, MA 02115. USA. Tel.: -cl-617-432362i: Fax: +1-617-734~7557; E-mail: clay
[email protected]
Jagadeesh et al., 1992; Frien et al., 1994: Kreiter and Singer. 1996; Livingstone, 1996; Neuenschwander and Singer, 1996) or correlated firing (Ghose et al.. 1994; Meister et al.. 1995: Meister. 1996; Dan et al.. 1998). Of course, many studies have elements of both approaches. In this review, we will concentrate on studies that emphasize the first approach. In particular, we will summarize the strategy employed in our recent studies o f the pathway from retina to thalamus (lateral geniculate nucleus or LGN) to visual cortex in the cat. Temporal relations between neurons can be characterized in simple ways, such as with cross-correlation analysis of a pair of neurons, or the analysis of how several neurons can respond in a correlated fashion to a sensory input. They can also be characterized in far more complex ways, such as the clustering of ensembles of neurons into co-active groups (Gerstein et al., 1985). Here, we will concentrate on the use of first-order cross-correlations between pairs of cells to ask a simple question: are the two neurons monosynaptically connected? Higher-order correlations among three or more neurons can be used to analyze more complex questions. For instance, correlations among multiple neurons have been used to analyze patterns of firing and their relation to sensory stimuli (Vaadia et al., 1995). We have used similar techniques to address somewhat narrower questions~ (1) In divergent feedforward systems, those in which one input neuron synapses on two output neurons: if a presynaptic spike drives one target to threshold, is it more or less effective in driving the second target? (2) In convergent feedforward systems, those in which
142 two inputs synapse on a common output neuron: are nearly synchronous spikes more effective than asynchronous spikes in driving the common target?
Recording strategies In our recent studies, we have recorded from two levels of the visual system at once in order to assess feedforward connections. In our simultaneous recordings in retina and LGN (Usrey et al., 1998, 1999) we built upon the earlier work of Cleland, Dubin, Levick, and colleagues (Cleland et al., 1971; Cleland and Lee, 1985). In simultaneous recordings in LGN and visual cortex, we built upon the work of Tanaka (Tanaka, 1983, 1985). As noted in these earlier studies, and discussed below, feedforward connections of this type are particularly well suited to the use of Cross-correlation analysis.: The novel feature of these experiments is that we have recorded f r o m multiple neurons in the intermediate level of this pathway in order to analyze not only feedforward connections, but :also the functional consequences of divergence and convergence. In the cat, retinal ganglion cells diverge to synapse on several target neurons in the LGN. An even larger number of geniculate neurons, in turn, converge onto each target neuron in visual cortex. Geniculocortical divergence is even more extensive, but will not be discussed here. Numerous studies have described synchronous activity within a given neural ensemble (reviewed in Usrey and Reid, 1999). The study o f the afferent connections to an ensemble allows f o r a t least a partial analysis of the causes of synchrony. Conversely, the study o f the efferent connections from an ensemble allows an analysis of the functional significance of :correlated activity: Specifically, it allows one to ask the question: are synchronous spikes f r o m two neurons especially effective in driving a common postsynaptic target?
Interpretation of cross-correlations in feedforward systems Cross-correlations between neurons in a strictly feedforward circuit can in many cases be quite easy to interpret. In order to discuss the topic, it is useful to define several terms so that they can
be used unambiguously. Connections between two pools of neurons, A and B, will be called strictly feedforward if there are direct excitatory connections between neurons in the presynaptic pool (A) and neurons in the postsynaptic pool (B), but no direct connections between B and A. By this definition, a system withpolysynaptic feedback can still be called strictly feedforward. For instance, the connections from lateral geniculate nucleus to layer 4 of visual cortex are strictly feedforward, even though there is indirect feedback to the LGN via cortical layer 6. All correlation between neurons in the presynaptic pool and neurons in the postsynaptic pool will be called a feedforward correlation. This term will not be used to connote mechanism. Thus a slow correlation between a geniculate neuron and a cortical neuron will still be called a feedforward correlation, even though its ultimate cause may in fact be slow correlations in the retina (Mastronarde, 1983a,b,c, 1989). By contrast, the term monosynaptic correlation obviously connotes a mechanism and thus carries with it a significant burden of proof. By 'monosynaptic correlation', we mean a crosscorrelation between two connected cells that is due to spikes in the presynaptic cell directly causing spikes in the postsynaptic cell (see Figs. 2 and 3, below). It is well known that cross-correlation analysis can easily be misinterpreted (correlation need not imply causation), so great care must be taken in analyzing correlogram peaks even m this simplest case. One can safely call a peak monosynaptic if two criteria are met. First, the peak must appear at a reasonable delay between A and B. In a retinogeniculate correlation, the peak is usually at + 3 - 5 ms, which is consistent with the conduction velocities of ganglion-cell axons, plus the synaptic delay (Henry et al., 1979). Second, and much more importantly, the rising phase of the correlogram must be fast. In most cases, the shape of a correlogram between two synaptically connected neurons is close to the first derivative of the excitatory postsynaptic potential (Fetz and Gustafsson, 1983), and therefore can be quite fast. In many cases, it is this rate of rise that allows one to exclude potential sources of artifact, as discussed below. If a peak in the correlation between neurons A and B is consistent with a monosynaptic connection (Fig. la), it is important to rule out other potential causes of this peak. Assuming that A and B are part
143
cMOnosynapti c orrelation
Correlation from C o m m o n Inout: Neural
External
B
B
A
A
A
Correlation from Synchrony m: Presynaptic Poot
PostsynaptiG Pool I
B
IB'
e)
A
Fig. 1. Five model circuits that could lead to correlation between neurons in a strictly feedforward pathway. In each example, neuron A is in the pre~ynaptic pool, neuron B in the postsynaptic pool. (a) A mad B are monosynaptical!y connected. (b, c) A and B are not connected, but they both receive common input. In one case (b), this common input comes directly from a neuron, C, in a different region. In another case (cL the common input comes indirectly from an external source, such as a sensory input, t d) A and B are not connected, but there is another neuron X. in the presynaptic pool. that both is correlated with A and is monosynaptically connected to B. (e) A and B are not connected, but there is another neuron B r, in the postsynaptic pool. that both is correlated with B and receives monosynapticatly input from A.
o f a strictly f e e d f o r w a r d system, they c o u l d falsely appear to h a v e a m o n o s y n a p t i c c o n n e c t i o n i f there were: (1) c o m m o n neural i n p u t to A and B f r o m o n e or m o r e c e l l s , C (Fig. l b ) ; (2) c o m m o n e x t e r n a l input, such as a sensory s t i m u lus, to A and (3) c o r r e l a t i o n b e t w e e n A and A', another m e m b e r (or m e m b e r s ) o f the p r e s y n a p t i c group that is i t s e l f c o n n e c t e d w i t h B (Fig. l d ) ; or
(4) c o r r e l a t i o n b e t w e e n B and B', a n o t h e r m e m ber for m e m b e r s ) o f the p o s t s y n a p t i c g r o u p that r e c e i v e s i n p u t f r o m A (Fig. l e ) . In m a n y cases, s u c h as in the r e t i n o g e n i c u t a t e system, c o m m o n neural input can b e r u l e d out on a n a t o m i c a l grounds. In v i r t u a l l y all f e e d f o r w a r d systems, h o w e v e r , the types o f correlations in cases 2 through 4 (Fig. l c - e ) are p o t e n t i a l l y p r e s e n t and therefore m u s t be e x c l u d e d . T h i s is u s u a l l y a c h i e v e d b y r e l y i n g o n a r g u m e n t s about t i m i n g . In general, it
144 is necessary to demonstrate that monosynaptic correlations are too fast to be caused by other sources - either stimulus-dependent correlations or what can be called lateral correlations: correlations among neurons in either the presynaptic or the postsynaptic pool. It is important to emphasize that lateral correlations need not be caused by lateral connections but instead can be from any source, such as common input (neuron C in Fig. ld). Consider the case of the retinogeniculate pathway. Correlations that provisionally could be called monosynaptic typically have a peak with a delay between retina and LGN of 3-5 ms and a rise time (from 1/2 maximum) of less than 0.3 ms (Usrey et al., 2000). These peaks can be seen under conditions in which stimulus-dependent correlations are on the order of hundreds of milliseconds (during slowly varying stimulation, such as drifting sinusoidal gratings) or 5 - 1 0 ms (during spatiotemporal white noise stimulation, Reid et al., 1997). It should be noted that some stimuli - - such as rapidly modulated, spatially diffuse flicker - - can drive both retinal (Berry et al., 1997; Berry and Meister, 1998) and geniculate (Reich et al., 1997; Reinagel and Reid, 2000) neurons to respond reproducibly with sub-millisecond precision. Cross-correlations between retina and LGN would therefore be impossible to interpret under such conditions. Lateral correlations
While it is possible to work under conditions such that stimulus-dependent correlations are far slower than monosynaptic correlations, it is not possible to remove lateral correlations. In many cases, however, lateral correlations are slower than feedforward correlations. The reasons why slower correlations cannot lead to a 'false positive' monosynapfic correlation are out±ned below. In the following sections, illustrations will be drawn from the retinogeniculate system and the geniculocortical system. For example, a peak in the correlation between retinal cell A and LGN cell B could appear in the absence of a monosynaptic connection if neuron A were correlated with another ganglion cell, A', which in turn was monosynapfically connected with B (Fig. ld: see Fig. 2, below). In the absence of any other source of correlation between A and B, the first-order prediction of the cross-cot-
relation between A and B will be the convolution of the transfer function between the neurons A and A' (kAr,A (t), discussed below) with the true monosynaptic correlation between neurons A' and B. To simplify matters, the spike train A(t) can be considered as the sum of two processes, one which is independent of the spike train A'(t), A±a,(t), and another which is linearly dependent upon A' (t): A a,(t). Given this formalism, the dependence between A r and A can be expressed with the transfer function, k a ' , a ( t ) : rk
(A(t)) =A±a,(t) + E
A'(t -- tU)kA,,A(t")
t'=O
=A±A,(t) + (A; A,(t)) (1) We can think of (A(t)) as the mean or expected value of the spike train A(t), taken over many samples of the spike train, A'(t). kA',A(t) is th e temporal weighting function that gives the relationship between (A(t)) and A'(t). (A(t)) depends on the past history of A'(t) up to a relative delay of Tk. Given this relationship between spike trains A and A/, it is possible to represent the cross-correlation between A and B, 1 XCA,B(t)
=
lim
T-~ T
T
E
A ( f + t)B(t')
t t =O
with the following series of equations: XC~.B (t)
=
1
|im
T-+cx~
T
T
Z (A(t' + t))B(t') F=0
(ergodicity of A) =
1
lim
T--+~
T
T
Z
[A±a,(t' + t)
t~=0
Tk
+E
Ar(t' + t
t')ka, a(t')]B(t')
t'--O
(from Eq. 1) =
1
lira
T--+~
T
T~
+Z t'=0
T
Z [A±A,(t' + t)B(t') F=0
A'(t' + t - t')ka, a(t')B(t')]
(2)
145 B, ke,,B, and the cross-correlation between A and Br:
= (A±A,)(B) T
Tk
-1- lim -T--~oo T =
•_
A'(t' + t - t ' )
T
XCA.B(t) -- (A±8,)(B) + Z k~.B,(t") ~"=0
× ka, A ( t ' ) B ( t ' )
(from assumed independence of = (A±a~}(B} + Z
A_I_A,and
B)
ka,.a(t')
~H~o
1 r x r-~lim T Z a ' ( t ' - t - t
t')B(t')
F=O T~
= (A±A,)(B) q- Z kA"A(t')XCA'B(I t"=o
t't) (3)
Thus the cross-correlation between neurons A and B is equal to a constant plus the true monosynaptic crosS-co~elation between A' and B, convolved (or smoothed) by the transfer function between A' and A. The constants, (A±A')and (B), are respectively the mean firing rates of the portion of A that is independent of A ~, and B. It should b e noted that correlations have been defined (Eq. 2) in terms of the joint probability per unit time o f two events, A and B, occurring with a relative detay, t. Various normalizations of this function Can be used to express it in units more suited for a given application. Most normalizations involve NA and NB, the total number of Spikes fired by A or B, and z: the bin width at which time is quantized. If the corretogram is divided by N A t , then the result can be thought of as the firing rate of neuron B, in spikes per second, with respect to the average spike f~'om AI If it is normalized b y N a , then it can be thought of as the conditional probability of B firing, per time bin, averaged over all occurrences of A. A very similar calculation can describe the case illustrated in Fig. le, in which neuron A projects only to postsyn@tic neuron B', which in tUl~ is cotrelated with the 'postsynaptic' neuron B. The only difference in the calculation is that B'. the postsynaptic neuron, must be decomposed into two dependent and independent terms: B~A and B'a: In this case, the cross2corretation between the unconnected cells, A and B, is eqtual to a constant term plus the convolution of the lateral transfer function l~etween B' and
× XCA.B,(t
-
(4)
t')
In the retino-geniculo-cortical system, lateral correlations in all three levels could pose a problem in the interpretation of feedforward correlations. Discussion of the very fast correlations in the LGN will be postponed until the next section; here we will examine the potential confounding effect of intraretinal and intracortical correlations. To apply Eqs. 3 and 4 to these cases, it would be best to know the transfer function between two neurons at a given level (ka'.A in Eq. 3 and kB,., in Eq. 4). The cross-correlation functions between the same two neurons (XCA .a and XC,,.,), however, are very closely related to the transfer functions; they can be substituted to first approximation. Potential artifacts caused by intrarefinal and intracortical correlations In the retina of the cat, there are three classes of correlations between ganglion cells (Mastronarde, 1983a, c): the slow correlations between two X cells (peak width: 5-20 ms), the faster but weak correlations between X and Y ceils (peak width: 2-3 ms), and yet faster and stronger correlations between Y cells (peak width: ~ 1 - 2 ms). The correlations that involve X cells are much too slow to confound the interpretation of monosynaptic peaks. The correlations between Y cells, however, have a time-scale that approaches that of retinogeniculate monosynaptic correlations. The most dramatic evidence that Y cells in the retina are coupled is that electrical stimulation of one Y cell can evoke a strong response in another Y cell with a delay of ~1 ms al~d a peak width of <1 ms (Mastronarde. 1983a). The important factot to consider when applying Eq. 3, however, is not this one-dh'ectional effect of Y-ceU coupling, but instead the actual cross-correlation between two Y cells. When examined at a fine time-scale, the cross-correlation between two Y cells has two peaks, one on each side of time zero, each with a width
146 of ~1 ms (Fig. 2a. upper right, XCA,,A ~ k A , A , data from fig. 2 of Mastronarde, 1983a). When this lateral correlation (kA'.A, re-plotted in Fig. 2b. middle) is convolved with what are presumed to be monosynaptic correlations between retina and LGN (Fig. 2a, XCA B and XCA B", histograms; see Eq. 3), the results are two-peaked correlations (Fig. 2b. fight and left plots, dark gray lines). These twopeaked correlations in no way resemble the actual monosynaptic correlation. Thus cross-correlations between retinal cells could not account for the extremely fast correlation peaks found between retinal and LGN cells. We can therefore conclude that the retinogeniculate correlations are in fact truly monosynaptic correlations (potential confounds due to intrageniculate synchrony are addressed below). Similarly, correlations between cortical neurons, even the very fastest ones (Lampl et al., 1999. fig. 7b; re-plotted in Fig. 3a. bottom right, XCB.~,), are much too slow to confound the interpretation of monosynaptic inputs from LGN (Eq. 4; Fig. 3c). Geniculocortical correlations that pass our criteria for calling them monosynaptic correlations (Reid and Alonso, 1995; Monso et al.. 1996: Usrey et al., 2000), have a rise time on the order of 1 ms. When these same correlations are convolved with a cross-correlation between two cortical ceils (Fig. 3c. middle) the results (Fig. 3c, fight and left plots, dark gray lines) have rise times on the order of several milliseconds. These slow correlations would not pass our criteria for monosynaptic correlations.
Potential artifacts caused by intrageniculate correlations The experiments illustrated in Figs. 2 and 3 were taken from studies (Alonso et al., 1996; Usrey et al., 1998, 1999) that were motivated by the original finding of correlations between neurons in the LGN (Alonso et al.. 1996; as predicted in Cleland. 1986). These correlations are both quite fast, on the order of 1.0 ms, and strong: ranging from ~ 1 % up to ~ 3 0 % of the spikes in either spike train. These lateral correlations, while interesting as a phenomenon unto themselves, pose a potential problem in the interpretation of feedforward correlations, either between retina and LGN or between L G N and cortex. As discussed above, lateral correlations in the retina and cortex are too slow (and usually too weak) to influence the interpretation of feedforward correlations, but the fast and strong correlations in the LGN are precisely the sort that could pose a problem. Fast correlations in the L G N could be seen as a potential problem for the interpretation of retinogeniculate correlations, ff the lateral correlations in the LGN were intrageniculate in origin, that is if they were caused b y excitatory local interactions between neurons, then it would be possible to have a retinogeniculate correlation that appeared as a monosynaptic correlation, even in the absence of a direct connection (Fig. lc). This is unlikely for two reasons. First, even though the intrageniculate correlations are extremely fast, on the order of 0.2-0.3 ms half-width at half-maximum, they tend to be symmetrical, while retinogeniculate correlations have faster rise times than decays (Usrey et al., 1999). The correlations be-
Fig. 2. (a) An example of a triplet of simultaneously recorded neurons (adapted from Usrey et at., 1998 and Usrey and Reid. 1999), one in the retina (A) and two in the LGN (B and B~). The three panels show receptive field maps of simultaneously recorded neurons (Retina A. LGN B, LGN B'; each is an on-center Y cell: grid size: 0.6%. The circles shown over the receptive field centers correspond to the best fitting Gaussian of the retinal receptive-field center (radius: 2.5 gret). The three histograms labeled XCA.~, XCA,s, and XCB.8, illustrate the cross-correlograms between each pair of neurons. The histogram XCA.A,, representing the expected correlation with another Y cell (not recorded), is re-plotted from another study (Mastronarde. 1983a. fig. 2). The retinogeniculate correlograms (XCA.B and XCA.B,) both have the features indicative of a monosynaptic connection: a short latency and a very fast rise time. 84% of the synchronous events in correlogram XCs.B, are accounted for by the peaks in correlograms XCa.B and XCA.~, (Usrey et al., 1998). Hence, divergence leads to synchrony. (b) Illustration o f potential artifacts caused by intraretinal synchrony. Middle: line plot of intraretinal correlation, from (a). Right and left: line plots superimposed on the retinogeniculate correlograms indicate the expected correlograms if the pair were not monosynaptically connected, but instead 'inherited' the correlation because of lateral correlations and a true monosynaptic connection (as in Fig. ld; Eq. 3). Offset and gain are arbitrary. (c) Middle: line plot of intrageniculate correlation, from (a). Right and left: line plots superimposed on the retinogeniculate correlograms indicate the expected correlograms in absence of true connection (as in Fig. le:
~.
4).
147
a)
Retina~LGN Divergence Synchrony
.?
msec
b~onn.
500.
250
¢R
"5.
0 -2
I000~
.
. 0
.
. 1
msec
msec
C~ooo
. -I
500
tn
0
OI
0
-2
-~
'
• rn_~c
3
4
msec
msec
. 2
3
II[I 4
148
a) 100(
L G NimA/-1, -n,.'~ XCA,
L G N11A ' i,.q~l A. _~kA,A, ~.,A' ~ EA,A'
( I i i III.........i : ~ -I0-8-6-4-2 I) 2 4 6 8 I0 msee
msec ; 3 o r t e ~ B - C o r t e x B' I I XC,:,e ' ~ k6B'
LGN-I~Cortex
Convergence Synergy msec
b),
50 ¸
'10-8-6-4-2
0 2 4 6 8 t0
msec
-10-8 "6-4-2 0 2 4 6 8 10 msec
-10-8-6-4-2
0'2
msec
4 6 8
10
c)
~11 msec
'
IWt
msOec s
msec
Fig. 3, (a) An example of a triplet of simultaneously recorded neurons (adapted from (Alonso et al., 1996; Usrey and Reid, 1999), two in the LGN (A, Ar) and one in the visual cortex (B). Conventions as in Fig. 2. The histogram XCB,~,, expected correlation with another cortical cell (not recorded), is re-plotted from another study (Lampl et al., 1999, fig. 7b). (b) Right and left: fight fines show expected correlations due to lateral correlations within the LGN (middle; as in Fig. le; Eq. 3). (c) Dark lines: expected correlations due to lateral correlations within the cortex (middle; as in Fig. le; Eq. 4).
149 tween retina and LGN that would be predicted in the absence of a :connection - - calculated by convolving the correlation between two LGN cellsi (Fig. 2c, middle) with' the actual retinogeniculate correlations (Fig. 2c, see Eq, 4) are therefore slightly slower than the actual retinogeniculate correlations. In particular, they have a symmetric tall, before the peak as well as after, which is not present in the actual correlations: Second, in admittedly anecdotal simultaneous recordings from sets of three neurons two in the LGN and one in the retina (as in Fig, 2a), we have found that virtually all correlated firing in a pair of LGN cells can be derived from the common etinal input. Specifically, the vast majority of synchronous spikes in the correlated pair of neurons are preceded with a !monosynaptic latency' by spikes ficom a specific retinal input (i_lsrey et al., 1958). A much more problematic consequence o£ the fast correlations in the LGN is thai they complicate the interpretation of correlations between LGN and visual cortex. Genicutocortical correlations, with a rise-time of ~ I ms, are slower than intrageniculate correlations. Therefore, it would be possil~le for the scenario illustrated in Fig. ld to yield false positives: cross-correlations with the characteristics of monosynaptic correlations in the absence of a true connection (Fig. 3b, right and left plots, gray lines). To examine the issue of false positives, we therefore recorded Simultaneously from groups of three neurons: twa geniculate h e r o n s (A and At) along with a cortical neuron (B) that exhibited apparently monosynaptic correlations with both inputs (Alonso et al., 1996).
Second-order analysis: interactions between inputs In order to analyze triplets of neurons highly correlated neurons in the LGN and a postsynaptic neuron in visual cortex - - we split the presynaptic spike trains, A and A', into three derived spike trains: A&A'. A* and A '~. A&A' is the spike train of synchronous spikes in the two LGN cells (arriving within 1.0 ms of each other), The remaining, nonsynchronous events are put in the two spike trains: A* and A'*. In K s analysis, it was clear that isolated spikes from both LGN cells(A* and A'*) affected the firing of coi~ical neurons with a monosynaptic
latency (Alonso et al., 1996, Usrey et al., 2000). Of course, this scenario could be complicated by other LGN cells (A") that are correlated with both A and A'. Given that highly correlated ceils in the X-cell system probably occur in groups not much greater than three neurons (two of which we have recorded; see Cteland, 1986), this sort of argument ensures that at least two cells in such an ensemble are connected to the postsynaptic neuron.
Functional role of divergence and reconvergence: synchrony and synergy Recording from triplets of correlated cells, two in the LGN and one in the visual cortex, would Seem an unreasonable amount of work if the only benefit were to prove that ~monosynaptic correlations' corresponded in fact to true connections. A more interesting result of these experiments, however, is an analysis of the potential effect of synchronous input to visual cortex (Usrey and Reid, 1999). Given the phenomenon that neurons in the lateral geniculate nucleus fire near-synchronous action potentials, the question naturally arises: are synchronous spikes arriving at a cortical neuron treated differently than isolated spikes? Of course, the two inputs will sum together in some fashion, so the question can be reformulated: do synchronous spikes sum in a fashion that is different from a reasonable null hypothesis? What might this null hypothesis be? A number of studies, both theoretical and experimental, have addressed the question of whether two excitatory postsynaptic potentials (EPSPs) to a cortical neuron sum linearly. It should be noted that this limited question is distinct from the more global question: do the combined effects of the thousands of inputs to a neuron sum linearly (Ferster. 1994; Borg-Graham et al., 1998; Hirsch et al., 1998: reviewed in Wandell. 1993)? Summation could be locally roughly linear but globally nonlinear, or even vice versa. Various models would predict that two inputs would add together in a linear fashion (Rall, 1964, for small, distant inputs; see also Bernander et al., 1994), a sublinear fashion (Rall, 1964, for larger, nearby inputs), or even a supralinear fashion (Mel. 1993; Softky, 1994). Most experimental studies have found that linear summation tends to be the rule: see Langmoen and Andersen (1983) and Cash and
150 Yuste (1999). All of these studies have examined how two EPSPs interact. Our question, however, is somewhat different. Given that we have no access to the intracellular potential, we are asking: what is the expected probability of a postsynaptic spike given near-simultaneous spikes from both presynaptic cells. In other words, what are the combination rules for cross-correlograms from two feedforward connections onto a common target? Do they simply add or is the interaction more complex? In addressing these questions, the first thing is to define what is meant by simple addition of inputs. The linear summation of EPSPs is a well-defined concept. Before the concept of summation of correlograms can be considered, however, there must be some interpretation of what the correlogram represents. The most natural interpretation of a correlogram in this regard is to consider it as the differential probability, above chance, that the postsynaptic cell, B, fires at a certain time after the presynaptic cell. A. To make a correlogram (as defined in Eq. 2) correspond to this interpretation, one must first subtract the baseline activity of neuron B, or the portion of B that is independent of A: (B±A). Next. the cross-correlogram must be divided by NA, the total number of spikes fired by A. This baseline-subtracted, normalized correlogram corresponds to the differential probability that B fires a spike following A. Finally, the integral of the 'monosynaptic peak' in this function yields what is known as the efficacy of A (Levick et al., 1972). Very loosely, the efficacy can be thought of as the percentage of A's spikes that caused a spike in B for a given duration after A fired. In our studies of connections between LGN and visual cortex, we set the duration of the monosynaptic peak equal to 3.0 ms to compute efficacy (Reid and Alonso, 1995; Alonso et al., 1996; Usrey et al, 2000). In this interpretation of cross-correlograms, in which the efficacy represents a marginal increase in probability that the postsynaptic cell fires a spike, approximate linear summation is well-defined and would be expected, at least for extremely small inputs. In virtually any probabilistic model, even a grossly nonlinear one, infinitesimal marginal probabilities combine linearly. Given the fact that thalamic inputs to cortical neurons have an efficacy of several percent (small, but certainly not infinitesimal),
one might expect some deviation from linearity. The question remains, therefore, what would be the expected deviation from linearity given a reasonable model of synaptic integration and neuronal spiking? In one analysis o f this question, Abeles (1982, 1991) examined the relation between the asynchronous attenuation, AA (equal to the inverse of the efficacy), and the synchronous attenuation, As (the number of synchronous inputs needed to create a spike 50% of the time). In order t o quantify this idea, he defines the term coincidence advantage, 'CA', given by: CA = 0.5AA/As
(5)
The coincidence advantage can be thought of as the ratio: half the number of identical asynchronous spikes needed to evoke an incremental spike i n the postsynaptic spike train (0.5 A~), divided by the number of synchronous spikes needed to evoke one spike 50% of the time: As. AA is easily measurable (it is the inverse of the efficacy of a single input), but As is a high number (Abeles estimates that 37 is a reasonable number for cortico-cortical synapses), so the coincidence advantage would be virtually impossible to measure experimentally. We have asked a slightly different question (Alonso et al., 1996): how much better are two synchronous inputs than the sum of the same inputs arriving asynchronous!y? This number, which we term the synergy ratio, will almost certainly be tess than the coincidence advantage for many inputs, but it is one that can b e measured experimentally. Considering no more than the threshold for firing, it might be expected that both quantities ~ the coincidence advantage and synergy ratio - - would tend to be greater than 1. Small inputs are unlikely to reach threshold and are therefore ineffective, while synchronous inputs are more likely to push the neuron over the threshold, It is worthwhile to examine this simple i d e a in more detail. As outlined by Abeles (1982), it is relatively straightforward to analyze the behavior of a neuron that adds together many small synaptic inputs in a roughly linear fashion and that has a fixed threshold. Abeles made the following simplifying assumptions. First, a neuron can be characterized in terms of the probability distribution of its instantaneous mem-
151
brahe potentials (Fig. 4a). Second, the probability that a neuron fires is linearly related to the probability that the membrane potential is above threshold (a p r e ~ s e that can beproven by simple statistical arg~ents), Further, a typical co)tical neuron tends to fire at rates far below the upper limit set by its refractory period. It is therefore reasonable to assume that most of the time it is below the spiking threshold. Given these assumptions, the neuron will have a coincidence advantage greater than one if: "the function describing the probability of the (instantaneous) transmembrane p0iential being above the threshold vs. the threshold level is concave. This complicated statement is best illustrated when the instantaneous membrane potential, V, has a Gaussian probability density (Fig. 4a).
P ( V > T) = P ( x > T / a ) = ~ _l _ / of r ee e -x2/2 dx
where x = V / a , and e r f c is the complementary error function (equivalent to the dark shaded areas in
a) 0.45
0.30
"E 0.15
1 _e_V2 2o.2
p ( V ) = ~2rc---Y
/
(6) /
where V is measured relative to Vre~t, and a is the standard deviation of V. At any given time. the probability that V is above threshold T, is given by:
(7)
= erfc(T/~/2a)/2
/
/
0J -3 -2 0.5( ~ 0
b)
/
/
-1
0
V/(~
1
2
3
0.4 A
I,,,, 0.3 A
Fig. 4. Analysis of the firing probability of a neuron with a threshold, T; a membrane voltage, V. with standard deviation. ~7: and EPSPs of asynchronous inputs, V*A and VA,*. (a) Probability density function of membrane voltage. Abscissas: normalized membrane voltage, V/a, in dimensionless units. The area of the dark shaded, region, where V > T, is proportional to the firing rate of the neuron. The area of the light shaded region. where T - VA* < T, is proportional to the average efficacy of asynchronous inputs, A*, in driving the neuron. (b) The probability that the membrane voltage is greater than threshold versus normalized voltage, V/a. This is equivalent to the probability that one spike fires in any given 3 ms period (see text). Graph illustrates the ealcnlafion yielding coincidence advantage (Abeles, 1982). Horizontal line from T: baseline probability of firing. O, probability of firing (0.5) given 9 synchronous EPSPs of size ]/A.,*-X, cumulative additional probability of firing (above baseline) for 9 asynchronous EPSPs of size VA*..The coincidence advantage is the ratio: (O minus baseline)/(X minus baseline). (c) Same plot as (b), but over a narrower range. Graph illustrates calculation yielding synergy ratio (see text). O, probabifity of firing given 2 synchronous EPSPs of sizes VA* and VA,*. X, cumulative addit]ort~ pro~abilit,j of firing (above baseline), for two asynchronous EPSPS of sizes VA* and VA,*. Synergy ratio is the ratio: (O minns basetine)/(X minus baseline).
0. 0.2 0.1
~ . , T . V ~,
2
0
c)
0.1-
3
V/~
0.08 A
I~ 0.04-
0.02 O~
1.4
1.7
V/~
2
152 Fig. 4a; plotted in Fig. 4b,c). As argued by Abeles, the mean firing of a neuron under these conditions is approximately P ( V > T ) / A t , where At is the interval over which the neuron will tend to fire exactly one spike if V is above threshold. That is, At is somewhere between the absolute refractory period and the relative refractory period. For our purposes, we can take At = 3 ms. Although this is clearly an approximation, it allows T / a to be estimated as the unique value that satisfies: P ( x > T / a ) = erfc(T/~/2a)/2 = )~At
(8)
where )~ is the mean firing rate. Given this scenario, if a spike from presynaptic neuron A depolarizes the postsynaptic neuron by an amount VA, this shifts the distribution of V to the fight by VA. For our purposes, this is equivalent on average to shifting the threshold to T - VA. This increases the probability of firing by the marginal increase in area under the Gaussian distribution (Fig. 4a, light shaded region). This increase in area is clearly not a linear function of Va, as further illustrated in Fig. 4b,c. Although we cannot determine VA from an extracellular recording, we can determine the average efficacy of the synapse by cross-correlation. Again, the efficacy is the marginal increase in firing probability during the monosynaptic peak (again, defined as 3 ms for the thalamocortical synapse; Reid and Alonso, 1995; Alonso et al., 1996). Therefore, VA/a is the unique value such that: P ( x > T / a - gA/ff)
=
erfc(T/x/2a
- V A / V ~ a ) / 2 = XAt + efficacy
(9)
For the cortical neuron, B, illustrated in Fig. 3, the baseline firing rate was -~10 spikes/s and the efficacies of the non-synchronous inputs, A* and A'* (defined above: second-order analysis: interactions between inputs), were 1.71% and 1.41%, respectively (Alonso et al., 1996). Substituting these values into Eqs. 8 and 9 yields: T / a = 1.88, VA./a = 0.21, and VA,*/a = 0.18 (10) By assumption, the synchronous activation of both inputs yields a depolarization that is the sum of the two individual depolarizations: VA&A,/~ = 0.39
(11)
Finally, substituting this value into Eq. 9 yields a value of 3.8% for the predicted efficacy of the synchronous input: A&A'. Under the simple model of a Ganssian distribution of membrane potential and linear summation of synaptic inputs, therefore, one would have predicted a synergy ratio of: Synmodel -- 3.8%/(1.71% + 1.41%) = 1.22 (12) This point can be made graphically. The lines tangent to the curves in Fig. 4b,c indicate the expected efficacy of two or more inputs if efficacy added linearly. The concavity of the curve (defined in Eq. 7) assures that the actual efficacy is greater than the linear prediction. Given this relatively modest synergy ratio for two synchronous inputs, it is instructive to use the same model to examine the coincidence advantage, as defined by Abeles (1982), for a larger number of synchronous inputs. The coincidence advantage is defined in terms of the number of identical synaptic inputs required to overcome the threshold, T (i.e., equivalent to shifting the threshold all the way to zero in Fig. 4a). For the stronger of the two asynchronous inputs to B, A*, this number, termed the synchronous attenuation, is given by: As = T~ VA. = 9.0
(13)
Substituting this value into Eq. 9, we get a coincidence advantage, 'CA', equal to: CAmodel = 0.5(1/1.7%)/9.0 = 3.27
(14)
This means that the net effect of nine synchronous inputs is 3.27 times stronger than if the inputs arrived asynchronously. For the weaker asynchronous input, A'*, the coincidence advantage is slightly higher: CAmodel = 3.40. In summary, given the assumptions of linear summation of EPSPs and a simple threshold model of firing, the effect of many synchronous thalamic inputs would be strongly synergistic, but two inputs would only be slightly synergistic. From a past study (Alonso et al., 1996), however, we have found that two thalamic inputs sum more synergistically than would be expected from this simple model: the synergy ratio had a mean value of 1.71 and a median of 1.50. In this study, an important control was that the synergy ratio was high not only for pairs of synchronized inputs (as in Fig. 3), but also for uncorrelated
153
a visual stimulus is strong and thus driving the retina to a N g h ratei Second, Several groups have suggested that t h e preferential transmission of synChronous thalamic inputs might help in the cortical processing e f certain classes of visual stimuli (Sillito et a l , 1994; K6nig et al., 1996; Neuenschwander and Singer, 1996).Finally, preferential cortical responses to synchronous inputs m a y help in the transmission of visual information. For pairs of synchronized L G N cells, if Synchronous spikes are considered as a separate spike train ( A & A ' , See above), informationtheoretic maalvsis with the stimulus reconstruction technique (Rieke et al., 1997) yields more information about the stimulus t h a n if synchrony were ignored. Thus the ability to 'read-off' s y n c ~ o n o u s spikes might help in transmitting information from retina to cortex, through the potential bottleneck of the thalamus.
Acknowledgements This work was supported by NIH grants EY10115 and EY12196. I thank Pamela Reinagel, Mark Andermann, and John Reppas for careful readings of the manuscript.
References Abetes. M. (19821 Role of the cortical neuron: integrator or coincidencedetector? [sr. J. Med. Sci., 18: 83-92.
Abeles, M. (1991) Corticonics: Neural Circuits o f the Cerebral Cortex. CambridgeUniversityPress, London. Alonso. J.M., Usrey, W.Mi and Reid, R,C. (1996) Precisely correlated firing in cells of the lateral genieulate nucleus. Nature. 383: 815-819. Bernander, O,, Koch, C. and Douglas, R.J. (1994) Amplification and lineafization of distal synaptic input to cortical py~rnidal cells. J. NeurophysioL. 72: 2743-2753. Berry II, MJ. arid Meister, M. (1998) Refractoriness and neural precision. J. Neurosci., 18: 2200-22tl. Berry, M.J., Warland.D.K. and Meister, M. (1997)The structure and precision of retinal Spike trains. Proc. Natl. Acad. Sci. USA. 94! 5411-5416. Borg-Graham, LJ., Monier, C. and Fr6gnac, Y. (1998)Visual input evokes transient and strong shunting inhibition in visual cortical neurons. Nature, 393: 369-373. Brivanlou, I.H., Warland, D.K. and Meister, M. (1998) Mechanisms of concertedfiring among retinal gangli0ncells. Neuron. 20: 527-539. Cash, S. and Yuste, R. (1999) Linear summation of excitatory inputs by CA1 pyramidal neurons. Neuron, 22: 383-394. CMand, B.G. (1986) The dorsal lateral geniculate nucleus of the cat. In: J.D. Pettigrew, K.S. Sandersonand W.R. Levick (Eds.), Visual Neuroscience. CambridgeUniversityPress, London, pp. 111-120. Cleland, B.G. and Lee, B.B. (1985) A comparison of visual responses of cat lateral geniculate nucleus neuroneswith those of ganglion cells afferent to them. J Physiol., 369: 249-268. Cleland. B.G.. Dubin. M.W. and Levick, W.R. (1971) Simultaneous recording of input and output of lateral genienlate neurones. Nat.-New Biol.. 231: 191=192. Dan. Y., Alonso, J.M., Usrey, W.M. and Reid, R.C. (1998) Coding of visual information by precisely correlated spikes in the lateral geniculatenucleus. Nat. NeuroscL, 1: 501-507. Eckhorn, R., Bauer. R., Jordan, W., Brosch, M., Kruse. W.. Munk, M. and Reitboeck, H.J. (1988) Coherent oscillations: a mechanism of feature linking in the visual cortex? Multiple electrode and correlation analyses in the cat. BIOL Cybern., 60: 121-130. Ferster, D. (1994) Linearity of synaptic interactions in the assembly of receptive fields in cat visual cortex. Curt. Opin. Neurobiol., 4: 563-568. Fetz. E.E. and Gustafsson. B. (1983) Relation between shapes of post-synapticpotentials and changes in firing probability of cat motoneurones.J. Physiol., 341: 387-410. Frien, A., Eckhorn. R., Bauer, R.. Woelbern, T. and Kehr, H. (1994) Stimulus-specific fast oscillations at zero phase between visual areas V1 and V2 of awake monkey.Neuroreport, 5: 2273-2277. Gerstein, G.L., Perkel, D.H. and Dayhoff, J.E. (1985) Cooperative firing activity in simultaneously recorded populations of neurons: detection and measurement.J. Neurosci., 5: 881-889. Ghose, G.M., Ohzawa. I. and Freeman, R.D. (1994) Receptivefield maps of correlated discharge between pairs of neurons in the cat's visual cortex, J. NeurophysioL, 71: 330-346. Gray, C.M., K6nig, P., Engel, A.K. and Singer, W. (1989) Oscillatory responses in cat visual cortex exhibit inter-columnar
154 synchronization which reflects global stimulus properties. Nature, 338: 334-337. Henry, G.H., Harvey, A.R. and Lund, J.S. (1979) The afferent connections and laminar distribution of cells in the cat striate cortex. J. Comp. Neurol., 187: 725-744. Hirsch, J.A., Alonso, J.M.. Reid, R.C. and Martinez. L.M. (1998) Synaptic integration in striate cortical simple cells. J. Neurosci. 18: 9517-9528. Jagadeesh, B., Gray, C.M. and Ferster, D. (1992) Visually evoked oscillations of membrane potential in cells of cat visual cortex. Science, 257: 552-554. Kaplan, E., lharpura. K. and Shapley, R.M. (1987) Contrasl affects the transmission of visual information through the mammalian lateral geniculate nucleus. J. Physiol., 391: 267288. Ktnig, P., Engel, A.K. and Singer, W. (1996) Integrator or coincidence detector? The role of the cortical neuron revisited. Trends Neurosci., 19: 130-137. Kreiter, A.K. and Singer, W. (1996) Stimulus-dependent synchronization of neuronal responses in the visual cortex of the awake macaque monkey. J. Neurosci.. 16: 2381-2396. Lampl, I., Reichova, I. and Ferster, D. (1999) Synchronous membrane potential fluctuations in neurons of the cat visual cortex. Neuron, 22: 361-374. Langmoen. I.A. and Andersen, P. (1983) Summation of excitatory postsynaptic potentials in hippocampal pyramidal neurons. J. NeurophysioL, 50: 1320-1329. Levick, W.R., Cleland. B.G. and Dubin. M.W. (1972) Lateral geniculate neurons of cat: retinal inputs and physiology. Invest. Ophthal., 11:302-311. Livingstone, M.S. (1996) Oscillatory firing and interneuronal correlations in squirrel monkey striate cortex.. J. Neurophysiol., 75: 2467-2485. Mastronarde, D.N. (1983a) Correlated firing of cat retinal ganglion cells, I. Spontaneously active inputs to X- and Y-cells. J. Neurophysiol., 49: 303-324. Mastronarde, D.N.: (1983b) Correlated firing of cat retinal ganglion cells. H. Responses of X- and Y-cells to single quantal events. J. Neurophysiol.. 49: 325-349. Mastronarde. D.N. (1983C) Interactions between ganglion cells in cat retina. J. Neurophysiol., 49: 350-365. Mastronarde, D.N. (1989) Correlated firing of retinal ganglion cells. Trends Neurosci.. 12: 75-80. Mastronarde, D.N. (1987) Two classes of single-input X-cells in cat lateral geniculate nucleus. II. Retinal inputs and the generation of receptive-field properties. J. Neurophysiol., 57: 381-413. Meister, M. (1996) Multineuronal codes in retinal signaling. Proc. Natl. Acad. Sci. USA, 93: 609-614. Meister. M., Lagnado, L. and Baylor, D.A. (1995) Concerted signaling by retinal ganglion ceils. Science, 270: 1207-1210. Mel, B.W. (1993) Synaptic integration in an excitable dendritic tree. J. Neurophysiol., 70: 1086-1101. Neuenschwander, S. and Singer, W. (1996) Long-range synchronization of oscillatory light responses in the cat retina and lateral geniculate nucleus. Nature. 379: 728-732.
Rall, W. :(1964)Theoretical significance of dendritic trees for neuronal input-output relations. In: R.F. Reiss 0Eds.), Neural Theory and Modeling, Stanford University Press, Palo Alto, pp. 73-97. [Reprinted in Rail, W. (1995), I. Segev, J. Rinzel and G.M. Shepherd (Eds.), The Theoretical Foundation of Dendritic Function: Selected papers of Wilfrid Rail with commentaries. MIT Press, Cambridge, MA, pp. 122-146. Reich~ D.S., Victor, J.D., Knight, B.W., Ozaki, T. and Kaplan, E. (1297) Response variability and timing precision of neuronal spike trains in vivo. J. Neurophysiol., 77: 2836-2841. Reid,: R.C. and Alons0, J,M. (1995) Specificity of monosynapfic connections from thalamus to visual cortex. Nature, 378: 281284. Reid, R.C., Victor, J.D. and Shapley, R.M. (1997) The use of m-sequences in the analysis of visual neurons: linear receptive field properties. Vis. Neurosci., 14: 1015-1027. Reinagel, P. and Reid, R.C. (2000) Temporal coding of visual information in the thalamus. J. Neurosci., 20: 5392-5400. Rieke; F., Warland, D., de Ruyter van Steveninck, R. and Bialek, W. (1997) Spikes: Exploring the Neural Code. MIT Press, Cambridge, MA. Sillito, A.M., Jones, H.E., Gerstein, G.L. and West, D.C. (1994) Feature-linked synchronization of thalamic relay cell tiring induced by feedback from the visual cortex. Nature, 369: 479-482. Softky, W. (1994)Sub-millisecond coincidence detection in active dendritic trees. Neuroscience, 58: 13-41. Tanaka, K. (1983) Cross-correlation analysis of geniculostriate neuronal relationships in cats. J. Neurophysiol., 49: 13031318. Tanaka, K. (1985) Organization of geniculate inputs to visual cortical cells in the cat. Vis. Res., 25: 357-364. Toyama, K.i Kimura, M. and Tanaka, K. (1981) Cross-correlation analYsis of intemeuronal connectivity in cat visual cortex. J. Neurophysiol., 46: i91-214. Ts'o, D.Y., Gilbert, C.D. and Wiesel, T.N. (1986) Relationships between horizontal interactions and functional architecture in cat striate cortex as revealed by cross-correlation analysis. J. Neurosci., 6: 1160-1170. Usrey, W.M. and Reid, R.C. (1999) Synchronous activity in the visual system. Annu. Rev. Physiol., 61: 435-456: Usrey, W.M., Reppas, J.B. and Reid, R.C. (1998) Paired-spike interactions and synaptic efficacy of retinal inputs to the thalamus. Nature, 395: 384-387. Usrey, W.M., Reppas, J.B. and Reid, R.C. (1999) Specificity and strength of retinogeniculate connections. J. Neurophysiol., 82: 3527-3540. Usrey, W.M., Alonso, J.-M. and Reid, R.C. (2000) Synaptic interactions between thaiamic inputs to simple cells in cat visual cortex. J. Neumsci., 20: 5461-5467. Vaadia, E., Haalman, I., Abeles, M., Bergman, H., Prut, Y., Slovin, H. and Aertsen, A. (1995) Dynamics of neuronal interactions in monkey cortex in relation to behavioural events. Nature, 373: 515-518. Wandell, B. (1993) Foundations of Vision. Sinaner Associates, Sunderland, MA.
M.A.L. Nicolelis (Ed,)
Progressin BrainResearch. Vol, 130 © 2001 Elsevier Sciet~ce B.V. All rights reserved
CHAPTER 10
Comparative population analysis of cortical representations in p ~ tric spaces of visual field and skin: a unifying role for nonlinear interactions as a basis for active information processing across modalities Hubert R. D i n s e * and Dirk Jancke 1 Institute for Neuroinformatics. Theoretical Biology, Ruhr-University Bochum, Bochum, Germany
Introductory From a phenomenological point of view, the concept of population :analysis is a rather straightforward and inescapable Consequence of the observation that a huge number o f broadly tuned neurons is activated after even the simplest form of sensory stimulation or in relation to motor outputs. This mass activity ineludes both spiking and suprathreshold activity. Our approach outlined in this chapter was developed in order to accotmt for population activity recorded in early sensory cortices at a spiking level. Our goal was to Visualize and to analyze cortical activity distributions in the coordinates of the respective stimulus space to explore cooperative processes (Dinse et al., 1996; Jancke et al., 1996; Kalt et al., 1996; Erlhagen et al, 1999; Jancke et al., t999). The basic assumption is that neuronal interactions are an intricate part of cortical info~ation processing generating
* Corresponding author: Hubert R. Dinse, Institute for Neuroinformatics; Theoretical Biology, Ruhr-University Bochum, Bochum, Germany. Tel.: +49=234-32-25565; Fax: +49-234-32-14209; E-mail: hubert @neuroil~formatik.rmhr-uni-bochum.de t Present address: The Weizmann Institute of Science, Rehovot, Israel.
internal representations of the environment beyond simple one-to-one mappings of the input parameter space. Using this approach, we can demonstrate that the spatio-temporal processing of sensory stimuli is characterized by a delicate, mutual interplay between stimulus-dependent and interaction-based strategies contributing m the formation of widespread cortical activation patterns.
Why populations? In 1972, Barlow published the well-recognized article 'Single units and sensation: a neuron doctrine for perceptual psychology'. He proposed that "active high-level neurons directly and simply cause the elements of our perception" (Barlow, 1972). This work articulated the prevailing conceptual framework of that time and had a great impact on research of sensory information processing in early cortical areas. In fact, during the late fifties and sixties, single-cell recordings, the monitoring of extraceltular potential changes, had become feasible routine in every laboratory. It is tempting to speculate, in how far purely technical aspects of that type boosted the conceptual framework of single-cell analysis. While this approach became dominant during the next decades, at the same time. it became more and more evident that there might be more
156 to higher brain processes than revealed by single-cell recordings. It should be stressed that the emphasis on distributed population activity, instead of a single cell, does not imply the underestimation of the performance of single cells. On the contrary, there is more and more experimental evidence that axons, passive and excitable dendrites and spines play a possibly underestimated role in signal transfer and processing (Segev and Rall, 1998). Anatomical analysis of cortical networks revealed the enormous richness of connectivity and interconnectedness of cortical networks (for review see Braitenberg and Schtiz, 1991). According to their minute analysis, a single cortical cell receives on average synaptic inputs in the magnitude of 105. However, the proportion of direct sensory afferents is only 20% even in layer 4 that provides most of the sensory inputs. The degree of interconnectedness is best illustrated by calculations made by Braitenberg and Schtiz (1991), according to which 1 mm 3 cortical volume contains about 150,000 neurons, 3 km axonal fibers and 450 m dendritic branches. Consequently, cortical networks are characterized by densely coupled widespread arborization of dendritic and axonal connections. From a theoretical point of view, these anatomical constraints have been interpreted as an ideal substrate for parallel processing based on recurrent loops, in which lateral interactions and nonlinearities play a key role. From a functional point of view, there is abundant evidence that is fully in line with the outlined theoretical and anatomical considerations: (1) Cortical point-spread functions are broad. It is well established that widespread patterns of cortical activation are evoked by even very small and simple, i.e. 'point-like', stimuli. This is true for visual, auditory and somatosensory cortical areas. It simply implies that whatever the stimulus is, large populations of many thousands of neurons are invoked. It is interesting to note that most recent developed techniques to record neural activity do in fact measure equivalents Of the cortical point spread function, as is the case for PET, fMRI and optical imaging of intrinsic or dye-coupled signals. (2) Cortical processing is active. Neurons in striate visual cortex have been characterized with re-
spect to physical key features, such as visual field location, orientation, motion direction, ocular dominance, and spatial frequency. These approaches allowed to analyze neural representations within parameter spaces that are explicitly defined by physical stimulus attributes. However, dependent on stimulus context, a large number of visual illusions, e.g. the perception of illusory contours (Kanizsa, 1976; Von der Heydt et al., 1984; Ramachandran et al., 1994; Sheth et al., 1996; Mendola et al., 1999), indicate that the visual system must contain additional mechanisms leading to representations within parameter spaces which have no physical counterpart. This is in line with the observation that single neurons exhibit complex, non-predictive behavior dependent on stimulus context (for review see Gilbert et al., 2000). Accordingly, this complex spatio-temporal response properties can be modified by stimulation displaced from the receptive field-center or even from outside the classical receptive field (Allman et al., 1985; Dinse. 1986; Gilbert and Wiesel, 1990; Sillito et al., 1995). This can be rewritten by stating that if interaction contributes significantly to neural activation in the visual cortex, then representations of the visual environment will differ from a simple feedforward remapping of visual space. (3) Behavioral performance is superior to singlecelt performance. Except for a few~ examples, the performance inferred from single-celt, tuning characteristics is not sufficient to explain the performance seen at a behavioral level There are usually ,two responses to that: (a) neurons at higher (possibly unknown) stages of processing will show the required characteristics; and (b) the required performance will be generated as soon as a 'pool' of neurons is taken into consideration. A famous example is 'hyperacuity', the threshold of which is several fold beyond that of single cells (Westheimer, 1979). The 'coarse coding' framework is an attempt to explain how high resolution can be easily obtained with broadly tuned elements (Hinton et al., 1986). Taken together, in order to address the implications listed above, the conceptual consideration of neural populations, their recording, analysis and understanding appears straightforward. A less straightforward question is how to accomplish this goal.
157 Emergence of de novo representations? As each single neuron is p ~ of a population, a single neuron's'- activity is based on the entire network activity and vice versa, the network activity is dependent on the contributing single neurons. It has in fact been shown that the activity of a single neuron reflects the actual state of the entire neural network [Arieli et at., 1996; Tsodyks et al., 1999). Yet, an important question remains whether a population is able to create de novo "qualia' neither explicitly present at the single-cell level nor in the input (Lehky and Sejnowski, i999). The most prominent example might be the sensation of 'white' arising from the trichromatic color vision system (Young, 1802) by the joint activation of a population of retinal receptors tuned to different wavelengths. There are a number of recent experimental findings suggesting the population-based creation Of de novo properties (Diesmann et al., 1999; Jancke, 2000; Thief et al., 2000). Why parametric In principle, when investigating the visual system, a distribution of population activity within the parametric space of the visual field is equivalent to activities recorded in fimctional imaging studies such as fMRI or optical imaging assuming a clean retinotopy. There are a number of differences, however. The main problem arises from the fact that the retinotopy is far from coming close to a clean representation of the visual field. This is particularly obvious at a spatial scale that differentiates between visual angles iess than 1° apart (Hubel and Wiesel, I962; All'us, 1975). The main constrains arise from the considerable scatter of RF position that is larger or in the same range than the required systematic Shifts due to a topographic gradient in the map, Again, this holds true for other modalities in an analogous way. At a larger scale of several degrees, a clear retinotopic gradient is present, though distorted. Yet at this scale, other factors complicate the aspect of a clean topography. As extensively studied in the Visual system, the retinotopic gradient of the cortical map of the visual field is overlaid by so-called: functional maps'. Functional maps contain an orderly arrange-
ment of certain stimulus attributes in a repeated way for certain portions of the reSpectiVe retinal locations (Hubener et al., 1997; Kim et al., 1999; Swindale, 20001). At present, functional maps have been established for orientation of moving gratings (Blasdel and Salama, 1986; Swindale et ai., 1987; ]3onhoeffer and Grinvald, 1991), direction of motion (Weiiky et al., 1996) and the spatial frequency of the moving grating (Shoham et al., 1997; Kim et al., 1999). In addition, maps exist for the inputs of the two eyes (ocular dominance maps,, Wiesel et al., 1974; LeVay et al., 1978) and disparity (Burkitt et al., 1998). There is, in fact, evidence that the retinotopic map contains discontinuities to account for the discrete organization according to certain stimulus attributes (Das and Gilbert, !997). Taken together, the requirement for 'cleanness' of the m a p i s not fulfilled at either spatial scale. For a discussion of multiple functional maps in auditory cortex, see (Schreiner, 1995). Taken together, our parametric population approach takes into account that: (1) neurons are broadly tuned, e.g. covering large ranges of parameter values; and it enables (2) to analyze their common responses within the metrics of given stimulus attributes. In addition, the construction o f distributions of population activity that are defined in physical metrics can help to find underlying neural transformation strategies that map sensory stimulus parameters onto the cortical anatomy. Which metrics ? One fundamental question arises when discussing the metrics within which population of neurons should be studied. Probably the main advantage of the conventional single-cell receptive field (RF) approach was to describe neural activity within the metrics of the stimulus space, and not in the mettics of the anatomical connections, e.g, the dendritic branching of the cell. This simple remapping of activity made it possible to study the cell's firing as a function of any possible stimulus attribute in terms of its parameter space. As detailed below, our approach similarly consists of a systematic remapping of population activity from their cortical coordinates back into the parametric space (Fig. 1). Accordingly, constructing
158
population representation in cortical coordinates
0.5 mm
visual hemifield
nlnnmall mm ;, wan
1 population representation in stimulus coordinates
nllMk JlPjnn lnilllm
• = stimulus 0.4 °
Fig. 1. Population representation in different coordinates exemplified for visual cortex studies. A stimulus is presented at a fixed location in the visual field (bottom, left). Recording of evoked activity results in a cortical activation corresponding to the cortical point spread function (top, right). Shown is an intrinsic optical imaging map evoked by a small square of light. Extracellular recordings of single cell responses (top, right) were used to determine the receptive fields (RFs) defined in the visual field coordinates (bottom, left, colored RF outlines). In contrast to conventional methodologies, we pursue a non-centered approach, in which a stimulus is kept at a fixed position independent from the location of the RFs being studied (bottom, left: relation of black square to RF outlines in blue and green. see also Fig. 2). The definition of single cell activity in parameter space allows a systematic investigation of the cells activity as a function of variation of stimulus parameter. Our approach accomplishes a transformation of population activity in parameter space which corresponds to a 'population RF' (see also Fig. 2). It should be noted that recording of activity across the cortical surface as obtained by optical imaging equivalently refers to a non-centered field approach. In contrast, a distribution of population activity in parameter space can be regarded as the inverted cortical point-spread function ('cortical spread-point function'. As a consequence, this procedure allows to investigate population distributions and their interactions in the physical metrics of the stimuli.
a parametric distribution of population activation can b e r e g a r d e d as a ' p o p u l a t i o n r e c e p t i v e f i e l d ' . W e w i l l d e m o n s t r a t e t h a t t h i s a p p r o a c h is h i g h l y s u i t a b l e to
reveal insight into processing principles, including n e u r a l i n t e r a c t i o n s t h a t g o b e y o n d t h o s e d e f i n e d at the level of classical single-cell approaches.
159
Two types of averaging Implication of the non-centered field approach Our population approach is based on two different types of averages. Here we discuss the averaging across different spatial locations within the: RFs. In the conventional RF approach, stimuli are applied to the RF center: Accordingly, as a first step, the approx~aate shape and center of an RF has to be determined. Once that is done, stimuli of various types are presented at the center or along a centered orientation in order to study possible dependencies of the firing rate from stimulus variations, We :call this procedure RF centered approach i in contrast, as our main goal is to: study a population response to a given stimulus, i.e. the contributions of all neu-
tion. curvature, length, motion direction et cetera (see also above 'overlaying maps'). To characterize the contribution of each neuron to the representation of a given stimulus, one might conceive of the highdimensional space spanned by its different parameters. Each neuron could be thought of as a point in this parametric space. This point corresponds to a set of preferred values for all represented parameters. By asking only how the neuron's firing rate depends on visual field position, the contributions of all neurons are averaged, although their preferred parameter set may be different along other dimensions. In this sense, the population distribution is a projection from a potentially high dimensional space onto a common neuronal space representing only visual field :position (Jancke et al~, 1999). In this way, the p0pulation receptive field can be regarded as the inverse of the cortical point-spread function ('cortical spread-point function').
Reconstruction of information
across RFs. In our view, this way of stimulus presentation and averaging is crucial for an understanding of how complex scenes are represented in the visual Cortex. A similar approach has been pursued by van Essen and coworkers, who investigated the activity. Of visual cortex cells under natural viewing conditions (Gallant et al., 1998). In a way, this procedure cotTesponds to a systematic shift of a stimulus throughout an RF (cf. Szulborski and Palmer, 1990). Instead; we do not shift the stimulus, but sample the contributions of P,Fs of many neurons shifted randomly across the stimulus. Multidimensional spaces, subp0pulations, and the contributing neuron A second important averaging is performed across many different cell types. Neurons in area 17 contribute potenfiaUy tothe representations of many different parameters, such as retinal position, orienta-
There is agreement that physical attributes of sensory stimuli are encoded as activity levels in populations of neurons. Reconstruction or decoding describes the inverse problem in which the physical attributes are estimated from neural activity. Reconstruction methods have been regarded useful first in quantifying how much information about the physical attributes is present in a neural population and, second, in providing insight into how the brain might use distributed activity (Nicolelis, 1996; Zhang and Sejnowski, 1999; Doetsch. 2000). However. given that: (a) an optimal reconstruction method is utilized; (b) the population is of sufficient size, i.e. it contains a sufficient number of neurons: and (c) the stimuli are within the range of behavioral relevance and resolution, i.e. belong to a stimulus that is represented in the brain, we argue that the mere reconstruction does not yield much additional information. Of course, one of the problems behind this is the question who reads the code. In the case of optimal reconstruction, an implicit assumption is that the brain is able to perform a comparable analysis. Ways to prove this assumption are to compare population data with psychophysical data of performance, thresholds, discrimination abilities, reaction times etcetera. An ultimate control consists of execution of behavior.
160
I
r~
B
Fig. 2. Schematic illustration of stimulation and construction of population distributions exemplifiedfor the visual cortex. (A) Illustration of the non-centered field approach. Stimuli, indicated by the black square, were presented independent of the locations of the receptive fields (RFs) of the measured neurons (schematically illustrated by the ellipses). The frame with the cross-hair (gray) illustrates the analyzed portion of the visual space (2.8 x 2.0). (B-D) Illustrations of the Gaussian interpolation method to construct the distribution of populatiorl activity. (B) The location of the RF center of each neuron as determined by response plane techniques was weighted with its firing rate, illustrated as vertical bars of varying length at various locations. (C) The distribution of population activity was obtained by Gaussian interpolation (width = 0.6°). (D) View of the distribution of population activation using gray-levelsto indicate activation. The location of the stimulus is indicated by the square outlined in white together with the stimulus frame. In the results section, activity distributions are shown as color-codedcontour plots. as exemplified in the study by Chapin et al. (1999), where rats were trained to position a robot arm to obtain water by pressing a lever. :Mathematical transformations were used tO convert multineuron signals into 'neuronal population functions' that accurately predicted lever trajectory and w e r e used by the animals as a substitute o f executed behavior to position the robot arm and obtain water. More generally, during recent years, it became evident that a critical step for the investigation of how distributed cell assemblies process behaviorally relevant information is the introduction of methods for data analysis that could identify functional neuronal interactions within high dimensional data sets (cf. Nicolelis, 1999).
What is gained: neural interaction information Our approach seeks an altemative in comparing different states of population activity evoked by different classes of stimulL Instead of asking how accurately the parameter of for example stimulus location
can be reconstructed or decoded, we primarily were interested in analyzing interaction-based deviations o f population representations dependent on defined variations of stimulus configurations. Accordingly, there is an important point of departure from the interest we share with aspects relating to estimation theory. Instead, our analysis aimed to investigate how the representation of position is affected by interaction among neurons. The impact of interaction on human perception has been repeatedly shown. Repulsion effects between neighboring stimuli have been described in humans. Errors incurring when subjects estimate the visual distance between two spots of light depend systematically on the retinal distance of the stimuli. Small separations are underestimated, large distances are overestimated (Hock and Eastman, 1995). Similar results have been obtained for estimation of the orientation of stimuli (Westheimer, 1990). In the experiments described below, we introduce similar paradigms contrasting population actlw
161 e l e m e n t a r y stimuli
-I
c o m p o s i t e stimuli
!
.................
,:
:
1 i[f I I I ////
. . . . .
..............
0.4 d e g
---I . . . . . . . . . . . . . . .
I=-I ...........
,
..... I
I - - !
•
. . . . . . . . . . . . .
.........
. . . . . . . _ • ......... • . . . . .
•
• • ~,
separation
r¸ •
-I
...... -I
-I
........... I--
o
t
.......... -----I-
2.4 deg
2.8 deg tem porat
nasal
site 1
~
j
7 mm
site 2 site 3
0-0-
~-......~____~--
12 mm
j
20 mm
~._.____j~
29ram
~
site 4
e-site 5
Fig. 3. Comparison of stimulation configurations used for the investigation of nonlinear interaction in the visual and the somatosensory cortex, stimuli employed were small squares of light (Fat visual cortex) and small tactile stimuli (light taps) applied to the glabrous skin of the hindpaw (rat somatosensory cortex), which were denoted as elementary stimuli. Top row: in the visual cortex, the elementary stimuli (squares of light, 0.4 x 0.4°) were presented at seven horizontally shifted positions within the central visual field representation. Bottom row: in the somat0sensory cortex, elementary stimuli (1.5 mm in diameter) were presented at five positions along the distalproximal axis of the hindpaw. Both studies had in Commonthat composite stimuli were assembled from combinations of the elementary stimuli. In the visual cortex, they were presented at six different separation distances of 0.4-2.4 ° (top row). The left stimulus component was always kept at a fixed nasal position. In the somatosensory cortex, the composite stimuli were presented at four different separation distances betw~n 7 and29 mm Skin surface (bottom row). The most distal stimulus component (site 1) was always kept fixed. ity resulting from so-called 'simple' or 'elementary' sti_muli with composite stimuli that were assembled f r o m the elementary ones using different separation distances (Fig, 3), In detail, we extract the contribution of neurons to the representation of the,location of small squares o f light which we called elementary' stimuli. We then project the neural responses to 'composite' stimuli assembled from the two elementary stimuli o f varied separations onto this subspace by analyzingi:population distributions weighted with the responses to composite stimuli. If nonlinear interactions contribute significantly to neural activation in the visual cortex, then the representations o f composite stimuli will systematically:differ ~ o m t h e superposifion of two elementary, Insight into neural interactions of the analysis of dis-
tance-dependent deviations o f the distributions from additivity, i.e. how interactions distort the distributions of activation. Such interactions may arise from recurrent connectivity within the cortical area as well as from recurrency within the network providing the sensory input. Methodological
considerations
and
population
construction
Construction of a distribution of population activation Our general idea behind constructing a population distribution is to extract the contributions of neurons to the representation of a particular stimulus param-
162 eter. To obtain entire distributions that are defined for sensory field location, two types of analysis were applied (for details see Appendix): (1) Based on the measured RF profiles, the calculated RF centers served to construct two-dimensional distributions of population activity by interpolating the normalized firing rates of each contributing neuron with a Gaussian profile. Calculation of population representations in the somatosensory cortex based on an analogous Gaussian interpolation (2) To minimize the reconstruction error for the elementary stimulus conditions, we extended the Optimal Linear Estimator (OLE) (Salinas and Abbott, 1994) resulting in one-dimensional distributions of population activity. Data acquisition: visual cortex We recorded responses of single units in the central visual field representation in area 17 of the left hemisphere of anesthetized cats. Stimuli were always presented to the contralateral eye. Recordings were performed simultaneously with two or three glasscoated platinum electrodes (resistance between 3.5 and 4.5 MOhm. Thomas-Recording, Germany). The bandpass-filtered (500-3000 Hz) electrode signals were fed into spike sorters based on an on-line principle component analysis (Gawne and Richmond, NIH, USA). Visual stimulation Stimuli were displayed on a PC-controlled 21-inch monitor (120 Hz, non-interlaced) positioned at a distance of 114 cm from the animal. An identical set of common stimuli was presented to all neurons: (1) elementary stimuli (Fig. 3), small squares of light (size 0.4 x 0.4°), were flashed at one of seven different horizontally contiguous locations within a fixed reference frame; and (2) composite stimuli, two simultaneously flashed squares of light, were separated by distances that varied between 0.4 and 2.4 ° . Each stimulus was flashed for 25 ms. The interstimulus interval (ISI) was 1500 ms. There was a total of 32 repetitions of each stimulus, arranged in pseudo-random order across the different conditions. Stimuli had a luminance of 0.9 c d / m 2 against
a background luminance of 0.002 c d / m 2. The retinal position of these common stimuli was constant, irrespective of the RF location of individual neurons. The profile of each individual RF was assessed quantitatively with a separate set of stimuli, consisting of small dots of light (diameter 0.64 °) which were flashed in pseudo-random order (20x) for 25 ms (ISI 1000 ms) on the 36 locations of an imaginary 6 x 6 grid, centered over the hand-plotted RF. To control for eye drift, RF profiles were repeatedly measured during each recording session. Data acquisition: somatosensory cortex Single unit activity was extracellularly recorded in layer I I I - I V of the primary somatosensory cortex of anesthetized rats at depths of 700-750 I~m using glass micro-electrodes filled with concentrated NaCl (1-2 MOhm). The bandpass-filtered (500-3000 Hz) electrode signals were fed into a window discriminator. The output TTL-pulses were stored on a PC with a time resolution of 1 ms. Raw analog recordings were displayed on oscilloscopes and on audio monitors. Digitized neural responses were displayed as post-stimulus time-histograms (PSTHs) on-line during the recording sessions. Data were analyzed offline in the IDL graphical environment (RSI, USA). Tactile stimulation Penetrations were usually placed 200-300 ~ m apart to map the entire spatial extent of the hindpaw representation. The location and extent of RFs on the glabrous skin of the hindpaw was determined by hand-plotting (Merzenich et al., 1984). RFs were defined as those areas of skin at which just visible skin indentation evoked a reliable neural discharge. Other studies have shown that just-visible indentation is in the range of 250-500 l~m, which is in the middle of the dynamic range of cutaneous mechanoreceptors (Johnson, 1974; Gardner and Palmer, 1989). Cells responding either to high threshold stimuli, joint movements or deep inputs were classified as non-cutaneons and were excluded from further evaluation. The size (area of skin in m m 2) of cutaneous RFs was quantitatively analyzed by planimetry. For the analysis of neural population representations, an identical set of computer-controlled so-
163 called 'con~non stimuli' was presented to all neurons recorded independent from their RF location
Here we s e recent t merit findings, in which we explore the rote Of neural into ral interactions for the reprosentation of senSoi~ ~ imuii ii iinearly cortical areas. In detail, we desCr~he the dis :he diStance-dependent interactions for c Site stimuli timuli 0bsereed in the visual and the somatoSer~sory: cortex ~ortex that share substantial similarities.
Distance-dependent interactions for composite stimuli observed:in cat visual cortex We constructeddistributions of pop~ation activity in response to a set: of:slmali squ~es of light (so±coiled elementary s t i m u i i ' ) ~ ¢ h : d in their position along a v i ~ a l l ~talline (Figs. 3 and 4). The distributions were defined in the visual space and were based on single-cell responses from 178 neurons recorded in the central visual field representation of cat area I7. In order to obtain these distributions; we used a two-dimensional Gaussian
interpolation procedure, in which the RF centers were weighted with the normalized firing rate of responses obtained during the responses were analyzed (30onset). The width of the Gausiformly to 0.6 ° to match the ~daverage RF profile of all neual., 1999). N addition, based mat the representation of visual sidered as a function of acti~pace, we minimized the error edimensional distributions ustr estimator (OLE) procedure. ml in the sense that it extracts tion from the firing rates under 1st square fit. Both approaches sults. For the OLE-dehved reesolved approach that captured :on responses and the analysis cs see Jancke et al. (1999). ig. 4, the interpolation derived were monomodal and centered visual field position (indicated by white squ~es). The spatial arrangement of activity of these distributions implies that neurons in primary visual cortex contribute as an ensemble to the representation of visual field location although each neuron's RF might be broadly tuned to Stimulus location, i e is characterized by RF sizes several fold larger than the stimuli employed. As discussed above, the mere construction of these representations does not provide much information about ongoing processing m e c h ~ s m s ; We therefore asked the question in how far:the rep: resentation of composite stimuli consisting of two elementary stimuli can be predicted from the representations of the elementary ones, thereby addressing the question of neural interactions within the population representation. If there were no interacfions within the population, then the distributions of the composite stimuli would be predicted to be the linear superpositions of the distrit~utions of the com, ponent elementary stimuli. To test this hypothesis, we build distributions based on the same estimator used for elementary stimuli, but now weighting each cell's contribution with the firing rate observed in response to the composite stimuli. Fig. 5 illustrates the distributions of composite stimuli and their super-
164
X
E
~
"
.~~',~
~.~ ~:~
E
~
~
~ . ~ E- ,
'o
--~ , ~
r~
2
Z,--~
165 Fig. 5. The measured twmdimensional activity distributions recorded in 'the vlsUal, cortex o f the six composite stimuli (left. from mp to bottom', 0.4-2.4 ° separation, cf. Fig, 3) were compared to the superpositions of their component elementary stimuli(right), The activity distributions were based on spike activity of 178 cells averaged over the ~ e interval from 30 to 80 ms after stimulus onset. Same conventions as in Fig. 4, the colorscale was normalized to peak activation separately tbr each row. For small stimulus separation, note the remarkably reduced !evel of activation for the measured as compared t o t h e superimposed responses. This behavior could not be explained by saturation effects, see text). A transitior/from monomodal to bimodal distributions was found between 1.2 and 1,6° separation. The bimedal distribution recorded for the largest stimulus separation comes close to match the superposition. However, inhibitory interaction can still be observed. An asymmetry in the shape and amplitudes between the representations of the left and the right stimulus component was present for the measured as compared to the superimposed distributions, specifically for stimulus separations of 1,2 and 1.6°. Reproduced, with premission, from Jancke et al.. 1999. Copyright ~999 by the Society for Neuroscience.
100 90
Y
80 ff
70 •-
f
J
J
iv"
60 0,4
0,8
1.2
1,6
2
2,4
separation distance (degree) Fig. 6. Interaction based suppression of the population activity recorded in cat visual cortex induced by composite stimuli as a function of separation between the two component stimuli. The total activation in the distribution was expressed as a percentage of the total activation in the superposition. Inhibition was strongest for zero distance (0.4%eparation) and decreased almost monotonically with increasing distance, but was still presem at the largest separation tested (2.4°).
2,8 (3)
positions. Both, the measured and the superimposed dis~butions of population activity were monomodal for small, and bimodal for large stimulus separations, the transition occurring at around 1.6° separation. The most striking deviation from the linear superposition was a reduction of activity compared to the measttred responses, which was particularly strong for small stimulus separations. This reduction was not due to a saturation of population activity since it was also observed for composite stimuli of larger separations, where the distributions ~,~ere bimodal and had little overlap (for a more extended discussion, see Jancke etal.. 1999). A quantitative assessment of this suppressive interaction allowed to uncover its dependence on stimulus distance. The total activation in the population distribution was computed as the area under the distribution and was expressed as a percentage of the total activation contained in the superposition. This percentage was always below 100%, confirming the inhibitory effect as a consequence of using composite stimuli, :Suppression was strongest for small distances and decreased with increasing distances (Fig. 6). To quantitatively assess the accuracy with which the distributions o f population activity represent location, we compared the position of the maximum
-o v
2,4
,.0
2 b0 ."
1,6
J
o
__o
1,2
~6 2 0,4
°
4
~0 cO o
-0,4 -0,4
0
0,4 0,8 1,2 1,6
2
2,4 2,8
stimulus locations (deg) Fig. 7. Constructed versus real position of the elementary stimuli using the averaged spike activity. The positions of the maximum of the population distributions are shown for the seven elementary stimuli as a function of the real stimulus position. The dotted line indicates the perfect match between estimated and real stimulus position. Stimulus position can be accurately estimated; however, there is a systematic deviation in localization for all locations that might reflect a true mislocalization as described recently for human perception.
of each distribution to the respective stimulus position. Fig. 7 plots these constructed positions against the real stimulus positions. For all positions tested, there was a systematic deviation of on average 0.20-4- 0.11°. Interestingly, in a recent psychophysical
166 study, briefly presented stimuli have been found to be mislocalized. When observers were asked to localize the peripheral position of a probe with respect to the midposifion of a spatially extended comparison stimulus, they tend to judge the probe as being more toward the periphery than is the midposifion of the comparison stimulus (Mtisseler et al., 1999). It might therefore be speculated that the systematic error is not due to a bias in sampling RFs or due to errors in reconstruction, but instead reflect a true mislocalization in the representation.
Fig. 8. Population representations of the five elementary stimuli (cf. Fig. 3) recorded in rat somatosensory cortex depicted as two-dimensional activity distributions (top) over the glabrous skin surface of the hindpaw (cf. Figs. 1 and 2). The construction was based on the activity of 206 neurons. Activity distributions were computed in the time interval between 16 and 30 ms after stimulus onset corresponding to the neural peak responses. The activation level is shown in a color-scalenormalized to maximal activation separately for each stimulus. The grid overlying the distribution is intended to facilitate localization of activity. Red indicates high levels of activation. Bottom: figurines of the hindpaw indicate schematically position of the elementary stimuli. Note that for each stimulus, the focal zone of activation is fairly centered on each stimulus location.
Distance -dependent interactions for composite stimuli observed in rat somatosensory cortex Similar to our procedure utilized for the exploration of interaction effects in cat visual cortex, we constmcted distributions of population activation in response to a set of small tactile stimuli which differed in their position along the distal-proximal extension from the tip o f digit 3 to the palm. In analogy to the visual cortex study, these stimuli were termed 'elementary stimuli'. The distributions of population activity were derived from single-cell responses from 206 neurons recorded in the hindpaw representation of rat somatosensory cortex. They were obtained after backprojection of cortical activity onto stimulus space and can be regarded as a population receptive field defined in skin space. Similar to the visual cortex study, we used a two-dimensional Ganssian interpolation procedure, in which the RF centers were weighted with the normalized firing rate of each neuron. As the distribution of all recorded RFs was not homogenous, a higher number of neurons w e r e sampled at t h e distal part of the paw representation, the population activity was normalized for sampling density. The resulting color coded activity distributions of the neural population are depicted in :Fig. 8. The corresponding stimulus configuration is indicated in a schematic drawing. T h e areas with maximum population activity matched fairly well the sites of tactile stimulation, except for the representation of the proximal hindpaw of stimulus site 5, where we observed a fairly broad distribution without a sharp peak of activity. The shape of the distributions showed a substantial degree of variability, a finding not that evident for visual representations. The activity distri,
bntions for stimulation of sites 2 and 4 were rather distinct compared to the population representations for stimulation sites 1 and 3. The activity distribution was scaled to cover the whole color table in order to illustrate the shape of the population distributions. Quantitative assessment revealed that the population response amplitudes varied only by about 15% between sites 1 and 4. Despite the density normalization, activity at site 5 reached only 55% of the maximum activity elicited at site 4. We therefore assume that the weak population response of this site might reflect a genuine difference of the cortical representation of this very proximal part of the hindpaw. To address the question of neural interaction dynamics within the population representations, we compared the activity distribution obtained for composite stimuli to their superposifions. As in the visual cortex study, evidence for interactions was inferred from deviations of the population representations of the composite stimuli from the linear superpositions of the component elementary stimuli. We therefore build population representations based on the same estimator used for elementary stimuli, but now weighting each cell's contribution with the firing rate observed m response to the composite stimuli. In Fig. 9, the population activity distribution for the measured and the superimposed distributions are illustrated. The measured distributions were monomodal for smaller stimulus separations (stimuli 1-2 and 1-3, 7 and 12 turn). Bimodal distributions were only found for larger separations of 20 and 29 mm (stimuli 1-4 and 1-5). Similar to what we had found in the visual cortex, the most striking devia-
167
~
__imm!
tion from the linear superposition was a reduction of activity compared :to tiae measured responses. Again, this suppressive effect was particul~ly strong for smaller stimulus separations between 7 and 12 mm. The distance-dependent Suppression was quantified
Fig. 9 The measured two-dimensional activity distributions recorded in rat somatosensory cortex of the four composite stimuli itopi from left to right: 5-24 mm skin surface separation, cf. Fig. 3)were compared to the superpositions of the representations oJ their component elementary stimuli (middle). The activity distt[bufions were based on spike activity 0f206 ceils averaged oi er the time interval from 16 to 30 ms after stimulus onset, S mac conventions as in Fig. 8, the cot0r-scale was normalized to :oak activation separately for each column. For small stimuius ~ iparation, note the remarkably reduced Ievel of activation for the measured (top) as compared to the superimposed (middle) sponses. The measured distributions were monomodal for sn ll-er stimulus separations (stimUli 1-2 and 1-3, 7 mm and t2 ~). Bimodal distributions were only found for larger separatioan~ of 20 and 29 mm (stimuli 1-4 and i-5). The bimodal distribution recorded for the largest stimulus separation comes close to match the superposition. As in the case of the visual cortex study, inhibitory interaction can still be observed. An asymmetry in the shape and amplitudes between the representations was present for the measured as compared tO the superimposed disttbutions, with the tendency of an attraction of activity towards the distal aspects Of the hindpaw.
Q
168 "-', 10o
40-
j,
L~ 9o "~
7 °
E E v
35-
g
30
°
0
80
25-
f
0D
t.p
,° /
..go 20-
70
15 •-
60 0
5
10
15
20
25
30
separation distance (ram skin) Fig. 10. Interaction based suppression of the population activity recorded in the somatosensory cortex induced by composite stimuli as a function of separation between the two component stimuli. Similar to our findings in the visual cortex (cf. Fig. 6), the total activation in the distribution was expressed as percentage of the total activation in the superposition. Inhibition was strongest for the smallest separation tested and decreased almost monotonically with increasing distance, but was still present at the largest separation tested (29 mm skin surface).
by expressing the integrated population activity in the measured distributions to those obtained from superposition. As shown in Fig. 10, we found a clear distance dependence very similar to that obtained in the v~sual cortex. ' We also found a substantial a s y m m e t r y in the distributions for composite stimuli (Fig. 9) that was even more pronounced than that observed in the Visual cortex. This asymmetry consisted in an attraction of activity evoked from composite stimuli towards the distal aspects of the hindpaw, i.e. in the direction of the digit representation. It is well known that the cortical map of the hindpaw is dominated by the representation of the digits. In contrast, a similar asymmetry in the visual field representations at the scale of our stimuli is not present. In single-cell recordings f r o m monkey area 3b, a systematic bias in t h e spatial distribution of inhibitory surrounds of tactile RFs towards the fingertip was reported (DiCarlo et al., 1998). :We therefore conclude that the substantial asymmetry found in SI probably reflects the representational constraints present in the somatotopic representation. In order to illustrate the accuracy with which the population distributions represent locations within the skin o f the hindpaw, the coordinates along the distal to proximal dimension of the locations with maximal activity levels were plotted as function of
2
10
tO
5 0 0
5
10
15
20 25
30
35
40
s t i m u l u s l o c a t i o n s (mm)
Fig. 11. Constructed versus real position of the elementary stimuli using the averaged spike activity. The positions of the maximum of the population distributions are shown for the five elementary stimuli as a function of the real stimulus position. The dotted line indicates the perfect match between estimated and real stimulus position. Stimulus position can be accurately estimated, however, fluctuations appear to be more random. the real stimulus location Points on the diagonal indicate an ideal representation of the stimulus location. The distance of the points from the diagonal is a measure for the discrepancy o f the reconstructed with the real stimulus locations. As shown in Fig. 11, the reconstructed locations were usually shifted to the proximal portions of the paw. Only in the case of stimulation at site 4, the reconstructed stimulus location is situated more distally compared to the actual position. We are not aware of any psychophysical evidence for a systematic mislocalization towards the tip of the hands, although there are many reports of significant mislocalizations after plastic reorganizations (Sterr et al., 1998; Braun et al., 2000). Accordingly, it remains an open question as to how far the observed errors in localization reflect shortcomings in RF sampling. Using a population of 206 neurons, the stimulus position could be reconstructed with an accuracy of 4-3.39 ram, given an average RF size of 56 m m 2 (cf. Fig. 11). Recently, it had been reported that simultaneous multi-site neural ensemble recordings in three areas of the primate somatosensory cortex (areas 3b. SII and 2) consisting of small neural ensembles ( 3 0 40 neurons) of broadly tuned somatosensory neurons were able to identify correctly the location of a sin-
169 8T
A~
©
2
E 0
50
100
150
200
250
number of neurons in population 0.40 "
~
~
t
B
$ o.3~ x/
0.50
=
o
0.25
Z'
~
E 0.15. 0.10
. 20
.
50
.
.
I O0
150
"-
neuron number
Fig. 12. (A) Average deviation of the activity distributions from an ideal localization of the stimulated sites as a function of neuron numbers in the somatosensory cortex. There is a fairly strong reduction in the deviation from ideal localization when the number of neurons is increased from 30 to 40 cells. Further increase of neuron number yields only little further increase in performance of localization. (B) Average deviation of the activity distributions from an ideal localization of the stimulated sites as a function of ndm'on numbers in the visual cortex. Similar to the data from som0xosensory cortex, the most prominent gain in localization is obtained for a fairly small size of population in the range of 40-50 neurons.
gle tactile stimulus on a single trial (Nicolelis et al., 1998). We performed atype of bootstrap analysis, in which neurons froln the entire population were randomlv selected m generate subpopulations of 33.40. 50 and 67 neurons. Re-analysis of the data showed that the localization error was significantly reduced when we increased the number of neurons from 33 to 40, while further increase yielded little further improvement in localization (Fig. 12A). Interestingly, a similar analysis performed in the visual cortex revealed a comparal~le dependency on neuron number (Fig. 1 2 B ) .
Similarities and differences of population representation of simple elementary Stimuli and resulting interactions from composite stimuli Our goal was to explore the general relevance of active, cooperative processes in the representation o f simple sensory stimuli by comparing neural interaction effects recorded in primary visual and somatosensory areas. We therefore constructed sets of stimuli that combined aspects of simplicity with aspects of comparability. In our view, small squares of light displayed in the visual field at contiguous locations along a virtual horizontal line might be maximal equivalent to small taps of cutaneous stimuli presented at adjacent locations along the distalproximal axis of the glabrous skin of the paw. Most importantly, in both cases, the composite stimuli were assembled from these elementary stimuli to yield a set of 'complex' stimuli of variable separation distances (cf. Fig. 3). Under these assumptions, we could demonstrate considerable similarities of the overall properties of population activity recorded in the visual and the somatosensory cortex: (1) Population activity in the parametric space of the visual field or the hindpaw glabrous skin could be Characterized by s h @ l y focused distributions centered around the location of the stimulus. (2) The distributions evoked by the composite stimuli could not be predicted from the superpositions of the distributions obtained for the elementary ones. (3) Accordingly, in both modalities, nonlinear interactions played an important role in the generation of 'complex' representations. (4) The effects of interactions consisted of a substantial suppression of response that was dependent on separation distance between the stimuli. (5) The amount of inhibition seen and the slope of the distance-response function were similar. indicating that the metrics that govern the interactions are comparable, despite their enormous differences in physical attributes. (6) Analyzing the number of neurons necessary to yield a given performance resulted in a comparable population size. However, we also observed a number of dissimilarities, The observed differences mostly deal with aspects of asymmetry within the representation.
170 Most notably, while population activity in the parametric space of the visual field was fairly similar in amplitude and shape across the seven elementary stimuli, we observed a gradient in amplitude in the representations of the tactile stimuli, with the representation of the most proximal aspects of the hindpaw being considerable weaker and broader. Embedding parametric maps and interaction ranges in cortical anatomy
It remains to be clarified in how far the similarities of distance-dependence of interactions translate into the metrics of their respective cortical representations. In order to provide a: first approximation, we take into account the mapping experiments of Tusa and coworkers in cat visual cortex (Tusa et al., 1978). According to their data. 2.8 ° of the central visual field corresponds roughly to an area of 3 mm cortical surface. In contrast, the dimensions of the rat hindpaw map in primary somatosensory cortex is in the range of 1-1.5 mm (Chapin and Lin, 1984). This comparison implies that comparable interaction mechanisms might operate on spatial scales of cortical coordinates that differ by a factor of two. It has been speculated that long-range horizontal connections are instrumental in providing a substrate mediating the interaction effects. Wide-spread horizontal connections have been described for visual cortex spanning several millimeters (Gilbert and Wiesel. 1985; L6wel and Singer, 1992: Kisvarday et al., 1997; Sur et al., 1999). In addition, short range interactions have been shown to be involved in contextual modulations (Das and Gilbert, 1999). It is conceivable that the spatial ranges of short- and long-range connections are different in rat cortex (cf. Hickmott and Merzenich, 1998; Cauller et al., 1998), thereby providing a species- and modality-specific adaptational scaling to the needs of interaction dimensions (Dinse and Schreiner, 2001).
evoked by composite stimuli built from simple ones. This approach was applied in studies of the visual and somatosensory cortex to address the question about a possible modality and area-independent role of neural interactions (Dinse and Schreiner, 2001). We found that the population activity to elementary and composite stimuli shared a number of substantial similarities. Most notably, we found a comparable distance-dependence of nonlinear suppressive effects. These data raise the question of a modality-specific adaptational scaling of the spatial ranges of short- and long-range connections to the needs of interaction dimensions. We are currently extending our approach to the analysis of moving stimuli and the role of stimulus history for the establishment of a moving trajectory of cortical activity. In addition, we initiated studies to explore the impact of plastic reorganizations on nonlinear interactions. An ultimate goal would be to record neural activity simultaneously, which additionally allows insight into the ongoing dynamics of cooperative processes. Combining this technique with real-time optical imaging can provide insight into the cortical layout of the spatial interaction ranges and their implementation by anatomical connections.
Acknowledgements This work was supported by the Deutsche Forschnngsgemeinschaft and by the Institute for Neuroinformatics, D.J. holds a Minerva stipend. We thank our colleagues for extensive and extended discussions: Drs: Amir Akhavan, Wolfram Erlhagen, Martin Giese, Gregor Sch6ner, Axel Steinhage and Werner yon Seelen. We also thank Luls Tissot for help in data analysis. We gratefully acknowledge the cooperation of Dr. Thomas Kalt in the studies of rat somatosensory cortex and for providing instructive material,
Appendix Conclusions We used a population coding approach to visualize and to investigate representations of simple sensory stimuli in terms of their parametric spaces. We provided evidence that nonlinear interactions are crucially involved in the generation of representations
Constructing two-dimensional distributions of population activity by Gaussian interpolation
For each locationon the 6 x 6 grid, an averageresponse strength was determined for each cell by averaging the firing rate in the time interval between 40 and 65 ms after stimulus onset corresponding to the peak responses in the PSTHs. RF profiles
171
t a Gaussian e RF of each at part of the rate. timu~us humtime interval ~ver 32 stimmated as the 5als. For the on, the firing :ing rate, mn, ~s and during any single lO*ms bin in the time interval from stimulus onset to 100 ms after stimulus onset, This normalized firing rate f , ( s t) -- bn F, (s, t) = - m~z - b~
(1)
was always welt defined and positive. The normalized firing rates, /~;~(s, t), were depicted at the position of each neuron's calculated RF center. For interpolation of the data points, the width of the Gaussian profile was chosen equal to 0.6° in the visual space (approximately corresponding to the average RF width of all ~eurons recorded (Fig. 2A)). To correct for uneven sampling of visual space by the limited number of RF centers, the distribution was normalized by dividing by a density function, which was simply the sum of unweighted Gaussian profiles (width = 0.64°) centered on all RF centers. Calculation o f population representations in the somarosensory cortex To obtain population distributions that are defined for the parameter skin field location, the area of the hand-plotted RF was con~?oiuted with a Gaassian, normalized to amplitude of unity. To be independent of this particular type of normalization, in a second approach, the same procedure was followed by normalizing the Gaussian distribution to integral of unity. To construct tWOdimensional activity distributions across the skin the calculated Gaussians were summed. To correct for uneven sampling, the distribution wag normalized by dividing by a density function, which was the sum of the unweighted Gaussian profiles. The resuiting population representation reflects the local distribution of neuronal activation with highest amplitude at the actual location of a common ~timutus encoded by the entire population.
References
Albus, K (1975)A quantitative study of the projection area of the central and the paracentral visual field in area 17 of the cat. I. The precision of :the topography. Exp. Brain Res.. 24: 159-179. Allman, J., Mieziu. E and McGuiness, E.L. (1985) Stimulus specific responses from beyond the classical receptive field. /~nnu. Rev. Neurosci.. 8: 407-430. Arieli, A., Sterkin, A., Gfinvald, A. and Aertseu, A. (1996) Dynamics of ongoing activity: explanation of the large variability in evoked cortical responses. Science. 273: 1868-1871.
Barlow. H.B. (1972) Single units and sensation: a neuron doctrine for perceptual psychology? Perception. 1: 371-394. BlasdeL G.G. and Salama, G. (1986) Voltage-sensitive dyes reveal a modular organization in monkey striate cortex. Nature. 321: 579-585. Bonhoeffer. T. and Grinvald, A. (1991) lso-orientation domains in cat visual cortex are arranged in pinwheel-like patterns. Nature. 353: 429-431. Braitenberg, V. and SchiJz, A. (1991) Anatomy of the Cortex. Swinger, New York. Braun. C., Schweizer, R,, Elbert, T., Birbanmer. N. and Taub. E. (2000) Differential activation in somatosensory cortex for different discrimination tasks. J. Neurosci.. 20: 446-450. Burkitt, G.R., Lee. J. and Ts'o, D.J. (1998) Functional orgamzation of disparity in visual area V2 of the macaque monkey. Soc. Neurosci. Abstr., 24: 1978. Canller, L.J., Clancy, B. and Connors, B.W. (1998) Backward cortical projections to primary somatosensory cortex in rats extend long horizontal axons in layer I. J. Comp. Neurol.. 390: 297-310. Chapin; J.K. and Lin, C.S. (1984) Mapping the body representation in the SI cortex of anesthetized and awake rats. Comp. NeuroL, 229: 199-213. Chapin, J.K., Moxon. K.A.. Markowitz, R.S. ~and Nicolelis. M.A. (1999) Real-time control of a robot arm using simultaneously recorded neurons in the motor cortex. Nat. NeuroscL. 2: 664670. Das. A. and Gilbert, C.D. (1997) Distortions of visuotopic map match orientation singularities in primary visual cortex. Nature. 387: 594-598. Das, A. a n d Gilbert. C.D. (1999) Topography of contextual modulations mediated by short-range interactions in primary visual cortex. Nature, 399: 655-661. DiCarlo, JJ.. Johnson. K.O. and Hsiao, S.S. (1998) Structure of receptive fields in area 3b of primary somatosensory cortex in the alert monkey. J. Neurosci.. 18: 2626-2645. Diesmann. M.. Gewaltig, M.O. and Aertsen, A. (1999) Stable propagation of synchronous spiking in cortical neural networks. Nature. 402: 529-533. Dinse. H.R. (1986) Foreground-background-interaction Stimulus dependent properties of the cat's area 17, 18 a n d 19 neurons outside the classical receptive field. Perception. 15: A6. Dinse, H.R. and Jancke. D. (2001) Time-varient processing in VI: from microscopic (single cell) to mesoscopic (population) levels. Trends Neurosci.. 24: 203-205. Dinse, H.R. and Schreiner. C.E. (2001) Do primary sensory areas play homologous roles in different sensory modalities? In: A. Schtiz and R. Miller (Eds.), Cortical Areas: Unity and Diversity: Conceptual Advances in Brain Research. Harwood. m press. Dinse, H.R., Jancke, D., Akhavan. A.C., Kalt, T. and Sch6ner. G. (1996) Dynamics of population representations o f visual and somatosensory cortex based on spatio-temporal stimulation. In: S.I. Amari. L. Xu, L.W. Chan, I. King and K.S. Leung (Eds.), Progress in Neural Information Processing. Imerna-
172
tional Conference on Neural In~tbrmation Processing. ICONIP '96. Springer, Singapore, pp. 1285-1290. Doetsch. G.S. (2000) Patterns in the brain. Neuronal population coding in the somatosensory system. Physiol. Behav., 69: 187201. Erlhagen. W., Basfian, A., Jancke. D., Riehle. A. and Sch6ner, G. (1999) The distribution of neuronal population activation (DPA) as a tool to study interaction and integration in cortical representations. J. Neurosci. Methods, 94: 53-66. Gallant, J.L.. Connor. C.E. and van Essen, D.C. (1998) Neural activity m areas V1, V2 and V4 during free viewing of natural scenes compared to controlled viewing. NeuroReporr, 9: 2153-2158. Gardner, E.P. and Palmer, C. (1989) Simulation of motion on the skin. I. Receptive fields and temporal frequency coding by cutaneous mechanoreceptors of OtrI'ACON pulses delivered to the hand. J. Neurophysiol., 62: 1410-1436. Gilbert, C.D. and Wiesel, T.N. (1985) Intrinsic connectivity and receptive field properties in visual cortex. Vision Res.. 25: 365-374. Gilbert, C.D. and Wiesel. T.N. (1990) The influence of contextual stimuli on the orientation selectivity of cells in primary visual cortex of the cat. Vision Res., 30: 1689-1701. Gilbert, C., Ito, M., Kapadia. M. and Westheimer, G. (2000) Interactions between attention, context and learning in primary wsual cortex. Vision Res., 40: 1217-1226. Hickmott. P.W. and Merzenich, M.M. (1998) Single-cell correlates of a representational boundary in rat somatosensory cortex. J. Neurosci., 18: 4403-4416. Hinton, G.E., McClelland, J.L. and Rmnelhart. D.E. (1986) Distributed representations. In: J.A. Feldman. P.J. Hayes and D.E. Rumelhart (Eds.), Parallel Distributed Processing. Exploration in the Microstrucmre of Cognition. Volume 1: Foundations MIT Press. Cambrige, MA, pp. 77-109. Hock. H.S. and Eastman. K.E. (1995) Context effects on perceived position: sustained and transient temporal influences on spatial interactions. Vision Res., 35: 635-646. Hubel, D.H. and Wiesel, T.N. (1962) Receptive fields, binocular interaction and functional architecture in the cat's visual cortex. Z Physiol., 160: 106-154. Hubener. M., Slioham, D.. Grinvald, A. and Bonhoeffer. T. (1997) Spatial relationships among three columnar systems in cat area 17. J. Neurosci.. 17: 9270-9284. Jancke. D. (2000) Orientation formed by a spot's trajectory: A two-dimensional population approach in primary visual cortex. J. Neurosci. 20: RC86. Jancke. D., Akhavan, A.C., Erlbagen, W., Giese, M., Steinhage, A., Sch6ner, G. and Dinse. H.R. (1996) Population coding in cat visual cortex reveals nonlinear interactions as predicted by a neural field model. In: C. yon der Malsburg, W. yon Seelen. J.C. Vorbrtiggen and B. Sendhoff (Eds.), Artificial Neural Networks ICANN '96. Springer, pp. 641-648. Jancke, D.. Erlhagen, W., Dinse, H.R., Akhavan, A.C., Giese, M., Steinhage, A. and SchSner, G. (1999~ Parametric population representation of retinal location: neuronal interaction dynamics in ca! primary visual cortex. J. Neurosci., 19: 90169028.
Johnson, K.O. f1974) Reconstruction of population response to a vibratory stimulus in quickly adapting mechanoreceptive afferent fibre population innervating glabrous skin of the monkey. J. Neurophysiol.. 37: 48-72. Kalt, T.. Akhavan. A.C., Jancke. D. and Dinse, H.R. (1996) Dynamic population coding in rat somatosensory cortex. Soc. Neurosci. Abstr., 22: 105. Kanizsa, G. (1976) Subjective contours. Sci. Am.. 234: 48-52. Kh-m D.S., Matsuda. Y., Ohki, K., Ajima, A. and Tanaka, S. (1999) Geometrical and topological relationships between multiple functional maps m cat primary visual cortex. NeuroReporr. 10: 2515-2522. Kisvarday, Z.E, Toth, E., Ransch, M. and Eysel, U.T. (1997) Orientation-specific relationship between populations of excitatory and inhibitory lateral connections in the visual cortex of the cat. Cereb. Cortex. 7: 605-618. Lehky, S.R. and Sejnowski, T.J. (1999) Seeing white: qualia in the context of decoding population codes. Neural Comput.. 11: 1261-1280. LeVay, S.. Stryker. M.E and Shatz, C.J. (1978) Ocular dominance columns and their development in layer IV of the cat's visual cortex: a quantitative study. J. Comp. Neurol., 179: 223-244. L6wel. S. and Singer, W. (19921 Selection of intrinsic horizontal connections in the visual cortex by correlated neuronal activity. Science. 255: 209-212. Mendola. J.D., Dale. A.M., Fischl, B., Liu, A.K. and Tootell. B.H. (1999) The representation of illusory and real contours in human cortical visual areas revealed by functional magnetic resonance imaging. J. Neurosci., 19: 8560-8572. Merzehich, M.M., Nelson. R.J., Stryker, M.P.. Cynader, M.S.. Schoppmann. A. and Zook. J.M. (1984) Somatosensory cortical map changes following digit amputation in adult monkeys. J. Comp. Neurol.. 224: 591-605. Mtisseler. J.. Van der Heijden, A.H.C., Mahmud, S.H., Deubel, H. and Ertsey, S. (1999) Relative mislocalization of briefly presented stimuli in the retinal periphery. Percept. Psychophys., 61: 1646-1661. Nicolelis, M,A.L. (1996) Beyond maps: a dynamic view of the somatosensory system. Bra~ J. Med. Biol. Res., 29: 401-412. Nicolelis, M.A.L. (Ed.) (1999) Methods in Neural Ensemble Recordings. CRC Press, New York. Nicolelis, M.A.. Ghazarffar, A.A., Stambaugh, C.R., Oliveira. L.M., Laubach, M., Chapin, J.K., Nelson, R.J. and Kaas, J.H. (1998) Simultaneous encoding of tactile information by three primate cortical areas. Nat. Neurosci.. 1: 621-630. Ramachandran, V.S., Ruskin, D.. Cobb, S. and RogersRarnachandran, D. (1994) On the perception of ilhtsory contours. Vision Res., 34: 3145-3152. Salinas, E. and Abbott. L.E (1994) Vector reconstruction from firing rates. J. Comp. Neurosci., 1: 89-107. Schreiner, C.E (1995) Order and disorder in auditory cortical maps. Curr. Opin. Neurobiol., 5: 489-496. Segev, I. and Rail, V~ (1998) Excitable dendrites and spines: earlier theoretical insights elucidate recent direct observations. Trends Neurosci., 21: 453-460. Sbeth, B.R., Sharma, J., Rao. S.C. and Sur, M. (1996) Often-
173
tatiou maps of subjective contours in visual cortex. Science. 274: 2110-2115. Shoham D., Huhener, M., Schulze, S,, Grinvald, A. and Bonhoeffer T. (1997) Spatio-temporal frequency domains and their relation to cytochrome oxidase staining in cat visual cortex. Nature. 385: 529-533. Sillito, A.M, Ca-ieve, K.L., Jones, H.E., Cudeiro, J. and Davis, J. (1995) Visual cortical mechanisms detecting focal orientation discontinuities. Nature, 378: 49g-496. Sterr, A., Mullel', M.M., Etbert, T., Rockstroh, B. Pantev, C. and Taub, E. (1998): Perceptual correlates of changes in cortical representation of fii~gers in blind multifinger Braille readers. J. Neurosci.. 18: 441714423. Sur, M., Angelucci, A. and Sharma, J. (1999) Rewiring cortex: the role 0f patterned activky in development and plasticity of neocortical c i t e r s . J. Neurobiol., 41: 33-43. Swhadale, N.V. (2000) How many maps are there in visual cortex? Cereb. Cortex, 10: 633-643. Swindale0 N.V.. Matsubara. LA. and Cynader. M.S. (1987) Surface organization of orientation and direction selectivity in cat area 18. J. Neurosci.. 7: 1414-1427. Szulborski R G and Palmer L.A. (1990) The two-dimensional spatial structm'e .of nonlinear subunits in the RFs of complex ceils. Vision Res., 30: 249~-254. Thier, E. Dicke: RW., Haas. R. and Barash. S. (2000) Encoding of movement time by populations of cerebellar Purkinje cells,
Nature. 405: 72-76. Tsodyks. M., Kenet, T., Grinvald, A. and Arieli, A. (1999) Linking spontaneous activity of single cortical neurons and the underlying functional architecture. Science. 286: 1943-1946. Tusa, R.J., Palmer. L.A. and Rosenqnist, A.C. (1978) 1'he retinotopic organization of area 17 ~striate cortex) in the cat. J. Comp. Neurol.. 177: 213-235. Von der Heydt, R., Peterhans. E. and Baumgarmer, G. (1984) Illusory contours and cortical neuron responses. Science. 224: 1260-1262. Weliky, M., Bosking, W.H. and Fitzpatrick, D. (1996) A systematic map of direction preference in primary, visual cortex. Nature. 379: 725-728. Westheimer. G. (1979) Cooperative neural processes involved in stereoscopic acuity. Exp. Brain Res., 36: 585-597. Westheimer, G. (1990) Simultaneous orientation contrast for lines in the human fovea. Vision Res.. 30:1913-1921. Wiesel, T.N., Hubel. D.H. and Lain, D.M. (1974) Autoradiographic demonstration of ocular-dominance columns in the monkey striate cortex by means of transneuronal transport. Brain Res.. 79: 273-279. Young, T. (1802) II. The Bakerian Lecture. On the theory of light and colors. Phil. Trans. R. Soc. Lond. 91: 12-48, Zhang, K. and Sejnowski. T.J. (1999) A theory of geometric constraints on neural activity for natural three-dimensional movement. J. Neurosci.. 19: 3t22-3145.
M.A.L. Nieolelis (E&)
Progressin Brain Research,VoL 130 © 2001 Elsevier Science B.V. All rights reserved
CHAPTER 11
Coordinate transformations in the visual system: how to generate gain fields and what to compute with them Emilio Salinas 1,, and L.E Abbott 2 z Howard Hughes Medical Institute, Computational Neurobiology Laboratory, The Salk Institute for Biological Studies, 10010 North Torrey Pines Road, La Jolla, CA 92037. USA 2 Volen Center for Complex Systems and Department of Biology, Brandeis University, Waltham, MA 02454-9110, USA
Introduction
Studies of population coding, which explore how the activity of ensembles of neurons represent the external world, normally focus on the accuracy and reliability with which sensory information is represented. However, the encoding strategies used by neural circuits have undoubtedly been Shaped by the way the enc0ded information is used. The point of encoding sensory information is, after all, to generate and gnide behavior. The ease and efficiency with which sensory information can be processed to generate motor responses must be an important factor in determining the nature of a neuronal population code. In other words, to understand how populations of neurons encode, we cannot overlook how they compute. Gain modulation, which is seen in many cortical areas, is a change in the response amplitude of a neuron that is not accompanied by a modification of response selectivity. Just as population coding is a ubiquitous form of information representation, gain mod~afion appears to be a widespread mechanism of neuronal computation. In particular, it allows information from different sensory and cognitive modalRies to be combined. Gain modulated *CorreSponding author: Emilio Salinas, Computational Neurobiology Laborat0ry, The Salk Institute, 10010 North Torrey Pines Road, La Jolla, CA 92037, USA. E-mail:
[email protected]
neurons simultaneously represent multiple forms of information in a population code. The responses of ensembles of neurons are necessary to understand what the population is representing, no single neuron is sufficient. The distributed, multi-modal representations of gain modulated neurons are ideally configured to facilitate certain kinds o f computations, namely coordinate transformations. Functionally, the gain-modulated population code forms a distributed substrate for both information representation and processing. Cortical areas that process visual information are subdivided functionally and anatomically into two pathways. The 'where' pathway runs dorsally from primary visual cortex into posterior parietal cortex, and the 'what' pathway runs ventrally from primary visual cortex into inferotemporat cortex (Ungerleider and Mishkin, 1982; Goodale and Milner. 1992). Parietal cortex is involved in the spatial analysis necessary for motor planning and for the localization of external objects (Andersen, 1989; Andersen et al., 1997), whereas inferotemporal (IT) neurons are important for object recognition (Goodale and Milner. t992; Gross, 1992). In spite of their distinct functional roles, neuronal populations in the two streams are subject to similar forms of gain modulation. Gaze direction provides a strong gain control signal in the dorsal stream, while attention provides a similar signal in the ventral stream. Although gaze-dependent and attention-dependent gain modulation act preferentially on separate processing streams and, to a
176 good extent, independently of each other, they seem to serve the same purpose, the computation of coordinate transformations. This functional interpretation of a widespread neuronal modulatory mechanism is the subject of this chapter. We review experimental evidence revealing gain modulatory processes in the dorsal and ventral visual pathways, focusing on two questions that have been addressed through analytical and simulation methods: (1) How can gain modulation be implemented by cortical microcircuitry? and (2) How can gain modulation be used to perform behaviorally useful computations?
Retinocentered receptive fields Neurons that respond to sensory stimuli are typically characterized by their selectivity, which is expressed in terms of a receptive field. In this context, we use a somewhat expanded definition o f a receptive field. For example, the receptive field of a visually responsive neuron defines not only the location in the visual field where an image must be placed to trigger a response, but also the specific image pattern that elicits the maximal response at a given stimulus intensity. The receptive field thus defines both the preferred location and preferred visual stimulus for a given neuron. These two receptive field attributes change progressively as information flows more centrally in the visual system, so that receptive field sizes tend to increase while preferred images become more complex. The receptive fields of neurons early in the visual pathway, such as those of retinal ganglion ceils and ceils in the lateral geniculate nucleus of the thalamus (LGN), are best described in retinal coordinates. This is because the locations of the receptive fields of these neurons are fixed to the eye or, equivalently, are always the same relative to the direction of gaze. Neurons in primary visual cortex or V1 are usually described in retinal coordinates as well, although recent reports suggest a more complex description (Guo and Li. 1997: Trotter and Celebrini, 1999; but see Sharma el al., 1999). A retinocentric receptive field is schematized in Fig. 1. At this level of processing the actual gaze direction, which is determined by a combination of eye and head positions with respect to the body, does not influence neuronal activity by itself. If an image is shown at
(b)
@ 4-
(c)
*+
[
Fig. 1. Early visual neurons operate in retinocentered coordinates. Thalamic visual neurons respond to contrasting center-surround patterns. The bars on the fight represent the expected firing rate of a hypothetical thalamic neuron in response to the images shown on the left. The cross indicates the fixation point, the location to which gaze is directed. (a) The image is aligned with the receptive field, and the neuron fires rapidly. (b) If the pattern is shown at the same physical location, but at a different position with respect to the fixation point, the neuron does not fire. (c) If the pattern is moved to the original location with respect to the fixation point the neuron responds again, regardless of the actual gaze angle. In other words, when the gaze direction changes, the receptive field moves with the eyes.
two different locations in visual space, the neuronal response does not change systematically as long as these locations correspond to the same position on the retina.
Gain modulation in parietal cortex Richard Andersen, initially working with Mountcastle (Andersen and Mountcastle, 1983) and later in collaboration with others (Andersen et al., 1985, 1990; Andersen, 1989; Brotchie et al., 1995), showed that, in contrasl to thalamic or retinal ganglion neurons, visual responses in parietal cortex depend on both the retinal location of a visual stimulus and on gaze direction. In these experiments, parietal neurons responded to spots of light located at various places within the visual field, ff gaze direction is held fixed and the response is plotted as a function of
177
(a)
(b)
::o@o
(c)
(d)
- IOO
{ '°°-I
80
'5.
".~
6o
> k~ O
<
'
0
90
180
270
360
Stimulus location (degrees)
I
'
-20
I
0
'
~'~'I
4o
E
20
"~
o
20
Gaze direction (degrees)
Fig. 2. Gain modulation of parietal neurons. (a) Visual stimuli are displayed at different locations while a monkey directs its gaze (fixates) straight ahead. The cross indicates:the fixation point, the large circle indicates the location of the neuron's receptive field, and the small circles show the tocauons where stimuli are presented, one at a time. (b) Visual stimuli are displayed at a similar set of locations while the monkey :directs its gaZe to the left. In a mad b, the stimuli are presented at the same locations as measured by retinal coordinates. (c) Neuronal activi~ recorded at two different gaze angles during experiments like those in a and b (although both eye and head position were varied). A neuron's firing rate is plotted as a function of stimulus location in retinal coordinates, and the two data sets (filled and open symbols) c0~espond to two different gaze directions. The continuous lines are Ganssian fits. The peaks, which correspond to the preferred spot locations, are the same in the two cases, but the amplitude or gain is different. (d) Hypothetical gain field (i.e. gain factor as a function of gaze angle) for the same neuron. For simplicity, only dependence on the horizontal direction is indicated. Full gain fields are two-dimensional. Diagrams redrawn from Andersen et al. ( 1985): data redrawn from Brotchie et al. (1995).
the position o f the spot, the resulting curve typically has a single p e a k and can be fitted to a Gaussian function. Fig. 2c shows an example. Notice that the x-axis represents retinal coordinates, referring to the stimulus position w i t h respect to the location of gaze ( t h e fixation point). I f tl~e set o f measurements is repeated using a different fixation point and thus a different g a z e direction, the n e u r a l response follows a curve with :similar shape and preferred location, but with a different amplitude. Thus, the amplitude or gain o f the receptive fields o f these parietal neurons depends on gaze. A s a comparison, the same experiment p r o b i n g early visual neurons would produce the same p e ~ e d c~arve for all gaze directions. T h e term 'gain f i e l d ' was coined to d e s c r i b e this
gaze-dependent gain modulation and how it varies as a function of gaze direction. The dependence on gaze direction is fairly linear, although sometimes it is closer to sigmoidal due to saturation effects. A n e t w o r k - b a s e d m e c h a n i s m for gain-modulatory interactions
One o f the striking aspects o f gain modulation is that the interaction between gaze direction and retinal location of the visual i m a g e is very close to b e i n g multiplicative. Neither the baseline firing rate nor the shape o f the response curves o f the parietal neurons change as a function o f gaze direction, and the gain modulated responses are well described b y a product
178 o f two functions. One, f ( x ) , depends on stimulus location in retinal coordinates, x, and corresponds to t h e Gaussian response curve discussed above. The other function, g(y), depends on gaze angle, y, and corresponds to the gain field. The firing rate can then be written as
w°
(a)
o
Receptive field overlap
r = [hG(y)+ hlV(x)+ Y~% rj- O]+ i
r = f(x)g(y).
(1)
Neurons are typically m o d e l e d and thought o f as integrators that compute weighted sums of their inputs. H o w can they achieve the type o f nonlinear. multiplicative behavior seen in gain modulation? A n early study (Mel, 1993) proposed that nonlinear cooperative interactions between neighboring synapses connected to the same n e u r o n could generate responses that depended multiplicatively on the inputs. This explained multiplicative interactions between two input signals on the basis o f nonlinear interactions between synaptic conductances. Although this is a plausible scenario, the extent to which these or other nonlinearities (Koch and Poggio, 1992) give rise to an effective gain-like multiplication remains a question. Another m e c h a n i s m proposed later (Salinas and Abbott, 1996), is based on the rich dynamics o f networks with recurrent connections. Recurrent connectivity is a well-described feature of cortical circuits (Gilbert and Wiesel, 1983, 1989; A h m e d et at.. 1994. 1997), and its d y n a m i c a l properties have been implicated in m a n y aspects o f cortical function, such as response selectivity (Ben-Yishai et at., 1995; Somers et al.. 1995; Chance et at., 1999), signal amplification (Douglas et al., 1995), and sustained neuronal activity (Seung, 1996). Fig. 3a shows a m o d e l network representing a small patch of parietal cortex where all cells have similar gain fields. Each of the model parietal cells receives two kinds of external inputs. One, he(y), provides an eye-position signal and is the same for all target neurons. The other input, hV(x), provides visual information which can be thought of as coming from early visual cells responding to spots of light. This input is different for each target neuron, giving them different preferred stimulus locations. The two external inputs to target neuron i, h G and h v, are added together. However, the response is not simply additive because, in addition to these external
o o o o o ~j-
\o'o'oO'o'ob OOO o
00000 00000 00000
(b) ¢. .e-
iT. Stimulus location (x)
E O
RI
Preferred retinal location
Fig. 3. A model circuit that produces multiplicative gain fields. (a) A group of parietal neurons (filled circles) receives input from two external networks (open circles). The network on the left provides a signal hG(y) that depends on gaze angle, y, and is identical for all target cells. The network on the right provides visual signals hVi(x) that depend on the location of the visual stimulus, x, and are different for different target cells. The hVi (x) t e r m s correspond to Gaussian receptive fields centered at different locations. As indicated by the equation, the tiring rate of target cell i is determined by the sum of its external inputs plus a weighted sum of the activity of its neighbors minus a constant threshold. The square brackets with a plus subscript indicate rectification. As indicated by the upper plot, the weight wij is positive (excitatory) if neurons 1 and j have similar receptive fields and negative (inhibitory) otherwise. (b) The upper graph shows the responses of a model parietal neuron. Each curve traces the firing rate as a function of stimulus location in retinal coordinates for a fixed gaze angle. The three curves correspond to three different gaze directions. The squares are the actual simulation results, and the lines are fits using Eq. 1 The lower panel shows the total external input (the sum of the two h terms) for each neuron in the network. Because the gaze signal is the same for all neurons, the input curves simply shift up or down as a function of gaze. Nevertheless, the output curve shows a multiplicative interaction. Results modified from Salinas and Abbott 0996).
179 inputs, the model parietal cells receive recurrent input from their nei~bors. This is a key property of the network, and ~ e critical parameters are the connection weights wij. For the multipticative, responses t o arise, similarly tuned neurons (with overlapping receptiVe fields) Should excite each other, wl~ereas neurons with different (non-overlapping) receptive fields should ~ b i t each other. As long as this ngeneral rule holds, the exact dependence of the conection strengths on receptive field properties does not affect the result.
variables, and nevertheless generate responses characterized by a a product of functions for each of those variables. In this mechanism, there is no need to invoke explicit multiplication at the synaptic or cellular level. Gain modulation is an emergent property of the network. Since recurrent connections are ubiquitous in the cortex, this connectivity may also provide a basis for multiplicative interactions between other kinds of input signals across cortical areas and modalities (see below).
Coordinate transformations for object localization Imagine that you are facing a computer monitor (Fig. 4), directing your eyes toward the left comer of the screen, and you want to reach the mouse without shifting your gaze. Arm movements are generated with respect to body position (see, for example, Georgopoulos, 1995), and to compute the direction to the mouse in body-centered coordinates, the retinal location of the mouse must be combined with the scribed by Eq. 1, so they have an almost perfectly multiNicative gain field. This result is quite robust. When noise is added to the inputs or when variability is introduced in the network parameters, the response functions are somewhat distorted and some variability across neuronal response curves arises, but the effect is still close to multiptiCative. However, when the recurrent connections are turned off in the model, the responses are far from multiplicative. Thus, recurrent colmections seem to be critical for generating multipticative gain modulation (Salinas atld Abbott, 1996). This model has an additional property: if two inputs at two different retinal locations are presented to .the network, the activity profile will have a single p e a k located at or close to the location of the strongest input. Thus the network also provides a mechanism ifor target selection, suppressing the activity driven by weak input stimuli. This is consistent with the finding that at least some parts of parietal cortex have a; very sparse representation of the visual scene, coding faith~lly the locations of only those stimuli-that are salient or behaviorally relevant (Gottlieb et al., 1998). In summary, individual neurons in a recurrently connected r~etwork may add inputs driven by independent sources corresponding to different sensory-
Fig. 4. Schematicexampleof a simple coordinatetransformation. Gaze is directed toward the left corner of the monitor and the task is to reach the mouse withou! shifting gaze. x is the location of the mouse in retinal coordinates and y is the gaze angle. To reach the mouse, a movementin the directionspecifiedby x + y must be performed.
180 current eye position relative to the body. The angle x corresponding to the retinal location o f the mouse and the gaze direction y must be added to obtain the angle x + y describing the mouse location with respect to the b o d y axis (Fig. 4). How do neurons p e r f o r m this addition n e e d e d to generate the coordinate transformation from retinal to body-centered reference frames? The first indication that gain modulation could be useful for such coordinate transformations came from the work o f Z i p s e r and Andersen (1988). They trained a three,layered artificial neural network to perform the transformation just described. The network was presented with various target locations in retinal coordinates and with various gaze angles. The network was trained, using backpropagation and many examples o f correct i n p u t - o u t p u t associations, to compute the target locations in body-centered coordinates. Once the network had learned the correct transformations, they examined the properties o f the neurons in the hidden layer. These responses, generated b y connections that the backpropagation procedure had produced during training, were similar to the gain-modulated receptive fields found in the recorded parietal neurons. This result suggested that gain modulation provides an efficient solution to the coordinate transformation p r o b l e m given the input and output representations. This work revealed that the measured neurophysiological properties o f real neurons could be understood to underlie a specific, nontfivial computation. Zipser and Andersen i m p o s e d the computation o f a coordinate transformation on a network and observed that gain-modulated responses resulted. Another approach is to put gain-modulated responses into a network from the start, and determine the conditions under which coordinate transformations arise. This provides insight into how and under what conditions gain modulation can perform coordinate transformation calculations (Salinas and Abbott, 1995). The network used for this purpose is shown in Fig. 5. The bottom portion o f Fig. 5a represents a set of parietal neurons that have gain-modulated receptive fields like those described experimentally. Their responses are determined by Eq. 1, with the set o f neurons including combinations o f receptive field locations and gain field modulations taken from the reported distributions across the population o f parietal neurons.
(a)
(b) r = H(x +y)
oo lloi oo
"E
/
E
I
".
I
00©©0 00000 00000 00000 r = f(x) g~)
O~ p.
E. Stimulus location (x)
Fig. 5. A model network that performs a coordinate transformation using gain modulation. (a) The large, square network represents a population of parietal neurons with all combinations of the receptive field locations and gain modulation parameters described experimentally. These neurons respond to object location x and gaze angle y according to the equation appearing under the network. The network on the top represents an array of neurons that must encode a linear combination of x and y to generate a motor response given, as indicated by the equation above the linear array of neurons, as a function of the sum x -~ y. The output neurons are driven by the parietal model neurons through synaptic connections indicated by arrows. (b) The bottom panel illustrates the responses of one gain-modulated parietal neuron. with the three curves corresponding to different gaze directiOns. The amplitude of the response changes with gaze. but the location of the peak response remains constant. The top panel shows the response of one output neuron. The tuning curve shifts when the gaze angle changes, but its amplitude remains constant. Results modified from Salinas and Abbott (1995). The output neurons at the top o f the network figure represent an array of neurons that e n c o d e the target location in body-centered coordinates and can generate a motor response such as reaching to the target. They must have firing rates that are functions o f the sum o f stimulus location and eye position. The connections between the two layers allow the b o t t o m array tO drive the top one. F o r a particular target position in retinal coordinates, x , and gaze angle, y, some o f the parietal neurons are activated, and they must drive the output neurons so that these encode x -- y, the target location in b o d y - c e n t e r e d coordinates. Given this setup, we can determine the synapfic connections that allow the parietal neurons, which combine x and y through a gain interaction, to drive downstream neurons that have responses that depend on x -1- y.
181 could not originally be d e t e ~ n e d unambigttously, areaetualiy consistent. The key is~that the: p6Sifions of objects ~,are not.ene0ded ~in o n e fixed coordinate frame, but through neural activity in pafietal~cortex from which any appropriate coordinate: frame may be read out according to ongoir~g task requirements (Pouget and Sejnowski, 1997a). They als0 simulated the effects of a lesion in a model of pafietal:cortex (Pouget :and: sejriowski, 1997b) and: f0an:d ~hat the model reproduced many of the typical effects found in patients:: In particular, the deficit .affeetedmutti, ple frames of reference, including object, centred. The consequences of parietal lesions have been difficult to reconcile, perhaps because the flexibility of the gain-modulated spatial representation used by parietal cells .led to widely different outcomes, de= pending on the combination of encoded quantities beingread out by downstream networks. In conclusion, the advantage o f a gain-modulated representation, consisting of a set o f neurons tuned to a quantity x and gain modulated by a quantity y, is that a downstream network can easily extract any linear combination of x I and y using correlation-based learning. This is particularly useful for sensory-motor control where movements can be practiced. images providei examples of correct trar~sformations because, wl~r/~theharid acts as the target; the retinal andbody~centered~-representations are automatically aligned, ii.e;:~the arm angte always equals x ÷ y . Indeed babies, watchtheir-own limb movements before they can Contr0[ithem. (Van der Meer et al., !995). Einalty, a third;result is that: not only the sum, but also any; o~ertinear combination of target location and-gaze: angle-can -be represented: by the array of output neur0nsi as tong as this linear 'combination corresponds$O!.~-0utput representation used .during leamingiThis means that. one downstream network may extractor.read out x + y from the activity of parietal neurons; while another downstream network may equally-weti-readout another combination such as x - y from the same responses using similar mechanisms. ~ . tas-t:I~int it~as,beeaa thoroughly ..elaborated by Pouget-:and Sejaowski (1997a). They showed tliat many .psych0physical and lesion data in which the ¢ o o r ~ a t e . f r ~ e l u s 0 x t for:object localization
Additional evidence for modulatory gain control Following the early-studies of Andersen and colleagues, gaze-dependent gain modulation has been reported at numerous stages of the visual .system (Galletti and Battaglini, 1989; Galletti et al., 1989; Van Opstal et al., 1995; Guo and Li, 1997; Snyder et al., 1998; Trotter and Celebrini, 1999) and in areas involved in motor functions (Boussaoud et al.. 1998). Some studies have shown that even in primary visual cortex around 50% of the neurons display :substantial gaze-dependent gain modulation of their orientation or disparity tuning curves (Guo and Li, 1997; Trotter and Celebrini, 1999; but see Sharma et al., 1999). Thus, gain modulation depending on eye position may operate simultaneously at different processing stages, possibly generating a larger final effect. Reaching for objects requires :the:computation of multiple coordinate transformations s'tmilar to the one shown in Fig. 4. The gain-modulated representations in parietal cortex may be the basis for the
182 subsequent spatial: representations needed for motor execution: :and object localization (Pouget and Sejnowski, 1997a,b). The simple two-layer network model for such transformations shows that explicit visual representations of the world in head-centered or body-centered coordinates are not absolutely necessary, because the readout units could be the same neurons producing the motor commands. However, visual neurons that are i n v ~ a n t to eye position have indeed been found (Graziano et al., 1997; Duhamel et al., 1997)~ They represent the output readout layer of a transformation process. In fact, the full transformation from retinal to world-centered coordinates seems to be explicitly computed, and the underlying m e c h ~ s m appears to be gain modulation. Recent recordings from parietal cortex have shown that area LIP has mostly gain fields that depend on gaze direction, leading to bOdy,centered coordinates useful for gaze control and object reaching, whereas area 7a has mostly gain fields that depend on body position with respect :to :the world, which may lead to world-referenced responses invariant to eye, head, and body orientation (Snyder et al., 1998). This is consistent with the existence of place fields inthe rat hippocampus that encode the animal's position with respect to its environment. Area 7a projects directly to this structure (Snyder et al., 1998) which is believed to:be strongly involyed in spatial computation (O,Keefe and Nadel, 1978). Thus, gain-modulated signals that depend on eye, head, and body position may be progressively combined until a map of extrapersonal space is: formed that is fully invariant with respect to the subject's position. Coordinate transformations for object
reeo~tion W e mentioned at the beginning that thalamic and retinal ganglion neurons operate in retinocentered coordinates because their receptive fields move with the eyes. This, however, poses a problem, A characteristic feature o f our visual system is that we are able to recognize objects independently o f their location and size. This is not true for the full visual field, since a familiar face may not be recognizable if it appears far in the periphery, but it is true f o r a large, central region where we can identify a familiar image or object even if we do not look directly at it.
A neural correlate of this phenomenon is provided by high-level visual neurons like those found in area IT. IT neurons are often selective for highly structured complex images (Desimone, 1991; Logothetis et al., 1995). They may respond strongly to faces, for example, producing little or no response for a large variety of other objects. The receptive fields of these neurons are large; diameters of 60° or more are not unusual. More important than sheer size is the property that a relatively small image can have approximately the same effect n o matter where it is placed, as long as it is inside the receptive field perimeter (Schwartz et al., 1983; Desimone et al., 1984; Tovee et al., 1994). Thus, if a neuron is selective for faces, a face presented anywhere inside the receptive field will typically produce a much stronger response than a non-face image anywhere inside the receptive field. This is known as translation invariance. Translation-invariant responses correlate with our capacity to perform location-independent object recognition, but how are these responses generated? This is not a trivial problem, since translation-invariant responses must be evoked by the activity of early visual neurons that are not themselves translation-invariant. Some models achieved invafiance to location (and to other image parameters, such as scale and perspective) through synaptic modification rules that link the images of objects appearing close together in time (FtldiS_k, 1991; Wallis and Rolls, 1997). Buonomano and Merzenich (1998) also exploited temporal structure, in this case the spike patterns arising from a distribution of latencies, to generate position-invariant neural responses. Other models have been based on the hierarchical multilayered structure and nonlinearities exhibited by the visual system (Fuknshima, 1980; Wallis and Rolls, 1997; Pdesenhuber and Poggio, 1998, t999). Detailed experimental support for these mechanisms is not abundant, but it is possible that the visual system uses some or all of them in order to achieve the level of translation invariance exhibited behaviorally. A different approach was taken by Olshausen, Anderson and Van Essen who, building on earlier ideas proposed by Hinton (1981a,b), developed a model in which the translation invariance problem is cast as a coordinate transformation, in this case from a retinocentered representation to an object-centered one based on attention (Anderson and Van Essen.
183 by filtering..,operations like those commonly used. to describe the responses :of V1 complex cells (Heeger, 1991)':. Tile! secoiidterm, G, depends on.the location z where attention is directed, relative to a point called the preferred attentional locus of.the neuron, b. The function G defines the attentional gain field, and h is the locus where directed attention produces the maximum gain (Fig. 6a). When z is equal to b, :this modulatory term is maximal, whereas if attention is directed far away from the preferred attentional locus, it goes to zero. Combining the two terms gives R = F ( A a)G(z - b).
1997) studied the responses of neurons in V4, an area:that projects directly to 1T (Felleman and Van Essen,-I991). V4 neurons have receptive fields that
depends on,.the ~irnage.I being shown. The function F descri~: ~~ ~ receptive field properties,::such as location, orientafio~ preference, spatial frequency selectivity,:and so'on, and it can be characterized
(2)
Here a corresponds to the receptive field center. This expression, whiclirepresents a reasonable mathematicat fit to the data, summarizes the experimental findings and describes how the locus of attention controls the gain of V4 cells. Neurons with visual responses modulated by attentional gain fields can generate translation-invariant responses in downstream neurons if these are driven by synapses with appropriate strengths (Salihas and Abbott, 1997a,b). "This can be shown analytically if a few simplifications are allowed, and its validity under more general conditions can be verified by simulating a model network. The key for this result to be true is that the set of V4 neurons should include many combinations of receptive field properties and gain field centers that are not correlated or aligned. Thus, a given location in the visual field must be covered by neurons.with different combinations of preferred orientation, preferred spatial frequency, and other receptive field parameters. For each of these receptive fields, there must be several neurons with different p r e m e d attentional loci that are independent of the:receptive field parameters. The experimental data support these assumptions (Connor et al.. 1997). The model used in the simulations is schematized in Fig. 6b. An image is projected on a pixel array to generate a set of model V4 responses determined by Eq. 2. These responses are then synaptically weighted through feedforward connections to produce a model IT response. The array of V4 responses consists of a set of 32 × 16 receptive field centers spread uniformly. At each location, there are neurons with .four orientation preferences, three frequency selectivities, and six different gain field
184 centers (for horizontal translation only). This gives a total of about 37,000 model V4 responses. The crucial elements in the network are the synaptic weights Wi. We have found a mathematical condition that t h e weights must satisfy in order for the IT neuron to be translation-invariant. This condition can be satisfied if the weights develop through simple correlationbased learning, which is precisely how they were set in the simulations. One simple training procedure that produces synaptic connections that satisfy the condition for translation invariance is the following. During a training period, a selected image is presented and translated to all locations while the IT neuron is set active (i.e. its firing rate, Rrr, is set to a high value throughout this period) while Hebbian learning takes place. Every time the training image appears at
(a)
(b)
= [Y_.,%R,- 0 ]+
I (c)
(o)
t AI I
0
:.
pixel array
,:.',., • , 16: 32 48 64 Image location
I
0
A ,
16
32
418
Imag e location
614
a given location, the V4 responses are computed and an amount proportional to the product RiRrr is added to each synapse Wi, w h e r e Ri represents the firing rate of V4 neuron i and RIT is the preset response of the IT neuron. During the training period, the location of attention is maintained at the center of the training image. Once the image has appeared at all locations, the weights are not modified any more, and the model is tested to determine how the IT cell responds. It is important to stress that this particular m e c h a n i s m for establishing the connections is not crucial for the success of the model. Weights that satisfy the condition needed for translation-invariant responses are not unique and could thus b e established in different ways. Fig. 6c shows the response of the model IT neuron to translated versions of the same image used
Fig. 6. A model of translation-invariantresponses based on attentional gain modulation. (a) The response of a V4 neuron depends on the product of its receptive field and its attentional gain field (Connor et al., 1996, 1997). The small open circle represents the receptive field of a V4 neuron with its center at position a. The large gray circle represents the attentional gain field of the neuron with its center at position b, the preferred attentional locus. To evoke a strong response, an image that matches the receptive field selectivity must appear at a, while attention is directed to a location near b. (b) Network model for translation-invariant responses. The bottom grid is a pixel array on which images are displayed. The middle grid represents an array of V4 neurons that respond to the image and are gain modulated by attention according to Eq. 2. Each crossing point in this grid represents a set of V4 neurons with the same receptive field location but with different combinations of preferred orientation, optimal spatial frequency and preferred attentional locus. The topmost neuron represents an IT cell that is driven by the activity of the V4 layer through synaptic connections Wi. In the expression shown. 0 is a constant threshold, Ri is the firing rate of V4 cell i. and the square brackets with a plus subscript indicate rectification. (c) Response of the model IT neuron versus the location of a preferred image. The response is large because this is the same image used during the training of the network. Filled symbols correspond to attentionlocated at pixel 16 and open symbols correspond to attention located at pixel 48. The IT response depends on the location of the image relative to the point where attention is focused. (d) Response of the same model IT neuron versus the location of a less effective image. The response is much reduced, because the cell is selective for the image used during training, not this image. As in c. the response depends on stimulus location in attention-centered coordinates. Images sizes were i6 × 16 pixels. Results modified from Salinas and Abbott (1997a).
185
(a)
1 (b)
1
Fig. 7. Correspondencebetween invariant objectrecognitionand activity in infe~emporal cortex according to the model based on attentional gain modulation. Rectangles correspondto visual displays. The small crossesrepresentthe fixationpoint (the point to which gaze is directed), and the crosshairs, which are not part of the actual visualdisplay,indicatethe location where attention is directed. The bars on the right show the expectedresponse of a hypotheticalface-selectiveIT neuronin the three situations fa. b and c ), accord~g to the model.
during learning. The plot shows the firing rate of the IT cell as a ftmction of image location. The filled Circles correspond to attention located at pixel 16, whereas the hollow circles correspond to attention directed at pixe148. The IT neuron responds strongly whenever attention is focused close to the center of the image, regardless of the location of the image on the retina or viewing screen. Fig. 6d shows ~ a t when a different pattern is shown to the same IT cell the evoked response is smaller. This response falls off gradually as the image moves away from the location o f a'ttention. The receptive field of the model IT neuron thus shifts with attention, while retaining its selectivity for a specific image pattern; the model neuron operates in attention-centered coordinates. Fig. 7 schematizes the relationship between object recognition and IT activity according to the mechanism based on a coordinate transformation from a retinal to an attention-centered reference frame. This figure applies equalty tO the models by Olshausen et al. ( I ~ 3 ) and Salinas and Abbott (1997a,b). When
an object for which the neuron is selective appears in the visual field, the IT neuron responds when attention is directed toward that object (Fig. 7a). If a different object appears and attention iS drawn to it, the neuron stops responding because, even though the new image is located at the center of the neuron's receptive field, the cell is not selective for it (Fig. 7b). However, when attention is directed back to the first image, the neuron fires rapidly again (Fig. 7c). Notice that the visual display is exactly the same in Fig. 7b and c, only the locus of attention has changed. The models predict that an object may be recognized when attention is focused on it, but not when it appears far away from the attended location. Mack and Rock and collaborators have performed psychophysical experiments that support this prediction (Mack and Rock, 1998). They designed a paradigm in which a test stimulus is displayed while subjects perform an attentionally demanding visual task unrelated to the test stimulus, tn trials in which subjects performed the task and expected the appearance of the additional but unspecified stimulus, they were able to identify it reliably, without affecting performance of the primary task. In this condition, the subjects presumably divided their attention in such a way that they focused on the two stimuli simultaneously and effectively. In contrast, when subjects were engaged in the primary task and were not expecting the appearance of a test stimulus, a significant fraction of the time they were not even able to detect it; the test stimulus simply went unnoticed in many of these trials (~25%), presumably because attention was fully devoted to the primary task. This, in itself, was quite remarkable, but most importantly for our discussion, the subjects that did report seeing an additional stimulus in this 'inattention' condition could not identify its shape above chance levels. This is in marked contrast to other attributes of the test stimulus, such as color, orientation, location and numerosity, which were not correctly identified all the time, but still were identified more often than expected from random guessing. These results suggest that, if attention is far away from a visual stimulus, its shape cannot be determined, This provides strong evidence for attention playing an essential role in object recognition, as suggested by the theoretical models.
186 Two important points should be stressed about the mechanism for translation invariance: that we have described. First, we have modeled only a single 1T neuron, but other units could be included in the network, driven by the same population of V4 neurons. Differem IT neurons could then be selective for different images. The major constraint imposed by the simple learning mechanism that we have proposed is that only small numbers of neurons should be active at the same time during learning, But regardless of the specifc mechanism, if the synaptic weights satisfy the proper condition, any number of receptive fields can operate in attention-centered co0rdinates. The second point is that the model requires on the order of 250,000 V4 neurons modulated by attention for full two-dimensional translation. It thus seems that the cost of invariance is a large number of driving neurons. However, these V4 cells have simpler receptive fields than IT neurons, and the same population of V4 neurons can provide the basis for any number of complex IT receptive fields that need to be translated to an attention-centered system, Desimone et al. (1984) found that face-selective neurons in I T cortex of anesthetized monkeys responded strongly to faces presented anywhere inside a large bilateral receptive field. This does not necessarily constitute evidence against an attention-centered reference frame as a basis for these responses, because attention could operate under the control of 'automatic' mechanisms. Eye movements can be controlled consciously but they need not be, and they can also take place during sleep. Similarly, we can consciously direct our attention, but this does not prove that an attentional locus is absent under anesthesia. Manipulations that are known to eliminate attentional effects selectively would be required to settle this question. A word should also be said about learning mechanisms that could possibly give rise to synaptic weights satisfying the condition for translation invariance. We already mentioned that there is no unique set of such synaptic connections. Nevertheless, regardless of the synaptic modification rules considered, one aspect of the training procedure does put a general constraint on the model: an object can be recognized at a given position only if it has appeared previously at that position during learning. Therefore, if during learning an object appears only
in the left hemifield, later the model IT cell will respond to that object strongly and independently of position only when it appears in the lefthemifield. No response will be observed when it is shown anywhere in the fight hemifield. The learning process itself is not translation invariant. This applies particularly to distances larger than a few degrees, because invariance over smaller distances could also be obtained through other mechanisms, for example from the properties of complex cells (Logothefis et al., 1995; Riesenhuber and Poggio, 1998, 1999). This restriction is consistent with the results of psychophysical experiments. In certain visual tasks, such as discrimination of unfamiliar images, in which increases in psychophysical performance can be controlled and quantified, learning is location-specific. Increases in performance are either absent outside the specific location used during learning (Karni and Sagi, 1991; Dill and Fable, 1997), or acquired gradually for subsequent locations (Sigman et al., 19:99). These results suggest that translation invariance may indeed require objects to appear at different locations during a training period. Gain modulation as a generalized control mechanism
In the 'where' visual pathway, eye position affects activity at multiple points in the pr0Cessing chain. Attention also seems to act in p a r c e l at multiple sites, as effects have been f o u n d i n many visual areas; including primary visual cortex (Motter, 1993; Connor et aLi 1996, 1997; Luck e t al., 1997; Vidyasagar; 1998; McAdams and Maunsell; :1999; Treue and Marffnez-Trujillo; 1999). As noted for eye position, attenfional control may be more effective when acting at different points along a hierarchical processing stream, Visual neurons are typically selective for a: number of stimulus attributes, such as orientation; color, and spatial frequency, and may b e modulated by multiple quantities as well, such as e y e position and attention. Previously, we argued :that recurrent connections could give rise to gaze-dependent gain modulation, but could this mechanism account for the effects of other or even multiple modulatory influences? In the model, the same modulatory input h6(y) (see Fig. 3) is added to a group of recurrently
187 connected neurons. As long as this input is common to them and independent of other tumng properties, its modality or origin is irrelevant. An input tiaat was a function of the location of attention would produce exactly the same scaling of a tuning curve as seen in the example using eye position modulation. If two modulatory inputs depending on quantities y and z act indepeMently, so that the total modulatory input is hG(y) + hG(z), the two influences would be combined additively to determine the total gain. Recent studies are consistent with this prediction. McAdams and Maunsell (t999)investigated the effects of attention on the orientation selectivity of neurons in V4 and Vt, and found that tuning curves were almost exactly scaled by attention. The authors pointed out the ubiquity of multiplicative interactions among many stimulus dimensions, noting that by virtue of its multiplicative effect attention is put on the same footing as many other sensory attributes. Contrast, for instance; provides a well-known example of a mutfipliCative influence on tuning properties (McAdams £ d Maunsell, 1999). In anotiaer study, Treue and Martfnez-Trujilio (1999) showed that attending to a setected location and to a selected feature both have almost perfectly multiplicative effects on direction tuning in area MT. Furthermore, they showed that the effects Combined additively. Thus, the model based on recurrent activity fits well with a variety of measured interactions between tuning properties and modulatory inputs. Nevertheless, its tendency to select one stimulus over others when many of them are presemed simultaneously may be incompatible with the many cortical areas thai are subject to attentional modulation under a wide variety of conditions. Attention sometime does act as if it were suppressing irrelevant stimuli (Reynolds and Desimone, t999), but not all observations are consistent with this (Treue and Martinez-Trujilto, 1999). Thus, although the recurrent network may capture some in~ortant properties of the circuits underlying gain modulation in general, the specific mechanisms that may produce attentional modulation consistent with the growing body of experimental observations need to be worked out in more detail Although we have discussed and modelled gain modulation that is mulfiplicative, this is not a critical feature. The key property is that gain modulation
must combine information about the modulatory influence with information about the sensory stimulus in a nonlinear way. In simulations, we have found that even large deviations from a product relationship can produce results similar to those found with exact multiplication, as long as the two terms involved are combined nonlinearly. However, our results suggest that multiplicative gain control may be advantageous if it is combined with a Hebbian synaptic modification mechanism. In this case. neural representations in a new reference frame can be established through correlation-based learning. The repertoire of synaptic modification rules available to a circuit must place strong constraints on the neural representations that it can use.
Closing remarks In conclusion, gain modulation may be a generalized mechanism by which populations of neurons may encode sensory stimuli and other kinds of information, with the advantage that such representation may greatly facilitate certain computations. The two models for coordinate transformations that we discussed involved separate anatomical structures in the dorsal and ventral visual pathways, as well as mdependent modulatory effects. However, as can be appreciated by comparing Eqs. 1 and 2, the two gain control signals exert identical effects, regulating the amplitude of a set of visual responses. Our findings support earlier conclusions by Andersen and collaborators, who suggested that gain fields are an efficient means to perform coordinate transformations in general. The presence of gain fields at one stage of a processing pathway suggests that responses at a downstream stage will be in a different coordinate system. Similarly, the presence of transformed or invariant responses at one area suggests that gain modulated responses will be found at points upstream from this area. We also found that gain fields arise naturally from the recurrent connectivity that is characteristic of cortical circuitry (Gilbert and Wiesel, 1983, 1989; Ahmed et al., 1994, 1997). Further experiments should outline more precisely how gain fields are built and exploited by the nervous system, but they will probably remain a prime example of neural design serving a computational purpose.
188
Acknowledgements Work supported by the National Science Foundation (IBN-9817194), the Sloan Center for Theoretical Neurobiology at Brandeis University, and the W.M. Keck Foundation. E.S. thanks Terry Sejnowski and the Howard Hughes Medical Institute for their support.
References Ahmed, B., Anderson, J.C., Douglas, R.J., Martin, K.A. and Nelson, J.C. (1994) Polyneuronal innervation of spiny stellate neurons in eat visual cortex. J. Comp. Neurol., 341: 39-49. Ahmed, B., Anderson, J.C., Martin, K.A. and Nelson, J.C. (1997) Map of the synapses onto layer 4 basket cells of the primary visual cortex of the eat. J. Comp. Neurol., 380: 230-242. Andersen, R.A. (1989) Visual and eye movement functions of posterior parietal cortex. Annu. Rev. Neurosci., 12: 377-403. AnderSen, R,A., Bracewell, R.M., Barash, S., Gnadt, J.W. and Fogassi, L. (1990) Eye position effects on visual, memory, and saccade-related activity in areas LIP and 7a of macaque. J. Ne~roscL, 10: 1176-1198. Andersen, R.A. , Essick, G.K. and Siegel, R.M. (1985) Encoding of spatial location by posterior parietal neurons. Science, 230: 450~58. Andersen, R.A. and Mountcastle, V.B. (1983) The influence of the angle of gaze upon the excitability of light-sensitive neurons of the posterior parietal cortex. J. Neurosci., 3: 532548. Andersen, R.A., Snyder, L.H., Bradley, D.C: and Xing, J. (1997) Multimodal representation of space in the posterior parietal cortex and its use in planning movements. Annu: Rev. Neuroscii, 20: 303-330. Anderson, C.W. and Van Essen, D,C, (1987) Shifter circuits: a computational,strategy for dynamic aspects of visual processing. Proc. Natl. Acad. Sci. USA, 84: 6297-6301. Ben-Yishai, R , Bar-Or, R.L. and Sompolinsky, H. (1995) Theory Of orientation tuning in visual Cortex. Proc. Natl. Acad. Sci. USA; 92: 3844-3848. Boussaoud, D., Jouffi-ais, C. and Bremmer, E (199:8) Eye position effects on the neuronal activity Of dorsal premotor cortex in the macaque monkey, J. Neurophysiol., 80: 1132-1150. Brotchie, P.R., Andersen, R.A., Snyder, L.H, and Goodman, SIZ (1995) Head position signals used by parietal neurons to encode locations of visual stimuli. Nature, 375: 232-235. Buonomano, DW. and Merzenich, M.M. (1998) A neural network model:of temporal code-generation and position-invariant pattern recognition. Neural Comput., 11: 103-116. Chance, E S , Nelson, S.B. and Abbott, L.E (1999) Compex cells as simple cells at high cortical gain. Nat. Neurosci., 2: 277282. Connor, C.E., Gallant, J.L., Preddie, D.C. and Van Essen; D.C. (1996) Responses in area V4 depend on the spatial relationship
between stimulus and attention. J. Neurophysiol., 75: 13061308. Connor, C.E., Preddie, D.C., Gallant. J.L. and Van Essen. D.C. (1997) Spatial attention effects m macaque area V4. J. Neuroscz., 17: 3201-3214. Desimone. R. ~1991) Face-selective cells in the temporal cortex of monkeys. J. Cognit. Neurosci., 3: 1-8. Desimone, R., Albright, T.D.. Gross. C.G. and Bruce. C. (1984) Stimulus-selective properties of inferior temporal neurons in the macaque. J. Neurosci., 4: 2051-2062. Dill. M. and Fahle, M. (1997) The role of visual field position in pattern-discrimination learning. Proc. R. Soc. Lond. Ser. B. 264: 1031-1036. Douglas. R.J., Koch. C.. Mahowald, M., Martin. K.A.C. and Suarez, H.H. (1995) Recurrent excitation in neocortieal circuits. Science, 269: 981-985. Duhamel. J.R., Bremmer, F., BenHamed. S. and Graf. W. (1997) Spatial invariance of visual receptive fields in parietal cortex neurons. Nature, 389: 845-848. Felleman. D. and Van Essen. D.C. (1991) Distributed hierarchical processing in the primate cerebral cortex. Cereb. Cortex, l: 147. F61difik. P. (1991) Learning mvariance from transformed sequences. Neural Comput.. 3: 194-200. Fukushima, K. (1980) Neoeognitron: a self-organized neural network model for a mechanism of pattern recognition unaffected by shift in position. Biol. Cybern.. 36: 193-202. Gallant, J.L Braun, J. and Van Essen. D.C. (1993) Selectivity for polar, hyperbolic, and cartesian gratings in macaque visual cortex. Science. 259: 100-103. Gallant, J.L., Connor, C.E., Rakshit, S., Lewis, J.W. and Van Essen, D.C. (1996) Neural responses to polar, hyperbolic, and Cartesian gratings in area V4 of the macaque monkey. J. Neurophysiol., 76: 2718-2739. Galletti, C. and Battaglini. P.P. (1989) Gaze-dependent visual neurons in area V3A of monkey prestriate cortex. J. Neurosci.. 9: 1112-1125. Galletfi, C., Battaglini, P.P. and Fattori, P. (1989) Eye position influence on the parieto-occipital area P.O. (V6) of the macaque monkey. Eur. J. Neurosci., 7: 2486-2501, Georgopoulos, A.P. (1995) Current issues in directional motor control. Trends Neurosci., 18: 506-510. Gilbert, C.D. and WieseL T.N. (1983) Clustered intrinsic connections in cat visual cortex. J. Neurosci., 3: 1116-1133. Gilbert, C.D. and Wiesel, T.N. (19891 Columnar specificity of intrinsic horizontal and corticocorfical connections in cat visual cortex. J. Neurosci., 9: 2432-2442. Goodale, M.A. and Milner, A.D. (1992) Separate visual pathways for perception and action. Trends Neurosci., 15: 2025. Gottlieb, J.P., Kusunoki, M. and Goldberg, M.E. (1998) The representation of visual salience in monkey parietal cortex. Nature. 391: 481-484. Graziano, M.S.A., Hu, T.X. and Gross, C.G. (1997) Visuospatial properties of ventral premotor cortex. J. Neurophysiol.. 77: 2268-2292.
189
Gross, C.G. (1992) Representation of visual stimuli in inferior temporal cortex. Phil. Trans. R. Soc. Lond.. 335: 3-10. Gut. K. and Li: C?F. (1997) Eye position-dependent activation of neurones in striate cortex of macaque. NeuroReport. 8: 14051409. Heeger. D.J. (1991) Nonlinear model of neural responses in cat visual cortex. In: M. Landy and J.A. Movshon (Eds.), ComputationaI Models of Visual Processing. MIT Press. Cambridge, MA. pp~ 119~-133. Hinton. G,E. (1981a) A parallel computation that assigns canonical object-based frames of reference. In: Proceedings o f the Seventh International Joint Conference on Artificial Intelligence, Vol. lI. Vancouver. Canada. pp. 683-685. Hinton, G.E..(1981b) Shape representation in parallel systems. In: Proceedings o f the Seventh International Joint Conference on Artificial Intelligence. Vol. II, Vancouver. Canada. pp. 1088-1096. Karni, A. and Sag/, D. (1991) Where practice makes perfect in texture discrimination: evidence for primary visual cortex plasticity. Proc. Natl. Acad. Sci. USA, 88: 4966-4970. Koch, C. and Poggio, T. (t992) Multiplying with synapses and neurons. In: T, McKenna. J.L. Davis and S.F. Zometzer (Eds.), Single Neuron Computation. Academic Press. Cambridge, MA, pp. 315-345. Logothetis, N.K., Pauls, J. and Poggio, T. (1995) Shape representation in the inferior temporal cortex of monkeys. Curt. Biol.. 5: 552-563. Luck. S,J.. Chelazzi. L.. Hiliyard. S.A. and Desimone, R. (1997) Neural mechanisms of spatial selective attention in areas V1. V2, and V4 of macaque visual cortex. Z Neurophysiol.. 77: 24-42. Mack. A. and Rock. I. (1998) lnattentional Blindness. MIT Press. Cambridge, MA. McAdams, C.J. and MannselL J.H.R. (1999) Effects of attention on orientation tuning functions of single neurons in macaque cortical area V4. J. Neurosci., 19: 431-441. Mel. B.W~ (1993) Synaptic integration in an excitable dendritic tree. J. Neurophysiol.. 70: 1086-1101, Motter, B.C. (1993) Focal attention prodnces spatialiy selective processing in visual cortical areas V1. V2, and V4 in the presence of competing stimuli. J. Neurophysiol., 70: 909-919. O'Keefe, J. and Nadel, L. (1978) The Hippocampus as a Cognitive Map. Oxford University th-ess, Oxford. Olshansen. B.A.. Anderson, C.H. and Van Essen, D.C. (1993) A neurobiological model of visual attention and invariant pattern recognition based on dynamical routing of information. J. Neurosci.. t3: 4700-4719. Pouget, A. and Sejnowski. Y.J. (1997a) Spatial transformations in the parietal cortex using basis functions. J. Cognit. Neurosci.. 9: 222-237. Pouget, A. and Sejnowski, TJ. (1997b) A new view of hemineglect based on the response properties of parietal neurones. Phil. Trans. R. Soc. Lond. Sen B. 352: 1449-1459. Reynolds, J. and Desimone, R. (1999) Competitive mechanisms subserve attention in macaque areas V2 and V4. J. Neurosci.. 19: 1736-t753. Riesenhuber, M. and Poggio, T. (1998) Just one view: invariances
in inferotemporal cell tuning. In: M.I. Jordan. M.J. Kearns and S.A. Solla (Eds.), Advances in Neural Information Processing Systems 10. MIT Press, Cambridge, MA, pp. 215-221. Riesenlmber. M. and Poggio, T. (1999) Hierarchical models of object recognition in cortex. Nat. Neurosci.. 2: 1019-1025. Salinas. E. and Abbott, L.F. (1995) Transfer of coded information from sensory, m motor networks. J. Neurosci., 15: 6461-6474. Salinas, E. and Abbott. L.E (1996) A model of multiplicative neural responses in parietal cortex. Proc. Natl. Acad. Sci. USA, 93: 11956-11961. Salinas. E. and Abbott. L.E (1997a~ Invariant visual responses from attentional gain fields. J. Nearophysiol., 77:3267-3272 Salinas, E. and Abbott, L.E (1997b) Attentional gain modulation as a basis for translation invariance. In: J. Bower (Ed.), Computational Neuroscience: Trends in Research 1997. Plenum. New York. pp. 807-812, Schwartz. E., Desimone. R.. Albright, T.D. and Gross. C.G. (1983) Shape recognition and inferior temporal neurons. Proc. Natl. Acad. Sci. USA, 80: 5776-5778. Seung, H.S. (1996) How the brain keeps the eyes still. Proc. Natl. Acad. Sci. USA. 93: 13339-13344. Sharma. J.. Dragoi, V.. Miller. E.K. and Sur, M. (1999) Modulation of orientation specific responses in monkey V1 by changes in eye position. Soc. Neurosci. Abstr.. 25: 677. Sigman, M., Westheimer. G. and Gilbert. C.D. (1999) The role of perceptual learning m object oriented attention. Soc. Neurosci. Abstr., 25: 1049. Snyder. L.H., Grieve, K.L.. Brotchie, P and Andersen. R.A. (1998) Separate body- and worid-referenced representations of visual space in parietal cortex. Nature, 394: 887-891. Somers. D.C.. Nelson. S.B. and Sur. M. (1995) An emergent model of orientation selectivity in cat visual cortical simple ceils. J. Neurosci.. 15: 5448-5465. Tovee. M.J., Roils. E.T. and Azzopardi, P. (1994) Translation invariance in the responses to faces of single neurons in the temporal visual cortical areas of the alert macaque. J. Neurophysiol.. 72: 1049-1060. Treue. S. and MartJnez-Trujillo. J.C. (1999) Feature-based attention influences motion processing gain in macaque visual cortex. Nature, 399: 575-579. Trotter, Y. and Celebrini. S. (1999) Gaze direction controls response gain in primary visual-cortex neurons. Nature, 398: 239-242. Ungerleider, L.G. and Mishkin. M. (1982) The two cortical visual systems. In D.J. Ingle, M.A. Goodale and R.J.W. Mansfield (Eds.), Analysis of Visual Behavior. MIT Press. Cambridge, MA, pp. 549-586. Van der Meer, A.L.H., Van der Wee1. ER. and Lee, D.N. (1995) The functional significance of arm movements in neonates. Science. 267: 693-695. Van Opstal, A.J.. Hepp, K.. Suzuki, Y. and Henn, V. (1995) Influence of eye position on activity in monkey superior colliculus. J. Neurophysiol., 74: 1593-1610. Vidyasagar, T.R. (1998) Gating of neuronal responses in primary visual cortex by an attentional spotlight. NeuroReport. 9: 1947-1952.
190 Wallis, G. and Rolls, E. (1997) Invariant face and object recognition in the visual system. Prog. Neurobiol., 51: 167-194. Zipser, D. and Andersen, R.A. (1988) A back-propagation pro-
grarmned network that simulates response properties of subset of posterior parietal neurons. Nature, 331: 679-684.
M.A.L, Nicolelis '~Ed.)
Progress in Brain Research, Vol 130 © 2001 Elsevier Science B.V. All rights reserved
CHAPTER 12
olfactory population coding using an artificial olfactory system Joel White* and John S. Kauer Deparrmenr of Neuroscience, Tufts University School of Medicine, 136 Harrison Ave.. Boston. MA 02111, USA
Introduction The vertebrate olfactory system is a molecular detector of great sensitivity and is thought to be capable of discriminating thousands of different odorants. Research has provided a number of insights into the anatomical, physiological, cellular, molecular, and biochemical mechanisms of olfactory function. Many questf~S remain, however, regarding the nature of the olfactory code: How are odoranis represented at the various levels of the olfactory pathway? How does processing at each level alter the odorant representation? How do we interpret these representations?
Distributed coding in olfactory processing The basic hypQthesis for odorant coding in the olfactory system is that identity and concentration are represented in the spatio-temporal pattern of activity distributed across populations of neurons at each level of the olfactory pathway (for reviews. see Kauer, 19'87, 1991). Numerous anatomical and physiological observations have led to this general hypc~thesis. Fig. 1 presents a simplified, schematic view of the organization of the vertebrate peripheral
* CorrespOnding author: Joel White, Department of Neuroscience, Tufts University School of Medicine, 136 Harrison Ave., Boston, MA 02111, USA. Tel.: + t-617-6360329; Fax: + 1,617-636-0476: E-mail:
[email protected]
olfactory system based on these data and is used as a framework for stating this hypothesis. While the focus here is on vertebrate olfaction, anatomical and physiological similarities also exist with invertebrate olfactory systems (e.g. see Hildebrand and Shepherd, 1997).
Physiological properties of olfactory sensory neurons
Single-unit recordings from olfactory sensory neurons (OSNs) indicate that individual OSNs can respond to a broad range of different mono-molecular stimuli (Revial et al., 1982; Firestein et at.. 1993: Kang and Caprio, 1995; Duchamp-Viret et al., 1999). With increasing odorant concentration, OSNs fire more action potentials at higher frequency and shorter latency (Getchell and Shepherd, 1978; Duchamp-Viret et al., 1999). In some cases, high odorant concentrations elicit an initial burst of high frequency, short latency spikes that is followed by a quiescent period corresponding to the peak of the electro-olfactogram (an OSN pop................ then by a later burst of lower frequency action potentials. This temporal response suggests that the spiking mechanism is saturated at these concentrations (Revial et al.. 1982: DuchampViret et al., 1999). While the temporal responses of OSNs appear to be rather simple, there is some evidence of inhibitory responses to some odorants in some species (Dionne, 1992; Ducharnp-Viret et al., 1999). The receptor and transduction mechanisms producing these inhibitory responses remain
192
I
I
Olfactory
i
I
Olfactory Bulb
I
Piriform Cortex
Epithelium
Fig. 1. Schematic overview of peripheral olfactory anatomy, including a representation of olfactory sensory neuron (OSN) mapping between the olfactory epithelium (OE) and olfactory bulb and the distributed nature of mitral cell (MC) projections to piriform cortex. Three types of OSN are represented with three different shading patterns. Each OSN type is randomly distributed within a region of the OE. but the axons of each type project in a convergent manner to the same glomerulus (represented by the same shading pattern). In mammals, multiple MCs each have a single primary dendrite in each glomerulus (only two MCs each shown here for clarity). Mitral cell activity is modulated by at least two populations of interneurons, periglomerular cells (PC) and granule cells (GC) Mitral cell axons then project in a distributed fashion to olfactory cortical areas, with the piriform cortex represented here. Mitral axons provide direct. excitatory inputs to piriform pyramidal cells. Piriform cortex also has populations of feedforward (FF) and feedback (FB) inhibitory neurons and has an extensive network of excitatory association fibers (lines with arrows at both ends).
unclear, as well as their significance for olfactory coding. The observation that individual O S N s respond to a range of odorants also implies the converse: a m o n o - m o l e c u l a r odorant can activate multiple different OSNs. Population recordings from the olfactory epithelium (OE) are consistent with this interpretation, showing widespread activity elicited by stimulation with m o n o - m o l e c u l a r odorants ( M a c K a y - S i r e et al., 1982; Kent and Mozell, 1992; M a c K a y - S i r e and Kesteven, 1994). The activity is not h o m o g e neous across the OE, however, and different odorants elicit different spatio-temporal patterns of activity (see, for example, Fig, 2). W i t h increasing odorant concentration, the overall level o f activity across the OE increases, but the relative pattern o f activity
produced b y a particular odorant appears to remain constant ( M a c K a y - S i m and Shaman, 1984: Kent and Mozell, 1992). The receptor m e c h a n i s m s underlying broad O S N tuning have not b e e n entirely worked out, although recem progress has been made. The two most likely possibilities are that: (1) an individual O S N expresses multiple receptor proteins, each responsive to different odorant molecule(s); or, (2) an individual OSN expresses a single-receptor protein, which is responsive to multiple different odorant molecules. Evidence is accumulating that individual O S N s express a single-receptor protein, at least in rodents (Ressler et al., 1993; Vassar et al., 1993: Malnic et al., 1999), providing support for the latter possibility. However, few data are available on the response
193
PAc
PAc
BAc
AAc
AAc
730ms Fig~ 2. Video images of voltage-sensitive dye signals from the olfactory epithelium of tiger salamander, Frames show activity at the times indicated after the onset of a 500 ms puff of odorant. Voltage-sensitive dye signals are from a single animal, shown in pseudo-color over an image of the olfactory epithelium. Green and red indicate increasing levels of depolarization, Anterior is to the left. midline is at the top of each image. PAc, propyl acetate: BAc, butyl acetate: AAc. amyl (pentyl) acetate.
ms 1030 ms Fig. 3. Video images of voltage-sensitive dye signals from the olfactory bulb of tiger salamander. Frames show activity at the times indicated after the onset of a 500-ms puff of odorant. Voltage-sensitive dye signals represent data averaged from 7 animals, shown in pseudo-color over a representative image of the olfactory bulb. Green and red indicate increasing levels of depolarization. Anterior is to the left. midline is at the top of each image. PAc, propyl acetate: BAc. butyl acetate: AAc. amyl (pentyl) acetate.
profiles of individual receptor proteins (Krautwurst et al., 1998, ~ a o et al:.; 1998; t~alnic et al., 1999) If an O S N expresses a single-receptor protein, yet responds to .several different odorant molecules, is there a framework f o r understanding these interactions? :It is clear from pharmacological studies that biological receptors in general bind to and interact with muttiple tigand molecules - - this is the basis of m o d e m therapeutic medicine (e.g. the fi-adrenergic receptor. Gilman et al., 1990). While initial functional studies using limited odorant test sets may suggest that olfactory receptor proteins are narrowly tuned (Krautwurst et al.. 1998; Zhao et al., 1998: Malnic et al., 1999; Touhara et al., 1999), further studies with additional odorants may indicate broader tuning-(e.g. Araneda et al., 1999; see also White et al,~. 19991. In an attempt to provide a re-
ceptor mechanism for broad O S N tuning, it has been hypothesized that an individual OSN is activated by a particular molecular feature of a molecule (Polak, 1973; Holley and Dcving, 1977; Kauer, I980. 1991; Wright, 1982). If a mono-molecular odorant is composed of several molecular features, it would then activate multiple O S N types. This hypothesis is consistent with numerous olfactory data, but requires detailed structure-activity and binding studies with defined receptor protein types. Approaches combining gene expression and physiological measurements may provide data from sufficient quantities of single-receptor protein types to enable such an analysis (Krantwurst et al., 1998; Murrell and Hunter, 1999). Odorant identity coding in the OE therefore appears to be characterized by a spatially distributed population response resulting f r o m the broadly tuned
194 SN is icy ; a )li:ial
'stem ween bulb sites of' synaptic iinte action (gk neruli) in the olfactory bulb (e.gi 25,00( : 1 for ran ts;isee review by Hildebrand and Shept ~i~d, 1997). knat0mical studies have provided[ SOme, ,nformafioJ on :the : nature of this convergence. DI e tracing 1 a number of species indica~s that in ali regi0ns )f the OS receive input from OSNs N g . buted ovei wide areas of the olf-actory epithelium Kauer, 1981; Baler and Korsching, 1994; Schoenfeld et al., 1994; Bozza and Kauer, 1998). Data from physiological studies are consistent with this organization, showing that individual mitral cells (MCs) in the salamander OB respond to punctate odorant stimulation over widespread areas of the OE (Kauer and Moulton, 1974). Conversely, OSNs from small regions of the OE project to wide-spread areas of the OB (Kauer, 1981). Data from recent molecular biology studies are also consistent with this organization, indicating that OSNs expressing the same olfactory receptor mRNA are distributed randomly over large areas of the OE (Ressler et al., 1993; Vassar et al., 1993) but project to one or a few glomeruli in each OB (Ressler et al., 1994; Vassar et al., 1994; Mombaerts et al., I996). The location of these glomeruli appear to be invariant across individual animals within a species, suggesting that the 'glomerular map' is fixed (see also Baler and Korsching, 1994) and hence likely important for olfactory coding (Schild, 1988). The convergent projections between the OE and the OB, the physiological properties OB mitral cells, and the molecular anatomy suggest the following organization: OSNs of a given type (i.e. having the same odorant response profile and therefore the same receptor or receptor complement) are scat-
ulation. The convergence of OSN axons in the peripheral olfactory system ,then ap p ears to brin g together the terminal projections of OSNs expressing the Same receptor--"protein(s) to one or two individual giomeruli in the 0B. The nature of this reorganization is currently unclear Recent molecular studies suggest that OSNs :expressing receptor mRNAs with similar sequences project to nearby glomeruli (Tsuboi et al., 1999), implying that neighboring glomeruli may have similar response properties. Voltage-sensitive dye recordings in zebrafish indicate that while neighboring glomeruli can have similar response profiles, this is not always the case (Friedrich and Korsching, 1997). Images of the molecular convergence from the OE to the OB can be quite striking (Mombaerts et al., 1996). It is not clear, however, whether all OSNs expressing the same protein actually have the same odorant response profile. OSNs retrogradely labeled by small dye injections into the OB can respond similarly to test odorants (Bozza and Kauer, 1998). However, there does seem to be some variability in the responses of OSNs expressing ostensibly 'identical' receptor proteins (Malnic et al., 1999). Clearly, more data are needed to define the response properties of convergent OSN types.
Physiological properties of mitral cells The organization of OE to OB connections, OSN response properties described above, and the structure of OB circuits provide a framework for interpreting OB recordings. As for the OE, population recordings from the OB indicate widespread activity upon
195 stimulation with mono-motecular odorants (Lancet et al:i 1982; Kauer, i1988; Cinelli et al., 1995; see Fig. 3). Because an individual glomerulus receives inputs, from OSNs o f a given type, and because many OSN types respond to a mono-molecular stimulus, many g]omemti are also activated (Friedrich and Korsching, 1997; Rubin and Katz, i999). In turn, MCs receivir~g direct and indirect inputs via these glomeruli (plus their associated OB interneurons; Fig. 1) are caused to respond, producing widespread OB activity (Fig, 3). At first glance, recordings from single MCs in the Ot3 (i.e. the output Cells) are consistent with this general interpretation, In mammalian olfactory systems, MCs generally receive input via a single glomerulus. Co~pled with the response prope~ies of the giomeruli described above, this cormectivity provides a p ~ a l explanation for the broad range of mono,molecnlar stimuti that can activate individual MCs (Kaner, 1974; Meredith and Moulton, 1978; Hamilton and Kauer. 1989; Wellis et al., 1989). However. lateral interactions in the bulb modify mitral activity in non-linear ways, making simple 'OSN --~ glomernius --~ mitraF transfer of olfactory information unIikely. In particular, MCs produce temporal patterns of spike activity with odorant stimulation (Kauer, 1974; Meredith and Moulton, 1978; Hamilton and Kauer, 1989) These temporal activity patterns arise from OE input influenced by lateral interactions with OB interneurons (Fig. 1). As a result. the relationship between mitral spike patterns and odorant identity in, physiological studies has not been obvious. Fer exampie, temporal patterns of spike activity can change in non-monotonic ways with changes in stimulus concentration, i.e. increasing odorant concentration does not usually simply lead to more spikes (Kauer, 1974; Meredith and Moulton. 1978; Meredith, t986; Hamilton and Kauer, 1989; Wellis et al., 1989). Furthermore, similar spike patterns can be elicited by different odorants at different concentrations. Physiological observations such as these have thus made the temporal components of OB coding in single mitral ceils difficult to elucidate. In addition to the temporal patterns of activity in single cells, populations of OB neurons may oscillate in sync~0ny with relatively long stimulus pulses. These oscillations occur at approximately 40 Hz in mammals (Adrian, 1942; Kashiwadani et al.,
1999) and at lower frequencies in frogs (7-13 Hz: Delaney and Hall, 1996), salamanders (12=20 Hz; Dorries and Kauer, 2000), and turttes (8-16 Hz: Lam et al., 2000), Oscillations may be important for olfactory coding in some olfactory systems (Laurent, 1996), although there can be multiple and complex originS of :the oscillatory activity in vertebrates (e.g. see Dorries and Kauer, 2000). The olfactory Code in the OB therefore appears to be a transformation of what is represented in the OE. Similar to the OE, mono-molecular odorant identity in the OB is likely represented by activity distributed across many broadly responsive neurons. Unlike OSNs in the OE, however, glomeruli do not appear to be randomly distributed in the OB, suggesting that the spatial relationships between glomeruli and their associated mitral output cells may be important for odorant coding. One important aspect of these spatial relationships is the lateral interactions within the OB, which contribute to the complex temporal patterns of MC spike activity that change in nonlinear ways with changes in odorant identity and concentration. Pir~form cortex
Mitral cell axons project widely to olfactory cortical structures, with the piriform cortex being perhaps the best analyzed, although there are still relatively few studies. The number of neurons in the piriform cortex is large relative to the number of MCs (approximately 2.5 x 105 MCs in rat (Meisarm and Safari, 1981); and approximately 107 piriform cortex neurons in opossum (Haberly, 1985)), suggesting that OB/cortical connections are divergent. An individual mitral axon can have numerous terminations among several cortical structures, as well as multiple terminations within a single area such as piriform cortex (Ojima et al., 1984; shown schematically in Fig. 1). These projections, in general, appear to be distributed and non-topographic, although topo~aphic projections to the anterior olfactory nucleus have been noted (Scott et al., 1985). Individual piriform pyramidal cells respond to several different odorants and, conversely, many cells respond to a single odorant (Tanabe et al., 1975: Nemitz and Goldberg, 1983: Duchamp-Viret et al., 1996). These data suggest that at the level of piriform cortex (and possibly other olfactory structures),
196 olfactory information is still encoded in a distributed form (Haberly, 1985). The nature of this code is also still unclear, however. Inhibitory interactions via intemeurons and excitatory interactions through association fibers (Fig. 1) have suggested to researchers that piriform cortex functions in a manner similar to a content addressable memory for processing spatial patterns of mitral output (Haberly, 1985; Hasselmo et al., 1990). Other hypotheses suggest that the cortex may be involved in processing temporal patterns of mitral activity (Hopfield, 1995). Of course, these functional hypotheses are not necessarily mutually exclusive.
Additional constraints on olfactory codes In addition to the temporal components of olfactory codes produced b y cellular interactions within the system, animals dynamically sample their odor environment through sniffing (sniff rates up to 2 3 Hz in salamanders (Kaner, unpublished); up to 6-8 Hz in rats (Youngentob et al., 1987)). A single sniff is sufficient for odorant identification in htmaans for some odorants (Laing, 1986). Sniffing behavior provides only transient exposure of OSNs to odorants, which necessarily places constraints on the nature of the olfactory code. For example, at behavioral sniff rates and typical mitral firing rates (up to 35 Hz in salamander (Kauer, 1974); up to 55 Hz in hamster (Meredith, 1986); up to 100 Hz in rat (Mair, 1982)), the inspiratory phase of a single sniff would provide relatively few spikes to convey olfactory information. Likewise, for oscillatory activity in the olfactory system (see above), a typical inspiratory phase of a sniff is likely to encompass approximately three oscillatory cycles, ff population oscillations are important for olfactory coding, this again limits the number of temporal events that can convey olfactory information.
Testing population coding hypotheses The picture of olfactory processing that emerges from the available data is that even mono-molecular odorants elicit temporal patterns of activity across populations of neurons at each level of the olfactory pathway. This population coding hypothesis contains numerous elements that are difficult to test directly. The aspects of an odorant stimulus that are repre-
sented by this distributed activity are still unknown. Although data are beginning to accumulate on the response properties of individual OSNs expressing identified olfactory receptor proteins (Krautwurst et al., 1998; Zhao et al., 1998; Malnic et al., 1999; Murrell and Hunter, 1999; Touhara et al., 1999), it is likely to be some time before a significant number of the approximately 1000 receptor proteins in rats or mice are characterized with more than a few of the thousands of possible odorous molecules. There are also few data available on the nature of the mapping function of OSN projections to the glomeruli of the OB (Schild, 1988), although studies are beginning to approach this problem using molecular tools (Tsuboi et al., 1999). While numerous studies have been conducted on the circuitry and synapfic interactions in the OB and piriform cortex, relatively few specific details are known about how odorant information is encoded in the spatio-temporal output of the array of OB mitral cells and about how olfactory cortical structures process these spatio-temporal patterns.
An artificial olfactory system for testing hypotheses Although many aspects of olfactory circuitry and processing are experimentally tractable, investigations of coding are made difficult by the distributed nature of the olfactory code. One approach we have taken that complements our physiological studies is to investigate olfactory coding by developing an artificial olfactory system. The artificial system consists of an array of chemical sensors, a series of processing steps modeled directly on the neural circuitry of the olfactory bulb, and the use of pattern recognition algorithms that mimic biological function.
Chemical sensor array Input to the artificial olfactory system comes from an array of rapidly responding, broadly sensitive chemical detectors chosen to have response properties similar to those of biological OSNs (Dickinson et al., 1996: White et al., 1996). The sensors are based on an optical method that exploits the use of fluorescent dyes in polymer matrices that change their light output upon exposure to organic vapors. Different sensors are made by using different polymers
197
OSNs (Fig. 4B). A sensor's input to its respective subpopulation is scaled across neurons by a gaussian distribution function (Fig. 4, inset) to simulate response variability in a group of cells. Thus, in the example shown in Fig. 4, there are six subpopulations of OSNs with six response types, each mapped onto 200 OSNs.
Olfactory bulb simulation The population of OSNs provides input to a representational computer simulation of the OB (White et al.. 1992). Bulb mitral cells (MCs) and two popu-
A
C MC activity
_
~
:"
o
1oo
2o0
3oo 40o
Time (ms)
...........................
..
..-
-
• :::211
..................................
( <-,
" ..
'-....
;
loo
26o s6o 4;o
Time (ms)
6
I 100 2G0 £o
Time (ms)
4;0
::::- ....................
//i;; J i/,%','); i i i/ill- : / ..
i iiY0).k) i.i lii}liiiii ~ i i il{";ii'~'~~ q :
:
: : :
Fig. 4. Steps involved in using sensor signals as inputs to the OB simulation. (A) Responses of six chemical sensors to amyl alcohol. These signals were used as input currents to a population of 1200 simulated OSNs, which produced spike activity as shown in (B). All 1200 OSNs are represented along the y-axis. A dot in the raster plot indicates the occurrence of a single spike. Inset shows the method for mapping the signal from one sensor (asterisk in A) onto an OSN subpopulation (asterisk in B; every tenth OSN shown). Sensor input is scaled by a gaussian function across the OSN subpopulation. Larger amplitude input currents (heavy arrows) elicit shorter spike latencies, higher firing rates, and longer burst durations than smaller currents (thinner arrows). A raster plot is then generated from the population of spike patterns. (C) Membrane potential changes and spike activity in 12 simulated mitral cells (MC), elicited by the OSN pattern shown in B. ha each plot, the heavy horizontal bar indicates the time course of the vapor pulse applied to the sensors. From White et al. (1998), with permission from Springer-Verlag. Berlin. Heidelberg.
198
lations of interneurons (periglomerular and granule cells) are used in the simulation. While the number of ceils in the simulation is far smaller than that in any vertebrate species, the proportions among the numbers of cells is similar to what is found in the olfactory bulbs of most species. Properties of the simulated OB cells and the connections between them are represented by a set of coupled differential equations, with parameters determined by anatomical and physiological data (White et al., 1992). These equations are integrated to produce changes in membrane potential and spike activity in the simulated OB ceils that evolve over time. To reproduce the OE to OB connection topography seen in the biological olfactory system, each OSN subpopulation driven by a sensor provides input to a restricted, glomeruhis-like region of the OB simulation. Inputs are arranged so that OSN subpopulations with similar response profiles provide input to adjacent OB areas. Using the sensor-driven OSN spike activity as inputs, the membrane potential changes and spiking patterns in the simulated mitral cells (Fig. 4C; see White et al., 1996; White and Kauer, 1999) are sim-
ilar to those seen in extracellular and intracellular recordings from biological mitral cells (see Kauer, 1974; Hamilton and Kaner, 1989). Simulated MCs show several o f the response types described by Kauer (1974), such as suppression, excitation, and various combinations of excitation and suppression. The patterns of spike activity produced b y the array of simulated mitral cells varies with different odorants, even among odorants that differ by a single carbon in chain length (Fig. 5). Relative to the OSN input activity, OB output represents odorant information with fewer cells (12 MCs vs. 1200 OSNs) and with fewer spikes per cell (1-6 per MC vs. 1-22 per OSN) (White et al., 1998). With increasing odorant concentration, the number of spikes produced by MCs remains relatively constant (e.g' the four higher DMSO concentrations in Fig: 6). In general, the spario-temporal pattern of mitral spikes also remains relatively constant over the same concentration range. Spike latency, however, tends to decrease with increasing odorant concentration (Fig. 6). A similar relationship between spike latency and concentration has been observed in physiological recordings
MCs **
Tom
• se °
Xyl
• "~
PAl BAI
• "*
"
......
AAI PAc
,¢
~'I', o
..... " "@"
BAc
" • f.
AAc
sso.
•
!
!
0
100
!
200
300
Time (ms)
Fig. 5. Mitral cell (MC) spike activity generated from six optical fiber sensor responses to nine organic vapors: AA1, amyl (pentyl) alcohol; BA1, butyl alcohol; PAl, propyl alcohol; AAc, amyl (pentyl) acetate: BAc. butyl acetate: PAc. propyl acetate: Ben. benzene: Tol. toluene: Xyl, xylene. For each odorant, a raster plot represents the spike activity of 12 MCs. Sensor responses to cartier gas with no vapor were below OSN spike threshold and did not elicit mitral activity. From White et al. (1998), with permission from Springer-Verlag, Berlin. Heidelberg.
199 considering activity across all DL units (White et al., 1998: White and Kauer, 1999). In addition, odorant concentration is represented by the latency of DL unit activity (Fig. 7 legend; White et al.. 1998: White and Kauer, 1999). While the DLNN is not a representational model of piriform cortex, it performs a latency analysis similar to that hypothesized for the piriform cortex (Hopfield, 1995). Interestingly, a similar relationship between odorant concentration and spike latency has been seen in recordings from cells in the lateral cortex of frogs (homologous to mammalian piriform cortex; Duchamp-Viret et al.. 1996). Thus, the distributed output activity produced by the OB simulation supports odorant identity and concentration recognition in the DLNN output.
St 2.9x 10 -6
°.
tt
4.1x10 "6
$~
9.5x10-6
*
tt
$$ 1.4x10 "5 ¢ ¢
2.9x105
oo
$t t¢
i
0
*
-
100
Properties of olfactory code in artificial olfactory system
* .I
200
Time
I
I
300
400
(ms)
Fig, 6. Raster plots of simulated mitral cell responses to pulses of dimethyl sulfoxide(DMSO) at the concentrations(vapormolarity) indicated.From White and Kauer (1999), with permission from ElsevierScience.
from mitral ceils (Kauer, 1974; Hamilton and Kauer, 1989; Weltis et al., 1989).
MC spike pattern processing The spatio-temporal patterns of spike activity produced b y t h e array of simulated mitral cells are then recognized by a delay line neural network (DLNN; White et al.. 1998; White and Kauer, 1999). In essence, the DLNN is a matching algorithm for spatio-temporal patterns of MC spikes. The DLNN consists of a single layer o f units, each receiving input from all t2 MCs via 12 delay lines (DLs). There is a DL unit for each odorant of interest (i.e. six DL units for six odorants in Fig. 7). The DLs to a given unit ~ e set so that a M C , spike pattern elicited by an odorant of interest (i.e. a target pattern) maximally activates the unk. While such a configuration would seem to make each DL unit specific for a particular odor, in fact odorant identity is best represented by
The odorant/sensor interaction in the artificial system is not mediated by a 'receptor' as one means in the case of biological olfactory systems: Relating spike activity in the simulated OSNs or MCs to a particular structural feature of the odorant is therefore difficult. However. in the tests described here, processing by the OB and DLNN supports detailed odorant recognition even without the kind of defined receptor/ligand binding usually associated with biological systems. Progressing through each level of the artificial olfactory system, the population code representing odorant information appears transformed. In the spike patterns of the OSNs. odorant identity and concentration appear to be combined in rather complex ways. Individual OSNs change firing rate and latency with changes in both odorant identity and concentration. The code in the simulated OSNs (in the biological olfactory system also) can be seen as redundant in at least two ways: a single odorant can activate OSNs of several types, and each OSN type occurs in large numbers of cells. Through interactions in the bulb simulation, identity and concentration representations appear to be partially separated. The output of the OB simulation presents one possible offactory population code that has a number of desirable properties. There is a compression of odorant information into fewer spikes in fewer cells
200
A
B
Acetone 2O 4.7x10 "4 . . . . . 1.2x10 "3 .......... 15
.~ °
10
Dimethyl Sulfoxide 2O
:
!
t
.,XlO.......... ' :ill,i ,
II!
!ili
i
imi
Ace
BAc
Ben
2.9x10 ~ . . . . 4.1x10 "6 .......... * 9.5x10 "6 1.4x10 "5 2.9x10 "5 ........
[
* 17x10 ~ 4.1x10 "3 - -
i DMSO
15
°:t~
o Hep
PAl
'
10
I Ace
DL Unit
,1: ,111 ' !
ihi BAc
Ili Ben
! DMSO
!ii,, Hep
PAl
DL Unit
Fig. 7. Outputs from a delay line nettral network for two odorants over a range of vapor concentrations (indicated by vapor molarity). Asterisks mark the concentration used to set delay lines for the corresponding DL unit (i.e., the 'Ace' unit for acetone and the 'DMSO' unit for dimethyl sulfoxide; see the 9.5 x 106 pattern in Fig. 6 for the DMSO target). In A, the response latencies for the Ace DL unit were: +37.8, +10.4, 0.0, -9.9, -9.4 ms (low to high concentration; negative values indicate shorter latencies). In B, the response latencies for the DMSO DL unit were: +30.5, +20.8, 0.0, -12.9, -16.6 ms. BAc, butyl acetate; Ben, benzene; Hep, heptane; PAl, propyl alcohol. From White and Kauer (1999), with permission from Elsevier Science.
than the OSN input, but it still exists in a distributed manner and the representation is still redundant. Furthermore, odorant information is contained in brief spike patterns lasting approximately 200 ms or less (Figs. 5 and 6), a time course compatible with the brief odorant applications generated by sniffing. In the final output of the DLNN, identity and concentration information are separable into two distinct representations: odorant identity by a spatial code across DL units and odorant concentration by a temporal code in the latency of DL unit responses. The DLNN represents one possible way of interpreting the spatio-temporal output of the OB. Other biologically plausible ways of interpreting OB output are also possible, such as considering the order of spike occurrence across MCs rather than their absolute latency (Rabinovich et al., 1999). The artificial olfactory system incorporates a number of elements of the biological system, but several aspects are not included in the current implementation. For example, there are centrifugal inputs to the OB from higher olfactory structures that may be important for processing (e.g. see Linster and Gervais, 1996; Linster and Hasselmo, 1997). This and other aspects of the biological olfactory system will be explored in future incarnations of the artificial system.
Conclusion
The first 'artificial nose, was described by Persaud and Dodd (1982)and since that time a number of research and commercial devices have been produced. All of these devices incorporate the two defining characteristics o f an artificial nose, namely an array of cross-reactive sensors and a means of pattern recognition. As such, these devices provide support for the hypothesis that a system using an array of cross-reactive sensors is capable of molecular recognition and discrimination. Our studies :with the ~ f i c i a l olfactory system reviewed here further suggest that distributed neuronal processing modeled directly after the biological olfactory system can also identify odorants over a range of concentrations. The aspects of the OB population code described above were n o t explicitly included in the equations and definitions of the Computer simulation. They are: instead properties that emerge as a result of OSN input patterns modified by lateral interactions in the OB. Whether olfactory information is coded in this way by the biological system is currently unknown. However, these studies suggest a number of testable hypotheses about various aspects of odorant coding in the biological olfactory system. For example, response
201
latency appears to be an important component of the 0tfactory 'code at the O B and DLNN levels in the artificial system. ~ l e decreased latency with increased stimulus con~erttrafion has been reported in: OB and piriform cortex single-unit recordings, studies of this aspect of odorant coding across populations of neurons have not been conducted. Imaging of voltage-sensitive dye signals and/or multisite, single-umt recordings from OB and piriform cortex Could beused to inwestigate the relationship between firing latency in populations of neurons and odorant identity and c°ncentrati°n coding. In effect, the artificial olfactory system described here represents a formal statement of our hypotheses of olfactory structure and function from the periphery to the piriform cortex. The resulting device thus serves the dual purposes of providing a realworld chemical sensing device as well as providing a platform for developing hypotheses about odorant coding for testing in the biological olfacto~2¢system.
AcknOwledgements Supported by ~ants from the NIDCD and ONR. and a Contract from DARPA.
References Adrian, E,D. (t942) Olfactory reactions in the brain of the hedgehog. J. Physiol., 100: 459--473. Arane~ta, R.C., Kini, A. and Firestein. S. (1999) Structureactivity relation between agonists and the mammalian octanal receptor Soc Neurosci Abstl:, 25 386 Baier, H. and Korsching, S. t 1994) Olfactory glomemli in the zebrafish form an invariant pattern and are identifiable across animals. J. Neurosci., 14~ 219-230. Bozza, T.C. and Kauer. J.S. (1998) Odorant response properties of convergem olfactory receptor neurons. I Neurosci., 18: 4560-4569. Cinelli, A.R, Hamilton, K.A. and Kauer, J.S. (1995) Salamander olfactory bulb neuronal activity observed by video rate. voltage-sensitive dye imaging. IlL Spatial and temporal properties of responses evoked by odorant stimulation. J. Neurophysiol.. 73:2053-207 l. Delaney, K.R. and Hall. B.J. (1996) An in vitro preparation of frog nose and br~5~afor the study of odour-evoked oscillatory activity. J. Neurosci. Methods, 68: 193-202. Dickinson, T.A., White, J., Kauer, J.S. and Walt, D.R. (1996) A chemical-detecting system based on a cross-reactive optical sensor array. Nature, 382z 697-700. Dionne. V. (1992) Chemosensory responses in isolated olfactory
receptor neurons from necmrus maculosus. J. Gen. PhysioL, 99: 415-433. Dorries. K.M. and Kauer, J.S. (2000) Relationships between odor-elicited oscillations in the salamander olfactory epithelium and olfactory bulb. 3". NeurophysioL. 83: 754-765. Duchamp-Viret, P., Panlouzier-Paulignan, B. and Duchamp, A. (1996) Odor coding properties of frog olfactory cortical neurons. Neuroscience, 74: 885-895. Duchamp-Viret, E, Chaput. M.A. and Duchamp. A. (1999) Odor response properties of rat olfactory receptor neurons. Science. 284: 2171-2174. Firestein. S., Picco, C. and Menini. A. (1993) The relation between stimulus and response in olfactory receptor cells of the tiger salamander. J. Physiol., 468: 1-10. Friedrich. R.W. and Korsching, SI. (1997 Combinatorial and chemotopic odorant coding in the zebrafish olfactory bulb visualized by optical imaging. Neuron, 18: 737-752. Getchell. T.V. and Shepherd, G.M. (1978) Responses of olfactory receptor ceils to step pulses of odour at different concentrations in the salamander. J. Physiol., 282: 521-540. Gilman. G., Rall. T.W.. Nies. A.S. and Taylor, E (Eds.) (1990) The Pharmacological Basis of Therapeutics. Pergamon Press. New York. Haberly, L.B. (1985) Neuronal circuitry in olfactory cortex: anatomy and fmlctional implications. Chem. Senses. 10: 219238. Hamilton. K.A. and Kauer. J.S. (i989) Patterns of intracellular potentials in salamander mitral/mfted cells in response to odor stimulation. J. Neurophysiol. 62: 609-625. Hasselmo, M.E., Wilson, M.A.. Anderson, B.P. and Bower, J.M. (1990) Associative memory function in piriform (olfactory) cortex: computational modeling and neuropharmacology. Cold Spring Harbor Syrup. Quant. Biol., LV 599-610 Hildebrand. J.G. and Shepherd. G.M. (1997) Mechanisms of olfactory discrimination: Converging evidence for common principles across phyla. Annu. Rev. Neurosci., 20: 595-631. Holley. A. and DOving, K.B. (1977) Receptor sensitivity, acceptor distribution, convergence and neural coding in the olfactory system. In: J. LeMagnen and R MacLeod (Eds.), Olfaetion and Taste VL IRL Press, London, pp. 113-123. Hopfield, J.J. (1995) Pattern recognition COlnputation using action potential timing for stimulus representation. Nature. 376: 33-36. Johnson, S.R., Sntter, J.M., Engelhardt H.L., Juts, RC., White. J.. Kaner, J.S.. Dickinson, T.A. and Walt. D.R. ~1997~ Identification of multiple analytes using an optical sensor array and pattern recognition neural networks. Anal Chem., 69: 46414648. Kang, J, and Caprio, J. (1995) In vivo responses of single olfactory receptor neurons in the channel catfish. 7ctalurus punctatus. J. NeurophysioL. 73: 172-177. Kashiwadani, H.. Sasaki, Y., Uchida, N. and Moil. K . (1999) Synchronized oscillatory discharges of mitral/tufted cells with different molecular receptive ranges in the rabbit olfactory bulb. 5'. Neurophysiol., 82: 1786-1792. Kauer, J.S. (1974) Response patterns of amphibian olfactory bulb neurones to odour stimulation. J. Physiol., 243: 695-715.
202 Kauer, J.S. (1980) Some spatial characteristics of central information processing in the vertebrate olfactory pathway. In: H. van der Starre (Ed.), Olfaction and Taste VII. IRL Press, London, pp. 227-236. Kauer, J.S. (1981) Olfactory receptor cell staining using horseradish peroxidase. Anat. Rec., 200: 331-336. Kauer, J.S. (1987) Coding in the olfactory system. In: T.E. Finger and W.L. Silver (Eds.), Neurobiology of Taste and Smell, John Wiley and Sons, New York, pp. 205-231. Kauer, J.S. (1988) Real-time imaging of evoked activity in local circuits of the salamander olfactory bulb. Nature, 331: 166168. Kauer, J.S. (1991) Contributions of topography and parallel processing to odor coding in the vertebrate olfactory pathway. Trends NeuroscL, 14: 79-85. Kauer, J.S. and Moulton, D.G. (1974) Responses of olfactory bulb neurones to odour stimulation of small nasa/areas in the salamander. J. PhysioL, 243: 717-737. Kent, P.E and Mozell, M.M. (1992) The recording of odorant-induced mucosal activity patterns with a voltage-sensitive dye. J. NeurophysioL, 68: 1804-1819. Krautwurst, D., Yau, K.W. and Reed, R.R. (1998) Identification of ligands for olfactory receptors by functional expression of a receptor library. Cell, 94: 917-926. Laing, D.G. (1986) Identification of single dissimilar odors is achieved by humans with a single sniff. PhysioL Behav., 37: 163-170. Lain, Y.-W., Cohen, L.B,, Wachowiak, M. and Zochowski, M.R. (2000) Odors elicit three different oscillations in the turtle olfactory bulb. J. Neurosci., 20: 749-762. Lancet, D., Greet, C.A., Kauer, J.S. and Shepherd, G.M. (1982) Mapping of odor-related neuronal activity in the olfactory bulb by high-res01ution 2-deoxyglucose autoradiography. Proc. Natl. Acad. Sci. USA, 79: 670-674. Lanrent, G. (1996) Dynamical representation of odors by oscillating and evolving neural assemblies. Trends Neurosci., 19: 489-496. Linster, C, and Gervais, R. (1996) Investigation of the role of interneurons and their modulation by centrifugal fibers in a neural model of the olfactory bulb. J. Comput. Neurosci., 3: 225-246, Linster, C: and Hasselmo, M. (1997) Modulation of inhibition in a model of olfactory bulb reduces overlap in the neural representation of olfactory stimuli. Behav. Brain Res., 84: 117-127. MacKay-Sim, A. and Kesteven, S. (1994) Topographic patterns of responsiveness to odorants in the rat olfactory epithelium. J. NeurophysioL, 71: 150-160. MacKay-Sim, A: and Shaman, P. (1984) Topographic coding of odorant quality is maintained at different concentrations in the salamander olfactory epithelium. Brain Res., 297: 207-216. MacKay-Sim, A., Shaman, R and Moulton, D.G. (1982) Topographic coding of olfactory quality: odorant-specific patterns of epithelial responsivity in the salamander. J. Neurophysiol., 48: 584-596. Mair, R.G. (1982) Response properties of the rat olfactory bulb neurones. J. Physiol., 326: 341-359.
Malnic, B., Hirono, J., Sato, T. and Buck, L.B. (1999) Combinatorial receptor codes for odors. Cell, 96: 713-723. " Meisami, E. and Safari, L. (1981) A quantitative study of th~ effects of early unilateral olfactory deprivation on the number and distribution of mitral and tufted cells and of glomeruli in the rat olfactory bulb. Brain Res., 221: 81-107. Meredith, M. (1986) Patterned response to odor in mammalian olfactory bulb: the influence of intensity. J. NeurophysioL, 56: 572-597. Meredith, M. and Moulton, D.G. (1978) Patterned response to odor in single neurones of goldfish olfactory bulb: influence of odor quality and other stimulus parameters. J. Gen. Physiol.~ 7l: 615-643. Mombaerts, R, Wang, F., Dulac, C., Chao, S.K., Nemes, A.! Mendelsohn, M., Edmondson, J. and Axel, R. (1996) VisualizZ ing an olfactory sensorY map. Cell, 87: 675-686. Murrell, J.R. and Hunter, D.D. (1999) An olfactory sensory neuron line, Odora, properly targets olfactory proteins and responds to odorants. J. Neurosci., 19: 8260-8270. Nemitz. J.W. and Goldberg, S.J. (1983) Neuronal responses of rat pyriform cortex to odor stimulation: an extracellular and intracellular study. J. NeurophysioL, 49: 188-203. Ojima, H., Mori. K. and Kishi, K. (1984) The trajectory of mitral cell axons in the rabbit olfactory cortex revealed by intracellular HRP injection. J. Comp. NeuroL, 230: 77-87. Persaud. K. and Dodd. G. (1982) Analysis of discrimination mechanisms in the man~nalian olfactory system using a model nose. Nature. 299: 352-355. Polak. E.H. (1973) Multiple profile-multiple receptor site model for vertebrate olfaction. J. Theor. Biol., 40: 469-484. Rabinovich. M.I.. Huerta, R., Volkovskii. A., Abarbanel, H.D.I. and Laurent. G. (1999) Sensory coding with dynamically competitive networks, http://eprints.lanl.gov/ Ressler. K.J., Sullivan, S.L. and Buck, L.B. (1993) A zonal organization of odorant receptor gene expression in the olfactory epithelium. Cell. 73: 597-609. Ressler, K.J.. Sullivan, S.L. and Buck. L.B. (1994) Information coding in the olfactory system: evidence for a stereotyped and highly organized epitope map in the olfactory bulb. Cell, 79: 1245-1255. Revial. M.E. Sicard, G.. Duchamp, A. and Holley, A. (1982) New studies on odour discrimination in the frog's olfactory receptor cells. I. Experimental results. Chem. Senses. 7: 175190. Rubin, B.D. and Katz, L.C. (1999) Optical imaging of odorant representations in the mammalian olfactory bulb. Neuron. 23: 499-511. Schild, D. (1988) Principles of odor coding and a neural network for odor discrimination. Biophys. J.. 54:1001-1011. Schoenfeld. T.A.. Clancy, A.N., Forbes. W.B. and Macrides. E (1994) The spatial organization of the peripheral olfactory system of the hamster. Part I: Receptor neuron projections to the main olfactory bulb. Brain Res. Bull., 34: 183-210. Scott, J.W., Ranier. E.C.. Pemberton. J.L.. Orona, E. and Mouradian. L.E. (1985) Pattern of rat olfactory bulb mitral and tufted cell connections to the anterior olfactory nucleus pars externa. J. Comp. NeuroL, 242: 415-424.
203
Tanabe, T., Iino, M. and Takagi, S.F. (1975) Discrimination of odors in olfactory bulb, pyriform-amygdaloid areas, and orbitofrontal cortex of the monkey. J. NeurophysioL, 38: 12841296. Touhara, K., Sengoku, S., Inaki, K., Tsuboi, A., Hirono, J., Sato, T., Sakano, H. and Haga, T. (1999) Functional identification and reconstitution of an odorant receptor in single olfactory neurons. Proc. Natl. Acad. Sci. USA, 96: 4040-4045. Tsuboi, A., Yoshihara, S., Yamazaki, N., Kasai, H., Asai-Tsuboi, H., Komatsu, M., Serizawa, S., Ishii, T., Matsuda, Y., Nagawa, E and Sakano, H. (1999) Olfactory neurons expressing closely linked and homologous odorant receptor genes tend to project their axons to neighboring glomeruli on the olfactory bulb. J. Neurosci., 19: 8409-8418. Vassar, R., Chao, S.K., Sticheran, R., Nunez, J.M., Vosshall, L.B. and Axel, R. (1994) Topographic organization of sensory projections to the olfactory bulb. Cell, 79: 981-991. Vassar, R., Ngai, J. and Axel, R. (1993) Spatial segregation of odorant receptor expression in the mammalian olfactory epithelium. Cell, 74: 309-318. Wellis, D.P., Scott, J.W. and Harrison, T.A. (1989) Discrimination among odorants by single neurons of the rat olfactory bulb. J. Neurophysiol., 61: 1161-1177. White, J., Bozza, T.C. and Alkasab, T.K. (1999) Probability
considerations in the study of olfactory receptor tuning. Chem. Senses, 24: 592. White, J., Dickinson, T.A., Walt, D.R. and Kauer, J.S. (1998) An olfactory neuronal network for vapor recognition in an artificial nose. Biol. Cybern., 78: 245-251. White, J., Hamilton, K.A., Neff, S.R. and Kauer, J.S. (1992) Emergent properties of odor information coding in a representational model of the salamander olfactory bulb. J. Neurosci., 12: 1772-1780. White, J. and Kauer, J.S. (1999) Odor recognition in an artificial nose by spatio-temporal processing using an olfactory neuronal network. Neurocomputing, 26-27: 919-924. White, J., Kauer, J.S., Dickinson, T.A. and Walt, D.R. (1996) Rapid analyte recognition in a device based on optical sensors and the olfactory system. Anal. Chem., 68: 2191-2202. Wright, R.H. (1982) The Sense of Smell. CRC, Boca Raton, FL. Youngentob, S.L., Mozell, M.M., Sheehe, P.R. and Hornung, D.E. (1987) A quantitative analysis of sniffing strategies in rats performing odor detection tasks. PhysioL Behav., 41: 5969. Zhao, H., lvic, L., Otaki, J.M., Hashimoto, M., Mikoshiba, K. and Firestein, S. (1998) Functional expression of a mammalian odorant receptor. Science, 279: 237-242.
M.A.L. Nicolelis (Ed.)
Progressin BrainResearch,Vol. 130 © 2001 Elsevier Science B.V. All rights reserved
CHAPTER 13
Neural population coding in the auditory system Ellen Covey * Department of Psychology, University of Washington, Box 351525, Seattle, WA 98195, USA
Population coding in sensory systems Representation of sensory information in the form of a neural population code is an intuitively obvious concept that most neurobiologists take for granted. There is overwhelming empirical evidence for population coding in a wide variety of neural systems, one of the most universally familiar being the three classes of cones that contribute to human color vision. In its simplest form, a population code is a neural representation in which information is conveyed by the relative amounts of activity across multiple units within an array; although no individual unit provides unambiguous information, the activity of the population as a whole is unambiguous. This model of population coding quickly becomes incomplete when one considers the properties of real neural networks operating in biological systems. First, each unit within the population may be excitatory or inhibitory, so that its activity could be represented by a negative value as well as a positive value. Second, the response of a unit may be highly non-linear with respect to different values of each of the stimulus parameters that determine its response, and there may be considerable interactivity among stimulus parameters. In addition to variations in spike count or probability of firing, the latency and temporal pattern of each unit's response may vary according to stimulus conditions. Moreover, any unit with a branching *Corresponding author: Ellen Covey, Department of Psychology, University of Washington, Box 351525, Seattle, WA 98195, USA. Tel.: +1-206-616-8112; Fax: +1-206685-3157; E-mail:
[email protected]
axon may participate in multiple populations and its activity may have very different significance depending on which target neuron is being considered. Within the context of these facts, the actual mechanisms by which population codes operate remain for the most part obscure. This article will present some examples from the auditory system of experimental data that show how a single neuron that receives convergent input from a given population provides a 'read-out' of activity in that population. It becomes clear that a concept like this is necessary when subthreshold activity is examined, revealing complex patterns of excitatory and inhibitory synaptic input over time which only under certain conditions cause the neuron to reach threshold and produce an action potential. The article will conclude by exploring the idea that even though each individual neuron provides a 'read-out' of a population code it is, in turn, just one element of a new and equally complex population code that transmits a spatio-temporally distributed pattern of information to other sets of neurons and/or motor units.
Spatial and temporal aspects of population coding Many models of population coding largely ignore the temporal dimension of neural activity by summing or averaging spikes over some arbitrarily chosen time period following stimulus presentation to produce a spatially organized array or matrix of positive response values for each of the units that make up a given population. It is commonly assumed that a given set of stimulus conditions will invariably result
206 in the same (or a 'noisy', but otherwise identical) set of values within the population matrix, providing a 'stable' population code. The resulting model is a population that essentially performs the same function as a labeled line since, even though the activity of a single neuron is ambiguous, the activity across the population is unambiguous, resulting in a stable multiunit map of a particular stimulus dimension. In order for this strategy to work effectively under all conditions, one must assume that the activity of a given neural element or population provides information about a single stimulus dimension or parameter. Given what we know about neural activity in sensory systems, it seems much more likely that information about multiple stimulus dimensions is conveyed by the spatio-temporal pattern of activity across the population and that information about multiple stimulus dimensions is multiplexed in the discharge of each single unit within the population. One way in which the multiplexed information may be separated is by divergence of a neuron's output to multiple targets so that the 'meaning' of the information conveyed by a given input unit depends on the other inputs that the target cell receives. Read-out of a population code
At every level, a population code necessarily requires some kind of 'read-out'. Very few studies have addressed the question of how population codes are 'read', even in the obvious case of color vision, and fewer yet have addressed the question of how the resuiting 'read-out' is used at subsequent stages. This is an issue that needs to be grappled with before any significant progress can be made in realistically modeling information processing in large populations of neurons and neural systems. The read-out of a population code is sometimes thought of in vague terms as a neuron in the cortex that observes and interprets the neural activity that goes on below. In more specific terms, the 'read-out' may be thought of as the last cell in a hierarchy of progressively more specialized cells that act as detectors for specific constellations of stimulus parameter values. This process culminates in so-called 'grandmother cells' - - cells that are essentially labeled lines for highly complex stimuli. Although many cells in sensory systems do respond selectively
to specific stimulus features, selective responses do not mean that the cell's only function is to 'detect' that feature, or that it would not respond to other stimulus features in other contexts. Perhaps the earliest and best understanding of how sensory population codes produce 'read-outs' comes from studies of the electrosensory system of fish (e.g. Heiligenberg, 1990, 1991a,b). However, as more is learned about information processing in the mammalian auditory system, it becomes increasingly apparent that many features of population coding are common to both the electrosensory system and the auditory system. In both fish and mammals, midbrain neurons receive convergent excitatory and inhibitory input from large populations of lower brainstem neurons and provide divergent output to a variety of sensory and motor pathways. The convergence of inputs at the midbrain results in selective filtering and, ultimately, neural population responses that control behavior. In the fish, for example, 'sign-selective' cells in the midbrain and diencephalon respond selectively to beat frequencies generated by interaction of the electric organ discharge of a neighboring fish with the fish's own discharge; these cells indicate whether the frequency of the neighbor's discharge is higher or lower than the fish's own. Comparisons are made across every point on the fish's body and the results are pooled to give a population estimate of whether the fish should raise or lower its own discharge frequency to avoid 'jamming' of its own electrosensory system by the neighbor's signals. In this system, each midbrain neuron makes a 'decision' based on the population inputs that it receives as to whether the neighbor's signal is higher or lower in frequency than the fish's own. The resulting motor output that changes the frequency of the fish's electric organ discharge depends on whether the majority of midbrain neurons indicate that the frequency should be raised or lowered. Studies of the auditory system in bats and other mammals indicate that neurons in the midbrain, thalamus and cortex perform similar neural computations that depend on population input, with different neurons responding selectively to different behaviorally relevant features of sounds (e.g. O'Neill and Suga, 1982; Olsen and Suga, 1991a,b; Ehrlich et al., 1994; Fuzessery, 1994. One of the best approaches to studying the nature of these computational mech-
207 anisms is intracellular recordings that provide information about the nature and time course of the different inputs to an individual midbrain neuron from the population that provides its input.
Read-out of population activity in the auditory system Studies of the mammalian central auditory system, especially those conducted in echolocating bats, provide important clues as to how population codes are implemented in terms of neural circuitry, and suggest what form the 'read-out' of a population code might take in this system. In the mammalian auditory brainstem, as in the brainstem of electric fish, timing precision is enhanced at early stages of processing through convergence of a population of afferents onto a target neuron (e.g. Kawasaki et al., 1988; Covey and Casseday, 1991; Joris et al., 1994). Each individual neuron in the auditory midbrain receives a spatio-temporally distributed pattern of information from a specific population of input neurons (e.g. Casseday and Covey, 1995; Covey and Casseday, 1995, 1999). Each neuron within the input population may be excitatory or inhibitory. The synaptic potentials it contributes to the target neuron have a characteristic latency and time course which may change according to stimulus parameters. The effect of activity in any one neuron within the input population necessarily depends on the magnitude, nature, and time course of input from the other neurons in that same population. Moreover, the effect of the net input from the population at any instant depends on the intrinsic properties, prior history, and current state of the target neuron. As in the electric fish, the convergence of parallel pathways on cells in the auditory midbrain results in selectivity for biologically relevant features of sound. For echolocating bats, these features include the location of the sound source, signal amplitude, signal duration, and the pattern of frequency change over the course of the signal. Intracellular recordings from individual neurons in the mammalian auditory midbrain and the midbrain of electric fish show that the synaptic currents and membrane potential recorded in response to a stimulus follow a time course that reflects the pattern of activity across the population of input neurons that project to the target cell; this pattern of activity
varies systematically as a function of any stimulus parameter that is varied (Rose and Call, 1992, 1993; Rose, 1995; Covey et al., 1996; Kuwada et al., 1997). Not only do the membrane currents or potentials of each neuron provide a 'read-out' of the activity that occurs across the population of input neurons - they allow us to see how the cell's pattern of spike output is shaped based upon the characteristics of the subthreshold 'read-out' (Nelson and Ernlkar, 1963; Casseday et al., 1994; Covey et al., 1996; Kuwada et al., 1997; Covey and Casseday, 1998). Fig. 1 shows a simple example of information that is present in subthreshold activity, but not in spike output. This recording is from the midbrain auditory center, the inferior colliculus (IC) of an echolocating bat. The stimuli in this case are sinusoidally
5'0 100 Tlme (ms)
150
Fig. 1. Responses of an IC neuron to a 100-ms stimulus consisting of a tone with a carrier frequency of 26 kHz, sinusoidally amplitude modulated at three differentrates. The sound level was 35 dB SPL. The cell's resting membranepotential was -67 mV. Modulation rates are indicated at the right side of each trace. This and all other whole-cell patch-clamp recordings illustrated were made in voltage clamp mode, with the cell held at its resting membranepotential. Upward deflectionsrepresent inhibitory postsynaptic currents (IPSCs) and downward deflections represent excitatory postsynaptic currents (EPSCs). Because the cell had an extensive dendritic tree, rapid, large depolarizing currents caused the cell to escape voltageclamp and fire action potentials. Recording methods are described in detail in Covey et al. (1996). Figure modified from Covey et al. (1996).
208 amplitude modulated tones, at three different modulation rates. This neuron, like many others in the IC from which extracellular recordings have been obtained, discharges action potentials only at the onset of the modulated signal and acts as a lowpass filter, responding only up to modulation frequencies of 100-200 Hz. The onset response characteristic is not a peculiarity of the conditions of intracellular recording, but rather a commonly observed property of midbrain auditory neurons (for review, see Casseday and Covey, 1996). From the oscillatory nature of the traces in Fig. 1, it is clear that the subthreshold pattem of synaptic currents resulting from input to the neuron at each modulation rate provides a fairly accurate reflection of the amplitude profile of the stimulus, but the spike output does not. The fact that spikes occur only in response to the first cycle may be due to intrinsic properties of the cell that prevent it from responding to a constant depolarizing input or, in this case, to a rapid series of depolarizing inputs (for review see Trussell, 1999). Spike output in response to the first cycle appears to be determined by the rate at which amplitude initially increases and by the period of time over which the amplitude of the first cycle remains above threshold. The neuron evidently receives a sequence of inhibitory and excitatory inputs that, combined with its intrinsic membrane properties, provide lowpass filter characteristics that progressively reduce its response to even the first cycle at higher modulation rates. The mechanism for creating low-pass filters through interaction of excitation and inhibition seems to be common to auditory neurons at several levels (e.g. Grothe, 1994; Covey et al., 1996) as well as neurons in the midbrain of electric fish (Rose, 1995). The role of convergent inputs from populations of inhibitory and excitatory neurons in determining filter characteristics of auditory neurons will be further explored later on in this review.
What happens to the read-out of a population code? Neurons in the auditory midbrain not only project to the thalamocortical system; they also project to multiple pathways that feed into motor systems (e.g. Covey et al., 1987; Schuller et al., 1991; Casseday and Covey, 1996). These findings suggest that
a spatio-temporal pattern of activity across a large population of 'read-out' neurons in the auditory midbrain is transmitted onto different populations of output neurons to ultimately generate different spatio-temporal sequences of muscle activation and/or sensory activation of thalamocortical pathways. This means that a given midbrain auditory neuron can potentially participate in more than one output population. The following sections will first summarize the basic circuitry underlying the auditory population code and then go on to describe in detail several examples of how responses are shaped by integration of excitatory and inhibitory synaptic events originating in a population of input neurons. Finally, it will go on to consider one example of bow the output of an IC 'read-out' neuron might be used in different contexts, with the neuron acting as one element in several distinct populations that project to subsequent levels.
The brainstem auditory pathways: circuitry and organizing principles Fig. 2 shows a schematic diagram of the major ascending pathways in the mammalian brainstem from the cochlear nucleus to the level of the midbrain auditory center, the inferior colliculus (IC). At the input stage, information carried by each auditory nerve fiber diverges to three major subdivisions of the cochlear nucleus, each containing multiple cell types. The subdivisions of the cochlear nucleus give rise to a number of distinct ascending pathways. Some of these (the dorsal cochlear nucleus and anteroventral cochlear nucleus, for example) project directly to the contralateral IC. The anteroventral cochlear nucleus also projects to structures in the contralateral brainstem that in turn project to the IC, thus forming a system of indirect 'monaural' pathways from the cochlear nucleus to the IC. Examples of indirect, monaural pathways are the projections via the intermediate and ventral nuclei of the lateral lemniscus (1NLL and VNLL). Bilateral projections from the anteroventral cochlear nucleus terminate in the nuclei of the superior olivary complex and other bilaterally innervated pathways in the lower brainstem. These in turn project directly or indirectly to the IC, forming a complex system of 'binaural' pathways. Examples
209 I~lbmin: Int~rnt~ anti a~m:tec:uncb
Fig. 2. Simplified diagram showing the major monaural pathways (black) and binaural pathways (gray) in the auditory brainstem of an echolocating bat. Each cochlear nucleus (CN) receivesinput from the ipsilateral auditory nerve and projects bilaterallyto multiple structures in the superior olivarycomplex(SOC). The SOC is the sourceof a direct pathway to the inferior colliculus (IC). The SOC also gives rise to bilateral pathways that project to the dorsal nucleus of the lateral lemniscus (DNLL, here included with the SOC to form the binaural system of pathways); the DNLL, in turn, projects bilaterallyto the IC. Monaural pathways from the CN include a direct projection from the CN to the contralateral IC as well as indirect projections via the contralateral monaural nuclei of the lateral lemniscus (INLL, VNLLc and VNLLm).The monaural nuclei of the lateral lemniscus, in turn, also project to the IC. From Covey and Casseday (1999).
of binaural pathways are the projections via the lateral and medial superior olivary nuclei and the dorsal nucleus of the lateral lemniscus. The end result of this complicated connectional network in the lower brainstem is that any cell in the IC can potentially receive 'straight-through' input from at least two major populations of ceils in the cochlear nucleus as well as binaural and monaural input from 10 or more different neural populations that have already performed various types and degrees of information processing. Because the latency ranges of cells in all of the 'straight-through' and multisynaptic pathways overlap, a neuron in the IC could potentially receive information from the direct and indirect pathways simultaneously.
Tonotopic organization and population coding The general structure of the brainstem auditory pathways follows several fundamental organizing princi-
pies. The first of these is tonotopic organization. It is well known that the active and passive mechanical properties of the cochlea give rise to a tonotopic place code within the receptor array such that hair cells at the base of the cochlea are maximally stimulated by high frequencies and those at the apex by low frequencies. The resulting tonotopic organization is commonly thought of as a labeled line code that produces a fixed spatial map of 'pitch' that is maintained at every level of the ascending auditory system through the auditory cortex. There are several reasons why this idea is an oversimplification. First, the place code is by no means the only source of information about the pitch of a sound. It has been known for decades that the activity of auditory nerve fibers is synchronized in time with the waveform of a low frequency tone or with the pattern of amplitude modulation of a high frequency tone (e.g. Kiang et al., 1965; Rose et al., 1968; Brugge et al., 1969), and that either form of 'phase-locking'
210 gives rise to the perception of a pitch corresponding to the frequency of the phaselocked activity (e.g. Wever and Bray, 1937; Wever, 1949; De Boer, 1976; Nordmark, 1978). There is good evidence that the temporal distribution of action potentials across multiple channels carries usable information about pitch that can override, or at least be integrated with, the information conveyed b y the place code (e.g. Licklider, 1951, 1956, 1959; Delgutte, 1980; Srulovicz and Goldstein, 1983; Delgutte and Cariani, 1992). Second, due to the mechanical properties of the cochlea, auditory nerve fibers do not respond to a single frequency at suprathreshold levels, but rather to a range of frequencies that broadens considerably as sound intensity is increased so that at the sound level of normal conversation, virtually all auditory nerve fibers are continually active (e.g. Kiang et al., 1967; Kiang and Moxon, 1972). This means that for a given auditory nerve fiber or neuron in the central auditory system, the same spike rate can be elicited by either a low intensity sound at the optimal frequency, or by a high intensity sound at a non-optimal frequency. Thus, the spike rate cannot provide unambiguous information about either frequency or intensity. This ambiguity can only be resolved by comparing relative amounts of activity in different units and by examining the distribution of spikes over time. Finally, the response of any auditory nerve fiber or neuron within the central auditory system to a particular sound segment depends not only on frequency and intensity, but also on the context in which that sound segment is heard. For many neurons in the central auditory system, the magnitude of their response depends heavily on what sounds have occurred at specific times prior to the stimulus under consideration (e.g. Feng et al., 1978; O'Neill and Suga, 1982; Margoliash, 1983, 1986; Covey, 1993a; Palombi et al., 1994). One important consequence of the auditory system's sensitivity to context is that we are better at discriminating frequency relationships than we are at making absolute frequency judgements. For example, we are able to recognize a melody regardless of the key in which it is played, and regardless of whether it is transposed upward or downward by one or more octaves. Thus, the primary function of frequency encoding in the auditory system is probably not pitch perception per
se, but rather the perception of relationships among different frequencies that occur simultaneously or sequentially. Despite the popularity of the 'auditory place code' in textbook accounts of the neural basis of pitch perception, it is not entirely clear how tonotopic organization relates to pitch perception. Nevertheless, one important consequence of tonotopic organization is that it provides multiple channels that can be combined in different temporal relationships at higher levels of processing. By the level of the IC, individual target neurons receive convergent input from a population of input neurons tuned to different frequency ranges, and having different thresholds and latencies. Moreover, the inputs from some members of the population may be excitatory, while others are inhibitory. The result is selectivity for sounds in which particular frequency relationships occur, either simultaneously or in the temporal domain. Fig. 3 shows intracellular recordings from a neuron in the IC of the big brown bat in response to tones at three different frequencies. A tone at 26 kHz elicited a train of action potentials, indicat-
29 kHz
. . . .
~
_ 26 kHz ~ '
~'-
. . . . . . . .
1
"~1~1~1~/, - - ' ' ~
.
.
.
.
.
45 dB SPL 0
20
40
60
80
100 120 140
Time (ms) Fig. 3. Responses of an IC neuron to a 5-ms, 45-dB SPL tone at three differentfrequencies,as seen by whole-cellpatch-clamp recording. Sound frequencyis indicated to the left of each trace. Note that the frequencies are arranged in nonsequential order. When the tone was at the neuron's best frequency(bottomtrace, 26 kHz) the response was an EPSC accompaniedby multiple action potentials. When the frequency of the tone was above the neuron's best frequency (middle trace, 29 kHz), there was a long latency IPSC, possibly followed by a small rebound. when the frequency of the tone was below the neuron's best frequency(uppertrace, 25 kHz), there was a short latencyIPSC. The cell's resting membranepotential was -70 inV. Spikes have been truncated for ease of viewing. Modifiedfrom Covey et al. (1996).
211 ing that this neuron received excitatory input from one or more neurons tuned to a frequency range that included 26 kHz. Tones at higher frequencies (e.g. 29 kHz) and lower frequencies (e.g. 25 kHz), elicited inhibitory postsynaptic currents, indicating that the IC neuron received inhibitory input from neurons tuned to ranges that included these frequencies. One of the most striking features of these inputs is the large difference in latency between the inhibitory postsynaptic current evoked by the 29-kHz sound (about 40 ms) and that evoked by the 25-kHz sound (about 20 ms). This means that certain temporal sequences of sounds would maximally drive the neuron, while other sequences would suppress its response. For example, the response of this neuron would be suppressed when a 29-kHz tone preceded a 26-kHz tone by approximately 20 ms. Not only are the relative amounts of activity in the different frequency-specific units that make up the input population important for determining this neuron's response, their timing is an important factor in determining whether the IC neuron will fire in response to a given sequence of sounds. Auditory neurons selective for specific combinations of frequencies presented simultaneously or sequentially are common in the auditory system of echolocating bats, starting at the level of the IC (e.g. Suga, 1969; Feng et al., 1978; Casseday and Covey, 1992; Fuzessery, 1994; Mittmann and Wenstrup, 1995; Dear and Suga, 1995) and proceeding through the thalamus (Olsen and Suga, 1991a,b), and cortex (e.g. O'Neill and Suga, 1982; Dear et al., 1993). Neurons selective for specific frequency combinations clearly develop this form of selectivity through convergent inputs from a population of neurons differentially tuned to frequency. For neurons that respond best to a specific temporal relationship between (or among) different frequencies, the inputs from different members of the population arrive with different delays relative to sound onset, augmenting or suppressing one another when the properties of the stimulus are such that the evoked postsynaptic potentials coincide.
Parallel processing and population coding Although tonotopic 'mapping' is the best known organizing principle in the ascending auditory path-
way, an equally important principle is the division of the pathway into multiple parallel streams of processing. The auditory system is unusual among mammalian sensory systems in the amount of processing that occurs prior to the thalamus. Much of this processing depends on the divergence of inputs into separate pathways, each of which may consist of multiple levels. At the level of the auditory nerve, all fibers discharge in fundamentally the same, primary-like, pattern. This means that for a sound of a given duration at a fixed frequency and intensity, the firing rate is initially high, but soon decreases to a steady state that is maintained for the remaining duration of the sound. In the cochlear nucleus, auditory nerve fibers diverge to terminate in different characteristic patterns on different populations of neurons with different morphologies and different intrinsic membrane properties. These factors alone are sufficient to transform the primary-like discharge patterns of auditory nerve fibers to a multitude of new temporal discharge patterns ranging from sustained responses to purely onset responses consisting of a single action potential (e.g. Oertel, 1991; Trussell, 1999). Neurons with different response types form parallel output pathways that innervate a number of separate auditory processing centers in the lower brainstem. Each of these centers performs some specialized transformational and/or computational function. The underlying mechanisms and nature of the different types of processing that occur in these pathways are numerous and complex. However, examples include the comparison of information from the two ears in the binaural pathway (for reviews see Kuwada and Yin, 1987; Casseday and Covey, 1987; Irvine, 1992; Konishi, 1993), sharpening of timing information through frequency convergence in the monaural pathway (for reviews see Covey, 1993b; Oertel, 1999), and lengthening of latency or conversion of sustained responses to offset responses through the action of neural inhibition (e.g. Grothe et al., 1992; Grothe, 1994). Fig. 4 shows how convergence of excitatory and inhibitory inputs can produce latency changes far greater than any that could result from the known differences in axon length or the number of synapses through which information is known to travel within the pathways of the auditory brainstem. Because the
212
27 kHz
IIRII,
(~
50
2"-"-I1[! '
....,
(~
50
,
,
100
150
,
,
100
150
I~
50 100 150 Time (ms) Fig. 4. Growth of short-latency inhibition and consequent increases in spike latency as a function of sound level as seen in whole-cell patch-clamprecordings. Traces show responses of an IC neuron to 5-ms tones with a frequency of 27 kHz, presented at three different sound levels. Sound pressure level is indicated at the right of each trace. First-spike latency is indicated beneath the arrow on each trace. As sound level increases, the magnitude and duration of the initial IPSC grow, progressivelytruncating the initial portion of the evoked spike train. The cell's resting membrane potential was -70 mV. Spikes have been truncated. Modified from Coveyet al. (1996). excitatory and inhibitory inputs can differ in their thresholds and other properties, their relationship changes as a function of sound intensity, causing corresponding changes in the target neuron's pattern of response, including a complete suppression of the response at high sound levels in some IC neurons. There is massive convergence at the IC from all of the different stages of the binaural and monaural pathways in the brainstem. In fact, it is likely that a single neuron in the IC could receive direct input from the cochlear nucleus as well as indirect input from neurons that are one, two, three, or more synapses re-
moved from the cochlear nucleus, with latencies that vary from 1 millisecond to many tens of milliseconds (e.g. Haplea et al., 1994; Covey and Casseday, 1995). Thus, it is reasonable to regard the collective outputs of the lower brainstem auditory pathways not only as a highly sophisticated population code in which qualities such as 'pitch' or 'timbre' are represented by relative amounts of activity across a population at a given point in time, but also as one in which population activity evoked by an 'instantaneous' stimulus becomes widely distributed in time. In the following section, it will become clear how such a spatially and temporally distributed code can facilitate the analysis of the temporal structure of sounds.
Spatio-temporaily distributed population activity and duration tuning In the IC of the bat, more than one-third of all neurons respond only to certain durations of sound that are within a biologically relevant range corresponding to the typical durations of its echolocation calls (Ehrlich et al., 1997; Fuzessery, 1994). Fig. 5 shows examples of bandpass-type duration tuned neurons. These neurons do not respond if the sound is too short, nor do they respond if it is too long. Whole-cell patch-clamp recording and neuropharmacological experiments have shown that duration tuning is created through the convergence of multiple excitatory and inhibitory inputs in a temporally specific fashion (Casseday et al., 1994; Covey et al., 1996; Casseday et al., 2000). Fig. 6 summarizes the basic neural mechanism for creating duration tuning. According to this model, the neuron's inputs can be separated into three major components. The first is a transient excitatory component that arrives at a fixed latency relative to sound onset. The second is a sustained inhibitory component that arrives with a latency less than or equal to that of the onset excitatory input, partially cancelling the excitation and rendering it subthreshold. The third component is transient subthreshold excitation (possibly due to an excitatory rebound from inhibition) that occurs soon after sound offset. The neuron's membrane potential reflects the interaction of all of these inputs regardless of signal duration, but an output in the form of action potentials is produced only when the sound's duration is such that the onset and offset excitatory
213 ]
o~ Inhibitory Input + rebound Excitatory Input 10
20
30
40
50
DuratlOn (ms)
B
3'
outlmt Inhlbltory Input + rebound ExoUatory Input t
tltl
3O ms ,0 o
10
20
30
40
rio
Duration (ms) Fig. 5. Examples of two duration-tuned neurons in the IC. Each graph shows the number of spikes per stimulus, averaged over 20 trials, plotted as a function of stimulus duration. Stimuli were pure tones, 36 kHz, 52 dB SPL for the neuron in (A) and 28 kHz, 84 dB SPL for the neuron in (B). Recordings were extracellular, as described in detail in Ehrlich et al. (1997). The dashed line in (A) indicates this neuron's spontaneous discharge rate. The neuron in (B) did not have any spontaneous activity. Modified from Ehrlich et al. (1997).
components coincide in time with one another. We do not know at present whether each component of the input originates in a single neuron or in a large population of neurons. Although the latter is most likely the case, this level of detail does not affect the basic model which, in any case, requires multiple inputs. Redistribution of information from a neural population
The strategy of simply counting up the total number of spikes fired in response to a sensory stimulus over some arbitrary time period is often assumed to provide accurate information about stimulus magnitude or the magnitude of the resulting sensation. In
Fig. 6. Schematic diagram showing the basic mechanism responsible for duration tuning of IC neurons. The neuron receives a transient excitatory input (bottom trace in A and B) that always arrives at a fixed latency relative to sound onset. It also receives a sustained inhibitory input (middle trace in A and B) that arrives at the same time as, or slightly before, the excitatory input. This inhibitory input partially cancels the excitatory input, rendering it subthreshold. The inhibitory input is followed by a rebound or second transient excitatory input, the timing and magnitude of which is determined by the time of sound offset. If the duration of the sound is such that the onset excitation and offset rebound (or excitation) coincide, the neuron reaches threshold and fires one or more action potentials (top trace in A). If the sound duration is such that the two depolarizing events do not coincide, the neuron does not fire (top trace in B). Modified from Covey and Casseday, 1999).
some cases, this may be true, or at least a good approximation. In other cases, the relationship between stimulus magnitude and neural firing is more complicated. For example, most neurons in the auditory system show bandpass tuning to sound frequency, responding over some limited portion of the audible range. Typically, the frequency range that will evoke a response broadens with increases in sound amplitude, with the greatest neural discharge being evoked by a frequency near the center of the range, and falling off at both higher and lower frequencies. In addition to their frequency tuning, however,
214
1-93-66 Downward FM 54 dB SPL , ' ' ~ +
-- :-~ -'- "
~
44 dB SPL 34 dB SPL 24 dB SPL
25 kHz ,
f
E
i
o
2o
40
60
80
100
,
l
120
140
~
-4
Time (ms)
Fig. 7. Responses of an IC neuron to a 25-kHz tone, 5 ms in duration, presented at four different stimulus amplitudes, as seen by whole-cell patch-clamp recording. Sound amplitude is indicated to the left of each trace. The traces are arranged with low sound levels at the bottom and high sound levels at the top. There are no spikes in the bottom trace, where sound level (24 dB SPL) was below threshold. In the top trace (54 dB SPL), the only response is a small IPSC (arrow), indicating that the predominant input to the cell was inhibitory. The cell's resting membrane potential was - 5 9 mV. Modified from Covey et al. (1996).
many neurons in the central auditory system, especially at the level of the midbrain and higher, have a non-monotonic relationship between sound amplitude and firing rate. That is, up to some point the neuron's firing rate increases with increases in sound amplitude, but further increases in amplitude cause it to fire fewer spikes or even cease firing in response to loud sounds (e.g. Casseday and Covey, 1992). Thus, the range of amplitudes over which the neuron provides output is limited, creating bandpass tuning to sound amplitude in the same neurons that have bandpass tuning for frequency. Fig. 7 shows an example of intracellular recordings from an amplitude-tuned neuron in the IC. This neuron receives excitatory input over a range of relatively low sound amplitudes, but at high sound amplitudes the net input is inhibitory. This suggests that the neuron receives excitatory input from one population made up of low-threshold neurons and inhibitory input from another population made up of high-threshold neurons. Moreover, at high sound amplitudes, the strength of the inhibitory input is more than sufficient to cancel out the contribution of the excitatory inputs.
| LIJ
0 45 90 Azimuth (degrees)
135
Fig. 8. Contour plots showing the spatial receptive field of an amplitude-tuned neuron in the left IC of the big brown bat, for two different sound amplitudes as measured at the loudspeaker. (A) Spatial receptive field when loudspeaker output was l0 dB above the neuron's threshold. (B) Spatial receptive field when loudspeaker output was 30 dB above the neuron's threshold. Each contour plot is constructed from a matrix of 320 response values, recorded with the loudspeaker in one of 32 horizontal positions and one of l0 vertical positions as described in detail in Grothe et al., 1996. The black dot indicates the location that elicited the maximal response. The surrounding dark area indicates the region lying between the maximal response and 75% of the maximal response, with each subsequent contour indicating a 12.5% decrement in response magnitude. The stimulus was a downward frequency modulated (FM) sweep. Modified from Grothe et al., 1996.
Fig. 8 shows an example of how an amplitudetuned auditory midbrain neuron responds to a frequency modulated sound of a fixed amplitude as a function of where the sound source is located within the auditory field of the animal, in this case an echolocating bat. The bat's outer ears, or pinnae,
215 are highly directional, so that a sound source at a fixed distance from the head, emitting a fixed amplitude sound, will provide the highest amplitude at the ear (i.e. the least attenuation) when located about 45 ° from the midline in the horizontal dimension, and a few degrees below the horizon in the vertical dimension, although this region of peak amplitude varies somewhat with frequency. For a loudspeaker emitting a sound at an amplitude just above the neuron's threshold (Fig. 8A: 10 dB re threshold), the directionality of the pinna creates a clearly defined spatial receptive field, as measured by spike counts evoked when the loudspeaker is located at different points in auditory space relative to the head. For a stimulus 10 dB above threshold, the spatial receptive field of this neuron is centered at about 50° horizontal and - 5 ° vertical. At this amplitude, the signal received at the ear is above the neuron's threshold only in the area at which pinna sensitivity is highest. In contrast, when the sound amplitude emitted by the loudspeaker is raised to 30 dB above threshold (Fig. 8B), the neuron's response is suppressed when the loudspeaker is located in the region of greatest pinna sensitivity, and thus loudest. Because the neuron's response is suppressed at higher sound amplitudes, the greatest spike counts are evoked when the loudspeaker is located within a ring-shaped area surrounding the field of maximal pinna sensitivity. Clearly this neuron could not contribute to a stable representation of auditory space if we assumed that its level of activity was always proportional to the amplitude of the sound at some spatial location that represents the center of its spatial 'receptive field'. Although the spatial location of a sound source is one determinant of this neuron's discharge or lack thereof, it is by no means the only parameter encoded by its activity. Other parameters encoded include the amplitude of the sound, its spectral content and its pattem of frequency change over time. For example, this neuron did not respond to pure tones, but responded well to a downward frequency modulation. If we try to derive the 'meaning' of the neuron's activity, it may well be that it is different depending on which of its target neurons we consider, and whether the neuron's output is excitatory or inhibitory. Fig. 9 shows a simple theoretical circuit that demonstrates how the output of an IC neuron like
that shown in Fig. 8 might be interpreted by different 'read-out' neurons to which it sends divergent projections. In the diagram, it is assumed that the IC neuron's output is inhibitory, but the model could work equally well (with different outcomes) if it were excitatory. The IC neuron's axon diverges to two different target regions. One region is concerned with sound source localization, (e.g. the superior colliculus), and the other (e.g. medial geniculate nucleus) is concerned with analyzing the time that elapses between the bat's loud outgoing vocalization and the faint returning echo. The target neuron(s) in either region receive inhibitory input from the IC cell when sound amplitude is low, but none when it is high. The responses of the target neuron in the region concerned with sound localization are determined by interaction of the inhibitory input from the IC neuron with an excitatory input from another source. In the model, all of these neurons have similar spatial receptive fields, located near the region of greatest pinna sensitivity (about 50° contralateral). Background sounds originating in regions of space peripheral to the IC neuron's receptive field would result in low sound amplitude within the receptive field, causing the IC neuron to respond. The resulting activity would cancel the excitatory input to the target neuron in the superior colliculus and prevent it from responding. The same sound source located within the IC neuron's spatial receptive field would likely be loud enough to suppress the IC neuron's activity, allowing the target neuron to be driven by the excitatory input. Thus, the output of the 1C neuron provides a sort of 'lateral inhibition', suppressing the response of the target cell to sounds outside the spatial receptive field and enhancing its response to sounds within the spatial receptive field. The specificity of the neuron in Fig. 8 for downward FM signals similar to those used by bats during echolocation suggests that in addition to participating in sound source localization, it may participate in the analysis of echolocation sounds. In the model circuit in Fig. 9, it is assumed that the IC cell is inhibitory and that it responds to low amplitude sounds, but not to loud sounds. When the bat produces its vocalization, which is high amplitude, up to 100 dB SPL, the IC neuron will be unrespon-
216
Coincidence detector
RF 50
RF 5O
with High Threshold and Lateral Inhibition
RF 50
--•O•FM Imlective
,neuron
Fig, 9. Hypothetical and greatly simplified circuit to illustrate two different ways in which the output of an amplitude-tuned, FM selective neuron such as the one in Fig. 8 could be used at subsequent stages of processing. The amplitude-tuned neuron (black) is assumed in this diagram to be inhibitory and to have a spatial receptive field centered around 50° contralateral (RF 50). The inset graph shows its spike output (% maximum) as a function of sound level. The black neuron is one member of a population that projects to a region concerned with localization of sound sources in space so that a motor response can be directed to a region from which a loud sound originates (gray neuron on right). This region receives input from a population of non-amplitude tuned excitatory neurons whose spatial receptive fields are also centered around 50° contralateral, represented by the gray neuron on the left. A relatively loud sound source at 50° contralateral will inhibit the black neuron, allowing the excitation in the gray pathway to bring the output neuron to threshold. The same sound at another location (e.g. 30° contralateral) will be attenuated at the ear. If it is within the range of low amplitudes that cause the black neuron to fire, it will inhibit firing in the gray output neuron. This system provides a sort of lateral inhibition which could sharpen the spatial specificity of the gray neuron. The black neuron is also a member of a population that projects to a neural circuit (white neurons) concerned with analyzing sequences of echolocation signals consisting of a loud emitted vocalization followed after some time interval by a low-amplitude echo. The two intermediate neurons in this circuit receive excitatory input from a non-amplitude tuned, FM selective neuron and send subthreshold inputs to a third neuron which acts as a coincidence detector. The input from the top neuron is delayed relative to that from the bottom neuron. The loud vocalization will suppress input from the black neuron, allowing the top, 'pulse' neuron to fire. If the lower amplitude echo falls within the range of amplitudes that cause the black cell to fire, it will suppress the 'pulse' neuron's activity, allowing only the lower 'echo' neuron (with no time delay) to fire. If the EPSP evoked by the non-delayed pathway coincides in time with that evoked by the time delayed pathway, the coincidence detector will fire, signaling an object at a specific distance that is proportional to the delay in the pathway. In this circuit, the function of the black cell is to prevent the cell that gives rise to the delayed pathway from responding to the echo.
sive. T h e target cell in a brain r e g i o n s p e c i a l i z e d for p r o c e s s i n g e c h o l o c a t i o n signals w o u l d be driven by its e x c i t a t o r y input. T h e excitatory r e s p o n s e is t r a n s m i t t e d through a d e l a y line to p r o d u c e a subt h r e s h o l d e x c i t a t o r y postsynaptic potential ( E P S P ) at another c e l l that acts as a c o i n c i d e n c e detector. I f a l o w - a m p l i t u d e e c h o returns f r o m an object, the IC n e u r o n will r e s p o n d and the resulting inhibition will c a n c e l e x c i t a t i o n at the target c e l l p r e v e n t i n g it f r o m eliciting a s e c o n d E P S P in r e s p o n s e to the echo. T h e neuron that sends an u n d e l a y e d input to
the c o i n c i d e n c e d e t e c t o r w i l l be active, however. If the t i m e i n t e r v e n i n g b e t w e e n the e m i t t e d pulse and the returning e c h o is such that u n d e l a y e d E P S P f r o m the ' e c h o ' n e u r o n c o i n c i d e s w i t h the d e l a y e d E P S P f r o m the ' p u l s e ' neuron, the c o i n c i d e n c e detector w i l l fire, signalling the o c c u r r e n c e of a l o u d F M sound f o l l o w e d after a specific t i m e interval by a faint F M sound. T h e inhibitory output o f the IC n e u r o n p r e v e n t s the c o i n c i d e n c e d e t e c t o r f r o m r e s p o n d i n g to a faint sound f o l l o w e d b y a l o u d one.
217
Conclusions The above examples of auditory processing mechanisms illustrate just a few of the many ways in which population codes can operate. Nevertheless, they point to some general principles that apply to most or all population codes, whether they be in the auditory system, other sensory systems, motor systems, or in systems whose function is not so obvious. First, a population code comprises activity in multiple units with different thresholds, sensitivity functions, discharge patterns and latencies; moreover, the different units in the population may be either excitatory or inhibitory. The subthreshold activity of each individual target neuron that receives input from the population, seen as changes in membrane potential or synaptic currents, reflects the net effect of input from the entire population and can be thought of as a read-out of population activity. The target neuron will produce an output only if the timing and nature of the combined input from the population is such that the read-out causes the target neuron to be depolarized strongly enough and quickly enough to reach threshold. Thus, information contained in the readout may or may not be transmitted to subsequent groups of neurons. Second, the fact that different units respond with different latencies means that the population code for any instantaneous set of information is widely distributed in time. Put another way, the instantaneous pattern of neural activity at any given point in time represents information that has occurred at many different points in time prior to the instant at which the measurement is made. The fact that many neural populations include sets of delay lines means that information from one point in time is constantly being integrated with or compared to information from another point in time. Delay lines allow the nervous system to perform operations similar to autocorrelation, as originally suggested by Licklider (1951, 1956, 1959) as a mechanism for auditory pitch perception based on phaselocked discharges. Third, because all neural systems operate in a dynamic, ever-changing environment, any neural representation of information must necessarily incorporate a time dimension, both at the input stage and at the output stage. It seems only reasonable to pos-
tulate that the 'read-out' of a spatio-temporally distributed population code is itself a spatio-temporally distributed population code. Finally, the significance of an individual neuron's output may be very different depending on which of its targets is considered. As we continue to increase our understanding of how populations of neurons generate, operate on, and redistribute spatio-temporal pattems of information, we will come closer to discovering the true nature of the neural interface between a living organism and its environment.
Acknowledgements Supported by grants from the National Institute on Deafness and Other Communication Disorders grants DC00607 and DC00287.
References Brugge, J.E, Anderson, D.J. and Aitkin, L.M. (1969) Time structure of discharges in single auditory nerve fibers of the squirrel monkey in response to complex periodic sounds. J. Neurophysiol., 32: 386--401. Casseday, J.H. and Covey, E. (1987) Structural basis for directional heating. In: W.A. Yost and G. Gourevitch (Eds.), Directional Hearing. Springer-Verlag, New York, pp. 109145. Casseday, J.H. and Covey, E. (1992) Frequency tuning properties of neurons in the inferior colliculus of an FM bat. J. Comp. NeuroL, 319: 34-50. Casseday, J.H. and Covey, E. (1995) Mechanisms for analysis of auditory temporal patterns in the brainstem of echolocating bats. In: E. Covey, H.L. Hawkins and R.E Port (Eds.), Neural Representation of Temporal Patterns. Plenum, New York, pp. 25-52. Casseday, J.H. and Covey, E. (1996) A neuroethological theory of the operation of the inferior colliculus. Brain Behav. EvoL, 47: 311-336. Casseday, J.H., Ehrlich, D. and Covey, E. (1994) Neural tuning for sound duration: role of inhibitory mechanisms in the inferior colliculus. Science, 264: 847-850. Casseday, J.H., Ehrlich, D. and Covey, E. (2000) Neural calculations of sound duration: control by excitatory-inhibitory interactions in the inferior colliculus, submitted for publication. Covey, E. (1993a) Response properties of single units in the dorsal nucleus of the lateral lemniscus and paralemniscal zone of an echolocating bat. J. Neurophysiol., 69: 842-859. Covey, E. (1993b) The monaural nuclei of the lateral lemniscus: parallel pathways from cochlear nucleus to midbrain. In: M.A. Merchan, J.M. Juiz and D.A. Godfrey (Eds.), The Mammalian
218
Cochlear Nuclei: Organization and Function. Plenum, New York, pp. 321-334. Covey, E. and Casseday, J.H. (1991) The ventral lateral lemniscus in an echolocating bat: parallel pathways for analyzing temporal features of sound. J. Neurosci., 11: 3456-3470. Covey, E. and Casseday, J.H. (1995) The lower brainstem auditory pathways. In: A,N. Popper and R.R. Fay (Eds.) Handbook of Auditory Research, Vol. 5: Hearing and Echolocation in Bats. Springer, New York, pp. 235-295. Covey, E. and Casseday, J.H. (1998) Brainstem circuits for processing time-varying information. In: A.R. Palmer, A. Rees, A.Q. Summerfield and R. Meddis (Eds.), Psychophysical and Physiological Advances in Hearing. Whurr, London, pp. 536545. Covey, E. and Casseday, J.H. (1999) Timing in the auditory system of the bat. Annu. Rev. PhysioL, 61: 457-476. Covey, E., Hall, W.C. and Kobler, J.B. (1987) Subcortical connections of the superior colliculus in the mustache bat, Pteronotus parnellii. J. Comp. Neurol., 263: 179-197. Dear, S.P. and Suga, N. (1995) Delay-tuned neurons in the midbrain of the big brown bat. J. Neurophysiol., 73: 10841100. Covey, E., Kauer, J.A. and Casseday, J.H. (1996) Whole-celt patch clamp recording reveals subthreshold sound-evoked postsynaptic currents in the inferior colliculus of awake bats. J. Neurosci., 16: 3009-3018. Dear, S.P., Fritz, J., Haresign, T., Ferragamo, M. and Simmons, J.A. (1993) Tonotopic and functional organization in the auditory cortex of the big brown bat, Eptesicus fuscus. J. Neurophysiol., 70: 1988-2009. De Boer, E. (1976) On the 'residue' and auditory pitch perception. In: W,D. Keidel and W.D. Neff (Eds.), Handbook of Sensory Physiology. Springer Vedag, Berlin, pp. 479-583. Delgutte, B. (1980) Representation of speech-like sounds in the discharge patterns of auditory nerve fibers. J. Acoust. Soc. Am., 68: 843-857. Delgutte, B. and Cariani, P. (1992) Coding of fundamental frequency in the auditory nerve: A challenge to rate-place models. In: M.E.H. Schouten (Ed.), The Processing of Speech. Mouton-DeGruyer, Berlin, pp. 37-45. Ehrlich, D., Covey, E. and Casseday, J.H. (1997) Neural tuning to sound duration in the inferior colliculus of the big brown bat, Eptesicus fuscus. J. Neurophysiol., 77: 2360-2372. Feng, A.S., Simmons, J.A. and Kick, S.A. (1978) Echo detection and target-ranging neurons in the auditory system of the bat Eptesicus fuscus. Science, 202: 645-648. Fuzessery, Z.M. (1994) Response selectivity for multiple dimensions of frequency sweeps in the pallid bat inferior colliculus. J. Neurophysiol., 72: 1061-1079. Grothe, B. (1994) Interaction of excitation and inhibition in processing pure tone and amplitude-modulated stimuli in the medial superior olive of the mustached bat. J. NeurophysioL, 71 : 706-721. Grothe, B., Vater, M., Casseday, J.H. and Covey, E. (1992) Monaural interaction of excitation and inhibition in the medial superior olive of the mustached bat: an adaptation for biosonar. Proc. Natl. Acad. Sci. USA, 89: 5108-5112.
Grothe, B., Covey, E. and Casseday, J.H. (1996) Spatial tuning of neurons in the inferior colliculus of the big brown bat: effects of sound level, stimulus type, and multiple sound sources. J. Comp. Physiol. A, 179: 89-102. Haplea, S., Covey, E. and Casseday, J.H. (1994) Frequency tuning and response latencies at three levels in the brainstem of the echolocating bat, Eptesicus fuscus, J. Comp. Physiol. A, 174: 671-683. Heiligenberg, W. (1990) Electrosensory systems in fish. Synapse, 6: 196-206. Heiligenberg, W. (1991a) The jamming avoidance response of the weakly electric fish, Eigenmannia: computational rules and their neuronal implementation. Semin. Neurosci., 3: 3-18. Heiligenberg, W. (1991b) Sensory control of behavior in electric fish. Curr. Opin. Neurobiol., 1: 633-637. Irvine, D.R.E (1992) Physiology of the auditory brainstem. In: A.N. Popper and R.R. Fay (Eds.), The Mammalian Auditory Pathway: Physiology. Springer-Verlag, New York, pp. 153231. Joris, P.X., Smith, P.H. and Yin, T.C.T. (1994) Enhancement of neural synchronization in the anteroventral cochlear nucleus. II. Responses in the tuning curve tail. J. Neurophysiol., 71: 1031-1037. Kawasaki, M., Rose, G. and Heiligenberg, W. (1988) Temporal hyperacuity in single neurons of electric fish. Nature, 336: 173-176. Kiang, N.Y.-S. and Moxon, E.C. (1972) Physiological considerations in artificial stimulation of the inner ear. Ann. Otol. Rhinol. Laryngol., 81: 714-730. Kiang, N.Y.-S., Watanabe, T., Thomas, E.C. and Clark, L.E (1965) Discharge patterns of single fibers in the cat's auditory nerve. Research Monograph No. 35. MIT Press, Cambridge, MA. Kiang, N.Y.-S, Sachs, M.B. and Peake, W.T. (1967) Shapes of tuning curves for auditory nerve fibers. J. Acoust. Soc. Am., 42: 1341-1342. Konishi, M. (1993) Listening with two ears. Sci. Am., 268: 6673. Kuwada, S. and Yin, T.C.T. (1987) Physiological studies of directional hearing. In: W.A. Yost and G. Gourevitch (Eds.), Directional Hearing. Springer Verlag, New York, pp. 146176. Kuwada, S., Batra, R., Yin, T.C.T., Oliver, D.L., Haberly, L.B. and Stanford, T.R. (1997) Intracellular recordings in response to monaural and binaural stimulation of neurons in the inferior colliculus of the cat. J. Neurosci., 17: 7565-7581. Licklider, J.C. (1951) A duplex theory of pitch perception. Experientia, 7: 128-134. Licklider, J.C. (1956) Auditory frequency analysis. In: C. Cherry (Ed.), Information Theory. Butterworth, London, pp. 253-268. Licklider, J.C. (1959) Three auditory theories. In: S. Koch (Ed.),
Psychology: A Study of a Science. Study L Conceptual and Systematic. McGraw-Hill, New York, pp. 41-144. Margoliash, D. (1983) Acoustic parameters underlying the responses of song-specific neurons in the white-crowned sparrow. J. Neurosci., 3: 1039-1057. Margoliash, D. (1986) Preference for autogenous song by audi-
219
tory neurons in a song system nucleus of the white-crowned sparrow. J. Neurosci., 6: 1643-1661. Mittmann, D.H. and Wenstrup, J.J. (1995) Combination-sensitive neurons in the inferior colliculus. Hear Res., 90: 185-191. Nelson, P.G. and Erulkar, S.D. (1963) Synaptic mechanisms of excitation and inhibition in the central auditory pathway. J. Neurophysiol., 26: 908-923. Nordmark, J.O. (1978) Frequency and periodicity analysis. In: Handbook of Perception, Vol IV, Academic Press, New York, pp. 243-282. Oertel, D. (1991) The role of intrinsic neuronal properties in the encoding of auditory information in the cochlear nuclei. Curr. Opin. Neurobiol., 1: 221-228. Oertel, D. (1999) The role of timing in the brain stem auditory nuclei of vertebrates. Annu. Rev. PhysioL, 61: 497-519. Olsen, J.E and Suga, N. (1991a) Combination-sensitive neurons in the medial genicnlate body of the mustached bat: encoding of relative velocity information. J. Neurophysiol., 65: 12541274. Olsen, J.E and Suga, N. (1991b) Combination-sensitive neurons in the medial geniculate body of the mustached bat: encoding of target range information. J. Neurophysiol., 65: 1254-1274. O'Neill, W.E. and Suga, N. (1982) Encoding of target range information and its representation in the auditory cortex of the mustached bat. J. Neurosci., 2: 17-24. Palombi, P.S., Backoff, P.M. and Caspary, D.M. (1994) Pairedtone facilitation in dorsal cochlear nucleus neurons: a shortterm potentiation model testable in vivo. Hear. Res., 75: 175183. Rose, J.E., Brugge, J.F., Anderson, D.J. and Hind, J.E. (1968)
Patterns of activity in single auditory nerve fibers in the squirrel monkey. In: Hearing Mechanisms in Vertebrates. Churchill, London, pp. 144-157. Rose, G.J. (1995) Representation of patterns of signal amplitude in the anuran auditory system and electrosensory system. In: E. Covey, H.L. Hawkins and R.E Port (Eds.) Neural Representation of Temporal Patterns. Plenum, New York, pp. 1-24.
Rose, G.J. and Call, S.J. (1992) Evidence for the role of dendritic spines in the temporal filtering properties of neurons: the decoding question and beyond. Proc. Natl. Acad. Sci. USA, 89: 9662-9665. Rose, G.J. and Call, S.J. (1993) Temporal filtering properties of midbrain neurons in an electric fish: Implications for the function of dendritic spines. J. Neurosci., 13:1178-1189. Schuller, G., Covey, E. and Casseday, J.H. (1991) Auditory pontine grey: connections and response properties in the horseshoe bat. Eur J. Neurosci., 3: 648-662. Srulovicz, P. and Goldstein, J.L. (1983) A central spectrum model: a synthesis of auditory-nerve timing and place cues in monaural communication of frequency spectrum. J. Acoust. Soc. Am., 73: 1266-1276. Suga, N. (1969) Classification of inferior colliculus neurons of bats in terms of responses to pure tones, FM sounds and noise bursts. J. Physiol., 200: 555-574. Trussell, L.O. (1999) Synaptic mechanisms for coding timing in auditory neurons. Annu. Rev. Physiol., 61: 477-496. Wever, E.G. (1949) Theory of Hearing. Wiley, New York. Wever, E.G. and Bray, C.W. (1937) The perception of low tones and the resonance-volley theory. J. Psychol., 3:101-114.
M.A.L. Nicolelis (Ed.)
Progress in Brain Research, Vol. 130 © 2001 Elsevier Science B.V. All rights reserved
CHAPTER 14
Population coding in the auditory cortex Bernhard H. G a e s e * Institut fiir Biologie 11, RWTH Aachen, Kopernikusstrasse 16, D-52074 Aachen, Germany
Introduction The auditory cortex receives information from the ascending central auditory pathway. The usually complex structured auditory information is fed into many different frequency channels as a result of the spectral analysis in the cochlea. Information from one ear arrives at the cortical level through several different possible pathways interconnecting the nuclei of the auditory brainstem, the auditory thalamus and the auditory cortex, interacting with incoming activity from the other ear. Arriving activity at the cortical level therefore is of dynamic structure and comes through many channels in parallel, always activating not only a few, but a population of neurons. This article will review data on auditory stimulus encoding at the cortical level with an emphasis on population coding, thereby trying to develop ideas on how the distributed encoding of activity is structured and how it might be read out.
Representation of stimulus domains in the auditory cortex There is a long tradition of research on the auditory cortex with an emphasis on single unit recording in anesthetized animals, trying to establish a general picture of auditory information processing. To some extend this approach has been very successful. It
*Corresponding author: Bernhard H. Gaese, Institut fur Biologie II, RWTH Aachen, Kopemikusstrasse 16, D-52074 Aachen, Germany. Tel.: +49-241-804854; Fax: +49-241-888-8133; E-mail:
[email protected]
provided a picture of the auditory cortex as holding overlaid topographical representations of several independent stimulus domains. These classical data are summarized here with an emphasis on the work in cats and rats, as important contribution to the understanding of population coding in the auditory cortex came from studies in these animals (see below): (1) Cortical neurons encode sound frequency mainly in V-shaped response areas that are tuned to one 'characteristic frequency' (Brugge and Reale, 1985; Sally and Kelly, 1988). Neurons of similar characteristic frequency are arranged in iso-frequency bands perpendicular to the frequency axis. (2) Sound intensity is encoded in monotonic or non-monotonic rate-intensity functions (Phillips and Kelly, 1989). However, a strong relationship to the encoding of stimulus onset parameters was found in the cat (Heil, 1997a,b). (3) Binaural response characteristics are represented in patches or in bands oriented orthogonal to the iso-frequency bands (Middlebrooks et al., 1980; Kelly and Sally, 1988). (4) Cortical neurons encode the repetition rate of amplitude-modulated stimuli, click trains or frequency-modulated stimuli in the low-frequency range in phase-locked activity (Schreiner and Urbas, 1988; Heil et al., 1992; Gaese and Ostwald, 1995a). This shows that individual auditory cortical neurons are involved in the processing of several stimulus parameters. Their firing rate is influenced by the dynamics of several parameters and is by no means unambiguous. Thus, reading out auditory information from cortical activity essentially relies on distributed patterns of activity in many neurons. The underlying strategy depends on the amount of infor-
222 mation that is signaled by each individual neuron. This aspect will be discussed for the most dominant stimulus parameter - - sound frequency - - in the following sections.
Beyond the 'classical view' of frequency encoding Until recently, the general view of frequency encoding at the cortical level was mainly emphasizing simple V-shaped tuning of excitatory response areas at the single cell level. Although the auditory cortex is several synapses away from the sensory epithelium, at least for the rat only simple V-shaped tuning characteristics were described (Sally and Kelly, 1988; Kilgard and Merzenich, 1998). This fits the 'classical view' of frequency encoding in the mammalian cortex. More complex response areas have been described in other species, but only in a few cases (Sutter and Schreiner, 1991; de Ribaupierre, 1997). The undedying response characteristics at the cortical level consist mainly of a phasic ON-response, followed sometimes by a tonic component indicating the duration of the stimulus (Fig. 1). One important point determining this picture seemed to be that the vast majority of investigations were performed under anesthesia. The few data available from awake animals indeed indicate, that one could find a higher percentage of 'complex tuned' cells. This was true for the cat (Abeles and
pr essur,~e 7 lSound evel
I '-L--~~
Goldstein, 1972) and for monkeys (Pelleg-Toiba and Wollberg, 1989). It was known for a long time that anesthesia reduces the general level of excitability and the level of spontaneous activity in the auditory system (Erulkar et al., 1956; Evans and Nelson, 1973; Kuwada et al., 1989; Zurita et al., 1994). For the rat auditory cortex it was shown that there are additional strong influences on the size and shape of excitatory response areas (Gaese and Ostwald, 1997). The majority of neurons lost significant tuning under light anesthesia. Quantification of tuning areas based on statistical evaluation showed that in the remaining neurons the sizes of tuning areas were strongly reduced. Interestingly, minimal threshold was not influenced, therefore the reduction in size actually resulted in an increased sharpness of tuning (Fig. 2). In the rat auditory cortex we found strong indications toward an increased influence of inhibition under anesthesia. Neurons, for example, were more susceptible to anesthesia when they had (in the awake state) low levels of spontaneous activity, high thresholds and small tuning areas. The form of excitatory tuning areas, which is thought to be shaped by inhibitory influences mainly mediated by GABA (Palombi and Caspary, 1996), was strongly influenced by anesthetic agents. The already active inhibitory influence in the awake state was enhanced under anesthesia. As it was shown recently, it is not only pentobarbital sodium and chloral hydrate, the drugs used in these studies, that have an effect on GABAergic transmission (Tanelian et al., 1993), but also many volatile anesthetics like halothane (Franks and Lieb, 1994).
t
Distributed encoding of sound frequency Minimal threshold" "
CF
Frequency
Fig. I. Schematic drawing indicating the 'classical view' of frequency encoding in the rat auditory cortex. The excitatory response area shows typical V-shaped tuning to only one frequency range. The frequency at minimal threshold is the 'characteristic frequency'. The insert shows schematic response characteristics elicited by a pure tone laying well inside the response area. They are either phasic (with an ON-response only) or phasic-tonic (shown here).
The distributed encoding of stimulus and movement parameters has now been shown for different sensory modalities and the motor system (Georgopoulos, 1994; Nicolelis, 1997). Activity in the awake rat auditory cortex shows several characteristics that strongly suggest the distributed encoding of important stimulus dimensions as well. One of these parameters is sound frequency, which serves also as the dominating organizational principle along the whole central auditory pathway including auditory cortex (Merzenich and Schreiner, 1992). Several aspects of
223
Anesthesia
Awake • ' 70 - - - o - - • O 0 0 0 0 • - O O O 0 -o.
~ 5o ~ 30 o 09
- o
-
--
• . . . . . .
.
.
•
.
.
•
.
-
.
0 -
.
•
-
o
O
O-
•
.
.
°-*-o
0
0
o
-
•
o - - - O - O •
.
.
.
.
•
O
-
70 -
-
. . . .
n l n l l l l l , l , l , l , l , l , l , I
2.4
4.8
9.5
19.0
38.0
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
OO0-o
.
.
.
.
.
l
,
.
.
.
.
.
.
O - O - O - - -
30
o O o 0 - - -
.
50
.
e - - • • e • • 0
.
,
76.1
Frequency [kHz]
I
2.4
,
l
,
[
,
4.8
I
,
9.5
l
,
l
:
l
19.0
,
l
,
l
38.0
,
I
76.1
Frequency [kHz]
Fig. 2. The effect of anesthesia on the response areas of auditory cortical neurons in the rat. Under light anesthesia the response area shows increased tuning sharpness (tight). Each neuron was tested with pure tones of 4 different intensities and 22 different frequencies (0.25 octave spacing) coveting the whole hearing range of the rat. Stimuli were presented in randomized order. Evaluation of response areas was based on the phasic onset activity (first 20 ms of stimulus-evoked activity) in 20 repetitions of each frequency/intensity combination using non-parametric statistics (Wilcoxon's MPSR-test). Significance of the excitatory response is coded in dot size (P < 0.05, P < 0.01, P < 0.001) with large dots indicating highly significant excitation. Dashes indicate non-significant changes. Underlying activity is partially shown in Fig. 3.
neuronal firing characteristics indicate that different spectral components are not only analyzed by a few neighboring neurons of similar characteristic frequency (CF). One of these indications is the high trial-to-trial variability in stimulus-evoked activity that is very obvious in the awake rat (Gaese and Ostwald, 1996). Even for stimulation well inside the response area, the typical phasic ON-response does not occur at every stimulus repetition in spite of random stimulation procedures (Fig. 3). This cannot be simply explained by slow trends in neuronal excitability (as it might occur as a result of changes in vigilance). There is no consistent pattern of lacking or occurring responses along the time course of the experiment, and the temporal patterns elicited by similar stimuli are mainly different. Response components occurring after the initial peak are even more unreliable. The dependence of trial-to-trial variability on the depth of anesthesia was shown recently (Kisley and Gerstein, 1999). The differences in variability of responses under awake and anesthetized conditions fit the general pattern indicated there. A similar high degree of trial-to-trial variability in single neuron responses as in the rat was found in the cat auditory cortex (Furukawa et al., 2000). A second indication is that most neurons can be excited by spectral energy from a broad range of frequencies at medium sound pressure levels. Record-
ings from the awake rat auditory cortex revealed a much more complicated pattern of frequency encoding compared to the previously prevailing view of simple tuning characteristics. Neurons in the primary auditory cortex showed response areas that very often included a part that strongly resembled the typical V-shaped tuning. They could, however, be rather broad in several cases. Sometimes the neurons responded to stimuli in one or two additional frequency ranges as well (Fig. 4). These parts of the response area could have V-shaped tuning as well or were totally separated from the rest. Response areas, finally, could show intriguing local discontinuities even for neighboring frequency/intensity combinations. Highly significant responses were often found right besides non-significant responses or even responses towards the opposite direction. Frequency response areas are usually determined, as shown above, using pure tone stimuli. Naturally occurring signals, however, are complex in their spectral and temporal structure. One step towards more dynamic and complex stimuli would be to use narrow-band noise to test neuronal response behavior. We compared frequency response areas, as determined using pure tones, to response areas determined using narrow band noise. Both stimulus types do not differ much in their spectral components, but their psych•physical quality is rather different. As the auditory cortex might very well participate in
224
,
'',,',,
'
¥,,t,,,',
,
',; L~',~' ='
,
'
_1
60.0
,I
~ ,.,,
,,,'~ i , ~
'
.
"
I
""~";1~:,''~
,
"',
I
~l ~=l~l
i I
HHII
,I . . . .
0. 50.0
'~,
'
"
'"'i" ,'l',,,,","' ~~"i ii
19.0
I't',,,,
I
i
l II
F
22.6
'
', ,,
.
.
"
/" '
i
i
i
26.9
,,, ,
I
/
I', . . I ,',,
/"'
I',
i
'
;,,, ,',,',, , ,',,
I
I
,,,
i
t ii
HJl ~
'
q
II
,
I i.
a
m
i
.,
"' ,,,,,~., i
I
,
I~t i~
m L
",:
,
I, ,, i
i
h,,
""/""''" " ' ' ...."'
,
Ii
~_.~.:_.L_,, ~ ~
'~
'
', I,
',,,
,
I
i,
',
/
.
I~' ,'
','
"':,, ',", ,,
ii
i I
I
.
II
, ,IlL ~ ,'iJ,' , r ,i,, J,,r,~,,',,
i
=,
''
,
I~ ~ '
,,'
',"
30.0 ~ . _ ~
lh,
t .
i
IH
I~,,'," ,'
~ '" ' 'I" . . . . . . .
.,,
' ""
,
(
',1l' "'
".~. 7, " .', '"" ',,'.v ''1~';'
,,',"
I
' ["'
, "
..... ......... /~.... , ~r' , _ j ~l~ ~,,,,,h, ~ !,
320
3
0
45 2
558
Frequency [kHz] Fig. 3. Responses of a neuron in the awake rat auditory cortex as an example for the typically high degree of variability in the activity. Dot-displays (20 trials) and peristimulus-time histograms (binwidth 2 ms, 150 ms duration) are shown for stimulation with pure tones of 28 different frequency/intensity combinations. Stimulus onset is indicated by a vertical line in the dot-displays, stimulus duration is 100 ms. Response characteristics and the degree of trial-to-trial variation changes with stimulus parameters. Depicted is only a part (19.0 to 53.8 kHz) of the frequency range tested. Statistical evaluation of this activity is shown in Fig. 2 (awake condition, left).
225 a 70 50
.
.
.
.
o _
.
.
.
.
.
.
.
.
.
.
.
.
.
.
o _
.
.
.
-
o
o O O 0 -
O o - o
.
o-o
.
o
.
.
.
....
o
-
.
.
o
-
-
o
-
o
O -
-
e O e e O . -
30 , I , I , I ,
2.4
I , I , I
I
4.8
9.5
19.0
38.0
ii
76.1
b m 70 - - , -o.
....
• oe~eo
o-
• •
0~ 50
-I 0
CO
30
.
.
.
,
.
. I
2.4
.
. ,
. I
4.8
. ,
. I
.
. ,
.
.
e *
I
t
9.5
19.0
Data from studies investigating cortical activity using modem imaging techniques are consistent with the data on distributed encoding of sound frequency at the single neuron level as presented so far. Investigating the rat auditory cortex using optical imaging and stimulation at suprathreshold levels again revealed a complicated picture of frequency representation (Bakin et al., 1996). The representation of sound frequency was more widespread over the cortical surface than expected from the proposed schema of aligned isofrequency lines (or patches). In addition, representations of very different frequencies could be widely overlapping, suggesting that neuronal populations involved in the processing of different sound frequencies might not be spatially segregated.
- g o - - -
,
t
38.0
Spectrotemporal dynamics of f r e q u e n c y
tuning
76.1
Frequency [kHz] Fig. 4. Response areas of neurons in awake rat primary auditory cortex showing activity in more than one frequency range. Evaluation of responses to pure tones of 22 different frequencies and 4 different intensities as described for Fig. 2. Example (b) shows a distinct response area in the high-frequency range, separated from the responses to lower frequencies, which occur only at high stimtflus intensities.
high-level processes of pattern recognition, on could expect differences in the neuronal representation of these ~wo different stimulus types that would shed some light on the underlying mechanisms of information processing. Contrary to these predictions tuning to pure tones and narrow band noise was rather comparable. This was true for small and large response areas. CFs of frequency response areas as determined by the two types of stimuli were usually the same. Response areas to noise band stimuli were again of complex structure showing local discontinuities. Neurons, however, seemed to be more sensitive to noise band stimulation in general than to pure tones. In spite of that the sizes of the response areas for a given neuron, as determined by the two different stimulus types, correlated rather well. Thus. at this level of auditory processing, the spectral content rather than the psychophysical quality was the prevalent parameter for neuronal activity (Gaese and Ostwald, 1996).
One aspect strongly neglected by the 'classical view' of frequency encoding is the fact that responses usually change over time, indicating that the representation of sound frequency is a dynamic process. Although already investigated in earlier studies (e.g. Abeles and Goldstein, 1972: Aertsen and Johannesma, 1981), spectrotemporal dynamics of activity in the auditory cortex during stimulus presentation has only recently gained more interest. Although the analysis so far was mainly based on onset activity, it is not clear which temporal window in the response is important for the processing of static stimuli Dynamic stimuli might be specifically analyzed by neurons with temporally changing response areas in frequency and intensity with a rate that closely matches the changing stimulus parameters. Distributed information, on the other hand does also show spectrotemporal dynamics (Fukunishi et al., 1992: Uno et al., 1993), revealing a dynamic organization of stimulus processing at the population level. A major portion of the neurons in the awake rat auditory cortex show spectrotemporal dynamics in their responses with lasting excitatory and/or inhibitory response components (Fig. 5). Although we evaluated responses by integrating over 20 ms time bins because of response variability, a systematic and mostly continuous change of response components was obvious (Gaese and Ostwald, 1996). After
226
[ms]
- --7777-77;77-;77°.°•
20
r--o .
. . . . . . . . . .
.
-
.
.
.
.
----
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
f~ _ , t .~
40
o---o---0 .
0 - - - - 0
0
.
,
t
.
.
.
.
.
.
,
i
.
.
.
.
.
,
.
.
.
.
i
f ............ I
0
. . . . . . . f
0
.
.
.
i
. ,
.
.
.
.
.
.
I
-
0
0
0
,
-
.
i
0
.
I i
.
.
.
0
.
.
i
.
.
.
t
.
i
.
I
-
.
,
.......
.
t
.
.
.
,
.
.
.
,
.
.
J
.
.
.
.
. . . . . . . . .
0 - - - 0 .
0--0
.
. . . . . . . . . .... 0-o ..........
o--oo0
. . . . . . . . . .
. .
o-0o
k , T i I
. . . . . . .
60
.
.
.
I
v__o___ooO---o . . . . . . . oo-00ooo t . . . . . . . . 0---0
F
i
. 0
,
.
.
. .
. .
. .
1 . 1 . 1
. .
. .
.
.
.
.
. .
t
i
the response area stayed active which was separated from the frequency range around C E Late inhibitory components were not as c o m m o n as excitatory ones and occurred mainly after an initial excitatory response. These inhibitory components lasted mostly for a longer time and covered a larger area than longlasting excitation. They occurred more often on the high frequency than on the low-frequency side of the neuron's CF. O n l y few cases s h o w e d both excitatory and inhibitory components after the ON-response. Spectrotemporal dynamics of response areas can also be determined in a very efficient w a y using the reverse correlation technique. This was successfully applied to the auditory cortex of awake owl monkeys. Response areas, again, were o f c o m p l e x structure often including excitatory and inhibitory subfields that changed over time (deCharms et al.. 1998). The importance for auditory processing was tested b y creating stimuli that followed exactly the spectrotemporal changes. These stimuli proved to be well optimized to excite the specific neuron much stronger than simpler stimuli as pure tones or noise bands without changes in frequency over time. This, again, confirms the importance o f the specific structure of a response area for spectrotemporal analysis.
Distributed encoding of sound localization 2.4
4.8
9.5 Frequency
19.0
38.0
76.1
[kHz]
Fig, 5. Example of a rat auditory cortical neuron showing spectrotemporal dynamics of frequency response areas. Stimulus-induced activity was analyzed in five time windows for the whole stimulus duration (0-20 ms. 20-40 ms etc.~. Numbers to the left indicate the start of the window relative to stimulus onset. Each plot comprises the statistical evaluation of stimulus-induced activity in a 20 ms time window as described in Fig. 2. First plot in this figure is comparable to the plots in Figs. 2 and 4. The level of significance of stimulus-induced excitation trilled circtes~ or inhibition (open circles) is coded in dot size with highly sighificam responses marked as larger dots. The exampIes show initial excitation in a broad frequency range and later an inhibitory response component that slowly gets smaller.
the initial response the response areas were mostly r e d u c ~ and stayed in the frequency range around the initial CF, w h i c h w o u l d roughly correspond to a tonic response characteristic. In a f e w Cases, a part o f
The functional importance o f the auditory cortex for spectral analysis is still unclear in several respects. The importance o f the auditory cortex for sound localization performance, however, was shown in several studies (e,g. Jenkins and Masterton, 1982). Still, auditory cortical neurons do not, as one might expect, encode sound-source location in sharply tuned spatial receptive fields. Instead, auditory cortical neurons (mostly studied in the cat) are b r o a d l y tuned to sound azimuth (Middlebrooks and Pettigrew. 1981: Brugge et al., 1996). Tuning to sound source elevation is broad as well (Xu et al., 1998). Accordingly, studies have failed to demonstrate a topographical representation o f auditory space in the cortex (Imig et al.. 1990: Brugge et al., 1996: M i d d l e b r o o k s et al., 1998). As spike patterns of m a n y neurons can carry information about soundsource locations throughout 360 ° o f space, a distributed ' p a n o r a m i c ' code was proposed, in which information about any point in auditory space is dis-
227 tributed across large populations of broadly tuned neurons (Middlebrooks et al., 1994, 1998): Further investigations used an artificial neural network algorithm to infer the location of sound sources from spike patterns o f several neurons simultaneously. They revealed that location signaling by neuronal ensembles of moderate size (more than 100 neurons) approached the level of accuracy exhibited in behavioral tasks (Furnkawa et al., 2000). The view of the encoding of sound-source location at subcortical levels is usually dominated by the picture found in the cat and monkey superior colliculus, where neurons exhibit mainly sharp spatial tuning at low stimulus intensities and are arranged in a map of auditory space (King, 1993)i This suggests that only a low degree o f interaction at the population level is necessary to explain behavioral localization performance, Comparing data from different nuclei along the ascending auditory pathway, however, hint towards a distributed encoding of relevant parameters for sound localization and the involvement of larger populations of neurons to retrieve specific locations of sound sources. This was shown by comparing the encoding o f interaural time differences (ITD), one of the important stimulus parameters for sound localization, at different levels along the auditory pathway. Neuronal encoding of ITD in the superior olivary complex, the inferior colliculus, and the medial geniculate body is broad and can only explain the behavioral performance under the assumption of strong interactions (Fitzpatrick et al., 1997). Recent data indicate that at least in the rat superior colliculus the broad spatial tuning of cells suggests a more distributed encoding as well (Gaese and Johnen, 2000). Interactions a m o n g n e u r o n s
Of basic importance for distributed encoding of information are mechanisms for the flow of activity among neurons in a region of the brain and the possibilities for (dynamic) functional interactions between groups of neurons. The auditory system relies on the concurrent activation of large populations of neurons to process the variety of stimulus parameters that contribute to normal auditory perception. Single auditory entities are, as pointed out above, represented each inthe activity of a subset of these neurons, best described by the concept of the 'neural assembly'
(Hebb, 1949). The formation and dynamic structure of these assemblies are based on correlated firing of participating neurons. We observed interactions in the temporal firing patterns of neurons in the awake auditory cortex which indicate the existence of structured neuronal assemblies. Comparable to former studies (Dickson and Gerstein, 1974; Frostig et al., 1983; Eggermont. 1992) we found a high degree of correlated firing among auditory cortical neurons (significant correlations in more than half of the pairs of neurons investigated). The majority of those were temporally precise, indicating more or less direct synaptic interactions. The degree of correlation diminished with increasing distance between neurons. Pairs of neurons with similar response characteristics showed correlated activity even over large distances. This pattern holds true for similar characteristic frequency and, interestingly, also for similar size of the response area (Gaese and Ostwald, 1995b). A direct functional importance of correlated firing patterns was found for stimulus-induced correlations in the auditory cortex (Espinosa and Gerstein. 1988; Eggermom, 1994). Relevant information was signaled in many cases by average firing rate as well as correlated activity. Encoding in correlated firing patterns without signaling by the average firing rate was shown for pure tones in general (deCharms and Merzenich, 1996) and specifically for the movement of auditory stimuli (Ahissar et al., 1992a). Patterns of correlated activity could be expressed in the auditory cortex of rats even in a task-specific manner. In animals involved in a Go/NoGo-task, patterns of correlated activity could predict the occurrence of specific behavioral responses with high reliability (Villa et al., 1999a). The dynamics of stimulus-related activity in the auditory system and especially the induction of spatiotemporal activity patterns over tens or even hundreds of milliseconds, as seen e.g. in the rat auditory cortex (Villa et al., 1999a) is perhaps strongly related to the modulation of ascending information by the well developed descending auditory pathways (Villa et al.. 1999b). Direct connections are not only going from the auditory cortex down to the auditory thalamus and midbrain, but also to the earliest stages of the ascending auditory pathway, the cochlear nucleus and the superior olive, as shown in the rat (Weedman and Ryugo, 1996).
228
Condusions The results reviewed here suggest that the classical tonotopic maps consisting of sharply tuned neurons to a narrow frequency range observed at all levels of the rat auditory system provide a very limited description o f the interactions between neuronal populations that underlie the function of this system, investigations describing the distributed encoding of important stimulus parameters in other animals (e,g. the encoding of sound-source azimuth in the cat auditory cortex) strongly support this view. Quantitative reconstruction of receptive fields and neuronal population responses demonstrates the existence of d y n a ~ c and distributed representations of auditory information. The dynamic nature of stimulus encoding was evidenced by the existence of spectrotemporal receptive fields. Finally, simultaneous recording from pairs of units revealed that neuronal interactions can be observed, suggesting that neural assemblies across the auditory system might be essential for the (often simultaneous) representation of auditory entities. This suggests the following for further directions in the research on auditory, information processing. (1) The application of modem recording techniques would greatly improve the understanding of auditory processing. Recording from only few single units at once is still the most widely used approach to investigate the encoding of important stimulus properties. With the advances in recording techniques and data analysis it is no more necessarily the case that "The scientist studying single units has been compared to a collector o f beautiful butterflies. Attention is focussed on the colorful, dramatic, or unusuat" (cited from Goldstein and Abeles, 1975). Recently developed new recording techniques allow for the recording from many neurons distributed over a specific brain area (Nicolelis et al.. 1997). Datasets including simultaneously recorded activity from a large number o f single units in the auditory cortex are still lacking. (2) One o f the important questions is: how are neuronal populations in the auditory cortex read out? One possibifity would be the formation of neuronal assemblies of more global structure, may be based on synchronons oscillatory, activity, as it was suggested for the visual system (see Freiwald et al.,
2001). High-frequency oscillating potentials in the gamma range, possibly underlying such assembly formation, were found in the rat auditory cortex (Barth and Macdonald, 1996). However, one might argue, that the modulation of neuronal activity in this frequency range could easily interfere with stimulus-related modulations (e.g. elicited by amplitude or frequency modulations). Therefore the importance of such oscillatory activity remains to be shown. (3) Further advances in the understanding of information processing in the auditory cortex might come from studies using a neuroethological approach, i.e. combining multiple single unit recording and behavioral analysis. A strong dependence of auditory cortical activity on behavioral relevance was already found using several different approaches (Ahissar et al., 1992b; Sakurai, 1994, 1996). (4) The detailed structure and dynamics of spectrotemporal receptive fields of auditory cortical neurons is of great interest to further investigate population encoding of relevant stimulus parameters. The reverse-correlation technique is very well suited for that and deserves further application (deCharms et al., 1998). Combined with virtual acoustic stimulation it provides an efficient way to also investigate spatiotemporal receptive fields (Jenison et al., 1999).
Acknowledgements The author is most grateful to Jo Ostwald for continuous support and to Katrin Bthning-Gaese and Harald Luksch for critical comments on earlier versions of the manuscript. This work was supported by the DFG (SFB 307, Ttibingen and SPP 1001 'Sensomotorische Integration').
References Abeles, M. and Goldstein, M.H. (1972) Responses of single units in the primary auditory cortex of the cat to tones and tone pairs. Brain Res., 42: 337-352. Aertseu, A.M.H.J. and Johannesma, P.I.M. (1981) The spectro-temporai receptive field - - a functional characteristic of auditory neurons.BioL Cybern., 42: 133-143. Ahissar. M., Ahissar. E., Bergman. H. and Vaadia, E. (1992a) Encoding of sound-sourcelocation and movement: activityof single neurons and interactions between adjacent neurons in monkey auditorycortex.J. Neurophysiol., 67: 203-215. Ahissar, E.~ Vaadia, E., Ahissar, M., Bergman, H., Arieli. A. and Abeles, M. (1992b) Dependence of cortical plasticity on
229
correlated activity of single neurons and on behavioral context.
Science, 257: 1412-1415. Bakin, J.S.. Kwon, M.C., Masino, S.A., Weinberger, N.M. and Frostig, R.D. (1996) Suprathreshold anditory cortex activation visualized by intrinsic signal optical imaging. Cereb. Cortex, 6: 120-130. Barth, D.S. and Macdonald. K.D. (1996) Thalamic modulation of high-frequency oscillating potentials in auditory cortex. Nature, 383: 78-81. Brugge, J.E and Reale, R.A. (1985) Auditory cortex. In: A. Peters and G. Jones (Eds.), Cerebral Cortex, 4. Associative and Auditory Cortex. Plenum. New York, pp. 229-271. Brugge, J.E, Reale, R.A. and Hind, J.E. (1996) The structure of spatial receptive fields of neurons in primary auditory cortex of the cat. J. Neurosci., 16: 4420-4437. deCharms, R.C. and Merzenich, M,M. (1996) Primary cortical representation of sounds by the coordination of action-potential timing. Nature, 381: 610-613. deCharms. R.C., Blake. D,T. and Merzenich, M.M. (1998) Optimizing sound features for cortical neurons. Science, 280: 1439-1443. De Ribanpierre. E (1997) Acoustical information processing in the auditory thalamus and cerebral cortex. In: G. Ehret and R. Romand (Eds.), The Central Auditory System. Oxford University Press, New York, 1st ed., pp. 317-397. Dickson. J.W. and Gerstein, G.L. (1974) Interactions between neurons in auditory cortex of the cat. J. Neurophysiol.. 37: 1239-1261. Eggerrnom. J.J, (1992) Neural interaction in cat primary auditory cortex dependence on recording depth, electrode separation. and age. J. Neurophysiol., 68: 1216-1228. Eggermont, J.J. (1994) Neural interaction in cat primary auditory cortex, 2. Effects of sound stimulation. J. Neurophysiol., 71: 246-270. Erulkar. S.D.. Rose, J.E. and Davies, EW. (1956) Single unit activity in the auditory cortex of the cat. Bull. Johns Hopkins Hosp., 99: 55-86. Espinosa, I.E. and Gerstein, G.L. (1988) Cortical auditory neuron interactions during presentation of 3-tone sequences: effective connectivity. Brain Res., 450; 39-50. Evans. E.E and Nelson, P.G. (1973) The responses of single neurones m the cochlear nucleus of the cat as a function of their location and anaesthetic state. Exp. Brain Res., 17: 402427. Fitzpatrick, D.C.. Batra. R., Stanford, T.R. and Kuwada, S. (1997) A neuronal population code for sound localization. Nature. 388: 871-874. Franks, N.E and Lieb, W.R. (1994) Molecular and cellular mechanisms of general anaesthesia. Nature, 367: 607-614. Freiwald, W.A., Kreiter. A.K and Singer, W. (2001) Synchronization and assembly formation in the visual cortex. In: M.A.L. Nicolelis (Ed.), Advances in Neural Population Coding, Progress in Brain Research, Vol. 130. Elsevier Science. Amsterdam, pp. 111-140. Frostig, R.D., Gottlieb, Y., Vaadia, E. and Abeles~ M. (1983) The effects of stimuli on the activity and functional connectivity of
local neuronal groups in the cat auditory cortex. Brain Res., 272:211-222. Fukunishi, K.. Mural, N. and Uno. H. (1992) Dynamic characteristics of the auditory cortex of guinea pigs observed with multichannel optical recording. Biol Cybern.. 67: 501-509. Furukawa, S., Xu. L. and Middlebrooks, J.C. (2000) Coding of sound-source location by ensembles of cortical neurons. J. Neurosci., 20: 1216-1228. Gaese, B.H. and Jotmen, A. (2000) Coding for auditory space in the superior colliculus of the rat. Eur. 1. Neurosci., 12: 1739-1752. Gaese, B.H. and Ostwald, J. (1995a) Temporal coding of amplitude and frequency modulation in the rat auditory cortex. Eur. 1. Neurosci., 7: 438-450. Gaese, B.H. and Ostwald. J. (1995b) Temporal interactions in the activity of single cells in the rat auditory cortex. In: N. Elsner and R. Menzel (Eds.), Learning and Memory. Proceedings of the 23rd Goettingen Neurobiology Conference, 1995. Thieme. Stuttgart, p. 157. Gaese, B.H. and Ostwald. J. (1996) Tuning characteristics in the auditory cortex of awake rats. In: N. Eisner and H.U. Schnitzler (Eds.), Brain and Evolution. Proceedings of the 24th Goettingen Neurobiology Conference, 1996. Thieme, Stuttgart, p. 228. Gaese, B.H. and Ostwald, J. (1997) Effects of anesthesia on stimulus representation and processing in the rat auditory cortex. Soc. Neurosci. Abstr., 23: 2070. Georgopoulos. A.P. (1994) New concepts in generation of movement. Neuron, 13: 257-268. Goldstein. M.H. and Abeles. M. (1975) Single unit activity of the auditory cortex. In: W.D. Keidel and W.D. Neff (Eds.), Handbook of Sensory Physiology. Springer, Berlin, pp. 199218. Hebb, D.O. (1949) The OrganizatiOn of behavior A NeuropsychoIogical Theory. Wiley, New York. Hell. P. (1997a) Auditory cortical onset responses revisited. 1. First-spike timing. J. Neurophysiol., 77: 2616-2641. He[l, P. (1997b) Auditory cortical onset responses revisited, 2. Response strength. J. Neurophysiol., 77: 2642-2660. He[i, P.. Rajan, R. and /rvine, D.R.E (1992) Sensitivity of neurons in cat primary auditory cortex to tones and frequency-modulated stimuli, I. Effects of variation of stimulus parameters. Hear. Res., 63: 108-134. Imig, T.J., Irons, W.A. and Samson, F.R. (1990) Single-unit selectivity to azimuthal direction and sound pressure level of noise bursts in cat high-frequency primary auditory cortex. J. Neurophysiol.. 63: 1448-1466. Jenison, R.L., Schnupp, J.W.. Reale. R.A. and Brugge, J.F. (1999) Auditory space-time receptive fields derived by white noise analysis. Soc. Neurosci. Abstr., 25: 395. Jenkins, W.M. and Masterton, R.B. (1982) Sound localization: effects of unilateral lesions in central auditory system. J. Neurophysiol.. 47: 987-1017. Kelly, J.B. and Sally, S.L. (1988) Organization of auditory cortex in the albino rat: binaural response properties. J. Neurophysiol., 59: 1756-1769, Kilgard, M.P. and Merzenich. M.M. (1998) Cortical map reor-
230 gauization enabled by nucleus basalis activity. Science, 279: 1714-t7t8. King, A J= (1993) A map of auditory space in the mammalian brain - neural computation and development. Exp. Physiol., 75:: 559-590. Kisley, M.A. and Gers~ein, G.L. (1999) Trial-to-trial variability arxd state-dependent modulation of auditory-evoked responses in Cortex. Z Neurosci.. 19: 10451-10460. Kuwada, S., Batra, R. and Stanford, T.R. (1989) Monaural and Nnaural response properties of neurons in the inferior collicuIns of the rabbit: effects of sodium pentobarbital. J. Neuropt~ysiol., 61: 269-282 Mefzenich. M.M. and Schreiner, C.E, (1992) Mammalian auditory cortex - - some comparative observations. In: D.B. Webster, :R.R. Fay and A,N. Popper (Eds.), The Evolutionary Biology of Hearing Springer, Berlin, pp. 673-689. Middlebrooks. J.C. and i~ettigrew, J.D. (1981) Functional classes of neurons in primary auditory cortex of the cat distinguished by sensitivity to sound location. J. Neurosci., 1: 107-120. Mi~lebrooks, J.C., Clock, A.E., Xn, L, and Green, D.M. (1994) A panoramic code for sound location by cortical neurons. Science. 264: 842-844. Middlebrooks, J.C., Xu, L., Eddins, A.C. and Green, D.M. (I998) Codes for sound-source location in nontonotopic auditory cortex. J. Neurophysiol.. 80:863-881 Middlebrooks, J.D., Dykes, R.W. and Merzenich, M.M. (1980) Binata-al response-specific bands in primary cortex (A1) of the cat topographical organization orthogonal to isofrequency Contours. Brain Res., 181: 31-48. Nicoletis. M.A. (19977 Dynamic and distributed somatosensory representations as the snbstrate for cortical and subcortical plastic}ty. Semin. Neurosci.. 9: 24-33. Nicolelis, M.A.. Gbazanfar, A.A., Faggin, B.M., Votaw. S. and Oliveira. L.M. (1997) Reconstructing the engram: simultaneous. muttisite, many single neuron recordings. Neuron, 18: 529-537. Palombi. RS, and Caspary, D.M. (1996) GABA inputs control discharge rate primarily within frequency receptive fields of inferior cdliculus neurons. J. Neurophysiol., 75:2211-2219. Pelteg-Toiba~ R. and Wollberg, Z. (1989) Tuning properties of auditolT cortex cells in the awake squirrel monkey. Exp. Brain Res.. 7~: 353-364. Phillips, D.R and Kelly. J.B. (1989) Coding of tone-puise ampli-
tude by single neurons in auditory cortex of albino rats (Rattu~ norvegicus). Hear. Res.. 37: 269-280. Sakurai, Y. (1994) Involvement of auditory cortical and hippocampal neurons in auditory working memory and reference memory in the rat. J. Neurosci., 14: 2606-262% Sakurai. Y. (1996) Hippocampal and neocortical cell assemblies encode memory processes for different types of stimuli in the rat. J. Neurosci., 16: 2809-2819. Sally, S.L. and Kelly, J.B. (1988) Organization of auditory cortex m the albino rat: sound frequency. J. Neurophysiol.. 59: 16271638. Schreiner. C.E. and Urbas, J.V. (1988) Representation of amplitude modulation in the auditory cortex of the cat, II. Comparison between cortical fields. Hear. Res.. 32: 49-64. Sutter. M.L. and Schreiner, C.E. (1991) Physiology and topography of neurons with mulfipeaked tuning curves in cat primary auditory cortex. J. Neurophysiol.. 65: 1207-1226. Tanelian, D.L., Kosek, R, Mody, I. and McIver, M B. (1993) The role of the GABA A receptor/chloride channel complex in anesthesia, Anesthesiology, 78: 757-776. Uno, H.. Mural, N. and Fukuuishi. K. (1993) The tonotopic representation in the auditory cortex of the guinea pig with optical recording. Neurosci. Lett.. 150: 179-182. Villa, A.E., Tetko, I.V.. Hyland. B. and Najem. A. (1999a) Spatiotemporal activity patterns of rat cortical neurons predict responses in a conditioned task. Proc. Natl. Aead. Sci. USA. 96: 1106-1111. Villa. A.E., Tetko, I.V., Dutoit. R, Deribanpierre. Y. and Deribaupierre, F. (1999b) Corticofugal modulation of functional connectivity within the auditory thalamus of rat, guinea pig and cat revealed by cooling deactivation. J. Neurosci. Methods. 86: 161-178. Weedman, D.L. and Ryugo. D.K. (1996) Pyramidal cells in primary auditory cortex project to cochlear nucleus in rat. Brain Res.. 706: 97-102. Xu. L.. Furukawa, S. and Middlebrooks. J C (1998) Sensitivity to sound-source elevation in nontonotopic auditory cortex. J. NeurophysioL, 80: 882-894. Zurita. P.. Villa. A.E.P., Deribaupierre. Y.. Deribanpierre, F. and Rouiller. E.M. (1994) Changes of single unit activity in the cat's auditory thalamus and cortex associated to different anesthetic conditions. Neurosci. Res. 19: 303-316.
M.A.L. Nicolelis (Ed.)
Progress in Brain Research, Vol. 130 © 2001 Elsevier Science B.V. All rights reserved
CHAPTER 15
Representations based on neuronal interactions in motor cortex Nicholas G. Hatsopoulos 1,,, Matthew T. Harrison 2 and John R Donoghue ] Department of Neuroscience, Box 1953. Brown University, Providence. RI 02912, USA 2 Division of Applied Mathematics, Brown University, Providence, RI 02912, USA
Introduction The brain is remarkably adept at constructing complex sensory and motor representations to perceive and act upon the outside world. The neural basis of such representations has been both elusive and highly debated. The concept that representations of complex objects are formed by a single most Sensitive neuron (i.e. the lower envelope principle) (Barlow, 1972) or by a so-called 'grandmother cell' which, by itself, mediates an entire representation has become less tenable, although there remain strong proponents of the idea (Parker and Newsome, 1998). One serious problem with such representational schemes is that there are nearly an infinite variety of complex representations that can be learned and, therefore, there are simply not enough neurons for them to act as unique encoders. Neural representations undoubtedly recruit groups of neurons but this observation alone does not clarify how neuron populations form representations. In visual cortex, experimental evidence accumulated over the past thirty years suggests that separate clusters of neurons represent specific features of a complex visual scene such as color, form, texture, and motion.
Corresponding author. Nicholas G. Hatsopoulos, Department of Neuroscience, Box 1953, Brown University, Providence, RI 02912, USA. Tel.: -I-1-401-863-1874; E-mail:
[email protected]
Thus, the neural representation of a complex scene would be composed of the activations of multiple groups of neurons, each of which would represent a simple visual feature. This would solve the problem of representing a nearly infinite number of possible objects that might occur in the world because, according to this scenario, each cluster of neurons represents a visual primitive which is used and reused in different combinations with other clusters to represent any complex object (Bienenstock and Geman, 1995). While less clear than in sensory systems, complex movements appear to be composed of elementary components whose neural representations are assembled together in some manner to form a global representation of a desired action. In the motor cortex, cells coding for basic motor components, such as direction, amplitude or force are readily encountered. Within motor cortex even the simplest of motor behaviors such as repetitive movements of a single finger involves a distributed group of cells (Schieber and Hibbard, 1993; Sanes et al., 1995; Kakei et al., 1999). The existence of a global motor representation can be demonstrated by writing one's signature using the fingers, the wrist, the whole arm, the head or the foot. In each case the general form of the signature is the same (though the quality varies), despite formidable differences in the effector used to create that signature. For such a movement to be made, some general plan of action must be processed into the muscle language that each of these mechanically
234 quite different body parts speaks. The plan of action we term a motor representation. A t t h o u ~ the idea that complex representations are built up from simpler representations is very appealing, yon der Malsburg was one Of the first to point out a fundamental problem with such a scheme (yon der MalsbUrg, 1981). ff multiple objects needed to be represented at the same time, there would be no way to unambiguously associate the feature representations comprising one object from those belonging m the other object. Consider the condition where multiple color-selective cells are active when looking at Several different colored objects. How does the system know how to ascribe the correct color to the correct object? The 'superposition catastrophe', as it is called. could be solved, according to yon der Malsburg, if there was some mechanism by which neurons representing features belonging to an object were linked ( y o n der Malsburg, 1981). He suggested that the correlated firing of these neurons would be a way to establish such links. Detection of fine temporal synchrony in visual cortex has been a major driving force for continued investigations of this hypothesis (Singer and Gray, 1995). Spikes from different neurons occurring at the same time, or in some regular temp0ral relationship, could signal that cells belong to the same grouping (Abeles et al., 1993). Despite recent interests in the phenomenon of synchrony, correlated activity amongst cortical neurons has been known for many years. However, these patterns were often dismissed as being a result of shared inpm from a common anatomical connection (Fetz et al., 1991), and not; the reflection of a dynamic process that might signal linking of elementary neural representations. The hypothesis is currently hotly debated (see e.g. recent issue of the journal, Neuron, vol. 24, 1, dedicated to binding), and no generally convincing resolution of this argument has been presented. The technical difficulty in simultaneously recording firom multiple neurons, the problem of manipulating synchrony and firing rate independently, and the theoretical need for appropriate statistical methods have limited the resolution of the issue. The recent emergence of several multiple neuron recording methods l~as removed the first barrier and advances in statistical tools have helped with the last of these chalIenges~ Directly manipulating timing in isolation
has been achieved only in invertebrate preparations (Stopfer et al., 1997) and remains a formidable experimental challenge. In this chapter, we briefly review recent evidence concerning the nature o f representations formed from groups of neurons in motor cortex and also draw upon examples from vision and from other systems to illustrate how representations might be coded through higher-order interactions among neuronal groups. We also present some of our recent analytical results which demonstrate that groups of motor cortical neurons carry substantial amounts of information related to motor behavior. As suggested from the discussion above, we will make a formal distinction between population codes which assume that representations involve the collective activity of independently firing neurons and ensemble (or relational) codes which depend on statistical dependencies (such as correlations) among neurons to form a represented variable.
Population code From a statistical point of view, a population code is a first-order code because it assumes that information resides in the mean activities of the engaged neurons. The work of Georgopoutos and his colleagues in the motor cortex has clearly demonstrated the power of population codes based on vector averages. This work established that firing rates of primary motor cortex (MI) neurons vary with direction of arm reaching. A cosine describes well the directional tuning of these neurons; the peak of this function defines the preferred direction (PD; i.e. the direction of maximal firing) of the cell (Georgopoulos et al., 1982). However. individual cell firing rates vary considerably from trial to trial (i.e. the variance is larger than the mean) which contributes 'noise' to this form of directional coding (Lee et al., 1998; Maynard et al., 1999), Georgopoulos presented a simple and elegant scheme that relieved this noise problem: averaging the firing rate of individual cells across a population. The decoding algorithm is based upon the population vector average, in which each contributing neuron defines a vector whose direction corresponds to its PD and whose tength is proportional to its firing rate on a particular trial. Such a population vector algorithm returns a reli-
235 able estimate of the actual direction performed. This approach has great appeal because it helps to deal with the apparent noisiness of neurons and averaging is a~:rfa~eh~smi!!tflat ~is bioJ6gib~]iy plausible. Vector averaging has been applied to a variety of systems including the visual: system to estimate the orientation of faces (Oram et al., t998)~ and in the auditory system to localize a sound source based on intraural time differences (Fitzpatrick et al., 1997). :There are; 'however, several strong assumptions made by veeto~vnveraged population coding. These include the: re@kement that each Cell has a single-peaked tuning curve and the necessity that the population of neurons have PDs that cover the space of possible : ~ c t i o n s (or values of any other vector quantity tha~is represented) uniformly. However, a more fundamental assumption of this method is most relevant to the central thesis of this paper. Namely, the fifiag rates of the population are assumed to be statistically independent. That is, conditional on a particular movement direction knowing the firing rate of one neuron provides no information about the firing rate of any other neuron in the population. It is this assumption, in fact, which provides this coding scheme with :its ability .to reduce noise so effectively. Mathematically the variability of the population estimate of direction or any parameter decreases as one o~er the square root of the number of neurons in the population, if the neurons fire independently. On.the other extreme, .if the neurons are perfectly correlated, the variability of the population estimate remains constant with the number of neurons (Rieke et al., 1997). That is, nothing i s gained by adding more neurons into the decoder because each neuron is providing a redundant estimate of the parameter. Cells in motor cortex are embedded in a rich matrix of modifiable intrinsic connections (Hess and Donoghue, 1994) and, therefore, on anatomical grounds, it would be unlikely that motor cortical neurons fire independently. Numerous cross-correlation studies in motor cortex show that MI neurons are not statistically independent (Allum et al., 1982; Murphy et al., 1985a,b; Fetz et al., 1991; Hatsopoulos et al., 1998b). If we extend the theoretical arguments made by vonder Malsburg to motor cortex, statistical interactions among motor cortical neurons could exist to combine elementary motor representations to generate a coordinated motor action.
Ensemble codes
Instead of assuming that information resides only in the first-order sta[istics of :neurons (i.e. their mean firing rate), ensemble codes incorporate higher-order interactions among neurons, such as pair-wise correlation~. I t i s imme0iate!y apparen~ ~ a t ~ s e kinds of codes Could provide richer rei~rese~fi0r~s and, therefore, additional information unav~able from neurons treated independently. Our decimal system of numeric representation is one simple example of an ensemble code (i. e. given 1 and 2, f o ~ numbers are possible, 1, 2, 12, 21; meaning is determined by the relationship of one number to the next as well as by the numbers ~emselv~s)i~ven the seemingly immense representational capacity of the :cortex, some form of ensemble code seems likely to overcome the coding limitations of populations of independent neurons. :Our recent work has revealed .that pairs of motor cortical neurons engage in correlated discharge on.~several different time scales ~atsopoulos et al., 1998a,b; Maynard et al, 1999). Moreover,: we have shown that more accurate predictiOn of :movement direction is possible when these correlations are incorporated into a decoding algorithm Until recently, it has been difficult to :investigate the existence of ensemble coding because it required recording simultaneously from: large sets of cortical neurons. However, new me[hods have emerged to allow such recording.:, These methods include the use: o f chronicaUy implanted microwires, multiple moveable microeleetrodes, ~ d fixed arrays of many electrodes. We have been using a fixed silicon-based array with 100 microelectrodes (Nordhausen et al., 1994) :to chronically record: extracellular action potentials in behaving prianatesi Currently, this method makes it possible to monitor u p to 50 single sites from a 10 x 10 grid of silicon electrodes each separated by 400 Izm. One or more single units can be detected on many of these electrodes. Using this :method we have recorded from up to 14 well-isolated units in macaque monkey MI cortex during arm reaching movements in the horizontal plane (Fig. 1). The monkeys have been trained to hold a manipulandum, which controls a position-feedback cursor on a computer monitor, and are instructed to move the cursor from a center target to one of eight pedpheraUy positioned targets (the so-called 'center-out'
236
medial
~
rostral i caudal,0o lateral 6~ time (s) movement onset Fig. L Diversityof firing patterns across MI during movement.The peri-eventtime histograms show data from t4 primarymotor cortical (MI) neurons recordedsimultaneouslywhile a monkeyperformedleftward reaching movements.The histograms (20 ms bin-width) are based on 150 trials of data aligned on movementonset and have been smoothed.The spatial configurationof the histograms corresponds to the spatial layout of the array cortex; adjacenthistograms are separated by 400 Ixm.Two units have beenisolated from one electrode locatedto the far right. task used by a number of researchers to examine directional turfing; see Hatsopoulos et al. (1998a) for more de~ails on the behavioral task. Briefly, each trial was composed o f three periods: a 5 ~ ms hold period during which time the monkey had to keep the oarsor over the center target, a variable (1-1.5 s) instruction period during which one of the eight peripherally positioned targets appeared, and a go period at wl~ieh time the peripheral target began blinking which signaled t o the animal to execute the instructed action. Unless specified otherwise, we will present results using a task with only two possible movement directions: left and right. This simplified version of the center-out task offers the advantage that we were able to collect sufficiently large numbers of trials necessary to use methods that incorporate higher-order statistical interactions within neuronal ensembles.
Fine temporal synchrony As others have shown previously, motor cortical neurons often engage in synchronous activity. Using standard cross-correlation techniques, we found that
up to 30% of cell pairs fired synchronously (Hatsopoulos et al., 1998b), meaning that they had a peak in their cross-correlogram centered at 0 (4-1 ms). Among those cell pairs, the temporal precision of synchrony defined as the width of the cross-correlation histogram (CCH) peak at half-height was usually between 10-15 ms for the majority of pairs. However, we occasionally observed CCH with very narrow peaks ranging from 1 to 3 ms wide especially when these histograms were built using data from a narrow time window (Fig. 2A). This high temporal precision was particularly remarkable given the large inter-electrode distance that could occur between these neurons (over 1 mm). Although cross-correlation techniques have been traditionally used to infer anatomical relationships between neurons, our main interest was to investigate the role that synchrony might play in motor representation. Towards this end, we examined the temporal dynamics of the observed synchrony and found that it was not a static property of neuronal pairs but tended to occur around movement onset (Fig. 2B) (Hatsopoulos et al., 1998b). Moreover, the strength of synchronous discharge varied not only with time
237
A
B t"
t-
go • signal
o
o
-50
o
50
lead/lag (ms) C
-lO
left 4" •
2
lead/lag (ms)
lO
right 30
2O 10 0
Fig. 2. Synchronousdischarge between MI neurons during arm movements. (A) Significantsynchronybetween two MI neurons can occur on a millisecond time scale. This figure shows the cross-correlationhistogram (1 ms bin-width) between two neurons; the correlogram plots the number of spikes occurring in one neuron (the target neuron) relative to all spikes in another neuron (the reference neuron) and using data from a 400 ms period straddled around movement onset. (B) Synchrony is a dynamic property of neuron pairs. This plots the cross-correlogram (1 ms bin-width) between two neurons which measures the magnitude of coincident spikes at different leads and lags and as a ftmction of time across the trial. The color code ranges from dark blue which represents non-significantcoincidence rates to bright red which reflects significant coincident rates (P < 0.0h see Abeles (1982) for significancetest). ~C) Synchrony varies with movement direction. Note the presence of a correlationpeak only for the rightward direction. The cross-correlogram is from one pair of MI neurons for left and right movementsbased on 400 ms period around the signal instructingthe animal to initiate the movement.
but also with movement direction (Fig. 2C). This strongly suggested that synchrony might actually carry information about the direction of the arm movement. Using an information-theoretic analysis, we found that directional information was available in the synchronous spikes between simultaneously recorded neurons and that the amount of information increased near m o v e m e n t onset. However, as is true of most MI neurons, firing rates modulate either up or down around movement onset. The n u m b e r of synchronous spikes will likewise modulate up or down around movement onset, simply as a consequence of these rate changes, even if the two cells are conditionally independent. To account for this confounding factor, we estimated the information
available from synchronous spikes after shuffling the trial order of one of the neurons relative to the other. This shuffling technique removed the trial-specific synchrony but preserved the synchrony that was due to c o m m o n rate modulation. By shuffling multiple times we could estimate the distribution of information available from synchrony, assuming that the two neurons fired independently. These data allowed a statistical test of the amount of synchrony. We found that 45% of cell pairs carried directional information in their synchronous discharge (defined on 1 ms time scale) around movement onset beyond that available from chance coincidences. This does not imply, however, that synchronous spikes provide additional directional information beyondthat available
238
L
Spike count e0v~iance
:
In addition tO s ~ c h r o n y measured o n : a relatively fine time scale, correlated activity occurs on the tens
A
and hundreds of millisecond:time scale. That is, the n u m b e r of spikes counted in a l a r g e time:window from one neuron c o v ~ e s withl that f r o m a n o t h e r neuron if measured r e p e a t ~ y : o v e r different:trials (Fig. 3A), This br0ad spike ¢0variaace has been observed in m a n y d~fferen~ e ~ c a l ~ a s besides motor cortex i n c t u ~ g Visual ~rtex::(vanKaia 6tal.i 1985; van der Togt et: N , i998); infero±temporal cortex (Gochin et aL, 1991; Gawn e ana m c h m o n a , 1993), and medial temporal c o r t e x (Zohary et al., 1994). It has been considered to be a nuisance variable for population coding schemes because they require that spike count variability or ' n o i s e ' across the neuronal population is independent in order to wash away its effects on parameter estimation. Although this form of correlation is observed by m a n y researchers, it has largely been dismissed as b e i n g insignificant espe-
60
R=0.48
50 40 neuron 2 30 spike count 20
" "
10 20
. o*
40
. • .: -.:t%o~
60
;o°
80
1O0
120
neuron 2 spike count
B
left
right
~0.5 o
'-
0
8 .o. 1 -I
time (s)
1
go
Fig. 3~ Pairs of MI neurons.engage in correlated activity on a broad tirae scale. (A) Scatter plot of the number of spikes generated by one neuron relative to another neuron measuredin a 500 ms period after movement onset. Each point in the scatter plot represents a different trial. The)plot demonstrates that the spike courtts between: the two neurons are significantly correlated (P < 0.01). (B) Correlations yary dyn~eallyover time. These plots show that the Fisher z-transform of the correlation coefficient between the spike counts of two .neurons :.computed for 100 ms intervals is plotted as a function of time relative to the go signal for movement to the left and to the right,-iThe dashexMine corresponds: to 2.33 standard deviations above the mean of 50 correlation coefficients (z-transformed), computed by rmd0rr~' sh~ii~g the trial order of oneof the neurons relative to the other. Therefore, the dashed line approximates the 99% significance level. That is, correlation coefficients above .the dashed line are significantly different from zero at the 1% level.
239 cially at large inter-electrode distances (Lee et al., 1998). Lee and colleagues showed that this broad correlation is more strongly related to the similarity of the preferred directions of the cells at small interelectrode distances but did not directly demonstrate that the correlation strength decreases with distance. Our results suggest that the strength of covariation in motor cortex remains relatively constant with interelectrode distance (Maynard et al., 1999). Recent theoretical studies have shown that broad correlations can improve neural discrimination of different stimuli or behaviors ( t r a m et al.. 1998). Using more sophisticated statistical models incorporating the observed covariance between neurons in a population, we were the first to demonstrate using simultaneously recorded unit data that estimation of movement direction can be improved (Maynard et al., 1999). While others have computed single estimates of spike count correlation across all experimental conditions (Gawne and Richmond. 1993; Zohary et al., 1994; Lee et al., 1998), we measured the spike count correlation between neurons separately for each movement direction. We found that the correlation strength varies in time and with movement direction (Fig. 3B) and contributes to the improved directional estimation. To demonstrate the contribution that correlated activity makes in direction estimation, we tried to predict movement direction from single trials of multi-neuron data. We used a simple maximum-likelihood classifier to assign individual trials of data into different directional classes. We used a Gaussian model to fit the observed number of spikes measured in a time window anchored to one of several stimulus or behavioral events: the instruction signal, the go signal, movement onset, and end of movement. The window anchoring time represented the end of the window and not the middle. This convention was adopted because neural activity in motor cortex generally precedes movement onset and, therefore, is assumed to predict motor behavior in the future. For single neurons, two parameters had to be estimated: the mean and the variance of the number of spikes. For a population of N neurons, we had to estimate a mean vector of length N and a covariance matrix of size N x N. By creating a separate model for each movement direction, we could estimate the likelihood of observing a particular number
of spike counts On a single trial given each model, and assigning that trial to the class with the largest likelihood. To avoid overfitting, we cross-validated the classification b~ leaving one trial of data out for testing and estimating the model parameters on the remaining set of data. The results of this classification procedure demonstrate a number of important points with regard t o neural representations in motor cortex, specifically, and to cortical coding in cortex, in general. First, the performance of classifier using an 8 neuron ensemble (thick solid line, Fig, 4) is better than the single best neuron (thin solid line, Fig. 4). Note that at every time point the performance of the best of the 8 neurons is plotted even if different neurons are best at different times. The ensemble classifies nearly perfectly (i.e. 100% correct classification) near the end of t h e reaction time period which corresponds to about 300-400 ms after the go signal. Although the singl e best neuron can perform quite well and considerably better t h ~ many other neurons in the population, the neural ensemble almost always provide s :more predictive power. Thus, the performance o f the: neural ensemble Cannot be accounted for by the most sensitive neuron in the recorded population, which suggests that Barlow's 'lower envelope' principle may not hold in this case (Barlow, 1972). In Fig. 5A, the mean performance over all possible subsets of N neurons is plotted as N increases. The performance o f the classifier increases monotonically as the number o f cells in the ensemble increases. This monotonic function is shown more clearly in Fig. 5B where the classifier's performance based on a 300 m s period after the onset of the go signal is plotted versus the number of cells. Second, the ability t o reliably predict the movement direction can occur quite early during the trial (see Fig; 4, left most panel). As early as 300 ms (i.e. 100 ms to 3 0 0 m s given a 200 ms integration window) after the onset of the instruction signal, one can predict the movement direction with 70% accuracy compared to chance performance of 50%. This period of time is over 1000 ms before the movement is executed. Thus, MI neurons are not simply 'upper motor neurons' that execute a command that is planned somewhere else in the brain. It is also noteworthy that the classifier's performance increases
240 instruction
~""
go cue
m o v e m e n t onset
e n d of m o v e m e n t
100 neural e n s e m b l e
00
.....
70 •
•
,~
ingle neuron
:4 4, t.. h"
50
to tO
0
"
'~;o' ' 'a;o . . . .
12000~
....
............0 400 ~ -400 time
400 -400
' 0
400
(ms)
Fig. 4. Performance of the maximum-likelihood classifier on single-trial data from a two-direction (left/right) movement task. The thick solid line plots the performance of the classifier based on cross-validated data aligned on differe m stimulus or behavioral events: instruction signal, go cue, movement onset, and end of movement. At each time point, the classifier s performance is based on the number Of spikes in each neuron in a 200 ms period ending at that time point. By comparison, the thin solid line plots the performance of the single neuron that classifies direction best at each time point. Note that each point on the solid line does not necessarily represent the same neuron. T h e dotted line is the 95% significance level based on the classifier's maximum performance from 59 shuffles in which each trial's direction label is randomly assigned to another trial. Thus, values exceeding this level are significantly different from chance at the 5% .levet. The arrow indicates that both the ensemble and the single best neuron can predict the movement direction with about 70% accuracy as early as 300 ms after the instruction signal.
sharply from 200 to 300 ms after the onset of the instruction signal. R then almost levels off until 800 ms after the instruction onset at which time it begins to increase steadily again. Third, the fact that prediction can be quite good even nsing very small integration time windows calls into question the sharp distinction between rate coding versus temporal coding that is often posited in the literature (I~ig. 6) Even at a peak rate of 150 Hz around movement onset, an integration window of only t0 ms will comain only about I to 2 spikes. Although direction prediction reaches a peak of about 85% for 8 ceils, it is important to note that many more neurons are active during such a task, and. therefore, it may not be unreasonable to assume that direction estimation could be nearly perfect using only the presence or absence of a single spike across a population. Fourth, the ability to predict movement direction on a single-~al basis is improved if the spike count correlations between neurons are incorporated into the statistical models used to decode the neural data.
We have shown that classification in an 8-direction task improves by 3% to 19% (average of 11% over all data sets) when the pair-wise correlations are taken into account (Maynard et al., 1999). In that study, we did not use a maximum-likelihood classifier when considering the complete neuronal ensemble but rather implemented a Monte Carlo technique m simulate data with the pair-wise correlations observed in the real data and used a nearest-neighbor classifier. In the two-direction task (left or 180°/right or 0 °) that we examine here, we found no statistical difference in classification performance between the maximum-likelihood classifier that incorporated the correlations versus the same classifier that ignored these correlations. This is not surprising given the fact that the individual spike rates of the population differ so markedly between opposite movement direction. In fact, the performance of the classifier becomes perfect (i.e. 100%) after movement onset when all 8 neurons are included (see Fig. 4). This ceiling effect prevents other statistical parameters such as correlation to improve performance: no fur-
241
A
go cue
instruction ~-.
100
50
movement onset
end of movement
y
¢-
.o
L
increasing number of cells
._o t~ 09 i
i
0
0
i
i
i
i
400
i
i
i
i
800
i
i
i
I
i
1200 0
i
i
i
r.\-i
i
i
i
i
400 -400
i
i
i
i
0
I
i
i
400 -400
i
i
i
0
i
i
i
i
400
time (ms)
B
i
85 80
z5
.m
t~ I.I,=.
•~
¢/) t~
0
65
60 i
|
1
2
i
I
3
4
5
i
i
i
6
7
8
Number of Neurons
Fig. 5. The maximttm-likelihood classifier's performance improves as more neurons are added into the ensemble. (A) The classifier's performance is plotted in time (based on the number of spikes in a 200 ms period) averaged over all subsets of N neurons from 1 to 8. (B) The classifier's performance based on a 300 ms period after the go signal as a function Of the number of neurons. The black line connects the mean performance averaged over all subsets of neurons as the number of neurons in the subset increases. The dashed horizontal lines represent the median perfot"mance, and the boxes span the range from the 1st to the 3rd quartile. The "error-bars' represent the full range of performance values.
ther i n f o r m a t i o n c a n be added by a n e w m e a s u r e i f it is fully specified b y the original one. H o w ever, b y e x a m i n i n g data that w e r e c o l l e c t e d using two directions that differ b y o n l y 45 ° , w e w e r e able to o b s e r v e i m p r o v e d classification w h e n the pairw i s e correlations w e r e included. Fig. 7 c o m p a r e s the p e r f o r m a n c e o f the 2 - n e u r o n classifier u s i n g a
600 m s i n t e g r a t i o n w i n d o w w h e n the correlations are :included (solid line) versus w h e n the correlations are i g n o r e d (dashed line). N o t i c e h o w the classifier p e r f o r m s better (gray region) w h e n the spike count c o v a r i a t i o n s are i n c l u d e d f r o m 200 ms after m o v e m e n t o n s e t (i.e. - 4 0 0 m s to + 2 0 0 ms) to 500 m s (i.e. - 1 0 0 m s to + 5 0 0 ms).
242 go cue
instruction
100
m o v e m e n t onset
end of m o v e m e n t
o~ t-
i
O.
95% significance level
0
•
~
0
i
I
i
400
,
i
i
800
i
J
i
i
~
1200 0
,
,
i
i
400
t
i
,
-400
i
i
0
i
400
,
-400
,
i
0
,
i
r
,
400
time (ms)
Fig. 6. Classification success for small numbers of spikes. The plot shows the performance of a maximum-likelihood classifier on slngle-trial data from a two-direction (left/right) movement task. The analysis is the same as in Fig. 4, using all 8 neurons, but here the integration window is only 10 ms instead of 200 ms wide. Performance reaches a maximum of ~85% at around movement onset. In such a. narrow integration window, motor cortical neurons may fire only one or two spikes. A decoding scheme based on the presence or absence of a single spike across a population of neurons can predict movement direction quite well, and, therefore, the assumption that rate codes require large integration windows may not be well founded.
Conclusion Large ga'oups of cortical neurons are simultaneously active when a stimulus is perceived or a motor act is planned and executed. The fact that nearly all units from w h i c h we recorded, using a immovable, chronically implanted array in motor cortex, modulated their activity during execution o f a simple reaching movement attests to the i d e a that a very large number o f neurons participate in even the simplest o f behaviors. W h a t is less clear is whether the statistical interactions a m o n g these neurons participate in perceptual and m o t o r representations that are used b y the nervous system to drive perception or guide behavior. Population decoding strategies such as the population vector algorithm assume that neurons are noisy but independent encoders. These types of d e c o d i n g algorithms have demonstrated that pooling the noisy signals from many neurons can reduce the detrimental effects o f noise and can predict the direction o f m o v e m e n t quite reliably. However, whal has b e e n termed ' n o i s e ' is actually correlated
and carries information. W e have tried to extend the idea of a first-order population code by taking into account higher-order relationships between neurons as well as their m e a n firing rates. B y recording from multiple neurons simultaneously, we have found that neurons are not independent encoders of movement direction. They exhibit statistical dependencies on both a fine time scale (i.e. synchrony) ........................................... By incorporating these correlations into our statistical coding and decoding schemes, we have shown that they are not necessarily detrimental to decoding movement direction but can improve o u r predictive power.
Acknowledgements This work was supported b y a N I M H grant (1KO1 MH01671) to NGH, a N I N D S grant (NS25074) to JPD, a W.M. K e c k Foundation grant (991710) to JPD. and a National Defense Science and Engineering G r a d u a t e F e l l o w s h i p to MTH.
243
100
=¥ t~
t~
5O
-400 -200
~I
200
/
400
time (ms)
movement onset
Fig. 7. Higher-order codes may optimize separation of similar activity patterns. This figure plots the classification rate for two movements that are close to the same directions (45° apart). The performance of the classifier is stronger after movement onset when the 2nd-order interactions among the neurons are explicitly included in the classifier's statistical models (see bracket; gray shading). The solid line is generated by the maximum-likelihood classifier described in the paper. The dashed line is generated in the same manner except that the covariances are forced to zero, i.e. the neurons are assumed to be conditionally independent given the movement direction. A 600 ms integration window was used.
References Abeles, M. (1982) Quantification, smoothing, and confidence limits for single-units' histograms. J. Neurosci. Methods, 5: 317-325. Abeles, M., Bergman, H., Margalit, E. and Vaadia, E. (1993) Spatiotemporal firing patterns in the frontal cortex of behaving monkeys. J. NeurophysioL, 70: 1629-1638. Allum, J.H.J., Hepp-Reymond, M.C. and Gysin, R. (1982) Crosscorrelation analysis of interneuronal connectivity in the motor cortex of the monkey. Brain Res., 231: 325-334. Barlow, H.B. (1972) Single units and sensation: A neuron doctrine for perceptual psychology. Perception, 1: 371-394. Bienenstock, E. and Geman, S. (1995) Composifionality in neural systems. In: M. Arbib (Ed.), The Handbook of Brain Theory and Neural Networks. Bradford Books/MIT Press, Cambridge, MA, pp. 223-226. Fetz, E.E, Toyama, K. and Smith, W. (1991) Synapfic interactions between cortical neurons. In: A. Peters and E.G. Jones (Eds.), Cerebral Cortex. Plenum, New York, pp. 1-80. Fitzpatrick, D.C., Batra, R., Stanford, T.R. and Kuwada, S. (1997) A neuronal population code for sound localization. Nature, 388: 871-874. Gawne, T.J. and Richmond, B.J. (1993) How independent are the messages carried by adjacent inferior temporal cortical neurons? J. Neurosci., 13: 2758-2771. Georgopoulos, A.P., Kalaska, J.E, Caminiti, R. and Massey, J.T.
(1982) On the relations between the direction of two-dimensional arm movements and cell discharge in primate motor cortex. J. Neurosci., 2: 1527-1537. Gochin, P.M., Miller, E.K., Gross, C.G. and Gerstein, G.L. (1991) Functional interactions among neurons in inferior temporal cortex of the awake macaque. Exp. Brain Res., 84: 505516. Hatsopoulos, N.G., Ojakangas, C.L., Maynard, E.M. and Donoghue, J.P. (1998a) Detection and identification of ensemble codes in motor cortex. In: H. Eichenbaum and J. Davis (Eds.), Neuronal Ensembles: Strategies for Recording and Decoding. Wiley, New York, pp. 161-175. Hatsopoulos, N.G., Ojakangas, C.L., Paniniski, L. and Donoghue, J.P. (1998b) Information about movement direction obtained from synchronous activity of motor cortical neurons. Proc. Natl. Acad. Sci., 95: 15706-15711. Hess, G. and Donoghue, J.P. (1994) Long-term potentiation of horizontal connections provides a mechanism to reorganize cortical motor maps. J. NeurophysioL, 71: 2543-2547. Kakei, S., Hoffman, D.S. and Strick, P.L. (1999) Muscle and movement representations in the primary motor cortex. Science, 285: 2136-2139. Lee, D., Port, N.L., Kruse, W. and Georgopoulos, A.P. (1998) Variability and correlated noise in the discharge of neurons in motor and parietal areas of the primate cortex. J. Neurosci., 18: 1161-1170. Maynard, E.M., Hatsopoulos, N.G., Ojakangas, C.L., Acuna, B.D., Sanes, J.N., Normann, R.A. and Donoghue, J.P. (1999) Neuronal interactions improve cortical population coding of movement direction. J. Neurosci., 19: 8083-8093. Murphy, J.T., Kwan, H.C. and Wong, Y.C. (1985a) Cross correlation studies in primate motor cortex: Event related correlation. Can. J. Neurol. Sci., 12: 24-30. Murphy, J.T., Kwan, H.C. and Wong, Y.C. (1985b) Cross correlation studies in primate motor cortex: Synapfic interactions and shared input. Can. J. Neurol. Sci., 12: 11-23. Nordhausen, C.T., Rousche, P.J. and Normann, R.A. (1994) Optimizing recording capabilities of the Utah intracortical electrode array. Brain Res., 637: 27-36. Oram, M.W., Foldiak, P., Perrett, D.I. and Sengpiel, F. (1998) The 'ideal homunculus': decoding neural population signals. Trends Neurosci., 21: 259-265. Oram, M.W., Hatsopoulos, N.G., Richmond, B.J. and Donoghue, J.P. (2001) Synchrony in motor cortical neurons provides direction information that is redundant with the information from coarse temporal response measures. (Submitted for publication.) Parker, A.J. and Newsome, W.T. (1998) Sense and the single neuron: Probing the physiology of perception. Annu. Rev. Neurosci., 21: 227-277. Rieke, E, Warland, D., de Ruyter van Steveninck, R. and Bialek, W. (1997) Spikes: Exploring the Neural Code. MIT Press, Cambridge, MA. Sanes, J.N., Donoghue, J.P., Thangaraj, V., Edelman, R.R. and Warach, S. (1995) Shared neural substrates controlling hand movements in human motor cortex. Science, 268: 1775-1777.
244
Schieber, M.H. and Hibbard, L.S. (1993) How somatotopic is the motor cortex hand area? Science, 261: 489-492. Singer, W. and Gray, C.M. (1995) Visual feature integration and the temporal conelation hypothesis. Annu. Rev. Neurosci.. 18: 555-586. Smpfer, M,, Bhagavan, S.. Smith, B.H. and Laurent. G. (1997) Impaired odour discrimination on desynchronization of odourencoding nettronal assemblies. Nature, 390: 70-74. Van der Togt, C., Lamme, V.A. and Spekreijse, H. (1998) Functional cormectivity within the visual cortex of the rat shows
state changes. Eur. J. Neurosci., 10: 1490-1507. Van Kan. RL.E.. Scobey, R.R and Gabor, A.J. (1985) Response covariance in cat visual cortex. Exp. Brain Res.. 60: 559-563. Von der Malsburg, C. (1981) The Correlational Theory of Brain Function. Max Planck Institute for Biophysical Chemistry, G6ttingen. Zohary, E.. Shadlen. M.N. and Newsome. W.T. ~1994) Correlated neuronal discharge rate and its implications for psychophysical performance. Nature, 370, 140-143.
M.A.L. Nicolelis (Ed.)
Progressin Brain Research,Vol. 130 Published by Elsevier Science B.V.
CHAPTER 16
Connectionist contributions to population coding in the motor cortex Sohie Lee Moody and Steven R Wise * Laboratory of Systems Neuroscience, National Institute of Mental Health, National Institutes of Health, 49 Convent Drive, MSC 4401, Building 49, Room B1EE17, Bethesda, MD 20892-4401, USA
Introduction Studies of the motor cortex have contributed early and often to understanding population coding in the nervous system. In a sense; this historical fact should be surprising. The mammalian motor cortex is often regarded as too complex to yield a clear-cut understanding of its mechanisms. By contrast, invertebrate nervous systems offer many well known advantages for studies of population coding. Further, in the motor system it is: much more difficult to control experimental variables than, for example, in the visual system. However, notwithstanding the inherent advantages of invertebrate and visual neurophysiology, many o f the seminal advances in understanding population coding have come from studies of the primate motor cortex, a highly complex part of an experimentally difficult system. The pioneering studies of Humphrey et al. (1970) first showed that the correlation of motor cortical activity and force improved dramatically by combining the activity of a small population of motor cortical neurons. Georgopoulos et al. (1982, 1983a)later found that a simple computation derived from a population of motor cortical neurons, termed the population *Corresponding author: Steven R Wise, Laboratory of Systems Neuroscience, National Institute of Mental Health, National Institutes of Health, 49 Convent Drive, MSC 4401, Building 49, Room B1EE17, Bethesda, MD 20892-4401, USA. Tel.: -t-1-301-402-5481; Fax: +1-301402-5441; E-mail:
[email protected]
vector, correlated closely with the direction of limb movement. Taken together, these studies led to the conclusion that populations of motor cortical neurons code voluntary action. The pioneering work cited above relied on the method of behavioral neurophysiology: as developed by Edward V. Evarts in the early 1960s. It typically entailed sampling single-cell activity one neuron at a time, Because this process took many weeks, population studies were limited to relatively stereotyped behaviors. This aspect of the method nearly precluded a neurophysiology of l e ~ n g (but cf. Mitz et al., 1991). Further, interactions among neurons were rarely studied and analysis was limited, for the most part, to firing rates and other aspects of spike trains, Behavioral neurophysiology is now passing from the Evarts era into one characterized by the simultaneous recording of activity from many ceils with multiple electrodes. Indeed, the process has been underway for some time in rodent neurophysiology and is progressing apace in primate neurophysiology, as well. This technical development will allow a more systematic study of population activity on a trial-by-trial basis, during less stereotyped behavior. There will certainly be more focus on the analysis of the interactions among neurons and their changes during learning. We can expect many problems of interpretation to arise during the era of simultaneous, multiple electrode recordings, especially in its earliest phases, Neurophysiologists have relatively little experience dealing with single-unit data arriving simultaneously
246 from separate elements of a neural system. However, there iS :a.branch:of e o ~ t i v e and. computational neurosciencefl~t" dens routinelywith an analogous situafionil tn. a'sense, simultaneous. Single-unit recording is an.inherent feature of connectionist models using paralteI distributed processing networks. Accordirig!% :we t~nk it pertinentto consider population coding:in neural network models and some of the lessons tfiose models offer. Unlike .neurophysiological data, neural network simulations come with the certainty that the individtml neuror/s in the networkcausally linkthe network's inputs t0.its outpms. An attempt to u~a~derstand the ft~ncfi6fial Operations of a model network corresponds iwith the]deal, neurophysiological situation of simultaneously recording from the entire population of-neurons thateontribute to a: given behavior. Given the pos~bitity of identif3~ing causal relations between motor cortical activity and action, the combination of causaliydetermined:neural network simulations and motor cortex ne~ophysi01ogy.seems auspicious (Fetz and Shape, t990). No a :priori assumptions need be made: about the.means.by, which a network will transform :an input s~gnaLintO, a specified output, and there are tl~erefore reiativdiy few constraints on the kinds of s e n s ~ inputs-or motor outputs available for simulation ( s ~ Lee, t:996 and Moody et al., 1998). We will refer~to sUch sensofimo~or transforms as mappings. o f course, we. do not mean to imply that artificial neural, networks veridically replicate thebiological system. Nor must we assume, that the leanfing algorithm resembles, the mechanisms of learning in the brairL However, the properties of a model network's hidden,traitS d ~ input-output mapping has proven useful i~ contemplating similar properties in cortical neurons and theirpotential function in biologicalnetw0rks:.:(Zipserand Andersen, 1988; Fetz and Shupe, 1990: Zipser et aI, 1993; Moody et al., 1998; MOody and Wise, 2000). VisuOmotor behavior and motor cortex
We will consider the properties of the population vector-during three classes of visuornotor .behavior, defimed.bythe ~elationship between visual input and spatially directed outpufi The first el-ass involves themovement of an ellector, e.g. the hand, limb, or fovea (i.e. gaze), to a
visible target. In this, the direct or standard mapping case,, the required movement .terminates atthe s~me location as the visual stimulus. The contrasting forms of Visaomotor control have been termed nonstandard mapping (Wise et al., 1996) to contrast all of its variations to standard mapping. Nonstandard mapping can be divided into the second and third classes we will consider. In the second, which we wilt: term transformational mapping, an algorithm relates the location of a visual stimulus to the direction of movement. In a common version of transformational mapping, a mo'~ement must be made at a 90 ° deviation from the visual stimulus (see Fig. 12, inset). The transform is,generally the same for all stimulus locations. Although several investigators have referred to these transforms as "rotations', we note that there are two distinct senses o f that term. Rotation can describe any transformation of a given stimulus from location A to location B, when A. and B are angularly displaced from One another. Rotation can also refer to the act of revolving, where intermediate stages betweenlocations A and B are instantiated during the transition from A:.to. B. The distinction between these senses of the rotation will become important in considering the-properties of the population vector. In the final class of nonstandard mapping we will discuss, the relationship between: the visual stimulus and the movement is arbitrary. The location of the visual stimuli does not indicate tile direction of movement to be made, nor does any :algorithm relate stimulus location to movement : direction. Instead, the system must store all of the learned stimulusresponse pairings. In a typical arbitrary mapping task, nonspatial aspects of.a visual stimulus, such as color; instruct a given movement direction or a movement target. Thus, we distinguish three classes of visuomotor behavior: standard, transformational, and arbitrary mapping. Neurophysiological studies in nonhuman primates have pointed to motor cortex, especially the premotor areas, as playing an important role in the transformation of visuospatial signals into directional motor outputs (for a brief review, see:Wise.et al., 1997). Brain-imaging studies have confirmed the importance of visual inputs to the motor cortex of human subjects. Inoue et aL (1998), for example, reported that either the primary motor cortex (MI) or the
247 dorsal premotor cortex (PMd) showed a significant increase in regional cerebral blood flow attributed to visual feedback of hand position during a p o i n t ing task. Other brain-imaging studies have, likewise, reported increases in motor cortical blood flow during visually guided reaching and pointing (Grafton et al., 1996; Matsumura et al., 1996). These blood flow changes seem to be especially prominent during periods of visuomotor learning (Kawashima et al., 1995), in accord with neurophysiological data showing changes in activity during such learning (Flament et al., 1993; Wise et al., 1998).
Standard mapping In the present study mad its predecessor (Lee, 1996), model neural networks were trained using a version of the Backpropagation Through Time learning algorithm, modified to approximate a continuous system of equations (Zipser et al., 1993). The algorithm evaluated error based on the difference between the mx and my given by the model and the values of mx and my required to connect the current hand position to the target location (Fig. 1, inset). A network minimized this error as it established a representation of the correct movement vector.
Questions concerning the population vector Fully distributed model In examining the population vector, we will not attempt a systematic survey of the literature (see Georgopoulos, 1991, 1995, 1996). Instead, we will address certain questions related to the three classes of visuomotor mapping listed above. We applied the first question to models of standard mapping. Do the preferred directions of model neurons correspond to their direct effects on the network's output elements? This question interested us because of the prevailing assumption that, for example, cells with leftward preferred directions should have strong connections to motor pools that produce leftward movements (Georgopoulos, 1995; Lukashin et al., 1996). This preconception makes sense from an engineering perspective: one might connect units with leftward preferred directions to left output units. However, in a distributed parallel processing network, a model neuron's preferred direction may not relate to its direct influence on output. To address the first question we examined the correlation between the preferred directions of model neurons and their synaptic weights upon the output units in two different models: a single-layer, fully recurrent neural network and a multi-layered model that simulates certain aspects of corticomotoneuronal organization. The second set of questions involve the properties of both units and the population vector during learning, which will be addressed for both standard and nonstandard mappings. Do hidden units in the network resemble neurons in the motor cortex of a monkey trained to perform analogous transformational mapping tasks (Wise et al., 1998)? And how does the population vector behave during initial learning or adaptation to novel mappings?
As in motor cortex (Georgopoulos et al. 1982. 1983a), model neurons were found to be broadly tuned to movement direction and the population vector, computed on the basis of those preferred directions, accurately predicted the direction of network output (Lee, 1996; Moody and Zipser. 1998). Other similarities to motor cortex included startingposition dependency (Caminiti et al.. 1991). The preferred direction of each model neuron was calculated by fitting the activity in response to eight equally spaced angular locations to a cosine curve. The peak of the curve determined the preferred direction. The dynamic range was defined as the difference between minimum and maximum activity across the eight test directions. The baseline was defined as the activity when the target input was the same as the current hand position input. In these respects, the tuning of model neurons resembled that seen in motor cortex and in several other parts of the motor system (Lee, 1996; Moody and Zipser, 1998). Thus. unlike some models of motor cortex, which imposed coarse directional tuning on the model's hidden units (e.g. Lukashin et al., 1996), application of the neural systems identification method showed that such constraints were unnecessary (Lee, 1996: Moody and Zipser, 1998). When the simulated hand position was fixed throughout testing, the network activity could be interpreted as a representation of intended movement. rather than as a simulated movement. The population vector and its error could then be computed based on the activity of individual recurrent neurons, in
248
Outputs:
Limb Position Coordinates
Sign of each Coordinate
~r
a' p' Target
a
I]
Current Position
| ~ J
Joint Angle Inputs
Fig. l. Architecame for fully recurrent network model. Hidden units are indicated by the triangles (pointing right). Each hidden unit receives external input and feedback from other units. Only three model units shown, but all network models contained between 9 and 25 units. Each model unit's activation was determined by a logistic function (see Moody et al., 1998). The network had four inputs, each in joint-angle coordinates. Two inputs reflected the current position of the limb, the other two inputs target position. The output neurons are also indicated by triangles(pointing up). The four outputs represented signed Cartesian coordinates, with separate output units for the magnitfide and sign of each d~ension. The inset in the upper left shows a schematic of the simulated, two-joint arm and both the arm (at a starting position) and the target in joint-angle coordinates. The movement vector, m. is divided into its Cartesian components, mx and m~.. Adapted from Lee (1996).
a manner analogous to that used for motor cortex. The population vector accurately predicted the output of the network (Fig. 3A), with a mean error of 14 =k 10° and a maximum error of 30 °. In the model. directional error in the network's output, which was trained directly, was about half that of the population vector. The Pearson's correlation coefficient, r 2, for the population vector and network output was ~0.99 for the network depicted in Fig. 3A,B. This degree of correlation and error were on the same order of magnitude as experimentally derived population vectors (GeorgopouloS et al., 1983a). We then addressed the issue of hidden unit directionality vs. their direct influence on output. For each hidden unit with a significant preferred direction, the output weight vector was calculated as follows: the synaptic weights connecting a given
model neuron to the x and y outputs, respectively, were added vectorially t o create a direction vector. For example, if a given model neuron was linked to the x and y output units with weights of +0.5 and +0.5, respectively, then the output weight vector would be 45 °, As shown in Fig. 3B, the correlation between each neuron's preferred direction and the synaptic weight connecting that model neuron to the output units was 0.01 for the population of all model neurons. Notwithstanding this lack of direct effect on network output and as noted above, the model's population vector equaled or surpassed the accuracy of experimental data (Fig. 3A). Thus, while the population vector computation assumes that each umt contributes to network output in its preferred direction, the general expectation among neurophysiologists has been that this would be achieved by
249 connecting cortical neurons with motor pools tending to move the limb in the cells' preferred direction. However, the model network showed no orderly relationship between the directionality of a given neuron and the direct influence the model neuron exhibits upon movement. We repeated this finding for multiple implementations of models with the architecture shown in Fig. 1, with variations including polar, Cartesian, and joint-angle coordinate frames. The correlation of the population of hidden units with their direct output effect was always near zero. In order to understand this finding further, we examined why the hidden units were tuned and how the population vector functioned, despite the fact that one of its fundamental assumptions appeared to be violated. In the connectionist model of the sort illustrated in Fig. 1, we understand why and how the hidden units showed coarse directional tuning (Moody and Zipser, 1998). The directional tuning of a hidden unit was a function of its activity in response to different input patterns. A hidden unit was considered untuned if
1.0
A
its response was unaltered across various inputs (for example, Fig. 2A, the dash-dot line). A tuned hidden unit, on the other hand, exhibited different activation in response to various inputs (for example, Fig. 2A, the solid line). The activation of a model neuron was the nonlinear weighted sum of inputs and synaptic weights (black diamonds in Fig. 1) connecting the inputs to a given model neuron, and each hidden unit had a unique set of input weights. Note that different hidden units, with different input weights, could show similar tuning curves (Fig. 2D, solid and dashed), but, in general, preferred directions were well distributed in the workspace. Most hidden units had clear tuning functions. Coarse coding of movement direction will emerge from any fully recurrent system that computes a nonlinear input-output mapping, without additional constraints or assumptions. The tuning will be cosine-tuned because the movement can be characterized in terms of a direction and an amplitude, an assertion elaborated in detail by Lee (1996), ch. 2. Directional tuning was essential
B
C ......,,--'" f
"X
.......... --~...~
0.5
\ ""'"...........................
0.0 0
45 90 i35 180 225 270 315
Locations 3
0
45 90 135 180 225 270 315
0
45 90 135 180 225 270 315
E
D
........,-'"........... ..
5 4 ~ @ ~ 2
.'""
"..
.,."
\
-~.~. ....................................... x~7. ...................
7 0
45 90 135 180 225 270 315
0
45 90 135 180 225 270 315
Fig. 2. (A-E) Tuning curves for the 18 units comprising the model network depicted in Fig. 3. Zero degrees is to the right. The y-axis is normalized to maximal hidden unit activity. Tuning curves are grouped in sets of four for illustrative convenience, not to indicate any relation between or differences among the groups, Note that there is a broad range of coarsely coded preferred directions. The inset in the lower left shows the convention for naming locations and directions.
250 t o t h e operations of the network because the hidden u n i t s n e e d e d to:differentiate among the various input patterns to subserve optimal network performance ( M o o d y and Z i p s e r , 1998). Cosine (or other b r o a d l y tuned) neurons are one of the two requirements of the population vector computation. T h e other is that the preferred directions are well distributed throughout the workspace. W i t h these t w o Criteria met, the population vector, based a s i.t i s on the assumption that units contribute output in their preferred d i r e c t i o n , always accurately corresponds to m o v e m e n t direction. In the present modeii the preferred directions were distributed throughout the workspace and were mostly cosine-tuned, a s shown in F i g . 2. Therefore. it was not surprising that the population vector computation was accurate and highly correlated with correct
A
360
r2=0.99
315 -~
270 ~
network output direction (Fig. 3A). [This perspective has been used to argue that the population vector is a tautology (see E.E. Fetz, discussion, p. 133 in Georgopoulos, 1987; see also Sanger, 1994), but some applications o f the m e t h o d suggest otherwise, a point we will take up below in the s e c t i o n entitled Arbitrary Mapping.] This explanation a c c o u n t s for w h y the population vector works in the m o d e l n e t w o r k , not how it does so. That is. how could the-hidden units produce an output in their preferred direction. if not through direct connections with the outputs? The answer is that the hidden units worked through all of their interconnections and not mainly through their direct influence on output neurons. This finding casts doubt on models o f the m o tor system that d e p e n d on the activation of various combinations o f motor primitives (see. for e x a m p l e , 77
P<
225 .,v/"
i80 "~
/
./. ,.//
135
"~
/ 7.~
90 42 /-
0
~v
/- /
0
/5. 45
90
135
180
225
270
315
360
Network Output 360 ! • 315
B
•
"
270-~
C
r2=0.04
r2=0-01 P=0.75
225i t80
°
',
135
.
90
P=0.37
• %
•
•
•
"
,
• • •
•
•
•
•
oo
•
-.5 ]
O
0 0
45
90
[35
180
225
270
315
360
13
45
90
135
180
225
270
315
360
Hidden Unit Preferred Direction Hidden Unit Preferred Direction Fig. 3. (A) The population vector computed from the hidden units correlates very closely with the direction of movement produced by the network in a simulation of an 8-target cemer-out paradigm (see Georgopoulos e~ al.. 1983a). (B) The preferred directions of the hidden units do not, as a population, correlate with their direct influence on the output neurons. The hidden unit preferred direction was calculated by regression. The direct influence was calculated as the vector sum of the weights on the x and y outputs (output weight vectar). A and B from a fully recurrent network with the architecture illustrated in Fig. 1. (C) Data comparable to B. for a network simulating motor pools, in contrast to the fully recurrent network in B.
251 d'Avella and Bizzi, 1998). As applied to motor cortex, such an architecture might be instantiated as a set of hidden units producing outputs in a given movement direction, with varying directions of movement being obtained by recruiting combination of these canonical vectors. Unlike models of M1 neurons that have enforced this kind of architecture on a model (Lukashin et al., 1996), the current approach shows that such constraints are unnecessary either for the accuracy of the population vector or for the mapping computed by the network. Models based on motor primitives may prove useful to understanding lower-order components of the motor system, but the findings from neural network models suggest that they are not necessary to produce the outputs and transformations observed at the cortical level. Of course, many alternative and more anatomically realistic models of the motor cortex have been developed (e.g.: Kuperstein, 1988; Mel, 1991; Burnod et al., 1992; Bullock et al., 1993; Cisek et al., 1998). Most have emphasized the transformation of input coordinate space to one more suitable for controlling movements. We will not venture into that terrain here, except to note that for visually guided movements this transformation is most likely from retinal space to joint coordinate space (Scott and Kalaska, 1997), rather than, as in the present model, from joint-angle space into Cartesian coordinates. However, the fundamental operations of the network would not have been affected by the particular coordinate frames chosen for input and output, given a nonlinear mapping.
Corticospinal model It could be argued that the lack of correlation shown in Fig. 3B resulted from the architecture of the network. On this view, the fact that the network was fully recurrent led to a more distributed solution to the input-output problem than would more realistic architectures. To examine this possibility, we constrained the architecture of the model to more closely resemble the corticomotoneuronal system. That model incorporated additional layers of network model neurons that corresponded to 'corticospinal neurons' and "motor pools', respectively. The motor pools and corticospinal neurons had explicit directional-
ity. There were two separate motor pools, one each representing the x and y axis of a two-dimensional movement. There were also two corresponding corticospinal populations, each connected directly to its respective motor pool. A recurrent pool of model neurons was connected to all input lines, and then each model neuron was connected to both corticospinal pools. The aim was not to mimic the nervous system very precisely, but rather to examine what might be undermining the tendency toward a correlation of hidden-unit preferred direction and output effects. We then examined the relationship between the preferred directions of 'cortical' recurrent pool of hidden units and their synaptic weights to the 'corticospinal neurons'. As with the single-layered, fully recurrent networks (Fig. 3B), we found no correlation between a model neuron's preferred direction and the synaptic weight upon the output unit (r 2 -----0.04, P > 0.36) in the multi-layered model (Fig. 3C). This finding confirmed that the fully recurrent architecture of the previous model was not the sole factor undermining the assumed relationship between hidden unit preferred direction and direct output effects. Instead, the lack of such a correlation appeared to be a general property of distributed, nonlinear mapping networks.
Transformational mappings There are many kinds of transformational mappings, i.e. any number of algorithms can relate input to output in a nonarbitrary manner. The most common kind of transformational mapping that has been studied involves an angular deviation between the location of a stimulus and the direction of movement, often termed a rotation (see Fig. 12, inset).
Single-input network Single-input networks received a set of input signals once, then produced an output. We will discuss a model that underwent two different training stages: (1) from naive (i.e. a random initial state) to the mastery of a standard mapping, and (2) from the standard mapping to the mastery of a transformational mapping. As the network model learned to perform these tasks, the correlations between the model neurons' preferred directions and weights upon outputs were
252 monitored, as well as the correlation between the model neurons' population vector and t h e model network's output. The architecture of the model network was like that depicted in Fig. 1. The network began from a state of randomly initialized synaptic weight values, with a!l Values in the range [ - 1.00 : 1.00]. In such networks, the population vector developed at approximately the same rate as the network output (Fig. 4) As shox~m in Fig, 4A, the output of the population vector quickly came t ° correlate with the network output direction. This particular network reached a correlation of r 2 ~ 0.95, somewhat lower than the n e t w o N illustrated in Fig. 3A, but nevertheless a reasongNe correspondence. Had the network been trained longer, i t seems likely that we would have obtained a higher correlation. However, instead of continuing to train the network on the standard mapping task, we changed the input-output requirements to simulate a 90 ° counterclockwise transformational mapping, the results of which are shown in Fig. 5. As
with the initial training from a random weight state, training from the standard-mapping weight state resulted in a period of poor performance, followed by convergence on the desired output. The network eventually produced an output that c a m e within a mean of 5 4- 4 ° of the target direction averaged over eight movement directions. The correlation of the population vector with the network output improved to r 2 ~ 0.9. and, notwithstanding early periods of unstable performance, decreased to an error of < 10°. From a qualitative perspective, learning the standard mapping from a random state (Fig. 4) did not differ in any remarkable way from learning the nonstandard mapping from the standard mapping state (Fig. 5). In a sense, the similarity of the data shown in Figs. 4 and 5 was not surprising because, as in biological systems (Held and Bauer, 1974), even a standard mapping must be learned and we know of no reason to suppose that initial-learning mechanisms differ from those underlying adaptations to alterna-
A 1.00 m o .,~
Random to Standard
Correlation of Population Vector and Network Output
~f
0.95 09o
"i
0.75 0.70 5o
B ©
Network Output Error
4o 3o
I0 0 0
t0
20
30
40
50
60
Time (cycles x 104) Fig. 4-. Initial learning of standard input-output mapping. In standard mapping, the simulated limb moves to the target. (A) Beginning from random weights, the population vector of the network's hidden units has. at first, a low correlation with the .direction of network output. (B) The grand mean error of the population vector decreases during learning, as does the error of the network output.
253
A
1.0
.2
Standard to 90-degree Rotation
Correlation of Population Vector and Network Output
0.9 0.8 0.7 0.6
©
0.5 0.4
B
I00 Network Output Error ~'~
80
60 40 20
0
10
20
30
40
50
60
Time (cycles x 104) Fig. 5. (A,B) Learning a nonstandard mapping. In this version of nonstandard mapping, the simulated limb moves at 90 ° counterclockwise to the target. Such a remapping, based on angular deviation, is often termed a "rotation' task.
tive mappings, However, from another perspective, one could imagine that the weight state used for standard mapping could impede learning the transformational mapping or, conversely, the ability to perform a (standard) mapping, which shared many of the features of the transformational mapping, might have promoted adaptation. Our models supported neither supposition. Examination of hidden unit properties, as they changed during adaptation to the 90 ° transform allowed a comparison with analogous neurophysiological data. Fig. 6 shows the tuning curves of six hidden units from a model network of the kind illustrated in Fig. 5. During the initial cycles(solid lines), the tuning curves had not changed from those observed during optimal performance of the standard mapping. The tuning curves of the same hidden units changed once the transformational remapping had been learned (dashed lines). Note the wide variety of changes in the preferred direction, width and depth of the tuning curves for these model units (Fig. 6).
For example, the hidden unit illustrated in Fig. 6A was well tuned for the intermediate stimulus locations in the later cycles, once the remapping had been learned. However. the same unit was much less distinctly tuned, and to a different preferred direction. in the early cycles when the tuning curve reflected that for the standard mapping. Fig. 6C shows a different hidden unit, one that remained well tuned during both the standard and nonstandard mapping, but shifted preferred direction. The hidden unit illustrated in Fig. 6B increased its depth of tuning during adaptation, but retained the same preferred direction in stimulus coordinate space. The hidden unit shown in Fig. 6D mainly became more deeply tuned, whereas that in Fig. 6F became more narrowly tuned. The same general phenomena were observed in motor cortical cells during the adaptation to a 90 ° clockwise or counterclockwise angular deviation (Wise et al,, 1998). Fig. 7 shows tuning curves from six motor cortical neurons. For example, as with the model unit in Fig. 6D. the cortical unit in Fig. 7D became more
254
Neural Network Model 1,0
B
A 0.5
~
0.0
..........
~
~'/"
I D
C
early . . . . . late
F
E
P
-
Stimulus Location Fig 6. (A-F) Changes in tuning properties of 6 of the network's hidden units, as the network learns the novel, nonstandard mapping depicted in Fig, 5. Each ptot shows the data for a single hidden unit. The heavy, solid lines indicate the unit's tuning during the first trials of the remapping, which corresponds to the weights obtained at the end of training the network to perform the standard mapping task. as shown in Fig. 4. The dashed lines show the same units' tuning after the nonstandard mapping has been learned, as shown in Fig. 5,
deeply tuned without a dramatic change in preferred direction. Other ceils, such as those illustrated in Fig. 7AIC, showed a dramatic change in preferred direction, much like tl Lehidden unit in Fig. 6C. To put these comparisons in a more quantitative basis~ we examined three measures of directional tuning, employed previously in comparing the tuning properties of biological and model units (Moody et al., 1998). The method for calculating preferred direction has been described above. Two other indices were computed, using the equations shown in Fig. 8. The selectivity index (s), for each unit (i) was calculated as illustrated in the equation in the upper. left part of the figure, where k was the number of stimulus locations, in was the activity of a neuron for each stimulus location, in this case for the period immediately before the onset of movement or the analogous period in model units, and imax was the maximum activity for the set of stimulus locations.
The depth of tuning index (d), was calculated as shown in the lower, left of Fig. 8, where imm was the minimum activity for the set of stimulus locations. As shown in the figure, high values corresponded to narrowly and deeply tuned units. Fig. 9 compares the population of hidden units of a model network to data obtained from the motor cortex of monkeys (Wise et al., 1998). Specifically, the figure plots the change in each of the three tuning indices from the early trials to the late trials during the adaptation to a 90 ° clockwise or counterclockwise rotation. We will not labor the comparison, except to note the general similarity. All of the general kinds of changes in tuning that were observed in the monkey, such as increases or decreases in selectivity, were also seen in the model network and with few exceptions the magnitude of change was similar. A comparison of the model units to the monkey's showed that the changes in preferred direction did not significantly
255 ~00 A
120
80
40 t
40
20 '
0
0t
,---, t00, :
"~
E
B
80
25
80
20
60
15
i
0
100
0
120
F
80 40 "
20 ] 0
c./'3_
40
O ¢,¢ -P0/'~O"/eoq~?J-e-'~a/a" o
0
o cj--oo%/@J,'J<~
Stimulus Location (degrees) Fig. 7. (A-F) Single,neuron activity from the motor cortex of a monkey adapting to a 90 ° counterclockwise transformational mapping~ In the f0rmat iof Fig. 6.; Error bars, standard deviation. Each point is an average of three trials, either at the beginning or end of an adaptation block ( s ~ Wise:et al., 1998).
differ (Wilcoxon rank-order test, P > 0.16), nor did the Changes:in the selectivity index ( P > 0.12); although the changes i n. tuning depthwere marginally different ( P := -0.049). In fur~er.examining the neurophysiotogical data. we found that a given- motor cortex unit might have different tuning properties during different blocks of the standard-mapping task (Wise et al.. 1998; see also Gondolfo et al., 2000). For example, we studied the same neuron for a block o f standard-mapping trials, -followed b y a block with some transformational mappings; and:a third'block of return to the standard mapping. W~ had assumed that the cells would return to their former .properties upon re, adaptation to the standard mapping, task.: Fiowever, only about half of themotorc~iicat:neur0ns did so (Wise et al., 1998). Accordingly; we examinedthe neuralnetwork model to see if a similar phenomenon occurred. Fig. 10
shows the tuning of five model units. Each row of Fig. 10 shows the activity of a single hidden unit, and the data are the same in all three columns. Highlighted in each column, by a thicker, broken line, are the tuning curves for the mapping stated at the top of the column. Thus, reading from left to right:in the top row, one can see that the hidden unit was well tuned to about 180° once the standard mapping had been first learned (left column). Then, after the same network mastered the nonstandard mapping, the cell became completely untuned (middle column), but returned to its original tuning during the retraining to a standard mapping (right column). The hidden unit in the second row shows a very different pattern. Like the unit in the top row, the cell lost its tuning after the transition from standard mapping to a 90 ° counterclockwise remapping. However, when the network was retrained on standard mapping, the
256 Selectivity Index (s)
s=0.1 ( ~i n "~
kSi
k-1 ¢..)
<
Depth o f Tuning Index (d)
d=0.1
d i _ imax -- imin imax d=0.9
Stimulus location Fig. 8. Graphical depictions of our definitions of two tuning parameters measured for both hidden units in the model network and in motor cortical neurons in the biological network. The selectivity index (s) measures the degree to which the unit differentiated among the various spatial locations. The depth of tuning index (d) measures the signal modulation for a given tuning curve. These measures were used in the analysis shown in Fig. 9. Adapted from Moody et al. (1998).
unit remained weakly tuned and much more closely resembled its tuning for transformational mapping. That hidden unit was essentially recruited out the network, in a functional sense, by the remapping. The bottom-three rows show different patterns, but each unit failed m return to its original properties when the network was retrained on the original standard mapping task. From a counectionist perspective. this finding was entirely unsurprising. There was no reason to suppose that the network would solve a given input-output function the same way once the weights had been changed to compute a different mapping function. However. this finding does illuminate the similar observations made in the monkey's motor cortex. Fig. 11 shows single-unit data from the monkey motor cortex in approximately the same format as for the model's hidden units in Fig. 10. The same general phenomena were observed. This finding undermines the simple concept that a given motor cortical cell has an intrinsic preferred direction and directional tuning, and that afferents to the motor cortex need only activate the appropriate
combinations of those tuned neurons to achieve a given output. Instead, it is clear from this work and from several other studies of motor cortex (Li et al., 2001), that the preferred directions, depth of tuning, and directional selectivity of an individual unit change dramatically during the life of a network, be it artificial or biological. As noted above in the section entitled Standard Mapping,: the computation of the output resulted from the conjoint operations of the population as a whole. This conclusion held for nonstandard, transformational mappings, as well.
Multiple-input network A different recurrent neural network model o f mot o r cortex was designed to simulate some aspects of movement dynamics, while introducing as few arbitrary initial constraints as possible. The inputs and outputs of this model network were the same as that illustrated in Fig. 1, but the training was different. In the model discussed earlier, there was always a current position and a target position, and
257 8
7
7
Model
5~
6
6
5
~5
4 It 1
4
3~
2l
3 2 1 0 60
121)
180
-I.0
~.5
7
0.0
0.5
3 2 i 1.0
0 -I.0
-0.5
0.0
0.5
1.0
-0.5
0.0
0.5
1.0
Monkey
6 5 ©
2
4 3 2
2
1 60
120
Change in Preferred Direction (early --
late)
180
0 -1.0
-0.5
0.0
0.5
Change in Selectivity Index (early
late)
1.0
0 -1.0
Change in Depth of Tuning Index (early
late)
Fig. 9. Comparison of neural network model (top row~ and motor cortical cells (bottom row~ in changes of tuning properues during learning. Both the ~etwork model and the monkeylearned a 90° nonstandard mapping during the monitoring of activity. Three tuning parameters were exanfined: prefelTed direction, selectivity and depth of tuning. Each histogram shows the change in that parameter for the first three trials vs. the last three trials of a learning block. During the learning block, the system begins with a different mapping function and ends with successful remapping to a 90° angular deviation ~'rotation'). the network output was the vector that connects those two points in the workspace. In the model to be discussed presently, the network output took fixed steps between the current position and the target. A step size o f about 7% of the average distance between a o v e n current position and a given target was chosen. Further, target switches were allowed to occur unexpectedly. Changes in direction may have occurred after a target was reached or in mid-reach when the current target was replaced by a new target. This training regime generated a continuous stream of targetqocation hand-location input pairs, covering all of the workspace (Moody and Zipser, 1998). When the hand reached the target, it paused for a random interval and then pursued the next target. If the target changed location, the hand either changed direction and moved toward a new target immediately, or paused. The rate of hand movement, and the probabilities o f target-location change, hand stops
and hand starts were all parameters generated by the training program. Before we consider the issue of population vector rotation, three kinds of change in target signals need to be distinguished, each of which has been investigated experimentally. One approach involved a discrete change in location of a visual target just after movement began (Georgopoulos et al.. 1983b). This manipulation led to a continuous change in the direction of hand movement from pointing toward the first target to pointing to the second. A different experimental paradigm involved continuous movements along complex visually specified trajectories such as ellipses and sinusoids (Schwartz, 1993, 1994~. The population vector also rotated smoothly during these movements such that it pointed in the direction of hand motion. Yet another kind of signal change, and the one we will focus on here. required movement in a direction offset by a fixed angle from the di-
258 1st Standard
90-degree Transform
Model
2nd Standard
Return
Stay
Continue
s S
~ x
Intermediate 1.0 0.8 0.6
Task-specific tuning
0.4 0.2 0.0
Stimulus Location Fig. 10. Effectof relearningthe standard mapping on tuning propertiesof the network's hidden units. See text for details. rection of a visual target cite (Georgopoulos et al., 1989; Lurito et aL, 1991). It has been argued that the actual direction of motion must have been computed internally by 'mental rotation': The mental rotation was hypothesized to involve the creation and rotation of a population vector from (near) the direction of target location to the direction of actual movement. Experimental results revealed rotation of t h e population vector in this task, in apparent confirmation of this idea (Georgopoulos et al,, 1989; Lurito et al:, 1991). The population vector first pointed in the direction of the visual target and then rotated in the appropriate direction until it reached t h e direction of actual movement. O n l y then did movement occur. We have confirmed that the population vector rotates in a model network under these conditions, but the interpretation in terms of mental rotation deserves closer examination. The model supported an alternative concept equally well. If target signals
were simply switched, a smoothly rotating population vector resulted (Lee, 1996). This finding leads to an alternative interpretation of population vector rotation, which has also been suggested by Whitney et al. (1997) andby Scott and Cisek (1997, 1999). The dynamical properties of model neurons were analyzed by simulating tasks similar to the ones used in experiments with monkeys. However, there was one important difference between how the model was tested and how experiments were generally performed. During testing of the model, only the movement representation changes, no s i m u l a t e d h a n d m o v e m e n t occurred. This was accomplished by fixing the hand-position input to correspond to a point in the center of the training region, then measuring responses to targets at various locations. To simulate the task, the visual input was assumed to be supplied to motor cortex. Somehow, this supplied target triggered the new, rotated target location. This
259
1 st Standard
90-degree Transform
2nd Standard
16°I
Stay
120
80
/
40
~~
40 I
"~ "~ ~
30
<
2O
,'~
le
0
o
Return
loo - -
60:
Continue
80
40 i 2O 0 -
o ~'o- -~o % % % % % Stimulus L o c a t i o n
o ~'~- "o % % % % %
Fig. 11. Effect o f relearning the standard m a p p i n g on tuning properties of motor cortex neurons. Each row shows the tuning curves for a single motor cortex ~zetl. The m i d d l e c o h m m shows variance in standard deviation. See text for details.
second, substitute target was presumed to be computed elsewhere; either on the basis of an algorithm (for transformational mapping tasks) or a look-up table (for arbilrary mapping tasks). These assumptions accord well with findings that signals in premotor cortex reflect the movement to be made on the basis of a visuospatiat signal (di Pellegrino and Wise, I993) and the target of the movement (Shen and Alexander, 1997a, b), as well as that certain premotor areas are necessary for arbitrary mapping (Passingham. 1993). In the model, the stimulus signal (representing standard mapping)decreased in activity as the target signal (representing transformational or arbitrary mapping) increased in activity. When the two targets were presented sequentially to the network model, the population vector '~ rotated as a function of time. Fig. t2 shows the rotation of the neuronal population vector alongside the model population vector. In the
model, the rotation of the population vector resulted from a combination of two factors: (l) a switch between two targets, the initial standard one and the final nonstandard one, and (2) the internal dynamics of the network, which in turn were engendered by the training paradigm desclibed above. The phrase 'internal dynamics' refers to the way in which the hidden units' activation was calculated. At time t, a hidden unit's activity was computed based on the current input values to the hidden unit as well as the unit's activity at time t - 1. In other words, the history of activity of a given hidden unit influenced its present activation. The network progressed toward the target location in small, fixed increments, taking roughly 15 time steps to reach a target. Together, the small step size and the recurrent model connections insured that the population vector would reflect a smooth rotation from the current target to a newly instantiated target.
260
Monkey
Model 2O0 190 180 170 160 150 140 130
< > < >
Target Change
120 110 100 90 ms Fig. 12. Rotation of the neuronal population vector from neurophysiological data and model. The stimulus and movement direcuons are indicated by the lines in the inset at the top. Left: The population vector in two-dimensional space is shown for successive time frames beginning 90 ms after stimulus onset. Adapted from Georgopoulos et al. f1989), with permission. Right: Network model reproduction of rotation of the population vector using a target-switch paradigm. Adapted from Lee (1996).
Following the first report of this possible interpretation of population vector rotation (Lee, 1996), two other groups of investigators (Scott and Cisek, 1997, 1999; ~ t n e y e t al., 1997) addressed the same issue in somewhat different ways: These two models essentially replicated population :vector rotation b y providing their ;respective models with two, asynchronous inputs, Both of these models; however, contain 'hard-wired' cosine tuning and preferred directions to span the reaching space. But both models go beyond findings of Lee (1996) to address the activation of cells, reaction-time effects and intermediate preferred directions, The model of ~ i t n e y et al. accounted for reaction-time differences that increase with angular difference, including the special; case of short reaction time for the 180° transformation case. Reaction time was defined using the termination condition that (1) the angle of the population vector essentially became constant and (2) the length of the population vector reached a certain arbitrary length. The direction of the population vector was determined by an equation
that was always 0 with a transformational angle of 180° [because sin (180°) = 0 and arctan (0°) = 0]. So, with a transformational angle of 180°, the population vector angle instantly stabilized, which accounted for the dramatic drop in reaction time. Scott and Cisek also showed an effect of spatial stimulus-response incompatibility on reaction time. The reaction-time effects in their model were more similar to what has been observed experimentally in certain transformational tasks (di Pellegrino and Wise, 1993) and were much smaller than those observed in experiments involving the mental rotation of visual images. Both the Whitney et al. and the Scott and Cisek models addressed the issue of the activity of cells with intermediate preferred direction engaging during the transformation. Scott and Cisek described the activation of cells with intermediate preferred directions as a result of the timing difference for nonpreferred directions (Scott, 1997). Cells 45 ° off the preferred direction were activated later than those at the preferred direction. This phenomenon, alone, would account for the activation of intermediate-tuned neu-
261 rons during the transformation, In the present model, cells with intermediate preferred directions were not activated. However, the evidence put forward in transformational mappings has not been entirely satisfying, in our opinion, and therefore the reported findings cannot be considered dispositive. In summary, three different simulations have shown that a population vector could rotate smoothly as a result of a discrete change from a standard mapping input to a nonstandard one. There was no need to posit an internal computation that smoothly transforms the standard input into the transformed output. This conceptuafization of transformational mapping thus distinguishes population-vector rotation from the perceptually based transforms of mental imagery, in which shapes must be transformed in analogical fashion.
Arbitrarymappings The third class of visuomotor mappings is perhaps the most flexible. In arbitrary mappings there is no systematic relationship between the visual stimulus and the direction~ of action. Instead, the response direction must be selected on the basis of a memorized '10ok-up table'. In neurophysiological studies of motor cortex activity change during the learning of arbitrary mappings, often termed conditional motor learning, nem"ons changed activity levels (Mitz et al., 1991; Chert and Wise, 1995) and preferred direction (Chen and Wise. 1996). These findings lead to the same questtons as addressed above for the transformational type of nonstandard mapping: How does the population vector change during learning to correspond to network output? As with the previous classes of visuomotor mappings, we used neural Instruction Stimulus
Delay Period
•
•
systems identification methods to simulate the mappings that occurred. Neural networks could master arbitrary mappings and learn reversals of some mappings without losing the ability to perform familiar and unchanged mappings simultaneously, just as experimental subjects can. However, the mechanisms of the network were sufficiently straightforward so that they will not be described here. Other neural network models have been published (Fagg and Arbib. 1992; Dominey et al., 1995), and the reader is referred to those publications. Chen and Wise (1997) performed a population vector analysis on the neurons with learning-dependent activity in the supplementary eye field (SEF). As always in a population vector computation, some reference or canonical preferred direction must be used. Because the preferred direction of most SEF neurons change, we chose to regard each cell's preferred direction for the set of four familiar stimuli, during the period just before and just after the saccade, as the canonical preferred direction. We used a mean circular weighted vector as a measure of the cell's preferred direction. Fig. 13 shows the video display presented to the monkeys. The monkey fixated the center of the display, where a multidimensional visual stimulus appeared. Each stimulus was mapped to one of the four potential eye-movement targets that appeared 1.53.0 s later. At the time a fixation point disappeared, the monkey was required to make a saccadic eye movement to one of the four targets. Only one target was correct for each stimulus, and the monkey was presented both with familiar stimuli, for which the mappings had been well learned, and novel ones that the monkeys had not previously seen. Fig. 14 shows the development of a preferred direction for one SEF Trigger Stimulus
•
0.55 s to acquire target 0.6 s fixation of target Fig. 13. Schematicsof the videoscreenas it appearedbeforemonkeysperformingan oculomotorversion of the arbitrarymapping task. %, 0.5 s
1.5-3.0s
262
SEF Population Vector 60
•
" " " ....
50
Y~ "o Down
40
/'q'"R
3o 20 l0
Mapping
Right
~
.. ~ Y/~ .o Left Up
Novel Familiar -..Q--
I Ref
--m--
80
0 60
Novel
L~ 40
Novel Fig. 14. Development of directional selectivity during learning, from one SEF neuron. Activity during learning is divided into four novel phases, with different symbols marking different eye movement directions. The rightmost data points (familiar) show the activity of the same cell during performance of arbitrary visuomotor mappings according to four familiar stimuli. The bar al the far right of the figure shows the reference (Ref) or 'background' activity level ± S.D. From Chert and Wise (1996).
neuron. During the early stage of learning the arbitrary mappings, the cells showed little or no modulation. Later, once the set of visuomotor mappings had been mastered, the cell showed a distinct preference for saccadic eye movement down and to the right. Figs. 15 and 16 show the results of our population vector analysis (Chen and Wise. 1997). The dashed curves in Fig. 15 show the error, defined as the angular difference between the output of the population vector and the ideal saccade direction, for the familiar instruction stimuli. In a sense, those computations were tautologous in that they are population averages of the cells based on the assumption that the cells contributed to a movement in their preferred directions. As discussed above, when the individual tuning functions are reasonably broad and well distributed across the workspace, the population vector will always correspond approximately with the direction of movement. However. the data for novel instruction stimuli were not subject to this objection. For those population vectors, the data for novel stimuli were computed with respect to the preferred directions for the trials with familiar stimuli. In this experiment, novel and familiar instructions were interleaved from trial-to-trial, to eliminate the concern
m
<
2o
Familiar
early
middle
late
established
Learning Phase Fig. 15. Error in degrees for population vectors based on familiar instruction stimuli (solid lines) and those based on novel stimuli (dashed lines). All four saccade directions are averaged. Error bars: S.E.M. See text for details. Adapted from Chen and Wise (1997).
that the overall excitability of the cell changed during learning. As shown in Fig. 15 for an average of the all response directions, the population vector error for peri-saccadic activity decreased from approximately 70 ° to the level seen for the familiar stimuli as learning progressed through its early, middle and later phases. The ~ 7 0 ° error may not seem large, but because random variation for a circular error is 90 ° , the population for the earliest trials conveyed little, if any, information about the direction about saccadic eye movement made on that trial. With learning, the population came to provide as much information about the direction of the saccade for the novel stimuli as for the familiar stimuli. This improvement in the accuracy of the population vector during the learning of arbitrary mappings resembled. at least in a qualitative manner, that seem for standard (Fig. 4B) and transformational, nonstandard (Fig. 5B) mappings. As the peri-saccadic SEF population vector becomes more accurate, it increased in length, which
263
Novel-IS Trials
A
~..,"-'-".~
/ 's~'" "
middle ~ ,~je
~
~ ~'"N.
late
~-
>
./ ",. '-... . . . . . .
./
'I! I
B
.
Familiar-IS Trials
early
~
mida~e -- --;:" ~.. ~ _ _late " - - ~ /
.,
.. , ° ' " . . . . . . .
"........,
.
-.
'~... ........
".
..." .'"
....."',~.......
0.0' . . . . . 0.5
Fig, 16. Perisaccadic population vectors for trials with novel (A) and familiar (B) instruction stimuli (IS). Both sets of population vectors are divided into the same learning phases; those for novel stimuli are displayed on separate reference circles. See text for details. Adapted from Chen and Wise (t997).
reflects a stronger signal as learning and consolidation progresses (Chen and Wise. 1997). This is shown in Fig. 16. which shows the same result as Fig. 15. but in polar coordinates. The origin of each population vector was placed at the perimeter of a circle. For example, the peri-saccadic population vector for upward saccades originates from the top of the circle for each of the first three phases of learning (early, middle, and late, from left to right in Fig. 16A)~ As shown in Fig. 16B, data for trials with the familiar instruction stimuli, even when the trials were broken down by the same phases of learning, pointed in roughly the correct direction For novel instruction stimuli, early during learning, the population vectors did not point in the direction of the correct saccade. As outlined above, only correctly executed trials contributed to the population average, so the monkeys always moved their eyes to the correct target ~br the data shown, even for the earliest phases of learning. But the population vector was nevertheless highly inaccurate. Note, for example, the population vector for novel stimuli instructing
upward saccades. It began the learnang period at a virtually random orientation, pointing roughly to the right rather than up, about 90 ° from the direction of the saccade. During the middle phase of learning, it remained inaccurately oriented, even as the population vector for the other directions improved. During the late phase of learning, the population vector for upward saccade lengthened and adopted the accurate orientation seen in the upper right part of Fig. 16. Conclusions Despite the many disparities with biological systems. the study of neuronal network models has contributed to our understanding of certain issues concerning population coding in the motor cortex. (1) Single unit activity, alone, accounted for input-output transforms in distributed neural networks. There was no need to refer m transient synchrony or correlation for a causal account of transforms of the sort studied in the motor cortex, notwithstanding the additional information that
264 may be reflected in those measures (cf. Riehle et al., 1997). We view those measures as an attractive epiphenomenon, most likely lacking causal contributions to the cortical network's mapping function. One corollary to this conclusion is that serial recording during a stereotyped task has no demerits in understanding population coding, assuming a reasonable neuronal sample and a hypothesis related to relatively stereotyped input-output mappings. We do not mean to disparage recent work involving simultaneous recording of single-unit activity with multiple electrodes, but rather to direct such analysis toward the problems to which it can be applied most usefully. Complex, emergent features of neuronal ensembles may ,not be needed for an understanding o f input-output mappings, but may play a role in more fundamental information-processing functions such as learning, coordination among parallel networks. and attention to action. (2) The population vector could be computed from the activity of hidden units, but its accuracy did not depend on direct influences of hidden units on output elements. Instead. all of the interactions of the neuronal network acted in concert to produce the network output. Cosine-like tuning and population vector coding emerged from a minimal-assumption network and need not be imposed. (3) The population vector rotated when the inputoutput function was changed from the standard mapping to a nonstandard one. However, there was no need to posit an internally mediated rotation o f the population vector or any other representation of movement. Instead, population-vector rotation could be accounted for by a transition between two inputs, one representing the visual signal and reflecting standard visuomotor mapping, the other the target signal into which it was transformed and reflecting nonstandard mapping. Thus, the mechanism and function of population vector rotation for nonstandard, transformational mappings m a y have little in c o m m o n with mental rotation as applied to visual images. (4) The changes of hidden-unit tuning properties that occurred while the network learned a nonstandard transform corresponded to those observed in the motor cortex of a monkey that learned analogous transforms. The tuning properties of individual cells changed, often dramatically, during the learning of both transformational and arbitrary mappings. The
changes were often long lasting, in the sense that the units did not necessarily return to their 'original' tuning properties when the mapping returned to standard. During learningl the population vector became more accurate in both model and biological networks, for both standard and nonstandard mappings, and for both transformational and arbitrary nonstandard mappings.
References Bullock, D., Grossberg, S. and Guenther, EH. (1993) A selforganizing neural model of motor equivalent reaching and tool use by a multijoint arm. J. Cogn. Neurosci., 5: 408-435. Burnod, Y., Grandguillaume, R, Otto, I., Ferraina, S., Johnson, RB. and Caminiti, R. (1992) Visuomotor transformations underlying arm movements toward visual targets: a neural network model of cerebral cortical operations. J. Neurosci., 12: 1435-1453. Caminiti, R., Johnson, EB., Galli, C., Ferranina, S. and Burnod, Y. (1991) Making arm movements within different parts of space: The premotor and motor cortical representation of a coordinate system for reaching to visual targets. J. Neurosci., 11: 1182-1197. Chert, L.L. and Wise, S.E (I995) Neuronal activity inthe supplementary eye field during acquisition of conditional oculomotor associations. J. NeurophysioI., 73: 110t-1121. Chen, L.L. and Wise, S.R (1996) Evolution of directional prefei'ences in the supplementary eye field during acquisition of conditional oculomotor associations. J. Neurosci., 16: 30673081. Chen, L.L. and Wise, S.R (1997)Conditional oculomotor learning: Population vectors in the supplementary eye field. J. Neurophysiol., 78: 1166-1169. Cisek, R, Grossberg; S. and Bullock, D. (1998) A cortico-spinal model Of reaching and propfioception under multiple task constraints. J. Cogn. Neurosci.; !0: 425-444: d'Avella,: A. :and Bizzi, E: (1998) Low dimensionality of supraspinally induced force fields. Proc. Natl. Acad. Sci. USA, 95: 7711-7714. di Pellegdno, G. and Wise, S.E ( i 993) Visuospatial Vsl vis:uomotor activityin the premotor and prefrontal cortex Of a primate. J. Neurosci., 13: 1227-1243. Dominey, R, Arbib, M, and Joseph, J.E (1995) A model of corticostriatal plasticity for learning oculomotor associations and sequences. J. Cogn, Neurosci., 7:3112336. . . . . . Fagg, A.H. and Arbib, M.A. (1992) A model of primate visualmotor conditional learning. J. Adapt. Behav., i: 3-37. Fetz, E~E. and Shupe, L.E. (1990) Neural network models of the primate motor system. In: R, Eckrniller (Ed,), Advanced Neural Computers. Elsevier, Amsterdam, pp. 43-50. Flament, D., Onstott, D., Fu, Q.-G. and Ebner, T.J. (1993) Distance- and error-related discharge of cells in premotor cortex of rhesus monkeys. Neurosci. Lett., 153:144-1481 Gondolfo, E, Benda, B.J., Padoa-Schioppa, C. and Bizzi, E:
265
(2000) Cortical correlates o f learning in monkeys adapting to a new dynamical environment. Proc. Natl. Acad. Sci. USA, 97: 2259-2263. Georgopoulos, A.R (1987) Cortical mechanisms subserving reaching. In: O Brock. M. O'Conner and J. Marsh (Eds.). Motor Areas of the Cerebral Cortex. CIBA Foundation Symposium 132 Wiley,~Chichester pp. 125-141. Georgopoulos, A.R (1991) Higher order motor control. Annu. Rev. Neurosci.. 14: 361-377. Georgopoulos, A.R (i995) Current issues in directional motor control. Trends Neurosei- 18:506-510 Georgopoulos. A.E (1996) Arm movements in monkeys: behavior and neurophysi0]ogy. J. Comp. Physiol.. 179: 603-612. Georgopoulos; A.R, Kalaska. LE. Caminiti. R. and Massey, J.T. (1982) On the relations between the direction of two-dimensional a r m movements and cell discharge in primate motor cortex. J. Neurosci.. 2: 1527-1537. Georgopoulos. A.R. CamJmti, R.. Kalaska. J.F. and Massey, J.T. (1983a)~ Spatiat ~coding of movement: a hypothesis concermng the coding of movement direction of motor cortical popnlations. In: J. Massion. J. Paillard, J.W. Schultz and M. Wiesendanger (Eds.). Neural Coding of Motor Performance. Springer, Berlin, pp. 327-336. Georgopoulos, A.R, Kalaska. J.F.. Caminiti. R. and Massey, J.T. (1983b) Interruption of motor cortical discharge subserving aimed arm movement. Exp. Brain Res.. 49: 327-340. Georgopoulos. A.P.. Lurito. J.T.. Petrides. M.. Schwartz, A.B. and Massey, LT..(1989) Mental rotation of the neuronal population vector. Science. 24.3: 234-236. Grafton. S.T.. Fagg, A.H., Woods, R.R and Arbib, M.A. (1996/ Functional anatomy of pointing and grasping in humans. Cereb. Cortex, 6: 226-237. Held. R. and Bauer, J.A. 11974) Development of sensoriallyguided reaching in infant monkeys. Brain Res., 71: 265-271. Humphrey, D.R, Sehmidt E.M. and Thompson. W.D. (1970~ Predicting measures of motor performance from multiple cortical spike trains. Science. 170: 758-762. Inoue. K.. Kawashima. R.. Satoh. K., Kinomura, S.. Goto, R.. Koyama, M., Sugiura, M., Ito. M. and Fukuda, H. t 1998) PET study of pointing with visual feedback of moving hands. J. NeurophysioI., 79: 117-125. Kawashima, R.. Roland. RE. and O'Sulfivan. B.T. ~1995) Functional anatomy of reaching and visuomotor learning: A positron emission tomography study. Cereb. Cortex. 5 : 1 1 1 122. Kuperstein. M. (1988) An adaptive neural model for mapping invariant target position. Behav. Neurosci.. 102: 148-162. Lee. SJ. (1996) The representation, storage and retrieval of reaching movement information in motor cortex. Doctoral dissertation in cognitive science. University of California, San Diego. Li, C.S.R., Padoa-Schioppa, C. and Bizzi, E. (2001) Neuronal correlates of motor perfccmance in the primary motor cortex of monkeys adapting to an external force field. Neuron. in press. Lukashin, A.V, Amirikian, B.R.. Mozhaev. V.L.. Wilcox. G.L. and Georgopontos, A.R (I996) Modeling motor cortical o!0-
erations by an attractor network of stochastic neurons. Biol. Cybern.. 74: 255-261. Lurito, J.T.. Georgakopoulos. T. and Georgoponlos. A.R (1991) Cognitive spatial-motor processes. 7. The making of movements at an angle from a stimulus direction: studies of motor cortical activity at the single cell and population levels. Exp. Brain Res.. 87: 562-580. Matsumura. M.. Kawashima, R., Naito, E.. Satoh, K., Takahashi. T.. Yanagisawa, T. and Ful~da, H. (1996) Changes in rCBF during graspmg m humans examined by PET. Neuroreport. 7: 749-752. Mel. B.W. (1991) A cormectionist model may shed light on neural mechanisms for visually guided reaching. J. Cogn. Neurosci.. 3: 273-292. Mitz. A.R., Godschalk. M. and Wise. S.R (1991) Learningdependent neuronal activity in the premotor cortex of rhesus monkeys. Z Neurosci.. 11: 1855-1872. Moody, S.L and Wise. S.R (2000) A model that accounts for activity prior to sensory inputs and match responses during matching-to-sample tasks. J. Cogn. Neurosci., 12: 1-20. Moody, S.L. and Zipser, D. (1998) A model of reaching dynamics in primary motor cortex. ~ Cogn. Neurosci.. 10: 3545. Moody, S.L, Wise, S.P.. di Pellegrino, G. and Zipser, D. (1998) A model that accounts for activity in primate frontal cortex during a delayed matching-to-sample task. J. Neurosci.. 18: 399-410. Passingham. R.E. (1993) The Frontal Lobes and Voluntary Action. Oxford University Press, Oxford. Riehle. A., Griin, S.. Diesmann. M. and Aertsen. A, (1997) Spike synchronization and rate modulation differentially involved in motor cortical function. Science. 278: /950-1953. Sanger, T.D. (1994) Theoretical considerations for the analysis of population coding in motor cortex. Neural Comput.. 6: 29-37. Schwartz, A.B. (1993) Motor cortical activity during drawing movements: Population representation during sinusoid tracing. J. Neurophysiol., 70: 28-36. Schwartz. A,B. (1994) Direct cortical representation of drawing. Science, 265: 540-542. Scott. S.H. (1997) Comparison of onset time and magnitude of activity for proximal arm muscles and motor cortical cells before reaching movements. J. Neurophysiol.. 77: 1016-1022. Scott, S.H. and Cisek, R (1997) Population vector rotation without mental rotation. Soc. Neurosci. Abstr., 23: 1555. Scott. S.H. and Cisek, R (1999) Population vector rotation without mental rotation. Neurosci. Lett.. 272: 1-4. Scott. S.H. and Kalaska. J.E (1997) Reaching movements with similar hand paths but different arm orientations. 1. Activity of individual cells in motor cortex. J. Neurophysiol.. 77: 826852. Shen. L. and Alexander, G.E. (1997a) Neural correlates of a spatial sensory-to-motor transformation in primary motor cortex. J. Neurophysiol., 77: 1171-1194. Shen. L. and Alexander. G.E. (1997b) Preferential representation of instructed target location versus limb trajectory in dorsal premotor area. J. Neurophysiol., 77:1195-1212. Whitney, C.S.. Reggia, J. and Cho, S. ~1997) Does rotation of
266 neuronal population vectors equal mental rotation? Connection Sci., 9: 253-268. Wise, S.P., di Pellegrino, G. and Boussaoud, D. (1996) The premotor cortex and nonstandard sensorimotor mapping. Can. J. Physiol. Pharmacol., 74: 469-482. Wise, S.R, Boussaoud, D., Johnson, RB. and Caminiti, R. (1997) The premotor and parietal cortex: Corticocortical connectivity and combinatorial computations. Annu. Rev. Neurosci., 20: 25-42. Wise, S.R, Moody, S.L., Blomstrom, K.J. and Mitz, A.R. (1998) Changes in motor cortical activity during visuomotor adapta-
tion. Exp. Brain Res, 121: 285-299. Zhang, J., Riehle, A., Requin, J. and Komblum, S. (1997) Dynamics of single neuron activity in monkey primary motor cortex related to sensorimotor transformation. J. Neurosci., 17: 2227-2246. Zipser, D. and Andersen, R.A. (1988) A back-propagation programmed network that simulates response properties of a subset of posterior parietal neurons. Nature, 331: 679-684. Zipser, D., Kehoe, B., Littlewort, G. and Fnster, J. (1993) A spiking network model of short-term active memory. J. Neurosci., 13: 3406-3420.
M . A . L Nicolelis (Ed.) Progress in Brain Research. Vol. 130 © 2001 Elsevier Science B.V. All rights reserved
CHAPTER 17
Distributed processing in the motor system: spinal cord perspective Yifat Prut, Steve I. Perlmutter and Eberhard E. F e t z * University of Washington, Department of Physiology and Biophysics and the Regional Primate Research Center, Seattle. WA 98195. USA
Background The motor system has long been thought to operate in a serial m~x/e of processing (Thach, i978; I(urata, 1993). In this :mode, information regarding motor action evolves sequentially through a series of motor stations, with each successive station translating the arriving information into a set of instructions that progressively contain more motor-related information. Ultimately, the neural information is transformed into muscle action by the segmental spinal circuitry. However, there is abundant evidence, both anatomical and physiological, that conflicts with this notion, leading to the conclusion that the system incorporates parallel processing as well as serial processing (~£iexander et al., 1990; Alexander and Crutcher. 1990). The relative extent to which the system employs each of these modes of operation is n o t yet known. In this paper we will summarize some o f the evidence supporting parallel processing in the motor system during generation of volunt a r y arm movements, and will estimate the role of synchronous filing in such information processing.
ters control motoneurons (MNs) through three layers (Fig. 1).
k_]
Parallel descending information The anatomy of descending pathways to the spinal cord (Kuypers, 1981) shows that cortical motor cen-
Corresponding author: Eberhard E. Fetz, Department of Physiologyl UniTerslty of Washington, Seattle, WA 98195, USA. Tei.: +1-206-543-4839: Fax: +1-206-685-8606: E-mail: fetz @u.washington.edu
Fig. 1. Schematic illustration of cortical control of MNs. Three main routes are available: CM control originating from layer V cortical cells m primary and non-primary motor areas synapse on MNs (black filled cell): corticospinalcontrol originates from sensory and motor cortices and terminates on spinal INs (gray filled cells), which in turn make synaptic contacts with MNs: cortico-reticulo-spinal pathway affects MNs through a potysynapfic route, including a synapse in the reticular formationand a synapse on spinalINs.
268 The first is the corticomotoneuronal (CM) control mediated through direct synapses of cortical neurons on MNs. CM axons arise from layer V pyramidal cells located in primary and non-primary motor cortical areas (Murray and Coulter, 1981: Toyoshima and Sakai. 1982; Dum and Strick, 1991; Porter and Lemon, 1993). It has been shown that a single corticospinal axon can contact most, or even all, MNs of a given pool, as well as multiple pools (Lawrence et al., 1985; Shinoda et al., 1986). The second layer of control is the indirect corticospinal pathway; cortical axons contact spinal interneurons (INs), which project mono- or polysynapfically to MNs. These contacts can span up to several segments (Kuypers, 1981) and their specific termination zone is dependent on the area of origin (Martin, 1996). The third layer of control is indirectly mediated through brain stem nuclei (Moll and Kuypers, 1977: Kuypers, 1981). This cortico-reticulo-spinal pathway mostly terminates on long propriospinal INs. which distribute their axons to multiple spinal segments (often throughout the length of the cord). The latter two levels of supraspinal control affect MNs indirectly through segmental INs. Although this indirect influence may seem weak compared with the direct CM synapses, it can have a powerful impact given that spinal INs provide more than 90% of the input to MNs (Porter and Lemon, 1993). It has been estimated that each MN receives input from several thousand segmental INs (Alstermark and Kummel, 1990). The relative numbers of MNs (several thousand) and INs (few hundred thousand) in each segment (Binder, 1989) suggest a high degree of convergence of IN input on MNs. The supraspinal information arriving to the spinal cord through these various routes is diverse, containing both motor, premotor and sensory information (Murray and Coulter, 1981; Toyoshima and Sakai. 1982; Dum and Strick, 1996). Our experiments were designed to evaluate the impact of the indirect corficospinal route on the segmental spinal network during voluntary movement.
Motor preparatory activity In addition to the anatomy of the motor system. which provides the structural foundation for parallel processing of information, the physiology is also in agreement with this parallel mode of operation. Existence of pre-movement activity was documented for many supraspinal structures in a task that mchided an instructed delay period, namely a time period between a cue onset and a trigger 'go' cue (Tanji and Evarts. 1976; Thach. 1978; Kubota and Hamada, 1979; Weinrich and Wise, 1982: Wise et al., 1986; Georgopoulos et al., 1989; Riehle and Requin, 1989). Alexander and Crutcher (1990) observed that the onset time of this preparatory activity greatly overlapped across motor centers regardless of the large difference in their 'synapfic distance' from the effectors (i.e., muscles). Since preparatory activity was considered to be a reflection of motor planning of forthcoming movement, it was suggested that the motor plan is generated in a distributed manner throughout the supraspinal motor system. This notion provided major support to the idea that information in the motor system is processed in parallel.
Spinal preparation for movement To date, the participation of the spinal cord in preparation for movement has been estimated only indirectly. Modulations of monosynaptic reflex pathways (Requin et al., 1977; Bonnet and Requin, 1982; Brunia et al., 1982; Komiyama and Tanaka, 1990) were considered to reflect cortically mediated presynaptic inhibition, which modifies sensory inflow and motoneuuronal excitability in preparation for the ensuing movement. Recently, we have shown (Prut and Fetz, 1999) that in primates performing a flexion/extension wrist task with an instructed delay period, many spinal INs exhibit specific pre-movement delay period modulation of firing rate (Fig. 2A
Fig. 2. Examples of spinal INs exhibiting pre-movementpreparatory activity. (A) Spinal IN with an excitatory rate modulation during pre-extensiou period. Extension (left, black histogram) and flexion (right, gray histogram) trials are plotted separately. For each set of Wials. the torque traces (bottomchart), the raster plot (middle chart) and the peri-stimulus-timehistogram (PSTH, top chart) are plotted. All trials and events are aligned on cue onset (time zero). In each trial, the event before cue onset is the start of the trial. The events after cue onset are the 'go' signal (triangles), movementonset (diamonds), and movementoffset (squares). (B) Spinal IN with inhibitory preparatory activity. (From Prut and Fetz. 1999.)
A. F1303ee~ Cell 1
269
Flexion, 15 trials
Extension, 14 trials o-
1
v
"r"
EL |
09
#
-2
-2
0 (cue onset)
0 (cue onset)
5
13.. F 3 9 0 5 e e , C e l l I
Extension, 39 trials
z~r'
Flexion, 40 trials
o9 09 v
-i-
n
o9 a3
. ,,,..
p-
F -1
O(cue onset) Time (sec)
5
-1
0 (cue onset) Time (sec)
5
270 and B). This preparatory activity was found in more than 30% of the recorded cells. The distribution of onset times for spinal preparatory activity had an average of 220 ms (median of 190 ms) and greatly overlapped the onset time previously reported by Alexander and Crutcher (1990) for cortical cells. However. it should be recognized that our measuring technique differed from that employed by others, and this may have contributed methodological differences in the estimated onset times. The preparatory rate modulation that was found in spinal INs was often inhibitory, and in reverse polarity compared with the subsequent movement response of the cell (Fig. 3). Often; the preparatory activity that was found in correct trials was absent when the monkey made an erroneous movement response to the cue. This further indicates that spinal preparatory activity reflects a correct cue-to-response match, as opposed to being cue-related or purely motor-related activity. The existence of rate modulation during the delay period may not be surprising given the extensive input to the spinal cord from sensory and motor areas that exhibit such activity. Spinal preparatory activity may not only reflect supraspinal processes. It may also take an active part in gating or modulating sensory information arriving from the periphery, which will subsequently be transmitted to supraspinal centers to continuously update the motor plan (Evarts and Granit, 1976: Tanji and Evarts, 1976: Evarts et al.. 1984). In this scenario, as soon as preparation for movement starts, activity reverberates throughout the motor system via anatomical parallel loops.
Synchronization within spinal and supraspinal circuitry In view of the broad and diverse impact of descending pathways on spinal activity and the high degree of convergence of input from segmental INs upon MNs, what is the mechanism by which descending inputs arriving to the cord select a specific motor pattern? In part this input affects spinal circuitry through a direct excitatory drive. Indeed, interrupting descending pathways results in muscle weakness, which presumably reflects the loss of the excitatory drive (Rymer, 1993). An additional mechanism through which supraspinal drive can shape
0
100-
(F+E, ttest)
Ntotal = 167 56'
Flexion 8o
n-
.
60
" Extension
6o o m
40
o -1~ . eo -1:3
o
"i" 0
--.-
~e °
•
•
oo
•a
o l @o
•
|
ojo°u
°o
•
~
o~o •
e
o
$
¢~,~ - 2 0
i
rr
42'
-4 0 -15
I0
5
9
I
0
5
10
15
ANate in Delay ()~Delay-)~Nest), sp/sec Fig. 3. Relations between polarity of movement response and polarity of delay-period response. Each point represents a case of significant delay modulation in either flexion (black) or extension (gray) trials. Each cell contributes either one or two points. X-axis value is the averaged rate modulation during the delay period relative to the rate during the rest period. Y-axis value is the averaged rate modulation during the subsequent active hold period (when the monkey is actively generating either flexion- or extension-directed torque) relative to the rate during the rest period. The upper-right and lower-left quadrants represent cases of congruency in the polarity of the delay-period activity and hold-period activity (excitation or inhibition in both periods relative to rest). These two quadrants contain 102/167 (61%) of the cases (number of points in each quadrant shown in corners). The upper-left and lower-right quadrants represent cases of reverse polarity between delay- and hold-period activity (excitation/inhibition or inhibition/excitation, respectively, relative to rest). These two quadrants contain 65/167 (39%) of the cases. Of 167 cases plotted. 98 (59%] have inhibitory delayperiod modulation (the left two quadrants of the graph). (From Pint and Fetz. 1999.)
spinal activity and control muscle recruitment is by modifying the temporal properties of the firing, of these neurons. Synchronized firing was shown for both motor cortical cells (Smith and Fetz, 1989; Vaadia et al., 1995: Hatsopoulos et al., 1998: Grammont and Riehle, 1999: Maynard et al., 1999) and motor units (MUs) (Datta and Stephens, 1990; Nordstrom et al.. 1992: De Luca et al., 1993; Schmied et al., 1993). Many have suggested that the synchronized firing of MUs is the result of cortical input (Baker et al.. 1988; Farmer et al., 1993; Schmied et al.. 1994; Smith et al., 1999) and is most likely gener-
271 ated by brancb_ing o f CM fibers within MN pools and across pools of similar muscIes (Datta et al.. 199t). It seems remarkable that branches emerging from siugle cells could have such a prominent impact on the firing of two MNs. In fact other factors may play a role in the observed synchronization. The first factor is synchronization among CM ceils. Two synchronized CM cells, projecting to different MUs. may entrain those MUs to fire synchronously without direct branching input from either CM cell to both MUs. A second factor that may contribute to correlation in firing among MUS is s ~ h r o n i z a t i o n among spinal INs. In our study we recorded activity of spinal iNs from three behaving monkeys performing wrist flexion and extension movements. Recordings were made from either single or a pair of electrodes. EMG activity was simultaneously recorded from up to 14 forearm muscles. Online and of(line spike-s0rting techniques were used to separate multiple waveforms from a single electrode. Spiketriggered averages were compiled for each neuron to determine functional connectivity between cells and muscles.
Firing properties of spinal INs We first studied the ~ n g properties of single spinal INs (Fig. 4). In many cases, the autocorrelation of spinal INs (Fig. 4A) and MNs (Fig. 4B) had periodic features, reflecting a tendency to fire with regular interspike intervals (ISI). The extent of regularity in firing was quantified b y the coefficient of variation (CV) of the ISI. T h e mean CV values for INs during rest and hold periods (Fig. 4C) were 0.72 and 0.65. respectively; MNs exhibited the highest level of regularity (~0.35). In contrast, the CV of cortical neurons tends to be higher, above 1 (Lee et al.. t998; Shadlen and Newsome, 1998). A value of 0.8 is a lower bound for cortical regularity (Stevens and Zador. 1998). It is clear that spinal circuitry expresses a much higher level of regular firing than cortical cells. For comparison, primary afferent inputs ~o spinal INs fire more regularly than INs (Matthews and Stein, 1969; Nordh et al. 1983) However. when the fusimotor drive to the muscle spindle is severed, the firing of primary afferents becomes even more regular (Matthews and Stein.
1969). This observation led to the assumption that the supraspinal input, which affects the y-system, is the source for the increase in variability of primary, afferents. Similar supraspinal inputs on spinal INs may act to increase their variability in firing as well.
Correlation in firing among spinal INs We have found that neighboring spinal INs are more likely to exhibit similar response properties than are neurons located far apart. Such similarity m response patterns can be the result of a common drive. This type of common drive also was considered as a source for a rapid correlation in firing (on the order of a few tens of milliseconds) between cortical neurons (Abeles. 1982). Fig. 5 shows an example of two spinal INs with a significant correlation in their firing. The results of this study are summarized in Table 1. Correlations were measured between pairs of spinal INs recorded by single or neighboring electrodes. The duration of the computed correlation was 4-100 ms. Only cases where at least 200 spikes from each of the units (i.e.. the trigger unit and the reference unit) were available were considered. For each pair of units three correlograms were compiled: during rest, during flexion-hold, and during extension-hold (periods in which the monkey actively maintained the cursor within the target on the screen). Of the 438 neuronal pairs for which correlation was computed in at least one behavioral mode, 255 were pairs of units recorded from the same elecnode and 183 were pairs recorded from two different electrodes. Of all the histograms, 49 (11%) had a peak in at least one behavioral mode. The frequencies of correlation as a function of behavioral mode and distance between cells are listed in Table 1. As shown in Fig. 5, in some cases the correlations were dynamic, changing in strength or existence depending on the behavioral mode. We counted the number of cases in which correlation exhibited such dynamic features for those pairs whose correlation was computed in more than one behavioral mode (385 pairs). Only 212 pairs provided sufficient data to compute correlation in three behavioral modes. 173 pairs in two modes, and 53 pairs in one mode only. Of these. 42 pairs had a peak in at least one behavioral mode. Table 2 summarizes the tendency
272
REST A.
HOLD FLX
HOLD EXT
ww3709,unitl
100
Ntrig = 1139
Ntrig = 580
Ntrig = 565
Or)
5o r.t?
0 -500 B.
!
0
500 -500
I 500 -500
0
0
500
ww6809me, unit I
Ntrig = 0
Ntrig=890
Ntrig = 1
60 (/1
0O 30
0
-200
0
200
-200
Time (ms)
0
200
-200
Time (ms)
0
200
Time (ms)
C. ) ¢-
_~(o-=-0.33)
O
(,.)
u 0
0.5
1
CV
1.5
2
2.5
HOLD (flx+ext) N=610 u = 0.65 (~=0.27)
7°i& 0 0
0.5
1
:1.5- - 2
2.5
CV
Fig. 4. Firing properties of single spinal INs. (A) Autocorrelation of a spinal IN computed for data collected during rest (left histogram), flexion-hold (middle histogram) and extension-hold (right histogram). The number of triggers is given for each histogram. (B) The same as A but for a spinal MN that is inactive during rest and extension periods. (C) Distribution of CV values computed during rest (left. black) and hold period (right, gray) for the whole population of spinal neurons. The total number of cases, the average value and the standard deviation are given for each histogram.
273
REST
HOLD EXT
HOLD FLX
A. F3002a, 5*6
Ntr=519 Nrf = 421 Ttime = 14112
N t r = 7 4 5 Nrf = 5 2 7 Ttime = 6 5 1 6
N t r = 8 0 7 Nrf = 651 Ttime = 9 4 9 4
1(-O3
O9
)
B. F3205; 4*5
O
8O
31
Ntr=2496 Nrf = 1792 T:tirne = 54578
co
"~ 6 0 "5. cO
40
-75
-75 T i m e (ms)
0
75
T i m e (ms)
-75
0
75
T i m e (ms)
Fig. 5, Two examples o f cross-correlograms computed for pairs of spinal INs. (A) Correlograms computed during rest (left histogram), extension-hold (middle histogram) and flexion-hold (right histogram). A significant peak in the correlogram exists during extension-hold. The horizontal lines show the average rate of the target unit (middle line) and the 99% confidence interval (upper and lower lines). The central dip is the resalt of the fact that the two cells were recorded by the same electrode. 03) The same as A but for a different pair of units. In this case. a peak of different extent can be seen in all three behavioral modes, while it crosses the fines of significance during flexion-hold only.
TABLE 1 Frequency of synchronization among spinal INs N ] = 49/438
Rest
Hold flexion
Hold extension
Cross-corretogram peaks NS 2 correlograms % Peaks
22 250 9
28 227 12
18 239 8
Peaks at distance ; = 0 NS correlogram at distance = 0 % Peaks
19 123 t5
26 118 22
17 132 13
t Number o f pairs with a significant peak in at least one behavioral mode over the total number of pairs. 2 Non-significant. 3 Distance = 0 refers to ceils recorded by a single electrode.
274 liseconds), and their peak area, which corresponds to their strength, was often small.
TABLE 2 Modulation of synchronization across behavioral modes Computed correlograms 1
Type 2
N
%
3 modes
3 peaks 2 peak ~- 1 NS 1 peak + 2 NS 2 peaks 1 peak -4- 1 NS
4 6 16 5 11 42 33/42
10 14 38 12 26 100 79
2 modes 3 Total Dynamic/constant 4
NS = non-significant. 1 Number of behavioral modes (out of 3) at which correlograms were computed. 2 Type of observed correlograms, namely either fiat (NS) or with a significant peak. 3 Two modes were found for pairs for which there were not enough data in one mode to compute a correlogram. In a few cases the correlogram had a clear peak. which was either insignificant or was based on low number of triggers or reference spikes (<200). These correlograms were excluded. + Ratio between pairs for which synchronization was modified according to the behavioral mode and pairs with a constant synchronization.
Discussion
Our study of spinal activity during preparation for movement further supports the idea of distributed, parallel processing in the motor system. The spinal manifestation of pre-movement delay period activity suggests that spinal circuitry is involved already in early stages of movement preparation. The rote of this activity and its source remains unknown. It was previously suggested (Moll and Kuypers. 1977; Baumgartner et al.. 1996) that the premotor cortex has a predominantly inhibitory effect on spinal activity. Premotor cortex cells exhibit very robust preparatory activity (Weinrich and Wise, 1982; Wise, 1985; Wise et al., 1986; Kurata, 1993) and therefore these cells could be the source of inhibition of INs during the preparatory period. Further study will be required to test this hypothesis.
Existence of synchrony between spinal INs of correlogram to change in different behavioral modes. The majority of correlograms were sensitive to the behavioral mode. Therefore the existence of correlated activity reflects not only the hardwired input arriving to these cells (which is constant throughout the trial), but also the behavioral epoch. Also. correlated firing seemed to be more frequent during flexion-hold periods than during rest and extension-hold periods (Table 1). Additional differences between the properties of INs during flexion and extension were previously reported from our lab (Perlmutter et al.. 1998). Accordingly, functional linkages between cells and muscles are observed more frequently in spike-triggered averages of flexor than of extensor muscles. It therefore may be that part of the observed effect of INs firing on muscle activity could be explained by a synchrony in firing among spinal INs. The increased tendency for correlated firing during flexion may increase both the frequency and the potency of the observed effects on flexor muscles, making them more common. However, it is important to note that the observed synchronizations between spinal INs occurred over a short time scale (often not more then a few rail-
The second question that we addressed in our study is the way the diverse information arriving to the spinal cord is processed by spinal INs. We found that spinal INs tend to fire in a very regular manner. This finding is in contrast with the data provided for cortical cells (including M1), which fire in a highly irregular manner (Smith, 1989; Soflky and Koch, 1993; Lee et al., 1998; Shadlen and Newsome. 1998: Stevens and Zador, 1998). Furthermore, spinal circuitry also differs from motor cortical neurons in the extent of Correlation in firing. Despite the fact that nearby spinal INs have similar response properties, only a small fraction of them were correlated in their firing, and these correlations were brief and weak. For M1 neurons, the estimated numbers of correlated cells ranged between 20% and 35% (Smith, 1989: Fetz et al.. 1991: Hatsopoulos et al., 1998; Lee et al., 1998), while in the spinal cord the total number of correlated cells was about 11%. The two results (higher regularity and lower correlation) are in agreement with other findings demonstrating the inverse relations between these two factors for y-mntoneurons (Davey and Ellaway, 1984). A possible reason for the relative low number of correlated
275 pairs found among spinal INs may reside in a specific connectivity scheme, in which cells of the same type are connected, regardless of their anatomical distance. However, it has been shown that the pattern of input arriving to INs is in some cases random (Harrison and Jankowska, i985), so that local branches of a given fiber wilt synapse on INs in their vicinity regardlessof the specific origin of this tibet: Also we have found a tendency of correlated activity to occur among neighboring ce!ls (those that were recorded by the same electrode). This suggests that neighboring INs are more likely to share their input (as opposed to a case where cell identity rather than its location determines its input), and therefore the low frequency of correlated activity found in our study is not due to the bias in recording mostly neighboring cells. Also, the impact of distance on cortical correlation seems t o be less pronounced (Smith, 1989; Vaadia et al., t995; Maynard et al., 1999) than in the cord. These facts are all in agreement with a tendency for asynchronous firing among INs compared with cortical neurons.
Functional role of spinal synchrony Several observations suggest that spinal synchrony has a functional context. The first finding is that spinal synchrony was modulated by the behavioral mode. The majority of correlated pairs were correlated only in some but not all the behavioral epochs. This result is similar m the dynamic correlations of firing that were found in the prefrontal cortex (Vaadia et al., 1995) and the 'information capacity' reported for M1 synchronization (Hatsopoulos et al.. 1998; Maynard et al., 1999). Given the firing properties of spinal INs. with their highly regular firing, and the lower number of correlations in the cord compared with the number of correlations in M1, spinal synchronization may be induced by converging inputs arriving to the cord, rather than intrinsic computational processes taking place within spinal circuitry. Accordingly, the behavioral modulation of spinal synchronization is the result of modulations among corticospinal inputs. The second f i n ~ g that suggests a functional role for the correlation between spinal INs is the consistency between the tendency of spinal ceils to have a functional linkage with flexors muscles (Perlmutter
et al., 1998) and the excess of correlation during active flexion. Regardless of the source of the synchronization, such activity can enhance the impact of a single cell upon its target muscle. This is particularly important for cells that have an oligosynaptic linkage m MNs, whose efficacy tends to be relatively weak. On the other hand. the limited extent of the correlation may help to limit the muscle fields of spinal INs compared with the larger muscle fields of CM cells (Perlmutter et al., 1998). Accordingly, spinal synchronization may sharpen the broad input to spinal neurons, as the existence or absence of synchronization within a group of INs may respectively amplify or attenuate the impact of arriving inputs. The contribution of spinal correlation to MU synchronization is still unclear: Although IN synchronization is comparable to MU synchronization in duration (Datta and Stephens. 1990: Schmied et al.. 1994) and task dependency (Bremner et al., 1991) the two events differ in their frequency. The number of synchronized MU pairs found within a muscle and across muscles is about 60-90% (Datta and Stephens, 1990; Bremner et al., 1991). On the other hand. there is a large convergence in the spinal cord where hundred of thousands of INs project to several thousands of MNs. Such a convergence pattern may amplify the limited correlation found between INs. However, a direct test is required to estimate the role of spinal INs in MU synchronization.
Summary and conclusions Recordings of spinal INs during a flexion/extension wrist task with an instructed delay period have shown directly that many spinal neurons modulate their rate during the preparatory period soon after a visual cue. The onset time and the relation between the delay period activity of spinal INs and the ensuing movement response suggest that this type of activity is not simply related to the forthcoming motor action. but rather reflects a correct match between the visual cue and the motor response. The existence of such activity further supports the notion that the motor system operates in a parallel mode of processing; so that even during early stages of motor processing multiple centers are activated regardless of their anatomical distance from muscles.
276
The firing properties of spinal INs during the performance of the task seem to differ from the comparable properties of motor cortical cells. Spinal INs fire in a highly regular manner - - their CV is substantially lower than the observed CV of cortical cells, Also, although neighboring cells tend to have similar response properties, the frequency of significant correlation is lower than for cortical cells and the anatomical extent of the correlation seems to be narrower. The similarity and differences between cortical and spinal cells in terms of response and firing properties suggests that while both type of cells are active in parallel throughout the behavioral phases of the motor task, each may operate in a different mode of information processing.
Abbreviations INs CM MU MN M1
spinal intemeurons corticomotoneuronal motor units motoneurons primary motor cortex
Acknowledgements We thank J. Garlid and L. Shupe for expert technical assistance and K. Elias for editorial help. This work was supported by the NIH grants NS12542, NS36781 and RR00166.
References Abeles, M. (1982) Local Cortical Circuits - - An Electrophysiological Study. Springer, Berlin. Alexander, G,E. and Crutcher, M,D. (1990) Preparation for movementi neural representations o f intended direction in three motor areas of the monkey. J. Neurophysiol., 64: 133I50. Alexander, G.E, Crutcher, M.D! and DeLong, M.R. (i990) Basal ganglia-thalamocortical circuits: parallel substrates for motor, oculomotor, 'prefrontal' and 'limbic' functions; Prog. Brain Res., 85: 119-146. Alstermark, B. and Kummel, H. (1990) Transneuronal transport of wheat germ agglutinin conjugated horseradish peroxidase into last order spinal interneurones projecting to acromioand spinodeltoideus motoneurones in the cat. 2. Differential labelling of interneurones depending on movement type. Exp. Brain Res., 80: 96=103,
Baker, J.R., Brenmer, :F.D., Cole, J.D. and Stephens, J.A. (1988) Short-term synchronization of intrinsic hand muscle motor units in 'Deafferented' man. J. Physiol.. 396: 155. Baumgarmer, C., Podreka, I.. Olbrich, A., Novak, K., Series, W., Aull, S., Almer, G.. Lurger, S., Pietrzyk, U., Prayer, D. and Lindinger, G. (1996) Epileptic negative myoclonus: An EEG-single~photon emission CT study indicating involvement of premotor cortex. Neurology, 46: 753-758. Binder. M.D. (1989) Peripheral motor control: spinal reflex actions of muscle. Joint and cutaneous recetors. In: H,D. Patton, A.M. Scher, A.F. Fuchs, R. Steiner and B. HiUe (Eds.), Textbook of Physiology, Vol. 1 W.B. Saunders, Philadelphia. PA. pp. 522-548. Bonnet. M. and Requin, J. (1982) Long loop and spinal reflexes in man dttring preparation for intended directional hand movements~ J. Neurosci., 2: 90-96. Bremner, E D , Baker, J.R. and Stephens, J.A. (1991) Effect of task on the degree of synchronization of intrinsic hand muscle motor units in man. J. NeurophysioL. 66: 2072-2083. Brunia, C.H., Scheirs. J.G. and Haagh, S.A. (1982) Changes of Achilles tendon reflex amplitudes during a fixed foreperiod of four seconds. Psychophysiology, 19: 63-70. Datta, A.K. and Stephens, J.A. (1990) Synchronization of motor unit activity during voluntary contraction in man. J. Physiol. (Lond.), 422: 397-419. Datta, A.K., Farmer, S.F. and Stephens, J.A. (1991) Central nervous pathways underlying synchronization of human motor unit firing studied during voluntary contractions. J. Physiol. (Lond.), 432: 401-425. Davey, N.J. and Ellaway, P.H. (1984) Patterns of discharge of y-motoneurons and their tendency to synchronized firing. Neurosci. Lett. Suppl., 18: $267. De Luca, C.J., Roy, A.M. and Erim, Z. (1993) Synchronization of motor-unit firings in several human muscles. J. NeurophysioI.. 70: 2010-2023. Dum. R.P. and Strick, P.L. (1991) The origin of corticospinal projections from the premotor areas in the frontal lobe. J. Neurosci., 11: 667-689. Dum. R.P. and Strick, P.L. (1996) Spinal cord terminations of the medial wall motor areas in macaque monkeys. J. Neurosci., 16: 6513-6525. E~carts, E.V. and Granit, R. (1976) Relations of reflexes and intended movements. Prog. Brain Res., 44: 1--14. Evarts, E.V., Shinoda, Y. and Wise, S.P. (1984) Neurophysiological Approaches to Higher Brain Functions. Wiley, New York. Farmer, S.E, Swash, M., Ingram, D.A. and Stephens, J.A. (1993) Changes in motor unit synchronization following central nervous lesions in man. J. Physiol. (Lond.), 463: 83-105. Fetz, E.E., Toyamm K. and Smith, W. (1991) Synapfic interactions between cortical neurons. In: A. Peters and E.G. Jones (Eds.), Cerebral Cortex, Vol. 9. Plenum. New York, 1-47. Georgopoulos, A.P., Crutcher, M.D. and Schwartz, A.B. (1989) Cognitive spatial-motor processes. 3. Motor cortical prediction of movemerit direction during an instructed delay period. Exp. Brain Res.. 75: 183-194. Grammont, E and Riehle, A. (1999) Precise spike synchro-
277 nization in monkey: motor cortex involved in preparation for movement. Exp: Brain Res., 128: 118-122. Harrison, RJ. and Jankowska, E. (1985) Organization of input to the intemeu~ones mediating group I non-reciprocal inhibition of motoneurones in the cat. J. Physiol. (Lond.), 361: 403-418. Hatsopoulos, NiG., Ojakangas, C.L., Paulnski, L. and Donoghue, J.E (1998) information about movement direction obtained from synchronous activity of motor cortical neurons. Proc. Natl. Acad. Sci. USA. 95:15706-1571 li Komiyama, T. and Tanaka, R. (1990) The differences in human spinal m0toneuron excitability during the foreperiod of a motor task. Exp. Brain Res.. 79: 357-364. Kubota. K. and Hamada, I. (1979) Preparatory activity of monkey pyramidal tract neurons related to quick movement onset during visual tracNng performance. Brain Res., 168: 435-439, Kurata. K. (1993) Premotor cortex of monkeys: set- and movement-related activity reflecting amplitude and direction of waist movements, or. Neurophysiol., 69: 187-200. Kuypers, H.GJ.M. (1981) Anatomy of the descending pathways. In: J.M. Brookhart and V.B. Mountcastle (Eds.), The Nervous System, Vol. II. American Physiological Society, Bethesda, MD, pp. 597-666. Lawrence, D.G., Porter, R. and Redman. S.J. (1985) Corticomotoneuronal synapses in the monkey: light microscopic localization upon motoneurons of intrinsic muscles of the hand. J. Comp. Neuroi.. 232: 499-510. Lee, D., Port, N.L.. Kmse. W. and Georgopoulos, A.R (1998) Variability and correlated noise in the discharge of neurons in motor and parietal areas of the primate cortex. J. Neurosei.. 18: 1161-1170. Martin, J.H. (1996) Differential spinal projections from the forelimb areas of the rostral and caudal subregions of primary motor cortex in the cat. Exp. Brain Res., 108: 191-205. Matthews, RB. and Stein, R.B. (1969) The regularity of primary and secondary muscle spindle afferent discharges. J. Physiol. (Loud. t, 202: 59-82. Maynard, E:M., Hatsopoulos. N.G.. Ojakangas, C.L.. Acuna. B.D., Sanes, J.N., Normann. R.A. and Donoghue, J.R (1999) Neuronal interactions improve cortical population coding of movement direction. J. Neurosci., 19: 8083-8093. Moll. L. and Kuypers. H.G. (1977) Premotor cortical ablations in monkeys: contralateral changes in visually guided reaching behavior. Science. 198:317-319. Murray, E.A. arid Coulter. J.D. (1981) Organization of corticospinal neurons in the monkey. J. Comp. Neurol.. 195: 339365. Nordh, E., Halliger, M. and Vallbo. A.B. (1983) The variability of inter-spike intervals of human spindle afferents in relaxed muscles. Brain Res., 271: 89-99. Nordstrom, M.A, Fuglevand. AJ. and Enoka. R.M. (1992) Estimating the strength of common input to human motoneurons from the cross-eorrelogram. £ Physiol., 453: 547-574. Perlmutter. S.I., Maier. M.A. and Fetz. E.E. (1998) Activity of spinal interneurons and their effects on forearm muscles during voluntary wrist movements in the monkey. J. Neurophysiol., 80: 2475-2494.
Porter, R. and Lemon. R.N. (1993) Corticospinal Function and Voluntary Movemem. Clarendon Press. Oxford. Prut, Y and Fetz, E.E. (1999) Primate spinal interneurons show pre-movement insmacted delay activity. Nature, 401: 590-594. Reqnin. J., Bonnet, M., Semjen, A. (1977) Is there a specificity in the supraspinal control of motor structures during preparation? In: S. Domic (Ed.), Attention and Performance. VoL VI. Lawrence Erlbaum, Hillsdale, NJ, pp. 139-147. Riehle. A. and Reqnin, J. (1989) Monkey primary motor and premotor cortex: single-cell activity related to prior information about direction and extent of an intended movement. J. Neurophysiol., 61: 534-549. Rymer. W.Z. (1993) Spinal cord injury: physiology and transplantation. Adv. Neurol., 59: 157-162. Schmied, A.. Ivarsson. C. and Fetz, E.E. (1993) Short-term synchronization of motor units in human extensor digitorum communis muscle: relation to contractile properties and voluntary control. Exp. Brain Res., 97: 159-I72. Schmied. A., Vedel, J.R and Pagni, S. (1994) Human spinal lateralization assessed from motoneurone synchronization: dependence on handedness and motor unit type. J. Physiol. (Lond.), 480: 369-387. Shadlen, M.N. and Newsome, W.T. (1998) The variable discharge of cortical neurons: implications for connectivity, computation, and information coding. J. Neurosci., 18: 3870-3896. Shinoda, Y., Yamaguchi, T. and Futami, T. (1986) Multiple axon collaterals of single corticospinal axons in the cat spinal cord. J. Neurophysiol., 55: 425-448. Smith, H.C, Davey, N.J., Savic. G.. Maskill. D.W. Ellaway, RH. and Frankel, H.L. (1999) Motor unit discharge characteristics during voluntary contraction in patients with incomplete spinal cord injury. Exp. Physiol.. 84: 1151-1160. Smith. W.S. (1989) Synaptic interactions between identified motor cortex neurons in the active primate. Ph.D. Thesis. Department of Physiology and Biophysics. University of Washington. Seattle, WA. Smith. W.S. and Fetz, E.E. (1989) Effects of synchrony between primate corticomotoneuronal cells on post-spike facilitation of muscles and motor units. Neurosci. Lett., 96: 76-81. Softky, W.R. and Koch. C. (1993) The highly irregular firing of cortical cells is inconsistent with temporal integration of random EPSPs. J. Neurosci., 13: 334-350. Stevens. C.F. and Zador. A.M. (1998) Input synchrony and the irregular firing of cortical neurons. Nat. Neurosci.. 1: 210-217. Tanji, J. and Evarts, E.V. (1976) Anticipatory activity of motor cortex neurons in relation to direction Of an intended movement. J. Neurophysiol., 39: 1062-1068. Thach. W.T. (1978) Correlation of neural discharge with pattern and force of muscular activity, joint position, and direction of intended next movement in motor cortex and cerebetlmn. J. Neurophysiol.. 41: 654-676. Toyoshima. K. and Sakai, H, (1982) Exact cortical extent of the origin of the corticospinal tract (CST) and the quantitative contribution to the CST in different cytoarchitectonic areas. A study with horseradish peroxidase in the monkey. J. Hirnforsch., 23: 257-269. Vaadia~ E.. Haalman, I., Abeles. M., Bergman, H., Pint, Y.,
278 Slovin, H. and Aertsen, A. (1995) Dynamics of neuronal interactions in monkey cortex in relation to behavioural events. Nature, 373: 515-518. Weinrich, M. and Wise, S.P. (1982) The premotor cortex of the monkey. J. Neurosci., 2: 1329-1345.
Wise, S.R (1985) The primate premotor cortex fifty years after Fulton. Behav. Brain Res., 18: 79-88. Wise, S.P., Weinrich, M. and Mauritz, K.H. (1986) Movementrelated activity in the premotor cortex of rhesus macaques. Prog. Brain Res., 64: 117-131.
M.A.L. Nicolelis (Ed.)
Progress in Brain Research. Vol.
130 © 2001 Elsevier Science B.V. All rights reserved
CHAPTER 18
Coding in the granular layer of the cerebellum Erik De Schutter 1,. and Jan G. Bjaalie 2 1 Born-Bunge Foundation, University of Antwerp, Universiteitsplein 1, B2610 Antwerp, Belgium 2 Department of Anatomy, Institute of Basic Medical Sciences. University of Oslo, P.O. Box 1105 Blindern. N-0317 Oslo, Norway
Introduction In this paper we formulate a new theory of how information is coded along the parallel fibers in the cerebellar cortex. A question which may arise is why such a new theory is needed at all. Previously we have argued that the dominant theory o f cerebellar coding, i.e. the perceptron learning theory formulated by Marr (1969) and Albus (1971) that was extended by" Ite (1982, 1984) and more recently by Mauk and colleagues (Raymond et aI., 1996: Mauk, t997), does not comply with current experimental data. The basic assttmption of these theories, that long,term depression (LTD) is the mechanism by which memo;w traces ~ e coded at the parallel fiber to Purkinje cell synapse and that LTD induction is controlled by the climbing fiber input, is not beyond doubt (De Scht~tter, I995, 1997). For example, recent data showing that LTD can be induced by pure parallel fiber excitation without any conjunctive signal (Hartell. 1996; Eilers et al., 1997; Finch and Augustine, 1998) does not conform to the theory proposed by Marr, Atbns a n d Ito. instead these indicate that the climbing fiber signal is not required to induce learning at tlae parallel fiber synapse and that, i n fact. LTD may have quite a different function. Moreover, studies using transgenic mice in which LTD induction was blocked have raised serious doubts about a link between LTD and cerebellar
findings
Corresponding author: Efik De Schutter, Bom-Bunge Foundation, University of Antwerp, Universiteitsptein 1, B2610 Antwerp, Belgium. Fax: 4-32-3-820-2669: E-mail:
[email protected]
motor control (e.g. De Zeeuw et al., 1998) and about the necessity of cerebellar LTD for eyeblink reflex conditioning (reviewed in De Schutter and Maex, 1996). As we have discussed this issue extensively elsewhere (De Schutter, 1995. 1997; De Schutter and Maex, 1996), we will not further address it here. Instead we will focus on the function of the input layer of the cerebellar cortex, the granular layer, and the mossy fiber projections to it. As this layer processes the mossy fiber input it makes sense to first try to understand how it transforms inputs to the cerebellum into parallel fiber signals before considering in more detail the role of LTD at the parallel fiber synapse. Such a study is necessary, especially now that recent experimental data have raised doubts about the effectiveness of parallel fiber input in exciting Purkinje ceils (Cohen and Yarom, 1998; Gundappa-Sulur et al., 1999). Most of the data presented and reasoning developed in this chapter concern the hemispheres of the rat cerebellum and the corticopontine somatosensory projections to this region. This emphasis reflects both our own work and the wealth of data available on these parts of the cerebellum. Considering the conserved cytoarchitecture from archi- to neocerebellum (Palay and Chan-Palay, 1974; lto. 1984) it seems reasonable to expect that our conclusions about the neocerebellum will also apply to the rest of the cerebellum.
A short review of the anatomy and physiology of the granular layer In this section we briefly introduce the reader to both well known facts and more recent data on the
280 granular layer of the cerebellar cortex. As mentioned before we focus on the mossy fiber system which is numerically the most important input to the cerebellum (Murphy and Sabah, 1971; Brodal and Bjaalie, 1992). For the processing of mossy fiber input the anatomy of cerebellar cortex can be approximated by a two-layered network. The granule cell input layer encodes the inComing mossy fiber signals and transmits them through the parallel fiber system to the output layer, consisting mainly of the Purkinje ceils. In both layers neural activity is controlled by inhibitory neurons, the Golgi cells in the input layer, and the basket and stellate cells in the output layer. Because Of the large number of granule cells (about 101 billion in man; Andersen et al,, 1992), the granule cell to Golgi cell ratio is very high. Recent estimates of a ratio of 400 (Korbo et al., 1993) are lower than those used previously (Ito, 1984), but all these studies have probably underestimated the ratio as they assumed that all large neurons in the granular layer are Golgi cells which is not the case (Dieudonn6 and Dumoulin, 2000; Geurts et al., 2001). Mossy fibers activate both the excitatory granule cells and the inhibitory Golgi cells in the granular layer (Fig. 1). Each granule cell receives input from parallel fibers
ding
Fig. 1. Schematic representation Of the organizatiOn of the granular layer of the cerebellum, transverse view. Mossy fibers originating external to the cerebellum excite both granule and Golgi cells, granule cells excite by their long parallel fibers Golgi cells, and Golgi cells inhibit granule cells. Each granule cell receives about 4 mossy fiber inputs and about 10 inhibitory contacts but it is unclear whether these come from different Golgi cells or not. The number of parallel fiber contacts onto Golgi cells is not known.
multiple mossy fibers, but physiological recordings suggest that mossy fibers projecting to a particular region code similar information (see below and Bower et al., 1981). The granule cell axon has an ascending part (Gundappa-Sulur et al., 1999) which may have a strong excitatory influence on overlying Purkinje cells (Bower and Woolston, 1983: Cohen and Yarom, 1998) and then splits into two parallel fiber segments. The parallel fibers do not only transmit information to the Purkinje cell output layer, but also provide additional excitatory input to Golgi cells. Each Golgi cell in turn inhibits the many granule cells present within the range of its axonal arbor (Eccles et al., 1966). Unique to the granular layer circuit is the absence of inhibitory connections between Golgi cells and of excitatory connections between granule cells. Combined with the parallel fiber excitation of Golgi cells and their inhibition of granule cells, this means that it contains pure feedback inhibition loops. In addition the direct excitation of Golgi cells by mossy fibers forms a feed-forward inhibition connection. Recently cerebellar slice recordings have provided additional insights in granule and Golgi cell physiology. Granule cells are regularly firing neurons which do not show adaptation (D'Angelo et al., 1995; Brickley et al., 1996). In rat cerebellar slices they have a rather high threshold, requiring co-activation of two or more mossy fiber inputs to fire the cell (D'Angelo et al., 1995). The mossy fiber to granule cell synapse can undergo long-term potentiation (LTP; D'Angelo et al.. 1999) and under particular conditions granule cells may show burst firing (D'Angelo et al., 1998). Golgi cells are spontaneously active in slice (3-5 Hz; Dieudonnr, 1998) and show firing rate adaptation upon current injection. This firing rate adaptation plays an important role in how these cells synchronize in vivo (see below and Maex et al., 2OOO).
The fractured somatotopy of mossy fiber projections The response characteristics of the granular layer to tactile stimulation have been studied extensively in the cerebellar hemispheres of the anesthetized rat. At the level of field potentials which probably reflect
281
\\"'-J~-'/
PML
////
Fig. 2. The tactile receptive field map of the cerebellar fotia crUsIIa, crusIlb, and the paramedian lobule (PML). Each patch represents either ipsilateral, contralateral or bilateral responses. The patch-likemosaic representationof differentbodyparts, with adjacent patches often receiving projections from non-adjacent body parts, has been termedfracturedsomatotopy.The schematic map shown in this figureemphasizesthe multiplerepresentations of the upper lip MOdifiedfrom Welker (1987) an(t Bower and Kassel (1990). the activation of mossy fiber synapses, one finds a fractured somatotopy of the tactile receptive fields (Shambes et at,, 1978; Welker, 1987; Bower and Kassel, 1990). This means that the receptive field map is a mosaic of small patches (on the order of a few 100 ~ m diameter), each representing a different part of the body surface (Fig. 2). Furthermore, each particular input Iocation, e.g. the upper lip region, is represente~t multiple times, but always surrounded by different neighboring patches. This particular arrangement of the receptive fields, combined with the dominance o f the ascending component of the granule cell axon (Bower and Woolston, 1983; Gundappa-Sulm" et al-, 1999), has led Bower (1997) m propose that the parallel fibers may have a role different from the ascending component. The parallel fiber would carry context signals from distant
patches to the Purkinje cells which integrate these signals with the dominant local mossy fiber input from the underlying patch. The field potentials recorded in each of the patches in response to tactile stimulation contain two components. The early one (8-10 ms delay) is caused by a direct pathway from the trigeminal nuclei, while the late one (16-22 ms delay) reflects mossy fiber activation through a thalamo-corticoponto-cerebellar loop (Morissette and Bower, 1996). The two separate mossy fiber pathways project to the same patches in cerebellar cortex (Bower et al., 1981), though the pontocerebellar mossy fibers tend to have a more diffuse projection and carry more often bilateral signals than the trigeminal ones (Morissette and Bower, 1996). Recently we have started recording the responses of inhibitory Golgi cells in these areas to tactile stimulation (Fig. 3) (Vos et aI., 1999b, 2000). In contrast to the fractured somatotopy observed in field potential recordings, Golgi cell receptive fields are very large and often bilateral. They usually also reflect the consecutive activation of the two different mossy fiber pathways, with delays of the respective peak responses which are similar to those observed in the field potential recordings. The large receptive fields observed in Golgi cell recordings are probably due to the activation of each Golgi cell by parallel fibers originating from patches with different input representations. In addition, for most Gotgi cells it is possible to find a particular response pattern which has a trigeminal component consisting of two or more sharp and highly accurate peaks (Fig. 3). This specific response pattern can be evoked from only one location on the rat's face and presumably reflects the direct activation of the Golgi cell by mossy fibers in the local patch (Vos et al., 1999b, 2000). Another intriguing property of the Golgi cell responses is the long silent period following the initial excitatory response (Fig. 3; Vos et al., 1999b; De Schutter et al., 2000). As Golgi cells are spontaneously active this silent period is quite noticeable. Similar silent periods are also found in other parts of the somatosensory system (Mountcastle et al., 1957; Mihailoff et al., 1992; Nicolelis and Chapin, 1994), where they are assumed to be the consequence of local feed-forward inhibition (Dykes et at., 1984). Note. however, that in somatosensory cortex such
282
B
A
\
D 80
300 -
7 ms
]
300
I /11ms
801
t'-
12 ms
O
O
. . . . .I . . I [
'
"
'
I
'
'
200 Time (ms)
'
I
'
'
'
I
'
AI
,
"T
I
I
I
2OO Time (ms)
Fig. 3. Response of two Golgi cells to tactile stimulation. (A) Recording sites in crusII marked on top view of the cerebellum. (Bj Same on a transverse section. (C) Location of the tactile stimulus. (D) Responses of the two ceils; in both cases the complete response over 600 ms following the stimulus (notice the long silent period) mad a blow up of the initial response (first 50 ms) are shown. Because of the double early peak (7 and 11 ms) the cell to the left is presumed to receive direct trigeminal mossy fiber input, the one to the right is activated through parallel fiber synapses. Both cells show an early trigeminal (<15 ms) and late corficopontine component. Modified from Vos et al. (1999b).
silent periods are observed i n excitatory neurons while inhibitory neurons remain active (Brumberg et aL, 1996). This is clearly not the case in the granular layer where a silent period is observed in the inhibitory Golgi cell.
Coding in the corticopontine pathway Knowing the properties of the corticopontine pathway is important as this may provide clues to what type of information the cerebellum wants to receive. Compared to the situation in the cerebellar h e , s p h e r e s the mapping from the neocortex to the pontine nuclei (PN)is relatively simple, In the developing rat, axons originating in restricted cortical regions grow into widespread but specific lamellar subspaces in the PN (Leergaard et al., 1995; Fig, 4A). There is an orderly topographic relationship between cortical sites of origin and t h e P N lamellar target regions, possibly related to temporal gradients operative within the cortex and PN (Fig, 4C). The anterolateral part of the cortex projects to an internal,
central core of the PN, ventral to the descending fiber tract. Cortical sites at increasing distance from this anterolateral region innervate progressively more external lamellar subvolumes. This 3-D pontine topographical arrangement observed in young animals preserves the overall neighboring relationships of the cortical map. Corticopontine projections in adult animals have classically been described as topographically organized (for review, see Brodal and Bjaalie, 1992). Compared to the initially widespread projections in the young animal, adult projections are more restricted and the continuous lamellar pattern is broken into pieces, described as patches or clusters within lamellar regions (Bjaalie et al., 1997; Leergaard and Bjaalie, 1998; Leergaard et al., 2000a). In single sections these separated patches may be interpreted as the substrate for the fractured map in cerebellar cortex. But, as neighbor relationships among the multiple patches or clusters of terminal fields in the PN largely reproduce those found in the neocortex (Leergaard and Bjaalie, 1998; Leergaard et al., 2000a), the
283
~dult rats. (A) Projections in the :al is upwards and ventral to the hemisphere) are superimposed (red-yellow-blue) corresponds m an internal:to-external shift of distribution in the PN, largely preserving neighboring relationships. (B) Adult rat projections from three major adjacent SI body representations. Presentation as in (A). Note that the representations of the trunk (yellow) and hindlimb (bhie) surround the representation of the face (red). The adult PN contains multiple representations for each body part, but overall neighboring relationships of the SI map are preserved. (C) Cartoon of the Leergaard et al. (1995) hypothesis explaining the establishment of general topographic organisation in the rat corticopontine system. Temporal gradients, from early to late, are illustrated by the colors red-vellow-bhie. Early arriving corticoponthae fihres innervate the early established central core of the PN. whereas later arriving fibres innervate progressively more external volumes. I A) and CC) are modified from Leergaard et al. (1995) and (B) from Leergaard et al. (2000a).
284 overall marily c shown il tracing, cortical tations ( onto th~ face prt the PN, trunk re SI hindl more ex of the pl in the pontine map. What then happens at a smaller scale? For example, could there be a fractured projection ~ o m smaller regions of SI onto the PN? We are currently studying from the ,' from indi ized reco~ we obsep Furthe sponses i: et al.. 20 vanced in locations We have so far or major jumps would be expecl While the m~ may be relativ~ it becomes moi of the stimulus representation o has been studiec system. Thus, t contains a more even distribution of distal versus proximal body representations than SI itself (Overby et al., 1989; Vassb0 et al., 1999). Similarly, the corticopontine projection from several visual cortical areas has a more even distribution of foveal versus extrafoveal representations than the neocortex (Bjaalie and Brodal, 1983; Bjaalie, 1985, 1986). It is known that distal body parts are emphasized in the somatosensory cortical areas, in the sense that they occupy disproportionally larger cortical volumes (Nelson et al., 1980). If this emphasis were
The findings summarized in Figs. 5 and 6 have important implications as they suggest that cortical information is rescaled and partially renormalized before being transmitted to the cerebellum. While maps in the cortex typically overrepresent functionally important parts of the input map, like the fovea for the visual system, the hand for the monkey somatosensory system or the vibrissae for the rat somatosensory system, this may not be the case to the same extent for input to the cerebellum. As far as the map of somatosensory projections to the rat cerebellar hemisphere is known (one should realize that
285
B
J Fig. 5. Flattened map of the cynomolgus monkey area 3h, showing density gradients of corticopontine cells in shades of grey. White indicates high density; dark grey low density. The PN was injected with large amounts of wheat germ agglutinin horseradish peroxidase and the retrogradely labelled neurons in the cortex were quantitatively mapped. (A) Three-dimensional landscape presentation of the density,distribution. (B) Two-dimensional density map. Dashed lines indicate the approximate boundaries of the major body representations in area 3b~ as outlined by Nelson etal. (1980). Note that the highest densities of corticopontine neurons are found in the representations of the trunk, proximal hindlimb (HL), and proximal forelimb (FL). The same pattern is found in other postcentral somatosensory areas. Modified from Vassb¢ et al. (1999). only the crowns of the folia have been mapped in detail, e.g. Fig. 2), it seems that vibrissal input is represented to a smaller extent than in somatosensory cortex (Chapin and Lin, 1990; Voogd and Glickstein, 1998). We can conclude that the corticopontine projections transmit another subset o f the input space than is represented in the cerebral cortex. Furthermore, it seems likely that the transformation from a continuous map in SI to a fractured somatotopic map in the cerebellum takes place primarily in the pontocerebellar projection given that the corticopontine projection is not basically fractured, but the pontocerebeltar pathway needs further study. Finally, it is well kauown that the corticopontine projections are among the fastest pathways in the human nervous system (Alien and Tsukahara, 1974)
The function of the cerebellar granular layer Most cerebellar theories give little consideration to the function of the granular layer; they focus instead on the interaction between parallel fibers and Purkinje cells (Ito, 1982; Braitenberg et al.. 1997) and. more recently, also on that between Purkinje cells and neurons in the deep cerebellar nuclei (Raymond et al., 1996). This focus on the output side of the cerebellar circuitry m a y be misconceived, considering that the granular layer contains 98% of the cerebellar neurons (Palay and Chan-Palay, 1974). In fact, Mart (1969) and Albus (1971) did consider the function o f the granular layer in detail in their original papers. They contribute it an important function in recoding the mossy fiber input so that the simple perceptron learning rule, which they propose for
286
Area 18 i
200
-- _
>-
_Fz ILl E3
100
J
J ILl 0
0 -40
'20
0
Elevation 1500 co .J J w 0
1000
o w
m
500
z -40
-20
Elevation Fig. 6. Distribution of corticopontine neurons in cat visual area 18. The PN was injected with large amounts of wheat germ agglutinin horseradish peroxidase and the retrogradely labelled neurons an the cortex were quantitatively mapped. The histograms show the distribution of corticopontine neurons in equally sized blocks of the lower visual field close to the vertical meridian (azimuth 0°-20°). Upper left: Densities of corticopondne neurons (cells/ram 2 cortex) decrease from the representation of the lower peripheral visual field (elevation - 5 0 °) to the central visual field representation (elevation 0%. Lower left: The number of corticopontine neurons devoted to each equally sized block of the visual field (cells/ram 2 cortex x m m 2 cortex/visual field block) is higher for blocks close to the central visual field representation. The perimeter chart shows the relative strength of the corticopontine projection from different parts of the visual field represented in area 18, based on quantitative data exemplified in the lower right histogram. Modified from Bjaalie (1985).
the parallel fiber to Purkinje cell synapse, can be applied to complex input patterns. Without such a recoding scheme perceptrons cannot learn to distinguish patterns that are not linearly separable (Minsky and Papert, 1969). Albus' paper contains a nice example where he shows how the recoding of mossy fiber input by the granular layer, which is in effect a combinatorial expansion by about two orders of magnitude, can circumvent this problem. In these theories the inhibitory Golgi cells control the activation threshold of granule cells, thereby keeping the number of active parallel fibers relatively small and constant over large variations in the number of active mossy fibers (Marr. 1969). This control over the number of active parallel fibers enhances the performance of the perceptron learning rule. A1bus (1971) used the term 'automatic gain control' to describe the role of the feedback inhibition by
Golgi cells. Overall this would restrain the number of active parallel fibers contacting a single Purkinje cell to 1% (Albus, 1971) or 0.3 to 6% (Marr, 1969). We have recently criticized the proposed gain control function of Golgi cells (De Schutter et al., 2000) and will not repeat our arguments in detail here. Instead we will focus on our recent modeling and experimental work, which suggests another function for cerebellar Golgi cells: the control over the timing of granule cell spikes (Maex and De Schutter, 1998a; Vos et al.. 1999a).
Golgi cells fire synchronously along the parallel fiber beam Our modeling studies of cerebellar cortex indicate that the cerebellar granular layer is highly prone to synchronous oscillations (Maex and De Schut-
287
o
g
8 D
o
i
D c 8
8
~ °
B
D
8
c
g
Q
e
5
g
m
o
a
8
5
c
O
n
8
~
3~
8
E
~°
o
o
200 ms
N
o ~
Fig. 7. Raster plot showing spike timing of 10 Golgi cells (upper part) and 300 granule cells (lower part). Initially the network is not activated and Only the Golgi cells fire occasionally. At the time indicated by the vertical line homogeneous 40 Hz mossy fiber input is applied and the complete circuit starts firing synchronously at a regular rhythm of about 20 Hz. Simulation of the standard network configurationdescribed in Maex and De Schutter (1998b) but with a more dense packing of the units (30 Gotgi cells, 21555 granule cells and 2160 mossy fibers~. ter. 1998b). A typical example is shown in Fig. 7. The raster plots show the activity in a large onedimensional network simulation, where all units are positioned along the parallel fiber axis. Initially, no
m o s s y fiber input is provided and the spontaneous activity of Golgi cells results in a l o w rate o f desynchronized firing. W h e n the simulated mossy fibers are activated, however, all Golgi cells synchronize
288 immediately and start firing rhythmically. In comparison, the granule cells show more complicated behavior. While they are also entrained in the synchronous oscillation, they fire less precisely and often skip cycles. The differences in behavior of individual granule cells in Fig. 7 can be explained by the randomization of connectivity and intrinsic excitability parameters (Maex and De Schutter, 1998b), indicating that such relative small sources of variability can generate complex activity patterns within the overall regular oscillation. The appearance of synchronous oscillations is explained by the intrinsic dynamics of the pure feedback inhibition circuit (Fig. 1). This can be easily understood by first considering the subcircuit consisting of a single Golgi ceil and its many postsynaptic granule cells. Inhibitory neurons exert a strong influence on the timing of action potentials in their target neurons (Lytton and Sejnowski, 1991', Cobb et al., 1995), The simulated granule cells will fire when inhibition is at its lowest, which is just before the next Golgi cell spike. Consequently the large population of granule cells postsynaptic to one Golgi cell will fire at about the same time. The loosely synchronous granule cell activity then excites the same Golgi cell and causes it to fire immediately, lea~ng to the establishment of a synchronized oscillation within this subcircuit, with granule spikes shortly preceding the Golgi celi spike. The long parallel fibers which are typical for the structure of the Cerebellar cortex couple many of these oscillatory subcircnits together. Common parallel fiber input will cause Golgi and granule cells located along the same transverse axis to fire (almost) synchronously. This is a dynamic property of the cerebellar circuitry; once the granular layer ~s activated sufficiently the most stable form of spiking is a synchronous oscillation (Maex and De Schutter, 1998b). The accuracy of this synchronization increases with increased mossy fiber activity, which also leads to an increased Golgi cell firing rate (Maex and De Schutter, 1998b), Consequently we expect to find a firing-rate dependency of the synchronization (Maex et al., 2000). As seen in Fig. 7 the synchronization is immediate upon activation; there is no delay due to the slow parNlel fiber conduction velocity. Finally, Golgi and granule cell populations are synchronized over the complete extent of the transverse
axis where mossy fibers are activated, even if this is much longer than the mean parallel fiber length of 4.7 mm (Pichitpomchai et al.. 1994). Because both cell populations fire in loose synchrony Golgi cell activity can be used to estimate the timing of granule cell spikes, though individual granule cells may skip cycles of the oscillation (Fig. 7). This is important as one cannot isolate single granule cells in vivo, while it is relatively easy to isolate Golgi cells (Edgley and Lidierth, 1987; Van Kan et al., 1993). These modeling predictions were confirmed using multi-single.unit recordings of spontaneous Golgi cell activity in the rat cerebellar hemispheres (Vos et al., 1999a). A total of 42 Golgi cell pairs in 38 ketamine-xylazine anesthetized rats were recorded. Of these, 26 pairs were positioned along the transverse axis (i.e. along the same parallel fiber beam), while the other 16 pairs were located along the sagittal axis (no common parallel fiber input). All transverse pairs except one showed a highly significant coherence measured as the height of the central peak in the normalized cross-correlogram. An example of such a cross-correlogram obtained from a pair of Golgi cells along the transverse axis is shown in Fig. 8. Conversely, in 12 out of 16 sagittal pairs no synchrony could be found~ The remaimng four sagittal pairs showed low levels of coherence, but in each of these pairs the Golgi cells were located within 200 ~m from each other. We assume that in these latter four pairs the cells were so close to each other that their dendritic trees overlapped (Dieudonn6, 1998), allowing them to sample common mossy and/or parallel fiber input despite their parasagittal separation. These findings confirmed the main prediction of the network simulations: Golgi cells along the parallel fiber beam fire indeed synchronously. Additionally, as predicted, the accuracy of synchronization, evidenced by higher and sharper central peaks in the cross-correlogram, increased with the Golgi cell firing rate (fig. 1 of Vos et al., 1999a), This indicates that synchronization may be much more accurate in awake animals, compared to the loose synchrony observed in the anesthetized rat. The only data presently available from awake animals are field potential recordings (Pellerin and Lamarre, 1997; Hartmann and Bower, 1998). These studies have also demonstrated the presence of oscillations in the gran-
289 7
6
iI
5
82
1 0
...........
-1 -2
.--t----
-250 -200 -150 -100 -50
0
50
100
150
200
250
Time lag (ms) Fig. 8. Cross-correlationof spontaneous activityof two Golgi cells receivingcommonparallel fiber input. The highly significantcentral peak (the cross-correlogramhas been normalizedfor firing frequency)is indicativeof synchronousfiring which is not veryaccurate as the peak is wide. This is the same Golgi cellpair as of Fig. 4. See Vos et al. (1999a) for experimentaland statistical procedures. ular layer that may correspond to those predicted by the model.
The effect of spatially localized mossy fiber input In the previous section we considered the synchromzation of Golgi c~lls in response to a spatially homogeneous mossy fiber input in the network simulations and compared this to experimental data obtained without any stimulation. This is of course a rather artificial assumption. Considering the patchy receptive fields i n the granular layer (Fig. 2) one expects stimulation to cause spatially heterogeneous mossy fiber activation. Recent modeling work in our laboratory demonstrates that a similar behavior can be obse~ed when a patchy mossy fiber input is applied. Specifically, if two patches ~ activated by comparable levels of mossy fiber input they will synchronize immediately, even if separated by a few millimeters of only lowly activated granular layer (Franck et al., 2001)~ This can be observed in Fig. 9A, which shows the spike trains of the model Golgi cells and of a subset of granule cells in a 0he-dimensional model with strong ~'eedback ~ b i f i o n . Activation of two small patches (200 t~m diameter c o n t ~ n g about 50 mossy fibers and 500 granule celts each, 1 mm separation) leads tO the immediate synchronization of all Golgi cells along about 6 mm of the parallel fiber beam overly-
ing the patches. As parallel fibers are 5 mm long in the network model, this means that all Golgi cells receiving input from the two patches become entrained in the rhythm, though the synchronization is clearly less robust than that evoked by homogeneous input (Fig. 7). The Other Golgi cells in the network fire iesS than before or are hardly affected at all. The picture looks somewhat different for granule cells. In Fig. 9 only a small subset of granule cell traces can be shown, so the borders of the patches are not represented. It can nevertheless be seen that they are activated inside the patches only. Between the two patches and at the outer borders of the patches they are actively inhibited by the activated Golgi cells. The granule cell activity within and between the two patches is highly synchronized. Like for the fully activated network (Fig. 7), granule cells spike together with Golgi cells, but sometimes skip cycles. In addition granule cells sometimes fire bursts of two spikes. The latter behavior was even more pronounced when the network parameters from previous studies (Maex and De Schutter, 1998b) were used (to diminish bursting in Fig. 9A the synaptic strengths of parallel fiber and Golgi cell synapses have been raised, making the feedback inhibition loop stronger). The possible importance of granule cell bursting for induction of synaptic plasticity at the parallel fiber to Purkinje cell synapse (Linden and Connor, 1995: Finch and Augustine. 1998) has
290
1
©
o
.I r~
r~
-SJ ---~q i
.
i
-
0
...,-
q
_
L
!q
:"
Y
,,,,1
"1
~-~
q -
<
I
7
~-
!--2 Z ::::~ = ~ : -
'
291
inhibition, activation of spatially separated patches can also couple the activity of Golgi cells. In their effect on gr~ute cell finng Golgi cells seem to have both a discriminating and an integrating function: they increase the contrast in firing rate between activated and non-activated granule cells and they synchronize firing of activated granule cells. In contrast, if a similar ilaput is applied to a network version with weak feedback inhibition and strong feed-forward inhibition the situation looks quite different ~ig. 9B). These networks show no synchronization o f Golgi cell firing for homogeneous mossy fiber input (Maex and De Schutter, 1998b). In Fig. 9B the Golgi cells are more active because Of their direct excitation by mossy fibers. Only the Gotgi cells receiving increased mossy fiber input in the patches (and a few of their immediate neighbors) show a response to the activation: they increase their firing rate without synchronizing. Tile granule cell activity is Yew heterogeneous: some fire at rates similar to those in Fig. 9A, others do not. While some of these spikes are clearly synchronized within a patch, there is almost no synchronization of activity between the two patches. The synchronization within patches is explained by common inhibition from one Golgi cell As the parallel fiber activation of Golgi cells is veer weak in these simulations, it is not sufficient to synchronize the Golgi cells.
Spafio-temporalcoding along the parallel fiber beam In the left panels o f Fig. 9 we compare the granule cell spiking m a synchronized version (Fig. 9A) and desynchrot~ized Version (Fig. 9B) of the granular layer network model. In the right panels we compare the effect of s~chronization on parallel fiber spike transmission l~y showing two snapshots for each network version.
Both simulations show waves of spikes traveling along the parallel fiber beam, but with an important distinction. In the feedback inhibition model the patches fire loosely synchronized so that the spikes originate in both patches, while in the feed-forward inhibition model all the spikes that can be observed at one time originate in only one of the patches. Only rarely did one observe spikes originating outside the patches or. in the case of Fig. 9D, a spike originating in the other patch. A comparison of these two simulations suggests that at the low firing rates present in the model granule cells the synchronization of the feedback inhibition model has an important effect on the patterns of parallel fiber spiking that are perceived by Purkinje cells. In particular, in the synchronized model the two patches activate Purkinje cells at roughly the same time (Fig. 9C) while without synchronization they will do so separately (Fig. 9D). The spike waves of Fig. 9C,D are reminiscent of the tidal waves proposed b y Braitenberg et al. (1997), with the important difference that their timing is generated inside the cerebellar cortex, not outside of it. But Braitenberg et al. (1997) proposed that only where all the spikes synchronize along the parallel fiber beam Purkinje cells would be activated. Because of the loose synchronization in the network model this is unlikely to occur. We assume that accurate synchronization may not be important, at least not along the parallel fiber beam. because our Parkinje celt model (De Schutter and Bower, 1994a,b) is a very poor coincidence detector (De Schutter, 1998). This is to be expected as the Purkinje cell dendrite does not contain fast sodium channels (Stuart and H~iusser, 1994). Instead its window of temporal integration is determined by much slower activating dendritic calcium channels (Regan, 1991; Usowicz et al., 1992). There are two ways in which the synchronized spike waves may be decoded by Purkinje cells. The most simple one is to assume a population rate coding scheme (Rieke et al., 1997). Because both patches fire loosely synchronized (Fig. 9A), they cause short bursts of spiking activity along the parallel fiber beam (Fig. 9C) during their activation. Purkinje cells receiving synaptic input from those parallel fibers could simply integrate this input with a total excitation determined by the size and num-
292 bet of patches activated (for a particular mossy fiber input rate). Supportive for this hypothesis is that the conduction ¢ime needed for a spike to travel along one half o f a parallel fiber (5-12 ms; Bernard and Axelrad, 1991; Vranesi c et al., t994)is in the same time range: as Me typical time window within which granule ceils inboth patches spike in the simulations of Fig. 9A (about 10 ms). Analtemative hypothesis is to assume a tempOral code, captured i n the relative timing of the granule cell: spikes I t w a s H6pfield (1995) who first hypothesized that the nervous system can use relative phase lags between spikes to encode information. In the context o f pattern recognition such a temporal code has the advantage of being much less sensitive to stimulus amplitude than standard rate codes :(which are used by :the percep~on learning rule of Man" and Albus), ~ l e Hopfield proposed that coincidence detection combined with different afferent delays would be used: to decode such phase lags, one can imagine ~ternative schemes which decode the phase lags directly (Steuber and Willshaw, 1999). :Whichever h~othesiS one prefers, the synchronization of firing of granule cells which are positioned along the same parallel fiber beam contributes t o the transformation of spatial patterns present in the mossY fiber input into a temporal pattern that is transmitted along the parallel fiber system. According tothe population rate coding hypothesis it isthe burst of synchronized spikes which associates the activity originating in two spatial locations; in the temporal coding the actual phase lag between spikes is the source of information. To distinguish between these two possibilities it would be helpful to know how accurate the timing is in awake animals, as the temporal coding hypothesis requires more accurate synchronization than that observed in anesthetized rats (Vos et al., 1999a; Maex et al., 2000). We are currently investigating this issue (Vos et al., 1999c).
Bringing it all together Pontine sensory input to the cerebellum copies cortical activity, without any obvious mixing of signals. Furthermore, the output from the SI cortex to the cerebellum via the pontine nuclei is renormalized to represent different body parts more equally (Figs. 5
and 6). Because of the patchy, fractured somatotopy of mossy fiber input (Fig. 2), tactile input will generate specific spatial patterns consisting of several co-activated patches in the granular layer (see also Peeters et al., 1999), The complex spatial pattern of activation: of patches may therefore b e used to distinguish between different stimuli and/or activation patterns in neocortex. We propose that the Golgi cell feedback inhibition loosely synchronizes the activity of granule cells in co-activated patches and thereby supports the transformation of a spatially encoded mossy fiber signal into a temporal code of spike waves transmitted along the same parallel fiber beam. Without synchronization much higher granule cell firing rates are required to ensure loose coincidence of spikes originating in different patches (e.g. Fig. 9D). The spatio-temporal transformation hypothesis accords with the properties of the synchronization described above. The immediate synchronization of Golgi cells (Fig. 7) allows for an efficient transformation, while the lack of accuracy fits with the variable conduction velocities of parallel fibers (Bernard and Axelrad, 1991; Vranesic et al., 1994) which: will slowly desynchronize the signal anyway. We expect that in awake animals transient synchronous oscillations linking several different or identical receptive field patches rise and wane continuously, transforming the spatial pattern of input into short sequences of synchronous spike waves along the parallel fiber beam. The amplitude of the input pattern ~ l l determine the frequency of these oscillations (Maex and De Schntter, 1998b). In principle this spatio-temporal code hypothesis is compatible with an additional combinatorial expansion of the mossy fiber signal, as suggested by Mart (1969) and Albus (1971). Nevertheless, alternative explanations for the large number of granule cells are available. Our modeling studies show that a minimum number of parallel fiber contacts onto each Golgi cell must be activated to sustain the synchronous oscillations (Maex and De Schutter, 1998a). This is much more easily achieved in a sparsely activated network containing many granule cells. In support of the spatio-temporal code hypothesis we have found that the most important stimulus aspect determining the fine temporal shape of Golgi
293
Conclusions
patibte w!th the Perceptron learning as proposed by Marr (I969) and Athus (197t), with the important addition of a spatio-temporal recoding not included in their theories. The phase coding scheme may require more specific learning rules (e.g. Steuber an~t Willshaw., 1999) or interactions between paralleI fiber excitation and inhibition by stellate and basket cells (Jaeger et al., 1997; Jaeger and Bower, t999). Concerning the pontine nuclei an additional intriguing possibility is that they enhance the generation of s3~ncba'onous oscillations in the granular layer. The synchronization of Golgi and granule celt spiking between patches assumes that the mossy fiber excitation of each patch is roughly equal (Franck et al., 2001). Therefore it would be useful to have a mechanism available which keeps mossy fiber excitation evenly distributed across activated fibers. How the pontine nuClei couid achieve this with only very few interneurons (Brodal et al., 1988; Border and M ~ l o f f , 1990) remains unclear, though subcortical projections to the pontine nuclei (reviewed in Brodal and BjaNie, 1992) and feedback projections from the cerebellum (Schwarz and Thier, 1999) might play a role. An alternative mechanism to keep activation by mossy fiber input in different patches roughly equal could be plasticity of the mossy fiber to grante cell synapse (D"-Angelo et al., 1999). As LTP at this synapse is suppressed by Golgi cell inhibition it may preferentially enhance transmission at synapses which were not effective in activating the Go!gi cell inhibitory feedback loop and thus boost mossy fiber transmission w h e r e it was relatively weak compared to elsewhere.
We propose that mossy fiber input to the cerebellum is coded primarily in spatial patterns, as reflected by the fractured :somatotopy of the receptive fields in the granular layer. Feedback inhibition by Golgi cells loosely synchronizes granule cell firing along the parallel fiber beam. Simultaneous activation of granular layer patches causes synchronized firing of the activated granule cells and transforms the spatial code into a temporal code onto the parallel fiber beam. The corticopontine projection contributes by distributing a (partially) renormalized copy of cortical activity to multiple patches and possibly by equalizing activity across fibers.
Acknowledgements We thank Trygve Leergaard and Knut Vassbo for assistance with the preparations of figures, Hugo Cornelis for the necessary software development and Reinond Maex and Volker Steuber for careful reading of the manuscript. This research was funded by EC contract BIO4-CT98-0182. by The Research Council of Norway, by The Jahre Foundation, by IPA Belgium (P4/22) and by the Fund for Scientific Research, Flanders (FWO-V1) (G.0401.00). EDS is supported by the FWO-V1.
References Albus. J.S. (1971) A theoryof cerebellarfunction.Math. Biosci.. 10: 25-61. Allen. G.I. and Tsukahara, N. (1974) Cerebrocerebellarcommunication systems.Physiol. Rev., 54: 957-1005. Andersen, B.B., Korbo.L. and Pakkenberg,B. (1992) A quantitative study of the human cerebellumwith unbiased stereological techniques.J. Comp. NeuroL, 326: 549-560. Bernard. C. and Axelrad, H. (1991)Propagationof parallel fiber volleys in the cerebellarcortex: a computer simulation. Brain Res.L 565: 195--208. Bjaalie, J.G. (1985) Distributionin areas 18 and 19 of neurons projecting to the pontine nuclei: a quantitativestudy in the cat with retrogradetransport of HRP-WGA. Exp. Brain Res.. 57: 585-597. Bjaalie, J.G. (1986) Distribution of comcopontine neurons in visual areas of the middle suprasylvian sulcus: quantitative studies in the cat. Neuroscience, 18: 1013-1033. Bjaalie. J.G. and Brodal, P. (1983) Distribution m area 17 of neurons projectingto the pontine nuclei: a quantitative study
294 in the cat with retrograde transport of HRP-WGA. J. Comp.
ule cells implicates the action of a persistelal sodium current.
NeuroL, 221: 289-303.
J. NeurophysioL, 80: 493-503.
Bjaalie, LG., Sudb¢, J. and Brodal, P. (1997) Corticopontine terminal fibres form Small Scale clusters and large scale lamellae in the cat. Neumreport, 8: 1651-1655. Border, B.G. and Mihailoff, G.A. (1990) GABAergic neural elements in the rat baSilar pons: electron microscopic immtmoehemistry. J. Comp. Neurol., 295: 123-135. Bower, J.M., Beerman, D,H., Gibson, J.M., Shambes, G.M. and Welker, W. (1981) Principles of organization of a cerebrocerebellar circuit. Micromapping the projections from cerebral (SI) to cerebellar (granule cell layer) tactile areas of rats. Brain Behav. EvoL, 18: 1-18. Bower, LM. (1997) Is the cerebellum sensory for motor's sake or motor for sensory's sake: the view from the whiskers of a rat? Prog. Brain Res., 114: 463-496. Bower, J.M. and Kassel, J. (1990) Variability in tactile projection patterns to cerebellar folia crus IIA of the Norway rat. J. Comp, Neurol., 302: 768-778. Bower, J.M. and Woolston, D.C. (1983) Congruence of spatial organization of tactile projections to granule cell and Purkinje cell layers of cerebellar hemispheres of the albino rat: vertical organization 6f cerebellar cortex. J. Neurophysiol,, 49: 745766. Bra~tenberg, ¥., Heck, D: and Sultan, F. (1997) The detection and generation of sequences as a key to cerebellar function. Experiments and theory. Behav. Brain Sci., 20: 229-245. Bricldey, S.G, Cull,Candy, S.G. and Farrant, M. (1996) Development of a tonic form of synaptic inhibition in rat cerebellar granule cells resulting from persistent activation of GABAA recePtors. J. Physiol., 497: 753-759; Brodal, P. and Bjaalie, J.G, (1992) Organization of the pontine nuclei. Neurosci. Res., 13: 83-118. Brodal, P., Mihailoff, G., Border, B., Ottersen, O.P. and StormMathisen, J. (1988) GABA-containing neurons in the pontine nuclei of rat, cat and monkey. An immunocytochemical Study. Neuroscience, 25: 27-45. Brumberg, J.C., Pinto, DJ. a n d Simons, D.J. (1996)Spatial gradients and inhibitory summation in the rat whisker barrel system. J. Neurophysiol., 76: 130-140. Chapin, J.K. and Lin, C:-S~ (1990)The somatic sensory cortex of the r a t In: B. Kolb and R.C. Tees (Eds.), The Cerebral Cortex of the Rat. MIT Press; Cambridge, MA, pp. 341-380. Cobb, S.R., Buhl, E.H., Halasy, K., Paulsen, O. and Somogyi, P. (1995) Synchronization of neuronal activity in hippocampus by individual GABAergic interneurons Nature, 378: 75-78. Cohen, D. and Yarom, Y. (1998) Patches of synchronized activity in the cerebellar cortex evoked by mossy-fiber stimulation: questioning the role of parallel fibers. Proc. Natl. Acad. Sci. USA, 98: 15032-15036. D'Angelo, E., De Filippi, G., Rossi, P. and Taglietti, V. (1995) Synaptic excitation of individual rat cerebellar granule cells in sita: evidence for the role of NMDA receptors. J. Physiol., 482: 397-413. D,Angelo, E., De Filippi, G., Rossi, P. and Taglietti, V. (1998) Ionic mechanism of electroresponsiveness in cerebellar gran-
D'Angelo, E., Rossi, P., Armano, S. and Taglietti, V. (1999) Evidence for NMDA and mGlu receptor-dependent long-term potentiation of mossy fiber-granule cell transmission in rat cerebellum. J. Neurophysiol., 81: 277-287. De Schutter, E. (1995) Cerebellar long-term depression might normalize excitation of Pufldnje cells: a hypothesis. Trends Neurosci., 18: 291-295. De Schutter,!E. (1997) A new functional role for cerebellar long term depression Prog. Brain Res., 114: 529-542. De Schutter, El (1998) Dendritic voltage and calcium-gated channels amplify the variability of postsynapfic responses in a Purkinje cell model. J. Neurophysiol.. 80: 504-519. De Schutter, E. and Bower, J.M. (1994a) An active membrane model of the cerebellar Purkinje cell. I. Simulation of current clamps in slice. J. NeurophysioL. 71: 375-400. De Sehntter, E, and Bower, J.M. (1994b) An active membrane model of the cerebellar Purkinje cell, II. Simulation of synaptic responses. J. NeurophysioL. 71: 401-419. De Schutter, E, and Maex, R. (1996) The cerebellum: cortical processing and theory. Curr. Opin. Neurobiol., 6: 759-764. De Schutter, E., Vos, B.P. and Maex. R. (2000) The function of cerebellar Golgi cells revisited. Prog. Brain Res., 124: 81-93. De Zeeuw. C.I.. Hansel, C.. Bian, F., Koekkoek, S.K.. van Alphen. A.M., Linden. D.J. and Oberdick, J. (t998) Expression of a protein kinase C inhibitor in Purkinje cells blocks cerebellar LTD and adaptation of the vestibulo-ocular reflex. Neuron, 20: 495-508. Dieudonn6, S. (1998) Submilfisecond kinetics and low efficacy of parallel fibre-Golgi cell synaptic currents in the rat cerebellum. J. PhysioL, 510: 845-866. Dieudonnt, S. and Dumoulin. A. (2000) Serotonin-driven longrange inhibitory connections in the cerebellar cortex. J. Neurosci., 20: 1837-1848. Dykes. R.W,. Landry, P., Metherate, R. and Hicks, T.P. (1984) Functional role of GABA in cat primary somatosensory cortex: shaping receptive fields of cortical neurons. J. NeurophysioL. 52: 1066-1093. Eccles, J.C., Llings, R.R. and Sasaki, K. (1966) The mossy fibre-granule cell relay of the cerebellum and its inhibitory control by Golgi cells. Exp. Brain Res., 1: 82-101. Edgley, S.A. and Lidierth, M, (1987) The discharges of cerebellar Golgi cells during locomotion in the cat. J, Physiol., 392: 315332. Eilers. J.. Takechi, H., Finch, E.A., Augustine, G.J. and Konnerth, A. (1997) Local dendritic Caz+ signaling induces cerebellar long-term depression. Learn. Memory, 4: 159M68. Eycken, A.. Bjaalie, J.G., Volny-Luraghi, A. and De Schutter, E. (2000) Electrophysiology of the pontine nuclei in the anesthetized rat. Eur. Z Neurosci. Suppl., 11: 430. Finch. E.A and Augustine, G.J. (1998) Local calcium signalling by inositol-l,4,5-trisphosphate in Purkinje cell dendrites. Nature. 396: 753-756. Franck, RR.C.A., Maex, R. and De Schntter. E. (2001) Synchronization between patches of local excitation in a cerebellar granular layer model. Neurocomputing., in press.
295
Geurts, E, Timmermans. J.-R and De Schutter. E. (2001) Morphological attd neurochemical differentiation of large granule layer intemeurons in the adult rat cerebellum. Neuroscience, in press. Gundappa-Sulur, G.. De Schutter, E. and Bower, J.M. (1999) Ascending granule cell axon: an important component of the cerebellar co~ical circuitry. J. Comp. Neurol., 408: 580-596. Harteit, N.A. (1996) Strong activation of parallel fibers produces localized calcium transients and a form of LTD that spreads to distant synapses. Neuron, 16: 601-610. Hartmarm, M.L and' Bower, J.M. (1998) Oscillatory activity in the cerebellar hemispheres of unrestrained rats. J. NeurophysioL. 80: 1598-1604. Harvey, RJ. and Napper, R.M.A. (1991) Quantitative studies of the mammalian cerebellum. Prog. Neurobiol., 36: 437-463. Hopfield. J.J. (1995)Pattern recognition computation using action potential timing for:stimulus representation. Nature. 376: 33-36. lto. M. (1982)Cerebellar corrtrol of the vestibulo-ocular reflex - - around the flocc~lus hypothesis. Annu. Rev. Neurosci., 5: 275-~298. Ito. M. (1984) The Cerebellum and Neural Control. Raven Press, New York. Jaeger, D. and Bower, J;M. (1999) Synaptic control of spiking in cerebellar Purkinje cells: dynamic current clamp based on model conductances. 'J. Neurosci., 19: 6090-6101. Jaeger, D., De Sclitttter, E. and Bower, J.M. 11997) The role of synaptic and voltage=gated currents in the control of Purkinje cell spiking: a m~eting study. J. Neurosci., 17: 91-106. Korbo, L., Andersen, B;B., Ladefoged, O. and M011er, A. (1993) Total numbers of various cell 'types in rat cerebellar cortex estimated using an unbiased stereological method. Brain Res.. 609: 262-268. Leergaard, T.B. and Bjaalie, J.G. (1998) From cortical 2-D to brain stem 3-D maps: organisation of corticopontine projections in developing and adult rats. Abstr. Soc. Neutosci., 24:262.9. Leergaard, T.B., Lakke, E.A.J.E and Bjaalie. J.G. (1995) Topographical Organization in the early postnatal corticopontine projection. A carbocyanine dye and 3-D computer reconstruction study in the rat. J. Comp. NeuroL, 361: 77-94. Leergaard, T.B, Lyngstad, K.A., Thompson. J.H.. Taeymans. S., Vos, B.P.. De Schutter, E., Bower. J.M.and Bjaalie. J.G. (2000a.~ Rat somatosensory cerehro-ponto-cerebellar pathways: spatial relationships in the SI somatotopic map are preserved in a three-dimensional clustered pontine map. J.. Comp. Neurol.. 422: 246-266. Leergaard, T.B, Alloway, K.D.. Mutic, J.J. and Bjaalie. J.G (2000b) Three-dimensional topography of corticopontine projections from rat barrel cortex: correlations with corticopontine organization. Z Neurosci.. 20: 8474-8484. Linden D.J. and Connor, J.A. (1995) Long-term synaptic depression. Annu. Rev. Neurosci., 18: 319-357. Lytton. W.W. and Sejnowski. T.J. (t991) Simulations of cortical pyrmnidal nettrons synchronized by inhibitory interneurons. J. Neurophysiol., 66: 1059-1079. Maex, R. and De Schutter, E. (1998a) The critical synaptic
number for rhythmogenesis and synchronization in a network model of the cerebellar granular layer. In: L. Niklasson. M. Bodtn and T. Ziemke (Eds.), ICANN 98. Springer, London, pp. 361-366. Maex, R. and De Schutter. E. (1998b) Synchronization of Goigi and granule cell firing in a detailed network model of the cerebellar granule cell layer. J. Neurophysiol., 80: 2521-2537. Maex, R., Vos. B.E and De Schutter, E. (2000) Weak common parallel fibre synapses explain the loose synchrony between rat cerebellar Golgi cells. J. Physiol., 523: 175-192. Marr. D.A. (1969) A theory of cerebellar cortex. J. Physiol., 202: 437-470. Mauk. M.D. (1997) Roles of cerebellar cortex and nuclei in motor learning: contradictions or clues? Neuron. 18: 343-346. Mihailoff, G.A., Kosinski, R.J., Azizi, S.A., Lee. H.S. and Border. B.G. (1992) The expanding role of the basilar pontine nuclei as a source of cerebellar afferents. In: R.R. Llinfis and C. Sotelo (Eds.), The Cerebellum Revisited. Springer, Berlin. pp. 135-164. Minsky, M. and Papert. S. (1969) Perceptrons: An introduction to Computational Geometry. MIT Press. Cambridge, MA. Morissette, J. and Bower, J.M. (1996) Contribution of somatosensory cortex to responses in the rat cerebellar cortex granule cell layer following peripheral tactile stimulation. Exp. Brain Res.. 109: 240-250. Mountcasfle, V.B.. Davies, RW. and Bertnan. A.L. (1957) Response properties of neurons of cat's somatic sensory cortex to peripheral stimuli. J. Neurophysiol., 20: 374-407. Murphy, J.T. and Sabah. N.H. (197t) Cerebellar Purkinje cell responses to afferent inputs, II. Mossy fiber activation. Brain Res.. 25: 469-482. Nelson. R.J.. Sur, M.. Felleman. D.J. and Kaas. J.H. (1980) Representations of the body surface in postcentral parietal cortex of Macaca fascicularis. J. Comp. Neurol., 192:611643 Nicolelis. M.A.L. and Chapin, J.K. (1994) Spatiotemporat structure of somatosensory responses of many-neuron ensembles in the rat ventral posterio medial nucleus of the thalamus. J. Neurosci., 14: 3511-3532. Overby, S.E.. Bjaalie, J.G. and Brodal. E (1989) Uneven densities of corticopontine neurons in the somatosensory cortex. A quantitative experimental study in the cat. Exp. Brain Res.. 77: 653-665, Palay, S.L and Chan-Palay, V. (1974) Cerebellar Cortex. Springer, New York. Peeters. R.R., Verhoye, M.. Vos. B.R. Van Dyck, D.. Van Der Linden, A. and De Schutter, E. (1999) A patchy horizontal organization of the somatoseusory activation of the rat cerebeltum demonstrated by functional MRI. Eur. J. Neurosci.. 11: 2720-2730. Pellerin, J.-R and Lamarre, Y. (1997) Local field potential oscillations in primate cerebellar cortex during voluntary movement. J. NeurophysioL, 78: 3502-3507. Pichitpomchai, C., Rawson, J.A. and Rees, S. (1994) Morphology of parallel fibers in the cerebellar cortex of the rat: an experimental light and electron microscopic study with biocytin. J. Comp. Neurol., 342: 206-220.
296 Raymond. J.L.. Lisberger, S.G. and Mauk, M.D. (1996) The cerebellum: a neuronal learning machine? Science. 272:11261131. Regan, L.J. (1991) Voltage-dependent calcium currents in Purkinje cells from rat cerebellar vermis. J. Neurosci., 11: 22592269. Rieke. F., Warland, D.. de Ruyter van Steveninck, R.R and Bialek, W. (1997) Spikes. Exploring the Neural Code. MIT Press, Cambridge, MA. Sehwarz. C. and Thier, P. (1999) Binding of signals relevant for action: towards a hypothesis of the functional role of the pontine nuclei. Trends Neurosci., 22:443-451. Sharnbes, G.M., Gibson, I.M. and Welker, W. (1978) Fractured somatotopy in granule cell tactile areas of rat cerebellar hemispheres revealed by micromapping. Brain Behav. Evol., 15: 94-140. Steuber, V. and Willshaw. D.J. (1999) Adaptive leaky integrator models of cerebellar Purkinje cells can learn the clustering of temporal patterns. Neurocomputing, 26: 271-276. Stuart, G. and H~usser, M. (1994) Initiation and spread of sodium action potentials in cerebellar Purkinje cells. Neuron, 13: 703-712. Usowicz, M.M.. Sugimori. M., Cherksey, B. and Llimis, R.R. (1992) Characterization of P-type calcium channels in cerebellar Purkinje ceils. Abstr. Soc. Neurosci.. 18: 974-974. Van Karl, RL.E.. Gibson. A.R. and Houk. J.C. (1993) Movement-related inputs to intermediate cerebellum of the monkey. J. Neurophysiol., 69: 74-94. Vassb0, K.. Nicotra. G.. Wiberg, M. and Bjaalie, J.G. (1999) Monkey somatosensory cerebro-cerebellar pathways: uneven densities of corticopontine neurons in different body representations of areas 3b, 1, and 2. J. Comp. Neurol., 406: 109-
128. Volny-Luraghi, A., De Schutter, E. and Vos, B.E (1999) Responses of cerebellar Golgi cells to tactile stimuli of different sizes. Abstr. Soc. Neurosci., 25: 914. Voogd, J. and Glicksteiu, M. (1998) The anatomy of the cerebellum. Trends Neurosci., 21: 370-375. Vos, B.R, Maex, R., Volny-Luraghi, A. and De Schutter, E. (1999a) Parallel fibers synchronize spontaneous activity in cerebellar Golgi cells. J. Neurosci., 19(RC6):1-5. Vos, B.R, Volny-Luraghi, A. and De Schutter, E. (1999b) Cerebellar Golgi cells in the rat: receptive fields and timing of responses to facial stimulation. Eur. J. Neurosci., 11: 26212634. Vos, B.P., Taeymans, S., Wijnants, M. and De Schutter, E. (1999c) Miniature carrier with six independently moveable electrodes for recording of multiple single-units in the cerebellar cortex of awake rats. J. Neurosci. Methods, 94: 1926. Vos, B.P., Volny-Luraghi, A., Maex, R. and De Schutter, E. (2000) Precise spike timing of tactile-evoked cerebellar Golgi cell responses: a reflection of combined mossy fiber and parallel fiber activation? Prog. Brain Res., 124: 95-105. Vranesic, I., Iijima, T., Ichikawa, M., Matsumoto, G. and Kn6pfel, T. (1994) Signal transmission in the parallel fiber Purkinje cell system visualized by high-resolution imaging. Proc. Natl. Acad. Sci. USA, 91: 13014-13017. Welker, W. (1987) Spatial organization of somatosensory projections to granule cell cerebellar cortex: functional and connecfional implications of fractured somatotopy (summary of Wisconsin studies). In: J.S. King (Ed.), New Concepts in Cerebellar Neurobiology. Alan R, Liss, New York, pp. 239280.
M.A.L. Nicotelis (Ed.]
Progress in Brain Research. Vot.
130 © 2001 Elsevier Science B.V. All rights reserved
CHAPTER 19
The cerebellum as a neuronal prosthesis machine John E Welsh a,,, Cornelius Schwarz 2 and Yoni Garbourg 3 1NeurologicalScienees Institute, Oregon Health Sciences University, 505 NW 185th Avenue, Beaverton, OR 97006, USA 2 Eberhard-Karts-Universit~it Tiibingen, Department for Cognitive Neurology, Aufder Morgenstelle 15, D-72076 Tiibingen, Germany 3 Department of Physiology and Neuroscience, New York University School of Medicine, 550 First Avenue, New Fork. NY 10016. USA
Introduction
The purpose o f tNs chapter is to present the idea that the architecture of the cerebellar cortex represents a substrate that :can be exploited for controlling skilled movements b y a multielectrode device. A hypothesis is being proposed that a 'neuronal prosthesis' can be d in cerebeltar cortex to correct pathological movement resulting from a central motor disease in other parts of the brain. Although the idea of a neuronal prosthesis to supplement function or to reptace:t0st function is not new, it has not been approached, as'far as we know, from the viewpoint of cerebeIlar neurophysiology. The concept of a cerebel.lar neuronal prosthesis derives from our work on multielectrode n siology in behaving animals which h a s elucidated highly defined activity patterns within cerebettar cortex that are functionally related to the performance of skilled movements. The idea of a neuronal prosthesis exists concepm• two forms. A ' read-out ' form envlslo " " n s that ally m brain signals recorded from an ensemble of neurons by a multielectrode device can be transformed into signals for controlling periphera! hardware or stimulating one or mare muscles! A read-in' form envisions that a peripheral event can be transformed into
* Corresponding author: John R Welsh, Neurological Sciences lnstimt~, Oregon Health Sciences University, 505 NW 185th Avenue, Bbaverton, OR 97006, USA. Tel.: t-503-418-2645; Fax: + 1-503-418-2501; E-mail
[email protected]
a neuronal ensemble signal and then injected into the brain through a multielectrode device in order to reproduce actual brain function. The requirements for realizing the two types of brain prostheses are very different. The 'read-out' form requires development of signal transform algorithms that will be unique for both the particular brain area whose signals are to be utilized and the peripheral device that is to be controlled. To date, there h a s been less emphasis on the specific neuronal types to be recorded as there has been on the ability to transform whatever brain signal is measured into a calibrated action of a peripheral device. The 'read-in' form of prosthesis presumes that naturalistic patterns of brain activity can be induced by multisite stimulation to produce natural function. Here, substantial emphasis is placed both on the neuronal types to be stimulated and the spatio-temporal patterns of stimulation that represent a peripheral event within the brain. Attempts have been made to actualize the 'read-in' prosthesis for inducing activities in sensory cortices and nuclei in order to provide sensation in peripherally deafferented patients (Schmidt et al., 1996; Otto et al., 1998) while the 'read-out' form of brain prosthesis is largely envisioned as a strategy to provide motor capability to spinally damaged patients (Chapin et al., 1999). A cerebellar neuronal prosthesis would take the form of a 'read-in' device for the motor system. Such a device would utilize the cerebellum as a piece of hardware already connected to the peripheral motor system a "neuronal machine' (Eccles et al., 1967) in order to compensate for dysfunction in higher
298 motor centers. It is generally understood that the cerebellum does not initiate volitional movements, but rather modulates and supplements descending motor commands in order to increase the accuracy and speed of movements initiated by forebrain substrates (Holmes, 1939). A cerebellar neuronal prosthesis could capitalize on this natural role of the cerebellum and induce activities to compensate for dysfunctional commands being transmitted to the spinal cord from diseased motor centers in the forebrain. One might conceive of a situation in which a stroke in motor cortex or basal ganglia alters the strength and accuracy of a volitional limb movement and a cerebellar prosthesis compensates for the performance problem in real-time by boosting the movement's velocity or redirecting its trajectory by inducing a pattern of cerebellar activity that supplements or corrects for the aberrant pyramidal command. It is evident that the above idea represents a futuristic view of how multielectrode recording and stimulation of the cerebellum might be integrated into a neuro-prosthetic application. Yet, we may ask: how far can the idea of a cerebellar neuronal prosthesis be taken and what experimental evidence might already support such an idea? At first approximation, two requirements have to be satisfied. First, spatio-temporal patterns Of cerebellar activity causally related to particular muscle activations or specific movement trajectories have to be elucidated. Second, spatially-specific patterns of cerebellar activation induced by a multielectrode device that will reliably and predictably alter movement characteristics as they are being performed have to be established. Below we describe results that we have obtained using multielectrode recording and stimulation of cerebellar cortex in order to evaluate the feasibility of the cerebellar neuronal prosthesis envisioned above.
Cerebellar cortex: a spatio-temporal activity generator modulating movement Our current work on cerebellar cortex has examined both major afferent systems of the cerebellum - - the mossy/parallel fiber system and the climbing fiber system within the context of skilled movements using multiple Purkinje cell recording. As we described in a previous chapter ~(Welsh and Schwarz,
1999), cerebellar Purkinje cells are quite a unique cell type for multielectrode recording for a variety of reasons. Unlike probably all other neurons of the central nervous system, Purkinje cells show two different classes of action potentials that are triggered by, and therefore indicate activity in, two different afferent systems (Fig. 1A). The single climbing fiber that each Purkinje cell receives (Fig. 1B) triggers a massive postsynaptic potential that is distributed throughout the dendritic tree which, in turn, elicits a barrage of dendritic spikes. Recorded extracellularly from the molecular layer and filtered appropriately, this set of events appears as a multiphasic spike burst that is typically called a 'complex spike' (Eccles et al., 1966b; Thach, 1967). The complex spike differs dramatically from the isolated axon hillock response to a summation of graded postsynaptic potentials, typically recorded as a triphasic event from the outside of a spiking neuron. This triphasic event is present in Purkinje cells also, but here it is called a 'simple spike' (Fig, 1 A ) a n d represents the axonal response of the neuron to the summed input of probably hundreds o f parallel fibers over its dendrites (Fig. 1B; Eccles et al., 1966a). Because the Purkinje cell is the ~sole output neuron of the cerebellar cortex, its spike activity also provides a measure of what the cerebellar cortex is transmitting to its projection target, the deep cerebellar nuclei (Fig. 1B). In a multieIectrode experimem, recordings obtained simultaneously from many Purkinje cells (Fig. 1C) and processed to differentiate simple from complex spikes (Fig. 1A).can provide real-time measures of activity within two afferent systems, their functional integration at the cortical level, and a measure of cerebellar cortex output. It has been understood for some time that the two major afferent systems of the cerebellum are dramatically different from a functional point of view. The inferior olive, the source of the climbing fibers and elicitor of complex spikes in Purkinje cells, is an electrotonically coupled nucleus whose neurons act as pacemakers when they fire rhythmically (Llin~is et al., 1974; Sotelo et al., 1974; Llin~is and Yarom, 1981a,b, 1986). These characteristics induce synchronous and sometimes rhythmic firing of Purkinje cells due to the monosynaptic relation between the olivary neurons and Purkinje cells (Bell and Kawasaki. 1972; Llin~is and Sasaki,
299 1989). Yet. the activity of inferior olivary neurons is quite low, among the slowest in the nervous system, and reaches an absolute maximum of 10-15 i-Iz within episodic bouts of spike trains. In contrast, the mossy fiber system is fast (100-200 Hz), originates from many brainstem and spinal nuclei, and affects Purkinje cells 0nly:via a disynaptic circuit involving the granule cells and their parallel fiber axons. The climbing fibers show a one-to-one relationship to individual Purkinje cells such that a complex spike is triggered by one, and only one, climbing fiber. In contrast, the mossy/parallel fiber system shows a high degree of convergence and divergence, such that each paralM fiber innervates hundreds of Purkinje cells and simNe sNkes in Purkinje cells are triggered by any numt~er of combinations of around 200,000 parallel fibers (Napper and Harvey, 1988). While synchrony ap~ars to be a modus operandi of the inferior olive and is ensured by electrotonic coupling, the prevalence of synchrony within the parallel fiber system is relatively low. So while the climbing fiber system uses 0he-to.one projections, low rates of firing, synchrony, and all-or-none excitation, the mossy fiber system employs convergence/divergence, very high rates of firing; asynchrony, and graded excitation to exert its function. It is important to recognize that the function of both systems iS highiy distributed ina way that is determined by their unique anatomies but is expressed through a final common substrate, the Purkinje cell. Climbing fiber function is distributed due to the electrotonic coupling in the inferior olive that leads m spatially ~stributed but synchronous activation of Purkinje cens. The function of the mossy fiber system is distributed due to the multiple irmervation of granule ceils by mossy fibers, multiple innervation of Purkinje Cells by parallel fibers, and the distributed excitation from multiple parallel fibers required to trigger a simple spil~e in a Purkinje celt. Lastly, the different afferents elicit different output within Purkinje cells, with complex spikes triggering a 500 Hz burst of 3-5 action potentials running down the Purkinje cell's axon (Ito and Simpson, 1971) while each simple spike portends a single axonal spike at the tern-final ending. The differential density of output spikes triggered by the two afferent systems allows for the possibility that cerebellar nuclear cells may process signals from the climbing fiber
and mossy fiber/parallel fiber systems differently (LliMs and Mtthlenthaler, 1988). So, the questions from a prosthesis-development point of view are: how does the distributed activity within these temporally different systems relate to the organization of movement and how precisely can these activities be mimicked in their common output substrate to modulate movement with a multielectrode device9 A previous paper presented the idea that the two afferent systems of the cerebellum may play independent but complementary roles in motor control (Welsh and LliMs, 1997). To summarize, it was hypothesized that cerebeltar control o f movement is dynamic, with the climbing fiber system providing episodic bouts of feedforward control that reduce the need for continuous feedback regulation of a movement. On the basis of the finding that synchronous complex spike activity occurred within time-varying patches of cerebellar cortex during the execution of skilled tongue movements (Fig. 1D: Welsh et al., 1995), it was hypothesized that the climbing fiber system established motor synergies during movement. Those data agreed well with the growing appreciation of a patchy division of muscle representations within cerebellar compartments (Cicirata et al., 1992). Such patches of synchronous olivocerebellar activity were observed during specific epochs of skilled tongue and arm movements and changed rapidly, even in the absence of sensory feedback, indicating a feedforward influence of olivocerebellar synchrony on movement. The implication of the experiments was that single muscles might be represented in discrete, local groups of olivary neurons in the brainstem and that the rapid repatteruing of electrotonic coupling within the inferior olive during movement could be a substrate for dynamically changing the combination of muscles that synchronously contract at any given moment. Behavioral support for the hypothesis was obtained when selective removal of the climbing fibers altered the timing of the skilled tongue movements (Fig. 2: Welsh, 1998). It has long been known that proprioceptive information carried to the cerebellum by mossy fibers regarding limb position and limb dynamics plays an important role in motor control, likely for the feedback regulation of movement (Grant, 1962; Arshavsky et al., 1972). Proprioceptive information
300
o x
~oa~ ddddo
o t.O
(w~) e!d M01e8 ~d~Cl o
o
o
o
~.
~ oo
o
W ,,,,
~:$ E nk
301 has also been demonstrated :in the activity o f single olivary neurons (Gellman et al., 1983). Thus, a synthesis of: Cerebelt~ ~ a t i o n s is that precise patterning o f 01ivoc l i a r s ~ c h r o n y during movement could allow a desired m o v e m e n t t o b e closely approximated under ' f r e d f o i ~ a r d control, by a priori specification emd activation of muscle synergies, against a bac of proprioceptive activity and cerebro,cerebellat influences Carried by both afferent systems that could for a more continuous finetuning of tile m o v e m e n t to achieve a final targeted position. An important question concerning our multiple Purldnje cell data is w h e t h e r the p a t t e r n s of complex spike synchrony play a role in the Control or generation of movement. If so, we would expect that the. synchronous, patterns are specific~ for the performed behavior; particular patterns of Purkinje cell synchrony should be assignable to specific motor events and separable from non-motor events. To begin to test this issue, we used discriminant analysis (Tabachnick and Fidell, 1996) to attempt to predict whether a rat extended its tongue to a target from the particular set of patterns of Purkinje cell complex spike synchrony. The specific analytical question to be a d d r e s s e d was: can we discriminatel using the pattern of :complex spike synchr0ny at any given moment, whether a rat is performing or is about to perform a tongue m o v e m e n t from whether it had recently heard a tone-conditioned stimulus or had recently receiVed a water reinforcer? Purkinje cell
groups showing synchronous complex spike firing were derived using the cluster analysis that we previously described in detail (Welsh and Schwarz, 1999). AS preprocessing for discriminant analysis, cluster analysis was applied to each successive 50 ms epoch for a 2.5 s period after the onset Of a tone-conditioned stimulus to derive groups of Purkinje cells that fired c o m p l e x spikes synchronously at various times during the behavioral trials (Fig. 3). These groups were then used as predictor variables for discriminant analysis using the m o m e n t s in time immediately after tone onset, after water delivery, or during tongue protrusion as grouping variables. The rats were previously trained to protrude their tongue in response to the tone-conditioned stimulus to receive a water reinforcer. In the two examples presented in Fig. 4, the complex spike activity of 26 Purkinje cells was recorded simultaneously during more than 300 conditioning trials yielding hundreds of tongue movements. The cluster analysis detected 10 and 14 different groups of synchronously firing Purkinje cells through the 2.5 s period in the two examples. Discriminant analysis was used t o find a set of classification equations based on patterns of complex spike synchrony for classifying the moments in time surrounding tongue movements, tone onsets, and water delivery with NCSS 2000 (Kaysville, UT) statistical software. A slightly different approach toward discriminant analysis was used than has been employed previously for multineuron data. While previous applications of discriminant analysis to
Fig. 1. (A) Parkinje cell electroresponsiveness. Examples of complex and simple spikes from a single Purkinje cell obtained at various depths below the pial surface are presented for a single electrode track. Complex spikes can be easily distinguished from simple spikes based on their different waveforms. Taken from Welsh and Schwarz (1999). (B) Schematic of the olivocerehellar system and paralM fibers. The olivocerebellar system consists of Purkinje cells, cerebellar nuclear neurons, and inferior olivary neurons. Olivary neurons are electrotonicaUy coupled and their axons project m Purkinje cells as climbing fibers (red) where they trigger complex spikes. Purkinje cells project to the cerebellar nuclei. Climbing fiber activity fires Purkinje cells synchronously, generating inhibitory-postsynaptic potentials in the cerebeUar nuclei which follow the early excitation produced by the collaterals. Feedback produced by cerebellar nuclei inhibition to the inferior olive projects to sites of electrotouic coupling and is viewed as a pattern generator that regulates the degree of coupling between olivary neurons and the spatial pattern of synchronous cerebetlar activation. Cerebellar nuclear neurons not projecting to the inferior olive are excitatory and project to motor nuclei in the thalamus and brainstem. Parallel fibers are indicated in green to point out the second major afferent of Purkinje cells. Parallel fibers worldng in combination trigger simple spikes in Purkinje cells. Modified from Ltin~s and Welsh (I993). (C) Location where muttineuron recordings of Purkinje cells were obtained using 40 microelectrodes during skilled movement. Each dot represents the location where a single microelectrode was inserted into the Crus IIa folium of the cerebellar cortex. (D)Spatial distribution of synchronous complex spike firing during conditioned tongue protrusion alone (bottom) or during a complex movement synergy involving the forepaw and tongue (top). The cartoons are taken from video images obtained during the movement at 50ms intervals. The dot figures show spatial plots of synchronously firing Purkinje ceils within the recording array in temporal reg~.strati0n ~ t h the movement trajectory.
302
A
B
climbing fibers intact lick 2
lick ,
lick 2
,
,
,
climbing fibers absent
4
Hck 2
, lick 5
8 HZ
lick 4 ,
lick3, , ,
'
,
,
I
lick 5
7 Hz ~/olicks
lick 3
4.5 3 1.5
I °+
lick 4
lick 5
0
(
~0
time after first lick (ms)
time after first lick (ms)
Fig. 2. (A) High-speed documentation of skilled tongue protrusion in a rat. Video amages are presented every 10 ms after the jaw began to open in response to a tone-conditioned stimulus that signaled the availability of a drop of water at a target 5 mm away from the mouth. (B) Selective removal of the climbing fibers throughout the cerebellum degrades the temporal precision of repetitive licking. Response density plots show the distribution of successive licks in a train of licks that follow the skilled movement shown in A. Licking of normal rats is repetitive and concentrated within narrow time windows, reflecting the highly rhythmic and stereotypic nature of the movement sequence. In the absence of the climbing fibers, the temporal precision of the repetitive licking is severely degraded without altering the modal frequency of the rhythm. Taken from Welsh (1998).
multineuron data have used single neurons as predictor variables (Chapin, 1999), we used the Purkinje cell groupings derived from cluster analysis as predictors. An initial series of analyses indicated that discrimination based solely on the occurrence of the synchrony patterns was weak. Thus, additional predictor variables were derived from the synchrony group data in order to provide more information to the analysis (Fig. 4A). Here, 'combination predic-
tors', consisting of all possible pairs of synchrony patterns, were added to the synchrony group predictors to increase the total number of predictor variables. For example, in one case (Fig. 4B) 10 different synchrony groups were derived from the cluster analysis; the 10 groups provided 45 different paired combinations that allowed for a total of 55 predictor variables. For each time of interest during the neurophysiology experiment (50 ms after tone
303 50 ms epochs
tone water licks 0
/J [
250 L
1-50 ms
500
750
1000 1250 time (ms)
1500
1750
2000
2500
101-150ms 151-200ms 201-25o ms 251-300ms
fi1-100ms
Synchrony
med|o.lateral
~
Fig. 3. Schematic of the method for deriving synchrony groups as predictor variables for discriminant analysis. The 2.5 s after onset of a tone-conditioned stimulus was divided into 50 ms epochs. Spatial patterns of Purkinje cells that fired complex spikes synchronously during each epoch were determined using the cluster analysis described by Welsh and Schwarz (1999). For illustration purposes, typical patterns of complex spike synchrony are plotted for the first 300 ms from tone onset. The dots within the synchrony plots represent locations of Purkinje cells within the Crus IIa folium (shown in Fig. 1C). Note that not all time epochs yield the same number of groups.
onset, 25 ms after water delivery, and 80 ms before the tongue contacted the target) a value of 1 or - 1 was assigned to each of the synchrony groups if they appeared or did not appear during that particular time of interest. Then, the combination predictors were derived as the multiple of the values of each paired combination of synchrony groups. For example, if two synchrony groups were active (both value = l) at any time during a time of interest, their combination predictor equaled 1 (1 x 1). I f both were inactive (both value = - 1 ) during a time of interest, their combination predictor also equaled 1 ( - 1 x - 1 ) , If only one of the two was active, then their combination predictor equaled - 1 (1 x - 1). With this strategy, not only could synchrony groups, per se, be used as predictors, but the presence as well as the absence of combinations of synchrony groups could be added as predictor variables in the analysis. It is important to note that the times of interest had a duration of 25-80 ms, therefore providing ample time for more ~ one synchrony group to be active, since syncbrony was defined as simultaneous activity within the Same millisecond.
The discriminant analysis was programmed to utilize stepwise variable selection to find the 18 most beneficial predictor variables. From these 18 variables, 3 classification equations were generated corresponding to the 3 grouping variables (tone, water. tongue). The classification equations are like regression equations in that they are weighted linear summations of the values of the predictor variables that can optimally discriminate between the grouping variables. There is a different set of coefficients for each classification equation. On each instance of a grouping variable (tone, tongue, water), a classification score was generated for each classification equation and the event was classified to the grouping variable for which it had the highest score. Percentage correct classification was calculated at the end of the procedure to provide a quantitative measure of how accurately tongue movement could be discriminated from water delivery and tone onset on the basis of the complex spike synchrony patterns. To provide a graphical assessment of classification, two canonical discriminant functions were created that allowed calculation of canonical variate scores.
304
A
synchrony groups a s combination predictors
synchrony groups a s single predictors
both present
present = 1 absent = -1
1
I
group 2
group 1
1
value 1 ,. Q
both absent
'1.75 mm
1
ee
I
i
OeO e
e
one present 2,75 mm
B
-1
Case 1:26 Purkinje cells, 10 synchrony groups 6 5 4 sensory @ 29% 3 0 tone onset (0-50 ms) O 2 O water onset (0-25 ms) 1 0 -1 motor I -2 9 ) tongue protrusion 0 -3 88% (-80-0 ms) -4 0 1 l L 1 I 1 -5 -2 -t 0 1 5 -3 2 3 4
®
8
C
@
..-..
°Oo 34%
Case 2 : 2 6 Purkinje cells, 14 synchrony groups 4 3
• e
•
2 Od
1
O
0
=o
~
°°
490/0
-1
O ° o oe
•
ID
oj=
-2 -3
64%
e o @ 72% ,
-4
~
I
I
I
I
-3
-2
-1
0
I
score
,
1
Fig. 4. Discriminant analysis of complex spike synchrony. (A) Predictor variables used for the analysis consisted of the synchrony groups derived from cluster analysis (left) and paired combinations of synchrony groups (right). For each time of interest in the analysis, a value was assigned to each synchrony group corresponding to whether it showed activity or not (1 and - 1 , respectively). Combination predictors were derived as the multiple of all possible paired values. The values were used as independent variables for discriminant analysis using the grouping variables of tone, water, and tongue protrusion as dependent variables. Times of interest for the grouping variables were: (a) tone onset -4-50 ms; (bl water onset +25 ms; and (c) 80 ms before the tongue contacted the target. (B,C) Canonical variate plots of the dependent variables of tone onset, water onset, and tongue protrusion for 2 rats, each in which the complex spikes of 26 Purkinje cells were recorded simultaneously. Canonical variates were derived from the 18 independent variables (synchrony groups and combination predictors) that best predicted the times of interest. Classification functions quantified the ability of the synchrony groups to discriminate among the behavioral grouping variables. Percent correct classification is given for each grouping variable and the diameter of the symbols reflects a sum of cases that overlapped in the plot.
305 The canonical variates defined an optimal 2-dimensional space into which instances of the grouping variables could be separated on the basis of the population synchrony data (Tabachnick and Fidell, i996; Chapin, 1999). ' The results of two discriminant analyses of multielectrode cerebeUar data are presented in Fig. 4B and C. Here, the Cajaonical variate scores for individual instances o f SensOry or motor events were plotted against each other. It can be seen that the complex spike syn¢~ony data for both cases provided fair separation o f the two sensory events from the tongue: imomr eventl Classification functions quantified the abitity of the synchrony data to discri~nate between the sensory and motor events. In cases 1 and 2, res~ctively, 88% and 49% of the tongue movements could be correctly identified as tongue movements based on the synchrony data. In case 1 (Fig. 4B), detection of tongue movements from the synchrony data was better than for sensory events, as the correc~ identification of both water and tone was significandy less than for tongue movement. However, when incorrect classifications of a sensory event occun'ed, they were more likely to be identified as a tongue movement rather than as one of the other sensory events This may be related to temporal overlap between the sensory events and movements not recorded with the behavioral techniquel An opposite situation occurred in case 2 (Fig. 4C), where the correct classification of the sensory events was slightly better than for the tongue movement. Nevertheless, 49% of the tongue movements were still correctly identified as':such on the basis of the synchrony data alone. The discriminant analysis provided the first quantitative i n s i s t : toward clarifying two important issues. First, the ability to d i s c ~ n a t e a specific motor act from various sensory events solely on the basis of the pattern of complex spike synchrony in a group of simultaneously recorded Purkinje cells increases com~dence that the synchrony patterns are not c h ~ c e events :but are causally related to the generation of movement. Second, the good but, nevertheless, imperfect ability to classify a relatively stereotypic movement based on complex spike synchrony may be rela~d to an insufficient n~mber of Purkinje ceils recorded. That is, 25-30 Purkinje cells m a y not provide sufficient information in the syn-
chrony of their complex spike activity to perfectly classify these tongue movements. Yet, the analyses provide optimism that a very good specification of a motor behavior can be derived solely from the complex spike activity of many simultaneously recorded Purkinje cells.
Distributed network interactions between climbing fiber and parallel fiber function Our most recent data obtained with multielectrode recording have revealed an unexpected operation of cerebellar cortex that implicates the cerebellar cortical interneurons in cooperative interactions between the climbing and mossy fiber systems.: Such interaction may represent a second state that may supplement the independent operation of the two cerebellar afferents. The idea that the two cerebellar afferents interact within cerebellar cortex is not new as single neuron techniques have reported a wide variety of influences of complex spikes on simple spike rate: sometimes inhibitory (Bell and Grimm, 1969; Colin et al., 1980: Mano et al.. 1986; Sato et al., 1992; Simpson et al., 1996) and other times facilitative IEbner and Bloedel, 1981a,b, 1981c). Yet, most of these outcomes have been interpreted as resulting from local interactions driven by the bio. physical effects of synaptic input on Purkinje cell membranes. It is important to recognize that interactions between the two afferent activities is not required to occur on the membranes of individual Purkinje cells, but rather can occur as the result of engaging the inhibitory intemeurons of the cerebellar cortex. Although evidence is scant, studies have mentioned that collaterals of climbing fibers innervate inhibitory neurons of the cerebellar cortex, such as Golgi and basket cells (Lemkey-Johnston and Larramendi, 1968; Schulman and Bloom, 1981). Such connections may allow climbing fibers to modulate the simple spike firing of Purkinje cells that they do not innervate directly. For example, a climbing fiber that innervates one Purkinje cell may modulate another Purkinje cell's simple spike activity by collaterally innervating inhibitory intemeurons which, in turn, directly innervate that other Purkinje cell or the granule cells that trigger simple spikes in that Purl~je cell (Fig. 5C). Tlae likelihood
306
A
PC firing
I~ SS - firing rate [] CS - firing rate
PC 5
PC 1
B
Normalized CorrelatedActivity
C
[] C S - CS (n=21)
m 0 •
PC 2
800 CS - SS (n=19)
PC 6
PC 3
0
PC 7
8OO
Peristimulus T i m e ( m s )
PC 4
PC 8
0.7
"~
0 . . . . . . -;~- : 0
" -: :-: ' 8OO
locations
Peristimulus Time (ms)
Fig. 5. (A) Firing rates of Purkinje cells' (PC) simple spikes (SS, green) and complex spikes (CS, orange) simultaneously recorded from a linear array of 8 electrodes lowered 300 Izm within Crus IIa. The inset shows the locations of the electrodes. Electrical stimulation of motor cortex (0.3 ms pulse, 100 ~A) evoked a short latency CS at 10 ms after stimulation, followed by a pause (50-180 ms), and a 12 Hz oscillation (180-250 ms). Two peaks of oscillation are discemable in most PCs recorded. SS trains were discriminated in PCs 1, 3, and 5. SS rate was enhanced during the initial CS pause (about 50-180 ms) and showed a negative imprint of the CS oscillation at 180-250 ms. (B) Correlated activity calculated by joint peristimulus time analysts was corrected for variations in tiring rates and normalized (Aertsen et al., 1989). The histograms show the average correlated activity of CS-SS pairs (red) recorded from different PCs and all CS-CS pairs (orange) at a time delay of -100 to 800 ms. CS-SS pairs are negatively correlated and their interaction is highly dynamic. The negative correlation decreased after the stimulus (0-180 ms) and was enhanced during the CS oscillation. Correlated with these dynamics, the neuronal interaction in CS-CS pairs showed a decrease in positive correlation during the first 180 ms post-stimulus and strong synchronization during the oscillation (180-250 ms). (C) Circuitry of cerebellar cortex to account for population CS-SS interactions. Electrotonically coupled olivary neurons issue climbing fiber collaterals that innervate inhibitory basket cells and Golgi cells that. in turn, innervate PCs and granule cells (green), respectively. This may allow individual climbing fibers to control SS rate in PCs that they dO not innervate directly. Abbreviations: BC, basket cell; cf, climbing fiber; CN, cerebellar nuclear neuron; GC, Golgi cell; IO, inferior olive; mf, mossy fiber; PC, Purkinje cell; pf, parallel fiber; SC. stellate cell. Cellular elements are colored according to the activity that they drive in the histograms shown in A and B. that such interactions c o u l d a p p e a r as r e l i a b l e single n e u r o n events w h e n one m i c r o e l e c t r o d e is u s e d for r e c o r d i n g m a y be related to the p r e v a l e n c e o f s y n c h r o n y in the c l i m b i n g fiber s y s t e m that p h a s i c a l l y increases the spatial m a g n i t u d e o f intracortical inhibition. W e u s e d a m u l t i e l e c t r o d e a p p r o a c h in k e t a m i n e a n e s t h e t i z e d rats that a l l o w e d us to r e c o r d c o m p l e x and s i m p l e spikes s i m u l t a n e o u s l y f r o m up to 16
P u r k i n j e cells in Crus IIa u s i n g the m e t h o d s that w e d e s c r i b e d p r e v i o u s l y ( W e l s h and S c h w a r z , 1999). In o r d e r to i n c r e a s e the p r o b a b i l i t y o f o b s e r v i n g c o m p l e x s p i k e - s i m p l e spike interactions, w e u s e d e l e c t r i c a l s t i m u l a t i o n o f a c e r e b r o - c e r e b e l l a r pathw a y o r i g i n a t i n g f r o m the t o n g u e area o f the p r i m a r y m o t o r cortex, w h i c h triggers b o t h c l i m b i n g fiber a n d m o s s y fiber a c t i v i t y w i t h i n Crus IIa. S t i m u l a t i o n o f m o t o r c o r t e x w i t h currents as low as 15 IxA for 0.3
307 ms evoked short latency (approximately 10 ms) responses in comptex~simple spike trains recorded in Cms IIa. We studied not only short latency responses to cerebral Stimulation but also longer latency responses that occurred several hundreds of milliseconds after the stimulation (Fig. 5A). Climbing fiber activity in this paradigm was highly stereotypic, showing a sequence of an initial short latency complex sp~e, suppression of complex spike activity for I80 ms, and tl~en a 10-15 Hz oscillation that lasted for up to 3 periods. The inverse pattern of response was observed in simultaneously recorded trains of simple spikes, such that simple spike activity was siglaificantly increased during the period of complex spike inhibition and suppressed during the peaks of complex spike oscillation. The cause of such profound simple spike modulation was most probably the climbing fit~er system, since the rate modulation within the mossy fiber system far outlasted the duration Of the cerebral stimulation and was temporally related to the complex spike oscillation, a behavior that derives from the intrinsic membrane properties and electrotoniC coupling of olivary neurons. In order to investigate i f complex-simple spike interactions take place across Purkinje cells and to study their dynamics, we used a cross-correlation analysis based on the normalized joint peri-event time histogram (Aertsen et al., 1989). This analysis provides a measure of correlated activity over time that removes correlation due to variations in firing rate. Thus, the technique isolates correlated activity due to neuronal interaction. Furthermore, the time-varying correlated activity is normalized which allows quantitative comparison of different pairs of spike trains. With this technique, we could compute the average correlated activity from multiple simultaneously recorded Purkinje cells in order to provide a population measure of neuronal interaction. "We wanted, to focus specifically on the interaction among Purkinje cells, so correlations from complex and simple spike trains recorded from the same Purkinje celt were excluded from the analysis. The average normalized correlated firing between the complex and simple spikes of different Purkinje cells showed a negative sign but was not static over time after motor cortex stimulation (red histogram, Fig. 5B). W i ~ respect to prestimulus values. the variation in correlation typically consisted of a
reduction in negative correlation during the reduction in complex spike frequency and an increase in negative correlation during the period of oscillatory complex spike firing. The results indicated that complex spikes both enhance and suppress simple spike rate and reconcile the disparate results previously reported. An important point to be taken from these multielectrode experiments is that the effect of complex spikes on simple spike rate varies over time relative to a stimulus event, that is, it is dynamic. Complex spikes have an immediate inhibitory effect on simple spike firing but can have delayed enhancing effect on simple spike rate after the occurrence of strong input to the inferior olive. In order to determine whether complex spikesimple spike interactions were related to synchrony within the inferior olive, we computed the normalized correlated complex spike firing between many Purkinje cell pairs. The average normalized correlated activity over time after motor cortex stimulation showed a prestimulus level of complex spike synchrony that increased to a maximum during the oscillatory phase (orange histogram, Fig. 5B). Such patterns of complex spike firing and synchrony after motor cortex stimulation are very similar to those found after an awake rat receives a conditioned stimulus that elicits a conditioned response. Such synchrony, together with the evidence of climbing fiber innervation of basket and Golgi cells and iunervation of Purkinje cells by axonal collaterals of other Purkinje cells (Fig. 5C), seems well suited for modulating simple spike firing in a population of Purkinje cells.
Implications of multielectrode neurophysiology for a cerebellar neuronal prosthesis The foregoing evidence has strong implications for the possible design and implementation of a cerebellar neuronal prosthesis. Our analysis of multi-Purkinje cell physiology indicates that a first approximation of such a device should try to inject climbing fiber function into an array of Pm~nje cells in order to modulate movement. This view is based on two fundamental findings. First, synchronous activation of Purkinje cells during movement, in the way that the climbing fiber system operates, is likely to have a strong influence on the activity of deep cerebellar nuclear neurons and behavior, by virtue of the focused
308 inhibition ~ rebound excitation that deep nuclear cells exNbit during s~chromzed olivocerebellar activity (Llings and Mtihlenthaler, 1988). Second, by engaging ~ e inttacortical circmtry, synchronized and 0scillatory elimbing fiber-like activity injected into cerebellum by a mttltieleC~ode device ~ g h t naturalistically e0ntrol the dynamics of simple spike firing in groups of Purkinje cells, This is a time-varying influence that can Suppress or amplify mossy fiber throughput relative to a moment of high synchrony. Lastly, there is good evidence to suggest that the climbing and mossy: fiber systems, respectively, provide feedforward and feedback control of the same motor act. Thus, it is possible that we may be able to control these aspects independently by selectively stimulating c0mplex spike or simple spike patterns of activityin identified populations of Purkinje cells. Multisite stimulation of cerebellar cortex: toward a cerebellar neuron~ prosthesis
The p r i m ~ :quesfi0n to be addressed was: :!can the trajecto~ 0f a skilled movement be redirected predictably and: ;re~ab!y by spatiNly and temporally specific stimulation of: cerebellar c 0 ~ x with a multielectrode device? To answer this question, we employed multisite stimulation of cerebellar :cortex using: our microelectrode array technology that we have described (Welsh and Schwarz, 1999), The skilled behavior that we e x ~ n e d was conditioned promasi0n of the tongue 5 ~mm out of the mouth toward a target in space: Our previous results with multineuron recording of Purkinje cells' complex spike activity clearly :reveNed time-varying patterns of 01ivocerebellar synchrony correlated with the ongoing performan¢~ o f this skilled movement (Welsh et al., 1995). The results of multiv~ate analyses of the data were consistent with a causal role o f these time-varying patterns on the movement's performance, a conclusion reinforced by the finding that selective removal of the climbing fiber system degraded the precision of the movement (Welsh, 1998). A linear array of 8 finely etched tungsten mi= croelectrodes (1-3 Mr2) spaced 250 t~m apart was placed into the right Crus II and paramedian lobe under e!ectrophysiologica! gnidance. The array was oriente d medio-laterally and the electrodes were lowered in order to obtain as many Purkinje cell record-
ings as possible. After implantation, two recording sessions were carried out in order to determine the temporal relations between the spikes of the recorded neurons and conditioned tongue protrusion. Recordings obtained from the middle of the array yielded spike-rat e modulations that correlated best with various aspects of the motor performance. The positions of 6 recording sites (electrodes 1-6 in the array) are shown in Fig. 6. To approximate climbing fiber-like activation of the cerebellum during movement, we stimulated 6 cerebellar sites alone or synchronously in different combinations at various times during conditioned tongue protrusion. A 3 ms train of biphasic current pulses (15-50 lxA, 500 Ixs width, cathodal first) was given via each of the 6 electrode sites shown in Fig. 6 at different times during the movement. As shown in:Fig 2A, conditioned protrusion of the tongue is extremely rapid and takes 25-50 ms to travel from inside the mouth to a target 5 mm away from the mouth. Thus, an apparatus had to be assembled that not only could deliver =cerebellar stimulation at different times during the 50 ms travel time of the tongue but could provide sumcient temporal resolution to visualize the effects of the stimulation on the movement. We used a Kodak EktaPro digital imaging system that captured 5000 video images per second. The system was operated in record-on-command, remote-triggered mode so that it would capture 200 ms before and 200 ms after the cerebellar stimulation (400 video images). Images of tongue trajectory were obtained from below the animal. Cerebellar stimulation was delivered relative to the first time the rat opened its mouth in response to the tone-conditioned stimulus. A digital trigger activated by the mandible interrupting an infrared photobeam was used to signal a programmable 8-channel microstimulator (Multichannel Systems; Reutlingen, Germany). The stimulator was programmed to deriver current through different combinations of electrodes at specific times during transit of the tongue to the target. Cerebellar stimulation was given at one of three times during the movement: (1) at the time of mouth opening (Fig. 7A); (2) during protrusion of the tongue to the target, typically 5-20 ms after mouth opening and 5-10 ms before the tongue hit the target (Fig. 7B); or (3) at about the time that the tongue hit the target
309
1 mm
la[eral
Fig. 6. Locations of 6 recording/stimulation microelectrodes in the right crus II and paramedian lobe. Electrodes are spaced 250 Ixm apart from one another in the medio-lateral plane and are contained within 300 Izm in the anterior-posterior plane. Section thickness is 30 Ixm. (Fig. 7C). The video data were downloaded from the Kodak processor to a hard drive of a personal computer and analyzed. Fig. 8 shows 3 trials illustrating a prominent effect of multisite cerebellar stimulation on movement trajectory. In all trials, time 0 represents the time that the mouth opened in response to the conditioned stimulus. The top row shows a control trial in which no stimulation was given. Under control conditions. the tongue took 18 ms to contact the target and continued to press into the target for another 24 ms before it was retracted and moved back into the mouth. On control trials, the trajectory of the tongue was simply straight to the target and straight back into the mouth. ~'rhe bottom two rows in Fig. 8 show 2 trials in which 3 cerebellar sites were stimulated synchronously as the tongue moved toward the target (sites 4 - 6 in Fig. 6). Here, a very unusual behavior
was observed. At 30 ms after cerebellar stimulation and about 20 ms after the tongue had contacted the target in the usual manner, the tip of the tongue was conspicuously redirected to the right (arrows in Fig. 8) before it was retracted into the mouth. This unusual sequence of events, which never occurred on control trials, gave the distinct impression that the tongue was being redirected toward a 'phantom target' that did not exist: 3 mm to the right of the actual target toward which that rat was trained to protrude its tongue. Figs. 9 and 10 show overlays of individual tongue trajectories. The data are plotted as the position of the tip of the tongue every 3 ms after mouth opening and are colored so that individual trajectories can be followed. The squares in the trajectory plots represent the position of the tongue at the time of cerebellar stimulation. Fig. 9 demonstrates that cere-
310
__j
current mouth
_[
11oo ,A 10 ms
current mouth
I
tongue ',current
__l
mouth
Fig. 7. Stimulation paradigm. Biphasic current pulses to the cerebellum (3 ms, 500 p~s pulse width, 50 p A maximum) are given simultaneous with mouth opening (A), durin~ tongue protrusion (B), or when the tongue contacts a target (C). Images of the movement are obtained from below the mouth with a high-speed video camera that captured an image every millisecond. Calibration on the right of the images shows distance in ram. The target contacted by the tongue delivered a 40 I~l drop of water as a reinforcer. bellar stimulation altered the tongue trajectory in highly defined wayS that depended upon the position of the tongue at the time of stimulation. Stimulation immediately before maximal protrusion dramatically altered the trajectory and moved the tongue toward the 'phantom target' as described above (Fig. 9B). Cerebellar stimulation at movement onset did not move the tongue to the phantom target, but instead realigned the entire protrusion rightward (Fig. 9C). Cerebellar stimulation at the transition from protrusion to retraction shifted the trajectory of the retraction toward the right (Fig. 9D). The bottom row of Fig. 9 shows the mean trajectory produced by each time of stimulation. In these average plots, the different effects can be seen clearly. It can also be seen that the latency to effect on the movement trajectory was approximately 12 ms, a latency consistent with a relatively direct circuitry linking cerebellar cortex to the hypoglossal motoneurons. Lastly, the experiments indicated a high degree of spatial specificity for altering tongue trajectory with cerebellar stimulation. It was found that individual stimulation of each of the 6 sites depicted in Fig. 9 in isolation could not modify tongue trajectory
and that synchronous stimulation through combinations of the most medial electrodes (sites 1-3) only weakly and non-significantly affected tongue trajectory. The most robust effects on trajectory were produced with stimulation through electrodes 4-6, all of which were contained within 500 Ixm of paramedian lobe. When electrode sites 4, 5, and 6 were stimulated synchronously, all of the types of trajectory deviation described above could be produced (Fig. 10A). To determine if the effect could be localized to a subset of these three sites, the experiment was repeated by stimulating through different combinations of electrodes. When stimulation was given through electrodes 4 and 5 only, no effect on tongue trajectory was produced (Fig. 10B). However, when electrodes 4 and 6 were stimulated synchronously, large changes in trajectory were produced (Fig. 10C). Stimulation through electrodes 5 and 6 (Fig. 10D) or through 6 alone (Fig. 10E) did not alter the trajectory. The results clearly demonstrated spatial specificity of multisite stimulation for altering movement trajectory. The experiments indicated that spatially distinct patterns of cerebellar stimulation by a multielec-
311
0
312
A control R
L
/
L
. ..t,.
-
tD
e 8
°e',~° °t
@
lIPo O
=leo t~
1;
t h
0
~e
o Q
~
Q
R
L
R
/
O0 OO
l
"1
L
R
'/
L
R
L
q
•
Q
•
I t mm
"
• point to ntom target
1 realign trajectory
alter return 3 ms resolution
313
Electrode Combination
A
B
C
4+5
• •
@ @ t
,j,,~,®O
• • •
@ @ @
~
• • •
•
•
tH
• ¶•
lie
@•
•••
•
•
E
5+6
6 alone
•
000 O
D
4+6
Oma mm
• •
•
••
•
0 °
• •
~
•
[]
iP,_ g • @
•
••
•
•
•
I•
••
•,
.
•••
•
@
• *
@ @
• "•
•
[]
•mll •
• •
•
" •
•05"
~' •••
••
000~
-
•
•,t'
.0
••
.
. []
.
•
•1
"
•
[]
mm I
I
3 m s resolution
Fig. t0. Spatml specificity of multisite cerebellar stimulation on tongue trajectory. (A) Simultaneous stimulation at sites 4-6 produced both redirection, to phantom target (green) and realignment (red, blue, black) changes in trajectory. (B) Simultaneous stimulation at sites 4 and 5 did not change trajectory. (C) Simultaneous stimulation of sites 4 and 6 redirected tongue trajectory to the phantom target. (D) Simultaneous St'maulation at sites 5 and 6 did not change trajectory. (E) Stimulation at site 6 alone did not change trajectory. Trajectory plots are presented as in Fig. 9. The data show that the specific combination of cerebellar stimulation at sites 4 and 6 produced the trajectory changes mad demonstrate functional specificity of microregions of cerebellar cortex prosthetic stimulation.
trode device can predictably and reliably alter the trajectory of a sl~illed movement. Whereas single electrode stimulation of a cerebellar folium did not alter movement trajectory, simultaneous activation of non-configueus sites effectively altered movement trajectory, iTiiere was specificity w i ~ distributed circuits such that not all spatial patterns of cerebellar activation, even within 500 Ixm, produced a similar
outcome. Importantly, the effectiveness of cerebellar stimulation for modifying movement trajectory varied in time with a time constant in the millisecond domain. This implies that the ability of cerebellar stimulation to modulate movernent is context sensitive such that there are discrete time windows within which cerebellar stimulation can optimally redirect movement.
Fig. 9. Different types of trajectory change produced by different latencies of multisite cerebellar stimulation. Each dot represents the position of the tip of the tongue as viewed from underneath the jaw Time resolution is 3 ms The squares indicate the position of the tongue at the time when the cerebellum was stimulated. In the top row, individual trials are color coded. The bottom row shows mean trajectories obtained by averaging the single trial data shown above. (A) Control trajectories. The tongue made a straight trajectory toward and from the target, often off center to the left, (B) Cerebetlar stimulation during tongue protrusion. Note that in each instance, the tongue was reRireeted to the fight, often as a second protrusion m a 'phantom target'. In one trial (yellow trajectory), the protrusion began substantially off center and cerebellar stimulation redirected the movement precisely into the target. (C) Cerebellar stimulation at movement onset: Stimulation at movement onset realigned the trajectory to the right. (D) Cerehellar stimulation at protrusion termination. Stimulation at this time redirected the trajectory of the retraction to the right. In all cases, stimulation was given through eleetrode positions 4-6 as shown in Fig. 6.
314
Conclusions and future directions
Acknowledgements
There is much work to do to achieve a 'cerebellar neuronal prosthesis' as envisioned in this chapter. Although the idea m a y seem futuristic to some, it is buoyed b y a number o f factors. First, and perhaps foremost, a multielectrode stimulation device interfaced to the cerebellum for modulating movement taps into the natural function o f the cerebellum to modulate o n g o i n g voltmtary movement. So, just as multineuron stimulation o f visual cortex (Schmidt et al., 1996) or the dorsal cochlear nucleus (Otto et al., 1998), in principle, could provide vision and audition tO peripherally deafferented patients, multineuron stimulation o f cerebellum, in principle, could provide movement control to a class o f centrally deefferented patients. Second, cerebellar cortex m a y be among the best understood neuronal systems in the brain from a multineuron physiological point o f view. Thus, the cerebellar cortex m a y be one o f the first systems in which we have the opportunity to try to utilize natural s p a t i o - t e m p o r a l patterns of activation to induce function, In the future, we should be able to choose among known, natural patterns o f cerebellar cortical activity to induce function as opposed to choosing patterns arbitrarily as would be required when working with a less completely understood neuronal circuit. M a n y challenges and questions lie ahead. Can sustained, fine control of a limb be induced in addition to all-or-none changes in trajectory? It is likely that combinations o f synchronous activation o f olivocerebellar clusters to provide inertial breaks and asynchronous fast activations to maintain or finetune position will have to be utilized. W h a t would be the m i n i m u m number o f electrode sites be to induce function and can current waveforms be o p t i m i z e d to take specific effect on small clusters o f Purkinje cells? H o w can invasiveness be reduced and stability increased? How would sensory feedback be integrated into the system? M a n y o f these issues are not unique to a cerebellar application. Nevertheless, it is anticipated that future experiments in well-defined models o f animal behavior can answer these questions, allow the concept o f a cerebellar neuronal prosthesis to be continuously evaluated, and, most importantly, provide fundamental information regarding the functional physiology of the cerebellum.
This research was supported by grants from the United States National Institute for Neurological Disorders and Stroke (NS31224) and the Germ a n Research Foundation ( D F G - S C H W 5 7 7 / 4 ) and the G e r m a n Ministl3 for Education and Science ( B M B F 0311858). Technology development for multi-microstimulation o f cerebellum was supported b y a Bioengineering Grant from the W h i t a k e r Foundation with a subcontract to Plexon, Inc. We thank Mr. Harvey Wiggins o f Plexon for valuable technological support.
References Aertsen. A.M.H.J., Gerstein, G.L., Habib, M.K. and Palm. G. (1989) Dynamics of neuronal firing correlation: modulation of 'effective connectivity'. J. Neurophysiol., 61: 900-917. Arshavsky, Y.I., Berkinblit. M.B.. Fukson, O.I.. Gelfand, I.M. and Orlovsky, G.N. (1972) Recordings of neurones of the dorsal spinocerebellar tract during evoked locomotion. Brain Res., 43: 272-275. Bell; C.C. and Grimm, R.J. (1969) Discharge properties of Purkinje cells recorded on single and double microelectrodes. J. Neurophysiol.. 32: 1044-1055. Bell. C.C. and Kawasaki, T. (1972) Relation among climbing fiber responses of nearby Purkinje cells. J. Neurophysiol., 35: 155-169. Bloedel, J.R. and Roberts, W.J. C1971) Action of climbing fibers in cerebellar cortex of the cat. J. NeurophysioL. 34: 17-31. Chapin, J.K. (1999) Population-level analysis of multi-single neuron recording data: Multivariate statistical methods. In: M.A.L. Nicolelis (Ed.), Methods for Neural Ensemble Recordtngs. CRC Press, Boca Ratou, FL, pp. 193-228. Chapin, J.K., Moxon, K.A., Markowitz, R.S. and Nicolefis. M.A. (1999) Real=time control of a robot arm using simultaneously recorded neurons in the motor cortex. Nat. Neurosci., 2: 664670. Cicirata, E, Angaut, P., Serapide, M.E, Panto, M.R. and Nicotra. G. (1992) Multiple representation in the nucleus lateralis of the cerebellum: An electrophysiological study in the rat. Exp. Brain Res., 89: 352-362. Colin, E, Manil, J. and Desclin. J.C. (1980) The olivocerebellar system. I. Delayed and slow inhibitory effects: an overlooked salient feature of cerebellar climbing fibers. Brain Res.. 187: 3-27. Eccles, J.C., Llin~is. R. and Sasald. K. (1966a) Parallel fiber stimulation and the responses induced thereby in the Purkinje cells of the cerebellum. Exp. Brain Res.. 1: 17-39. Eccles, J.C., Llin~is. R. and Sasaki, K. (1966b) The excitatory synaptic action of climbing fibres on the Purkinje cells of the cerebellum. J. Physiol. ,Lond.), 182: 268-296.
315
Eccles, LC., Ito, M. and Szentfigothai, J. (1967) The Cerebellum as a Neuro~ud Machine. Springer, New York. Ebner, T.J. and Bloedeli J.R (1981a)Temporal patterning in simple spike discharge of Purkinje ceils and its relationship to cfimbing fiber activity. Z Neurophysiol., 45: 933-947. Ebner, T.J. and Bl~edel, J.R. (1981b) Correlation between activity of Purkdnje cells and its modification by natural peripheral stimuli. 5'. Neurophysiol., 45: 948-961. Ebner. T.J. and Bloedel. J.R. (1981c) Role of climbing fiber afferent input in determining responsiveness of Purkinje cells to mossy fiber inputs. J. NeUrophysiol., 45: 962-971. Gellman~ R., Houk, LC and Gibson, A.R. (1983) Somatosensory prope~ies of the inferior olive of the cat. J. Comp. Neurol., 215: 228-243. Grant, G. ( 1962).Spinal course: and somatotopically localized termination of the spinocerebeliar tracts. An experimental study in the eat. Acta Physiol. Scand., 56 (Suppl.) 193:1-45. Holmes, G. (1939) The cerebellum of man. Brain, 62: 1-30. Ito. M. and Simpson, 1.I. (1971)Discharges in Purkinje cell axons during climbing fiber activation. Brain Res., 31: 215219. Lemkey-Johnston, N and Larramendi, LM.H. (1968) Types and distribution of synapses upon basket and stellate cells of the monse cerebellum: An eleetronmicroscoplc study, a~ Comp. Neurol.. 134: 73-112. Llinfis, R, and Miihlenthaler, M. (1988) Electrophysiology of guinea-pig eerebeUar nuclear cells in the in vitro brain stemcerebellar preparation. Z Physiol. (Lond.), 404: 241-258. Llin~s, R. and Sasald, K. (1989) The functional organization of the olivo~cerebellar system as examined by multiple Purkinje cell recordings~ Eur. J. Neurosci., 1: 587-602. Llin~s, R. and Welsh, J.P. ~1993) On the cerebellum and motor learning. Curt Opin. NeurobioL. 3: 958-965. Llin~S, R. and Yarom, Y. (i981a) Properties and distribution of ionic conductances generating electroresponsiveness of mammalian inferior olivary neurones in vitro. J. Physiol. (Lond.), 315: 569-584, Llin~. R. and Yarom, Y. (1981b) Electrophysiology of mammalian .inferior olivary neurones in vitro. Different types of voltage-dependent ionic conductanees. J. Physiol. (Lond.), 315: 549-567. Ltinfis, R. and Yarom, Y. (t986) Oscillatory properties of guineapig inferior olivary neurones and their pharmacological modulation: an in vitro study. J. PhysioL (Lottd.), 376: 163-182. Llin~s. R.. Baker. R. and Sotelo, C. (t974) Etectrotonic coupling between neurons in cat inferior olive. J. Neurophysiol.. 37: 560-571.
Mano. N.. Kanazawa. 1. and Yamamom, K. (1986) Complexspike activity of cerebellar Purkinje cells related to wrist tracking movement in monkey. J. Neurophysiol., 56: 137-158. Napper, R.M.A. and Harvey, R.J. (1988) Number of parallel fiber synapses on an individual Purldnje cell in the cerebellum of the rat. J. Comp. Neurol.. 274: 168-177. Otto. S.R., Shannon, R.V., Brackmann, D,E., Hitselberger, W.E., Staller, S. and Menapace. C. (1998) The multichannel auditory brain stem implant: Performance in twenty patients. Otolaryngol. Head Neck Surg., 118: 291-303. Sato. Y.., Miura, A., Fushiki. H. and Kawasaki, T. (1992) Shortterm modulation of cerebellar Purkinje cell activity after spontaneous climbing fiber input. J. Neurophysiol., 68: 2051-2062. Schmidt. E.M.. Bak, M.J.. Hambrecht. RT., Kufta, C.V., O'Rourke, D.K. and Vallabhanath, P. (1996) Feasibility of a visual prosthesis for the blind based on intracortical rmcrostimulation of the visual cortex. Brain, 119: 507-522. Schulman. J.A. and Bloom. F.E. (1981) Golgi cells of the cerebellum are inhibited by inferior olive activity. Brain Res.. 210: 350-355. Simpson, J.I., Wylie. D.R. and DeZeeuw. C.I. (1996) On climbing fiber signals and their consequence(s). Behav. Brain Sci.. 19: 384-398. Soteto, C.. Llings, R. and Baker. R. (1974) Structural study of inferior olivary nucleus of the cat: Morphological correlates of electrotonic coupling. J. NeurophysioL, 37: 541-559. Tabachnick. B.G. and Fidell, L.S. (1996) Using Multivariate Statistics. Harper Collins College Publishers. New York, 3rd ed. Thach. W.T. 0967) Somatosensory receptive fields of single units in cat cerebellar cortex. J. Neurophysiol., 30: 675-696. Welsh. J.R (1998) Systemic harmaline blocks associative and motor learning by the actions of the inferior olive. Eur. J. Neurosci., 10: 3307-3320. Welsh. J.R and Llin~is, R. (1997) Some organizing principles for the control of movement based on olivocerebellar physiology. In: C.I. DeZeeuw, E Strata and J. Voogd (Eds.), The Cerebellum: From Structure m Control. Prog. Brain Res.. 114:449461. Welsh. J.R and Schwarz. C. (1999) Multielectrode recording from the cerebellum. In: M.A.L. Nicolelis (Ed.), Methods for Neural Ensemble Recordings. CRC Press, Boca Raton, FL, pp. 79-100. Welsh, J.R. Lang, E.J., Sugihara, I. and Llin~is. R. (1995) Dynamic organization of motor control within the olivocerebellar system. Nature, 374: 453-457.
M.A.L. Nicolelis (Ed.)
Progressin BrainResearch. VoL
130 D 200t Elsevier Science B,V, All fights reserved
CHAPTER 20
Do
birds sing? Population coding and learning in the bird song system Daniel Margoliash * Department of Organismal Biology and Anatomy, The University of Chicago, 1027 E. 57th St., Chicago. IL 60637 USA
Introduction Three related issues d o ~ a t e current discussions of population coNng. The first is the nature of the code and has recently focused on temporal coding is there information in the timing (precise or otherwise) of individual spikes, if: so how general is this phenomenon, how is this information represented within ensembles of neurons, and how is timing represented at the cell/synaptic level. The second is ~stributed representations - - how is information (genetic or epigenetic) rel~esented in the spatiotemporal patterns of activities of neurons, and how broadly tuned are neurons. The third is how interactions between separate areas of the brain combine to establish computational rotes that describe the mechanisms of behavior. Birdsong learning and the analysis of the attendant avian song system are instructive for all three questions,
Temporal coding Traditionally extraceltular activity in sensory systems has been described b y examining changes in the instantaneous rate of sequentially recorded, individual neurons in response to repeated presentations
CorresponNng author: Daniel Margoliash, Department of Orga~tismat Biology and Anatomy, The University of Chicago, 1027 E. 57ih St.. Chicago, IL 60637, USA. Tel.: --1-773-702-.8090; Fax: +1-773-702-0037: E-mail:
[email protected]
of the same stimulus. This approach stemmed from technical limitations of single cell electrophysiology, but it also has been justified by the belief that a significant component of spiking is dominated by noise, which cannot be assessed in individual trials. The limitation of such an approach is that it cannot detect information in the joint probability distributions that groups of neurons may exhibit. which may demonstrate cross-interactions that would otherwise not be observed. Simultaneous multiple recordings can address this limitation, but it may be difficult to target the proper set of neurons, the strength of interactions may be rapidly modulated with behavioral dynamics, and it may be difficult to assess the statistical significance of the interactions. Evidence for temporal coding, however, can also be observed in recordings from single neurons. Although neurons in some systems appear to respond in a stochastic nature, in other cases responses are highly repeatable (see Mainen and Sejnowski, 1995). This leads to the general question of whether greater precision of deterministic activity is hidden in seemingly random activity. Temporal coding can exist at all scales of integration (Reinagal and Reid, 2000). Thus temporal precision of neuronal activity is particularly suggestive of temporal coding, but its absence is not diagnostic of a lack thereof. There are now weIl-established specific examples of temporal coding phenomena, such as coincidence detection and temporal summation. These are known to play an essential role in initial stages of processing in some sensory systems (Reinagal and Reid, 2000:
320 Carr et al.. 1986a,b; Carr and Konishi, 1990). The question remains as to the generality of temporal coding at higher levels of sensory systems, and the mechanisms for temporal representations. In the bird song system, several fines of evidence provide support for a temporal coding hypothesis. The RA is the main forebrain output nucleus in the song system, analogous to the primary motor cortex in mammals. RA neurons have precise oscillatory ongoing discharge properties, which result from both intrinsic and network properties (Mooney, 1992). Perhaps the most direct evidence for temporal coding in birdsong is the temporal precision of complex neural activity in the descending motor system in relation to behavior (Yu and Margoliash, 1996). When a bird sings, neurons in RA exhibit precise bursts of activity. Each burst is associated with a local feature of a syllable, possibly related to the activity of a single muscle or small set :of muscles in the syrinx (vocal organ). The temporal structure of the burst activity changes with each associated feature. These changes can occur many times in the course of a single syllable that may last 50-300 ms. Nevertheless there is remarkable, submillisecond control of the timing of individual spikes. Because each burst has a unique: pattern, and because spike timing is highly reliable, once the experimenter learns the associations between burst patterns and vocal behavior, the behavior can be reliably described from single traces of neural activity. Clearly no averaging need be posited for this system. Rather, an 'instantaneous rate' should be considered, or simply a temporal code where each spike counts. During singing, each syringeal (vocal organ) muscle contributes to each syllable, so that the set of syringeal muscles must be activated in a dynamically complex fashion to produce the notes of a syllable, and in coordination with respiration in the longer intervals between syllables (Goller and Suthers, 1996a,b). Centrally, this requires dynamic reconfiguration of the RA network driving the brainstem motoneurons. Ensembles of RA neurons could be created based on modulation of strength of coupling between oscillators with the moment-tomoment dynamics of singing. This would provide the complexity of output and high degree of reliability required for the stereotypy of complex vocal output which is a hallmark of bird song.
Temporal domain sensorimotor interactions Interestingly, in some behavioral states, ILa~neurons also exhibit auditory responses: (Dave et al., 1998). The auditory activity uncovered under these conditions is matched in timing and structure to the pre-motor activity of the same neurons (see below). Thus, the time-domain representation of neuronal activity is not limited to the motor system, but is clearly evident in sensory representations as well. This supports the conclusion that in bird song learning, the problem of forming a mapping between motor and sensory modalities is solved with a mechanism that is sensitive to the timing of individual spikes. Deriving such a mapping is the fundamental problem that birds have to solve during learning. Neural mechanisms of supervised learning have been studied extensively in sensorim0tor Systems where the motor output is mapped in a continuous topographic representation o f one o r two dimensions; and sensory input is correspondingly mapped (see Knudseni 1994). In contrasti the representation for sensorimotor mapping in song :learning is apparently of higher dimensi0nality, because motor output is not continuously: topographically mapped, and t h e mapping o f sensory (auditory) input onto motor output is not topographic. I n such caseS, a coding scheme based on the ~ n g of individual spikes can act as an efficient representation. In this context,: efficiency may result from maximizing information content, supporting d y n ~ c interactions, and :mapping without the requirement for topography. The neural solution to the mapping problem observed for song learning may generalize to this class of problems, Such coding schemes may be present in other systems, but may be more difficult to observe because the behaviors are more variable and also less precisely characterized. With regard to temporal coding, the demonstration that signals are represented by the precise patterns of spiking activity at the level of RA leaves unanswered what is the nature of the code. That is, what information is represented in the burst patterns of RA neurons? The stereotypic nature of birdsong production that has permitted insight into the significance of spike timing also limits its analysis, because neuronal activity under a broad range of motor behaviors, and across a continuum of motor behaviors,
321 is not observed. It is likely that the complex oscillations observed at the level of single RA neurons during Sin~ng reflects the strength and phase of the iJateractions of the simple individual neuronal oscillators, and their excitatory and inhibitory interactions, when these simple oscillators are driven during singing or in response to auditory playback of song (Spiro et al., 1999). From this perspective. local networks of RA neurons are being dynamically shaped on a feature-by-feature basis during singing, and perhaps w}th longer time-scale dynamics as well. This may imply the dynamic formation of neuronal ensembles as well as the modulation of individual neurons withina network. Thus, as in other systems, there can be information in the synchronous activity of populations of neurons that cannot be readily observed in the acti~ty of single RA neurons. Such data m a y be obtained in future experiments that achieve recOrdings from two or more ceils simuttaneonsly. These issues may also be examined in species which sing multiple song repertoires, or under conditions in juveniles and adults when singing is more variable.
Distributed: representations Distributed:representations are common in systemslevel neural network modeling. In one form of a distributed representation, the connections between neurons at a g:~ en layer (level) or between layers of a model is either random, or initially the network is fully connected and some of these connections are pruned during the process of training (or self-organization) of the model.. The popularity of distributed representations in network models arises partially from the t~hnical simplification this brings to the modeling process assuming simple patterns of connections results in a reduction of the number of free parameters that must be assigned specific values or other a pnori information required to initially describe the model. Distributed neural network models may have properties, such as graceful degradation of performance in the presence of partial damage to the network that seems to mimic the performance of real neural systems under similar conditions. The role of spike timing in information coding also seems to support a distributed representation perspective, because: the spatiN location of the neurons 'informa-
tionally' coupled by temporal coincidence may vary on a moment-by-moment basis, and so may be of secondary importance. Whether such descriptions apply to real neural systems, however, remains in considerable doubt. It is now clear that throughout brainstem, midbraln, and primary forebraln areas, the general rule is precision of connections within and between specific morphological and functional classes of cells. 'Crystalline' precision within topographically organized patterns of connections is more commonly used to describe real neural systems, rather than random or distributed connections. Temporal coding exists within topographically organized structures. Also, although brain lesions can result in a graded reduction of performance, closer analysis of lesioned animals often reveals specific, focal deficits. In contrast, in non-primary, higher levels of the forebraln, the anatomical and functional organization of patterns of cOnnections is less well understood, It is at these levels of the forebrain that multimodal mapping and associational properties are computed in a manner analogous to the mapping computed in neural network models. Thus, the value of the distributed representation perspective remains an open debate. The analysis of the bird song system can contribute to this discussion. HVc is organized in a distributed fashion
The HVc is a candidate site where information regarding song production and song acoustics is represented in a spatially distributed fashion across populations of neurons. HVc is a major site where auditory and motor activity patterns are synthesized and combined. As suggested above, the sensorimotor mapping in birdsong is apparently a problem with high-dimensionality that may not lend itself to resolution into spatial coordinates or systematic connectivity. Anatomically, the data seem to point to a distributed representation of information within HVc (Fortune and Margoliash, 1995). For example, the nuclei robustus archistriatalis (RA) and area X, which receive projections from the nuclei HVc and lateral subdivision of the magnocellular nucleus of the anterior neostriatum 0MAN), exhibit a topographic organization which is ultimately referred to
322 the muscles of the syrinx (Vicario and Nottebohm, 1988; Johnson et al., 1995). Small injections of tracers restricted to a part of RA or area X retrogradely label neurons only in restricted regions of 1MAN. The same injections, however, label neurons throughout HVc. Similarly, auditory inputs to HVc ultimately arise from structures that are tonotopically organized (spatial organization based on sound frequency). However a topographic organization has yet to be observed in the mapping of afferents onto HVc. The functional properties of HVc are also consistent with a distributed representation hypothesis (Yu and Margoliash, 1996). Perhaps all HVc neurons are active during singing. The time structure of the premotor activity varies for different syllables, however typically for each neuron the maximal or average rate of activity does not vary dramatically with syllable type. Individual HVc neurons can exhibit complex premotor patterns that are predicted by the syllable type but not predicted as well by :the constituent notes that make up each syllable type. Thus, when analyzing instantaneous firing rates during singing, spatial localization of function within the nucleus based on large-scale units of song (syllables), or finer-scale temporal segmentation of individual syllables, is not apparent. A similar conclusion emerges when considering auditory activity of HVc neurons. The fundamental sensory property observed in HVc (and throughout the song system) is remarkable selectivity for the individual bird's own song (Margoliash and Konishi, 1985). In playback experiments, in each animal, virtually all HVc neurons with significant auditory responsiveness exhibit stronger responses to that individual's own song than to any other song or stimulus presented. Some neurons may not exhibit suprathreshold auditory activity, but in these cases, when there is subthreshold auditory activity, it too is song selective (Lewicki, 1996; Mooney, 2000). In each individual bird, there is a tendency for all neurons to share some features of temporal dynamics of response, for example a restricted portion of song that elicits the strongest response, independent of the spatial position of the neurons within the nucleus (Sutter and Margoliash. 1994). A subset of neurons in HVc are temporal combination sensitive (TCS), responding to a sequence of syllables (or notes) but not to the constituent elements of the sequence pre-
sented in:isolation or in incorrect order (Margoliash, 1983; Margoliash and Fortune, 1992). Such properties are probably established as part o f the mechanism for sequence learning. Although many HVc neurons exhibit sequence sensitivity, TCS neurons are easily distinguished from the larger population by their highly non-linear facilitative responses, phasic responses, and low rates of ongoing discharge. Both classes of HVc projection neurons have TCS properties, and there is no apparent spatial segregation of TCS neurons within the nucleus (Lewicki, 1996; Mooney, 2000). Thus; each functional property that has been observed for HVc neurons is observed in a pattern that is spatially distributed throughout the nucleus. Lesion studies of HVc have arrived at similar conclusions regarding spatial organization. During singing, dynamic modulation of :song can be achieved by minute brief electrical stimulation (Vu et al., 1994). In HVc, such stimulation tends to arrest singing at syllable boundaries, with singing then recommencing at the start of a 'motif' (a fixed temporal sequence of syllables). This effect of electrical stimulation during singing is seen at loci throughout HVc. Similarly, when electrical or chemical lesions were made in restricted regions of HVc, parameters of singing performance (e.g. fundamental frequency) were affected, This was particularly compelling in the case when such lesions were made in whitethroated sparrows, where unilateral lesions as small as 2% of the nucleus could have temporary effects and unilateral lesions as large as 100% still resulted in songs that could be analyzed. Under these conditions, the degree of frequency shift of the sparrow's songs were related to the size, but not location of the lesion within HVc (Hardin and Margoliash, 1992). These results could be interpreted as graceful degradation of performance.
HVc organization is not fully distributed In spite of the considerable evidence for distributed representation within HVc, there remains the distinct possibility that an underlying anatomical and corresponding functional spatial structure exists that has yet to be detected. By one measure, HVc has three classes of neurons, neurons that exclusively project to either RA (HVc-RAn) or area X (HVc-Xn), and
323 interneurons. A recent in vivo flatracellular study assessed auditory responses of HVc neurons with and without manipulation of membrane potential to determine excitatory and inhibitory components of the response to song (Mooney, 2000) These data demonstrate that all classes of HVc neurons are auditory, but that HVc-RAn probably project onto interneurons which in turn project onto HVc-Xn. In this scheme, HVc-RAn project onto interneurons which project onto HVc-Xn. In addition, feedback may exist at many levels of this circuit. If. as this study suggests, there are distinct patterns of feedforward and feedback projections within HVc:, then it may be possible to identify an HVc canonical circuit. Furthermore, there are several morphological features of HVC that suggest local structure. Three cytoarchitectonic subdivisions of HVc have been identified (Kirn et al., 1989; Fortune and Margoliash, 1995). Within each subdivision, cells are organized into clusters. ~ e cellutar constituents of HVc clusters are not well established, but cetIs within cIusters may communicate in part through gap junctions (e.g. Gahr and Garcia-Segura, 19%), a common feature of cell clusters in dorsal ventricular ridge (Ulinski, 1983). Strong functional connectivity within cell clusters opens the possibility that the cell cluster is a functional unit of organization. One possible model is that intrinsic collaterals that arise throughout the nucleus are processed locally within each cluster. If so. a critical feature that must be assessed is the spatial and functional distribution of intrinsic projections within HVc onto individual clusters of HVc neurons, and the differences between functional properties of cells within and between clusters. These are challenging observations that have yet to be made. In. some experiments, it will be necessary to achieve dual intracellular recordings, to examine cell-cell anatornicat and functional interactions. In other experiments, it will be necessary to achieve simultaneous ext:raceltular recordings during singing and during 'auditory playback of song of two or more single neurons whose projection class is identified, to examine timing of activity of populations of neurons. In summary~ HVc is one of the better studied brain nuclei to support a distributed representation hypothesis (Margoli-ash et al., 1994). Gross anatomical and funC~onal observations of ttVc yield results that are consistent with a distributed representation
model, but such observations could not detect spatial organization if it were present at the level of small clusters of cells. Single cell functional extracetlular data from HVc are consistent with complexity, but cannot distinguish between distributed and localized te.g. cell cluster) models. A similar level of ambiguity applies for the analysis of many forebrain regions. For HVc, recent data suggest the existence of a canonical circuit, which constrains the forms of distributed representation. A fully distributed representation, in the sense of neural network models, is unlikely to obtain in HVc:
State-dependent functional re-wiring of the song system The functional properties of song system neurons have been observed under a variety of behavioral states. The first observations were a comparison of HVc neuronal activity during singing and the auditory response properties of the neurons during broadcast (playback) of songs (McCasland and Konishi. 1981: Yu and Margoliash, I996). The premotor activity of neurons in area X and nucleus 1MAN have also been observed during production of "vacuum' behavior and compared with activity recorded when the male is singing in the presence of a female (Hessler and Doupe, 1999). The changes in variance of activity patterns observed under these two conditions may be the result of activation of a dopaminergic 'reward' including input to area X (Lewis et al., 1981). Circadian patterns of activity have also been observed, in RA. HVc. and area X, and these have also been compared with data collected in the anesthetized state (Dave et al.. 1998: Schmidt and Konishi. 1998). In recent years, this collection of data. mostly gathered using chronic recording techniques, has become a substantive set of observations with which to constrain models of the function of the birdsong system. In interpreting these data. however, it is important to appreciate the relative difficulty in achieving single neuron recordings in chronic preparations, especially under a variety of behavioral conditions. This implies potentially a significant uncontrolled bias in the sample of single neurons represented in these recordings. Many conclusions from physiological studies are sensitive to such biases: such conclusions must therefore be con-
324 sidered working hypotheses rather than established facts. Finally, to date virtually all the work has been conducted in zebra finches. Although this is a particularly well-studied species, such a narrow focus robs the field of much of its natural comparative strengths. Auditory-motor interactions in HVc In zebra finches, in all cases the premotor activity was tonic in nature (cf. recordings from mockingbirds (see McCasland, 1987)). The tonic nature of premotor activity was observed in single unit (Yu and Margoliash, 1996) as well as in multiunit (McCasland and Konishi, 1981) recordings. Some of the single neurons recorded also had auditory responses during playback of song, exhibiting the same selective responses to the bird's own song observed for HVc neurons in anesthetized preparations. Of these auditory/motor neurons, some had phasic responses to song playback - - a feature of activity that in a recent intracelhilar study was associated with HVc projection neurons; other neurons had more tonic responses that are more common in HVc interneurons (Mooney, 2000). When the auditory responses of these single neurons were compared with the neurons' premotor activity patterns, in general there was poor correspondenCe of the temporal structures of the two patterns of physiological activity (Yu and Margoliash; 1996). A parsimonious explanation for this observation is that the premotor activity in the HVc provides strong input to all cells, driving them tonically and more strongly than occurs just by auditory stimulation alone. This is consistent with the observation that only some HVc neurons exhibit auditory responses (Margoliash, 1983; Margoliash and Fortune, 1992; see below), This simple explanation does not rule out the possibility that a portion of the premotor activity recorded, that which occurs after syllable onset, is accounted for by auditory feedback. We have observed for a single HVc neuron, however, an excellent correspondence between premotor and auditory response patterns of activity (Rauske and Margoliash, 1999). For this one cell, the premotor activity preceded each of the introductory notes of song, but thereafter the temporal structure of activity during singing switched, tending to occur
during and after each syllable. When the neuron was presented with song playback, responses also occurred during and after each syllable, and the shape of the response PSTH was quite similar to the PSTH derived from premotor activity. One of two explanations can obtain. Either the activity during singing was predictive of the expected auditory feedback activity (see below), or the activity during singing was the result of actual auditory feedback. Inspection of the timing of peaks of activity during singing (and playback) was insufficient to distinguish between these two possibilities. In other HVc cells recently recorded, however, the pattern of firing during singing especially the concentration of activity following syllables has the characteristics of auditory responses and not premotor activity (Rauske and Margoliash, unpublished data). This suggests differential access to auditory feedback during singing for different classes of HVc neurons. Another dynamic feature of HVc activity is the circadian modulation of auditory responses. Many neurons have little or no responses to song playback during the daytime hours, but have strong responses to song playback at night. This effect is mimicked when comparing anesthetized with awake preparations (Schmidt and Konishi, 1998). Other HVc neurons exhibit clear auditory responses throughout the day and night. The temporal pattern of activity of such neurons in response to song playback does not change as the bird transitions between the day and night (Rauske and Margoliash, unpublished data). However, no HVc neuron has been observed that has as strong or stronger a response during the day than at night. This is consistenl with the idea that each HVc neuron receives broadly from other HVc neurons, which would predict that at night all HVc neurons would receive more auditory input. RA neurons have virtually no auditory activity during the day (Dave et al., 1998), whereas some auditory responsiveness has been detected in area X during the day (Rauske and Margoliash. unpublished data). Thus. it would seem that the HVc neurons with auditory responses during the day include HVc-Xn. whereas the daytime non-responsive neurons include the HVc-RAn. It will be valuable to determine if auditory input to HVc-RAn is gated during singing, and developmentally regulated in the context of sensorimotor learning.
325
A neural representation ofsensorimotor mapping in birdsong A particularly c0m~Uing-set of observations has resuited from~ analysis, of behavioral state modulation of RA neur~ns~ :In most animals; RA. neurons are completely irefraetory to song playback during the day~ so that playba6k does :riot modulate the very regttlar :pattern ~of..ongoing •lae~onal discharge. This is particularly remarkable because the same negrons exhibit str0ng aaditory~ responses at night (Dave et al.. 1998). ~ i s has been 0bse/ved consistently When r e c o r ~ : from.RA neurons, either while recording from multil~le units over a day/night Cycle (Dave et al., 199.8), or more recently recording from single neurons before, during, and after a period of sleep (Dave-and Margotiash, 2000). When the auditory responses of RA neurons in sleeping birds was compared with the premotor data ~of the same neurons, there were many syllables for which the neuron exhibited:premotor activity, but little or no auditory activity. When:ithe rmuron, exhibited auditory responses, however,, th6re was .a remarkable correspondence of those patterns with the premotor patterns. This implies-that a t the--level of RA, the auditory system codes for activity: with the same temporal structure of spike, bursts u s e d b y the inotor system to control singing. The most parsimonious explanation for_this phenomenon is if thesame pattern generating circuit is recrtfiteddurhag singing and in response to song playback. It is importanI to consider the timing of the auditory activity :in response to song playback in relation to the timing.of premotor-activity. In RA. premotor activity occurs as bursts of spikes that often precede a syllable. ~ e activity-is premotor, in the sense that it depends on the following syllable. This may be examined i n the case when the bird sings syllable sequences ... A - B . . . and ... A - C . . . . or when the bird. ter~nates-the, song at A. Interestingly, the auditory-ev0ked patterns 0 f R A neurons could also commencepi-i6r to the onset of syllables (Dave and Margo!iash,::20~), Furthermore, ~ere was relatively little lag in ~ e timing of the auditory response compared: to the timing.-of the premotor activity, far less lag than would be predictedfor auditory activity that would restdt from feedback dur:mg singing. How does the auditory system accomplish thiS? To explore
this question, we presented RA neurons recorded in sleeping birds with songs that had systematic deletions of syllables in one part of the song (the syllables were replaced with background noise of the same duration). Under:these conditions, t h e R A neurons stopped responding at specific points in Song when a preceding syllable was deleted. The syllable could be one, two, or in some cases even three or more syllables preceding the syllable, during which the change in response was observed. These data are reminiscent of the temporal combination sensitive neurons that have been extensively described in HVc. Thus, it appearsthat the :auditory response of RA neurons is a result of sensitivity to a sequence of syllables. The sequence of syllables is used to predict the response to the following syllable. The auditory response is termed a prediction-because it can begin prior to the onset of the target syllable. but - - based on the premotor data of the same neuron is associated with the target syllable and not the syllable preceding the target syllable. 'Clearly, the pattern of activity of such neurons could emerge over time during song development in concert with the statistics of singing.
Replay of song during sleep Additional insight has been garneredby observing the structure of the ongoing discharges of RA neurons recorded at night while birds were sleeping (Dave and Margoliash, 2000). The neuronal activity was observed to have complex bursts, that for most neurons resulted in bimodal distributions of interspike intervals, w i t h t h e short-interval peak in the distribution only observed during nighttime recordings. When the bursts of ongoing neuronal activity were compared with premotor activity of the same neurons, visually compelling matches were observed, in some instances extending over many syllables. B y comparing aUbursts recorded during sleep with an exemplar of premotor activity, we identified many bursts that matched premotor data; a bootstrap procedure demonstrated these matches were highly statistically significant. It is noteworthy, however, that this match was not perfect. Many patterns: observed during sleep, while clearly (and statistically significantly) derived from the sensorimotorpattems o f activity, were patterns that were never observed in the
326 premotor data. At least at a single neuron level, replay of song during sleep engenders embellishments on the normal singing behavior otherwise observed. Thus replay during sleep could explore patterns of singing not observed in actual singing performance. The observation that spiking activity at night carties meaningful information related to singing that occurs during the day suggests the possibility that the patterns observed during sleep are involved in some aspect of song regulation, and by extension that song learning in juveniles may evidence a circadian pattern (see Tchernichovski et al.. 2001). This hypothesis is consistent with a large literature relating sleep to learning (e.g., Karni et al., 1994). One implication of such a hypothesis is thai there may be greater modulation of singing behavior in adult birds than has been traditionally accepted. The classical deafening experiments found large effects on song development in juvenile-deafened birds, but little or no disruption of song in adult-deafened birds (Konishi, 1965). This, plus the stereotypy of adult song observed in many species, has led to the implicit assumption that there is little auditory-feedback mediated sculpting of song in the adult. This assumption has been shown to be incorrect, at least for some species. For example, zebra finches were shown to maintain their adult songs by reference to auditory feedback: adults that were deafened showed marked song deterioration, but only after a period of weeks to months (Nordeen and Nordeen, 1992). In the related Bengalese finches, songs of adult deafened birds may deteriorate even faster (Okanoya and Yamaguchi, 1997). When realtime analysis was used to provide disruptive feedback time-locked to specific vocal elements produced by adult zebra finches, the song deteriorated, but slowly, over a period of weeks and months (Leonardo and Konishi. 1999). In this case, it was possible to remove the abnormal feedback, and demonstrate that birds recovered their original songs. Thus, zebra finches have a stable internal representation of song that is maintained and can be modified by auditory feedback, but was not abolished by the abnormal auditory feedback that was presented. These features of internal representations may map onto the two main forebrain pathways, as described below. More recently, preliminary work has identified subtle, but consistent, changes in vocal output of
intact, unmanipulated zebra finch adults (Chi and Margoliash, unpublished data). The variation in vocal output was observed only after a high-resolution analysis procedure was developed to identify local acoustic features of syllables of songs. A complex pattern of yariation was observed within and between songs, and variation was observed both in the acoustics of song and the timing of RA single neuron activity in relation to those acoustics. Whether the variation manifests itself as a circadian pattern is currently under investigation. Such analyses can also directly address the issue of coding of RA neurons, identifying the variation in features of vocal behavior associated with variation in specific burst patterns.
Models of birdsong learning Two pathways A fundamental distinction in the birdsong system has been the roles of the two main forebrain pathways that have been the focus of most of the experimental attention. The vocal motor pathway (VMP)includes HVc, RA, and the descending and feedback projections of ~ (Fig; 1): As described abovei lesion, electrical stimulation;: and physiological recordings during singing all support arole for this pathway in motor control. The second, anterior forebrain pathway ( A F P ) h a s long b e e n thought to act as some sort of motor control/error correction c i r c u i t . This i s consistent with its analogous organization with basal ganglia pathways in mammals (see Bottjer and Johnson, 1997; Luo and Perkel, 1999a). TNS circuit starts with area X, which receives input from HVc. A feedback projection from, RA combines with the output from area :X in the thalamic structure DLM. D L M in turn projects to 1MAN, from which arises feedback projections to area X a n d a projection to RA. Lesion studies provided initial insight into the role of the AFP in song learning. Lesions of the 1MAN disrupt song acquisition in juvenile birds yet lesions of 1MAN have little effect on otherwise intact adult birds (Bottjer et al, 1984). The adult songs of birds that sustained 1MAN lesions as juveniles are simple and highly stereotyped, a n d crystallize early in development, In contrast, juvenile birds that receive lesions of area X develop songs in adult-
327 inputs
AFP
outputs Fig. 1. A simplified schematic of some basic forebrain connections o f the song system discussed i n this paper. Inputs arrive at HVc ~here used as its ,proper name) from multiple sources including auditory and premotor structures. HVc projects to the nucleus robustus archistriatalis (RA), which has many outputs including projections to the brainstem nuclei controlling vocal muscles and respiration. HVc also projects to area X. the start of a three-nuclei pathway mrrned the anterior forebrain pathway (AFP). Area X projects to the medial subdivision of the dorsolaterat nucleus of the thalamus (DLM), which projects to the lateral subdivision of the magnocellular nucleus of the anterior neostriamm (tMAN). Feedback pathways from RA to DLM and 1MAN to area X are also shown,
hood that lack stereotypy; these songs are somewhat suggestive Of the songs that develop from birds deafened early in lt~? (Sohrabji et al.. 1990; Scharff and Nottebohm, 1991). The dichotomy between the developmental effects of area X and 1MAN lesions are imporm_nt because they give some confidence that the site of the lesion is related to the observed effect, always a difficult problem in interpreting lesion studies. Attempts have been made to classify the role of the AFP in juvenile song learning and adult song maintenance as 'auditory' or 'motor'. Yet we now know that the AFP has singing-related changes in gene expreSsion (Jarvis and Nottebohm, 1997), auditory activity (Doupe, 1997), changes in premotor activity patterns in relation to behavioral state (Hessler and Doupe, 1999), may be involved in consolidation of song models during the sensory phase of acquisition (Basham et al., I996), and may contribute to
the perception of adult songs (Hamilton et al.. 1997). Clearly the relation between activity in AFP and behavior is complex, and is not easily captured by such simple distinctions. It has also been common in the literature to attempt to distinguish between the AFP and the VMP based on global criteria, such as the AFP is the 'learning' pathway, a claim sometimes made in the context of the differential effects of lesions in the two pathways. Technically, this approach is flawed because of the fact that lesions of the VMP always affect song output (it is a motor pathway); this may obscure other actions of the VMR Conceptually this approach is flawed because the criteria used to distinguish these pathways fail to provide distinct functional definitions of the nuclei. which can support specific hence fatsifiable models. For example, it is also the case that the VMP is pre-motor (Nottebohm et al., 1976; McCasland and Konishi. 1981), auditory (McCasland and Konishi. 1981; Williams and Nottebohm. 1985: Margoliash, 1986), sensitive to behavioral state (Dave et aL, 1998; Schmidt and Konishi, 1998), and contributes to adult song perception (Brenowitz, 1991). A direct test of the role of the VMP in sensory acquisition has yet to be reported, but see Bolhuis et al. (20001. The distinction between the roles of the AFP and VMP await morpho-functional descriptions of the activities of the various nuclei during the singing and learning processes. More specific models of the AFP have slowly begun to emerge. One major insight was the observation of differential development of HVc and 1MAN axons projecting to RA (Gumey, 1981). In zebra finches, 1MAN axons arrive in RA by about 15 days of age, whereas HVc axons do not arrive until about 25 days of age (Konishi and Akutagawa. 1985: Herrmann and Arnold, 1991). The 1MAN axons access the NMDA-type glutamate receptor exclusively, whereas the HVc axons access both NMDA and AMPA-type of glutamate receptors in RA (Mooney and Konishi, 1991; Stark and Perkel, 1999). These data support the hypothesis that 1MAN inputs may act to organize HVc inputs and RA local circuits during development. Because lesions of the AFP have little effect on adult song, the potential role of the AFP in adult song production had been ignored. This is a common error in interpretation of lesion data. Indeed. recent
328 data provide direct evidence of a role of the AFP in regulating adult song production. When the nerve innervating the syrinx (vocal organ) is cuti adult zebra finches normally produce abnormal songs which only recover as the nerve grows back over a period of many weeks. In contrast, when the 1MAN was lesioned bilaterally in conjunction with peripheral nerve cuts, there was much less reorganization of song (Williams and Mehta, 1999). This clever experiment yielded important results, demonstrating an active role of 1MAN in song maintenance, and suggesting that the modulatory role of 1MAN is normally balanced or canceled by normal auditory feedback. This interpretation has been verified and extended by the careful analysis of the effects of 1MAN lesions in deafened birds (Brainard and Doupe, 2000).
Solving the temporal Creditassignment problem Models have value in providing succinct explanations of observed phenomena and m ~ n g novel predictions whose verification and falsi~eation can unambiguously distin~ish between interesting Classes of mechanisms. NetWork computational modeling in neurobiology has been more successful in providing global descriptions than i n directing specific experimental programs, For birdsong studies, network models have been valuable in focusing attention on the computational implications of feedback timing (Doya and Sejnowski, 1995; Troyer et al, 1996), In sensorim0tor inte~ation, such as feedback control of birdsong, a f u n ~ e n t a l problem is how to compensate for the time interval between when a motor pattern is generated a n d when sensory feedback is received. The delay between the command and the error signal can be 10ng ~ in birdsong it is estimated tobe on the order of 70-100 ms. The: match ;in auditory/motor properties of RA neurons in adult birds is one endpoint of sensorimotor learning i n this system: a neural representation of the solution to the problem of mapping auditory feedback onto motor output. In one possible model to explain this result, during juvenile singing it is hypothesized that RA neurons exhibit premotor activity in response to premotor activity in HVc, but RA is also activated in response to auditory feedback arriving from HVc. The feedforward prediction
expressed by the auditory activity of RA neurons is ultimately compared against other RA neurons, with the output of the AFP (nucleus 1MAN) providing the evaluative signal that modifies connections in RA. The major feature of this model is that feedback is used in realtime to modify motor output, via the HVc to RA projection. The model predicts that RA neurons express auditory activity during singing in juvenile birds, and that the activity of RA neurons becomes refractory to auditory feedback as song develops. We have yet to formulate an explicit solution to the temporal credit assignment problem under this hypothesis (but see Troyer and Doupe, 2000). An alternative, and complementary model embraces a role for circadian patterns in learning (Dave and Margoliash, 2000). The neuronal data share some similarity with temporal difference models of sensorimotor integration. In the birdsong system, we do not have a learning rule specified as of yet, however a plausible heuristic model motivated by these data can be proposed. Briefly, this model proposes that daytime singing results in sensorimotor efference copy signals that arise in RA, traverse the AFR and are delayed in DLM. By this model it is predicted that the delay is of sufficient magnitude to bring the efference copy signal that arrives in area X (via the 1MAN projection) into temporal coincidence with realtime auditory feedback that arrives in area X from HVc. Synaptic mechanisms with considerable delay have been described for DLM neurons (Luo and Perkel, 1999b). The comparison in area X could be used to store information (probably directly in area X) about motor behavior, expected, and observed results. Such information would not modify motor output in realtime, possible because 1MAN projections onto RA would not be in temporal coincidence with RA activity. At night, the bursting activity observed throughout the song system in sleeping birds would permit access to the information stored during the day. The bursting activity at night would be approximately synchronous throughout the system, so that replay would not need to compensate for the long delays that real feedback engenders. The nighttime replay would represent a mechanism for sensorimotor consolidation of patterns that were practiced during the day. The major feature of this model is that feedback is compared with a sensorimotor efference
329
copy signal in area X, By these mechanisms, motor output ksl not modified in: realtime, instead local RA circuits: are modified at night during bursts of Spikes:' that get to replay song.-The model predicts that during singing, but not at night, 1MAN activity :signifieantly lags-aCtivity in RA. It presents an explicit soliation to-the temporal credit assignment problem. Conclusion Birdsong learning is rapidly approaching the phase when models that make precise predictions can be proposed, tested, and modified or discarded. The behavior ~s an-exan~le of a general class of problems in reinforcement learning in sensorimotor systems. Insights info ~s implementation in real neural circuits is likely to ~beof general interest. Acknowledgements ! thank;Ami':sh S. Dave for a valuable critique of this manuscript. Supported by NIH Grants MH59831 and MH60276. .
References Basham, :M.E., Nordeen. E.J. and Nordeen, K.W. (1996) Blockade of NI~DA receptors in the anterior forebrain impairs sensory aequi~ioniin.nhe zebra finch (Poephila guttata). Neurobiot. Learn, Mere,, 66: 295-304. Bolhuis~ J.J., Zijlstra, G.G., den Boer-Visser, A.M. and Van Der Zee, E.A. (2000) Fi'om the Cover: Localized neuronal activation in. the zebra finch brain is rented to the strength of song learning. Proc2 Natl. Acad. Sci. USA. 97: 2282-2285. Bottjer, S:N. a n d Jolmson, E (1997) Circuits, hormones, and Iearning: v0calbehavior in songbirds. J. Neurobioi., 33: 602618. Bottjer, S:W., Miesner. E.A. a n d Arnold, A.R (t984) Forebrain lesions disrupt devetopmem but .not maintenance of song in pa~erine birds. S~ence, 224: 901-903. Brainard, M.S. and Doupe. A.L (2000) Interruption of a basal ganglia f o r e b r ~ cketfit prevents plasticity of learned vocallzations. Nature, .404r 762-766. BrenowitZ, E:A. (1991) Altered perception of species-specific song by female birds after lesions of a forebrain nucleus. Science, 251: 303-305. Can', :C~E. and:-Konlshi, M . (1990) A circuit for detection of interaural time differences in the brain stem of the barn owl. J. Neurosci., t ~ 3227.;=3246. Carr. C.E.. Heiiigenberg, W. and Rose. G.J. (1986a) A time-corn-
parison circuit in the electric fish midbrain. I. Behavior and physiology. J. Neurosci.. 6: 107-119. Carl C.E., Heiligenberg, W. and Rose, G.J. (1986b) A timecomparison circuit in the electric fish midbrain. II. Functional morphology, J. Neurosci.. 6: 1372-1383. Dave, A.S. and Margofiash, D. (2000) Song replay dining sleep and computational rules for sensorimotor vocal learning. Science, 290: 812-816. Dave. A., Yu. A.C. and Margoliash. D. (1998~ Behavioral state modulation of auditory activity in a vocal motor system. Science, 282: 2250-2254. Doupe, A.J. (1997) Song- and order-selective neurons in the songbird anterior forebrain and their emergence during vocal development. J. Neurosci.. 17: 1147-1167. Doya, K. and Sejnowski, TJ. (1995) A novel reinforcement model of birdsong vocalization learning. In: G. Tesauro, D.S. Touretzky and T.K. Leen (Eds.), Advances in Neural Information Processing Systems. Vol. 7. MIT Press, Cambridge, MA, pp. 101-108. Fortune, E.S. and Margoliash, D (1995) Parallel pathways and convergence onto HVc and adjacent neostriatum of adult zebra finches (Taeniopygia guttata). J. Comp. NeuroL, 360: 41344l. Gahr, M. and Garcia-Segura, L.M. (1996) Testosterone-dependent increase of gap-junctions in HVC neurons of adult female canaries. Brain Res., 712: 69-73. Goller, E and Suthers, R.A. (1996a) Role of syringeal muscles in controlling the phonology of bird song. J. NeurophysioL, 76: 287-300. Goller, E and Suthers, R.A. (1996b) Role of syringeal muscles in gating airflow and sound production in singing brown thrashers. J. Neurophysiol.. 75: 867-876. Gurney, M.E. (1981) Hormonal control of cell form and number in the zebra finch song system. J. Neurosci., 1: 658-673. Hamilton, K.S., King, A.P., Sengelaub, D.R. and West, M.J. (1997) A brain of her own: a neural correlate of song assessment in a female songbird. Neurobiol. Learn. Mere., 68: 325-332. Hardin. B.D. and Margoliash, D. (1992) Effects on songs of lesions in HVc. Proc. 3rd Int. Congr. NeuroethoL, p. 344: Herrmann. K. and Arnold. A.P. (1991) The developmem of afferent projections to the robust arehistriatal nucleus in mate zebra finches: a quantitative:electron microscopic study. J. Neurosci., 11: 2063-2074. Hessler, N.A. and Doupe, A.J. (1999) Social context modulates singing-related neural activity in the songbird forebrain. Nat. NeuroscL, 2: 209-211. Jarvis, E.D. and Nottebohm, F. (1997) Motor-driven gene expression. Proc. Natl. Acad. Sci USA, 94: 4097-4102. Johnson. F., Sablan, M.M. and Bottjer, S.W. (1995) Topographic organization of a forebrain pathway involved with vocal learning in zebra finches. J. Comp. Neurol.. 358: 260-278. Karni, A., Tanne. D., Rubenstein, B.S., Askenasy, J.J: and Sagi, D. (1994) Dependence on REM sleep of Overnight improvement of a perceptual skill. Science. 265: 679-682. Kim, J.R., Clower. R.E, Kroodsma, D.E. and Devoogd~ T.J. (1989) Song-related brain regions in the red-winged blackbird
330
are affected by sex and season but not repertoire size. J. Neurobiol., 20: 139-163. Knudsen, E.I. (1994) Supervised learning in the brain. J. Neurosci., 14: 3985-3997. Konishi, M. (1965) The role of auditory feedback in the control of vocalization in the white-crowned sparrow. Z. Tierpsychol., 22: 770-783. Konishi, M. and Akutagawa, E. (1985) Neuronal growth, atrophy and death in a sexually dimorphic song nucleus in the zebra finch brain. Nature, 315: 145-147. Leonardo, A. and Konishi, M. (1999) Decrystallization of adult birdsong by perturbation of auditory feedback. Nature. 399: 466-470. Lewicki, M.S. (1996) Intracelhilar characterization of song-specific neurons in the zebra finch auditory forebrain. J. Neurosci., 16: 5855-5863. Lewis. J.W.. Ryan. S.M., Arnold. A.E and Butcher, L.L. (1981) Evidence for a catecholaminergic projection to area X in the zebra finch. J. Comp. Neurol.. 196: 347-354. Luo. M. and Perkel. D.J. (1999a) Long-range GABAergic projection in a circuit essential for vocal learning. J. Comp. Neurol.. 403: 68-84. Luo, M. and Perkel, D.J. (1999b) A GABAergic, strongly inhibitory projection to a thalamic nucleus in the zebra finch song system. J. Neurosci.. 19:6700-6711 Margoliash. D. (1983) Acoustic parameters underlying the responses of song-specific neurons in the white-crowned sparrow. J. Neurosci.. 3: 1039-1057. Margoliash, D. (1986) Preference for anmgenous song by anditory neurons in a song system nucleus of the white-crowned sparrow. J. Neurosci.. 6: 1643-1661. Margoliash, D. and Fortune, E.S. (1992~ Temporal and harmonic combination-sensitive neurons in the zebra finch's HVc. J. Neurosci.. 12: 4309-4326. Margoliash. D. and Konishi. M. (19851 Anditory representation of anmgenous song in the song-system of white-crowned sparrows. Proc. Natl. Acad. Sci. USA. 82: 5997-6000. Margoliash, D., Fortune, E.S.. Sutter. M.L., Yu, A.C.. WrenHardin. B.D. and Dave. A. (1994J Distributed representation in the song system of oscines: evolutionary implications and functional consequences. Brain Behav. Evol., 44: 247-264. McCasland. J.S. (1987~ Neuronal control of bird song production. J. Neurosci.. 7: 23-39. McCasland. J.S, and Konishi. M. (1981) Interaction between auditory and motor activities in an avian song control nucleus. Proc. Natl. Acad. Sci. USA. 78: 7815-7819. Mooney, R. (1992) Synapfic basis for developmental plasticity in a birdsong nucleus. J. Neurosci., 12: 2464-2477. Mooney, R. (2000) Different subthreshold mechanisms underlie song selectivity in identified HVc neurons of the zebra finch. J. Neurosci.. 20: 5420-5436. Mooney, R. and Konishi, M. (1991) Two distinct inputs to an avian song nucleus activate different glutamate receptor subtypes on individual neurons. Proc. Natl. Acad. Sci. USA, 88: 4075-4079. Nordeen, K.W and Nordeen, E.J. (1992) Auditory feedback is
necessary for the maintenance of stereotyped song in adult zebra finches. Behav. Neural Biol., 57: 58-66. Nottebohm, F., Stokes, T.M. and Leonard, C.M. (1976) Central control of song in the canary, Serinus canarius. J. Comp. Neurol., 165: 457-486. Okanoya, K. and Yamaguchi, A. (1997) Adult Bengalese finches (Lonchara striata var. domestica) require real-time auditory feedback to produce normal song syntax. J. Neurobiol., 33: 343-356. Okuhata, S. and Saito, N. (1987) Synaptic connections of thalamo-cerebral vocal nuclei of the canary. Brain Res. Bull., 18: 35-44. Ranske, RL. and Margoliash, D. (1999) Does behavioral state modulate sensorimotor properties in HVc? Soc. Neurosci. Abstr., 25: 624. Reinagel, R and Reid, R.C. (2000) Temporal coding of visual information in the thalamus. J Neurosci. 20:5392-5400. Scharff, C. and Nottebohm, E (1991) A comparative study of the behavioral deficits following lesions of various parts of the zebra finch song system: implications for vocal learning. J. Neurosci., 11: 2896-2913. Schmidt, M.E and Konishi, M. (1998) Gating of auditory responses in the vocal control system of awake songbirds. Nat. Neurosci., 1: 513-518. Sohrabji, E, Nordeen, E.J. and Nordeen, K.W. (1990) Selective impairment of song learning following lesions of a forebrain nucleus in the juvenile zebra finch. Behav. Neural Biol., 53: 51-63. Spirt, J.E., Dalva, M.B. and Mooney, R. (1999) Long-range inhibition within the zebra finch song nucleus RA can coordinate the firing of multiple projection neurons. J. Neurophysiol., 81: 3007-3020. Stark, L.L. and Perkel, DJ. (1999) Two-stage, input-specific synaptic maturation in a nucleus esSential for vocal production in the zebra 'finch. J. Neurosci., 19:9107-91 i6. Sutter, M.L. and Margoliash, D. (1994) Global synchronous response to autogenons song in zebra finch HVc. J. Neurophysiol., 72:: 2105-2123: Tchernichovski, O., Mitra, RR, Lints, Ti and Nottebohm, E (2001) Dynamics of the vocal imitation process: how a zebra finch learns its song. Science, in press. Troyer, T:W. and D0upe, A:J. (2000) An associational model of birdsong sensorimotor learning I. Efference copy and the learning of song syllables. J. Neurophysiol., 84:1204-1223. Troyer, T., Doupe, A.J. and Miller, K,D. (1996) An associational hypothesis for sensofimotor learning of birdsong. In: J.M. Bower (Ed.), Computational Neuroscience, Academic Press, San Diego; pp. 40%414. Ulinski, RS: (1983) Dorsal Ventricular Ridge. Wiley, New York. Vicario, D.S. mad Nottebohm, F. (1988) Organization o f the zebra finch song control system: I. Representation of syringeal muscles in the hypoglossal nucleus. J. Comp. Neurol., 271: 346-354. Vu, E.T., Mazurek, M.E. and Kuo, Y.-C. (1994) Identification of a forebrain motor programming network for the learned song of zebra finches. J. Neurosci., 14: 6924-6934. Williams, H. and Nottebohm, E (1985) Auditory responses in
331 avian v~CN motor neurons a motor theory for Song perception in birds, s~ie~ee; 229! 279-282. WiliiamS H: aria Mehta N (i999)ChmageS in adult zebra finch song require a forebrNn nucleus that is not:neceSsary for song i
production. J. Neurobiol., 39: 14-28. Yu, A.C. and Margoliash, D. (1996) Temporal hierarchical control of singing in birds [see comments]. Science, 273: 18711875.
M,A.L. Nicolelis (Ed.)
Progressin BrainResearch. Vol.
130 © 200t Elsevier Science B.V. All fights reserved
CHAPTER 21
Accuracy and learning in neuronal populations Kechen Zhang 1., and Terrence J. Sejnowski 1,2 1 Howard Hughes Medical Institute. Computational Neurobiology Laboratory, The Salk Institute for Biological Studies, La Jolla, CA 92037. USA z Department of Biology, University of California, San Diego, La Jolla. CA 92093. USA
Introduction The information about various sensory and motor variables contained in neuronal spike trains may be quantified by either Shannon mutual information or t~isher info~ati0n. Although they are related, the Fisher information measure is more convenient for dealing with continuous variables which are more common in lower level sensory and motor representations. T h e accuracy of encoding and decoding by a popu!atiOn of neurons as described by Fisher i n f o ~ a t i o n has some general properties, including a universal scaling law with respect to the width of the tuning functi~ms. :The theoretical accuracy for reading ou~ ation from population activity can be reached, in principle, by Bayesian reconstruction, which can be simplified by exploiting Poisson spike statistics. The Bayesian method can be implemented by a feedforward network, where the desired synaptic strength dan be established by a Hebbian lem~ning rule that is proportional to the logarithm of the presynaptic firing rate, suggesting that the method might be potentially relevant to biological systems.
Corresponding author: Kechen Zhang, Computational Neur0biology Laboratory, The Salk Institute for Biological Studies, La Jolla, CA 92037 USA. Tel.: +1-858453-4100, ext. i420; Fax: +1-858-587-0417; E-mail:
[email protected]
Accuracy of neural coding: Fisher information vs. Shannon information Neuronal spike trains carry information about various sensory and motor variables. To quanti~ the amount of information, Shannon's mutual information has often been used (Rieke et al.. 1997). Fisher information was introduced in the 1920s. two decades earlier than the introduction of Shannon information, but the application of Fisher information to neural coding is more recent (Paradiso. 1988; Lehky and Sejnowski, 1990). To compare the two information measures, consider a neuron that encodes a one-dimensional variable x by the number of spikes n evoked within a certain time interval. Shannon mutual information between the encoded parameter x and the number of spikes n is defined as
l = E p(n, x) ln- p(n, x)_ p(n)p(x) ll,X = E p(n I x)p(x) In ~.x
?(n i x ) p(n)
(1)
(2)
where the second step is an identity based on the definition of conditional probability p(n, x) = p(n , x) p(x), together with p(n) = ~ x p(n ] x ) p(x). The sums here are over all possible values of the encoded variable x and the number of spikes n. If x is continuous, the sum is understood as an approximation that approaches an integral. Fisher information is defined
334 Cramtr-Rao lower bound on the variance or mean square error:
as
J(x) =
lnp(n Ix)
(3)
1
E[~2] -> 7'
= Z n
p'(n I x ) 2 p(n I x ) '
(4)
where the average { ) is over n with respect to the conditional probability distribution p(n I x), and the second step is the result of the average, with p'(n I x) = Op(n ] x)/Ox being a derivative function that describes how the firing of the neuron is affected by a slight change of the encoded variable x. Note that Fisher information J(x) is defined for each value of the encoded variable x, whereas Shannon information averages over the whole range of values according to the distribution p(x). Thus Shannon information is a global measure and Fisher information is a local measure for the coding of the variable of interest. Fisher information is defined for continuous, variable x only, and is closely related to the minimal variance for estimating the variable from.the spikes. These two information measures are related in several ways. When a probability distribution is perturbed by adding gaussian noise, the rate of change of Shannon entropy is proportional to the Fisher information according to de Bmijn's identity (Cover and Thomas, 1991). Another relation is that Shannon mutual information between a probability density and its slightly shifted version is proportional to Fisher information (Frieden, 1998). The two measures are also related by an inequality because given the Fisher information or given the variance of parameter estimation, the entropy cannot exceed that of a gaussian distribution with the same variance (Brunel and Nadal, 1998).
(5)
which applies to all possible unbiased estimation methods that can read out the variable x from the population activity without systematic error (Paradiso, 1988; Seung and Sompolinsky, 1993; Snippe, 1996). Here 8 2 is the square error for estimating variable x in a single trial, and E[e 2] is the average error across repeated trials with a fixed x. The error is caused by the randomness of spikes so that the true value of x can never be completely determined, regardless of the method for reading out the information. If the population of neurons represent a continuous D-dimensional vector variable x = (xl, x2 . . . . . xD), the square error in a single trial is given by e 2 = e 2 + e~ + . . - - t - e ~ ,
(6)
with ei the error for estimating xi. Eq. 5 is still valid assuming that different representations in different dimensions are uncorrelated. In the following, we always assume that there are N neurons with identical tuning functions whose peak positions are scattered uniformly across the range of the encoded variable. Because of the uniform distribution, the Fisher information is the same for all value of the variable. The spikes from different neurons are assumed to be independent so that the total Fisher information for the whole population is the sum of the Fisher information for each of the neurons. The spike statistics are assumed to be Poisson. a reasonable first approximation. The time interval in which the spikes are collected is always indicated by r.
Example 1: cosine tuning Examples of Fisher information A cosine tuning function is given by Sensory and motor variables are typically represented by a population of broadly tuned neurons. We consider several commonly encountered tuning functions and give the total Fisher information for a population of these neurons. Fisher information Y can characterize how accurately a variable x is encoded by the population because its inverse is the
f = A cos 0 + B,
(7)
where f is the mean firing rate, 0 is the encoded variable, and A and B are two constants. Here 0 is a one-dimensional variable, corresponding to a reaching direction in two-dimensional workspace. The total Fisher information for N neurons with preferred
335 directions distributed uniformly around the circle is A2).
J = rN(B - ~ -
(8)
where ~ is the number of neurons per unit volume in the D-dimensional space of the encoded variables (Zhang et al., 1998).
Example 2: circular normal tuning
Correlated noise
The mean firing rate of a circular normal tuning function is given by
In the examples above, the total Fisher information is always proportional to the total number of neurons as well as the total number of spikes from the whole population of neurons. This is a consequence of the assumptions of the independence of different cells and the Poisson spike statistics, which implies that the information for each cellis proportional to f , 2 / f , which in turn is proportional to the peak firing rate. Here f is the tuning function and f ' is its derivative with respect to the encoded variable. The property that Fisher information is proportional to the total number of neurons and total number of spikes still holds even when the neurons are not independent but have pairwise correlations of their noise. The initial observation (Zohary et al., 1994) that even weak correlated noise may destroy this proportionality depends on how this information is read out (Abbott and Dayan, 1999). The exact resutts also depends on the form of the correlations (Yoon and Sompolinsky, 1999). Sometimes even when the correlated noise is deliberately ignored, the decoding error may still be the same as in the independent cases (Wu el al., 2000). In particular, as shown below, the universal scaling law for tuning width is insensitive to some common forms of noise correlation.
f = C exp(K cos0),
(9)
where 0 is a one-dimensional circular variable. C and K are constants. The total Fisher information for N neurons is (10)
J = :cNCKtl (K),
where I I ( K ) = i J _ 1 ( i K ) is the modified Bessel function of imaginary argument, with J-1 being Bessel function of the first kind of the order - 1. For small K, the Bessel function becomes - 1 - - K ~ 27-2-1.
II(K)~K
(11)
and the circular normal tuning function in Eq. 9 approaches the cosine tuning function in Eq. 7, with B ~ C and A ~ C K . Now the Fisher information formula in Eq. 8 is recovered. For large K; the Bessel function becomes (12)
ll(K) ~ exp(K)/~/2zrK.
and the circular normal tuning function in Eq. 9 approaches the gaussian tuning function in Eq. 13 (D : 1), with a 2 ~ I / K , F ~ C e x p ( K ) and r~ ~ N/(27r), so that the Fisher information formula in Eq. 14 (D = 1) is recovered. Example 3: gaussian tuning
The mean firing rate of a gaussian tinting function in a space of dimension D is given by f = Fexp(x2+'"+x2) •
-
~g2
,
(13)
where (xl, x2 . . . . . XD ~ are the encoded variables, F is the peak firing rate and a is the tuning width parameter. The centers of the tuning functions for different neurons are assumed to be uniformly distributed. Total Fisher information is J = --(2zc)D/~tzFcrD-2.-
D
(14)
Universal sealing law for tuning width In Example 3 in the preceding section, the Fisher information scales with the tuning width o- according to tlcr D-2. where D is the dimension of the encoded variable, and 0 is the density of neuron for the encoded variable. This specific example assumes gaussian tuning function, Poisson spike statistics and independence of different neurons. In general, the accuracy of population coding by tuned neurons as a function of tuning width follows the same universal scaling law regardless of the exact shape of the tuning function and the exact probability distribution of spikes, and allows some correlated noise between neurons. The general results are described below, followed by an intuitive explanation of this universal scaling
336 I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
I
F
I
I
Fig. 1. Schematic diagrams showing the tuning curves of a population of neurons, with each tuning curve describing the mean firing rate of a neuron as a function of the value of the encoded variable~ The accuracy of the encoding by the whole population depends on the width of the tuning curves in a universal manner regardless of the exact shape of the curve and the exact spike statistics, provided that the tuning width is not too small compared with the spacing of the tuning curve centers (vertical ticks), and also not too large compared with the full range of the variable.
(Fig. 1). Formal treatment of the result is given in Zhang and Sejnowski (1999a), following earlier works on related issues (Hinton et al., 1986; Baldi and Heiligenberg, 1988; Snippe and Koenderink, 1992; Zohary, 1992; Zhang et al., 1998). The tuning function can be an arbitrary radially symmetric fimction that describes the dependence of the mean firing rate f (x) of a neuron on the variable of interest x = (xl, Xe . . . . . XD):
where r~ is the number of neurons whose tuning function centers fall into a unit volume in the D-dimensional space of encoded variable, assuming that these neurons fire independently, and the centers of the tuning functions are uniformly distributed in the space of the encoded variable. The proportional constant (not shown) may depend on the time interval r and the exact shape of the tuning function ~b. The universal scaling law with the factor a 0-2 implies that for one-dimensional feature (D = 1), more information can be coded per neuron for a sharper tuning curve, provided that all other factors are fixed, such as peak firing rate and noise correlation. For two-dimensional features (D = 2), such as the spatial representation by hippocampal place cells, coding accuracy should be insensitive to the tuning width (Zhang et al., 1998). In three and higher dimensions (D > 3), such as the multiple visual features represented concurrently in the ventral stream of primate visual system, more information can be coded per neuron by broader tuning. Although sharpening makes individual neurons appear more informative, it reduces the number of simultaneously active neurons, a factor that dominates in higher dimensions where neighboring tuning functions overlap more substantially. On the other hand, sharpening can always improve the Fisher information coded per spike and thus energy efficiency for spike generation (Zhang and Sejnowski, 1999a). The universal scaling law in Eq. 16 still holds when the firing rate fluctuations of different neurons are weakly correlated. The result is
J -~ o(yD-2 ( -Aq-- B =
.
which depends only on the Euclidean distance to the center c. Here o- is a parameter that describes the tuning width, which scales the tuning function without changing its shape. This general formula includes all radial symmetric functions, of which gaussian tuning is a special case. The probability P ( n I x, r) for n spikes to occur within a time interval of length ~: can be an arbitrary function of the mean firing rate f ( x ) . These general assumptions are strong enough to prove that the total Fisher information has the form: J oc O~rD-2.
q
(15)
(16)
+
B
)
,
(17)
ignoring contributions from terms slower than linear with respect to the population size, where A and B are constants independent of the correlation strength q (Zhang and Sejnowski, 1999a). Here the noise covariance between neurons i and j is given by the average:
Cij -~"E [ ( n i
-
IJ.i)(rtj -/~j)]
JC 2
i f / = j,
| CiCj
otherwise,
(18)
where Ci = ~(/xi) is an arbitrary function of the mean number of spikes #i = E[nj] within certain
337 time interval. The spike statistics are assumed to follow a multivariate gaussian distribution Thus the noise correlation does not affect the scaling factor t/o-p-2, and the proportional constant is affected by the factor 1/(t - q), which slightly increases the Fisher information when there is positive correlation (q > 0). An intuitive derivation of the universal scaling factor ~o-D-2 is as follows. The average square error for estimating the encoded variable from the activity of a population of neurons should scale as: 0-2 E [ 8 2] (X
.
(19)
N"
where N ' is the total number of activated neurons for each fixed value of the encoded variable. The factor t / N ' may be justified by the square root law for using a large: number of independent neurons. The factor 002 arises because of the dimensionality reqmrement: t f the length scale of the encoded variable is changed, b o t h the value of the tuning width and the square error shoutd change accordingly, leading to the factor 002. Because each neuron is tuned with width 00 in each of the D dimensions, the number of activated neurons in response to a fixed value of the encoded variable should be proportional to: N t c< ~cr D
(20)
where r~ is the density of neuron as before. Substituting Eq. 20 into Eq. 19 yields the average square error:
1
E[e 2] e~
D-~' (21) ~Ta Using Eq. 5, we see that the Fisher information should scale as ~TaD-2, which is the same as Eq. 16. More generafly, if the tuning function is not radially symmetric but has different widths for different dimensions, then Eq: 20 should be replaced by (22)
N ' cv ~7001cr2 . . - 00D,
so that the average square error for the i-th dimenstun should be: E[g 2] o( 00~ o( N'
o.2
.
(23)
r ] o t 0°2 • • • 00D
A formal derivation of this result using Fisher information is given in Eurich and Wilke (2000). If
the width o.i in dimension i is sharpened while the widths of all other dimensions are kept constant, the error should decrease in proportion to 00i, consistent with the previous result for a one-dimensional problem. When the tuning widths of different dimensions are identical: o.1 = o-2 . . . . . 00, the factor ~/o.D-2 is recovered.
Origin of cosine tuning function In the above, different shapes of the tuning functions are directly used without considering why a particular shape exists. There is a general argument for why the cosine tuning curves are so widespread in the sensory and motor systems (Zhang and Sejnowski. 1999b). A cosine tuning function implies a dot product between a fixed preferred direction and the actual movement direction (Georgopoulos et al., 1986; Schwartz et al.. 1988). This suggests a linear relation with the movement direction, although the actual coding may have various forms (Mussa-Ivaldi. 1988; Sanger, 1994). The general argument below exploits the inherently low dimensionality of natural movements and shows that neuronal activity tuned to movement often obeys simple generic rules as a first approximation, insensitive to the exact sensory or motor variables that are encoded and the exact computational interpretation (Zhang and Sejnowski. 1999b). Consider the reaching problem and assume that the mean firing rate of a neuron relative to its baseline is proportional to the time derivative of an arbitrary unknown function of hand position in space during stereotyped reaching movement. As a linear approximation, the firing rate can always be written as f
= f o + p v cos oe,
(24)
where f0 is the baseline rate. p is a constant, v is the instantaneous reaching speed of the hand. and ol is the angle between the preferred direction and the instantaneous reaching direction. A similar linear approximation applies to the visual system. In response to a three-dimensional object moving at translational speed v and angular speed w, the mean firing rate of a motion-sensitive neuron should be given by f = f0 + p v c o s ~ t q w c o s f l ,
(25)
338 where f0 is the baseline rate, p and q are constants, o~is the angle between the instantaneous translational velocity and a fixed preferred direction, and fi is the angle between the instantaneous angular velocity and a fixed preferred rotation direction. The assumption here is that the firing rate is proportional to the time derivative of an arbitrary function of the position and orientation of the object in three-dimensional space (Zhang and Sejnowski, 1999b). Thus, given a particular view of a particular object, the response above baseline is predicted to be the sum of two components, one translational and one rotational, each with cosine tuning and multiplicative modulation by speed and angular speed, respectively, In the motor system, broad cosine-like tumng curves have been observed in many brain areas, including the primary motor cortex, premotor cortex, parietal cortex, cerebellum, basal ganglia, and somatosensory cortex. The tuning rule in Eq. 24 can account for various experimental facts besides the cosine shape, including multiplicative speed modulation, trajectory reconstruction by the population vector, curvature power law. elbow position, and the linear relation between the spike count and reaching distance. In the visual system, the predicted tuning rule in Eq. 25 has not been tested with realistic moving three-dimensional objects. A partial confirmation of a special case of this tuning rule is provided by neurons selective to wide-field spiral visual motion with broad cosine tuning in monkey medial superior temporal area (MST), the ventral intraparietal area (VIP), and the parietal area 7a. because such motions may be generated plausibly by a large moving planar object facing the observer. A population of such neurons tuned to three-dimensional object movement might be useful for updating static-view representations in the ventral visual stream (Zhang et al., 1998).
Optimal decoding by Bayesian method Various methods have been used to 'decode' or read out information from the spike trains of a population of neurons. Within a probabilistic framework, the theoretically optimal methods are based on the Bayes rule (F6ldifik, 1993; Sanger, 1996; Zhang et al., 1998: Brown et al.. 1998).
A popular reconstruction method is called the population vector scheme, first used in motor cortex, where the average firing rate of a given neuron is maximal when the arm movement is in a particular direction, known as the preferred direction for that neuron (Georgopoulos et al.. 1986). The population vector method estimates the direction of arm movement by summing the preferred direction vectors weighted by the firing rate of each neuron. A more general approach to reconstruction is to allow the neurons to represent more general basis functions of the physical variables (Girosi and Poggio, 1990; Pouget and Sejnowski, 1997), Each neuron contributes a basis function in this space of variables whenever it fires, and the best estimate of the physical variables can be computed from the sum of these functions weighted by the number of spikes occurring in each neuron during a time interval. An alternative method for decoding a population code is based on Bayesian reconstruction or maximum likelihood estimation. These optimal probabilistic methods take into account prior probabilities and attempt to reconstruct the entire probability distribution. Instead of adding together the kernels, as in the basis function method, the probabilistic approach multiplies them, assuming that the spikes in each neuron are independent. An example of how the timing of spikes in a population of neurons can be used to reconstruct a physical variable is the reconstruction of the location of a rat in its environment from the place cells in the hippocampus of the rat. The place cells are silent most of the time, and they fire maximally only when the animal's head is within a restricted region in the environment called its place field (Muller et al., 1987: Wilson and McNanghton, 1993). The reconstruction problem is to determine the rat's position based on the spike firing times of a few dozen simultaneously recorded place cells. Examples of the Bayesian method and the direct basis function method applied to the place cell data are shown in Fig. 2. The Bayesian reconstruction method works as follows. Assume that a population of N neurons encodes several variables (xl, x2 . . . . ), which will be written as vector x. Here x is the position of the animal in the maze. From the number of spikes
339 Time = 1 sec
Time = 0 see
Time
=
2 sec
True Position
L
Probabilistie Method
Direct,~sis Method
Fig. 2; Predictlng the. position Of a freely moving rat from the spike trains of 25 simultaneously recorded hippocampal place cells. distribution density for the position Of the rat are compared with its true position, which occupied a single pixel on the 64 x 6 4 grid, corresponding to 111 x I 11 cm in real space. The probabilistic method or Bayesian method often yielded a sharp distribution with a single peak by combining the place fields multiplicatively, whereas the direct basis, method typically led to a broader distrib~ation with multiple peaks by combining the place fields additively. Taken from Zhang et al. (1998) with premission.
SnapshOtSbftfie:reconstmcted
n = (nl, n2~.. : , nA0 .fired by the N neurons within a time interval ~:, we w a n t t o estimate the value o f x using the Bayes rule for conditional probability: P ( x l n ) = P ( n [ x)P(x)/P(n),
(26)
assuming independent Poisson spike statistics of different neurons. The final Bayesian estimate is: P ( x ~ n)
prior probability for the animal to visit position x, which can be estimated from the data, and fi(x) is the empirical tuning function, i.e. the average firing rate of neuron i for each position x. Examples of the probability P (x n) for the probable position of the rat is shown in Fig. 2. The most probable value of x can thus be obtained by finding the x that maximizes P ( x n), namely, = argmax P ( x I n).
= kP(x)
'
\i=1
3}(x) nl /
exp - r \ i=1
3~(X )
, / (27)
where k is a norma_lization constant. P(x) is the
(28)
By sliding the time window forward, the entire time course of x can b e reconstructed from the time varying-activity of the neural population.
340 A comparison of different reconstruction methods for this problem shows that the Bayesian reconstruction method was the most accurate (Zhang et al., 1998). As the number of neurons included in the reconstruction is increased, the accuracy of all the methods increased. The best mean error using 25 30 simultaneously recorded cells was about 5 cm, in the range of the intrinsic error of the infrared position tracking system. Alternative formulas were derived from Bayes rule assuming a gaussian model of place field and updating the estimated position only when a spike occurs (Brown et al.. 1998), although its accuracy seemed to be slightly lower than that of the Bayesian method above (Chan et al., 1999). There are thousands of place cells in the hippocampus of a rat that respond in any given environment. However, it is not known how this information is used by the rat in solving spatial and memory problems.
Synaptic learning for Bayesian decoding Although these reconstruction methods may be useful for telling us what information could be available in a population of neurons, it does not tell us what information is actually used by the brain. In particular. it is not clear whether these reconstruction methods could be implemented with neurons. Pouget et al. (1998) show how maximum likelihood decoding can be performed using the highly recurrent architecture of cortical circuits, and thus demonstrate that the theoretical limit corresponding to the Fisher information is achievable. Zhang and Sejnowski (1999b) show how a feed:forward network with one layer of weights could in principle read out a Bayesian code. Thus, optimal decoding is well within the capability of the network mechanisms known to exist in the COrtex.
How a feedforward network can implement the Bayesian decoding method is shown in Fig. 3. The first layer contains N cells tuned arbitrarily to a variable of interest x, and the tuning function f/(x) of cell i describes its mean firing rate. The cells in the second layer represent the value of the encoded variable x (discretized if it is continuous) by their locations in the layer. Let function ~i(x) be the synaptic connection weight from cell i in the first layer to cell x in the second layer. Given the numbers of spikes hi, n2 . . . . . nN from the N cells in the first
Neurons in Layer 2
1
2
1
2
3
i
3 Neurons in Layer 1
N
Fig. 3. The Bayesian decoding method can be implementedby a feedforward network whose synaptic strength can be learned by a Hebbian learning rule that is proportionalto the presynaptic firing rate on a log scale, suggesting that the biological system might possibly be able to team to use the theoretically optimal Bayes rule for reading out information contained in a neuronal population. layer within a unit time interval, the distribution of activity in the second layer computes the sum N
ni qSi(X).
(29)
i=1
To reconstruct the true value of x, the cell with the highest activity in the second layer should be chosen with a winner-take-all circuit. Different reconstruction methods correspond to different basis functions 4)i(x), as shown in more details in Zhang et al. (1998). A constant bias term A(x) that is independent of the activity may be added to implement the Bayesian rule in Eq. 27. We only need the basis function: Oi (X) (3( log
fi (X),
(30)
because by taking logarithm of Eq. 27. the only term depending on the spikes /iti is the sum Y~Ni=1 ni log fi(x). Thus, the synaptic weight should be proportional to the logarithm of the tuning function. Here we show that a simple Hebb rule is sufficient to establish the weights needed for the Bayesian method. We propose that the synaptic weight should change according to:
A W o~ Post x log(Pre)
(31)
where for the presynaptic cell, Pre = firing rate. and for the postsynaptic cell. Post may be taken as
341
a binary variable, During training, the activation at the second layer is confined to a single unit corresponding to the true Value of the encoded variable, which sho.uld vary, sampting all possible values unifomfly. It can be shown ihat the final outcome of this learning is the synaptic weight patterns that are proportional to the average firing rate of the presynaptic cell in Iog units. R a t is, the requirement in Eq. 30 is satisfied after the training. Thus, the Bayesian decoding method considered in the preceding section can be implemented by a feedforwatd network and the synaptic weights can be learned with a synaptic weight plasticity proportional to presynaptic firing rate at logarithm scale. The bias or prior in the Bayesian method should be imptement~ as a tonic input independent of the activation of the cells in the first layer, so that when the context changes, the tonic input should switch accordingly: The learning rule shows how to pool a bank of existing feature detectors with Poisson spikes to quickly develop new detectors at the next stage that are optimal. In particular, this may be used to develop detectors for complex patterns. such as a "grandmother cell'. Overall it does not seem to take much to implement Bayesian formula to achieve optimal performance in the special cases considered above, even though an explicit readout of a population code may not be needed until the final common pathway of the motor system since projections between cortical areas may simply perform transformations between different population codes, References Abbott. L.F. ,and Dayan, R (1999) The effect of correlated variabitity on the accuracy of a population code. Neural CompuL. 11: 91-101. Baldi. R and Heiligenberg, W. (1988) How sensory maps could enhance resolution through ordered arrangements of broadly tuned receivers. Biol. Cybernet.. 59: 313-318. Brown. E.N.. Frank. L.M.. Tang, D., Quirk. M.C. and Wilson, M.A. (1998) A statistical paradigm for neural spike train decoding applied to position prediction from ensemble firing patterns of rat hippocampal place cells. J. Neurosci.. 18:74117425. Brunel, N. and Nadal, J.-R (1998) Mutual information. Fisher information, and population coding. Neural Comput.. 10: 17311757. Chart. K.-L.. Zhang~ K.-C.. Knierim. J.J.. McNaughton. B.L. and Sejnowski, T.J. (1999) Comparison of different methods for
position reconstruction from hippocampal place cell recordings. Soc. Neurosci. Abstr.. 25: 2166. Cover, T.M. and Thomas. J.A. (1991) Elements of Information Theory. Wiley, New York. Eurich. C.W. and Wilke. S.D. (2000) Multi-dinlensional encoding strategy of spiking neurons. Neural Compur., 12: 15191529. F61difik, R (1993) The 'ideal humunculus': Statistical inference from neural population responses. In: E Eeckman and J. Bower (Eds.), Computation and Neural Systems 1992. Kluwer Academic. Norwell, MA. Frieden, B.R. (1998) Physics from Fisher Information: A Unification. Cambridge University Press. Cambridge. Georgopoulos, A.E. Schwartz. A.B. and Kettner. R.E. (1986l Neuronal population coding of movement direction, Science. 233: 1416-1419. Girosi. E and Poggio, T. (19901 Networks and the best approximation property. Biol. Cybernet.. 63: 169-176. Hinton. G.E.. McCMland. J.L. and Rumelhart. D.E. (1986) Distributed representations. In: D.E. Rumelhart and J.L. McClelland (Eds.). Parallel Distributed Processing, Vol. 1, MIT Press. Cambridge, MA, pp. 77-109. Lehky, S.R. and Sejnowski, T.J. (1990) Neural model of stereoacuity and depth interpolation based on a distributed representation of stereo disparity. J. Neurosci.. 10: 2281-2299. Muller, R.U.. Kubie. J.L. and Ranck Jr.. 1.B. (1987) Spatial firing patterns of hippocampal complex-spike cells in a fixed environment. J. Neurosci.. 7: 1935-1950. Mussa-Ivaldi. EA. (1988) Do neurons in the motor cortex encode movement direction? An alternative hypothesis. Neurosct. Lett., 91: 106-111. Paradiso. M.A. 11988) A theory for the use of visual orientation information which exploits the columnar structure of striate cortex. Biol. Cybernet.. 58: 35-49. Pouget, A. and Sejnowski. TJ. (1997) Spatial u'ansformations in the parietal cortex using basis functions. J. Cognit. Neurosci.. 9: 222-237. Pouget. A.. Zhang, K.-C.. Deneve, S. and Latham. RE. (1998) Statistically efficient estimation using population code. Neural Comput.. 10: 373-401. Rieke. E. Warland. D.. de Ruyter van Steveninck. R. and Biatek, W. (1997) Spikes: Exploring the Neural Code. MIT Press, Cambridge. MA. Sanger. T.D. (1994) Theoretical considerations for the analysis of population coding in motor cortex. Neural Compur.. 6: 12-21. Sanger, T.D. (1996) Probability density estimation for the interpretation of neural population codes. J. NeurophysioL. 76: 2790-2793. Schwartz. A.B.. Kettner. R.E. and Georgopoulos, A.R (1988) Primate motor cortex and free ann movements to visual targets in three-dimensional space. I. Relations between single cell discharge and direction of movement. J. Neurosci., 8: 29132927. Seung, H.S. and Sompolinsky, H. (1993) Simple models for reading neuronal population codes. Proc. Natl. Acad. Sci. USA. 90: 10749-10753.
342 Snippe, H.E (1996) Parameter extraction from population codes: a critical assessment. Neural Comput., 8:511-529. Snippe, H.E and Koenderink, J.J. (1992) Discrimination thresholds for channel-coded systems. Biol. Cybernet., 66: 543-551. Wilson, M.A. and McNaughton, B.L. (1993) Dynamics of the hippocampal ensemble code for space. Science, 261: 10551058. (Corrections in Vol. 264, p. 16). Wu, S., Nakahara, H., Murata, N. and Amari, S. (2000) Population decoding based on an unfaithful model. In: S.A. Solla, T.K. Leen and K.R. Muller (Eds.), Advances in Neural Information Processing Systems, Vol. 12. M1T Press, Cambridge, MA, pp. 192-198. Yoon, H. and Sompolinsky, H. (1999) The effect of correlations on the Fisher information of population codes. In: M.S. Kearns, S.A. Solla, and D.A. Cohn (Eds.), Advances in Neural Information Processing Systems, Vol. 11, MIT Press, Cambridge, MA, pp. 167-173.
Zhang, K.-C. and Sejnowski, T.J. (1999a) Neuronal tuning: to sharpen or broaden? Neural Comput., 11: 75-84. Zhang, K.-C. and Sejnowsld, T.J. (1999b) A theory of geometric constraints on neural activity for natural three-dimensional movement. J. Neurosci., 19: 3122-3145. Zhang, K.-C., Ginzburg, I., McNaughton, B.L and Sejn0wsld, T.J. (1998) Interpreting neuronal population activity by reconstruction: unified framework with application to hippocampal place cells. Z Neurophysiol., 79: 1017-1044. Zohary, E. (1992) Population coding of visual stimuli by cortical neurons tuned to more than one dimension. BioL Cyberner, 66: 265-272. Zohary, E., Shadlen, M.N. and Newsome, W.T. (1994) Correlated neuronal discharge rate and its implications for psychophysical performance. Nature, 370: 140-143.
M.A.L. Nicolelis ~Ed.)
Progressin BrainResearch_Vol.
130 © 2001 Elsevier Science B.V. AII rights reserved
CHAPTER 22
What ensemble recordings reveal about functional hippocampal cell encoding Robert E. Hampson *, John D. Simeral and Sam A. Deadwyler Departrr~,m of Physiology and Pharmacology, Wake Forest University School of Medicine, Winston-Salem, NC 27157-1083, USA
Introduction
One of the key advantages of recording from neuronal ensembles in the hippocampus, or any brain area. is the potential for deciphering behavioral and cognitive correlates. Single neuron recordings have been made in all major subfields in the hippocampus. and cell identification via firing signature is not a problem i n m o s t cases (Fox and Ranck Jr.. 1981: Christian and Deadwyler, 1986) (Freund and Buzsaki, 19%). However. with the exception of place field firing, recordings of single hippocampal neurons reveal encoding o f only a 'pieces' of the information puzzle required to be encoded d ~ g ongoing behavioral performance fHampson and Deadwyler, 1996a). In order to extract the complete 'code' it is therefore necessary to record from ensembles of hippocampal neurons and employ population or multivariate analyses to extract such Codes (Hampson and Deadwyler, 1998, 1999). There are several means of constructing ensembles of neurons: either from combined single neuron recordings ('Serial' reconstruction) or from simultaneous recordings of multiple neurons ('parallel' recording). 'Serial' reconstruction of ensembles by combining single neuron recordings has previously
* Corresponding author: Robert E. Hampson, Department Of Physiology and Pharmacology, Wake Forest University School of Medicine. Winston-Salem. NC 27157-1083. USA. Tel.: +1-336-716-854t: Fax: +1-336-716-8501: E-mail:
[email protected]
been used correlate behavioral correlates with ensemble activity lMiller et al., 1993; SChultz et al.. 1993; Gochin et al., 1994). However. such techniques applied to ensembles of hippocampal neurons have failed to reveal encoding any more than simultaneously recorded neurons (Hampson and Deadwyler. 1996a). One reason for this result is that serially recorded data increases uncorrelated noise, and it may not be possible to isolate some of the ensemble information code from this elevated noise level. Another reason may be that serial recording of single neurons does not account for 'local circuits' within the hippocampal anatomy which may influence the activity of neurons recorded in different locations. In other words, the activity of one neuron in a circuit influences the activity of other neurons within the circuit, however, when only a single neuron is recorded, this covariance cannot be determined: Despite differences in the neural 'codes' determined using serial vs. parallel recording techniques. The single neuron and ensemble codes are likely mutually dependent. Once a code is identified within the ensemble, it should be possible to determine how single neurons contribute to the ensemble pattern. Identification of robust single neuron encoding may only be possible within the context of the ensemble code. thus revealing codes that would be missed by serial recording of single neurons. Multineuron recordings using a recording electrode 'array' (Deadwyler et al., 1996) with electrode tips to record positioned electrodes along specific anatomic projections can be used to simultaneously record ensembles of neurons within hippocampus. Once a population of neurons is
346 recorded, ensemble analyses can proceed using multiple cross-correlations, population vector analysis, or linear discriminant analyses. Given these factors, it should be possible to determine how single neuron activity within hippocampus circuits is integrated within ensemble codes for behavioral and cognitive events. Identifying information encoded by neural ensembles As stated in a previous report (Hampson and Deadwyler, 1999), cross-correlation analyses are useful in examining functional connectivity between neurons. In an ensemble recorded from a patterned array of electrodes, this may be useful in defining neural circuits across the hippocampus. However, since crosscorrelations can only be practically calculated for pairs of neurons, this technique is not suited to extracting simultaneous firing across ensembles of neurons. In addition, recent reports by Gerstein (Bedenbaugh and Gerstein, 1997) have shown that incomplete separation of multiple neuron recordings from single electrodes result in a contamination of crosscorrelations between neurons recorded from different electrodes. Alternatively, population vector analyses can take advantage of correlated firing between neurons to extract patterns of firing across the whole ensemble. Vector analyses have been successfully applied by Georgopoulos, Schwartz and others (Georgopoulos et al.. 1989; Moran and Schwartz, 1999a,b; Georgopoulos, 1995; Pellizzer et al., 1995) to extract ensemble codes for directional and velocity of motor movements. Since the population vector treats instantaneous firing rate of all neurons in the ensemble as a single vector in multidimensional space. different behavioral 'states' can also be represented by unique population vectors (i.e. vectors pointing in different 'directions') (Chapin et al.. 1989). One weakness of both correlational and vector analyses is that they do not provide a means of identifying codes for multiple pieces of information that are represented within the ensemble at the same time. Identification and separation of multiple overlapping codes requires a multivariate analysis, such as linear or canonical discriminant analysis (CDA). This type of analysis subdivides the sources of variance within the ensemble firing into components which can be used to discriminate the behavioral events
encoded by the ensemble activity (Deadwyler et al., 1996; Hampson and Deadwyler, 1998). The CDA used extensively in our laboratory is a form of linear discriminant analysis which has allowed the identification of variance sources corresponding to three orthogonal dimensions of a delayed-nonmatch-to-sample (DNMS) behavioral task (task phase, response position and trial performance; see Deadwyler et al., 1996: Hampson and Deadwyler, 1996a). The analysis revealed that the ensemble activity s i m u l t a n e o u s l y encoded different types of information that was critical to performance of the task (Hampson and Deadwvler. 1996b; Deadwyler and Hampson. 1997; Hampson et al., 1999). Although the CDA operated on fixed temporal intervals corresponding to behavioral events within the trial, later analyses revealed that the task-specific information appeared at other times during the task as well (Hampson and Deadwyler, 1999). Where does ensemble information come from? Was the above successful determination of an ensemble code for behavior due in large part to the simultaneous nature of the ensemble recordings? In other words, was the code broadly distributed across neurons? or did each neuron encode a portion of the code with multiple neurons adding up to a complete code? Although these two questions appear similar. in fact, they imply two different underlying processes. If the ensemble code is broadly distributed across neurons then examination of each neurons' firing would reveal only a small fraction of firing related to behavioral events. Only when a large number of such small fluctuations occur synchronously would a coherent pattern (i.e. 'code') be detected. On the other hand. if each neuron encoded a distinct portion of the complete code. it would be possible to detect that for a single cell (even if encoding were incomplete). The synchronized successive firing of all neurons, each encoding different aspects of each behavioral event, would be revealed across the ensemble activity as a complete code for the event. The above scenario leads to two possible means of deciphering the means by which information is encoded by neural ensembles: a top down model implies that information is d i s t r i b u t e d a c r o s s neurons, while a bottom-up model suggests that n e u r o n s
347
ehse/nblecOde! .. Un y,. i f the inforrnatior~ within neural ensembles .is~b0th distrib~ted iind locally encoded. it: may .be ~ c u l t to identify the specific aspects Of single rteuron, firing that Critically correspond to the enCoded.:~enf,-This is the case with ensembles of h,ippoc~pal neurons, where prior investigations have r e p o ~ both :iocal (Hamps0n et al.,: 1993; Heyser et~aL].-t993)ias well as distributed encoding (Deadwyiet et ai., i996; Hampson and Deadwyler, 1996a,: !999;Deadwyler and Hampson, 1997). One metals of bridgiag:.~s problem is to first extract v~ance C0m~0nents :fr0m the entire ensemble firing pattern¢ ther1?i~ntify ~pecific neurons that contribute tO those, v ~ a n c e components. I n this manner, both local :and d~s~butrdencoNng of information is address.~, within .the same ,ensemble: using statistic~ constraints ':that: a~?oid over-interpretation of 'weak firing correlates.
A recent refinement ~of .the CDA has been the imp!ementati0n .of.;a 'sli:ding analysis window', which allows exami.nation o f ensemble encoding for all temporal segments: :throughout all phases the task (Hampson -andDeadwyler, 1998, 1999). The CDA is iterativety ~ealent:ated throughout segments of prescribed duration by sliding the analysis '.window' of discri~nanf ftinction',calculations across successive temporaI, segments., By using the same coefficients of the discriminant..functions derived :from; key events •
i
within the trial (Fig. t), the state of.extracted vailance sources can be determined.throughout the.trial. Dfita was :cOllected'from 23 ~Nals"hsing the electrode array system (Fig. 1A) duriag..performance of a DNMS task. Each animal (mate L0ng,Evans rats) performed the task by pressing .~. leftilor right presented lever in the sample phase,., noseP01dng into a lighted device on the opposite wall-'fbr the duration of a random 1-40 s Delay phase, then.,pressing .the opposite lever during the nonmatch phase (Hampson et al., 1993; Deadwyler et al., 1996). The CDN was applied successively to the intertrial interval (ITI), sample response (SR), last nosepoke (LNP) in the delay internal, and nonmatch response (Nit). events (Fig. 1B). Five sets of canonical discriminant functions were derived, with the first three corresponding to task phase and type of responses and the fourth and fifth corre~pon~ng io"levers pressed in the sample and nonm~ch i'~phaseS.(cf. ~Deadwyler : et-at~,.:199)5). By reassessing discriminant functions derix~ed from one aspectof the taskat all other:thins Wittfin the trial, patterns o f ensemble.firing Corresponding to these factors predictably varied in accordance with the behaviorai contingenciesof ihe Nsk (Hampson and Deadwyler, 1999). The plot in Fig. 1C shows discriminant scores throughout the entire trial, for three of the discriminant functions: CAN1 which tracked task phase (38.% of total variance), CAN4 which, tracked lever response position (10% of Variance), and CAN5 which tracked the type of trial (8% of varianceS. Each function shows firing changes during the trial which decreased across the delay interval and were increased during sample and nonmatch phases. Note that CAN1 and CAN4 showed 'crossover' from.sam, ple to nonmatch in accordance with either the-different phase (CAN1), or the opposite lever response (CAN4) required by the DNMS task. There was no difference in CAN1 variance for left and right lever trials as determined by sample lever position. However. variances for CAN4 and CAN5 exhibited mirror image differences between the two .trim types defined by the encoded sample lever. While CAN1 did not discriminate position, discriminant scores were quite, significantly different (F(1,1213) = 11.2. P < 0.00D for sample vs. nonmatch events, but not for lever position (F(1,1213) = 1.9, P = 0.16). Scores for CAN4 were
348
C.
A,
CANt- Trial Phase
3
QQ
~
2 1
¢
u}
1
n"
0
-2 -3
d u" sK" ;R" "; "i0 "~;" ?0" i ; ;0" ~; "iN;"NR"~l'"
B.
1
S D
o
m
-1
m'" "s;'" "s~'";" "~;""~;"~" "2;""~" "3;'i.'; "N;i",g"
Hz
,,=,
o
,.. z
SR
10
20 30 Delay (sec)
LNP NR
k~
-1
~.
-2 ITI
SP
SR
5
10
15
20
25
30
35 L ~
NR
m
Trial Phase
Fig. 1. Ensemble recording and analysis during delayed-nonmatch-to-sample (DNMS) task. (A) Example of a electrode array for hippocampal recording. Array consisted of two rows of eight microwires (28-56 Ixm diameter) at 200-~m intervals oriented to traverse longitudinal axis of htppoeampus. Rows were fixed 800 txm apart, with 1200 ~,m difference in length to allow simultaneous placement in CA1 tdepth approximately 2.8-3.2 mm) and CA3 (depth approximately 4.0-4.4 ram) (Deadwyler et al.. 1996). (B- Ensemble histogram from a single animal showing averaged firing distribution of all cells (n = 16) during the entire DNMS trial. SR sample presentation: SR. sample response; LNR last nosepoke during delay; NR, nonmatch response. The canonical discriminant analysis (CDA) was calculated using ensemble firing within a 3-second interval around SR. LNP. NR, and a point 2 s prior to the start of the trial during the intertrial interval (ITI ~. The 'sliding' CDA consisted of applying the discriminant functions to 3-s intervals throughout the trial fcross-hatched and shaded band) to calculate discriminant scores for the complete trial. (C) Discriminant scores for three canonical discriminant functions (CAN1. CAN4, CAN5) obtained from the CDA (see text). For each function, discriminant scores were calculated at3-s intervals through DNMS trials, sorted by trial, and averaged within each trial type. Filled circles represent CDA scores on left trials (initiated by a left sample responses, while unfilled circles represent scores from right trials (initiated by a right sample response). Scores are shown for CANI. which discriminated sample from nonmatch phase responses; CAN4, which discriminated left from right responses irrespective of phase: and CAN5 which discriminated all responses for a given trial type irrespective of phase and actual response position). s i g n i f i c a n t l y d i f f e r e n t (E1.1213~ = 7.4. P < 0 . 0 1 ) f o r l e f t vs. r i g h t trials, a n d as e x p e c t e d , w e r e sign i f i c a n t l y d i f f e r e n t (FcH213, = 9.6, P < 0 . 0 0 1 ) f o r s a m p l e vs. n o n m a t c h r e s p o n s e s as r e q u i r e d b y t h e ' D N M S ' c o n t i n g e n c y . S c o r e s for C A N 5 w e r e s i g n i f i c a n t l y d i f f e r e n t (F(1,1213~ = 9.1, P < 0 . 0 1 ) t h r o u g h o u t all p h a s e s o f left vs. r i g h t trials, i n c l u d -
i n g t h e delay, b u t w e r e n o t s i g n i f i c a n t l y d i f f e r e n t
(F(1,1213) --~ 2.7, P = 0 . 1 1 ) f o r s a m p l e vs. n o n m a t c h r e s p o n s e s w i t h i n t h o s e s a m e trials. T h i s f u n c t i o n ( C A N 5 ) t h e r e f o r e e n c o d e d trial type, b u t d i d n o t d i s c r i m i n a t e i n d i v i d u a l e v e n t s w i t h i n t h e trial (i.e. s a m p l e vs. n o n m a t c h or left vs. r i g h t r e s p o n s e s ) . Other derived CDA components behaved in a simi-
349 larly predictable manner in other portions of the trial. primarily encoding lever pressing vs. nosepoking and within-trial vs. intertrial events. Employing the CDA in this manner, as a 'sliding window' provided not only profiles from which altered individual neuronal activity could be extracted, but also 'benchmarks' regarding the level of such activity as well.
Using the canonical discriminant functions to indicate which neurons and temporal intervals were critical to encoding specific DNMS events, hippocampal cells could be categorized as firing differentially and significantly to the position of lever response (left or right), or phase of the task (sample or nonmatch), and were cias:sifiedas position- or phase-" ' cells. Perievent histograms were constructed around each behavioral event (within temporal domains identified by the canonical discriminant functions) and used to classify individual cell types. A cell was considered
to encode the event if its z-score ([Maximum Baseline] + SD of Baseline) exceeded 3.19 (equivalent to a significance of P > 0.001). Further, the cell was considered to differentially encode the event, if one or more peaks were at least 3.19 (z)larger than the remaining event peaks. Of the complete set of neurons (n = 263), only 20 could not be identified with differential event encoding that conformed to the above three categories. Examples of these 'functional cell types' (FCTs) are illustrated by the rastergrams in Fig. 2A.B. The same task phase information extracted by CAN1 was represented by the phase-only functional cell types. Fig. 3 illustrates an 'ensemble' constructed of six sample-only and six nonmatchonly neurons selected from the complete set of 80 such neurons. The firing of the same 12 neurons is shown in both Fig. 3A and B - - for left and right sample trials, respectively. The firing patterns are identical throughout the trial for sample-only cells (foreground) or nonmatch-only cells (background) (Fig. 3A,B). This is further illustrated by tlae non-
~.
~.
Contribution of different cell types to CDA patterns
Position
..~'P~
,~
.
.
o
llo|
•
'',
g
.. ~ ..º.~,~.aCiP'.' ~.' ' "
u i
,,"~?.i~:'; ' .',
, ..
~ ®|
.
| '°L m
o
"T..':.'./.' "i:~a
.eo | .
"'
C,
"I
" .'..
j
I
i I
'~ j •
. ., . . -
-1.0
i~ •
t
.oo
.I D II
~ ~
Osec
Sample
I
a!
, g 01~
. . . . ;'
•
~,,',° |I0 I -
+1,0
i
"
-
I
-
*
I /I '' u
-1,0
i I
.
F
.'
'I
'1
I
•
"r
i
•
• ,,,
,
"•''
I
I
Osec
Nonmatch
I,
-
~
+1,0
I
' ~ II .I
i
II I
I
I
I l n l l m JmI | , I
i, , ,,. .,'.'~r'
]
' .
,'1'I ,,. ',.~'" .
..
' , ',°
I
I
lJlCo, i ',,,.
.
" .
..,.~,~
"
.
""'," ....:' "'•
I
Trial Type I
Iii I
• I I | I
"
i
II |
I
|
....
I
II
i||e
I1~1 i I
II
', t ,,P';'¢.I.
"" II
''
iiii
•
|
° i gI
I
I hill
" ,. . . .
....
.,. r~ '~,:!~-~r,'d"~,'...' . , "
s Iit
i / ml
,'.,~
v., I
"
D.
.
e o m
L
II
'
I
,
Conjunctive
"
Phase
:'
IiIll
t
i
',: ,'
.,J ~'.,"~r':...'.:.
'
i.i
,
I I ~1
"
~llm ~.
,,I.,,
"
'
"
"
"
...... .
r,
,
,,~,,,~
•
-1.0
0
se
Sample
+1.0
-1.0
0 Sec
+1.0
Nonmatch
Fig. 2. Single,trial cell tiring records shown in 'raster' form (each row represents a single trial) for four different functional cell types (FCTs) of hippocampal cells. Each vertical tick mark represent a single action potential occurrence. Tick marks within a row represent all action potentials tired within the 3-s interval for a given trial. Ten trials are shown for each condition. Vertical line indicates lever response during Sampte or nonmatch phase on the left (L) or right (R) lever, thus for each FCT, four sets of rasters are depicted, one for each combination of phase and response position. (A) Position cell fires only to responses at the left (L) and not the right (R) lever. (B) Phase ceil fires during response in the nonmatch (not sample) phase irrespective of response position. (C) Example of a Conjunctive cell that fires only to the combination of left (L) response in the nonmatch phase. (D) Trial type cell fires in sample and nonmatch phases, but to one trial contiguration fi.e. right sample and left nonmatch. RS-LN) and not the other.
350
3.
A,
P h a s e Cell I
I
s
is
l
II
II
aI I I
I
|l
1!
| iiii
e ii
!
o
e I ii
II
,,
i
,,
| ,
I |
|
".'" , :;'
| |
, i~
|.°.
I el Bill l i t I IIIII I I • I • IllO • I I iI lil|l II | I I I III | I Illl I loll I II Rill
Osec
I
='n|,~om]='=
u,'
•
'
'• .... ~l"t'.,, "4,., ,,,..,-,,.l.;., '', ,,',,,
a
,o,~
a|''|°"
+i.O '
•
"
-1.0
Sample
D ,
I
I I I loll I I II I •
...."'t ' '';"
"
"-1.1)
| Ill I
/
"
!
"
0 sec +1.0 Nonmatch
CAN1
-
Trial Phase
B 2
1 m
0
o
-1
~;)
-2
.
u~
ITI
SP
SR
5
10
15
20
25
30
35 LNP NR ITI-
Trial Phase Fig. 3. Examples of phase-only cells. An 'ensemble' was constructed of six sample-only and six nonmatch-only neurons selected from 80 phase-only neurons. (A) Firing of the 12 neurons during a left trial (i.e. initiated with left sample response). Axis at left represents individual neurons, and separates the sample-only cells (right or foreground) from nonmatch-only cells (left or background). Axis at right depicts time throughout the DNMS trial, with the sample response (SR) at left and nonmatch response (NR) at right (see Fig. 1B for comparison). Note firing of sample-only and nonmatch-only cells during the appropriate phases. (B) Firing of the same 12 neurons during a right trial (i.e. initiated with right sample response). Axes and order of neurons is the same as m A. Note that there was no change in the pattern of neuron firing. (C) Rastergram las in Fig. 2) depicting firing of a single nonmatch-only cell. (D) 'Sliding' CDA discriminant scores for CAN1 (phase) throughout DNMS trial. Filled circles, CDA discriminant scores averaged over left trials: unfilled circles. CDA scores averaged over right trials. The CDA shows the same discrimination of phase (irrespective of position] as the phase-only cells, with the exception of the difference in sign between phases. This difference is likely contributed by the fact that there are two different subpopulations of sample-only and nonmatch-only cells within the phase-only cells.
match-only cell rastergram for a nonmatch-only cell type (Fig; 3C). Thus the trial-dependent reciprocal discriminant scores extracted by CAN1 (Fig. 3D) result in part from the activity of two distinct populations of phase.only firing cells, one of which encodes only sample, and the other only nonmatch. The position-only cell firing corresponded to the same information represented by CAN4 (response position) in a manner similar to that described for CAN1 and phase-only cells described above. There
were 33 cells identified with left-only or right-only firing. Fig. 4 illustrates the firing of six of each cell type in response to left or right trials (Fig. 4A,B). The left-only cells (foreground) fired only during the sample phase, while the right-only cells (background) fired only during the nonmatch phase. Fig. 4 also shows that some cells fired not only at the respective left or right responses, but also during the Delay phase prior to an appropriate nonmatch response. For example, left-only cells increase fir-
351
C.
A.
Position
L
Cell
', ', ',"''-"--~tt']',',' ! ' , %. , . . . ,, , ,,,"t,,~}: gS' ' ' ¶ II I ~'&"~' ' i I ','~,,,,I Ir~#JI,"', ,', ' " ', I ',''~',',~I,,ll',ll, '~'.~",i' '' ,. ' ' ,,",'¢~'d~,' ~ ' , '
'
ii
el
'
•
!
"e
,| !' '
I
""':
•
I|*Q
", , 0' ...I .... . '. :." . . .I . I • . • I •
R
-1.0
0 sec
B.
I1",
,, ,,:..
•
" ''&" '' "' i ,, ; ''[I " "
';" !
+1.0
-1.0
0 sec
Sample
D,
. •. . .S
Im
-
+1.0
Nonmateh
CAN4
- Lever
Position
~:Z:
n-
1
l
~3
I
U~
0
°
-1
I
--o--. . . . ,,,,1
ITI
, , , 0 ~
SP
RIGHT
TRIAL
0 , , , , 0 ° , , , , , , , , , o , , ° ,
SR
5
10
15 Trial
20
25
. . . .
30
°,
35 LNP NR
°
°,
•
,
ITI
Phase
Fig. 4, Examples of position-only cells. An ensemble similar to that shown in Fig. 3 was= constructed of six left-only and six right-only neurons selected fi-om 33:position-only neurons. (A) Firing of the 12 neurons during a left trial. Axes are the same as in Fig. 3A, except that the six right-only ceils have been placed to the right (foreground~ of the neuron axis. and the six left-only cells were placed to the left ~backgro~md) 0 f the neuron axis. (B) Firing of the same 12 neurons during a right trial. Axes and order of neurons is the same as in A. NOte that position-only cells fire in different phases on different trial types in accordance with the nonmatch demands of the DNMS task, (C) Rastergram (as in Fig. 2) depicting firing of a single left-only ceil. The cell fired in both sample and nonmatch phases as long as the response was in the left (L) position. (D) 'Sliding' CDA discriminant scores for CAN4 (position) throughout DNMS trial shows the same discrimination of position. Note that again, the difference in sign (negative vs. positive) likely results from the two subpopulations of left-only arid right-only cells.
ing during the delay preceding a left nonmatch. while right~on!y cells increase firing during the delay preceding :a right nonmatch, The rastergram for a representative left:only cell in Fig. 4C confirm thal the cell exhibited nearly identical firing in sample or nonmatch phase, :if the response position was the same (left), Hence the reciprocity in CAN4 during the trial (Fig: 4D) also restdted from a combination of two subpopulations of position-only cells, one encoding left responses, and the other encoding right responses. The)e were considerably fewer positiononiy cells (n = 33) than phase-on}y cells (n = 80)
identified in the complete data set, and this was reflected in the discriminant function for position (CAN4) which accounted for less ensemble firing variance (10%) than the discriminant function for phase (CAN1, 38%). Other cells could be further differentiated and classified with respect to conjunctions of the above two features (i.e. right sample, left sample, right nonmatch or left nonmatch), and these were called conjunctive cells (Table 1, Fig. 2C, see also Hampson and Deadwyler, 1999). A total of 101 Conjunctive cells were identified, with nearly equal numbers of
352 TABLE 1 Functional cell types: frequencies of occurrence
Position No Position
Left
Right
Totals
No Phase
No Correlate n= 20 (7%)
Left-Only n= 14 (5%)
Right-Only n= 19 (7%)
53
Sample
Sample-Only n= 38 (14%)
85
Nonmatch Nonmatch-Only n= 42 (16%) Trial Type Cells I Totals
100
94 Left Trial n= 17 (6%)
Right Trial n= 12 (5%)
29
79
84
263
Conjunctive cell types are shown in the shaded area.
cells corresponding to each of the four possible conjunctions (Table 1, gray shading). Each cell type fired only during the appropriate conjunction or combination of events, hence a left-sample cell would fire only during the sample response at the left lever, but not during any nonmatch or right sample responses. This population of conjunctive cells thus encodes only one of the four possible responses that comprise a DNMS trial, and can thus contribute to the encoding o f these events by t h e ensemble as extracted by both CAN1 and CAN4. Since equal numbers o f cells were identified for each conjunction, the number o f neurons that encode any events according to phase is increased from 80 to 181, and the number of neurons encoding position from 33 to 134 neurons, consistent with a higher percentage of total variance in ensemble variance being accounted for by phase than position factors. Yet another cell type was identified which could be classified as firing specifically to the two events that make up a trial type (i,e. right sample and left nonmatch), but not to the events that make up the other trial type (i.e. left sample and fight nonmatch), and hence were designated trial type cells (Fig. 2D). Only 29 trial'type cells were identified, whose firing was shown to correspond to the same information extracted as CAN5 (Fig. 2) across the DNMS trial. Fig. 5A and B depicts six left-trial, and six righttrial cells that fired throughout the respective DNMS
trial type. The left-trial cells (foreground) that only fired on trials initiated with a left sample response (Fig. 5A), and were essentially silent during trials initiated by a fight sample response (Fig. 5B). Thus this functional cell type fired 'disjunctively' on opposite levers in different phases, but always the same combination ttrial) of phases and levers (Fig. 5C). The initial peak, abrupt decrease, and slow 'ramping' of firing during the delay prior to the nonmatch response (Fig. 5A,B) was similar to the behavior of the discfiminant scores for CAN5 (Fig. 5D). This form of delay-phase firing did not occur in phaseonly or conjunctive cells for left or fight sample or consistently in any other cell type. As with CAN1 and CAN4, the appropriate conjunctive cells may have contributed to the ensemble firing patterns represented by CAN5. but only the 29 trial-type cells unambiguously encoded a given trial type.
Functional hippocampal cell types in ensembles The relative frequencies of functional cell types (FCTs) are provided in Table 1. Of the 263 cells analyzed in 21 animals, only 20 did not show a discrimination of some aspect trial phase or response position, and were excluded from the analvsis (Table 1. 'No Correlate'). phase-only cells accounted for 30% of cells, while position-only cells accounted for only 12% of cells. The largest population of
353 A.
C.
Trial-Type Cell
?'1'
,
"
,
,
t
I
a ,
¢ i
• ii
i
i
. ' .,. B
•
u
-1.0
•
I
-
ID
I
!
||
,ola
u
•
• me •
+1.0
o !
~u'
•
-1.0
oa o
~ Inao
'=
Jo o l l o u | ' o
I I
lull li O
o i~ u |; t! B,
, '=
', 0',.I
-
0 sec
!
iii
am
io,lj~ o
• i|! ~ | to ° e°al Ii I ~ a .n
=o~
|m
~,, ,,
I. ' M r . '
O ~l
"
,
i i ,',~La.......
t
,, o, ao
i
.
oD
.t~t,,
|
i
o
i
o
, .
0 see
|||
,u|
v'-,,
aa|o
•
..
|
+1.0
Nonmatch
Sample
D. CAN5
B.
F--1O
2
(
1
- Trial Type
O o
co
/
5}
~5 J
]
I ILL UJ -J
-2
I
----o--
LEFT TRIAL ---o--- RIGHT TRIAL o . . . . . . . . . . . . . . . . . . . . o
ITI
SP
SR
5
10
. . . . . o . . . . . . . . . . . ° .
15
20
25
30
35 LNP NR ITI
Trial Phase Fig. 5. Examples of trial-type cells. An ensemble of six left-trial and six fight-trial cells (out of 29 trial-type neurons) was constructed as in Figs. 3 and 4. (A) Firing of the 12 neurons during a left trial. Axes are the same as in Fig. 4 except for placement of neurons. Right-trial neurons were:placed to the right of the neuron axis. while left-trial neurons were placed to the left of the neuron axis. (B) Firing of the same neurons during a right trial. Note that the opposite trial-type cells are virtually silent during the inappropriate trial. (C) Rastergrarns of a right-trial cell show firing at the right sample and left nonmatch responses appropriate to a right trial, but no firing at the responses (left sample, right nonmatch) appropriate ~o a left trial. (D) 'Sliding' CDA scores for CAN5 (trial type) show the same discrimination as the trial-type cells. Note that the scores are reduced immediately after the sample response (SR) but are 'ramped up' during the delay interval. This same pattern is also exhibited by the delay firing of trial-type cells (A,B).
TABLE 2 Functional ceil types and DNMS task-specific information
Phase-only Position-only Conjunctives Trial-type
Proportion of DNMS information encoded per cell
Number of cells (%)
1/4 1, 4 1/8 1/2
80 33 101 29
cell types were the conjunctive cells, accounting for 38%, with trial-type cells accounting for the smallest group at 11% of the total. These proportions were used to create a-prototypical' ensemble consisting
(30%) (12%) (38%) (11%)
of 20 neurons encompassing all the FCTs, as shown in Fig. 6. The firing of this prototype ensemble during left and fight trials illustrates that some neurons fired at the same times on different trials (i.e.
354
5 Hz
R
5 Hz
L~
Fig. 6. Simulated ensembles of 18 neurons containing a proportional mixture of functional cell types (FCTs). The simulated ensemble was constructed with one left-only, three sample-only, two left-nonmatch, two right-nonmatch, one right-trial, one left-trial, two right-sample, two left-sample, three nonmatch only, and one right-only cells. Neurons are ranked in this same order from left-to-right along the neuron axis. Time axis is the same as in Figs. 3-5. (A) Firing pattern of simulated ensemble on left trial. (B) Firing of the same ensemble on a right trial. Note that although one-third of the cells fire the same in both trials (phase-only cells) the overall pattern of firing is distinct for the two trials. Conjunctive cells, which were not assigned to CAN1, CAN4, or CAN5 (Figs. 3-5) are included here, and contribute to the overall discrimination of phase and position.
phase-only cells), while others fired at specific times during the trials (i.e. position-only cells). The above FCTs only accounted for 42% of the population of cells, the remaining 58% of the population were conjunctive cells ,and trial-type cells which fired at unique combinations of phase and position. In terms of the information required to complete the DNMS task, each phase-only and position-only cell could encode at most only 25% of the information required to represent a trial, while each conjunctive cell could uniquely encode half of that information (Table 2). The trial-type cells encoded nearly all of the information required to represent a given trial,
but each cell could only encode one of the two possible DNMS trial types (Table 2). FCTs that encoded the highest proportion of information about DNMS events occurred with the lowest frequencies, whereas those that encoded less information occurred more frequently. The results in Table 2 suggest that as few as eight neurons with the appropriate FCTs would sufficiently encode all information required to perform the DNMS task; however, since recorded hippocampal ensembles rarely contained an 'ideal' mixture of cells types, larger ensembles would generally be required to encode all DNMS events (Hampson and Deadwyler, 1996a).
355 DNMS
If single neuron FCTs encode so much information about DN~vIS events, then do hippocampal ensembles exhibit distributed or local encoding? Prior analyses have shown the presence of both distributed and local encoding in hippocampat ensembles: place cells exhibit distinct l ~ a l coding in the form of place fields, yet the ~presentation of a complete spatial environment is distributed over a large number of neurons (WiIson a n d McNaughton, 1993; McHugh et al., 1996~ Bedenbaugh and Gerstein, 1997; Brown et al., i998). Likewise. spatial navigation has been
show distinct receptive fields (Georgopoulos et at., t986; Miller et al., i993; Lee et al., 1998), but encode considerably more information as a populations (Pellizzer et al.. 1995: Nicolelis et al.. 1995. 1998; Chapin et al., 1999). Encoding of DNMS events was previously shown to be distributed across neurons (Hampson and Deadwyler, 1996a; Deadwyler and Hampson, 1997), yet an anatomic or procedural segregati0n of information (Hgmps0n and Deadwyler, 1999) was observed when ensembles were examined for single neuron contributions, The prototypical histograms in Fig. 6 illustrate one mechanism by which ,the ,hippocan~al FCTs may provide a procedural map of a DNI~S trial. Du~ng sample phase, only the respective sample and position cells are active, falling silent at the beginning of the delay phase. Meanwhile the nonmatch-only and opposite position cells remain silent Until the later delay phase. Only trialtype cells remain active throughout the trial. As the delay progresses in the nonmatch phase, the nonmatch and appropriate position cells fire (Fig 6)
In this manner, hippocampal cells would fire in sequence through a 19NMS trial, encoding task-specific information as much :by firing sequence, as by the individual neural correlates. Thus the encoding of the DNMS trial is distribuWd across the ensemble, but relies on the local encoding revealed by the FCTs. If single neurons encoded DNMS :events with the precision revealed by the FCTs, what additional information did ensemble analysis contribute? The answer lies in the manner in which FCTs were identified. T h e canonical discriminant analysis (CDA) of the ensembles was essential to determining the FCTs. Prior analyses of single neurons (Hampson et al., 1993; Heyser et al.~ 1993) showed that each neuron's firing varied in response to several events. Ensembles constructed of non-simultaneously recorded neurons contained too much noise uncorrelated with task events, to accurately encode all DNMS events (Hampson and Deadwyler. 1996a). After the total variance of the simultaneously-recorded neurons were analyzed as an ensemble via CDA. the discriminant functions were used to identify firing patterns and temporal intervals, and search for FCTs. Thus the single neuron FCTs were not identified until afwr the ensemble analysis, and relied on the CDA for segregation of variance associated with DNMS event encoding. Neural ensemble recording revealed a feature of DNMS event encoding by hippocampal neurons that was not immediately apparent from single neuron recordings, yet single neurons exhibited features that contributed to the ensemble code. Previous reports revealed both simple and conjunctive firing patterns of single hippocampal neurons (Otto and Eichenbaum, 1992; Hampson et al., 1993; Schoenbaum and Eichenbaum, 1995b; Young et al., 1997), but did not relate that firing to ensemble encoding. The above results demonstrate that both local and distributed encoding occur within the hippocampus. Similar findings have been reported for other brain areas (Nicolelis et al., 1998). The presence of neurons with distinctly tuned visual or sensory receptive fields is often cited as evidence for local encoding of information (cf. Georgopoulos et al., 1986; Gawne and Richmond. 1993). However, even with distinct receptive fields, the 'tuning' of a given field is often not precise enough to solely account for the precision of the code by ensembles (Salinas and Abbott.
356
1994; Georgopoulos, 1995; Fitzpatrick et al., 1997). Thus it is possible to have highly tuned receptive fields which encode only very specific information, similar to the event specificity exemplified by conjunctive cells, yet still have a code distributed across neurons. Single cell and ensemble encoding are mutually dependent processes, and both analyses reveal characteristics that could not be obtained alone: ensemble analyses revealed task-specific information that was encoded across neurons, yet portions of the code could be represented by single neurons. However, it was only after the ensemble firing patterns were extracted, that the corresponding single neuron firing patterns could be effectively identified. Thus both techniques together showed how DNMS events were represented within the ensemble (across neurons), as well as how complete DNMS task-specific information Could be represented in ensembles of only 10-20 neurons.
Acknowledgements The authors thanks D.R. Byrd, J.K. Konstantopoulos, J. Brooks and T. Bunn for technical support. This work was supported by NIH Grants DA08549 to R.E.H. and DA03502 and DA00119 to S,A.D.
References Bedenbaugh, E and Gerstein, G.L. (1997) Multiunit normalized cross correlation differs from the average single-unit normalized correlation: Neural Comput:, 9: 1265-1275. Brown, E.N., Frank, L.M., Tang, D., Quirk, M.C. and Wilson, M.A, (1998) A statistical paradigm for neural spike train decoding applied to position prediction from ensemble firing patterns of rat hippocampal place cells. J. Neurosci., 18:74117425. Chapin, J., Nicolelis, M:A., Yu, C.-H. and Sollot, S. (1989) Characterization of ensemble properties of simultaneously recorded neurons in somatosensory (SI) cortex. Soc. Neurosci. Abstr., 15: 312. Chapin, J.K., Moxon, K.A:, Markowitz, R.S. and Nicolelis, M.A. (1999) Real-time control of a robot arm using simultaneously recorded neurons in the motor cortex [see comments]. Nat. Neurosci., 2: 664-670. Christian, E:E and Deadwyler, S.A. (1986) Behavioral functions arid hippocampal cell types: evidence for two nonoverlapping populations in the rat. J. Neurophysiol., 55: 331-348. Deadwyler, S.A. and Hampson, R.E. (1997) The significance of neural ensemble codes during behavior and cognition. In: W.M. Cowan, E.M. Shooter, C.E Stevens, and R.F. Thomp-
son (Eds,), Annual Review of Neuroscience, Vol. 20. Annual Reviews. Palo Alto. CA. pp. 217-244. Deadwyler. S.A.. Bunn, T. and Hampson. R.E. (1996) Hippocampal ensemble activity during spatial delayed-nonmatch-to-sample performance in rats. J. Neurosci., 16: 354372. Fitzpatrick. D.C.. Batra. R., Stanford, T.R. and Kuwada, S. (1997~ A neuronal population code for sound localization. Nature, 388: 871-874, Fox. S.E. and Ranck Jr.. J.B. (1981) Electrophysiological characteristics of hippocampal complex-spike cells and theta ceils. Exp. Brain Res.. 41: 399-410. Freund. T.E and Buzsaki, G. (1996) Internenrons of the hippocampus. [Review] [757 refs]. Hzppocampus, 6: 347-470. Gawne, TJ. and Richmond. B.J. (1993) How independent are the messages carried by adjacent inferior temporal cortical neurons? J. Neurosci., 13: 2758-2771. Georgopoulos. A.E (1995) Current issues in directional motor control. Trends Neurosci.. 18: 506-510. Georgopoulos. A,Eo Schwartz, A.B. and Ketmer, R.E. (1986) Neuronal population encoding of movement direction. Science. 233: 1416-1419. Georgopoulos. A,R, Lurito, J.T.. Petrides. M., Schwartz. A.B. and Massey, J.T. (1989) Mental rotation of the neuronal population vector. Science, 243: 234-236. Ghazanfar. A.A. and Nicolelis. M.A. (1997J Nonlinear processing of tactile information in the thalamocortical loop. J. NeurophysioL, 78: 506-510. Gochin. EM., Colombo. M.. Dorfman, G.A., Gerstein, G.L. and Gross. C.G. 11994) Neural ensemble coding in inferior temporal cortex. J. Neurophysiol.. 71: 2325-2337. Hampson. R.E. and Deadwyler, S.A. (1996a) LTP and LTD and the encoding of memory in small ensembles of hippocampal neurons. In: M. Baudry and J. Davis (Eds.), Long-Term Potentiation. Volume 3. MIT Press, Cambridge, MA. pp. 199214. Hampson. R.E and Deadwyler, S.A. (1996b) Ensemble codes involving hippocampal neurons are at risk during delayed performance tests. Proc. Natl. Acad. Sci. USA, 93: 1348713493. Hampson. R.E. and Deadwyler, S.A. (1998) Methods, results and issues related to recording neural ensembles. In: H. Eichenbaum and J. Davis (Eds.), Neuronal Ensembles: Strategies for Recording and Decoding. Wiley, New York, pp. 207-234. Hampson. R.E. and Deadwyler, S.A. (1999) Pitfalls and problems in the analysis of neuronal ensemble recordings during performance of a behavioral task. In: M. Nicolelis (Ed.), Methods for Simultaneous Neuronal Ensemble Recordings. Academic Press. New York. pp. 229-248. Hampson, R.E., Heyser, C.J. and Deadwyler. S.A. (1993) Hippocampal cell firing correlates of delayed-match-to-sample performance in the rat. Behav. Neurosci.. 107: 715-739. Hampson. R.E.. Simeral, J.D. and Deadwyler, S.A. (1999) Distribution of spatial and nonspatial information in dorsal hippocampus. Nature, 402: 610-614. Heyser, C.J., Harnpson, R.E. and Deadwyler, S.A. (1993) The effects of delta-9-THC on delayed match to sample performance
357 in rats: alterations in short~term memory produced by changes in task specific firing of hipp6campal neurons. J. Pharmacol. Exp. The~:,264: 294-307. Lee; D,, Pert, N.P., Kruse. W. and Georgopoulos, A.R (1998) Neuronal population coding: multielectrode recording in primate cerebral cortex. Im H. Eichenbaum and J. Davis (Eds.).
Neu/onal Eesembles: St~'awgiesfor Recording and Decoding. Wiley. New York. McHugh. T.J.. Blum. K.I, Tsien, J.Z., Tonegawa. S. and Wilson. M.A. (1996) Impaired hiptmcampal representation of space in CAl-specific NMDAR1 knockout mice [see comments]. Cell, 87: t 339-1349 Miller, E.K., Li, L. and Desimone, R. (1993) Activity of neurons in anterior inferior temporal cortex during a short-term memory task. J. Neurosci., 13: 1460-1478. Moran, Diw. and Schwartz, A.B. (1999a) Motor cortical activity during drawing movements: population representation during spiral tracing. J. Neurophysiot., :82: 2693-2704. Moran, D.W. and Schwartz, A.B. (1999b~ Motor cortical representation of speed and direction during reaching. J. Neurophysiol.. 82: 2676-2692. Muller, R.U, Stead, M. and Pach, J. (1996) The hippocampus as a cognitive graph [Review] [108 refs] J. Gen Physiol i 107 663-694. Nic01elis. M.A.. Baccala. L.A.. Lin. R.C and Chapin, J.K. (1995) Sensorimotor encoding by synchronous neural ensembIe activity at multiple levels of the somatosensory system. Science, 268: 1353-1358. NicoMis M.A., Ghazanfar A.A., Statabaugh, C.R. Oliveira L.M.. Lanbach. M.. Chapin, J.K.. Nelson, R.J. and Kaas. J.H. (1998) Simultaneous encoding of tactile information by three primate cortical areas, Nat. Neurosci. t: 621-630.
Otto. T. and Eichenbaum, H (1992) Neuronal activity in the hippocampus during delayed non-match to sample performance in rats: evidence for hippocampal processing in recognition memory. Hippocampus. 2: 323-334. Pellizzer. G., Sargent, E and Georgopoulos. A.E H995) Motor cortical activity in a context-recall task. Science. 269: 702705. Redish. A.D. (1999) Beyond the Cognitive Map. MIT Press. Cambridge, MA. Salinas, E. and Abbott, L.E (1994) Vector reconstruction from firing rates. J~ Comput. Neurosci., 1: 89-107. Samsonovich. A. and McNaughton, B.L. (I997) Path integration and cognitive mapping in a continuous attractor neural network model. J. Neurosci.. 17: 5900-5920. Schoenbaum, G. and Eichenbanm. H. (1995a} Information coding in the rodent prefrontal cortex. II. Ensemble activity in orbitofrontal cortex. J. Neurophysiol., 74: 751-762. Schoenbaum. G. and Eichenbaum, H. (1995b) Information coding in the rodent prefrontal cortex. I. Single-neuron activity in orbitofrontal cortex compared with that in pyriform cortex. J. Neurophysiol., 74: 733-750. Schultz. W., Apicella. R and Ljungberg, T. (1993) Responses of monkey dopamine neurons to reward and conditioned stimuli during successive steps of learning a delayed response task. J. Neurosci., 13: 900-913. Wilson. M.A. and McNanghton, B.L. (1993) Dynmmcs of the hippocampal ensemble code for space. Science. 261: 10551058. Young, B.J.. Otto. T., Fox, G.D. and Eichenbaum. H. (1997) Memory representation within the parahippocampal region. J. Neurosci.. 17: 5183-5195.
359
Subject Index
rhythm, 98 7-12 Hz rhythmic oscillations, 96 accuracy of neurat coding, 333 across-fiber patterning. 14 acute plasticity, 92 adaptive spike mechanism. 197 Adrian, 4, 1 t AMPA receptors, 91 amplitude-modulated stimuli, 221 animat, 56 arbitrary mappings, 261 area 3b, 63, 168 area IT, 182 area LIP. 182 area SII, 168 artificial neural networks, 89, 93, 246 artificial olfactory system, t 96 audition. 16 auditory cortex, 221 auditory feedback. 326 auditory pathway, 221 auditory place code. 210 auditory system, 206. 322 auditory-motor interactions. 324 backpropagation, 180 Backpropagation Through Time. 247 Barlow. Horace. 4 barretoids, 91 barrels, 91,284 bats, 206 Bayesian method. 338 behavioral neurophysiology, 245 binaural response, 221 binding hypothesis, 23, 113, 114 bird song system, 320 Bishop. 17 brain plasticity, 65 brain slices. 54 brainstem auditory pathways. 208
calcium imaging, 54 canonical discriminant analysis. 346 cell assembly. 17, 111 central motor program, 98 cerebellar coding, 279 cerebellar cortex. 279. 297 cerebellum. 299 characteristic frequency, 223 chemical sensor array, 196 chronically implanted microwires, 235 cluster analysis, 301 coarse coding, 6, 20 cochlear nucleus, 208 cognitive maps, 17 coherence, 130 coherence analysis, 33 common neural input, 143 condition-test paradigms, 102 connectionist models, 246 convergence, 142, 212, 268 coordinate transformations, 182 corpus callosum, 102 correlated discharge, 235 correlated firing, 113. 227 correlated noise, 335 correlation methods, 33 correlations, 271 correlograms, 150, 274 cortical point-spread functions. 156 cortical processing, 156 corticocortical loop. 89 corticomotoneuronal control, 268 corticopontine pathway, 282 corticospinal axon, 268 corticospinal model, 25 l corticothalamic projections, 91 cosane tuning function, 337 cross-correlation, 33, 66, 142, 235, 288 cross-correlation techniques. 236 cultured neuronal networks, 49
360 delay lines, 78 directed coherence, 34 directed transfer function, 34 discriminant analysis, 301,303, 346 dissociated neural cultures, 53, 54 dissociated neuronal networks, 49 distributed coding, 3, 191 distributed coding in the visual cortex, 112 distributed encoding of sound frequency, 222 distributed encoding of sound localization, 226 distributed network interactions, 305 distributed processing, 93 distributed representations, 319, 321 divergence, 142 EEG, 33 EEG recordings, 98 ensemble codes, 235 EPSP, 91,216 error generation, 98 expectation, 98 face-selective neurons, 186 feed-back connections, 112 feed-forward connections, 112 feedback, 142 feedback pathways, 89 feedforward, 142 feedforward computations, 89 fine coding, 20 fine tuning, 20 Fisher information, 333 fMRI, 126 fractured somatotopy, 280 frequency encoding, 222 frequency-modulated stimuli, 221 GABAergic innervation, 101 GABAergic transmission, 222 gain field, 177 gain modulation, 175 Gall, 9 gamma frequency range, 116 gating effect, 95 Ganssian interpolation procedure, 163 Gaussian probability density, 151 geniculocortical correlations, 149 Gestalt, 17
glial fibrillary acidic protein, 66 glomemlus, 194 Golgi cell, 280 grandmother cell, 233 Granger causality, 34 Granger causality tests, 34 guatation, 15 Hartline, 16 Hebb, Donald, 17, 111,227 Hebbian. theory, 130 Helmholtz, 10 hidden unit, 248 hierarchical schemes, 112 hippocampal ensemble codes, 347 hippocampus, 52, 345 horizontal connections, 119 Hubel and Wiesel, 11 HVc, 321 hyperacuity, 20, 21, 156 inferior colliculus, 207 inferior olivary neurons, 299 inferotemporal cortex, 17 inter-areal synchronization, 117 interaural time differences, 227 interhemispheric cross-correlations, 68 intracellular recordings, 207 intracortical correlations, 145 joint position, 16 Joint-PSTH, 115 Knudsen, 16 Kuffler, Stephen, 5 labeled-line hypothesis, 4, 11 Lashley, 17 lateral correlations, 144 lateral geniculate nucleus, 21,117, 141, 176 Lettvin, 4 local field potential, 116 long-range synchronization, 117 long-term depression; 279 look-up table, 261 Mtiller's law, 5 mappings, 246 memory, 17
361 mental rotation. 258 Merzenich, 15 metabotropic receptors. 91 Michigan system, 64 micro-elec~ode arrays. 50 microwires, 64 mitral cell, 14, 194 modular coding, 11 modularity, l0 Monte Carlo technique, 240 mossy fibers. 280 motoneurons. 267 motor cortex. 234, 245 motor preparatory activity, 268 motor systems, 16, 208. 267 Mountcasfleo I5 MT. 123 multi-electrode at-ray culture dishes, 50 multi-single-unit recordings, 288, 345 multi-site chronic recordings, 99 multi-unit activity, 1 t6, 118 multichannel neuronal acquisition processor, 64 multielectrode recording, 141. 305 multiple electrodes, 245 multiple-input network. 256 muscle sense, t6 mutual information. 33 network computational modeling, 328 neural assemblies. 111. 227. 346 neural ensemble recording, 355 neural mass, 22 neural mass differences. 18 neural network. 246 neural rewesentation. 205 nenrochip, 51 neuromagnetic responses. 131 neuronal population functions. 160 neuronal prosthesis. 297 NMDA receptors. 91 noclceptive information, 96 non-centered RF approach. 159 non-monotonic relationship. 214 nonlinear interactions, 169 nonlinear summation of tactile stimuli, 98 olfaction. 14 olfactory code, 195
olfactory receptor proteins. 193 olfactory sensory neurons. 191 optical imaging, 54. 196 optical recordings, 55 optimal linear estimator, 163 organotypic synaptic connections, 49 oscillation-based circuits. 78 oscillations. 141. 195 parallel afferent pathways, 77 parallel descending information. 267 parallel processing, 18, 211. 267 parietal cortex, 175 partial directed coherence, 33 perceptron learning, 279 peripheral deafferentation. 92 peripheral encoding, 75 Pfaffman, Carl, 4 phase-locked activity, 22! phase-locked loop, 79 pheromone, 19 piriform cortex, 195 pitch perception, 210 plastic reorganization. 93 population analysis. 155 population coding, 3, 13.24. 175, 196. 205. 234 population receptive field, 158 population recordings, 192 population vector. 245 population vector analyses, 346 population vector average, 234 premotor areas. 246 premotor cortex, 274 proprioceptive information, 299 Purkinje cell, 279, 298 RA neurons, 320 rat auditory cortex. 222 rat somatosensory cortex. 166 rat vibrissal system. 75 rate-population code, 82 read-in, 297 readout. 12, 19. 206, 297 receptive fields. 5, 15, 65, 157. 176. 183,281,355 reconstruction method, 338 recurrent connections. 178 respiratory centers, 52 reticular nucleus of the thalamus. 91
362 retina, 17, 52 retinogeniculate, 143 RF centered approach, 159 rhythmic bursts, 96 second-order analysis, 149 sensorimotor interactions, 320 sensorimotor mapping, 325 serial processing, 267 Shannon mutual information, 333 signal segmentation, 98 silicon-based array, 235 simple cells, 21 simultaneous recording, 245 single best neuron, 239 single-cell activity, 245 single-cell recordings, 155 single-compartment neuron, 197 sleep, 325 somatosensory cortex, 63, 93, 166 somatosensory gating, 96 somatosensory oscillators, 78 somatosensory system, 89, 284 somatotopic maps, 89 somesthesis, 15 song learning, 320 sound intensity, 221 sound localization, 226 sound-source location, 227 spatial code, 16 spatial encoding, 75 spatio-temporal coding, 291 spatio-temporal pattern, 198, 227,298, 319 spatio-temporal response, 156 spectral analysis, 33 spectrotemporal dynamics, 225 Sperry, 15 spinal cord, 267 spinal interneurons, 268 standard mapping, 246 stereopsis, 16 summation, 149 superior colliculus, 5, 16
I
superposition catastrophe, 234 superposition problem, 113 supplementary eye field, 261 suprachiasmatic nucleus, 52 synaptic learning, 340 synaptic potentials, 207 synchronous activity, 67, 115, 149, 237, 270, 302 synchronous oscillations, 288 synergy, 149 tactile stimulus, 94 template matching, 98 temporal coding, 75, 292, 319 temporal decoding, 78 temporal relations, 141, 211 temporal-to-rate code transformation, 83 temporally correlated activity, 115 thalamic implants, 64, 69 thalamocortical loop, 89 thalamocortical system, 208 tonotopic organization, 209 top-down influences, 94 topographic, 20 unmasking of novel tactile responses, 92 Utah system, 64 V-shaped response, 221 vector coding, 113 ventroposterior nucleus, 69 vertebrate olfactory, 191 vestibular sense, 16 visual coding, 14 visual cortex, 112 visual oscillations, 78 visuomotor behavior, 246 whisker twitching, 97 whisking, 75 whole-cell patch-clamp, 212 Young, Thomas, 3, 9, 12, 157 zebra finches, 324