The Primacy of Grammar
The Primacy of Grammar
Nirmalangshu Mukherji
A Bradford Book The MIT Press Cambridge, Massachusetts London, England
6 2010 Massachusetts Institute of Technology All rights reserved. No part of this book may be reproduced in any form by any electronic or mechanical means (including photocopying, recording, or information storage and retrieval) without permission in writing from the publisher. MIT Press books may be purchased at special quantity discounts for business or sales promotional use. For information, please e-mail
[email protected] .edu or write to Special Sales Department, The MIT Press, 55 Hayward Street, Cambridge, MA 02142. This book was set in Times New Roman and Syntax on 3B2 by Asco Typesetters, Hong Kong. Printed and bound in the United States of America. Library of Congress Cataloging-in-Publication Data Mukherji, Nirmalangshu. The primacy of grammar / Nirmalangshu Mukherji. p. cm. ‘‘A Bradford book.’’ Includes bibliographical references and index. ISBN 978-0-262-01405-2 (hardcover : alk. paper) 1. Biolinguistics. 2. Language and languages—Philosophy. I. Title. P132.M85 2010 401—dc22 2009031988 10 9 8 7
6 5 4 3 2
1
[I]t seems obvious, when you think about it, that the notion of language is a much more abstract notion than the notion of grammar . . . grammars have to have a real existence. . . . But there is nothing in the real world corresponding to language. In fact, it could very well turn out that there is no intelligible notion of language. —Noam Chomsky
Contents
List of Figures Abbreviations Preface xv
xi xiii
1
The Loneliness of Biolinguistics 1 1.1 Some Classical Issues 2 1.2 Limits of Cognitive Inquiry 4 1.3 Overview of Biolinguistics 11 1.3.1 Language and Biology 13 1.3.2 A Body of Doctrines 19 1.3.3 A Mind-Internal System 24
2
Linguistic Theory I 29 2.1 Russell’s Scope Problem 30 2.2 Principles and Parameters 35 2.3 Government-Binding Theory 37 2.3.1 D-Structure 39 2.3.1.1 C-Selection 40 2.3.1.2 X-Bar Theory 43 2.3.1.3 Theta Theory 47 2.3.2 S-Structure 51 2.3.2.1 Case Theory 52 2.3.2.2 Wh-Movement 55 2.3.2.3 Binding Theory 57 2.3.3 LF 62 2.4 Grammar and Scope Problem 66
3
Grammar and Logic 73 3.1 Chinese Room 74 3.2 PFR and SFR 75 3.3 LF and Logical Form 83 3.4 Truth and Meaning 87 3.5 Limits of Formal Semantics
95
viii
Contents
3.6
3.5.1 External Significance 95 3.5.2 Syntax of Thought? 100 3.5.3 Russell’s Equivalence 106 Summing Up 115
4
Words and Concepts 119 4.1 ‘‘Incompleteness’’ of Grammar 119 4.2 Lexical Data 124 4.2.1 Uncertain Intuitions 126 4.2.2 Nature of Lexical Inquiry 132 4.3 Lexical Decomposition 134 4.3.1 Initial Objections 136 4.3.2 Nouns 145 4.3.3 Verbs 150 4.4 Crossroads 156
5
Linguistic Theory II 161 5.1 Minimalist Program 162 5.1.1 Conceptual Necessity 163 5.1.2 Feature Checking 165 5.1.3 (New) Merge 169 5.1.3.1 Merge and Syntax 171 5.1.3.2 Merge and Semantics 173 5.1.4 Economy Principles 176 5.2 CHL and Linguistic Specificity 179 5.2.1 Principles 182 5.2.2 Displacement 185
6
Language and Music 189 6.1 Musilanguage Hypothesis 189 6.1.1 Evidence 190 6.1.2 What the Evidence Means 194 6.2 Strong Musilanguage Hypothesis 197 6.2.1 Music and Meaning 198 6.2.2 Themes from Wittgenstein 201 6.2.2.1 Music and Emotions 202 6.2.2.2 Internal Significance 205 6.2.3 Recursion in Music 209
7
A Joint of Nature 215 7.1 Merge and Music 216 7.2 Faculty of Music 222 7.3 ‘‘Laws of Nature’’ 229 7.3.1 Forms of Explanation 229 7.3.2 Scope of Computationalism 235
Contents
Notes 243 References 253 Index 271
ix
List of Figures
Figure 2.1
Government-Binding theory
Figure 2.2
X-bar organization
Figure 2.3
Clause structure
Figure 2.4
y-role assignment
Figure 2.5
That-trace
Figure 3.1
Montague tree
Figure 5.1
Minimalist Program
Figure 5.2
Unambiguous paths
Figure 5.3
Merge in language
Figure 6.1
Structure of a raaga
Figure 7.1
External Merge in music
Abbreviations
A-P
Articulatory-perceptual system
CHL
Single computational system of human language
C-I
Conceptual-intentional system
CS
Computational system of a language
CSR
Construction-specific rule
DP
Determiner phrase
ECP
Empty Category Principle
EPP
Extended Projection Principle
FI
Principle of Full Interpretation
FL
Faculty of language
FLB
Broad faculty of language
FLI
FL-driven interpretation
FLN
Narrow faculty of language
FM
Faculty of music
FMI
FM-driven interpretation
G-B
Government-Binding framework
GLP
General linguistic principle
GTTM
Generative Theory of Tonal Music
IP
Inflectional phrase
LF
Logical form
LI
Lexical item
LSR
Language-specific rule
MLC
Minimal Link Condition
MLH
Musilanguage Hypothesis
xiv
Abbreviations
MP
Minimalist Program
NP
Noun phrase
NTC
No Tampering Condition
PCE
Principle of Computational E‰ciency
PCP
Purely computational principle
PF
Phonetic form
PFR
Purely formal rule
PHON
Phonetic output of grammar
PLD
Primary linguistic data
PP
Prepositional phrase
P&P
Principles-and-parameters approach
QP
Quantified phrase
Q-PCP
Quasi-PCP
QR
Quantifier raising
SDC
Shortest Derivation Condition
SEM
Semantic output of grammar
SFR
Stands-for rule
SM
Sensorimotor System
SMH
Strong Musilanguage Hypothesis
SMT
Strong Minimalist Thesis
SO
Syntactic object
UG
Universal Grammar
VP
Verb phrase
WP
Wh-phrase
Preface
Human languages are commonly viewed as complex, porous, and moldable systems that we construct by active human agency to meet a variety of sociocultural ends. On this view, languages are more like institutions such as legal and political systems; they are no doubt codified in some sense, but their codified character should not mislead us into thinking that there is some natural basis to their organization. In contrast, a small minority does hold the view that languages are natural objects with a biological basis, not unlike the respiratory or immune systems. Even there, most think of biological systems as irreducibly complex and messy where the methods of the exact sciences such as theoretical physics do not apply. In any case, for most people, a prominent lesson from the history of the sciences is that there are reasons to be skeptical about genuine scientific advances in the study of what may be called the ‘‘inner’’ domains. In the prevalent intellectual scenario, it is of considerable interest that the contemporary discipline of generative linguistics—also called ‘‘biolinguistics’’—has raised the prospects for developing a form of inquiry achieved only in some of the basic sciences. Biolinguistics is arguably the only attempt in the history of ideas in which, according to Noam Chomsky, a study of an aspect of the human mind—language— is beginning to have the ‘‘feel of scientific inquiry.’’ Biolinguistics is currently suggesting that the structure of language may be ‘‘perfect’’ in design, not unlike the arrangement of petals in the sunflower and the double helix of the DNA. Yet these advances have been accomplished essentially independently of the natural sciences, especially biology. In that sense, biolinguistics has initiated a (basic) science in its own terms. In view of these startling developments, there ought to be some interest in the foundations of this discipline. For instance, it is most natural to ask: Which aspect of nature does this science investigate? The topic is introduced in chapter 1.
xvi
Preface
I made a very preliminary attempt to address this issue in a short monograph earlier (Mukherji 2000). The present book vastly extends the scope of that work. Here I have made a more serious e¤ort to construct a philosophical discourse that weaves in and out of some of the basic ideas in biolinguistics, including some technical ones, as I examine its internal logic. My contention is that the theoretical beauty of biolinguistics cannot be adequately displayed without marshaling some degree of rigor and formal precision. Biolinguistics is not just a clever coverage of data; it is a search for invariants in nature, as noted. The limited technical discussion of grammatical theory in chapters 2 and 5 is directed at the nonlinguist readers, but is neither intended nor enough to turn them into working linguists. As for professional linguists, there is perhaps some novelty in the way familiar material has been presented. In any case, I need this material for the complex argument that follows. What It Is Not
I am aware that biolinguistics is not alone in syntax research, not to speak of studies on language as a whole; there are other perspectives on the organization of language and the architecture of the mind. However, as my basic interest is to understand the scope and limits of biolinguistics and to extract some general consequences from that understanding, I have made no attempt to engage in comparative studies; to that extent, this work is not a defense of biolinguistics. In fact, the work is pretty much confined to Chomsky’s contributions in linguistic theory. Apart from the fact that Chomsky continues to be the prime mover in biolinguistics, the issues that interest me arise directly from his work, or so I think. I am aware that almost everything Chomsky is currently saying on the design of the language faculty is under dispute even within the core biolinguistic community; I have noted as many of them as possible while trying not to clutter the text with field-internal debates. In fact, I myself will express a variety of disagreements with Chomsky. So, as with any rational inquiry, the general argument of the book is conditional in character: assuming the validity of Chomsky’s claims that I find attractive (and I will tell you why), certain general conclusions seem to follow that go far beyond those claims. I am not suggesting that Chomsky is likely to agree with these conclusions. This basic focus on Chomsky’s work in biolinguistics has also (severely) constrained the extent to which I could address specific topics from other disciplines that merit extensive discussion on their own. Apart
Preface
xvii
from topics internal to biolinguistics, the work touches on topics in the history and philosophy of science, epistemology, philosophy of language, philosophy of mind, lexical and formal semantics, and psychology of music, among others. I discuss them, at a suitable level of abstraction, only to see what light they throw on the character of biolinguistic inquiry from the perspective that interests me. In any case, it is physically impossible by now to look at even a fraction of the literature on language and related capacities, not to speak of developing expertise in all the areas listed above. My own professional location is in the philosophy of language. Venturing out from there, I had to make hard choices regarding the material to include, and a host of rich details internal to the individual disciplines had to be set aside, mostly by design but sometimes out of sheer exhaustion as well. Some perspective on the nature of language and mind does seem to follow once these choices are made. What It Is
Biolinguistics is centrally concerned with Plato’s problem: How do humans come to know so much from so little exposure to the environment? It is interesting that biolinguistics has in fact furnished a partial but substantive answer to this ancient question for a restricted domain that falls within the wider domain of language; the identification of this restricted domain is a task that will occupy us through much of this work. As discussed in chapters 3 and 4, biolinguistics has been able to maintain some distance from topics traditionally thought to be central to the study of language: concepts, truth conditions, and communication. In this very specific sense, biolinguistics is concerned with the study of grammars. As in the opening citation from Chomsky, I use grammar mostly to designate the object of biolinguistic inquiry, but sometimes I use it to label the inquiry itself. I expect the context to make clear which of the two uses of grammar is intended. For example, the title of this work should strictly read The Primacy of Grammatical Theory. The context makes it very clear: no useful theoretical sense can be attached to the idea that the object—a piece of the brain—has primacy. Grammars consist of schematic and computational aspects of the mind/brain. Principles of grammar compute over symbols (representations) that may be used to express thoughts and emotions. But these principles do not compute over contents of thoughts and emotions, where content covers both the (mind-internal) conceptual aspects and the
xviii
Preface
external significance of language. Against the current perhaps, I take the isolation of this rather austere object to be the central contribution of Noam Chomsky; its significance lies in its frugality. The history of the more established sciences suggest that once some observed complexity has been successfully analyzed into parts, a study of the original complexity often drops out of that line of research, perhaps forever. This restriction to grammar, and the abstraction away from ‘‘language,’’ opens the possibility that the computational system of human language may be involved in each cognitive system that requires similar computational resources. In chapter 6 and the early part of chapter 7, a mixture of analytical argumentation, varieties of empirical and introspective evidence, and some speculation suggests a picture in which a computational system consisting of very specific principles and operations is likely to be centrally involved in each articulatory symbol system—such as music—that manifests unboundedness. Finally, in the rest of chapter 7, I suggest that the following things converge: e e e e
The scientific character of biolinguistics Its independence from the rest of science Its basic explanatory form The domains of its application
From this perspective, the real gain of the biolinguistic approach to cognitive phenomena is that the approach may have identified, after thousands of years of inquiry, a specific structure of the human mind, perhaps a real joint of nature. Acknowledgments
I have been working on this book, o¤ and on (more o¤ than on), for over a decade. It is a pleasure to record my gratitude for people who helped me sustain this e¤ort for so long in otherwise di‰cult circumstances. As noted throughout this book, my foremost intellectual debt is to the work of Noam Chomsky, apart from my personal debt to him for his swift, lengthy, and typically critical but constructive comments in many e-conversations over the years. I am also delighted to mention Ramakant Agnihotri, Daniel Andler, Roberto Casati, Probal Dasgupta, Pierre Jacob, Lyle Jenkins, Mrinal Miri, Bibhu Patnaik, Susrut Roy, and Rajender Singh for their support to the project in various ways. After several unsuccessful attempts (including computer disasters), the basic structure of this work finally fell (more or less) in place during the
Preface
xix
summer of 2003 in Paris, in studios located in the ancient Latin Quarters and the ravishing Montmartre. I am grateful to Roberto Casati, close friend and critic, for proposing the visit on behalf of the marvelous Institut Jean Nicod, and for maintaining a constant watch on my well-being. I am indebted to Maurice Aylmard and Gilles Tarabout of Maison des Sciences de l’Homme, and Pierre Jacob, Director of Institut Jean Nicod, for supporting the trip. Unfortunately, several years elapsed before I could return to the work as I shifted to other more pressing concerns soon after returning from Paris. I am fortunate that Massimo Piattelli-Palmarini, Howard Lasnik, Norbert Hornstein, Cedric Boeckx, and Wolfram Hinzen read one of the recent versions, in part or in full. I must mention that I had no personal acquaintance with any of these renowned scholars when I approached them. Yet, they agreed to look at the manuscript (repeatedly, in some cases); there must be some invisible hand after all. I also learned much from the reports of the reviewers for the publisher on the penultimate version. I deeply appreciate the e¤orts of the editorial team at the MIT Press for executing the project with exemplary understanding.
1
The Loneliness of Biolinguistics
Reflecting on the state of language research, after a decade of work in the principles-and-parameters framework, Noam Chomsky (1991b, 51) observed that the systems found in the world will not be regarded as languages in the strict sense, but as more complex systems, much less interesting for the study of human nature and language, just as most of what we find around us in the world of ordinary experiences is unhelpful for determining the real properties of the natural world. . . . I have spoken only of language, which happens to be one of the few domains of cognitive psychology where there are rather far-reaching results. But I think it would hardly be surprising if the truth of the matter were qualitatively similar in other domains, where far less is known . . . only ancient prejudice makes this prospect appear to many to be unlikely.
I find it instructive to open the discussion with a somewhat free interpretation of these remarks. The citation has two parts. In the first, it is suggested that the study of ‘‘complex systems’’ ‘‘found in the world’’ is not likely to lead to the discovery of the ‘‘real properties of the natural world.’’ In the second part, the citation mentions the discipline of cognitive psychology where ‘‘rather far-reaching results’’ have been achieved in some of its domains. The results have been ‘‘far-reaching’’ in the sense that something has been learned in these domains at a su‰cient remove from ‘‘the world of ordinary experiences.’’ Combining the two, it follows that, in these few domains of cognitive inquiry, research has been able to abstract away from the complexities of systems found in ordinary experience to isolate some simple systems whose properties may be viewed as real properties of nature. The implicit reference here is to some small areas of the more established sciences such as physics, chemistry, and certain corners of molecular biology, where rather surprising and deep properties of the natural world are sometimes reached by abstracting away from common experiences
2
Chapter 1
and expectations (Stainton 2006). I will have many occasions in this work to evaluate advances in the ‘‘few domains’’ of cognitive psychology in terms of the history and methodology of physics and other advanced sciences (also see Boeckx 2006). For now, Chomsky seems to be generally suggesting that, in these domains, something like the explanatory depth of the natural sciences is almost within reach. How did it happen? 1.1 Some Classical Issues
If ‘‘cognitive psychology’’ is understood broadly as a systematic study of human cognitive behavior (as contrasted to, say, motor behavior), then the study is probably as old as human inquiry itself. Extensive, and sometimes quite rigorous, studies on this aspect of human nature dominated much of philosophical thinking across cultures for centuries. These studies were not always cast in direct psychological terms—that is, in terms of the properties of the human mind. For example, language was often studied as an independent, ‘‘external’’ object by itself, and the character of the studies ranged from mystical reflections to more critical and often constructive suggestions on the nature of this object. Such studies proliferated in large parts of the ancient Indian intellectual tradition. In the Rgveda (c. 1000 BC), for instance, the phenomenon of language is once described as a ‘‘spirit descending and embodying itself in phenomena, assuming various guises and disclosing its real nature to the sensitive soul.’’1 On the other hand, much later but within the same tradition, Pa¯nini (c. 450 BC) worked out the first extensive and rigorous ˙ grammatical account of Sanskrit to trigger discussion and analysis that continue today (Kiparsky 1982; Barbosa et al. 1998, 2; Dasgupta, Ford, and Singh 2000; Coward and Kunjunni Raja 2001). Although nothing like the sophistication of Paninian grammar was ever reached in other domains, vigorous discussion of conditions governing human knowledge, perception, memory, logical abilities, and the like, continued for over a millennium in eight basic schools of thought with many subschools within each. The complexity and the depth of this tradition have begun to be understood in contemporary terms only recently. Unfortunately, the context and agenda of the present book do not allow more detailed comments on this tradition.2 Similar variations are found in the Western tradition as well. For the mystical part of the tradition, one could cite Hegel, for whom language is ‘‘the medium through which the subjective spirit is mediated with the
The Loneliness of Biolinguistics
3
being of objects.’’ The critical and constructive part of the enterprise took shape since Plato and Aristotle and continued to Descartes, Leibnitz, Kant, Hume, and later thinkers such as Wilhelm von Humboldt (1836).3 Here as well we notice the interesting unevenness between linguistic studies, say, in the Aristotelian and Cartesian–Port Royal traditions, and the rest of the studies on human cognition. While studies on language and logic grew in sophistication, it is hard to see any radical progress since, say, the Theory of Ideas proposed by Plato in the fifth century BC. Very tentatively, therefore, there seems to be a sense in which the ‘‘few domains’’ of language and related objects are such as to open themselves to focused theoretical inquiry.4 It is not di‰cult to reinterpret at least some of these studies from either tradition in naturalistic terms to suggest that they were directed at uncovering the ‘‘real properties’’ of one part of nature, namely, the human mind. For Bhartrhari (c. 450–500 AD), a philosopher of language in the Paninian tradition, speech is of the nature of the Ultimate Reality (SabdaBrahma): ‘‘Although the essence of speech is the eternal Brahman, its significance evolves in the manner in which the world evolves.’’5 The thought is subject to a variety of (often conflicting) interpretations. However, no familiar conception of divinity—for example, an object of worship—attaches to the concept of Brahman. In that sense, nothing is lost if Brahman is understood as a system of invariants that constrains both the evolution of the world and the significance of speech. For the Western tradition, consider what Chomsky takes to be the central question in cognitive psychology: How do humans come to know so much from so little exposure to the environment? In di¤erent places, Chomsky calls this problem variously ‘‘Plato’s problem’’ (Chomsky 1986), ‘‘Descartes’ problem’’ (Chomsky 1966), or ‘‘Russell’s problem’’ (Chomsky 1972b). These names suggest that, at least in the Western tradition, the general problem was raised throughout directly in psychological terms—that is, in terms of constraints on human knowledge. Nevertheless, despite the noted unevenness between linguistic and other studies, studies from Rgveda to Russell hardly qualify as scientific studies in any interesting sense of ‘‘science.’’ Suppose we label the most rigorous e¤orts in this area as ‘‘proto-science.’’ For some domains of current cognitive psychology, in contrast, what Chomsky is claiming is a lot stronger. He is claiming that studies in these domains already exhibit some of the properties of the most advanced corners of some of the natural sciences. So the situation is this: the general
4
Chapter 1
questions currently asked in these domains are fairly classical though the form and the content of the answers have radically changed. I can think of only one way this could have happened. Recall that only a few domains of ‘‘cognitive psychology’’ seemed to be intrinsically open to serious theoretical inquiry leading to proto-science; they at once await and motivate, as it were, development of new ideas and theoretical tools. Whenever new ideas and tools are directed at classical questions, interesting answers begin to appear at a certain remove from common experience only in these domains from among the assorted domains to which the general, philosophical inquiry was initially, somewhat aimlessly, directed. Assuming this, it is no wonder that the object responded to the e¤orts of Paninian, Aristotelian, and Port Royal grammarians, as well as to contemporary generative linguists. Also, it could have been the case that the object, the new tools, and novel ideas formed a symbiotic relationship in that these tools and ideas interestingly applied only to this object. If so, then we have some explanation of why thousands of years of philosophical investigations into the nature of the rest of human knowledge in either tradition revolved around basically the same set of ideas and problems while formal studies on language and related topics flourished. I return to these issues shortly. 1.2 Limits of Cognitive Inquiry
The picture sketched above sheds some light on what seems to me to be a major perplexity in contemporary studies on language and mind. The perplexity is this: although there are reasons, both historical and conceptual, for skepticism about the very idea of (serious) cognitive inquiry, certain approaches to language have rapidly reached the standards of the advanced sciences. There is an old adage that a theory of language is an impossibility since the theory has to be stated in some language or other. Thus, the theory always falls short of its object. It quickly generalizes to a dim view of theories of mind as well: a theory of mind is an impossibility since the theory itself will be a product of the mind, and hence a part of the object under examination. The adage appeals to the image of eyeglasses: we can give only a partial and distorted description of the glasses when we wear them; we can take them o¤, but then we cannot see. This adage is distinct from classical skepticism that denies the possibility of any knowledge. The e¤ect of the adage is restricted only to cognitive inquiry; in that
The Loneliness of Biolinguistics
5
sense, it allows the possibility of knowledge of the ‘‘external world,’’ say, the world of physics. Moreover, the adage needs to be distinguished from the more general observation by Chomsky (1980) that, since the human science-forming capacity is itself a natural object, there ought to be limits on scientific inquiry: the constraints that govern the capacity both allow and restrict the formation of scientific theories. Thus, unsolved problems divide into two kinds: ‘‘puzzles’’ that the human mind can in fact solve, and ‘‘mysteries’’ whose solutions, perhaps even intelligible formulation, lie beyond the power of the human mind (Chomsky 1975). The scope of this suggestion is di‰cult to estimate. On the one hand, the suggestion seems to apply to all the sciences: Are the unsolved problems of the origins of life or of the universe puzzles or mysteries? On the other hand, it is unclear whether it applies to the entire study of inner domains. For example, Chomsky specifically thinks that what he has called ‘‘the creative aspect of language use’’—our essentially unbounded ability to produce and interpret sentences appropriately in novel circumstances—is a mystery. In the limit, we could conjecture that any significant general study of the scienceforming capacity itself is beyond the power of the capacity. It does not follow, as the adage requires, that the study of cognitive systems such as the visual system, structural aspects of human reasoning, the observed diversity of languages, and so on also fall beyond the capacity, as Chomsky’s own work on language testifies. The adage is also di¤erent from a more recent skeptical perspective on the history of science. According to Chomsky, lessons from the history of the natural sciences seem to suggest that ‘‘most things cannot be studied by contemporary science.’’ On this issue, it seems to him that Galileo’s intuition that humans will never completely understand even ‘‘a single effect in nature’’ is more plausible than Descartes’ confidence that ‘‘most of the phenomena of nature could be explained in mechanical terms: the inorganic and organic world apart from humans, but also human physiology, sensation, perception, and action to a large extent.’’ Developments in post-Cartesian science, especially Newtonian science, ‘‘not only e¤ectively destroyed the entire materialist, physicalist conception of the universe, but also the standards of intelligibility that were based on it.’’ Thus Chomsky (2001b) supports Alexander Koyre’s remark that ‘‘we simply have to accept that the world is constituted of entities and processes that we cannot intuitively grasp.’’ Clearly, these remarks apply to the whole of science including, as noted, the most innovative proposals
6
Chapter 1
in theoretical physics. The remarks tell us about the kind of science we are likely to have at best; they do not deny that some form of science is available to humans in most domains of inquiry. The adage under discussion, on the other hand, suggests that scientific explanation may not be available for the study of ‘‘inner’’ domains at all, notwithstanding the character of scientific explanation already available for the ‘‘outer’’ domains. Nevertheless, the adage and Chomsky’s observations possibly converge around the issue of complexity. Chomsky suggests that sciences of outer domains work under severe constraints, cognitive and historical. These constraints perhaps lead to the striking unevenness in the development of science. Genuine theoretical understanding seems to be restricted to the study of simple systems even in the hard sciences such that ‘‘when you move beyond the simplest structures, it becomes very descriptive. By the time you get to big molecules, for example, you are mostly describing things’’ (Chomsky 2000a, 2). Thus the quality of explanation falls o¤ rapidly as inquiry turns to more complex systems. Given that the organization of our inner domains—that is, the respects in which we wish to understand them—is vastly more complex than free electrons or isolated genes, it is not surprising that we lack scientific progress in these domains. These remarks suggest that, even when we reach some understanding of cognitive domains such as language, the understanding is likely to be restricted to small and simple parts of the domain such as grammar. The adage fosters a lingering intuition that our ability to have a theoretical grasp of ourselves must be severely restricted somewhere: ‘‘There are inevitably going to be limits on the closure achievable by turning our procedures of understanding on themselves’’ (Nagel 1997, 76). It is likely that when we approach that point our theoretical tools begin to lose their edge and the enterprise simply drifts into banalities since, according to the adage, our resources of inquiry and the objects of inquiry begin to get hopelessly mixed up from that point on. Such a point could be reached in the ‘‘hard sciences’’ as well when they attempt to turn ‘‘inward.’’ This may be one way of understanding the origin of the deep puzzles around the so-called measurement problem in quantum physics. The conjecture here is that, for inner domains such as reasoning and language, such points show up sooner rather than later. Despite the intellectual appeal of the adage, it is not clear how to examine it in a theoretically interesting manner. In fact, from the point of view of the cognitive sciences, the adage may be viewed as intrinsically uninteresting. How can we tell now what an enterprise is going to look like in the
The Loneliness of Biolinguistics
7
future (Fodor 2000, 11 n. 1)? Skeptical questions could have been, indeed must have been, raised at the beginning of physics. But physics progressed, through calm and stormy times, without ever directly answering them. The questions were ultimately answered indirectly by the growth of physics itself to the point that skepticism became uninteresting. However, there is no credible evidence in the history of the sciences—just the opposite in fact, as we will see briefly in the context of biology—that lessons from the history of physics generalize to other domains of inquiry. It could be that Galilean physics is an exception rather than the rule in scientific inquiry. To emphasize, Galilean physics could be an exception precisely because it could extract and focus on simple parts of nature. In any case, the natural sciences typically focus on ‘‘outer’’ domains of nature, called the ‘‘external world’’ in the philosophical literature; the study of inner domains just does not belong to serious science. This is one source of the classical mind-body problem. The mind (the collection of inner domains) is thought to be so fundamentally di¤erent from the body (the collection of outer domains) that the forms of scientific explanation available for the latter are not supposed to obtain for the former. Chomsky has dubbed this doctrine ‘‘methodological dualism’’ (Chomsky 2000d, chapter 4, for extensive criticism). When we add the further assumption that the forms of explanation that apply to the outer domains are the only ones in hand, it follows that inner domains fall out of science.6 To find some grip on these very general issues, I will assume, as noted, that the study of inner domains is essentially concerned with what Chomsky has called ‘‘Plato’s problem’’: How do organisms form rich cognitive structures from little exposure to the environment? I take this to be the original and fairly classical motivation for cognitive science although not everyone who currently works in the cognitive sciences shares the motivation.7 The problem arises from what has come to be known as the ‘‘poverty-of-the-stimulus arguments,’’ which show that there is not enough information in the environment for the rich systems constructed by organisms (Chomsky 1959; Piattelli-Palmarini 1980; Wexler 1991; Crain and Pietroski 2001; Berwick and Chomsky 2009, etc.). As Chomsky (2000a, 6) puts it, ‘‘We can check the experience available; we can look at it and see what it is. It’s immediately obvious that it’s just much too limited and fragmentary to do anything more than shape an already existing common form in limited fashions.’’ The observation applies across the board to human language, the visual system, bird songs, insect navigation, bee dances, and so on. For
8
Chapter 1
example, in the visual-cli¤ experiment, a given pattern is broken into an upper and a lower half with a glass top extending from the ‘‘shallow’’ half over the ‘‘deep’’ half. Thus, in the absence of depth perception, the lower half looks continuous with the upper. Newly hatched chicks and one-dayold goats will stop at the upper edge of a visual cli¤ at the very first exposure; a goat will in fact extend its forelegs as a defensive measure when placed on the ‘‘deep’’ side and leap onto the ‘‘shallow’’ side (Kaufman 1979, 237). For human language, which is our basic concern, it has been extensively documented that children rapidly acquire languages not only on the basis of impoverished information, but also, in many cases, seemingly without any relevant information at all (Jackendo¤ 1992, chapter 5). In a particularly telling case, three deaf children were able to construct a sign language secretly and for use only among themselves in the face of parental opposition. Investigations later showed that this language compared favorably with the spoken language developed by normal children of the same age (Goldin-Meadow and Feldman 1977; Gleitman and Newport 1995; Goldin-Meadow 2004). In fact, studies show that deaf and normal children make the same ‘‘mistakes’’ at the corresponding stage of acquisition. At a certain stage, normal children typically use you to refer to themselves and I to refer to the addressee. Amazingly, corresponding gestures in American Sign Language for you and I by deaf children show very similar reversal despite the fact that these gestures are iconic (Chiat 1986; Petitto 1987). Studies show that twelve-hour-old babies can distinguish between linguistic and nonlinguistic acoustic inputs. Jacques Mehler and his associates showed further that four-day-old infants can distinguish between the prosodic contours of, say, Russian and French (Mehler et al. 1986). Turning to more abstract syntactic abilities, four-month-old babies are sensitive to the clause boundaries of, say, Polish and English, their native tongue. By six months, however, they lose their sensitivity to Polish clause boundaries, but retain the same for English (Karmilo¤-Smith 1992, 37). In other terms, as we will see, some parameters of specific languages are fixed by then (Baker 2001). The general task of the cognitive sciences is to explain this astonishing ability in every domain in which it is displayed. Returning to the adage and setting other inner domains aside, it is already clear that language escapes the suggested divide between what does and does not fall under science. Language not only belongs to the inner domain, it is an extremely complex system even when it is studied under the so-called top-down—rules-and-representations—approach
The Loneliness of Biolinguistics
9
(Jackendo¤ 2002, 6); at the level of neurons and their connections, the complexity is astronomical. This is where we would least expect genuine scientific understanding. Yet, in just over four decades of research, we not only have substantive solutions to Plato’s problem in this domain, the solutions have the form of the most advanced corners of science. Somehow the adage lost its skeptical power when scientific attention was directed at a specific aspect of human language. If the results of language research are to be coherently accommodated within current scientific outlook, some fundamental assumptions have to give way. In contrast, despite immense international e¤ort accompanied by technological development, progress in other classical cognitive domains continues to be largely elusive. According to Jerry Fodor (2000, 2), ‘‘The last forty or fifty years have demonstrated pretty clearly that there are aspects of higher mental processes into which the current armamentarium of computational models, theories, and experimental techniques o¤ers vanishingly little insight.’’ Even if Fodor’s rather sharp remark is only partly true, it seems the adage continues to control the study of these domains. The opening citation from Chomsky, together with its free interpretation, gives some preliminary idea of what might be happening. The basic idea, as hinted, is that linguistic inquiry could escape the adage precisely because it could address Plato’s problem in an area of human cognition that has traditionally allowed inquiries to abstract away from common experience. Inquiries that are more directly involved with common experience, even implicitly, seem to fail to do so. For now, I will make some brief remarks on this point with some speculation on how our cognitive capacities might be organized with respect to our ability to study them. The rest of the work may be viewed as a gradual unfolding of these preliminary ideas. It seems that our grammatical—not linguistic—capacity is such that we have no firm common beliefs about its nature and function; we just use it to form ‘‘surface’’ intuitions in the form of judgments of acceptability. That is, our use of this capacity does not require that we form some opinion of it; we are not congenital syntacticians. Karmilo¤-Smith (1992, 31) disagrees: ‘‘Normally developing children not only become e‰cient users of language; they also spontaneously become little grammarians.’’ Very young children, no doubt, make surprisingly sophisticated grammatical judgments on the basis of substantial tacit knowledge, as Karmilo¤-Smith documents, but they do not know what noun phrase or anaphora means. In this sense, the grammatical system is ‘‘opinion-encapsulated.’’ Therefore, it is possible for the human science-forming capacity to study
10
Chapter 1
these ‘‘surface’’ intuitions reflected in the production and interpretation of speech, and to abstract away from them relatively undisturbed by ‘‘folk syntax.’’ In other words, grammatical competence is typically put to use without any knowledge that a grammar is in use—that is, without knowing that the user is putting something to use. When asked if the (complicated) structure John is too intelligent to fail is okay, a competent user can give assent without having any resource to explain why it is so. If this is roughly correct, it explains why linguists can place their own linguistic intuitions under scientific scrutiny, thus opening up an explosion of data for language research. As we will see much later in the work (in section 4.2), even when human grammatical judgments are uncertain, we cannot remove the uncertainty by conscious e¤ort. So, even uncertain intuitions become data for science. Interestingly, what I just said about grammaticality judgments seems to apply to perceptual judgments as well when the contexts are properly controlled and the stimulus is presented rapidly. It is generally said that, among the studies on cognitive capacities, the sciences of language and vision have made the most progress, though the sharp unevenness of progress between the two is also acknowledged. In other domains—for example, the human conceptual system—it seems that we need to become ‘‘folk semanticists’’ in varying degrees to be able to use this system. This is because this system is directly involved, at varying levels of consciousness, with what beliefs we form about the world so that we can lead a life in it. Thus, we need to form fairly conscious judgments regarding which concept is related to which one, which is ‘‘higher’’ and which is ‘‘lower,’’ which has a sharp boundary and which is relatively loose, and so on, in order to be able to use them in appropriate contexts. In that sense, users of dog or apple need to be prepared to explain what they are talking about. In fact, asked to explain if the sentence mentioned above, John is too intelligent to fail, is okay, a user is likely to answer in terms of the meanings of John, intelligent, and fail, rather than whether the small clause is correctly placed. These judgments interfere, quite fatally as we will see, with our scientific ability to penetrate below them to examine the ‘‘real’’ structures that no doubt exist: ‘‘One cannot guess how a word functions. One has to look at its use and learn from that. But the di‰culty is to remove the prejudice which stands in the way of doing this. It is not a stupid prejudice’’ (Wittgenstein 1953, paragraph 340). This is not to deny that we might form some common opinion on what counts as ‘‘language’’ essentially in terms of this ‘‘folk semantics,’’ an
The Loneliness of Biolinguistics
11
opinion that in turn might lead to some opinion on what counts as ‘‘grammar.’’ Schoolchildren are constantly subjected to grammatical lessons on, say, how to convert direct speech to indirect speech in terms of specific contexts of use. But, if the preceding analysis is broadly correct (and partly clear), then we know why it is possible for the biolinguist to ignore such opinion without much di‰culty—or, to use such opinion itself as data—and focus on the underlying object instead. 1.3
Overview of Biolinguistics
Chomsky initiated the contemporary research on language nearly half a century ago essentially to solve Plato’s problem for the domain of language, as noted. From the beginning, the research focused on language as a cognitive system in the mind/brain that solves Plato’s problem for the child (Chomsky 1955a);8 hence, the enterprise is called ‘‘biolinguistics.’’ In the domain of language, Plato’s problem took an interesting form very early in the research program: the tension between descriptive and explanatory adequacies (Chomsky 1965, chapter 1). The tension arose as follows. When researchers attempted to give a precise description of the properties of expressions in individual languages—the condition of descriptive adequacy—they were compelled to postulate very complex mechanisms with varied grammatical constructions that mostly looked specific to the language under study. Following Plato’s problem, the condition of explanatory adequacy required that the construction of grammars be based on the impoverished conditions of language acquisition. So, the very complexity of the descriptions made the languages essentially unlearnable because there just is not enough information available to the language learner for constructing those elaborate grammars. Moreover, since children are not born with genetic dispositions to learn specific languages, the language faculty ought to allow every normal child to acquire any of the thousands of human languages with equal facility, so Plato’s problem compounds. As a matter of fact, children do acquire any human language rapidly and with ease with little stimulus from the environment, as we saw. In most cultures, children acquire a number of languages before they know the names of these things; they do not even know that they have acquired languages. The rich descriptions of languages were thus incompatible with what children do. Notice that the problem is somewhat di¤erent from the closely related problem of acquiring the visual system. The problem of explanatory
12
Chapter 1
adequacy for language learning (alternatively, ‘‘the logical problem of language acquisition’’) clearly suggests that the initial human language system, the faculty of language, ought to be simple and uniform across the species. This must also be the case with the visual system, as with every cognitive system, given Plato’s problem. But the visual system is not only uniform across the species like the language system; the states that it can attain, unlike the language system, are largely uniform as well, pathology aside. The shapes that occur in this line, for instance, can be copied by anyone, but they will be understood only by competent users of English. The states that the language system can attain vary wildly, as thousands of human languages and dialects testify. This led quite naturally to the principles-and-parameters framework, as we will see. Continuing with historical remarks, the research that ensued for the domain of language began receiving some acceptance in the early 1960s, most notably at MIT, where Chomsky taught. Still, the field of biolinguistics was rather small at this stage, with only a handful of researchers at MIT and elsewhere. Chomsky reports that ‘‘it used to be possible, not so long ago, to teach [biolinguistics] from zero to current research within a term or a year or so’’ (Chomsky, Huybregts, and Riemsdijk 1982, 52). In just a few decades since, biolinguistics has become a major scientific enterprise across the globe. Jenkins (2000, ix) reports that, apart from research in theoretical linguistics (syntax, semantics, morphology, lexicon, phonology) covering hundreds of languages and dialects, the enterprise now actively touches on areas such as articulatory and acoustic phonetics, language acquisition, language change, specific language impairment, language perception, sign language, neurology of language, languageisolated children, Creole language, split-brain studies, linguistic savants, and electrical activity of the brain, among others. Notwithstanding astonishing growth within a short time, biolinguistics is very far from being the acclaimed program in studies on language in general. Apart from biolinguists and some of their coresearchers in psychology and the neurosciences, researchers on language include other varieties of linguists such as sociolinguists and historical linguists, practitioners of a large and amorphous discipline called ‘‘cognitive linguistics,’’ literary theorists, semioticians, philosophers of language, logicians and formal semanticists, communication theorists, varieties of computational linguists (including those who work on machine translation), and so on. Although some people from these disciplines do work within the broad generative enterprise, it is a safe bet that most researchers in these disci-
The Loneliness of Biolinguistics
13
plines not only do not work within the biolinguistic enterprise, they are positively hostile to it. 1.3.1
Language and Biology
Some of this resistance comes from (1) the antiscientific aspects of the general intellectual culture especially when it concerns topics of ‘‘human’’ interest, (2) varying conceptions of language and theory of language, (3) discomfort with formal analysis, and (4) the continuing influence of traditions of linguistic research outside the generative enterprise. Many of these strands can be traced to the widespread belief that language is not an object for the natural sciences at all. In other words, the basic source of this resistance to the enterprise is the central claim of biolinguistics that, in studying the nature and function of human language, linguists are in fact studying some biological aspect of the human brain. People working on language are generally uncomfortable with the idea that language is essentially a biological system (Koster 2009): it is a determinate, restrictive structure that grows in the mind of the child under highly specific inner constraints. This is in conflict with a conception of language shared by many that language is a ‘‘cultural’’ entity; it is thus flexible and moldable, not unlike the alleged malleability of social institutions, customs, and political beliefs.9 However, contrary to expectations, Chomsky does not defend the biological basis of linguistic theory by citing (corroborative) evidence from the brain sciences; just the opposite in fact in major respects for now. Somehow then the claim that language is a natural object—a biological system with a genetic component—is maintained independently of the advances in the biological sciences! In recent work, Chomsky has argued for this perspective by showing that, even if the current biological sciences do not provide any manifest basis for the results of linguistic research, there is no coherent alternative to the view that language is a biological system; any alternative perspective is likely to fall into one untenable version of dualism or another (Chomsky 2000d). Indeed, according to Chomsky (2005), those who explicitly resist the idea of the biological basis of language often adopt it implicitly for coherence. Even then it is not immediately clear what naturalistic basis to ascribe to language in the absence of direct support from biology. The issue of the biological basis of the generative enterprise can be raised for di¤erent aspects of the enterprise. These aspects fall into two broad categories: observational and theoretical. First, a number of
14
Chapter 1
observations on the character of linguistic and related behavior are made to argue that the human language system must be highly constrained innately. Second, linguistic theories are formulated to seek specific properties of the mind/brain that give rise to the observed phenomena. The issue of biological basis a¤ects these two aspects in di¤erent ways. Chomsky’s position, stated above, belongs essentially to this second aspect, although it touches on the first as well. The idea that organisms are highly constrained innately is taken to be a truism in the study of organic systems: ‘‘Take the fact that people undergo puberty at a certain age . . . if someone were to propose that a child undergoes puberty because of, say, peer-pressure (‘others are doing it, I’ll do it too’), people would regard that as ridiculous’’ (Chomsky 2000a, 7). Still, prevalent conceptions of language require that the truism be explicitly demonstrated. Central to these demonstrations are the poverty-ofstimulus arguments, noted above. In this sense, Chomsky and others have drawn on a variety of evidence, from the way children learn languages to brain disorders of language, to argue that the linguistic system is innately constrained. The most plausible way of construing these innate constraints is to think of them as having a biological (¼ genetic) basis. However, apart from telling us, in general terms, that there ought to be a biological basis of language, poverty-of-stimulus and related arguments do not supply any clue about what that basis is: ‘‘Poverty of stimulus considerations tell us that some knowledge is innately represented; they don’t tell us how the knowledge is represented or processed’’ (Collins 2004, 506). These ‘‘arguments’’ are really observations that help set up a problem—essentially, Plato’s problem—that theories of language try to solve. Note also that such arguments, including those from split-brain studies, are typically focused on behavior, which is the output of the concerned cognitive system; they are not directly focused on the biological properties of the system. In solving the problem raised by these arguments, a theory of language attempts to go below the behavior to isolate the specific properties of the innate biological system involved here. Thus, a more demanding concept of biological basis arises when we shift to particular proposals—theories—regarding the innate constraints. Here, as I understand it, Chomsky’s perspective is that biolinguistics stands essentially on its own; biological sciences do not provide any support. In fact, the basic problem that is currently animating linguistic research is even more enigmatic than the problem just mentioned. The enigma arises as follows. Suppose that the biolinguistic framework is reluctantly admitted if only because, as noted, coherent alternatives are di‰cult to
The Loneliness of Biolinguistics
15
conceive. Now, biological systems are standardly viewed as poor solutions to the design problems posed by nature. These are, as Chomsky (2000a, 18) puts it, ‘‘the best solution that evolution could achieve under existing circumstances, but perhaps a clumsy and messy solution.’’ In contrast, the so-called exact sciences, such as physics and parts of chemistry, follow the Galilean intuition that nature is perfect: natural e¤ects obtain under conditions of ‘‘least e¤ort.’’ Thus, the search for these conditions in nature had been a guiding theme in these sciences. The design problem that the human linguistic system faces is the satisfaction of legibility conditions at the interfaces where language interacts with other cognitive systems. Roughly, the sensorimotor systems access representations of sound, and conceptual-intentional systems access representations of meaning. As Chomsky (2000a, 17) phrases the design problem, ‘‘To be usable, the expressions of the language faculty (at least some of them), have to be legible by the outside systems. So the sensorimotor system and the conceptual-intentional system have to be able to access, to ‘read’ the expressions; otherwise the systems wouldn’t even know it’s there.’’ Explorations under what is known as the Minimalist Program are beginning to substantiate the view that the system is ‘‘perfect’’: it solves the design problem under conditions of least e¤ort. What look like apparent imperfections in the system, such as the existence of (semantically) uninterpretable features in the lexicon, are best explained as optimal mechanisms for meeting legibility conditions imposed by systems external to language. We will look at the phenomenon later (chapter 5). How do we accommodate these discoveries with the idea that biological systems are ‘‘clumsy and messy’’? Some years ago, Chomsky (1995b, 1–2) formulated the big puzzle that emerges as follows: ‘‘How can a system such as human language arise in the mind/brain, or for that matter, in the organic world, in which one seems not to find anything like the basic properties of human language?’’ Chomsky thought that the ‘‘concerns are appropriate, but their locus is misplaced; they are primarily a problem for biology and the brain sciences, which, as currently understood, do not provide any basis for what appear to be fairly well established conclusions about language.’’ Unless one is intrinsically excited about the prospect of discovering a new aspect of nature in whatever terms are available, especially in the ‘‘inner’’ domains, one is not likely to be convinced by Chomsky’s diagnosis of the problem without further arguments. Given the power and prestige of the ‘‘hard sciences,’’ it is di‰cult to swallow the idea that
16
Chapter 1
biolinguists are right and all the life sciences, as currently understood, are wrong, or at least insu‰cient, in this respect. In fact, if the enterprise is not to be viewed as just a technique for generating linguistic structures, then it is an open question how many generative linguists themselves seriously subscribe to the idea that, for example, in studying the intriguing structure John had a book stolen they are in fact studying the human brain.10 In a conversation twenty-five years ago regarding the early developments in transformational grammar, Chomsky remarked that ‘‘it was just used as another descriptive device.’’ ‘‘There are things,’’ Chomsky continued, ‘‘that you can describe in that way more easily than in terms of constituent structure, but that is not a fundamental conceptual change, that is just like adding another tool to your bag’’ (Chomsky, Huybregts, and Riemsdijk 1982, 40). It will be surprising if the general attitude has changed much in the meantime, even if the ‘‘tools’’ have. Chomsky’s remark has an immediate echo in an intriguing period in the history of science that he has alluded to from various directions in recent years. The period at issue concerns the character of chemistry, as viewed by most of its principal practitioners before its unification with (quantum) physics. As Chomsky (2001b) puts it, ‘‘It was claimed, up until the 1920s by Nobel laureates, philosophers of science, and everyone else, that chemistry is just a calculating device; it can’t be real. This is because it couldn’t be reduced to physics.’’ Since linguistics could not be ‘‘reduced’’ to biology either, it is not wholly unreasonable to view the generative enterprise as a ‘‘calculating device’’ by its practitioners. As noted, Chomsky placed the ‘‘locus’’ of the concern on the biological sciences; others might prefer to place it on the generative enterprise itself. Could it be that the entire discipline of biolinguistics lacks foundations? Although nothing can be ruled out, finding something fundamentally wrong with the internal research of biolinguistics now requires working through this increasingly di‰cult discipline with its very abstract formulations and a massive body of interdisciplinary research, as noted. It is likely that, from now onward, foundational problems with the generative enterprise, if any, will be noticed within the enterprise itself, as in physics and mathematics—not from the outside. In fact, the enterprise has already faced a number of such problems. The conflict between descriptive and explanatory adequacies, mentioned above, is one of the earlier ones. In the 1980s, the postulation of ‘‘inner’’ levels of representation, d- and s-structures, posed another fundamental
The Loneliness of Biolinguistics
17
problem since no other system of the mind accesses them. In the current Minimalist Program, such problems include the existence of uninterpretable features: lexical features, such as structural Case, that have no semantic interpretation are found in every language; certain operations seem to require ‘‘look-ahead’’ information, and so on (I return to these issues in section 5.1.2). Notice that foundational problems have progressively become more theory-internal, as expected in an advancing science (see Freidin, Otero, and Zubizarreta 2008). It is not surprising that attempts to challenge the foundations of the discipline from the outside have more or less faded out in recent decades. Outside the enterprise, a more convenient strategy is to grant that Chomsky may be right in what he is doing, but he is doing very little— so little, in fact, that we need to redo the whole thing, including syntax. As Peter Ga¨rdenfors (1996, 164–165) puts it, ‘‘Semantics is primary to syntax and partly determines it. . . . This thesis is anathema to the Chomskyan tradition within linguistics.’’ Anna Wierzbicka (1996, 7) complains that the ‘‘Chomskyan anti-semantic bias’’ has ‘‘led to a preoccupation with formalisms . . . in which ‘meaning-free’ syntax has for decades usurped the place rightfully belonging to the study of meaning.’’ To show the extent of disapproval, she cites Nobel laureate Gerald Edelman: ‘‘The set of rules formulated under the idea that a grammar is a formal system are essentially algorithmic. In such a system, no use is made of meaning. Chomsky’s so-called generative grammar . . . assumes that syntax is independent of semantics.’’11 Ray Jackendo¤ (2002, 269) suspects that ‘‘the underlying reason for this crashing wave of rejections is the syntactocentrism of mainstream generative grammar: the assumption that the syntactic component is the sole source of generative capacity in language.’’ Hence, he is led to suggest ‘‘a radical reformulation of linguistic theory that in some strange sense ‘turns the grammar inside out’ ’’ (p. xii). Given the continuing popularity of these complaints against biolinguistics, apparently leading to a ‘‘crashing wave of rejections,’’ some brief remarks are in order at this point. I will discuss the question of meaning in biolinguistics at length as I proceed, especially in chapters 3–7. ‘‘Meaning,’’ Chomsky (1957) observed in his early work, is a ‘‘catchall’’ term. The term evokes a variety of expectations, not all of which can be met in serious theoretical inquiry. Furthermore, there is no assurance that, when the common concept of meaning is placed under theoretical scrutiny, whatever remains of the common concept will be located in one theoretical place. It is more likely that the thick and loose ordinary
18
Chapter 1
concept will be broken down into theoretically salient parts, and that the individual parts will be attached to di¤erent corners of the total theoretical plane. Keeping these points in mind, consider Chomsky’s general characterization of the computational system of language. A computational system consists of ‘‘rules that form syntactic constructions or phonological or semantic patterns of varied sorts to provide the rich expressive power of human language’’ (Chomsky 1980, 54¤.). Notice that this characterization includes ‘‘semantic patterns.’’ Almost every topic in biolinguistics is directly concerned with semantics and questions of meaning. For example, treatment of grammatical phenomena such as understood Subject, antecedents of anaphors and pronouns, quantifer movement, and so on, are directly semantically motivated. In fact, the entire nonphonological part of computation (N ! SEM computation) is currently viewed as geared to form an ‘‘image’’ SEM in a way such that the (configurational) demands placed by the systems of thought are optimally met. Furthermore, it is most natural to view the language faculty itself as containing what may be called ‘‘I-meanings’’: representations encoded in the formal-semantic features of lexical items. Finally, other naturalistic things being equal, the domain of syntax may be broadened to include much of what goes by the label ‘‘formal semantics’’; thus, the concept ‘‘semantic value’’ could cover syntactic objects internal to the mind but external to the language faculty. So much for Chomsky’s ‘‘preoccupation with ‘meaning-free’ syntax,’’ and his stopping ‘‘people from working on meaning’’ (Marvin Minsky, cited in Jenkins 2000, 52). It is hard to find any interest, then, in the objections to the biolinguistic enterprise from the outside. Therefore, the only option is to try to make sense of linguistic research in the context of current science. And here the stumbling block, to repeat, is that there is nothing in the relevant current sciences that tells us how to make that sense. The problem, as Chomsky notes, may well lie with biology and the brain sciences, which do not provide any basis for what appear to be well-established conclusions about language. More specifically, the biological sciences may not have su‰ciently advanced to respond to the questions posed to it by linguistic research. It could be, as Chomsky (2000d, 104) observes, ‘‘any complex system will appear to be a hopeless array of confusion before it comes to be understood, and its principles of organization and function discovered.’’ I will briefly cite two examples to suggest what is at stake here. Consider the research on nematodes. Nematodes are very simple organisms
The Loneliness of Biolinguistics
19
with a few hundred neurons in all, so people have been able to chart their wiring diagrams and developmental patterns fairly accurately. Yet Chomsky (1994b) reports that an entire research group at MIT devoted to the study of the ‘‘stupid little worm,’’ just a few years ago, could not figure out why the ‘‘worm does the things it does.’’ More recently, citing cognitive neuroscientist Charles Gallistel, Chomsky writes that ‘‘we clearly do not understand how the nervous system computes, or even the foundations of its ability to compute, even for the small set of arithmetical and logical operations that are fundamental for any computation for insects’’ (Chomsky 2001b; Gallistel 1998). Commenting on Edward Wilson’s optimism about a ‘‘coming solution to the brain-mind problem,’’ Chomsky remarks that the ‘‘grounds for the general optimism’’ regarding ‘‘the question of emergence of mental aspects of the world’’ are at best ‘‘dubious.’’ There are serious attempts in biology itself to address the tension between the concept of perfection and what is known about biological systems. In recent years, there has been increasing application of considerations from physics (such as symmetry, least-energy requirement, and the like) to try to understand the organization and function of complex biological systems (Jenkins 2000; Leiber 2001; Piattelli-Palmarini and Uriagereka 2004; Chomsky 2005, etc.). If this approach is successful in providing an account of some of the complex physical structures and patterns found in the biological domain, then biology will also confirm the intuition about nature’s drive for the beautiful which has been a guiding theme of modern science ever since its origins, as Chomsky (2001b) remarks following Ernst Haeckel. Still, even if we grant that the patterns on zebras or the icosahedral structure of viruses have interesting leaste¤ort explanations, the chance of such explanations extending to the abstract structures of language is at best remote. I return to this point in section 7.3.1. 1.3.2
A Body of Doctrines
Pending such advances in biology, the only option is to make scientific sense of linguistic research in its own terms. In e¤ect, I view the basic vocabulary and the constructs of linguistics—its lexical features, clause structures, island constraints, argument structures, landing sites, constraints on derivation, and so on—as theoretical devices to give an account of at least part of the organic world, namely, the human grammatical mind, and perhaps much more. More specifically, one should be allowed to draw a tree diagram and claim that it describes a state of the brain.
20
Chapter 1
Returning at this point to the period in the history of chemistry mentioned above, recall that chemistry was viewed as a mere ‘‘calculating device’’ on the grounds that it could not be unified with physics. The gap seemed unbridgeable essentially because the chemists’ matter was discrete and discontinuous, the physicist’s energy was continuous (Chomsky 2001b). Under the assumption that the physicist’s view of the world is ‘‘basic’’ at all times, it is understandable that chemistry was viewed as ‘‘unreal.’’ However, as Chomsky has repeatedly pointed out in recent years, the gap was bridged by unifying a radically changed physics with a largely unchanged chemistry. Analogically, from what we saw about the current state of biological research on cognition and behavior, it is possible that a ‘‘radically changed’’ biology, perhaps on the lines sketched above, will unify with a ‘‘largely unchanged’’ linguistics. Since the likelihood of such biology is remote, all we have in hand is the body of linguistic research itself. Chomsky drives the point by citing what he calls the ‘‘localist’’ conception of science attributed to the eighteenth-century English chemist Joseph Black: ‘‘Let chemical a‰nity be received as a first principle . . . till we have established such a body of doctrine as [Newton] has established concerning the laws of gravitation’’ (quoted in Chomsky 2000d, 166). Thus, chemical research proceeded along a di¤erent path from physics, and the gap between the disciplines widened. In my opinion, the gap between biology and linguistic research is even wider. After all, the chemist’s view of matter as discrete and discontinuous was not something unheard of in (earlier) physics. Newton himself was a ‘‘corpuscularean’’ about many aspects of nature in the sense that he thought that all matter in the universe is made up of the same ‘‘building blocks.’’ So, for him, even something as ethereal as light also consists of ‘‘particles,’’ a view confirmed from a wholly di¤erent direction two centuries later. In that sense, in adopting the chemist’s view of matter, (later) physics was reconstructing a part of its own past. Furthermore, the disciplines did get unified. It could mean either that something totally unexpected happened, or that the disciplines were ‘‘proximal’’ for centuries for this to happen eventually. The bare fact that this form of unification is a very rare event in science lends support to either interpretation. Still, the second option of proximality is more plausible since the first involves a miracle. In contrast, although the general idea of biolinguistics goes back to ancient times in many traditions, there is no record of ‘‘proximity’’ of the disciplines of linguistics and biology—just the opposite in most cases, as we have seen. In general, if
The Loneliness of Biolinguistics
21
there is exactly one case of large-scale unification in the whole of science, it is natural to expect that there is a history to it, which does not often repeat itself. Pursuing the point a bit further, it seems that although physics and chemistry became separable bodies of doctrine at some point, there was some conception of a unified picture throughout. That is, some conception of properties of more ‘‘basic’’ elements combining to produce both physical and chemical e¤ects guided much research for centuries. In this connection, the issue of John Dalton’s professional identity is interesting: Was Dalton a chemist or a physicist? The Britannica Micropaedia article on Dalton (vol. 3, 358) lists him as both a chemist and a physicist; so does the main Macropaedia article (vol. 5, 439). The Macropaedia article on the history of the physical sciences, however, lists him as a chemist only. But the section on chemistry in this article (vol. 14, 390) begins as follows: Eighteenth century chemistry derived from and remained involved with questions of mechanics, light, and heat as well as iatrochemistry and notions of medical therapy. . . . Like the other sciences, chemistry also took many of its problems and much of its viewpoint from [Newton’s] Opticks and especially the ‘‘Queries’’ with which that work ended. Newton’s suggestion of a hierarchy of clusters of unalterable particles formed by virtue of the specific attractions of its component particles led directly to comparative studies of interactions and thus to the table of a‰nities.
Following these remarks, the article goes on to view Joseph Priestley’s work on chemical a‰nities as continued explorations of Newtonian queries. This is exactly what I had in mind about the ‘‘proximity’’ of chemistry and physics. In light of these considerations, the alleged ‘‘divergence’’ between physics and chemistry as separable ‘‘bodies of doctrine’’ could be viewed as a late-nineteenth-century construction motivated largely by the temporary decline of corpuscular theories and the rise of wave mechanics in physics. Similarly, the alleged ‘‘convergence’’ of the physical and chemical in the postquantum era could be a twentiethcentury construction based on the revival of ‘‘corpuscular’’ theories in physics. Chemistry, as Chomsky emphasized, remained essentially unchanged throughout. It is at least questionable how much weight should be placed on these temporary phases to form a general conception of science as it develops over centuries. In fact, without this background continuum, the concept of body of doctrine with its ‘‘locality’’ does not make clear sense, just as clusters of South Pacific land masses are ‘‘islands’’ in the context of the continuum of the ocean. I have no problem with the concept of body of doctrine
22
Chapter 1
that separates physical, chemical, biological, geographic, and so on with respect to a general conception of science emanating from a unitary source, here Newton. Otherwise, the concept just seems to label any inquiry whatsoever (astrological, sociological, economic, etc.) and is, therefore, without empirical force. In sharp contrast, there is neither any historical e¤ort nor any contemporary evidence for us to be able to place studies on language somewhere in this continuum. There are two crucial points to this. First, language theory is envisaged here entirely in terms of its object, which is an abstract computational system with certain output properties with, I think, a possible range of application across related domains such as music, arithmetic, and logical reasoning: this object is called ‘‘grammar.’’ It does not include the conceptual system, and its operations are fairly ‘‘blind’’ with respect to the band of information it computes on. We will see all this as we proceed. Second, the most amazing fact is that language theory is available in the Galilean style. As long as language theory was not there, we had some loose ‘‘philosophical’’ conception of a domain that also did not belong to the Newtonian continuum precisely because it was not a science at all. So, a very di¤erent issue opens up once the Galilean style began to apply to language, and language theory in its recent form emerged. On the one hand, the availability of the Galilean style surely signals the arrival of a science of language; on the other, this arrival has had no historical link with the only scientific continuum in hand, namely, the Newtonian one. In fact, there is a sense in which there indeed is a ‘‘continuum’’ in which to place language theory, as hinted in section 1.1. The curve begins with, say Pa¯nini, and continues through Aristotle, Port Royal, von Hum˙ boldt, Saussure, Turing, and so on, to lead to generative grammar. The continuum could well be called the ‘‘generative enterprise,’’ in a wider sense. No doubt, this continuum, unlike the Newtonian continuum, is an abstract conception without direct historical-textual lineage. Its themes originated more or less independently in di¤erent textual traditions in India and Europe; the spirit of high ideas knows no boundary. With the intervention of German ‘‘orientalists,’’ we can also think of the two traditions converging at Saussurean linguistics in the late nineteenth or early twentieth century. This enterprise flourished essentially independently of, and parallel to, the Newtonian continuum more or less throughout. The emergence of language theory shows, in afterthought, that it had always been a scientific enterprise. With two continuums in hand, various possibilities for unification arise. The present point is that such unification is
The Loneliness of Biolinguistics
23
likely to be very di¤erent in structure from the ones within the Newtonian continuum. Nevertheless, Black’s ‘‘isolationist’’ conception of science still holds fairly decisively with respect to his original example: Newton’s theory of gravitation. It is well known that the postulation of universal gravitation immediately raised a storm of controversy strikingly similar to contemporary controversies around Universal Grammar. Much reordering has happened in science since Newton’s formulation of the theory over three hundred years ago. Newton’s original conception (‘‘action at a distance’’) was replaced by the concept of a gravitational field. The concept of field was extended to the phenomena of electricity and magnetism, which were subsequently unified under Maxwell’s laws. More recently, there has been further unification within physics of electromagnetic forces with forces internal to the atom. As noted, physics was unified with chemistry, and chemistry with molecular biology. Through all this turbulent history, gravitation remained an enigma; the concept just could not be coherently accommodated with the rest of physics (Held 1980; Hawking and Israel 1987, for history of gravitation theory). It continues to be a problem even in the recent advances in quantum field theory, the most general unified theory in physics currently available (Cao 1997). Roger Penrose (2001) formulates the general problem as follows. According to him, quantum theory and gravitation will be properly unified—that is, we can expect gravitational e¤ects at the quantum scale—only in a ‘‘new physics’’; the current scales of quantum theory and relativity theory are insu‰cient. If we take a free electron (current scale of quantum theory), we get the relevant quantum e¤ects, but the gravitational e¤ects are too small. If we take a cat (current scale of relativity), quantum theory produces paradoxes. So we settle for an intermediate scale, say, the scale of a speck of dust: ‘‘With a speck of dust you can start to ask the question, ‘could a speck of dust be in this place and in that place at the same time?’ ’’ The point is that a speck of dust is not in the domain of either quantum theory or relativity theory. Pushing the analogy with gravitation further, it is of much interest that, for several centuries after Newton postulated the force, the physical character of gravitation, even in the field version, remained an enigma. That is, although the properties of gravitation itself were mathematically well understood and its empirical e¤ect widely attested, no one really knew what it means for the physical universe to contain such a force. Albert Einstein finally characterized universal gravitation in terms of other parameters of the physical universe, namely, the spatial properties of
24
Chapter 1
bodies. But, as Penrose’s remarks suggest, in so doing he had to construct a theory that just would not mesh with the rest of physics. In other words, although Einstein explained the ‘‘evolution’’ of gravitation (from bodies), the theory of evolution he needed could not be arrived at from some theory of proto-gravitation already available in existing physics. There is a case, then, in which a body of doctrine has resisted unification with the rest of physics for over three hundred years, despite some of the most imaginative scientific reflections in human history. Note also that this problem has persisted in research essentially on ‘‘outer’’ domains. In contrast, the unification problem facing biolinguistics arises for an ‘‘inner’’ domain. Here a solution of the unification problem requires not only a radical shift in domains, but also that the purported solution works across ‘‘inner’’ and ‘‘outer’’ domains. It can only be ancient prejudice that the entire body of biolinguistics is often dismissed on the grounds that it does not come armed with a certificate from existing theories of organic evolution. Biolinguistics is a body of doctrines that is likely to remain isolated, in the sense outlined, from the rest of science far into the future. To emphasize, this conclusion is based on the history of science, namely, that the problem of unification between psychological studies and biology is as unresolved today as it was two centuries ago.12 The crucial recent dimension to this history is that psychological studies now contain a scientific theory, so there is a genuine partition in science. I will draw on this perspective a lot in what follows. 1.3.3
A Mind-Internal System
In the meantime, we can ask other questions about biolinguistics. Given its (current) isolation, a natural question is, ‘‘What is its reach?’’ This question can be rephrased as follows. According to Jenkins (2000, 1), the biolinguistics program was supposed to answer five basic questions: (1) What constitutes knowledge of language? (2) How is this knowledge acquired? (3) How is this knowledge put to use? (4) What are the relevant brain mechanisms? (5) How does this knowledge evolve? How many of these issues are within the reach of current biolinguistic inquiry? In view of the state of the unification problem, it is clear that substantive answers to questions 4 and 5 are currently beyond reach. This does not mean that no answers are available, especially for 4. For example, one could simply take the constructs of linguistics to be properties of brain states as currently understood, and proceed from there; some of that could already be happening in the brain-imaging literature. To con-
The Loneliness of Biolinguistics
25
sider just one case, it is suggested that a system of neurons executes what is known in linguistics as the ‘‘trace deletion hypothesis’’ (Grodzinsky 2000). Similar proposals are routine in physics; geodesics and potentials are viewed as located all over the universe. Yet, the di¤erence between physics and linguistics is that the universe is what physics says it is; there is no other account of the universe once we grant that scientific understanding is limited to its intelligible theories. But the brain is not what linguistics says it is.13 There are independent electrochemicalmicrobiological accounts of the brain on which these images take place. These accounts do not explain what it means for the trace deletion hypothesis to be executed there (Smith 2000). As for answers to question 5, until explanations are available via a ‘‘radically changed’’ biology envisaged above, what we have in hand for now, according to Chomsky (2002), are more like ‘‘fables’’ and ‘‘stories.’’ Substantive work in biolinguistics, through all its phases, has been basically concerned with question 1: ‘‘What constitutes knowledge of language?’’ In the generative enterprise, this question was pursued in an interesting symbiosis with question 2, the issue of acquisition of language. As noted, the enterprise was directly concerned with what is known as the ‘‘problem of explanatory adequacy’’: ‘‘languages must somehow be extremely simple and very much like one another; otherwise, you couldn’t acquire any of them’’ (Chomsky 2000a, 13). Question 1, therefore, was taken to be shorthand for ‘‘What constitutes the knowledge of language such that Plato’s problem is solved in this domain?’’ In that sense, answers to question 1 have an immediate bearing on question 2: we would want to know if the abstract conditions on acquisition, which are postulated to answer question 1, are in fact supported by, say, child data. Thus within the computational-representational framework, interesting answers to question 2 flow directly from substantive answers to question 1 (Crain and Pietroski 2001 and references). However, beyond this framework, the problem of unification a¤ects answers to question 2 insofar as we expect the issue of acquisition to be addressed in terms of the physical mechanisms of organisms. These remarks extend to question 3 as well. If the issue of language use concerns mechanisms, not to speak of actions, the unification problem blocks substantive answers. However, subtle questions of language use can be posed within biolinguistics itself. Recall that the design problem for the language faculty was posed in terms of language use: To be usable, the expressions of the language faculty (at least some of them) have to be legible to the outside systems. It looks as though the computational
26
Chapter 1
system of the faculty of language is, in a way, sensitive to the requirements of the conceptual-intentional systems—the ‘‘thought’’ systems. For example, to meet its own conditions of optimality, the computational system sometimes places a linguistic object in a location where pragmatic conditions are also satisfied. Consider the so-called passive constructions in which the Object of the main verb moves to the Subject position: John read the book ! The book was read by John. This is a purely grammatical phenomenon that I will look at in some detail from di¤erent directions later. Now consider the following sentences (Pinker 1995a, 228): Scientists have been studying the nature of black holes. The collapse of a dead star into a point perhaps no larger than a marble creates a black hole. The trouble with the ‘‘active’’ second sentence is that there is a discontinuity of topic with the first. The first sentence has introduced the topic of black holes and the second should take o¤ from there. This is achieved by the passive sentence: A black hole is created by the collapse of a dead star into a point perhaps no larger than a marble. In that sense, substantive answers to question 1 also lead to interesting answers to question 3. Furthermore, these results, apart from suggesting a unified approach to questions 1–3, also o¤er a theoretically motivated division of the concept of language use. Pretheoretically, it is natural to view the question of language use thickly in terms of whatever it is that we do with language: express thoughts and feelings, talk about the world, ask questions, get someone to do something, and so on. In this picture, there is a cognitive system called ‘‘language’’ that we put to use in the external social setup to enable us to do these things. Therefore, question 3 (‘‘How is this knowledge put to use?’’) is classically viewed as a question about, say, how we talk about the world: the problem of reference. As we just saw, a very di¤erent set of issues emerges when we frame the question of language use in terms of satisfaction of legibility conditions at the interfaces. Here the question, roughly, is about language-external but mind-internal systems that not only immediately access but also (partly) influence the form of the representations constructed by the language faculty. In this sense, in meeting the legibility conditions, language has already been put to use! But this concept of use, restricted to mindinternal systems, need not appeal to any concept of use that involves, say, the concept of reference; in other words, the question of how language is related to the rest of the world to enable us to refer is now delinked from the question of how language relates to mind-internal systems. This is fundamental progress, and it was achieved by answering
The Loneliness of Biolinguistics
27
question 1 in minimalist terms. From this perspective, the classical problem of Intentionality—how language relates to the world—basically falls out of biolinguistics. Is there a meaningful problem of Intentionality outside of biolinguistics? Chomsky’s (2000d, chapters 2 and 7) basic position is that words do not refer, people do; in fact, ‘‘I can refer to India without using any word, or any thought, that has any independent connection to it.’’ ‘‘It is possible,’’ Chomsky (2000d, 132) suspects, ‘‘that natural language has only syntax and pragmatics’’; ‘‘there will be no provision’’ for what is assumed to be the ‘‘central semantic fact about language,’’ namely, that it is used to represent the world. I return to these issues in chapter 3. Chomsky’s conclusion may be viewed as a rejection of what Jerry Fodor and Ernst Lepore have called ‘‘Old Testament semantics’’: ‘‘Semantic relations hold between lexical items and the world and only between lexical items and the world.’’ According to Fodor and Lepore (1994, 155), there is no semantic level of linguistic description: ‘‘The highest level of linguistic description is, as it might be, syntax or logical form.’’ Roughly, the claim is that the output of grammar, LF, is not a semantic level.14 I will question this conception of LF in some detail in this work to argue that LF itself may be viewed as a (genuine) semantic level. Keeping to Fodor’s conceptions of syntax and semantics, much of Fodor’s recent work (Fodor 1994, 1998) may be viewed as a defense of Old Testament semantics—the study of language-world connections— against any other form of semantics such as those involving conceptual roles, exemplars, prototypes, and the like (Murphy 2002): call them ‘‘New Testament semantics.’’ If Fodor is right in his rejection of New Testament semantics, and if Chomsky is right in rejecting Old Testament semantics, no intelligible concept of semantics survives outside the internalist concept of language use proposed in biolinguistics (Bilgrami and Rovane 2005). Beyond biolinguistics, vast gaps of understanding surround studies on language and related mental aspects of the world, even when we set aside the various dimensions of the unification problem. We now have some idea of the respects in which biolinguistics is isolated from the rest of human inquiry, including other inquiries on language. The twin facts of the isolation and the scientific character of biolinguistics raise the possibility that biolinguistics may have identified a new aspect of the world. I assume that we talk (legitimately) of an aspect of the world only in connection with a scientific theory of an advanced character with the usual features of abstract postulation, formalization, depth of
28
Chapter 1
explanation, power of prediction, departure from common sense, and so on. This is what Black’s notion of a ‘‘body of doctrines’’ implies, in my opinion. Chemical, optical, and electrical count as bodies of doctrines because, in each case, there is a cluster of scientific theories that give a unified account of a range of processes and events that they cover: the broader and more varied the range, the more significant the appellation ‘‘body of doctrines.’’ With the exception of rare occasions of unification, science typically proceeds under these separate heads, extending our understanding of the aspect of the world it already covers. Thus, not every advance in science results in the identification of a new aspect of the world. It follows that, since biolinguistics is a science, it extends our understanding of some aspect of the world. However, since it is isolated from the rest of science, the aspect of the world it covers does not fall under the existing heads; therefore, biolinguistics has identified a new aspect of the world. We need to make (metaphysical) sense of the puzzling idea that the object of biolinguistics stands alone in the rest of the world. The obvious first step to that end is to form some conception of how biolinguistics works. As noted, biolinguistics attempts to solve a specific version of Plato’s problem with explicit articulation of the computational principles involved in the mind-internal aspects of language, including aspects of language use that fall in this domain. We have to see how exactly meaningful solutions to Plato’s problem are reached within these restrictions.
2
Linguistic Theory I
I provided a general historical overview of the generative enterprise in the last chapter; hence, I will skip discussion of the early phases of the enterprise (see Boeckx 2006, chapter 2). The current review thus goes straight into the principles-and-parameters (P&P) framework, developed in the early 1980s and generally viewed as a watershed in the short history of the field. The P&P framework essentially consists of two phases: an earlier phase known as Government-Binding Theory (G-B) and the current Minimalist Program (MP). I have organized the discussion of these phases as follows. Except for some brief remarks on MP near the end, I will concentrate on G-B in this chapter because most of the traditional issues concerning the notions of grammar, language, and meaning discussed in the next two chapters (chapters 3 and 4) can be addressed with G-B in hand. Once I have done so, I will return to MP in chapter 5 to show that it takes these issues to a di¤erent plane altogether. Moreover, although possible, it is di‰cult to describe the P&P framework directly with MP, just as it is difficult to describe the theory of relativity without grounding the discussion first in Newtonian theory (Piattelli-Palmarini 2001, 3). This is reinforced by the fact that, as we will see in section 5.2, each principle postulated in MP derives in one way or another from G-B itself (Boeckx 2006, chapter 3; also Hornstein and Grohman 2006). Finally, a crucial discussion in chapter 5 will require a comparative study of the principles of G-B and MP; to that end, a discussion of G-B is needed in any case. I begin the discussion with a classic philosophical problem in this area—the scope problem raised by Bertrand Russell—to show how contemporary linguistics o¤ers a marvelous solution to this problem. This is not the way Chomsky and other linguists would like to introduce their discipline.1 Needless to say, the basic scientific program throughout is to solve Plato’s problem in the domain of language. Solution of classical
30
Chapter 2
philosophical problems from within the same explanatory goals, then, ought to be viewed as bonus; it also helps prepare the ground for questioning nongrammatical approaches to language. 2.1 Russell’s Scope Problem
In an epoch-making paper, the philosopher Bertrand Russell (1905) raised the following problem. We know that France had ceased to be a monarchy at a certain point in history. Then what should be the truth value of sentence (1) uttered after that time? (1) The king of France is wise. Since there is no king of France, the sentence cannot be true. Can it be false? The trouble is that if (1) is false, then sentence (2), other things being equal, ought to be true. (2) The king of France is not wise. How can (2) be true if there is no king of France? Whose lack of wisdom is asserted here? After rejecting a number of obvious and, in my opinion, fairly plausible options, Russell suggested that the notation of (first-order) quantification be used to rewrite (1) as (3) (bx)(Kx & (Ey)(Ky ! x ¼ y) & Wx) which means informally that there is a king of France, x, and that if anyone else, y, is a king of France then x and y must be one and the same; and also that x is wise. Russell thought that (3) captured the meaning of (1). Clearly, (3) will be false if there is no king of France. Then Russell suggested an ingenuous move for (2). Thinking of not as the familiar negation operator (s), Russell argued that there are two places in (3) where the operator can occur, giving rise to (4) and (5). (4) (bx)(Kx & (Ey)(Ky ! x ¼ y) & sWx) (5) s(bx)(Kx & (Ey)(Ky ! x ¼ y) & Wx) It is clear that (4) also will be false if there is no king of France so that the uneasy issue of ‘‘who-is-unwise’’ does not arise. When (3) is false, (5) indeed is true but it is not true of someone; hence, the problem disappears. Many authors including myself have held that, for definite descriptions, such a solution does not work and is perhaps not even required (Strawson 1950, 1961; Hawkins 1978; Heim 1982; Mukherji 1987, 1995, etc.). I postpone a discussion of this aspect of Russell’s theory of definite descriptions
Linguistic Theory I
31
(see section 3.5.3). Nevertheless, notwithstanding the merits of the specific solution, Russell’s analysis of the problem brought out a general feature of languages. The heart of Russell’s solution was the observation that (2), a sentence of English, is structurally ambiguous. Example (2) is not ambiguous because some word(s) occurring in (2) are ambiguous; thus the ambiguity of (2) is quite di¤erent from the ambiguity of, say, The kite is flying high where kite could mean either a bird or an artifact. According to Russell, (4) and (5) represent disambiguated versions of (2) and, if his analysis is correct, we know the exact source of this ambiguity. The ambiguity lies in the relationship between the quantifier (bx) and the negation operator (s): in (4), the operator is inside the scope of the quantifier; in (5), the relation is reversed. Statements (4) and (5) bring out these facts formally by specifying the location/position of these items in each case. For example, in (4), the negation operator is said to have a ‘‘narrow’’ scope since it occurs to the right of the quantifier; the quantifier in (4) has a ‘‘wide’’ scope since it occurs to the left of the operator. Likewise for (5). This is the sense in which (2) is structurally ambiguous. Several interesting points emerge. First, the surface or phonetic form of (2) conceals the ambiguity. Second, the ambiguity a¤ects, as we saw, how (2) is to be semantically interpreted—that is, whether (2) is true or false. Third, the semantic ambiguity can be traced to structural ambiguity in a suitable canonical notation. All of this led Russell to distinguish between the surface form of a sentence such as (2), that is, how the sentence looks and sounds, and its logical form(s) such as (4) and (5), that is, how the sentence is to be interpreted. Clearly, the distinction obtains for any sentence in any language even if a particular sentence is not structurally ambiguous: the (unique) logical form of the sentence tells us why. I have ignored what Russell took to be the central feature of the distinction, namely, that the unit The king of France is no longer represented either in (4) or in (5). This led Russell to conclude that The king of France is semantically insignificant; it is an ‘‘incomplete symbol.’’ It is interesting that, in philosophical circles, this conclusion still generates a lot of heat (Buchanan and Ostertag 2005). Perhaps the most interesting point of the logical form (5) is that here the negation operator ‘‘s’’ occurs at the front of the sentence whereas not occurred near the end of (2). Given that (5) represents a (canonical) interpretation of (2), it follows that the lexical item not has not been interpreted where it has been sounded. In that sense, the interpretation of not is displaced—that is, phonetic and semantic interpretations are assigned
32
Chapter 2
at a ‘‘distance.’’ Russell’s analysis gives an explicit account of a specific displacement because of the specificity of his interests. Yet, perhaps fortuitously, Russell put his finger on a much wider phenomenon. Scope distinctions, whether or not they involve the particular operator and the quantifier mentioned above, can now be generally viewed as examples of displacement. Consider (6). (6) Every boy danced with a girl. Example (6) is clearly ambiguous between (i) every boy found at least one girl to dance with, and (ii) a girl is such that every boy danced with her. In fact, the phenomenon is even wider. Consider an active-passive pair such as (7) and (8). (7) Bill has read the book. (8) The book has been read by Bill. The expressions the book and Bill continue to have the same interpretations—object and agent, respectively, of the action of reading—in (7) and (8), although they occupy very di¤erent positions in these sentences. Such examples give rise to the even more general idea that sound-meaning connections in natural languages are typically indirect in that one cannot read o¤ the meaning of a sentence from its phonetic form; hence the need for canonical representation of meaning. In that sense, Russell opened our eyes to a fundamental feature of natural languages. This is perhaps the place to list some of the classic examples that have occupied linguists over the years to show just how widespread the phenomenon really is. Each of these displays various complex and indirect relationships between phonetic and semantic interpretation, thereby bolstering the argument from the poverty of stimulus: the stimulus properties of the datum supply insu‰cient evidence for the child to decide how a sentence is to be semantically interpreted. (9) Flying planes can be dangerous. (10) Shooting of the hunters disturbed Mary. (11) The troops stopped drinking in the village. (12) John is easy to please. (13) John is eager to please. (14) John is too stubborn to talk to Bill. (15) John is too stubborn to talk to.
Linguistic Theory I
33
Examples (9), (10), and (11) are structurally ambiguous in various ways. For example, (9) might mean that it is dangerous to fly planes, or that planes that fly can be dangerous; in (10), what disturbed Mary could be either that hunters themselves were getting shot or the fact that hunters were shooting, say, birds; a similar analysis accompanies interpretations of (11). As the paraphrases show, these examples have similarities with scope ambiguities. The pairs (12)/(13) and (14)/(15), on the other hand, require major di¤erences in underlying semantic interpretations despite very close similarities in their phonetic shapes. In (12), John is the Object of please, while in (13) some arbitrary individual(s) is the Object of please; similarly for (14) and (15). How good is Russell’s own analysis of the general phenomenon even if we ignore the specific theory of descriptions? The trouble is that canonical forms such as (4) and (5) are central to Russell’s approach, and it is not at all clear what is accomplished by these forms. As I viewed the matter, displacement is clearly a natural phenomenon demanding a principled account. For various reasons, it is rather doubtful whether Russell has given such an account. First, it is not clear in what sense, say, (4) and (5) give an account of (2). The notation of logical theory itself belongs to an artificial language. So what Russell has done in (4) and (5) is to capture two of his intuitive interpretations of (2) in two sentences of this artificial language. In e¤ect, this exercise has, at best, the same force as two Hindi sentences displaying the ambiguity of (2). In fact, for those who understand Hindi, the latter exercise is likely to be preferred over Russell’s ‘‘analysis’’ since Hindi speakers can depend on their own linguistic intuitions that are naturally associated with their knowledge of Hindi. This point is exemplified by the English paraphrases (i) and (ii) of (6). Clearly, these paraphrases are informal displays of the phenomenon; the question will be begged if they are thought to give an account of the interpretation(s) of (6). The logical notation, on the other hand, is a tool created by the logician and, hence, it is not associated with any linguistic intuitions independently of the intuitions of the native speaker of English. The fact that we ‘‘see’’ that (2), for instance, has the same meaning as (4) is not because we have some independent knowledge of the meaning of (4), but because we simply (and intuitively) associate (4) with our knowledge of the meaning of (2). In other words, if we are asked to explain the meaning of (4), the most we can do is to say that it has one of the meanings of (2). So if the task is to explain what knowledge of meaning of (2) we have, then it is not accomplished by writing (4) down. We return to this issue in
34
Chapter 2
chapter 3, after witnessing the explanatory power of an alternative framework in this chapter. Second, the logical notation does not even su‰ce as an adequate notational scheme. Even if we grant, say, that (4) and (5) give an account of (2) in some (yet to be clarified) sense, the account ought to begin with the structural features of (2) and lead systematically to the structural features of (4) and (5). This will be one way of ‘‘rationally reconstructing’’ the native speaker’s knowledge of (2), notwithstanding the lack of explanatory force of this exercise. Any procedure that establishes a structural link between (2) and (4) cannot begin unless we have a proper syntactic characterization of (2). To have that characterization is to already have a canonical notation. Of course, once we have that characterization with a syntactic theory, we may use its resources to ‘‘plug in’’ logical formalism as systematically as we can. But then we need independent justification for the duplication of the e¤ort as in much work in formal semantics, as we will see. Thus we want linguistic theory to solve two major problems simultaneously: the unique linguistic version of Plato’s problem, and the scope problem as an instance of the general displacement problem. This project will take us to the end of this chapter. After completing the project, we return to a more detailed examination of the relations between logical and grammatical theories in the next chapter. I have just raised, and will continue to raise, a variety of objections against the use of logical theory to explain the workings of natural languages.2 None of this is meant as an objection to logical theory itself. I hold the development of mathematical logic through the work of Gottlob Frege, Bertrand Russell, Alfred Tarski, Kurt Go¨del, Alonzo Church, Alan Turing, and a host of others, as one of the most significant achievements in the history of human thought. As an ardent student of logic, I cannot fail to admire the beauty of its constructions, its metaproofs, and a series of surprising results on the character of formal systems culminating in Kurt Go¨del’s mind-boggling work. But praise is due where it belongs. The basic objection I am raising is simple and, to my mind, pretty obvious. Almost all of the work in mathematical logic originated with an abstract and intuitive characterization of a small list of words from natural language: & for and, s for not, E for all, b for some, ! for if, (variable) x for it, and so forth. It was a stupendous feat to construct extremely complex systems and proof strategies from such a small basis. One of the natural requirements for these constructions was to keep the
Linguistic Theory I
35
basic formalism, including a scheme of interpretation, under strict control so that the vagaries of the basis do not contaminate the resulting constructions. It is not surprising, therefore, that the characterization of the basis not only required a prior intuitive understanding of certain English words; it reflected the meanings of these words only partially to enable the logician to reach a ‘‘core’’ meaning to which the logical concerns of deducibility, validity, and so on can be systematically attached. As Peter Strawson (1952, 148) described the logical enterprise, ‘‘the rules of the system’’ of logic ensured that the constants of the system formed ‘‘neatly systematic relations to one another’’ that the words of English do not have. By the same token, we cannot expect expressions of logic to give an account of the meanings of these English words. A very di¤erent form of inquiry is needed to explain the workings of natural language. In pursuing it, no doubt, lessons from the rest of human inquiry, including logical theory, will be drawn on. In fact, the basic explanatory format of linguistic theory, namely, the computational-representational framework, is adopted directly from a specialized branch of mathematical logic known as ‘‘computability theory’’ advanced by Go¨del, Church, Turing, and others. But not surprisingly, the domains of application of mathematical logic and linguistic theory di¤er sharply. 2.2
Principles and Parameters
I now turn to the treatment of the scope problem in generative grammar.3 According to Chomsky (2000d), Universal Grammar (UG) postulates the following provisions of the faculty of language (FL) that enter into the acquisition of language: A. A set of features B. Principles for assembling features into lexical items C. Operations that apply successively to form syntactic objects of greater complexity CS, the computational system of a language, incorporates (C) in that it integrates lexical information to form linguistic expressions hPF, LFi at the interfaces where language interacts with other cognitive systems of the mind. Although there has been significant progress in recent decades on principles of lexical organization (Pustejovsky 1995; Jackendo¤ 2002; also Pinker 1995a for a popular review), linguistic theory has been primarily concerned with the properties of CS. In what follows, therefore, I
36
Chapter 2
will also concentrate on CS to review the character of the principles contained there. The basic features of the P&P framework (Chomsky 1981) can be brought out as follows. We may think of four kinds of rules and principles that a linguistic theory may postulate. First, the formulation of some rules may be tied to specific languages; call them ‘‘language-specific rules’’ (LSR): relative clauses in Japanese, passivization in Hindi, and so on. Second, some rules may refer to specific constructions without referring to specific languages; call them ‘‘construction-specific rules’’ (CSR): NP-preposing, VP ) V NP, and the like. (I am introducing this group for expository purposes. In practice, these rules often refer to language typologies—for example, VP ) V NP holds only for head-first languages. It does not a¤ect the discussion that follows.) Third, we may have rules that refer neither to specific languages nor to specific constructions, but to general linguistic categories; call them ‘‘general linguistic principles’’ (GLP): a lexical item may have a y-role just in case it has Case, an anaphor must be bound in local domain, there is a head parameter, and the like. Finally, we may have rules that simply signal combinatorial principles and general principles of interpretation without any specific mention of linguistic categories; call them ‘‘purely computational principles’’ (PCP): all elements in a structure must be interpretable, the shorter of two converging derivations is valid, and so on. I discuss this taxonomy from a di¤erent direction in section 5.2. The remarkable thing about current linguistic theory is that there is a real possibility that rules of the first two kinds, namely, LSR and CSR, may be totally absent from linguistic theory. The P&P framework made this vast abstraction possible. In slightly di¤erent terms than mine, Chomsky (1991a, 23–24) brings out the basic features of a P&P theory as follows. Consider two properties that descriptive statements about languages might have: a statement may be language-particular or languageinvariant [Glp, where þlp ¼ LSR], or it could be construction-particular or construction-invariant [Gcp, where þcp ¼ CSR]. Then, according to Chomsky, a P&P theory contains only general principles of language that are [lp] and [cp], and a specification of parameters that is [þlp] and [cp]. Traditional grammatical constructions such as active-passive, interrogative, and the like are ‘‘on a par with such notions as terrestrial animal or large molecule, but are not natural kinds.’’ Once the parameters are set to mark o¤ a particular language, the rest of the properties of the expressions of this language follow from the interaction of language-invariant principles: ‘‘the property [Gcp] disappears.’’
Linguistic Theory I
37
Figure 2.1
Government-binding theory
However, the property [Glp] cannot disappear because of the existence of thousands of human languages. By hypothesis, principles of UG that solve this problem must be [lp], but they cannot solve the problem so narrowly as to pick out the same language for any data since there is more than one language; hence [þlp]. Chomsky finds it ‘‘surprising’’— perhaps a ‘‘defect’’ in the scheme of evolution—that there is more than one language (Baker 2001; Chomsky 2000b). To account for multiplicity of languages, the P&P framework suggests the following compelling picture of language acquisition: ‘‘We can think of the initial state of the faculty of language as a fixed network connected to a switchbox; the network is constituted of the principles of language, while the switches are the options to be determined by experience. Each particular human language is identified as a particular setting of switches’’ (Chomsky 1997, part 1:6). These ideas were first articulated in the scheme of grammar known as Government-Binding Theory (figure 2.1). 2.3
Government-Binding Theory
Given the lexicon, the boldface levels represent the various structures generated by the computational system. Each of the surrounding principles
38
Chapter 2
of UG mentioned in the boxes are language-independent, though as we will see, some of the ‘‘principles’’ (alternatively, ‘‘modules’’ of grammar) are subject to parametric variations. Move-a is the only transformational rule that converts phrase markers into phrase markers, where a could be any syntactic category. Such a rule will obviously overgenerate wildly. Principles of UG work in tandem to constrain free movement—that is, any category can be moved anywhere provided no principle of UG is violated. It is important to recognize that boldfaced levels of representations are ‘‘events’’ in the sense that they are theoretical constructs entirely identified in terms of what happens at each level. What happens is that certain principles of UG progressively interact with the lexicon to generate certain structures. For example, d-structure is identified because this is where, among other things, X-bar theory holds; s-structure is identified because this is where Case theory and Binding theory apply; and so on. So if the course of theory requires some redesigning of the principles, then the nature of interactions and hence the character of a level will change. In principle if nothing happens at a level, then the level does not exist. However, there is a distinction between the ‘‘inner’’ levels of d-structure and s-structure and the ‘‘outer’’ levels of PF and LF. What happens at PF and LF is that, apart from the application of various principles as in the ‘‘inner’’ levels, they interact with nonlinguistic components of the mind. Hence, in that sense, they must exist regardless of the course of the theory as long as the general organization depicted in figure 2.1 holds: PF and LF are theoretically indispensable since they are the interface levels of language. In other words, PF and LF are ‘‘conceptually necessary’’ while d-structure and s-structure, to which no other component of mind has access, are mere constructs of theory. This among other things led Chomsky (1993) to eliminate the intermediate levels in the Minimalist Program. Obviously, what used to ‘‘happen’’ at these levels now happens elsewhere, if at all, as we will see in chapter 5. The phenomenon of displacement, we saw, quite literally requires that elements be moved from one structure to form other structures geared to proper interpretation(s), the sense of proper varying with the nature of constructions. For example, scope distinctions typically require that elements be moved such that di¤erent semantic interpretations are generated; passivization requires, on the other hand, that elements are so moved and introduced that semantic interpretation remains unchanged. Therefore, we need to identify an initial form on which movement
Linguistic Theory I
39
takes place, and another form that results from the movement. We will see that semantic interpretation will require a further—LF—level of representation. 2.3.1
D-Structure
How are the initial forms—d-structures—generated? It is natural to think of d-structures as generated from a system of rewriting rules called ‘‘phrase-structure rules,’’ a sample of which is presented in (16).4 The fact that this rule system was essentially eliminated in favor of deeper principles over a quarter of a century ago is not yet fully recognized outside linguistics circles. Hence it will be worthwhile to spend some time on this development to appreciate one significant point of departure for the P&P framework. (16) a. b. c. d.
S ) NP VP VP ) V NP NP ) DET N NP ) N
Rules in (16) will analyze a sentence (S) down to the basic syntactic categories (V, N, DET), at which point lexical items may be suitably ‘‘inserted’’ to generate an initial (phrase-marker) representation. For example, the sentence Bill read the book has a structure that may be displayed either as a labeled bracketing (17) or, equivalently, as a tree diagram (17 0 ). Note that the first line in (17) represents a sample of rules of lexical insertion for either diagram. (17) N ) Bill, book; V ) read; DET ) the [S [NP [N Bill]] [VP [V read] [NP [DET the] [N book]]]] (17 0 )
Since there is a finite enumeration of lexical items, the rule system (16) will generate only a finite number of ‘‘structure descriptions’’ (expressions)
40
Chapter 2
such as (17) or (17 0 ). The system can be given infinite capacity by adding a recursive clause (18c) and modifying rules (16c) and (16d) accordingly as in (18a) and (18b). (18) a. NP ) DET N 0 b. N 0 ) N c. N 0 ) N S With proper addition of lexical items, this system will now have infinite generative capacity. For example, with the addition of lexical items belief and that, the rule system will generate a structure description for the belief that Bill read the book. With a few more additions to the lexicon, this system will generate sentences such as John’s finding that the belief that Bill read the book surprised Tom interested Harry.5 The system will also be adequate in the sense that it can be used to represent all and only (dstructure) expressions of English. The system of phrase-structure rules specifies the configuration—that is, the order—in which lexical items appear in syntax. Roughly, only those lexical items may be inserted via the rules of insertion for which predetermined positions are available in the structure. For example, the expression *Bill read the book the boy will be rejected since the extra element the boy cannot be inserted in the configuration. Thus both the number and the order of the constituents are exactly determined. If the phrase structure system needs to be given up on independent grounds, these properties will have to be expressed by alternative means. There are a variety of problems with this rule system. I will briefly discuss a few of them (Chomsky 1972c; 1986, 56–64, 80–82, for more). Notice that the discussion here is restricted to whether phrase-structure rules are needed for d-structure representation. The wider issue of whether this system alone is adequate for syntactic description of languages is discussed in Chomsky 1957 with the conclusion that a transformational component is also needed. I take this conclusion for granted. C-Selection The system plainly has an ad hoc, taxonomic character in the sense that it is basically a list of categorial relationships of a preliminary sort. For example, the book is a part of speech consisting of two subparts the and book in that order. Rule (16c) simply states this observation in categorial terms. Thus the system, though observationally correct, is likely to have missed underlying uniformities and generalizations. Two of these underlying uniformities, among others, deserve imme-
2.3.1.1
Linguistic Theory I
41
diate mention. First, the lexicon, which the child has to acquire in any case, is richly structured (Chomsky 1965, 164–170). Each lexical item has three categories of information in a full dictionary: phonetic (how it is pronounced), semantic (what it means), and categorial or syntactic (how it is structurally characterized). Boy, for example, is categorially characterized as a noun; it has a certain sound; and it has semantic features such as þanimate, þhuman, þmale, and so on. More to the point, verbs and other basic categories carry categorial information regarding their C-selection (‘‘C’’ stands for categorial); alternatively, basic categories carry information about what they subcategorize for. Apart from verbs, other basic categories such as nouns, adjectives, and prepositions also C-select: the destruction of the city, angry at Sam, out of the room, and so forth (Jackendo¤ 1983, 59–60). Keeping to verbs, consider the verb read. On acquiring the verb the child knows that it has the structure ‘‘x(read )y’’ whereas die has the structure ‘‘x(die)’’—that is, read is transitive and die intransitive. However, the child also knows that transitivity comes in di¤erent forms. Notice the variety of things that follow the main verb in the following examples: read a book (NP), made me angry (NP, AP), went out of the room (PP), paid handsomely (AdvP), gave an apple to the boy (NP, PP), thinks that John is sick (Finite Clause), told Mary that John is sick (NP, Finite Clause), and persuaded Mary to visit John (NP, Infinitival Clause). These facts allow us to say that, for example, read C-selects an NP, persuade C-selects an NP and an infinitival clause, and so on. This the child already knows. Notice also that all of the examples listed above are VPs such that we get full sentences if we attach a Subject—say, Jones (NP)— in front of them. So, in e¤ect, in learning a verb, the child has learned the categorial information encoded in a VP involving that verb. Let us say a verb phrase has two major parts: a head and a complement. The head of a phrase is the most prominent element of the phrase in the sense that the head is what the phrase essentially is: the old man is a man, read a book is a reading, told Mary that John is sick is a telling. A complement is what the head C-selects. So various head-complement relationships that constitute VPs are listed, among other things, in the verbal part of the lexicon. To that extent, the phrase-structure rule (16b) [VP ) V NP], for example, is redundant. However, unlike the rule, C-selection does not specify the direction in which the complement NP is to be found. Moreover, there are uniformities in the head-complement relationships across categories that phrase-structure rules miss. Generalizing from the
42
Chapter 2
VP case, we can say that a noun phrase (NP) consists of a head noun followed by a complement. Now, notice the striking similarities between (19), (20), and (21). (19) Bill [VP observed [S that Jamie was still awake]] (20) the [N 0 observation [S that Jamie was still awake]] (21) Bill’s [N 0 observation [S that Jamie was still awake]] Similar structures can be obtained for pairs such as claimed/claim, construct/construction, and the like. The noun observation that heads the structure [observation [S that Jamie was still awake]]—and that is the nominal form of the verb observe—has the same complement as the verbal head in (19). So the property of C-selecting a (finite) clausal complement really belongs to the word observe and its derivatives such that the forms (19)–(21) are largely predictable from lexical information alone. This generalization across phrasal categories is missed in the relevant phrasestructure rules. Also notice that (20) and (21) are NPs according to the phrase-structure rule (16c) read with (18e): (NP ) DET N 0 S). Hence, this rule is redundant as well insofar as it conveys categorial organization of NPs. Now if we simply adopt a global principle that links lexical information to syntactic constituency, then phrase-structure rules are not needed to that extent. This is achieved by the Projection Principle: (22) Lexical structure must be fully represented categorially at every syntactic level. The principle enables us to view syntactic structures as ‘‘projections’’ from the lexicon. The structures (20) and (21) have other related interests. The value of DET in (20) is the definite article the while in (21) it is John’s, a full noun phrase (Chomsky 1986). It also appears in structures such as (23) [NP John’s [VP hitting the ball]] where it occurs to the front of a VP rather than an N 0 as in (21). This suggests that DET belongs to a category more abstract than the standard category of articles and that the category is available across phrases. Let us call this category ‘‘Specifier’’ (Spec), which in English occurs at the front of a phrase. Then both (21) and (23) have the structure [Spec [XP . . . ]], where ‘‘XP’’ is either an N 0 or VP. The sequence Spec-headcomplement then fully captures the categorial information of phrases such as (20), (21), and (23). Extending the idea, the element very in very
Linguistic Theory I
43
confident of his ideas may now be viewed as the Spec of an adjectival phrase headed by the adjective confident. X-Bar Theory These ideas can be explored further. Let us think of two basic category features ‘‘N(ominal)’’ and ‘‘V(erbal)’’ each of which has binary (G) options. This will generate four phrasal heads as in (24); the traditional categories of noun, verb, and the like are no longer basic syntactic categories, but are mnemonics for a collection of syntactic features.
2.3.1.2
(24) þN, N, N, þN,
V : N(oun) þV : V(erb) V : P(reposition) þV : A(djective)
The heads N, V, P, and A project four categories of phrases, namely, NP, VP, PP, and AP. Each of these will obey the general scheme Spechead-complement suggested above. Further, it has been a discovery of some importance that various generalizations can be captured if the scheme has the following general pattern called ‘‘X-bar theory’’ (figure 2.2). The important point to note is that a head X projects a category X-bar consisting of X and its complement, while a further projection of X belongs to the category X-bar-bar (¼ XP, a phrase) consisting of the specifier and X-bar; hence, the Spec belongs to a level ‘‘higher’’ than the level of X and its complement. This hierarchy is of much use in defining various structural relationships between categories. For example, lexical information regarding C-selection simply says that a verb C-selects, say, an NP. It does not explicitly say which NP is C-selected, though there is
Figure 2.2
X-bar organization
44
Chapter 2
Figure 2.3
Clause structure
an implicit suggestion that only that NP counts that is ‘‘related’’ to the verb somehow—the one that falls within the ‘‘domain’’ of the verb. How is this suggestion structurally realized? We may now use X-bar-theoretic notions to bring it out explicitly. For two mutually exclusive elements a and b, let us say ‘‘a c-commands b’’ just in case every maximal projection dominating a dominates b. We say that an element a ‘‘governs’’ an element b just in case a and b c-command each other and a is a head—that is, a A fN, V, A, P, Inflg. I explain Infl in a moment; for now, let us just concentrate on structural relations. Thus, in figure 2.3, Infl c-commands V; V does not c-command either Infl or [NP, S], though V c-commands [NP, VP], which c-commands V as well. Therefore, Infl does not govern V but V governs [NP, VP]; [NP, VP] does not govern V since the former is not a head. It is clear, then, that Cselection is satisfied under government (see Duarte 1991 for more). Recall that phrase-structure rules conveyed two sorts of information: categorial and configurational, namely, the nature and the order of the constituents. We saw above that the categorial part is captured in the lexicon. Capturing the configurational part is a complex problem. X-bar theory now may be viewed as solving part of this problem. The rest of the ordering, as we will see, will be accomplished by other ‘‘modules’’ of grammar—for example, Case theory. Notice that X-bar theory is language-independent though it is associated with a few (probably, just two) options: the scheme could be leftordered, head-first as above for languages such as English and French, or, it could be right-ordered, head-final, as in Japanese. The interesting
Linguistic Theory I
45
fact is that all categories of phrases in a given language (typically) obey identical orientation. Children can fix the value of the orientation parameter for their language from very simple data such as drink milk. Thus, for English, C-selection in combination with X-bar theory now requires that the complement(s) be located to the right of the head. The X-bar scheme can be easily generalized to sentences and clauses (figure 2.3). I have so far considered ‘‘proper’’ heads: N, V, A, and P. Verbal forms typically contain various kinds of information regarding tense, modality, aspect, and agreement. Depending on the language, some or all of this information is morphologically realized. Let us collectively call these elements ‘‘Infl’’ (for ‘‘inflection’’). I will use Infl to represent just the tense information. The tense part of a verbal form is interesting in that it does not really modify the ‘‘basic’’ meaning of the verb. In some sense, ate means the same as will eat insofar as the action is concerned, but of course the timing of this action, past and future respectively, varies. In that sense, the tense information is detachable from the verb itself: Tense þ V. It also stands to reason that this information is really related to how the whole sentence is to be interpreted. Thus, the di¤erence between John ate a cake and John will eat a cake relates only to the di¤erence in the timing of the event John’s-eating-a-cake. The tense information, therefore, is central in distinguishing between these two sentences. By parity, the infinitival to is also central in the same sense since it suppresses timing information in the clause in which it occurs. On these considerations, the tense inflection Infl can be thought of as heading the sentence, where the sentence itself is the maximal projection of this head, Infl-bar-bar. The intermediate projection Infl-bar will establish the sisterhood between Infl and the complement VP. Thus, all requirements of the X-bar format are met (for more, see Jackendo¤ 2002, 6.2.2). Furthermore, classical notions of Subject and Object, which play crucial roles in the syntactic description of languages (Chomsky 1965), can also be explicitly defined within the X-bar format. Subject is the NP immediately dominated by S—that is, [NP, S]. Object of the main verb is the NP immediately dominated by VP—that is, [NP, VP]. Following exactly the same line of reasoning and generalizing further, we postulate a sentence-external category Comp (for ‘‘complementizer’’), which in English may be that, for, or null. Thus a clause, finite or infinitival, has the following structure: (25) [S-bar Comp [S NP [Infl-bar Infl [VP V . . . ]]]]
46
Chapter 2
Taking the complement of V to be NP, (25) will be the general representation for a tensed clause such as that English is a language as well as for an untensed clause like for John to visit the hospital. The net result so far is a striking response to Plato’s problem. A large part of the computational system becomes operative just as minimal data trigger o¤ the categorial and inflectional systems and determine one of the binary choices of the X-bar template; the child must seek this data in any case. Some words of caution are needed here. First, no doubt, the brief sketch of X-bar theory was designed to show how this part of grammar solves Plato’s problem for the child by constructing a large chunk of grammatical organization when simple data for head orientation is made available to the child. This was not meant to show that the theory of parameters makes language learning trivial. After all, a grammar provides just an abstract format; it must be hooked on to the child’s experience and other cognitive resources. For example, the child needs to analyze even this simple data (drink milk) to figure out the syntactic properties of heads and complements from the rapid use of words in the linguistic environment.6 Not an easy task (Baker 2001, 213; Fisher, Church, and Chambers 2004). Fisher et al. (1994) present evidence that while children find it relatively easy to acquire sound-noun connections from environmental contexts alone, acquisition of verbs—even simple ones like go, do, push—seem to present insurmountable problems. Still, most interestingly, the corpus of verbs in caregiver speech directed at very young children frequently includes abstract verbs like think and want (Snedeker Gleitman 2004). Fisher and colleagues contend that verb learning requires considerable syntactic support. By ‘‘syntax’’ they basically mean argument structure with y-roles filled in (see below). Second, I discussed only the head-orientation parameter because it is easy to understand (and relatively uncontroversial) to get the flavor of the rest of the machinery. Mark Baker’s meticulous work across a wide variety of typologically di¤erent languages suggests that up to six parameters need to be fixed to reach a specific language type. As Baker (2001) shows, sets of parameters branch out in regular ways to determine language types such that some language types may not have more than one common parameter. The set of parameters for a given language type form a hierarchy of increasing ‘‘depth,’’ suggesting an increasing degree of difficulty for the child. With respect to Baker’s hierarchy, his review of research on language acquisition suggests that while the head-orientation parameter is fixed as early as twelve months, the fixing of the ‘‘deepest’’ might be delayed until the thirtieth month.
Linguistic Theory I
47
Finally, I must mention that current research has already questioned some of the ideas sketched above. For example, Chomsky 1994a proposes a minimalist approach to X-bar theory leading to many modifications— in fact, a virtual abandonment—of X-bar theory; I discuss the new ideas later (section 5.1). In the current picture, the head parameter is not attached to the X-bar template since there is no such template. According to Chomsky 1995b, parameters are located in the morphological part of the lexicon, not in the computational system. More radically, there are proposals for a nonparametric conception of the faculty of language (Miyagawa 2006, cited in Chomsky 2006a). From a very di¤erent direction, Richard Kayne (1994) proposes an elegant reformulation of phrase-structure theory from a single axiom, the Linear Correspondence Axiom (LCA), which states that ‘‘syntactic structure is universally and without exception of the form S-H-C’’ (specifierhead-complement) (Kayne 2004, 3).7 It follows, with wide support from crosslinguistic studies, that the distinction between head-last and headinitial languages may not be a primitive of syntactic theory. A closer look at syntactic heads across many languages suggests more fine-grained ‘‘microstructures.’’ According to Jenkins (2004, xviii) the ‘‘fine-grained study of language from the microparametric point of view might be compared to the study of the ‘fine-structure’ of the gene which followed the earlier coarser approaches of classical genetics.’’ Theta Theory Apart from information about what they C-select, verbs also carry information, perhaps more obviously, about what they Sselect (‘‘S’’ for ‘‘semantic’’). Thus, looking back at the list of verbs with various C-selection properties, apart from saying that read C-selects an NP, it is natural to say that it S-selects a theme. (Perhaps it is more traditional to say that read S-selects an object. However, since object is also used for the functional categorial concept [NP, VP], I will use theme to avoid confusion. Sometimes I will also use object in an extended sense to include other complements such as [PP, VP] and [S 0 , VP] as well.) Extending the idea, we could say that hit S-selects a patient (who gets hit), plan S-selects a goal, believe S-selects a proposition, persuade S-selects a theme and a proposition, and so on. With one exception to be noted below, these verbs will also S-select an agent of the given action where the agency will be assigned to the element in the Subject position. Notions such as theme, goal, patient, proposition, agent, and the like, which verbs S-select for, are collectively called ‘‘thematic roles’’ (y-roles). These notions allow us to view verbs as predicate-argument structures 2.3.1.3
48
Chapter 2
familiar from logical theory: y-roles are arguments that fill the relevant argument places in the matrix of a verb/predicate; similarly, for other lexical heads such as N, V, A, P, though I will keep to S-selectional properties of verbs only. If we include agency, then a sentence may be viewed as a predicate-argument structure as well. Clearly, there is a certain amount of redundancy between the C-selection and the S-selection systems; they basically establish head-complement relationships. In some places, Chomsky has suggested that the entire system of C-selection is possibly eliminable in favor of S-selection (Chomsky 1986, 86–90). In other places, some doubts have been raised about the alleged redundancy from the point of view of language acquisition (Chomsky and Lasnik 1993, 12– 14). For now, let us hold onto two indisputable facts: thematic relations are available to children as part of their lexical knowledge, and thematic structure is intimately related to the language module though possibly separate from it. We might as well try to make some use of these facts in grammatical theory, especially if they help solve some computational problems. As a step toward that end, let us ignore the conceptual aspects of the thematic structure by putting aside fine-grained conceptual distinctions between, say, themes, goals, recipients, patients, agents, experiencers, instruments, and the like, and work just with abstract y-roles. In other words, theoretically we need not care which role is assigned to what argument as long as arguments are assigned roles. However, I will continue to use notions of agent, theme, goal, and proposition for expository purposes. Let us assume that these notions ultimately get interpreted when the grammar interacts with the conceptual system; in grammar, I assume that abstract thematic roles just play computational roles. In doing this, I am indeed invoking thematic roles ‘‘as a thinly disguised wild card to meet the exigencies of syntax’’ (Jackendo¤ 1990, 2.2). Though it is part of lexical information that predicates S-select y-roles for various arguments, arguments cannot be lexically so marked since arguments are structural entities. Thus y-roles need to be assigned to designated arguments in a structure. Keeping to verbal predicates, the nature of assignment may be displayed cumulatively as follows (figure 2.4). The agent role is assigned ‘‘compositionally’’ to the external argument by the phrasal node VP—while internal arguments are assigned y-roles by the predicate itself. The figure shows which role is assigned to which category of argument, although these details are not computationally relevant, as noted. It is obvious that internal y-role assignment takes place under government.
Linguistic Theory I
49
Figure 2.4
Y-role assignment
There is one universal exception to this system of assignment: if V is a passive participle—for example, kiss-en—then the external y-role is not assigned; in other words, the passive morphology en absorbs external yrole. This idea surely has a stipulative character though its e¤ect is widely attested empirically (Lasnik and Uriagereka 2005, 126–128). Also, there is a problem with data such as [John will kiss Mary], where the Infl will intervenes between the Subject John and the VP; this prevents the VP from assigning the external y-role to John. This is one motivation for the VP-internal-Subject hypothesis (see below and Radford 1997). With this system of assignment in hand, a pending problem can now be addressed. Recall again that phrase-structure rules carried two sorts of information: categorial and configurational. We are still in the process of fixing the configurational part. X-bar theory, we saw, solves part of the problem by imposing a hierarchy on categorial elements along with the orientation parameter. C-selection, via the projection principle (29), links lexical properties with the X-bar template. We know that the (transitive) verb visit takes an NP complement, so the VP visit John is well formed; the expression satisfies C-selection and X-bar requirements along with the head-first choice for English. Note that C-selection and X-bar requirements are satisfied once at least one NP ‘‘somewhere’’ to the right of the verbal head becomes available. The restrictions do not say anything regarding (i) how many NPs at most are to be available to the right of visit, and (ii) where exactly to the right of visit these NPs are to be located. Therefore, there is nothing in the requirements so far to prevent, say, *visit John an apple, which is not well formed; however, give John an apple is fine. Such facts are now easily explained via the thematic structure of verbs: an apple in *visit John an apple is not licensed since an apple is not a proper argument that visit S-selects for; give an apple to John is
50
Chapter 2
well formed because give S-selects two y-roles, perhaps a recipient and a theme, assigned here to John and an apple. In general, then, y-assignments may be used as a check on the enumeration of arguments in a structure in answer to (i) above; note that (ii) has not been answered yet. y-theory enables us to enumerate the arguments S-selected by the predicates in the sense that when each of the predicates has been inspected, the system knows exactly how many arguments are ‘‘expected’’ (Chomsky, Huybregts, and Riemsdijk 1982, 86–89). It is natural to take the next step to formulate a preliminary criterion called ‘‘ycriterion’’ as follows; ‘‘preliminary’’ because it will soon be replaced by a more unified idea (in (33)) below. (26) Every argument must be assigned a y-role and every available yrole must be assigned to an argument. Notice that this criterion does not prevent assignment of two y-roles to one argument. Consider John left the room angry. Here John gets a yrole compositionally from the VP left the room. Now compare this sentence with John made [Bill angry/leave], in which the Subject Bill of the ‘‘small clause’’ Bill leave must have a y-role that can only be assigned by the predicate angry. So, by parity of assignment, the predicate angry in John left the room angry assigns a second y-role to John (Chomsky 1986, 91). Intuitively, it is John who (both) left the room and was angry. Since (26) will ensure a tight fit between y-roles and arguments, (26) may now be viewed as a constraint that every subsequent syntactic structure must obey; in fact it is reasonable to think that y-theory was invoked essentially to enable us to formulate this criterion. In combination with the lexicon, and subject to constraint (26), the projection principle thus displays the complete predicate-argument information at d-structure; a d-structure may therefore be thought of essentially as a y-structure. In this display, there are positions that may possibly be occupied by arguments obeying (26). These are called ‘‘argument positions,’’ A-positions. Nonargument positions are called A-bar positions; in particular, a Comp position is an A-bar position. Since d-structure is now viewed as a projection of lexical information, important consequences follow. For example, ignoring details of hierarchy, (27) will be the d-structure representation of the passive sentence Mary was kissed. (27) e [Infl kiss-en Mary] where en is the passive marker that attaches to the verb (see Lasnik and Uriagereka 2005, 125–128). Here kiss S-selects a theme that is assigned
Linguistic Theory I
51
to the argument Mary, as required. In (27), however, the empty element e is projected due to a general principle (28), which we may call the ‘‘Extended Projection Principle’’ (EPP). (28) Every clause must have a Subject. The lexicon must therefore be viewed as containing a finite enumeration of empty elements to be projected onto syntactic structure as and when required. The presence of two projection principles might look inelegant. So, it may be desirable to unify this principle with the projection principle (22) (Chomsky 1986, 116). Following developments in MP, the idea seems unnecessary. The projection principle is really needed for the intermediate syntactic structures such as d- and s-structures; it is trivially satisfied (or not) at the interface levels. Hence if the intermediate structures are not needed otherwise, as in MP, this principle will be infructuous. The Extended Projection Principle, however, continues to be a crucial constraint on well-formedness on independent grounds such as the existence of pleonastic Subjects in some languages such as English: there ensued a riot in Delhi, it rained heavily last night. I return to EPP in chapter 5. Returning to (27), the Subject position is an A-position that is not assigned any y-role due to the presence of a passive marker in the assigner (see the exception to figure 2.4). This allows the movement of Mary to the Subject position, leaving a new empty element, an NP-trace, behind. I return to the details of this movement after introducing Case theory. For the time being, notice that this movement gives a very natural explanation of why Mary, which is the theme/object of kiss, nevertheless occupies a Subject position in the passive sentence; also, the movement is essentially triggered by the passive morphology. We get the first glimpse of how di¤erent components of grammar interact to produce a grammatical ‘‘action.’’ 2.3.2
S-Structure
Returning to the problem of ordering of elements, once the system has an enumeration of arguments licensed by criterion (26), an order may now be imposed on them if some principle that links arguments with the relevant positions in the structure is found. Now, an enumeration of all and only positions is already available from the general structure of a clause satisfying X-bar theory. So the ‘‘relevant’’ positions will be a proper subset of these positions; these can only be A-positions. Therefore, a principle that maps the set of arguments onto the set of available A-positions will finally solve the problem that arose with the elimination of phrasestructure rules.
52
Chapter 2
Thinking abstractly, it is clear that the principle is to be such as to identify some property of arguments that is not satisfied until a given argument occupies a designated position. In some cases, the principle will be automatically satisfied as soon as the argument is displayed at the dstructure. In other cases, the argument will move to a suitable (and licit) position to satisfy the principle causing another structure, s-structure, to form. Failing this, the argument cannot be licensed and the string in which it occurs will be rejected. So, by s-structure, all arguments must satisfy the principle; hence it is not relevant at the d-structure even though it may be satisfied at the d-structure. Case Theory The demand for such a principle is quickly met. Case is a familiar grammatical property that is classically linked to such functional notions as Subject and Object and is thus linked to designated positions in a structure. Thus, the Subject of a clause is said to have a nominative Case, the Object an accusative Case, the noun phrase of a PP an oblique Case, and so on. It is generally agreed that all languages have a Case system (Chomsky 1988, 101; Pinker 1995a, 115–117), though the system is not overtly realized fully in most languages. Languages such as Latin and Sanskrit have a rich inflectional system in which varieties of Case are morphologically realized. English, on the other hand, is a Casepoor language in that Case is overtly realized only in pronouns. Noting that only NPs have Cases and assuming universality of at least a core Case system, we may temporarily adopt principle (29), called ‘‘Case Filter.’’
2.3.2.1
(29) Each phonetically realized NP must have Case. Since Case is typically linked to functional notions, Case is also assigned to various designated NPs as follows. (30) a. b. c. d.
Tense Infl assigns nominative Case to [NP,S] An active verb assigns accusative Case to [NP,VP] A preposition assigns oblique Case to [NP,PP] Genitive Case is assigned without assigner to possessives
A special mention must be made of three categories that do not assign Case: (i) [þN] categories—that is, nouns and adjectives—apparently do not assign Case (ii) Infl does not assign Case when it has the infinitival value to (iii) A passive particle does not assign but absorbs Case
Linguistic Theory I
53
Each of these exceptions is controversial. Setting complications aside, (i)–(iii) have far-reaching consequences. For example, (i) has the consequence that expressions such as the noun phrase translation [the book] and adjectival phrase full [water] are correctly blocked. These expressions otherwise satisfy each of X-bar theory, C-selection, and S-selection. In English and in many other languages, these expressions require the insertion of a vacuous preposition of (translation of the book, full of water), which can now assign Case (Chomsky 1988, 110–112). Consequences of (ii) are particularly interesting. By the Extended Projection Principle, every clause, including an infinitival clause, must have a Subject. So, the VP to go home needs a Subject in order to be licensed. Sometimes this Subject is an empty element called ‘‘PRO,’’ as in John wants [PRO to go home]. By (ii), PRO will not have Case but that will not violate principle (29) since PRO is not a phonetically realized element. We will see that other considerations imply that PRO has ‘‘inherent’’ Case. English and some other Case-poor languages have an additional requirement that Case is assigned to an adjacent NP—that is, nothing may intervene between an NP and its Case assigner. Let us think of it as a parameter of the universal Case theory. Thus, we cannot have *put (on the table) (the glass) while put the glass on the table is permitted. When this condition is combined with the orientation parameter of X-bar theory, a rather strict word ordering (and hence, uniqueness of positions) follows for Englishlike languages. Adjacency requires that Case is assigned to the complement that is immediately to the right or to the left of the head. English is a head-first language in that a complement is placed to the right of its head, so this position is uniquely designated for arguments to be licensed. The problem of configuration is thus finally solved. Languages that do not require Case adjacency will therefore have free word order to that extent (Baker 2001; Jackendo¤ 2002 for extensive discussion). Turning to (iii) and returning to (27) (e [Infl kiss-en Mary]), it is clear that the argument Mary is not licensed since it does not have Case (Rouveret and Vergnaud 1980), showing that a d-structure is not subject to the Case filter. Hence, Mary must move—raise—to the only A-position available, namely, the Subject position, which is currently occupied by an empty element. Since this position may be assigned nominative Case, Mary gets the Case. We think of each movement as leaving a coindexed empty element—a ‘‘trace’’ of the vacated category—behind (Chomsky 1980, 146). Notice that movement vacates a category but does not eliminate it. Movement is essentially deletion and insertion. At the Object
54
Chapter 2
position, Mary vacates the category NP where an empty element is inserted since the category continues to exist. At the Subject position, the movement deletes the existing empty element and Mary is inserted. The result is a new structure, an s-structure, roughly represented in (31). # (31) Maryi [INFL kiss-en ei ] When (31) is fed into the PF component, spelling rules finally generate the expression Mary was kissed. The interest here is that, of the two positions occupied by Mary and e respectively, the former has Case while the latter has a y-role. Thus the chain (Mary, e), marked by coindexing in (31), satisfies both criterion (26) and the Case filter (29). Think of Mary as heading this chain and e as referentially dependent on Mary. For a simple sentence such as John likes Mary, each of John and Mary are assigned both Case and y-role; thus, John and Mary may be thought of as single-membered chains. This analysis establishes a close link between y-theory and Case theory: there seems to be a relationship of mutual satisfaction. Suppose we capture this by imposing a Visibility Condition (Chomsky 1986, 94). (32) A chain is visible for y-marking only if it is assigned Case. So we now have three licensing conditions: y-criterion (26), Case Filter (29), and Visibility Condition (32). They seem to say similar things in slightly di¤erent ways. Can they be put together in a better package? It is clear that once we have the interactive condition (32), much of the e¤ect of (29) is already captured. The e¤ect of (32) is to make sure that a chain will not be y-marked unless it is Case-marked, which, in fact, is the requirement of (29). Since an interactive principle is always preferable over an isolated one, we adopt (32) and give up (29). The notions of chain and visibility may now be used to formulate a modified y-criterion (33). (33) A chain has at most one y-position; a y-position is visible in its maximal chain. The s-structure (31), we saw, was formed due to the movement/raising of an NP. Movement, surely, is always from an A-position for, by now, we are familiar with a basic P&P idea that movement is always forced by licensing requirements that are not met at a d-structure, which, we saw, is essentially a y-structure. This leaves two options: an element moves either to another A-position or to an A-bar position. NP-movement, an exam-
Linguistic Theory I
55
ple of which we saw, is typically an A-to-A movement. There are other varieties of A-to-A movement that I put aside. Wh-Movement A typical example of A-to-A-bar movement is the movement of wh-phrases (WPs) in questions and relative clauses. We will look at the properties of the first kind of movement. There is a pretheoretical intuition that WPs are NPs that are sometimes, and in some sense, best thought of as direct objects of verbs: John ate what? is a natural response to the remark John ate five cockroaches. At other times, a WP is naturally viewed as the Subject of a sentence: Who ate five cockroaches? is also a natural response to the preceding remark. So the d-structures for sentences (34)–(36) will be something like (37)–(39), respectively.
2.3.2.2
(34) I wonder who John saw (35) What is it easy to do today (36) Who ate what (37) [S-bar [S I wonder [S-bar [S John [VP INFL see who]]]]] (38) [S-bar [S It is easy [VP to do what] today]] (39) [S-bar [S Who [VP Infl eat what]]] A number of interesting facts emerge. In the actual sentence (35) the WP has moved from the VP-internal position in (38) to the front of the main clause. For (34) as well the WP moves in (37), but it moves to the front of the embedded clause. In sentence (36) it seems that the first WP has not moved since it is already in a clause-front position; the second WP in (36) stays at the VP-internal position. How do we explain these movements or apparent lack of them? When WPs do move, where do they move to? A host of other questions arise, as we will see. Although a WP is naturally viewed as a Subject or a direct Object, we cannot think of a WP as a (specific) agent or theme or goal. We will see shortly that WPs are ultimately interpreted as quantifiers. Hence they cannot continue to occupy the position of the complement of a verb where a specific thematic interpretation is typically assigned. Similar remarks apply to the Subject position. This is one of many arguments, leading away from the Standard Theory (Chomsky 1965), that suggest that a d-structure representation is not the proper vehicle for semantic interpretation since it suppresses the nonthematic character of wh-elements (Chomsky 1977).
56
Chapter 2
A WP thus must move from the visible positions in (34)–(36) to a nonvisible position. Since A-positions are typically visible in a chain, a WP moves to an A-bar position, namely, a Comp position that is a clauseexternal position. The character of this movement is controversial, and it animates much discussion in the Minimalist Program (Hornstein 1995; Johnson 2000; Boeckx 2006). I note some of the controversies as I proceed. In English, a WP raises to a Comp position. Which one? At this point the Subjacency Principle of Bounding Theory plays a crucial role. The theory requires the concept of bounding nodes: a single application of Move-a may not cross more than one bounding node. Intuitively, a bounding node determines the extent to which instances of Move-a apply in one stroke; in that sense, the principle has a ‘‘least-e¤ort’’ flavor, to which I return in chapter 5. Initially, NP and S were taken to be the bounding nodes in English; other languages may have S-bar and NP as bounding nodes. Insofar as this is true, the module is possibly parameterized. I have assumed, following Chomsky 1986, that Subjacency applies between d- and s-structures; others argue that it applies between sstructure and LF. This issue, as well as the issue of where ECP applies (see below), is somewhat moot in the light of the Minimalist Program since, as noted, MP does not have s-structures. In (38), there is one bounding node to cross for the only available Comp position at the front of the sentence; hence the WP adjoins to the front in one bound. In (37), the WP adjoins to the first of the available Comp positions: a possible ‘‘hopping’’ movement to the higher Comp is barred due to the lexical properties of wonder. In (39), the Subject WP adjoins to the only available Comp position, forcing the other WP to remain in situ at the s-structure, so this is how the sentence is going to be pronounced. The movement of who forces ‘‘superiority’’ (i.e., the leftmost WP moves first); this again has a ‘‘least-e¤ort’’ flavor. The other WP what must now be assigned a specific y-role contrary to the nonthematic character of such phrases. We will see that even this phrase will move covertly in LF to avoid the problem. Movement as usual creates coindexed trace elements. The resulting s-structures corresponding to (34)–(36), then, are roughly as follows: (40) [S I wonder [S-bar [Comp Whoi [S John Infl see ei ]]]] (41) [S-bar [Comp Whati [S it is easy to do ei today]]] (42) [S-bar [Comp Whoi [S ei Infl eat what]]] So far two basic kinds of empty categories have been postulated: those that are projected by the lexicon essentially to satisfy y-theory and those
Linguistic Theory I
57
that are created by movement. The latter kind, namely, trace, again subdivides into two categories: those created by A-to-A movement and those created by A-to-A0 movement; let us call them ‘‘NP-trace’’ and ‘‘whtrace’’ respectively. We also saw briefly that the Subject position of infinitival clauses is sometimes occupied by an empty element PRO. Some languages—for example Spanish and Hindi (but not English)—have an additional empty element called (small) pro, which sometimes occurs as the Subject of finite clauses in the so-called null-subject languages. We will see that all these empty categories are classified into four basic kinds. This proliferation of empty categories, forced throughout by theory as we saw, creates a problem for the language learner. Since these are not phonetically realized, how does the child interpret them? In fact, how does the child know that they are there? Speaking roughly, but quite correctly, interpretation of a sentence ultimately accrues from the meanings of words, which the child has to learn independently in any case. Thus, one of the principal goals of the P&P framework is to shift the child’s burden only to the learning of words while the rest of the business of interpretation is placed on the universal principles of the computational system (Wasow 1985). Empty categories do not seem to fit this explanatory strategy. The preceding way of stating the problem itself suggests how the problem is to be addressed. Empty elements will not be a problem if there are principled ways in which each empty element is shown to be linked to some phonetically realized element; in so linking, the empty element will be endowed with some ‘‘proxy’’ interpretation. In other words, the natural general idea is that empty elements be viewed, across the board, as dependent elements whose antecedents are ultimately some independently interpretable items. The qualification ‘‘ultimately’’ is related to the concept of maximal chain mentioned in connection with the revised ycriterion (33). A chain may have more than two members: John seems e1 to have been hit e2 by a car contains the chain (John, e1 , e2 ) headed by John. The last empty element e2 ultimately receives its semantic interpretation via John. What the theory needs to do is to give a naturalistic account of which empty category is linked to what antecedent to receive which interpretation. In this way, the ‘‘burden’’ will remain with the computational system (Chomsky 1988, 90–91). Binding Theory In fact, there are also phonetically realized dependent elements in languages that require a similar account. Pronouns as in John thought that he [John] needed a shave and reflexive pronouns
2.3.2.3
58
Chapter 2
as in John decided to shave himself are paradigmatic examples of such dependent elements (he may have a disjoint reference as in John thought that he [Bill] needed a shave). So ideally, instead of treating empty categories in a separate block, the theory should explain the general phenomenon of dependency. Perhaps, still more generally, the theory simply gives an account of how the NPs in s-structure are distributed. Much of this ideal is fulfilled in Binding theory, though some of the residual problems with empty categories are treated separately in the Empty Category Principle (ECP). Let us assume that all A-positions are freely indexed at s-structure, perhaps barring those, if any, that are already indexed by Move-a (which, we saw, may index some A-bar positions as well). (43) Definitions: For categories a and b, a. a A-binds b just in case: (i) a c-commands b, and (ii) a and b are coindexed arguments. b. The governing category for a is the smallest NP or S containing a and the governor of a. (44) Typology of arguments: þa, p: anaphor (himself, each other, NP-trace) a, þp: pronominal (he, him, them, pro) a, p: r-expression (John, the man, wh-trace)8 þa, þp: pronominal anaphor (PRO) (45) Principles of A-binding: Principle A: An anaphor is bound in its governing category. Principle B: A pronominal is free in its governing category. Principle C: An R-expression is A-free. The preceding definitions and principles of Binding theory have natural explanations as follows. The general and, therefore, the minimal definition (43a) imposes two natural conditions: (i) a dependent element must occur in the domain of its antecedent and, since c-command is the widest grammatically salient concept of a domain, an antecedent must at least ccommand its dependent; and (ii) among all the argument-NPs that occur within this domain, only those that are specifically related to each other count, so that an antecedent and its dependent(s) must be coindexed. Thus the minimal definition maximally captures the concept of grammatical binding as distinguished from, say, pragmatic binding. Additional restrictions are needed to specify the binding relationships for various subclasses of NPs. To that end, two things are needed. First,
Linguistic Theory I
59
we need some concept of ‘‘local domain,’’ obviously narrower than ccommand, to serve as a unit for computing dependencies for various subclasses of arguments. This is achieved in definition (43b). Second, we need a principled way of partitioning the class of arguments to be distributed. This is stated in (44). Following the paradigmatic cases of phonetically realized dependent elements as mentioned above, we think of two basic features with binary options: anaphoric (Ga) and pronominal (Gp). This generates four categories, as shown. I have also listed some suggestive examples alongside each category. It is obvious that the category of pronominal anaphora cannot be treated in Binding theory since the definition of the category (þa, þp) requires that it is treated both as an anaphor and a pronominal and, hence, it is both bound and free in its governing category. The way out of this contradiction is to suggest that definition (43b) does not apply to this category—that is, the category is ungoverned. Since it is ungoverned it cannot receive Case. But Case theory requires that every phonetically realized NP must receive Case; hence, this category cannot be phonetically realized—that is, it is empty. The element that simultaneously meets these conditions is the empty element PRO that occurs as the Subject of an infinitival clause. Returning to Binding theory, the rest of the cases are covered individually as required by (45). Principles A and B are quickly verified by the following examples. (46) a. [S Johni shaved himselfi ] b. [NP Johni ’s shaving himselfi ] c. *Billi said that [S John shaved himselfi ] d. *[S Johni shaved himi ] e. *[NP Johni ’s shaving himi ] f. Billi said that [S John shaved himi ] g. [S Johni shaved himj ] Except for (46g), definition (43a) is satisfied in these cases since a, which is either John or Bill, is coindexed with b, which a c-commands. In each case, a of definition (43b), which is either an anaphor himself or a pronoun him, is governed by the verb shave and both b and its governor are contained in the smallest S or NP as indicated by bracketing; so definition (43b) is satisfied as well. In (46a) and (46b), the argument John correctly binds the anaphor himself, showing that both S and NP are governing categories. The anaphor cannot be bound by an argument outside these categories as (46c) shows.
60
Chapter 2
Hence Principle A is satisfied in both directions. Similar arguments extend to NP-trace. We put aside subtle issues regarding indexing that arise for such apparent failures of Principle A as [the children]i thought that [S [NP pictures of [each other]i ] were on sale], called ‘‘long-distance binding,’’ in which an anaphor each other is bound outside its governing category S (Chomsky 1986, 173). The ungrammatical structures (46d) and (46e) show, on the other hand, that the pronoun him is not bound—that is, it is free—in either category. The pronoun is bound by Bill in (46f ) but Bill lies outside the governing category. In contrast, (46g) is fine because John and him are not even coindexed; hence him is disjoint and definition (43a) does not apply. Therefore, Principle B is not violated. These examples suggest that anaphors and pronouns have a complementary distribution in that the domain in which an anaphor must find an antecedent to be licensed is the domain in which a pronoun must not have an antecedent. R-expressions contrast with both anaphors and pronominals. Consider (47a) in which an anaphor each other, a reciprocal, and (47b) in which a pronoun them are correctly bound, obeying Principles A and B respectively. (47) a. The musiciansi like [each other]i b. The musiciansi wanted John to like themi Replacement of the bound element by an r-expression, say, the men, however, yields ungrammatical expressions (48a) and (48b). (48) a. *The musiciansi like [the men]i b. *The musiciansi wanted John to like [the men]i Both (48a) and (48b) are fine if the men has a di¤erent index. So rexpressions are not only not A-bound in the governing category, they are not A-bound at all; r-expressions are A-free. The expressions in (48) thus violate Principle C. The importance of the qualification ‘‘A’’ in ‘‘Afree’’ is shown in (49b), where an r-expression the fool has an ‘‘antecedent’’ John but the relationship lies beyond Binding theory. In contrast, (49a) again shows a violation of Principle C (Chomsky 1986, 79). (49) a. *Johni didn’t realize that [the fool]i had left the headlights on b. Johni turned o¤ the motor, but [the fool]i had left the headlights on Interestingly, wh-traces are also r-expressions. Consider the examples in (50).
Linguistic Theory I
61
(50) a. *Whoi does hei think ei is intelligent b. *Whoi does hei think that I like ei c. Whoi does hej think ei is intelligent d. Whoi ei thinks hei is intelligent (50a) and (50b) are examples of ‘‘strong crossover’’ in which a WP has crossed over the coindexed pronoun he while moving to the main-clause front. In (50a), the wh-trace is bound in its governing category while in (50b) it is A-bound by the pronoun he. The resulting expression in either case is ungrammatical. In (50c) and (50d), there is no crossover since either the pronoun is not coindexed as in (50c) or the pronoun has not been crossed as in (50d). In each case, the trace is bound by a WP from an A-bar position, not an A-position, showing that a wh-trace is A-free. Hence a wh-trace is covered under Principle C. The assimilation of an empty category in the class of such obviously referring expressions as John and the men has important theoretical consequences, as we will see. The current point is that we are beginning to have a rather natural account of the chain (wh-, e). The original intuition that an empty category must be a dependent category is still upheld by the idea that wh-traces, I may now say, must be A-bar bound in order to be licensed; this, of course, cannot be said of other r-expressions discussed so far since they may not be A-bar bound. In fact, this property of a whtrace lends further support to the idea that a WP must move to the Comp position; otherwise the trace cannot be licensed given Principle C. But this licensing requirement is not part of Binding theory and, as we will see in a moment, it is not even independently needed since it follows from a more restrictive condition called ‘‘Empty Category Principle.’’ Each of the elements of the chain (wh-, e) is thus interesting: a WP is an NP that is not a genuine argument though it functions as a Subject or a direct Object; a wh-trace is also an NP that is an empty, and thus a dependent, element, though it is classed as an r-expression. These features will have a natural explanation when we get to LF-interpretations. So far the theory has not imposed any explicit criterion of dependency on wh-traces; we only know that these are A-free. Even if we uphold the implicit idea that wh-traces are A-bar bound, it still does not fully square with our original intuition that empty elements must be somewhat ‘‘narrowly’’ dependent since the only element a wh-trace is linked with is a rather ‘‘distant’’ element, namely, a WP in the A-bar position, especially if the WP has ‘‘hopped.’’ The general idea of ‘‘narrow’’ dependence of any category is largely captured in the concept of government since it imposes a local relationship
62
Chapter 2
between two elements, a governor and a governee, where the governor is typically a lexical head and the governee a complement. However, a Subject is either ungoverned or is governed by Infl, not really a ‘‘proper’’ governor. This is particularly problematic for traces, since they do, on occasion, occupy the Subject position. Thus it is necessary to impose (51), the Empty Category Principle. (51) A trace must be properly governed. where ‘‘proper government’’ is defined as in (52). (52) a properly governs b just in case a. a governs b and a A fN, V, A, Pg, or b. a locally A-bar binds b So for wh-traces, three possibilities of well-formedness follow: (i) when a WP moves from an Object position, the trace will be governed by a lexical head that is typically a verb, (ii) when the phrase moves from the Subject position, it adjoins to the next Comp position, from where it locally binds its trace, and (iii) if the phrase has moved/hopped further, then the trace of the second movement locally binds the first trace. Consider (53). (53) a. Whoi b. Whoi c. Whoi d. *Whoi
do you think Bill saw ei ei thinks that Bill is intelligent do you think ei (ei left) do you think ei that (ei left)
The first three examples illustrate the listed options respectively; the last example shows the failure of option (iii). In (53a), the trace is properly governed by the verb see. In (53b), the trace is locally A-bar bound to the WP. In (53c), the trace of the second movement locally A-bar binds the first trace; note that the WP properly governs the second trace. In (53d), however, the element that intervenes between the two traces to block proper government, as seen clearly in the tree diagram in figure 2.5 (adapted from Sells 1985). 2.3.3
LF
After witnessing the elaborate licensing conditions on s-structures, we face the general problem of interpreting s-structures. I will ignore the issue of phonetic interpretation and turn directly to issues regarding semantic interpretation. Over the last three decades, there has been a growing consensus that s-structures are not directly semantically interpreted. Rather, s-structures
Linguistic Theory I
63
Figure 2.5
That-trace
feed into another level of representation to which, in turn, semantic interpretation is attached. The displacement problem and the consequent problem of indirect sound-meaning connections make the requirement for an additional level of representation after s-structures quite natural. As noted, it was found that d-structure representations are unsuitable for feeding into the semantic component—for example, wh-elements need to move in order to be interpreted. The general idea is that, in order to uphold indirectedness, computation must branch. As usual, this issue has become moot with the advent of MP. Since MP does not postulate ‘‘inner’’ levels of representation, it proposes that phonetic computation and semantic computation proceed in parallel, although computation does branch at a point called ‘‘Spell-Out,’’ where the phonetic features are stripped away to form a separate phonological representation PF. However, Spell-Out is not a level of representation. Since LF and PF are the only levels of representation in MP, it changes the mechanisms of computation fairly drastically. For example, we can no longer treat multiple-wh constructions at di¤erent levels. Keeping to G-B, another need for a separate level for semantic representation comes from a global constraint on phonetic and semantic representations. It is natural to expect that each of these levels will consist only of those elements that are relevant for the level concerned, such that if a level contains elements that cannot be interpreted at that level, then the representation will be rejected. Thus a principle of Full Interpretation
64
Chapter 2
(FI)—a representation may not contain any vacuous element—may be invoked for the interface levels. Consider pleonastic elements such as it and there. Since English requires a lexical Subject, these elements appear at the d- and s-structures, as noted. They must also appear at the phonetic interface since English sentences contain them. However, they cannot appear at the semantic interface since they cannot be given semantic interpretation. Thus, in order to obey FI, they must be deleted before a representation is ready for semantic interpretation, although there is much controversy on how the deletion of pleonastic elements is executed. The same applies to the vacuous preposition of in destruction of the city, which was needed solely for licensing the city, as noted. From the other direction, if semantic representation requires introducing covert elements, such as wh-trace, as a necessary condition for semantic interpretation only, then such elements cannot be allowed to either remain or be introduced at the phonetic interface. It is natural, then, that FI is satisfied only if computations branch at the s-structure. Another argument for the existence of a distinct level of semantic representation is that meanings are largely ‘‘wired-in’’: we do not expect languages to vary significantly in the semantic component; languages vary in how they sound. What notion of ‘‘semantics’’ is at issue here? In the literature, ‘‘semantics’’ is standardly defined in terms of a list of phenomena: relative quantifier scope, scope of negation, modality, opacity, pronoun binding, variable binding, focus and presupposition structure, adverbial modification, and so forth (Hornstein 1995, 1). It is not immediately obvious that ‘‘semantics,’’ so characterized, ought to be ‘‘wired-in.’’ Why should, say, the properties of variable binding or adverbial modification be invariant across languages? It is frequently suggested by Chomsky and others that semantics, even in the restricted sense, is likely to be ‘‘wired-in’’ because ‘‘the most fundamental principles of semantics are . . . remote from the data available to the child (situations of utterance, the behaviour of other speakers etc.)’’ (Higginbotham 1985, 550, cited in Hornstein 1995, 3). The child has to form judgments about semantic organization essentially by observing what people are doing with language, and what people are doing is subject to so many di¤erent interpretations that it supplies only vague cues about how a sound is to be interpreted. The data for the sound systems, in contrast, are somewhat more directly available to the child in that the ‘‘perceptual apparatus just tunes in’’ (Chomsky, in Hornstein 1995, 4; Karmilo¤-Smith 1992, 37; Ramus et al. 2000). In sum, the idea is that
Linguistic Theory I
65
the argument from the poverty of stimulus applies more strongly for semantics than for phonetics. The issues of whether LF exists and whether LF is language-invariant are distinct, though the idea of language invariance bolsters the existence claim. In view of indirectedness, it is natural to postulate a level of representation where all ‘‘grammatically determined information relevant to interpretation are consolidated’’ (Higginbotham 1985, 549), where ‘‘information relevant to interpretation’’ stands for the list of semantic issues drawn up above. Explicit empirical argumentation is then needed to determine if the postulation makes a di¤erence in explanation especially in terms of wider coverage of data, and whether the argument for language invariance is bolstered. There are a variety of interesting arguments to that end. I will barely scratch the surface below.9 A number of crosslinguistic studies conducted in recent years seem to supply the evidence needed. To take just one study that is widely mentioned in the literature, Huang (1982) suggested that languages such as Chinese and Japanese do not have overt wh-movement in syntax—that is, at s-structure. In other words, unlike English, WPs in Chinese and Japanese remain in situ in syntax. However, when the interpretations of sentences containing WPs are closely investigated, especially with respect to their scopal properties, it turns out that WPs in Chinese and Japanese covertly move to the clause boundary just as in English; there are other more complex considerations. Therefore, not only that interpretation is assigned to a level of representation di¤erent but derived from s-structure, the structure to which the interpretation is assigned conceals di¤erences between languages. There is good reason, then, to believe that there is a level of representation, to be called ‘‘Logical Form’’ (LF), that is largely language-invariant. Recall a pending problem concerning (42), repeated below as (54), in which the second WP of the sentence who ate what could not be moved out of the direct Object position at s-structure. It is a problem since, despite being a nonthematic item in the sense suggested, a WP cannot fail to be assigned a y-role in that position. (54) [S-bar [Comp whoi ][S ei Infl eat what]] Following the example of covert wh-movement in Chinese, it is natural to think that the element what also moves covertly at LF. This results in the LF-structure (55). (55) [S-bar [Comp whatj [Comp whoi ]][S ei INFL eat ej ]]
66
Chapter 2
In a similar fashion, we may now extend the analysis easily to various multiple wh-constructions such as those in (56). (56) a. I wonder who gave the book to whom. b. Who remembers where John read what? There are other interests here that are widely discussed in the literature. For example, some LF-representations of multiple wh-constructions seem to violate Subjacency. This suggests that Subjacency might hold only at sstructure and be violated at LF, just as Case theory holds at s-structure and is sometimes violated at d-structure. In English the wh-instance of the rule Move-a obeys Subjacency at s-structure, and ECP at LF. In the Minimalist Program, obviously, these issues of selective application disappear. 2.4 Grammar and Scope Problem
We can now appreciate how scope ambiguities of sentences containing quantified phrases (QPs), which were at the heart of Russell’s theory that brought us here, may now be handled in the general theory of LF. In fact, the theory easily extends uniformly to various subclasses of quantifiers, to interactions between them as well as to interactions between quantifiers and WPs. These include number quantifiers (two), ‘‘strong’’ quantifiers (every, most), ‘‘indefinites’’ (a) and ‘‘definites’’ (the), among others. A small sample of simple examples are given in (57). (57) a. Most Indians speak two languages. b. Every boy danced with a girl. c. Who gave a gift to every child? To simplify matters, let us assume that this instance of Move-a—that is, quantifier raising (QR)—has exactly the same properties as wh-movement except that a quantifier (controversially) raises to the IP node.10 Let us define ‘‘quantifier scope’’ in terms of asymmetric c-command: a QP b is in the scope of a QP a just in case a c-commands b but b does not ccommand a. Move-a (¼ QR) generates multiple structures from a single set of lexical elements, creating traces along the way, if all of the structures are individually licensed. The first two sentences in (57) have, roughly, the LF-representations (58a) and (58b). I am skipping fuller analysis of these structures since their basic forms should be fairly obvious by now. I will assume that scopal ambiguities of each sentence have been structurally represented with full generality, as promised above (section 2.1).
Linguistic Theory I
(58) a. LF1: LF2: b. LF1: LF2:
[IP [IP [IP [IP
67
[most Indians]i [ei speak two languages]] [two languages]j [IP [most Indians]i [ei speak ej ]]] [every boy]i [ei danced with a girl]] [a girl]j [IP [every boy]i [ei danced with ej ]]]
A natural way of initiating the discussion of the interpretation of these structures is to go all the way back to the pending problem of how to interpret the elements of the chain (wh-, e). Recall that a WP moves to the Comp position and a wh-trace is created at the original position. At that point, we just recorded that each of the elements of the chain (wh-, e) is interesting. Let us review some of the facts about the empty element e, a wh-trace. We know that e is an r-expression like John and the child since it is unlike an NP-trace in two respects: (i) e is A-free while an NP-trace is A-bound, and (ii) e is assigned both Case and y-role while an NP-trace typically lacks Case. Thus, a WP-trace may be viewed as a ‘‘place holder for a true referential expression’’ (Chomsky 1988, 88). However, e is also unlike John and the child in that e must be locally A0 -bound either to a WP or to its trace; in other words, it must appear in a chain (wh-, e1 , . . . , en ). Turning to WPs, although a WP must head a chain containing wh-traces and, hence, is coindexed with them, coindexing here, unlike argument chains, does not amount to coreference since a WP is a nonreferring expression. In sum, a WP binds an element in its domain where the element is otherwise given full semantic interpretation. These are exactly the properties of quantifiers and bound variables in standard quantification theory. Though a bound variable, like a name and a free variable, can be assigned an independent interpretation (in a model under a system of assignment), it can only occur, unlike names and free variables, in association with a quantifier whose task is to bind the variable. So a WP and its trace are naturally viewed as a quantified phrase and a variable respectively. Let us examine this crucial point from a slightly di¤erent angle. It is well known that standard QP-constructions of English, such as all men are mortal, are naturally viewed as having a quantifier-variable structure in the sense just outlined. This explains why, despite serious problems noted above (section 2.1), canonical representations of natural-language sentences in the notation of first-order logic look so natural. Adoption of this notation helps explain the telling semantic di¤erences between, say, all men are mortal and Socrates is mortal. All men are mortal expresses a general proposition—that is, a proposition that is not about anyone in particular. The proposition concerns, at best, a class whose members are
68
Chapter 2
referred to, as Quine puts it, ‘‘with studied ambiguity.’’ Socrates is mortal, on the other hand, expresses a singular proposition since it is plainly about Socrates. This di¤erence is traceable to di¤erences between all men and Socrates and the di¤erential relationship they bear to the predicate terms. All of this is clearly brought out in the notation of (restricted) quantification as follows: (59) All men are mortal : (all x: man x)(x is mortal) Socrates is mortal : (Mortal)Socrates Therefore, the idea that the chain (wh-, e) has a quantifier-variable structure will be bolstered if we can show explicitly that WPs are more naturally aligned with QPs than with names. Following Chomsky 1980, 160–162, consider (60) and (61). (60) Johni betrayed the woman hei loved (61) The woman hei loved betrayed Johni The coindexing shows that in either case we may take the pronoun he to refer to John. Now, let us replace John in either case by a QP everyone to yield (62) and (63). (62)
Everyonei betrayed the woman hei loved
(63) *The woman hei loved betrayed everyonei This time coindexing fails for the ‘‘quantified’’ version of (61). It is natural to explain these results in the following idiom. (62) allows the interpretation that for each person x, x betrayed the woman x loved. This interpretation is not allowed in (63); however, if he and everyone are not coindexed, (63) will allow the interpretation that for each person x, the woman he loved betrayed x, where he refers to someone else. Suppose we o‰cially adopt this interpretive idiom and assign it to LFrepresentations. I have not given the full LF-representations for (62) and (63) because the essential forms can be easily constructed. In the LF-representation for (62), the QP everyone moves from the Subject position of the main clause to adjoin to the IP, leaving a coindexed trace behind; this creates the LF-structure [everyonei ei . . . hei . . . ]. In (63), the QP moves from the Object position of the embedded clause to the sentence front, and the coindexed trace appears at the sentence end; this creates the LF-structure [everyonei . . . hei . . . ei ]. (62) and (63) may thus be said to have the logical forms (64) and (65) after the idiom is assigned.
Linguistic Theory I
(64)
69
for every person x, x betrayed the woman he loved
(65) *for every person x, the woman he loved betrayed x In (64), the r-expression/variable x may be coindexed with the pronoun he in accordance with Principle B of Binding theory. However, the variable cannot be coindexed with he in (65), since it will result in ‘‘weak crossover.’’ The adopted notation thus enables a convergence of semantic facts with structural explanations. Also note that all and only elements of the respective LF-representations, including the empty elements, are given ‘‘logical’’ interpretation. Thus the LF-representations accord with the requirements of FI. Next, we replace John in (60) and (61) with the WP who such that, after fronting and inversion where required, we get (66) and (67). (66)
Whoi betrayed the woman hei loved
(67) *Whoi did the woman hei loved betray Clearly, the relevant facts match exactly those for the QP-constructions in (62) and (63). A wh-element, then, is better viewed as a quantifier like every than as a name like John. Further, a wh-trace has the properties of a QP-trace. We get the logical forms for (66) and (67) simply by replacing every with which in (64) and (65). The quantifier-variable notation thus naturally extends to WPs as well, resulting in a large unification of theory.11 Returning to the examples in (56) and (57), the ground is now prepared for assigning interpretations to the various multiple-wh and multiple-QP constructions. For instance, (56a), I wonder who gave the book to whom, has the logical form (68). (68) I wonder (for which persons x,y (y gave the book to x)) The sentence has a natural pair-list interpretation—that is, it is about two persons, one of whom gave a book to the other. This interpretation is concealed in the phonetic form but is brought out in (68). (56b), who remembers where John read what, on the other hand, admits of multiple interpretations depending on the relative scopes of the embedded WPs. Similarly, the first two examples in (57), repeated here in (69a) and (69b) respectively, and whose LF-representations were given in (58), may be assigned dual logical forms as follows. (69) a. Most Indians speak two languages LFI 1: For most Indians x, x speak two languages
70
Chapter 2
LFI 2: Two languages y are such that for most Indians x, x speak y b. Every boy danced with a girl LFI 1: For every boy x, x danced with a girl LFI 2: A girl y is such that for every boy x, x danced with y We recall that, notwithstanding glaring problems, one of the reasons for the persistent use of logical notation is that it does seem to give a natural account of an aspect of language, namely, the crucial distinction between names and quantifiers. The quantifer-variable notation of firstorder logic thus looked indispensable. We just saw that the structural aspects of this notation can be smoothly incorporated and even extended—to wh-constructions, for example—within grammatical theory itself as a part of the solution to Plato’s problem. The present discussion of linguistic theory started as a challenge to provide an alternative to Russell’s way of handling scope distinctions for QPs. So far I have been discussing grammatical theory basically within the G-B framework. I have mentioned some of the central features of the Minimalist Program (MP) only in passing. As noted, I will discuss MP more fully in chapter 5. However, it seems to me that MP can be used at this point to highlight the contrasts between Russell and grammatical theory even more dramatically. For this limited goal, I will describe very briefly how scope distinctions are represented in MP with an example. For a fuller understanding of this example, readers might want to take a quick look at chapter 5 first. A basic idea in MP is that phrases must get their uninterpretable features checked in syntax to meet FI, which is an economy condition. For example, noun phrases have (semantically) uninterpretable Case features that need to be checked. For checking, another instance of the feature is to be found in a local domain, say, c-command. Sometimes, NPs need to displace from their original position to meet this requirement. Displacement, needless to say, must meet economy conditions such as (the MP version of ) subjacency. Now, consider a simple multiple-quantifier sentence such as (70). I will follow the analysis in Hornstein 1995, 155; 1999. I wish to emphasize the character of Hornstein’s analysis, rather than its internal details that are controversial (Kennedy 1997). (70) Someone attended every seminar. It is obvious that the sentence is doubly ambiguous and we will want the theory to generate the two (desired) interpretations of the sentence. As
Linguistic Theory I
71
noted, Subjects such as someone are initially located inside the verb phrase itself: the VP-internal-Subject Hypothesis. For feature checking, particularly Case checking, the Subject someone and the Object every seminar must move outside the VP. Notice that this is the familiar NP movement, not QR, which Hornstein (1995, 153–155) rejects on minimalist grounds. Since movement is copying, a copy of these items is inserted at the position shown by the arrows. Ignoring details, this results in (71). # (71) [Someone [every seminar [VP someone [VP attended every seminar]]]] " A general economy condition on interpretation requires that no argument chain can contain more than one element at LF; this accords with the principle of Full Interpretation, noted above. Since (71) contains two such argument chains ((someone, someone); (every seminar, every seminar)), one member from each needs to be deleted. This generates four options (72)–(75), where deleted elements are marked in strikethroughs. (72) [Someone [every seminar [VP someone [VP attended every seminar ]]]] (73) [Someone [every seminar [VP someone [VP attended every seminar]]]] (74) [Someone [every seminar [VP someone [VP attended every seminar]]]] (75) [Someone [every seminar [VP someone [VP attended every seminar ]]]] Another constraint on interpretations says that arguments with strong quantifiers must be interpreted outside the VP-shell: the Mapping Principle. The requirement is purely grammatical. Consider, there [VP ensued [a/ some fight(s) in Delhi ]], in which the determiners a/some occur inside the VP. We cannot replace a/some with the/every/any, (‘‘strong’’ quantifiers). This grammatical fact, noticed by Milsark (1974), is captured in the strong/weak classification of quantifiers (Reuland and ter Meulen 1987; Hinzen 2006, 5.5). In Diesing 1992, chapter 3, this fact is (ultimately) explained via a distinction between the IP-part and the VP-part of a syntactic tree (see section 3.3). Strong quantifiers are said to presuppose the existence of entities (in the discourse): they are discourse-linked (dlinked). D-linked quantifiers can be interpreted only when they occur in
72
Chapter 2
the IP-part. Following Hornstein, I am using only the grammatical part of the argument. Since strong quantifiers such as every are d-linked, (73) and (74), in which the strong quantifier occurs inside the VP-shell, have no interpretation. This leaves (72) and (75) as the only interpretable structures, just as desired. This is one way—Hornstein’s—of accomplishing the task; there are others (Lasnik and Uriagereka 2005, 6.3–6.9). Also, as noted, we need not accept all the details of the argument. For example, we may doubt if the phenomenon sought to be covered by the Mapping Principle requires an independent principle. Still, the analysis does show that there are purely grammar-internal grounds—feature checking and Full Interpretation—on which such displacements take place. However, in satisfying them, some requirements of the external C-I systems—for example, d-linked items can only be interpreted at the edge of a clause—are also met. In this sense, conditions on meaning follow from the satisfaction of narrow computational properties of the system. In sum, all that Russell achieved by imposing a logical notation has been achieved without it. To emphasize, all of this is a bonus: the system is basically geared to solve Plato’s problem. Can we conclude already that grammatical theory itself qualifies as a theory of language? Traditionally, grammars are not viewed as capturing the notion of meaning; grammars are viewed as representing the ‘‘syntax’’ part of languages, where ‘‘syntax’’ contrasts with ‘‘semantics.’’ For example, it would be said that even if (72) and (75) capture the structural aspects of the relevant scope distinction, we still need to say what these structures mean. As noted, Fodor and Lepore (1994, 155) hold that ‘‘the highest level of linguistic description is, as it might be, syntax or logical form: namely a level where the surface inventory of nonlogico-syntactic vocabulary is preserved;’’ that is, linguistic description up to LF is syntactic—not semantic—in character. Even biolinguists seem to waver on this point: thus, Fox (2003) holds that LF is a syntactic structure that is interpreted by the semantic component.12 So the demand is that we need to enrich grammatical theory with a theory of meaning or a semantic theory to capture human semantic understanding; only then do we reach a proper language system. A number of familiar questions arise. Does the notion of grammar capture some notion of meaning? Beyond grammar, is the (broader) notion of language empirically significant? What are the prospects of ‘‘enriching’’ grammatical theory with postgrammatical notions of meaning? I turn to these and related questions in the next two chapters.
3
Grammar and Logic
‘‘Semantic understanding’’ is a catchall expression for whatever is involved in the large variety of functions served by language. The rough idea is that grammatical knowledge somehow interacts with our reasoning abilities concerning whatever we know about intentions, desires and goals, the world, cultures, and the like. It is assumed that, somewhere in this vast network of knowledge, there is a semantics-pragmatics distinction. Faced with such imprecision, we need some general plan to be able to raise specific questions on the issues facing us. Given the vastness of the semantic enterprise and the limitations of space here, I need to state, before I proceed, what exactly my goals are. As noted in various places, a central motivation for this work is to see if the concept of language embedded in grammatical theory stands on its own as an adequate theory of human languages. Keeping to the meaning side, in e¤ect the issue is whether the scope of grammatical theory needs to be expanded at all by incorporating what are believed to be richer conceptions of meaning. I assume that two broad semantic programs are immediately relevant here: formal semantics and lexical semantics. To urge the adequacy of grammatical theory, I will argue in this chapter that the sharp traditional distinction between syntax and semantics, explicitly incorporated in logical theories, does not plausibly apply to the organization of grammatical theory; in that sense, grammatical theory contains a semantic theory. The rest of the discussion attempts to ‘‘protect’’ this much semantics. Thus, next, with respect to some classical proposals in formal semantics, I will argue not only that some of the central motivations of formal semantics—for example, the need for a canonical representation of meaning—can be achieved in grammatical theory, but that it is unclear if formal semantics carries any additional empirical significance. In that sense, I will disagree with the very goals of formal semantics.
74
Chapter 3
In the next chapter, in contrast, I will agree that, from what we can currently imagine, a theory of conceptual aspects of meaning is desperately needed, but, with respect to some of the influential moves in lexical semantics, there are principled reasons to doubt if the data for lexical semantics is theoretically salient, and if we can take even the first steps toward an abstract, explanatory theory in this domain at all. As indicated in chapter 1 (sections 1.2, 1.3.3), I am indeed rather skeptical about all nongrammatical approaches in semantics that work either with constructs of logic or invoke concepts; it is hard to see what is left. However, proving skepticism is not the agenda here. All I want to do within the space available is to bring out enough foundational di‰culties in the current form of these approaches to motivate a parting of ways from traditional concerns of language theory, while the search for deeper theories in formal and lexical semantics continues. 3.1 Chinese Room
I wish to take a detour to develop an image that might help us secure a firmer perspective on how to conceptualize the semantic component of language. In a very influential essay, the philosopher John Searle (1980) proposed a thought experiment to evaluate the scope of the general idea that certain mental systems can be viewed as symbol-manipulating devices. We saw that this idea certainly guides biolinguistic research. Searle invites us to imagine a room that contains a monolingual English speaker S, a number of baskets filled with Chinese symbols, and a ‘‘rulebook’’ that contains explicit instructions in English regarding how to match Chinese symbols with one another. Suppose S is handed in a number of questions in Chinese. He is then instructed to consult the rulebook and hand out answers in Chinese. Suppose the Chinese speakers find that these answers are eminently plausible; hence, S passes the Turing test (Turing 1950). Yet, according to Searle, for all that S knows, he does not understand Chinese. He simply matched one unintelligible symbol with another and produced unintelligible strings on the basis of the rulebook. A symbol-manipulating device, therefore, cannot represent genuine understanding. Since Chinese speakers by definition understand Chinese, Chinese speakers cannot ( just) be symbol-manipulating devices. I will not enter into the internal merits of this argument insofar as it concerns the specific features of the Chinese room. Just too many details of the parable need to be clarified before one can begin to draw general lessons from the argument. For example, if the argument warns against
Grammar and Logic
75
taking computer simulations—whether computers can be programmed to mimic human understanding—too realistically, then we might readily agree with the spirit of the argument (Chomsky, Huybregts, and Riemsdijk 1982, 12). If, on the other hand, Searle’s argument is designed to be a global refutation of computational theories of mind and language, then we would want to be clear about several details. To mention the most obvious of them: What is the content of the rulebook? What concept of S’s understanding enters into his understanding the instructions of the rulebook? Why should we infer, from the activity of matching ‘‘unintelligible’’ symbols, a total lack of understanding on S’s part? In any case, these questions have been extensively discussed in the literature (Dennett 1991; Block 1995, etc.). Instead, I will be concerned directly with the general conclusion Searle draws, notwithstanding its source. Thus Searle (1990, 27) says, ‘‘Having the symbols by themselves—just having the syntax—is not su‰cient for having the semantics. Merely manipulating symbols is not enough to guarantee knowledge of what they mean. . . . Syntax by itself is neither constitutive of nor su‰cient for semantics.’’ Searle o¤ers what we needed: some concept of syntax and some way of distinguishing it from semantics. Unfortunately, the concept of semantics continues to be uncomfortably thick: all we have been told is that semantics is not syntax. But, at least, we have been told what syntax is: it is mere manipulation of symbols. Although the statement is not exactly clear, it is a start. 3.2
PFR and SFR
Let us grant the (perhaps) obvious point that the person inside the Chinese room does not understand Chinese insofar as, ex hypothesi, he does not know what the various Chinese symbols ‘‘stand for.’’ Until we are told more about the contents of the rulebook, this is the only sense in which S does not understand Chinese; no other sense of his lack of understanding of Chinese has been proposed. So, according to Searle, a linguistic system executes two functions: a syntactic function that establishes ‘‘purely formal relationships’’ (PFRs) between a collection of lexical items, and a semantic function that relates the collection to what it stands for (SFRs). According to Searle, S understands a language L just in case both the functions, in order, are executed. Let us examine what PFR and SFR mean, if anything, in the context of grammatical theory. The conceptions of PFR and SFR are deeply ingrained in the logical tradition from where, presumably, Searle and others cull their conception
76
Chapter 3
of how language works. A logical theory is usually conceived of in two stages. The first stage is called ‘‘syntax,’’ which states the rules of wellformedness defined over the primitive symbols of the system to execute PFRs, where a PFR is understood to be just an arrangement of ‘‘noise’’ or marks on paper. A logical theory also contains a stage of ‘‘semantics,’’ which is viewed as a scheme of interpretation that gives the satisfaction conditions (in a model) of the well-formed strings supplied by syntax. That is, SFRs are executed when the scheme of interpretation is applied to the well-formed strings. For example, syntax says that the string ‘‘P5Q’’ is well formed and semantics says that ‘‘P5Q’’ is true just in case each of ‘‘P’’ and ‘‘Q’’ is true. In other words, (logical) syntax outputs a set of structures that are taken to be interpretable though uninterpreted, and (logical) semantics says, in general terms, what that interpretation is. As we will see, the general picture is routinely taken for granted in the formal semantics program. I am not suggesting that Searle, or others who uphold the syntaxsemantics divide, would automatically approve of the formal semantics program as it is typically pursued. In fact, from what I can make of Searle’s position on the issue, he would want to go far beyond merely the logical conditions to capture what he takes to be genuine linguistic understanding: ultimately, it will involve communication intentions, illocutionary acts, and the like. So, Searle is likely to think of the formal semantics program as so much more of syntax. Nevertheless, there is no doubt that Searle’s notion of semantics begins to get captured with the logical conditions themselves. In that sense, I am suggesting that a logicmotivated syntax-semantics divide has been seen to be necessary for launching any semantics program beyond LF; by parity, doubts about the divide lead to doubts about the entire program no matter where it culminates. In this program, sound-meaning correlation is taken to be direct since, by stipulation, a logical ‘‘language’’ has no room for ambiguity: a given string has exactly one interpretation. So, there is no need either to correlate di¤erent interpretations with the same string or to attach the same interpretation to di¤erent strings (Moravcsik 1998). I feel, tentatively for the moment, that this idea of direct correlation between sound and meaning is part of the motivation for thinking that syntax and semantics are strictly distinct; it encourages a ‘‘syntax first, semantics next’’ picture. This is one of the many reasons why a logical system is, at best, an artificial language. An artificial language is conceived of as an external ob-
Grammar and Logic
77
ject constructed for various ends: examples include computer languages, logistic systems and other formal languages, perhaps even systems such as Esperanto, and so on. When we design such languages, it is likely that we bring in certain pretheoretical expectations in their construction. One such expectation is a sharp syntax-semantics divide. Even natural languages could be viewed as external objects such as sets of sentences, and not as something internalized by the child. It is no wonder that similar expectations will be brought in to study natural languages as well even if we do not deliberately create them. This may have given rise to the idea that there is an absolute distinction between syntax and semantics in natural languages as well. The sharp distinction between the well-formedness and interpretability of a sentence fails to accommodate the intuition that we expect these things to converge—that the form of a sentence is intrinsically connected to its interpretation even if actual convergence may not be uniformly available.1 For a flavor of this very complex issue, consider the following. The expression who you met is John is (immediately) interpretable without being grammatical (Chomsky 1965, 151); there seems to a man that Bill left, on the other hand, is not immediately interpretable without being ungrammatical (Hornstein 1995, 70, following Chomsky 1993). We would like to think that, at these points, the system ‘‘leaks’’ in that it allows generation of various grades of gibberish. Thus an empirically significant theory will include a subtheory of gibberish: ‘‘The task of the theory of language is to generate sound-meaning relations fully, whatever the status of an expression’’ (Chomsky 2006a). Logical theory, on the other hand, blocks all gibberish by stipulation. We saw that a grammatical theory takes the lexicon as given and gives a generative account of a pair of representations, PF and LF, formed thereof. A PF-representation imposes grammatical constraints on how a string is going to sound, while an LF-representation captures the grammatically sensitive information that enters into the meaning of a string. Is there a PFR-SFR distinction as separate stages in grammatical computation? Note that in order to say that the output of grammar itself is an arrangement of ‘‘noise’’ (PFRs), we should be able to say that both the outputs of grammar, PF and LF, represent noise—that is, PF and LF individually represent PFRs. Could we say that the phonological part of the language system executes just PFRs? No doubt the phonetic properties of individual lexical items are distinct from their nonphonetic properties—that is, categorial and semantic properties. There is also no doubt that there is a distinct
78
Chapter 3
(branching) phonological computation based on phonetic features alone. Furthermore, the principles of phonological computation may be radically di¤erent from the rest of the computational principles (Bromberger and Halle 1991; Halle 1995; Harley and Noyer 1999; Pinker and Jackendo¤ 2005). However, the task faced by the phonological component is not merely to list the phonetic properties of the given lexical items (they are already listed in the lexicon!), but to decide whether a sequence of phonetic properties of lexical items—that is, the phonetic representation of the utterance —is an expression of a language. Alec Marantz (2005, 3.1) shows that we can construct ‘‘word salads’’ with English words but with a head-final structure as in Japanese (*Man the book a women those to given has). It will not be recognized as an English sentence precisely because it violates structural conditions. Therefore, a phonological processor cannot process even phonetic information unless it is supplied with categorial information; hence the need for branching at a certain stage of computation. The grammar of the language assigns a structure description to the phonetic representation of an utterance to generate the phonological form. Central to this approach is the idea of a syntax-governed ‘‘phonological phrase’’ that enters into the possible phonological interpretation of a string (Chomsky and Halle 1968, 5–11; Nespor 2001). As we will see in chapter 5, the current assumption is that the same syntactic object—phase—is transferred to the two interfaces. So, the basic idea is that if you know that a given string is, say, an English sentence, then you cannot fail to attach some semantic interpretation to it. It is not ruled out that the interpretation you attach holds the sentence to be gibberish. It follows that the conception of a phonological form as a representation of ‘‘noise’’ is without any clear sense in the grammatical framework within which we are working. However, this does not rule out the possibility of an isolated LF-processor (Chomsky 1997). It could be that the mind, beginning with the lexicon, generates an abstract and semantically interpretable expression, yet no articulation takes place. According to Chomsky (2001b), most of language use is in fact geared to unarticulated ‘‘inner thought.’’ In that sense the sound part could be viewed as ‘‘ancillary’’ to the basic language system, as we will see. The significance of the notion of syntax-governed ‘‘phonological phrase’’ may be brought out by considering an example in which both the licit and the illicit sequences are meaningless in Searle’s sense. This
Grammar and Logic
79
will enable us to control for the distinction between (the grammatical) concept of interpretability and the thick concept of semantics Searle is working with. Consider the classic collection of lexical items green, ideas, sleep, colorless, and furiously. Given categorial information regarding the features GN(ominal) and GV(erbal) and keeping to G-B, the computational system will decide that the string colorless green ideas sleep furiously is fine since it satisfies, say, X-bar theory, C-selection, number agreement, and so on. But *sleep green colorless furiously ideas is not fine although, in Searle’s thick sense of meaning, both the strings are meaningless. The former string is accepted precisely because it is interpretable. The point is illustrated by the fact that anyone who listens to the string for the first time tries to attach some meaning to it by stretching the imagination, invoking metaphors, and the like. This is true of most of the popular examples of ‘‘deviant’’ strings such as this stone is thinking of Vienna, John frightens sincerity, they perform their leisure with diligence, the unbearable lightness of being, and so on. In each case, some interpretation seems to be available after a little thought; in fact, some of them become literary masterpieces. As more information regarding S-selection, binding, and so forth are invoked, richer interpretative possibilities come into view. To sum up: a phonological representation, unlike an acoustic representation, is computationally connected with meaning—the connection being largely indirect in the case of natural languages. For natural languages, then, the syntax-semantics divide, even if there is one, does not cut the joints of language in ways that logical theory demands. This raises doubts regarding the applicability of the logical concept of syntax and the related concept of well-formedness in grammatical theory. These pretheoretical notions do not play any significant role in grammatical theory since, whatever these are, the theory attempts to unpack them. Except for serving some expository purposes, the widespread pretheoretical idea that grammatical theory partitions the class of strings into grammatical and ungrammatical subclasses is quite unnecessary (Chomsky 1993). Every theory, of course, will naturally partition strings in theoryinternal terms. As an aside, we might note that perhaps the global concept of acceptability of a string is the only one that matters for grammatical theory since this concept directly attaches to data. However, even this concept is suspect. The language system may pass strings that are not immediately acceptable to the native speaker, or are accepted on the wrong grounds; on the other hand, a speaker may accept strings that are rejected by the system. For example, complex structures with central embedding are
80
Chapter 3
often di‰cult to accept: the nurse whom the cook whom the maid met saw heard the butler (Miller and McNeill 1969, 707). ‘‘Garden-path’’ sentences such as the horse raced past the barn fell are typically accepted because raced is wrongly identified as the main verb. The string no head injury is too trivial to ignore is taken to mean that no matter how trivial a head injury is, it should not be ignored; however, the language system interprets it as: no matter how trivial a head injury is, it should be ignored (Collins 2004, 518 n. 14). The child seems sleeping is accepted for further conversation even if the string is rejected by grammar due to the violation of the selectional properties of the verb seem. These examples suggest that the concept of acceptability, being a global property of strings, is extremely thick. The acceptance or the rejection of a string might involve grammatical factors, contributions from nonlinguistic factors, performance factors, and so on. The divide between these factors is essentially theoretical in nature; the data do not wear the divides on their sleeves. The theory, in turn, attempts the best possible explanation of how, and which, lexical information is accessed and processed by which component of the mind. In this sense, the concept of lexical information is the only salient concept for language theory. So far I have been arguing that the concept of PFR does not play any sensible role in grammatical theory. Similar remarks apply to the concept of SFR, the stands-for relationship, but from the opposite direction. From what we know about the role of categorial information in grammar, it is not clear at all whether the properties of having a y-role, having a Case, having Subject agreement, having the feature of an anaphor, and so on, belong to the SFR part or not, although all of these properties are progressively invoked and formally established by the grammar, as we saw in some detail. All we know is that the grammar specifies which formal requirements are met by lexical items in having these properties; just this much does not make the computations purely formal since many of these properties clearly have semantic significance. For example, to say that an element is an anaphor is to say that it is dependent on some other formally identified element for interpretation; in other words, an anaphor does not by itself ‘‘stand for’’ anything. An r-expression, on the other hand, may stand for some object (in a model). Binding theory does not use the concept of SFR, but it takes interpretations very close to this notion. The property of having a y-role is particularly interesting from this point of view (for more, see Hinzen 2006, 152¤.). Higginbotham (1989) thinks of the thematic structure of a sentence as a ‘‘partial determination
Grammar and Logic
81
of (its) meaning.’’ Once a y-role is assigned, the concerned item cannot fail to stand for something—agent, patient, experiencer, theme, goal, whatever—even if a y-role is assigned to a trace, an empty element. We saw that an empty element is always viewed as a dependent element linked to an antecedent. A y-role is thus assigned to a chain consisting of the dependent element and its antecedent, which ultimately is a lexical element, which, in turn, will stand for something, even if the ‘‘stands-for’’ relation here may not amount to the full referential relation. What an element with a y-role stands for will depend on accessing further information from the concerned lexical item by other cognitive capacities of the mind that are designed to process that information. I believe that this point about the semantic character of y-roles is illustrated by the following example from Jackendo¤ 1983, 207, although Jackendo¤ himself does not use this example to that end. There is something in the thematic structure of the verb grow that allows the pair every oak grew out of an acorn and every acorn grew into an oak, but that does not allow the pair *an oak grew out of every acorn and *an acorn grew into every oak. The grammatical phenomenon of quantifier scope seems to be sensitive to the semantic phenomenon regarding which QP has what y-role. We have two options here: either the grammar is enlarged to accommodate (full-blooded) semantic information, or, we think of, say, an acorn grew into every oak as gibberish passed by the grammar (compare: a bird flew into every house), but rejected by the concept of growth. I am working under the second option; apparently, Jackendo¤ is working under the first. Thus, with respect to the information processed by grammar, as currently conceived, not only is the general nonphonological interpretability of a string determined; some cues about how it is to be interpreted are partly determined as well. This is one way of thinking that grammar progressively executes parts of SFR. Since there is no natural joint in grammar where the execution of PFR ends and the execution of SFR begins, these notions have no real meaning in grammatical theory. Hence, Searletype parables are not likely to apply to this theory. I am not suggesting that all formal properties have semantic significance. Structural Case, as noted, has no semantic significance. Thematic roles and binding properties, on the other hand, have semantic significance. Some agreement features, such as number feature of nouns, have semantic significance, while other agreement features, such as number feature of verbs, have no semantic significance. In MP, these variations are handled in terms of legibility conditions at the LF-interface. Roughly,
82
Chapter 3
features that enter into ‘‘understanding’’ are brought to the interface; the rest are systematically wiped out during computation. There is no prior syntax-semantics division here. There are lexical features and there are legibility conditions; together they generate interpretable expressions. This part of the computation is often called ‘‘N ! LF computation’’ (narrow syntax), meaning the part of the computation that begins with a numeration N of lexical items and generates LF phrase markers; there are no intermediate stages. The present point again is that there are nothing like separate PFR- and SFR-stages in the computational process. We recall that, in the Government-Binding framework (G-B), there indeed was a syntax-semantics divide between computation up to sstructures and computations from s-structures to LF. But that distinction, as we saw, was entirely internal to theory and is no longer maintained in more recent conceptions of grammar. By parity of enterprise, therefore, the traditional syntax-semantics divide ought to be viewed as an artifact of (naive, commonsensical) theory rather than as a fact about languages. The main thrust of the preceding way of looking at the grammatical system is that the traditional syntax-semantics divide needs to be given up since, otherwise, it is di‰cult to make sense of the claim that LF is the level where semantic (namely, nonphonetic) information begins to cluster. I return to the point in a moment. Are we overemphasizing the semantic nature of LF in the picture just sketched? As noted, the concept of semantics allegedly captured in LF is narrowly defined in terms of a list of phenomena such as quantifier scope, pronoun binding, variable binding, adverbial modification, and the like. It may be argued that since these things have nothing to do with SFRs, as Searle and others construe them, I have overstreched the idea that grammatical theory already executes parts of SFRs. I think this argument basically raises terminological issues. Let me explain. The only problem I have been discussing is whether grammatical theory gives an account of some nonphonetic understanding of a string. If it does, then, by Searle’s definition, we cannot equate grammar with a system executing PFRs. If, on the other hand, execution of SFRs is viewed as the only legitimate semantic enterprise, then grammatical theory certainly does not contain semantics. But then, the nonphonetic understanding that a grammatical theory does capture escapes the PFR/SFR divide. In other words, the PFR/SFR distinction does not apply to natural languages, if the phonetic/nonphonetic distinction is to apply to them. As a corollary, it follows that if ‘‘semantics’’ is viewed as a theoretical con-
Grammar and Logic
83
struct defined in terms of SFRs, then grammatical theory shows that this construct is (theoretically) dispensable. 3.3
LF and Logical Form
The preceding considerations suggest at most that, insofar as grammatical organization is concerned, there are fundamental di¤erences between the structures of formal logic and natural languages; therefore, the structure of logical theory cannot be mimicked inside grammatical theory. Just this much leaves open the possibility that lessons from formal logic may be added to the output of grammar to construct a more comprehensive theory of language. More specifically, lessons from formal logic may be used for an enriched description of the semantic component of languages beyond the description reached at LF. To understand this project, let me review what happens at LF. LF, we saw, has the following properties, among others. In the T-model, grammatical computation on a selection of lexical items branches at some point to generate two representations, PF and LF. PF is where ‘‘phonological and phonetic information is represented,’’ and LF is where ‘‘interpretive-semantic information is represented’’ (Hornstein 1995, 5). LF then represents nonphonological information; specifically, it represents interpretive-semantic information. At LF, all grammatically determined ambiguities, including scope ambiguities, are segregated, and all arguments, including NP-trace, are assigned thematic roles, among other things. Specifically, it is natural to think of the thematic structure of a sentence as a ‘‘partial determination of (its) meaning’’ (Higginbotham 1989), as noted. The correlation between PF and LF thus captures the traditional conception of language as a system of sound-meaning connections (Chomsky 1995b). Furthermore, the principles that determine LF-structure include some of the ‘‘most fundamental principles of semantics’’ (Hornstein 1995, 7). As we saw, the following principles, among others, apply at LF: Full Interpretation (a structure is licensed i¤ each element in the structure has an interpretation), the Principles of Binding (such as, an anaphor must have an antecedent in the local domain), the y-criterion (every argument must have a thematic role), and the Mapping Principle (discourse-linked arguments must be interpreted outside the VP-shell). As we saw, each of these is semantically motivated. In sum, not only is semantic information represented at LF, but its structure is determined by principles of semantics.
84
Chapter 3
As Chomsky (1991b, 38) put it, ‘‘Much of the fruitful inquiry and debate over what is called ‘the semantics of natural language’ will come to be understood as really about the properties of a certain level of syntactic representation—call it LF.’’ It seems natural, then, to view LF itself as a level of semantic representation, and the theory of LF as a semantic theory. More generally, once the requirement of sound-meaning correlation has been met at LF, we may identify the scope of language theory with that of grammatical theory, and view other conceptions of semantics, if viable, as falling beyond language theory per se. Pushing it, we could even say that ‘‘semantics’’ is whatever is done at the nonphonetic end of grammatical theory. The last point needs emphasis because it is not ruled out that developments at this end of grammatical theory might expand beyond the current conception(s) of LF to include phenomena that fell earlier under other conceptions of meaning. For example, semantic features such as þ/animate or pragmatic features such as þ/definiteness may play a role in grammatical computation (Ormazabal 2000; Diesing 1992). More radically and from an opposite direction, grammatical theory may not even require a level of representation, LF or SEM or whatever; it may consist of just computation on the lexicon and interface conditions (Chomsky 2005), as we will see in chapter 5. The basic point is that ‘‘semantics’’ is wherever the internal drive of grammatical theory leads us to at this end; residual conceptions of meaning fall under the study of other faculties of the mind that, together with the faculty of language, lead to the (extremely complex) phenomenon of language use. However, Hornstein (1995, 6) seems to be saying something slightly di¤erent when he says that LF is ‘‘where all grammatically determined information that is relevant to [semantic] interpretation is consolidated’’— that is, LF ‘‘provides the requisite compositional structure’’ for executing ‘‘interpretive procedures’’ concerning various facts taken to be ‘‘characteristic of meaning.’’ These facts include relative quantifier scope, scope of negation, modality, opacity, pronoun binding, variable binding, focus and presupposition structure, adverbial modification, and so forth. In this picture, although these things are ‘‘done o¤ the LF-phrase marker,’’ LF itself is not viewed straightforwardly as a level of semantic interpretation, but as something that provides the necessary sca¤olding for other nongrammatical theories to execute ‘‘interpretive procedures.’’ Chomsky (1991b, 46) also said that LF is ‘‘associated with’’ semantic interpretation, without saying explicitly that it is a level of semantic interpretation, even if LF has some of the properties usually described in model theory. We
Grammar and Logic
85
may conclude that, according to these authors, although LF is certainly semantically sensitive, it is best viewed as preparing the ground for (post-LF) semantic interpretation; in that sense, LF is missing something, semantically speaking.2 To focus on what is missing, it is well known that grammatical theory self-consciously stays away from things like conceptual roles, background beliefs, speaker intentions, cultural and historical expectations, and the like (Chomsky and Lasnik 1977, 428; Chomsky 2000d, 26). It is unlikely then that when Chomsky and Hornstein think of LF as missing something, they have these things in mind. Richard Montague (1974, 188), among many others, raised a much narrower issue. Montague held that the ‘‘construction of a theory of truth is the basic goal of serious syntax and semantics.’’ ‘‘Developments emanating from the Massachusetts Institute of Technology,’’ he immediately added, o¤ered ‘‘little promise towards that end.’’ Since the theory of LF does not contain a subtheory of truth, Montague’s complaint is factually correct. Moreover, Montague’s objection seems to fall in place with what we have seen so far. As noted, Hornstein thought that facts such as relative quantifier scope, scope of negation, modality, opacity, and so on fall at the borderline of LF and other ‘‘interpretive procedures.’’ Beginning with the work of Gottlob Frege and Bertrand Russell, most of these facts have been addressed by now by logic-based semantic theories of the type Montague advocated.3 Also, these theories are typically unconcerned about conceptual roles, background beliefs, speaker intentions, and the like. These are abstract theories formulated in logical notation to explore structural conditions on meaning. In that sense, they touch the domain of the theory of LF. However, since these theories are truth theories without question, they di¤er sharply from the theory of LF. If the (semantic) scope of grammatical theory is to expand at all, it seems natural that the first thing to add to grammatical theory is some version of truth theory.4 So, we can envisage an enterprise in which versions of truth theory are attached suitably to the theory of LF. In general, the envisaged project basically amounts to establishing relations, if any, between the grammatical concept of LF and the philosophical concept of logical form. The philosophical project of logical form has two parts: postulation of canonical expressions to capture the meaning of expressions of natural languages, and attachment of some version of semantic metatheory, which includes a model theory, to the canonical expressions. Although
86
Chapter 3
Bertrand Russell may be credited with initiating the philosophical notion of logical form in connection with his landmark theory of definite descriptions, the original theory did not contain any explicit metatheory. So his theory of descriptions, especially his account of scope distinction, o¤ered a way of studying just the first part of the project, as we saw in section 2.1. From what we saw, Russell’s project looked questionable because there was no clear justification as to why the notation of predicate logic is to be imposed on expressions of natural language. We saw that our linguistic intuitions suggest that certain sentences—say, The king of France is not wise—are structurally ambiguous. We can even give informal paraphrases of those intuitions in English, not surprisingly. It was unclear what more is accomplished by the imposition of logical notation. All that the notational scheme of (4) and (5) did was to represent those intuitions. (4) and (5) thus just represent data; they do not give an account of the data. Furthermore, we saw that the task of representing scope distinctions in a canonical notation can be accomplished within the theory of LF with full generality and explanatory power. With the theory of LF in hand, especially in the minimalist version, we are in a position to substantiate these initial impressions. Consider again (6), mentioned in the last chapter. (6) Every boy danced with a girl. In the notation of restricted quantification, the ambiguity of (6) can be represented as in (76) and (77). (76) (every x: boy x)((a y: girl y)(x danced with y)) (77) (a y: girl y)((every x: boy x)(x danced with y)) In this representation, the indefinite article a in the phrase a girl is viewed as an existential quantifier. Following the work of Irene Heim and Peter Geach, Molly Diesing (1992, 7) writes the logical form of (78) as (79). (78) Every llama ate a banana. (79) Every x ((x is a llama) ((by) y is a banana & x ate y)) In this notation, the phrase a banana is viewed rather as introducing a variable y, which in turn is bound by an ‘‘abstract’’ quantifier b. Similarly, a llama introduces a variable x that is overtly bound by the quantifier every. Finally, the existential clause is embedded within the scope of every to bring out the intuition underlying (78) that each member of the set of llamas ate a banana. Thus, the notation of logical theory, as used in (79), is a tool that helps in explicit representation of data, namely, what
Grammar and Logic
87
interpretation native speakers attach to (78); it does not give an account of the data. The account ensues when the structure of (79) is exploited to provide a grammatical analysis of (78); it ensues because a phenomenon is now analyzed, not just represented, in terms of empirically significant theoretical postulations. As Diesing notes, (79) suggests that there could be a split in the underlying syntactic structure of (78) such that every llama and a banana are interpreted at di¤erent parts of the structure. This led to the distinction between the IP and the VP parts of a structure via the VP-internal-Subject hypothesis, as we saw. It took linguists several years to incorporate this specific syntactic idea within the general grammatical theory. Several steps, some of which we saw, were required. First, a general theory of clause structure was formulated (Pollock 1989) in which the VP-internal-Subject hypothesis (Koopman and Sportiche 1991) was incorporated. Second, a general theory of displacement had to be found (Case checking). Third, principles had to be designed to place strong quantifiers at specific locations outside the VP-shell (Mapping Principle), and so on. Logical theory played no role in this at all. Furthermore, Diesing’s representation of (78) as (79) shows that logical notation, like musical notation, can be changed abruptly and arbitrarily for convenience of exposition. This is not to deny that there is some element of convenience in any choice of notation whose arbitrary character shows up when theoretical explanations get deeper. It happened for parts of grammatical theory as well: ‘‘A lot of uses of such devices as proper government and indices turn out to be pseudo-explanations which restate the phenomena in other technical terms, but leave them as unexplained as before’’ (Chomsky 2000a, 70). But this fact, common in science, cannot apply to logical notations since their uses do not even qualify as pseudo-explanations, as argued above. To emphasize, a new logical notation cannot show that the uses of old ‘‘devices’’ turn out to be ‘‘pseudo-explanations.’’ For example, the adoption of the new notation in (79) does not show that the old notation was a pseudo-explanation; it shows that the old notation did not even represent data perspicuously enough for analyzing the user’s intuition. So, why do we need the notation of logic anymore? 3.4
Truth and Meaning
For a large (and growing) number of linguists and philosophers, the notation of logic is still needed because the imposition of this notation
88
Chapter 3
facilitates systematic assignment of truth conditions to sentences of a language. To feel this need is to entertain a conception of language theory that includes truth theory as a part. Although there is distinguished history to this conception (Wittgenstein 1922, 4.024; Strawson 1952, 211), it is di‰cult to find an explicit argument that supports this conception of language. To my knowledge, this issue was never fully raised in the literature since it was taken for granted—even by many proponents of generative grammar—that it fails to include an adequate semantic theory. What we find is that people simply proclaim a conception of language theory that contains truth theory by stipulation: ‘‘Semantics with no treatment of truth conditions is not semantics’’ (Lewis 1972, 169); ‘‘construction of a theory of truth is the basic goal of serious syntax and semantics’’ (Montague 1974, 188). If this is just an unargued assertion (Pietroski 2005; Stainton 2006), then the motivation for expanding the scope of grammatical theory considerably weakens: di¤erent people with di¤erent assumptions do di¤erent things, period. To illustrate this crucial point, I will examine in some detail the origins and structure of the theory of meaning proposed by Donald Davidson. In a classic paper (Davidson 1967), which may well be thought of as having initiated the contemporary program of formal semantics, Davidson gave a new direction to a conception of semantics he traced to Gottlob Frege. Frege held the near truism that the meaning of a sentence is determined by the meaning of words in it.5 We would thus expect a theory of meaning to spell out how individual words systematically contribute to the meaning of a sentence, the meanings of sentences di¤ering when (nonsynonymous) words in them di¤er. Given that the linguistic resources of users of a language L are finite, it is plausible to place the condition (M) on a theory of meaning of L such that the theory recursively generates, for each sentence s, a theorem of the form (M) s means m, where m gives the meaning of the sentence (Davidson 1967, 307). The trouble is that Frege’s own conceptions of meaning fail to satisfy these requirements. If, on the one hand, the ‘‘meaning of a sentence is what it refers to, all sentences alike in truth value must be synonymous—an intolerable result’’ (p. 306). If, on the other, individual words and sentences have meanings distinct from their reference, then the most we can say is that the meanings of Theaetetus and flies are what they contribute to the meaning of Theaetetus flies, and the meaning of Theaetetus flies is what is
Grammar and Logic
89
contributed by the meanings of Theaetetus and flies. Since we ‘‘wanted to know what the meaning of Theaetetus flies is, it is no progress to be told that it is the meaning of Theaetetus flies.’’ Keeping this in mind, a number of consequences could be drawn from these results regarding the form and the scope of a putative theory of meaning. For example, why should Frege’s choices exhaust the options for what is to count as the right entities mentioned by the singular term m? Suppose we hold that the concept of meaning, whatever it is, that figures in LF-representation(s) of a sentence exhausts the scope of a theory of language; as I proposed earlier, any other concept of meaning could then be viewed as either incoherent or falling beyond the scope of a theory of language. Now recall that so far Davidson has placed two constraints on a theory of meaning: it has the form ‘‘s means m’’ where m is systematically correlated with s, and m cannot be Fregean. It is easy to see that these conditions are satisfied by the schema for grammatical meaning (GM), (GM) s means LFs, where s is a phonological form of s,6 and LFs names LF-representation of s, a theoretical expression systematically correlated with s. LFs is a genuine (abstract) singular term that mentions perhaps a specific state of the mind/brain. This should not be a problem for Davidson since his ‘‘objection to meanings in the theory of meaning is not that they are abstract or that their identity conditions are obscure, but that they have no demonstrated use’’ (p. 307). Since LF-representations do have demonstrated use, we could say that grammatical theory generates each instance of GM. GM also escapes another objection to M. Davidson’s objection to M is sometimes interpreted to suggest that M is uninformative. Someone may know an instance of M without knowing what the relevant expression means: one may know ‘‘Theaetetus flies means what is meant by Theaetetus flies’’ without knowing what Theaetetus flies means. In contrast, GM is not only noncircular, the conception of (overt) knowledge of language that gives rise to this objection does not apply to a theoretical formulation such as GM; only linguists have overt knowledge of GM. Well-formulated recursive syntax and dictionary thus seem adequate. Davidson, however, chooses a di¤erent course because ‘‘knowledge of the structural characteristics that make for meaningfulness (¼ syntax) in a sentence, plus knowledge of the meanings of the ultimate parts, does not add up to knowledge of what a sentence means’’ (p. 307). This is because just this much knowledge fails for propositional attitudes where
90
Chapter 3
‘‘we cannot account for even as much as the truth conditions of such sentences on the basis of what we know of the meanings of the words in them’’ (p. 308). It is well known that the problem of propositional attitudes persists even within semantic programs that have more resources than Davidson’s own truth-theoretic semantics (Chomsky, Huybregts, and Riemsdijk 1982, 91); for that matter, it apparently persists irrespective of conceptions of meaning (Kripke 1979). In any case, the vexing, unsolved problem of propositional attitudes (Schi¤er 1987) cannot be the starting point for a general theory of meaning. If Davidson’s appeal to propositional attitudes is merely an attempt to draw attention to the general point that syntax and dictionary together fail to show how the meaning of the complex is a function of its constituents—the ‘‘contrast with syntax is striking’’—then the objection does not apply to GM since syntax does generate LF-representations computationally from the lexicon. In that sense, whatever an LFrepresentation means is a function of the meanings of its parts. It follows that Davidson simply requires at least that a theory of meaning give an account of truth conditions of sentences. The pretheoretical assumption shows up even more directly in the way Davidson formulates his theory. When Fregean conceptions failed, Davidson did not pursue the option of finding a non-Fregean singular term, as noted; he just asserted that the only way to reformulate M is to change m to a sentence p. No argument is advanced for the totally artificial suggestion that the meaning of a sentence is to be ‘‘given’’ by another sentence in some language or other. If anything, it is natural to think of the meaning of a sentence as a property of the sentence to be described in some theoretical term; it is hard to see how another sentence carrying the burden of its own meaning can do the job. As we will see, there is exactly one way of not begging the obvious questions. Setting the issue aside for now, the move from m to p enables Davidson to narrow down the options. In formulating a theory of meaning, we can no longer use the two-place predicate means to relate s and p; for that we now need a sentential connective. Davidson imposes an additional constraint that the theory of meaning must appeal only to extensional relations, since with the ‘‘non-extensional ‘means that’ we will encounter problems as hard as . . . the problems our theory is out to solve’’ (p. 309). Once we stipulate that m be reformulated with a sentence p, and p and s be related by an extensional sentential connective, then for the whole thing to work as a definition of sentence meaning, schema (T) becomes inevitable:
Grammar and Logic
91
(T) s is T if and only if p, where s is a structure description of the sentence under study, p is a metalinguistic expression that is systematically correlated with s, and is T is a metalinguistic predicate that is recursively characterized by the theory for each sentence of L. To see that schema T is inevitable recall that p has to bear a lot of weight: it needs to be both systematically and uniquely correlated with s, and it must give the meaning of s in some sense. This last condition rules out formulations such as schema G. (G) s is grammatical if and only if there is a derivation that yields LFs. This schema obeys much of Davidson’s conditions: it ‘‘provides’’ s with its own predicate ‘‘is grammatical,’’ and it has a ‘‘proper’’ sentential connective that has a sentence to its right. G has the additional virtue that it carries exactly the import of GM without mentioning meanings, a virtue that Davidson claims for Convention T (310). However, there is no clear sense in which the entire sentence there is a derivation that yields LFs by itself ‘‘gives’’ the meaning of s; the derivation does not give the meaning, LFs is the meaning of s.7 Therefore, p can only be s itself if L is contained in the metalanguage ML, or p is a translation of s in ML. This forces schema T. Despite these moves, we still do not know what is T is, since we have taken much care to detach the content of is T from that of means that. However, it is somehow clear to Davidson that ‘‘the sentences to which the predicate is T applies will be just the true sentences of L, for the condition we have placed on satisfactory theories of meaning is in essence Tarski’s Convention T that tests the adequacy of a formal semantical definition of truth’’ (pp. 309–310). (Convention T) s is true if and only if p. No doubt schema T looks like Tarski’s Convention T, and alternatives virtually die out once the preceding conditions are enforced. Yet, some independent argument is needed for us to view the identification of is T with is true ‘‘in the nature of a discovery’’ (p. 310). Otherwise, it is di‰cult to resist the thought that just those assumptions have been built into the form of a putative theory of meaning that filter out everything except Convention T. A theory of meaning, then, is what we get when we work backward from a theory of truth. Davidson does provide justification for viewing a Tarski-type truth theory itself as a theory of meaning, but it comes after the ‘‘discovery’’
92
Chapter 3
has been made. So, it could have been proposed without taking recourse to the famous argument I just reviewed. Thus, it is suggested that Convention T will generate theorems such as ‘‘snow is white is true i¤ snow is white.’’ Any competent speaker of English, namely, one who has already internalized the meaning of the metalinguistic relation is true i¤, will recognize that this instance of Convention T is true; in recognizing this, a speaker has displayed understanding of the sentence snow is white in some sense. Therefore, a recursive characterization of the truth predicate for each sentence of a language ‘‘recovers’’ something of the user’s understanding of the language. What is the sense in which competent users of English recognize that ‘‘snow is white is true i¤ snow is white’’ is true? The answer is that, in giving assent to ‘‘snow is white is true i¤ snow is white,’’ the users have indicated the right circumstance in which snow is white is to be used, namely, the circumstance in which snow in fact is white. In other words, by displaying the correct use of snow is white in the right-hand side of the formula, the users have displayed their competence of snow is white itself mentioned in the left-hand side of the formula. The bridging truth predicate is true i¤ thus systematically matches sentences with descriptions of circumstances. In this sense, Convention T brings out the informationbearing aspect of language.8 From this perspective, Davidson’s proposal concerning Convention T may be viewed as a philosophical clarification of the concept of meaning. As competent speakers of a language, we need some common hold on what counts as the significance of a sentence in that language. Since it is generally believed that the most promiscuous use of language consists in talking about the world, the common notion of significance may be captured in formulations such as ‘‘snow is white is true i¤ snow is white’’; thus, as Tarski saw, competent speakers of a language are likely to assent to it. In other words, native speakers assent to instances of Convention T precisely because it agrees with their folk conception of significance/ meaning. It follows that Davidson’s proposal does unearth a folk conception of meaning. Not surprisingly, this part of Davidson’s work set o¤ one of the most interesting literatures in philosophy, which includes Davidson’s own subsequent work on structure of beliefs, knowledge, truth, subjectivity, and the like. However, philosophical clarification of a network of common conceptions—such as the ‘‘triangularity’’ of truth, meaning, and beliefs— is a very di¤erent activity from giving an empirically significant account of some properties of linguistic objects in the mind/brain. This is because
Grammar and Logic
93
folk conceptions of meaning themselves demand explanation, perhaps in a form of inquiry Chomsky (2000d) calls ‘‘ethnoscience’’; also, folk conceptions could be false with respect to—or inapplicable to—the properties of the mind/brain. Davidson meets this objection by proposing a side project: the conception of meaning via Tarski’s truth theory can be formulated as an empirically significant theory of meaning in consonance with grammatical theory we already have. For the rest of this chapter, I will be exclusively concerned with this project. What sort of theory might ensue from the fact that native speakers assent to instances of Convention T? For Tarski, the task was to construct a theory for an entire language such that the e¤ect of Convention T obtains for each sentence of the language. The task obviously requires that we are able to fully characterize a language L recursively such that its sentences can be explicitly identified. Tarski held that it can only be done for formal languages. Supposing it to have been accomplished, what is the character of such a theory? At its best, that is, for a su‰ciently rich formal language, the theory could be viewed, as with any logical theory, as a ‘‘rational reconstruction’’ of the native speaker’s informal intuition of the correct use of the truth predicate captured in Convention T. In other words, the theory would be an explicit generalization of the native speaker’s intuition displayed in cases such as ‘‘snow is white is true i¤ snow is white.’’ It will not be an explanation of the intuition: Tarski and, as we will see, Richard Montague never intended the project to be so (Stainton 2006). For natural languages, even this much is hard to achieve since there is no prior characterization of a natural language: structure of natural languages are matters of fact, not stipulation. So the only route available here is to first focus attention on that fragment of a natural language that does match formal languages in the relevant respects under suitable abstraction: to use ‘‘new’’ English to throw light on ‘‘old’’ English. Then we proceed to extend the system to cover further fragments. This part of the project is certainly empirical in the (very) narrow sense that we do not know in advance which structures will fall under the enlarging truth theory. In other words, the project is empirical in its attempts to catch up with Tarski in covering progressively richer fragments of these languages. At each stage of empirical progress, therefore, we have a more adequate rational reconstruction of the native speaker’s original intuition; prima facie, no other notion of empirical significance attaches to the theory beyond a philosophical clarification of linguistic
94
Chapter 3
significance for fragments of natural languages. Here, empirical work in the syntax of natural languages provides the relevant structures to which Tarski-type (metalinguistic) structures are suitably attached. This is exactly how Davidson proceeds. Davidson’s general suggestion is ‘‘to mechanize as far as possible what we now do by art when we put ordinary English into one or another canonical notation.’’ After all, it would be a ‘‘shame to miss the fact that as a result of these two magnificent achievements, Frege’s and Tarski’s, we have gained a deep insight into the structure of our mother tongues’’ (1967, 315). Frege showed how some of the quantificational idiom of English can be ‘‘put’’ into the canonical notation of first-order logic; Tarski (1935) showed how to give truth definitions for a variety of formal languages, including first-order logic. Davidson’s suggestion is that we align these e¤orts. Invoke some canonical notation to ‘‘tame’’ fragments of English, then apply Tarski’s method to the fragment so tamed—the project of logical form, as I envisaged it.9 Drawing on pre-LF work in generative grammar, Davidson invited ‘‘Chomsky and others’’ to join the project since they are ‘‘doing much to bring the complexities of natural languages within the scope of serious semantic theory’’ (p. 315). Specifically, canonical notation of logic will accomplish two tasks. First, we need to state ‘‘semantic axioms’’ such as ‘‘John is wise is true i¤ John is wise’’ by means of satisfaction relation in a model whose domain contains John, and that enables us to characterize/construct the set of wise things. Second, rules of logical theory will enable us to compute the values of more complex sentences such as John is wise or snow is white from the values of their parts. I will attend only to the first part of the project. We need not deny that this program ‘‘works’’; sometimes it does so extensively and elegantly, as in Larson and Segal 1995. In fact, as we saw for every llama ate a banana, an appeal to logical form might on occasion throw some light on how syntax may be organized; there are other notable (and more technical) examples in the literature (Heim and Kratzer 1998; Chierchia and McConnell-Ginet 2000). But caloric theory also worked elegantly for a wide range of thermal phenomena; otherwise, it is di‰cult to explain its popularity with distinguished scientists for hundreds of years (for more examples of this sort, see Bennett and Hacker 2003, 4–5). As with caloric theory, the issue is whether the program is needed. For that, we need to see why attachment of a truth theory coherently falls within the explanatory program of generative grammar beyond the se-
Grammar and Logic
95
mantics already covered at LF. From what we have seen so far, the explanatory significance of the combined theory of LF and logical form does not exceed the explanatory significance the theory of LF already has. 3.5
Limits of Formal Semantics
In the post-LF era of generative grammar, the convergence between LF and logical form is typically motivated as follows. Robert May (1991, 336) thinks that the philosophical concept of logical form—that is, the concept of logical form that is concerned with ‘‘the interpretation of language’’—serves as an ‘‘extrinsic constraint’’ on language theory. For example, we will want LF to have a form such that (truth-theoretic) semantic rules can apply, and ‘‘compositional interpretation’’ can be articulated (Larson and Segal 1995, 105).10 Once we have done so, LF and logical form will jointly serve to determine the structure ‘‘in the course of providing a systematic and principled truth definition for (a language) L.’’ As a result, ‘‘a fully worked out theory of LF will be a fully worked out theory of logical form’’ (Neale 1994, 797). Supposing that we have a ‘‘fully worked out’’ theory, what does the theory explain beyond the narrow sense of empirical significance discussed above? There are two prominent responses to this query in the literature: (i) formal semantics explains the external significance of language; (ii) it explains (mind-) internal significance of language beyond syntactic organization. 3.5.1
External Significance
It is widely held (Barwise and Perry 1983; Larson and Segal 1995) that one of the main goals of a semantic theory of human language is to furnish an account of the external significance of linguistic expressions: the fact that humans use languages to talk about the world. A semantic theory that does furnish such an account is then empirically significant. There is no doubt that grammatical theory is not empirically significant in this sense; the task is to see if theories of logical form are so significant. I assume that it is plausible to hold a pretheoretical conception of language theory in which we give an account of how language relates to the world. As noted, I also grant that our grasp of the truth predicate indicates the significance we attach to this function of language. To that extent, I assume that there is a sense of ‘‘understanding’’ in which a user
96
Chapter 3
of L fails to understand a sentence of L if she cannot figure out the conditions in which the sentence turns out to be true or false. Since LF does not furnish an account of this understanding, LF needs to be supplemented by a truth-conditional theory in this sense. Notice that this conception can be questioned: just which aspects of understanding a sentence is to be covered by language theory? It is certainly an aspect of a competent user’s understanding of English (or some dialect of it) that the expression get out of here may be interpreted either as an insult or an endearment depending on who uttered it on what occasion. It is unclear if such phenomena of language use fall under language theory per se. Studies on autistic and aphasic patients dramatically illustrate the point. Faced with the request Can you pass the salt please, an autistic patient said Yes and did nothing further. The saying Too many cooks spoil the broth elicited this response from an aphasic patient: Too many cooks, you know, cooks standing around the broth, they are talking and cooking (Gardner 1975, 79). In a famous work on the subject called Laura, Jeni Yamada (1990) found that Laura’s language capacities were apparently intact, but her cognitive and pragmatic competence was limited. For example, Laura knew when she should describe herself and others as sad or happy, but apparently without the capacity to feel sad or happy. In each case, there is a clear sense in which a user fails to understand some English sentences. There is considerable evidence that such ‘‘pragmatic’’ failures are caused by selective impairment dissociated from the language area of the brain (Kasher 1991; Fromkin 1991); hence, these notions of understanding a sentence do not fall under language theory. So, even if we agree with Larson and Segal (1995, 31) that understanding a sentence requires knowing its truth condition ‘‘at the very least,’’ we will expect some additional argument as to why this notion of understanding is to be covered by the ‘‘general enterprise initiated by Noam Chomsky’’ (1995, 10), unless it is held that every aspect of human understanding falls under the Chomsky-initiated general enterprise. In that event, the impairment cases just cited also fall under the enterprise. The issue clearly is: What falls under a version of the enterprise restricted just to some theoretically salient conception of language? Holding onto the pretheoretical (¼ unargued) conception, nonetheless, we will expect an empirically significant theory of truth to attach somehow to grammatical theory, possibly as part of a general theory of language use. This additional theory will purport to give an account of the extremely complicated ability to relate items of language to aspects of
Grammar and Logic
97
the world, the ability that enables a user to determine if a sentence is true or false. Notice that all I am currently discussing is whether the formal semantic truth conditions explain the external significance of language; the postulated properties of expressions under discussion will be those that bring out this significance. Therefore, an empirically significant account of these properties must tell us how these properties obtain such as to relate mindinternal entities with those in the world. If formal semantics is not designed to do so, then either the program is assuming what it needs to explain or it explains some other significance of language, to which I turn in the next section. From this perspective, even superficial reflection suggests that a wide variety of cognitive capacities must work in tandem to connect items of language to aspects of the world. Keeping just to names such as John and Mt. Everest, not to speak of external significance of whole sentences, at least the following things are involved: a system of linguistic structure to place the name, and a system of conceptual relations and conditions, along with factual beliefs, to place the thing named (Chomsky 1975, 46; Malt, Sloman, and Gennari 2003, 82–84). Hence, we need, if at all, a cluster of theories about the world, about linguistic practices, about formation of beliefs, and the like, to reach something like a naturalistic version of truth theory; then we find some way of attaching the cluster directly to grammatical theory. It is important to note that each of these theories will be ultimately needed in a naturalistic explanation of external significance of linguistic expressions, for, together they explain just the abilities that enable a competent speaker to assent to instances of Convention T such as ‘‘snow is white is true i¤ snow is white.’’ In other words, it is hard to see how a speaker who lacks these abilities can master the use of the truth predicate to assent to instances of Convention T. Needless to say, the desired cluster of theories is not even in sight. Lowering our sights for now, the only issue that concerns us is whether the imposition of logical form is a desirable step toward the envisaged naturalistic theory. If it is not, then, granting that it is the only technique we know of for attaching a truth theory to a theory of language, we are either left with an ‘‘incomplete’’ theory of language in the form of grammatical theory, or we begin to doubt whether the pretheoretical conception of language theory is to be entertained at all. So, how does a theory of logical form begin to account for the external significance of language? One answer, underlying much of the formal
98
Chapter 3
semantics literature, is that the recognition of di¤ering truth conditions of the two readings of, say, (2) The king of France is not wise. is in fact the relevant intuition to be explained by language theory ‘‘at the very least.’’ In other words, our recognition that (2) is structurally ambiguous is dependent on our intuition that (2) may be interpreted in two di¤erent ways and that intuition, in turn, rests on the fact that the interpretations di¤er in truth conditions. Since this last intuition is the one we ultimately need to give an account of to resolve scope ambiguity, we must begin with structural ambiguity and end by showing how the structural ambiguity results in di¤ering truth conditions to which di¤ering truth values may be systematically assigned. Since the logical forms (4) and (5) motivate the desired progression, they are empirically significant. We need not deny that expressions such as (4) and (5), or their informal English paraphrases, enable us to recognize that the readings of (2) di¤er in truth conditions; that, in other words, is data. Just what is achieved by the use of logical notation beyond this? The only constructive response that I can think of is that logicians have mastered a technique not only for representing the said intuition in structures such as (4), but also for characterizing that intuition explicitly. Thus, having reached (4), logical theory imposes a scheme of interpretation in a model. We define satisfaction conditions for each of the primitive terms and assign them compositionally to structures such as (4). Suppose we have two sentences, John is wise and Mt. Everest is a mountain. As noted, the model-theory part of the theory of logical form assigns John to John, Mt. Everest to Mt. Everest, and so on. It is claimed that the resulting expressions in the semantic metalanguage will mimic the varying truth conditions. As far as I can see, the claim of external significance of truth theories ensues from these assignments alone; once these denotations are plugged in, the rest is Tarski-style construction, as noted. Just what is accomplished by picking John and mountain from a model such that John denotes John, or that mountain denotes the set of mountains, and leaving matters at that? Suppose we want to know how the external significance of John and mountain contribute to the external significance of John climbed a mountain. Clearly, we need to say something about John, mountain, and climbing, beyond saying that these things are denotations of John, mountain, and climbing. It is di‰cult to see how one can refuse to say these things while claiming that the expressions at issue have external significance.
Grammar and Logic
99
Denotational theorists will, of course, deny this. Concerning the sentence France is hexagonal, and it is a republic, Pietroski (2005) notes that speakers can use France to (simultaneously) refer to various things—a certain terrain, a particular nation, or whatever. However, according to Fodor and Lepore (cited by Pietroski), ‘‘this does not yet show that semantic theories should mark such distinctions. Perhaps we should diagnose such facts as reflections of what speakers know about France, and not what they know about France.’’ Fodor and Lepore thus seem to allow that France is a hexagonal republic—which is di¤erent from, say, a triangular dictatorship—since that is what France denotes. The entire explanatory weight thus is on denotes. What is denoting? Unfortunately, we have very little to work with here.11 In the formal semantics literature, denote is taken to be a primitive and is used in formulations such as ‘‘John denotes John.’’ This contrasts sharply with grammatical theory. Grammatical theory uses the term ‘‘r-expression,’’ where ‘‘r’’ abbreviates ‘‘referring.’’ In that sense, the theory uses some aspect of referring. But the aspect of referring it uses is theoretically characterized: an r-expression is A-free. Needless to say, the theory does not characterize—hence, does not use—other aspects of the common term. We would expect formal semantics to have taken over that unfinished task. Formal semantics fails us in using the common term itself. It seems that people for whom ‘‘John denotes John’’ appears to be essentially analytic (¼ nonempirical) could have the right intuition. Suppose, by ‘‘denotes’’ we mean something like stands-for. Bertrand Russell (1919) held that a symbol stands for something. Hence, the knowledge that John is a symbol is the knowledge that John stands for something. This general knowledge and the device of disquotation yield ‘‘John stands-for/denotes John.’’ On this view, the device of disquotation spells out that John is a symbol. One may know this without knowing what John (specifically) stands for or denotes; all we know is that John has external significance. The argument extends to common nouns like mountain: one may know ‘‘mountain denotes mountain’’ without knowing what mountain (specifically) means.12 We wanted to know the external significance of John and mountain to find out how they contribute to the external significance of John climbed a mountain; it is no progress to be told that John and mountain have external significance. It seems that the net situation is that what would count as a genuine theory of the external significance of language is not in hand; what is in hand is not a theory of external significance (Stainton 2006; Pietrosky 2006 for more).
100
Chapter 3
Common conceptions of denotation and designation, and cognate conceptions of truth and meaning, then, have little role in a theory of language. In fact, their lack of explanatory value puts significant pressure on the classical notion of semantics—that is, the study of how language relates to the world. We may doubt whether this study applies to natural languages at all, whether or not there could be such a study for artificial languages or nonhuman signal systems. ‘‘It is possible,’’ Chomsky (2000d, 132) suspects, ‘‘that natural language has only syntax and pragmatics.’’ There is a certain irony here. We are discussing whether the concept of denotation interestingly captures the external significance of language. According to Chomsky, only two kinds of systems may be viewed as illustrating the common concept of denotation: animal call/signal systems and logical systems. The irony is that no significant notion of language applies to animal call systems, although the systems have external significance in the desired sense. Logical systems, on the other hand, may well be viewed as products or ‘‘o¤shoots’’ of the human linguistic system, but they do not have external significance. So, there is no case in which the notions of denotation, language, and external significance converge. 3.5.2
Syntax of Thought?
The preceding observations need not immediately a¤ect formal semantics in its entirety. Although formal semantics does indeed claim to address the external significance of language, arguably its more interesting parts can be viewed as concerning essentially the structural conditions that enter into computation of meaning. In other words, the program grants that words mean whatever they do; the program just looks at the general conditions they must meet on entering the semantic part of the computational system. Of course the program need not be restricted to just this much. Having described the structural conditions, the program might try to link them up with the conceptual system; the conceptual system, in turn, might be viewed as linked up with items in the world. Chomsky (2001b) describes the full enterprise: the denotation relation holds between the internal semantic representation of books and its ‘‘semantic value,’’ which in turn relates somehow to books; ‘‘the child acquires the denotation relation by virtue of causal properties of the world that relate external phenomena to mind-internal entities, say ‘concepts’. . . . We can forward further inquiries to the physics department, or maybe the sociology department.’’
Grammar and Logic
101
Accordingly, the total program has three parts: (i) structural conditions on logical and nonlogical terms, (ii) conceptual characterization of nonlogical terms, and (iii) linking nonlogical terms to items in the world. From what we saw, (iii) faces the problems discussed in the last section. For (ii), I will assume, pace Chomsky 2002, 159, that ‘‘nobody really has much of an idea about the computational processes right outside the language faculty. One could say there is a language of thought or something like that, there are concepts, etc., but there has never been any structure to the system outside the language faculty.’’ We will study the prospects for a theory of conceptual organization in the next chapter. In any case, the program begins and is at its sharpest at (i), where it is concerned with the structural conditions that enter into computation of meaning. This is done by focusing basically on what are generally called ‘‘closed items’’ of a language. Words of a language fall into two broad groups: closed and open. Closed items, such as prepositions and articles, form a fixed and restricted set—a few dozen in English, for example. Typically, though not always, they do not have a meaning standing alone; they must combine with open items, such as nouns and verbs, to form larger units that have independent meaning. It is reasonable to think of the semantic properties of closed items as ‘‘wired-in’’ such that vagaries of mindexternal language use are not likely to a¤ect their operations. The semantics of open items such as nouns and verbs, in contrast, not only involves the conceptual system directly, but the significance of these items is largely derived from the external contexts of their use, as noted. The restricted attention to the structural conditions governing the (contribution of ) meanings of closed items raises the possibility that formal semantics need not be primarily viewed as giving an account of the mind-external significance of language. Instead, it may be seen as describing the mind-internal properties of language. With the change in perspective, formal semantics will no longer be semantics in the sense of relating language and the world, but as establishing systematic relations between language and some mind-internal entities. Chomsky has held this perspective on formal semantics for several decades. In Chomsky, Huybregts, and Riemsdijk 1982, 46–47, he suggested that Montague semantics is ‘‘not semantics really’’ since ‘‘it does not deal with the classical questions of semantics such as the relation between language and the world.’’ More recently (2003, 271), he has held that formal semantics should be regarded as a ‘‘form of syntax.’’ Hence, it is ‘‘a study of symbolic objects and their properties—in this case, internal objects,
102
Chapter 3
linguistic expressions, and semantic values.’’ ‘‘Postulation of semantic values,’’ he adds, ‘‘faces the same challenges as postulation of other theoretical entities: phonemes, atoms, whatever.’’ Formal semantics thus extends the notion of syntax beyond grammar. What do we make of the altered perspective even if, for Chomsky, formal semantics may not meet the standards of science? Can we think of the structures described by formal semantics as internal to the child’s mind, an account of a cognitive capacity at all? To answer these questions, we need at least a cursory look at how part (i) of formal semantics in fact works. Given the fixed, universal character of closed items, it is possible to use the abstract format of logical theory to spell out the conditions governing the semantic contribution of closed items to the overall meaning of the phrase in which they occur. As we saw, Russell invoked quantification theory to represent the contribution of the in a definite description. Once the contribution is spelled out in the logical format, the standard schemes of interpretation may then be attached to the expressions so constructed. It stands to reason that the treatment could be extended beyond the usual quantificational part to any closed item of a language, including whitems, pronouns and anaphors, tense and agreement features, modality, and so on. For a brief look at how all this is supposed to work, consider Richard Montague’s treatment of every man walks (Dowty, Wall, and Peters 1981, 108–109). As noted, the open items man and talk are viewed as nonlogical ‘‘constants’’ that are incorporated into Montague’s formal system as man 0 and walk 0 respectively—that is, the latter are the ‘‘primed variants’’ of the corresponding English words. Although man 0 is called ‘‘Common Noun (CN)’’ and walk 0 is called ‘‘Intransitive Verb (IV),’’ both are viewed as semantic objects denoted by the type set of individuals, represented as he, ti (¼ semantic value), a map from entities/individuals to truth values. Notice that the (vast) conceptual di¤erences between man and walk are simply set aside; these open items enter the system only as abstract semantic types. The syntax of the sentence is displayed in terms of the usual phrase structure. The real action is on the closed item every. Keeping to the extensional part of the system, every is viewed as a logical constant belonging to the type 5e, ti, 5e, ti, t6, a map from sets of individuals to truth values. This is arrived at as follows. Every is viewed as an operation on two one-place predicates of the type he, ti. Let these arbitrary predicates be P and Q that range over constant predicates such as man 0 and walk 0 .
Grammar and Logic
103
Figure 3.1
Montague tree
Thus, we have the familiar first-order form Ex[Px ! Qx] for the general expression every P is Q. Next, we apply l-abstraction over the predicate expressions to yield lP[lQEx[Px ! Qx]], where l-abstraction turns a functional expression into an individual expression (Dowty, Wall, and Peters 1981, 98); in this case, lP[lQEx[Px ! Qx]] is an abstract determiner. We can view this expression as capturing the meaning of every since it has the desired type 5e, ti, 5e, ti, t6; it is desired because when the values for man 0 and walk 0 are inserted, the two he, tis cancel out and a t results. Thus, the logical form of every man walks is lP[lQEx[Px ! Qx]] (man 0 )(walk 0 ). The derivation is represented in figure 3.1. In frameworks that appeal only to (restricted) first-order notation, the general logical form for a sentence of the form every F is G is (every x: Fx) (Gx), with the ‘‘truth clause’’ jF Gj ¼ 0, meaning the set of things that are F but not-G is empty (Neale 1990, 42–43). So, the logical form and the truth clause for every man walks are (every x: man x) (walk x) and jMan Walkj ¼ 0 respectively, where Man and Walk represent the cardinality of the sets {man} and {walk}. The preceding sketch is enough evidence that formal semantics can be ‘‘done,’’ as long as grammatical theory keeps supplying enough structures for semanticists to attach logical expressions to. Much of the current work thus depends parasitically on linguistic theory to furnish fine-grained
104
Chapter 3
analyses of linguistic structures; once those structures are made available, say by Binding theory, the primitive basis of logical theory is suitably expanded to accommodate the new structures. Could all this be viewed as describing the mind of the child? Consider the remark, once made by Chomsky, that Peano axioms are actually embedded in the (child’s) mind. The postulation was supposed to explain why, pathology aside, children are able to acquire the standard number system without fail.13 Arguments from poverty of the stimulus suggest that children must have internalized a system that recursively generates numbers and other discrete infinities. With regard to numbers, Peano axioms also do the same. Yet, it will need a lot more than just these two facts—psychological and textual—to conclude that children have internalized these axioms. Peano axioms are ‘‘rational reconstructions’’ of some body of mathematics; hence they are essentially normative in character. They are useful for studying various formal properties of arithmetic; they are not intrinsically useful for studying the nature of arithmetical knowledge internalized by the child. If anything, Peano axioms are products of what Chomsky (1980) calls the ‘‘science-forming capacity,’’ whose object in this case is the body of arithmetic, not the child’s mind. It is not at all obvious that the same ‘‘axioms’’ will show up when the capacity shifts its gaze to the child. What children do is a matter of fact, and a very di¤erent line of inquiry is needed to account for it. In fact, Montague (1974, chapter 3) was very clear about the character of his project. He insisted that the theory is entirely mathematical in nature in that no psychological implications should be read in it. His goal was to treat English as a formal language with the ‘‘usual syntax and model theory (or semantics) of the predicate calculus’’ (p. 189). With Davidson, he regarded ‘‘the construction of a theory of truth . . . as the basic goal of serious syntax and semantics.’’ He was not giving an account of how English speakers in fact attach interpretations to sentences; he was merely assigning logical interpretations to a class of English expressions taken as objects of inquiry by themselves. According to Thomason 1974, 3, Montague never suggested that ‘‘his work should be applied to topics such as the psychology of language acquisition.’’14 A number of recent and influential monographs on the subject pursue the topic exactly as Montague did, di¤ering only in the specific logical theory they adopt: Heim and Kratzer (1998) use the extensional part of Montague’s model theory; Larson and Segal (1995) use only first-order logic and its familiar scheme of interpretation. But now the claim is that
Grammar and Logic
105
they are studying ‘‘I-languages’’—that is, they are studying ‘‘knowledge’’ of language as internalized by native speakers. Thus, a purely mathematical inquiry has turned into a psychological one without any noticeable change in the internal vocabulary of the enterprise. Constructs of formal semantics such as entity, truth value, semantic value, and so on have now assumed an ‘‘internalist’’ character such that they are to be viewed as theoretical postulations uncovering aspects of the mind. Restricted to mind-internal aspects of language, what do the postulations of formal semantics now pick out? Given what entity and truth value commonly mean, how do we conceptualize mapping of entities to truth values internalistically? The question is pertinent because it is natural to view postulations such as entity, truth value and semantic value, as relating to the mind-external significance of language, as an attempt to explain the ‘‘aboutness’’ of linguistic expressions, whatever the merit of the project for a theory of language may be. No doubt, the ‘‘world-bound’’ character of formal semantics remains incomplete until several other steps are taken, as we saw. The world side of the language-world relation that obtains for Nixon simply uses Nixon: Nixon denotes Nixon; it tells us nothing about what denotation accomplishes. Nonetheless, the postulated relations of denotation, reference, and designation, intend to relate linguistic expressions to items in the world. Thus, in generating ‘‘Arnold Schwarzenegger is big is true i¤ (the individual) Arnold Schwarzenegger is big’’ as a theorem, formal semantics is committed to the claim that Arnold Schwarzenegger designates Arnold Schwarzenegger, the famous bodybuilder, ‘‘the actual, physical person’’ (Larson and Segal 1995, 197, 203), not an ‘‘internal representation’’ of Arnold Schwarzenegger. It is hard to see what else designate could mean. That is why fictional names such as Nixon—where Nixon names a pain that has ceased to exist—are a puzzle in semantics. Since it is unclear what Nixon means in a pain-free (state of the) world, we are prone to entertain the false belief that ‘‘pains like Nixon never cease to ‘exist’ ’’ (Kaplan 1989, 612). Given what we know about ‘‘internal representations,’’ it would not have been a problem if Nixon picked out ( just) an internal representation of Nixon. For the same reason, Chomsky’s examples of the average man, the flaw in the argument, John Doe, and Joe SixPack are problems for formal semantics. Not only is it di‰cult to locate flaws and average men in the world, but some of these singular terms— the average man, John Doe, Joe Six-Pack—are used precisely for making
106
Chapter 3
general comments about the world, and not for picking out anything in particular. Hence, it is unclear what it means to have a model with John Doe and Joe Six-Pack occurring as individuals in it. Constructs of formal semantics have an intrinsic urge to fly o¤ the paper. No wonder major books on formal semantics are often full of pictures and hand drawings of people, dogs, cats, perambulators, and the like (Barwise and Perry 1983; Larson and Segal 1995). The contrast with syntax is striking. Postulates of grammatical theory such as noun phrase, anaphora, trace, c-command, and so on, make no reference to entities outside the mind: there are no noun phrases or reflexives in the world. If anything, these are likely to be properties of the mind/brain. Notions like truth, reference, and the rest, on the other hand, do not seem to be ‘‘psychological categories’’ at all: it is better to think of them as ‘‘modes of Dasein’’ (Jerry Fodor, cited in Jackendo¤ 1992, 159). From this perspective, it is implausible to assimilate truth and entity with anaphora and ccommand under the common head ‘‘syntax.’’ To summarize, the actual domain of formal semantics—structural conditions on closed items—raised the prospect of viewing it as restricted to language-external but mind-internal aspects of meaning; in that sense, it apparently enlarged the scope of language theory. However, its conceptual tools are suitable, if at all, for addressing classical language-world relations. Given its conceptual resources, formal semantics cannot have it both ways. 3.5.3
Russell’s Equivalence
To pursue the topic of formal semantics from yet another direction, recall that we have so far agreed that the use of logical notation may help represent the data of linguistic intuitions somewhat perspicuously. On favorable occasions such as (79) above, logical representations might even suggest directions for linguistic inquiry. Obviously, only those representations will be taken notice of (or deliberately constructed) that fall within the current scope of grammatical theory. To that extent, formal semantics supports grammatical theory without enlarging its scope. From this limited perspective, maybe we can think of formal semantics just as a ‘‘mapping’’ device concerning three symbol systems: linguistic expressions, expressions of logic, and expressions of set theory with arbitrary elements—‘‘a study of symbolic objects and their properties,’’ as Chomsky put it. It seems that this was Montague’s only interest in the project, as noted. The task of formal semantics then is to set up two sets of relations—between linguistic and logical expressions, and between log-
Grammar and Logic
107
ical and set-theoretic expressions. Thus, we relate every man walks to (every x: man x) (walk x), which in turn is linked to jMan Walkj ¼ 0. A systematic matching of three symbol systems for a su‰ciently large fragment of, say, English under the overall constraints imposed by syntax is no mean feat, whatever be the point of the exercise. To emphasize, we are no longer concerned with what expressions of formal semantics denote. That is, we are ignoring the relationship of the last of the triad—set theory—with nonsyntactic objects such as concepts or entities in the world. The current perspective essentially means that we are viewing the model-theory part of formal semantics, at best, as highlighting some general set-theoretical intuitions; such intuitions accompany linguistic intuitions anyway. In fact, studies of these intuitions have an eminent precedence. Although, as noted, Russell (1905) did not have an explicit model theory in his theory of descriptions, he appealed to intuitive set theory whenever needed. Recall his famous remark on the sentence The present king of France is bald: ‘‘If we enumerated the things that are bald, and then the things that are not bald, we should not find the present king of France in either list. Hegelians, who love a synthesis, will probably conclude that he wears a wig’’ (p. 485). In this light, the model-theory part of formal semantics is to be seen as visual aids to the (set-theoretic) intuitions we already have, not as an account of those intuitions. Even there, the resources of formal semantics seem severely restricted in representing linguistic intuitions. Formal semantic treatment of English definite descriptions illustrates the point. I do not have the space to develop a positive theory of descriptions (Mukherji 1987, 1989, 1995). I will only make some brief remarks on why I think that the resources of logical theory fail even to represent the linguistic intuitions concerning definite descriptions, notwithstanding over a century of intense e¤ort. (I must add that positions on this turbulent topic are so hardened by now that nothing is even remotely settled.) To begin, it is natural to think that the original proposal for logical form must have been based on some intuitive understanding of some natural-language sentences of a rather simple sort: Socrates is wise, All men are mortal, Tully is Cicero. So it is no wonder that, given stipulation, at least these sentences will display their meanings via their logical forms since the logical forms ‘‘display’’ their meanings via these sentences. Therefore, we need to show that, beyond the initial stipulations, certain theoretically interesting aspects of the sentences of a language fragment fall under the scope of logic. This much, I assume, is maintained at least since Frege; it is certainly assumed by Montague.
108
Chapter 3
Bertrand Russell (1905) identified such a language fragment. Beginning with the stipulated resources, he proposed an innovative canonical form for English sentences with definite descriptions. The canonical form enabled him to make an empirically significant distinction between the surface and the logical forms of English sentences, as we saw. Armed with the distinction, he proposed a solution within logical form to a semantic problem that arose in the surface form. Definite descriptions thus were something of a test case for the canonical power of formal logic. The problem is that, in over one hundred years of voluminous discussion on this topic, there is still no consensus on how to treat definite descriptions: the has been viewed as a singular quantifier, universal quantifier, referring expression, term operator, abstractor, Hilbert’s epsilon operator, and so on. As a result, the test case has continued to be the problem case. Nevertheless, even if there is disagreement about the operational character of the, there is wide consensus on the basic meaning of the definite article via what David Kaplan (1972) has called Russell’s ‘‘fundamental equivalence’’; it is also called the ‘‘uniqueness condition.’’15 I will argue that the fundamental equivalence gives an entirely wrong account of the intuitions of English speakers when they use/ interpret definite descriptions.16 Russell’s fundamental equivalence required that an English sentence of the form (80) The F is G is equivalent to two other English sentences of the form (81) and (82), (81) One and only one F is G (82) Exactly one F is G Assuming the equivalence between (81) and (82), Russell argued that, say, (82) could be rewritten as (83), (83) (bx)(Fx & (Ey)(Fy ! x ¼ y) & Gx) as we saw, or equivalently as (84), (84) (by)(Ex)((Fx $ x ¼ y) & Gy)) So, the ‘‘quantificational’’ treatment of the depends crucially on the equivalence of (80) with (81) and (82)—that is, the translation of (82) into (83) or (84) requires the fundamental equivalence between (80) and (81) or (82). As an aside, I note that the canonical forms (83) and (84) are hardly perspicuous for triggering set-theoretic intuitions. For that
Grammar and Logic
109
purpose, their quasi-English rendition and, in fact, the original (80) seem better suited, unsurprisingly. As noted, most formal semantics approaches take the stated equivalence for granted, even if they disagree with the specific canonical form of (83)–(84). To take some examples, Montague held that the role of the definite determiner is to make two assertions: (i) that there is one and only one individual that has the property of being F, and (ii) that this individual also has the property G (Dowty, Wall, and Peters 1981, 197). Montague brought out this role by translating the as (85) by applying labstraction over (84). (85) lF[lGby[Ex [Fx $ x ¼ y] & Gy]] According to Neale 1990, 45, The F is G is true i¤ (i) all Fs are Gs and (ii) there is exactly one F. Thus the truth clause for The F is G is (86) ‘[the x: Fx] (Gx)’ is true i¤ jF Gj ¼ 0 and jFj ¼ 1 Although Larson and Segal (1995, chapter 9) recognize the ‘‘namelike’’ or ‘‘referring’’ character of some the-phrases, generally they seem to favor a quantificational treatment along Russellian lines; otherwise, they need to hold that the, a closed item, is widely ambiguous. Thus, The F is G is true just in case there is exactly one F and this F Gs (Larson and Segal 1995, 320). In their formal notation, The F is G translates as (87). (87) Val(hX, Yi, the, s) i¤ jY X j ¼ 0 and jY j ¼ 1 meaning, roughly, the semantic value of the is such that, given a sequence of objects s, the the relation holds between (the cardinality of ) two sets X and Y just in case Y is a subordinate of X and Y is 1. In e¤ect, both Neale, and Larson and Segal, implement Chomsky’s idea that the F is to be viewed as a universal quantifier with the special condition that the set of Fs is a unity when F is in the singular, and more than one when F is in the plural. This enabled Chomsky (1977, 51) to say that the ‘‘meaning of the, then, is not that one and only one object has the property designated by the common noun phrase to which it is attached.’’ As shown in the citations, both Neale and Larson-Segal miss this important qualification while implementing Chomsky’s idea. In any case, as Evans (1982, 59) points out approvingly, Chomsky’s formulation turns the into a straightforward numerical quantifier. It is the ‘‘numerosity’’ picture of the that is central to Russell’s equivalence. In contrast to the quantificational treatment, Kaplan (1972) introduces a primitive operator, called the ‘‘definite description operator,’’ which
110
Chapter 3
generates a term t in association with a description, such that t denotes one and only one object satisfying that description. Thus Kaplan treats the F as a term (a designator) rather than as a quantified phrase, although his semantic rule obeys Russell’s fundamental equivalence. I will hold on to this point exclusively to enquire if the is to be viewed as a quantitygenerating operation at all, in whatever guise. Since quantifiers, especially numerical quantifiers, can certainly be viewed as ‘‘importing’’ quantity (Quine 1980, 169), the question whether the is a quantifier becomes a special case of a more general question that arises even if we treat thephrases as terms obeying Russell’s equivalence. By focusing on the fundamental equivalence, therefore, I am setting aside the standard controversy between namelike and quantifierlike treatments of definite descriptions. As such I will not be concerned with scope distinctions, anaphoric properties, referential uses, functions across possible worlds, and the like. To raise these theoretical issues, we need some pretheoretical idea of the role of the in language. My contention is that people get into these theoretical issues because it is thought that the pretheoretical issue has already been settled in Russell’s equivalence.17 Furthermore, almost all these theoretical issues are raised within a restricted set of choices. That is, it is taken for granted that if you are not happy with the quantificational treatment of the, it is your burden to show that the-phrases are namelike, failing which you return to the quantificational picture somehow (or, settle for ambiguity of a closed item). This assumes that there must be a satisfactory account of the within the broad framework of logic. By suspending the theoretical issues, then, we are leaving open the possibility that the closed item the—perhaps the most frequently used closed item of English—escapes the broad framework of logic; the may not be a logical operation, in any clear sense of ‘‘logic,’’ at all (Mukherji 1987, chapter 3). To create interest in this issue, I wish to just raise a number of queries of varying degrees of salience regarding the fundamental equivalence, more or less at random; I have no space for addressing them individually. First, while the semantic value of (the singular) The F is one in the frameworks of Neale, Chomsky, and Larson and Segal, Russell’s equivalence required the quantity to be not just one, but exactly one or one and only one. How do we capture these additional things set-theoretically? Second, one item of English, the, is said to be synonymous with two other items of English, one and only one and exactly one. Why should there be such a proliferation of synonymy among the closed items of a language? Third,
Grammar and Logic
111
turning to plural descriptions, when I say The world wars were catastrophic events, do I mean that exactly two things were world wars and each was a catastrophic event? If yes, then what do I mean when I say The claws of a tiger are dangerous? I do not even know how many claws a tiger has. If not, why not? Why don’t we have definite articles for exactly two, exactly three, and so on? Fourth, we know that the-phrases can sometimes ‘‘grow capital letters’’ (Strawson 1950, 341) and turn into names: The White House, The Good Book, The Old Pretender. Can any other determiner phrase, including exactly one house, ever do so? Fifth, Kripke (1972) suggested that singular descriptions and complex demonstratives may be used to ‘‘fix the reference’’ of a name: the teacher of Plato may fix the reference of Socrates without being synonymous with it, that guy with dark glasses may fix the reference of Karunanidhi. Can a numerically quantified phrase such as exactly one teacher fix the reference of a name? Sixth, we noted that Quine (1980, 169) held that quantifiers import quantity; why did he treat descriptions separately as importing ‘‘uniqueness’’? Is there a distinction between importing quantity and uniqueness? We can go on like this (see Mukherji 1987 for more).18 Hoping that the issue of fundamental equivalence is now open and setting other issues aside, consider how Strawson (1961, 403) compares numerically quantified phrases with the-phrases: One who says that there exists one thing with a certain property typically intends to inform his hearer of this fact. Thereby he does indeed supply the hearer with resources of knowledge which constitute, so to speak, a minimal basis for a subsequent identifying reference to draw on. But the act of supplying new resources is not the same act as the act of drawing on independently established resources.
Observations like this are typically interpreted in the literature as introducing a ‘‘pragmatic’’ new-given (or new-old) distinction; however, the ‘‘semantics’’ (¼ intuitive set-theory) continues to be the same. For example, Jackendo¤ (2002, 397 n.) holds that ‘‘the definite article expresses a claim . . . that a unique individual satisfies this description (more or less as in Russell’s 1905 explication of definite descriptions).’’ However, if a speaker anticipates a situation in which ‘‘the purported referent is not present’’ in the hearer’s knowledge base, a definite description or an ‘‘unadorned’’ proper name won’t be used. Also, we need to ‘‘fall back on some repertoire of repair strategies’’ if it so happens that the speaker says the apple on the table, but the hearer fails to see any apples, or sees two of them (Jackendo¤ 2002, 325). So, the fundamental equivalence holds along with the new-given distinction.
112
Chapter 3
No doubt, Strawson does want to introduce the new-given distinction in the passage cited: the so-called familiarity theory of descriptions. But in doing so, he is making a much deeper point—routinely missed in the literature—on the role of the definite article. I also think that Strawson’s own earlier—and more famous and controversial—view of definite descriptions (Strawson 1950) is partly responsible for this systematic misinterpretation. In that work, Strawson suggested that although the F does not mean one and only one F as Russell thought, nonetheless, one and only one F gives the condition/presupposition for asserting the F. I have argued elsewhere that this weaker claim does no damage to the basic spirit of Russell’s theory (Mukherji 1995). Without repeating the arguments here, it is clear that Strawson did not detach himself completely from Russell’s fundamental equivalence. The distinction between identifying and resource-presenting functions advocated in the passage cited, I will argue, amounts to a distinction between all typically quantified phrases including, in particular, one and only one F, on the one hand, and definite descriptions of all varieties (and perhaps proper names and demonstratives as well, but I will not argue this point here), on the other. At a minimum, Strawson is claiming that the expressions the F and one and only one F signal very di¤erent speech acts. To proceed, let me try to understand this claim with straightforward linguistic intuitions. Let us suppose a schoolteacher asks in a history class (scenario 1), ‘‘How many kings of France were guillotined?’’ Assuming just one king of France was guillotined, an appropriate and correct response would be, ‘‘One.’’ If the teacher pursues the matter by asking, ‘‘Isn’t that more than one?’’, an appropriate answer would be, ‘‘No, exactly (or just or only) one.’’ The point is, the question requested a number, possibly a unique one, and that request is not fulfilled by uttering the king of France. Suppose now the teacher asks, while showing pictures of kings from di¤erent countries (scenario 2), ‘‘Which one of them ruled from Versailles?’’ Now an appropriate and correct answer would be, ‘‘The king of France.’’ In this case, it would be totally inappropriate to respond with one and only one king of France. Before I proceed to build up on these intuitions, notice that it is too late for formal semantics to claim that these intuitions show, at best, that the king of France and one and only one king of France di¤er in felicity conditions rather than in truth conditions.19 John Austin (1961, 231) observed that to be true (or false) is not the ‘‘sole business’’ of utterances; utter-
Grammar and Logic
113
ances can also be ‘‘happy’’ or ‘‘unhappy’’ in a variety of ways. There are ‘‘conventional procedures’’ for using specific linguistic expressions, and ‘‘the circumstances in which we purport to invoke this procedure must be appropriate for its invocation’’ (p. 237). These circumstances will be the felicity conditions for the use of the relevant expressions. To take Austin’s example, suppose someone says, while playing a game at a children’s party, ‘‘I pick George,’’ and George says, ‘‘I’m not playing’’ (p. 238). The felicity condition for the (correct) use of I pick George requires that George is playing; his refusal makes the use of the expression, on that occasion, infelicitous. Suppose that Austin’s distinction applies beyond what he called ‘‘performative utterances’’ such as I promise, I name, I pick, I congratulate, and the like. Suppose also that the felicity conditions for uses of the F and one and only one F are as described in scenarios 1 and 2 respectively. Finally, suppose that the truth-conditional contribution of these expressions is the same via the fundamental equivalence. In sum, we grant the formal semanticist her best ground. This is the best ground because the felicity condition for definite phrases proposed, say, in Heim 1982, 165, is not su‰cient for distinguishing the-phrases from one and only one– phrases, though Heim’s condition may be su‰cient for distinguishing definite and indefinite phrases. This is because Heim views the uniqueness condition itself as part of the felicity condition rather than the truth condition; as noted, the uniqueness condition applies to one and only one– phrases as well. Supposing that the-phrases and one and only one–phrases di¤er in their felicity condition as above, what follows? Notice that the point at which Strawson’s distinction was introduced concerned just the limited issue of whether set-theoretic approaches capture salient linguistic intuitions. Central to set-theoretic approaches is the issue of whether the is a quantifier (or a name) at all. Strawson’s distinction shows that our linguistic intuitions suggest that the is not a quantifier in the set-theoretic sense, even if one and only one is. If this distinction is traced to felicity conditions rather than truth conditions, fine, but then the set-theoretic step of truth conditions fails to capture the distinction that our intuitions demand. In other words, what is central to our intuitions about the—the reason for having it specifically in (some) languages in addition to one and only one—is not captured in the formal semantics approach, given that these approaches are incapable of capturing crucial felicity conditions such as ‘‘drawing on independently established resources.’’ To my knowledge, Heim 1982 is the first serious formal attempt
114
Chapter 3
to incorporate Strawson’s view of descriptions in her file semantics: ‘‘For every indefinite, start a new card; for every definite, update a suitable old card’’ (Heim 1982, 276). Unfortunately, technology apart, it is hard to distinguish file semantics from model theory, as Heim’s later work based on Montague semantics illustrates. Further, grant that Heim’s proposal works for interactions between indefinite descriptions and anaphoric definite descriptions: A dog bit a woman. The dog ran away (p. 275). Still, it is unclear how to ‘‘update a suitable old card’’ for a typical nonanaphoric ‘‘Russellian’’ description like the king of France or the first line of Gray’s Elegy; obviously, the entire explanatory weight is on suitable (Mukherji 1987, 1995 for more on ‘‘Russellian’’ descriptions). Returning to Strawson’s observations, it seems that we respond with exactly one to questions of the form how many?; we respond with the F to questions of the form who/what?, when asked in a context in which the speaker is assured that the audience can draw on an independently established resource. It begins to look as if Russell’s equivalence is involved in a fundamental confusion amounting almost to a category mistake. We may view the passage cited as a theoretical explanation of that confusion. It is important to distinguish such who/what questions from another sort mentioned in Strawson 1950. There, Strawson suggested that we must have some way of forestalling the question, what (who, which one) are you talking about, as well as the question, what are you saying about it (him, her). That distinction, in Strawson’s technical sense, was between referring and ascribing, and it grouped both the F and exactly one F as referring expressions in that technical sense. In the passage cited from his later writing (1961), exactly one F performs a resource-presenting function, while the F performs an identifying function within the generic referring function mooted in Strawson 1950. The distinction can be generalized. We answer a how many question with any of all Fs, some F, a few F, most F, as well as exactly one F.20 Quantified phrases, therefore, do what they are supposed to, namely, signal a quantity of Fs.21 Designators, on the other hand, simply refer back to these objects signaled in advance. Contrary to what Russell suggested, a use of the never produces a quantitative picture, implicitly or otherwise. This is not to deny that, in a use of a singular definite description, the use of the predicate F in the singular itself bears information (roughly) paralleling the information borne by one and only one F, set-theoretically speaking. But this fact has nothing to do with the definite article and has no e¤ect on a general theory of def-
Grammar and Logic
115
inite descriptions. Curiously, the same point emerges in Strawson’s rare agreement with Russell: ‘‘There are certain well-marked cases in which a full-scale Russellian account is obviously correct. . . . The only man who refused to sign was shot. . . . It is clear that part of the content of the information explicitly conveyed is precisely that one and only one man refused to sign’’ (Strawson 1995, 400). Explaining further, Strawson observed that the ‘‘presence of the word only is just what marks such cases as clearly Russellian’’ (p. 400). Strawson did not cite a case in which the presence of the marked it as clearly Russellian. So, the semantic value for the F proposed independently by Neale and Larson-Segal is now explained. Not surprisingly, both set up semantic values only for the singular F without setting up the value for the full definite phrase. It is hard to see what else to do with the resources of formal semantics, for there is no additional semantic value to be set up for the definite article itself. The same truth clause will remain intact for a variety of quantified phrases—one F, one and only one F, exactly one F, nonzero but less than two Fs, unique F, and so on—even if natural languages failed to contain the definite article. The most ubiquitous closed item of a language turns out to be meaningless from the (formal) semantic point of view. At least one crucial instance thus supports Strawson’s (1950) general claim that ‘‘ordinary language has no exact logic.’’22 3.6
Summing Up
In this chapter, I have reached the following conclusions: 1. There is no coherent division between syntax and semantics in grammatical theory; in that sense, grammatical theory already contains a semantic theory. 2. Logical notation may at best be viewed as representing data of linguistic intuitions; this much may be accomplished by grammatical notation, which also has (independent) explanatory value. 3. Many of Davidson’s structural conditions for truth theory can be captured in grammatical theory; what cannot be captured seems to merely be pretheoretical assumptions. 4. There is no interesting sense in which formal semantics explains the external significance of language, although its vocabulary purports to attain that goal. 5. For this reason, formal semantics cannot be viewed as explaining the internal significance of language either.
116
Chapter 3
6. Linking up with (2), resources of formal logic cannot even represent linguistic intuitions in the crucial case of definite descriptions. The upshot is that there is no compelling reason so far to deny the original proposal that the nonphonological component of grammatical theory may itself be viewed as a theoretically salient semantic theory (section 3.3). This is because the most relevant nongrammatical theory, namely, formal semantics, lacks explanatory motivation; hence, attaching this theory to grammatical theory does not enlarge the explanatory scope of language theory. Recall from chapter 2 that the scope of grammatical theory is so far restricted to certain mind-internal computational properties of language. To that extent, biolinguistics promotes a specific, theory-internal conception of language. It is plausible to hold, from what we have seen so far, that the grammatical conception of language is all that we have. After all, a ‘‘final’’ account of utterance tokens must deal with many physiological factors that relate to the production and hearing of sound. Yet we do not expect a linguistic theory to include a subtheory of ear and throat mechanisms. So why not drop the axe even earlier at the natural joint where grammar ends and other things begin? I am not advocating, in advance of research, what grammatical theory ought to look like and what it should cover in the future. Research on grammars will proceed under its internal compulsions. In so doing, it will abandon some of its earlier goals and cover new phenomena on a caseby-case basis. In some cases, as we saw, it might embrace entirely new domains not anticipated in its earlier phases—for example, the by-now familiar incorporation of LF, some properties of external systems, introduction of optimal mechanisms, and so on. It is also not ruled out that some of the semantic phenomena currently covered by theories of logical form—insofar as they were designed to capture structural conditions on meaning—might fall under grammatical theory itself. In that case, notions like predication, propositional form, and truth-‘‘indications’’ will no longer be viewed as primitives of semantic theory, but as entities projected by syntax for the C-I interface: ‘‘As if syntax carved the path interpretation must blindly follow’’ (Hinzen 2006, cited in Chomsky 2006a; also Pietroski 2005; Stainton 2006 from a di¤erent direction). For these purported extensions of grammatical theory, we need to examine closely the additional machinery invoked to see if they have independent explanatory motivation, or whether they push the problems with
Grammar and Logic
117
logical form just a step back. If successful, the approach will reinforce the idea that grammatical theory itself contains the semantics of human languages, understood as structural conditions on meaning. In any case, progress in such directions is likely to be painstaking and incremental and always from within the form of theory already reached. The process is common in the natural sciences. Studies of the microworld and fields that are now routine in contemporary physics were not part of classical mechanics that studied projectiles and pendulums. Why should the course of biolinguistics be otherwise?
4
Words and Concepts
As currently pursued, one of the goals of biolinguistics—perhaps, the goal—is to explain how sound-meaning correlations are established in languages. The Government-Binding Theory sought to explain those correlations as interactions between lexical information and universal principles with language variation restricted to parameters attached to these principles. As we will see in the next chapter, the Minimalist Program went several steps ahead to establish such correlations in even more restrictive terms. But the descriptive goal remained the same: establish sound-meaning correlations. 4.1
‘‘Incompleteness’’ of Grammar
Now the problem is that, in some widely accepted sense of ‘‘meaning,’’ biolinguistics fails to establish sound-meaning correlations. We argued at length in the previous chapter that the grammatical level of LF represents meanings. But grammar will not distinguish between (the wide di¤erences in) the meanings of (88) and (89). (88) John decided to attend college. (89) Bill tried to attend church. (88) and (89) have nearly identical LF representations although they differ widely in sound. Thus, two PF representations have been correlated with one LF representation. Let me be very clear about what exactly the problem is. The task for biolinguistics is not to establish sound-meaning correlations per se, but to establish them in accordance with the native speaker’s intuitions which is based on the knowledge of language internalized by the native speaker. So, properties of English sentences are not the focus of inquiry: knowledge of English, under usual abstractions, is. If the native speaker takes
120
Chapter 4
a single sound to have di¤erent meanings—ambiguity, as in flying planes—the theory explains that by postulating mechanisms for displacement. If the native speaker takes two sounds to have the same meaning— synonymy, as in active-passive pairs—the theory explains that as well, again by postulating mechanisms for displacement. By parity, if the native speaker takes two sounds to have two di¤erent meanings, as in (88) and (89), the theory ought to explain that too: ‘‘linguistic semantics, if it is to explicate the matching between form and meaning, must be able to distinguish the meanings of words’’ (Jackendo¤ 2002, 286). If this is a legitimate demand on language theory, then grammatical theory fails to accomplish it. The problem could be identified as follows. (88) and (89) di¤er at three places: proper names John and Bill, verbs decide and try, and common nouns church and college. These things vary in how LF-representations of (88) and (89) fail to meet Jackendo¤ ’s demand. Take the verbs decide and try which are listed in the lexicon as transitive verbs that take infinitival clauses; they are also control verbs in that they must have thematic Subjects, among other interesting properties. This cluster of properties distinguish them from other transitive verbs that take infinitival clauses— for example, seem and appear have the complex property of deleting the S-bar node in their complement structure in the G-B formulation. So grammatical theory does get tantalizingly close to a unique characterization of individual lexical items. Yet, as things stand now, we are still left with at least two verbs (seem, appear) and (decide, try) in each subclass such that grammatical theory can not distinguish between the members of a subclass; in that sense, decide and try have the same LF-meaning. However, it is surely an empirical question if there are grammatically significant di¤erences between decide and try; we just need to know more. Suppose there are. In that case, grammatical theory will succeed in matching each verb with its (own) LFmeaning. Now, if Jackendo¤ demands that language theory must also account for the conceptual di¤erences between deciding and trying, then he would be appealing to a conception of language theory we may want to question, as mooted in the previous chapter (section 3.6). It seems that, in any case, we need to reach decisions about where to place viable limits on what counts as ‘‘language theory’’; as Chomsky observes, a theory of language cannot be a theory of ‘‘everything.’’ Some authors seem to hold the view that a theory of language is supposed to furnish ‘‘a complete algorithmic account’’ of how ‘‘understanding is reached from specific acoustic properties of utterance tokens’’ to the
Words and Concepts
121
‘‘speaker’s communicative intentions’’ (Fodor 1989, 5). It seems to me that this conception of what a (general) theory of language ought to accomplish in the final analysis is shared subliminally by most people thinking on language; we are not really done until each of the components of the entire stretch somehow falls in place. I do not see what it means to satisfy this goal, not to speak of whether it is satisfiable. For one thing, grammatical theory does not even pretend to generate full tokens in any useful sense. Insofar as theories are concerned, this must be the case with any theory: grammar, truth conditions, speech acts, or whatever—none takes you from ‘‘specific acoustic properties of utterance tokens’’ to the ‘‘speaker’s communicative intentions.’’ In that sense, a speech-act theory is no more ‘‘complete’’ than grammatical theory. So, the current suggestion is that, since a cut has to be made in any case, let us make it at the sharpest joint, namely, right after LF, and shift other notions of meaning and content beyond language theory. At the current state of understanding, the suggestion lacks plausibility. No doubt, as noted, the issue of whether there are significant grammatical distinctions between decide and try is an empirical one. Nonetheless, from what we know, it seems rather unlikely that enough grammatical distinctions will be found across languages to (uniquely) identify each verb in grammatical terms. Moreover, each of the subclasses noted above is certainly ‘‘open-ended’’ in that new verbal labels are introduced as speakers of English experience new processes and events; some of these labels might enlarge the subclasses just listed. The problem obviously compounds for (John, Bill ) and (college, church). We called them ‘‘proper nouns’’ and ‘‘common nouns’’ respectively. These are not even grammatical categories. For example, grammar will not distinguish between the proper noun, The White House, and the ‘‘ordinary’’ det-phrase (DP), the white house. In fact, John itself could be viewed as a DP with null det; similarly, for college and church in (88)– (89). Roughly, insofar as grammatical theory is concerned, these are all þN items that occur in DPs, period. Thus, given a pair of grammatical representations hPF, LFi for the strings John tried to attend college and John decided to attend church, no further partitioning of representations is possible in grammatical theory. This much is pretty obvious (for more, see Marconi 1996, chapter 1). What is not so obvious is the lesson to be drawn from this. Suppose we have a concept of meaning under which (88) and (89) are viewed as synonymous. I am surprised to learn that a similar view has sometimes been o‰cially aired. Jackendo¤ (2002, 338) cites Grimshaw as follows:
122
Chapter 4
‘‘Linguistically speaking, pairs like [break and shatter] are synonyms, because they have the same structure. The di¤erences between them are not visible to the language.’’ Maybe break and shatter are too close in meaning for users to tell the di¤erence; so maybe no significant linguistic difference is involved: compare elm and beech (Putnam 1975). Is it plausible to declare that even John and Bill, or college and church, are synonyms, linguistically speaking? Suppose we extend Grimshaw’s idea from the ‘‘thin’’ cases to ‘‘thick’’ cases as well wherever needed, such as John and Bill, church and college, etc. In e¤ect, we view the pair (88)–(89) on a par with, say, a pair of active-passive sentences. Our current conception of meaning is such that the synonymy of a pair of active-passive sentences forms a crucial data for linguistic theory.1 But then, by parity of conception, the pair (88)– (89) can not be viewed as synonymous at the same time. In fact, if we are to give an account of the sameness of meaning for active-passive pairs, then giving an account of the di¤erence in meaning between (88) and (89) becomes part of the agenda for linguistic theory, as noted. To probe a bit, we saw that an active-passive pair is synonymous because of sameness of y-roles. If we are admitting y-roles in language theory to determine the sameness or di¤erence in linguistic meaning, why can’t we admit conceptual categories such as Gplace, Ginstitution, Greligious, and so on to distinguish the meanings of college and church? Roughly, college will carry the features þplace, þinstitution, religious, and church will have þplace, þinstitution, þreligious (ignore problems with church colleges). Such is the pressure of the current largely commonsense concept of meaning. Are we willing to submit to it? I do not think such problems arise in physics (anymore). In principle, if two things di¤er in the space-time framework, then, ceteris paribus, the magnitude of the forces acting on them will di¤er as well. Thus, there is a complete physical characterization of anything insofar as it has physical properties at all. No doubt, physics cannot fully characterize all the properties of those objects that happen to have physical properties as well. For example, physics cannot furnish a complete account of what makes something a tree or an elephant. The additional properties that are required to distinguish trees from elephants thus belong to other disciplines; biology, in this case. Therefore, unless one is a ‘‘reductionist’’ in the sense that one believes that all the properties of anything at all must have a physicalist account, no clear sense of incompleteness attaches to physics. What I am driving at is the well-known point that, during the development of a science, a point comes when our pretheoretical expectations
Words and Concepts
123
that led to the science in the first place have changed enough, and have been accommodated enough in the science for the science to define its objects in a theory-internal fashion. At this point, the science—viewed as a body of doctrines—becomes complete in carving out some specific aspect of nature. From that point on, only radical changes in the body of theory itself—not pressures from common sense—force further shifting of domains (Mukherji 2001). In the case of grammatical theory, either that point has not been reached or, as I believe, the point has been reached but not yet recognized. The apparent problem of incompleteness of grammar thus leaves us with the following three options as far as I can see. First, we can continue with the current thick concept of meaning, attempt to disentangle its parts, and attach accounts of these parts, arranged in some order, to grammatical theory to achieve growing completeness. In doing so, we have a variety of options that range from Jackendo¤ ’s apparently modest demand to Fodor’s full-blooded one. Two further options seem to follow disjunctively if this first option fails. As a second option, we can try to dissociate the scope of grammatical theory from the current putative scope of (broad) language theory. Whether we continue to call grammatical theory ‘‘linguistic theory’’ becomes a verbal issue. We think of grammatical theory as defining, not unlike physics, its own domain internally. Some initial data, such as active-passive pairs, no doubt triggered the search for this theory. But once a study of some core data has enabled us to reach an internally coherent model that addresses Plato’s problem, we simply kick the ladder. This is a routine practice even in linguistic research anyway (Larson and Segal 1995, 8–9), not to speak of the more advanced sciences. I am suggesting that we push this practice to its logical end. Clearly, the major task at that point is to come up with some conception of the domain so defined. (It is no longer a secret that I am pursuing this choice in this work.) Or, finally as a third option, we reach the skeptical conclusion that grammatical theory is indeed an inherently incomplete theory and we try to find some philosophical justification for this conclusion. I think a number of philosophers profess this option without really considering the second option. Notice that we could settle on the skeptical option immediately after the failure of the first option. That means we could have taken the current conception of language theory for granted to conclude that the theory cannot be realized. This is how the philosophical/skeptical literature usually works; despite its apparently radical stance, the skeptical literature is essentially conservative.
124
Chapter 4
The second and third options arise only when the first fails. We have not seen that happen yet. 4.2 Lexical Data
Pursuing the first option, then, a necessary, but certainly not su‰cient, condition for generating a representation of a token is to invoke enough nongrammatical types to capture specific meanings of words. A natural first step in that direction is to attach ‘‘selectional features’’ to lexical items. In Chomsky 1965, 85, lexical items belonging to the category [N] were assumed to have features such as GCommon, GCount, GAnimate, GHuman, and so on, arranged in the order just stated. These features were then used to display more fully some of the subcategorization properties of verbs. For example, subcategorization frames were now supposed to mention types such as [[þAbstract] Subject], [[Animate] Object], and the like. This generates an elaborate system of agreements in which subcategorization frames of verbs will be checked to see whether they match the selectional features of arguments. Clearly, resources such as these may now be used, in fairly obvious ways, to throw out ‘‘deviant’’ strings such as colorless green ideas sleep furiously, golf plays John, and misery loves company, while admitting strings such as revolutionary new ideas appear infrequently, John plays golf, and John loves company (Chomsky 1965, 149). An immediate problem with selectional features is that one does not know where to stop. Jackendo¤ (1990, 51–52) points out that the verb drink takes (names of ) liquids as internal arguments. Should we, therefore, include [þLiquid] in the subcategorization frame of drink, and as a selectional feature of whatever happens to be in the internal argument position? To see the beginning of a problem that will occupy us for much of this chapter, consider the Bengali verb khaawaa (eat). It takes any of solids, liquids, and gases, among various other things: khaabaar (food), bhaat (rice), jol (water), haawaa (air), cigarette, gaal (abuses), chumu (kisses), dhaakkaa (push/jolt), and so on. A number of these complements are shared by the verb taanaa (pull): khaabaar, bhaat, jol, cigarette. However, taanaa also takes other things that khaawaa does not: nisshaash (breath), khaat (bed), dori (rope), darjaa (door), naak (nose), kaan (ear), and so on [Naak taanaa typically does not mean pulling one’s own or someone else’s nose, rather it means drawing in nasal fluid; kaan taanaa, however, means pulling ears]. How are the subcategorization frames of khaawaa and taanaa structured?
Words and Concepts
125
Selectional features, which attach to the category [N], do not explain other varieties of meaning relationships that can be traced to synonymy between pairs of verbs. Additional mechanisms are needed to account for these facts. Consider the following pairs of sentences (Chomsky 1965, 162). (90) John strikes me as pompous / I regard John as pompous. (91) John bought the book from Bill / Bill sold the book to John. (92) John struck Bill / Bill received a blow from John. In each case, roughly, the relation that the first verb establishes between two NPs (John, I ) and (John, Bill ) respectively is maintained by the second verb, although the relative positions of NPs vary. It is natural to express these constancies in terms of thematic roles of NPs. Thus, in (92), the two sentences are related by the fact that, in each case, John is the agent and Bill the recipient/patient. Similarly for (90) and (91). A somewhat di¤erent ‘‘meaning’’ relationship—not of synonymy, but of entailment—obtains between the verbs persuade and intend. Thus, Chomsky (1991b, 35) suggests that John persuaded Bill to attend college implies, apparently independently of world knowledge, that Bill decided or intended to attend college. Further, there is some sort of a presuppositional link between John is proud of what Bill did and John has some responsibility for Bill’s actions, that needs to be explained in terms of universal concepts of PRIDE and RESPONSIBILITY (Chomsky 1972a, 60) (I adopt the convention that words in uppercase mark concepts). These then form a small sample of the data that a putative lexical semantics needs to give an account of. Chomsky did pay some attention to these facts and to the issues that arise from them in his Aspects of the Theory of Syntax (1965), as noted. There is some discussion in this book regarding the form a semantic theory might take to account for the facts listed above. Still, Chomsky concluded that ‘‘selectional rules play a rather marginal role in the grammar.’’ Thus, ‘‘One might propose . . . that selectional rules be dropped from the syntax and that their function be taken over by the semantic component’’ (p. 153). Chomsky’s writings on language, both technical and informal, have been torrential in the decades that followed Aspects. Excluding his technical monographs on grammatical theory, he has written to date at least a dozen books, and a very large number of papers of varying length, devoted primarily to traditional and informal issues in language theory.
126
Chapter 4
Yet it will be surprising if his constructive remarks, on the issues just raised, exceed a few dozen pages in all.2 In these pages, he basically repeats the examples to suggest that a universal theory of concepts is urgently needed, without suggesting, as Jackendo¤ (2002, 275) also notes, how this theory is supposed to get o¤ the ground. Even though the scope of grammatical theory has been enlarged since Aspects to include semantics (LF), selectional rules still do not play any role in this theory. For example, it is routinely said (Chomsky 1993, 1995b) that lexical items, say book, carry semantic features, such as þartifact, along with the usual phonological and formal features. It is not clear, however, that this feature is ever introduced in grammatical computation. As far as I can see, nothing happens to these features after lexical insertion except that they are simply carried over to the semantic component; this is one reason why systems such as Distributed Morphology divide lexical features into those that enter into computation to LF and those that do not.3 I touch on Distributed Morphology below. It is reasonable to conclude then that problems of lexical semantics will probably be addressed, if at all, in a nonlinguistic theory of concepts, where the concepts that are in fact verbalized will belong to semantics proper. Chomsky’s prolonged silence on how this can be done could be interpreted as his basic reservation about this enterprise. 4.2.1
Uncertain Intuitions
It seems that the data cited above are not as salient as the typical data for grammatical theory. As we saw for dozens of examples in chapter 2, the core data for grammatical theory carry a sense of immediacy and irrefutability. Confronted with paradigmatic cases of unacceptable strings, it is hard to see how the cases may be ‘‘saved.’’ *John appeared to the boys to like each other is flat wrong and tinkering with the meanings of appear or each other does not have any payo¤; in fact, such tinkering is not even attempted by the native user since, by the time he reaches each other, all possible interpretations have collapsed. More interestingly, even when judgments of acceptability are relatively uncertain, the uncertainty, typically, can neither be removed nor enhanced on reflection. Thus, in an attempt to extend their coverage of data, linguists depend not only on sharp judgments of acceptability, but also on uncertain judgments, where native judgments are uncertain, for example, with respect to whether a given string is okay. The data is listed in some order of increasing uncertainty, and an attempt is made to explain the
Words and Concepts
127
uncertainties according to the order in which they arise. Consider the following (Chomsky 1986, 76–77). (93) *the man to whom I wonder what to give (94) *the man whom I wonder what to give to (95) *the man to whom I wonder what he gave (96) *the man whom I wonder what he gave to Although each of these sentences is marked as unacceptable, it is clear that they are not all unacceptable to the same degree: (93) is perhaps the most acceptable, (96) the most hopeless, and the rest fall somewhere in between. Or, notice the subtle di¤erence between the unacceptable expressions (97) and (98) (Chomsky 2006a). (97) *which book did they wonder why I wrote (98) *which author did they wonder why wrote that book We will expect a grammatical theory to order these sentences as they occur because, although our judgments are uncertain, the degree to which a judgment is uncertain, for most speakers of a language, is largely invariant and is not likely to change with further thought. This property of grammatical intuitions enables ‘‘the theory of language’’ to ‘‘generate sound-meaning relations fully, whatever the status of an expression’’ (Chomsky 2006a). Semantic judgments, in contrast, are typically open to further thought, even if such thoughts might, on occasion, confirm our initial intuitions. As noted in passing before, ‘‘deviant’’ strings illustrate this point directly. In class, I need to shake my head rather vigorously when students attempt to attach coherent interpretations to the string colorless green ideas sleep furiously. The point is, I am never allowed to just mention the string, and proceed. We can throw this string out by invoking selection restrictions, as noted. But we may as well get it in by relaxing some of them; that is what the students want. Prinz and Clark (2004, 60) suggest that even the most mundane ‘‘word salads,’’ such as arrogant bananas and argue an earnest cake, ‘‘will summon ideas from beyond their boundaries.’’ As for golf plays John, John’s addiction to golf might lead to the point where it’s golf which takes over.4 Misery loves company sounds fine and is frequently in use since misery is infectious, it spreads. I am obviously stretching things here, but how could I allow myself to do so if the relations are supposed to highlight some aspect of my biological makeup?
128
Chapter 4
Consider the relation between persuade and intend, Chomsky’s favorite example, as noted. If I have persuaded X to do Y, does it always follow that X intends to do Y? Is it meaningless to say, ‘‘I have been persuading John to attend college, but so far he hasn’t agreed to?’’ Persuasion seems to be an act that is stretched over time; John’s intentions, on the other hand, are not actions, but states of John, which he either has or does not have. It is not obvious that I have failed to act at all just because John failed to attain the relevant state. It will be said that my action was really not of persuading John, but of trying to persuade John; the try goes through even if the persuasion fails. So the suggested entailment does hold, confirming Chomsky’s intuitions. But it took some persuasion to make a fairly competent user of English to agree. I am not suggesting that the observed relation between persuade and intend is without theoretical interest. As Chomsky has shown recently in a (rare) extensive discussion of the issue (Chomsky 2003: ‘‘Reply to Horwich’’), the lexical items under consideration occur in fairly restrictive syntactic contexts. For example, persuade typically occurs in the syntactic frame (99), (99) Nominal (100)
John
—
V
—
persuaded
Nominal
—
Mary
[Infinitival Clause] to buy the book.
Chomsky points out that (100) entails something about Mary’s intentions, namely that she intends to buy the book, but it entails nothing about John’s intentions. Interestingly, as with a host of other verbal items, the lexical item expect also appears in the syntactic frame (99), without entailing anything about Mary’s intentions. (101) John expected Mary to buy the book. The parallel breaks down further as persuade cannot appear in (102), but expect can appear in (103): (102) *John persuaded there to be a hurricane tomorrow. (103)
John expected there to be a hurricane tomorrow.
These facts certainly show that specific aspects of meanings of verbs apparently have syntactic consequences—that is, these aspects seem to enter into computation.5 Chomsky calls these aspects of meaning ‘‘Imeanings.’’ The result reinforces the conclusion reached earlier that, insofar as the domain of grammatical computation is concerned, there is no sharp division between syntax and semantics.
Words and Concepts
129
However, it is important to be clear about the domain covered by I-meanings. Chomsky (2003, 298–299) is explicit on the status of Imeanings in language theory: ‘‘Principles of FL may enrich and refine the semantic properties of [lexical items] and the image SEM determined by computation,’’ as with central grammatical phenomena such as referential dependence, understood Subject, passivisation, direct causation, and the like. However, ‘‘there is no reason to expect that what is informally called ‘the meaning of X’ will be fully determined’’ by the faculty of language. In other words, study of I-meanings fall under grammatical theory. So far, these e¤ects are typically listed as properties of lexical items—for example, persuade is listed as an Object-control verb. Hopefully, a more principled and abstract grammatical description will be found in future as the understanding of the structure of the lexicon improves. Returning to the relationship between persuade and intend, the Imeaning properties just observed suggest that pursuade is a causative, say, ‘‘make-intend.’’ The fact remains that the preceding form of analysis tells us little about the individual concepts PERSUADE, INTEND, and EXPECT. As noted, all we know is that the meaning of persuade is such that persuade acts as a causative; we do not know what persuade means. The point can be extended to cover examples (90)–(92) above. Thus, sell is another causative amounting to ‘‘make-buy,’’ strike amounts to ‘‘makereceive-blow,’’ and so on. The burden of explanation thus shifts to the individual concepts INTEND, BUY, BLOW-RECEIVE, and the like, plus some general cognitive account of formation of causatives. Given Plato’s problem, it was always obvious that we need an explanatory account of the system of concepts. The data cited above was expected to supply some point of entry to this complex system. The worry is that they do not. They simply state what we knew before, namely, that an account of individual concepts is needed. In my opinion, the preceding (skeptical) remarks on lexical data also apply to some of the well-known examples in lexical semantics frequently cited by Chomsky in recent years. I will consider three of them (Chomsky 2006b): (104) If John painted the house brown then he put the paint on the exterior surface though he could paint the house brown on the inside. (105) When John climbed the mountain he went up although he can climb down the mountain.
130
Chapter 4
(106) Books are in some sense simultaneously abstract and concrete as in John memorized and then burned the book. It seems to me that there is some ambiguity about what Chomsky wants to accomplish with these example. At some places, he uses these examples to cast doubt on specific semantic theories, sometimes even on the very possibility of semantic theory. For example, versions of (106) are used to show the limitations of formal semantics; to my knowledge, Chomsky uses (106), and similar examples, only for this negative purpose. In the sentence John memorized the book and then burned it, the pronoun it is referentially dependent on the book where the book is both memorized and burned; it is unclear how the denotation of the book is to be captured. Sometimes, examples (104)–(106) are also used to highlight the complexity of innate knowledge associated with lexical items as well as the variety of their particular uses, creating serious organizational problems for lexical semantics. We will get a glimpse of this problem with respect to paint and climb below. I will study this aspect of the problem for lexical semantics more fully later in this chapter in connection with lexical decomposition. At other places, in contrast, Chomsky (2006b) gives the impression that examples (104)–(106) form the core data for a theory of I-language alongwith familiar grammatical data involving understood Subject, pronoun binding, and the like. To say that the I-language specifies the listed properties of paint, climb, and book is to say that these properties fall under the category of I-meanings, and hence under (extended) syntax. Earlier, we saw that some of the lexical properties of persuade, intend, expect, and so on fall under I-meanings since these properties enter into grammatical computation. It is unclear if the cited properties of paint, climb, and book also fall under I-meanings in the same way. In fact it is unclear to me if these properties belong to the study of language at all in the sense in which understood Subject, pronoun binding, and causatives belong to it. Consider (104), paint. No doubt, it is generally true that painting the exterior is the default option for painting because the color of the exterior is privileged in appearance; in fact, in the typical cases of painting walls, doors, fences, and so on, the exterior is the only plausible option. As Chomsky observes elsewhere, this could extend to imaginary objects such as golden mountains, even to impossible objects such as a round square. On this basis, Chomsky suggests that the option is mind-directed (¼ innate) rather than based on world knowledge.
Words and Concepts
131
My intuitions di¤er. I can think of apartment complexes in which the exterior of the buildings are maintained by the management while residents have the freedom to paint the interior of their apartments the way they wish. In such circumstances, John painted his apartment green has the default meaning that the interior of the apartment has been painted. Furthermore, suppose the distribution of expertise is such that laypersons are able to paint the interiors of their houses themselves; painting the exteriors requires professional help. If we know that John is not a professional, then John painted his house brown will typically mean the interiors. Finally, Gricean maxims (Grice 1975) suggest that, other things being equal, the sentence John painted the entire house brown is likely to mean the interiors since people typically paint individual rooms di¤erently. That is, since people typically do not paint the exteriors with di¤erent colors, the phrase the entire house violates the maxim of relevance if it is intended to mean the exteriors. In this light, the sentence John painted the entire town brown has the opposite e¤ect. Default readings of paint vary as they are tied to world knowledge about houses and towns (see Stainton 2006, 931–932). Climb seems to pose a di¤erent set of problems. Chomsky (1988, 190– 191; 2000a, 75) observes that its meaning is ‘‘very complicated’’ such that human nature gives the concept CLIMB virtually ‘‘for free.’’ Jackendo¤ ’s analysis of climb brings out some of this complexity (1992, 46–47). It is instructive to study his analysis to see if the semantic intuition reported in (105) can be satisfactorily explained as part of a theory of I-language. Jackendo¤ cites four examples to capture the ‘‘feature system’’ of the ‘‘cluster’’ concept CLIMB. (107) Bill climbed (up) the mountain. (108) Bill climbed down the mountain. (109) The snake climbed (up) the tree. (110) ?The snake climbed down the tree. According to Jackendo¤, uses of climb involve two independent features: (A) an individual is traveling upward, and (B) the individual is moving with characteristic e¤ortful grasping motions. Jackendo¤ calls this manner of motion ‘‘clambering,’’ so climbing is clambering upward. For Jackendo¤, (107) is salient since it satisfies both (A) and (B); (108) and (109) are partly deviant because they satisfy either (A) as in (109) or (B) as in (108), not both; (110) is fully deviant because it fails to satisfy
132
Chapter 4
either (A) or (B). In particular, the up in (107) is optional since, by (A), climbing up is the default meaning. There are a number of problems with both (A) and (B). Since, according to Jackendo¤, climb is defined as clambering upward, strictly we cannot say, without redundancy, John climbed up the mountain. Further, since up is part of the meaning of climb, strictly we cannot say, without contradiction, John climbed down the mountain. As a matter of fact, we can say either without redundancy or contradiction, suggesting that up cannot be central to the meaning of climb. If up is not central, what explains the default reading? As his analysis suggests, Jackendo¤ is likely to say that up as in (A) is central, but because (A) needs to interact with (B), the deviant option (108) needs to be admitted, which in turn forces up as a (deletable) option in (107); in other words, unless up is an option in (107), (108) cannot be admitted at all. Why is (108) admitted? Because (107) and (108) share (B), the manner of motion. Thus, the entire weight of the analysis shifts to the characterization of the manner of motion. Now the characterization ‘‘e¤ortful grasping motions’’ is vague enough to admit of a large variety of motion. Walking through rocks or thickets or a dark room full of furniture requires e¤ortful grasping motions. None of these are nondefault readings of climb! Further, when we look specifically at the manners of motion involved in, say, climbing up and down a mountain, it is unclear if the manners of motion match at all in the way in which, say, the manners of motion of swimming up and down the river match. In fact, the manners of motion of a snake climbing up and down a tree are a lot more similar than those involved in humans climbing up and down a mountain. Yet Jackendo¤ declares (110) to be fully deviant because, as far as I can see, (110) fails the ‘‘feature system’’ stipulated by him. Also, when leopards or mountain goats climb a mountain they do not seem to require e¤ortful grasping motions; in fact, even humans do not require grasping motions when they climb stairs. It is unclear to me if Jackendo¤ ’s internalist analysis of climb with conceptual features explains the intuitions reported in Chomsky’s example (105) even if particular uses of the verb currently seem to confirm those intuitions. 4.2.2
Nature of Lexical Inquiry
To say that lexical intuitions are uncertain is not to say that we do not have pretty firm intuitions of nongrammatical anomaly in some cases. Asking of certain questions is moot—for example, Can you hear me? We are likely to be puzzled by strings such as Bill admires the Pope’s wife,
Words and Concepts
133
John’s subordinate is John’s boss, The mat sat on the cat, and the like. Any popular piece on newspaper bloopers, political speeches, student papers, and so forth, contains dozens of examples of this sort. But these are not usually cited in scholarly treatises on semantics for a very good reason. Examples from language use, unless they are properly controlled, admit of any number of factors not all of which pertain directly to a theory of language; clearly, the anomaly of Bill admires the Pope’s wife has nothing to do with English (or Italian, for that matter). When such popular, and often hilarious, examples are closely scrutinized, it turns out that they involve grammatical failure, performance failure, pragmatic failure, and insu‰cient world knowledge, among many other things (Pinker 1995a). I am suggesting, however, that when it comes to data that is supposed to be theoretically interesting, the quality of the data leaves much to be desired insofar as nongrammatical (¼ for now, conceptual) aspects of meaning are concerned. So the chances are that problems with the data may lead up to problems with the theories they spawn. I attempted to raise two points simultaneously: an explicit point and an implicit one. The explicit point was to cast some preliminary doubt as to whether the alleged data for lexical semantics is salient enough for us to feel interested in the search for deep theory. Because we are looking for a theory in this domain from LF-up, as it were, we expect to find some initial data that has some promise of throwing further light on the human makeup: that is the only notion of an explanatory (cognitive) theory under discussion in this work. I am personally not convinced that the data cited in this regard hold the promise. The implicit point was to show that what is called ‘‘semantics’’ in this area essentially amounts to giving at least a systematic account of the organization of a vast network of concepts used by humans. In each case, the data demanded that we know more about individual concepts and their relations to one another, as they attach to words, whether a noun or a verb. Apparently, then, the two points are opposed to one another: the second urges a research program that the first casts doubt on. Yet they are not really in opposition. If we could pursue the program with some degree of success, then it is quite possible that new data will show up, and doubts about the quality of the initial data will gradually subside. Perhaps we can already see that the resulting enterprise will have a very di¤erent flavor than that of grammatical research, in which a general research program took o¤ from very specific (and incontrovertible) data of
134
Chapter 4
linguistic behavior. The same, it could be argued, is the case with physics. It is hard to imagine that physics could have reached the level it soon achieved if some ancient scientists had proclaimed, ‘‘Let us try to understand the physical universe.’’ It would not have been clear what was specifically there to understand. Even though the general query might well have been entertained in classical philosophy as a quest for ‘‘being,’’ what led to physical theory were some sharp and deep facts that demanded sustained explanation: day and night, tides, eclipses, motion of pendulums and projectiles, angle of shadows cast by the sun, and so on. Such facts abound in grammatical theory at various levels of generality, as we saw: rapid acquisition of language, the ambiguity of flying planes can be dangerous, the missing lexical Object in John is eager to please, to mention a few. I am not convinced that this sort of motivation exists for lexical semantics; all we have is some general motivation, not unlike the proclamation cited above. For example, a contemporary work on lexical semantics begins with the following project: ‘‘It would be at least useful to investigate our semantic competence, that is, to wonder what kind of knowledge and abilities we possess that make it possible for us to understand language’’ (Marconi 1996, 1). 4.3 Lexical Decomposition
The general task for lexical semantics is to come up with some account of how individual concepts are organized to lend, so to speak, meanings to individual words. Following poverty-of-stimulus arguments, it is natural to think that most concepts are somehow built out of primitive concepts which must be just a handful. In this picture, meanings of most lexical items, perhaps all, will be captured ‘‘decompositionally’’; in other words, the total meaning of an item will be broken down until the conceptual primitives are reached. Once the system is made available to the child, some of the nodes of the system—that is, the individual concepts— will be associated with the sounds that the child hears: each sound-node association will count as a word. This is a fairly obvious and standard assumption: ‘‘Two things are involved in knowing the meaning of a word— having the concept and mapping the concept onto the right form. This is the sense of ‘knowing the meaning of a word’ implicit in most discussions of language development, both scientific and informal’’ (Bloom 2000, 17). The child has to learn these associations individually in any case. I said the project looks ‘‘natural,’’ although it is far from clear how it is going to be executed.
Words and Concepts
135
It is obvious that the project requires some notational scheme to mark the concepts themselves, independently of how they are marked by the sound systems of particular languages. The overall organization of concepts is thus described in terms of a system of semantic or conceptual markers. The form of explanation is called ‘‘markerese explanation.’’ Almost everyone who looks at the universal bases of semantics adopts markerese explanation in one form or another, even if there is good deal of resistance in recent years to use discredited labels like ‘‘decomposition’’ and ‘‘markerese explanation.’’ This is particularly true of empirical research across languages. I will restrict the discussion to only those approaches which at least profess to adopt the broad explanatory goal of biolinguistics, namely, to solve Plato’s problem in this domain. Before I proceed I must note that the notion of lexical decomposition is also used in some recent work in distributed morphology (Harley and Noyer 1999; Embick and Marantz 2006, section 2) which apparently does not take recourse to markerese explanation.6 The distributed morphology approach proposes that lexical information is not located in one store called the ‘‘lexicon,’’ but is distributed/decomposed in at least three components. The formal morphosyntactic items, but not the phonological ones, enter into syntactic computation to LF and are then mapped onto the meaning interface. The insertion of ‘‘vocabulary items,’’ such as þcount, þanimate and phonological properties among others, is delayed and fed directly into the meaning interface via phonological computation. Encyclopaedic information such as þcanine are fed into the meaning interface directly. In this scheme, then, there is no ‘‘sound’’ in narrow syntax (¼ computation to LF) and LF is viewed as a level of representation which exhibits certain meaning-related structural relations, such as quantifier scope (Harley and Noyer 1999). Insofar as the second set of items do not enter into syntactic computation to LF, the scheme bypasses the vexing issue, noted above, regarding the role of these ‘‘semantic’’ features, such as þartifact, during syntactic computation; according to the proposed scheme these features do not enter narrow syntax at all. For the purposes of this work, I certainly welcome as frugal a conception of LF (¼ output of narrow syntax) as possible. In that sense, the postulation, if valid, of ‘‘late insertion’’ of vocabulary items (on which most of the work in distribured morphology has been done, not surprisingly) strengthens the perspective developed in this work. As for the third—encyclopedia—component, it is hard to see how the scheme avoids markerese explanation in one form or other when a systematic attempt is made to describe the general organization of this
136
Chapter 4
component. Notice that the organization of the post-LF component of meaning is exactly the topic currently under discussion. 4.3.1
Initial Objections
Some authors working on language dismiss the very project of markerese explanation. One of these objections runs as follows. As Norbert Hornstein (1989, 36) puts it: ‘‘how does saying that in understanding a sentence we map them into mentalese help explain anything unless we understand what it is for someone to understand mentalese? Postulating semantic markers only pushed the problem back one step. Instead of explaining what it was for a word to have a content, the problem was to explain what it was for a marker in mentalese to have a content.’’ There are many domains of inquiry where pushing ‘‘the problem back one step’’ signals significant progress. The theory of evolution is a case in point. Arguably, the task for the theory of evolution is to explain how species, any species, evolved. But the explanations actually o¤ered typically fall short of the global task; usually it is said that species X evolved from species Y, Y from Z, and so on, without really hitting a nonspecies end until very recently. Similar remarks apply to cosmological theories in physics. It does not follow that hundreds of years of evolutionary and cosmological explorations have been a waste of time. It seems then that the force of Hornstein’s objection depends entirely on the specific nature of ‘‘pushing back,’’ not on the general fact alone. A related objection to the very project is reported by Ray Jackendo¤ (1990, 4), a leading exponent of markerese semantics himself: ‘‘How do you know your putative semantic primitives really are primitive? Mightn’t there be an infinite regress?’’ Again, the demand that the ‘‘real’’ primitives ought to be available all at once, and at the very beginning of the enterprise, is unfair. As Jackendo¤ rightly notes, by these standards even the search for the fundamental constituents of the physical universe (the ‘‘primitives’’ of physics) will appear to be dubious since physics, at no point in its history, can legitimately claim to have reached rock bottom. There are fairly standard ways in which progress can still be judged from incremental steps. The question of course is whether markerese semantics meets these standards. It is hard to see that this question has an answer in advance of the enterprise. Perhaps this is the right place to examine a set of more general objections that aim to beset the very idea of language research, including especially the study of word meanings currently under discussion. Recall that, in a decompositional analysis, concepts are linked in a complex network.
Words and Concepts
137
Each link in that network represents a conceptual relation, which, in turn, represents a part of the meaning of related words, if the (nodes of the) links are verbalized at all. For example, COW will form a link with ANIMAL such that, when verbalized, the conceptual relation COW ! ANIMAL will represent a part of the meaning of cow (and animal ). Since this inferential link represents a part of the meaning of cow, the link must hold come what may, that is, whenever cow is used; in other words, the enterprise requires that there are analytical inferences of the sort just sketched (Fodor and Lepore 1994). Following the work of Willard Quine (1953, 1960), the very idea of analytic relationships between linguistic expressions has been widely questioned. Skipping details of Quine’s argument (Mukherji 1983), his basic observation is that knowledge of linguistic meaning cannot be interestingly separated from extralinguistic knowledge. It follows that, according to Quine, since extralinguistic knowledge is in principle revisable, so are relationships of meaning. Hence, there are no analytical inferences that display the (invariant) meaning-relationship between expressions. It is di‰cult to estimate the scope of this generalization. For example, it is indisputable that there are analytical relationships between, say, an active sentence and its passive: John likes Mary strictly entails Mary is liked by John, but does not entail Mary likes John. Turning to words, we saw that there are aspects of the meanings of persuade and intend such that John persuaded Mary to buy the book entails Mary intends to buy the book, but does not entail anything about John’s intentions. We also saw that in each case, general grammatical explanations are available. Active-passive entailment is predicted by the interaction of y-theory and Case theory with the theory of movement; the entailment between persuade and intend is captured in the general syntactic frames for causatives. It follows that there are aspects of the concept of analyticity that are satisfactorily covered within grammatical theory; if there are other problematic aspects of the concept, then they are likely to belong to the nongrammatical aspects of linguistic expressions. Grammatical theory thus looks insulated from Quinean charges. The question arises as to whether the lessons from grammatical theory can be extended to the study of specific word meanings. It is unclear if the question can be meaningfully addressed from a priori philosophical grounds alone, as Quine advocated (Uriagereka 1998). After all, Quinean charges could have been generally advanced for the relationship between persuade and intend before the causative frames were unearthed. Once grammatical understanding was reached, the charges
138
Chapter 4
lost their general e¤ect. Maybe, if we begin with such facts, we will be able to reach a point of abstraction where some allegedly analytical inferences will not look so bad in that they will engender a su‰ciently attractive semantic picture; maybe we will not even call them ‘‘analytical inferences’’ anymore. Or, maybe, we will locate specific reasons why these charges continue to hold for word meanings. The strategy of deferment possibly extends to a more intractable problem raised by Quine (1960). Imagine an English-speaking linguist visiting an alien culture to learn its language. Suppose a rabbit scurries by and a native utters gavagai. Should the linguist write down that gavagai means rabbit? As Quine (1960, 52) observes, ‘‘Point to a rabbit and you have pointed to a stage of a rabbit, to an integral part of a rabbit, to the rabbit fusion, and to where rabbithood is manifested.’’ In e¤ect, the linguist has no empirical control on the (specific) meaning of gavagai, and, thus, on the concept GAVAGAI. Decades after it was proposed, many authors continue to view the problem as fundamental for any coherent theory of word meaning, semantics, and learning of words (Larson and Segal 1995, 16–17; Bloom 2000, 3–4; Jackendo¤ 2002, 88; among others). As Chomsky (1980, 14) notes, Quine intended the problem to arise not just for ‘‘problems of meaning but to any theoretical move in linguistics.’’ Again, the scope of this generalization is unclear. As Chomsky (1980) suggested, Quine needs to show that the gavagai-problem appears in the same form for grammatical theory as well. For example, grammatical theory holds that the sentence the man you met read the book I wrote contains two grammatically significant units (noun phrases) the man you met and the book I wrote, but met read the is no phrase at all. Quine needs to hold that there is no fact of the matter here. Recall that the ‘‘indeterminacy’’ arose in the gavagai-case because the English-speaking linguist was faced with at least two ‘‘equivalent descriptions’’: (111) gavagai means (whole enduring) rabbit (112) gavagai means (undetached) rabbit-stage This can only happen if both RABBIT and RABBIT-STAGE are simultaneously available to the linguist as resources for describing the native’s utterance of gavagai. In e¤ect, (111) and (112) are equivalent descriptions of the linguist’s intuition of what gavagai means to the native. In order for this case to extend to grammar, then, the linguist should be able to formulate two equivalent descriptions of the intuition underlying the grammaticality of the man you met read the book I wrote: one in terms of noun phrases, the other in terms of clusters like met read the. It is
Words and Concepts
139
totally unclear what it means to formulate a description of the latter kind. In any case, there are tons of evidence showing why a phrase-structure description of linguistic streams is salient (Chomsky 1957). Jenny Sa¤ran (2002) shows that structures such as (The) ( professor graded the) (students) are not ‘‘predictable dependencies,’’ that is, these structures are simply not learned from the linguistic environment even though the words are presented serially in the stimuli (see also Chomsky 2000d, 121). Equivalent descriptions are a fact of science since theories are necessarily underdetermined by the evidence they cover; from equivalent descriptions, we choose the best theory we can basically from nonevidential considerations (Chomsky 1980, 22). Quine’s problem thus looks rather benign once again when raised for grammatical theory: ‘‘indeterminacy’’ is nothing other than familiar underdetermination (George 1986; Hornstein 1991 for more). Does this conclusion extend to the gavagai-case? Chomsky (1980) thinks so. According to him, (111) and (112) are just equivalent descriptions familiar in science. A geneticist works with fruit flies, and he assumes that two fruit flies are the same in relevant respects. There is no conclusive empirical control on this assumption, yet he holds on to this as long as it works. Similarly, when a linguist works on an alien language, he assumes that the speakers of that language are like us in relevant respects such that a native is likely to mean rabbit rather than rabbit stage in uttering gavagai, just as we do in uttering rabbit. No special problem arises when we shift from the study of fruit flies to the study of words. Chomsky’s response may not be as decisive as it looks. No doubt the geneticist simply assumes that, other things being equal, the two fruit-flies he is working with are alike in relevant respects. When other things are not equal, there are ways to find that out. Again, each such finding will be underdetermined by whatever evidence is relied on by the geneticist at that point. Thus suppose one of the fruit flies is in fact an engineered object slipped into the sample by a jealous colleague. We can imagine this object to be as close to a real fruit fly in its appearance as we wish, except that it di¤ers in one minute feature caused by a planted gene or something; the feature is so minute that it escapes standard methods of observation, but it takes the inquiry totally o¤ the track. We wish these things do not happen, but there are ways of finding out when they do, including a confession from the remorseful colleague. It is unclear if a parallel is available for the gavagai case. The way we posed the problem, it does not easily disappear with the assumption that the native is su‰ciently like us. In fact, if the native does resemble us,
140
Chapter 4
then he is all the more likely to entertain either (111) and (112) without our ability to figure out which one, for, ultimately, it is we, namely the linguists, who formulated the alternative hypotheses on the native’s behalf. The native just uttered gavagai, and in doing so he might have been following some of our respectable philosophical traditions. Furthermore, each of the hypotheses is itself underdetermined by the nonlinguistic evidence—the glimpsed rabbit—that give rise to such hypotheses. When we are ascribing two of these to the native, the problem is not just that they are individually underdetermined, but, from the evidence we have about the native—namely, his gestures at a rabbit—it is underdetermined which of the (two) underdetermined schemes is at work. This, as far as I can see, is the point about ‘‘additional’’ indeterminacy mentioned by Quine (1969a, 303). Finally, unlike the fruit-fly case, it is not easy to think of some ‘‘discriminating’’ evidence that will temporarily settle the problem, for the same problem is likely to be built in to any conceivable additional evidence: will that be an evidence about rabbits or rabbit stages? Contrasted to the problem of analyticity, indeterminacy is a wonderfully deep problem. Nevertheless, it is also true that we typically think of whole enduring rabbits when uttering rabbit. If the native is su‰ciently like us, why should he typically think otherwise? In that sense, Quine could be making just ‘‘a philosophical point’’ (Quine 1969b, 34). It could be that the precondition for forming (112) is (111), that is, there could be some explanation that acquisition of RABBIT necessarily precedes that of RABBITSTAGE. On this view, unless other conditions so warrant, rabbit means rabbit by default; so does gavagai. This will meet Quine’s challenge provided we are able to study the structure of RABBIT that enters into our uses of rabbit without begging any of Quine’s questions. If, in other words, there is a su‰ciently abstract general theory of concrete nouns, we will be in a position to examine how the specific meaning of rabbit plays out. A similar perspective obtains for Saul Kripke’s influential remarks on rule following (Kripke 1982). It is well known that the concept of rule following permeates much of the research in the cognitive sciences, including biolinguistics. Tracing the brief history of contemporary approaches to mental phenomena, Chomsky (2005, 2) suggests that in many cases, ‘‘The best available explanatory theories attribute to the organism computational systems and what is called ‘rule following’ in informal usage.’’ As Chomsky points out, the usage applies not only to straightforward rulegoverned phenomena such as language, but also to the functioning of the visual system, insect navigation, and much else.
Words and Concepts
141
Following some remarks by Wittgenstein (1953), Kripke argues that the very concept of rule following is suspect. Kripke explains some of Wittgenstein’s ‘‘elusive’’ remarks in terms of the quus-puzzle where two alternative definitions, (113) and (114), are given for computing the arithmetical sum of two numbers. (113) could be viewed as the ‘‘standard’’ rule. (113) If X and Y are numbers, X plus Y ¼def X þ Y, the arithmetical sum. However, someone, say, Ali’s use of plus must be based on a finite number of past uses of this word. Hence, there must be some given numbers, say, 55 and 65, beyond which Ali has not checked his use so far. Thus, the entirety of Ali’s past usage is also compatible with the following (nonstandard) rule: (114) If X and Y are numbers, X quus Y ¼def X þ Y for X,Y a 65; ¼def 10, otherwise. How can Ali tell, when he is currently using plus, that he is not using it in the sense of quus since all of Ali’s past uses of plus are compatible with (114) as well? It seems we need some matter of fact that uniquely decides in favor of one of the definitions/usages—preferably (113)—as a representation of Ali’s ability to compute the arithmetical sum of two numbers. As Kripke’s ingenuous handling of the available options shows, there are no non-question-begging criteria, including a survey of Ali’s introspective states, that a rule follower may invoke to justify his particular choice (Kripke 1982, chapter 2). According to Kripke, the problem just raised leads to Wittgenstein’s (1953, paragraph 201) remark that ‘‘no course of action could be determined by a rule, because every course of action can be made to accord with the rule.’’ Hence, to ‘‘think one is obeying a rule is not to obey a rule. Otherwise, thinking one was obeying a rule would be the same thing as obeying it’’ (paragraph 202). Therefore, ‘‘when I obey a rule, I do not choose; I obey the rule blindly’’ (paragraph 219). There is a massive philosophical literature that tries to understand what Kripke’s problem and the associated remarks from Wittgenstein mean (Goldfarb 1985; Ebbs 1997; Horwich 1998; Hattiangadi 2007, etc.). Of direct interest here is Kripke’s suggestion that ‘‘if statements attributing rule following are neither to be regarded as stating facts, nor to be thought of as explaining our behavior, it would seem that the use of the ideas of rules and competence in linguistics needs serious
142
Chapter 4
reconsideration.’’ Kripke (1982, 31–32) also thinks that the ‘‘problems are compounded if, as in linguistics, the rules are thought of as tacit, to be reconstructed by the scientist and inferred as an explanation of behavior.’’ Since the idea of rule following is used to describe visual systems and insect navigation, the problem compounds even further. Chomsky (1986, 225) responds to all this as follows: ‘‘If I follow R, I do so without reason. I am just so constructed. I follow R because So maps data presented into Ss , which incorporates R. There is no answer to the Wittgensteinian skeptic and there need be none.’’ According to Chomsky, then, the question of justification, or the lack of it, as posed by Kripke simply does not arise in the context of cognitive explanations. Again, the scope of Kripke’s very general problem—and Chomsky’s sweeping denial of it—is unclear. Immediately following these remarks, Chomsky gives three examples where he thinks the skeptic need not be answered: ‘‘I know that 27 þ 5 ¼ 32, that this thing is a desk, that in a certain sentence a pronoun cannot be referentially dependent on a certain noun phrase etc.’’ The first example concerns (113) and it falls squarely within the scope of Wittgenstein’s skeptic. It is hard to see that Ali’s knowledge that 27 þ 5 ¼ 32, rather than 27 þ 5 ¼ 10, can be traced to a fact about Ali’s constitution. If Ali was so constituted, how could he come up with a widely di¤erent rule in his skeptical moment? The question is all the more pertinent because nonstandard formulation of standard arithmetical practices is part of the practice of mathematics itself. Nonstandard mathematics would have been impossible if our constitution allowed only standard formulations. So, it is more likely that if there are facts of constitution underlying our arithmetical practices—there must be such facts—then a ‘‘rule-following’’ description of the constitution will be more abstract than either (113) or (114) such that both are admissible as equivalent descriptions. The desired description will essentially give an account of the meaning of plus, or of the concept PLUS. The issue is whether such an account is available in a non-question-begging manner. Suppose the second example concerns a meaning rule: desk means desk. Then, in the light of the above and in a Quinean vein, Kripke would suggest that such rules are not qualitatively di¤erent from arithmetical rules. For example, Kripke (1982, 19) asks, ‘‘Can I answer a skeptic who supposes that by table in the past I meant tabair, where a tabair is anything that is a table not found at the base of the Ei¤el Tower, or a chair found there?’’ Recalling the discussion on Quine’s gavagai-problem, it is unlikely that the problem of equivalent descriptions, as it arises for mean-
Words and Concepts
143
ing postulations or translation manuals, has a clear solution in advance of detailed empirical research. In fact, it could well turn out that problems of ‘‘indeterminacy’’ resist any coherent empirical approach toward understanding our constitution in these respects. In view of this uncertainty, we do not know what it means to trace Ali’s preference for the ‘‘standard’’ meaning of desk to some aspect of his constitution. Does it follow that all ascriptions of rule following have the same (uncertain) e¤ect? Consider the contemporary research on face recognition. It has been found that recognition of faces, say, from photographs, depends on interesting geometrical properties which remain invariant under varying light and shade, tilt, rotation, degree of camouflage, and so on (Carey 1979). There are threshold points upto which this ability works with remarkable regularity and success. No one is ever taught these regularities; in fact no one knew about them until recently. Suppose we write these regularities down in some notation to develop a rule system—a ‘‘grammar’’—of face recognition. It will be absurd to suggest that recognition of faces is an unjustified activity just because the relevant capacity has been described in terms of rules and representations, and the subjects are viewed as having internalized this rule system.7 There may be equivalent descriptions here as well as a matter of routine scientific practice. But these are not alternative descriptions we could adopt, but alternative descriptions of a single objective reality, a matter of fact, concerning some aspect of human behavior. There is no answer to the skeptic and there need be none. We are just constructed that way. It may even be counterintuitive to call such behavior ‘‘rule-governed,’’ displaying the usual vagueness with which we enter big philosophical debates. For the case of face recognition, some neurological evidence has been found to support and extend the early ‘‘top-down’’ theories (Rodman 1999). The neurological story is in its infancy; but it is enough to suggest that we are looking at natural principles, perhaps specific to the domain (Jackendo¤ 1992, 73). But suppose there is no neurological evidence as indeed was the case at the beginning of this research. That should not alter the basic methodological issue: the presence or absence of neurological evidence cannot suddenly change a matter of faith to a matter of fact. If there is a matter of fact to be studied at all, even the use of the vocabulary of rules and representations in the early stages of research ought to be viewed as describing the same aspect of nature. The basic issue seems to be the character of what we are studying, not just the vocabulary in which studies are phrased.
144
Chapter 4
Consider now Chomsky’s third example. The example concerns the phenomenon of pronoun binding. Speakers of English know that them as in the men expected to like them can not be referentially dependent on the men. We saw that the phenomenon is explained by Principle B of binding theory: a pronominal is free in a local domain. The phenomenon is grammatical in character and the relevant principle is buried deep in human constitution such that a nonlinguist is not likely to have any access to it. Even Shakespeare would not have been able to justify the rule that the men can not bind them in the cited construction; he would have followed it blindly. Needless to say, a linguist can now justify this practice, including his own, on usual scientific grounds by furnishing a variety of evidence, some of which we saw. The point is further substantiated by another example discussed by Chomsky (1986, 227). It is well known that, at a certain stage of language development, children characteristically overgeneralize the rule for forming past tense—for example, they say sleeped instead of slept (Pinker 1995b, 2001; Pinker and Ullman 2002). Chomsky holds that we have ‘‘no di‰culty in attributing to them rules for past tense, rules that we recognize to be di¤erent from our own.’’ Whatever principle a theorist may come up with to account for this, the desired explanation will probably invoke several abstract principles working in tandem which do not prevent the child at that stage from saying sleeped, so she says it. No one taught her, just the opposite actually. She probably will become adjusted to normative ways of adult speech and learn to say slept. Let us say sleep, slept, slept will then be just the sort of rule which enables a speaker to conform to a social practice; it is of some interest that we view slept as ‘‘irregular.’’8 By parity, the earlier rule sleep, sleeped, sleeped can only be viewed as a fact about the child’s constitution. The application of both the rules is ‘‘blind,’’ but in di¤erent ways. The skeptic need not be answered, then, if we can restrict attention to only those aspects of human language learning ability which may properly be called ‘‘grammatical.’’ Two things happen in this area: (i) it allows a level of explanation for a universal object and, hence, the explanation constitutes something like natural principles on par with the facerecognition paradigm; (ii) we can avoid meaning rules as part of the explanatory vocabulary enabling us, thereby, to avoid the principle thrust of the skeptic. It will be puzzling to claim that the Webster’s Dictionary reflects how we are constituted; it is eminently plausible that the principles of universal grammar have a biological basis.
Words and Concepts
145
It is reasonable then to conclude that the challenges posed by Quine and Kripke do not a¤ect grammatical theory. As to word meanings, I am not rejecting any of the objections stated in this section, but I am not accepting them either at this stage of inquiry. Each of the objections, we saw, can be admitted, set aside or deferred provided a more abstract account of the structure of concepts is reached. At the first guess, it seems that the question of whether such an account is reachable is an empirical one. Thus, more material is needed before we can evaluate the e¤ect of these objections. I return to this topic in section 4.4. 4.3.2
Nouns
The classic work of Jerrold Katz (Katz and Fodor 1963; Katz 1972) contains the most explicit articulation of the goal of semantic decomposition. Katz proposed his theory several decades ago, and everyone in the field knows that it has fatal problems. Yet, it is unclear if these problems pertain to the specific formulations in Katz’s theory, or whether the stated goal of lexical semantics is fundamentally flawed. To focus on this issue, I will ignore several aspects of Katz’s work that he thought to be central to his theory, and which have been strongly criticized in the subsequent literature.9 Thus, I will simply pick a noun and ask what concepts can be listed, in some order, to specify its meaning—that is, I will treat all concepts that enter into the decomposition of a complex concept as semantic markers. Katz’s basic idea is that once we have a su‰cient body of decompositions in terms of semantic markers for a variety of nouns, some patterns and uniformities are likely to emerge to lead us toward a finite set of general concepts which superordinate on the rest. Following this intuition, the only constraint I will impose in the theory is superordination: a more general concept must dominate a less general concept in a semantic tree. This will ensure that, in the final analysis, the most general concepts (¼ primitives) dominate the rest. Testing for taxonomic organization of common nouns as coordinate (arm-leg) and superordinate ( fruit-apple) is a standard method of measuring semantic competence (Morais and Kolinsky 2001, 471). Furthermore, it is advisable that we begin with nouns whose decompositional character is something of common knowledge—that is, nouns that have generally agreed definitions for some of their central uses. It is well known that explicit decomposition of meaning is a rare phenomenon in any case. It seems to work, if at all, for fairly restricted classes
146
Chapter 4
of lexical items: these include jargon vocabularies (ketch, highball ), terms in axiomatized systems (triangle), and kinship vocabularies (grandmother, bachelor) (Fodor et al. 1980). As Jerry Fodor (1998, 70–72), following Putnam (1983), observes, it is di‰cult to deny the ‘‘conservative’’ intuition that bachelor and unmarried have an ‘‘intrinsic conceptual connection’’ such that bachelors are unmarried is ‘‘boringly analytic.’’ I will focus on bachelor to judge the theory on its strongest classical ground. Since semantic markers are nothing but concepts, we represent a semantic marker also in the uppercase. Markers such as HUMAN and COLOR are not supposed to be English words. They are supposed to be ‘‘theoretical constructs’’ lexicalized in English as human and color. In a language di¤erent from English, these constructs will be either lexicalized di¤erently, or borrowed from some other language, or not lexicalized at all. In other words, the English word human is supposed to invoke, among other things, the conceptual information HUMAN; similarly for other common nouns such as male, adult, animal, and the like. Thinking of common nouns as ‘‘predicates’’ that take (a set of ) semantic markers as ‘‘arguments,’’ these markers may now be used to decompose the meaning of a complex predicate such as bachelor. Following the constraint of superordination, the dictionary entry for bachelor will then be listed as a path with markers [HUMAN MALE ADULT UNMARRIED], in that order of increasing specificity. Despite the high rhetoric borrowed from grammatical research, the theory is beset with serious internal problems from the beginning. The lexical item bachelor typically means an unmarried male, as noted; but it could also mean a male animal without a mate during breeding season (mawam), among other things. It is important that these two readings apply to the same word bachelor since it is a part of the native user’s grasp of bachelor that, unlike kite and bat, it is not ambiguous. How do we capture this in a single lexical entry? The problem is that the di¤erences between these two readings of bachelor begin straight away. Since bachelor could cover either people or mawams, the markerese paths branch right at the top, HUMAN or ANIMAL. So, the semantic decompositions for the two readings will consist of [HUMAN MALE ADULT UNMARRIED] and [ANIMAL MALE YOUNG WITHOUT-AMATE] respectively. Notice that we must use MALE after HUMAN (or, ANIMAL) since HUMAN/ANIMAL is a higher category. This means that the marker MALE will have to be listed twice: once after HUMAN to terminate in UNMARRIED, and again after ANIMAL to
Words and Concepts
147
terminate in WITHOUT-A-MATE. The generalization that MALE is common to the two readings of bachelor is missed. The result is that we are forced to view bachelor as lexicalizing two entirely di¤erent conceptual paths with nothing in common. But in fact, apart from MALE, the two readings have a central feature in common—something like WITHOUT-A-MATE; how do we insert this item? Notice that we cannot begin with WITHOUT-A-MATE at the top to solve the problem because this category is subordinate to both HUMAN and ANIMAL: everything that is without a mate is either a human or an animal (or a robot), but not all humans or animals (or robots) are without a mate. If we insert WITHOUT-A-MATE after HUMAN or ANIMAL in separate paths, the commonality will not be represented. Suppose, as Fodor suggested, we give up everything else and settle for UNMARRIED as the meaning of bachelor, since we failed to accommodate this crucial item in tandem with other markers except HUMAN and MALE. As noted, there is also an intuition that the sense in which bachelor applies to people has something in common with its application to mawams. That intuition has to be given up now since UNMARRIED will not apply to anything outside humans. In this picture, HUMAN immediately dominates the sequence [MALE ADULT UNMARRIED] where the path terminates. The composite concept BACHELOR now applies to each unmarried adult human male. A popular counterexample, due to George Lako¤ 1987, is the Pope; the Pope is unmarried, is he a bachelor? Suppose we introduce the item CAN-MARRY to exclude the Pope. All human males of (marriageable age) are thus subjected to a binary classification, UNMARRIED and MARRIED. Under UNMARRIED, let us have a second binary branching into CAN-MARRY and CANNOT-MARRY. So, a bachelor is [UNMARRIED CAN-MARRY], the Pope is the other one. But then a Muslim (another popular counterexample), who is allowed up to four wives but in fact has two, cannot be placed in the scheme: he is both married and can marry. If we now allow the second binary branching under MARRIED as well, it generates the contradictory path [MARRIED CANNOT-MARRY]. To solve the problem, we go back and revise the item CAN-MARRY to CAN-MARRY-AGAIN so that we have a consistent path [MARRIED CAN-MARRY-AGAIN] that accommodates the Muslim. But now, a bachelor is a human male who is unmarried, but who can marry again. But CAN-MARRYAGAIN applies only to someone who is already married which conflicts with the item UNMARRIED. We cannot have it both ways.
148
Chapter 4
We can go on like this. All we are left with is that bachelors are some sort of solitary individuals, but solitary individual and bachelor are not only not synonymous, they do not even have the same extension. The philosopher of science Norwood Hanson once opened a very high profile conference on meaning of theoretical terms in science with the observation that electron means little things that wiggle. Most of these problems are well known in the literature, as noted. Thus, regarding the attempt to give an ‘‘exhaustive decomposition’’ of the ‘‘necessary and su‰cient conditions’’ governing the meaning of bachelor, Ray Jackendo¤ (2002, 375) dismisses the entire project because ‘‘the meaning of bachelor is inseparable from the understanding of a complicated social framework in which it is embedded.’’ But then he suggests immediately that ‘‘someone has to study these more complex aspects of meaning eventually’’ (p. 376). This raises the prospect that someone, may be Jackendo¤ himself, is prepared to enter into an understanding of the ‘‘complicated social framework.’’ The fact that the project is never launched by Jackendo¤, or by anyone else to my knowledge, deserves examination. In our analysis of bachelor, we just asked for some nontrivial addition to grammatical story that tells us how to represent the meaning of bachelor in terms of concepts. The basic problem seems to be quite overwhelming, namely, that in order to analyze the meaning of a common noun such as bachelor just to find a distinguishing concept, we have to jump into a vast sea of knowledge fairly blindly; this sea does not seem to part in accordance with our semantic intuitions. To review, it is a central part of the data for bachelor that bachelors are unmarried males is ‘‘boringly analytic’’; to that extent, there is no problem in representing this intuition in terms of concepts [ADULT, MALE, HUMAN, UNMARRIED] in some order.10 But the data for bachelor also includes intuitions such as mawams are bachelors, bachelor is closer in meaning to spinster than to professor, the Pope though unmarried is not a bachelor, bachelors are without a mate, two unmarried adult male shipmates are both bachelors, and so on. The complexity of the organization of these intuitions has little to do with the ‘‘complicated social framework’’ involved in the concept of marriage without denying that marriage is complicated. In other words, the problem is not just that whatever decomposes BACHELOR is itself complicated, the problem is that our intuitions seem to wander across these complicated decomposers. These intuitions concern any of marriage, mates, adulthood, heterosexual activity, loneliness, and so on, on a piece-
Words and Concepts
149
meal basis apparently without any concern for global coherence with respect to the decomposers just listed. The preceding diagnosis of the problem might be seen as a step towards a more comprehensive lexical semantics that takes into account many more dimensions of world knowledge. Thus James Pustejovsky (1995) proposes that we enlarge the scope of semantics to include aspects of world knowledge such as the origin, material constitution, layout, function and future course of things. It is hard to see how these notions throw significant light on the semantics of abstract terms such as bachelor. But may be a systematic application of these dimensions allows a more comprehensive semantics for concrete nouns (Moravcsik 1981). It is unclear if the complex character of semantic intuitions described above can be interestingly accommodated even with such rich resources. Chomsky’s analysis of river is a dramatic illustration of the suspicion just raised (Chomsky 2001b). Elaborating on Hobbes’s definition that we call something the ‘‘same river’’ if it comes from the same source (¼ origin), Chomsky points out that if the river changes its course by several miles, we will continue to call it the ‘‘same river’’ as long as it has the same source. But then suppose the course of the river is reversed, that is, it does not have the same source any more and we know about it, even then it would be the same river. Next, we can have a river that is artificially broken into tributaries so that it ends up somewhere else; it will be the same river. Suppose it is filled with waste containing 99 percent arsenic from some chemical plant upstream thus changing material constitution drastically; it is still the same river. Similarly, if the river dries up completely the material constitution changes, but it is the same river. In contrast, suppose we make a minuscule quantum-theoretic change so tiny that nobody can even detect it, and the river hardens into a glassy substance. Suppose we sprinkle something on it to add friction, draw a line in the middle, and cars start going up and down. It is the same object at the same place with virtually the same material constitution. But it will not be called a ‘‘river,’’ it will be called a ‘‘highway.’’ To add to Chomsky, suppose further that the river remains dry for many years, the bed fills up with erosion from the banks, and people start cultivation. Suppose this state of a¤airs continues for several generations, and the name of the river disappears from the local dialect. Then, one day, torrential rains start, the sand and the mud are washed away, and water starts flowing. It will now be called a ‘‘flooded field.’’ Chomsky concludes: ‘‘It’s physically identical to what it was before, but it’s not a river. On the other
150
Chapter 4
hand, you can change its course, move it, you can reverse its direction, and you can change its content, it still will be the same river.’’ The net result is that even if we enlarge the scope of semantics in many dimensions to include the origin, material constitution, layout, function, and future course of things as well as social expectations, conventions, and psychological needs, it is unclear how these additional dimensions fit the organization of semantic intuitions. It seems that semantic decomposition basically boils down to making lists of current uses of a term, as above, to be supplemented in the future as and when new uses are detected. One could add some theoretical flavor to such lists by introducing technical expressions such as ‘‘polysemy,’’ ‘‘family resemblance,’’ ‘‘cluster concept,’’ ‘‘fuzziness,’’ and the like. But all they do is to highlight the fact that we have a list, nothing more. Ray Jackendo¤ raises a similar complaint with much influential work in lexical semantics: building ‘‘an industry on the endless details of a single word’’ is not ‘‘properly systematic,’’ he says. Thus, he is unhappy with the work of Steven Pinker and Anna Wierzbicka because ‘‘the result is all too often a tiring list, impossible for any but the most dedicated reader to assimilate’’ (Jackendo¤ 2002, 377). Jackendo¤ ’s complaint is that the activity of making unsystematic lists is not even a first step for the solution of Plato’s problem. To that end, Jackendo¤ (2002, xvi) suggests that we stick to the original agenda of semantic decomposition, but aim for a ‘‘far richer notion of lexical decomposition.’’ Thus, he wishes to reformulate the agenda with the explicit recognition that word meanings constitute a ‘‘richly structured system’’ along a variety of dimensions. These dimensions include ‘‘conditions that shade away from focal values, conditions that operate in a cluster system, features that cut across semantic fields,’’ and so on. We note that these suggestions were made in the context of the complexity of bachelor and proper names, among other things (p. 377). It is totally unclear how these dimensions are supposed to harness the complexity of bachelor or river. In any case, Jackendo¤ does not give any indication of how to do so for, say, bachelor. 4.3.3
Verbs
However, there is another way of looking at Jackendo¤ ’s suggestion that does seem to o¤er, perhaps subliminally, a theoretical handle on the issue of complexity of nouns. Jackendo¤ could be suggesting that notions such as ‘‘focal value’’ and ‘‘semantic field’’ already form a part of the theoreti-
Words and Concepts
151
cal vocabulary in the field of lexical semantics. But the domain in which these notions have theoretical value does not currently include nouns. Chomsky (2001b) makes a similar point: the ‘‘internal structure of nouns is not studied very much’’ because when ‘‘you start looking at the structure of the simplest nouns, their complexity is overwhelming.’’ As a result, lexical semantics has dealt mostly with verbs because ‘‘you can discover things; you can find the primitive elements that seem to be rearranged in di¤erent ways.’’ From this perspective, the study of common nouns with ‘‘richly structured systems’’ of meanings becomes a future research program already implemented in less complex cases. Thus it is not surprising that most of the work, including Jackendo¤ ’s, on semantic fields and related aspects of lexical meaning is concentrated on verbs and prepositions rather than on nouns. In our terms, we may not yet know what distinguishes college from church, or rabbit from rabbit part, in a systematic study of nouns, but there is some promise of a theoretical framework that will ultimately make a systematic postgrammatical distinction between decide and try. The hope is that once the semantics of verbs is adequately understood, it might lead to a better understanding of the semantics of nouns. Setting nouns aside, then, we try to find some theoretical guidelines from the study of verbs. Over fifteen years ago, Jackendo¤ (1990, 3) suggested that the state of semantics at that point was akin to the generative syntax of the early 1960s with its emphasis on attaining ‘‘descriptive power.’’ Jackendo¤ asked for ‘‘some years of experience’’ in ‘‘semantic description’’ before ‘‘issues of explanation’’ could be meaningfully explored. The remark is puzzling since work on syntax-inspired semantics started in the early 1960s itself (Katz and Fodor 1963; Katz and Postal 1964; Gruber 1965), not to mention philosophical explorations that go back at least to Austin 1962, Wittgenstein 1953, and Frege 1892. In any case, according to Jackendo¤, lexical semantics continued to be ‘‘descriptive’’ in the early 1990s while grammatical research was entering the ‘‘Galilean’’ phase. In the decade that followed this remark, the cumulative international e¤ort on semantics far exceeded syntax research. Yet, in his later work, Jackendo¤ (2002, 377) concludes an extensive survey of lexical semantics with the following words: ‘‘there are fundamental methodological and expository di‰culties in doing lexical semantics . . . (p)erhaps there is no way out: there are just so many goddamned words, and so many parts to them.’’ I am reminded at once of Chomsky’s (2000d, 104) remark that
152
Chapter 4
any ‘‘complex system will appear to be a hopeless array of confusion before it comes to be understood, and its principles of organization and function discovered.’’ With these somber thoughts in mind, I will consider perhaps the most encouraging work in the semantics of verbs: the study of semantic fields (Gruber 1965). Jackendo¤ has described and elaborated on Gruber’s proposals in a number of books (1972, 1983, 1990, 1992, 2002, etc.). As with the gavagai problem, some authors (Larson and Segal 1995, 4) view the study of semantic fields to be a central concern of semantic theory. Jackendo¤ (1990, 25–27; 2002, 356–357) appeals to the following data: (115) a. Spatial location and motion (i) The bird went from the ground to the tree. (ii) The bird is in the tree. (iii) Harry kept the bird in the cage. b. Possession (i) The inheritance went to Philip. (ii) The money is Phillip’s. (iii) Susan kept the money. c. Ascription of properties (i) The light went/changed from green to red. (ii) The light is red. (iii) Sam kept the crowd happy. d. Scheduling of activities (i) The meeting was changed from Tuesday to Monday. (ii) The meeting is on Monday. (iii) Let’s keep the trip on Saturday. How salient is the data and what is the data for? At the minimum, the list displays the obvious fact that the same verb can be used in a variety of ways. Consider run. All sorts of things run, some without moving at all: people, dogs, noses, books, computers, cars, batteries, paints, ink, ideas, commentaries, and so on. Things can run into buildings, troubles, persons, surprises, applauses, 500 pages, the thirteenth week, big time; things can run out of breath, time, life, the house, gas, money, ink. As we saw, the Bengali verbs khaawaa and taanaa take a vast range of assorted complements. We also saw that some of these complements are shared between the two verbs. It is not surprising that go, be, change, and keep also have a variety of uses, while they share some of them. Should we think of semantic fields in terms of a list of these complements? If yes, then a list of semantic fields just represents obvious data.
Words and Concepts
153
Jackendo¤ ’s claim of course is that these verbs not only have a variety of uses, but that these uses fall under delineable patterns captured by a principled array of semantic fields. The underlying idea seems to be that relatively stable linguistic devices such as verbs and prepositions interact in ways such that they generate a spectrum of uses to be realized in specific contexts: ‘‘The variety of uses is not accidental’’ (1990, 26). Thus, keep and in interact to determine spatial location of things, keep and on determine scheduling of activities, and so on. There are a number of immediate problems with this proposal. A theoretically satisfying study of semantic fields ought to tell us how unlisted uses of the concerned verbs fit the classification reached so far. Consider John kept the secret close to his heart. For the part John kept the secret, category (b) seems to apply, but what does it mean to ‘‘possess’’ a secret? Why isn’t keeping a secret a scheduling of activities, and what kind of ‘‘activity’’ is that anyway? Add the part close to his heart. Now it seems category (a)—spatial location—(also) applies. But then what does it mean for a secret close to one’s heart to have a ‘‘spatial location’’? Similar remarks apply to John went into a coma/tizzy. May be these are ‘‘zeugmatic’’ uses (Fodor 1998, 50) of keep and go; may be these should be considered separately or not at all. But how do we decide what to set aside if we cannot find a principled basis for classifying the core data? It seems that the placing of verbs in a semantic field is either ‘‘pure stipulation’’ (Fodor 1998, 54), or is based on a selection that privileges certain uses, as we will see more clearly below. Further, no doubt, some uses of the cited verbs specify, say, spatial location and motion. How deep is the fact by itself ? It is not surprising that specification of spatial location, possession, and the like, are general ends that may be realized by a variety of actions/processes, especially when the ‘‘ends’’ are characterized broadly and loosely as noted. The study of semantic fields, it would seem, is meant to unravel the abstract properties of specific verb meanings to show how these properties generate a variety of uses when placed in varying complement contexts. Thus, there are two generalizations envisaged here: one for the cluster of verbs that fall within a semantic field, and another for the uses of a given verb across semantic fields. Both require theoretical identification of specific semantic fields. The absence of su‰cient guidelines plagues the prospects for the first generalization. As to the second generalization, Jackendo¤ suggests that, say, the ‘‘keep sentences all denote the causation of a state that endures over a period of time,’’ despite the uses of keep across varying semantic fields; ‘‘go
154
Chapter 4
sentences each express a change of some sort,’’ and so on. Notice that the suggested characterizations are nonunique in the sense that lots of verbs satisfy them: turn, leap, run, fall, and so on all express a ‘‘change of some sort.’’ Some of these verbs perhaps do not appear in some of the fields occupied by go. But that only brings out the well-known fact that uses of words ‘‘criss-cross and overlap,’’ as Wittgenstein put it. Just a list such as (115) merely restates this fact in informal terms without suggesting even the beginning of an explanation. Suppose we grant that the study of semantic fields at least brings out some nonunique but stable conditions on the meaning of verbs: keep signals causation of a state that endures over a period of time. Fodor (1998, 51) points out the obvious problem—familiar to us by now following the study of bachelor and river—that the generalization depends crucially on the univocality of causation, state, endurance, and the like. In other words, the claim that keep sentences always denote the causation of a state that endures requires that the concepts CAUSE, STATE, and ENDURANCE are invariant across the variety of uses. Since Fodor has discussed this problem at length, I will turn to what seems to me to be a more direct objection to the project. It seems that, even if we grant invariance to the concepts that supposedly explain the meaning of a word, Jackendo¤ ’s description of how the invariant concepts help generalize the meaning of a verb is simply not true. Thus, granting invariance to ENDURANCE, why should we always suppose that keeping something results in enduring states? Suppose John promised that he would destroy the contract if Ali disagreed. Ali disagreed and John tore o¤ the contract. We could say, John kept his word. What was the enduring state caused by/to John, and where? Following John Austin (1961, 233–252), we could say that keeping the word, like promising, is a performance that begins and ends with the utterance of the expression; in that sense, these uses of keep need not involve ENDURANCE at all. It seems that the idea of an enduring state is based on selective uses of keep as in kept the crowd happy or kept the bird in the cage; it does not apply to all uses of keep. In fact, even for kept the bird in the cage, it is unclear if the generalization applies. Suppose the bird escaped as soon as John kept it in the cage: what was the enduring state? Jackendo¤ is surely depending on the normal circumstance that if you keep a bird in a cage (carefully), it stays there for a while. From familiar expectations about birds, cages, and our ability to hold them there, Jackendo¤ proceeds to formulate universal conditions on meaning: some birds are not likely to oblige.
Words and Concepts
155
We merely scratched the massive literature in lexical semantics.11 Still, the basic problem ought to be reasonably clear by now. Undoubtedly, words—bachelor, river, keep, go—not only di¤er in sound, they di¤er in meaning as well. Also, each of these words has a variety of uses in varying contexts. It is natural to expect that the meaning, which distinguishes a word from another, also plays out in its varying uses. Here, the general suggestion in the literature is that a word A is narrowly understood in terms of a manageably finite set of concepts C, where the meaning of A is determined by the conceptual role conferred on it by C: the meaning of bachelor is captured in terms of UNMARRIED, ADULT, and MALE; the meaning of keep is determined by CAUSE, STATE, and ENDURANCE. The study of word meanings is expected to bring out neat conceptual networks in which the individual meanings are placed. The problem is to construct a su‰ciently abstract theory that captures this expectation. The problem looks unsolvable because the putative theory needs some terms in which to describe the system of unity-in-diversity, and those terms can only be some words in use. As suggested throughout the preceding exercise, the idealization we hope for is that we describe word meanings with a system of (nonlinguistic) concepts perhaps with some ‘‘parameters’’ attached to account for varying uses. Yet, devices like capitalization notwithstanding, the expressions that supposedly stand for concepts—male, mate, source, cause, endurance—are themselves words. So, what meaning they lend to the word meanings depends entirely on what these (concept) words themselves mean. In other words, we do not have any access to the conceptual system except through the word meanings the system is supposed to describe. The problem arises even if each word had exactly one, definite use in every context. The problem compounds, as we saw, if the same word is used in a variety of contexts such that no delineable pattern emerges to cover all of them. In fact, even a cursory study of word use—bachelor, river, keep—suggests that the conceptual relations in which word meanings are embedded are not only indefinitely porous and open-ended, they are often in conflict for the uses of the same word. As a result, either we are restricted to a list of preferred uses or we get trapped in cycles of word meanings when we attempt to generalize. It is important to emphasize that the complaint is about the suggested form of explanation, not about a competent speaker’s uses of words, for, there must be universal constraints on the content of words for children to learn them rapidly and e¤ortlessly. The suggested form of explanation does not unearth those constraints.
156
Chapter 4
Recall that we entered the study of word meanings to make some progress toward a more specific account of sound-meaning correlations. The project was faced with a number of initial objections (section 4.3.1). In each case, we asked for deferment since we wanted to survey empirical research in this domain. It looks as though each of the objections is borne out from what we saw. Hornstein’s objection that postulating semantic markers only pushed the problem back one step is justified since the expressions that stand for semantic markers (¼ concepts) themselves require explanation of word meanings. The objection noted by Jackendo¤ is essentially borne out because, in the absence of even the first abstract generalization, no route to primitives has been found. Quine’s objection to the very idea of analytic connections is plausible because we failed to discern any stable pattern between words and their conceptual structure independently of use in specific contexts. Quine’s problem of indeterminacy is valid roughly for the same reason: since the desired patterns could not be detected, uncontrolled conceptual variation cannot be ruled out within the suggested form of explanation. Kripke’s problem is justified directly because in each case we saw that the explanations were based on a limited list of current uses which do not generalize. Thus, current approaches in this domain do not raise the prospect of (even the beginning of ) an abstract theoretical hold on the phenomena under consideration. As Jackendo¤ (2002, 377) observes, ‘‘There are fundamental methodological and expository di‰culties in doing lexical semantics.’’ 4.4 Crossroads
What happens then to the classical idea that language is a system of sound-meaning correlations? To recapitulate (section 4.1), in the face of the alleged incompleteness of grammatical theory, I suggested three options for linguistic inquiry: (i) to supplement grammatical theory with theories covering full-blooded classical notion of meaning; (ii) to declare that language theory is indeed incomplete in principle; or (iii) to dissociate the scope of grammatical theory from the putative scope of (broad) language theory. From what we saw in the last two chapters, it stands to reason that the first option is unlikely to materialize from contemporary formal and lexical semantics approaches. This negative result leads to two contrasting perspectives. Phenomenal Complexity of Language Global skepticism about approaches from formal and lexical semantics does not preclude the possibility
Words and Concepts
157
that grammatical theory may cover aspects of traditional notions of meaning—truth conditions, conceptual connections, and so forth—from its own resources in the future. In principle, then, option (i) can still be resurrected and pursued from the direction of grammatical theory itself. What are the prospects? Insofar as the current form and scope of grammatical theory is concerned, it is unlikely that it may be extended to qualify as a complete traditional theory of language. Even if resources of grammatical theory such as hierarchy, Subject-Object asymmetry, argument structure, and so on are exploited to capture some of the abstract, structural aspects of predication and truth condition (Hinzen 2006, chapter 5), as noted, it is hard to see how grammatical theory can determine the content of these structures. Also, I do not see how all aspects of lexical richness—beyond Imeanings, that is—involved in climb, run, bachelor, river, keep, khaawaa, and so forth can be predicted from grammatical resources alone. As we saw, there is nothing grammatical about the palpable conceptual distinction between John decided to attend college and Bill tried to attend church since the conceptual distinction arises after grammatical resources have been exhausted. In the next chapter, I will suggest that the scope of semantics in grammar is basically restricted to what I will call ‘‘FL-driven interpretive systems.’’ As things stand now, grammatical theory will continue to be an incomplete theory of language—option (ii)—for establishing sound-meaning correlations, if our desired notion of language includes the full-blooded concept of meaning. This is because, as we saw, any composite theory that attempts to harness the phenomena, that are sought to be covered by formal and lexical semantic approaches, needs to plunge into the massive complexity of natural languages. I recall Chomsky’s remarks with which I opened chapter 1: ‘‘The systems found in the world will not be regarded as languages in the strict sense, but as more complex systems, much less interesting for the study of human nature and language.’’ To get a feel of what this implies, consider the putative components of a ‘‘full’’ theory of language that begins with grammatical theory. First, apart from the core computational system CS, we need a systematic account of the complex human lexicon. Next, we need a principled enumeration of conditions enforced by the complex array of ‘‘external systems’’ (at least) at the two interfaces.12 Finally, we need a theory of language use in the complex conditions of world knowledge, including speakers’ and hearers’ intentions and goals, not to mention more ‘‘local’’ factors such as personal idiosyncrasies, historical accidents, cultural e¤ects,
158
Chapter 4
mixing of populations, injunctions of authority, and the like. Needless to say, an account of the extremely complex human conceptual system needs to be woven into many of the preceding components if we are to capture anything like the full-blooded concept of meaning. The foundational di‰culties with the study of the conceptual system (assuming there to be one) alone casts doubt on the intelligibility of the enterprise, notwithstanding the salience of the grammatical component and its (very) restricted extensions. The phenomenal complexity of language thus keeps the second option alive suggesting that a study of complex systems is unlikely to unearth ‘‘real properties’’ of nature, as Chomsky suggested. In sharp contrast, as progressively abstract theoretical inquiry was directed at the phenomenal complexity of language, linguistic theory was able to extract a strikingly simple grammatical system. Even if there are sharp di¤erences as to how language-specific features are to be covered under a general theory, there is wide agreement that the computational principles and operations are invariant across languages.13 On the one hand, given the formal part of the human lexicon, the system establishes sound-meaning correlations of a restricted sort in some core cases (Chomsky 2000c, 95); in this sense, the system captures the essential feature of what counts as a language ‘‘in the strict sense.’’ On the other, the system does not cover conceptual and categorial organization, structure of intentionality, background beliefs, pragmatic factors such as speaker’s intentions, and the like (Chomsky and Lasnik 1977, 428; Chomsky 2000d, 26). This much abstraction was fully in force during the G-B phase, but it is discernible throughout the biolinguistic project. Thus, without failing to be languages in the strict sense, grammars radically depart from traditional conceptions of language both in terms of coverage and the principles contained in it. Looking back at the recent thrust of research in linguistic theory, it seems to me that the enterprise looks at the rest of the phenomenal complexity of language basically to isolate just those parts that can be brought into harness with the grammatical system. As a result, as far as I can see, linguistic inquiry is getting increasingly restricted to just those phenomena that can be directly explained with the properties of the computational system and the relevant properties of the lexicon alone. At least in Chomsky’s own work—my focus—there is less and less concern with the sound system and, as we will see, even the quasi-semantic topics of G-B (ytheory, Binding theory, scope ambiguity, and so on) are progressively set
Simplicity of Grammar
Words and Concepts
159
aside from the (core) domain of grammar without losing sight of the phenomena, of course. It is a lesson from the history of the sciences that once a simple optimal design of a system has been reached from rigorous empirical argumentation, it becomes harder to handle phenomenal complexity that abounds in any case. Typically, we try to ‘‘protect’’ the simple design. It took several centuries of research, supported by a range of mathematical innovations, to spell out and identify the scope of Newton’s strikingly simple laws of motion. Phenomenal complexity continuously threatened those laws, but they were too elegant to be dispensed with. It is one of the wonderful mysteries of science that, in the favorable cases, these restrictive moves actually lead to wider coverage of (previously unanticipated) phenomena. In any case, as noted at the outset, the history of the more established sciences suggest that once some observed complexity has been successfully analyzed into parts, a study of the original complexity often drops out of that line of research, perhaps forever. I am possibly overstating the current case for grammatical theory, but that seems to be one of the prominent future directions; the other of course is to continue to pursue the first option somehow. As we will soon see, in the more recent Minimalist Program, the even more restrictive grammatical system—called ‘‘CHL ’’—consists of exactly one recursive operation, Merge, and a small set of principles of e‰cient computation that constrain the operations of Merge. This object is buried deep down somewhere in the total cognitive architecture of humans such that people do not have introspective access to it in any intelligible sense. Its existence was not even known until recently. In CHL , linguistic theory is most likely to have unearthed a ‘‘real property of matter.’’ These considerations strongly suggest an even more focused line of inquiry directed at CHL itself, but on a plane di¤erent from that of classical linguistic inquiry. The existence of this object raises a new set of questions. What are the relations between the di¤erent components of CHL ? What is the notion of meaning minimally captured by the operations of this system? Given the very abstract character of its principles, is the effect of CHL restricted to human language? What does it mean to have such a system in the broad architecture of the human mind? In addressing these and related questions, we essentially leave the traditional choice (the first option) behind, and pursue the third.
5
Linguistic Theory II
Since its inception, the biolinguistic project has always viewed languages as systems consisting of a computational system (CS) and a lexicon. CS works on elements of the lexicon to generate sound-meaning connections of human languages. This foundational aspect of the project was firmly articulated in Chomsky 1980, 54–55, where a principled distinction was made between the computational and the conceptual aspects of languages. A computational system includes the syntactic and the semantic systems that together provide the ‘‘rich expressive power of human language’’; the conceptual system was taken not to belong to the language faculty at all but to some other faculty that provides ‘‘common sense understanding’’ of the world. To my knowledge, the strikingly more abstract notion of the single computational system of human language—CHL —was first used o‰cially in Chomsky 1994a.1 To appreciate how the notion emerged, recall that the G-B system postulated a computational system each of whose principles and operations was universal—that is, the principles themselves did not vary across languages. But most of the principles came with parametric choices such that particular languages could not be described in this system until the parameters were given specific values. In that sense, the G-B system also worked with a plurality of computational systems for describing human languages. A very di¤erent conception of the organization of the language faculty emerged as follows. To track the printed story, Chomsky (1991c, 131) proposed that ‘‘parameters of UG relate, not to the computational system, but only to the lexicon.’’ If this proposal is valid, then ‘‘there is only one human language, apart from the lexicon, and language acquisition is in essence a matter of determining lexical idiosyncrasies’’ (p. 131). The ‘‘lexical idiosyncrasies’’ are viewed as restricted to the morphological part of the lexicon; the rest of the lexicon is also viewed as universal. In that sense,
162
Chapter 5
Chomsky (1993, 170) held that ‘‘there is only one computational system and one lexicon.’’ Finally, in Chomsky 1994a, this computational system was given a name: a single computational system CHL for human language. ‘‘CHL ’’ was used extensively in Chomsky 1995b. The guiding assumption—mentioned earlier but spelled out much later, in Chomsky 2000b—is that language variation itself ought to be viewed as a problem, an ‘‘imperfection,’’ for learnability of languages; there are just too many of them. Notice that this idea has a di¤erent flavor from the classical considerations from the poverty of the stimulus. Even if the salience of those considerations are questioned (Pullum and Scholz 2002; Crain and Pietroski 2002 for response), no one can deny that there are thousands of languages and dialects. Linguistic theory, therefore, should be guided by the ‘‘Uniformity Principle’’ (116). (116) In the absence of compelling evidence to the contrary, assume languages to be uniform, with variety restricted to easily detectable properties of utterances. Assuming the detectable properties to reside (essentially) in the morphological part of the lexicon, the conception of a single computational system for (all) human languages follows. The Minimalist Program (MP) for linguistic theory attempts to articulate this conception. As usual, I will keep to the basic design of the system following Chomsky. 5.1 Minimalist Program
To set the stage for MP, let me quickly restate one of the major aspects of the principles-and-parameters framework that was stressed at the very beginning (see section 2.2): the property Gcp (construction-particularity) disappears.2 Literally, there are no rules in the G-B system; there are only universal principles with parametric choices. Move-a is the only transformational operation which applies to any category. Although the property Glp (language-particularity) cannot disappear, nothing in the system explicitly mentions specific languages such as English or Hopi. Specific languages are thus viewed as ‘‘epiphenomenon’’ produced by the system when triggered by particular experience (Chomsky 1995b, 8). In this sense, the framework abstracts away from the vast particularities of languages to capture something like ‘‘human language’’ itself. By any measure, this is a striking abstraction. It is surprising that, despite this leap in abstraction, empirical coverage has in fact substantially increased, explaining the exponential growth in crosslinguistic studies in the last three decades. On the basis of what we can see from here with
Linguistic Theory II
163
limited understanding, the Minimalist Program proposes to push the growing abstraction very close to the limit. 5.1.1
Conceptual Necessity
As a preparation to that end, let us ask, following Chomsky 1993: Can the linguistic system be described entirely in terms of what is (virtually) conceptually necessary? Suppose we are able to form some conception of what is minimally required of a cognitive system that is geared to the acquisition, expression and use of human language with its specific properties of unboundedness and construction of ‘‘free expression’’ for speech and interpretation (Chomsky 1994a). Chomsky is asking whether this minimal conception can be put to maximal explanatory use. In other words, can the relevant facts of language be fully explained with this conception alone? To appreciate what is involved, let us see if there are nonminimal conceptions in the earlier theory. Consider the distinction between ‘‘inner’’ and ‘‘outer’’ levels of grammatical representations. We assume that (at least) two outer levels are needed in any case since the system must be accessible to external systems: sensorimotor (SM) and conceptualintentional (C-I). Keeping to G-B, let us assume for now that PF and LF are those levels; PF and LF are then the ‘‘interface’’ levels of representation. The inner levels (d- and s-structures), in contrast, are just theoretical constructs to which no other system of the mind has access. In current terms, the inner levels are not conceptually necessary. So we have a lexicon and two interface levels for the SM and the C-I systems. The task of the computational system then is to map lexical information onto the two interfaces for the ‘‘external’’ systems to read them: satisfaction of legibility conditions. As for the design of CHL , we not only assume as noted that it is completely universal, we would want the system to contain only those principles and operations that are required under conceptual necessity. Now, we know that there is at least one ‘‘imperfection’’ in human languages in that sound-meaning connections are often indirect: the displacement problem. We return to the issue of whether it in fact is an imperfection. Since syntactic objects need to be displaced, conceptual necessity suggests that there is one operation that a¤ects it: Move-a; better, A¤ecta. This part of conceptual necessity (as with much else), then, was already achieved in G-B. Since movement is ‘‘costly,’’ we assume, again following the spirit of conceptual necessity, that movement happens only as a ‘‘last resort’’ and with ‘‘least e¤ort.’’ These are notions of economy that endow an optimal
164
Chapter 5
character to a system. We assume that only economy principles with ‘‘least-e¤ort’’ and ‘‘last-resort’’ e¤ects occur in the system to enforce optimal computation. Move-a (or, A¤ect-a) may now be viewed quite di¤erently: nothing moves unless there is a reason for movement and then the preference is for the most economical movement (Marantz 1995). Finally, we ensure, following conceptual necessity, that optimal computation generates only those syntactic objects that are least costly for the external systems to read. In other words, representations at the interfaces also must meet conditions of economy. Let us say, a system that meets these design specifications is a ‘‘perfect’’ system. We thus reach the Strong Minimalist Thesis (SMT): ‘‘Language is an optimal solution to interface conditions that FL must satisfy.’’ If SMT held fully, ‘‘UG would be restricted to properties imposed by interface conditions’’ (Chomsky 2006c). (See figure 5.1.) The first thing we would want to know is how the displacement problem is now addressed within the restrictive conditions. We saw that postulation of s-structures allowed computations to branch o¤ to accommodate this fact. Without s-structures, the problem reappears. We saw that displacement requires movement. So the issue of how to handle displacement gets closely related to the issue of what causes movement and
Figure 5.1
Minimalist program
Linguistic Theory II
165
what constrains it. The basic idea in MP is that certain lexical features cause movement under conditions of optimality enforced by economy principles. The idea is explained and implemented as follows. 5.1.2
Feature Checking
Suppose the system is presented with an array of lexical items. This array is called a ‘‘numeration’’ N, which is a (reference) set {LI, i}, where LI is a lexical item and i its index showing the number of occurrences of LI in the array. A lexical item enters the system with the operation Select. Select maps N to a pair of representations at the interfaces. Each time Select picks a lexical item, the index is reduced by one. A representation is not well formed until all indices reduce to zero; essentially, the procedure guarantees in purely algorithmic terms that each occurrence of every lexical item of the given array has entered the system. Since, by the time computation reaches the interfaces, complex syntactic objects must be formed for inspection by the external systems, two or more objects picked up by Select need to combine. The operation that repeatedly combines lexical items as they individually enter the system is Merge. What are the items that must be ‘‘visible’’ at the interfaces for the external systems to read? It seems the following items are needed at most: properties of lexical items, and certain types of larger units formed of lexical items—units traditionally called ‘‘noun phrase,’’ ‘‘verb phrase,’’ and so on. As we will see, the semantic component seems to recognize things that can be interpreted as topic, new item, argument structure, perhaps even proposition, and the like. For this, individual lexical items (their interpretable features) and structures like CP, DP, and ‘‘light-verb’’ phrases are needed at the interface. In X-bar theoretic terms, just the minimal and maximal projections are needed; in particular, X-bar levels are not needed. Conceptually speaking, lexical items enter the system via Select anyway and Merge combines larger units from them ‘‘online.’’ So, X-bar theory is not needed under conceptual necessity. Having suggested the conceptual point, I will leave matters at that, since, technically, it is not clear that bar levels can in fact be dispensed with.3 The elimination of bar levels suggests another minimalist move. Recall that bar levels were introduced during computation; similarly for indices for binding theory, not to speak of d- and s-structures. Under conceptual necessity, lexical information must enter CHL . In contrast, bar levels, indices, and inner structures seem to be just theoretical devices, and of these, as we saw, some are certainly eliminable. In the spirit of minimalism, then, we adopt the Inclusiveness Condition: no new objects are added during
166
Chapter 5
computation apart from rearrangement of lexical properties. In e¤ect, the condition requires that what must enter the system is the only thing that enters. Conceptually, the Inclusiveness Condition captures the natural idea that to know a language is to know its words (Wasow 1985); the computational system does not have anything else to access. We recall that most of the constraints on movement were located at the inner levels of computation (figure 2.1). With the inner levels removed, the system will generate wildly. The e¤ects of some of these constraints, therefore, need to be rephrased or captured somehow in minimalist terms. Under conceptual necessity, there are a limited number of options available: either the burden of the inner levels is distributed over the outer levels, or they are traced back to the lexicon, or the principles of the system are redesigned; or, as it turns out, all of them together. Basically, we will expect the conditions of computational e‰ciency, such as least e¤ort and last resort, to constrain the operations of Merge such that, after optimal computation, all and only those structures are typically retained which meet legibility conditions. The huge empirical task is to show that this in fact is the case. Suppose Merge combines two lexical items a and b to form the object K, a ‘‘phrase,’’ which is at least the set {a, b}. K cannot be identified with this set because we need to know which phrase it is: verb phrase, noun phrase, and so on. So K must be of the form {g, {a, b}}, where g identifies the type of K; g is, therefore, the label of K. As discussed below, lexical items are nothing but collections of features. By the inclusiveness condition, g can be either the set of features in a or in b or a union of these sets or their intersection. For simplicity, suppose g is either a or b, the choice depending on which of the two items in the combination is the head: the head of a combination is the one that is ‘‘missing something,’’ one that needs supplementation. Suppose it is a;4 the other one, b, is then the complement. Thus K ¼ {a, {a, b}} (for more, see Uriagereka 1998, 176–182). When the and man are combined by Merge, we get the structure K ¼ [the [the man]]. Note that K here is of the type ‘‘determiner phrase’’ (DP). Suppose Merge forms another object L from saw and it. Merging K and L, we get, (117) [VP saw [DP the [the man]] [V 0 ¼V saw [saw it]]] The resulting structure (117) is not exactly pretty in that it is di‰cult to interpret visually; the proposed tree diagrams are even more so (see Chomsky 1993). We seem to lose the familiar sense of sentencehood. But
Linguistic Theory II
167
then the subroutines that underlie the single-click operations Delete or Escape in your computer, not to speak of Change All Endnotes to Footnotes, are not simple either. We expect a genuine theory to describe (states of ) its objects at a remove from common expectations. (117) is what the mind ‘‘sees,’’ only a part of which is made available to the articulatory systems via economy conditions internal to the phonological system. This unfamiliar structure enables a range of novel theoretical moves. Recalling X-bar concepts, think of K, which is the maximal projection of the, as the ‘‘Specifier’’ of saw, and the maximal projection of saw ‘‘Verb Phrase’’: all and only items needed at the LF-interface have been made available under minimalist assumptions. The following local domains are also available: specifier-head, head-complement, alongwith a modified c-command details of which I put aside. Notice that X-bar theory—especially the intermediate projections (the bar levels)—is essentially abandoned on the basis of most elementary assumptions (Chomsky 1995b, 249). Notice also that the scheme naturally incorporates the VP-internal-Subject hypothesis, which, as noted, has independent motivation—for example, cliticization and external y-assignment. There are other advantages of generating phrase structure ‘‘online’’ as Select and Merge introduce and combine sets of lexical features. We will see that it leads to a very natural and economical theory of displacement. Regarding the nature of (lexical) information that enters the system, the basic idea is that lexical items are a cluster of features: phonetic features (how it sounds), semantic features (what is its linguistic meaning) and formal features (what is its category). Suppose we introduce a general requirement that formal features need to be ‘‘checked’’ in the course of a derivation. ‘‘Checking’’ really means checking for agreement, a grammatical well-formedness condition known since antiquity. For example, the number feature of verbs has to agree with the same feature of Subjectnouns: they do versus he does. The number feature makes a di¤erence to the semantic interpretation of nouns, for example, the di¤erence between singularity and plurality. But, although the verbs have to agree with the nouns in this respect, the number feature of verbs does not contribute to its semantic interpretation: do means exactly what does does. The fact that number feature of verbs are uninterpretable at the semantic interface conflicts with optimal design because a principle of economy on representations, Full Interpretation (FI), will be violated. The presence of uninterpretable features thus renders the system ‘‘imperfect.’’ Uninterpretable features abound in languages: number, gender and person features on verbs; Case features on nouns; number and gender
168
Chapter 5
features on articles, and so on. The peculiar and ubiquitous property of the agreement system in human languages is at once variously exploited and explained in MP. For one such exploitation, we ‘‘save’’ semantic interpretation by letting the number feature of verbs being wiped out from the semantic part of the computation once the sound requirement of agreement is met. Generalizing sweepingly on this, we first impose the requirement of feature checking of certain formal features which means, in e¤ect, that another instance of the (agreeing) feature to be checked be found somewhere in the structure. By minimalism, the relation between the feature-checker and the feature-checked must be established in a local domain: least effort. So, if an item is not in the local domain of its feature-checker, it must move—as a last resort, obeying least e¤ort throughout—to a suitable position to get its feature checked (and be wiped out if it is not making any semantic di¤erence). Thus, certain formal features of lexical items such as gender, person, and number, often called ‘‘j-features,’’ force displacement under the stated conditions. To take a simple example and skipping many details, consider the sentence she has gone (Radford 1997, 72). The item she has the formal features third-person female nominative in the singular (3FNomS). Apart from the tense feature (Present) the item has has the feature that it needs to agree with a 3NomS Subject. Has is in the local domain of she (specifier-head) and its features match those in the Subject; similarly for the features in has and gone. The structure is well formed. Among features, Nom is a Case feature. Case features do not make any di¤erence to semantic interpretation. So, it is wiped out from she once the requirement of has has been met; similarly with the 3S features of has. However, the 3SF features of she remain in the structure since these are semantically significant. The situation is di¤erent with she mistrusts him. Following Radford 1997, 122, suppose Merge creates the structure (118) in three steps. [she [Infl [mistrusts him] !
(118)
3FS Nom Pres 3SNom Now, mistrusts requires a 3SNom Subject to check its uninterpretable 3SNom features. She is the desired Subject, but she and mistrusts are not in a local relation since Infl (¼ Tense) intervenes. The tense feature of Infl, on the other hand, lacks any value. Let us say Infl attracts the tense feature Pres of mistrusts. Suppose that the attraction of the Pres feature
Linguistic Theory II
169
moves other features of mistrusts as well: general pied-piping. This results in a local relation between the (features) of she and the attracted features such that the checking of rest of the features of mistrusts takes place, and uninterpretable features are wiped out. Additionally, Infl gets filled with tense information Pres such that Infl now becomes LF-interpretable (satisfies FI). The scheme just described addressed an obvious problem. Surely, uninterpretability is something that external systems determine; how does the computational system know which features are uninterpretable without ‘‘look-ahead’’ information? The basic solution is that CHL looks only for valued and unvalued features, not interpretable and uniterpretable ones. For example, CHL can inspect that she has the value ‘‘female’’ of the gender feature or that some INFL is unvalued for the tense feature, as we saw. So, the computation will be on values of features rather than on their interpretability. Assume that features that remain unvalued after computation is done will be uninterpretable by the external systems. CHL will ‘‘know’’ this in advance, and the computation will crash. Again, a host of complex issues arise that I am putting aside. 5.1.3
(New) Merge
What I just sketched belongs basically to the early phase of the program (Chomsky 1995b). Developments in the years that followed suggest that we can do even better in terms of SMT. I must also note that, as with every major turn in theory, early adherents begin to branch out as the logic of a given proposal is pursued from di¤erent directions. Hence, as noted in the preface to this work, there is much less agreement with Chomsky’s more recent proposals than with the early phase of the minimalist program, even within biolinguistics.5 With this caveat in mind, recall that MP postulated two elementary operations: Merge for forming complex syntactic objects (SO), and Move for taking SOs around. Do we need both? We can think of Merge as an elementary operation that has the simplest possible form: Merge (a, b) ¼ {a, b}, incorporating the No Tampering Condition (NTC), which leaves a and b intact. The e¤ect is that Merge, in contrast to the earlier formulation, now projects the union of a and b simpliciter without labels—that is, without identifying the type of syntactic object constructed. As a result, phrase-structure component stands basically eliminated from the scheme, leading to the postulation of ‘‘phases’’ as we will presently see. Also, Merge takes only two objects at a time—again the simplest possibility (Boeckx 2006, 77–78)—yielding ‘‘unambiguous paths’’ in the form
170
Chapter 5
Figure 5.2
Unambiguous paths
of binary branching as in figure 5.2.6 It is obvious that NTC is at work: g is not inserted inside a and b; it is inserted on top (¼ edge) of a,b: NTC forces hierarchy. Merge is conceptually necessary. That is, ‘‘unbounded Merge or some equivalent is unavoidable in a system of hierarchic discrete infinity’’ because complex objects need to form without bound, so ‘‘we can assume that it ‘comes free’ ’’ (Chomsky 2005, 12; also 2006c, 137). The present formulation of Merge is the simplest since, according to Chomsky, anything more complex—for example, Merge forms the ordered pair ha, bi—needs to be independently justified. The argument works even if it is suggested that, due to the physical conditions imposed on human sign systems, a specific order—linearity—is inevitable (Uriagereka 1999, 254– 255). If the physical design forces linearity anyway, why should Merge specifically reflect that fact? The simplest design of the language faculty thus treats linear ordering of linguistic sequences as a property enforced by the sensorimotor systems since humans can process acoustic information only in a single channel; apparently, dolphins can process acoustic information in two channels simultaneously. The computational system itself does not enforce order; it enforces only hierarchy—that is, sisterhood and containment.7 The emergence of Merge signaled the ‘‘Great Leap Forward’’ in evolution. Now, unless we make the special assumption that a and b in Merge (a, b) are necessarily distinct, b could be a part of a. Since special assumptions are ‘‘costly,’’ we allow the latter since it comes ‘‘free.’’ In that case, (Internal) Merge can put parts together repeatedly as long as other things are equal. The original part will appear as copies (¼ traces) conjoined to other parts: (119) The book seems [the book] to have been stolen [the book]
Linguistic Theory II
171
Here, displacement of the book from the original Object position just means that only one of the copies, that is, the leftmost one, is sounded for reasons of economy in the phonological component; others are left as covert elements to be interpreted by the C-I systems. Internal Merge thus functions as Move under copy theory of movement. 5.1.3.1 Merge and Syntax Consider the e¤ects of Merge in the system. As noted, linguistic information enters the computational system in the form of lexical features divided into three types: phonetic, formal, semantic. External Merge takes lexical features as inputs and constructs complex SOs obeying NTC; internal Merge sets up further relations within SO. These ideas enable us to redescribe the familiar phenomenon of dislocation. Following Chomsky 2002, I will give a brief and informal description (see Chomsky 2000b, 2000c, 2006c, for more formal treatment). To implement dislocation, three things are needed:
(i) A target (¼ probe) located in a head that determines the type of category to be dislocated (ii) A position to be made available for dislocation (iii) A goal located at the category to be dislocated By the inclusiveness condition, lexical information is the all and only information available to the computational system. Lexical information is distributed as properties of features. So, the preceding three requirements can be met if we can identify three features that have the relevant properties. In fact there are these three features in a range of cases requiring dislocation. For example, in some cases, the goal is identified by the uninterpretable Structural Case, the probe is identified by redundant agreement features, and the dislocable position (¼ landing site) is identified by the EPP (extended projection principle) feature. It is easy to see how the phenomenon represented in (118) can now be redescribed in probe-goal terms (Infl ¼ Tense is the locus of EPP). The scheme provides a natural explanation for the existence of the EPP feature. Previously, the Extended Projection Principle—the Subject requirement typically satisfied by pleonastic there and it in English (see section 2.3.1.3)—was considered ‘‘weird’’ because no semantic role is involved. Now we know what the role is: it is a position for dislocation namely, the Subject position (Chomsky 2002; Bosˇcovic´ 2007 for a di¤erent approach to EPP). To see how the basic scheme extends to wh-movement and to illustrate the subtlety of research, I have followed Pesetsky and Torrego (2006) and
172
Chapter 5
Figure 5.3
Merge in language
Pesetsky (2007), who propose somewhat di¤erent ideas than Chomsky regarding valuation and interpretability. Setting many details (and controversies aside), figure 5.3 displays the phenomenon of dislocation of which book in the structure (120); only relevant features for wh-movement are shown. (120) (I wonder) which book the girl has read. In the diagram (adapted from Pesetsky 2007), (external) Merge generates the basic structure in six steps. According to Pesetsky and Torrego, the probe C—on analogy with T, which is typically unvalued and which hosts EPP—is viewed as containing an unvalued but interpretable whfeature along with EPP; it thus seeks a matching valued feature to ‘‘share’’ the value of the latter. A valued (interrogative) but uninterpretable wh-feature is located in the goal as indicated. The EPP feature of C determines the position C 0 , which is the projection of the head C, for (internal) Merge to place a copy of the goal which book there.8 The dislocation of which book to the edge satisfies the ‘‘external’’ condition that which book is to be interpreted as a quantifier. In e¤ect, conditions on meaning follow from the satisfaction of narrow conditions on computation. Furthermore, the incorporation of Merge enforces a drastic reduction in computational complexity. As Chomsky (2005) explains, the G-B
Linguistic Theory II
173
model required three internal levels of representation—D-Structure, SStructure, and LF—in addition to the interface levels. This increases cyclicity. Intuitively, a syntactic ‘‘cycle’’ refers to the syntactic operations in a domain, where the domain is determined by a selection from the lexicon (Boeckx 2006 for more). Now, the postulation of three internal levels in G-B required five cyclic operations on a selection from the lexicon: (i) the operations forming D-Structures by the cyclic operations of X-bar theory; (ii) the overt syntactic cycle from D- to S-Structure; (iii) the phonological/morphological cycle mapping S-Structure to the sound interface; (iv) the covert syntactic cycle mapping S-Structure to LF; and (v) formal semantic operations mapping LF compositionally to the meaning interface. SMT suggests that all this can be reduced to a single cycle, dispensing with all internal levels; as a result, the distinction between overt and covert computation is given up. Incorporation of Merge enables the elimination of compositional cycles (i), (ii), and (iv). This leaves two mapping operations, (iii and v), to the interfaces. For these, Chomsky invokes the notion of a minimal SO called a ‘‘phase.’’ Phases denote syntactic domains constructed by Merge. In some cases, phases look like classical phrases such as CP and DP (see figure 5.3), in some others these are new objects such as ‘‘light-verb’’ phrases (vP) with full argument structure. Details (and controversies) aside, the basic point for SMT is that phases are all that the grammatical system generates and transfers to the interfaces; perhaps the same phase is transferred to both the interfaces. 5.1.3.2 Merge and Semantics As noted, computations at the sound end of the system might pose problems for this coveted austere picture since phonological computation is thought to be more complicated than computation to the C-I interface (¼ narrow syntax): ‘‘We might discover that SMT is satisfied by phonological systems that violate otherwise valid principles of computational e‰ciency, while doing the best it can to satisfy the problem it faces (Chomsky 2006c, 136).’’ The austere picture might still hold if we recall classical conceptions of language ‘‘as a means of ‘development of abstract or productive thinking’ ’’ with ‘‘communicative needs a secondary factor in language evolution’’ (Chomsky 2006c, 136).9 In contemporary terms, this means that the principal task of FL is to map syntactic objects to the C-I interface optimally; mapping to the SM interface will then be viewed as ‘‘an ancillary process.’’10 SM systems are needed for externalization such as the ability to talk in the dark or at a distance; if humans were equipped with
174
Chapter 5
telepathy, SM systems would not be needed. Thus, SM systems have little, if anything, to do with the productivity of language. Viewing narrow syntax thus as the principal e¤ect of FL, syntactic operations essentially feed lexical information into the C-I systems single cyclically (phase by phase), period. Finally, the Merge-based mechanisms just sketched have interesting consequences, in line with SMT, for the meaning-side. First, the singlecyclic operation eliminates the level of representation LF/SEM; this is because, by (v) above, the existence of this level of representation will add another cycle. Second, Merge constructs two types of structures, each a type of phase, that are essentially geared to semantic interpretation. According to Chomsky, most of the grammatically sensitive semantic phenomena seem to divide into two kinds: argument structure and ‘‘elsewhere.’’ ‘‘Elsewhere’’ typically constitutes semantic requirements at the edge such as questions, topicalization, old/new information, and the like, as noted. These requirements seem to fall in place with what Merge does: as we saw, external Merge constructs argument structures, internal Merge moves items to the edge. At the sound end, internal Merge forces displacement. For our purposes, it is of much interest that, although the relevant format for the C-I systems is made available by Merge, much of the semantic phenomena handled in G-B is no longer covered inside the narrow syntax of MP. However, as argued above (chapter 3), some semantic computation must be taking place in mapping lexical information to the C-I interface; there is no alternative conception of what is happening in this cycle. In that sense, the narrow syntax of MP captures what may be currently viewed as ‘‘minimum’’ semantics. Given SMT, much of what fell under grammatical computation earlier—binding, quantifier scope, antecedent deletion condition, theta theory, and so on—can no longer be viewed as parts of FL proper. According to Chomsky (2002, 159), operations that are involved in these things—assignment of y-roles to arguments to enforce the y-criterion, free indexing to set up coindexing for Binding theory, and mechanisms for marking scope distinctions,—‘‘are countercyclic, or, if cyclic, involve much more complex rules transferring structures to the phonological component, and other complications to account for lack of interaction with core syntactic rules.’’11 These systems are best viewed as located just outside FL, hence, at the edge of C-I systems, which, to emphasize, are performance systems: ‘‘It is conceivable that these are just the interpretive systems on the meaning side, the analogue to articulatory and acoustic phonetics, what is going
Linguistic Theory II
175
on right outside the language faculty’’ (Chomsky 2002, 159). These interpretive systems are activated when data for anaphora, thematic roles, and so on carry information for these procedures. It enables us to think of CHL itself as independent of these procedures.12 To speculate, although Chomsky has placed these systems ‘‘right outside’’ FL, it seems to me that they require a special location there. So far, the ‘‘outside’’ of FL at the semantic side was broadly covered by the entire array of the conceptual-intentional (C-I) systems. These comprised of the conceptual system, the ‘‘pragmatic’’ systems giving instructions for topicalization, focus, and perhaps instructions for truth conditions; call the package ‘‘classical C-I.’’ The systems just expelled from FL seem to di¤er quite radically from classical C-I systems: (i) they enforce structural conditions like scope distinction and referential dependency; (ii) their description needs grammatical notions such as anaphors and wide scope; and, most importantly, (iii) few of them, if any, are likely to be shared with, say, chimpanzees who are otherwise viewed as sharing much of the C-I elements with humans (Hauser, Chomsky, and Fitch 2002; Premack and Premack 2003; Reinhart 2006). To capture the distinction between these and the classical C-I systems, let us call the former ‘‘FL-driven Interpretation’’ (FLI) systems. To follow Chomsky’s tale, classical C-I (and sensorimotor) systems were already in place when the brain of an ape was ‘‘rewired’’ to insert FL. In contrast, as we will see in section 5.2, although FLI systems are not language-specific, they are best viewed as linguistically specific; we do not expect every cognitive system requiring manipulation of symbols to have them. In other words, although FLI systems are invariant across specific languages such as English and Hopi, their application is restricted to the domain of human language. Given their linguistically specific nature, it is hard to think of FLI systems as existing in the ape’s brain prior to the insertion. In that sense, FLI systems belong to FL without belonging to CHL . In the extended view of FL just proposed, FLI systems occupy the space between CHL and classical C-I.13 If this perspective makes sense, then, we may not view FLI systems as ‘‘external’’ in the sense in which classical C-I systems are external. In this sense, they are dedicated to language even if they are viewed as located outside of CHL . We are just going by a tentative list of these systems so far. As discussed in chapter 3, much of the preparatory structural work for the thought systems seems to take place here: the y-criterion is satisfied, referential dependency is established, scope ambiguities are resolved, and so forth. What falls under
176
Chapter 5
FLI is an empirical issue, but the conception ought to be reasonably clear. The proposed organization thus retains the spirit of G-B in that the information encoded at the output representation of FLI systems recaptures LF; however, in G-B, these ‘‘modules’’ were seen as operative in narrow syntax itself. FLI systems thus (immediately) surround CHL . In doing so, they carve the path ‘‘interpretation must blindly follow’’ (Chomsky 2006a). CHL is a computational device that recursively transfers symbolic objects optimally to FLI systems. If CHL has optimal design, then we will expect CHL to transfer (mostly) those objects that meet the conditions enforced by FLI systems. For example, we will expect the structural conditions for establishing various dependencies in Binding theory, or scope distinctions for quantifiers, to follow from computational principles contained in CHL (Chomsky 2006a; 2006d; Rouveret 2008). FLI systems then enforce additional conditions—perhaps countercyclically—to establish soundmeaning correlations within grammar. In doing so, they generate structures to which classical C-I systems add content. From this perspective, most of the interesting work trying to articulate grammatically sensitive semantics of human language is in fact focused on FLI systems (Hinzen 2006; Reinhart 2006; Uriagereka, 2008). 5.1.4
Economy Principles
Chomsky (1995b) extensively discussed two global ideas: least e¤ort and last resort. As far as I can figure, these were discussed as ‘‘considerations’’ or ‘‘factors’’ that must be implemented somehow in the working of the computational system for human language; it is hard to locate an actual statement of them as principles on par with, say, the projection principle or the principles of binding.14 Be that as it may, the conceptual significance of these ideas is not di‰cult to appreciate. As the name suggests, last resort requires that syntactic operations—especially, operations e¤ecting dislocation—are resorted to only when some syntactic requirement cannot be met by other means. This is because syntactic operations are ‘‘costly.’’ Least e¤ort requires that, in case syntactic operations are to be executed, global preference is for executions that are least costly. The point is that, whatever be the specific articulation of these ideas, the minimalist program needs them in one form or another. Hence, subsequent controversies about the character of specific principles (noted below) do not a¤ect the general requirement. Although Chomsky examines a range of syntactic phenomena directly from these ‘‘considerations,’’ he does propose some specific principles that
Linguistic Theory II
177
appear to fall under the last resort and least e¤ort categories. For example, Chomsky 1995b made extensive use of the principles of greed and procrastinate. Greed required that a syntactic operation, essentially Move, applies to an item a only to satisfy morphological properties of a, and of nothing else. This clearly is a last resort principle—Chomsky christened it ‘‘self-serving last resort’’—which has a markedly descriptive character. It is eliminable if we simply require that Move works only under morphological considerations—that is, Move raises features which, in fact, is all that Move does. The idea is even more naturally incorporated in the probe-goal framework with (Internal) Merge, as we saw. Procrastinate, another last resort principle, required that covert operations to LF be delayed until overt operations to PF are completed because PF operations are ‘‘costly.’’ Skipping details, this had to do with a distinction between strong (phonological) features and weak features (for more, see Lasnik and Uriagereka 2005, 3.4). As the distinction was given up, so was procrastinate. In any case, as we saw, in the recent design there is no overt/covert distinction. The point to note is that both of these last resort principles were cast in linguistic terms—morphological features and strong features—and both were eliminated soon after they were proposed. But the elimination of specific principles does not mean that the last resort condition has disappeared. As Boeckx (2006) shows for a variety of cases, the last resort condition is now deeply embedded in ‘‘local’’ syntactic mechanisms; for example, an element is no longer accessible for computation once its Case has been checked, which means that Case checking is a last resort. We can state this rather specific condition because it is a reflex of the general last resort condition. In this sense, the last resort idea remains central, but as a consideration or an e¤ect to be implemented in syntax whenever required (Lasnik and Uriagereka 2005, 174–175). As for least-e¤ort principles, three principles stand out: Full Interpretation (FI), Minimal Link Condition (MLC), and Shortest Derivation Condition (SDC)—all drawn essentially from G-B. Although the general conceptions of last resort and least e¤ort are clear enough, it is not obvious that specific principles fall neatly under them. Take Procrastinate. In the preceding discussion, Procrastinate was taken to be a last resort principle because it delayed (covert) computation to LF—that is, between PF- and LF-computation, LF-computation is the last one resorted to. By the same token, since Procrastinate allows PF-computation first, it can be viewed as a least e¤ort principle that reduces the number of ‘‘costly’’ overt operations. This is rea‰rmed as follows.
178
Chapter 5
Shortest Derivation Condition says that between two converging derivations, the one with less number of steps is preferred. It is then clearly a least-e¤ort principle. However, Kitahara (1997) suggested that Procrastinate can be derived from the least e¤ort SDC; it follows, in any case, that Procrastinate is not independently needed. Although the least e¤ort spirit of SDC is well taken, it looks problematic as formulated since it requires that two or more converging derivations be compared for their length. Comparing derivations ‘‘globally’’ after they are over is a hugely costly a¤air not becoming of a least-e¤ort principle. The natural solution is to basically block multiple derivations by keeping derivations short so that alternative derivations do not get the chance to branch out, as it were (Boeckx 2006). This is achieved by reducing syntactic domains and operations on them: cyclicity. In that sense, Merge-driven single-cyclic derivations by phase implements the spirit of SDC without invoking it; as a principle, SDC is not required. This leaves the principles FI and MLC. As noted, Full Interpretation requires that no illegible objects appear at the interfaces (for more, see Lasnik and Uriagereka 2005, 105–106); Minimal Link Condition requires ‘‘shortest move’’ in that an element cannot move to a target if another element of the same category occurs between the first element and the target. MLC thus imposes a condition of relativized minimality— ‘‘relativized’’ because minimal links are defined with respect to specific categories such as WP. Both FI and MLC are clearly least-e¤ort conditions. As with last resort and SDC, the question is if they are to be stated as specific principles. As Chomsky (1995b, 267–268) observes, if we state MLC as a principle, then it can only be implemented by inspecting whether another derivation has shorter links or not. Similar observations apply to FI if it is to be formulated as an output condition for choosing the most economical representation. In each case, the problem of globality arises. As with last resort, the natural solution is to think of these conditions as enforced directly in the system. Thus, the reflex of MLC obtains by simply barring anything but shortest move (Reinhart 2006, 22); the reflex of FI obtains by rejecting structures with uninterpretable elements within CHL , as noted. Empirically, we know that these conditions on movement and representations are obeyed in a vast range of cases. To that extent, these conditions are empirical generalizations. If, however, an evidence is located in which, say, the minimal link condition on movement is apparently violated, we try not to withdraw the least e¤ort condition, but explain the anomaly by drawing on other factors (Boeckx 2006, 104). Be-
Linguistic Theory II
179
yond empirical generalization, then, the least e¤ort conditions act as general constraints on the system. To sum up, it is reasonable to say that the last resort condition obtains in CHL in a general way. The least e¤ort condition seems to obtain more specifically in three subconditions implementing the e¤ects of SDC, MLC, and FI: these subconditions are restricted cyclicity, condition on movement, and condition on representations, respectively. To emphasize, neither of the last resort and least e¤ort conditions are stated as specific economy principles. During computation, we will expect the last resort and least e¤ort conditions to obtain throughout to ensure that the computational system, CHL , meets the conditions of optimal design in its operations. 5.2
CHL and Linguistic Specificity
Although intricate and, sometimes, elaborate computations take place in CHL as information from the extremely complex lexicon of human languages enters the system, CHL itself constitutes of just Merge that operates under last-resort and least-e¤ort conditions—apparently, nothing else.15 To put it di¤erently, linguistic information is essentially stored in the lexicon and, working on it, CHL generates symbolic objects at the interfaces which are interpreted by the relevant external systems. As we saw, some of these systems are likely to be linguistically specific. Is the CHL itself—or, better, its components—linguistically specific? I am raising this question because, once we reach the austere design of CHL under the Minimalist Program, it is di‰cult to dispel the intuition that the system seems to be functioning ‘‘blindly’’ just to sustain e‰cient productivity. There is a growing sense that, as the description of the human grammatical system gets progressively simple, the terms of description get progressively linguistically non-specific as well. Let us say that a principle/operation P of a system Si is nonspecific if P makes no reference to Si -specific categories. Suppose that the collection of Ps is su‰cient for describing a major component of Si for us to reach some nontrivial view of the entire system. Recall that, with respect to the language system, we have called such a principle a ‘‘purely computational principle’’ (PCP) earlier (section 2.2). It is the ‘‘purely computational’’ nature of the functioning of CHL that gives rise to the intuition of (the relevant notion of ) nonspecificity. Intuitively, to say that P is purely computational is to say that the character—and hence the formulation—of P is such that its application
180
Chapter 5
need not be tied to any specific system Si . In that sense, P could be involved in a system Sj which is (interestingly) di¤erent from Si in which P was originally found. It could turn out of course that only Si has P since only it requires P even if its formulation is nonspecific—that is, it could be that there is no need for P anywhere else (but, in that case, the nonspecific character of P remains unexplained). So the idea really is that, if a computational system other than the language system required P, then P must be nonspecific; it does not follow from this statement alone that there are other computational systems requiring P. That is an empirical issue, but it interestingly opens up only when the collection of Ps in Si begin to look as if they are non-Si specific. Until very recently, linguists, including Chomsky, held a very di¤erent view of the language system. The GLOW manifesto, which represents the guiding spirit and motivation of current linguistic work, states explicitly that ‘‘it appears quite likely that the system of mechanisms and principles put to work in the acquisition of the knowledge of language will turn out to be a highly specific ‘language faculty’ ’’ (Koster, Riemsdijk, and Vergnaud 1978, 342). In general, Chomsky had consistently held that, even if the ‘‘approaches’’ pursued in linguistic theory may be extended to study other cognitive systems, the principles postulated by the theory are likely to be specific to language: ‘‘There is good reason to suppose that the functioning of the language faculty is guided by special principles specific to this domain’’ (Chomsky 1980, 44; also Chomsky 1986, xxvi). Notice that this view was expressed during the G-B period that promoted a strongly modular view of the organization of language (Boeckx 2006, 62–66), as we saw. The point of interest here is that the idea of linguistic specificity was advanced for the principles and operations that constitute the computational system of human languages. Nonetheless, I am asking whether the elements of FL are dedicated to language alone, or whether there is some motivation for thinking that significant parts of FL might apply beyond language. I am suggesting that the most reasonable way to pursue this motivation, if at all, is to focus on the combinatorial part of the system to ask whether some of the central principles and operations of this part could be used for other cognitive functions. Therefore, the term ‘‘CHL ’’ is to be understood as a rigid designator that picks out a certain class of computational principles and operations, notwithstanding the built-in qualification regarding human language. However, so far, I am thinking of CHL as restricted to language and some other human cognitive systems, especially those that may be viewed as ‘‘language-like,’’ ones that are likely to require P under a first
Linguistic Theory II
181
approximation. In this formulation of the issue, the human specificity of these systems is not denied although the domain specificity of some of the central organizing principles of these systems is questioned. The formulation arises out of the intuition that, besides language, there are many other cognitive domains where combinatorial principles seem to play a central role: arithmetic, geometry, music, logical thinking, interpretation of syntactic trees, maps, and other graphic representations (Casati and Varzi 1999; Roy 2007), to name a few. If the elements of FL are to be used elsewhere at all, it is quite likely that they reappear in some of these domains; that is the step of generalization I have in mind. In a related way, the proposal might enable us to make better sense of the architecture of the language faculty, sketched earlier, in which domainspecific FLI systems are viewed as separate from the core computational system itself. If language is a distinct cognitive domain we will expect some linguistically specific e¤ects to cluster somewhere while the computational system itself e¤ects bare productivity in other domains as well. For that to happen, the computational system itself needs to be linguistically nonspecific. To my knowledge, there has been little discussion on whether and to what extent the principles actually discovered in the study of language can be extended to other cognitive domains. Clearly, the issue under discussion here arises only for those cognitive domains for which a fairly abstract body of principles is already in hand. In other words, if the principles postulated for a cognitive domain are too directly tied to the phenomena they cover, then their very form will resist generalization across phenomenal domains. For example, questions of generalization could not have been interestingly asked for the system of rules discussed in the Aspects model of language (Chomsky 1965). For the cognitive domains under consideration here, it is generally acknowledged that a su‰ciently abstract formulation has been reached, if at all, only for a few cognitive domains including language, and that too very recently. Thus, given the lack of su‰cient advance in studies on other ‘‘languagelike’’ cognitive domains, the question that concerns us here has not been routinely asked. Postponing Chomsky’s current and very di¤erent views on this issue to chapter 7, I will propose that a significant component of the language system, under suitable abstractions, consists wholly of purely computational principles. The proposal requires a study of the organization of grammar, principle by principle, to see if it is valid. Since we have just traced the development of grammatical theory, it seems to me that this is the right place (while grammatical theory is still fresh in our minds) to pursue the
182
Chapter 5
proposal in terms of a quick review of what we have seen so far. I will discuss the significance, including empirical motivation, of the issue in the chapters that follow. To recapitulate (section 2.2), we may think of four kinds of rules and principles that a linguistic theory may postulate: language-specific rules (LSR), construction-specific rules (CSR), general linguistic principles (GLP), and purely computational principles (PCP). It is obvious that, for the issue of nonspecificity, only PCPs count. From that point of view, the four kinds of rules basically form two groups: linguistically specific (LSR, CSR, GLP), and linguistically nonspecific (PCP). If PCP is empty, then the language system is entirely specific. If PCP is nonempty but ‘‘poor,’’ then the issue of nonspecificity is uninteresting beyond language. Thus, the real question is: is PCP rich? In other words, how much of the working of CHL can be explained with PCPs alone? As we saw, the principles-and-parameters framework (P&P) postulates that rules of the first two kinds, that is, LSR and CSR, may be totally absent from linguistic theory. In these terms, a linguistic theory under the P&P framework postulates just two kinds of principles, GLP and PCP. However, it is clear that just the framework is not enough for our purposes, since the framework allows both GLP and PCP. Therefore, unless a more abstract scheme is found within the P&P framework in which PCPs at least predominate, no interesting notion of nonspecificity can emerge. The issue obviously is one of grades: the more PCPs there are (and less GLPs) in CHL , the more nonspecific it is. The task then is to examine the short internal history of the P&P framework itself to see if a move towards progressively PCP-dominated conceptions of CHL can be discerned. As noted, CHL has two components: some principles and one operation. I discuss these in turn. 5.2.1
Principles
Recall the organization of grammar in G-B theory schematically represented in figure 2.1. The following principles are postulated in that grammar: Projection principle, X-bar, y-criterion, Case filter, principles of Binding, empty category principle, subjacency, chain condition, and Full Interpretation, among others. Let us now see how the principles postulated by G-B theory fall under the suggested categories of GLP and PCP. The classification is going to be slightly arbitrary; we will see that it will not a¤ect the general argument. The projection principle stipulates that lexical information is represented at all syntactic levels to guarantee that input information may not
Linguistic Theory II
183
be lost to the system. Any computational system requires that none of the representations that encode information are lost to the system until a complete interpretation is reached. However, the formulation of the projection principle mentions the linguistic notion of lexical information. This suggests an intermediate category of principles; call it ‘‘quasi-PCPs’’ (Q-PCP): linguistically specific in formulation, but PCP in intent. X-bar is a universal template, with parametric options, that imposes a certain hierarchy among syntactic categories. Again, it stands to reason that any computational system will require some notion of hierarchy if a sequence of its elements is to meet conditions of interpretation. Still, it is not obvious that every symbol system must have the rather specific hierarchy of specifiers, heads and complements captured in X-bar theory. In that sense, the principle falls somewhere between GLP and Q-PCP. Given the uncertainty, let us assume the worst case that X-bar theory is GLP. y-theory seems linguistically specific in that it is exclusively designed to work on S-selectional properties of predicates. The y-criterion (‘‘each argument must have a y-role’’), the main burden of this theory, is phrased in terms of these properties. But what does the criterion really do, computationally speaking? As we saw (section 2.3), two kinds of information are needed to precisely determine the relations between the arguments projected at d-structure: an enumeration of arguments, and the order of arguments (Chomsky, Huybregts, and Riemsdijk 1982, 85–86). Thinking of thematic roles as lexical properties of predicates, the y-criterion checks to see if elements in argument position do have this lexical property. To the extent that the y-criterion accomplishes this task, it is a PCP. Yet, as noted, it is phrased in GLP-terms. In my opinion, it ought to be viewed as Q-PCP. The Case filter (‘‘each lexical NP must have Case’’), the main burden of Case theory, is also linguistically specific in exactly the same way: it cannot be phrased independently of linguistically specific properties. Yet, as for the y-criterion, the Case filter serves a purely computational purpose to check for the ordering part of the set of arguments; as we saw, the system does not care which lexical NP has which Case as long as it has a Case. In that sense, it is a Q-PCP as well. Binding theory explicitly invokes such linguistically specific categories as anaphors, pronominals, and r-expressions to encode a variety of dependency relations between NPs. It is implausible to think of, say, musical quantifiers, anaphors and pronominals, just as it makes no sense to look for Subject-Object asymmetries in music. Notice the problem is not that other symbol systems may lack dependency relations in general; they
184
Chapter 5
cannot. The issue is whether they have relations of this sort. Similar remarks apply to the Empty Category Principle (ECP). These are then GLPs. This brings us to the principle of Full Interpretation (FI) and Bounding theory. Bounding theory contains the Subjacency principle that stipulates the legitimate ‘‘distance’’ for each application of Move-a. These distances are defined in terms of bounding nodes, which in turn are labeled with names of syntactic categories such as NP or S. Abstracting over the particular notion of bounding nodes, it is an economy principle that disallows anything but the ‘‘shortest move’’ and, as such, it is not linguistically specific; it is Q-PCP. Finally, the principle of Full Interpretation does not mention linguistic categories at all in stipulating that every element occurring at the levels of interpretation must be interpretable at that level; in other words, all uninterpretable items must be deleted. FI then is PCP. Notice that most of the principles cluster at the inner levels of representation: d-structure and s-structure. Moreover, the principles discussed are a mixed bag of GLPs, Q-PCPs, and PCP; predominantly Q-PCPs, in my opinion. In this scenario, although PCP is nonempty, it is poor; hence, the system is not really nonspecific. But the predominance of Q-PCPs, and the relatively meager set of GLPs, suggests that there are large PCPfactors in the system which are concealed under their linguistic guise. If these factors are extracted and explicitly represented in the scheme, G-B theory can turn into one that is more suitable for nonspecificity. I will argue that the scheme currently under investigation in the Minimalist Program may be profitably viewed in that light. As noted, the Minimalist Program is more directly motivated by the assumption that FL has optimal design. Two basic concepts, legibility conditions and conceptual necessity, are introduced to capture this assumption. On the one hand, we saw that the intermediate levels of dand s-structures are eliminable from the system on grounds of conceptual necessity. On the other, we saw that most of the complex array of principles of G-B theory was clustered on these inner levels. With the elimination of these levels, MP enforced drastic reordering of principles. Recall that we viewed X-bar theory, Binding theory, and ECP as GLPs. CHL , the computational system in MP, (arguably) does not contain any of them. Further, the projection principle, Subjacency, y-theory, and Case theory were viewed as Q-PCPs. While the projection principle as such is no longer required in the system, Case theory is basically
Linguistic Theory II
185
shifted to the lexicon. y-theory, Binding theory, and ECP are shifted to an external cluster of FLI systems, as we saw. This leaves Subjacency, a Q-PCP. Q-PCPs are essentially PCPs under linguistic formulation. This raises the possibility, as noted, that just the PCP-factor may be extracted out of them, and explicitly represented in the system. The MP principle Minimal Link Condition (MLC) serves that purpose with respect to Subjacency. Similar remarks apply to the G-B condition on chains. This condition is first replaced by an economy condition called the Shortest Derivation Condition (SDC), which requires that, in case there are competing derivations, the derivation with the least number of steps is chosen (Chomsky 1995b, 130, 182); as we saw the condition was then implemented directly by restricting the domain of operation. Thus, the only G-B principle which is fully retained for the CHL in MP is Full Interpretation (FI), a PCP. In sum, insofar as the G-B principles are concerned, all linguistically specific factors have been either removed from the system in MP, or they have been replaced by economy conditions. As I attempted to show, all the principles of MP have been factored out of those of G-B—that is, no fundamental principle has been added to the system in MP. The general picture, therefore, is that the CHL in MP is predominantly constituted of PCPs. The preceding discussion of MP is not exhaustive. Let us also grant that the rendition of some of the individual principles and operations, regarding the presence or absence of linguistically specific factors in them, could be contentious. Yet, plainly, when compared to the G-B framework, the overall picture is one of greater generality and abstraction away from linguistic specificity. Recall that the only issue currently under discussion is whether we can discern a progressively PCP-dominated conception of CHL . 5.2.2
Displacement
A variety of objections may be raised against the picture. A general objection is that, granting that successive phases of linguistic theory do show a movement from GLPs to PCPs, PCPs are to be understood in the context of linguistic explanation (only). The objection is trivially true if its aim is to draw attention to a certain practice. There is no doubt that these PCPs were discovered while linguists were looking only at human languages. We need not have entered the current exercise if someone also discovered them in the course of
186
Chapter 5
investigating music or arithmetic. But the future of a theoretical framework need not be permanently tied down to the initial object of investigation. As Chomsky observed in the past, a su‰ciently abstract study of a single language, say, Hidatsa, can throw light on the entire class of human languages; hence, on FL. This observation cannot be made if it is held that the non-Hidatsa-specific principles that enter into an explanation of Hidatsa cannot be extended to Hindi because Hindi was not in the original agenda. No doubt, the laws and principles postulated by a theory need to be understood in their theoretical context. For example, the notions of action and reaction as they occur in Newton’s force-pair law (‘‘every action has an equal and opposite reaction’’) have application only in the context of physical forces even if the law does not mention any specific system. We cannot extend its application to, say, psychological or social settings such as two persons shaking hands. Global limits on theoretical contexts, however, do not prevent theoretical frameworks to evolve and enlarge within those limits. The force-pair law does not apply to social situations, but it does apply to a very large range of phenomena, perhaps beyond Newton’s original concerns in some cases. For instance, the law has immediate application in static phenomena like friction, but it also applies to dynamical phenomena such as jet propulsion. So the question whether principles of CHL apply to other cognitive systems is more like asking whether the force-pair law applies to jet propulsion, rather than to people shaking hands. The burden is surely on the linguist now to tell us what exactly the boundaries of the linguistic enterprise are. A specific objection to the picture arises as follows. As we saw in some detail earlier, human languages require that sometimes an element is interpreted in a position di¤erent from where it is sounded. John and the book receive identical interpretations in markedly di¤erent structures such as John read the book and the book was read by John. It is the task of a transformational generative grammar to show the exact mechanism by which the element the book moves from its original semantic position to the front of another structure without altering semantic interpretation. A basic operation, variously called Move-a or A¤ect-a in G-B, and Move, Attract, or Internal Merge in MP, implements displacement. We saw all this. Now the objection is that nothing is more linguistically specific than the phenomenon just described. A major part of CHL is geared to facilitate instances of movement in an optimal fashion. Thus, even if the requirement of optimality leads to PCPs, the reason why they are there
Linguistic Theory II
187
is essentially linguistic. In that sense, the phenomenon of displacement could be viewed as blocking any clear conception of nonspecificity of the computational system. To contest, I will outline a number of directions to suggest that the issue of displacement (hopefully) breaks down into intelligible options that are compatible with the general picture of nonspecificity. First, suppose displacement is specific to human languages. In that case, the general picture will not be disturbed if the phenomenon is linked to other linguistically specific aspects of the system. From one direction, that seems to be the case. We saw that the lexicon, which is a collection of features, is certainly linguistically specific in the sense under discussion here. One of the central ideas in MP is that the lexicon contains uninterpretable features such as Case. Since the presence of these features at the interfaces violates FI, CHL wipes them out during computation. The operation that executes this complex function is Move. Move is activated once uninterpretable features enter CHL ; displacement is entirely forced by elements that are linguistically specific.16 There are several ways of conceptualizing this point within the general picture. If Move is an elementary operation in the system, then we may think of this part of the system as remaining inert until linguistically specific information enters the system. The rest of CHL will still be needed for computing non-linguistic information as Merge is activated to form complex syntactic objects. In e¤ect, only a part of the system will be put to general use and the rest will be reserved for language. Chomsky (1988, 169) says exactly the same thing for arithmetic: ‘‘We might think of the human number faculty as essentially an ‘abstraction’ from human language, preserving the mechanism of discrete infinity and eliminating the other special features of language.’’ Alternatively, Move may not be viewed as an elementary operation but a special case of existing operations. There are suggestions in which Move is viewed as specialized Merge (Kitahara 1997; Chomsky 2001a). As we saw in some detail, an even simpler view is that Move is simply internal Merge. So if you have (external) Merge, you have (internal) Merge for free. Second, we may ask whether displacement in fact is linguistically specific. We saw a CHL -internal reason for displacement triggered o¤ by uninterpretable features. However, there is another reason for displacement. As noted, external systems impose certain conditions on the form of expressions at the interfaces. For example, (e‰cient) semantic interpretation often requires that items be placed at the edge of a clause to effect a variety of phenomena such as topicalization, definiteness, and the
188
Chapter 5
like. The elimination of uninterpretable features takes an element exactly where it receives, say, a definiteness or quantifier interpretation. Given that linguistic notions such as topicalization, definiteness, and so on—‘‘edge’’ phenomena—are viewed as special cases of more general notions such as focus, highlight, continuity, and the like, could it be that the external systems that enforce these conditions are not themselves linguistically specific, at least in part? If yes, then these parts could be viewed as enforcing conditions on structures which are met in di¤erent ways by di¤erent cognitive systems in terms of the internal resources available there. For example, language achieves these conditions by drawing on uninterpretable features specifically available in the human lexicon; as we will see, music could be enforcing similar deleting operations with the ‘‘unstable’’ feature of notes that occur in musical progression. This will make the implementation of displacement specific to the cognitive system in action; but the phenomenon of displacement need not be viewed as specific to any of them. In any case, the issue seems to be essentially empirical in character; we just need to know more.
6
Language and Music
It seems that whenever there is an urge to talk about languagelike systems, people typically mention music, arithmetic, and logic, among others. After characterizing ‘‘hominization’’ as a process that includes acquisition of language, music, mathematics, and logic, Derek Bickerton (2000, 161–162) thinks that it would be bizarre to suppose that ‘‘each of these capacities had a separate and independent birth.’’1 According to Bickerton, the supposition would be bizarre because these ‘‘traits’’ are essentially unique to humans, yet it is ‘‘entirely beyond belief ’’ that the variety of ‘‘unconnected capacities of this magnitude’’ could have emerged in the short period of time that has elapsed since the hominid line split from the rest of the primates. It is more likely then that these capacities have a common origin. Call the suggested list of capacities the ‘‘hominid set.’’ Since much of Bickerton’s paper is concerned with computational or syntactic aspects of language, I will assume that the preceding concerns about common origin also have the same thrust: the computational principles, or some (abstract) version of them, that underlie human linguistic competence could be implicated in the domains of music, mathematics and logic as well. The list of capacities just mentioned is intuitive and obviously incomplete; if the hominid set is to denote a natural category, what falls under it cannot be stipulated. Extensive theoretical and empirical inquiry is needed to determine which cognitive systems in fact satisfy the early a prioristic formulation. 6.1
Musilanguage Hypothesis
I will focus on the cognitive system of music to see if something like a ‘‘musilanguage hypothesis’’ (MLH)—music and language share underlying syntactic properties—makes sense. The choice of music, as against,
190
Chapter 6
say, arithmetic, is interesting since arithmetic is generally taken to be ‘‘derived’’ from language anyway. Informally, people often speak of music also as a language, sometimes even as a universal language of mankind.2 Such superficial impressions aside, language and music seem to be very di¤erent cognitive systems marked by domain-specific properties— for example, language stores lexical information, music stores tonal information. We will note many other di¤erences as we proceed (section 7.2). In that sense, the proposed generalization that cognitive systems other than language might share syntactic properties with language will be significant if it extends to the music case. Furthermore, MLH might begin to address the mystery raised by Charles Darwin (1871, 878): ‘‘As neither the enjoyment nor the capacity of producing musical notes are faculties of the least direct use to man in reference to his ordinary habits of life, they must be ranked amongst the most mysterious with which he is endowed.’’ MLH suggests that language and music ‘‘could have arisen due to the occurrence of an ancestral stage that was neither linguistic nor musical but that embodied the shared features of modern-day music and language’’ (Brown 2000, 277). I am not proposing any worked-out theory of music cognition in what follows. The stage has not arrived for developing MLH to anything like an empirically significant theory. My hope is that, once the proposal is seen to be empirically plausible and relatively immune from conceptual objections, theory building—by people more directly engaged in linguistics and music theory—in this obscure but intrinsically interesting area might take a definite course, in contrast to the rather partial and disjoint e¤orts currently in view, as we will presently see. In fact, to anticipate, it might be possible to entertain an even stronger hypothesis in which the computational principles already discovered in the language case may be used to explain aspects of musical competence. If that makes preliminary sense, then we may be spared the trouble of discovering at least the basic form of some of the fundamental principles of musical computation independently and (entirely) afresh. 6.1.1
Evidence
There is no conclusive evidence that supports some version of MLH. Nonetheless, over the years, some evidence has accumulated from a variety of largely independent directions. Even if individual pieces of evidence do not directly support MLH, the cumulative body of evidence does seem to encourage further theoretical inquiry on MLH. The most interesting pieces of evidence may be listed as follows.
Language and Music
191
Prima facie, it seems that music, like human language, is a completely universal, species-specific capacity. Infants a few months old can distinguish between clause boundaries of languages (Karmilo¤-Smith 1992, 37) as well as between consonant and dissonant notes in a melody (Schellenberg and Trehub 1996; Trehub 2003; Hauser and McDormott 2003). Furthermore, there is striking convergence between language and music in stages of maturation, including critical periods (Fitch, Hauser, and Chomsky 2005 and references). Like language, every culture develops at least vocal music even in adverse environmental conditions independently of race, gender, cultural achievement, and the like. Sami people of Lapland developed extremely complex vocal music although they failed to develop instrumental music because of scarcity of material in such latitudes (Krumhansl et al. 2000). Archaeologists have discovered flutes made from animal bones by Neanderthals living in Eastern Europe (Kunej and Turk 2000). Although the interpretation of the discovery is controversial (Fitch 2006), it is instructive to look at some of its implications for the issue in hand (see also Mithen 2005). The discovery consists of a perforated thigh bone of a young cave bear; the bone was buried at layer 8 at Divje babe I cave site in Slovenia. The authors report that, previous to this find, the earliest known bone flute was dated by radiocarbon method at 36,000 years, and was assigned to the Upper Paleolithic. Layer 8 has been radiocarbon dated to an interval from 43,000 to 45,000 years, and is assigned to the Middle Paleolithic. The authors claim that the holes in the bone flute could not have been made by an animal: strong evidence is given that the holes were made by humans with stone tools designed for that purpose. The authors note that the spread of ‘‘technology’’ in those days took a long time, perhaps tens of thousands of years, not surprisingly. They also note that a more common method of making flutes could have been the use of hollowed bark, which is much easier to handle. Being less durable, all wood flutes, if any, would have been lost by now. Use of bones with special tools could have been a later method (Fitch 2006), perhaps to obtain a sharper, more durable tone; we can only guess. Further, it is a plausible conjecture that, given the trouble needed to fashion appropriate instruments, instrumental music could have emerged much later than vocal music. This would place emergence of music even further back in time. Estimates of the emergence of language vary widely, some tracing it to as far back as 100,000 years. However, according to one respectable estimate (Holden 1998), the faculty of language emerged about 40,000 years
192
Chapter 6
ago. It is most interesting that recent estimates of the sudden increase in human brain size is also traced to about 100,000 years (Striedter 2006, cited in Chomsky 2007, 19). Next, outside of language, music is the only (other) cognitive system for which a fairly detailed generative theory has been proposed on the model of research in generative grammar: generative theory of tonal music, GTTM (Lerdahl and Jackendo¤ 1983). Although the theory is focused on Western tonal music, the structures so isolated suggest an underlying basis to musical experience that looks invariant across a vast range of music in that genre. Some parts of the theory were subsequently verified in terms of actual audience response (Jackendo¤ 1992; Krumhansl 1995). Subsequent theoretical and experimental work has now yielded a theory some of whose predictions can be quantified (Lerdahl 1996, 2001). Notice that I am using the emergence of GTTM itself as a possible evidence for MLH, without supposing that, as a theory, GTTM illustrates MLH. The idea is that we get some glimpse of the general properties of an object from the sort of theoretical moves it responds to (see section 1.1). In this case, the object appears to be formal in character. Further, experimental research on various aspects of music cognition, across a wide spectrum of musical traditions and cultures, are beginning to provide evidence for the ‘‘parametric’’ nature of musical organization within a narrow range. Researchers have identified ‘‘a core set of psychological principles underlying melody formation whose relative weights appear to di¤er across musical styles’’ (Krumhansl et al. 2000, 13–14). For example, in the paper just cited, it was reported that in studies on melodic expectancy and tonal hierarchies, considerable agreement was found between listeners from the music’s cultural context or from outside it. Thus, ‘‘the inexperienced listeners were able to adapt quite rapidly to di¤erent musical systems’’ (p. 14). Finally, there is some evidence that ‘‘tonal syntax is closely analogous to the part of language we call grammar,’’ as Carol Krumhansl interprets a recent study on Broca’s area of the brain (Maess et al. 2001; also Patel 2003). In an attempt ‘‘to localize the neural substrates that process musicsyntactic incongruities,’’ Maess and his colleagues studied brain processes ‘‘elicited by harmonically inappropriate chords occurring within a majorminor tonal context.’’ They found that such chords elicited an early e¤ect ‘‘localized in Broca’s area and its right-hemisphere homologue, areas involved in syntactic analysis during auditory language comprehension.’’ This suggests ‘‘that these areas are also responsible for an analysis of in-
Language and Music
193
coming harmonic sequences, indicating that these regions process syntactic information that is less language-specific than previously believed.’’ Turning to some other properties of the language system, it is plausible to assume that musical systems satisfy some of the well-known general properties of language such as unboundedness and weak external control (Chomsky 2002). Apparently, every musical system consists of a small set of notes with a universal core. Informally speaking, these notes are compiled over and over again to generate progressively complex objects such as chords, phrases, passages, movements, and so on. The generation of complex objects is unbounded and countable (Brown 2000, 273; Fitch 2006). I return to the topic for extensive discussion. The entire system is totally ‘‘internal’’ in the sense that there seems to be little external control on the form and development of the relevant cognitive structures. As with language, music leads to music, that is, the primary tonal data that triggers o¤ the musical system is a product of the musical system itself. As far as I can tell, children do not develop musical competence by listening to birds even if some birds are viewed as musical. As noted, the system is at once universal and ‘‘parametric’’ in a general way. As Chomsky notes, these are pretty rare and surprising properties of cognitive systems, including most human systems. So, if they are available (only) for language, music and a small class of other systems, some unifying explanation is called for. Additionally, introspective evidence, for what it is worth, suggests that, unlike vision (I return), simultaneous access to musical and linguistic systems is at least di‰cult.3 At one extreme, it is nearly impossible to both sing and listen to someone at the same time. However, just this piece of evidence is not persuasive since the di‰culty could be arising not due to conflict in the computational system, but because of conflict in the same (auditory) channel of information. Turning thus to two di¤erent channels, it is also extremely di‰cult to sing and read something with equal e‰ciency, unless it is the score we are singing from. It continues to be di‰cult to listen to music while reading something; we are unlikely to bring a book to a concert. It is also quite di‰cult to listen to music while thinking (hard) about something else. Introspective evidence also suggests that listening to music does not a¤ect purely ‘‘visual reading’’ such as attending to shapes of letters, spacing, and so on, as in proofreading. Problems begin when we attend to syntactic properties such as agreement and clause boundary; the music is simply ‘‘switched o¤ ’’ at that point. Needless to say, the di‰culties compound with increase in the complexity
194
Chapter 6
of the piece of music and the object of reading/thinking at issue. This last point suggests that, other things being equal, the systems compete for the same computational resources.4 6.1.2
What the Evidence Means
None of the pieces of evidence just cited directly support MLH, although their salience with respect to MLH varies. The fact that both linguistic and musical syntactic abilities begin to show up in early infancy is not decisive since many other abilities, such as the ability to recognize faces, to detect and express emotions, and so on, may also be showing up at about the same stage. It will be di‰cult to hold that the ability to recognize faces influenced the emergence of language. Similarly, the sudden increase in brain size might have led to the emergence of a variety of other abilities—for example, advanced use of digits for tool-making. On the basis of current evidence, there seems to be little connection with toolmaking and the ability to hum. Although the emergence of a generative theory of music (GTTM) soon after that of language is interesting, it proves little by itself. As Lerdahl and Jackendo¤ (1983) explicitly observe, their work shows little similarity between language and music with respect to core linguistic systems such as phonology, syntax, and semantics; parallels, if any, are to be found in areas such as rhythmic and prosodic structures, which are generally viewed as not restricted to language, or even to humans in any case (Ramus et al. 2000; Fitch 2006). Lerdahl and Jackendo¤ suggest that the parallels with linguistic theory are to be found more in the methodology and style of inquiry, and, according to Jackendo¤ (1992), the ‘‘style’’ extends to domains such as vision and social cognition, which fall beyond the scope of MLH. It follows that, if MLH is to hold, GTTM is not likely to satisfy it. In my opinion, the most compelling evidence is the introspective one that seems to filter out other cognitive systems except the hominid set under consideration. But then, as emphasized, the evidence is just introspective, and people’s judgments are likely to vary. We must learn more from controlled experimentation on musical ability on various grades of autism, children with specific language impairment, and so on, to find out exactly which musical capacity, if any, remains unimpaired alongwith varieties of linguistic impairment, and vice versa. In any case, it is hard to see what exactly to look for in the suggested cases unless we already have something like a theoretical framework in hand.
Language and Music
195
The point applies, perhaps more clearly, to neural evidence (Mukherji 1990). For example, the neural evidence regarding Broca’s area cited above could just mean that we have been wrong in identifying the resources of this area too narrowly. Also, following Patel 2003, it is unclear to me whether the neural evidence regarding Broca’s area directly explains musical competence or whether it brings out certain general patterns of acoustic processing shared by language and music. Again, it is hard to see how to distinguish between these alternative interpretations of neural data definitively without the guidance of a theory. Finally, the suggestion that music, like language, is unbounded and is weakly subjected to external control is more like a proposal than a statement of fact; it depends crucially on what we mean by ‘‘unboundedness’’ and ‘‘weak control’’ and which properties are in fact jointly satisfied. I will thus spend considerable time in getting clear about the properties of unboundedness and weak control. MLH does look more promising with respect to the cumulative body of evidence cited above. When we consider the full basket of evidence, individual pieces might be viewed as reinforcing each other. Thus, neural evidence that the Broca’s area could be processing both musical and linguistic syntactic information supports the introspective evidence concerning di‰culties of simultaneous access. These two evidences together support the evidence of surprising matching of critical periods in early infancy within the narrow domain of phrase boundaries for both music and language. The growing body of evidence then aligns with the evolutionary evidence regarding almost simultaneous emergence of the two systems, perhaps due to the increase in brain size. Following this direction, to me the most promising aspect of the noted evidence for music is that, except for arithmetic, logic, and the like, I do not know of any other human nonlinguistic cognitive system, not to speak of nonhuman systems, where so many languagelike properties cluster. Consider the visual system. Recall that an obvious general property of language is that it is a formal, articulated system. That is, it is a system of perceptually distinguishable signs that individually and collectively express the information encoded in the representations associated with the signs. This contrasts sharply with the visual system, which is a ‘‘passive’’ system; it is not a system of signs at all.5 No doubt, visual representations can be described in combinatorial terms (Marr 1982; Ho¤man 1998). But the system itself is not a ‘‘language’’; we use signs to describe its structure (see section 7.3). Another general property of language is that it is a
196
Chapter 6
system of discrete infinity, as noted. In the absence of a system of ‘‘expressions’’ it is unclear how to determine the magnitude of what the visual system ‘‘generates.’’ Most importantly, environmental conditions strongly influence the properties of perceptual systems; they only weakly influence, if at all, the properties of the language and the musical systems. It is well known that the sensory systems of organisms, including the visual system, degenerate when environmental conditions that enforced those systems are no longer present. In the other direction, these systems adjust to changed environmental conditions to develop or amplify alternative sensory properties. The blind mole rat (Spalax ehrenbergi) illustrates the point. As the species moved underground millions of years ago, their eyes atrophied and became subcutaneous (David-Gray et al. 1998). It was naturally assumed that their visual system had become completely dysfunctional, but recent studies have shown that only those parts of the brain that support image formation have atrophied. The eye and the other parts of the brain have continued to develop an auditory system suited for the perception of vibratory stimuli (Bronchti et al. 2002).6 The phenomenon just does not apply to the language system since the only ‘‘external’’ condition it has to meet is the linguistic ‘‘environment,’’ not the physical properties of the world: the linguistic ‘‘environment,’’ the source of primary linguistic data, is a product of the language system itself. Furthermore, the familiar argument from the poverty of stimulus suggests that the initial human language system, the faculty of language, ought to be simple and uniform across the species; this must also be the case with the visual system. But the visual system is not only uniform across the species like the language system, the states that it can attain, unlike the language system, are largely uniform as well, pathology aside. The states that the language system can attain, however, vary wildly as thousands of human languages and dialects testify. Within the species, the language system is thus parametric; the visual system is not. Moreover, given the ‘‘master-eye’’ hypothesis, the human visual system may not be restricted to the species at all; the language system, in contrast, is largely unique to the species in major respects. It is not surprising, therefore, that there is no common core in the specific operating principles of the two systems; the language system does not have anything like the rigidity principle of the visual system, the visual system does not seem to require anything like internal Merge. Possibly, the remark extends to principles of acquisition of these systems. Jenny Sa¤ran (2002) makes the interesting observation that strategies of statisti-
Language and Music
197
cal learning of predictive dependencies—for example, the ability to predict phrase boundaries from incoming streams—may extend beyond natural languages to include ‘‘artificial languages’’ and music. These strategies do not seem to apply to the visual modality when the stream of input is presented simultaneously—not serially—which is typically the case with vision. Finally, when we are discussing whether two cognitive systems di¤er in their computational principles, it is natural to ask if there is any conflict in their operations. Introspective evidence seems to suggest that, unlike the music case, there is no obvious conflict in the simultaneous operation of the linguistic and the visual systems, enabling us to report on what we see. We can change visual fields, and zoom in and out of them while continuing to report on all these changes at near-perfect e‰ciency. Heart surgeons and sports commentators are able to give running commentaries on the intricate, rapidly changing scenarios in front of them, including their own actions. In fact, in this line of reasoning, our ability to talk about what we see would seem to require that the systems of visual and linguistic computation are separate.7 Other things being equal and pending more controlled experimentation, it is most likely that the linguistic and the visual systems access different computational systems; it is hard to find a dimension in which to place the visual system on a par with the linguistic system. The music system, in contrast, seems to satisfy all those ‘‘languagelike’’ conditions that the visual system fails to satisfy. 6.2
Strong Musilanguage Hypothesis
The net result seems to be that the cumulative body of evidence demands a theory in MLH lines so that we are able to furnish a unified account of otherwise disjoint and inadequate individual pieces of evidence. What are the prospects of giving a theoretical shape to MLH? The simplest answer will show that the same syntactic system underlies the capacity to generate unbounded sequences in both music and language. Ideally, the structuring principles already discovered in the language case will constitute such a system. In what follows, I will concentrate on the syntactic framework proposed by Noam Chomsky in the Minimalist Program without ruling out that other frameworks—for example, Richard Kayne 1994—may be relevant. For now, it seems to me that Chomsky’s framework has direct implications for music. In that sense, the ability to cover cognitive
198
Chapter 6
systems other than language may well be a criterion for choosing among various syntactic frameworks. As a first step in that direction, we will require that the principles of linguistic organization are not linguistically specific. Under the current minimalist conception (Chomsky 1995b), the core linguistic system consists of two things: a recursive operation Merge and some principles of computational e‰ciency (PCE). We saw in some detail that these abstract principles and the operation that constitute the computational system of human language, CHL , are not linguistically specific. Could these principles be involved in musical organization as well? If the answer is in the positive, then CHL is the sole computational system of music and language. This will count as the strongest version of the musilanguage hypothesis (SMH). SMH continues to be ( just) a hypothesis; an empirically significant theory ensuing from SMH is nowhere in sight. Detailed empirical and theoretical research is needed to show that the principles of CHL , or some (abstract) version of them, in fact explain properties of musical organization.8 I have been trying to promote MLH, and now SMH, as an attempt to make a variety of insights, mysteries and individual pieces of evidence converge. Nevertheless, a number of conceptual or foundational issues need to be addressed before SMH is allowed to get o¤ the ground. For the rest of this chapter I will be concerned with what I consider to be conceptual issues of immediate interest. For example, I am aware that broad philosophical or musicological objections may be raised against the very idea of computational theory of music (Scruton 1997) just as they were raised against linguistic theory in the past. As with linguistic theory, such objections can only be addressed by simply pursuing an otherwise plausible theoretical framework. So these cannot be of immediate interest.9 Two general issues seem to require pressing attention before SMH is seriously entertained: (i) Is music a system of sound-meaning correlations at all? (ii) Is music recursive in the sense in which language is? 6.2.1
Music and Meaning
Not everyone is convinced that music is a symbol system in the right sense. One may even doubt if music is a system of symbols at all—that is, music may be viewed as nothing but a system of sounds. The underlying idea is that a sound is a symbol if it has a meaning, and the relation between sound and meaning is largely arbitrary. For example, Bertrand Russell (1919) held that a symbol ‘‘stands for’’ something else. The traditional way in which a linguistic symbol stands for something is for the
Language and Music
199
sound of the word to be associated with a concept or an object— typically, both. Since we cannot associate a musical sound with either a concept or something in the world (Fitch 2006; Boghossian 2007), musical sounds are not symbols. Notice that the objection makes it virtually impossible to inquire if a certain noise or mark on paper is a symbol unless it is very much like a linguistic mark; that is, we cannot meaningfully ask if there are symbol systems other than language (Ra¤man 1993, 40–41). More significantly, on the Russellian count, it is di‰cult to assign any theoretical salience to the notion of a linguistic symbol. From what we can judge now, wordconcept and word-object relations may fail to be theoretically salient; as we saw at length in chapters 3 and 4, we can even doubt whether these relations obtain for natural languages. Suppose we do so, but that does not prevent us from entertaining some concept of meaning within grammatical theory itself. To recall, when a set of lexical items enters the grammatical system, computation begins. If the computation does not crash, a relationship is established between the phonetic (PF) and the logical form (LF) such that the pair hPF, LFi captures the traditional idea of language as a system of sound-meaning correlations (Chomsky 1995b). In that sense, an LF is an organization of symbols, but the semantic content of LF does not include either denotational or conceptual information. At LF, all phonetic features have been stripped away and only semantic and formal features remain, but the semantic features are not interpreted at LF since LF is the output of grammar;10 only formal features, including features such as person, number, and gender play computational role. So the LF-structure should be viewed as consisting of only these purely structural items. This must be the case whatever be the output of narrow syntax—LF, SEM, or just interpretable phase. As proposed above, LF-information is best viewed as captured as the output of FLI systems, rather than as the output of the computational system. Even there, we do not know, say, what the gender feature ‘‘female’’ means; we just know that the feature has to agree. In e¤ect, grammatical theory provides compelling evidence for postulating some (as yet unclear) notion of internal significance of a sequence of symbols (McGilvray 2005). I do not see why this restricted notion of meaning/significance cannot apply to music. From this perspective, consider again the LF-representation of the sentence every boy danced with a girl. As we saw, the sentence is two-ways ambiguous and the ambiguity can be represented at LF as follows:
200
Chapter 6
(121) (a) Representation: [IP [every boy]i [ei danced with a girl]] Interpretation: For every boy x, x danced with a girl (b) Representation: [IP [a girl]j [IP [every boy]i [ei danced with ej ]]] Interpretation: A girl y is such that for every boy x, x danced with y Since representations (121a) and (121b) carry lexical information of quantifiers, nouns, verbs, and so forth, they are linguistic expressions par excellence; in particular, scope distinctions are forced by linguistically specific properties of the expression as we saw. I am not suggesting that musical representations look like these.11 Nevertheless, I wish to draw attention to some general features of this example which, in my opinion, are available beyond language. First, (121a) and (121b) are structurally distinct in that the relative positions of the symbolic objects in them differ. Second, these structural di¤erences are directly related to how a representation is to be interpreted. Third, the interpretations do not make any reference to how the world is like, the beliefs of people interpreting them, the vagaries of the associated culture, and the like. In fact, in order to di¤er, the interpretations do not require that there be an ‘‘external’’ world at all. Turning to music, Diana Ra¤man observes that musicians are typically concerned with structural issues such as a given phrase ends at a certain E-natural because the note prepares a modulation to the dominant (Ra¤man 1993, 59). Three possibilities arise: the phrase ends before the Enatural, the phrase ends at the E-natural, and the phrase extends beyond the E-natural. As anyone familiar with music knows, these structural variations make substantial di¤erences in the interpretation of music. Depending on the group of notes at issue, and the location of the group in a passage, some of the structural decisions may even lead to bad music. This is because these decisions often make a di¤erence as to how a given sequence of notes is to be resolved. Any moderately experienced listener of music can tell the di¤erences phenomenologically, though its explicit explanation requires technical knowledge of music (such as modulation to the dominant). Lerdahl and Jackendo¤ ’s work (1983), especially Jackendo¤ and Lerdahl 2006, shows how di¤erent groupings impose di¤erent hierarchies on musical surfaces such that each hierarchical organization gets linked to a specific interpretation of the surface. For example, if phrase boundary is marked with pauses or interludes, grouping of the same sequence of notes—with pitch and meter fixed but the location of the pauses
Language and Music
201
varying—creates very di¤erent musical surfaces (Jackendo¤ and Lerdahl 2006, 2.1). The existence of delineable grouping structures explains why composers and performers spend much time on ‘‘marking’’ a score to show how exactly they wish a sequence of notes to be grouped. The practice is explicit in musical traditions which use scores. But it can be observed in any tradition by attending its training sessions, for example. Training means attention to the pitch of individual notes and how notes are to be organized. When the music becomes complex, and it begins to tax memory and attention, various devices are used to highlight the salient properties of tonal organization. These include emphasis typically by suitable ornamentation, organization of music in delineable cycles such as rondo, display of unity of larger sections by cadences, exploiting the cyclic features of the accompanying beat, and so on. The description is obviously very incomplete, but it is pretty clear that, in some sense, there is nothing else to music. Interpretations in music are sensitive solely to the formal properties of representations. That does not make musical sounds any less symbolic. 6.2.2
Themes from Wittgenstein
We just saw that grammatical theory postulates the notion of ‘‘internal significance’’ of a sequence of symbols; the postulation perhaps extends to musical symbolism. What is internal significance? The topic is large and currently pretty obscure though, as noted, it has to be faced eventually. For now, I will only make some brief remarks to indicate the issues involved. In my opinion, a fruitful way of doing so is to examine some puzzling remarks of Ludwig Wittgenstein, since Wittgenstein’s observations on language led him directly into drawing parallels with significance of musical expressions. Even there, I will avoid (the dangerous ground of ) exegesis of Wittgenstein, and attend only to some of his assorted remarks. It is interesting that Wittgenstein’s interest in a joint study of language and music spanned his entire philosophical career, despite radical changes in his philosophical position. In his earliest work, Tractatus LogicoPhilosophicus (1922), for instance, he held a ‘‘picture theory’’ of meaning in which the internal organization of linguistic symbols was viewed as establishing relations between language and the world: ‘‘A proposition is a picture of reality: for if I understand a proposition, I know the situation that it represents’’ (1922, 4.021). Strangely, he extended the idea to music: ‘‘A gramaphone record, the musical idea, the written notes, and the
202
Chapter 6
sound-waves, all stand to one another in the same internal relation of depicting that holds between language and the world’’ (paragraph 4.014). Although subsequently he abandoned the picture theory, the joint study continued. The proposal appears repeatedly in his Blue and Brown Books (1931/ 1958)—thought to be the notes for his last work, Philosophical Investigations (1953)—and in other places. In the Investigations, he asserts: ‘‘Understanding a sentence lies nearer than one thinks to what is ordinarily called understanding a musical theme’’ (1953, paragraph 527). This engagement throughout his philosophical career suggests that Wittgenstein might have had something deeper in mind than merely using music as a handy analogy for his remarks on language. Keeping to his later works, it seems that the suggested parallel between music and language involved two broad but related steps: rejection of ‘‘emotive’’ theory of music, and emphasis on internal significance of music. 6.2.2.1 Music and Emotions Wittgenstein rejected what is generally held to be the most salient aspect of music: music ‘‘represents emotions in a way that can be recognized by listeners’’ (Dowling and Harwood, cited in Ra¤man 1993, 42). Ra¤man proceeds to cite Roger Scruton: it is ‘‘one of the given facts of musical culture’’ that the hearing of music is ‘‘the occasion for sympathy.’’ For Scruton, if someone finds the last movement of The Jupiter Symphony ‘‘as morose and life-negating,’’ he would be ‘‘wrong’’ (see also Boghossian 2007). Thus the literature on the emotional significance of music includes items such as being merry, joyous, sad, pathetic, spiritual, lofty, dignified, dreamy, tender, dramatic; it also includes reference to feelings of utter hopelessness, foreboding, sea of anxiety, terrified gesture, and the like. Jackendo¤ and Lerdahl (2006) suggest the following list: gentle, forceful, awkward, abrupt, static, earnest, opening up, shutting down, mysterious, sinister, forthright, noble, reverent, transcendent, tender, ecstatic, sentimental, longing, striving, resolute, depressive, playful, witty, ironic, tense, unsettled, heroic, or wild.12 Following Jackendo¤ 1992 and Jackendo¤ and Lerdahl 2006, I would call these things collectively as ‘‘musical a¤ect.’’ Wittgenstein dismissed the whole thing: ‘‘It has sometimes been said that what music conveys to us are feelings of joyfulness, melancholy, triumph etc., etc. and what repels us in this account is that it seems to say that music is an instrument for producing in us sequences of feelings.’’ Elaborating on why he is ‘‘repelled,’’ he says that it is a ‘‘strange illusion’’
Language and Music
203
that possesses us when ‘‘we say ‘This tune says something,’ and it is as though I have to find what it says’’ (1958, 178). In the absence of supporting arguments and evidence from Wittgenstein, we may interpret these remarks as follows. Let us ask why emotions are seen to be so strongly associated with the expressive content of music. Music seems to pose the dilemma that it both conveys and does not convey ‘‘thoughts;’’ appeal to emotions seems to solve the dilemma. To appreciate the dilemma, for anyone even marginally engaged with music, it seems that ‘‘music is not understood in a vacuum, as a pure structure of sounds fallen from the stars, one which we receive via some pure faculty of musical perception’’ (Levinson 2003). In that sense, a tune seems to ‘‘say’’ something. In the paper cited, Jerrold Levinson supports the thoughtful nature of music with a range of musical examples.13 Thus, we can think of music as drawing a conclusion (Beethoven’s Piano Sonata, op. 110, Dvorak’s Seventh Symphony), coming to a close (minuet movements from Mozart’s 40th or 41st), asserting (the opening of Schubert’s Piano Trio No. 2, op. 100), questioning (the opening phrases of Beethoven’s Piano Sonata No. 18, op. 31, no. 3), imploring (the flute introduction to Bellini’s aria Casta Diva), defying (the opening of Beethoven’s Fifth Symphony), disapproving (the orchestral interjections in the first part of the finale of Beethoven’s Ninth Symphony), and so on. Levinson asks: ‘‘Can a medium capable of summoning up such a range of mindful actions be a domain in which thought is absent?’’ However, the most obvious candidate for what-is-said, namely, a proposition with truth value, simply does not apply to music. In other words, music does not seem to correlate with extramusical objects which either do or do not satisfy a given musical expression: ‘‘I know that [a tune] doesn’t say anything such that I might express in words or pictures what it says’’ (Wittgenstein 1931/1958, 166). What then do musical thoughts express or signify? The widespread intuition that music is intimately associated with a¤ects helps address the problem. Given the ‘‘strange illusion’’ that the sayability of music needs to be captured in some or other extramusical terms, people find it natural to assume that music expresses a¤ects; perhaps one could even say that (specific) a¤ects are what given pieces of music mean.14 In fact, a closer look at Levinson’s defense of Wittgenstein’s claim that music is a thoughtful activity does not preclude that these thoughts basically produce a¤ects. For example, apart from the instances of musical ‘‘thought’’ cited above, Levinson also includes the
204
Chapter 6
following: angrily despairing, menacing, cajoling, comforting, bemoaning, heaven-storming, and so on. It is hard to make sense of these things without some notion of a¤ect. These and the earlier examples suggest that Levinson is more interested in the ‘‘pragmatic’’ aspects of musical thought than its ‘‘semantic’’ aspects; that is, he is more interested in what music does (to people) than in what music is. It is also instructive that, despite his frequent citations from Blue and Brown Books, Levinson fails to discuss the crucial remark at (178) that recommends disassociating music from its a¤ects, as we saw. Notwithstanding its intuitive appeal, there is also a growing body of literature that questions the idea that musical understanding can be su‰ciently explained in terms of musical a¤ects, although the ‘‘association’’ between them is never denied. For brevity, I will mention just two related objections to the idea (for more, see Mukherji 2000, chapter 4; Vijaykrishnan 2007, chapter 7, for objections to Mukherji). It seems that ascription of specific a¤ects to given pieces of music is not stable: the first movement of Mozart’s G minor symphony, now considered tragic, was viewed by nineteenth-century critics as cheerful. Further, two very di¤erent pieces of music from the same or di¤erent genres can both be viewed as cheerful: Mozart’s Ein Kleine Nachtmusik and the medium-tempo parts of performances of, say, raagas Jayjawanti or Bilaval. If the ‘‘cheerfulness’’ is traced to the tempi, then the tonal structure of music becomes irrelevant, apart from the implausible consequence that all music in a certain tempo would have to be viewed as cheerful. Keeping to the tonal structure, suppose we think of the Beatle’s song Michelle as generally sad. Ignoring the (palpably sad) words, could the sadness be traced to the fact that much of the melody in Michelle ‘‘moves in a relatively small range in the mid-to-low vocal range, with a generally descending contour,’’ as Jackendo¤ and Lerdahl (2006, 63) seem to suggest? Unless sadness is associated with this range by definition, I can easily cite sad music that ‘‘wails’’ as the notes ascend in the mid-to-high range. In contrast, (the early part of ) the slow movement ‘‘in a relatively small range in the mid-to-low vocal range’’ of the so-called morning raaga aahir bhairav is meant to evoke not sadness, but spirituality tinged with joy as the ‘‘dawn breaks.’’ This leads to the problem—hardly mentioned in the otherwise massive literature on the topic to my knowledge—that, even if we are able to ascribe specific a¤ects to given pieces of music to some degree of agreement between people hearing it, a¤ects can only be global properties of (large) chunks of music; typically, a¤ects are assigned to an entire piece of music,
Language and Music
205
as we saw. But, informally speaking, music is a complex organization of tones which we hear on a tone by tone basis, forming larger and larger groups as we proceed. It is hard to see how the global (a¤ective) property of the piece is computationally reached from its smaller parts. To me, the problem seems pretty overwhelming. If a cheerful music is cheerful in each of its parts—an implausible assumption in any case— then the compositionality of the global property of cheerfulness is trivial; that is, the music as a whole is not saying anything di¤erent from its parts. If a (globally) cheerful music is not cheerful in its parts, then cheerfulness is not a compositional property of the piece of music allowing Ein Kleine Nachtmusik and Jayjawanti to have identical global properties, as noted; a¤ects just do not compute in the desired sense. In sum, the notion of musical a¤ect makes the deeply complex and significant internal structure of music essentially irrelevant for understanding of music. It is interesting that Jackendo¤ and Lerdahl (2006, 61) note, almost in passing, that ‘‘musical meaning’’ is ‘‘the a¤ects that the listener associates with the piece by virtue of understanding it.’’ So, the ‘‘association’’ of a¤ect with the piece follows its understanding; the associated a¤ect, therefore, does not explain what it is to understand a piece of music. The dilemma, posed above, persists. None of this is meant to deny that musicians and their audiences are often a¤ected by music—as reflected, in part, in their complex facial responses. The point is that the fact need not to be traced to the internal structure of music itself. Jackendo¤ (1992, chapter 7) shows that it is possible to explain why we want to hear the same music from the properties of musical processing alone, not because the piece invokes—although it may—pictures of reality, desires, and the like, that we wish to revisit. From a di¤erent direction, Massimo Piattelli-Palmarini (personal communication) suggests that, even if opera shows that certain emotions (habitually) go with certain kinds of melodies, it could be that representations are not involved. If these suggestions hold, then, it should be possible to explain the undoubtedly intimate relationship between music and human a¤ects without tracing the relationship to the internal structure of music. As songs attest, outputs of both language and music certainly access a¤ects. Perhaps music accesses a¤ects more directly and definitively because music has nothing else to access. Internal Significance Given the salience of the notion of musical thought, and the total absence of anything extramusical to correlate these thoughts with, Wittgenstein (1931/1958, 166) proposed perhaps the only
6.2.2.2
206
Chapter 6
option left: ‘‘Given that what a tune ‘says’ cannot be said in words, this would mean no more than saying ‘It expresses itself.’ To bring out the sense of a melody then ‘is to whistle it in a particular way.’ ’’ As noted, he extends the claim to language as well—to the understanding of a sentence, for example. He suggests that what we call ‘‘understanding a sentence’’ has, in many cases, a much greater similarity to understanding a musical theme ‘‘than we might be inclined to think.’’ The point is that we already know that understanding a musical theme cannot involve invoking of ‘‘pictures.’’ Now the suggested similarity between music and language is meant to promote a similar view of language as well, that is, no ‘‘pictures’’ are made even in understanding a sentence. ‘‘Understanding a sentence,’’ he says, ‘‘means getting hold of its content; and the content of the sentence is in the sentence’’ (p. 167). The remarks quoted from Wittgenstein suggest that they comprise of three ideas: that music expresses itself, what the music expresses can only be shown by whistling it in a particular way, and that understanding sentences of a language is much like understanding pieces of music. Is there a way of giving a coherent shape to these apparently disjoint claims? To emphasize, we are aiming for a scheme in which each of these ideas play significant roles. The idea that music expresses itself is the natural place to start since it signals the point of departure from ‘‘externalist’’ understanding of music. Roger Scruton (2004) supports Wittgenstein in questioning the prevalent idea that musical meaning is a relation between music and something else such as emotions. Scruton holds that, for Wittgenstein, it is either not a relation at all, or, at best an ‘‘internal relation’’ in the ‘‘idealist’s sense’’ that ‘‘denies the separateness of the things it joins.’’ In line with this thought, Scruton suggests that ‘‘the connection has to be made in the understanding of those who use the sign.’’ However, for Scruton, the ‘‘idealist’s sense’’ is best understood in terms of the analogy of (gestalt properties of ) facial expressions and ‘‘first-person ascriptions.’’ But then Scruton also holds that Wittgenstein’s reference to language is ‘‘no more than an analogy’’ since linguistic understanding, unlike musical understanding, involves ‘‘semantic properties’’ governed by a generative grammar.15 No doubt, in some scattered remarks, Wittgenstein did toy with the analogy of facial expressions (I am setting aside the first-person issue). But he explored it—without much success in my opinion—primarily for language, especially for word meanings, to study what may be involved in ‘‘semantic interpretation.’’ So, by Scruton’s own account, if the study of
Language and Music
207
musical understanding is like the study of facial expressions, then the similarities with language cannot just be ‘‘no more than an analogy.’’ In any case, appeals to such analogies just shift the problem back one step: What structural conditions give rise to facial expressions? In sum, not only that Scruton fails to take into account Wittgenstein’s lifelong conviction about deep similarities in linguistic and musical understanding, he gives no clue as to how the ‘‘internal relation’’ between musical ‘‘signs’’ gives rise to a particular way of whistling them. For Levinson (2003), in contrast, music is languagelike insofar as linguistic meaning is essentially characterized in terms of the thought expressed. In fact, he suggests that the process of musical thinking can be traced to the internal structural development of music. He calls it ‘‘intrinsic musical thinking’’; it resides ‘‘in the mere succession from chord to chord, motive to motive, or phrase to phrase at every point in any intelligible piece of music.’’ Levinson illustrates the point with an almost measure-by-measure analysis of the first movement of Beethoven’s Tempest Sonata (op. 31, no. 2). For example, in the first two measures, a four-note rising motif ‘‘has about it a pronounced air of uncertainty and wonder.’’ In measures 2–6, the initial motif is followed by a descending allegro motif which ‘‘anxiously frets,’’ ending in an adagio turn of ‘‘questioning character,’’ and so on. The richness of Levinson’s description makes it clear that he traces the ‘‘intelligibility’’ of ‘‘intrinsic musical thinking’’ essentially to the a¤ects of the interpreter, not to the tonal and relational properties of the musical structure itself. To understand a piece of music, Levinson holds, is to learn ‘‘how to respond to it appropriately and how to connect it to and ground it in our lives.’’ This not only raises all the problems about the relation between musical structure and a¤ects all over again, it fails to give a su‰ciently narrow interpretation of Wittgenstein’s idea that music expresses itself. In other words, Levinson places the Wittgensteinian theme of language-likeness of music in a framework that does not make durable contacts with the rest of Wittgenstein’s themes. The theme of ‘‘internal relations’’ has been explored more directly by Yael Kaduri (2006). Instead of describing the tonal structure of music, Kaduri focuses on the general pause—silent figure (aposiopesis)—in Haydn’s instrumental music. Working through ‘‘the immensity of Haydn’s repertoire,’’ Kaduri notes that Haydn’s music ‘‘contains so many general pauses that it seems they form an intrinsic component of his musical language.’’ The pauses seem to fall under definite stylespecific categories, di¤ering as styles di¤er: string quartets, symphonies,
208
Chapter 6
movements in sonata form, rondo movements, and minuets. Further, ‘‘the pauses almost always appear in the same positions in each of the different forms’’—for example, in the rondo finale movements, the pauses are ‘‘juxtaposed to the successive repetition of a small motif.’’ According to Kaduri, thus, the ‘‘di¤erent ways in which Haydn employs the general pause and the logical links between them constitute a ‘grammar’ of the general pause, which provide its meaning.’’ The data of (Haydn’s use of ) the general pause suggests how musical understanding can be explained in part from structural conditions alone. To that extent, Kaduri’s adoption of the notion of ‘‘grammar’’ from Wittgenstein seems appropriate. In fact, it is another puzzle in Wittgenstein’s work that, although he gave up the notion of ‘‘logical form’’ pursued in his early work, he continued to invoke some notion of ‘‘grammar’’ in his later work on language. Without entering into exegesis, it stands to reason that he was searching for a notion that captured just the structural conditions met by symbols that furnished the underlying basis for interpreting them. In his earlier work, Wittgenstein could have thought that the prevalent Frege-Russell notion of logical form captured those conditions. Having found that notion to be untenable, he might have searched for an alternative in the conditions of musical understanding. As Sarah Worth (1997) insightfully remarks: ‘‘Music is often seen as being problematic because of its lack of detectable meaning, but this is precisely why Wittgenstein thinks it has an advantage. The necessarily abstract quality of music allows us to avoid being absorbed merely with the quest for referential meaning.’’ The phenomenon of general pause shows how the ‘‘necessarily abstract quality’’ works. As Kaduri observes, the pauses by themselves are not meaningful; they contribute to the meaning of pieces of music as it progresses by relating chunks of musical structure in formally specifiable ways. Kaduri does not tell us how the internal significance of the chunks, that are related by the pauses, itself arises. But Levinson’s analysis of Beethoven’s Tempest Sonata clearly shows that such chunk-specific meaning does arise at least on a measure by measure basis, although Levinson’s idea of what that meaning is is not likely to have impressed Wittgenstein. Pursuing the proposed reconstruction of Wittgenstein a bit further, it is plausible to assume that, as with anyone interested in the workings of language, Wittgenstein could have been intrigued by the fact that a bunch of symbols in association with one another is somehow capable of conveying
Language and Music
209
meaning. A natural thought is that much of the ‘‘meaning’’ so conveyed must be ensuing from the structural conditions governing the array of symbols themselves; hence the need for some notion of grammar. In the case of music, it looks as though structural meaning is all there is to the notion of musical meaning. Once that meaning is grasped by a competent performer or listener, the music may be ‘‘whistled’’ in a particular way— the particularity being determined by the specific structural meaning assigned to a selection of notes. In the language case, this perspective is hard to maintain in view of the dominance of ‘‘referential meaning’’ in linguistic understanding. Linguistic understanding thus comes in progressively thicker layers (Mukherji 2003b). Yet, as Wittgenstein saw, insofar as a linguistic expression is essentially a structure of symbols, the notion of structured meaning must obtain for language as well at some level. Following the analogy of musical expression and the whistling of it, a natural division thus obtains between language per se and its use. On this view, to refer, or to convey extralinguistic information by some other means, or to engage in a speech act, is to ‘‘whistle’’ the sentence in a particular way that is largely determined by the internal organization of linguistic signs. Contrary to some popular interpretations of Wittgenstein’s later philosophy, then, the use of a sentence is not its meaning; the meaning of a sentence is in the sentence—already, prior to use, as it were. This could well be a general phenomenon for the natural symbol systems, the hominid set, at issue; it may not obtain for artificial and nonhuman systems. In my opinion, this structured meaning is what is represented at LF, the semantic output of language. In that sense, just as Wittgenstein turned to music to throw light on language, we turn to language to throw light on music. We cannot assume of course that Wittgenstein had LF in mind. But he did have logical syntax in mind when looking for an adequate notion of grammar in his early work. And the basic reason why he gave up logical syntax is that it required a ‘‘picture’’ theory; the theory of LF does not require such ‘‘referential meaning.’’ 6.2.3
Recursion in Music
Suppose then that music is a symbol system. Even then there could be reservations that music is a system of discrete infinity in the desired sense available for language. Following an influential discussion in Hauser, Chomsky, and Fitch 2002, the issue has been sharpened recently by Fitch, Hauser, and Chomsky 2005.
210
Chapter 6
Both the papers hold that, pending further research on a variety of cognitive domains in humans and nonhuman animals, only the human language system is endowed with what they call the ‘‘narrow faculty of language’’ (FLN). FLN is to be distinguished from the ‘‘broad faculty of language’’ (FLB) which constitutes what we may think of as the complex organization of the human linguistic system. The complex organization is certainly unique to humans, but all its parts may not be. Some of the parts of FLB are shared by a variety of nonhuman organisms, some others are shared with other human but non-linguistic domains. According to Hauser, Chomsky, and Fitch 2002, only FLN contains what is unique to both humans and language; hence, FLN is unique to the human linguistic system. FLN is nonempty since it contains the mechanism of recursion. It follows that music being a di¤erent domain from language (I agree), FLN cannot be involved in music by definition; that is, the parts, if any, that music shares with language can only belong to FLB. Thus, music cannot contain recursion. Since the property of (hierarchic) discrete infinity arises in language essentially by virtue of the existence of a recursive mechanism, music cannot be a discrete infinity in that sense. Although neither paper rules out that FLN could turn out to be empty, currently it seems that they are inclined to place the convergences between language and music, if any, in FLB. Hauser, Chomsky, and Fitch (2002, 1571) o¤er a fairly exhaustive description of the property of discrete infinity for language: (i) sentences are built up of discrete units, there are 6-word sentences and 7-word sentences, but no 6.5-word sentences; (ii) any candidate sentence can be ‘‘trumped’’ by, for example, embedding it in ‘‘Mary thinks that . . .’’; and (iii) there is no nonarbitrary upper bound to sentence length. As for music, Fitch, Hauser, and Chomsky 2005 agree that music has a phrase structure and the phrasal structure of music shows no obvious limit on embedding. Still, the authors hold that ‘‘there are no unambiguous demonstrations of recursion in other human cognitive domains, with the only clear exceptions (mathematical formulas, computer programming) being clearly dependent upon language.’’ In the absence of a lexicon-driven generative theory, it is di‰cult to cite precise (bottom-up) examples of musical recursion. But we might find some examples at intermediate levels of complexity. For example, Douglas Hofstadter (1979, 129–130) suggests that ‘‘we hear music recursively—in particular, that we maintain a mental stack of keys, and
Language and Music
211
that each new modulation pushes a new key onto the stack.’’ According to Hofstadter, the process is dramatically illustrated in Bach’s Little Harmonic Labyrinth in which the original key in G is first nested (modulated) into the key of D; the music then ‘‘jumps’’ back again into G but at a much higher level of complexity. Kaduri (2006) suggests a very similar description for all music in the sonata form. The sonata form has two parts. In the first part, a theme is inaugurated in the key of the movement. Then, ‘‘a transitional section’’ leads to a second theme in a di¤erent key. The second part then modulates from one to another while developing the theme of the first part. This part ends with the ‘‘retransition’’ section that leads back to the tonic. However, these descriptions of musical recursion, if valid, seem to lean on specific aspects of Western tonal classical music such as modulation and harmonic changes. Hence, they do not directly generalize to other forms of music (Jackendo¤ and Lerdahl 2006). I will now give an example of endless embedding of musical phrases from Indian classical music, which does not have the cited features of Western tonal music. In general, the Indian raaga system crucially depends on the experienced listener’s ability to periodically recover versions of the same chalan (progressions typical to specific raagas) and bandish (melodic themes specially designed to highlight the tonal structure of a raaga) through ever-growing phrasal complexity and at varying pitch levels. The basic recurring feature of the tonal structure of a raaga is best illustrated in a ‘‘noise-free’’ way in the first part of Indian classical music known as aalaap (melodic prelude); the part is central for both Hindustani (North Indian) and Carnatic (South Indian) music. This part systematically introduces the basic tonal structure of a raaga. The part is noisefree because it is free of any definite beat (taala). Hence, except for a drone marking the tonic and the fourth or the fifth above the tonic, no other accompaniment is used.16 For the same reason, it stays away from any specific melodic theme (bandish) since melodic themes are composed around specific beats (taalas). In this sense, the aalaap part represents what may be viewed as pure tonal music, and nothing else. A study of the initial phases of this part, thus, presents something of a test case for the recursion issue that may be linked directly to the tonal form of music itself. I have sketched the first few lines of a typical (¼ beginner’s) aalaap of the common raaga Yaman of North Indian classical music (figure 6.1); the raaga is widely used in compositions of popular—including film—
212
Chapter 6
Figure 6.1
Structure of a raga
music. I have chosen this raaga since it belongs to the Kalyan scale (thaata) which includes all seven notes with one sharp note (marked by þ ) and no flat notes, reminiscent of the common G-Major scale in the Western system.17 These properties make it by far the simplest of the scales. The basic notes are given in the middle octave; lower and higher octaves are marked by ' before and after a note respectively.18 Looking at the organization of the raaga, it is obvious that the system satisfies each property of discrete infinity mentioned by Hauser, Chomsky, and Fitch 2002 for the language case. The structures are built out of a small set of discrete units: S, R, G, Mþ , P, D, N. Items selected from this set are combined and repeatedly embedded in more complex structures to sustain the character of the raaga as follows (boldfaced sequences show embedding):
Language and Music
213
S R-S 0
N-R-S
R- 0 N-R-S ...: 0
N-R-G-R- 0 N-R-S
...: 0
N-R-G-M-P-M-G-R- 0 N-R-G-R- 0 N-R-S, etc.
I have listed just five relatively simple lines of aalaap for expository purposes. Barring human fatigue, there is no principled limit to these embeddings. The example suggests that, if anything, the phrasal complexity of music resembles language far more than arithmetic.19 There are other interests in this example to which I return in the next chapter.20 Skeptics are not easy to please. So, let us ask: What counts as ‘‘unambiguous demonstrations of recursion’’ in cognitive domains? For example, what is the unambiguous demonstration of recursion in the language case? As noted, the only ‘‘demonstration’’ we have is that we can keep embedding, say, relative clauses at will, but we can never demonstrate empirically the presence of discrete infinity in any domain. Daniel Everett 2005 has proposed that the piraha˜ language does not have recursion because speakers of piraha˜ do not comment on anything except ‘‘immediate experience.’’ Chomsky o¤ered the following counterexample (in English): the apple that I am now looking at is rotten (cited in Nevins, Pesetsky, and Rodrigues 2007). What else could Chomsky do to show that piraha˜, like any human language, allows endless embeddings? For arithmetic, once we have the successor function, we do have reductio proofs that the concept of the largest number is incoherent. But that is theory, not an (empirical) demonstration; anecdotally speaking, young children find it di‰cult to comprehend that the number system has no limit. There is no such proof in the language case. While reporting their work on possible recursive mechanisms in songbirds, Gentner et al. 2006 observe: ‘‘In practice, however, the stimulus sets used to test such claims must be finite.’’ This leads to ‘‘theoretical di‰culties in proving the use of context-free rather than finite-state grammars.’’ As they proceed to note, these di‰culties ‘‘extend to studies of grammatical competence in humans
214
Chapter 6
as well, and therefore call into question the falsifiability of claims regarding [context-free grammars] in humans compared to non-humans.’’ I return to this paper. How then do we view language as a recursive system? We develop the intuition that the number series maps onto the sequence of objects generated by the language system: discrete infinity. Then we look for an unbounded operation, and introduce heavy theory in terms of the competence-performance distinction. Not surprisingly, we begin with ‘‘surface’’ recursion that directly appeals to intuitive data such as embedding of relative clauses. The complex history of generative grammar shows that the discovery of Merge was not easy, although it looks simple once we found it. In this light, SMH could be viewed as a plea for theory: think of Merge and the economy principles as applying to music, then find the suitable organization of musical information that would use the package to generate musical surfaces. Once that happens, demand for ‘‘unambiguous demonstration’’ will be automatically met. What are the prospects for a theoretical framework in which the actual components of CHL are viewed as implicated in music? As we will now see, the query gives rise to a new set of issues regarding the organization of the human mind that go far beyond the specific issue of music.
7
A Joint of Nature
Universal Grammar (UG) specifies the initial state of the language faculty; suppose we think of UG as consisting (only) of linguistically specific items in the sense outlined. According to Chomsky (2001a), apart from parametric variations in the morphological system (and ‘‘Saussurian arbitrariness’’), human language consists of a single lexicon, as noted. We may view at least the formal features of the lexicon as specified in the initial state, perhaps much else. If CHL is not linguistically specific, as noted, then only the suggested aspects of the lexicon belong to UG: UG is ( just) a universal store of lexical features. To expand, CHL consists of Merge and the principles of e‰cient computation, both viewed as purely computational principles (PCPs) in our formulation. In other words, even though Merge and the principles of e‰cient computation are specified in the initial state, they do not belong to UG. On this view, PCPs, which constitute the generative procedure of language, satisfy two conditions at once: they are specified in the initial state without being specific to language. Hence, they may well be involved in generative procedures elsewhere, such as music. We are aiming for a generalization that there is a small class of generative systems—the hominid set—consisting of domain-specific ‘‘lexicon’’ and CHL . The progressive austerity of the grammatical system under the biolinguistic program suggests that there could be a generalization of the suggested sort somewhere. In his recent writings (Chomsky 2001a and after), Chomsky makes a rather di¤erent suggestion. According to him, the design of the language faculty consists of three factors: (i) linguistic experience, that is, primary linguistic data (PLD); (ii) UG exactly as viewed above: linguistically specific initial state; and (iii) principles of computational e‰ciency (PCE) of least-e¤ort and last-resort varieties: the ‘‘third factor.’’1 Since PLD is needed in any case to activate the language system, we set it aside. Before it is triggered, FL then consists of UG and PCE.
216
Chapter 7
Apparently, the taxonomy seems to be compatible with what I suggested. Appearances notwithstanding, Chomsky has a very di¤erent conception of what falls in UG and the status of PCE. For him, UG consists of both the single lexicon and Merge. For me, UG contains just the lexical part, if anything: so my conception of UG is simpler. PCEs, on the other hand, are ‘‘laws of nature’’ for Chomsky in that they are viewed as general properties of organisms, perhaps on par with such physical principles as least-energy requirement or minimal ‘‘wire length’’ (Cherniak, Mokhtarzada, and Nodelman 2002; Cherniak 2005).2 The picture presents two problems for the conception of the hominid set: (1) Merge is specific to language, hence it cannot apply elsewhere; and (2) PCEs are general properties of organisms; they are not specifically involved in either language or music—that is, since PCEs are allegedly found across (every) cognitive system of organisms, they do not help in conceptualizing the restricted class of human ‘‘languagelike’’ cognitive systems. At best, the hominid set is single-membered; at worst, the conception is vacuous. I will suggest that, while (1) is an undergeneralization, (2) is an overgeneralization. In my view, the right generalization supports the conception of the hominid set. As noted, in this work I concentrate on two members of the hominid set: language and music. Hence, the focus continues to be on strong musilanguage hypothesis (SMH). 7.1 Merge and Music
Is Merge involved in musical computation? The question is central for SMH because, to emphasize, Merge is the sole recursive device in language postulated in the Minimalist Program. Even if we assume music to be a recursive system in a general way, as above, SMH will not hold unless the specific recursive operation of language could be viewed as implicated in music as well. As noted, the interest is that, although Merge is certainly involved in language, its formulation—and, thus, its e¤ects— does not seem to be linguistically specific (section 5.2.2). Since I know of no direct literature (with one exception noted below) in which the status of Merge has been studied vis-a`-vis something like SMH, I will basically examine what Chomsky has to say on Merge without suggesting that he himself holds any position on musical recursion. To recapitulate, ‘‘unbounded Merge or some equivalent is unavoidable in a system of hierarchic discrete infinity, so we can assume that it ‘comes
A Joint of Nature
217
free’ ’’ (Chomsky 2006c). The characterization is completely general in that it does not mention any specific domain or a system. Further, Chomsky observes that Merge is an elementary operation that has the simplest possible form: Merge (a, b) ¼ {a, b}, incorporating the No Tampering Condition (NTC), which leaves a and b intact. This formulation of Merge is the simplest since, according to Chomsky, anything more complex—for example, Merge forms the ordered pair ha, bi—needs to be independently justified. As we saw (figure 6.1), musical structures are at least hierarchies of sets of progressively embedded discrete symbols: {S},{R,S},{0 N,{R,S}},{R,{0 N,{R,S}}},{{0 N,{R,G}},{R,{0 N,{R,S}}}, etc: How can Merge fail to apply to music? I can think of two possibilities in which Merge, notwithstanding its linguistically nonspecific formulation, operates only on linguistic information. The first possibility is that there could be a domain-internal relationship between Merge and what it computes upon. At one place, Chomsky suggests that to be able to enter into computation a lexical item (LI) must have some property that permits it to merge with an available syntactic object: ‘‘A property of an LI is called a ‘feature,’ so an LI has a feature that permits it to be merged’’ (Chomsky 2006c, 139). Suppose we interpret these remarks as suggesting that Merge is sensitive only to lexical features. The suggestion looks very much like a stipulation unless it is explained why the very general characterization of Merge is satisfied only under the highly restrictive condition of lexical items of language. In any case, even the stipulation looks suspect since the same Merge is supposed to generate the infinite system of numbers. Following Chomsky, we think of a ‘‘language’’ that has just one ‘‘lexical item,’’ called ‘‘one.’’ Now, the first application of Merge to one yields {one}, called ‘‘two’’; the second application yields {{one}}, called ‘‘three,’’ and so on: ‘‘In e¤ect, Merge applied in this manner yields the successor function.’’ It is hard to see that Merge, so applied, is computing on lexical features of the kind available in UG. Notice that Chomsky is not saying that what applies in the domain of numbers is an ‘‘equivalent’’ of Merge; it is Merge itself, perhaps because arithmetic is routinely viewed as an ‘‘o¤shoot’’ of language. As far as I can see, the only reason why arithmetic is viewed as an ‘‘o¤shoot’’ of language is that arithmetic is recursive, which begs the current question.3 Moreover, the operation of Merge just described is restricted to a single item. This gives the most perspicuous example of recursion in which the
218
Chapter 7
same item is fed back into the same function over and over again noniteratively to generate new syntactic objects. Syntactic recursion in language typically does not look like arithmetic recursion at all, precisely because the human lexicon is complex. Typical examples of recursion such as The man who met the woman who . . . , or, I wonder if Susan knows that Fred assured Lois that . . . (Jackendo¤ 2002, 38–39), do not repeat the same items. So, linguistic recursion is to be understood in terms of (syntactic) types such as relative clauses. Recursion in language means that a type may be embedded in the same type indefinitely, performance factors aside. Identification of syntactic types clearly ensues from lexical features such as Gwh-. So, if Merge is sensitive only to these features, Merge cannot apply to arithmetic. It follows that if Merge is to apply to arithmetic, Merge cannot be sensitive only to linguistic information. A second possibility is that there could be recursion elsewhere, such as music, without Merge applying. We will shortly study a specific proposal from that direction. We have already covered the problem that arithmetic poses to such proposals. Now, if music is a system of hierarchic discrete infinity and if Merge does not apply, then, the only option is that ‘‘some equivalent’’ of Merge applies: call it ‘‘Murge.’’ The human cognitive architecture then has at least two recursive devices, perhaps more. Since Merge is the simplest recursive operation, Murge is either a notational variation of Merge, or is more complex than Merge. Setting the former aside, a more complex operation, pace Chomsky, needs to be independently justified (after formulating the operation, of course). Further, Chomsky views the emergence of Merge as a ‘‘Great Leap Forward’’ with no known evidence about how such a mechanism got inserted in the species; it is a mystery. For Chomsky, the only plausible speculation is that some critical physical event ‘‘rewired’’ the brain of an ape that had the conceptual-intentional system more or less in place. How many times did the ape’s brain get rewired to incorporate Murge and other recursive devices? Finally, we can perhaps make some sense of the adaptive advantages of language to the species with Merge in place— for example, the ability to plan ahead. In contrast, as Darwin’s remark pointed out, adaptive advantages of music remain a mystery (Pinker 1997; Wallin, Merker, and Brown 2000; Hauser and McDormott 2003; etc.). The addition of Murge to the architecture then (quite unnecessarily) adds to the mystery we already have with Merge. It follows that, Merge needs to be characterized abstractly in any case independently of where it applies. Merge then yields domain-specific constructions by using specific resources available in a given domain. Chom-
A Joint of Nature
219
sky (2006a) seems to be saying something very similar: ‘‘The conclusion that Merge falls within UG holds whether such recursive generation is unique to FL or is appropriated from other systems. If the latter, there still must be a genetic instruction to use Merge to form structured linguistic expressions satisfying the interface conditions.’’ Given wide variances in the specific resources of the concerned domains (lexical features, tones, numbers), and given the plausible idea that much of these resources could have been independently available to the broad cognitive architecture from the rest of the organic world, it is a plausible assumption that each domain requires specific genetic instructions to access the otherwise general-purpose Merge. Prima facie, the assumption looks simpler than the conjecture that the architecture requires a variety of independent recursive operations. With this general perspective in mind, I turn to what seems to be a direct rejection of the claim that musical recursion could be languagelike. Jackendo¤ and Lerdahl (2006; also Jackendo¤ 2009) suggest that ‘‘the kind of recursion appearing in pitch reductions seems to be special to music. In particular, there is no structure like it in linguistic syntax. Musical trees invoke no analogues of parts of speech, and syntactic trees do not encode patterns of tension and relaxation.’’ I have already covered the conceptual implausibility of the idea that recursion in music is ‘‘special.’’ Nonetheless, there are a variety of specific problems with the remarks just cited. First, it looks as though the authors are making these suggestions from within the framework of the generative theory of tonal music (GTTM) proposed by them earlier (Lerdahl and Jackendo¤ 1983) and subsequently developed in Lerdahl 2001 and Temperley 2001 among others. As far as I can see, GTTM is at best a systematic description of how listeners parse a musical text. GTTM accomplishes this task by taking a musical surface and showing how a listener detects patterns of tension and release, identifies phrases in terms of, say, pauses and interludes, etc. For example, their prolongational-reduction trees describe how a listener undergoes tension and relaxation as the music progresses; they do not describe how the concerned piece of music is generated such that listeners undergo the suggested states. Not surprisingly, the description uses traditional music-theoretic notions such as progression, inversion, and harmonic function. GTTM does not give a generative account of these traditional descriptive tools by unearthing the underlying laws that govern the generation of musical surfaces. This contrasts sharply even with the earliest proposals in generative grammar (Chomsky 1955a/
220
Chapter 7
1975) that ‘‘recovered’’ traditional notions like Subject, Object, and Subject-Object asymmetry in generative terms, as we saw. In that sense, GTTM covers some restricted aspect of perception of music in mostly music-specific terms. So the theoretical notions it invokes, such as musical recursion, is likely to be musically specific from that perspective. It makes little sense to make the general claim that musical recursion is specific to music on that basis. Consider the analogy of the Aspects model of linguistic theory (Chomsky 1965) alluded to earlier (section 5.2). Although, in comparison to GTTM, the Aspects model was far richer and a truly generative account of all aspects of linguistic competence under usual idealizations, no significant claim about whether linguistic recursion (such as N 0 ) N S) applied to anything else could be made from that model. Despite its richness, the design of the Aspects model was linguistically specific, as noted. In sum, the claim that musical recursion is specific to music is a consequence of the descriptive framework of GTTM, rather than a property of music itself. Second, the requirement that, in order for musical recursion to be like linguistic recursion, the structure of pitch reduction must look like ‘‘analogues’’ of parts of speech is not immediately plausible (but see below). To my knowledge, nobody wants to claim that music contains analogues of noun phrases and anaphora while language contains analogues of octaves and dissonances. It is like asking for cardinals and surds in linguistic structure, and reflexives and pleonastic elements in arithmetic, to show that the recursive mechanisms in language and arithmetic are the same. Since music and language (and arithmetic) are no doubt distinct domains, it stands to reason that the information processed and stored thereof must be domain-specific as well. Thus, after factoring out all nonspecific aspects, it could turn out that ‘‘tonal space—the system of fixed pitches and intervals, and its hierarchy of pitches, chords, and keys and distances among them—is entirely specific to music and therefore to melodic organization’’ (Jackendo¤ and Lerdahl 2006). It does not follow that the computational principles that access and rearrange that information must also be domain-specific. Finally, it can be shown that some of the structures proposed by GTTM can be redescribed with Merge, not surprisingly. Recall figure 5.3, in which the tree for which book the girl has read is derived with repeated application of Merge. Now, referring to the prolongationalreduction structures proposed by Lerdahl and Jackendo¤ in GTTM, David Pesetsky (2007) claims that these structures are also ‘‘binary branching and involve headed phrases. They are thus characterizable as
A Joint of Nature
221
Figure 7.1
Merge in music. Reproduced from Pesetsky 2007 with permission.
products of External Merge.’’ To establish the point, Pesetsky redraws the diagram for prolongational-reduction structure proposed by Ray Jackendo¤ for the opening of Mozart’s piano sonata K. 331 (see figure 7.1). Skipping details, the basic idea is to treat ‘‘I’’ and ‘‘V’’ as heads of phrases, where ‘‘I’’ is the tonic chord and ‘‘V’’ the dominant chord.4 So we can think of I-phrases and V-phrases as ‘‘parts of speech’’ of musical organization after all! Those who hold the view that musical expressions are nonpropositional in character may want to rethink the issue. Assuming the general validity of Pesetsky’s suggestion, how do we explain the fact that a GTTM-motivated—that is, an avowedly musicspecific—structure is redescribed by linguistic means? As noted, GTTM does not seem to have the form to satisfy SMH; hence, the broad properties of design that SMH seeks, in line with the design of the language faculty, are not likely to be available in GTTM. However, it goes without saying that, since GTTM does inquire into some of the aspects of the structure of musical competence, many of its specific insights are likely to find their way into a more principled SMH-guided theory, just as some of the major insights of the Aspects model, such as island constraints, are incorporated in the Minimalist Program under di¤erent technology. Supposing Pesetsky’s proposals to form a part of the envisaged SMH-guided
222
Chapter 7
theory of music, we saw the first example of how insights of GTTM may be incorporated within it; we will see more of this soon. In this case, as far as we know, it is plainly a fact of musical experience that musical surfaces have prolongational-reduction structures; it is a phenomenon that any theory of music must capture. Insofar as these structures are syntactically governed, Pesetsky’s ingenious analysis shows, as Pesetsky boldly puts it, ‘‘musical syntax is language syntax.’’ Suppose then that external Merge applies to music. Does internal Merge also apply to music? To recall, (internal) Merge is not a distinct operation: it is Merge working on parts of existing SOs. So, its ‘‘availability’’ is not the issue; the issue is whether it is active. After establishing external Merge in music as above, Pesetsky 2007 proceeds to show that internal Merge is (also) at work in music as it maps the tonal structure shown above to the rhythmic interface. Specifically, Pesetsky suggests that internal merge dislocates/raises the tonal tree, branch by branch, to the (top) edge by constructing relevant nodes at IP and above. However, on the one hand, much complex music does not have any pronounced rhythmic structure, as Pesetsky notes; the aalaap is a case in point chosen specifically for that reason.5 On the other hand, as Pesetsky demonstrates, pronunciation of linguistic items sometimes has manifest rhythmic structure: Carrots & lemons & co¤ee & pepper (his example), or mind and body, heart and soul. Pesetsky suggests that, like music, internal Merge works in the language case also to map SOs to the rhythmic (¼ sound) interface. But, as discussed above (section 5.1.3), there is a deep sense in which that is not what internal Merge is (basically) for, since the sound system can be viewed as ‘‘ancillary’’ to the language system (though necessary for communication, etc.).6 The basic idea in the Minimalist Program is that internal Merge essentially creates copies for the semantic interface; in the process, it dislocates items at the sound interface. Given the existence of nonrhythmic music, it is not obvious why a similar picture can not be extended to music if musical structures are to meet conditions of ‘‘thought.’’ We speculate on this as follows. 7.2 Faculty of Music
CHL , we saw, is geared to generate complex SOs and check for uninterpretable features. For structures such as she has gone, feature checking takes place at the (base-generated) positions at which external Merge places syntactic objects; thus, as far as current understanding goes, no movement is needed. For which book the girl has read, in contrast, the
A Joint of Nature
223
SO which book needs to move to CP for checking the wh-feature; thus, a copy of which book is generated by internal Merge. The computational requirement of feature checking serves two purposes for the external systems: at the meaning interface we get the quantifier interpretation for a WP, and at the sound interface the copy is pronounced (in English). Basically then, internal Merge is activated when copies are needed as a last resort to meet external conditions. If internal Merge is active in music as Pesetsky suggests, then, granting that copies are generated for the rhythmic interface on suitable occasions, we would want to know if these copies are also needed for the semantic interface of music. For that to happen, at least three things are needed: (1) a computational system—CHL , under hypothesis—that generates complex musical SOs and enforces some (narrow) computational requirement on analogy with feature checking; (2) some conception of external systems at the semantic interface, perhaps on analogy with FLI systems of language; and (3) a need for copies generated by internal Merge to meet these external (semantic) conditions. As noted, we have very little to go by on these topics at the current state of inquiry. In particular, the prevalent ‘‘no semantics’’ view of music leaves little room for conceptualizing the semantic interface of music. All we have for now are: (i) external Merge may be constructing base structures in music, (ii) music may have internal significance, and (iii) internal Merge may be transferring parts of base-generated structures to the rhythmic interface. Could these be put into a preliminary conception of the organization of the music faculty? I must emphasize that what follows is a very tentative sketch of how the faculty of music may look like from a minimalist perspective. Speculating freely on the limited basis in (i)–(iii) in the preceding paragraph, and borrowing from the GTTM framework whenever needed, it seems that the basic computational mechanism in music is reduction of tonal tension under the overall constraint of the designated tonal space:7 a tonal space is a collection of tones such as a scale, a raaga, and so on. Once the center of the tonal space, the tonic, is identified, it signals the most ‘‘relaxed’’ state; any ‘‘motion’’ away from the tonal center increases ‘‘tension.’’ The task of tonal organization is to (periodically) return to relaxed/stable states. To that end, ‘‘an unstable pitch tends to anchor on a proximate, more stable, and immediately subsequent pitch’’ (Jackendo¤ and Lerdahl 2006, 51–52), where stability is defined in terms of proximity to the tonic. It looks as though the mechanism of reduction of tension works under locality and is designed principally to enforce (musical)
224
Chapter 7
stability by eliminating instabilities, not unlike the elimination of uninterpretable features in linguistic computation. With this mechanism in hand, we may draw what Lerdahl and Jackendo¤ call ‘‘prolongational-reduction’’ trees showing how the listener perceives the transitions from states of relaxation to states of tension (and back) on a note by note basis, while an overall move toward relaxed states is progressively attained. In Indian music, the ‘‘ground note’’ S (Saa) is taken to be the most stable note. Drone instruments are typically tuned with this note—along with P, a fifth above S—irrespective of the raaga. In that sense, S qualifies as a ‘‘tonic’’ (Lerdahl and Jackendo¤ 1983, 295). Thus, it has priority over the ‘‘real’’ tonic (vaadi) of a raaga, the selected tonal space; as noted, the tonic for the tonal space of Yaman is G and N is the dominant (samvaadi). Thus consider the first line of the aalaap for raaga Yaman (figure 6.1). We may chart the stable points of the structure as follows (the chart itself is not the prolongation-reduction tree). (122) [S, 0 N-R-S, 0 N-R-G-R-0 N-R-S, S-0 N-0 D-0 N-0 D-0 P, 0 Mþ -0 D-0 N-R-0 D- 0 N, R-G-R, S]
g d g d t d g g d d g ¼ ground note, d ¼ dominant, t ¼ tonic
d
d
t
g
Roughly then, the motion from the first note S to 0 N increases tension which is further increased in the motion from 0 N to R, followed by a motion of relaxation to S. Thus, the overall motion over the first two phrases is toward relaxation. Taking into account di¤erences in duration of notes and the distribution of relative stability with respect to the ground note, the tonic and the dominant respectively, these motions can be suitably represented in a tree diagram, as noted (Lerdahl and Jackendo¤ 1983, chapter 8). This rather narrow computational mechanism generates the immensely complex melodic structures of music. Prolongational-reduction trees describe what motion an experienced listener of a musical style perceives; it does not explain why the music has the structure to trigger the specific motion for the listener, as noted.8 It stands to reason that, other things being equal, just the mechanism of tension and relaxation enforcing least e¤ort considerations suggests that a music (preferably) stays at the tonic (and its neighbourhood) to sustain the most relaxed state. In fact, much music does precisely that: sections of rap, chants, choruses, lullabies, cheers, and so on. So, why does music rise and fall prominently at all?
A Joint of Nature
225
Specifically, we would want to know why the line above is heard as a line of Yaman, but not, say, as the raagas Kedaar or Bhupali, which also belong to the same Kalyan scale (thaata), sharing essentially the same tonal space. We saw (figure 6.1) that the structures of raaga Yaman are so generated that the characteristic features of the raaga are sustained over growing complexity. It looks as though the computational system works ‘‘blindly’’ to increase stability; the characteristic features of the raaga, which is an organization of a certain tonal space, constrains the working of the computational system. Suppose we capture the phenomenon by making a principled distinction between the computational system per se and the ‘‘external’’ conditions that constrain its operations. The resulting musical surface then is to be viewed as an optimal solution to these interface conditions. As an aside, I note that Lerdahl and Jackendo¤ frequently appeal to tonal spaces to describe musical motion—for example, Jackendo¤ and Lerdahl (2006, 35) suggest that the melody in the Beatle’s song Norwegian Wood so moves as to satisfy the E major triad, B–Ga–E–B. In their framework, it is unclear to me where this information about the triad is located and how it is accessed. In general, most theories of music cognition appeal to the notion of an ‘‘experienced listener’’ without unpacking it. They simply tell us what experienced listeners do without telling us what it is about experienced listeners—how the relevant material is organized in their mind—such that they are able to do it. The perspective just sketched o¤ers some conception of the relevant external systems, that are internalized by an experienced listener, such that interface conditions are enforced in music. As a certain set of notes are selected from the (largely) universal musical lexicon, certain interpretive conditions—scales, modes, ascent/descent structures, characteristic motifs, delineated musical forms, and so on—are enforced on the selection.9 I am calling these ‘‘interpretive’’ conditions since, in some sense, these are available in advance to create a space of expectations about how the melody is going to be organized; these conditions are already in store before the onset of music, so to speak. When these conditions di¤er, the melodic organization ensuing from the same selection of tones di¤ers as well. Indian music has many sets of raagas—such as Yaman, Bhupali and Kedaar, as noted—in which members of a given set share exactly the same set of tones; in Western music, a sonata in G Major is a very di¤erent form of music than a symphony in the same key. It is important to note that these interpretive systems are cognitive systems in that their organization is only weakly determined, if at all, by the
226
Chapter 7
physical properties of sound. Lerdahl and Jackendo¤ (1983, 11.5) forcefully argue that all attempts to determine the scale systems across the world from, say, the overtone series, turn out to be unsuccessful. It is even more so for the Indian raaga system—not really a scale-system— which directly illustrates Helmholtz’s idea that ‘‘the principal of tonal relationship’’ must be regarded ‘‘to some extent as a freely selected principle of style’’ (cited in Lerdahl and Jackendo¤ 1983, 293; see also Vijaykrishnan 2007, 2.2.5). Needless to say, the more consciously constructed musical forms such as sonata, concerto, dhrupad, gazal, and so on, are even more specific to musical traditions. In this sense, I am proposing that the interpretive conditions constitute the ‘‘thought’’ systems of music. I am not suggesting that these are irreducibly ‘‘cultural’’ constructs of some sort where a ‘‘system of di¤erences’’ rule. The suggestion is rather that any investigation into the universal basis of these systems ought to search for cognitive categories rather than acoustic ones. For example, an explanation of how ‘‘inexperienced listeners [are] able to adapt quite rapidly to di¤erent musical systems’’ (Krumhansl et al. 2000, 14) is not likely to follow from acoustic properties alone. If this picture is roughly valid, then, in some sense, music creates and enforces its own external conditions: music expresses itself. Also ‘‘parameters’’ of music are likely to be located here rather than in the ‘‘lexicon’’ since the musical lexicon—individual tones, not tonal spaces—appears to be by and large universal. One advantage of this ‘‘in-house’’ organization of interpretive conditions is that it keeps the significance of music inside music, so to speak; it seems to guarantee that music has only internal significance. In that sense, the external conditions that a particular musical computation must meet is enforced by the faculty of music (FM) itself. I am obviously suggesting a parallel between these FM-driven interpretive conditions (FMI) and what I earlier called FLI systems—interpretive systems driven by FL itself (section 5.1.3.2). If the parallel holds, then, both lend direct internal significance to structures transferred by the computational system. For music, the basic point is that these interpretive systems impose conditions on the selected tones that progressions with these tones must meet. Apart from these, musical progression may also meet more general requirements of thought systems that are adapted to music, as suggested earlier (section 5.2.2). These could include musical analogues of such ‘‘pragmatic’’ linguistic phenomena as topicalization, focus, new informa-
A Joint of Nature
227
tion, and the like—perhaps, things such as highlight and continuity as well. Notwithstanding these crucial parallels, even these earliest reflections on the organization of the music faculty seem to lead to a markedly different conception of the music faculty when compared with the language faculty. We noted that the musical lexicon is largely universal unlike the lexicon of language which has parametric properties. Further, no intelligible notion of numeration of lexical items seems to apply in the music case. In the language case, a numeration signals a ‘‘one-time’’ selection of lexical items; in the music case, a ‘‘selection’’ is stored in the interpretive systems, such as a raaga or a scale, from which the selected items are repeatedly drawn ‘‘online.’’10 At many places, Chomsky explains why CHL computes only on a selection from the lexicon once and for all. If CHL is allowed to go back to the lexicon, the entire lexicon needs to be ‘‘carried along’’ for each computation. It is like carrying a refinery, not just a tank, in a moving car (Chomsky 2002). Since the musical lexicon is very small, it might as well be carried along. With respect to the interpretive systems, structural significance perhaps terminates with the FMI systems in the music case; in the language case, semantic interpretation apparently extends beyond FLI systems to interface with classical C-I systems, suggesting a major di¤erence in the organization of language and music. In this sense, the picture sketched above ‘‘saves the appearance,’’ namely, the wide phenomenal divergence between music and language. However, the computational system remains the same with very similar internal operations. In a way, then, the sameness of the computational system, required by SMH, forces a very di¤erent conceptualization of FM (almost) everywhere else: a pleasing result, if true. With the thought systems of music in place, we might want to know if the conditions imposed by these systems are sometimes met by internal Merge. Internal Merge, we know, essentially generates copies of structures already generated by external Merge. Thus, we are asking if conditions of optimal computation are sometimes met when copies are made available by internal Merge as a last resort. The need for copies in music seems all pervasive since ‘‘most background level of reduction for every piece is a statement of the tonic; hence the tonic is in some sense implicit in every moment of the piece’’ (Lerdahl and Jackendo¤ 1983, 295, emphasis added). If a generative theory of music is to make explicit what is implicit in the understanding of music, such as the ‘‘presence’’ of the tonic
228
Chapter 7
throughout, the theory must postulate some device to that e¤ect. A similar demand seems to arise for much larger (and later) structures if the interpretive role of musical devices such as recapitulation, coda, and cadence (Katz and Pesetsky 2009) is to be explained by a theory. Let us assume that these demands are met when internal Merge generates (covert) copies as a last resort at the relevant positions. Be that as it may, in an optimal design, the copies created by internal Merge to meet FMI conditions will also be used by the rhythmic system when required. Much of the rhythmic structure of music seems to act as guide to memory—‘‘reminders’’—of past events; in Indian music, the return to the first beat (som) of a beat cycle typically coincides with a return to the tonic (or, its neighbourhood) after meeting (some of ) the conditions of the raaga during the earlier cycle. In that sense, the requirements of the raaga coincide with the rhythmic structure. Beyond this very general intuition, however, currently it is unclear if the computational requirement met with operations of internal Merge coincides with the simultaneous satisfaction of external requirements of rhythm and interpretation. Assuming internal Merge to be in place, we will expect some economy conditions to govern computations in music as well. In the language case, as we saw, a small class of principles of e‰cient computation (PCE) not only enforce optimal computation, (ideally) they so constrain the operations of Merge that only interpretable structures meet the interface conditions; the rest is filtered out. We also saw that PCEs are linguistically nonspecific, that is, PCEs are purely computational principles, in the sense outlined. If FM is optimally designed with Merge in place, the system will require some economy principles that enable Merge-generated structures to meet legibility conditions optimally. Could PCEs (of language) be those principles? I have already used least e¤ort and last resort considerations in a general way while speculating on the organization of music. Some version of the least e¤ort principle MLC seems to be operative in the fact that an unstable pitch tends to anchor on a proximate, more stable, and immediately subsequent pitch, as noted. The other least e¤ort principle FI is observed in facts such as a pitch ‘‘in the cracks’’ between two legitimate pitches Da and E will be heard as out of tune (Jackendo¤ and Lerdahl 2006, 47). In general, the phenomena of ‘‘dissonance’’ and intonation seem to require FI since no note by itself is either dissonant or out of tune. If these speculations make sense, then, as with Merge, we will expect these economy conditions to be available in an abstract manner
A Joint of Nature
229
across FL and FM. Specific resources internal to a domain will then be used to implement them: in FL, PCEs constrain feature movement; in FM, they control tonal motion. Is this view valid? 7.3
‘‘Laws of Nature’’
In a way, the answer is trivially in the positive. According to Chomsky, PCEs are ‘‘laws of nature’’ in that they are general properties of organisms, perhaps on par with physical principles such as least-energy requirement or minimal ‘‘wire length,’’ as noted. PCEs thus apply to, say, music because they apply everywhere. Therefore, assuming that Merge applies to music, CHL satisfies SMH at once! Apparently, then, the Minimalist Program makes it rather easy for SMH to obtain. The sweeping generality enforced by PCEs could be viewed as a ground for casting doubt on—hence, an objection to—the substantive character of SMH. SMH is supposed to be a substantive proposal in that it attempts to capture ( just) the computational convergence between language and music. Under Chomsky’s proposal, the cherished restricted character of SMH collapses. Let C be any cognitive system of any organism. If PCEs cover all organisms, and if C contains Merge, then a strong C-language hypothesis (SCH) holds. Since we have not specified the scope of Merge so far, there is no reason why Merge can not obtain widely. Insofar as it does, SCH also holds widely. SMH, then, is just an instance of SCH; there is nothing specifically ‘‘languagelike’’ about music that SMH promised to cover. Unlike the issue of Merge, the issue is no longer whether some component of CHL —that is, PCEs—applies to music. That is apparently already granted under Chomsky’s generalization. The issue is rather whether PCEs can be prevented from applying to systems outside the hominid set. The discussion, therefore, will be concerned more with the general organization of cognitive systems of organisms than with (human) music and language. 7.3.1
Forms of Explanation
In his recent writings, Chomsky has drawn attention to two competing perspectives in biology: the ‘‘standard view’’ that biological systems are ‘‘messy,’’ and an alternative view that biological systems are optimally designed. It seems that, currently, Chomsky’s views on this topic are moving away from the standard view and towards the alternative. For example, in Chomsky 2006b, he actually criticized British geneticist Gabriel
230
Chapter 7
Dover who held that ‘‘biology is a strange and messy business, and ‘perfection’ is the last word one would use to describe how organisms work.’’ Interestingly, Chomsky himself held this view just a few years ago: ‘‘Biological systems usually are . . . bad solutions to certain design problems that are posed by nature—the best solution that evolution could achieve under existing circumstances, but perhaps a clumsy and messy solution’’ (Chomsky 2000a; also see 1995b, 2002). From what I can follow, the shift in perspective was essentially motivated by an extremely plausible methodological idea which emerged when it became reasonably established that large parts of grammatical computation can be explained by PCEs alone; capturing the rest of the parts, such as c-command and classical island constraints on extraction, then become research problems (Chomsky 2006d). UG specifies the initial state of FL with linguistically specific items—that is, elements of UG belong to the faculty of language proper. Suppose now we want to attach an explanation of how FL evolved. Clearly, the more things UG contains, the more di‰cult it is to explain why things are specifically that way. It follows that ‘‘the less attributed to genetic information (in our case, the topic of UG) for determining the development of an organism, the more feasible the study of its evolution’’ (Chomsky 2006a). If so, then there is a need to reduce UG to the narrowest conception. What can we take away from the things just listed and assign it elsewhere plausibly? As we saw, what we cannot take away from UG, according to Chomsky, includes at least (some prominent parts of ) the human lexicon and Merge. That leaves PCEs, the principles of e‰cient computation. If PCEs also belong to UG, principled explanation has to be found as to why they are so specifically located. The explanation will not be needed if it can be suggested that they are available to the faculty of language in any case as part of the general endowment of organisms. From what I can follow, this suggestion has been advanced along the following steps. The first step is methodological: the ‘‘Galilean style’’ of explanation in science begins by assuming that nature—or, at least, the aspects of nature we can fruitfully study—is perfect. The second step consists in showing that general principles such as least energy requirement (least-e¤ort principles) have played a major role in formulation of scientific theories, including reflections on biological phenomena. A variety of natural phenomena seems to require essentially the same form of explanation to the e¤ect that nature functions under optimal conditions. This include phenomena such as the structure of snowflakes, icosahedral form of poliovirus shells, dynamics of lattice in superconductors, minimal search
A Joint of Nature
231
operations in insect navigation, stripes on a zebra, location of brains at the front of the body axis, and so on. The third step shows that PCEs are optimal conditions of nature, given that language is a natural object. Although almost everything just listed is under vigorous discussion, I will simply assume that each of these steps has been successfully advanced.11 There is no doubt that the range of discoveries listed above has played a major role in drawing attention to the alternative view of biological forms. On the basis of this evidence, I will assume that nature, including the biological part of nature, is perfect; therefore, human language, also a part of nature, has a perfect design. I can a¤ord to assume all this because it still will not follow that PCEs are general properties of organic systems such as insects, not to mention inorganic systems such as snowflakes. For that ultimate step, we need to shift from historical parallels and analogies, however plausible, to theory. Basically, shifting of PCEs to the third factor conflates the distinction between a (general) form of explanation and an explanation of a specific (range of ) phenomena. Suppose the preferred form of explanation is the ‘‘Galilean style’’—mathematico-deductive theories exploiting symmetries, least-e¤ort conditions, and so on. We may assume that physics since Galileo has adopted the Galilean style. But that did not prevent two of the most sophisticated and recent theories, namely, relativity theory and quantum theory, to di¤er sharply about the principles operating in di¤erent parts of nature. The separation between these two theories is pretty fundamental such that it divides nature into two parts, obeying di¤erent principles. As noted in section 1.3.2, this divide can only be bridged by unification, perhaps in a ‘‘new’’ physics, as Roger Penrose suggests. Until that happens, the two general theories of nature are best viewed as two separate bodies of doctrines, while both adopt the (same) Galilean style. Faced with the overwhelming complexity and variety in the organic world, a Galilean form of explanation is harder to achieve in biology, explaining the wide prevalence of the standard view sketched above. Application of Galilean style to this part of nature thus requires at least two broad steps. First we show that the apparent diversity of forms in a given range of phenomena can in fact be given a generative account of from some simple basis: the condition of explanatory adequacy. Next, we go beyond explanatory adequacy to show that the generative account can be formulated in terms of symmetry and least-e¤ort considerations already noted in the nonorganic part of nature. The distinction between form of explanation and specific explanation seems to apply to each of these steps.
232
Chapter 7
Consider the idea that the ‘‘innate organizing principles [of UG] determine the class of possible languages just as the Urform of Goethe’s biological theories defines the class of possible plants and animals’’ (Chomsky, cited in Jenkins 2000, 147). If the parallel between UG and Urform of plants is intended to highlight the general scientific goal of looking for generative principles in each domain, it satisfies the first step. It is totally implausible if the suggestion is that a given Urform applies across domains: UG does not determine the class of plants just as Goethe’s Urform fails to specify the class of languages. Similarly, the very interesting discovery of homeotic transformations of floral organs into one another in the weed Arabidopsis thaliana (Jenkins 2000, 150) does not have any e¤ect on wh-fronting. To play a role in theoretical explanation of phenomena, the general conceptions of Urform and transformation need to be specifically formulated in terms of principles operating in distinct domains, pending unification. Turning to the issue of whether a particular least e¤ort principle of language, say, the Minimal Link Condition might apply in other domains and organisms, consider Chomsky’s (2000d, 27) general idea that ‘‘some other organism might, in principle, have the same I-language (¼ brain state) as Peter, but embedded in performance systems that use it for locomotion.’’ The thought is di‰cult to comprehend if ‘‘I-language’’ has a full-blooded sense that includes lexical features, Merge, and PCEs (plus PLD, if the I-language is not at the initial state). To proceed, let us assume that by ‘‘I-language’’ Chomsky principally had minimal search conditions in FL, that is PCEs, in mind. To pursue it, Hauser, Chomsky, and Fitch 2002 suggest that ‘‘comparative studies might look for evidence of such computations outside of the domain of communication (e.g., number, navigation, social relations).’’ Elaborating, the authors observe that ‘‘elegant studies of insects, birds and primates reveal that individuals often search for food using an optimal strategy, one involving minimal distances, recall of locations searched and kinds of objects retrieved.’’ Following the Galilean assumption that nature is perfect, optimal search could well be a general property of every process in nature, including the functioning of organisms. As such, principles of optimal search could be present from collision of particles and flow of water to formation of syntactic structures in humans. However, it requires a giant leap of faith to assume that the same principles of optimal search hold everywhere. Plainly, we do not wish to ascribe ‘‘recall of locations searched’’ to colliding particles or to the trajectory of a comet. In the reverse direction,
A Joint of Nature
233
(currently) there is no meaningful sense in which principles of optimal water flow are involved in insect navigation, not to speak of syntactic structures in humans. To emphasize, I am not denying that, say, foraging bees execute optimal search, as do singing humans and colliding particles. The problem is to show that there is a fundamental unity in these mechanisms. There could be an underlying mechanism of optimal search in nature that has ‘‘parametric’’ implementation across particles, bees and humans. But the unearthing of this mechanism will require the solution of virtually all problems of unification. Even in such a general theory of nature as Newtonian mechanics (‘‘a theory of everything’’), economy considerations are formulated in terms of principles specific to a domain. Newton’s first law of motion has two parts: (i) ‘‘Every body perseveres in its state of rest or uniform motion in a straight line,’’ and (ii) ‘‘except insofar as it is compelled to change that state by forces impressed on it.’’ The first part states a least-e¤ort principle, and the second a last resort one in terms of properties specific to a theoretically characterized domain. Clearly, this very general law of nature does not belong to CHL since, at the current state of knowledge, nothing in grammar moves in rectilinear or elliptical paths, and no forces act on SOs. From the perspective of physics, the language system is an abstract construct; laws of physics do not apply to it just as they do not apply to the vagaries of the soul. Yet, to return to the theme of chapter 1, there is no doubt that biolinguistics is a profound body of doctrine that has unearthed some of the principles underlying an aspect of nature. As Reinhart (2006, 22) observes, Chomsky’s definition of ‘‘Attract’’ also combines ideas of last resort and least e¤ort: K attracts F if F is the closest feature that can enter into a checking relation with a sublabel of K. This time the combination is implemented specifically for the aspect of nature under investigation in a di¤erent scientific ‘‘continuum’’ obtaining since Pa¯nini (section 1.3.2). It ˙ is certainly a law of nature if valid, but it does not apply to particles or planetary motions. As far as we can see, this aspect of nature is somehow located in the human brain and not in the joints of the knee; it is also conceivable that laws of physics (ultimately) apply to the human brain. Thus, one is perfectly justified to use the bio in biolinguistics. Nonetheless, currently, there is nothing in the formulation of CHL that requires that CHL cannot be located in knee joints. Therefore, even if narrow physical channels have influenced the evolution of the brain as with much else in nature
234
Chapter 7
(Cherniak 2005; Carroll 2005), it has not been shown that the influence extends to the design of the language faculty. In fact, it is unclear what is there to show: ‘‘What do we mean for example when we say that the brain really does have rules of grammar in it? We do not know exactly what we mean when we say that. We do not think there is a neuron that corresponds to ‘move alpha’ ’’ (Chomsky, Huybregts, and Riemsdijk 1982, 32). Chomsky made this remark over a quarter of a century ago, but the situation does not seem to have changed in the meantime. For example, over a decade later, Chomsky (1994b, 85) observed that ‘‘the belief that neurophysiology is even relevant to the functioning of the mind is just a hypothesis.’’ Several years later, he continued to hold, after an extensive review of the literature, that ‘‘I suspect it may be fair to say that current understanding falls well short of laying the basis for the unification of the sciences of the brain and higher mental faculties, language among them, and that many surprises may lie along the way to what seems a distant goal’’ (Chomsky 2002, 61). In general, the problem of unification between ‘‘psychological studies’’ such as linguistics and biology is as unresolved today as it was two centuries ago (Chomsky 2001b). The ‘‘locus’’ of the problem continues to be on biology and the brain sciences (Chomsky 1995b, 2). To insist on some unknown biological basis to the actual operations and principles contained in CHL is to miss the fact that, with or without biology, the theory of CHL already uncovers an aspect of nature in its own terms. To sum up, while the Galilean idea is a guide to science, nothing of empirical significance follows from the idea itself. We need to find out, for each specific system, how the idea is implemented there—if at all, because it cannot be assumed that every aspect of nature can be subjected to the Galilean form of inquiry. In that sense, it has been a groundbreaking discovery that the principles of CHL implement the Galilean idea in the human faculty of language. As far as I can see, the only empirical issue at this stage of inquiry is whether these principles specifically apply somewhere else. It is like asking if a principle of least e¤ort witnessed in water flow extends, after suitable abstraction, to all fluids (liquids and gases).12 I have proposed that the economy principles postulated in the Minimalist Program to describe the functioning of CHL also apply to music under suggested abstractions; perhaps they apply to the rest of the systems in the hominid set. In other words, the suggestion is that the economy principles of language (‘‘water’’) extend to a restricted class of systems (‘‘fluids’’).
A Joint of Nature
7.3.2
235
Scope of Computationalism
What then are the restrictions on the domain(s) in which the economy principles of language apply? To recall, PCEs, as formulated in biolinguistics, are principles of computational e‰ciency. I will suggest that the notion of computation, and its intimate connection with the notion of a symbol system, enables us to generalize from language to a restricted class (hominid set), and the notion of computation, strictly speaking, is restricted only to this class. The notion of computation thus characterizes a (new) aspect of nature. Following some of the ideas in Turing (1950), the reigning doctrine in much of the cognitive sciences—for over half a century by now—is that cognitive systems are best viewed as computational systems (Pylyshyn 1984); the broad doctrine could be called ‘‘computationalism.’’ I have no space here to trace the history of the doctrine and its current status (see Fodor 2000). Basically, once we have the mathematical theory of computation, any device whose input and output can be characterized by some or other mathematical function may be viewed as an instance of a Universal Turing machine, given the Church-Turing thesis (Churchland and Grush 1999, 155). Since brains no doubt are machines that establish relations between causes and e¤ects (stimulus and behaviour), brains are Turing machines. However, as Churchland and Grush immediately point out, this abstract characterization of brains as computational systems merely by virtue of the existence of some I/O function, holds little conceptual interest. Any system that functions at all could be viewed as a computational system: livers, stomachs, geysers, toasters, solar systems, and so on, and of course computers and brains. Turing (1950) made a narrower and more substantive proposal. The proposal was articulated in terms of a thought experiment—the ‘‘Turing Test.’’ The test consists of an interrogator A, a man B, and a woman C. On the basis of a question-answer session, A is to find out which one of B and C is the woman. Almost passingly, Turing imposed two crucial conditions: (i) A should be in a separate room from B and C, and (ii) a teleprinter is to facilitate communication between the players ‘‘in order that tones of voice may not help the interrogator.’’ Turing asked: what happens when a Turing machine in the form of a digital computer takes the place of B? Turing’s proposal was that if A’s rate of success (or failure) in identifying the gender of B remains statistically insignificant when a human is replaced by a computer, we should not hesitate to ascribe thinking to the computer.
236
Chapter 7
As the conditions of the thought experiment show, the issue of whether a digital computer is ‘‘intelligent’’ was posed in terms of whether it can sustain a humanlike discourse (Michie 1999). To enforce this condition, Turing deliberately ‘‘screened o¤ ’’ the computer from its human interlocutor, and allowed the ‘‘discourse’’ to take place only with the help of a teleprinter. Turing’s insight exploited a central aspect of Turing machines: Turing machines are symbol manipulators (and nothing else).13 The pointer of a Turing machine moves one step at a time on a tape of blank squares to either erase or print a symbol on designated locations. In doing so, it can compute all computable functions in the sense that, given some interpretation to the input symbols, it can generate an output where the output can be interpreted as a sequence with a truth value. Thus, after some operations, suppose the tape has a sequence of eight ‘‘j’’s. In a machine with a di¤erent ‘‘hardware,’’ these could be sequences of lights. When these ‘‘j’’s are suitably spaced on the tape, the sequence can be given the arithmetic interpretation 2 þ 2 ¼ 4. Abstracting away from the particular design of the machine and the mode of interpretation enforced on it, the basic point is that computation takes place on things called ‘‘symbols’’ where a symbol is a representation that has an interpretation. The notion of ‘‘interpretation’’ at issue is of course theory-internal (Chomsky 1995b, 10 n. 5). In the case of language and music, as noted, internal significance—satisfaction of legibility conditions for FLI and FMI systems respectively—is that notion. The notions of computation and symbol systems are thus intimately related. From this narrower perspective, a system may be viewed as a computational system just in case it is a symbol system.14 If a computer is to be viewed as ‘‘intelligent,’’ its symbolic operations ought to generate sequences that may be intuitively interpreted as signs of intelligence. It follows that, in this setup, intuition of intelligence is restricted to only those systems whose operations may be viewed as representable in terms of the symbolic operations of the system. Suppose that the ability to carry on a conversation in a human language, which is a symbol system, is a sign of intelligence. Insofar as human languages are computable functions, then in principle they are (e¤ectively) representable in a Turing machine. Assume so. Under these conditions, we may think of the computer as participating in a humanlike discourse. Since the entire issue of ‘‘intelligence’’ hangs on the satisfaction of the preceding conditions, everything else that distinguishes humans from computers needs to be screened o¤. In e¤ect, Turing’s project was not to examine whether computers are intelligent (a meaningless issue anyway); he was proposing a research pro-
A Joint of Nature
237
gram to investigate whether aspects of human cognition can be explained in computational terms. The proposed connection between computation and symbolism gave rise to the computational-representational view of the mind in which a ‘‘central and fundamental role’’ was given to rules of computation, which are ‘‘rules for the manipulation and transformation of symbolic representations’’ (Horgan and Tienson 1999, 724–725). By its very conception, the project is restricted to systems where the notion of symbolic representation applies, not just I/O systems. In my view, a further restriction applies when we try to turn Turing’s formal program into an empirical inquiry because the connection between computation and symbolic representation places a stringent constraint on cognitive theories. Extremely sophisticated mathematical devices are routinely used in physics to describe the world. Thus, suppose a particular state of a system, such as some colliding particles, is described by a certain solution to a certain complex di¤erential equation. As noted, the system can be described in computational terms. However, it is always possible to hold an ‘‘instrumentalist’’ view of the e¤ort such that we do not make the further suggestion that the colliding particles are solving di¤erential equations, or that snowflakes have ‘‘internalized’’ fractal geometry. For cognitive theories, the burden is more. Particles do not ‘‘internalize’’ symbolic/mathematical systems, (natural) minds do. For example, we need to say that a human infant has internalized the rules of language. I am not suggesting that, in internalizing the linguistic system, the human infant has internalized G-B theory. What I am suggesting is that the human infant has internalized—‘‘cognized’’ (Chomsky 1980)—a computational system which we describe in G-B terms, wrongly perhaps; that is the crucial distinction between a toaster/comet/snowflake and a human infant. In recent years, Chomsky has argued with force that our theories of nature are restricted to what is intelligible to us, rather than to what nature is ‘‘really’’ like (Chomsky 2000d; Hinzen 2006; Mukherji, forthcoming b); within the restrictions of intelligibility we aim for the best theories. It does not follow that the notion of intelligibility applies in the same fashion in each domain of inquiry; what is intelligible in one domain may be unintelligible in the next. Thus, the motion of comets is captured in our best theory that uses di¤erential equations to make it intelligible to us why comets move that way; the theory will not be intelligible if it required that comets solve those equations. For computational theories in cognitive domains, in contrast, our best theories make it intelligible to us why
238
Chapter 7
a child uses an expression in a certain way by requiring that the child is computing (explanatory adequacy); the theory will not be very intelligible otherwise since it would fail to distinguish between the child and snowflakes. It follows that we genuinely ascribe computational rules only to those systems to which we can intelligibly ascribe the ability to store and process symbolic representations.15 We need to distinguish, then, between symbol-processing systems per se and our ability to describe some system with symbols—the latter deriving obviously from the former. Mental systems are not only describable in computational terms; they are computational systems. And the only way to tell whether a system is computational is to see whether we can view the system as a genuine symbol manipulator. Once we make the distinction, we may not want to view stomachs and toasters as computational systems since it is hard to tell that these are symbol-processing systems themselves. A large variety of systems thus fall out of the range of computational systems understood in this narrow sense: systems of interacting particles, assembly of DNA, chemical a‰nity, crystal structures, and so on. As noted, the list possibly includes the visual system which is a ‘‘passive’’ system; it is not a system of symbols at all. From this narrow perspective, it is not at all obvious that nonhuman systems, such as the system of insect navigation, also qualify as computational systems in the sense in which human language and music qualify.16 Charles Gallistel (1998) raised much the same problem, in my view. As Gallistel’s illuminating review of the literature shows, sophisticated computational models have been developed to study the truly remarkable aspects of insect navigation, such as dead reckoning. How do we interpret these results? According to Gallistel, ‘‘A system that stores and retrieves the values of variables and uses those values in the elementary operations that define arithmetic and logic is a symbol-processing system’’ (p. 47). ‘‘What processes enable the nervous system,’’ Gallistel asks, ‘‘to store the value of a variable . . . to retrieve that value when it is needed in computation?’’ ‘‘(W)e do not know with any certainty,’’ Gallistel observes, ‘‘how the nervous system implements even the operations assumed in neural net models, let alone the fuller set of operations taken for granted in any computational/symbolic model of the processes that mediate behaviour’’ (p. 46). As noted in chapter 1, Chomsky (2001b) mentioned these remarks to illustrate the general problem of unification between biology and psychology. In my opinion, Gallistel’s concerns cover more. As Gallistel observed, to say that a system constitutes of computational principles is to say that it is a symbol-processing system. It is of some in-
A Joint of Nature
239
terest that Gallistel mentioned arithmetic and logic to illustrate symbolprocessing systems. No doubt, postulation of computational processes for, say, human language raises exactly the same problem with respect to the nervous system: we do not know with any certainty how the nervous system implements the operations assumed in any computational/ symbolic model of the processes that mediate linguistic behavior. This is the familiar unification problem that arises in any case. But Gallistel does not seem to be raising (only) this problem; he seems to be particularly worried about insects. Gallistel may be questioning the intelligibility of the idea that insects are symbol processors because their cognitive systems simply do not fit the paradigms of arithmetic and logic, not to mention language and music. How then do we decide that a certain system is not only describable in computational terms, it may be intelligibly viewed as a computational system? Keeping to organic systems, which aspects of an organism’s behavior is likely to draw our intelligible computationalist attention? To recapitulate, we saw that the sole evidence for the existence of CHL is the unbounded character of a variety of articulated symbol systems used by humans. In particular, we saw that CHL is likely to be centrally involved in every system that (simultaneously) satisfies the three general properties of language: symbolic articulation, discrete infinity, and weak external control. Symbolic articulation and its structure indicate an inner capacity with the properties of unboundedness and freedom from external control. As the French philosopher Rene´ Descartes put it, ‘‘All men, the most stupid and the most foolish, those even who are deprived of the organs of speech, make use of signs,’’ and signs are the ‘‘only certain mark of the presence of thought hidden and wrapped up in the body’’ (cited in Chomsky 1966, 6).17 In other words, we look for CHL when we find these properties clustering in the behavior of some organism. From this perspective, it is not at all clear what sense may be made of the proposal that foraging behavior of animals displays properties of discrete infinity and weak external control, since these animals simply do not exhibit the required behavior in any domain. After an extensive review of literature on cognitive capacities of nonhuman systems, Penn, Holyoak, and Povinelli (2008) conclude that, for a significant range of human capacities including communication with symbols and navigation with maps, there is not only absence of evidence in nonhuman systems, but also evidence of absence. In this sense, the analogy between insects and humans is no more credible, for now, than that between humans and comets/snowflakes.
240
Chapter 7
Insects may be ‘‘too imperfect,’’ to use Descartes’’ phrase,18 but some other animals may be viewed as meeting the suggested criteria. For example, contemporary research on bird songs shows that some species of birds are capable of producing songs of impressive variety and complexity, so these are certainly systems of articulation. The complexity of the songs and the method of acquisition suggest an inner drive that appears to be largely free from external control; suppose so. Thus, for researchers in this area, the only issue seems to be recursion (¼ discrete infinity) in nonhuman species. In an innovative experiment, Fitch and Hauser (2004) tested monkeys for their ability to ‘‘master’’ two di¤erent rule systems. It is important to be clear about the exact scope of this work. After noting that human languages have the ‘‘capacity to generate a limitless range of meaningful expressions from a finite set of elements,’’ the authors wanted to see if cotton-top tamarin monkeys can be ascribed this capacity. ‘‘Weak’’ grammars that contain only ‘‘local organizational principles, with regularities limited to neighboring units’’ cannot represent this capacity. The capacity requires that the grammar incorporates ‘‘hierarchical structure’’: a ‘‘phrase-structure’’ grammar. To capture the distinction, they devised a rule system of the form (AB) n —a weak grammar—which generates iterative sequences like ABABAB. . . . The monkeys readily mastered this grammar. Fitch and Hauser devised another rule system of the form A n B n —a ‘‘phrase-structure grammar’’—which generated embedded structures like AAABBB . . . , that is, AB embedded in AB, AABB embedded in AB, and so on: central embedding. The monkeys failed to master this system. According to Fitch and Hauser then (i) mastering hierarchical structures is necessary for mastering human language, and (ii) monkeys cannot master hierarchical structures. It follows that monkeys cannot learn human languages. Although Fitch and Hauser do not make it perspicuous, it does not follow that, had the monkeys learned the A n B n grammar, they would have learned human languages. In a more recent study, Timothy Gentner and colleagues studied some European starlings (Sturnus vulgaris) from the same direction (Gentner et al. 2006). They found that these songbirds could master a rule system of the form A n B n after over forty thousand trials. Do these systems contain CHL and belong to the hominid set? The crucial issue is whether the starlings display symbolic articulation to indicate the ‘‘capacity to generate a limitless range of meaningful expressions from a finite set of elements.’’
A Joint of Nature
241
As Gentner et al. note, central embedding of more than one linguistic item is not the preferred form of productivity of human symbol systems; performance systems begin to collapse after two or three embedding: the nurse [whom the cook [whom the maid met] saw] heard the butler (Miller and McNeill 1969, 707).19 Furthermore, as Chomsky (personal communication) suggests, starlings could be just storing sequences of A’s and B’s up to familiar limits on short-term storage (Miller 1956). If these observations are valid, both group of researchers, Fitch-Hauser and Gentner et al., could have been looking at the wrong phenomenon to locate analogues of human language in other species. In any case, with respect to the issue of symbolic articulation, Gentner et al. note that ‘‘starlings sing long songs composed of iterated motifs (smaller acoustic units) that form the basic perceptual units of individual song recognition.’’ It follows that starlings themselves do not produce hierarchically organized sequences. Also, it is well known that even elaborate birdsongs convey only global information such as announcing the presence of a sexual partner or guarding of territory.20 Unlike human language and music, individual units of birdsongs do not combine to generate more complex meaningful units. Peter Marler (2000, 36) calls this ‘‘phonocoding’’: ‘‘recombinations of sound components in di¤erent sequences where the components themselves are not meaningful.’’ In other words, the phenomenon of phonocoding signifies absence of computational meaning in birdsongs.21 I have made the following points about starlings: (i) their own productions (songs) do not have the telling features of human recursion leading upto computational meanings; (ii) the evidence of recursion in their perception of acoustic stimuli is far from conclusive; (iii) in particular, the absence of computational meaning in the production case suggests its absence in the perception case as well. Taking the cumulative e¤ect of (i)– (iii), it is implausible that PCEs are operative in these systems since the basic task of the PCEs is to generate computational meaning under optimal conditions. Since there is no compelling reason to attribute the relevant notion of a computational system to these organisms, it is currently implausible that their structural organization be viewed on a par with human linguistic and musical systems. Studies on birdsongs currently do not encourage extending the notion of a computational system to anything beyond the hominid set. To review the central concerns of this chapter, on the one hand, it seems plausible to hold that Merge, the singular recursive mechanism, applies beyond language to a narrow class of symbol processors, especially
242
Chapter 7
to human music; on the other, it is doubtful that PCEs, the restricted set of principles of computational economy, apply beyond this narrow class. In e¤ect, I have made two suggestions. First, each symbol processor, narrowly conceived, contains the same computational system (CHL ). Second, it is currently not intelligible that computational systems, in the relevant sense, exist anywhere else in nature. If these suggestions jointly hold, then, strictly speaking, CHL , which replicates under reproduction in humans, is the only computational system in nature.
Notes
Chapter 1 1. Rgveda 4.58.3. Interpretation by T. R. V. Murthy; cited in Coward 1980, vii. 2. For some analysis with respect to the issues discussed in this work see Matilal and Shaw 1985, Matilal 1990, Siderits 1991, Mohanty 1992, Ganeri 1999, Dasgupta, Ford, and Singh 2000. 3. For remarks on possible convergences between classical and contemporary inquiries in the Western tradition, see Chomsky 1966, 1972a, 1975, 1995a, 1997. 4. Interestingly, similar remarks apply to formal theories of music in either tradition. See Mukherji 2000, chapter 4, and McLain 1976 for some discussion and references. Developments in number theory provide another example; see Lako¤ and Nunez 2000. Note that the remarks concern just these two traditions; there are others. 5. The original Sanskrit couplet is: Anaadinidhanam brahma sabdatattvam yad aksaram / Vivartate 0 rthabhaavena prakriyaa jagato yataah (I.1). Bhartrhari’s classic work Vaakyapadiya (‘‘On Words and Sentences’’) begins with these words. 6. ‘‘At their politest, [neuroscientists] say that the gap between molecules and mind is so vast that it is absolutely fruitless to think about bridging the gap. At their worst, my colleagues warn me to stick to molecules, because it is the only hard-nosed approach’’ (James Schwartz, in Chomsky 1994b, 72). 7. See Leiber 1991 for a similar conception of cognitive science. See Seidenberg 1995 for a somewhat di¤erent conception. 8. The label mind/brain just signals that the unification problem between linguistics and the brain sciences remains unsolved, perhaps fundamentally so; see sections 1.3.1 and 7.3.2. I do not think the objection to this concept by Bennett and Hacker 2003 applies in the specific case of biolinguistic research, even if their objection holds for more questionable areas of cognitive neuroscience such as studies on consciousness, beliefs, and perception. 9. One piece of evidence is the immediate popularity of (persistent) attempts to show that even the grammar of languages is culturally determined. The latest effort is by Everett 2005, see Nevins, Pesetsky, and Rodrigues 2007 for criticism.
244
Notes
10. The structure is ambiguous in many ways; its resolution by language users touches on some deep aspects of language acquisition. For some discussion, see Chomsky 1965, 22, and Chomsky 2000a. 11. See Edelman 1992, 243, as well as Chomsky 2000d, 103–104, on Edelman. See Jenkins 2000, 52¤., for more examples. 12. This observation needs to be sharply distinguished from the a priori claim that, since mental properties are ‘‘nomologically autonomous,’’ their study is ‘‘not part of the rest of science’’ (cited in Churchland and Sejnowski 1992, 2). 13. See Mukherji 1990 for a more general discussion of the issue. 14. ‘‘Roughly’’ because Fodor and Lepore’s actual claim is more complicated. They claim that natural languages such as English do not have a semantics at all (Fodor and Lepore 2002); semantics, in the form of symbol-world connections, applies systematically to the ‘‘language of thought’’ (Fodor 2007). See Pietroski 2006 for discussion and criticism. Chapter 2 1. Saul Kripke (2005, 1012) points out that, although the scope problem studied by linguists is claimed to be a ‘‘very recently noticed distinction,’’ the basic idea goes back to Russell. Kripke’s observation thus lends some support to the strategy adopted here. 2. See Chomsky 1955b for an early and illuminating discussion of this issue. 3. I am indebted to Howard Lasnik for going through the rest of this chapter carefully and for making many suggestions. Any mistakes that still remain are mine. 4. Phrase-structure rules do not automatically generate a labeled bracketing or a tree diagram. For this we need to define a ‘‘phrase marker’’ as a structure consisting of (i) a collection of partially ordered nodes with a top node uniquely defined, and (ii) a collection of labels that are assigned to the nodes. Now the rewriting rules may be suitably interpreted to fit this structure. For more on this and related issues, see Chomsky 1955a, chapter 7; 1965, 88–90; 1986, 56–58. Also see Martin and Uriagereka 2000, 2–6. 5. See Jackendo¤ 2002, 38–39, for a variety of very natural examples such as I wonder if Susan knows that Fred assured Lois that Clark would remind Pat to buy food for dinner. Also see Soames and Perlmutter 1979, 281. 6. There is some evidence that the head-complement structure of a language has phonological consequencess in terms of which word of a phonological phrase has prosodic prominence. In e¤ect, when babies are able to segment speech inputs into phonological phrases, they can hear the prominence and settle on the value of the head parameter for that language (Mehler, Christophe, and Ramus 2000, 62–63). 7. The best accessible discussion that I know of is in Uriagereka 1998, chapter 3, especially section 3.5.
Notes
245
8. ‘‘r’’ for referring expressions: the category also contains variables; wh-traces, as we will see, are paradigmatic examples of variables. 9. See Hornstein and Weinberg 1990 and Hornstein 1995 for more on this topic, especially the argument from ‘‘weak crossover’’ for some classic cases. See also Lasnik and Uriagereka 2005, chapter 6, for a more recent review. 10. In Hungarian, QPs and WPs in fact move overtly to the clause front such that the syntax of the Hungarian equivalent of John loves Mary and John loves everyone is rather di¤erent overtly (for details, see Lasnik and Uriagereka 2005, 6.2). 11. See Roeper and de Villiers 1993 for more on this from the point of view of language acquisition. 12. For Fox (and a range of linguists he cites), the semantic component is described by truth-theoretic semantics. In what follows, the theoretical salience of this conception of a post-LF semantic component is precisely the issue under discussion. Chapter 3 1. See Heim and Kratzer 1998, 47–53, for an illuminating discussion of this issue. It is not clear to me what lesson they draw regarding the syntax-semantics divide, since their concept of interpretability is explicitly drawn from logical theory. 2. In contrast, Hinzen (2006, 221) grants full semantic status to LF: it is a level of representation in the linguistic system ‘‘in which semantic information is coded by means of narrowly linguistic principles and constraints.’’ 3. I am aware of major internal di¤erences between Montague Semantics and Davidsonian Semantics. See Lepore 1982 for an influential discussion of this topic. According to Lepore, Montague Semantics, unlike Davidsonian/Structural Semantics, fails to be an ‘‘absolute’’ theory of truth that explains language-world connections. Thus, I concentrate on Davidsonian Semantics when discussing these (purported) connections; I turn to Montague Semantics when discussing mindinternal properties of expressions. Hence, I cover both eventually. 4. I am not suggesting that Chomsky and Hornstein approve of this addition; they do not. 5. It is not a downright truism because of vacuous words such as pleonastic it and there, vacuous inflections on words, and ‘‘frozen phrases’’ such as idioms. 6. I am setting aside the issue of whether the phonological form of s su‰ces as a structure description of it. 7. I cannot think of other ways of formulating the right-hand side with ‘‘LFs’’ occurring in it. 8. I set aside the issue of whether the predicate in question has enough constraints to match a sentence with its ‘‘own’’ circumstance. 9. Both Frege and, as noted, Tarski were rather strongly opposed to the use of formal logic to characterize ‘‘our mother tongues.’’
246
Notes
10. See Higginbotham 1986, 1991, for a di¤erent angle on the idea. See Lappin 1997 for a more comprehensive list. 11. Dictionaries I could lay my hands on are of little help; denote is simply paired with terms that require exactly the same clarification: designate, indicate, mean, signify, refer, and the like. 12. This argument, first suggested by Michael Dummett to my knowledge, is different from the Davidsonian objection (Lepore 1982) that one may know ‘‘John denotes what is denoted by John’’ and fail to know what John denotes. I am suggesting that one fails to know what John denotes even if one has the linguistic knowledge John denotes John. 13. We need not assume that Chomsky was making a serious empirical proposal. All he could have meant was that, in order to account for this universal ability displayed by children, we need to postulate something like the Peano axioms. 14. Current followers of Montague obviously di¤er with him on this issue. See Partee 1979, 1980, 1992, for more. 15. Szabo (2005) holds that the so-called uniqueness condition in the theory of descriptions is not only misleading (as I will also argue from a di¤erent direction), but that Russell never really gave any substantive argument in support of this condition. 16. Russell’s enigmatic 1905 paper raises many issues. I am only concerned with its application to uses of English the-phrases. Even there, there are many objections to Russell’s theory within this concern that I will not touch. See Schi¤er 2005 for a recent revival of Keith Donnellan’s objections from referential uses (Donnellan 1966); also Mukherji 1987, 1989. I will focus on the fundamental equivalence. 17. For instance, in a recent special issue of the journal Mind (vol. 114, no. 456) devoted to Russell 1905, Stephen Neale, Saul Kripke, David Kaplan, and Nathan Salmon all simply take the uniqueness condition for granted and proceed to extremely complex theoretical discussion. 18. ‘‘More’’ includes data such as all the boys, some of the boys, most of the boys, exactly one of the boys, and so on, but not *that the boy. 19. Thanks to Norbert Hornstein (personal communication) for raising this issue. 20. Hornstein (1984, chapter 3) gave some grammatical support to this distinction via his distinction between Type I and Type II quantifiers. Type I quantifiers include names, descriptions, any, a certain, and so on; Type II quantifiers include a, some, every, and so forth. The distinction was essentially based on whether QR (quantifier raising) applies or not. Since Hornstein rejected QR on minimalist grounds later (Hornstein 1995), as noted (section 3.3), the distinction is no longer motivated in those terms. 21. Siderits (1998, 507) observes that, according to the Navya-Nyaya school of Indian philosophy, universal quantifiers are always used to mention a number: ‘‘To say ‘all crows are black’ is to say that everything qualified by crowness is also qualified by a particular number, namely the number of crows, and that every-
Notes
247
thing qualified by that number-particular is in turn qualified by blackness. We, of course, do not know the number of this totality of crows, but Nyaya is quite confident that there is such a number, since otherwise we should be unable to refer to all the crows’’! 22. The conclusion probably extends to believes that in the de dicto sense (Ray 2007). Chapter 4 1. The assumption is questionable in view of data such as many (of his) arrows didn’t hit the target / the target wasn’t hit by many (of his) arrows (Lasnik and Uriagereka 2005, 181). Perhaps we can question the very idea of (large-scale) synonymy for natural languages (Hinzen 2006). 2. I am not counting his (often lengthy) critical remarks on theories of meaning proposed by others, especially, the philosophers (Chomsky 1975, 1980, 1986, 1994b, 2000d). 3. Ormazabal (2000) suggests that the feature þ/animate plays a significant role in syntax—for example, the presence of this feature in tandem with the Minimal Link Condition helps explain a variety of raising of nominals. As we will see, part of this phenomenon is typically explained in terms of feature checking of syntactic Case. If correct, Ormazabal’s thesis goes some way in eliminating the Case system from the syntactic component. The move seems natural to Ormazabal because ‘‘unlike syntactic Case, which is a theory-internal construct, features like þ/animate . . . survive to one of the interface levels’’ (p. 254). 4. See Fodor and Lepore (1994, 146–147) for remarks on the answer calculated John. 5. See Lidz, Gleitman, and Gleitman 2004 for review and implications for language acquisition. 6. Thanks to Wolfram Hinzen (personal communication) for pointing this out. 7. Ascription of rules to insects and other nonhuman organisms, and to domains such as face recognition, gives rise to other conceptual problems that I discuss later in the work (chapter 7). I will ask: What does it mean to view face recognition, as contrasted with language, as a rule-governed behavior in the first place? Kripke is not raising this specific problem about nonlinguistic domains. He objects to the idea of rule following anywhere, especially in the case of language. 8. There is some evidence that children with Specific Language Impairment (SLI) have di‰culty in mastering regular tense morphology, but have fewer problems with irregular verbs (Leonard 1998, 59–66). 9. These include the claims that semantic decomposition is exhaustive, gives the necessary and su‰cient conditions for the meaning of a word, requires a distinction between semantic knowledge and encyclopedia, and so on. 10. It is a legitimate question if this notational move has any explanatory value, or whether it is just representation of data. I am setting such problems aside for
248
Notes
now because the issue of the (explanatory) salience of a notational scheme arises only after a coherent scheme has been found. It is unclear if that stage has been reached for bachelor. 11. I consider Fodor’s criticism of a variety of other approaches in lexical semantics to be decisive (Fodor 1998). It is interesting that Murphy (2002), which is supposed to be a comprehensive document on recent empirical work on concepts, does not address any of Fodor’s objections although Fodor (1998) is listed in the bibliography. As noted, I also fail to attach explanatory significance to Fodor’s own Old Testament Semantics (see sections 1.3.3 and 3.5.1 of this book). 12. Apart from the assumed SM and C-I interfaces, language is certainly interfacing with the visual and other perceptual systems, and the system of emotions. 13. Thanks to Norbert Hornstein (personal communication) for phrasing the current scene like this. Also see Baker 2003. Chapter 5 1. If this evidence holds, then the observation in Lasnik and Uriagereka 2005, 2, that CHL was an ‘‘aspect’’ in the classical theories in generative grammar is misleading. It misses the radical departure proposed in MP. 2. Thanks to Norbert Hornstein and Bibhu Patnaik for a number of suggestions on this section. Needless to say, I am responsible for the remaining mistakes. 3. The sketch of MP presented here is obviously sanitized; it ignores turbulent debates over each G-B principle for over a decade through which an outline of MP slowly emerged. In fact, some G-B principles were found to be wrong, not just dispensable. X-bar theory could be one of them. Presence of SVO structures in supposedly SOV languages, the striking example of polysynthetic languages (Baker 2001), and so on suggested that X-bar may not even be true. Thanks to Wolfram Hinzen (personal communication) for raising this. Also, see the illuminating discussion of phrase-structure theory in Hinzen 2006, 180–193. 4. Formally, it could be b as there is nothing to distinguish between a and b at the first step of derivation since both are LIs. We suppose that one of the products of Merge will be ‘‘deviant’’ and thrown out. Notice the problem disappears after the first step for the concerned lexical items (Chomsky 2006a); it will reappear, if two new LIs are to be merged as in the tree below (figure 5.3). 5. See Boeckx 2006, chapter 5, for a glimpse of the range of controversies. 6. The postulation of unambiguous paths has a variety of consequences for virtually every component of the system. Chomsky (2006c) observes that this ‘‘limitation’’ could be involved in minimal search within a probe-goal framework, requirement of linearization, conditions of predicate-argument structure and others. (See below.) 7. This is not to trivialize the complex problem of how linear order is in fact obtained in sound—that is, outside CHL . 8. Thanks to David Pesetsky (personal communication) for help.
Notes
249
9. Although I have cited from a recent paper, the general idea appears repeatedly in Chomsky 1995b. 10. Hauser 2008 finds some indirect support for this view from cross-species studies. 11. See Reinhart 2006 for di¤erences in computational mechanisms within the (narrow) language system and the systems just outside it leading to complicated ‘‘interface strategies.’’ 12. One consequence, among many, is that the operation (External) Merge can no longer be viewed as taking place at predesignated y-positions. 13. If we view the system of I-meanings, which enter into computation (chapter 4), as another layer before classical C-I, the conception of FL would be even broader. 14. Chomsky (1995b, 261) does propose alternative formulations for a principle of last resort. But this was largely a rhetorical device to reject them all. 15. What follows is a quick summary of Mukherji 2003a; see this paper for details. I am indebted to Taylor and Francis Publishers for permission to use the material. 16. There seem to be some exceptions involving long-distance agreement to this general phenomenon, currently under study; see Chomsky 2006c for a powerful explanation. Chapter 6 1. Bickerton also includes ‘‘self-consciousness’’ as a part of hominization. 2. See Mukherji 2000, chapter 4, for criticism of popular impressions in this area. 3. I am setting aside the complex phenomena of singing and listening to songs. Songs seem to require simultaneous access to linguistic and musical systems. Introspective evidence is too thick for determining the nature of ‘‘trade-o¤s’’ between the two systems. 4. I am indebted to Roberto Casati for raising this general issue. 5. By ‘‘passive’’ I do not mean that the visual system is an inactive receptor of stimuli; it is well known that perceptual systems actively participate in the formation of visual representations—addition of ‘‘depth’’ in a 2 12 -D sketch, for example. 6. I am indebted to Lyle Jenkins (personal communication) for this example and references. 7. This assumes for now that the visual system is a computational system, which I doubt (see chapter 7). 8. Notice that SMH does not require that the systems of language and music access CHL in the same way or to the same extent. Plainly, language and music are very di¤erent cognitive systems geared to process and store di¤erent kinds of information. These and other system-specific properties are likely to influence their conditions of access to CHL and developmental patterns. In turn, these di¤erences
250
Notes
might explain cases of selective impairment of language and music (Peretz 2001; Peretz and Coltheart 2003; Schellenberg and Peretz 2007). 9. See Mukherji 2000, chapter 4, for a discussion of some other objections against the musilanguage hypothesis. 10. In distributed morphology, as we saw, semantic features such as þhuman, þcount, þanimate, and the like do not even enter the syntactic computation to LF (Harley and Noyer 1999). 11. Curiously, Fitch, Hauser, and Chomsky 2005, who favor the view that language and music are distinct domains, do not rule out the possibility that music may have ‘‘purely syntactic aspects of constituents such as complementizers, auxiliaries, or function words’’ (p. 203). 12. I have heard serious philosophers of music talk about heavyhearted resoluteness. Others hold ‘‘persona’’ theories in which a given piece of music serves as a prop for make-believe behavior as if someone is angry, joyous, sad, and so on; see Walton (1993) for one proposal in this direction. A detailed discussion of this amazing proposal is beyond the scope here. But the following remark from Wittgenstein 1980, 69e may be instructive: ‘‘Are we supposed to imagine the dance, or whatever it may be, while we listen? . . . If seeing the dance is what is important, it would be better to perform that rather than the music. But that is all misunderstanding.’’ 13. Visit the site to listen to the music. 14. In the literature on aesthetics and philosophy of music, there is a good deal of dispute as to how the relation between music and its a¤ect is to be formulated: as the meaning of music, the expression of music, or simply as what music is expressive of; see Davies 1994. 15. Scruton also invokes Frege’s conception of meaning in the context of musical understanding. He had brought in Frege earlier in Scruton 1983, which I had criticized in detail in Mukherji 2000, 105–111. 16. A drone instrument, such as a taanpuraa, is used as a support because Indian classical music has no fixed pitch. I have used Western idioms such as tonic, dominant, etc. only for suggestive comparison. There is much controversy as to whether these terms are accurate for Indian music. 17. Visit http://www.youtube.com/?v=SQYuQnTTN90 for a performance of aalaap in Yaman. Also go to http://www.youtube.com/?v=cENz3lPRcPU for a lucid introduction to the raaga. 18. I have omitted all ‘‘ornamentations,’’ variations in duration of notes, emphasis, and the like. Sometimes these things are crucial for determining the form of a raaga. 19. Note that the present task is to show how musical progression meets the informal criteria of recursion suggested by Hauser, Chomsky, and Fitch 2002. Later, in section 7.1, we will see a more explicit hierarchical organization of music due to Pesetsky 2007. 20. I am indebted to Krishna Bisht, professor of music (University of Delhi) and a well-known artist of Hindustani classical music herself, for help with alaap in Yaman.
Notes
251
Chapter 7 1. Chomsky’s ‘‘third factor’’ consists of a variety of other things, such as resources for analyzing perceptual data; I set them aside. I am also setting aside the FLI systems from the picture since it is unclear if they belong to FL proper. 2. Cherniak’s leading idea is that the ‘‘body plan’’ of biological systems often exhibits a preference for ‘‘least-path’’ organization such as the location of the eye and the brain near the front of the body axis. 3. I am setting aside possible independence between language and arithmetic witnessed in some cases of lesion resulting in hemispheric dissociation (Varley et al. 2005). As Chomsky (2006c) rightly points out, their significance is uncertain in view of the competence-performance distinction. 4. Pesetsky explains: ‘‘V ¼ ‘five’ (i.e., a chord built on the fifth note of the scale, called the dominant), I ¼ ‘one’ (a chord built on the first note of the scale, called the tonic). The subscripted arabic numerals following the Roman numeral indicate which note of the chord is the lowest note heard, and whether there are any pitches in the chord besides the three pitches of the basic triad. This is traditional chord notation.’’ Pesetsky’s ingenious proposal is currently available only in the form of an enigmatic handout. Many details are currently missing. See also Katz and Pesetsky 2009. 5. Pesetsky also suggests that the ‘‘hierarchical organization of rhythmic beats . . . is the product of a distinct system that provides an additional kind of organization (perhaps absent in certain ‘free’ styles, e.g., chant or recitative).’’ 6. I am not saying of course that sound is ancillary to music as well. However, we need to explain that Beethoven composed his later music while he was totally deaf. 7. I will use the more general notion of tonal space, rather than the specific notion of pitch space, because not all music, such as Indian music, has fixed pitch. 8. A part of the why-question can be addressed in terms of a list of what Lerdahl and Jackendo¤ (1983, chapter 9) call ‘‘well-formedness rules’’ and ‘‘preference rules,’’ which are essentially attempts to formalize facts about prolongationalreduction trees such that application of these rules to specific pieces of music will generate the trees. 9. In contrast, Vijaykrishnan (2007, chapter 8) suggests that all of individual pitches, scales, raagas and their distinguishing motifs, among other things, constitute the lexicon of Carnatic music. There are at least two problems. First, scales, motifs, familiar tunes, and the like, are themselves complex SOs built out of simpler elements. Which combinatorial system constructs these objects? Second, as his earlier elegant discussion (2007, 87–88) brings out, traditional claims of 22microtones for Carnatic music notwithstanding, this form of music essentially uses the familiar 12-semitone system per octave used in much of Western classical music. However, he goes on to suggest that the additional microtones in Carnatic music are ‘‘cognitively’’ constructed in the system as raaga-specific requirements are enforced (chapter 8). It follows that there is a clear division of labor between
252
Notes
a largely universal set and the one enforced by the raaga-system in a specific tradition. To dump everything under ‘‘lexicon’’ misses this crucial architectural feature. See Mukherji 2009. 10. From this fact, Pesetsky (2007) concludes that music has no lexicon! The conclusion seems unwarranted. If there is no lexicon, what is Merge merging? 11. Apart from Chomsky’s thoughts on these issues already cited, see Stewart 1995 and 2001, Flake 1998, Leiber 2001, Jenkins 2000, Uriagereka 1998, Martin and Uriagereka 2000, Piattelli-Palmarini and Uriagereka 2004, Carroll 2005, for detailed case studies from physics, chemistry, molecular biology, and ethology that bear on biolinguistics. Much of this work mentions d’Arcy Thompson 1917 and Alan Turing 1952 as starting points; some trace it to Goethe 1790. 12. Fluid of course is a technical term entrenched in theory: a ‘‘perfect fluid’’ has zero viscosity. 13. Another closely related aspect is that Turing machines are, in an abstract mathematical sense, ‘‘mechanical’’ devices. 14. I assume that not every symbol system, especially the finitary ones, is a computational system; only unbounded (recursive) ones are. 15. It is another story why this obvious restriction on computationalism got sidetracked in the brief history of the cognitive sciences. 16. I will only make some breezy remarks on what seem to me to be the immediate worries. Thus I will not discuss the (neural) cost of having a computational system, the possible distinction between computational and ‘‘causal’’ explanations, ‘‘Saussurian arbitrariness’’ of sign systems, and the like, not to mention the ubiquitous problem of ‘‘rule following.’’ 17. See Mukherji 2000 for an attempted reconstruction of the doctrine of dualism from this direction; also see Mukherji, forthcoming a. 18. Regarding the issue of whether nonhuman animals have the ability to use signs, Descartes’ worry was that ‘‘there is no reason to believe it of some animals without believing it of all, and many of them such as oysters and sponges are too imperfect for this to be credible’’ (cited in Leiber 1991). 19. In my opinion, the example that Boeckx (2006, 31) wants us to ‘‘imagine’’ is even worse: anti-anti-anti-missile missile missile missile. 20. This is not true for (some) vocalizing primates (Cheney and Seyfarth 1999), although, unlike birdsongs, primate calls exhibit very little internal structure. 21. In his response to Gentner et al., Marc Hauser points out that the mere presence, if any, of recursion in songbirds does not imply that songbirds can acquire human languages; this is because human languages crucially involve meanings (reported in ‘‘Songbirds Learn Grammar,’’ The Hindu, April 27, 2006). This obvious problem could have been raised before Hauser and colleagues themselves embarked on the study of tamarin monkeys to find out if these animals have the capacity to learn human languages.
References
Austin, J. 1961. Performative utterances. In J. Austin, Philosophical Papers, 233– 252. Oxford: Oxford University Press. Austin, J. 1962. How to Do Things with Words. Ed. J. Urmson. Oxford: Oxford University Press. Baker, M. 2001. The Atoms of Language: The Mind’s Hidden Rules of Grammar. New York: Basic Books. Baker, M. 2003. Linguistic di¤erences and language design. Trends in Cognitive Science 7(8): 349–353. Barbosa, P., D. Fox, P. Hagstrom, M. McGinnis, and D. Pesetsky, eds. 1998. Is the Best Good Enough? Optimality and Competition in Syntax. Cambridge, MA: MIT Press. Barwise, J., and J. Perry. 1983. Situations and Attitudes. Cambridge, MA: MIT Press. Berwick, R., and N. Chomsky. 2009. Poverty of the stimulus revisited: Recent challenges reconsidered. Ms., MIT. Bennett, M., and P. Hacker. 2003. Philosophical Foundations of Neuroscience. Oxford: Blackwell Publishers. Bhartrhari. Vaakyapadiya. Ed. K. Abhyankar and V. Limaye. Poona: University of Poona, 1965. Bickerton, D. 2000. Can biomusicology learn from language evolution studies? In N. Wallin, B. Merker, and S. Brown, eds., The Origins of Music, 153–163. Cambridge, MA: MIT Press. Bilgrami, A., and C. Rovane. 2005. Mind, language, and the limits of inquiry. In J. McGilvray, ed., The Cambridge Companion to Chomsky, 181–203. Cambridge: Cambridge University Press. Block, N. 1995. The mind as the software of the brain. In E. Smith and D. Osherson, eds., An Invitation to Cognitive Science: Thinking, vol. 3, 377–425. Cambridge, MA: MIT Press. Bloom, P. 2000. How Children Learn the Meanings of Words. Cambridge, MA: MIT Press.
254
References
Boeckx, C. 2006. Linguistic Minimalism: Origins, Concepts, Methods, and Aims. Oxford: Oxford University Press. Boghossian, P. 2007. Explaining musical experience. In K. Stock, ed., Philosophers on Music: Experience, Meaning, and Work. Oxford: Oxford University Press. Bosˇcovic´, Zˇ. 2007. On the locality and motivation of Move and Agree: An even more minimal theory. Linguistic Inquiry 38(4): 589–644. Bromberger, S., and M. Halle. 1991. Why phonology is di¤erent. In A. Kasher, ed., The Chomskyan Turn, 56–77. Oxford: Blackwell. Bronchti, G., P. Heil, R. Sadka, A. Hess, H. Scheich, and Z. Wollberg. 2002. Auditory activation of ‘‘visual’’ cortical areas in the blind mole rat (Spalax ehrenbergi). European Journal of Neuroscience 16: 311–329. Brown, S. 2000. The ‘‘musilanguage’’ model of music evolution. In N. Wallin, B. Merker, and S. Brown, eds., The Origins of Music, 271–300. Cambridge, MA: MIT Press. Buchanan, R., and G. Ostertag. 2005. Has the problem of incompleteness rested on a mistake? In S. Neale, ed., 100 Years of ‘‘On Denoting.’’ Special issue. Mind 114(456): 889–914. Cao, T., ed. 1997. Conceptual Developments of 20th Century Field Theories. Cambridge: Cambridge University Press. Carey, S. 1979. A case study: Face recognition. In E. Walker, ed., Explorations in the Biology of Language, 175–202. Cambridge, MA: MIT Press. Carroll, S. 2005. Endless Forms Most Beautiful. New York: Norton. Casati, R., and A. Varzi. 1999. Parts and Places: The Structures of Spatial Representation. Cambridge, MA: MIT Press. Cheney, D., and R. Seyfarth. 1999. Mechanisms underlying the vocalizations of nonhuman primates. In M. Hauser and M. Konishi, eds., The Design of Animal Communication, 629–644. Cambridge, MA: MIT Press. Cherniak, C. 2005. Innateness and brain-wiring optimization: Non-genomic nativism. In A. Zilhao, ed., Evolution, Rationality and Cognition, 103–112. New York: Routledge. Cherniak, C., Z. Mokhtarzada, and U. Nodelman. 2002. Optimal-wiring models of neuroanatomy. In G. Ascoli, ed., Computational Neuroanatomy: Principles and Methods, 71–82. Totowa, NJ: Humana Press. Chiat, S. 1986. Personal pronouns. In P. Fletcher and M. Garman, eds., Language Acquisition: Studies in First Language Development, 339–355. Cambridge: Cambridge University Press. Chierchia, G., and S. McConnell-Ginet. 2000. Meaning and Grammar: An Introduction to Semantics. 2nd ed. Cambridge, MA: MIT Press. Chomsky, N. 1955a. The Logical Structure of Linguistic Theory. Doctoral dissertation, University of Pennsylvania. Most of a 1956 revision was published under the same title in 1975. New York: Plenum Press.
References
255
Chomsky, N. 1955b. Logical syntax and semantics: Their linguistic relevance. Language 31(1–2): 36–45. Chomsky, N. 1957. Syntactic Structures. The Hague: Mouton. Chomsky, N. 1959. Review of B. F. Skinner’s Verbal Behavior. Language 35: 26– 58. Chomsky, N. 1965. Aspects of the Theory of Syntax. Cambridge, MA: MIT Press. Chomsky, N. 1966. Cartesian Linguistics. New York: Harper and Row. Chomsky, N. 1972a. Language and Mind. 2nd ed. New York: Harcourt Brace Jovanovich. Chomsky, N. 1972b. Problems of Knowledge and Freedom. New York: Pantheon. Chomsky, N. 1972c. Remarks on nominalization. In Studies on Semantics in Generative Grammar, 11–61. The Hague: Mouton. Chomsky, N. 1975. Reflections on Language. New York: Pantheon. Chomsky, N. 1977. Essays on Form and Interpretation. Amsterdam: NorthHolland. Chomsky, N. 1980. Rules and Representations. Oxford: Blackwell. Chomsky, N. 1981. Lectures on Government and Binding. Dordrecht: Foris. Chomsky, N. 1986. Knowledge of Language. New York: Praeger. Chomsky, N. 1988. Language and the Problem of Knowledge: The Managua Lectures. Cambridge, MA: MIT Press. Chomsky, N. 1991a. Linguistics and adjacent fields: A personal view. In A. Kasher, ed., The Chomskyan Turn, 1–25. Oxford: Blackwell. Chomsky, N. 1991b. Linguistics and cognitive science: Problems and mysteries. In A. Kasher, ed., The Chomskyan Turn, 26–53. Oxford: Blackwell. Chomsky, N. 1991c. Some notes on economy of derivation and representation. In R. Freidin, ed., Principles and Parameters in Comparative Syntax. Cambridge, MA: MIT Press. Reprinted in Chomsky, The Minimalist Program, 129–166. Cambridge, MA: MIT Press, 1995. Chomsky, N. 1993. A Minimalist Program for linguistic theory. MIT Working Papers in Linguistics. Reprinted as chapter 3 in Chomsky, The Minimalist Program, 167–217. Cambridge, MA: MIT Press, 1995. Chomsky, N. 1994a. Bare phrase structure. MIT Working Papers in Linguistics. Reprinted in G. Webelhuth, ed., Government and Binding Theory and the Minimalist Program, 383–439. Cambridge, MA: MIT Press, 1995. Chomsky, N. 1994b. Language and Thought. London: Moyer Bell. Chomsky, N. 1995a. Language and nature. Mind 104: 1–61. Chomsky, N. 1995b. The Minimalist Program. Cambridge, MA: MIT Press. Chomsky, N. 1997. Language and mind: Current thoughts on ancient problems, Parts 1 and 2. Reprinted in L. Jenkins, ed., Variation and Universals in Biolinguistics, 379–405. Amsterdam: Elsevier, 2004.
256
References
Chomsky, N. 2000a. The Architecture of Language. Ed. N. Mukherji, B. Patnaik, and R. Agnihotri. New Delhi: Oxford University Press. Chomsky, N. 2000b. Derivation by phase. MIT Occasional Papers in Linguistics, 18. http://web.mit.edu/mitwpl. Chomsky, N. 2000c. Minimalist inquiries: The framework. In R. Martin, D. Michaels, and J. Uriagereka, eds., Step by Step: Essays in Honor of Howard Lasnik, 89–155. Cambridge, MA: MIT Press. Chomsky, N. 2000d. New Horizons in the Study of Language and Mind. Cambridge: Cambridge University Press. Chomsky, N. 2001a. Beyond explanatory adequacy. MIT Occasional Papers in Linguistics, 20. http://web.mit.edu/mitwpl. Chomsky, N. 2001b. Language and the rest of the world. Bose Memorial Lecture in Philosophy, Delhi University, November 4. Chomsky, N. 2002. On Nature and Language. Cambridge: Cambridge University Press. Chomsky, N. 2003. Replies. In L. Antony and N. Hornstein, eds., Chomsky and His Critics. London: Blackwell. Chomsky, N. 2005. Three factors in language design. Linguistic Inquiry 36(1): 1– 22. Chomsky, N. 2006a. Approaching UG from below. Ms., MIT. Chomsky, N. 2006b. Introduction. Language and Mind. 3rd ed. Cambridge: Cambridge University Press. Chomsky, N. 2006c. On phases. Ms., MIT. Printed in R. Freidin, C. Otero, and M. Zubizarreta, eds., Foundational Issues in Linguistic Theory: Essays in Honor of Jean-Roger Vergnaud, 133–166. Cambridge, MA: MIT Press. Page references are to the printed paper. Chomsky, N. 2006d. San Sebastian lecture. Ms. Chomsky, N. 2007. Biolinguistic Explorations: Design, Development, Evolution. International Journal of Philosophical Studies, 15(1): 1–21. Chomsky, N., and M. Halle. 1968. The Sound Pattern of English. New York: Harper and Row. Chomsky, N., R. Huybregts, and H. Riemsdijk. 1982. The Generative Enterprise. Dordrecht: Foris. Chomsky, N., and H. Lasnik. 1977. Filters and controls. Linguistic Inquiry 8(3): 425–504. Chomsky, N., and H. Lasnik. 1993. The theory of principles and parameters. In J. Jacobs, A. von Stechow, W. Sternefeld, and T. Vennemann, eds., Syntax: An International Handbook of Contemporary Research. Berlin: Walter de Gruyter. Reprinted in Chomsky, The Minimalist Program, 13–127. Cambridge, MA: MIT Press, 1995. Churchland, Patricia, and R. Grush. 1999. Computation and the brain. In R. Wilson and F. Keil, eds., MIT Encyclopedia of the Cognitive Sciences, 155–158. Cambridge, MA: MIT Press.
References
257
Churchland, Patricia, and T. Sejnowski. 1992. The Computational Brain. Cambridge, MA: MIT Press. Collins, J. 2004. Faculty disputes. Mind and Language 19(5): 503–533. Coward, H. 1980. Sphota Theory of Language. Delhi: Motilal Banarasidas. Coward, H., and K. Kunjunni Raja, eds. 2001. The Philosophy of the Grammarians: Encyclopedia of Indian Philosophers. Vol. 5. Delhi: Motilal Banarsidas. Crain, S., and P. Pietroski. 2001. Nature, nurture and universal grammar. Linguistics and Philosophy 24: 139–186. Crain, S., and P. Pietroski. 2002. Why language acquisition is a snap. Linguistic Review 19: 163–183. Darwin, C. 1871. The Descent of Man, and Selection in Relation to Sex. London: Murray. Dasgupta, P., A. Ford, and R. Singh. 2000. After Etymology: Towards a Substantivist Linguistics. Munich: Lincom Europa. David-Gray, Z., J. Janssen, W. DeGrip, E. Nevo, and R. Foster. 1998. Light detection in a ‘‘blind’’ mammal. Nature Neuroscience 1(8): 655–656. Davidson, D. 1967. Truth and meaning. Synthese 17: 304–323. Davies, S. 1994. Musical Meaning and Expression. Ithaca, NY: Cornell University Press. Dennett, D. 1991. Mother nature versus the walking encyclopedia: A western drama. In W. Ramsey, S. Stich, and D. Rumelhart, eds., Philosophy and Connectionist Theory, 21–30. Hillsdale, NJ: Erlbaum. Diesing, M. 1992. Indefinites. Cambridge, MA: MIT Press. Donnellan, K. 1966. Reference and definite descriptions. Philosophical Review 75: 281–304. Dowty, D., R. Wall, and S. Peters, eds. 1981. Introduction to Montague Semantics. Dordrecht: D. Reidel. Duarte, I. 1991. X-bar theory: Its role in GB theory. In M. Filgueras, ed., Natural Language Processing: Lecture Notes in Artificial Intelligence, No. 476, 2554. Berlin: Springer Verlag. Ebbs, G. 1997. Rule-Following and Realism. Cambridge, MA: Harvard University Press. Edelman, G. 1992. Bright Air, Brilliant Fire: On the Matter of the Mind. New York: Basic Books. Embick, D., and A. Marantz. 2006. Architecture and blocking. Ms. Evans, G. 1982. Varieties of Reference. Oxford: Oxford University Press. Everett, D. 2005. Cultural constraints on grammar and cognition in Piraha˜. Current Anthropology 46: 621–646. Fisher, C., D. Hall, S. Rakowitz, and L. Gleitman. 1994. When it is better to receive than to give: Syntactic and conceptual constraints on vocabulary growth. In
258
References
L. Gleitman and B. Landau, eds., The Acquisition of the Lexicon, 333–375. Cambridge, MA: MIT Press. Fisher, C., B. Church, and K. Chambers. 2004. Learning to identify spoken words. In D. Geo¤rey Hall and S. Waxman, eds., Weaving a Lexicon, 3–40. Cambridge, MA: MIT Press. Fitch, W. 2006. The biology and evolution of music: A comparative perspective. Cognition 100: 173–215. Fitch, W., and M. Hauser. 2004. Computational constraints on syntactic processing in a nonhuman primate. Science 303: 377–380. Fitch, W., M. Hauser, and N. Chomsky. 2005. The evolution of the language faculty: Clarifications and implications. Cognition 97(2): 179–210. Flake, G. 1998. The Computational Beauty of Nature: Computer Explorations of Fractals, Chaos, Complex Systems, and Adaptation. Cambridge, MA: MIT Press. Fodor, J. 1989. Why should the mind be modular?. In A. Gerge, ed., Reflections on Chomsky, 1–22. Oxford: Blackwell. Fodor, J. 1994. The Elm and the Expert. Cambridge, MA: MIT Press. Fodor, J. 1998. Concepts: Where Cognitive Science Went Wrong. Oxford: Clarendon Press. Fodor, J. 2000. The Mind Doesn’t Work That Way: Scope and Limits of Computational Psychology. Cambridge, MA: MIT Press. Fodor, J. 2007. Semantics: An interview with Jerry Fodor. Revista Virtual de Estudos da Linguagem 5(8). www.revel.inf.br. Fodor, J., M. Garrett, E. Walker, and C. Parkes. 1980. Against definitions. Cognition 8: 263–367. Fodor, J., and E. Lepore. 1994. Why meaning (probably) isn’t conceptual role. In S. Stich and T. Warfield, eds., Mental Representation, 142–156. London: Blackwell. Fodor, J., and E. Lepore. 2002. The Compositionality Papers. Oxford: Oxford University Press. Fox, D. 2003. On logical form. In R. Hendrick, ed., Minimalist Syntax, 82–123. London: Blackwell. Frege, G. 1892. On sense and reference. In P. Geach and M. Black, eds., Translations from the Philosophical Writings of Gottlob Frege, 56–78. Oxford: Blackwell, 1966. Freidin, R., C. Otero, and M. Zubizarreta, eds. 2008. Foundational Issues in Linguistic Theory: Essays in Honor of Jean-Roger Vergnaud. Cambridge, MA: MIT Press. Fromkin, V. 1991. Language and brain: Redefining the goals and methodology of linguistics. In A. Kasher, ed., The Chomskyan Turn, 78–103. Cambridge: Blackwell. Gallistel, C. 1998. Symbolic processes in the brain: The case of insect navigation. In D. Scarborough and S. Sternberg, eds., An Invitation to Cognitive Science:
References
259
Methods, Models, and Conceptual Issues, vol. 4, 1–51. Cambridge, MA: MIT Press. Ganeri, J. 1999. Semantic Powers. Oxford: Clarendon Press. Ga¨rdenfors, P. 1996. Conceptual spaces as a basis for cognitive semantics. In A. Clarke, ed., Philosophy and Cognitive Science, 159–180. The Netherlands: Kluwer Academic Publishers. Gardner, H. 1975. The Shattered Mind. New York: Knopf. Gentner, T., K. Fenn, D. Margoliash, and H. Nusbaum. 2006. Recursive syntactic pattern learning by songbirds. Nature 440: 1204–1207. George, A. 1986. Whence and whither the debate between Quine and Chomsky. Journal of Philosophy, September, 489–500. Gleitman, L., and E. Newport. 1995. The invention of language by children: Environmental and biological influences on the acquisition of language. In D. Osherson, ed., An Invitation to Cognitive Science, vol. 1 (ed. L. Gleitman and M. Liberman), 1–24. Cambridge, MA: MIT Press. Goethe, J. 1790. Versuch die Metamorphose der Pflanzen zu erkla¨ren. English trans. in D. Miller, ed., J. W. Goethe: Scientific Studies, 53–128 for his biological works, 76–97 for Metamorphosis. New York: Suhrkamp, 1988. Goldfarb, W. 1985. Kripke on Wittgenstein and rules. Journal of Philosophy 82(9): 471–488. Goldin-Meadow, S. 2004. Lexical development without a language model: Are nouns, verbs, and adjectives essential to the lexicon? In D. Geo¤rey Hall and S. Waxman, eds., Weaving a Lexicon, 225–256. Cambridge, MA: MIT Press. Goldin-Meadow, S., and H. Feldman. 1977. The development of language-like communication without a language model. Science 197: 401–403. Grice, P. 1975. Logic and conversation. In P. Grice, Studies in the Way of Words, 22–40. Cambridge, MA: Harvard University Press, 1989. Grodzinsky, Y. 2000. The neurology of syntax: Language use without Broca’s area. Behavioral and Brain Sciences 23: 1–71. Gruber, J. 1965. Studies in lexical relations. In J. Gruber, Lexical Structures in Syntax and Semantics, 1–120. Amsterdam: North-Holland, 1976. Halle, M. 1995. Feature geometry and feature spreading. Linguistic Inquiry 26: 1– 46. Harley, H., and R. Noyer. 1999. State-of-the-article: Distributed morphology. GLOT International 4(4): 3–9. Hattiangadi, A. 2007. Oughts and Thoughts: Rule-Following and the Normativity of Content. Oxford: Clarendon Press. Hauser, M. 2008. The illusion of biological variation: A minimalist approach to the mind. In M. Piattelli-Palmarini, J. Uriagereka, and P. Salaburu, eds., Of Minds and Language: The Basque Country Encounter with Noam Chomsky. Oxford: Oxford University Press.
260
References
Hauser, M., N. Chomsky, and W. Fitch. 2002. The faculty of language: What is it, who has it, and how did it evolve? Science 298: 1569–1579. Hauser, M., and J. McDormott. 2003. The evolution of the music faculty: A comparative perspective. Nature Neuroscience 6(7): 663–668. Hawking, S., and W. Israel, eds. 1987. Three Hundred Years of Gravitation. Cambridge: Cambridge University Press. Hawkins, J. 1978. Definiteness and Indefiniteness. Atlantic Highlands, NJ: Humanities Press. Heim, I. 1982. The Semantics of Definite and Indefinite Noun Phrases. Doctoral dissertation, University of Massachusetts, Amherst. Heim, I., and A. Kratzer. 1998. Semantics in Biolinguistics. Oxford: Blackwell. Held, A., ed. 1980. General Relativity and Gravitation: One Hundred Years after the Birth of Albert Einstein. 3 vols. New York: Plenum. Higginbotham, J. 1985. On semantics. Linguistic Inquiry 16(4): 547–593. Higginbotham, J. 1986. Linguistic theory and Davidson’s program in semantics. In E. Lepore, ed., Truth and Interpretation: Perspectives on the Philosophy of Donald Davidson, 29–48. Oxford: Blackwell. Higginbotham, J. 1989. Elucidations of meaning. Linguistics and Philosophy 12. Reprinted in P. Ludlow, ed., Readings in the Philosophy of Language, 157–178. Cambridge, MA: MIT Press, 1997. Higginbotham, J. 1991. Belief and logical form. Mind and Language 6(4): 344– 369. Hinzen, W. 2006. Mind Design and Minimal Syntax. Oxford: Oxford University Press. Ho¤man, D. 1998. Visual Intelligence. New York: Norton. Hofstadter, D. 1979. Go¨del, Escher, Bach: An Eternal Golden Braid. New York: Vintage Books. Holden, C. 1998. No last word on language origins. Science 282: 1455–1458. Horgan, T., and J. Tienson. 1999. Rules and representations. In R. Wilson and F. Keil, eds., MIT Encyclopedia of the Cognitive Sciences, 724–726. Cambridge, MA: MIT Press. Hornstein, N. 1984. Logic as Grammar. Cambridge, MA: MIT Press. Hornstein, N. 1989. Meaning and the mental: The problem of semantics after Chomsky. In A. George, ed., Reflections on Chomsky, 23–40. Oxford: Blackwell. Hornstein, N. 1991. Grammar, meaning and indeterminacy. In A. Kasher, ed., The Chomskyan Turn, 104–121. Oxford: Blackwell. Hornstein, N. 1995. Logical Form: From GB to Minimalism. Oxford: Blackwell. Hornstein, N. 1999. Minimalism and quantifier raising. In S. Epstein and N. Hornstein, eds., Working Minimalism, 45–75. Cambridge, MA: MIT Press. Hornstein, N., and A. Weinberg. 1990. The necessity of LF. Linguistic Review 7: 129–167.
References
261
Hornstein, N., and K. Grohman. 2006. Understanding Minimalism. Cambridge: Cambridge University Press. Horwich, P. 1998. Meaning. Oxford: Oxford University Press. Huang, C.–T. J. 1982. Logical Relations in Chinese and the Theory of Grammar. Doctoral dissertation, MIT. Humboldt, W. von. 1836. Uber die Verschiedenheit des Menschlichen Sprachbaues. English trans., G. Buck and F. Raven, Linguistic Variability and Intellectual Development. Philadelphia: University of Pennsylvania Press, 1972. Jackendo¤, R. 1972. Semantic Interpretation in Generative Grammar. Cambridge, MA: MIT Press. Jackendo¤, R. 1983. Semantics and Cognition. Cambridge, MA: MIT Press. Jackendo¤, R. 1990. Semantic Structures. Cambridge, MA: MIT Press. Jackendo¤, R. 1992. Languages of the Mind: Essays in Mental Representation. Cambridge, MA: MIT Press. Jackendo¤, R. 2002. Foundations of Language: Brain, Meaning, Grammar, Evolution. Oxford: Oxford University Press. Jackendo¤, R. 2009. Parallels and nonparallels between language and music. MusicPerception 26: 195–204. Jackendo¤, R., and F. Lerdahl. 2006. The capacity for music: What’s special about it? Cognition 100: 33–72. Jenkins, L. 2000. Biolinguistics: Exploring the Biology of Language. Cambridge: Cambridge University Press. Jenkins, L. 2004. Introduction. In L. Jenkins, ed., Variation and Universals in Biolinguistics, xvii–xxii. Amsterdam: Elsevier. Johnson, K. 2000. How far will quantifiers go? In R. Martin, D. Michaels, and J. Uriagereka, eds., Step by Step: Essays on Minimalist Syntax in Honor of Howard Lasnik, 187–210. Cambridge, MA: MIT Press. Kaduri, Y. 2006. Wittgenstein and Haydn on understanding music. Contemporary Aesthetics 4. www.contempaesthetics.org/newvolume/pages/article.php?articleID =397. Kaplan, D. 1972. What is Russell’s theory of descriptions? In D. Pears, ed., Bertrand Russell: A Collection of Critical Essays, 227–244. New York: Doubleday Anchor. Kaplan, D. 1989. Afterthoughts. In J. Almog, J. Perry, and H. Wettstein, eds., Themes From Kaplan, 565–614. New York: Oxford University Press. Karmilo¤-Smith, A. 1992. Beyond Modularity: A Developmental Perspective on Cognitive Science. Cambridge, MA: MIT Press. Kasher, A. 1991. Pragmatics and Chomsky’s research programme. In A. Kasher, ed., The Chomskyan Turn, 122–149. Oxford: Blackwell. Katz, J. 1972. Semantic Theory. New York: Harper and Row. Katz, J., and J. Fodor. 1963. The structure of a semantic theory. Language 39: 170–210.
262
References
Katz, J., and P. Postal. 1964. An Integrated Theory of Linguistic Descriptions. Cambridge, MA: MIT Press. Katz, J., and D. Pesetsky. 2009. The Recursive Syntax and Prosody of Tonal Music. Ms MIT. Kaufman, L. 1979. Perception: The World Transformed. New York: Oxford University Press. Kayne, R. 1994. The Antisymmetry of Syntax. Cambridge, MA: MIT Press. Kayne, R. 2004. Antisymmetry and Japanese. In L. Jenkins, ed., Variation and Universals in Biolinguistics, 3–25. Amsterdam: Elsevier. Kennedy, C. 1997. Antecedent-contained deletion and the syntax of quantification. Linguistic Inquiry 28: 662–688. Kiparsky, P. 1982. Some Theoretical Problems in Panini’s Grammar. Pune: Bhandarkar Oriental Research Institute. Kitahara, H. 1997. Elementary Operations and Optimal Derivations. Cambridge, MA: MIT Press. Koopman, H., and D. Sportiche. 1991. The position of subjects. Lingua 85: 211– 258. Koster, J. 2009. Ceaseless, unpredictable creativity: Language as technology. Biolinguistics 3(1): 61–92. Koster, J., H. van Riemsdijk, and J.-R. Vergnaud. 1978. The GLOW manifesto. Reprinted in C. Otero, ed., Noam Chomsky: Critical Assessments. London: Routledge and Kegan Paul, 1993. Kripke, S. 1972. Naming and necessity. In D. Davidson and G. Harman, eds., Semantics of Natural Language. Dordrecht: D. Reidel. Reprinted with an expanded introduction by Harvard University Press, Cambridge, MA, 1980. Kripke, S. 1979. A puzzle about belief. In A. Margalit, ed., Meaning and Use, 239–275. Dordrecht: D. Reidel. Kripke, S. 1982. Wittgenstein on Rules and Private Language. Cambridge, MA: Harvard University Press. Kripke, S. 2005. Russell’s notion of scope. In S. Neale, ed., 100 Years of ‘‘On Denoting.’’ Special issue. Mind 114(456): 1005–1037. Krumhansl, C. 1995. Music psychology and music theory: Problems and prospects. Music Theory Spectrum 17: 53–90. Krumhansl, C., T. Eerola, P. Toiviainen, T. Ja¨rvinen, and J. Louhivuori. 2000. Cross-cultural music cognition: Cognitive methodology applied to North Sami yoiks. Cognition 76: 13–58. Kunej, D., and I. Turk. 2000. New perspectives on the beginnings of music: Archeological and musicological analysis of a middle paleolithic bone ‘‘flute.’’ In N. Wallin, B. Merker, and S. Brown, eds., The Origins of Music, 235–268. Cambridge, MA: MIT Press. Lako¤, G. 1987. Women, Fire, and Dangerous Things. Chicago: University of Chicago Press.
References
263
Lako¤, G., and R. Nunez. 2000. Where Mathematics Comes From: How the Embodied Mind Brings Mathematics into Being. New York: Basic Books. Lappin, S. 1997. The Handbook of Contemporary Semantic Theory. Oxford: Blackwell. Larson, R., and G. Segal. 1995. Knowledge of Meaning: An Introduction to Semantic Theory. Cambridge, MA: MIT Press. Lasnik, H., and J. Uriagereka. 2005. A Course in Minimalist Syntax: Foundations and Prospects. Oxford: Blackwell. Leiber, J. 1991. An Invitation to Cognitive Science. Oxford: Blackwell. Leiber, J. 2001. Turing and the fragility and insubstantiality of evolutionary explanations: A puzzle about the unity of Alan Turing’s work with some larger implications. Philosophical Psychology 14(1): 83–94. Leonard, L. 1998. Children with Specific Language Impairment. Cambridge, MA: MIT Press. Lepore, E. 1982. What model theoretic semantics cannot do. Synthese 54: 167– 187. Lerdahl, F. 1996. Calculating tonal tension. Music Perception 13: 319–364. Lerdahl, F. 2001. Tonal Pitch Space. New York: Oxford University Press. Lerdahl, F., and R. Jackendo¤. 1983. A Generative Theory of Tonal Music. Cambridge, MA: MIT Press. Levinson, J. 2003. Musical thinking. Journal of Music and Meaning 1, sec. 2. www.musicandmeaning.net. Lewis, D. 1972. General semantics. In D. Davidson and G. Harman, eds., Semantics for Natural Languages, 169–218. Dordrecht: D. Reidel. Lidz, J., H. Gleitman, and L. Gleitman. 2004. Kidz in the ’Hood: Syntactic bootstrapping and the mental lexicon. In D. Geo¤rey Hall and S. Waxman, eds., Weaving a Lexicon, 603–636. Cambridge, MA: MIT Press. Maess, B., S. Koelsch, T. Gunter, and A. Friederici. 2001. Musical syntax is processed in Broca’s area: An MEG study. Nature Neuroscience 4(5): 540–545. Malt, B. C., S. Sloman, and S. Gennari. 2003. Speaking versus thinking about objects and actions. In D. Gentner and S. Goldin-Meadow, eds., Language in Mind: Advances in the Study of Language and Thought, 81–112. Cambridge, MA: MIT Press. Marantz, A. 1995. The Minimalist Program. In G. Webelhuth, ed., From GB to Minimalism, 349–382. Oxford: Blackwell. Marantz, A. 2005. Generative linguistics within the cognitive neuroscience of language. Linguistic Review 22: 429–445. Marconi, D. 1996. Lexical Competence. Cambridge, MA: MIT Press. Marler, P. 2000. Origins of music and speech. In N. Wallin, B. Merker, and S. Brown, eds., The Origins of Music, 31–48. Cambridge, MA: MIT Press. Marr, D. 1982. Vision. New York: Freeman.
264
References
Martin, R., and J. Uriagereka. 2000. Some possible foundations of the Minimalist Program. In R. Martin, D. Michaels, and J. Uriagereka, eds., Step by Step, 1–29. Cambridge, MA: MIT Press. Matilal, B. 1990. The Word and the World. Delhi: Motilal Banarasidas. Matilal, B., and J. Shaw. 1985. Analytical Philosophy in Comparative Perspective: Exploratory Essays in Current Theories and Classical Indian Theories of Meaning and Reference. Dordrecht: D. Reidel. May, R. 1991. Syntax, semantics and logical form. In A. Kasher, ed., The Chomskyan Turn, 334–359. Oxford: Blackwell. McGilvray, J. 2005. Meaning and creativity. In J. McGilvray, ed., The Cambridge Companion to Chomsky, 204–224. Cambridge: Cambridge University Press. McLain, E. 1976. The Myth of Invariance. New York: Nicholas Hays. Mehler, J., G. Lambertz, P. Jusczyk, and C. Amiel-Tison. 1986. Discrimination de la langue maternelle par le nouveau-ne´. Comptes Rendes Acade´mie des Sciences 303, se´rie III, 637–640. Mehler, J., A. Christophe, and F. Ramus. 2000. How infants acquire language: Some preliminary observations. In A. Marantz, Y. Miyashita, and W. O’Neil, eds., Image, Language, Brain, 51–76. Cambridge, MA: MIT Press. Michie, D. 1999. Alan Mathison Turing. In R. Wilson and F. Keil, eds., MIT Encyclopedia of the Cognitive Sciences, 847–849. Cambridge, MA: MIT Press. Miller, G. 1956. The magical number seven, plus or minus two: Some limits on our capacity for processing information. Psychological Review 63: 81–97. Miller, G., and D. McNeill. 1969. Psycholinguistics. In G. Lindzey and E. Aronson, eds., The Handbook of Social Psychology, vol. 3, 666–794. Reading, MA: Addison-Wesley. Milsark, G. 1974. Existential Sentences in English. Doctoral dissertation, MIT. Mithen, S. 2005. The Singing Neanderthals: The Origins of Music, Language, Mind, and Body. Cambridge, MA: Harvard University Press. Miyagawa, S. 2006. A Design for Human Language: A Non-parametric Theory of Narrow Syntax. Ms. Mohanty, J. 1992. Reason and Tradition in Indian Thought. Oxford: Oxford University Press. Montague, R. 1974. Pragmatics. In R. Montague, Formal Philosophy, 95–118. New Haven, CT: Yale University Press. Morais, J., and R. Kolinsky. 2001. The literate mind and the universal human mind. In E. Dupoux, ed., Language, Brain, and Cognitive Development: Essays in Honor of Jacques Mehler, 463–480. Cambridge, MA: MIT Press. Moravcsik, J. 1981. How do words get their meanings? Journal of Philosophy 78: 5–24. Moravcsik, J. 1998. Meaning, Creativity, and the Partial Inscrutability of the Human Mind. Stanford University: CSLI.
References
265
Mukherji, N. 1983. Against indeterminacy. In D. Chattopadhyay, ed., Humans, Meanings, Existences, 139–159. New Delhi: Macmillan. Mukherji, N. 1987. Language and Definiteness. Doctoral dissertation, University of Waterloo, Canada. Mukherji, N. 1989. Descriptions and group reference. Journal of Indian Council of Philosophical Research 6(3): 89–107. Mukherji, N. 1990. Churchland and the talking brain. Journal of Indian Council of Philosophical Research 7(3): 133–140. Mukherji, N. 1995. Identification. In P. Sen and R. Verma, eds., Philosophy of P. F. Strawson, 32–55. New Delhi: Allied Publishers. Mukherji, N. 2000. The Cartesian Mind: Reflections on Language and Music. Shimla: Indian Institute of Advanced Study. Mukherji, N. 2001. Shifting domains. In M. Chadha and A. Raina, eds., Basic Objects: Case Studies in Theoretical Primitives, 91–106. Shimla: Indian Institute of Advanced Study. Mukherji, N. 2003a. Is CHL linguistically specific? Philosophical Psychology 16(2): 289–308. Mukherji, N. 2003b. Is there a general notion of interpretation? In A. Ritivoi, ed., Interpretation and Its Objects: Studies in the Philosophy of Michael Krausz, 39–54. Amsterdam: Rodopi. Mukherji, N. 2009. Dawn of music theory in India. Annual Review of South Asian Languages & Linguistics, in press. Mukherji, N. Forthcoming a. Doctrinal dualism. In P. Ghosh, ed., Materialism and Immaterialism in India and Europe, Volume 12: Levels of Reality, Part 5. New Delhi: Centre for Studies in Civilizations, Project of History of Indian Science, Philosophy and Culture (PHISPC). Mukherji, N. Forthcoming b. Truth and intelligibility. In A. Dev, ed., Science, Literature, Aesthetics. New Delhi: Centre for Studies in Civilizations, Project of History of Indian Science, Philosophy and Culture (PHISPC). Murphy, G. 2002. The Big Book of Concepts. Cambridge, MA: MIT Press. Nagel, T. 1997. The Last Word. New York: Oxford University Press. Neale, S. 1990. Descriptions. Cambridge, MA: MIT Press. Neale, S. 1994. Logical form and LF. In C. Otero, ed., Noam Chomsky: Critical Assessments, vol. 2, 788–838. London: Routledge and Kegan Paul. Nespor, M. 2001. About parameters, prominence, and bootstrapping. In E. Dupoux, ed., Language, Brain, and Cognitive Development: Essays in Honor of Jacques Mehler, 127–142. Cambridge, MA: MIT Press. Nevins, A., D. Pesetsky, and C. Rodrigues. 2007. Piraha˜ exceptionality: A reassessment. Ms. Ormazabal, J. 2000. A conspiracy theory of case and agreement. In R. Martin, D. Michaels, and J. Uriagereka, eds., Step by Step: Essays on Minimalist Syntax in Honor of Howard Lasnik, 235–260. Cambridge, MA: MIT Press.
266
References
Partee, B. 1979. Semantics–mathematics or psychology? In R. Bauerle, U. Egli, and A. von Stachow, eds., Semantics from Di¤erent Points of View, 1–14. Berlin: Springer. Partee, B. 1980. Montague Grammar, mental representation and reality. In S. Ohman and S. Kanger, eds., Philosophy and Grammar, 59–78. Dordrecht: D. Reidel. Partee, B. 1992. Naturalizing formal semantics. In Proceedings of the XVth World Congress of Linguists: Texts of Plenary Sessions, 62–76. Quebec: Laval University. Patel, A. 2003. Language, music, syntax and the brain. Nature Neuroscience 6: 674–681. Penn, D., K. Holyoak, and D. Povinelli. 2008. Darwin’s Mistake: Explaining the discontinuity between human and nonhuman minds. Behavioral and Brain Sciences 31: 109–178. Penrose, R. 2001. Conversation. In R. Nair, ed., Mind, Matter and Mystery: Questions in Science and Philosophy, 119–133. New Delhi: Scientia. Peretz, I. 2001. The biological foundations of music. In E. Dupoux, ed., Language, Brain, and Cognitive Development: Essays in Honor of Jacques Mehler, 435–445. Cambridge, MA: MIT Press. Peretz, I., and M. Coltheart. 2003. Modularity of music processing. Nature Neuroscience 6: 688–691. Pesetsky, D. 2007. Music syntax is language syntax. Ms. web.mit.edu/linguistics/ people/. faculty/pesetsky/Pesetsky_Cambridge_music_handout.pdf. Pesetsky, D., and E. Torrego. 2006. The syntax of valuation and the interpretability of features. In S. Karimi, V. Samiian, and W. Wilkins, eds., Clausal and Phrasal Architecture: Syntactic Derivation and Interpretation. Amsterdam: John Benjamins. Petitto, L. 1987. On the autonomy of language and gesture: Evidence from the acquisition of personal pronouns in ASL. Cognition 27: 1–52. Piattelli-Palmarini, M., ed. 1980. Language and Learning: The Debate between Jean Piaget and Noam Chomsky. Cambridge, MA: Harvard University Press. Piattelli-Palmarini, M. 2001. Portrait of a ‘‘classical’’ cognitive scientist: What I have learned from Jacques Mehler. In E. Dupoux, ed., Language, Brain, and Cognitive Development: Essays in Honor of Jacques Mehler, 3–21. Cambridge, MA: MIT Press. Piattelli-Palmarini, M., and J. Uriagereka. 2004. The immune syntax: The evolution of the language virus. In L. Jenkins, ed., Variation and Universals in Biolinguistics, 342–378. Amsterdam: Elsevier. Pietroski, P. 2005. Meaning before truth. In G. Preyer and G. Peters, eds., Contextualism in Philosophy, 253–300. Oxford: Oxford University Press. Pietroski, P. 2006. Character before content. In J. Thomson and A. Byrne, eds., Content and Modality: Themes from the Philosophy of Robert Stalnaker, 34–60. Oxford: Oxford University Press.
References
267
Pinker, S. 1995a. The Language Instinct: How the Mind Creates Language. New York: HarperCollins. Pinker, S. 1995b. Why the child holded the baby rabbits: A case study in language acquisition. In L. Gleitman and M. Liberman, eds., An Invitation to Cognitive Science, Volume 1: Language, 107–133. Cambridge, MA: MIT Press. Pinker, S. 1997. How the Mind Works. New York: Norton. Pinker, S. 2001. Four decades of rules and association, or whatever happened to the past tense debate? In E. Dupoux, ed., Language, Brain, and Cognitive Development: Essays in Honor of Jacques Mehler, 157–179. Cambridge, MA: MIT Press. Pinker, S., and R. Jackendo¤. 2005. The faculty of language: What’s special about it? Cognition 95(2): 201–236. Pinker, S., and M. Ullman. 2002. The past and future of the past tense. Trends in Cognitive Sciences 6(11): 456–463. Pollock, J. 1989. Verb movement, universal grammar, and the structure of IP. Linguistic Inquiry 20: 365–424. Premack, D., and A. Premack. 2003. Original Intelligence: Unlocking the Mystery of Who We Are. New Delhi: McGraw-Hill. Prinz, J., and A. Clark. 2004. Putting concepts to work: Some thoughts for the 21st century (a reply to Fodor). Mind and Language 19(1): 57–69. Pullum, G., and B. Scholz. 2002. Empirical assessment of stimulus poverty arguments. Linguistic Review 19: 9–50. Pustejovsky, J. 1995. The Generative Lexicon. Cambridge, MA: MIT Press. Putnam, H. 1975. The meaning of ‘‘meaning.’’ In K. Gunderson, ed., Language, Mind, and Knowledge, 131–193. Minnesota Studies in the Philosophy of Science, 7. Minneapolis: University of Minnesota Press. Putnam, H. 1983. ‘‘Two Dogmas’’ revisited. In H. Putnam, Realism and Reason: Philosophical Papers, vol. 3, 87–97. Cambridge: Cambridge University Press. Pylyshyn, Z. 1984. Computation and Cognition. Cambridge, MA: MIT Press. Quine, W. 1953. Two dogmas of empiricism. In W. Quine, From a Logical Point of View, 20–46. Cambridge, MA: Harvard University Press. Quine, W. 1960. Word and Object. Cambridge, MA: MIT Press. Quine, W. 1969a. Reply to Chomsky. In D. Davidson and J. Hintikka, eds., Words and Objections, 302–311. Dordrecht: D. Reidel. Quine, W. 1969b. Speaking of objects. In W. Quine, Ontological Relativity and Other Essays, 1–25. New York: Columbia University Press. Quine, W. 1980. The variable and its place in reference. In Z. Van Straaten, ed., Philosophical Subjects: Essays Presented to P. F. Strawson, 164–173. Oxford: Clarendon Press. Radford, A. 1997. Syntax: A Minimalist Introduction. Cambridge: Cambridge University Press. Ra¤man, D. 1993. Language, Music, and Mind. Cambridge, MA: MIT Press.
268
References
Ramus, F., M. Hauser, C. Miller, D. Morris, and J. Mehler. 2000. Language discrimination by human newborns and by cotton-top tamarin monkeys. Science 288: 349–351. Ray, J. 2007. Beliefs and Logical Form. Doctoral dissertation, Department of Philosophy, University of Delhi. Reinhart, T. 2006. Interface Strategies: Optimal and Costly Computations. Cambridge, MA: MIT Press. Reuland, E., and A. ter Meulen, eds. 1987. The Representation of (In)definiteness. Cambridge, MA: MIT Press. Rodman, H. 1999. Face recognition. In R. Wilson and F. Keil, eds., MIT Encyclopedia of the Cognitive Sciences, 309–311. Cambridge, MA: MIT Press. Roeper, T., and J. de Villiers. 1993. The emergence of bound variable structures. In E. Reuland and W. Abraham, eds., Knowledge of Language, 105–140. Vol. 1. Dordrecht: Kluwer Academic Publishers. Roy, S. 2007. Understanding Graphical Expressions. Doctoral dissertation, Department of Philosophy, University of Delhi. Rouveret, A. 2008. Phasal agreement and reconstruction. In R. Freidin, C. Otero, and M. Zubizarreta, eds., Foundational Issues in Linguistic Theory: Essays in Honor of Jean-Roger Vergnaud, 167–196. Cambridge, MA: MIT Press. Rouveret, A., and J.-R. Vergnaud. 1980. Specifying reference to the subject: French causatives and conditions on representations. Linguistic Inquiry, 11: 97– 202. Russell, B. 1905. On denoting. Mind 14: 479–493. Russell, B. 1919. The philosophy of logical atomism. Monist. Reprinted in R. Marsh, ed., Logic and Knowledge, 177–281. London: George Allen and Unwin, 1956. Sa¤ran, J. 2002. Constraints on statistical language learning. Journal of Memory and Language 47: 172–196. Schellenberg, E., and I. Peretz. 2007. Music, language and cognition: Unresolved issues. Trends in Cognitive Science 12(2): 45–46. Schellenberg, E., and S. Trehub. 1996. Natural musical intervals: Evidence from infant listeners. Psychological Science 7: 272–277. Schi¤er, S. 1987. Remnants of Meaning. Cambridge, MA: MIT Press. Schi¤er, S. 2005. Russell’s theory of definite descriptions. In S. Neale, ed., 100 Years of ‘‘On Denoting.’’ Special issue. Mind 114(456): 1135–1184. Scruton, R. 1983. The Aesthetic Understanding. Manchester: Carcanet Press. Scruton, R. 1997. The Aesthetics of Music. Oxford: Oxford University Press. Scruton, R. 2004. Wittgenstein and the understanding of music. British Journal of Aesthetics 44(1): 1–9. Searle, J. 1980. Minds, brains and programs. Behavioral and Brain Sciences 3. Reprinted in J. Haugeland, ed., Mind Design II: Philosophy, Psychology, Artificial Intelligence, 183–204. Cambridge, MA: MIT Press, 1997.
References
269
Searle, J. 1990. Is the brain’s mind a computer program? Scientific American 262: 26–31. Seidenberg, M. 1995. Language and connectionism: The developing interface. In J. Mehler, and S. Franck, eds., COGNITION on Cognition, 415–432. Cambridge, MA: MIT Press. Sells, P. 1985: Lectures on Contemporary Syntactic Theories. Stanford, CA: CSLI. Siderits, M. 1991. Indian Philosophy of Language: Studies in Selected Issues. Dordrecht: Kluwer Academic Publishers. Siderits, M. 1998. Review of Studies in Humanities and Social Science 3(2), 1996. Philosophy East and West 48(3): 503–513. Smith, N. 2000. Introduction. In N. Chomsky, New Horizons in the Study of Language and Mind, vi–xvi. Cambridge: Cambridge University Press. Snedeker, J., and L. Gleitman. 2004. Why it is hard to label our concepts. In D. Geo¤rey Hall and S. Waxman, eds., Weaving a Lexicon, 257–294. Cambridge, MA: MIT Press. Soames, S., and D. Perlmutter. 1979. Syntactic Argumentation and the Structure of English. Berkeley: University of California Press. Stainton, R. 2006. Meaning and reference: Some Chomskyan themes. In E. Lepore and B. Smith, eds., Handbook of Philosophy of Language, 913–940. Oxford: Oxford University Press. Stewart, I. 1995. Nature’s Numbers: Discovering Order and Pattern in the Universe. London: Weidenfield and Nicholson. Stewart, I. 2001. What Shape Is a Snowflake? London: Weidenfeld and Nicolson. Strawson, P. 1950. On referring. Mind, July. Reprinted in P. Strawson, LogicoLinguistic Papers, 1–27. London: Methuen, 1971. Strawson, P. 1952. Introduction to Logical Theory. London: Methuen. Strawson, P. 1961. Singular terms and predication. Journal of Philosophy 58. Reprinted in P. Strawson, Logico-Linguistic Papers, 53–74. London: Methuen. Strawson, P. 1995. Replies. In P. Sen and R. Verma, eds., Philosophy of P. F. Strawson, 398–434. New Delhi: Allied Publishers. Striedter, G. 2006. Precis of Brain Evolution. Behavioural and Brain Sciences, 29(1). Szabo, Z. 2005. The loss of uniqueness. In S. Neale, ed., 100 Years of ‘‘On Denoting.’’ Special issue. Mind 114(456): 1185–1222. Tarski, A. 1935. The concept of truth in formalized languages. In A. Tarski, Logic, Semantics, Metamathematics, trans. J. Woodger, 152–278. Oxford: Clarendon Press, 1956. Temperley, D. 2001. The Cognition of Basic Musical Structures. Cambridge, MA: MIT Press. Thomason, R. 1974. Introduction. In R. Montague, Formal Philosophy, 1–69. New Haven, CT: Yale University Press.
270
References
Thompson, d’Arcy. 1917. On Growth and Form. Cambridge: Cambridge University Press. Trehub, S. 2003. The developmental origins of musicality. Nature Neuroscience 6: 669–673. Turing, A. 1950. Computing machinery and intelligence. Mind 49: 433–460. Turing, A. 1952. The chemical basis of morphogenesis. Philosophical Transactions of the Royal Society of London, series B, 237: 37–72. In P. Saunders, ed., The Collected Works of A. M. Turing: Morphogenesis. Amsterdam: North-Holland, 1992. Uriagereka, J. 1998. Rhyme and Reason: An Introduction to Minimalist Syntax. Cambridge, MA: MIT Press. Uriagereka, J. 1999. Multiple spell-out. In S. Epstein and N. Hornstein, eds., Working Minimalism, 251–282. Cambridge, MA: MIT Press. Uriagereka, J. 2008. Syntactic Anchors. Cambridge: Cambridge University Press. Varley, R., N. Klessinger, C. Romanowski, and M. Siegal. 2005. Agrammatic but numerate. Proceedings of the National Academy of Sciences, early edition, 1–6. www.pnas.org_cgi_doi_10.1073_pnas.0407470102. Vijaykrishnan, K. 2007. The Grammar of Carnatic Music. Berlin: Mouton de Gruyter. Wallin, N., B. Merker, and S. Brown, eds. 2000. The Origins of Music. Cambridge, MA: MIT Press. Walton, K. 1993. Metaphor and prop oriented make-believe. European Journal of Philosophy 1(1): 39–57. Wasow, T. 1985. ‘‘Postscript’’ to Sells 1985. Reprinted in C. Otero, ed., Noam Chomsky: Critical Assessments, vol. 1. London: Routledge and Kegan Paul, 1994. Wierzbicka, A. 1996. Semantics: Primes and Universals. Oxford: Oxford University Press. Wexler, K. 1991. On the argument from the poverty of the stimulus. In A. Kasher, ed., The Chomskyan Turn, 252–270. Oxford: Blackwell. Wittgenstein, L. 1922. Tractatus Logico-Philosophicus. Trans. D. Pears and B. McGuinness. London: Routledge and Kegan Paul. Wittgenstein, L. 1931/1958. The Blue and Brown Books. Oxford: Blackwell. Wittgenstein, L. 1953. Philosophical Investigations. Trans. G. Anscombe. Oxford: Blackwell. Wittgenstein, L. 1980. Culture and Value. Trans. P. Winch. Oxford: Blackwell. Worth, S. 1997. Wittgenstein’s musical understanding. British Journal of Aesthetics 37(2): 158(10). Yamada, J. 1990. Laura: A Case for the Modularity of Language. Cambridge, MA: MIT Press.
Index
Aalaap, 211–213, 222 Advanced sciences. See Basic sciences Analytical inferences, 137–138, 146, 148 Anaphora, 9, 18, 36, 58–60, 80, 83, 102, 106, 175, 183 Argument structure, 47–48, 50, 157, 165, 174, 248n Aristotle, 3, 4, 22 Arithmetic, 19, 22, 104, 181, 186–187, 189–190, 213, 217–218, 220, 236, 238–239, 251n Articulatory system, xviii, 12, 167, 174, 195, 239–240 Artificial language, 76, 100, 197 Austin, J., 112–113, 151, 154 Basic sciences, xv–xvi, xviii, 1, 3, 15, 17, 27, 159 Baker, M., 8, 37, 46, 53 Barbosa, P., 2 Barwise, J., 95, 106 Bee dances, 7 Bennett, M., 94, 243n Berwick, R., 7 Bhartrhari, 3, 243n Bickerton, D., 189 Binding theory, 37–38, 57–61, 80, 83, 104, 144, 158, 164–165, 174, 176, 182–185 Biolinguistics, xv–xviii, 1, 11–14, 16– 18, 20, 24–25, 27–28, 74, 116, 119, 135, 140, 161, 169, 233 Biological basis of language, 13–16, 18–20, 25, 234, 252n
Bird songs, 7, 193, 213, 240–241, 252n21 Black, J., 19, 23, 28 Blind mole rat, 196 Block, N., 75 Bloom, P., 134, 138 Boeckx, C., 2, 29, 56, 169, 173, 177– 178 Boghossian, P., 199, 202 Bone flutes, 191 Bosˇcovic´, Zˇ., 171 Brahman, 3 Brain size, 192, 194–195 Broca’s area, 192, 195 Bromberger, S., 78 Bronchti, G., 196 Brown, S., 190, 193, 218 Call systems, 100 Caloric theory, 94 Carey, S., 143 Carroll, S., 234 Casati, R., 181 Case adjacency, 53 Case, structural, 17, 36, 59, 67, 70–71, 80–81, 167–168, 171, 177, 187, 247n Case theory, 38, 44, 51–55, 59, 87, 137, 182–184 C-command, 44, 58–59, 66, 70, 106, 167, 230 Chemical a‰nity, 20 Cherniak, C., 216, 234, 251n Chiat, S., 8 Chierchia, G., 94
272
Chomsky, N., xvi–xviii, 2, 3, 5–7, 17, 20, 21, 25, 29, 57, 75, 93, 104, 140, 142, 144, 181, 187, 192, 193, 197, 219, 229, 230, 232, 236, 237, 239, 241 on G-B theory, 40, 42, 45, 50–54, 56, 60, 67, 68, 127, 183, 185 on language theory, v, xv, 1, 9, 11, 12, 16, 18, 35–37, 41, 55, 77–79, 83, 85, 87, 90, 100, 116, 120, 138, 139, 157–158, 161, 176, 186, 199, 213, 215–216, 220, 244n10 on minimalism, 38, 47, 161–164, 166–167, 169–173, 175, 177–178, 180, 198, 217–218, 227, 233, 248n4, 248n6, 249n16, 251n1 on semantics, 18, 27, 64, 84, 97, 101– 102, 105–106, 109–110, 124–126, 128–132, 149, 151, 174 on unification problem, 13–15, 18– 19, 25, 234, 238 Church, A., 34–35 Churchland, P., 235, 244n Clause boundaries, 8, 19, 191, 200 Closed items, 101–102, 106 Collins, J., 14, 80 Complementizer (COMP), 45, 56, 61, 67 Complex systems, 1, 6, 8, 9, 18–19, 156–158 Computational e‰ciency, principles of (PCE), 198, 215–216, 228–232, 235, 239, 241–242 Computational-representational framework, 25, 28, 34, 237 Computational system of human language (CHL ), 159, 161–162, 165, 169, 175–176, 178–179, 182, 185– 187, 198, 214–215, 223, 229, 233– 234, 240, 242 Computational system, 35, 140, 166, 170, 183, 238, 252n and conceptual system, 25–26 of language, xviii, 18, 22, 35–37, 46– 47, 57, 78–79, 100, 157–158, 161– 163, 171, 176, 184, 237 of music, 193, 198, 223, 225, 227
Index
and nonlinguistic domains, 181, 187, 189, 197, 239, 241, 249n Turing and, 235–236 Conceptual-intentional systems (C-I), 15, 26, 72, 116, 163–164, 171, 173– 176, 218, 227 Conceptual necessity, 38, 163–166, 170, 184 Conceptual system, 10, 22, 97, 100– 101, 133–134, 136–137, 145, 155, 158, 161 Consonance/dissonance, 191, 228 Construction specific rules (CSR), 36, 182 Convention T, 91–93, 97 Crain, S., 7, 25, 162 Creative aspect of language use, 5 Cyclicity, 173–174, 176, 178. See also Phase Dalton, J., 21 Darwin, C., 190, 218 Dasgupta, P., 2, 243n David-Gray, Z., 196 Davidson, D., 88–94, 104, 115 Definite descriptions, Russell’s theory of, 30–33, 66, 86, 102, 107–115, 246n Dennett, D., 75 Descartes, R., 3, 5, 239–240 Descriptive adequacy, 11, 16 Diesing, M., 71, 84, 86–87 Discrete infinity. See Unboundedness Displacement/dislocation problem, 31–33, 38, 63, 120, 163–164, 167, 171–172, 176, 185–188 Distributed morphology, 126, 135– 136, 250n D-structure, 16, 37–40, 50–53, 55–56, 63–64, 163, 165, 173, 183–184 Dover, G., 230 Dowty, D., 102–103, 109 Ebbs, G., 141 Economy, principles of, 70–71, 159, 164–165, 167, 171, 176–179, 184– 185, 214, 228, 233–235, 242. See
Index
also Computational e‰ciency (PCP), Purely computational principles Edelman, G., 17, 244n Edge phenomenon, 72, 170, 172, 174, 187–188 Einstein, A., 23 Embick, D., 135 Emotions, xvii, 194, 202–205, 206, 248n. See also Musical a¤ects Empty category principle (ECP), 37, 56, 58, 61–62, 66, 182, 184–185 Established sciences. See Basic sciences Ethnoscience, 93 European starlings, 240–241 Evans, G., 109 Everett, D., 213, 243n Explanatory adequacy, 12, 16, 25, 231, 238 Extended projection principle (EPP), 51, 52, 171–172 External significance of language, xviii, 95, 97–101, 105, 115 External systems, 15, 18, 26, 72, 116, 157, 163–165, 169, 172, 175, 179, 185, 187–188, 223, 225–226, 228 Face recognition, 143, 194 Faculty of language (FL), 12, 26, 35, 129, 161, 164, 173–175, 180–181, 184–185, 191, 196, 210, 215, 227, 229–230, 234. See also Biolinguistics, Universal grammar Faculty of music (FM), 222–223, 226–229 Felicity condition, 113 Feature checking, 70–72, 222–223. See also Uninterpretable features Fisher, C., 46 Fitch, W., 175, 191, 193–194, 199, 209–210, 212, 232, 240–241 FL-driven interpretation (FLI), 157, 175–176, 185, 199, 223, 226–227, 236 FM-driven interpretation (FMI), 226– 227, 236 Fodor, J., 7, 9, 27, 72, 99, 106, 121, 123, 137, 145–147, 151, 153–154, 235
273
Formal semantics, xvii, 18, 73, 76, 97–103, 105–107, 115–116, 156– 157 Fox, D., 72 Frege, G., 34, 85, 88–89, 94, 107, 151, 208, 245n Freidin, R., 17 Frompkin, V., 96 Galileo, Galilean, 5, 7, 15, 22, 151, 230–232, 234 Gallistel, C., 19, 238–239 Ga¨rdenfors, P., 17 Garden-path sentences, 80 Gardner, H., 96 Geach, P., 86 General linguistic principles (GLP), 36, 182–184 General pied-piping, 169 Generative theory of tonal music (GTTM), 192, 194, 219–223 Gennari, J., 97 Gentner, T., 213, 240–241 George, A., 139 Gibberish, 77, 81 Gleitman, L., 8, 46 GLOW manifesto, 180 Goal, 171. See also Probe Go¨del, K., 34–35 Goldfarb, W., 141 Goldin-Meadow, S., 8 Government-Binding theory (G-B), 29, 37, 70, 79, 82, 119–120, 158, 161– 163, 172, 174, 176–177, 180, 182, 184–186, 237 Grammar, 10–11, 17, 27, 29, 51, 66, 72, 77–78, 80, 83, 102, 123, 143, 157, 159, 182, 199, 209, 213–214, 223, 234, 240, 243n. See also, Grammar, Universal Grammar, Biolinguistics and conceptual system, 48, 119, 125, 176 generative, xv, 17, 22, 35, 94–95, 88, 186, 192, 206, 214, 219, 248n modules of, 38, 44, 46 of music, 192, 208, 252n
274
Grammar (cont.) as object of inquiry, v, xvii–xviii, 6, 22, 81–81, 116, 138, 158, 181 Pa¯ninian, 2, 4 Grammatical theory, 48, 75, 77, 89, 95, 103, 120, 123, 125, 134, 138139, 145, 159, 181 as a form of inquiry, 87, 97, 99, 116, 201 and language-theory, 70, 72, 73, 84, 93, 96–97, 120, 123, 127 scope of, 70, 72–73, 79–83, 85, 88, 106, 115–116, 121, 126, 129, 137, 156–157, 199 Gravitation, 20, 23–24 Grice, P., 131 Grimshaw, J., 121–122 Grodzinsky, Y., 25 Grohman, K., 29 Gruber, J., 151–152 Hacker, P., 94 Haeckel, E., 19 Halle, M., 78 Hansen, N., 148 Harley, H., 78, 135 Hattiangadi, A., 141 Hauser, M., 175, 191, 209–210, 212, 218, 232, 240–241 Hawking, S., 23 Hawkins, J., 30 Head-first languages, 36, 44, 47 Hegel, G., 2 Heim, I., 30, 86, 94, 104, 113–114 Held, A., 23 Higginbotham, J., 64, 80 Hinzen, W., 71, 80, 116, 157, 176, 237 Ho¤man, D., 195 Hofstadter, D., 210–211 Holden, C., 191 Hominid set, 189, 194, 209, 215–216, 229, 234–235, 240–241 Horgan, T., 237 Hornstein, N., 29, 56, 64, 70–72, 77, 83–85, 136, 139, 156 Horwich, P., 128, 141 Huang, J., 65
Index
Hume, D., 3 Huybregts, R., 12, 16, 50, 75, 90, 101, 183, 234 I-meanings, 18, 128–129, 157, 249n Inclusiveness condition, 165–166, 171 Indian intellectual tradition, 2 Inner domains, xv, 5–8, 15, 24 Insect navigation, 7, 140, 144, 231– 233, 238–239 Intentionality. See Reference Internal significance, 199, 201, 205– 209, 223, 226 Island constraints, 19, 230 Jackendo¤, R., 41, 45, 81, 106, 126, 138, 143, 156 and language-theory, 8–9, 17, 48, 53, 78, 120, 123, 218 and lexical semantics, 35, 111, 121, 124, 131–132, 136, 148, 150–154 and music, 192, 194, 200–202, 204– 205, 211, 219–221, 223–228, 251n Jenkins, L., 12, 18–19, 24, 47, 232 Johnson, K., 56 Kaduri, Y., 207–208, 211 Kant, I., 3 Kaplan, D., 105, 108–110 Karmilof-Smith, A., 8, 9, 64, 191 Kasher, A., 96 Katz, J., 145, 151 Kaufman, L., 8 Kayne, R., 47, 197 Kennedy, C., 70 Kiparsky, P., 2 Kitahara, H., 178, 187 Kolinsky, R., 145 Koopman, H., 87 Koster, J., 13, 180 Koyre, A., 5 Kratzer, A., 94, 104 Kripke, S., 90, 111, 140–145, 156, 244n Krumhansl, C., 191, 192, 226 Kunej, D., 191 Kunjunni Raja, K., 2
Index
Lako¤, G., 147 Landing site, 19 Lambda abstraction, 103, 109 Languagelike systems, 180–181, 189, 195, 197, 207, 216, 219, 229 Language specific rules (LSR), 36, 182 Language theory. See Theory of Language Larson, R., 94–96, 104–106, 109, 110, 115, 123, 138, 152 Lasnik, H., 48–50, 72, 85, 158, 177, 178 Last resort, 163, 164, 166, 168, 176– 177, 179, 215, 228, 233 Laura, 96 Least e¤ort, 15, 19, 56, 163, 164, 166, 168, 176–179, 215, 228, 230–234, 251n Least energy, 19, 216, 229, 230 Legibility conditions, 15, 25, 26, 81– 82, 163, 166, 184, 228, 236 Leiber, J., 19, 243n Leibnitz, G., 3 Lepore, E., 27, 72, 99, 137 Lerdahl, F., 192, 194, 200–202, 204– 205, 211, 219–220, 223–228 Levinson, J., 203–204, 207–208 Lewis, D., 88 Lexical features, 17, 19, 82, 126, 165, 167, 171, 215, 217–219, 232 Lexical semantics, xvii, 73–74, 126, 129, 130, 133–134, 145, 151, 155– 157, 248n Linear correspondence axiom, 47 Linguistic specificity, 175, 179, 183– 188, 198, 215–216, 220 Local domain, 59, 70, 83, 144, 166– 169, 223 Localist conception of science, 19, 21 Logic, Logical systems, 19, 22, 34–35, 73–76, 79, 86–87, 94, 100, 104, 106–107, 110, 115, 181, 238–239 Logical Form (LF), 27, 35–39, 56, 61–63, 65–72, 76–78, 82–86, 89–90, 94–95, 116, 119–121, 126, 133, 135– 136, 167, 169, 173–177, 199, 209, 245n
275
Long distance binding, 60 Look-ahead, 17, 169 McConnell-Ginet, S., 94 McDormott, J., 191, 218 McGilvray, J., 199 McNeill, D., 80, 241 Maess, B., 192 Malt, B., 97 Mapping principle, 71–72, 83, 87 Marantz, A., 78, 135, 164 Marconi, D., 134 Markerese/mentalese explanation, 135–136, 145–146, 156 Marler, P., 241 Marr, D., 195 Matilal, B., 243n May, R., 95 Mehler, J., 8 Merge, 159, 164–173, 178–179, 197, 214–220, 229–230, 232, 241 external, 171–172, 174, 221, 223, 227, 249n internal, 170–172, 174, 177, 186–187, 196, 222, 228 Michie, D., 236 Miller, G., 80, 241 Mind-body problem, 7, 252n17 Mind/brain, xvii, 14, 89, 92, 93, 106, 243n Mind-internal, xvii, 26, 28, 95, 97, 101, 105, 116, 245n Milsark, G., 71 Minimalist program, 15, 17, 29, 38, 50, 56, 63, 66, 70, 81, 119, 159, 162– 165, 168–169, 179, 184–185, 197, 216, 221, 222, 229, 234 Minimal link condition, 177–179, 185, 228, 232 Minsky, M., 18 Mithen, S., 191 Miyagawa, S., 47 Model theory, 84, 85, 94, 104, 107 Mohanty, J., 243n Montague, R., 85, 93, 101–109, 113 Moravcsik, J., 76, 149
276
Move-a, Move, 36–38, 53, 56, 66, 162–164, 169, 177, 184, 186–187, 234 Mukherji, N., xvi, 30, 107, 110–113, 123, 137, 188, 195, 204, 209, 237 Murge, 218 Murphy, G., 27 Music, xviii, 22, 181, 183, 186, 189– 197, 200–210, 214–227, 229, 234, 236–239 Musical a¤ects, 202–205, 250n12, 250n14 Musical notes, 193, 200, 201, 251n Musilanguage hypothesis (MLH), 189–190, 192, 194–195, 197–198 Natural sciences. See Basic sciences Nagel, T., 6 Neale, S., 95, 103, 109, 110, 115 Nematodes, 18–19 Nespor, M., 78 Nevins, A., 213 Newport, E., 8 Newtonian science, 5, 21, 22–23, 29, 158, 186, 233 No tampering condition, 169–171, 217 Noyer, R., 78, 135 Ormazabal, J., 84, 247n Ostertag, G., 31 Otero, C., 17 Outer domains, 6, 7, 24 Pa¯nini, 2, 3, 4, 22, 233 Passive constructions, 26, 32, 36, 38, 49, 52, 122, 129, 137, 247n Patel, A., 192, 195 Peano, G., 104, 246n Penn, D., 239 Penrose, R., 23–24 Perfection, Perfect system, xv, 15, 19, 164, 230–232 Perry, J., 95, 106 Pesetsky, D., 171, 172, 213, 220–222, 228, 251n, 252n Peters, S., 102, 103, 109 Petitto, L., 8 Phases, 78, 169, 173, 174
Index
Phonetic form (PF), 35, 37, 38, 63, 77, 83, 119, 121, 177, 199 Phonological phrase, 78, 244n Phrase structure rules, 39–42, 44, 51, 139, 169, 240, 244n Piattelli-Palmarini, M., 7, 19, 29, 205 Pietrosky, P., 7, 25, 88, 99, 115, 162 Pinker, S., 26, 35, 52, 78, 133, 144 Plato, 3 Plato’s problem, xvii, 3, 7, 9, 11–12, 14, 25, 28, 29, 34, 46, 70, 72, 123, 129, 135, 150 Pollock, J., 87 Port Royal, 3, 4, 22 Postal, P., 151 Poverty of stimulus, 7, 14, 32, 65, 134, 162, 196 Povinelli, D., 239 Premack, D., 175 Priestley, J., 21 Principles and parameters, 1, 12, 29, 36, 37, 39, 54, 57, 162, 182 Prinz, J., 127 Probe, 171, 172 Procrastinate, 177, 178 Projection principle, 37, 42, 50, 176, 182–184 Prolongation-reduction, 219, 221, 222, 224 Proto science, 3, 4 Pullum, G., 162 Purely computational principles (PCP), 36, 179, 182–186, 215 Purely formal relations (PFR), 75–77, 80–82 Pustejovsky, J., 35, 146 Putnam, H., 1 Pylyshyn, Z., 235 Quantifier, Quantifier movement, 18, 29–31, 64, 66–71, 81–86, 102–103, 108–111, 172, 174, 176, 188, 200, 245n, 246n20, 246n21. See also Scope problem Quantum physics, 16, 21, 23, 231 Quasi-PCP, 183, 184, 185 Quine, W., 68, 110, 111, 137–140, 142, 145, 156
Index
Raaga, 204, 211–212, 224–228 Radford, A., 49, 168 Ra¤man, D., 199, 200, 202 Ramus, F., 64, 194 Recursion, recursive systems, 40, 88– 93, 104, 159, 176, 209–214, 216– 220, 240–241, 250n19, 252n11, 252n19 Reference, problem of, 26–27, 99, 100, 105, 158, 246n Reinhart, T., 175–176, 233 R-expression, 58–59, 69, 80, 99, 183, 245n Reuland, E., 71 Rgveda, 2, 243n Rhythmic, prosodic structures, 8, 194, 222–223, 228 Riemsdijk, H., 12, 16, 50, 75, 90, 101, 180, 183, 234 Rodman, H., 143 Rouveret, A., 53, 176 Rovane, C., 27 Rule following, 140–141, 144, 247n Rules and representations, xvii, 8, 141. See also Computationalrepresentational framework Russell, B., 29–34, 70, 72, 85–86, 99, 102, 107–109, 115, 198, 208, 246n Sabda-brahman, 3 Sa¤ran, J., 139, 196 Sanskrit, 2 Saussure, E., 22 Schellenberg, E., 191 Schi¤er, S., 90 Science forming capacity, 5, 9, 104 Scope problem, 29, 30–35, 38, 70, 72, 86, 98, 158, 175 Scruton, R., 198, 202, 206, 207 Searle, J., 74–82 Segal, G., 94–96, 104–106, 109, 110, 115, 123, 138, 152 Selectional features, 124–126 Sells, P., 62 Semantic value, 18, 100, 102, 103, 105, 109 Sensorimotor systems, 15, 163, 164, 173–174
277
Shortest derivation condition, 177– 179, 185 Sign language, 8, 12 Smith, N., 25 Snedeker, J., 46 Specific language impairment, 12, 194, 247n Spell-out, 63, 164 Sportiche, D., 87 S-structures, 16, 37–38, 50–53, 56, 62, 64–66, 163–165, 173, 184 Stability in music, 188, 223–225, 228 Stainton, R., 2, 88, 93, 99, 116, 131 Stands-for relation (SFR), 75, 77, 80– 83 Strawson, P., 30, 34, 88, 111–115 Streidter, G., 192 Strong crossover, 61 Strong minimalist thesis (SMT), 164, 169, 173–174 Strong musilanguage hypothesis (SMH), 197–198, 214, 216, 227, 229, 249n Subjacency principle, 56, 66, 70, 182, 184–185. See also Minimal link condition Superiority, 56 Superordination, 145–146 Symbol manipulation, 74–75, 175, 236, 238 Symbols, xvii, 74–76, 99, 101, 106– 107, 176, 179, 183, 198, 199, 201, 208–209, 235–242, 252n18 Szabo, Z., 246n Tarski, A., 34, 91–94, 98, 245n Temperley, D., 219 Tension-relaxation, musical, 219, 223– 224 Thematic role (y-role), 36, 46–51, 54– 56, 65, 67, 80–82, 122, 137, 164, 167, 175 Theory of language, 4, 13–14, 22, 74, 89, 96–98, 100, 105–106, 116, 120, 123, 127, 131, 156 Theta criterion (y-criterion), 37, 50– 51, 54, 57, 83, 158, 174–175, 182– 185
278
Thomason, R., 104 Thoughts, thought systems, xvii, 18, 26, 78, 101, 207, 222, 226, 239 Tonal space, tonal context, 192, 204– 205, 207, 211, 219–220, 222–223, 225–227 Torrego, E., 171–172 Trace-deletion hypothesis, 25 Transformational grammar, 16, 40, 162, 186 Trehub, S., 191 Truth theory, 85, 88, 90–91, 93–94, 96–98, 115, 245n Turing, A., 22, 34–35, 74, 235, 237 Turing test, 235–237 Turk, I., 191 Ullman, M., 144 Unboundedness, xviii, 193, 195, 210, 213–214, 216, 239 Unification problem, 16, 20–24, 25, 28, 233, 234, 238 Uniformity principle, 162 Uninterpretable features, 17, 70, 167– 169, 172, 187–188, 222, 224 Universal grammar (UG), 23, 35, 37, 38, 144, 161, 164, 215–217, 219, 230, 232 Urform, 232 Uriagereka, J., 19, 49–50, 72, 137, 166, 170, 176–178 Varzi, A., 181 Vergnaud, J., 53, 180 Vijaykrishnan, K., 204, 206 Visibility condition, 54 Visual cli¤ experiment, 8 Visual system, 5, 7, 11–12, 140, 142, 193, 194, 195–197, 238, 248n, 249n von Humboltd, W., 3, 22 VP-internal subject, 49, 71, 87, 167 Wall, R., 102–103, 109 Wallin, N., 218 Wasow, T., 57, 166 Wexler, K., 7 Wilson, E., 19
Index
Wierzbicka, A., 17 Wittgenstein, L., 10, 88, 141, 151, 154, 201–203, 205–209 X-bar theory, 37–38, 43–47, 49, 51, 53, 79, 165, 167, 173, 182–184, 248n Yamada, J., 96 Zubizarreta, M., 17
Nirmalangshu Mukherji