This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
object; hence the verb can merge with both the nominal constituent that book and the propositional constituent to Pat. On the other hand, in (11b), the verb admire has an object feature; therefore this verb can merge with that book, but not with to Pat. (11) a. Chris gave that book to Pat b. Chris admired that book (*to Pat) Relatedly, the determiner that has a singular Number feature <SING>, so it will not be able to merge with any nouns that have an un-matching plural Number feature
166 Thomas S. Stroik and Michael T. Putnam (13) a. I like [that book] b. [that book] I like (14) Which politicians does Chris think will be elected. To explain how a constituent can appear in a syntactic position distal from its merged position, Chomsky (2001) claims there is a second syntactic operation, called Internal Merge (IM), which can re-locate constituents within a Derivation. This operation differs from EM in one essential way – it builds structure not by linking elements from N into D, but by re-positioning elements already in D somewhere else in D. That is, IM is a D-to-D mapping, as stated in (15). (15) IM: D D What IM does, in particular, is take an element a ε D and remerge it in D. So, if D consists of [d [ba]], then IM can select the element a and remerge it (see (16)). (16) IM {a, [d[ba]]} [a[d [b a]]] where a is a phonetically inert copy of a. As with our first formulation of EM (8), IM (16) is much too powerful. It would allow any element in D to be remerged, thereby permitting the verb have in (17a) to remerge as in (17b). (17) a. Pat shouldn’t have left IM b. *Have Pat shouldn’t have left. The fact that (17b) is ungrammatical demonstrates that IM (16) overgenerates syntactic structure and that IM must reduce its derivational capacity. Chomsky (2001) lessens the computational power of IM in two ways. First, rather than letting any element remerge at any point in the derivation, Chomsky limits remerge to cases involving feature-match. For remerge to occur, a head (say d) must have a concatenative feature
Surviving reconstruction
167
Notice that both EM and IM build syntactic structure by locally linking constituents with matching features. The second way Chomsky constrains IM is by delimiting the search that d head, which has both a
feature and a <WH>. (19) a. [C
[Pat will
read [what<WH> to whom<WH>]]] IM b. [will
[C
[Pat will read [what<WH> to whom<WH>]]]] IM c. [what<WH> [will
[C
[Pat will read [what to whom <WH>]]]]] ‘What will Pat read to whom?’ In (19a), the C
head will look for a constituent with a
feature within its search domain. Once it locates the constituent will
, the search will terminate and IM will apply, remerging will
in the matrix CP. Since the C head also has a <WH> feature, the head must begin another search – one that looks for a <WH> counterpart feature. The search will terminate when the first active <WH> is found on the constituent what<WH>. IM will apply again, this time remerging what<WH> in the matrix CP. Despite the seeming ease with which (19c) is derived, this derivation is actually extremely complex. If we look closely at how the derivation proceeds from (19b) to (19c), we will see some of this complexity. Of note here is the fact that IM can’t apply directly to (19b). Rather, before IM can apply to (19b), the C<WH> head must initiate a search for a constituent
168 Thomas S. Stroik and Michael T. Putnam with a <WH> feature – that is, a FIND<WH> operation must apply to D. The FIND operation itself is a complex (composite) operation that must LOOK-AT-FEATURES of every constituent K in the Derivation, that must DISREAGARD all
2
3
4
This is a rather problematic operation in that it is unclear where this operation could take place. Another problem with current minimalism is that in the derivation illustrated in (19) the C-head possessing the [Q]-feature that will peer down into its ccommand domain to find a relevant goal is previously blind to the wh-items that have existed in the derivation for quite some time. That these wh-items are not found and recognized until the very end of the derivation is problematic for a true minimalist analysis of syntax. For example, Internal Merge requires ancillary operations such as FIND PHASE EDGE that burden the processing load of the language faculty.
Surviving reconstruction
169
3. Internal merge and reconstruction effects In this section, we look all the more closely at the IM operation by considering how an IM analysis can account for the Reconstructions Effects in (20). (20) a. [which picture of Bill that John likes] did he buy? *Bill…he/OK John…he b. He bought [a picture of Bill that John likes]. *he…Bill/*he…John What needs to be explained in (20) is why the coreferentiality relationship between the pronoun he and the DP Bill is not affected by whether the bracketed DP undergoes displacement (as in (20a)) or not (as in (20b)), while the coreferentiality relationship between the pronoun and the DP John is affected by the displacement of the bracketed DP. We need to determine what role, if any, IM plays in these differing coreferentiality relationships. The data in (20) led Chomsky (1993) to conclude that arguments contained within displaced constituents, such as the DP Bill in (20a), behave, in terms of Principle C relations, as if they are in their pre-displacement positions (see (20b)). In other words, the constituent [which picture of Bill] in (20a) acts as if it shows up twice in a syntactic derivation – once in its displaced (internally merged) position and once, in reconstructed copy form, in its (externally) merged position, as in (21). (21) [which picture of Bill that John likes] did he buy [which picture of Bill] For Chomsky (1993), Lebeaux (1995) Epstein et al. (1997), Fox (2003), and many others, the data in (20) provide important support for some version of a copy theory of movement (variants of Internal Merge). This above analysis, however, does not hold for adjuncts contained in displaced constituents. As we can see in (20a), the adjunct (the relative clause) contained within the displaced constituent does not appear to show up twice syntactically. If it did show up in the same places that the argument DP Bill does, then we would expect that (20a) would have (22) as its syntactic representation at some point in the derivation. (22) [which picture of Bill that John likes] did he buy [which picture of Bill that John likes]
170 Thomas S. Stroik and Michael T. Putnam But (22) is not a possible representation for (20a) because in (22) the pronoun he c-commands (and should not be co-referential with) the adjunct contained DP John. The fact that the pronoun and the DP can be coreferential argues against (22). What this means is that although the adjunct can show up once the bracketed constituent is displaced in (20a), it cannot also show up syntactically in the pre-displaced bracketed constituent. Within IM-style analyses, there are only two ways of explaining why the adjunct in (20a) shows up syntactically after the wh-constituent has been displaced. Either the adjunct is merged into the derivation after the wh-displacement (the Late Merge Hypothesis) or the adjunct is merged into the derivation prior to the displacement, but becomes syntactically visible only after the displacement (the make Simple Hypothesis). Late Merge analyses of data such as (20a) have been advanced by Lebeaux (1991), Chomsky (1993), Fox (2003), and others. Under the Late Merge analysis, (20a) is derived as follows: (23) a. he did buy [which picture of Bill] – wh-movement b. [which picture of Bill] did he buy [which picture of Bill] – adjunct Merge c. [which picture of Bill [that John likes]] did he buy [which picture of Bill] Even though this derivation expresses all the structural constraints on coreferential relations between the pronoun he and both the DP Bill and the DP John, it is an untenable analysis. Behind the apparent simplicity of (23) lurks a significant problem. As Chomsky (2001) notes, no Merge operation should “tuck” elements into a derivation D because of the processing complications that arise when structure is re-modeled from within (the structure would have to be, in essence, dismantled and then rebuilt). Merge operations, according to Chomsky, must be cyclic operations that build structure at the edges of D, not within D. If we look at derivation (23), we will see that the Late Merge of the adjunct is a case of a non-cyclic Merging of the relative clause into the wh-constituent; therefore this Merge should be disallowed. It might be possible, however, to circumvent the “tucking in” problem by allowing the IM operation to re-situate the wh-constituent in (23a) in some work-space outside the derivation (where it could Merge the adjunct) prior to remerging the constituent back into D; unfortunately, doing this would add several complexities to the IM operation beyond those we discussed in the previous section. These complexities would include chang-
Surviving reconstruction
171
ing IM from a D-to-D mapping to a D-to-Workspace mapping and then requiring a subsequent operation that could remerge elements from the Workspace to D, among other complexities. At this point, the IM operation loses any semblance of simplicity and, in fact, loses any claim to being an operation that involves the internal merge of constituents. It would seem, then, that if we are to have the IM operation in our Narrow Syntax, we must follow Chomsky in assuming that the adjunct in (20a) is not merged late into the derivation; rather, it is merged in its externally Merged position shown in (20b), though it is Merged differently than an argument would be. The adjunct is Pair Merged into D, which means, essentially, that it is merged, but it is not syntactically visible; and it will remain syntactically invisible and inert until the Simpl operation makes the adjunct syntactically present. Under these assumptions, (20a) will have partial derivation (24). (Recall that the italicized constituent is syntactically invisible.) (24) a. he did buy [which picture of Bill [that John likes]] –wh-movement b. [which picture of Bill [that John likes]] did he buy [which picture of Bill [that John likes]] – Simpl c. [which picture of bill [that John likes]] did he buy [which picture of Bill [that John likes]] Attractive as this analysis may be for the data in (20), it does not hold up if we look at some other adjunct constructions. Consider the data in (25). (25) a. [after Pat wakes up] I want her to leave b. I want her to leave [after Pat wakes up] In (25a), the bracketed constituent is a temporal adjunct that modifies the embedded verb leave and, under Chomsky’s analysis of adjuncts, this adjunct will have to be externally Merged in its verb-modifying position, as in (25b). Since the adjunct is Paired Merged in the embedded VP, it will be syntactically invisible there. But now we face a quandary. We need to move the adjunct from its embedded position in (25b) into its displaced position in the matrix sentence (as in (25a)). However, this adjunct can’t get a free ride the way the adjunct in (20a) does. That is, the adjunct in (20a) is contained within a wh-constituent that gets displaced, so the adjunct gets displaced as a by-product of the wh-movement. Since the adjunct in (25b) can’t get a free ride, it will have to move on its own accord. This requires, however, that the adjunct be syntactically visible for the IM operation. In
172 Thomas S. Stroik and Michael T. Putnam other words, before the adjunct can be moved, it will have to undergo Simpl. The derivation for (25a) will have to proceed as in (26). (26) a. I want her to leave [after Pat wakes up] – Simpl b. I want her to leave [after Pat wakes up] – IM c. [after Pat wakes up] I want her to leave [after Pat wakes up] This derivation leads us to a problematic conclusion; that is, given derivation (26), it should be impossible to have co-referential relations between the pronoun her and the DP Pat because the pronoun c-commands the DP in structures where the adjunct is syntactically visible – (26b,c). The fact that the pronoun and the DP can be co-referential suggests that the Simplbased derivation (26) is not a viable derivation for (25a). Of the two IM analyses we’ve considered for adjunct-displacement constructions, neither of them is feasible. Since it seems at present that there are only two possible IM analyses for these constructions, we are left as were at the end of the last section with the nagging sense that IM (or any movement-related operation) lacks both conceptual and empirical motivation.
4. Surviving reconstruction After exposing the conceptual faults that we encountered with the two IManalyses in accounting for adjunct-displacement constructions and the potential added burden that they add to the computational system with regards to language processing, in this section we will present an alternative analysis along the lines of Stroik (1999, to appear) and Putnam and Stroik (in preparation). Under this view, we interpret the displacement of syntactic objects from their based position not necessitated by Attract or Move, but enacted by means of survival. Stroik defines this grammatical primitive as the Survive Principle: (27) The Revised Survive Principle (based on Stroik 1999: 286) If Y is a syntactic object (SO) in an XP headed by X, and Y has an unchecked feature [+F] which is incompatible with the feature X, Y remains active in the Numeration. To provide an illustration of the Survive Principle in action, consider the following sentence in (28) with its derivational history following in (29) (data taken from Stroik (to appear): 79–80).
Surviving reconstruction
173
(28) Who snores? (29) a. b. c. d. e. f. g.
Merge {who, snores} → who snores Survive {who} Merge {T, {who, snores}} → T who snores Remerge {who, {T, {who, snores}}} → who T who snores Survive {who} Merge {C, {who, {T, {who, snores}}}} → C who T who snores Remerge {who, {C, {who, {T, {who, snores}}}}} → who C who T who snores
Upon the concatenation of a syntactic object with a head both bearing the matching feature δ through the operation Merge the syntactic object will survive and remain active in the lexicon if the syntactic object bears any additional features not present on the immediately governing head. In the derivation above the wh-item who will be mapped into the vP to check its θ-feature. At this point in the derivation a link is established signaling to the external interfaces (e.g. LF, PF) the thematic identity associated with this concatenate structure (cf. Putnam 2006b, 2007). Immediately after the concatenation of <who, snores> (29a) who survives from this position due to the additional features it possesses that must be properly licensed through iterative applications of Merge and Remerge in the course of the derivation. In steps (29d) and (29g) who remerges from the lexicon in order to properly discharge its agreement and Q-features. Perhaps the term ‘discharge’ is a bit of a misnomer, because the true motivation behind the sequence of MergeSurvive-Remerge is to generate concatenate structures that are interface interpretable. As explained earlier in Section 2 of this paper, the mapping of copies of lexical items into the narrow syntax rather than the objects themselves eliminates the need for Copy Theory and “movement” a priori from the theory, thus providing a purely derivational account of syntactic operations rather than a view of mixed theory that is weakly representational (cf. Brody 1998, 2002). The iterative application of Merge-Survive-Remerge also provides a straight-forward account to long-distance wh-movement previously unattainable in minimalism. (30) Whati do [TP ti you think [CP ti John likes ti]]? (31) *Whati do [TP ti you think [CP that John likes ti]]?
174 Thomas S. Stroik and Michael T. Putnam Sentences similar to (30) and (31) serve as canonical examples in the generative tradition to illustrate the reality of cyclic movement.5 In (30) the wh-item what must move to the left periphery of the embedded CP, TP and then to its final destination in the matrix CP. Example (31) shows that what must strictly adhere to cyclic movement or else the system will ultimately crash. Any theory of syntax employing either a Move or Attract model of constituent construal must delay the final feature-evaluation and subsequent checking or valuing process until the final C enters the derivation. Furthermore, successive cyclicity is an unsubstantiated formative in these models, i.e. it is a necessary component of the theory although we have little if any proof why it exists. Our version of XP-displacement under the MergeSurvive-Remerge mechanism forces the evaluation of the feature identity of all lexical items upon the merger of every head into the narrow syntax. In example (30), after concatenating with V, the wh-item what immediately survives (due to its remaining [Q] feature) and remains active in the lexicon for further operations. This syntactic object is an eligible candidate to remerge into the syntax at any time; however, it can only do so when a head with a matching feature appears. Upon every application of head merger an evaluation process takes place within the computation system.6 Returning to the focus of this paper, the remainder of this section will illustrate the conceptual advantages our approach has in properly deriving reconstruction structures in avoiding the aforementioned pitfalls of IManalyses. First, let’s return to example (20) from the previous section. (20) a. [which picture of Bill that John likes] did he buy? *Bill…he/OK John…he b. He bought [a picture of Bill that John likes]. *he…Bill/*he…John
5
6
Bear in mind that the derivational histories in (30) and (31) represent a movement-based analysis akin to former instantiations of generative theory. The proposal put forward in this paper does not support the theoretical approach that constituent displacement takes place by means of Move but rather Survive. David Pesetsky (personal communication) points out that the application of evaluation processes upon every iterative head merger could also potentially be very costly from a processing standpoint. Be that as it may, it is far more economical to envision a system which immediately evaluations candidates rather than one that makes use of look-ahead and look-behind operations.
Surviving reconstruction
175
Both the application of Late Merge and the Simpl operation are untenable options in explaining these Condition C inconsistencies. Late Merge (as currently formulated) requires the “tucking in” of the adjunct [that John likes] into the wh-items which in itself is an undesirable result, while the “peek-a-boo” effects of Simpl cloak the adjunct through some of the derivation and make it visible for syntactic operations and effects later on. Although it is an attractive alternative solution for (20), it does not hold up when we consider other adjunct constructions such as (25). (25) a. [after Pat wakes up] I want her to leave b. I want her to leave [after Pat wakes up] The crux of the matter is determining how and when the adjunct [that John likes] enters the syntax. Fox’s (2004) current version of Late Merge faces the unwanted “tucking in” problem since it is in the Derivation rather than in the Numeration. In a Survive-based model of syntactic derivation, we can avoid tucking the adjunct (qua Merge) to the complex wh-item by arguing that the adjunct resides in the Numeration and adjoins to the wh-item [what picture of Bill] prior to its remerging into the syntax since it survives and returns back to the lexicon due to its [Q]-features which must be checked in CP. Call this operation Late Num Merge.7 Two points must be clarified at this point to understand the conceptual advantages of our approach to a minimalist, derivational approach to generate syntax: First, the adjunction of [that John likes] is a syntactic object in the Numeration, therefore its concatenation with [what picture of Bill] will not be non-cyclic and therefore does not fall victim to “tucking in”. Second, the cyclic application of our reformulation of Late Merge forces the adjunct [that John likes] to be visible in the syntax for all operations. This fact allows us to abandon the now unnecessary Simpl operation on the grounds of virtual conceptual necessity. Since the DP John was not a part of the original complex wh-item [what picture of Bill] that merged into the VP prior to its repulsion there is no point in the derivation during which the pronoun he could potential ccommand John, thus explaining how John and he can be co-referential in (20a). The derivational history in (32) below highlights the pivotal steps in the composition of (20a).
7
We take an agnostic view at this time as to whether adjunction is motivated by some sort of feature or feature-like entity active in the Numeration or Derivation (also see Putnam 2006a: ch. 4 and Rubin 2003).
176 Thomas S. Stroik and Michael T. Putnam (32) a. Merge {buy, [which picture of Bill] } → buy which picture of Bill b. Survive [which picture of Bill] ([Q]-feature) → c. Merge {he, buy [which pictures of Bill]} → he buy which picture of Bill d. Merge {did, {he, {T, {buy [which picture of Bill] }}} → did he buy which picture of Bill e. Merge {C, {did, {he, {T, {buy [which picture of Bill] }}}} → C did he T buy which picture of Bill f. Late Num Merge { [which pictures of Bill], [that John likes] } g. Remerge { [which pictures of Bill that John likes], {C, {did, {he, {T, { buy [which picture of Bill] }}}}} → which pictures of Bill that John likes C did he T buy which picture of Bill ‘Which pictures of Bill that John likes did he buy?’ The non-cyclic application of Late Num Merge (32f) in the Numeration rather than in the course of the Derivation provides a straightforward explanation of Condition C asymmetries within core minimalist desiderata.
5. Theoretical consequences The removal of Internal Merge (Move) would undoubtedly have widesweeping effects on the entire generative enterprise. A closer look at these changes shows that they would be a welcome adjustment to the minimalist program. First, by dismissing Internal Merge in favor of a Survive approach, the theory needn’t construe and enforce economy constraints upon the language faculty anymore; the fact that local evaluation processes take place at ever step of Merge and Remerge throughout the course of the derivation mitigates the necessity of the existence of such constraints on the grammar. Economy is a natural by-product of the Survive Principle. Second, Chomsky’s formulation of phases (vP and CP) and multiple Spell-Out (cf. Uriagareka 1999 and a host of others) can be removed from the system. In current minimalist models of syntax supporting either a Move or Attract stance on XP-construal, not only must some version of look-ahead or lookback feature evaluation exist, but some sort of evaluating property that recognizes the shape of phases must also be a component of the theory. In a minimalist syntax that is argued to be label-free in the spirit of Collins (2002), it is conceptually puzzling and taxing from a processing standpoint why the human language faculty should/must be responsible for both fea-
Surviving reconstruction
177
ture evaluation and the recognition of larger units such as phases. A derivational model based on the Survive Principle has the distinct advantage of destroying the theory’s reliance on such rich ontological commitments. Before moving to the conclusion it must be pointed out that through the Survive Principle we seek to remove Internal Merge without abandoning a derivational view of syntax. Taking into account Brody’s (1998, 2002) excellent and accurate criticism of minimalism being a mixed, weakly derivational theory, we needn’t surrender minimalism to a purely representational line of thought. On the contrary, we seek to remove Internal Merge which, according to Stroik (to appear), is the culprit of minimalism’s classification as being partly-derivational and partly-representation. The elimination of Internal Merge in favor of Survive creates a pure derivational view of minimalism. Most importantly, our revision of the minimalism program does not increase the computational workload of the language faculty, but rather significantly reduces constraints and ontological internal interfaces in the narrow syntax as well as what materials appear at the interfaces. These adjustments bring us one step closer to Frampton and Gutmann’s (2002) vision of a crash-proof syntax and how it should operate.
6. Conclusion In this paper we sought out to address the Principle C asymmetries connected with adjunction reconstruction. We showed where both proposed solutions affiliated with Internal Merge (or anything theory of constituent displacement by means of Move for that matter) face significant shortcomings in the analysis presented here. Chomsky’s (2001) Simpl taxes the computational language faculty with an operation that forces it to reconfigure previous ‘invisible’ structure for syntactic considerations, while the application of Fox’s Late Merge (2004) forces the allowance of “tucking in” of structures to previously merged constituents. Based on the work of Stroik (1999, to appear) and Putnam and Stroik (in preparation), we proposed a version of constituent displacement perpetuated by the repulsion of objects rather than attraction (Survive). Furthermore we adopted a version of Fox’s Late Merge (2004) that applies in the Numeration rather than in the narrow syntax which we labeled Late Num Merge. Two conceptual advantages to our approach immediately come to the forefront: First, we do not have nor require any non-cyclic applications of Merge (or any other operation for that matter) in our system. Second, Late Num Merge obviates the “tucking in” issue associated with Fox’s (2004) current formulation of Late Merge.
178 Thomas S. Stroik and Michael T. Putnam The implementation of Survive has far-reaching effects on a view of minimalist syntax. By removing Internal Merge from the system, economy constraints, the concept of phases and multiple Spell-Out are also deemed disposable due to the fact they are no longer conceptually necessary.
Acknowledgements Portions of this paper have been presented at LASSO 2005, DEAL 2005 and the Syntax Support Group Colloquy Series at the University of Michigan. In connection with these presentations, this paper has benefited from comments and discussions with Sam Epstein, Kleanthes Grohmann, Dennis Ott, David Pesetsky and Peter Sells. All remaining shortcomings and errors remain our own.
References Adger, David 2003 Core Syntax. Oxford: Oxford University Press. Brody, Michael 1998 Projection and phrase structure. Linguistic Inquiry 29: 367–398. 2002 On the status of representations and derivations. In Derivation and Explanation in the Minimalist Program, Samuel D. Epstein and T. Daniel Seely (eds.), 90–105. Oxford: Blackwell. Chomsky, Noam 1965 Aspects of the Theory of Syntax. Cambridge, MA: MIT Press. 1980 Lectures on Government and Binding. Dordrecht: Foris. 1993 A minimalist program for linguistic theory. In The view from Building 20, K. Hale and S. J. Keyser (eds.), 1–52. Cambridge, MA: MIT Press. 1995 The Minimalist Program. Cambridge, MA: MIT Press. 2001 Beyond explanatory adequacy. MIT Occasional Papers in Linguistics 20. Cambridge, MA: MITWPL. Collins, Chris 1997 Local Economy. Cambridge, MA: MIT Press. 2002 Eliminating labels. MIT Occasional Papers in Linguistics 20. Cambridge, MA: MIT Press. Epstein, Samuel D., Erich M. Groat, Ruriko Kawashima, and Hisatsugu Kitahara 1997 A Derivational Approach to Syntactic Relations. Oxford: Oxford University Press.
Surviving reconstruction
179
Fox, Danny 2003 On logical form. In Minimalist Syntax, Randall Hendrick (ed.), 82– 123. Oxford: Blackwell. 2004 Condition A and scope reconstruction. Linguistic Inquiry 35: 475– 485. Frampton, John and Sam Gutmann 2002 Crash-proof syntax. In Derivation and Explanation in the Minimalist Program, Samuel D. Epstein and T. Daniel Seely (eds.), 90 –105. Oxford: Blackwell. Frege, Gottlob 1884 Die Grundlagen der Arithemtik. Eine logisch mathematische Untersuchung über den Begriff der Zahl. Breslau: Koebner. Lebeaux, David 1991 Relative clauses, licensing, and the nature of the derivation. In Perspectives on Phrase Structure: Heads and Licensing, S. Rothstein (ed.), 209–239. San Diego, CA: Academic Press. 1995 Where does the Binding Theory apply? In University of Maryland Working Papers in Linguistics 3: 63–88. Department of Linguistics, University of Maryland. Putnam, Michael 2006a Scrambling in West Germanic as XP-adjunction: A critical analysis of prolific domains. Ph.D. Dissertation, University of Kansas. 2006b Eliminating Agree. Ms. University of Michigan. 2007 Scrambling and the Survive Principle. Amsterdam: John Benjamins. Putnam, Michael and Thomas Stroik in prep. The Survive Manifesto: A Derivational Guide to Syntactic Theory [tentative title]. Michigan State University and the University of Missouri/Kansas City. Rubin, Edward 2003 Determining Pair-Merge. Linguistic Inquiry 34: 660–668. Stroik, Thomas 1999 The survive principle. Linguistic Analysis 29: 278–303. to appear Locality in Minimalist Syntax. Cambridge, MA: MIT Press. Uriagereka, Juan 1999 Multiple spell-out. In Working Minimalism, Samuel D. Epstein and Norbert Hornstein (eds.), 251–282. Cambridge, MA: MIT Press.
On the interface(s) between syntax and meaning Anjum P. Saleemi
This paper addresses the well-known foundational issue of the tension between form and content (meaning in a wider sense of the term); more specifically, between morphosyntactic form and meaning. We assume syntax to be a gradual, combinatorial interface between the two broad interface systems, namely, the Articulatory-Perceptual interface (which is not dealt with here) and the Conceptual-Intentional interface. The latter is conceived of as something considerably more complex than is traditionally assumed, encompassing the derivational level known as LF (Logical Form) and a number of other aspects of meaning, in particular the conceptual-lexical one and the pragmatic-illocutionary one. So conceived, the semantic component contains most of the necessary elements required for basic interpretation, which are transformed into increasingly syntactic (and ultimately phonetic) representations as a result of adaptation to a language-specific lexicon, phrase structure, and various transformations. As part of the reconceptualization of linguistic theory outlined above, one of our major aims is to demonstrate that the framework in question is sufficient to successfully explain a range of specific phenomena, such as the conceptual basis of argument and event structure within v*P and the relationship between CP and illocutionary force. Our investigation reinforces the view that a major function of syntactic Phases is to repackage universal semantic representations into speech signals in a step-by-step, but at the same time parallel, fashion. In addition, it reveals that Phases represent the intersection between the purely syntactic and the broadly semantic categories of the human linguistic system, functioning as hybrid categories constituting the interfaces between form and meaning. Finally, adopting a more coherent idealization, a ‘neo-Saussurean’ version of minimalism, deploying linear or horizontal derivations, rather than the usual bifurcated ones, is proposed.
1. Introduction What follows is an attempt to make some contribution to the theory of interfaces and Phase theory, both part of the Minimalist Program (e.g., Chomksy
182 Anjum P. Saleemi 1993, 1995a, 1995b, 2000, 2001). We have two related types of goals in mind: the programmatic-conceptual ones and the empirical ones. The accuracy of the former clearly must remain subject to further support and verification from data, as in most cases of scientific inquiry. Almost entirely focusing on the syntax-meaning interface, often known as the ConceptualIntentional interface, we do not have much to say regarding the Articulatory-Perceptual interface, which involves the phonological-phonetic system as well as the actual sensori-motor mechanisms of speech. A relatively recent addition to the Minimalist Program (henceforth MP), is Phase theory – PT for short (Chomsky 2001, 2004, 2005, to appear),1 which is an attempt to bring the issues bearing on the ‘psychological’ validity2 of linguistic constructions to the forefront of inquiry, assuming that packets or quanta (i.e., Phases, or Domains of Phases) of derived material are transferred to the two interface systems, namely, the Conceptual-Intentional interface (C-I ≈ LF) and the Articulatory-Perceptual interface (A-P ≈ PF), as soon as they acquire legibility in the relevant technical sense. Thus computational complexity and load are important motivating considerations in MP and PT, as is a renewed search for the identity of real linguistic units at a level of abstraction that strives to reduce structural categories and relations to a bare minimum (Chomsky 1995b), a line of inquiry explicitly initiated in Chomsky (1993), though it roots can be traced at least as far back as Chomsky (1955 /1975). However, Phases, understood as a subset of ‘proposition-like’ syntactic categories (i.e., v*P and CP), are there not only to decrease the computational load on working memory, but also to ensure mutual intelligibility and connectivity between form and meaning, or, more appropriately in the present context, between syntax and meaning. Ever since the Extended Standard Theory (EST) and its Y-model of derivation started to undergo radical changes and the elimination of D-structure and S-structure was proposed, it has been commonly understood that while the interface between syntax and A-P is relatively well-defined, C-I cannot just be equated with LF; thus Chomsky states that while “The level A-P has 1
2
Phases are not without historical antecedents, as Boeckx and Grohmann (to appear) rightly point out; see also this work for some constructive critical remarks. The processing/functional considerations in regard to Phases are important, but it is most likely that they exist as a result of phylogenetic change, rather than in an online and ontogenetic sense. The phylogenetic structure of language, moreover, might have been determined by a range of factors in addition to those bearing directly on functional efficiency; see Chomsky (2005), and Hauser, Chomsky and Fitch (2002) for some specific remarks and a general discussion.
On the interface(s) between syntax and meaning
183
generally been taken to be PF; the status and character of C-I have been more controversial” (1993: 2; see Grohmann, to appear, for some suggestions regarding PF). Indeed, it has even been suggested recently that “…the final internal level LF is eliminated, if at various stages of computation there are Transfer operations”… that deliver the already constructed Phases “to the semantic component, which maps it to the C-I interface” (Chomsky, to appear: 9). Such a view of C-I resonates well with the viewpoint outlined in the present work, one of whose aims is to demonstrate that the syntactic system as a whole is a complex set of interfaces between the two other primary or ‘outer’ interfaces of meaning and spoken form, and to suggest that the term ‘interface’ should not be confined to a single derivational level, particularly insofar as C-I is concerned. The other issues addressed in the following pages range from some of those that pertain to the inner working of the derivational cycle, but more importantly, to those that belong to a certain larger project that has not been paid sufficient attention to in the research; namely, that of defining and explicating the ‘outer’ interfaces, in particular the intentional part of C-I, the illocutionary (or pragmatic) aspects of meaning not having received the attention that they in principle deserve. It must be pointed out that our major focus, strictly speaking, is on illocutionary force as a unit of meaning, not on illocutionary acts or speech acts3 which belong more to the A-P interface, even though they too must originate in the first instance as mental events. We emphasize the crucial role of the illocutionary aspects of meaning in addition to the conceptual-propositional ones, arguing that the semantic interface should in fact be treated as a many-layered and composite system. Further, it is suggested that the relevant elements from illocutionary logic and speech act theory (Searle and Vanderveken 1985, henceforth S&V; 4 see also Searle 1969, 1979) should be incorporated into C-I to function as the pragmatic-semantic counterparts of the higher level(s) of the CP structure. In other words, the periphery of the CP Phase parallels the illocu3
4
Speech acts are the units of human communication, and illocutionary acts are minimal units of this kind. Further, an illocutionary point, one of the seven components underlying an illocutionary act, is the “purpose which is internal to it being an act of that type” (S&V: 13). See S&V and the other related work for further explanation. We rely in particular on this work because it embodies an illocutionary logic that is thoroughly worked out, but we do not take for granted all that it offers, nor do we consider it definitive. Consequently, later in this paper (see section 5.2.) some modifications to the theory presented in it are suggested.
184 Anjum P. Saleemi tionary force intended by the speaker, much as conceptual-lexical structure serves as the foundation of the basic propositional content of the v*P Phase. Also emphasized is the need to alter the derivational model currently in use in MP, typically assumed to bifurcate at SPELL-OUT into LF and PF, by ‘linearizing’ it. In essence such a perspective is by no means a radical departure from the standard minimalist approach, but it is innovative to the extent that it (re)introduces a ‘neo-Saussurean’ (Saussure 1916 /1986) element in regard to the way the linguistic interfaces are interrelated. In general, our approach is meant to be closer to MP in spirit than much other work, which remains confined to the syntactic execution of things within pre-set limits. In a sense what we try to accomplish is to delineate a perspective based upon and conceptualized within the MP, though in part the latter is also viewed from without, as it were, with the intention of attempting to further those of its objectives which remain by and large unrealized.
2. The C-I interface: Some fundamental issues We believe it is important to recall and reconsider the background of the form-meaning relationship within a broader conceptual and philosophical perspective, in part since any mention of illocutionary force and speech acts immediately brings to mind various ill-fated attempts within generative semantics, in particular the performative deletion analysis, to bring speech acts into syntactic derivations, or in general to reduce meaning to syntax (see Ross 1970; Gordon and Lakoff 1971/1975)5, not to mention some linguistic approaches which believe that the opposite aim, i.e., that of reducing syntax to meaning, should be held to be paramount. Thus it should be clearly stated that, unlike generative semantics, our approach does not involve the reduction of the illocutionary level of meaning to technical devices traditionally characteristic of transformational-generative syntax, nor indeed do we wish to achieve the converse. Some preliminary general remarks seem to be in order here to establish the relevant conceptual and philosophical backdrop. Thus, an apparent analogy can be drawn between the familiar mind-body problem and the problem of form vs. content, with the crucial difference that the latter problem involves entities both of which are mental. At any rate, an enormous gap has existed within each of these seemingly antithetical pairs of entities. 5
See Searle (1979) for some critical remarks on this enterprise.
On the interface(s) between syntax and meaning
185
One of the prevalent opinions is that the relationships between the members of each pair do not involve eliminative reduction, but are very likely to eventually boil down to mutual co-existence, perhaps in the end turning into some sort of unification (see Saleemi 2005, and the references cited therein, for a general discussion). It is noteworthy that in the case of the mind-body problem historically it is the concept of matter, one of whose manifestations is the body, that has undergone radical changes, not that of the mind, such that the former has expanded to include energy, fields of force, and the like. The distinction between form and content in human sciences, like the philosophy of meaning and linguistics, appears not to have evolved to the point where the validity of a certain conception of form could be addressed without recourse to its reduction in the direction of meaning, as in the case of generative semantics, or vice versa, as still remains the case in various kinds of cognitive or functional approaches to linguistics. Although the concept of form (particularly morphological, syntactic, and phonological) has admittedly kept undergoing changes even within a single paradigm like the generative one, most of its manifestations continue to retain an overextended concept of form. At least in principle, a notion of syntactic form constrained by “virtual conceptual necessity” (Chomsky 1993), MP, PT, and the theory of interfaces together seem to be a step in the right direction, as they aim to minimize or contract the scope of syntactic form, while at the same time to expand the role of the interfaces, without any tacit or open attempt at absolute reduction. This should make it easier to tease out form, on the reasonable assumption that any sort of reduction is impossible, and that syntactic form is not meaning in some elaborate disguise, but an independent vehicle intended to make the expression of meaning possible, with the two held together in an intricate fashion by a host of mapping devices. Consider, for example, the following three simple logical possibilities in respect of the form-meaning relationship, partially reiterated here: (1)
a. Reduction of form to meaning. b. Reduction of meaning to form. c. Redefinition of the concept of form, resulting in an expansion of the interfaces.
It is the clearly the third possibility (1c) that should be considered to be more viable in MP. In contrast, neither (1a) nor (1b) can explain the relative grammaticality of an expression like (2), a sentence containing non-sense lexical items but retaining minimally correct morphosyntactic form, nor can
186 Anjum P. Saleemi it account for examples such as (3), for similar reasons, or for that matter for the oft-cited example in (4), in which the lexical items are individually meaningful but mutually incoherent. In should be mentioned here, though, that most of the functional morphemes employed (-ed, -s, it, is, that, were, by) are interpretable, serving as anchors maintaining basic links between syntax and meaning, and that even when semantically bleached uninterpretable items (pleaonastic it and there, for example) are used, as in (3), they could well be non-randomly related to their cognate interpretable items. (2)
Bleep tinged shoops.
(3)
It’s shoops that were tinged by bleep.
(4)
Colourless green ideas sleep furiously.
On the other hand, in (5), an instance of random scrambling employing meaningful English lexical items, the interpretation remains recoverable although the form (basically, the word order) is illegitimate, again indicating the relative independence of meaning from pure syntactic form. (5)
*crime is poverty that it breeds well-known.
Let us emphasize once more that none of these examples is totally ungrammatical or meaningless. Now look at the following contrast. (6)
It’s well-known that poverty breeds crime.
(7)
*It be well-known that poverty breeds crime.
Here (6) is the grammatical version of (5), and (7) an ill-formed one, as in it the tense is missing. It can be postulated that sentences like (6) result from a lexical array (LA) that is conceptually coherent and ordered in conformity with the canonical word order of a given language (modulo transformational intervention),6 that (7) exemplifies the non-application of proper V-to-T movement (or, equivalently, feature matching followed by deletion), and that (5) is simply the outcome of the unordered insertion of the lexical items.7 6
7
Formally speaking, an array is an ordered set of related elements, or an arrangement of quantities and symbols in rows and columns. The reasoning employed here, and the investigation of (un)grammaticality, can be carried out in a more fine-grained manner, for example to incorporate various
On the interface(s) between syntax and meaning
187
Note that nothing in this conceptualization bars the generation of deliberately unordered LAs. As we know, Phases (CP or v*P) are considered to be propositional (e.g., Chomsky 2004). However, it may be the case that Phases are not merely propositional, but also have an illocutionary force associated with them. That propositional expressions can exist without a typical illocutionary force associated with them is trivially illustrated by innumerable run-of-the-mill expressions cited as linguistic examples in “out-of-the-blue …contexts” (Rizzi 2004: 238), such as (8). (8)
The cat sat on the mat.
That is to say, the syntactic engine can be made to work completely or partially, leading to PF strings that are well-formed, or which crash for one or more reasons. Thus the degrees of grammaticality and semantic coherence can be more or less independent of each other, depending on the extent of the intersection between the two, which is what we really consider to be the meaning-syntax interface, of which what is known as C-I forms the basis. Clearly, under MP the concept of syntactic form must be reduced in order to better accommodate elements that reside in this intersection, which by definition should consist solely of such interface devices. Any changes along these lines must also affect the status of argument structure (see Levin and Rappaport Hovav 2005 for a survey) and event structure (Rothstein 2004), both a significant part of the interface between the conceptual-lexical system and syntax, as is evident, e.g., in Hale and Keyser (1993, 2002: see section 5.1. below). However, a notable desideratum in the currently prevalent MP model, as already implied, is that although it does take on board the conceptual aspects of C-I insofar as argument and event structure is concerned, the illocutionary part of the picture has received scant attention in it, in spite of the not infrequent complimentary references in the literature to entities like the language of thought – in the sense of Fodor (1975) – propositional attitudes, and “mentally constructed” speech acts (see Chomsky 2000: innovative deviations in literary style, as stated by Chomsky (to appear). To quote: “Merge can apply freely, yielding expressions interpreted at the interface in many different kinds of ways. They are sometimes called ‘deviant’, but that is only an informal notion… And expressions that are ‘deviant’ are not only often quite normal but even the best way to express some thought; metaphors, to take a standard example… That includes even expressions that crash, often used as literary devices and in informal discourse, with a precise and felicitous interpretation at the interfaces.” (Chomsky, to appear: 11.)
188 Anjum P. Saleemi 91, 94; 2005a: 4, and indeed elsewhere), testifying to the importance of the intentional and illocutionary factors. To sum up, since illocutionary force is as integral a part of C-I as conceptual structure, any reformulation of the form-meaning interface should include the interrelationships between syntax and both these components of meaning. Needless to add, we consider the philosophical underpinnings outlined above to be crucial to the proper execution of the theory of C-I. Generally speaking, they are not at all incompatible with MP, because minimizing the role of syntax entails concomitantly maximizing the role of the interface mechanisms.
3. Propositional and illocutionary meaning Conceptual-lexical structure builds up expressions primarily in a bottom-up manner, whereas illocutionary structure primarily serves as a top-down constraint on the generation of expressions, with syntax (and linguistic form in general) intervening between the two as a formal interface. The former generates skeletal v*P structure, primarily conceptual-propositional in nature; in contrast CP is essentially illocutionary-propositional in nature, designed to interface with the illocutionary basis of an expression. Moreover, CP functions through the mediation of TP, and possibly some other bridging structures, such as AspP, extending the basic proposition to incorporate tense, modality, aspect, and voice. The alliance between syntactic mechanisms (call them S) generating a sentence (CP) and the related combination of illocutionary force (F) and propositional content (P), may be expressed as the simple schema below (based on S&V), where ↔ represents the interface between the two levels, namely: (i) P and F per se, and (ii) their syntactic expression in the form of CP. (9)
F(P) ↔ S(CP)
The point is that it is F(P) that is largely responsible for the selection of appropriate syntactic mechanisms, as well as lexical items, that are available in a particular language. Each expression is to be understood as having a purely conceptual, intentionally decontexualized interpretation, besides an intentionally contextualized interpretation. The italicization here is meant to emphasize the difference between intrinsic (mental) and extrinsic (situational) discourse determinants, a distinction that is often ignored. The initial step in the generation of an expression is the correct selection of an LA, that is, the set of lexical items and features constitutive of that
On the interface(s) between syntax and meaning
189
expression. An LA may be directly grounded in a speaker intention, embodied in its illocutionary force (henceforth, IF) and conceptual structure (CS), as is the case when language is used in normal situations, such that the mapping to LA yields the intended expression. The IF of an expression may be related to it only vacuously, in the sense that no specific type of it is identifiable, as exemplified by expressions cited out of context. In the latter case, an LA embodies a preposition P whose truth or otherwise may be subjected to investigation regardless of any reference to IF, in order to ascertain if it hangs together, i.e., is inherently coherent as a preposition in respect of the overall conceptual relationships among the individual items contained in the LA in question. As pointed out above and will be discussed further below, in natural conditions the composition of an LA will also be contingent upon its illocutionary purpose, depending on what is intended by a particular speaker; for example, the speaker may want to make a statement, ask a question, or convey his surprise about the content of a P. Therefore, the IF of an LA, even its components (see S&V: 87ff.), may be a crucial factor in demarcating the membership of the LA, eventually determining the clause type of the resulting expression as well. To state the obvious, CS (as described, e.g., in Jackendoff 19838, but see Pustejovsky 1995 for an alternative, more generative view of the conceptual bases of the lexicon 9), typically realized through the lexicon, being one of the two major components of C-I, is responsible for expressing, among other things, the argument and event structure of an expression. It contains constraints which are internal to it, so that each concept is often tied up with several others (but note that these constraints have their origin in factors which are ultimately intentional, as explained below). To return to the question of lexical selection, our conception of it relies upon both the components of the C-I system, and is conceived as a two-fold but necessarily overlapping process, involving illocutionary selection (driven by IF) as well 8
9
A conceptual view of the lexicon, reaching down to the sublexical elements, appears to have implications for syntactic operations that are far from trivial, as for example demonstrated by Culicover and Jackendoff (1995) in regard to binding theory. An example of a conceptual-lexical generative mechanism proposed in Pustejovsky (1995) is lexical inheritance structure, which may be considered to incorporate relations between a given lexical item X and the set of nodes in a semantic lattice that X is related to up to the nth degree of remove from itself, thus defining the range over which (for instance) co-compositionality and type coercion can operate.
190 Anjum P. Saleemi as conceptual selection (based upon CS), on the assumption that both IF and CS are necessary for an LA to be selected and an expression to be generated. Let us assume that CS primarily selects a (sub)array LACS that is appropriate for the basic elements of P, and IF selects another (sub)array LAIF, not necessarily disjoint from LACS,10 that inserts items relevant to the illocutionary factors. In general the selection of items contained in LAIF, and indeed the related syntactic structures, is based upon a search for the Illocutionary Force Indicating Devices (S&V), i.e., the linguistic correlates (as opposed to the mere propositional-content expressing devices) best suited for the expression of IF. These may be clause types, illocutionary verbs, or a variety of other elements. Evidently, illocutionary features are not confined to any particular part of syntactic structure, let us say, to the Edges of Phases. While recognizing a certain amount of tension between conceptual and illocutionary meaning, it should be made clear that, strictly speaking, even propositional meaning is (ultimately) intentional, either intrinsically or derivatively (see Searle 1990 for a clarification of this distinction11), so we consider C-I to be basically intentional. Consequently, both CS and IF are major components of C-I (= intentional meaning), interrelated in the manner shown in (10). (10)
C-I
CS
IF
In other words, CS largely comprises the micro-elements of the propositional content of intentions, and IF typically consists of the macro-aspects of meaning, i.e., those related to types or classes of illocutionary intentions. Conceptual categories, however, have to be cognitively antecedent to the il10
11
For instance, illocutionary verbs (order, declare,…) have both conceptual-propositional and illocutionary content. Briefly, on Searle’s (1990, and much subsequent work) view the distinction between intrinsic or inherent and derived intentionality hinges on the former being an inalienable characteristic of the human mind, of which linguistic and most other kinds of representations are a derived and externalized reflection. In other words, a sentence is intentional only by virtue of the fact that it can be uttered by a human speaker in some context or other.
On the interface(s) between syntax and meaning
191
locutionary ones, since it is they which form the fundamental defining units of intentionality and thought. IF categories are thus inevitably couched in the terminology of CS, forming as they do a special subset of the latter – the major reason why CS and IF appear to be somewhat deceptively indistinguishable from each other. However, IF categories are special because they are particularly relevant to the expression of the illocutionary aspects of intentional thought.
4. Illocutionary and syntactic force Among the Illocutionary Force Indicating Devices, what are traditionally known as clause or mood types are especially important for our purposes (though illocutionary verbs and certain other types of lexical items also play a significant role here). As a result, it is significant to figure out the interface system between particular types of illocutionary points (see footnote 3) at the clause level and the syntactic clause types, in other words, between IF and syntactic force (henceforth, SF). Not unlike the number of θ-roles in the theory of argument structure, the range of fundamental types of either SF or illocutionary points of IF is strikingly limited. It is generally agreed that there are five or so basic clause or mood types (see Huddleston 2002, Allan 2006; cf. Chomsky 1955/75: 301ff.12). Comparably, S&V’s classification of the fundamental types of IF consists only of the following five categories: assertives (which state the way things are), commissives (which express one’s commitment to doing something), directives (that are meant to get other people to do things), declaratives (which bring about changes in the ‘world’ through one’s utterances), and expressives (aiming to convey one’s feelings and attitudes). Likewise, Allan (2006) identifies the following five categories of what he calls “typical primary illocution” of clauses: declarative, interrogative, imperative, hypothetical, and expressive. Note that according to the framework followed in this paper “Questions are always directive, for they are attempts to get the hearer to perform a speech act” (S &V: 1999). 12
Interestingly, Chomsky (1955/75: 301) remarks that, “… had we given a more complete grammar…, the first statement of this grammar would have had to be Imperative Sentence Interrogative Sentence Sentence → Declarative Sentence : . …”
192 Anjum P. Saleemi A thorny issue that yet remains to be resolved is what the exact nature of the algorithms involved in IF ↔ SF is. As is well known, the relationship between the types of SF and IF is hardly ever one-to-one, but is usually oneto-many or many-to-one, as partially exemplified below, where (11) contains syntactically distinct sentences all expressing directives (e.g., requests, 11a–b; commands, 11c–e), and the examples in (12) show the same basic sentence or clause type, namely (Yes/No or Wh-) interrogative, used to express two different IF types, i.e, directive (12a–b) and expressive (12c–d); note that (11b), a directive, is an interrogative too. (See Lin and Saleemi 200513; also Saleemi 1998.) (11) a. b. c. d. e.
Please tell me the name of this animal. Would you tell me the name of this animal? Tell me the name of this animal. You WILL tell me the name of this animal. You WILL tell me what the name of this animal is.
(12) a. b. c. d.
Is this your house? Who lives next door? Is this a dance or what? Where in the world can you find such enemies?
As is evident, particular kinds of IF and SF are interrelated in a rather complex fashion; thus, while discussing types of illocutionary acts, it has been stated that “Often one can do more than one of these things in the same utterance” (S&V: 52, emphasis ours). It seems to us that this happens to the extent that changes in a speaker’s illocutionary perspective can be assumed to take place ‘online’. Although in this context it is customary to talk about direct and indirect speech acts, and derived illocutionary force, the fact is that illocutionary acts may shift from one type of IF to another, depending on the speaker intentions and the context of utterance, somewhat like the type shift that word meanings undergo when they cooccur with one set of words rather than another, what Pustejovsky (1995) refers to as “sense-incontext”. Therefore, we venture to propose that, beyond the fundamental IF types, illocutionary acts are as generative in nature as are syntax proper or the lexicon (see Pustejovsky 1995 on the nature of the lexicon). However, the IF ↔ SF relationship may not be entirely unsystematic. For one thing, some combinations tend to be more characteristic than others 13
See Lin and Saleemi (2005) for further elaboration of this point.
On the interface(s) between syntax and meaning
193
(cf. Allan 2006, Huddleston 2002: 853ff. for a discussion); see the following correlations. (13) a. b. c. d. e.
Assertive ↔ Indicative Commissive ↔ Indicative Declarative ↔ Indicative Directive ↔ Indicative ∨ Interrogative Expressive ↔ Indicative ∨ Exclamative
The recurrence of the indicative SF points to another reason why the IF ↔ SF mapping may not be random. Generally speaking, only a relatively small subset of the set of all logically possible linkages between the members of the two sets is actually attested. Moreover, on the illocutionary side, assertives are apparently the most general or basic class, perhaps because they are held to be truth-functional (at least according to S&V), in comparison with commissives, directives, declaratives, and expressives, whose truth value at best can be only taken for granted as it is tied to the speaker’s viewpoint. On the linguistic side, indicatives appear to be the most general or superordinate clausal form; arguably, they, too, are the most directly truthfunctional of the clause types. At the other end of the spectrum, the declarative IF and the exclamative SF happen to be the most impoverished types, having a rather limited range of uses. (See Lin and Saleemi 2005 for further detail.) Regardless of the enormous difficulties involved in systematically relating IF and SF, various parallels between syntax in general and IF are not too hard to locate: consider, for example, the following examples of negation (14–16) and conditionals (17–18), all taken or adapted from Lin and Saleemi (2005). (14) [I promised [ that I’d come ]]. (15) F(~P): I promised not to come. (16) ~F(P): I didn’t promise to come. As is well-known, negation is understood as having scope over a certain domain. Thus, (15–16), the two negative counterparts of (14), which contains the commissive verb promise, are supposed to differ in respectively exhibiting narrow and wide scope, a structural difference that can also be captured, in fact springs naturally from, the deeper intentional distinction between whether the speaker’s intention is to negate the embedded proposition
194 Anjum P. Saleemi (15) or the IF of the root clause (16), i.e., the distinction between propositional negation (15) and what S&V term denegation (16). As to the following conditionals used in conjunction with bet, a commissive verb (both 17–18 are slightly adapted from S&V: here C = conditional operator), scope is operative here much the same way that it is in negation, depending on whether a sentence expresses an IF whose propositional content is conditional (17), or whether it is the IF itself that is conditional (18): “a conditional illocutionary act is not performed categorically but only on the condition that the proposition P that is its antecedent is true” (S&V: 157). (17) F(C(P)): I bet you five dollars that if a candidate gets a majority of the electoral votes, then he’ll win. (A bet on a conditional proposition.) (18)
C
(F(P)): If Bush is the next Republican candidate, then I bet you five dollars that the Democrats will win. (A conditional bet.)
To return to the relationship between IF and SF, a unique syntactic projection headed by Force has been proposed as part of the CP system by Rizzi (1997; also see Chomsky 1995). Rizzi’s (1997) formulation of ForceP is illustrated below (cf. Rizzi 2004)14. (19) … [ForceP [ that [TP she gave [v*P [VP [VP the bottle to the baby] full.]]]]] As Rizzi states, “We can think of the complementizer system as the interface between a propositional content (expressed by the IP) and the superordinate structure (a higher clause or, possibly, the articulation of discourse, if we consider a root clause). As such, we expect the C system to express at least two kinds of information, one facing the outside and the other facing the inside.” (Rizzi 1997: 282, the emphases added). However, despite an explicit recognition of the role of “the articulation of discourse”, Rizzi makes no attempt whatsoever to consider in any detail the kind of information “facing the outside”, an issue which we feel must be addressed, and will therefore be taken up again in greater detail in section 5.2.
14
See Rizzi (2004) for an even more elaborate version of the split-CP hypothesis, under which the C system has a considerably greater number of components.
On the interface(s) between syntax and meaning
5.
195
Phases and interfaces
Let us now take a closer look at the semantic interface(s) and the derivational model, reconsidering their role and reconceptualizing them, especially in terms of the commonly recognized Phases (i.e., v*P and CP), and consolidating and introducing some supportive arguments and empirical evidence in the process.
5.1. Phase one: the verbal complex Some aspects of the syntax-meaning interface have been studied relatively well, though they are often not treated as such. For instance, lexical-conceptual structure (e.g., argument structure and event structure) can itself be taken to be an interface between the lexicon and syntactic derivations, or between the CS part of C-I and the syntactic structure of and the lexicon. Recall that CS provides the basic propositional input to derivations, in collaboration with that provided by the outer illocutionary interface in C-I, namely, IF. As Chomsky points out, “The phases have correlates at the interfaces: argument structure or full propositional structure at meaning side, relative independence at the sound side. But the correlation is less than perfect, which raises questions” (2005: 18). In any case, conceptual interdependence among lexical items, based on a generative view of the mapping between concepts and morphological entities, could be instrumental in defining the notion of Phase, a proposition-like construct that, it will be suggested, combines syntactic and semantic characteristics, as is partly apparent from the fact that Phases are distinct from purely syntactic categories such as VP, vP, v*P, TP and CP, or their bare counterparts in MP. For an illustration of this point at the level of the verbal Phase (v*P, to be precise), take, Hale and Keyser’s (1993, 2002) ‘bare’ view of argument structure. Commendable though such an enterprise devoted to finding the structural or templatic correlates of meaning is, it is far too obvious that it can explain the facts only within a limited range. As Hale and Keyser (2002) point out, contrasts such as the one exemplified in (20) – from Hale and Keyser (2002: 24) – compel one to look for factors other than those definable exclusively in terms of bare arguments structure. (20) a. b. c. b.
The kids splashed mud on the wall. Mud splashed on the wall. The kids smeared mud on the wall. *Mud smeared on the wall.
196 Anjum P. Saleemi Additional semantic features, in this case manner features of splash and smear, have to be invoked to help explain not only this particular contrast between verbs whose argument structure is otherwise identical. Patently, the situation in regard to figurative expressions must require even further modifications in the concept of selection. Even if delimited to nonfigurative meaning, selectional factors, we believe, often go well beyond simple configurational and thematic matters, to the extent that the items selected may alter each other’s meaning compositionally (cf. the notion of “sense extension” in Pustejovsky 199515). To reemphasize the propositional character of v*P, it should be fruitful to consider an example of the interaction between verbal prefixation and event structure. The meaning of the verbal prefix re- (as in repaint), it has been claimed, does not connote the exact repetition of an event on the basis of data like the following (from Marantz 2005). (21) a. b. c. d.
They repainted the walls. *John re-put the books on the table. *John re-smiled. John smiled again.
On this view, re-prefixation, contrary to claims to the effect (e.g., Lieber 2004), appears not to be semantically equivalent to the adverbial modification by again,16 and consequently Marantz (2005; cf. Keyser and Roeper 1992) claims that re- in fact originates as a nominal prefix, having scope over it, and appears when the nominal in question is of a certain change-ofstate, eventive type, for example cake, which does not admit a simple iteration of the action denoted by a verb like rebake; after rebaking, the outcome is never going to be the same cake. Such an approach is clearly both counterintuitive and problematic because re- does after all seem to be a verbal affix, and it can in fact mean simple repetition of an action or event, as for instance in (22), or have other connotations when used metaphorically or idiomatically (23). 15
16
More generally, see Pustejovsky’s (1995) distinction between semantically “monomorphic” and “polymorphic” languages, and his evidently correct characterization of natural languages as necessarily belonging to the latter class of languages in terms of their expressive power. But note that there are languages, such as Urdu(-Hindi), in which a prefixal equivalent of re- is not available, and therefore appropriate adverbial modification is the only option.
On the interface(s) between syntax and meaning
197
(22) a. John redialled the number. b. Please reenter your password. (23) a. I must repay your generosity at some point. [i.e., ‘pay back’ the generosity.] b. I hope the outcome will repay the efforts. [i.e., the efforts will be rewarded.] An alternative could be to consider the meaning of re- to range from mere iterative action of event to change of nominal state, and indeed well beyond, with the exact fixation of the sense to be determined in context, i.e., in composition with the nominal, verbalized through conflation (as in denominal verbs or adjectives) or otherwise, such that the referential nature of the nominal influences the interpretation of the verbal meaning. The relevant semantic conditions may be met simply by postulating that re- occupies a verbal head position immediately above the lexical verb, which the latter moves into. Thus: (24) [v*P… [V re- [V+N V [ N]]]] This will enable the prefix in question to have scope over both V and N. One consequence of this suggestion will be that the semantics of lexical items is compositional, but only in a complex, co-compositional sense (cf. Pustejovsky 1995). Further, systematic sense extension plays a significant role in trying to find an LACS that is as close as possible to LAIF. To conclude, the driving force behind many such phenomena is the intentional nature of meaning, which is not averse to finding ways to exploit a language as much as possible in order to achieve maximum approximation to the intended meaning. Given the fact that purely syntactic structures such as VP/vP/v*P, and the corresponding propositional complexes, can achieve a full interpretation only when the totality of an LA is taken into account, and that choices regarding the thematic and conceptual appropriateness of the subject, or the insertion of expletive elements, if required, can be made only after the conceptual meaning has been computed, we suggest a modification in PT. We postulate that Phases, especially their Edges (as distinct from their purely syntactic counterparts), mediating between meaning and syntactic structure as they do, are themselves interface devices; in other words, Phases are in fact interfaces, which in the ideal case must obey the conditions imposed by both syntax (or linguistic form in general) and C-I; let us call this view
198 Anjum P. Saleemi the hybrid thesis of Phases. A natural corollary of this thesis is that some syntactic constraints could be overridden by certain C-I constraints just in case (a given segment of) an expression is not a Phase, or the expression in question is fragmentary and, therefore, as before, does not constitute a Phase. This can help account for some annoying exceptions to syntactic principles without having to modify the latter in an increasingly (and often self-defeatingly) complex manner, or to treat such exceptions as mere discourse-related oddities. To summarize, the v*P Phase contains both structural and intentional information (pertaining to CS or IF) in the form of Edge features (EF, for short), so it may be thought of as an entity of the following type, with ↔, as before, representing the mapping mechanism(s), and AE standing for argument and event structure. (25) [v*P {EF [CS] ↔ [AE]}… ] It is hoped that an extension of this view of Phases to CP, to be undertaken in the following section and naturally assumed to explicitly incorporate the illocutionary component of meaning as well, will further support the assumptions underlying the hybrid thesis of Phases.
5.2. Phase two: The clausal complex Evidently, the clausal equivalent of (25) should look like (26). (26) [CP {EF [IF] ↔ [SF]} [TP…]] The usefulness of (26), or of (25) by extension, can be illustrated by citing some data from Urdu(-Hindi), a typically head-final language, except when the complement CP is finite (see Davison 2007 for a discussion of this split in the wider context of South Asian languages). Besides, Urdu allows a good deal of scrambling.17 The data in question contain ‘right-dislocation’ of a CP element, a process that is supposed to take place before the Domain of the CP Phase is sealed off, due to the Phase Impenetrability Condition of PT (Chomsky 2000), and its contents sent off to A-P, or even before that if the constituent involved is not a Phase; for instance, is a DP or a PP. Consider the contrast between (27) and (28): in (28) the DP appearing at the right Edge has been either moved or copied from inside the TP, and 17
We put aside any cases involving further scrambling in this language.
On the interface(s) between syntax and meaning
199
arguably right-adjoined to it, whereas it appears in situ in (27); here much structural detail at both the verbal and the clausal level is suppressed. (27) us nee kahaa [CP ki [TP aaj [DP doonoN umiidvaaroN ki s/he ERG said that today both candidates Particle darmian mukaabla] hoo gaa]] between competition be will ‘Today there’ll be a competition between the two candidates.’ (28) us nee kahaa [CP ki [TP [TP aaj hoo gaa] [DP doonoN umiidvaaroN ki darmian mukaabla]]] If we consider the basic structure of (28) to be (29), it seems reasonable to claim that the scrambling out of the TP is triggered by the presence of an intentional Focus feature at its right Edge, sitting with a corresponding rightdislocation syntactic feature, presumably both inherited from the CP. Thus, contrary to Rizzi (1997, 2004), it is claimed that Focus is an intentionalillocutionary, rather than a syntactic, feature because it is a manifestation of the degree of strength18, one of the essential components of IF (S&V: 15, 98 –100) (29) [CP [TP [TP…] {EF [IF Focus] ↔ [SF Right Adjuction]}]] This is not mere terminological shift or notational variation, but part of a general attempt to reduce redundancy in the study of various aspects of language. Its particular significance in the present context is that such reclassification of features will reduce the ever-expanding vocabulary and structure of syntax, whose mechanisms should merely specify the exact variety of the movement or copying operation available in a language (in this case, Internal Merge to a right-adjoined position), as dictated by the true spirit of MP. The alternative, of course, is to make syntax increasingly elaborate by introducing non-syntactic, even pragmatic, categories into in, as, for instance, is done in Rizzi’s (1997) split-CP hypothesis where the sentence or the clause (CP) is claimed to have an elaborate structure of the following sort, a move closer to generative semantics than to MP. (30) [ForceP [Force] [TopP [Top] [FocP [Foc] [FinP [Fin]]]]]
18
Compare the illocutionary verbs suggest and insist, both directives but differing in the degree of strength of their IF.
200 Anjum P. Saleemi To consider some further empirical evidence reinforcing our hypothesis, an interesting case can be made about the way syntactic and illocutionary factors interact, with empirical consequences for syntax, by considering the distribution of the English word any, which can appear both as a so-called Negative Polarity Item (NPI), licensed by an explicit negator like not, and as a lexical item with a ‘free-choice’ (FC) interpretation, almost like a positive-polarity universal quantifier; observe the contrasts between (31) and (32), and between (31a) and (31b), all assertive-indicative sentences: (31a) is grammatical, but (31b) is not. (31) a. He didn’t buy any books. b. *He bought any books. (32) a. Any man can write a book. b. Almost any man can write a book. c. He bought any books he wanted.
[NPI]
[FC]
This obviously can imply that any is in fact two semantically distinct but morphologically isomorphic lexical items. An alternative possibility is that any is a single lexical item, but that it inherits its polarity (positive or negative) in a given context from a negator like not and (ultimately) from the P of the IF. If the latter option is taken, it will mean, by assumption, that the basic representational schema of (31a) and (32) is as follows, where the illocutionary-propositional feature set (in IF) and the syntactic feature set (in SF) are both marked for polarity (±Pol). Here S = sentence, A = assertive, and I = indicative. (33) [CP {EF [IF ±Pol(A(P))] ↔ [SF ±Pol(I(S))]} [TP… ]] If the approach illustrated in (33) is adopted, well-formed examples of the type of (31a) and (32), as well as the ill-formed (31b), can be accounted for: both (31a) and (32) are marked for positive polarity by the speaker’s choice, as implied by (33) and shown more specifically in (34). (34) [CP {EF [IF +Pol(A(P))] ↔ [SF +Pol(I(S))]} [TP… ]] Accordingly, the representation for (31a), where any receives a negative interpretation, is given in (35). (35) [CP {EF [IF –Pol(A(P))] ↔ [SF –Pol(I(S))]} [TP… ]]
On the interface(s) between syntax and meaning
201
To return to the ungrammatical (31b), we suggest its underlying representation is not marked for either positive or negative polarity at all, as shown in (36), and its EF is therefore incomplete. As a result, since any is sensitive to polarity, but not to specific negative or positive polarity, and given the LA the sentence contains, any cannot receive any interpretation in (31b). (36) [CP {EF [IF (A(P))] ↔ [SF (I(S))]} [TP… ]] To sum up the observations made so far, any can be construed either positively or negatively, or not at all, depending on the relevant choices made in EF and in the selection of the LA correspondingly deployed. Comparably, directive-imperative (37), commissive-infinitive (38), and declarative-indicative (39) contexts permit either FC or NPI any, much like assertives (31a–32) under our analysis of them (see 33, above): here the (a) examples are FC, while the (b) examples are NPI. (37) a. Buy almost any book you want! b. Don’t buy any book sold here. (38) a. I promise to borrow any book on syntax. b. I promise not to borrow any book on syntax. (39) a. I declare you the president of almost any country you like. b. I don’t declare you the president of any country. Therefore the explanation of these contrasts, too, could fall under the schema in (33). However, many non-assertive environments only permit the NPI any, whether or not their polarity is marked to be present: see the following examples of directive-Yes/No interrogative (40a–b), directive-Wh interrogative (41a–b), and commissive-if-conditional (42a–b) contexts. (40) a. Do you have any book on syntax? b. Don’t you have any book on syntax? (41) a. Who buys any books here? b. Who doesn’t buy any books here? (42) a. If you have any book on syntax, I promise to borrow it. b. If you have any book on syntax, I promise not to borrow it.
202 Anjum P. Saleemi Thus, it is only these which truly exemplify the exclusive use of the ‘negative’ any. Our explanation for this is that any type of IF (call it X) may be either polar or non-polar (¬Pol). However, Yes/No and Wh-interrogatives, both obviously questions (Q), and if-conditionals (C), are indeterminate in regard to polarity,or are inherently non-polar (¬Poli) at the root-clause level, as opposed to being so by the speaker’s choice, (though clearly they can contain embedded polar propositions). This, we surmise, is due to certain fundamental conceptual factors, which eventually must be precisely articulated. What happens in the case of the particular combinations of SF and IF in the contexts under consideration (namely, 40–42) is that the non-polarity, and possibly the slight conceptual mismatch between the IF and SF polarity features, result in an indefinite interpretation (Indef) of EF, and consequently that of any. Thus (¬ = non-polar): (43) [CP Indef{EF [IF ¬Pol(X(P))] ↔ [SF (¬Poli(Q ∨ C(S))]} [TP… ]] Therefore, it appears feasible to formulate a unified account of the distribution of any, as an alternative to the various standard accounts (e.g., see the papers in Horn and Kato 2000, Hsiao 2006, and the other references cited in these works), which is able to explain a wider range of data, at least in part because it cuts across conceptual-illocutionary factors and the syntactic ones. Such an approach should also account for the ‘logophoric’ any, i.e., the appearance of any in the contexts where no c-commanding licenser is available (see Progovac 2000); at any rate, Chomsky points out that c-command may not be actually necessary in explaining various syntactic phenomena (to appear: 8), presumably especially if they are attested in nonPhases. To revise our previous summation, we propose that there is only one lexical item any which is construed negatively, positively, not at all, or as an indefinite word, depending on the C-I factors outlined above. Another case in point, likewise demonstrating the usefulness of the C-I categories, is binding theory, which Chomsky considers to lie “at the outer edge of the C-I interface” (to appear: 8). The data we seek to examine include non-core or logophoric reflexive anaphors (44–48) as well as the core cases of canonically bound anaphors (49–50) of this sort (see Büring 2005 for an overview). (44) There was a wall right [PP behind ourselves]. (45) [ConjP As to myself], I don’t know where I’ll be staying. (46) [DP Pictures of himselfi] please Johni a lot.
On the interface(s) between syntax and meaning
203
(47) [DP Nude pictures of herselfi] don’t offend Maryi. (48) [DP Stories about himselfi] excite Johni. (49) a. John saw himself. b. *Himself saw John. (50) a. He wants [PROi to hurt himselfi]. b. *He wanted [Maryi to hurt himselfi]. It is typical to assume that most of the problematic cases (44–48) do not submit to the usual sentence-internal binding conditions due to some discourse or other factors. We wish to argue that incorporation of certain illocutionary elements can help one achieve a unified, and therefore more elegant, explanation for the distribution of most if not all kinds of anaphors. Our account is based upon (i) PT, and (ii) the notion of “direction of fit”, one of the components of IF, (S&V: 92ff.). The direction of fit determines whether IF is designed to express the way the world is, or to make it the way intended by IF. S&V propose the following directions of fit: (i) the word-to-world direction of fit (assertives), (ii) the world-to-word direction of fit (commissives and directives) – which can be speaker-based (commissives) or hearer-based (directives), (iv) the double direction of fit (declaratives), and (iii) the null or empty direction of fit (expressives). It should be made explicit that the term “world” is used here in a relative sense, not necessarily in the literal one. First consider the role of PT, more specifically our modified version of it. It appears that in several contexts non-core unbound anaphors occur in non-Phase constituents, such as DP or PP (46–48, 44, respectively), or the sort of Conj(unction)P used at the start of (45). This means that they are not subject to the binding conditions, on the reasonable assumption that if an anaphor cannot find an explicit antecedent within the Phase immediately containing it, it is free to look for one elsewhere. This is unlike (49-50) above and the contrast in (51) below, where the binding must occur within the CP, clearly a Phase. (51) a. [CP For himself to win the contest], John decided to take all possible measures. b. *Himself to win the contest, John decided to take all possible measures. In regard to (44–48), instead of assuming the presence of phonetically null antecedents (a legitimate option for 50a), and regardless of the referential
204 Anjum P. Saleemi ambiguity of such antecedents (who took the picture? John /Mary or someone else?), an alternative is to consider the non-Phase constituents to be part of a sentence which expresses a new kind of direct of fit, which makes reflexivity possible in general. Although the original claim is that “There are four and only four directions of fit in language” (S&V: 52), we suggest that the range of the direction of fit be extended to include the world-to-world fit, which naturally subsumes the word-to-word fit, thereby exhausting all the logical possibilities of how various entities relate to each other in the relevant terms. Note that ‘entity’ could mean a variety of things: a person or an object about which something is stated to be the case or to which something is attributed (52), the speaker himself (53), the hearer (54), an abstract object (55), literally the whole world (56), even a linguistic expression (57). More importantly, this fifth kind of direction of fit, it is claimed, must co-occur with all the other kinds of direction of fit, as also partially and indirectly illustrated by the following examples: (52), (54), (56), and (57) are assertive (see also 49a and 50a above), while (53) is a declarative, and (55) a directive (exhibiting a sort of speaker-to-hearer-to-hearer fit!), with one of the four standard categories of the direction of fit applying to each type of IF as well. (52) This room can clean itself. (53) I declare myself to be the president of this country. (54) This system can destroy itself. (55) Go hang yourself! (56) The world will destroy itself. (57) This expression contradicts itself. Armed with the new category of the world-to-world fit, we can now claim that the IF in these non-binding, and indeed all other binding, cases is marked for this particular kind of fit, over and above containing one of the remaining four categories. So it is this property of IF that legitimizes the use of reflexives in the non-core examples under consideration (44–48), where reflexives, as we already know, are all part of non-Phase constituents. The implication is that the resulting overarching illocutionary factors not only (redundantly) apply to the core syntactic contexts in which anaphoric binding must occur, but also that the same factors can explain the appearance of reflexives in the non-core environments.
On the interface(s) between syntax and meaning
205
A problem remains to be resolved, which is why DPs containing picturestory type nouns are ill-formed in certain sentences (58–59), but not others (cf. 46–48). (58) *[DP Nude pictures of herselfi] absolved Maryi of the crime. (59) *[DP Stories about himselfi] don’t describe Johni very well. An obvious solution to this problem is that it is the CS-IF of these verbs that is responsible for these facts: the default meaning of please, offend, and excite (see 46–48) is experiencer-oriented (CS) and expressive (IF), but that of absolve and describe is agent-oriented (CS), and declarative and assertive, respectively (IF). This difference in meaning makes it impossible for the DPs in (58–59) to be licensed by any IF feature in EF that would also participate in the selection of the verbs they contain, as each of these DPs contains the theme of the verb in the form of reflexives. An expanded binding theory, as outlined in the foregoing pages, is better able to account for expressions, or parts thereof, whether or not they happen to be Phases, thereby showing the importance of the C-I interface, especially its illocutionary component, in human language. Presumably, it should not be too hard to incorporate into EF the features relevant to this approach to the distribution of reflexives, as exemplified in the previous part of this section for Focus and the distribution of any. Finally, it is hoped that the same kind of approach can be extended, mutatis mutandis, to reciprocal pronouns, also an object of inquiry in binding theory.
6. Phasing out the residue of the Y-model We begin to conclude this paper with a plea to straighten out an unnecessary and undesirable wrinkle in the derivational model currently in use in MP, which is in fact merely a diluted version of the more traditional EST/Ymodel and therefore carries over some residue from the older conceptualization. More precisely, it is suggested that the bifurcated derivational model retained in MP be realigned so that the syntax-meaning interface (C-I) constitutes one end of a linear or horizontal derivation, and A-P the other, a suggestion that is in keeping with the spirit of the arguments presented so far. The EST/Y-model, depicted in (60), is not appropriate under the fundamental assumptions of MP, and is therefore no longer required; the dotted triangle below depicts the step-by-step construction of expressions as a re-
206 Anjum P. Saleemi sult of the successive application of External Merge and the other syntactic operations, e.g., Internal Merge. (60) The bifurcated model Lexicon
A-P
SPELL-OUT
C-I
The revised model is graphically represented below (61) for the sake of explicit comparison. (61) The linear model CS ↓↑ Intentional Meaning LEXICON → C-I →… SYNTAX… (non-)PHASE → A-P ↓↑ IF It is argued that (61) is more desirable because: (a) It is conceptually more elegant, and there is really no need to set apart brute-force combinatorial mechanisms and the lexicon as they can be built into the modified model without any loss of explanatory force. (b) It is equally feasible to execute any derivations in (61) that the existing model is able to handle. (c) It can generate, among other things, expressions based on both ordered and unordered LAs, and is thus able to account for why some derivations crash but can still be interpreted with more or less effort (see examples 5 and 7 above), while others converge, but only to be understood as incoherent gibberish (2–4). (d) It incorporates Phases as part of the system, and provides for the generation of fragmentary or non-Phasic expressions. The two interfaces (A-P and C-I) must be treated as such, not as syntactic levels, and the lexicon should not be located orthogonally along a derivational axis different from the direct C-I/A-I axis, since it is the linguistic
On the interface(s) between syntax and meaning
207
starting point of any derivation, and therefore a significant part of the C-I interface. One major consequence of such a shift in the place of the lexicon is that there is no dual SPELL-OUT to A-P and C-I, but only one SPELLOUT towards the A-P end of the derivation. Another important shift is that derivations, since they do not have two distinct starting points anymore, do not need to be made separately legible to A-P and C-I. Instead, the main idea is to make meaning representations legible to linguistic form as a whole.
7. Some implications and consequences A major outcome of the perspective outlined above will be a truly minimal view of the core of the linguistic system, or “the faculty of language in the narrow sense,” perhaps one solely consisting of certain combinatorial and recursive properties; such a view, inevitably, will need to be complemented by a derivational system vastly enriched at the interfaces, constituting, as it were, “the faculty of language in the broad sense”(Hauser, Chomsky and Fitch 2002). Part of our effort is aimed at teasing apart pure or narrow syntactic mechanisms and those apparent properties of syntax which in fact reflect an assortment of conceptual and illocutionary factors. The syntactic and semantic entities are by no means completely autonomous, but part of the divergence between them is brought about by their specific properties and mechanisms coming into conflict with each other in the course of derivations. The concept of Phases advocated in this paper, as encapsulated in the hybrid thesis of Phases, is not only conceptually and empirically of a piece with their propositional-illocutionary character, but should also help redefine the relationship between form and meaning. Given all these factors, it would seem best to adopt the most adequate, yet essentially simpler, idealization, namely, that based on the one famously proposed by Saussure (1916 /1986), as any other versions promote the illusion that they are empirically the case, and as a result tend to be treated too narrowly and rigidly. Such derivational systems, in our view, are eventually bound to fail as the scope of linguistic theory and its database expand further. It is evident to us that considerably more empirical support is needed in support of our relatively innovative approach, and before that much existing empirical evidence and argumentation will have to be reconsolidated. However, we are convinced that the general direction outlined in the foregoing pages is fundamentally along the right lines.
208 Anjum P. Saleemi References Allan, Keith 2006 Clause-type, primary illocution, and mood-like operators in English. Language Sciences 28: 1–50. Boeckx, Cedric and Kleanthes K. Grohmann 2007 Putting phases in perspective. Syntax 10: 204 –222. Büring, Daniel 2005 Binding Theory. Cambridge: Cambridge University Press. Chomsky, Noam 1955/75 The Logical Structure of Linguistic Theory. New York /London: Plenum Press. 1993 A minimalist program for linguistic theory. In The View from Building 20: Essays in Linguistics in Honor of Sylvain Bromberger, Kenneth Hale and Samuel Jay Keyser (eds.), 1–52. Cambridge, MA: MIT Press. 1995a The Minimalist Program. Cambridge, MA: MIT Press. 1995b Bare phrase structure. In Government and Binding Theory and the Minimalist Program, Gert Webelhuth (ed.), 383–439. Cambridge, MA /Oxford: Blackwell. 2000 Minimalist inquiries: The framework. In Step by Step: Essays on Minimalist Syntax in Honor of Howard Lasnik, Roger Martin, David Michaels, and Juan Uriagereka (eds.), 89–155. Cambridge, MA: MIT Press. 2001 Derivation by phase. In Ken Hale: A Life in Language, Michael Kenstowicz (ed.), 1–52. Cambridge, MA: MIT Press. 2004 Beyond explanatory adequacy. In Structures and Beyond – The Cartography of Syntactic Structure, Vol. 3, Adriana Belletti (ed.), 104– 131. Oxford: Oxford University Press. 2005 Three factors in language design. Linguistic Inquiry 36 (1): 1–22. 2008 On Phases. In Foundational Issues in Linguistic Theory: Essays in Honor of Jean-Roger Vergnaud, Robert Freidin, Carlos P. Otero and Maria Luisa Zubizarreta (eds.), 133–166. Cambridge, MA: MIT Press. Culicover, Peter W. and Ray Jackendoff 1995 Something Else for the Binding Theory. Linguistic Inquiry 26 (2): 249 –275. Davison, Alice 2007 Word order, parameters, and the extended COMP projection. In Linguistic Theory and South Asian Languages, Josef Bayer, Tanmoy Bhattacharya, and M. T. Hany Babu (eds.), 175–198. Amsterdam/ Philadelphia: John Benjamins. Fodor, Jerry 1975 The Language of Thought. New York: Crowell.
On the interface(s) between syntax and meaning
209
Gordon, David and George Lakoff 1971/75 Conversational postulates. In Syntax and Semantics, Vol. 3: Speech Acts. CLS 7, Peter Cole and Jerry L. Morgan (eds.), 63–84. New York: Academic Press. Grohmann, Kleanthes K. to appear The road to PF. In Proceedings of the 17th International Symposium on Theoretical and Applied Linguistics, E. Agathopoulou, M. Demitrikapoulkou, and D. Papadopoulou (eds). Thessaloniki: Aristotle University of Thessaloniki. Hale, Kenneth and Samuel Jay Keyser 1993 On argument structure and the lexical expression of syntactic relations. In The View from Building 20: Essays in Linguistics in Honor of Sylvain Bromberger, Kenneth Hale and Samuel Jay Keyser (eds.), 53–109. Cambridge, MA: MIT Press. 2002 Prolegomenon to a Theory of Argument Structure. Cambridge, MA: MIT Press. Hauser, Marc D., Noam Chomsky, and W. Tecumseh Fitch 2002 The faculty of language: What is it, who has it, and how did it evolve? Science 298: 1569 –1579. Horn, Laurence and Yasuhiko Kato 2000 Negation and Polarity: Syntactic and Semantic Perspectives. Oxford: Oxford University Press. Hsiao, Katherine 2006 Polarity and ‘Bipolar’ Constructions: Epistemic Biases in Questions. MA thesis, National Chi Nan University. Huang, C.-T. James 1984 On the distribution and reference of empty pronouns. Linguistic Inquiry 15: 531–574. Huddleston, Rodney 2002 Illocutionary force and clause types. In The Cambridge Grammar of the English Language, Rodney Huddleston and Geoffrey Pullum (eds.). Cambridge: Cambridge University Press. Jackendoff, Ray 1990 Semantics Structures. Cambridge, MA: MIT Press. Keyser, Samuel Jay and Kenneth Hale 1992 Re: The abstract clitic hypothesis. Linguistic Inquiry 22: 89 –125. Levin, Beth and Malka Rappaport Hovav 2005 Argument Realization. Cambridge: Cambridge University Press. Lieber, Rochelle 2004 Morphology and Lexical Semantics. Cambridge: Cambridge University Press. Lin, Nelly and Anjum P. Saleemi 2008 Global linguistic derivations. Proceedings of CLS 41.
210 Anjum P. Saleemi Marantz, Alec 2005 Rederived generalizations. (Handout of) Talk delivered at National Tsing Hua University, Hsinchu. Progovac, Ljiljana 2000 Coordination, C-command, and ‘logophoric’ N-words. In Negation and Polarity: Syntactic and Semantic Perspectives, L. Horn and Y. Kato (eds.), 88–114. Oxford: Oxford University Press.. Pustejovsky, James 1995 The Generative Lexicon. Cambridge, MA: MIT Press. Rizzi, Luigi 1997 The fine structure of the Left Periphery. In Elements of Grammar, L. Haegeman (ed.), 281–337. Dordrecht: Kluwer. 2004 Locality and Left Periphery. In Structures and Beyond: The Cartography of Syntactic Structures, Vol. 3, Adriana Belleti (ed.), 223–251. Oxford: Oxford University Press. Ross, John R. 1970 On declarative sentences. In Readings in English Transformational Grammar, R. A. Jacobs and P. S. Rosenbaum (eds.), 222–272. Waltham, MA: Ginn & Co. Rothstein, Susan 2004 Structuring Events: A Study in the Semantics of Aspect. Malden, MA/ Oxford: Blackwell. Saleemi, Anjum P. 1998 Skinner’s razor and the varieties of mentalism. In Text in Education and Society, Desmond Allison, Lionel Wee, Bao Zhiming, and Sunita Anne Abraham (eds.). Singapore: Singapore University Press. 2005 Introduction: The Enigma of unification. In In Search of a Language for the Mind-Brain: Can the Multiple Perspectives be Unified? Anjum P. Saleemi, Ocke-Schwen Bohn and Albert Gjedde (eds.), Aarhus: Aarhus University Press. Saussure, Ferdinand de 1916 /86 Course in General Linguistics. La Salle, IL: Open Court. Searle, John R. 1969 Speech Acts: An Essay in the Philosophy of Language. Cambridge: Cambridge University Press. 1979 Expression and Meaning: Studies in the Theory of Speech Acts. Cambridge: Cambridge University Press. 1990 Consciousness, explanatory inversion and cognitive science. Behavioral and Brain Sciences 13 (4): 585–595. Searle, John R. and Daniel Vanderveken 1985 Foundations of Illocutionary Logic. Cambridge: Cambridge University Press.
Dynamic economy of derivation Takashi Toyoshima
1. Introduction Since the onset of generative grammar (Chomsky 1951, 1955), a kind of economy notions has been in the theoretical considerations as an evaluation metric for the choice among competing grammars, namely, the Simplicity Measure. As the theory has advanced, the original simplicity measure short lived, but it has also become increasingly clear that some kind of economy plays a significant role in the fundamental properties of language. Perhaps, the first explicit endeavor that put economy considerations into the research agenda is Chomsky (1992). There, the Least Effort Principle was suggested, unifying the two types of economy conditions, Economy of Derivation and Economy of Representation. Since then, the issue of economy has been a lively topic of research, and it crystallized in the Minimalist Program initiated in Chomsky (1993), in which linguistic expressions are taken to be nothing more than “optimal” realizations of interface conditions imposed from the external systems (Economy of Representation), and those linguistic expressions are generated in an “optimal” way (Economy of Derivation). One of the issues that have been hotly debated in the Minimalist Program is the nature of economy conditions, whether they can be “global” or should be “local” (Collins 1997; Johnson & Lappin 1997, 1999; Nakamura 1997; Yang 1997; among others). The Shortest Derivation Condition as informally stated in Chomsky (1992) is a “global” condition, in the sense that it requires a comparison of all the derivations that are completed (trans-derivational comparison; Groat & O’Neil 1996, among others). The Procrastinate Principle proposed in Chomsky (1993) is also “global”, as it “looks ahead” to determine whether or not a particular application of overt operations at a given point in a derivation would lead to convergence at the interfaces, potentially ending up with comparison of all the derivations that are completed, as well as the ones that are terminated in crash. They entailed a computational complexity of combinatorial explosion, and it has been commonly assumed that “look-ahead” is to be avoided in the formulation of derivational economy. Yet, it has not been made clear as to what the computational complexity is in the minimalist syntax, not to mention how it is to be measured. It has
212 Takashi Toyoshima become a buzzword that appears in numerous works, claiming that their proposals reduce it, but without making explicit what they mean by computational complexity and how it is reduced. In this paper, I will attempt to frame the issue of computational complexity of the minimalist syntax into the theory of computational complexity developed in mathematics and information science. Although human syntactic computation may turn out to require a distinct notion of complexity, it is certainly desirable if the complexity of human syntactic computation can be made amenable to such a mathematical theory of computational complexity, as the latter is not domain-specific. I believe that it is useful to formulate economy conditions of the minimalist syntax in terms of the mathematical theory of computational complexity, to make the preferential metric involved more explicit in comparing different proposals. Toward that end, I propose a dynamic economy principle of derivation, arguing that the notion of locality must be differentiated for Economy of Derivation and for Economy of Representation, and that “look-ahead” is the very nature of derivational theory. In particular, I argue that “look-ahead” does not necessarily entail “global” comparison, and what must be avoided is the non-local “look-far-ahead” of all the potential derivations, not the derivationally local “look-one-step-ahead” of only the actual derivation at a given stage. This conception of derivational economy allows the dynamic choice at every local step of derivations which operations to apply, (external) Merge or Move (internal Merge). In turn, the question, which motivated the concept of Chomsky’s (2000: 106ff.) Lexical Subarray (LSA), why Merge of an expletive does not always preempt Move, can be answered without introducing such a novel concept of LSA. Chomsky (ibid.) claims that LSA is “propositional,” determining a phase, and hence can be selected straightforwardly from the initial lexical array (LA); LSA is determined by a single choice of C or of v. I show that the concept of LSA is problematic, the determination of which is not so straightforward that complexity can be reduced as Chomsky envisages, entailing combinatorial explosion of an exponential order. Reviewing Fukui’s (1996) insightful overview of the nature of economy in language, sections 2 discusses the nature of economy conditions as optimization, and the issue of “global” vs. “local” optima, in relation to the theory of computational complexity. In section 3, I review the background problems of misgeneration that motivated the concept of Lexical Subarray (LSA) in Chomsky (2000), and in section 4, I demonstrate the determination problem that the notion of LSA entails. I also review the non-deterministic problem of Collins’ (1997)
Dynamic economy of derivation
213
formulation of Local Economy in section 5, and the general problem of static economy that predetermines some operation over another, scrutinizing Shima’s (2000) preference for Move over Merge in section 6. In section 7, I propose a dynamic economy principle of Minimum Feature Retention (MFR) that makes a locally deterministic choice without a predetermined preference for Merge or Move, which solves the misgeneration problem that motivated the concept of LSA. We see there that the computational complexity of Merge over Move with LSA is far greater than that of MFR. In section 8, I propose that the expletive there is selected as a kind of “External Argument” of v, which dissolves an apparent problem for MFR, and section 9 summarizes the main points with some concluding remarks.
2.
Economy in language
2.1. Fukui (1996) Fukui (1996) claims that economy in language is best captured as a discrete optimization problem, studied in combinatorial mathematics and theoretical computer science. An optimization problem in general can be formalized as a search for the value of the decision variable x that optimizes (minimizes or maximizes) the value of the objective function ƒ(x) under some constraints (Papadimitriou & Steiglitz 1982). This can be formulated as follows: (1)
Let the decision variable x be an n-dimensional vector x = <x1 , x2 , x3 , …, xn > Objective function: ƒ(x) Constraints: x ∈ R Problem: find x such that OPT (ƒ(x)) where the domain of ƒ is an appropriately chosen set D (⊇ R).
The set R is called the feasible region of the optimization. If x satisfies the constraint x ∈ R, it is called a feasible solution to the optimization problem (1), and if a feasible solution x* (x* ∈ R) satisfies the following condition: (2)
1
ƒ(x*) ≤ ƒ(x), where x ∈ R where x* is the global optimum.1
The condition (2) is for optimization problems as minimization. As maximization, the condition will be: (2´) ƒ(x*) ≥ ƒ(x), where x ∈ R
214 Takashi Toyoshima Given a neighborhood set U(x*) that includes a feasible solution x* (∈ R), x* is called a local optimum if the following relation holds: (3)
ƒ(x*) ≤ ƒ(x), where x ∈ R ∩ U(x*)
The global optimum is necessarily a local optimum, but the converse is not necessarily true. That is, there are optimization problems that can be solved globally, but not locally. It is not exceptional that complex problems get stuck into a local optimum that is not the global optimum. In the minimalist syntax, it means that a local economy principle does not necessarily yield the optimal derivation. We will see that in section 5, where Collins’ (1997) Local Economy principles over-generate ungrammatical derivations. Formulating both the Condition on Economy of Representation and the Condition on Economy of Derivation as discrete optimization problems, Fukui (1996) argues that the former is solvable in a (strictly) local fashion, as its satisfaction can be verified with(in a well defined domain of) a single representation at the interface. On the other hand, the derivational nature of language, i.e., the Condition on Economy of Derivation, is fundamentally global. Thus formulated, the issue of economy in language is therefore not an either/or-question (Collins 1997; Johnson & Lappin 1997, 1999), or the division of labor between global and local economy (Nakamura 1997). Rather, it is whether the local economy can “solve” the global optimization of language, i.e., the combinatorial recursive generative property of its derivational character.
2.2. Computational complexity of global optimization The most important problem for global optimization arises in the computational complexity that is induced. In the theory of computational complexity, problems, formalized as mathematical objects, are classified into complexity classes, in terms of computational power and resources that an algorithm requires to solve those problems (Johnson 1990, Papadimitriou 1994, among others). The computational power is modeled on formal automata, such as Turing machines (Turing 1937). The resource requirements are usually characterized by the growth rate of time and/or space, with respect to the “size” of the problem.2 Time is measured by the number of (discrete) 2
For other computational resources that can be used for complexity measures, see Blum (1967), Hartmanis & Hopcroft (1971), Seiferas (1990), among others.
Dynamic economy of derivation
215
operational steps of the algorithm to solve a problem (time complexity), and space by the memory capacity expended by the algorithm (space complexity). Any problem solving procedure that can be reasonably called an “algorithm” can be formalized as a Turing machine (Church’s Thesis).3 The complexity class of decision problems that are solvable by deterministic Turing machines in polynomial-bound time is called the class P. The complexity class of decision problems that can be solved by non-deterministic Turing machines in polynomial-bound time is called the class NP (Garey & Johnson 1979). Informally speaking, the class P problems are computationally tractable, i.e., “efficiently solvable” (Edmonds 1965), whereas the class NP problems are not.4 There are often tradeoffs between time and space, but the space classes usually include the time classes; it is known that the polynomial-space bound classes (N)PSPACE, whether solvable by a non-deterministic or deterministic Turing machine, properly include the polynomial-time bound classes (N)P, and the exponential-time bound classes (N)EXP is properly included in the exponential-space bound classes (N)EXPSPACE. It is also known that the exponential classes (N)EXP(SPACE) properly include the polynomial classes (N)P(SPACE), whether by deterministic or nondeterministic, or time-bound or space-bound. That is, for the reduction of over-all computational complexity, the reduction of space-complexity is fromally more effective than the reduction of time-complexity. Optimization problems can be reformulated into a roughly equivalent decision problem, by supplying a target value (reduction). Thus reformulated, optimization problems can be classified in terms of their computational complexity. A variety of optimization problems, e.g., the network flow problem, are known to be reducible to the class P. On the other hand, the traveling salesman problem is a classic example of optimization problems that have not been reducible to the class P. It asks to find the shortest tour of a given number of locations with specified dis3
4
Church’s Thesis (Church 1941) was independently reached in its essentials in Turing (1937), Kleene (1936), Post (1936), and Markov (1954). Although it has not yet been formally proven, it is generally believed that the class NP properly includes the class P, as non-finite state non-deterministic automata are generally more powerful than the corresponding deterministic automata. It does not mean that the class NP problems cannot be solved, but that there is no general algorithm known to solve arbitrary instances of their types. If a given instance of a problem is small enough, it may be solved without difficulty.
216 Takashi Toyoshima tances from one another. It can be solved by enumerating all the possible tours, computing the total distance of each, and picking the shortest. This is a brute-force “global” search, in which the solution is guaranteed to be found, but requires time factorial to the number n of locations ( n–1! 2 ). However, once a candidate solution is supplied, such an optimization problem reduces to a class P decision problem; for the traveling salesman problem, it asks whether there is an equal or shorter tour with respect to the supplied value. That is, once an instance is provided, it becomes easy to verify the answer.5 This can be done by a “guess-and-check” procedure, “guessing” a candidate value deterministically, and “checking” it in polynomial-time bound. Various approximation schemes and “heuristic” algorithms have been proposed for these purposes. The strategy shared by all those heuristic algorithms is the use of local optimization to approximate the global optimal solution. Conjecturing that the Condition on Economy of Derivation, formulated as a discrete optimization problem, probably does not belong to the class P, Fukui (1996) argues that language is equipped with local conditions as biological counterparts of a heuristic algorithm that reduces the computational complexity induced by its global nature, inherent to the derivational system.
2.3. Economy conditions as optimization Adapting Fukui (1996), I formulate the Economy Condition of Derivation as a discrete optimization problem as follows: search for a derivation d that minimizes the value of the cost function ƒc (d), which maps a set D of derivations {d1, d2, d3, …, dn } to a set I of non-negative integers (cost of d), under the constraint that d belongs to a set Dc of convergent derivations. (5)
The Economy Condition of Derivation Objective function: ƒc (d): D → I Constraints: d ∈ Dc Problem: find d such that MIN(ƒc (d))
5
The “easy-verifiability” of the supplied value does not mean that the optimization is solved. To get the optimal value, many trials are needed with different supplied values, and yet there is no guarantee that the optimal value can be found, unless all possible combinatorial permutations are tried out – which simply reverts back to the brute-force exhaustive search.
Dynamic economy of derivation
217
Fukui (1996) takes the decision variable to be a binary relation r such that r ∈ R = {〈(π, λ), D〉} where (π, λ) is a structural description of a linguistic expression, and D is a convergent derivation that forms it, leaving the constraints open, as premature to postulate. I take a derivation to be the decision variable, whether convergent or not. A derivation, in turn, can be described as a binary relation, but from a given Lexical Array (LA) to a set S of structural descriptions (SDs), that is, d: LA → S.6 This allows the separation of the convergence condition as the constraint for optimization, defining the reference set among which the optimal derivation can be determined. In this way, the problem of misgeneration that will be discussed in the following sections can be addressed (Chomsky 1995: 220 ff., 227ff.). Yet, non-convergent derivations do not compete for the purpose of economy of derivation (Chomsky 1993, et seq.). For a derivation to converge, the given LA must be empty, and the structural description SD derived by d must be fully interpretable at both the Articulatory-Perceptual (A-P) interface and the Conceptual-Intentional (C-I) interface. That is, SD must be a pair of output representations <π, λ> that conform to the Economy Condition of Representation, namely the Principle of Full Interpretation, containing all and only information interpretable at the respective interfaces. Thus, I suggest the following formulation of the Economy Condition of Representation: (6)
The Economy Condition of Representation Objective function: ƒc (SD): S → I Constraints: SD =| d Problem: MIN(ƒc (SD))
Following Fukui again, I take the Economy Condition of Representation to be a discrete optimization problem of a search for a structural description 6
I take a derivation not to be functional. This is because two distinct outputs can be derived from exactly the same choice of lexical items as well as their associated formal features. For example, the following two sentences are both perfectly grammatical: i) Mary thinks that John is intelligent. ii) John thinks that Mary is intelligent. If LA is somehow structured in a way to control the order of lexical choices, or dynamically formed as suggested in Chomsky (2000), a derivation may be understood as a function, the problem of which is the main topic discussed in Section 4.
218 Takashi Toyoshima SD = <π, λ> that minimizes the value of the cost function ƒc (SD), which maps a set S of structural descriptions {SD1, SD2, SD3 , …, SDn } to a set I of nonnegative integers (cost of SD), under the constraint that SD is a pair <π, λ> formed by a derivation d, which is denoted as SD =| d in (6). Again, Fukui does not specify the constraint, but the interface representations π and λ cannot just be an arbitrary pair of representations that converge at the A-P and the C-I interfaces independently (Chomsky 1995: 225); they must be properly paired by a legitimate derivation d. This is not a trivial matter, insofar as some kind of deletion operation exists. The atomic soundmeaning correspondence paired in lexical items alone cannot guarantee the proper association between the structured propositional meaning and the phonetic symbols. Formulating the constraint this way embodies the Principle of Recoverability (Chomsky 1964) as a part of Economy of Representation without conceptual conflict.7 Being formulated as an optimization (minimization) problem, the Economy Condition of Representation as a whole also embodies the Principle of Inclusiveness (Chomsky 1995), insofar as there is no operation that “adds” an element absent in LA.
2.4. Other economy conditions in the Minimalist Framework Other than the Shortest Derivation Condition, Chomsky’s (1992) Economy of Derivation included the Shortest Movement Condition and the Last Resort Condition. There was a tension, in fact, a conflict between the Shortest Derivation Condition and the Shortest Movement Condition. Derivations with a single long-distance movement in one fell swoop are shorter than derivations with many shortest possible movement steps to achieve the corresponding long-distance effect. In the Minimalist Program of Chomsky (1995), the operation Move is reformulated in terms of Attract, incorporating the Shortest Movement Condition (Minimal Link Condition) as one of its defining properties. The Last Resort Condition is also incorporated as feature-checking (or Agree in Chomsky 2000, et seq.), another defining property of the operation Move. 7
Obviously, the most economical representation is the one in which everything is deleted, but unless a deletion operation can wipe out multiple items in one step, deleting more items takes more steps, resulting in longer derivations (cf. Kitahara 1995, among others). Thus, there is a tension between Economy of Representation and Economy of Derivation that is balanced on the Principle of Recoverability.
Dynamic economy of derivation
219
It meant that neither the Shortest Movement Condition nor the Last Resort Condition is any longer an economy condition on derivation; there is simply no movement operation that is not shortest or last resort. The computational system of human language CHL just does not have such an operation. That leaves only the Shortest Derivation Condition as the economy condition of derivation, resolving the conflict with the Shortest Movement Condition. Yet, it remains to be seen how we can get beyond its informal idea and improve on its implementation. How to determine the shortest derivation is not easy; “global” search among all the possible derivations is guaranteed to succeed, but not computationally feasible. The task is, then, to formulate a local condition that is guaranteed to yield the shortest derivation. To that end, the notion of locality must be rethought, and I propose a derivationally local principle that yields the shortest derivation in section 7. In order to see the problems of the conventional notion of locality, let us turn to the problems of misgeneration.
3. Problems of misgeneration 3.1. The over-generation problem Since Chomsky (1995), it has widely been held that the operation Merge is “costless,” whereas the operation Move is less economical than Merge. This is because Merge is required for any symbolic systems, whereas Move is not usually used in well-designed artificial symbolic systems. Thus, Move is an “imperfection” of natural language. In addition, Move can be considered as being composed of suboperations of Feature-Checking (or Agree in Chomsky 2000) + Merge (+ Generalized “Pied-Piping”). In this view that Merge preempts Move, there is an over-generation problem when an expletive is involved. In Chomsky’s (1995: 295ff.) account, superraising (7) cannot be generated. (7)
*Johni seems (that) it was told ti that TP
Presumably, (7) would have been through the following stages in its derivation:
220 Takashi Toyoshima (8)
a. b. c. d. e.
[Tʹ T was told John [CP that TP]] [TP it [Tʹ T was told John [CP that TP]]] Merge (it) [CP (that) [TP it [Tʹ T was told John [CP that TP]]]] [v/VP seems [CP (that) [TP it [Tʹ T was told John [CP that TP]]]]] [Tʹ T [v/VP seems [CP (that) [TP it [Tʹ T was told John [CP that TP]]]]]]
At the stage (8a), there was an option of either Merge (it) or Move (John). As Merge (it) is assumed to be more economical than Move (John), the former is chosen (8b). When the derivation reaches the stage (8e), there is no more relevant lexical item to fill the matrix Spec(TP). Thus, the only operation available at this stage is Move. As it is closer than John is to the matrix T, it moves to become the matrix Spec(TP), yielding the following. (9)
*[TP Iti [Tʹ T [v/VP seems [CP (that) [TP ti [Tʹ T was told John [CP that TP]]]]]]]
Chomsky (ibid.) reasoned that although the matrix T satisfies its EPP requirement, its Case-feature cannot be checked, since the Case-feature of it has been checked and erased when it is Merged to the embedded Spec(TP) – hence the derivation crashes. Thus, superraising (7) is blocked by the crashing derivation (9). As has been noticed, however, there are two serious oversights in this account. The first is that crashing derivations should not be able to block other derivations. Otherwise, Move will never be applied insofar as Merge preempts it. By assumption, derivations involving only Merge is more economical than the ones involving Move. However, most, perhaps all, I suspect, derivations involving only Merge crashes. If crashing derivations can block other derivations, no derivation with Move survives. Thus, another convergent derivation must be sought to block superraising (7). The second is that covert raising of John’s Case-feature in (9) should be able to check the Case-feature of the matrix T, as John’s Case-feature has not been checked. Thus, (9) should converge, contrary to Chomsky’s (ibid.) claim. That is, although superraising (7) is correctly blocked, another ungrammatical derivation (9) is over-generated.
3.2. The under-generation problem In the Phase-Based Probe-Goal Theory of Chomsky (2000), superraising (7) and the over-generation of (9) are both blocked. When the derivation
Dynamic economy of derivation
221
reaches the stage (8e), the probe T in the matrix clause cannot access John or it in the embedded TP by Phase-Impenetrability Condition (PIC). (10) Phase-Impenetrability Condition (PIC)
(Chomsky 2000: 108 (21))
In phase α with head H, the domain of H is not accessible to operations outside α, only H and its edge are accessible to such operations. In (8e), the relevant phases are the matrix vP and the embedded CP, so the probe T in the matrix clause cannot access it in the embedded TP. Therefore, neither the superraising (7) nor the ungrammatical (9) can be over-generated. Even if PIC does not hold, the superraising (7) cannot be derived from (8e) by Defective Intervention Constraint (DIC). (11) Defective Intervention Constraint (DIC) α>β>γ
(Chomsky 2000: 123 (42))
where > is c-command, β and γ match the probe α, β is inactive.
In (8e), the probe α is the matrix T, β is it, and γ is John. Even though John is still active and is a potential goal for the probe α, the matrix T, it is inactive and intervenes between them so that Agree between the matrix T and John cannot be established. Thus, John is unable to move to the matrix Spec(TP), correctly blocking the superraising (7). Yet, the grammatical (12) cannot be derived either, insofar as Merge (it) preempts Move (John) at the stage (8a) – the under-generation problem. (12) [TP It [Tʹ T [v/VP seems [CP (that) [TP Johni [Tʹ T was told ti [CP that TP]]]]]]] Note in passing that in this account, superraising is doubly ruled out by PIC and DIC, while creating the under-generation problem. Minimalist intuitions lead us to cast some doubts whether this is the optimal solution.
4. The determination problem of Lexical Subarray Acknowledging the under-generation problem, Chomsky (2000: 106 ff.) introduces a novel concept of Lexical Subarray (LSA), in terms of which a phase is defined. At the start of a derivation, the initial lexical array (LA) is selected from the lexicon, and as the derivation proceeds, LSA, a subset of
222 Takashi Toyoshima lexical items in LA, is extracted, placed in active memory (the “workspace”), and submitted to the computation. When LSA is exhausted, the computation may proceed, or it may return to LA, extracting another LSA, and proceed as before. Thus, at a given stage of derivation, Move may take place if LSA in active memory does not contain an expletive.8 Chomsky (ibid.) asserts that this cyclic access to LA reduces “operative” complexity.9 Chomsky (ibid.) then characterizes LSA as “propositional,” and claims: LAi [= LSA: TT] can then be selected straightforwardly: LAi [= LSA: TT] contains an occurrence of C or v, determining clause or verb phrase … [emphasis added: TT]
Even if LSA can be selected straightforwardly as Chomsky (ibid.) claims, it cannot be a genuine solution; it merely replaces the problem of (narrow) syntactic computation with the problem of how LSA is extracted from LA. “Expletive insertion” is a matter of syntax proper; it is the heart of the problem of EPP, a perennial troublemaker throughout the history of generative grammar. It should not be relegated to some “exo-syntactic” process of LSA formation, but should be resolved in narrow syntactic computation per se.10 In other words, it should be accounted for in terms of how lexical items are combined to build a structure, not in terms of how lexical items are made available for narrow syntactic computation. Merely characterizing LSA as “propositional” cannot guarantee the exclusion of an expletive from LSA either, since expletives are presumably “meaningless,” and hence it should not matter whether one is included in LSA or not; it neither contributes any meaning to, nor hampers the content of, a “proposition.” Neither will it suffice to determine LSA by containment of a single C or v. Selection of a relevant LSA is not so straightforward as Chomsky (2000) 8
9
10
This is, in fact, an over-simplification. Move cannot take place if LSA in active memory contains any lexical items, or if any complex structures have already been built in parallel spaces, that satisfy the EPP requirement, presumably of DP category. Merge of these should preempt Move of the raising argument in question. I take what Chomsky (2000) calls “operative” complexity to correspond to “computational” complexity measured by the number of “operational” steps, i.e., timecomplexity, in information science and discrete mathematics. LSA extraction is “exo-syntactic” in the sense that is not itself a structure-building operation. Yet, it still must be internal to the computational system of human language CHL.
Dynamic economy of derivation
223
claims. In principle, there are nCm possible ways to form LSA, extracting a subset of m lexical items from LA of n remaining lexical items. Take, for instance, the following example. (13) it T1 seems [that friendsi T2 were told ti [CP ]]
(cf. op. cit.: 129 (48b))
At the stage where the complement CP of told has somehow been constructed successfully, LA contains at least 8 remaining lexical items {T1, seem, that, T2, were, told, friends, it}, assuming for simplicity, were = v and there is no matrix C.11 The relevant LSA in question must contain at least 1 lexical item, either that (= C) or were (= v), but not both. Thus, there are 6 remaining lexical items {T1, seem, T2 , told, friends, it}, from which none to all items can be extracted to form LSA, either with that or with were. That is, there are 6C0 + 6C1 + 6C2 + 6C3 + 6C4 + 6C5 + 6C6 = 64 ways of extracting a subset to form LSA, either with that or with were. Therefore, there are 64 2 = 128 ways in total to form LSA at the stage where the complement CP of told has been constructed. If the expletive it is to be excluded at this same stage, potential combinations to be considered are among the 5 remaining lexical items {T1, seem, T2, told, friends}, from which none to all items can be extracted to form LSA, either with that or with were. There are 5C0 + 5C1 + 5C2 + 5 C3 + 5C4 + 5C5 = 32 ways of extracting a subset that can form LSA, either with that or with were. Therefore, there are 32 2 = 64 ways in total to form LSA without the expletive it, at the stage where the complement CP of told has been constructed. See Appendix. In a nutshell, there are 128 potential LSA, and 64 candidate LSA that do not contain the expletive it (underlined in Appendix), out of which there is only 1 correct LSA (marked ❖ in Appendix) to yield the desired derivation (12). Thus, extraction of the correct LSA is not a trivial matter, requiring a search among candidates exponential (2n – m∙m) to the number n of the remaining lexical items in LA that contains m phase-defining lexical items, i.e., C or v. That is, extraction of the correct LSA is a problem of the exponential class, be it the time-class (N)EXP or space-class (N)EXPSPACE (see section 2.2). 11
In the structure (13), were is not raised to T2 for the ease of discussion to follow. As far as I can see, there is no relevant effect even if were is not v but is directly generated as T2. If a matrix C must always be present, the combinatorial problem to be discussed below will further be compounded.
224 Takashi Toyoshima In sum, the “operative” complexity in derivation is reduced at the cost of the added “computational” complexity of an exponential order in the “exo-syntactic” process of extracting the correct LSA. Taking the redundant nature into account that superraising is doubly ruled out by PIC and DIC, this does not strike me as a genuine minimalist solution.
5. The problem of non-deterministic economy: Collins (1997) Underneath the problems we have reviewed lies the idea that Merge preempts Move (whenever possible). This view seems to be persisting even after Chomsky (2004: 110) reinterprets Move as internal Merge (IM), which comes “free,” just as the conventional Merge does, which is also reinterpreted as external Merge (EM). Thus, Chomsky (2004: 114) still adverts, “Move = Agree + Pied-piping + (external) Merge.” The displacement property of language is a necessity, not an imperfection, so that Move (IM) is a freely available operation, just as (external) Merge (EM) is; IM and EM are isotopes of Merge, and they both come “free.” One may argue that IM conceptually presupposes EM, as IM can be applied only to syntactic objects constructed through EM. On the other hand, EM operates on two separate syntactic objects, whereas IM operates on a single syntactic object. Or one may argue that Pure Merge, Merge that is not part of Move (Chomsky 2000: 103), involves c-/s-selection and determination of whether the terms are drawn from LA or from the syntactic objects already built. These are on a par with Agree and Pied-Piping for Move. Conceptual arguments can be made both ways, and ultimately the question is empirical, whether either one has the priority over the other or they are on an equal footing. Mere reinterpretation of Move (internal Merge) and (external) Merge as different or not comparable does not solve the over-generation problem we reviewed in section 2. In fact, Collins (1997) argues that Move and Merge are not comparable, and makes a proposal which in effect renders Move and Merge equal in cost. He contends that economy conditions should be formulated in a local fashion as follows: (14) Local Economy
(op. cit.: 4 (3))
Given a set of syntactic objects Σ which is part of derivation D, the decision about whether an operation OP may apply to Σ (as part of optimal derivation) is made only on the basis of information available in Σ.
Dynamic economy of derivation
225
Collins (op. cit.: 90ff.) regards Move as Copy + Merge, and the operation Select as a Copy operation out of the lexicon. Thus, both Move and Select + Merge are both instances of Copy + Merge operations, so that they both need to be triggered as last resort operations. Then, he formulates Last Resort as an independent economy condition, rather than a defining property of an operation, as in the following: (15) Last Resort (op. cit.: 9 (12)) An operation OP involving α may apply only if some property of α is satisfied. For Move, Collins (op. cit.) adopts the contemporary minimalist assumption that the relevant property to be satisfied is the checking of uninterpretable formal features. For Merge, he proposes the following principle: (16) Integration (op. cit.: 66 (8)) Every category (except the root) must be contained in another category. Then, Collins (op. cit.) formulates Minimality as another independent economy condition. (17) Minimality (op. cit.: 9 (13)) An operation OP (satisfying Last Resort) may apply only if there is no smaller operation OPʹ (satisfying Last Resort). Collins (op. cit.) adopts the standard Minimalist conception of the Minimal Link Condition as the metric for “smallness” of the operation Move. For Merge, the number of merged objects is counted, introducing a new formulation of Merge, which he calls Unrestricted Merge (op. cit.: 75 ff.). Unrestricted Merge is a generalized grouping operation that applies to any number of constituents, but its vacuous application to no element does not satisfy Last Resort, so that it is not possible. The binary application is smaller than ternary, quadripartite, etc., applications involving more than two elements. Given Last Resort and Minimality, a unary application to a single element is the “smallest” application of Unrestricted Merge, but if such a unary application is allowed, it inevitably yields infinite recursion, so that it is stipulated to be impossible. Therefore, only the binary application involving two elements is chosen by Minimality. Given these, Collins (1997) claims that Move and (Unrestricted) Merge are not comparable, and that they are equally economical insofar as they
226 Takashi Toyoshima obey Minimality (21).12 Thus, neither one is favored over the other in his local economy. In effect, they are equal in cost, and hence both options can be pursued. This non-deterministic nature of Collins’ (op. cit.) local economy faces the same over-generation problem as Chomsky (1995) we have seen in section 2. Since Merge (it) and Move (John) are not comparable and hence equal in cost at the stage (8a), repeated below as (18), there are two possible continuations, (8b) repeated below as (19), and (20). (18) [Tʹ T was told John [CP that TP]] (19) [TP it [Tʹ T was told John [CP that TP]]]
Merge (it)
(20) [TP Johni [Tʹ T was told ti [CP that TP]]]
Move (John)
As desired, continuation from (19 = 8b) cannot lead to superraising (7), repeated below as (21), since it violates Minimality, and the continuation from (20) may correctly yield the grammatical (12), repeated below as (22). (21) *[TP Johni [Tʹ T [v/VP seems [CP (that) [TP it [Tʹ T was told ti [CP that TP]]]]]]] (22) [TP It [Tʹ T [v/VP seems [CP (that) [TP Johni [Tʹ T was told ti [CP that TP]]]]]]] Yet, another continuation from (19 = 8b) will lead to over-generation of the ungrammatical (9), repeated below as (23). (23) *[TP Iti [Tʹ T [v/VP seems [CP (that) [TP ti [Tʹ T was told John [CP that TP]]]]]]]
12
There is an oversight in this reasoning of Collins’ (1997). Merge is not necessarily of lexical items, e.g., a complex subject DP with a transitive verb-complement structure (whether Vʹ or vʹ), so that it need not be preconditioned by Select (as Copy out of the lexicon). Thus, Merge of two complex structures already built in parallel, which is without Copy (= Select of a lexical item), may still be more economical than Move (= Copy + Merge).
Dynamic economy of derivation
227
6. The problem of static economy: Move over Merge As it is clear by now that Move (John) needs to preempt Merge (it) at the stage (18 = 8a), in order to derive the grammatical (22 = 12) while blocking superraising (21 = 7) and over-generation of the ungrammatical (23 = 9). Yet, Move cannot just always take precedence over Merge. Consider the continuation (24) from (20). (24) a. [CP (that) [TP Johni [Tʹ T was told ti [CP that TP]]]] b. [v/VP seems [CP (that) [TP Johni [Tʹ T was told ti [CP that TP]]]]] c. [Tʹ T [v/VP seems [CP (that) [TP Johni [Tʹ T was told ti [CP that TP]]]]]] At this stage, Move (John) over Merge (it) will yield the following, leaving it in LA. (25) [TP Johni [Tʹ T [v/VP seems [CP (that) [TP tʹi [Tʹ T was told ti [CP that TP]]]]]]] Chomsky (1995: 226) has claimed that if LA is not exhausted, “no derivation is generated and no questions of convergence or economy arise.” At (25), LA still contains it, so that it is not a completed derivation from (24c). Thus, Merge (it) is the only choice, yielding the grammatical (22 = 12). Notice at this point that the derivation of the grammatical (22 = 12) is one-step shorter than the one of the ungrammatical (23 = 9); in order to satisfy the EPP-feature and the Case-feature of the matrix T, the former invokes one movement of John, whereas the latter two movements, overt movement of it for EPP and covert movement of John’s Case-feature, with the number of mergers and of all the other movements being equal. Thus, the Shortest Derivation Condition, imposing the fewest-step requirement, could have chosen the grammatical (22 = 12) while blocking the superraising (21 = 7) as well as the over-generation of the ungrammatical (23 = 9), but only if Move had been chosen over Merge at the stage (18 = 8a). This illustrates a case where the computation is stuck in a local optimum that is not the global optimum, not being able to derive the shortest derivation due to the fixed decision for Merge over Move applied locally. Yet, a simple-minded appeal to the necessity of LA exhaustion does not solve the problem unequivocally. Suppose the derivation reached the stage (25), and LA still contained not only it, but also another T, is, likely, and another that (C). Then, the derivation could have continued as (26a–d).13 13
The putative raising of is to T is ignored here for the ease of discussion.
228 Takashi Toyoshima (26) a. [CP that [TP Johni [Tʹ T [v/VP seems [CP (that) [TP tʹi [Tʹ T was told ti [CP that TP]]]]]]]] b. [v/VP is likely [CP that [TP Johni [Tʹ T [v/VP seems [CP (that) [TP tʹi [Tʹ T was told ti [CP that TP]]]]]]]]] c. [Tʹ T [v/VP is likely [CP that [TP Johni [Tʹ T [v/VP seems [CP (that) [TP tʹi [Tʹ T was told ti [CP that TP]]]]]]]]]] d. *[TP It [Tʹ T [v/VP is likely [CP that [TP Johni [Tʹ T [v/VP seems [CP (that) [TP tʹi [Tʹ T was told ti [CP that TP]]]]]]]]]]] At (26d), LA is exhausted, so that the derivation is complete, and yet it is ungrammatical. This is a classic case of a Tensed-S Condition violation (Chomsky 1973), and in the Phase-Based Probe-Goal Theory (Chomsky 2000, et seq.), it is attributed to the Activity Condition, which renders a goal inactive after its uninterpretable/unvalued features being deleted/valued.14 The point is, the necessity of LA exhaustion cannot force Merge (it) at the stage (24c). Stipulating the following ancillary condition, Shima (2000) maintains that Move is preferred over Merge whenever possible. (27) [Spec, TP] can be filled only by a DP with structural Case. (op. cit.: 377 (10)) By (27), Move (John) is not an option at the stage (24c), since John has already checked its Case-feature in the embedded Spec(TP). Therefore, it is merged as the only choice to become the matrix Spec(TP), yielding the grammatical (22 = 12). Thus, Shima’s (op. cit.) proposal that Move is preferred over Merge, together with the condition (27), correctly generates (22 = 12). At the same time, both superraising (21 = 7) and the overgenerated (23 = 9) are correctly blocked as desired, repeated as (28, 29) below, respectively. (28) *[TP Johni [Tʹ T [v/VP seems [CP (that) [TP it [Tʹ T was told ti [CP that TP]]]]]]] (29) *[TP Iti [Tʹ T [v/VP seems [CP (that) [TP ti [Tʹ T was told John [CP that TP]]]]]]]
14
For problems of the Activity Condition, see Nevins (2004).
Dynamic economy of derivation
229
Nevertheless, this cannot be the whole story, either. For the contrast in existential constructions (30), Shima (2000: 382ff.) offers a Case-based analysis on the assumptions (31), following Belleti (1988) and Lasnik (1995). (30) a. Therei seems [ ti to [be someone in the room]] b. *There seems [someonei to [be ti in the room]] (31) a. The expletive there has a [structural: TT] Case feature, and a postcopular DP is optionally assigned [an inherent: TT] partitive Case by a copular. b. The expletive there has a formal feature to be checked by that of a DP with partitive Case. [emphases in bold added: TT] The common intermediate stage for (30a,b) is the following. (32) [Tʹ to [be someone in the room]] If someone is assigned a partitive Case, it cannot fill Spec(TP) by the condition (27). Therefore, there will be merged as Spec(TP), and the derivation can continue as the following, yielding (33d = 30a). (33) a. b. c. d.
[TP there [Tʹ to [be someone in the room]]] [v/VP seems [TP there [Tʹ to [be someone in the room]]]] [Tʹ T [v/VP seems [TP there [Tʹ to [be someone in the room]]]]] [TP therei [Tʹ T [v/VP seems [TP ti [Tʹ to [be someone in the room]]]]]]
If someone is not assigned a partitive Case, say, nominative, it can fill Spec(TP). Then, the preference for Move over Merge dictates its movement over merger of there at the stage (32), and the derivation would continue as the following. (34) a. [TP someonei [Tʹ to [be ti in the room]]] b. [v/VP seems [TP someonei [Tʹ to [be ti in the room]]]] c. [Tʹ T [v/VP seems [TP someonei [Tʹ to [be ti in the room]]]]] At this point, there are two options: Move (someone) and Merge (there). As someone still has a structural Case and there is also assumed to have a structural Case, they both meet the condition (27). Then, the preference for Move over Merge demands someone to move to the matrix Spec(TP). (35) [TP someonei [Tʹ T [v/VP seems [TP tiʹ [Tʹ to [be ti in the room]]]]]]
{there}
230 Takashi Toyoshima But that leaves there in LA. Shima (ibid.) claims, “hence [it] does not produce a single syntactic object, which makes the derivation crash (Chomsky 1995: 226).” However, this deduction is inaccurate; incomplete derivations neither converge nor crash. They fail to yield outputs at the interface levels. To quote Chomsky (1995: 226) accurately: Note that no question arises about the motivation for application of Select or Merge in the course of application. If Select does not exhaust the numeration [≒ LA: TT], no derivation is generated and no questions of convergence or economy arise. …
In other words, if a choice made by an economy consideration does not yield a complete derivation, that economy consideration becomes irrelevant, and an alternative that completes the derivation needs to be sought. Thus, at the stage (34c), Move (someone) over Merge (there) is not really an option, and only Merge is the possible continuation leading to (30b), which is ungrammatical, so it should be a crashing derivation. In Shima’s (2000) analysis, what is wrong with (30b) must be the inability of someone to check some formal feature of there. By hypothesis, someone in (34–35) is not assigned a partitive Case, and hence by the assumption (31b), it cannot check whatever the formal feature of there. Then, the question boils down to what is the feature of there that needs to be checked, and why it cannot be checked by other than that of DP with partitive Case. Thus, Shima’s (op. cit.) account crucially relies on the stipulated condition (27) and the assumption (31) that there must in effect cooccur with a DP with partitive Case, which is dubious as it is morphophonologically undetectable.
7. Preemptive Move: Dynamic economy A desideratum that emerges from the observations thus far is that Move needs to preempt Merge sometimes but not always. That is, an economy consideration that chooses between Merge and Move had better not be fixed statically. This view meshes well with Chomsky’s (2000: 110) reinterpretation of Move as internal Merge, “an operation that is freely available,” so that neither (external) Merge nor Move (internal Merge) is more economical than the other. This amounts to Collins’ (1997) proposal, but he did not pursue the issue further, ending up with non-deterministic economy that did not solve the over-generation problem as we have seen in section 5.
Dynamic economy of derivation
231
What kind of conditions can allow such a shifting, yet deterministic choice of operations, and how can we formulate it? In order to approach the answers to these questions, let us review the decisive points in the derivation for the “superraising” example. At the stage (18 = 8a), repeated below as (36), we want Move (John) over Merge (it). (36) [Tʹ T was told John [CP that TP]] At the stage (24c), repeated below as (37), Merge (it) needs to preempt Move (John). (37) [Tʹ T [v/VP seems [CP (that) [TP Johni [Tʹ T was told ti [CP that TP]]]]]] What could motivate these choices? At the stage (36 = 18 = 8a), the Casefeature of John and the φ-set of T are unvalued, and the EPP-feature of T needs to be deleted. Move (John) will delete the EPP-feature of T, valuing both the Case-feature of John and the φ-set of T. Merge (it) will also delete the EPP-feature of T, valuing the φ-set of T and the Case-feature of it itself, but will leave John’ Case-feature unvalued. At the stage (37 = 24c), the φ-set of the matrix T is unvalued, and its EPPfeature needs to be deleted as well. Move (John) will delete the EPP-feature of the matrix T, but cannot value its φ-set, as the Case-feature of John has already been valued in the embedded Spec(TP). Furthermore, it will be left in LA, generating no derivation. On the other hand, Merge (it) will delete the EPP-feature of the matrix T and value its φ-set as well as the Case-feature of it itself. Notice that the undesired choice of operations leaves unvalued/uninterpretable features in the resulting stage of the derivation. Note also that the number of features deleted/checked or valued/matched is the same whether Move (John) or Merge (it) at the stage (36 =18 =8a). Thus, neither Total Checking (Poole 1998) nor Maximum Matching (McGinnis 1998, Chomsky 2001: 15 (14)) of the sort can make the right choice. Exploiting these facts, I propose the following economy principle of derivation. (38) Principle of Minimum Feature Retention (MFR) At every stage Σn of a derivation, choose the operation that leaves the fewest unvalued/uninterpretable features in the resulting stage Σn + 1 of the derivation.
232 Takashi Toyoshima The intuitive idea behind this principle is that unvalued/uninterpretable features must ultimately be valued or deleted, so that carrying more of them along the derivation is less economical than valuing/deleting them as soon as possible. Keeping track of which unvalued/uninterpretable features are still unvalued or not deleted demands more memory capacity than to forget about them as soon as they are deleted or valued. As we have reviewed in section 2.2, the space-complexity is more important than the time-complexity, in reducing the over-all computational complexity. That is, memory consideration is more important than the number of operational steps, in the reduction of computational complexity. Although MFR was conceived from an intuitive idea of memory reduction, it is formulated to reduce the number of operational steps. MFR dynamically makes a locally deterministic choice between (external) Merge and Move (internal Merge). MFR is dynamic in that it makes a choice whenever options arise, without any preference predetermined for either Merge or Move. MFR is deterministic in that it chooses either Merge or Move (unless a tie) at every point in the derivation where options arise, unlike Collins’ (1997) non-deterministic economy that allows different operations to proliferate different continuations whenever options arise.15 Furthermore, MFR is local in that it does not make any trans-derivationally global comparison of all the derivations to determine which operation to choose at each point of the derivations. It crucially involves the so-called “look-ahead” of only one-step in the derivation Σn → Σn + 1, but it does not “look far ahead” in the derivation Σn → Σn + m (m > 1). This transitional onestep “look-ahead” is essential for the notion of locality in the derivational economy, and it does not impose any significant computational load, as the choices are, at worst, time-bound, to a linear function f(x) = n + m + l of the number n of the remaining lexical items in LA + the number m of the complex structures already constructed in parallel + the number l candidate constituents that can be moved.16
15
16
Determinism of MFR is limited to the choice between Merge and Move, and I leave open for now the choice among Merge (x) or the choice among Move (y). For Move, see also fn.16. Given the “attract-the-closest” formulation of Move with feature-checking and the fact that Move is contingent on unvalued/uninterpretable features of the probe/target that have not been valued or deleted, the number l of candidate constituents that can be moved in most cases is limited to 1. Also, n + m + l never exceeds the number k of lexical items in the initial LA at the start of derivations.
Dynamic economy of derivation
233
The infamous “look-ahead” is the “look-far-ahead” in the derivation Σn → Σn + m (m > 1) of the sort, which proliferates exponentially all the potential derivations to be compared, just as in Collins’ (1997) non-deterministic economy. In fact, if absolutely no “look-ahead” is allowed as in Collins’ (op. cit.: 4 (3)) local economy (14), the economy conditions will end up being equivalent to representational filters as argued in Johnson & Lappin (1997, 1999). I argue that the notion of locality needs to be differentiated for the Economy of Derivation from the one for the Economy of Representation. Derivations, by nature, map one representation to potentially more than one representation in a sequential manner. Thus, the locality in derivational economy should be defined in terms of one step in derivations as illustrated in (39), rather than on a single representation, which is still the basis for the locality notion in representational economy.17 (39)
LA q tgy p Σ1a Σ1b Σ1c Σ1d Σ1e … ty fgh fgh fgh ty Σ2a Σ2b fh fh Σ3a Σ3b g g
← derivationally local one step ← derivationally local one step ← derivationally local one step . : :
Let us now go through in some detail how MFR contrasts with Merge over Move escorted by LSA in Chomsky’s (2000, et seq.) Phase Theory, in deriving (13), repeated below with a little more detailed structure as (40). (40) [TP It T1 [vP v [VP seems [CP that [TP friendsi T2 [vP were [VP ti told [ CP ]]]]]]]
17
We may reformulate MFR to incorporate a minimality notion of representational economy as follows: (i) Principle of Minimum Feature Retention (reformulated) At every stage Σn of a derivation, choose an operation that yields the minimum representation in the resulting stage Σn + 1 of the derivation. How we define the minimum representation is an issue, however, which I have to leave aside for another occasion. The number of unvalued/uniterpretable features is certainly relevant, but so is the “size” of the structures. With the standard assumption of the Extension Requirement (Chomsky 1993: 22ff.), Merge always grows the structure “taller,” whereas Move need not, as in the case of head-tohead adjunction or of covert movement, in the relevant sense.
234 Takashi Toyoshima To converge as (40), the derivation had been through the following stage:18 (41) [Tʹ T2 [vP were [VP friends told [ CP ]]]] At this stage, there are n1 = 5 lexical items left in LA, {that, T1, seems, v, it}, assuming, again for simplicity, were = v and there is no matrix C (see fn.11). Presumably, there is m1 = 0 complex structure constructed separately, and there is l1 = 1 constituent that can be moved, namely, friends. Therefore, there are (n + m + l) = (5 + 0 + 1) = 6 options at this stage: Merge (that), Merge (T1), Merge (v), Merge (seems), Merge (it), and Move (friends). (42) a. b. c. d. e. f.
[TP that [Tʹ T2 [vP were [VP friends told [ CP ]]]] [TP T1 [Tʹ T2 [vP were [VP friends told [ CP ]]]] [vP v [Tʹ T2 [vP were [VP friends told [ CP ]]]] [TP seems [Tʹ T2 [vP were [VP friends told [ CP ]]]] [TP it [Tʹ T2 [vP were [VP friends told [ CP ]]]] [TP friendsi [Tʹ T2 [vP were [VP ti told [ CP ]]]]]
The computational system of human language CHL performs all the 6 possible operations, in order for MFR to compare the resulting stages. Merge (that) may delete the EPP-feature of T2, but cannot value the φ-set of T2 or the Case-feature of friends, leaving 2 uninterpretable/unvalued feature(set)s in the resulting stage. Merge (T1) may also delete the EPP-feature of T1, as well as the EPP-feature of T2, but does not value the φ-set of T2 or the φ-set of T1, as they can be conflicting. Consequently, the Case-feature of friends, cannot be valued, either, leaving 3 uninterpretable/unvalued feature(set)s in the resulting stage.19 Merge (v) or Merge (seems) may also be able to delete the EPP-feature of T2, but cannot value the φ-set of T2 or the Case-feature of friends, leaving 2 uninterpretable/unvalued feature(set)s in the resulting stage. Merge (it) will delete the EPP-feature of T2, valuing the Case-feature of it itself and perhaps incorrectly the φ-set of T2 as singular, but leaves the Case-feature of friends unvalued in the resulting stage. 18
19
I assume that the verb tell takes a complement CP as its direct object in its complement position and an indirect object in its specifier position as their base argument positions. I assume the φ-set of T is valued by the interpretable φ-set of DP, and the Casefeature of DP by the interpretable tense-feature of T.
Dynamic economy of derivation
235
Move (friends) will delete the EPP-feature of T2, and value the φ-set of T2 and the Case-feature of friends, leaving no uninterpretable/unvalued feature in the resulting stage. Therefore, MFR chooses Move (friends) as the most economical operation at the stage (41), and discards all the other options. Or one may think of MFR choosing the resulting representation (42e) that is “minimum,” the representation with the fewest unvalued/ uninterpretable feature(set)s (see fn.17). It may appear as a complex process, but any system has to decide which lexical item to Merge at a given point of derivations, even if Merge always preempts Move. Verification of remaining uninterpretable/ unvalued feature(set)s is not implied in other systems, but any system needs to keep track of uninterpretable/unvalued feature(set)s. CHL performs all the possible operations, either in parallel, or in random sequence one by one, while keeping each result in memory until a new winner appears. In either case, MFR does not really need to count the exact cardinality of them. MFR only needs to tell which resulting stage has fewer uninterpretable/unvalued feature(set)s. Collins (1997: 132, n.10) also broaches an alternative to counting, and in fact, counting may not be such a strange bedfellow of grammar after all – if the core property of the Faculty of Language in Narrow sense (NFL) is recursion (Hauser, Chomsky & Fitch 2002; Fitch, Hauser & Chomsky 2005) and the recursive application of Merge yields the successor function (Chomsky 2008, Hinzen 2008), it will be no surprise that grammar turns out to be able to count. In Chomsky’s (2000, et seq.) Phase Theory with Merge over Move, on the other hand, before the derivation reaches the stage (41), two LSA must be formed after the embedded CP is successfully constructed. At the stage where the embedded CP was completed, there were n = 9 lexical items left in LA, {it, T1, v, seems, that, T2, were, friends, told}, which contained m = 3 phase-defining lexical items, {v, that, were}. Thus, there were 29 – 3∙3 = 192 ways to form LSA, out of which only one correct LSA, {were, friends, told}, had to be searched for the vP-phase. (43) [vP were [VP friends told [ CP ]]] Furthermore, in order to derive (43), those three lexical items had to be merged in the correct order: Merge (told), Merge (friends), and Merge (were). There are 3! = 6 possible permutations, out of which the correct order had to be searched. Even if a wrong LSA is formed, CHL would not know it until all the lexical items in that LSA are merged in all the possible orders. The largest
236 Takashi Toyoshima possible LSA contains 7 lexical items: a single phase-defining lexical item, and all the rest of 6 non-phase defining lexical items {it, T1, seems, T2, friends, told}. As there are 3 phase-defining lexical items, there are 3 such largest LSA, for each of which there are 7! = 5040 permutations. The second largest LSA contains 6 lexical items, and there are 3 6 = 18 such LSA, for each of which there are 6! = 720 permutations. The third largest LSA contains 5 lexical items, and there are 3 15 = 45 such LSA, for each of which there are 5! = 120 permutations. The fourth largest LSA contains 4 lexical items, and there are 3 20 = 60 such LSA, for each of which there are 4! = 24 permutations. The fifth largest LSA contains 3 lexical items, and there are 3 15 = 45 such LSA, for each of which there are 3! = 6 permutations, and the sixth largest LSA contains 2 lexical items, and there are 3 6 = 18 such LSA, for each of which there are 2! = 2 permutations. The smallest possible LSA is the ones with only a phase-defining lexical item, v, that, or were. Thus, there are (3 5040) + (18 720) + (45 120) + (60 24) + (45 6) + (18 2) + 3 = 35,229 orders of Merge, out of which only one order, namely Merge (told) – Merge (friends) – Merge (were), could derive (43) from the stage where the complement CP had been completed. After such a massive search, another LSA had to be extracted for the next CP-phase, in order for the derivation to reach (41). When the vP-phase (43) was completed, there were n = 6 lexical items left in LA, {it, T1 , v, seems, that, T2}, which contained m = 2 phase-defining lexical items, {v, that}. The largest possible LSA contains 5 lexical items, and there are 2 such largest LSA, for each of which there are 5! = 120 permutations. The second largest LSA contains 4 lexical items, and there are 2 5 = 10 such largest LSA, for each of which there are 4! = 24 permutations. The third largest LSA contains 3 lexical items, and there are 2 10 = 20 such largest LSA, for each of which there are 3! = 6 permutations. The fourth largest LSA contains 2 lexical items, and there are 2 10 = 20 such largest LSA, for each of which there are 2! = 2 permutations. The smallest LSA contains just 1 phase-defining lexical item, and there are 2 such smallest LSA. Thus, for this CP-phase LSA, there are (2 120) + (10 24) + (20 6) + (20 2) + 2 = 562 possible orders of Merge. Therefore, in the worst case scenario, Chomsky’s (2000, et seq.) Phase Theory with Merge over Move by LSA needs to try out whopping 35,229 + 562 = 35,791 orders of Merge, in order for the derivation to reach (41) correctly. On the other hand, if there is no LSA formation, there are at most 9P4 = 3,024 orders of Merge that need to be tried out, in order for the derivation to reach (41) from the stage where the most embedded CP was completed. It must be by now obvious, that the time-complexity involved in LSA is far
Dynamic economy of derivation
237
greater than without it, by an order of magnitude. MFR requires just 6 more options to be considered after the stage (41). Note, in passing, that the exponential problem of extracting the correct SLA can be resolved if the lexical items are ordered in LA for extraction, contrary to the standard assumption. Furthermore, the factorial problem of Merge orders can be resolved as well, by ordering the lexical items in LA or in SLA. Nevertheless, ordering of the lexical items nullifies the raison d’être of LSA itself; LSA is not needed if the ordering can control when the expletive is merged. As MFR makes a deterministic choice, the derivational paths are pruned down logarithmically as derivations proceed. Likewise, Chomsky’s (2000, et seq.) Phase Theory presumably discards all the derivational paths that are not chosen at every derivational stage, and yet LSA needs to be formed when the derivation starts and the previous LSA is exhausted. Thus, while the derivational paths are reduced logarithmically, there are stages where an exponential search of the desired LSA is involved, and consequently a factorial search of the correct derivational path as we have seen. (44)
LA q ty p LSA1a LSA1b LSA1c LSA1d … ← exponential search ty fgh fgh ty Σ1a Σ1b fh fh Σ2a Σ2b fgh g g g Σnx q ty p LSA2a LSA2b LSA2c LSA2d … ← exponential search ty fgh fgh ty Σ(n + 1)a Σ(n + 1)b fgh g
8. There as external argument An immediate question may arise with respect to the there-existential constructions we have reviewed for Shima’s (2000) static economy of Move over Merge in section 6. MFR appears to face a problem with the contrast (30), repeated below as (45).
238 Takashi Toyoshima (45) a. Therei seems [ ti to [be someone in the room]] b. *There seems [someonei to [be ti in the room]] Presumably, both (45a,b = 30a,b) are derived from the common intermediate stage (32), repeated below as (46). (46) [Tʹ to [be someone in the room]] MFR does not seem to be able to decide between Move (someone) and Merge (there) at this stage because of a tie; Move (someone) will delete the EPP-feature of T (to) but leave the Case-feature of someone unvalued. Merge (there) will also delete the EPP-feature of T (to) and leave the Casefeature of someone unvalued. Assuming with Chomsky (1995, et seq.) but contra Shima (2000) that there does not have any Case-feature, either Move (someone) or Merge (there) will leave the same number of features unvalued, namely the Case-feature of someone. We want Merge (there) for (45a,b = 30a,b), but we cannot exclude Move (someone) at the stage (46 = 32) for the ECM paradigm, such as the following: (47) a. *Mary believes [ to [be someone in the room]] b. Mary believes [someonei to [be ti in the room]] Do we need to fall back on Shima’s (2000) static economy of Move over Merge with the Case-theoretical assumptions? Rather than a Case-based approach as Shima (op. cit.) takes, I would pursue another possibility hinted by Lasnik (1995: 624ff., fn.14): “… However, if it could be stipulated in some such fashion that there can be introduced only as the subject of unaccusative verbs and (certain instances of) be, several of the phenomena that have come up in the present discussion would reduce to this one property, as observed by the reviewer. For example, the following would all be excluded, along with (16): (i) (ii) (iii) (iv)
* There seems/is likely [that John is tall]. (12) * There strikes John/someone that Mary is intelligent. (13) * I want [there someone here at 6:00]. (14b) * There seems/is likely [there to be a man here]. (from footnote 10)
(16) * There is likely [someonei to be [ti here]]. [(16) appended for reference: TT] The selectional restriction possibility thus deserves further investigation, which I put aside for another occasion.”
Dynamic economy of derivation
239
Kidwai (2002) independently argues that merger of the expletive there is restricted to Spec(v*P), a specifier of v with full argument structure: transitive v or v for experiencer verbs (Chomsky 2001: 9, 43, n. 8), surveying the observation by Levin & Rappaport-Hovav (1995) as follows (Kidwai 2002: 4 (10)): Verb type
there-existential
inherently directed motion manner of motion existence appearance occurrence spatial configuration
(arrive) (fall) (live) (appear) (ensue) (lie)
yes yes yes yes yes yes
disappearance change of state change of state change of state change of color change of state
(disappear) (break verbs) (bend verbs) (cook verbs) (-en/-ify verbs) (deadjectivals)
no no no no no no
From Levin & Rappaport-Hovav’s (1995) study, Kidwai (op. cit.) draws the generalization that the there-existential construction is possible only with unaccusatives that are incompatible with an interpretation of external causation and takes both a Locative and a Theme argument. Remarking further that passives (48) and some ergatives (verbs of internally caused change of state (49a), verbs of sound/light emission (49b), as well as agentive manner of motion verbs (49c)) participate in the there-existential construction, Kidwai (op. cit.: 14) proposes a configurational licensing condition of the EPP-specifier (50). (48) There was a building demolished.
(ibid. (41))
(49) a. There bloomed a rose in the garden. b. There boomed a cannon in the distance. c. There ran a man into the room.
(ibid. (42)) (ibid. (43)) (ibid. (44))
(50) v* can be assigned an EPP-feature iff v is merged with [VP DP V X(P)]. (ibid. (40))
240 Takashi Toyoshima The configuration depicted as [VP DP V X(P)] in (50) is instantiated as follows: for unaccusatives (51a), passives (51b), and unergatives (51c). (51) a. Unaccusatives VP wo DP Vʹ (Theme) 3 Vunacc PP (Locative) b. Passives VP wo DP Vʹ (Direct Object) 3 Vpass Prt (PARTICIPLE) c. Unergatives VP wo DP Vʹ (External 3 Argument) Vunerg PP (Locative/ Directional) Kidwai (2002) argues that External Argument is merged as Spec(vP) for externally caused predicates, that is, transitive and some unergative verbs, while (unergative) verbs of internally caused change of state, select “External Argument” as Spec(VP) as in (51c). Abstracting away from Kidwai’s proposal that the expletive there is merged due to the EPP-feature assigned to v* in the above configurations, I would propose the following. (52) The expletive there can optionally be selected as a kind of “External Argument” of v that selects an intransitive VP that is saturated. (53) An intransitive VP is saturated if its head verb V is either: i) an unaccusative verb that is incompatible with an interpretation of external causation and selects a Locative and a Theme argument;
Dynamic economy of derivation
241
ii) an unergative verb that selects a Locative/Directional argument and an External Argument of internally caused change of state, of sound/light emission, or of manner of motion; or iii) passivized participle. Then, I modify the intransitive structures as follows: (54) a. Unaccusatives vP wo there vʹ wo v VP wo Vʹ PP 3 (Locative) Vunacc DP (Theme) b. Passives vP wo there vʹ wo v VP (be) wo (XP) Vʹ (Indirect 3 Object) Vpass DP (PARTICIPLE) (Theme) c. Unergatives vP wo there vʹ wo v VP wo DP Vʹ (External 3 Argument) Vunerg PP (Locative/ Directional)
242 Takashi Toyoshima I take the Theme argument, i.e., direct object, to be the complement of both unaccusative and passivized verbs,20 and following Kidwai (2002), External Argument of internally caused change of state, of sound/light emission, of manner of motion, is located at Spec(VP). The idea that the expletive there is an “External Argument” is not anything new, as Chomsky (2004: 126, n.37) acknowledges; … Note that nothing requires that EA [External Argument: TT] be an actual argument: it could be a there-type expletive with φ-features …
Given these, the contrast (45 = 30), repeated below as (55), ceases to be a problem for MFR. (55) a. Therei seems [ ti to [be someone in the room]] b. *There seems [someonei to [be ti in the room]] The common intermediate stage for (55 a,b = 45a,b = 30 a,b) was not (46 = 32), repeated below as (56), but (57). (56) [Tʹ to [be someone in the room]] (57) [Tʹ to [there be someone in the room]] With the structure at the stage (57), the relevant options to be considered are: Move (there) and Merge of whatever DP, available in LA or constructed already in parallel. MFR dictates Move (there) here; Move (there) will delete the EPP-feature of T (to) and leaves the Case-feature of someone unvalued, whereas Merge (DP) can also delete the EPP-feature of T (to) but will “add” the Case-feature of that DP, which will be left unvalued, on top of leaving the unvalued Case-feature of someone. That is, Move (there) leaves fewer unvalued/uninterpretable features at the next stage in the derivation than Merge (DP), which adds an unvalued Case-feature of DP that is left unvalued. Notice that the number of unvalued/uninterpretable features is the same in the structure at the stage (57), i.e., the EPP-feature of T (to) and the Case-feature of someone, and either Move (there) or Merge (DP) deletes only the EPP-feature of T (to) and leaves someone’s Case-feature for the next stage of the derivation. 20
Although I do not exclude the possibility that the Locative PP in unaccusatives is a right-branching Spec(VP), I take it to be an adjunct internal to VP. In passives, the External θ-role is suppressed, but an “External Argument” can still be optionally realized as there in Spec(vP), I assume.
Dynamic economy of derivation
243
This is where the derivational notion of locality in MFR becomes crucial, taking a one-step “look-ahead,” and the economy principle I propose is formulated in terms of minimal feature retention, instead of “maximum feature elimination.” Accordingly, MFR chooses Move (there) and the derivation can proceed as follows: (59) a. b. c. d.
[TP therei [Tʹ to [ ti be someone in the room]]] [v/VP seems [TP therei [Tʹ to [ ti be someone in the room]]]] [Tʹ T [v/VP seems [TP therei [Tʹ to [ ti be someone in the room]]]]] [TP Therei [Tʹ T [v/VP seems [TP ti [Tʹ to [ ti be someone in the room ]]]]]]
If there were in LA but not selected, no derivation would be generated as the LA would not be exhausted. (60) *Someonei seems [tiʹ to [be ti in the room]]
{there}
If there were not in LA, the following would result: (61) Someonei seems [tiʹ to [be ti in the room]] Again, Merge of some other DP, say, another expletive it, even if available in LA, is not an option, either at the embedded Spec(TP) or at the matrix Spec(TP) by MFR. In the embedded clause, Merge (it) at Spec(TP) will delete the EPP-feature of T (to) but leaves the Case-feature of someone unvalued, and further “adds” the Case-feature of it itself unvalued. Move (someone) to Spec(TP) will delete the EPP-feature of T (to) and leave the Case-feature of someone unvalued, but “add” no further unvalued/uninterpretable features. In the matrix clause, Merge (it) at Spec(TP) will delete the EPP-feature of T and value the Case-feature of it itself, but leave the Case-feature of someone unvalued. Move (someone) to Spec(TP) will delete the EPP-feature of T and value the Case-feature of someone, and “add” no further unvalued/ uninterpretable features. This entails that the expletive it can only be Merged at Spec(TP) when all the unvalued/uninterpretable features have been deleted or valued in its domain, i.e., vP. In other words, it is the only “pure EPP expletive” that is “meaningless,” whereas there is a selected “External Argument” with an existential import. Consequently, the followings are underivable by MFR.
244 Takashi Toyoshima (62) a. [TP Iti [Tʹ T [v/VP seems [TP ti [Tʹ to [be someone in the room]]]]]] b. [TP It [Tʹ T [v/VP seems [TP someonei [Tʹ to [be ti in the room]]]]]] Furthermore, the following contrast that Shima’s (2000: 384, fn.8) Casebased approach has troubles with, will follow from MFR with the assumption that there is selected as an “External Argument” of v. (63) a. It seems that there is someone in the room. b. *There seems that it is someone in the room. Shima (ibid.) admits: The preference for Move over Merge has nothing to do with the choice between it-insertion and there-insertion, and it, with no partitive Case, does not block association of someone, with partitive Case, to there. … I tentatively speculate that [Spec, T], which selects the copular be with partitive Case, must be filled by there rather than it, but otherwise I leave this problem open. [emphasis in bold added: TT]
That is, Shima (op. cit.) has to resort to “selection” of there after all, on top of his Case-theoretical assumptions (31), which was dubious as we have seen. Finally, for the ECM paradigm (47), repeated as (64) below, if there were in LA and selected, (65) would have been derived. (64) a. *Mary believes [ to [ be someone in the room]] b. Mary believes [someonei to [ be ti in the room]] (65) Mary believes [therei to [ ti be someone in the room]] If there were in LA but not selected, no derivation would be generated as the LA would not be exhausted. (66) *Mary believes [someonei to [be ti in the room]] If there were not in LA, (64b) would result.
{there}
Dynamic economy of derivation
245
9. Concluding remarks As I have shown, determination of Lexical Subarray (LSA) is not so straightforward that computational complexity can be reduced as Chomsky (2000) has envisaged, and therefore, it should be eliminated from the theory. The question, which motivated the concept of LSA, why Merge of an expletive does not always preempt Move, can be answered by MFR (38) that dynamically makes a derivationally local deterministic choice between (external) Merge and Move (internal Merge) without recourse to how lexical items are made available for narrow syntactic computation. (38) Principle of Minimum Feature Retention (MFR) At every stage Σn of a derivation, choose an operation that leaves the fewest unvalued/uninterpretable features in the resulting stage Σn + 1 of the derivation. MFR invokes one-step “look-ahead” Σn → Σn + 1, which is the crucial notion of “derivational locality,” without which the notion of locality will end up being equivalent to representational constraints. For there-existential constructions, MFR appears to face a problem of indeterminacy because of a tie, but it is only apparent since there is a selected “External Argument” of an intransitive VP that is saturated. (52) The expletive there can optionally be selected as a kind of “External Argument” of v that selects an intransitive VP that is saturated. (53) An intransitive VP is saturated if its head verb V is either: i) an unaccusative verb that is incompatible with an interpretation of external causation and selects a Locative and a Theme argument; ii) an unergative verb that selects a Locative/Directional argument and an External Argument of internally caused change of state, of sound/light emission, or of manner of motion; or iii) passivized participle. Insofar as I can see, there is no genuine case of a tie with regard to MFR. This implies that there is no real syntactic optionality. Alleged optionalities of word order, or constructional alternatives, must be due to lexical choices, distinct specifications of feature composition, not by the free choice of syntactic operations that economy conditions make available.
246 Takashi Toyoshima With regard to the notion of phase, I remain agnostic but open. Yet, I contend that a phase must be redefined in some other fashion than through Lexical Subarray, determination of which imposes a significant computational load to “exo-syntactic” processes in an exponential order as I have shown. My hunch is that a strong phase is to be defined, if it needs to be, in terms of completed formal licensing; that is, when XP has deleted/valued or moved away all the uninterpretable/ unvalued features, it is a strong phase. Otherwise, it is a weak phase, which I do not think needs to be defined. It remains to be seen whether or not Phase-Impenetrability Condition and Defective Intervention Constraint survive their functions after a new definition of phase is ever made.
Appendix: Possible LSAs 6C 0
{that}
6C 0
{were}
6C 1
{T1, that} {seem, that} {that, T2} {that, told} {that, friends} {that, it}
6C 1
{T1, were} {seem, were} {T2, were} {were, told} {were, friends} {were, it}
6C 2
{T1, seem, that} {T1, that, T2} {T1, that, told} {T1, that, friends} {T1, that, it} {seem, that, T2} {seem, that, told} {seem, that, friends} {seem, that, it} {that, T2, told} {that, T2, friends} {that, T2, it} {that, told, friends} {that, told, it} {that, friends, it}
6C 2
{T1, seem, were} {T1, T2, were} {T1, were, told} {T1, were, friends} {T1, were, it} {seem, T2, were} {seem, were, told} {seem, were, friends} {seem, were, it} {T2, were, told} {T2, were, friends} {T2, were, it} {were, told, friends} ❖ {were, told, it} {were, friends, it}
Dynamic economy of derivation
247
6C 3
{T1, seem, that, T2} {T1, seem, that, told} {T1, seem, that, friends} {T1, seem, that, it} {T1, that, T2, told} {T1, that, T2, friends} {T1, that, T2, it} {T1, that, told, friends} {T1, that, told, it} {T1, that, friends, it} {seem, that, T2, told} {seem, that, T2, friends} {seem, that, T2, it} {seem, that, told, it} {seem, that, told, friends} {seem, that, friends, it} {that, T2, told, friends} {that, T2, told, it} {that, T2, friends, it} {that, told, friends, it}
6C 3
{T1, seem, T2, were} {T1, seem, were, told} {T1, seem, were, friends} {T1, seem, were, it} {T1, T2, were, told} {T1, T2, were, friends} {T1, T2, were, it} {T1, were, told, friends} {T1, were, told, it} {T1, were, friends, it} {seem, T2, were, told} {seem, T2, were, friends} {seem, T2, were, it} {seem, were, told, it} {seem, were, told, friends} {seem, were, friends, it} {T2, were, told, friends} {T2, were, told, it} {T2, were, friends, it} {were, told, friends, it}
6C 4
{T1, seem, T2, were, told} {T1, seem, T2, were, friends} {T1, seem, told, were, friends} {T1, T2, were, told, friends} {seem, T2, were, told, friends} {T1, seem, T2, were, it} {T1, seem, were, told, it} {T1, T2, were, told, it} {seem, T2, were, told, it} {T1, seem, were, friends, it} {T1, T2, were, friends, it} {seem, T2, were, friends, it} {T1, were, told, friends, it} {seem, were, told, friends, it} {T2, were, told, friends, it}
6C 4
{T1, seem, that, T2, told} {T1, seem, that, T2, friends} {T1, seem, that, told, friends} {T1, that, T2, told, friends} {seem, that, T2, told, friends} {T1, seem, that, T2, it} {T1, seem, that, told, it} {T1, that, T2, told, it} {seem, that, T2, told, it} {T1, seem, that, friends, it} {T1, that, T2, friends, it} {seem, that, T2, friends, it} {T1, that, told, friends, it} {seem, that, told, friends, it} {that, T2, told, friends, it}
6C 5
{T1, seem, T2, were, told, friends} {T1, seem, T2, were, told, it} {T1, seem, T2, were, friends, it} {T1, seem, were, told, friends, it} {T1, T2, were, told, friends, it} {seem, T2, were, told, friends, it}
6C 5
{T1, seem, that, T2, told, friends} {T1, seem, that, T2, told, it} {T1, seem, that, T2, friends, it} {T1, seem, that, told, friends, it} {T1, that, T2, told, friends, it} {seem, that, T2, told, friends, it}
6C 6
{T1, seem, T2, were, told, friends, it}
6C 6
{T1, seem, that, T2, told, friends, it}
248 Takashi Toyoshima Acknowledgements This work is based on my earlier work, Toyoshima (1999, 2000), and is a revised version of the paper delivered at InterPhases: A Conference on Interfaces in Current Syntactic Theory, held in Nicosia, Cyprus, 18–20 May, 2006. I thank the audience for the questions and comments raised during the conference, and I am indebted to Kleanthes Grohmann for his patience in editing and an anonymous reviewer for detailed constructive criticisms, to which I hope not to have failed to do due justice. Any remaining errors or shortcomings are my own. This work is partially supported by a Grant-in-Aid for Scientific Research (C) #14510626 and a Grad-in-Aid for Exploratory Research #19652044 from the Japan Society for the Promotion of Science, which I gratefully acknowledge here.
References Belletti, Adriana 1998 The case of unaccusatives. Linguistic Inquiry 19: 1–34. Blum, Manuel 1967 A machine-independent theory of the complexity of recursive functions. Journal of the Association for Computing Machinery 14: 322– 336. Chomsky, Noam 1951 The Morphophonemics of Modern Hebrew. M.A. Thesis: University of Pennsylvania. [Published in 1979 from Garland Publishing: New York.] 1955 The logical structure of linguistic theory. Ms., Harvard University. [Published partially with revision in 1975 from Plenum Press: New York. Reprinted in 1985 from Chicago University Press: Chicago, IL.] 1973 Conditions on transformations. In A Festschrift for Morris Halle, Stephen R. Anderson and Paul Kiparsky (eds.). 232–286. New York: Holt, Rinehart and Winston. 1992 Some notes on economy of derivation and representation. In Principles and Parameters in Comparative Grammar, Robert Freidin (ed.), 417–454. Cambridge, MA: MIT Press. 1993 A minimalist program for linguistic theory. In The View from Building 20: Essays in Linguistics in Honor of Sylvain Bromberger, Kenneth Hale and Samuel Jay Keyser (eds.), 1–52. Cambridge, MA: MIT Press. 1995 Categories and transformations. In The Minimalist Program, Noam Chomsky, 219–394. Cambridge, MA: MIT Press.
Dynamic economy of derivation 1998
249
Some observations on economy in generative grammar. In Is the Best Good Enough?: Optimality and Competition in Syntax, Pilar Barbosa, Danny Fox, Paul Hagstrom, Martha McGinnis and David Pesetsky (eds.), 115–127. Cambridge, MA: MIT Press and MITWPL. 2000 Minimalist inquiries: The framework. In Step by Step: Essays on Minimalist Syntax in Honor of Howard Lasnik, Roger Martin, David Michaels and Juan Uriagereka (eds.), 89–155. Cambridge, MA: MIT Press. 2001 Derivation by phase. In Ken Hale: A Life in Language, Michael Kenstowicz (ed.), 1–52. Cambridge, MA: MIT Press. 2004 Beyond explanatory adequacy . In Structures and Beyond: The Cartography of Syntactic Structures, Vol. 3, Adriana Belletti (ed.) 104– 141. Oxford: Oxford University Press. 2008 On phases. In Foundational Issues in Linguistic Theory: Essays in Honor of Jean-Roger Vergnaud. Robert Freidin, Carlos P. Otero and Maria Luisa Zubizarreta (eds.), 133–166 Cambridge, MA: MIT Press. Church, Alonzo 1941 The Calculi of Lambda-Conversion. Princeton, NJ: Princeton University Press. Collins, Chris 1997 Local Economy. Cambridge, MA: MIT Press. Edmonds, Jack 1965 Paths, trees, and flowers. Canadian Journal of Mathematics XVII: 449–467. Fitch, W. Tecumseh, Marc D. Hauser, and Noam Chomsky 2005 The evolution of the language faculty: Clarifications and implications. Cognition 97: 179 –210. Fukui, Naoki 1996 On the nature of economy in language. Cognitive Studies: Bulletin of the Japanese Cognitive Science Society 3: 51–71. Garey, Michael R. and David S. Johnson 1979 Computers and Intractability: A Guide to the Theory of NP-Completeness. San Francisco, CA: W. H. Freeman. Groat, Erich and John O’Neil 1996 Spell-out at the LF interface: Achieving a unified syntactic computational system in the Minimalist Framework. In Minimal Ideas: Syntactic Studies in the Minimalist Framework, Werner Abraham, Samuel D. Epstein, Höskuldur Thráinsson and C. Jan-Wouter Zwart (eds.), 113–139. Amsterdam: John Benjamins. Hartmanis, Juris and John E. Hopcroft 1971 An overview of the theory of computational complexity. Journal of the Association for Computing Machinery 18: 444–475.
250 Takashi Toyoshima Hauser, Marc D., Noam Chomsky and W. Tecumseh Fitch 2002 The faculty of language: What is it, who has it, and how did it evolve? Science 298: 1569 –1579. Hinzen, Wolfram 2008 The successor function + LEX = human language? In InterPhases: Phase-Theoretic Investigations of Linguistic Interfaces, Kleanthes K. Grohmann (ed.). Oxford: Oxford University Press. Johnson, David S. 1990 A catalog of complexity classes. In Handbook of Theoretical Computer Science A: Algorithms and Complexity, Jan van Leeuwen (ed.), 67–161. Amsterdam: Elsevier / Cambridge, MA: MIT Press. Johnson, David E. and Shalom Lappin 1997 A critique of the Minimalist Program. Linguistics and Philosophy 20: 273–333. 1999 Local Constraints vs. Economy. Stanford, CA: CSLI Publications. Kidwai, Ayesha 2002 Unaccusatives, expletives, and the EPP-feature of v*. Ms., Jawaharlal Nehru University, New Delhi. Kitahara, Hisatsugu 1995 Target α: deducing strict cyclicity from derivational economy. Linguistic Inquiry 26: 47–77. Kleene, Stephen C. 1936 General recursive functions of natural numbers. Mathematische Annalen 112: 727–742. Lasnik, Howard 1995 Case and expletives revisited: On Greed and other human failings. Linguistic Inquiry 26: 615–633. Levin, Beth and Malka Rappaport-Hovav 1995 Unaccusativity: At the Syntax–Lexical Semantics Interface. Cambridge, MA: MIT Press. Markov, Andrey A. 1954 Teoriya Algorifmov. Akademii Nauk SSSR: Moskva. [Theory of Algorithms. The Academy of Sciences of the USSR: Moscow. English translation by Jacques J. Schorr-Kon and PST Staff (1961) Theory of Algorithms. The Israel Program for Scientific Translations: Jerusalem.] McGinnis, Martha 1998 Locality in A-movement. Doctoral dissertation, MIT, Cambridge. Nakamura, Masanori 1998 Global issues. Proceedings of the North East Linguistic Society 28 (1): 301–318. Nevins, Andrew 2004 “Derivations without the Activity Condition. In Perspectives on Phases: MIT Working Papers in Linguistics 49, Martha McGinnis & Norvin Richard (eds.), 287–310. Cambridge, MA: MITWPL.
Dynamic economy of derivation
251
Papadimitriou, Christos H. 1994 Computational Complexity. Reading, MA: Addison-Wesley. Papadimitriou, Christos H. and Kenneth Steiglitz 1982 Combinatorial Optimization: Algorithms and Complexity. Englewood Cliffs, NJ: Prentice-Hall. Poole, Geoffrey 1998 Constraints on local economy. In Is the Best Good Enough? Optimality and Competition in Syntax, Pilar Barbosa, Danny Fox, Paul Hagstrom, Martha McGinnis and David Pesetsky (eds.), 385–398. Cambridge, MA: MIT Press and MITWPL. Post, Emil L. 1936 Finite combinatory processes – formulation 1. The Journal of Symbolic Logic 1: 103–105. Seiferas, Joel I. 1990 Machine-Independent Complexity Theory. In Handbook of Theoretical Computer Science A: Algorithms and Complexity, Jan van Leeuwen (ed.), 163–186. Amsterdam: Elsevier / Cambridge, MA: MIT Press. Shima, Etsuro 2000 A preference for Move over Merge. Linguistic Inquiry 31: 375–385. Toyoshima, Takashi 1999 Move 1st: A Dynamic Economy Plan. In Proceedings of the North East Linguistic Society 29 (1): 409–425. 2000 Head-to-Spec movement and dynamic economy. Doctoral dissertation. Cornell University, Ithaca, NY. 2005 Preemptive move toward elimination of lexical subarray: Dynamic economy. Proceedings of the Israel Association for Theoretical Linguistics 21. (http://atar.mscc.huji.ac.il/~english/IATL/21/). Turing, Alan M. 1937 On computable numbers, with an application to the Entscheidungsproblem. Proceedings of the London Mathematics Society: Second Series 42: 230 –265. Yang, Charles D. 1997 Minimal Computation: Derivation of the Syntactic Structure. M.Sc. Thesis, MIT, Cambridge.
The conceptual necessity of phases: Some remarks on the minimalist enterprise Dennis Ott
1. Biolinguistics and Turing’s Thesis I would like to discuss some very general issues pertaining to recent proposals in linguistic theory. In order to do this, it is necessary to review some fundamental aspects of the Minimalist Program (MP). This first section introduces the framework, emphasizing that the MP is not a theory of language but a bet on what kind of biological object I-language is. One particular theoretical reflex of this ontological (actually, metaphysical) commitment of the program – the notion of syntactic phases – will be investigated in section 2. Section 3 is a brief comment on some conceptual problems, and section 4 concludes.1 The overall goal of this paper is very modest: to show that once certain plausible assumptions about the architecture of the Language Faculty are adopted, the notion of phase is a conceptual necessity. The question of what kind of object an I-language is is closely related to the evolutionary origin of natural language. Evidently, our tentative assumptions about the nature of the language organ depend to a significant extent on our equally tentative assumptions about the major factors that enter into its Entwicklungsgeschichte (evolution and development). As many authors have stressed, it seems reasonable to assume that an Ilanguage provides an evolutionary advantage, and that a linguistic organism will be favored by selection. While this vague assumption appears relatively innocuous, it does not ipso facto provide any insight into the structure of Ilanguage. The problem of morphogenesis, which arises in the study of the evolution of any biological object, is summarized by developmental biologist Gunther J. Eble, who notes that “form, and more generally, the phenotype, has always remained a fundamental problem demanding explanation in evolutionary biology. (…) While the structured entities of evolutionary dynamics certainly include genes, structure is not equivalent to genetic informa1
In what follows, MI = Chomsky (2000), DbP = Chomsky (2001), BEA = Chomsky (2004), OP = Chomsky (to appear), AUB = Chomsky (2007).
254 Dennis Ott tion; genes are necessary but not sufficient for the emergence of form” (Eble 2003: 34). Hence, “Darwin’s Dilemma” (Boeckx 2006a) remains in the age of the modern synthesis, calling for a general internalist research program in theoretical biology.2 The MP is one such program, appealing to structural principles of evolution and development. These observations are not without precursors: the gist of Alan Turing’s seminal work on morphogenesis is that “the forms we observe in organisms are often determined more by physics than by selection,” in the words of mathematician-biologist Peter Saunders (1994: 373). This is what we may call, with Chomsky (2006), Turing’s Thesis. Discussing the role of developmental morphospaces, Eble (2003: 42) notes that “[p]hysical models of form have been shown to be very suggestive of actual principles involved [in defining morphospaces]. (…) To the extent that such physical principles are involved in development, they become part and parcel of the developmental matrix that is potentially involved in structuring evolution in morphospace.” Unless we adopt some methodological dualism, this should apply to I-language as it does to other organs of the body; natural law is a “Third Factor” that enters into its design, as does (presumably) genetic information.3 For the domain of I-language, Darwin’s Dilemma is aggravated by the findings of theoretical linguistics: the entities postulated by linguistic theory (say, c-command or the CED) are unlikely to be results of successive, gradual adaptation of the language organ to adaptive pressures. More plausibly, they are epiphenomenal effects of a mutation that had deep consequences: 4 Many of the details of language (…) may represent by-products of [FLN, the Faculty of Language in the narrow sense], generated automatically by neural/computational constraints and the structure of [FLB, the Faculty of Language in the broad sense] (…). [The development of the Faculty of Language is] constrained by biophysical, developmental, and computational factors shared with other vertebrates. (…) [S]tructural details of FLN may result 2
3 4
For review see e.g. Webster & Goodwin (1996). See also Gould (2002) for an extensive outline of internalism in biology, reaching back to Goethe’s rationalist morphology. For general discussion of the Third Factor, see Chomsky (2005a,b). This conjecture is supported by the biological isolation of I-language, which is a species-specific human property, apparently without any relevant evolutionary precursors; notice e.g. Gould’s (1991: 59) assertion that “attributes unique to our species are likely to be exaptations,” hence not to be explained in terms of selection.
The conceptual necessity of phases
255
from such preexisting constraints, rather than from direct shaping by natural selection targeted specifically at communication. Insofar as this proves to be true, such structural details are not, strictly speaking, adaptations at all. (Hauser et al. 2002: 1574)
In general, the internal make-up of I-language does not appear to be specifically adjusted to communicative demands, and its essentials might to a significant extent be determined by the Third Factor (perhaps along with some minor co-adaptation of grammar and external systems, a possibility not contradicted by Hauser et al.’s suggestions).5 The MP assumes Turing’s Thesis to be on the right track, for I-language and beyond. As Boeckx (2006b) argues at length (see also Chomsky 2002: ch. 4), this speculative metaphysical thesis has proven fruitful for natural science in its quest for explanation beyond description: phenomenology is a hopeless chaos, but this chaos is merely the distorted expression of a higher nature, where basic laws and principles interact. To understand the chaos as peripheral and our models of the higher nature behind the chaos as real is perhaps the most important aspect of the Galilean scientific revolution. What does this mean for the study of I-language? Adopting Turing’s Thesis, the MP is an attempt to show that I-language approaches some significant degree of structural optimality, understood in terms of naturalness: its internal mechanisms and operations are not arbitrary, but a least-effort solution to minimal design specifications.6 Given that at least a subset of the body of knowledge that is given to the speaker by her I-language is usable, we can assume that (some of) the outputs of its computations are poised for access by external (performance) systems: to some extent, I-language meets interface conditions, sometimes termed “bare output conditions” (Chomsky 1995: 221; BEA: 106). If I-language is essentially an expression of the Third Factor, we expect it to meet interface conditions in a way that makes use of the minimal machinery necessary.7 Assuming that the system interfaces 5
6
7
For all we know, a human I-language might be the natural result of self-organizational principles of neural tissue, and the changes it underwent in the “great leap forward.” We can only speculate, but there is some suggestive work in neuroscience, most notably by Alessandro Treves, on the neural basis of infinite recursion (see Treves 2005). For specific proposals along these lines, see Chomsky (1995 et seq.), Fox (2000), and the papers in Epstein & Seely (2002), and much related work. Moreover, once an element in the design of I-language is identified as an interface requirement, it receives principled explanation: “Insofar as properties of [Ilanguage] can be accounted for in terms of [interface conditions] and general
256 Dennis Ott (minimally) with sensorimotor systems on the one hand and systems of conceptual organization and intentionality on the other, this claim amounts to saying that I-language is a perfect solution to the task of linking these kinds of systems. The Strong Minimalist Thesis (SMT) in its most rigorous form maintains exactly this assumption: (SMT) I-language comprises only the minimal mechanisms necessary to compute representations that can be accessed by the interfacing external systems.8 SMT is a guide line for theory construction, based on the bet that Turing captured some deep aspect of nature, and that I-language, due to its extraordinary evolutionary (non-)history, reflects this nature in a rather straightforward way. If SMT held fully, all principles of I-language would reduce to natural law or to interface conditions, hence in virtue of metaphysical necessity or the structure of external systems, not genetic encoding (that is, the principles would come “for free”). In this sense, the MP “approaches UG from below” (AUB) in that it attempts to shift the explanatory burden from the genetic endowment to natural principles of computation. A successful pursuit of the MP could eventually lend support to theories of I-language evolution along the lines of Hauser et al. (2002), making them more than a reasonable bet: “the less attributed to genetic information (in our case, the topic of UG) for determining the development of an organism, the more feasible the study of its evolution” (AUB: 3).
2.
The necessity of phases
2.1. Unconstrained merge and “overgeneration” As a first approximation, let us assume that I-language comprises a lexicon, a (narrowly) syntactic component, and (broadly) syntactic mapping components – “broadly” syntactic because they’re often labeled “semantics” and “phonology”, depite their presumably syntactic character. This architecture suffices in principle to generate an infinite number of (mind-internal) Ex-
8
properties of efficient computation and the like, they have a principled explanation: we will have validated the Galilean intuition of perfection of nature in this domain” (BEA: 2). For other formulations of SMT, see MI: 96, DbP: 1.
The conceptual necessity of phases
257
pressions, in the technical sense yet to be made precise. Minimally, UG makes available an operation of set-formation or “Merge”, which creates a syntactic object: Merge(X,Y) = {X,Y}. Recursive Merge yields digital infinity. Another way of saying this is that UG equips lexical items (LIs) with Edge Features (EFs), in the terminology of OP. EF on LI will result in LI being merged to some other LI. Iterated Merge yields complex structures as well as “dislocation”; merging an object that is already present in the structure yields multiple occurrences of the object in the tree. Following BEA (p. 110), we can descriptively distinguish external and internal Merge: if an element is merged from the lexicon, it is merged externally; if it is part of the already-formed structure, it is merged internally. But notice that the distinction is purely descriptive; no operation of movement or copying must be stipulated in addition to Merge. Dislocation follows logically as a subcase of unconstrained Merge (see Berwick 1998 for a very early formulation of this insight). In addition to elementary tree construction by Merge, there seem to be grammatical operations triggered by formal features of the elements in the derivation (surfacing in inflectional morphology). If some element bears a formal feature that is not valued in the lexicon, it reflexively probes the structure (naturally, its c-command domain) in order to find some valued non-distinct feature to agree with, its goal. Simplifying massively, it seems that Agree(X,Y) may trigger IM of the goal, but need not, depending on the probe’s EPP/EF status.9 Two mutually exclusive views of Merge have been advanced in the literature: 1. Merge is subject to Last Resort: each application of Merge(X,Y) must be licensed by a probe-goal relation between X and Y. 2. Merge applies freely (or, alternatively: EF on LI deletes freely). Most of the literature effectively takes the first route, assuming some kind of featural relation to ensure “correct” selection at each Merge step (see, e.g., Svenonius 1994; Collins 2002; Frampton & Gutmann 2002; Adger 2003; Pesetsky & Torrego 2004), and at least in some cases arguing explicitly that this makes syntax “crash-proof.” Chomsky has argued for the second view (see, in particular, BEA: 111f.). He argues that we can dispense with both c-selection (Pesetsky 1982) and s-selection as properties of syntax. Rather, selectional properties are “read 9
For details, see MI, DbP, BEA, OP, AUB, and references therein.
258 Dennis Ott off” the relevant heads at the semantic interface. This view implies that syntax is not crash-proof, and that Merge will freely generate an infinitude of structures that violate selectional (and other) requirements. At the interface, the interpretation of a head in a particular configuration will induce deviance if selectional properties are violated. Take, for instance, UTAH, as discussed by Baker (1997). If Baker is on the right track, his proposals raise the question whether UTAH is built into Merge, conditioning its application, or whether it is imposed by C-I systems. The minimalist working assumption (and Baker’s) is that UTAH is a descriptive statement of C-I interpretation principles, while the (undersirable) alternative would be to enrich narrow syntax accordingly. Notice that this would conflict with the “from below” approach of the MP, which seeks to minimize the amount of syntax-specific mechanisms. Free application of Merge entails the existence of an infinite array of structures that do not conform to externally imposed interface conditions (IC). In particular, “[elimination of s-selection] entails that derivations cannot be failure-proof (“crash-free”), thus undermining the strong version of IC (…). IC must be weakened” (BEA: 9f.), allowing for degrees of deviance. It seems to me that only this second view of Merge is adequate, given the flexibility of interpretation and the gradedness of deviance. Consider deviant cases like “John slept Mary”, “John kicked”, or “John ran Mary blue”, all of which are assigned some interpretation (and the same is true for many other violations in various domains; actual cases of true unintelligibility are hard to make up). As Chomsky (p.c.) puts the matter, “it’s important to remember that we do want to ensure that ‘deviant’ expressions are generated, even word salad. The I-language assigns it an interpretation, reflexively, with only limited variability” (my emphasis). In fact, the argument recapitulates a familiar truism, going back to Chomsky (1955), that natural language has no notion of well-formedness or “grammaticality” – a striking contrast with formal languages. Rephrasing this insight in modern terminology, Chomsky (p.c.) suggests that conditions on the application of Merge are actually empirically inadequate, and that a more plausible assumption is that degrees of deviance are interface effects (see also OP: 10, BEA: 3): We could add further conditions [on Merge] to block ‘overgeneration’ (e.g., no more SPECs after theta roles have been exhausted, keep to selectional requirements, etc.). But that seems to me wrong. First, such conditions duplicate what’s done independently at the C-I interface. Second, it imposes some ±grammatical distinction, and since LSLT in 1955, that’s seemed wrong to me.
The conceptual necessity of phases
259
I will assume that this view is correct, and that subjecting Merge to Last Resort is an empirically and conceptually dubious move. It is unclear how a crash-proof syntax could yield a substantial part of the linguistic knowledge that speakers undoubtedly have, and how the gradedness of deviance could arise. We thus shift the explanatory burden to interface conditions, while syntax is free to generate an infinite array of structures. It is plausible to assume that conceptual-intentional (C-I) and sensorimotor (SM) systems impose quite different conditions; call the syntactic objects mapped onto these interfaces SEM and PHON, respectively. Assume that in analogy to PHON, we can view SEM essentially as an instruction to construct a complex concept (see Pietroski to appear, forthcoming). The conceptual system will naturally impose conditions on what counts as a proper instruction; only syntactic objects of a certain kind will be able to interface with “thought” (whatever the relevant properties are). These syntactic objects are what Chomsky calls “phases,” and the empirical project is to determine what makes a syntactic object a phase.
2.2. Propositional cycles By definition, phases are those syntactic objects that can interface with “thought.” At the same time, the phase defines the syntactic cycle, a traditional notion in linguistic theory (see Lasnik 2006 for review). C-I interface conditions thus determine that the syntactic cycle must be “meaningful” (BEA: 108), corresponding to a proper instruction for concept construction. The question, then, is: What makes a syntactic object “meaningful” in the relevant sense?10 Chomsky suggests that phasal syntactic objects are defined as such by their “propositionality”:
10
Chomsky’s original formulation of the phase was mainly extensional: In MI and DbP, a phase is defined relative to a lexical subarray (LA). The LA, a collection of items from the lexicon, is exhausted by the computational system; the resulting structure (the phase) is transferred, and the next subarray is used. The argument for subarrays was mainly based on “Merge over Move” effects, an incoherent notion once the assumptions about internal/external Merge in BEA and later work are adopted. I will not discuss this view any further.
260 Dennis Ott Ideally, phases should have a natural characterization in terms of [interface conditions, IC]: they should be semantically and phonologically coherent and independent. At SEM, v*P and CP (but not TP) are propositional constructions: v*P has full argument structure, CP is the minimal construction that includes Tense and event structure [note omitted, DO], and (at the matrix at least) force. (BEA: 124; cf. also MI: 106, DbP: 12)
Let us summarize this statement as a first approximation: 1. CP is a phase because it specifies force/full event structure. 2. v*P is a phase because it represents full argument structure. CP and v*P thus correspond to what Chomsky calls the “duality of semantics”: local predicate-argument structure, and derived positions, which both enter into the semantic interpretation of a full expression. The particular phases we find in human syntax are thus not a matter of necessity; if the CI system were structured differently, different structures would be ‘picked out’ for concept construction. It is perfectly coherent to imagine a C-I system that can only deal with local theta-structure configurations but has no use for CP-like structures. The structure of Expressions is thus not determined by narrow syntax, but by C-I properties.11 Given the structural parallelisms of the verbal and nominal domains (cf. Chomsky 1970; Abney 1987, and much subsequent work), we can speculate that certain nominal structures will also meet Chomsky’s criterion of propositionality. Longobardi (1994, 2005) provides substantial evidence that there are at least two fundamentally different kinds of nominal phrases: referential, definite DP (“the picture of John”) and predicative, indefinite NP (“a picture of John”). Longobardi shows that only DP can be an argument, with either a definite determiner or the head noun raising covertly to D. Assuming that intentional systems allow the speaker to use certain linguistic expressions to refer to non-linguistic entities, Longobardi’s work can be interpreted as showing that it is D that encodes referentiality of DP, making it a “special” unit at the interface, hence that DP (but not NP) is a phase (cf. AUB: 25f.). Let us thus add a third hypothesis: 3. DP is a phase because it is referential.
11
Hinzen’s (2006) program is exactly the opposite of the view described here. Hinzen argues against independent C-I constraints, arguing that all semantic properties are fully determined by intrinsic properties of syntax.
The conceptual necessity of phases
261
The identification of CP, v*P and DP as phases is plausible, given the duality of semantics plus referential use of nominal expressions. Needless to say, hypotheses 1–3 require much clarification and empirical justification. However, notice that some notion of phase is conceptually necessary in any syntactic theory, in particular when Merge is assumed to be unconstrained. There must be some way of “picking out” those special structures that can instruct C-I systems; there is nothing in the structures themselves that marks them propositional (contra Hinzen 2006). In this sense, any generative model has to give substance to the notion of a “meaningful” cycle; the relevant C-I constraints being of course a matter of empirical discovery. Hence, it is of central importance to linguistic theory to develop a proper technical notion of propositionality that applies to those structures that instruct conceptual systems. Earlier theories took for granted that the basic unit of computation is essentially the clause (CP), i.e. that there is only a single phase. Phase theory denies that the clause is the only relevantly “special” structure: parallel propositional structures are expected in the verbal (v*P) and nominal (DP) domains as well. If true, the phases presumably reflect deep aspects of the C-I systems, perhaps divided along the following lines: – Information/discourse structure CP – Thematic structure v*P – Referentiality DP
Notice that the multiple-phase approach pushes many global properties of (semantic and phonological) Expressions outside of grammar proper. Instances of long-distance binding and global aspects of interpretation beyond local compositionality (discussed in Chomsky 1977) are now a matter of computation beyond the interface. I-language provides an infinite array of propositional Expressions (clausal, verbal, and nominal), which are apt for concept construction. But the C-I system also interprets “full expressions” (something like sentences) that comprise several propositional units. Likewise, many prosodic and intonational properties are presumably computed over cumulative phonetic representations in the SM system.12 Hence, much 12
Notice in this regard that the basic intonation pattern of a sentence is at least in part determined by C, the functional head specifying force. However, C is fed into the phonological component only at the very end of the C phase; therefore, the computation of an intonation matrix requires a full expression, hence must take place outside of grammar.
262 Dennis Ott more computational work is ascribed to external systems in assembling the “full expression” that the speaker actually interprets in language use. In contrast to earlier models of locality (see Boeckx & Grohmann 2007 for review), phases are both propositional units of interpretation and derivational units. Hence, the phasal cycle should have syntactic repercussions.
2.3. Derivational cyclicity Notice that the phase-based model is a significant departure from the standard Y-model of syntax (Chomsky & Lasnik 1993). The previous independent cycles of X-bar theory, D-structure, S-structure and LF (as the output of narrow syntax and the input to covert operations) are reduced to a single cycle that maps a set of lexical items to the semantic interface, mapping to SM being an ancillary process (see DbP: 5, 15). Assuming that phases define syntactic cycles as well as units of interpretation, the size of the cycle is expected to restrict application of operations. Chomsky attempts to relate the notions of phasal derivation and minimal search space by assuming that phases are in some sense impenetrable for later operations. As Müller (2003: 6) phrases this desideratum, “minimality effects should emerge as epiphenomena of constraints that reduce the space in which the derivation can look for items that may participate in an operation (…).” The goal is to reduce phenomena of locality to the the necessary conditions on conceptual instructions. Concretely, Chomsky assumes that once a phase is completed, its domain (the complement of the phase head) is transferred: (1)
Phase Impenetrability Condition, version 1 (PIC1) In phase HP with head H, the domain of H is not accessible to operations outside HP; only H and its edge are accessible to such operations. (Cf. MI: 108)
Notice that the PIC as such does not state whether it holds for narrow syntax, for the mapping components, or both. Since the options are logically distinct, the domain of application of PIC is an empirical matter. Assume first that narrow syntax forgets transferred phases. As an illustration, consider the reduction of T’s search space under PIC1, printed boldface: 13 13
The notation is borrowed from Richards (2006), who in turn relies on Müller (2003).
The conceptual necessity of phases
(2)
263
T [v*P … v* … [VP … V … ] ]
As soon as the v*P phase is completed, VP is transferred, hence no longer accessible to T. Notice that as the above condition states, it is not exactly the phase itself that gets transferred; rather, the phase head’s complement domain is transferred. Successive-cyclic movement is a necessary consequence of (1), forcing elements to move through intermediate phase edges on their way to the final landing site. In addition to familiar empirical evidence for successive-cyclic movement (see Boeckx 2007 for extensive review), Fox (2000) and others have specifically argued for reconstruction effects at the v*P edge: (3)
a. [Which of the papers that hei wrote for Ms. Brownk] did every studenti get herk to grade twh b. *?Every student got herk to grade the first paper that he wrote for Ms. Brownk
The binding relations indicated in (3a) require a copy of the wh-phrase in a lower position. (3b) shows that the relevant reconstruction site cannot be the base position, which yields a Principle C violation. Thus, the reconstruction site must be in between, leaving SPEC-v* as the most plausible option: (3) a’. [TP every studenti did [v*P [which of the papers that hei wrote for Ms. Brownk] [v*P get herk to grade twh]]] A’-reconstruction thus provides some evidence for the special status of the v*P edge, as expected if v*P is a phase. Notice that Chomsky explicitly excludes defective v (passive and unaccusative) from his typology of phase heads. His motivation for restricting phasal status to v*P is that subextraction out of an agentive subject yields stronger deviance than subextraction from a passive subject (see OP: 14 for discussion). However, Legate (2003) casts doubt on Chomsky’s identification of v*P as the only relevant verbal structure. While confirming the phasal status of v*P (and CP), Legate argues on the basis of reconstruction effects, parasitic gaps and prosodic data that passive and unaccusative vP is equally phasal, exhibiting the same degree of ‘interface isolability.’14 As an 14
As a result of Legate’s caveat, Chomsky (in DbP) introduces the distinction of strong vs. weak phases. The distinction is entirely obscure: while abiding his original conception by referring to CP and v*P as strong phases, Chomsky acknowledges the phasal character of the relevant VPs, declaring them weak
264 Dennis Ott illustration, consider the pair in (4). In both the active (4a) and the passive (4b) case, the bracketed wh-phrase reconstructs below every student/every man, i.e. in the edge of the verbal phrase (marked “t’wh”), as indicated by binding relations.15 (4)
a. [Which of the papers that hei gave Maryk] did every studenti [v*P t’wh ask herk to read twh carefully] b. [At which of the parties that hei invited Maryk to] was every mani [vP t’wh introduced to herk twh]
Diagnostics like this suggest that the edge of passive vP serves as an intermediate landing site for movement, indicating that it is actually phasal (see Legate’s paper for discussion).16 In addition, Sauerland (2003) has provided compelling evidence that A-movement, too, can reconstruct in SPEC-v: (5)
[Every child]i doesn’t [vP ti seem to hisi father to be smart]
Assuming that the quantified phrase reconstructs below negation, this yields the “not every” reading of the sentence. It is thus still an open question whether Chomsky’s v*/v distinction in terms of phase heads is adequate; at least with regard to edge effects, v* and v appear to be on a par.17 Moreover, stated as in (1), PIC faces empirical challenges. Notice that according to PIC1, T in particular is unable to probe into VP. However, this incorrectly rules out instances of Agree(T, DP), as in English existential constructions or Icelandic DAT-NOM structures, where nominative is valued on DP by Agree(T, DP), with DP in situ (= in VP). In particular, in Icelandic a finite verb or auxiliary outside v*P can agree with a nominative DP inside VP across a transitive (hence, phasal) v*.18
15
16 17
18
phases. It remains unclear what the distinction, stipulated without further argument, actually expresses, since there seems to be no relevant difference between weak phases and non-phasal elements. It was consequently abandoned by Chomsky in class lectures (MIT, summer 2005) as well as in OP, AUB. As in the previously discussed cases, reconstruction in a lower position would presumably lead to a Principle C violation. But see Den Dikken (2006) for a thorough evaluation of Legate’s arguments. For some inconclusive evidence concerning the DP edge, see Matushansky (2005) and Heck & Zimmermann (2004). I’ll set the issue aside in what follows. The example is from Jonsson (1996: 157). Similar instances of T probing into VP and valuing nominative Case on DP in situ can be found in German, as discussed by Wurmbrand (2006). DP may not actually be in situ, strictly speaking:
The conceptual necessity of phases
(6)
265
Joni virdhast hafa likadh thessir sokkar Jon.DATi seem.PL [TP have [v*P ti like [VP these.NOM socks.NOM]]] ‘Jon seems to have liked these socks’
T apparently probes across the phase boundary, hence v*P is apparently not impenetrable in the sense of PIC1.19 There is also (less clear) evidence for agreement across the CP-phase boundary. In Tsez, some verbs that take clausal complements agree optionally with an argument embedded in the complement clause (see Polinsky & Potsdam 2001: 606). Assuming, as seems plausible, that these clausal complements are CPs, the data provide some further evidence against CP being impenetrable to agreement: (7)
enir uza magalu bac’ruli biyxo mother.DAT [CP boy.ERG bread.III.ABS III.eat] III.know ‘The mother knows the boy ate the bread’
In DbP, the PIC was revised accordingly: a phase domain is transferred only when the next higher phase head is merged. That is, VP remains within T’s search space, but is transferred upon merger of C. (8)
Phase Impenetrability Condition, version 2 (PIC2) For a phase HP with phase head H, the domain/complement of H is not accessible to operations at next-higher phase ZP; only H and its edge are accessible to such operations, the edge being the residue outside of H’ (specifiers of H or adjuncts to HP). (Cf. DbP: 13)
Under this conception, H and its complements are accessible until the nexthigher phase head is merged (but no longer). Consider again the example of T’s search space under the revised PIC: (9)
19
T [v*P … v* … [VP … V … ] ] C [TP T [v*P … v* … [VP … V … ] ] ]
if feature inheritance (see below) extends to the v*-V complex, then v* (via V) may raise the object to SPEC-V. But even in this case, it would still be within the lower phase, hence inaccessible to T. However, Svenonius (2004) notes that Icelandic long-distance agreement is limited to Number agreement, whereas more local instances of Agree always involve Person.
266 Dennis Ott Other than under PIC1, the domain of the lower phase is fully accessible to T (in case no higher element intervenes), for it is only transferred when the next-higher phase head (C, in this case) is merged. Thus under PIC1, VP is accessible to v*, but not to any higher elements; under PIC2, it remains accessible to T, while C triggers its Transfer. In the light of problematic cases like those discussed above, this formulation of Phase Impenetrability appears to be empirically preferable over the earlier definition.
2.4. Feature Inheritance Chomsky (OP, AUB) argues that the relation between the phase heads C and v and their non-phasal complements is characterized by a mechanism of feature inheritance: T and V receive the uninterpretable features of C and v, respectively, at the point when the latter are merged.20 In this way, Chomsky attempts to capture the rather traditional observation that both T and V appear to be more complete (perhaps, phi-complete) when selected by C and v*, respectively, but defective in other contexts (in particular, ECM); see Fortuny (2006: section 3.1) for discussion. It follows that only phase heads are probes in a direct sense; if T or V probe, they do so only indirectly by virtue of the features inherited from the phase head. For example, the raising of a goal XP to SPEC-T is the result of C’s features probing down the tree, but these features appear on T (not C), after inheritance. The goal’s unvalued features are valued after raising to SPEC-T, hence movement stops there (in accord with the Activity Condition, which says that elements with all features valued are inert for further operations21) and does not proceed further to SPEC-C. T on its own is unable to probe; hence, structures that lack C (ECM/raising) also lack T-agreement effects (nominative Case on DP). If C is present, its EF and agreement features (the latter realized on T) probe simultaneously. Chomsky’s account raises the question whether feature inheritance is an ad-hoc solution or a conceptually necessary design feature. That is, why can the features on phase head P not remain on P, but instead must be passed on to the complement head H? 22 20
21
22
In AUB, Chomsky extends the reasoning to definite DP (n*P in his terminology), effectively declaring it a phase. See BEA, OP; also Nevins (2005) for an opposing view on activity as a precondition for movement. As Richards (2007) points out, feature inheritance cannot be a C-I requirement to encode the A/A’ distinction properly. (This is the rationale offered by Chomsky
The conceptual necessity of phases
267
Feature inheritance solves a problem concerning the valuation of uninterpretable features and the timing of Transfer, pointed out by Epstein & Seely (2002b). After valuation, there is no way for the system to tell which features had originally entered the derivation unvalued: once valued, all features ‘look the same,’ namely valued. This means that there is no way for the system to tease apart interpretable and uninterpretable features, hence feed the respective interface with those and only those features relevant to it (Full Interpretation; Chomsky 1986). The only way out seems to be back-tracking (reconstruction of the derivation), but this is clearly undesirable, as it dramatically adds to computational complexity. The solution proposed by Chomsky (OP, AUB) and discussed in detail by Richards (2007) is that valuation of unvalued features and Transfer coincide: features are not valued before Transfer, but at Transfer (that is, at the phase level). But this requires that the phase-head complements inside the phase domain not value any features prior to the phase level (viz., merger of the phase head). Hence, if valuation applies at phase level, it must be the case that all instances of valuation are triggered by the phase heads. But if the unvalued features are on the non-phasal complement head H from the start, it would be implausible to claim that H delays probing until the phase head comes in (as we were led to claim if valuation and Transfer were simultaneous). It is, far more natural to assume that the features are not there before the phase head comes in: only if a non-phasal head mediates between probe and goal can valuation occur at the point of Transfer. From this point of view, the phase head’s “passing on” its features indeed follows from good-design considerations: if valuation and Transfer must be one and the same operation for reasons of Full Interpretation, then SMT requires unvalued features to exist in phase heads only and be inherited by the complement head, which in turn must exist as a “receptacle” that receives these features in order to avoid timing problems.23
23
in OP: EF of P yields IM to A’ (= SPEC-P); EF of H yields IM or EM to A (= SPEC-H).) Assume that instead of T, C is merged to vP. C has, in this model, all the features familiar from T, and hence the A/A’ distinction could simply be encoded via first and second merge to C (multiple SPECs, in traditional terminology), without feature inheritance. Moreover, as Richards notes, the rationale only pertains to C-T, not to the v*-V complex. But, clearly, any attempt to trace feature inheritance to SMT must not discriminate between the different phasal complexes. The above considerations do not only provide a rationale for certain elements in terms of SMT, but also for a particular arrangement of these elements that has
268 Dennis Ott What this means in more concrete terms is that when T inherits C’s phifeatures, the subject in SPEC-v* is probed by both EF on C and phi-features on T. Agree(T,XP) assigns Case to XP and triggers raising to SPEC-T. To avoid countercyclicity, Chomsky proposes that all these operations apply simultaneously when the C-T complex is merged. Similarly, V inherits phifeatures of v*, raising the object XP to SPEC-V (V-to-v* raising restores word order). And some similar mechanism should exist in the nominal domain as well. Notice that the feature-inheritance variety of phase theory has an immediate empirical consequence if we assume it to hold for narrow syntax: it restores the original version of PIC (1) and is incompatible with the revised version (8). Hence, it does not straightforwardly capture those cases discussed above that led to the revision, on the assumption that it constrains agreement. If feature inheritance takes place in order to ensure valuation at Transfer, then Transfer of the complement must apply when the phase head is merged, and cannot be delayed until the next-higher phase (contrary to PIC2). Since feature inheritance seems to be supported on independent grounds, this invites the conclusion that PIC holds only for the mapping components, not for narrow syntax, considerably reducing computational load. This entails that Agree (as a narrow-syntactic operation) is not constrained by phase boundaries but only by intervention (cf. OP: 9), the data discussed in section 2.3 being instances of nonintervention. (See also Bošković 2007 for a similar conclusion.)
3. A Note on “I-functional Explanation” It has been pointed out (see Boeckx & Grohmann 2007 and references therein) that phase theory suffers from serious conceptual flaws. In particular, it is unclear why there is a discrepancy between the phases and the doso far simply been taken for granted. In the tree, each phase head P must dominate a receptacle head H: P … H … P … H … The basic structure of the phase (phase head, non-phase head) falls out immediately. This kind of structure is therefore a natural product of SMT: if we are on the right track so far, this is the clausal skeleton by the “perfect” nature of the system. Richards (2007) notes that this line of reasoning “might constrain the possible expansions of the core functional sequence into more richly articulated hierarchies” (Rizzi, Cinque). I will not dwell on the issue here; see Boeckx (to appear), Hinzen & Uriagereka (2006), Fortuny (2006) for some related proposals.
The conceptual necessity of phases
269
mains of Transfer, the phase-head complements. If CP, v*P, and DP are propositional objects, then it seems implausible to assume that what actually reaches the interface is not the phase, but only part of it (TP, VP, and NP, respectively – but set DP aside for now). Epstein (2007b) notes that “VP and TP are claimed to have NONE of the interface properties (…) that are alleged to make vP and CP natural candidates for phasehood.” Prima facie, Transfer of CP and v*P appears to be the optimal way, given their interface properties. But if CP and v*P were transferred, CP would not exist, because no derivation could go on beyond the level of the v*P phase. Epstein thus provides an I-functionalist rationale for the phase edge, echoed by Richards (2006: 9): “The disparity between CP/vP and TP/VP (…) follows as a way to ensure that Full Interpretation can be met – only a subpart of a phase can possibly be spelled out if language is to conform to SMT (…).” He goes on to argue that “no integrated, compositional, or convergent structure could be formed without some notion of phase edge (…), so that the latter is a requirement of good (let alone optimal) design.”24 Epstein (2007a,b) explicitly argues for the general validity of “I-functionalist” or “inter-organ explanation” in this sense: without property X, the system would not work, would not make use of all of its expressive potential, or would generate only deviant expressions; hence, X is a necessary design feature. This line of reasoning, however, is dubious with regard to explanatory adequacy. As Hinzen (2006: 213) points out, I-functional proposals have a “‘teleological’ flavor” to them: if property X is taken to exist because otherwise derivations would “crash”, one can ask: “why should it not crash? And how can the derivation, prior to reaching the putative interface, know what demands will be made there, and arrange itself accordingly?” 25 24
25
Similarly, Chomsky reasons: “applied to a phase PH, [Spell-Out] must be able to spell-out PH in full, or root clauses would never be spelled out [note omitted]. But we know that S-O cannot be required to spell-out PH in full, or displacement would never be possible” (BEA: 108). Hinzen illustrates the point with the case of dislocation. Prior to its reduction to Merge, externalist explanations for movement were sought to explain the system’s property of displacement on principled grounds: ‘movement is made for requirement X of an external system.’ This, indeed, has an obvious teleological flavor to it. Chomsky’s more recent explanation that displacement is merely a consequence of unbounded Merge, in contrast, is fully internalist, relying on ‘design’ factors internal to the narrow Language Faculty only.
270 Dennis Ott I agree with Hinzen: I-functional explanations of syntactic mechanisms are as empty as the engineering solutions that characterized a good deal of earlier P&P theories. This is true in particular for explanations that refer (explicitly or implicitly) to “richness of expression”; it is the system’s expressiveness itself that has to be explained.
4. Conclusions Let me briefly summarize the main points of this paper: 1. Free generation (unconstrained Merge) is empirically preferrable over Last Resort conditions on Merge. Abandoning the inadequate ±grammatical distinction, syntax cannot be “crash-proof” as a matter of empirical fact. 2. If syntax is “unprincipled” in this sense, it must be C-I constraints that determine which structures have the properties necessary for concept construction (“propositionality”). Those structures that can interface with “thought” are the phases. 3. Plausibly, “propositionality” is (at least) threefold: information/discourse semantics, thematic structure, and referentiality. The first two properties reflect the “duality of semantics”: surface vs. base properties that enter into interpretation; “referentiality” allows intentional (world-directed) language use. 4. The corresponding syntactic structures are CP, v*P (perhaps also vP), and DP (but not NP). C specifies force and comprises full event structure. v* contains information about thematic structure (the status of defective v remains unclear). The D position appears to be related to referentiality, and to argumenthood of nominal phrases. 5. The main empirical project is the identification of C-I properties to give flesh to the notion of “propositionality”. Moreover, many empirical questions concerning the syntactic repercussions of the phasal cycle (edge effects, feature inheritance) remain. It seems questionable that PIC holds for narrow syntax. 6. I-functional explanation is not valid, for “richness of expression” is a property to be explained. It remains to be seen whether any of the assumptions sketched here will resist empirical challenges. While I have not even attempted to advance the empirical debate concerning phase theory, my main goal in this paper was
The conceptual necessity of phases
271
to show that the notion of phase (as a proper format for instructions to C-I systems) is conceptually necessary, at least if current assumptions about the architecture of the broad Faculty of Language are roughly on the right track.
Acknowledgements I am indebted to Marc Richards, Cedric Boeckx, Wolfram Hinzen, Volker Struckmeier, Noam Chomsky, and Juan Uriagereka for valuable comments. All mistakes and misinterpretations are mine.
References Adger, David 2003 Core Syntax. New York: Oxford University Press. Baker, Mark C. 1997 Thematic roles and syntactic structure. In Elements of Grammar, Liliane Haegeman (ed.), 73–137. Dordrecht: Kluwer. 2001 The Atoms of Language. New York: Basic Books. Berwick, Robert C. 1998 Language evolution and the minimalist program. In Approaches to the Evolution of Language, James R. Hurford, Michael StuddertKennedy and Chris Knight (eds.), 320–340. Cambridge: Cambridge University Press. Boeckx, Cedric 2003 Islands and Chains: Resumption as Stranding. Amsterdam: John Benjamins. 2006a Darwin’s problem – or how did language get its spine, its spots, and its niche? Paper presented at Of Minds and Language, San Sebastian, Basque Country. 2006b Linguistic Minimalism. New York: Oxford University Press. 2007 Understanding Minimalist Syntax: Lessons From Locality. Oxford: Blackwell. 2008 Bare Syntax. New York: Oxford University Press. Boeckx, Cedric and Kleanthes K. Grohmann 2007 Remark: Putting Phases in Perspective. Syntax 10 (2): 204–222. Bošković, Željko 2007 Agree, phases, and intervention effects. Linguistic Analysis 33: 54–96.
272 Dennis Ott Cherniak, Christopher 2005 Innateness and brain-wiring optimization: Non-genomic nativism. In Evolution, Rationality and Cognition, A. Zilhao (ed.), 103–112. London: Routledge. Chomsky, Noam 1955 The Logical Structure of Linguistic Theory. Mimeographed, Harvard University/ MIT. 1977 Essays on Form and Interpretation. New York: North-Holland. 1986 Knowledge of Language. New York: Praeger. 1995 The Minimalist Program. Cambridge, MA: MIT Press. 2000 Minimalist inquiries. In Step by Step: Essays on Minimalist Syntax in Honor of Howard Lasnik, Roger Martin, David Michales and Juan Uriagereka, 89–156. Cambridge, MA: MIT Press. 2001 Derivation by phase. In Ken Hale: A Life in Language, Michael J. Kenstowicz (ed.), 1–52. Cambridge, MA: MIT Press. 2002 On Nature and Language. Cambridge: Cambridge University Press. 2004 Beyond explanatory adequacy. In Structures and Beyond, Adriana Belletti (ed.), 104–131. New York: Oxford University Press. 2005a Three factors in language design. Linguistic Inquiry 36 (1): 1–22. 2005b Some simple Evo-Devo theses: How true might they be for language? Ms., MIT. 2006 Turing’s Thesis. Talk delivered at InterPhases, Nicosia, Cyprus. 2007 Approaching UG from below. In Interfaces + Recursion = Language? Chomsky’s Minimalism and the View from Syntax-Semantics, U. Sauerland and H.-M. Gärtner (eds.), 1–30. Berlin /New York: Mouton de Gruyter. 2008 On phases. In Foundational Issues in Linguistic Theory: Essays in Honor of Jean-Roger Vergnaud, Robert Freidin, Carlos P. Otero and Maria Luisa Zubizarreta (eds.), 133–166. Cambridge, MA: MIT Press. Chomsky, Noam and Howard Lasnik 1993 The theory of principles and parameters. In Syntax: An International Handbook of Contemporary Research, Vol. 1, Joachim Jacobs, Arnim von Stechow, Wolfgang Sternefeld, and Theo Vennemann (eds.). Berlin /New York: de Gruyter. (Reprinted in Chomsky 1995.) Collins, Chris 2002 Eliminating labels. In Epstein and Seely (2002a), 42–64. Den Dikken, Marcel 2006 A reappraisal of vP being phasal: A reply to Legate. Ms., CUNY. Eble, Gunter J. 2003 Developmental morphospaces and evolution. In Evolutionary Dynamics, James P. Crutchfield and Peter Schuster (eds.), 35–65. New York: Oxford University Press.
The conceptual necessity of phases
273
Epstein, Samuel D. 2007a Physiological linguistics, and some implications regarding disciplinary autonomy and unification. Mind & Language 22: 44–67. 2007b On I(nternalist)-Functional Explanation in Minimalism. Linguistic Analysis 33: 20–53. Epstein, Samuel D. and T. Daniel Seely (eds.) 2002a Derivation and Explanation in the Minimalist Program. Oxford: Blackwell. Epstein, Samuel D. and T. Daniel Seely 2002b Rule applications as cycles in a level-free syntax. In Epstein and Seely (2002a), 65–89. Fortuny, Jordi 2006 The emergence of order in syntax. Doctoral Dissertation, University of Barcelona. Fox, Danny 2000 Economy and Semantic Interpretation. Cambridge, MA: MIT Press. Frampton, John and Sam Guttmann 2002 Crash-proof syntax. In Epstein and Seely (2002a), 90–105. Gould, Stephen J. 1991 Exaptation: A crucial tool for evolutionary psychology. Journal of Social Issues 47: 43–65. 2002 The Structure of Evolutionary Theory. Cambridge, MA: Harvard University Press /Belknap Press. Hauser, Marc D., Noam Chomsky, and W. Tecumseh Fitch 2002 The faculty of language: What is it, who has it, and how did it evolve? Science 298: 1569 –1579. Heck, Fabian and Malte Zimmermann 2004 Phasenentwicklung in der Entwicklungsphase: Randphänomene der DP. Paper presented at GGS, Mannheim. Hinzen, Wolfram 2006 Mind Design and Minimal Syntax. Oxford: Oxford University Press. Hinzen, Wolfram and Juan Uriagereka 2006 On the metaphysics of linguistics. Erkenntnis 65 (1): 71–96. Jonsson, Jóhannes Gísli 1996 Clausal Architecture and Case in Icelandic. Doctoral Dissertation, University of Massachusetts at Amherst. Lasnik, Howard 2006 Conceptions of the Cycle. In Wh-movement Moving On, Lisa Cheng and Norbert Corver (eds.), 197–216. Cambridge, MA: MIT Press. Legate, Julie Anne 2003 Some interface properties of the phase. Linguistic Inquiry 34 (3): 506– 516.
274 Dennis Ott Longobardi, Guiseppe 1994 Reference and proper names: A theory of n-movement in syntax and logical form. Linguistic Inquiry 25 (4): 609–655. 2005 Toward a Unified grammar of reference. Zeitschrift für Sprachwissenschaft 24: 5–44. Matushansky, O. 2005 Going through a phase. In McGinnis and Richards (2005), 157–181. McGinnis, Martha and Norvin Richards (eds.) 2005 Perspectives on Phases (MIT Working Papers in Linguistics 49). MITWPL. Müller, Gereon 2003 Phrase impenetrability and wh-intervention. In Minimality Effects in Syntax, Artur Stepanov, Gisbert Fanselow and Ralf Vogel (eds.), 289–326. Berlin /New York: de Gruyter. Nevins, Andrew 2005 Derivations without the activity condition. In McGinnis and Richards (eds.), 287–310. Pesetsky, David 1982 Paths and categories. Doctoral Dissertation, MIT. Pesetsky, David and Esther Torrego 2004 Tense, case, and the nature of syntactic categories. In The Syntax of Time, Jacqueline Gueron and Jacqueline Lacarme (eds.), 495–538. Cambridge, MA: MIT Press. Pietroski, Paul M. to appear Systematicity via monadicity. Croatian Journal of Philosophy. forthc. Semantics Without Truth-values. New York: Oxford University Press. Polinsky, Maria and Eric Potsdam 2001 Long-distance agreement and topic in Tsez. Natural Language & Linguistic Theory 19: 583–646. Richards, Marc 2006 On phases, phase heads, and functional categories. Ms., University of Cambridge. 2007 On feature inheritance: An argument from the phase impenetrability Condition. Linguistic Inquiry 38 (3): 563–572. Sauerland, Uli 2003 Intermediate adjunction with A-movement. Linguistic Inquiry 34 (2): 308 –314. Saunders, Peter T. 1994 Evolution without natural selection: Further implications of the Daisyworld Parable. Journal of Theoretical Biology 166: 365–373. Svenonius, Peter 1994 C-selection as feature-checking. Studia Linguistica 48 (2): 133 –155.
The conceptual necessity of phases
275
2004
On the Edge. In Peripheries: Syntactic Edges and their Effects, David Adger, Cécile de Cat and George Tsoulas (eds.), 261–287. Dordrecht: Kluwer. Treves, Alessandro 2005 Frontal latching networks: A possible neural basis for infinite recursion. Cognitive Neuropsychology 22 (3/4): 276–291. Webster, Gerry and Brian Goodwin 1996 Form and Transformation: Generative and Relational Principles in Biology. Cambridge: Cambridge University Press. Wurmbrand, Susi 2006 Licensing case. Journal of Germanic Linguistics 18 (3): 175–235. Yang, Charles D. 2002 Knowledge and Learning in Natural Language. New York: Oxford University Press.
Contributors
Carlo Geraci
Dennis Ott
Università degli Studi di MilanoBicocca Piazza Dell’Ateneo Nuovo, 1 20126 Milano Italy
Harvard University Department of Linguistics Boylston Hall, 3rd floor Cambridge, MA 02138 USA
[email protected]
[email protected]
Kleanthes K. Grohmann
Michael T. Putnam
University of Cyprus Department of English Studies 75 Kallipoleos P.O. Box 20537 1678 Nicosia Cyprus
Carson-Newman College Department of Foreign Languages 1646 Russell Ave 341 Henderson Humanities Building Jefferson City, TN 37760 USA
[email protected]
[email protected]
Martin Haiden
Anjum P. Saleemi
Université François Rabelais de Tours Inserm U 930 “Brain and Imagery” 3, rue des Tanneurs 37041 Tours Cedex 1 France
GC University Higher Education Commission of Pakistan, Department of English Lahore 54000 Pakistan
[email protected]
saleemi_ [email protected]
Dalina Kallulli
Tobias Scheer
Universität Wien Insitut für Sprachwissenschaft Berggasse 11 1090 Vienna Austria
Université Nice Sophia-Antipolis Laboratoire BCL CNRS UMR 6039, MSH de Nice 98 Bd E. Herriot 06200 NICE France
[email protected]
[email protected]
278 Contributors Thomas Stroik
Takashi Toyoshima
University of Missouri – Kansas City Department of English Kansas City, MO 64110 USA
Kyushu Institute of Technology Department of Human Sciences Kawazu 860-4 Iizuka
[email protected]
Fukuoka 820-8502 Japan
Hisao Tokizaki
[email protected]
Sapporo University Department of English Nishioka 3–7 Sapporo 062-8520 Japan [email protected]
Index
accentuation (see also deaccentuation) adjunct, 10, 14, 161–163, 169–175, 242, 265 affix class-based phenomena, 28, 37 Agree, 6, 10, 119, 126–128, 134, 135–137, 152–153, 157, 162, 218–221, 224, 257, 264–270 Albanian, 117–121, 125, 128 assembly problem, 12, 97
144–156, 161–177, 181–184, 195, 205–207, 211–237, 242–245, 257– 258, 262, 267, 269 ~ by phase, 2, 23, 26–27, 35, 44 determinism, 15, 212, 215, 224, 226, 230–233 discourse, 97–98, 106, 118, 124, 129, 186–188, 194, 203, 261, 270 dynamic, 2–10, 15, 211–213, 230–232
bare prosodic structure, 11, 67, 84–89 Branch Right, 99
economy, 6–9, 15–16, 26, 45–47, 162, 174–178, 211–238, 243, 245 expletive, 16, 127, 152, 154, 197, 212– 213, 219, 222–223, 229, 237–245
causativization, 75–80 clausal complements, 12, 115–117, 265 clitic doubling, 117–120, 127 combinatorial, 181, 206–207, 211–216, 223 complexity, 9, 15–16, 167–168, 182, 211– 216, 222–224, 232, 245, 267 Condition C asymmetries, 14, 161–162, 175–176 contraction, 105 CV model (of phonological representation), 69, 71 cyclic, 6, 23–59, 69, 95, 133–134, 138, 144, 147, 151–157, 170, 174–175, 222 ~ derivation, 25, 41, 44 ~ spell-out (of words), 23–24, 27, 37– 44, 48–49, 53–58 Czech, 50 deaccentuation, 12, 115–116, 126–129 Defective Intervention Constraint (DIC), 221, 224, 246 derivation, 2–6, 9–16, 23, 25–27, 29, 32, 35, 44–47, 67–85, 90, 95–109, 115– 116, 121, 126, 133–135, 138–139,
factivity, 116–121, 124–128 ~ triggers, 116 feature, 1, 6, 10–14, 27, 37, 44, 67, 73, 78–83, 86– 90, 101–105, 116, 125– 128, 134–137, 144, 151–157, 163– 168, 172–177, 186–190, 196–202, 205, 217, 225, 228–235, 238, 242– 246, 257, 264–270 edge ~, 134–135, 198, 257 ~ inheritance, 264–270 EPP~, 127, 134–136, 151–154, 227, 231, 234–235, 238–243 f-marking, 129 generative lexicon, 189, 192, 195 generative semantics, 184, 185, 199 German, 11, 68, 75–76, 86–89, 117, 120–121, 124–125, 128, 264 givenness, 12, 115, 118–121, 128–129 global, 15–16, 33, 44, 47, 68, 211–216, 219, 227, 232, 261 Greek, Modern, 117–120, 125
280 Index head (definition), 4–5, 13, 27, 31, 36–38, 46–48, 71–72, 79–85, 88–90, 96, 102, 116, 126–127, 133–137, 140, 143– 147, 151–154, 157, 166–167, 173– 174, 197, 221, 240, 245, 258–268 Icelandic, 264–265 I-functionalism, 268 –270 I-language, 25–258, 261 illocutionary, 183, 187–195, 199–200, 202, 204–205, 207, ~ force, 15, 181, 183–184, 187–192, ~ logic, 183, ~ meaning (intentional), 14, 181, 183– 184, 188, 190, 198–199 Inclusiveness Condition (IC), 25, 67, 70– 71, 85, 258, 260 information structure, 8, 13, 54, 115–130 insertion, 16, 40, 54, 186, 197, 222 lexical ~, 58, 96 interactionism, 10–11, 23–26, 44, 128, 196 intermodular argumentation, 10 –11, 23– 27, 35–39, 56–59 intonation, 40, 48–58, 261 Italian Sign Language (LIS), 13, 133– 134, 139–149, 154–156 Late Merge, 150, 156, 161, 170, 175, 177 left-branching structure, 99, 108–109 level 1/2 rules, 29, 35 lexical (sub)array, 3–6, 13, 16, 152, 164, 186, 212, 217, 221, 245–246, 259 Linear Correspondence Axiom (LCA), 107 Linearize Down, 12, 96–101, 111 successive-cyclic linearization, 13, 138 top-down linearization, 95, 111 local (locally/locality), 4, 7, 14–16, 36, 126–127, 162, 167, 176, 211–216, 219, 224–227, 232–233, 243–245, 260–262, 265
look-ahead, 11, 15–16, 67–75, 79, 89–90, 174, 176, 211–212, 232–233, 243, 245 medial gemination, 71 Merge (definition) (see also Late Merge, Remerge), 4, 6, 9, 12–16, 28, 36, 52– 53, 71, 79–88, 95–96, 98–101, 105– 106, 109–111, 133–138, 151–154, 157, 161–178, 186, 199, 206, 212– 213, 219–239, 242–245, 256–261, 265–270 bottom-up ~, 98 External ~, 135, 152, 154, 157, 164, 206 Internal ~, 14, 40–41, 44–45, 89, 133– 136, 151–154, 157, 162–171, 176– 178, 183, 189, 199, 206, 212, 222, 224, 230, 232, 242, 245, 255–259, 269 Pair ~, 162, 171 Minimal Link Condition, 218, 225 Minimalism/Minimalist Program, 2, 5– 6, 9–15, 67, 80, 82, 90, 95, 111, 135, 162, 168, 173, 176–177, 181–182, 211, 218, 253 Minimum Feature Retention, Principle of (MFR), 16, 213, 231–238, 242–245 Model, 69, 73, 80, 83, 115, 174–176, 254–255, 261–262 bifurcated ~, 14, 181, 184, 205–206 derivational ~, 14, 75, 99, 101, 126, 177, 182, 184, 195, 205 linear ~ (‘neo-Saussurean’), 15, 181, 184, 206 T- ~, 2, 14, 25, 52, Y- ~, 2, 14, 205, 262, morpheme, 24–25, 29–32, 39–44, 50, 54–58, 126, 186 ~-specific phonologies, 28–32, 41–43 abstract ~, 116, 126 multiple computational systems in phonology, 13, 23–24, 30–32, 39–46, 51,
Index 281 68, 81, 84, 100, 111, 115, 164, 167, 172, 219, 222, 234, 259 no look-back mechanism, 4, 26–27, 33, 35, 37, 44–47, 55 non-branching constituent, 100, 107–108 non-concatenative, 11, 88 Numeration, 3, 4, 12–14, 101, 116, 126, 134–135, 152, 154, 162–164, 172, 175–177, 230 Optimality Theory (OT), 29, 51 stratal ~, 29 optimization, 14, 16, 162, 211–224, 269 overgeneration, 256, 258 parsing, 12, 105–110 vacuous ~, 108 partitive (Case), 229–230, 244 pause, 101–102, 105–110 performative deletion analysis, 184 phase, 1–16, 23–27, 31, 35–39, 44–46, 52, 55, 96–101, 105, 116, 133–134, 138–140, 144–157, 167–168, 176– 178, 181–184, 187, 190, 195–198, 203–207, 212, 221, 233–237, 246, 253, 256, 259–271 ~ edge, 10, 23, 27, 35–39, 46, 134, 139– 140, 151, 154–157, 168, 263, 269 ~ theory, 1–4, 7–9, 13, 16, 25–27, 35–37, 46, 96, 133–134, 148–150, 181–182, 233–237, 261, 268–270 ~ unit, 97–98, 101 Phase Impenetrability Condition (PIC), 4–7, 23–27, 35–38, 44–45, 55–58, 144, 151, 198, 216, 221, 224, 261– 265, 268, 270 ~ in phonology, 35, 37, 44 parameterisation of the ~, 55 phonologically empty element, 104, 106 phonology, 4, 11, 24–25, 28–32, 35–37, 40–58, 67–72, 75, 81, 84–87, 256 lexical ~, 25–32, 35–44, 55, 58
lexical vs. post-lexical ~, 42, 54–55 word- vs. sentence- ~ (Praguian segregation), 41–44, 54, 58 pitch accent, 121, 124, 129 polarity, 15, 200–202 presupposition, 12, 115–116, 120, 124 probe-goal, 6, 126–128, 151, 220, 228, 257 Projection Principle, 85, 89 pronoun, 116–118, 121, 124–129, 145, 161–163, 169–170, 172, 175, 205 clitic ~, 117 correlate ~, 120, 128 pleonastic ~, 118, 124 proposition(ality), 15–16, 36, 120, 165, 184, 187–188, 190, 193–197, 202, 212, 218, 222, 259–262, 269–270 propositional meaning, 188, 190, 218, 222 prosodic information, 126, 130 reconstruction, 14, 105–106, 161–164, 169, 172, 174, 177, 263–264, 267 recursion, 49–52, 67, 86–87, 225, 235, 255 ~ of prosodic structure in intonation, 49–52, 86 remerge, 14, 146–147, 155, 161–163, 166, 171–174, 176 representation(al), 8–9, 11, 13, 15, 40, 57, 67–71, 75, 79, 90, 96, 101–105, 110–111, 115, 169–173, 177, 181, 190, 200–201, 207, 211–214, 217– 218, 233, 235, 245, 256, 261 rewriting rules, 96 rightward movement, 143 –146, 150, 156 ~ of wh-phrases, 150 sandhi (external), 40, 44, 55, 57 cyclicity-induced ~ (absence of ~), 40, 42, 44, 59 sense extension, 196–197, 259 Shortest Derivation Condition, 211, 218– 219, 227
282 Index Shortest Movement Condition, 218–219 silent demibeat, 12, 95, 102–111 Simpl, 162, 171–177 Spanish, 129 speech act, 183–184, 187, 191–192 Spell-out, 2–13, 23–49, 53–58, 67–68, 74–75, 95–111, 116, 133, 138–139, 154–155, 176, 178, 184, 206–207, 269 ~ mechanism, 23–24, 27, 31, 36, 38– 41, 46–47, 55–58 ~ your sister, 37, 39, 56 ~-as-you-merge, 36 selective ~, 25–27, 31–32, 35–37, 56 word ~, 10, 23–24, 27, 38, 48, 54–58 ~-model, 4, 6 –7, 12, 98, 100 split-CP, 15, 194, 198–199 Strong Minimalist Thesis (SMT), 256, 266–269
Survive, 14, 162–163, 172–178, 246 syntactic bracket, 12, 100–105, 111 template, 11, 67–76, 87–90 third factor, 254–255 topichood, 118–119 Transfer, 3–4, 7, 9, 69, 126, 183, 266– 269 Tsez, 265 umlaut, 78–80, 89 unaccusativity, 79, 238–242, 245, 263 underapplication, 28–29, 32–33, 35, 37 Universal Theta Assignment Hypothesis (UTAH), 258 up-down paradox, 99 zig-zag movement, 13–14, 133–134, 139– 140, 144, 148–151, 154–157