This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
|der| in| nahm|gab| einige| sich| und|auf|St-| zu| wenn| Mehrzahl|ihnen| nur|lange| blieb| wie|Umtausch|Anspruch| so| es| die|hinsetzten| der| Stelle|lesen| auch| doch| gleich| begannen|
Figure 5. Accumulation of the impression of foreign accent by means of the combined effect of the inappropriate placement, choice and realization of pitch accents in the utterance ‘Und wenn auch die Mehrzahl von ihnen nur so lange Zeit blieb wie der Umtausch in Anspruch nahm, so gab es doch einige, die sich hinsetzten und gleich auf der Stelle zu lesen begannen’ (“And even if the majority of them only stayed as long as it took to complete the exchange, there still were some who sat down and started reading right away”).
In summary, it can be reiterated that the multitude of different facets of tonal variation strongly impedes any attempt to provide a structured overall representation or classification of its association with clearly defined semantic interpretations. This inherent characteristic of intonation has the disadvantage of making it difficult for the researcher to determine which tonal choices are really inappropriate and why. However, it also has the advantage of offering to the language teacher the possibility of identifying, selecting and teaching a wide variety of individual correspondences between particular intonation patterns and interpretations that he or she considers to be especially useful or important. As a matter of fact, this choice need not be restricted to a number of closely defined discourse situations, which are connected to specific tunes.
90
Matthias Jilka
5.
Overall impression of intonational foreign accent
Unlike the individual and immediately conspicuous cases of tonal deviations discussed in section 3, impressions of foreign accent caused by cumulative effects are not only associated with the moment when no reasonable interpretation for the overall intonation pattern is possible anymore (see section 4). Very often it is rather a process of becoming aware of preexisting subconscious impressions that something indefinable in the speaker’s productions is unusual (provided of course that there are no other more obvious errors, for example on the segmental level). The listener may therefore perceive one complex, overall impression as opposed to discrete individual deviations following each other. It it thus not unreasonable to postulate that when the many individual events potentially expressing foreign accent are combined, such a common overall impression is created. In other words, several intonational features together would conspire towards a specific overall intonation characteristic. Summarizing observations made with respect to American speakers’ intonation in German, several features can indeed be shown to exhibit similar tendencies. A comparison of the same sentence read by native speakers of American English and German as depicted in Figure 5 shows that the Americans use twice as many pitch accents as the Germans in the same stretch of speech and that they tend to have wider pitch ranges (in section 3.1 an identical observation was confirmed by measurements within the framework of PaIntE parameters). The American speaker’s production in our example can thus be described as comprising more tonal movements, rises and falls, with more extreme endpoints. The created perception is that of a much more lively intonation. This impression is representative of American speakers as opposed to Germans in general. Further support for this tendency can be found in the fact that if a transfer of a tonal category takes place, it is likely to lead to additional tonal movements as well, as for example in the transfer of the continuation rise described in section 3.1, in which the comparison showed an extra fall and rise (L*H % vs. L+H* L-H%) in the American speaker’s production.
Different manifestations and perceptions of foreign accent in intonation
91
Figure 6. Differences in overall intonation characteristics in the sentence ‘Alle kochten bereits vor Wut und der Mann konnte jetzt von allen Seiten Schimpfwörter hören’ (“Everybody was boiling with rage and the man could hear swearwords (directed at him) from all sides”) between productions by German (top contour) and American speakers (bottom contour). American speakers typically use about twice as many pitch accents and make more generous use of their pitch ranges
These accumulated patterns create a form of “global” intonational foreign accent that is language-distinctive, if not language-specific, due to the influence of the native language. This form of foreign accent exhibits a certain independence from the segmental level. Knowledge of the relationship between phonetic and phonological parameters (e.g., temporal alignment, choice and placement of pitch accents) and their interpretation is not necessary for listeners to be able to recognize and possibly identify the foreign accent. This language-distinctive independence of prosodic features has been demonstrated in a number of studies including, just to name an example, a thorough language identification task by Ramus and Mehler (1999) that used different stages of delexicalization by means of resynthesis. This aspect was also examined specifically with respect to the perception of foreign-accented speech in Jilka (2000). Listeners were presented with low pass-filtered stimuli and asked to decide whether the language they heard was English or German. The stimuli had been produced by native speakers of American English and German and were selected in such a way that the majority of them were foreign-accented. Therefore listeners were expected to identify the speakers’ native languages. While for stimuli
92
Matthias Jilka
of varying duration identification rates (i.e. correct recognition of the speaker’s native language) were generally not significant, there was a significant (p = 0.030) correlation of 0.786 (Spearman-Rho test) between identification rate and stimulus duration. For this reason a small-scale additional test with eight stimuli longer than 35 seconds was performed. As expected the speakers’ native languages were recognized in all cases, in six cases significantly so (p < 0.00005). Such results can certainly be interpreted as confirmation of the idea postulated earlier that the overall impression of foreign accent independent of semantic content slowly accumulates during a stretch of speech. The longer it is, the more hints at unusual tonal features reach the listener’s ear, until they eventually cross the threshold of awareness2.
6.
Possible conclusions for language teaching
The presented characteristic aspects of intonation address potentially difficult challenges for intonation research and teaching alike. It can be shown, however, that these challenges can be met to a considerable degree and that the discussion of these aspects can lead to insights that underline the usefulness of strictly research-related problems to the development of teaching methods. The question of the significance of perspective, i.e., the dependence on the model of intonation description, expresses a general uncertainty as to what the true representation of intonation is. This certainly is problematic for intonation research. However, from a more practical, pedagogical perspective it can also be argued that the multitude of different representations provides the chance to deal with an intonation error from different starting points. In section 3 the causes of some intonational deviations were shown to be either unknown or of a non-transparent nature. As a result it would be extremely difficult, if not impossible, to understand these particular sources of foreign accent and develop systematic approaches to predicting and avoiding them. On the other hand there are some well-defined environments, especially the basic discourse situations such as declaratives, wh-questions, yes/no-questions, continuation rises etc., for which it should be possible to make sure that the final tunes associated with them are produced correctly and do not contain any obvious cases of tune transfer. A similar approach could be applied to successfully identified transfer phenomena concerning
Different manifestations and perceptions of foreign accent in intonation
93
the phonetic realization of equivalent categories. See section 3.1 for example for both types of transfer. The importance of the high variability of intonation was discussed as a factor complicating the relationship between prosody and meaning. The virtually infinite number of tonal variations and corresponding interpretations makes it impossible for intonation researchers to provide a formal description that relates all possible variations in all possible contexts to the intended corresponding interpretations. Even if such a description existed it would obviously still be unreasonable to expect a second language learner to be able to acquire and apply it. It was pointed out, however, that from the pedagogical point of view this variability also has the benefit of allowing the identification and teaching of specific tonal constellations that are guaranteed to express the intended discourse meaning. The selection of such exemplary tonal patterns and interpretations, as well as the development of suitable teaching methods, may be challenging but is nevertheless well within the grasp of the language teaching community. Finally, the overall characteristics of foreign accent described in section 5 contribute an essential share of the impression of foreign accent that a non-native speaker conveys. One interesting inherent property that these general characteristics have is that it is not necessary to relate them to particular meanings or positions in the phrase. If specific differences exist between two languages, like they do between German and American English, and it is possible to teach learners conscious control of these global features (e.g., “don’t extend your pitch range too much”, “use fewer pitch accents” etc.), then this measure alone should greatly reduce the impression of intonational foreign accent, even though it would not affect the more persistent tonal deviations that are due to misrepresentations of equivalent contexts or categories in the learners’ native languages. The observations and suggestions contained in this study are made from the point of view of intonation research and do not incorporate insights from the fields of teaching methodology or even pedagogy in general. They express a relatively broad objective and would of course not lead to a completely accentless pronunciation (which is not a realistic goal anyway). It can be argued, however, that their application together with a heightened awareness of the nature of intonational errors will help foreign language teachers such as teachers of German as a foreign language to develop more systematic approaches to dealing with foreign-accented intonation. The application of speech technology in the form of F0 generation and resynthesis as demonstrated in Figure 4 should also be of use eventually,
94
Matthias Jilka
helping for example to make clear the difference between intonation contours that a learner has produced and more appropriate realizations generated with the learner’s own voice (some commercially available products that attempt to go in this direction already exist).
Notes 1. British School transcriptions follow the models given in Cruttenden (1997: 61) 2. A number of studies have also shown that rhythmic features alone may be sufficient to distinguish languages. In Jilka (2000) it is however shown, using low pass-filtered stimuli with a constant F0 of 220 Hz, that language identification rates are significantly better when intonation information is present.
References Beckman, Mary and Gail Ayers 1994 Guidelines to ToBI Labelling. Version 2.0. Ohio State University. Best, Catherine T. 1995 A direct-realist view of cross-language speech perception. In: Winifred Strange (ed.), Speech Perception and Linguistic Experience: Theoretical and Methodological Issues, 171–204. Timonium, MD: York Press. Cruttenden, Alan 1997 Intonation. Cambridge: Cambridge University Press. Flege, James E. 1995 Second language speech learning: Theory, findings and problems. In: Winifred Strange (ed.), Speech Perception and Linguistic Experience: Theoretical and Methodological Issues, 233–277. Timonium, MD: York Press. Halliday, Michael A. K. 1967 Intonation and Grammar in British English. The Hague: Mouton. Jilka, Matthias 2000 The Contribution of Intonation to the Perception of Foreign Accent. PhD Dissertation. AIMS 6(3). Stuttgart: University of Stuttgart.
Different manifestations and perceptions of foreign accent in intonation
95
Jilka, Matthias and Bernd Möbius 2006 Towards a comprehensive investigation of factors relevant to peak alignment using a unit selection corpus. Proceedings of Interspeech, Pittsburgh, 2054–2057. Jilka, Matthias, Gregor Möhler and Grzegorz Dogil 1999 Rules for the generation of ToBI-based American English intonation. Speech Communication 28, 83–108. Kingdon, Roger 1958 The Groundwork of English Intonation. London: Longman. Klein, Wolfgang and Clive Perdue 1997 The basic variety (or: Couldn’t natural languages be much simpler?). Second Language Research 13, 301–347. Möhler, Gregor 1998 Theoriebasierte Modellierung der deutschen Intonation für die Sprachsynthese. PhD Dissertation. AIMS 4(2). Stuttgart: University of Stuttgart. Möhler, Gregor and Alistair Conkie 1998 Parametric modelling of intonation using vector quantization. Proceedings of the 3rd ESCA Workshop on Speech Synthesis, Jenolan Caves (Australia), 311–316. Müller, Karin 1998 German Focus Particles and their Influence on Intonation. Master’s Thesis, University of Stuttgart. O’Connor, Joseph D. and Gordon F.Arnold 1973 Intonation of Colloquial English. London: Longman. Palmer, Harold E. 1922 English Intonation. Cambridge: Heffer. Pierrehumbert, Janet 1980 The Phonology and Phonetics of English Intonation. PhD Dissertation. Cambridge, MA: MIT. 1981 Synthesizing intonation. Journal of the Acoustical Society of America 70, 985–995. Pike, Kenneth 1945 The Intonation of American English. Ann Arbor: University of Michigan Press. Ramus, Franck and Jacques Mehler 1999 Language identification with suprasegmental cues: A study based on speech resynthesis. Journal of the Acoustical Society of America 105, 512–521
96
Matthias Jilka
van Santen, Jan and Julia Hirschberg 1994 Segmental effects on timing and height of pitch contours. Proceedings of the 3rd International Conference on Spoken Language Processing, Yokohama (Japan), 719–722. Silverman, Kim, Mary Beckman, John Pitrelli, Mari Ostendorf, Colin Wightman, Patti Price, Janet Pierrehumbert and Julia Hirschberg 1992 ToBI: A standard for labelling English prosody. Proceedings of the 2nd International Conference on Spoken Language Processing, Banff (Canada), 867–870. Trager, George L. and Henry L. Smith 1951 An Outline of English Structure. Norman, OK: Battenburg Press.
Rhythm as an L2 problem: How prosodic is it? William J. Barry 1.
Introduction
Making L2 learners aware of pronunciation problems in general and, more specifically, of the difference between their own pronunciation and the pronunciation they are supposed to acquire is extremely difficult, as any language teacher (interested in pronunciation) will attest.1 It should therefore be paramount that the terms we use to direct learners’ attention to problem areas should be clearly defined and easy to associate with the phenomenon that they need to learn. And there’s the rub! It is well-known that the pronunciation problems we face are difficult to illustrate, explain and demonstrate because: (i)
(ii)
(iii)
Acoustic phenomena remain as pre-categorical percepts in our consciousness for no more than a fraction of a second (Massaro 1972; Kallman and Massaro 1983) and as perceived categories (which already resist change in our manner of dealing with them) for no more than a few seconds (Crowder and Morton 1969, and compare Crowder 1993 and de Gelder & Vroomen 1997). We do not process the time-varying signal uniformly over time: The mechanisms we have developed in our L1 for decoding the phonetic information contained in the acoustic signal are attentiondirected and the properties to which attention is directed can differ in importance from language to language (cf. for example Hazan 2002, and see Quené and Port 2005 for effects of “rhythmically” induced attention). Our decoding mechanisms are geared primarily to the extraction of communicatively relevant information (the semantics of an utterance, its significance for the ongoing communication act). For this we do in fact make use of phonetic nuances of the utterance, but in terms of speaker identity interpretation (cf. Palmeri, Goldinger and Pisoni 1993), which may also serve speaker-attitude interpretation. But we are not concerned with pronunciation analysis.
98
William J. Barry
In summary, becoming aware of and learning a foreign pronunciation is problematical. But it is not impossible, as some people’s natural acqusition of an acceptable L2 accent testifies. That we all do react to the differences between external models and our internal pronunciation habits is illustrated by many adults living abroad who, after many years in the foreign-language environment, lose their perfect native pronunciation but do not acquire perfect L2 pronunciation (cf. Markham 1997). The potential for using the acoustic differences in teaching depends, however, on directing a learner’s attention to the differences, or to quote the opening thought in this introduction: “making L2 learners aware of pronunciation problems”. Finding a “hook” on which to hang the problem is a vital first step. Different problems present different degrees of difficulty in finding the right hook, and prosodic problems are particularly difficult. The thesis behind this paper is that Rhythm2 presents the greatest difficulties and we therefore need to rethink the status of Rhythm in pronunciation teaching.
2.
The “hooks” to swing on
Segmental problems are the easiest problems to explain because we have an orthography-to-sound relationship (itself a “spelling” problem of course) which our Western, reading- and writing-orientated education fixes in our mind. Of course, as pronunciation teachers, we have to fight continuously against the confusion between letters and sounds, but the letters (and letter combinations) provide a permanently recordable focus (on paper) for developing exercises. The PERmanent GRAPHic RECord can also be exploited in making learner’s aware of the word-stress concept. In terms of accessibility and learner awareness, word-stress is not so problematical because word identity (meaning) is central to everyone’s idea of learning a language. If, by chance, there are minimal pairs relying on word-stress, then the way you reCORD them helps to strengthen the concept.
Rhythm as an L2 problem: How prosodic is it?
99
0.2067 0 -0.3202
0
10
-10
0
4000
0
1.34376
yDê==É======â==========l§=======Ç=y===========yê==f======Dâ*=========l§= =======Ç=y Time (s)
a
Time (s)
0
b
1.34376
1.34376 Time (s)
Figure 1. Microphone signal, F0 and spectrogram of a) REcord and b) reCORD
Nowadays, with the ubiquitous notebook (PC) and readily available signalprocessing freeware (perhaps the most powerful package available is Praat: www.fon.hum.uva.nl/praat/), a signal-based graphic record can be presented together with the auditory example (see Fig. 1, and listen to the sound-file REcord-reCORD.wav) to create the necessary link between intellectual understanding of the concept and experience of the phenomenon itself. At this level, the relationship between the simple experience of syllabic prominence and the complex of prominence-bearing signal properties (duration, F0, intensity and vowel spectrum) can be demonstrated and may also become comprehensible beyond being merely a verbal formula. Both the “hooks” already mentioned in connection with word-stress can be exploited for work on sentence-stress, and the learner’s awareness of the phenomenon can be easily stimulated because, here too, the natural process of decoding the meaning of an utterance results in a difference in understanding of e.g., “I THOUGHT he aGREED” and “I THOUGHT he aGREED”. Of course, the graphic signal representation with an accompanying acoustic demonstration (see Fig. 2 and listen to the sound-file Fig2-agreed-a+b.wav) adds flesh to the skeletal understanding of the concept triggered by the purely orthographic representation.
100
William J. Barry
0.3006 0 -0.4913 0 10
-10 0 4000
2.59565
y~f=Dq==l§=í=ÜᧅªÖê==á§====d=Ly===========y=~f====@ q==l§===í=Üá§=…=¤Ö=ê==á§=Ç=y Time (s)
a)
Time (s)
0 0
b)
2.59565
2.59565 Time (s)
Figure 2. Microphone signal and F0 trace of a) “I THOUGHT he aGREED” and b) “I THOUGHT he aGREED”
With the signal representation, the relationship is again illustrated between the complex signal structure (duration, F0, intensity and – in this case less so – vowel spectrum) and the less complex perceived difference in the prominence patterns between the sentences (with the accompanying difference in their meanings). If we now look for a “hook” on which to hang the concept of intonation, we begin to run into a number of difficulties. Firstly, the melodic pattern, which is fundamental to intonational structure, cannot be so simply, or at least not so naturally demonstrated using the orthographic manipulation tools that were so helpful for word- and sentence-stress. But careful progression through the methods used in intonation description, from iconic to more abstract (see Fig 3a-d), should help to develop the learner’s awareness.3 Secondly, even though we recognize the primary role of tonal properties in intonation, a too narrow understanding of intonation as only the melodic pattern carried by the fundamental frequency contour is patently wrong. The contour carrying version b) of the sentence in figure 2 (I THOUGHT he AGREED) can be seen to rise from “I” to “thought”, to remain level for “he a-” and then to scoop down low and rise again during “greed”.
Rhythm as an L2 problem: How prosodic is it?
a) I
c)
I
THOUGHT
THOUGHT
H*
he a
GR
E
b)
E
D
he a GREED ^H* L-%
d)
•
•
101
•
•
•
I
THOUGHT
he aGREED
I
T HOUGHT
he a^GREED
Figure 3. Different graphical means of conveying the tonal contour of an utterance, with increasing abstractness from a) to d).
0.2876 0 -0.5598
0
10
2.98785
y=~f==Dq==l§=í=Ü==ᧅªÖê==á§===d=y=========y==~f====@ q=l§=í=Üᧅ=¤Ö=ê===á§====d=y Time (s)
Orig. -10 0 4000
a
Time (s)
0 0 10
b
2.98785
2.98785 Time (s)
-10 0
2.98785 Time (s)
Figure 4. “I THOUGHT he aGREED” and “I THOUGHT he AGREED” with same tonal contour. (top contour: original production, bottom: manipulated contour)
102
William J. Barry
Figure 4, however, shows basically the same contour in a perfectly acceptable realization of version a), i.e., with the secondary sentence accent on “thought” and the primary accent on “agreed” (I THOUGHT he aGREED. Listen to sound-files Fig4-orig.wav and Fig4-manip.wav). Although the melodic contour is the same, no-one would wish to say that the intonation is the same. The two versions of the utterance (Fig. 2b and Fig. 4) clearly have a different meaning4, and that is due to the difference in intonation, which is the product of the tonal movements in relation to the duration and intensity of the accented syllables.
3.
A hook for Rhythm?
Having looked for and found (albeit with increasing difficulty) “hooks” to hang our awareness teaching on, we can now ask what Rhythm is? Is it something above and beyond the three prosodic structuring levels – wordstress, sentence-stress and intonation – that we have considered so far, or is it perhaps below and part of them? Before addressing that question, we need to recognize that there is a progressive overlap in the acoustic and linguistic nature of each of the phenomena as we consider them in turn: Sentence-stress makes use of the lexical stress patterns to structure its prominences (and appears to use the self-same acoustic parameters); intonation needs the sentence-stress structure to fit its melodic pattern over. Looking at it in another way, we see that the separation of word-stress from sentence-stress, and sentence-stress from intonation is an artificial product of the particular level of observation and analysis. In reality they are not separable: In a one-word “sentence”, word-stress is sentence-stress and it also carries the intonation contour. Similarly, in the more usual multi-word utterances, sentence-stress relies in part on the tonal movements of the intonation contour to make the important words prominent, and the tonal movement relies on the durational (and apparently to a lesser extent) intensity properties of the accented words. What, then, is Rhythm in spoken language? One approach to the question is to try to relate the prosodic structuring of spoken language to a more general understanding of Rhythm.
Rhythm as an L2 problem: How prosodic is it?
103
3.1. Rhythm in music and spoken language Outside language, particularly in music of the Western tradition, rhythm is commonly understood to be the repeated pattern of prominent beats and the less prominent beats between them. We talk about a whole piece of music being “rhythmic” if there is a regular strong beat. But the nature of the rhythm depends on the number of weaker beats between the strong ones. These, it seems, have to be of a predominantly constant number, though an occasional reduction or increase in the number doesn’t change the perceived nature of the rhythm as a whole, as long as the temporal relationship between the strong and the weak parts of the bar is kept constant. Another important feature is that rhythm is not continuous throughout a piece, but is manifested within phrases, which often have boundary properties (e.g., a weak beat before the first strong beat, a final strong beat with no accompanying weak beats, etc.) which are different from the regular beats within the phrase. Projecting this common understanding of rhythm onto spoken language, we can immediately appreciate that spoken verse can be produced and perceived as “rhythmic” in a similar sense. This is because the words and phrases are selected to conform to one of the classical poetic metrical patterns of strong ( ) and weak (ˇ) syllables iambic (ˇ ), trochaic ( ˇ), dactylic ( ˇ ˇ), anapaest (ˇ ˇ )– often with a strict number of beats (feet) in the phrase (line). The close relationship between musical and poetic rhythm is apparent in words put to music and tunes to which words are written. However, the natural production of a poetically well-formed phrase in normal speech communication, though possible, is rather rare and regarded as special (as the post-hoc observation “I was a poet and didn't know it” bears out). A further consideration which separates classical poetic metre from natural speech is its application across (Western) languages, independent of a language’s status in terms of linguistic rhythm typology. The perceptual effect in different languages of, technically, the same metrical structure can be very different. We can thus close the case on normal spoken language rhythm being the same as musical rhythm and come to a second approach, the languagetypology approach to spoken-language rhythm. Since Lloyd (1940: 25), who famously described French as having a “machine-gun rhythm” and English as having “morse-code rhythm”5 an almost mystical belief has arisen in a rhythm-based division of the languages of the world into what Pike (1946) termed “syllable-timed” and “stress-timed” languages. The
104
William J. Barry
identification of a third type – “mora-timed” – was separate from this dichotomy and has been attributed to Bloch (1950) and to Ladefoged (1975). This characterisation is as attractive as it is problematic, both in general scientific terms and in respect of its possible application to L2 pronunciation. 3.2. Rhythm in language typology Scientifically, binary (or even ternary) features which contribute to the categorisation of language phenomena are attractive concepts which demand serious examination. To be phonologically relevant, however, there needs to be some structural correlate of Rhythm which is best explained by that concept rather than another (already established) phonological category. Alternatively, the term can be based on the conceptual grouping of a number of structural correlates, possibly already established at other levels of description. Ideally, these structural properties should have identifiable phonetic exponents, either in measurable aspects of speech production or in reliable perceptual reflexes. Although there is extensive linguistically orientated and often experimentally supported discussion of the supposed universal rhythmic distinctions (cf. Bertinetto 1989 for a thorough and humorously (self-)critical discussion of the literature up to that point), the majority devoted to the syllable- vs. stress-timed distinction, no single structural correlate has been found which justifies the labels as phonological categories in the normal sense or the term. On the other hand, it has been suggested (Bertinetto 1981; Dauer 1983, 1987) that differences in rhythm type are the product of a number of phonologically relevant dimensions, among which are structural properties such as syllable complexity, vowel-length distinctions and word-stress, and interactional prosodic effects such as vowel-duration- and vowel- quality-dependency on stress, the coincidence (or not) of intonational F0 peaks and troughs and of lexical tones with accented syllables. In this respect, Rhythm becomes a phonologically relevant cover term, but no longer in the sense of a rhythm dichotomy or trichotomy. There is no reason to expect the properties listed by Dauer (1987) to group into two neat packages supporting the syllable- vs. stress-timed division, and the position of the mora-timed languages relative to the implied continuum (if the properties group freely) is undefined.
Rhythm as an L2 problem: How prosodic is it?
105
Psycholinguistic research, on the other hand, offers some support for the perceptual reality of the rhythmic typology divisions in terms of lexical processing, at least for the languages which are cited as being prototypical for the three rhythmic types (French, English and Japanese). Cutler et al. (1986), Cutler & Otake (1994), Cutler (1997), Cutler, Murty and Otake (2003), Otake et al. (1993) have demonstrated lexical access differences in terms of the effect of the syllable or the mora on the speed of access. It must be acceded, however, that however real these processing differences are, they do not relate to any concept of Rhythm we have discussed. A perceptual acceptability study by Bertinetto and Fowler (1989), however, demonstrated that English listeners are relatively insensitive to durational manipulation which shortens unstressed syllables compared to Italian listeners (though neither is particularly sensitive to lengthening of unstressed syllables). This corresponds to the results of production analyses for Italian (Farnetani & Kori, 1990) and Greek (Arvaniti, 1994) which, for these two languages, support the “syllable-timing” claim that sequences of more than two unstressed syllables are articulated without any “eurhythmic” differentiation. In languages such as English, sequences of more than two unstressed syllables are produced in such a way as to provide longer, less reduced syllables between shorter, more reduced ones, resulting in a perceptible alternating “rhythm”. It should be borne in mind, however, that this “rhythmic” difference occurs, and becomes apparent only in the perceptually less prominent parts of utterances between the more prominent syllables of sentence-accentuated (i.e. informationally important) words. The prominence patterning of the complete “information package” (possibly a sentence, or an intonation phrase within a longer sentence) will be necessarily more complex than the sequence of unstressed syllables alone. This may, in part, explain why no instrumental analyses looking for isochrony (either of syllables or feet) have been successful (Roach 1982 and compare Bertinetto 1989). Instrumentally based attempts to define the rhythmic types in quantitative terms at the level of production can be divided mainly into two approaches: those seeking syllable- vs. foot-based durational regularity or isochrony (cf. Bertinetto 1989) and those looking for differences between languages in the degree of variability in consecutive (part-of-) syllable durations (Grabe and Low 2002; Ramus, Nespor and Mehler1999; Gibbon and Gut 2001; Wagner and Dellwo 2004). The earlier studies sought regularity, seeking some confirmation in substance for the original auditory impressions. They were singularly unsuc-
106
William J. Barry
cessful, and it appears currently to be generally accepted that there is no direct acoustic, nor articulatory measure of the syllable- vs. stress-timed distinction. Studies that included mora-timing in their remit (Hoequist 1983a, 1983b) have been no more successful. The studies quantifying the structural variability of syllables are based broadly on the theoretical framework suggested by Bertinetto (1981) and Dauer (1983, 1987), though their measures are restricted to durational derivatives directly or indirectly linkable to many of the structural properties. They capture either the overall variability of the syllable, vowel or consonantal interval durations (e.g. with the standard deviation) or the average degree of durational change from one interval to the next throughout an utterance or a corpus. There has been considerable success in differentiating between languages traditionally regarded as belonging to one of the three rhythm types (cf. Ramus 1999; Grabe and Low 2002). However, there is no reason to consider the measures to be a reflex of specifically rhythmic rather than general structural properties (Barry et al. 2003, Wagner and Dellwo 2004), and it has been demonstrated (Steiner 2003) that, in the Bonn database at least, any subdivision of the sound inventory into “vocalic” and “consonantal” intervals, and ultimately the distribution of /l/ and /n/ in the different languages which served as language differentiators. But as measures of language classification (rather than language differentiation), they may be unreliable because they can be strongly influenced by speech rate (cf. fig. 5 from Barry et al. 2003), showing a shift from more to less variable structure with increasing articulation rate (see also Engstrand & Krull 2001 for similar observations on Swedish read vs. spontaneous speech). The extent of this shift observed in Barry et al. (2003) is almost certainly, in part, an artefact of the structural basis for the calculation of articulation-rate, i.e., syllables per second. This is unreliable for spontaneous speech, since the word sequences being compared are not identical and structurally less complex syllables are, ceteris paribus, produced more quickly than complex ones. In other words, the division of the corpus into three sub-corpora of differing articulation rates is also a division into utterances with different average syllable complexities. However, even the Bonn speech rate corpus (Dellwo et al. 2004) shows considerable inter-syllabic variability over slow-to-fast speech rates for lexically controlled (albeit read) speech (Dellwo and Wagner 2003). It is again the consonant variability measure (DeltaC) which shows the strongest variation (for German, English and French, though with a deviation from the general pattern for the fastest rate in English). In
Rhythm as an L2 problem: How prosodic is it?
107
their discussion, Dellwo & Wagner (2003) touch on the problem of different tempo norms in French compared to English or German, and suggest the normalizing variation coefficient (varco = DeltaC * 100 / meanC) as a means of teasing out language differences. They report an interesting separation of French (varco remains constant) on the one side from German and English on the other (varco changes with articulation rate). In articulatory terms, we suggest (without insight at present into the details of their findings) that German and English (and by analogy also Swedish, cf. Engstrand and Krull 2001) tend to simplify the potentially more complex syllable structure with increasing articulation rate, whereas French has less scope for such simplification. 100
120
Ba2 90
Gsp Ba2 Gr
90
Na Pi
70
PVI-C
DELTA-C
Pi
Na
80
Gsp Ba2 GrBa1 Bu1 Na Pi Gr Gsp Bu1
60
50
40
Ba2
Pi
Ba1
Ba2 Na Gr Ba2Ba1 Pi Na Bu1
60
Bu1
Bu1 Bu2 Bu2 Bu2
30 25
Bu1
Ba1 Na
Gsp
Pi
GrGsp Gr Gsp Bu2 Bu2 Bu2
30 50
75
100
DELTA-V
125
150
40
45
50
55
60
65
PVI-V
Figure 5. Measures of consonantal and vocalic variation (calculated after Ramus 1999 and Grabe & Low 2002) as a function of articulation rate (syll/sec). (Gsp = spontaneous German; Gr = read German; Pi = Pisa; Na = Naples; Ba1 and Ba2 = Bari; Bu1 and Bu 2 = Bulgarian)
A different approach by Cummins and Port (1998), using short, twobeat phrase repetition, does show clear production-pattern differences between French and English which are interpretable in terms of stress- vs. syllable-timing. Whereas English speakers appear to introduce an underlying, silent beat in order to regularize the timing of the repeated phrases at foot level, French speakers do not. This is interpretable as a sensitivity to foot-based structuring in English speakers and an inability in French speak-
108
William J. Barry
ers to structure the utterance rhythmically above the syllable level. The question arises, however, whether the “isochronic tendency” that is observable across phrases when they are repeated – akin to Abercrombie's silent stressed syllable in “__ ’kyou” (observed in repeated “Thank you” utterances, e.g. by a bus conductor, cf. Abercrombie 1967: 36) – corresponds to a need to regularize “feet” or “stress units” within a phrase. To summarize the attempts to pigeonhole Rhythm in linguistic terms over the past half-century, it is true to say that the many “isochrony” studies have looked for something measurable that is immediately relatable to “regularly repeated beats”. They have attempted (in vain) to verify instrumentally the original auditory observations by skilled phoneticians about contrasting “rhythmic” impressions of a small number of languages (originally only two). Since the 1980s, structural differences between languages have been moved into focus, and the thrust of work has been to identify differences between languages which conspire in one but not in the other group of languages to prevent the syllables in an utterance from occurring at equal intervals. Some are based in the segmental structure, like vocalic quantity oppositions or variable syllable complexity; others are observations of prosodic behaviour, like the tendency, or lack of it (a) to reduce the duration and spectral distinctiveness of unstressed syllables between accented ones and (b) to compensatorily shorten accented syllables as a function of the number of unstressed syllables following. The instrumental measures associated with this theoretical view (summarized above) have indeed shown that languages can be differentiated, and that they appear to divide up into groups containing languages that have traditionally been described as syllable-timed or stress-timed. However, the wide range of values across languages belonging to the “same” rhythmic group, the assumption of “mixed” rhythm types, and the conflicting positioning of the same language from one study to another within the language selection examined cast doubt both on the validity and on the “rhythmic” basis of the distinctions. 3.3. Rhythm in L2 In connection with the teaching and acquisition of a correct Rhythm in a foreign-language, the first question could well be whether the discussion so far has any relevance at all?
Rhythm as an L2 problem: How prosodic is it?
109
Most teachers probably associate the idea of Rhythm with the regular beats discussed and rejected as a normal phenomenon in non-poetic speech (section 3.1). A well-established German programme (for young French learners) which makes explicit use of Rhythm as an integral part of teaching active speech production (Andreas Fischer: www.phonetik-atelier.de) in fact uses rhythmic movement, simple rhythm instruments and silent beats to help young French learners of German to produce regular accent intervals and avoid the perceptually much more equal weight attached to consecutive syllables in French. On the other hand, a more complex and analytic view of rhythm in speech production is presented by Stock & Veličkova (2002, cf. also Veličkova 1990, 1993). Equally concerned with the practicalities of teaching and learning (with the focus on adult learners), they acknowledge the persistence of isochrony as the established view while discussing the possible bi-directional interpretation of Rhythm a) as the determinant of the segmental and prosodic properties associated with the rhythm-typology divisions and b) as the product of those properties (cf. also Krull and Engstrand 2003). The use in teaching of gestural support for segmental properties which affect the overall prosodic pattern of a phrase (e.g. vowel length in stressed syllables, cf. Veličkova 1990, 1993) underlines the recognition that properties from all levels of language structure contribute to a gestalt-like experience of Rhythm in an utterance. They consequently deal with rhythmic patterns of phrases, both as stand-alone expressions and sequenced within texts, which reflect the lexical stress patterns of the words used and the information weighting of those words within the context. The question of isochrony in stress-timed languages hardly arises because many phrases, even within a longer text, do not exceed two accents – even if they contain more than two accentable words. The example given by Stock and Veličkova (2002: 29) may serve as illustration of how the number of accents can vary for a given word sequence: // manche kol / legen / wissen das aber / nicht // 0 0 X 0 X 0 X 0 X X X 0 X X X X
(The accents, marked with X, can vary from one to four. Only the fourth variant, with all four feet accented, deviates from the default nuclear posi-
110
William J. Barry
tion (wissen). Isochrony is testable in the fourth, and in the third variant in a more limited way by excluding the last foot from the metrical frame) Putting it at its simplest and most extreme, we can say: Every utterance has its own particular “Rhythm-pattern”, determined by the relative communicative weight attributed (by the particular speaker) to the particular words within the particular syntactic structure within the particular communicative context. We thus define Rhythm as the situation- and utterancedependent pattern of prominences and shall use the term “prominence pattern” instead of Rhythm from now on. This information-based view allows for a considerable amount of variation in the “rhythmic” realisation of any given sequence of words, and underlines the importance of the teaching maxim that nothing should be taught without contextualisation. However, given the situation and the linguistic pre-context, and a not-too-eccentric speaker, the actual degree of freedom is much smaller. The choice of which and how many words to make communicatively prominent, as well as the relative prominence of the accented words are fairly strictly delimited. Figure 6 shows the tight syllable-duration clustering for the main accents in the sentence “Heute morgen bin ich zu spät aufgestanden” (I got up too late this morning) spoken in an unmarked manner, as if introducing a story. For the unaccented words there is more individual variation. In unaccented multi-syllabic words, the lexical-stress pattern for citation-form production can disappear, be dynamically (but not tonally) retained, or even shifted, depending on the language (e.g. its status within traditional rhythm-typology classes). Above all, the treatment of unstressed syllables will depend on the language, though in spontaneous speech, the occurrence of elision and assimilation phenomena, particularly at word boundaries appears to cut across traditional rhythm-typology differences (Barry and Andreeva 2001). However, the general observation which is supposed to separate “stress-timed” from “syllable-timed” languages, namely the tendency to reduce unstressed syllables in the latter and to retain the full phonetic identity in the former, certainly has some languagedifferentiating validity. But they do not all behave in the same way. As a comparison of English, Bulgarian, Russian on the one hand with German, Dutch, Swedish on the other – all of them “reducing” languages – will show: there are very different forms and degrees of reduction. Dutch, English, German and Swedish reduce the quantity of long vowels in unstressed position, but only English has systematic quality reduction of the vowels (towards schwa). Bulgarian and Russian do not have a long-short vowel
Rhythm as an L2 problem: How prosodic is it?
111
opposition, which precludes quantity reduction, but they do (like English) have spectral change in unstressed vowels, albeit in a more complex manner than the general centralization tendency found in English.
Normalised Syllable/Segments
,9
Segmentally normalized syllable duration
,8
Speaker AD
,7
ad AK
,6
ak MK
,5
mk SO
,4
so SW
,3
sw
,2
TH
th
,1 1
2
3
4
5
6
7
8
9
10 11 12
Syllable sequence of sentence: „Heute morgen bin ich zu spät aufgestanden“ Syllable Sequence Figure 6.
“1-Heu 2-te 3-mor 4-gen 5-bin 6-ich 7-zu 8-spät 9-auf 10-ge 11-stan 12-den”. Articulation-rate-normalized syllable durations for 6 native speakers of German.
However, it is not only the fact that languages differ in their reduction patterns which leads to prominence-pattern deviations in the speech production of non-native speakers. Prosodic differences alone may lead to learners with “stress-timed” L1 reducing the articulatory effort invested in the (unstressed) syllables between accents when speaking a “syllable-timed” L2, and possibly introducing spurious (eurhythmic) prominences into sequences of more than two unstressed syllables (cf. Arvaniti 1994; Farnetani and Kori 1990). However, all learners, whatever their L1, tend to overarticulate in comparison to native speakers. This makes the step for a “syllable-timed” learner producing “stress-timed” utterances particularly diffi-
112
William J. Barry
cult, although, as figure 7 shows, even experienced “stress-timed” learners of another “stress-timed” language (English, Russian) together with speakers of an assumed “syllable-timed” language (Korean) all tend to deviate from the average native-speaker pattern in a way which reflects too little differentiation of accented syllables and unstressed syllables6 (cf. also Gut 2002, 2003; Benkwitz 2003). ,7
,6
Sprecher Speaker
Segmentally normalized ,5 syllable duration (average for accent strength ,4 category)
English English
bb Korea I
Korea 1
kh-a
Korea II
Korea 2
kh-b
German
average German D. average Mittelw.
,3
Russian
,2 1
ot 2
3
Accent strength category Accent 1= main; 2 = secondary; 3 = unstressed Strength
Akzentklasse
Figure 7.
“Heute morgen bin ich zu spät aufgestanden”. Articulation-ratenormalized syllable durations for 5 L2-speakers of German in comparison to the average native-speaker durational pattern.
We see from the preceding discussion that some factors affecting the prominence pattern stem from segmental changes which (depending on the language) do or do not co-occur with stress- and accent-status. Since such segmental changes are unlikely to become established as an unconsciously
Rhythm as an L2 problem: How prosodic is it?
113
absorbed corollary of explicit, holistic rhythmic speaking practice (even with young French children), they need to be dealt with specifically. Indeed (taking English as an example), (i) weak forms, (ii) final voiced consonants (with their longer preceding vowels), (iii) vowel-length and -quality contrasts, (iv) consonant-cluster reductions at word-boundaries etc. are all accepted points of pronunciation practice, whether the learner comes from an assumed “syllable-timed” or from a “stress-timed” language. The thesis postulated here is that the sum of these (essentially segmental) properties are the determining features of an acceptable (prosodic) prominence pattern. Introducing the concept of foot-based isochrony (i.e. rhythmic regularity) on top of all these syllable-realization exercises is not only unnecessary, but also induces an element of stylization and artificiality which, if it actually becomes established in the learners’ production patterns, will have to be unlearned again.
4.
Conclusions
The assumption behind our discussion has been that the goal of pronunciation teaching is to make the learners aware of the nature of the task they have to practise. With regard to the Rhythm concept, awareness is most easily linked to the idea of isochrony, i.e. to a regular beat, traditionally considered to characterize so-called “stress-timed” languages (a regular beat of accented syllables) and “syllable-timed” languages (a regular syllabic beat). We maintain, however, that this is both unhelpful and misleading in the L2 teaching environment. Richard Cauldwell’s (2002: 1) recent summing up of the situation corresponds very much to the view we have tried to present in this paper: Although the formal events of speech – phones, strong and weak syllables, words, phrases – occur ‘in time’ (they can be plotted on a time line) they do not occur ‘on time’, (they do not occur at equal time intervals). English is not stress-timed, French is not syllable-timed. The rare patches of rhythmicality are either ‘elected’ – as in scanning readings of poetry and the uttering of proverbs – or ‘coincidental’ – the side-effects of higher order choices made by speakers. Coincidental rhythmicality is most likely to occur where there are equal numbers of syllables between stresses. In spontaneous speech, the speaker’s attention is on planning and uttering selections of
114
William J. Barry
meaning in pursuit of their social-worldly purposes, and this results in an irrhythmic norm which aids comprehension.
Ulrike Gut (2003) retains the term “rhythmical” in her study of prosodic behaviour in a number of different learner-groups' production of L2 German. However, operationally, she breaks Rhythm down into durational and metrical characteristics, the latter being defined as the relative prominence of units such as syllables. It is debatable whether prominence can, ultimately, be separated from duration (cf. Kochanski et al. 2005, however, who consider “loudness” separately from “duration” as determinants of prominence) but this is irrelevant for teaching purposes. It is important that individually learnable properties of language be brought into focus – informationally important (prominent) words, informationally less important words (and syllables within multi-syllabic words), long and short vowels, spectrally reduced vowels, consonant elision, etc. The contextualized introduction and practice of these properties in an optimal sequence is, of course, a non-trivial task. But their command will lead to a globally correct prosody and, in time, to a sense of prosodic “rightness” for the particular communicative intention in the same way that learning verb or noun morphology and syntactic regularities will lead to a command of the correct form and sequence of words. In neither of these areas would one think of introducing teaching points by appealing to a sense of “Morphology” or “Syntax”. We suggest that the appeal to a general idea of “Rhythm” which is abstracted from the prominence pattern of the particular utterance is equally unproductive. The implication of the message conveyed by this discussion will no doubt annoy those teachers who would like a lot of different pronunciation problems to be covered by one “rhythmic blanket”. But the facts remain: Prosodic differences between languages – and our discussion leads to the conclusion that correct Rhythm is the sum of the communicatively correct (i.e. contextually and situationally correct) prosodic properties – are distributed over all levels of phonetic-phonological structure. Correct pronunciation cannot emerge from an appeal to an undefined blanket term. Knowledge and treatment of individual problems remains essential. The articulation of individual sound, which can be “new sounds” (like /y/ for English speakers or /6, &/ for German speakers), combinations of familiar sounds (like /kn/ for English speakers) or combinations of new with familiar sounds (alveolar non-sibilants followed by dental fricatives for most learners of English) and new distributional patterns (like final
Rhythm as an L2 problem: How prosodic is it?
115
voiced consonants for Germans) lead to a slowing down of articulatory processes in their vicinity, which inevitably affects the overall prosodic pattern. Direct prosodic repercussions arise from differences in length oppositions between languages (whether L1 and L2 both have long and short vowels or long and short consonants) and intra-syllabic phonetic length relations (e.g. long consonants following short vowels and vice versa, as is the case in Swedish). Thus we see that a considerable amount of segmentally orientated pronunciation work, assuming that it is satisfactorily contextualized, contributes to rhythmically correct speech. At the level of prosodic or pronunciation practice, correct word-stress location is an obvious and fundamental contributor to the correct rhythmic identity of an utterance. But, as we identified in the discussion, even with correct stress location, the phonetic means of realizing the stress can be different from one language to another and thus distort the rhythmic impression. In Italian, for example, the vowels in open syllables are lengthened when stressed, and even more lengthened when given a topical or focal accent. This leads to the well known rhythmic distortions of Italian speakers of other languages, but is, of course, also the source of rhythmic distortion for learners of Italian. French is also a language that exploits a large degree of syllabic lengthening for the informational or affective weighting of words at utterance level (despite the fact that, phonologically, French has neither phonemic vowel length distinctions nor word-stress). It must be clear from this short selection of rhythm-distorting problems that a global appeal to language rhythm as “stress-timed” or “syllabletimed” is of no advantage. This does not mean, however, that learners of French should not be made aware of the fact that (outside the topical, focal or emphatic accents) syllables are all given as equal weight as possible, and vowels are not reduced (a statement often associated with “syllabletiming”). Nor does it mean that learners of English should not be made to practise the reduction and temporal compression of unstressed and unaccented syllables in words and phrases (a statement often associated with “stress-timing”). It does mean that teachers need to be aware of a lot more differences between the respective L1s and L2s, of the problems that contribute to incorrect pronunciation in general, and to the incorrect rhythmic impression of utterances in particular.
116
William J. Barry
Notes 1. In fact it is so difficult that many teachers neglect pronunciation because their own awareness has lagged way behind their expertise in other areas such as grammar and vocabulary. These have the advantage of being capturable in a permanent form – in writing – for post hoc consideration. 2. We use the word Rhythm throughout the paper (with a capital R) for the term we are discussing and calling in question as an independently identifiable phenomenon. 3. A universal ability to register tonal differences and types of tonal movement in speech should not be taken for granted, even if the universal ability (for the non-handicapped) to communicate with speech might make us assume it. How absolutely necessary the decoding of tonal structure is for successful (contextualized) speech communication has not been investigated, and tonal structure is accompanied by several other signal properties, as we have already shown in figs 1 and 2. This suggests the possibility of compensatory decoding, i.e., making use of other than tonal properties for speaker-hearers insensitive to tone. 4. The fig. 2b version implies that the speaker is confirming that the fact of the other person’s agreement corresponded to his/her (i.e. the speaker’s) assumption. The fig. 4 version implies that the speaker’s previous assumption of agreement by the other person might not be true; it expresses some degree of protest. 5. Quoted from Abercrombie (1967), p. 171, endnote 7. 6. The three accent-strength categories over which syllable durations were calculated are: (i) tonally prominent accented syllables, (ii) unaccented syllables without vocalic reduction and (iii) unstressed syllables.
References Abercrombie, David 1967 Elements of General Phonetics. Edinburgh: Edinburgh University Press. Arvaniti, Amalia 1994 Acoustic features of Greek rhythmic structure. Journal of Phonetics 22, 239–268. Barry, William and Bistra Andreeva 2001 Cross-language similarities and differences in spontaneous speech patterns. Journal of the International Phonetic Association 31, 51–66.
Rhythm as an L2 problem: How prosodic is it?
117
Barry, William, Bistra Andreeva, Michela Russo, Snezhina Dimitrova and Tania Kostadinova 2003 Do rhythm measures tell us anything about language type? Proceedings of 15th International Congress of Phonetic Sciences, Barcelona, Vol. 3, 2693–2696. Benkwitz, Anneliese 2004 Kontrastive phonetische Untersuchungen zum Rhythmus. (Hallesche Schriften zur Sprechwissenschaft und Phonetik 14). Frankfurt am Main etc.: Peter Lang. Bertinetto Pier Marco 1981 Strutture prosodiche dell’italiano. Firenze: Accademia della Crusca. 1989 Reflections on the dichotomy ‘stress’ vs. ‘syllable-timing’. Revue de Phonétique Appliquée 91-92-93, 99–130. Bertinetto, Pier Marco and Carol Fowler 1989 On the sensitivity to durational modifications in Italian and English. Revista di Linguistica 1, 69–94. Bloch, Bernard 1950 Studies in colloquial Japanese IV: Phonemics. Language 26, 86– 125. Cauldwell, Richard 2002 The functional irrhythmicality of spontaneous speech: A discourse view of speech rhythms. Applied Language Studies: Apples 2,1, 1–24. Crowder, Robert G. and John Morton 1969 Precategorical acoustic storage (PAS). Perception and Psychophysics 5, 363–73. Crowder, Robert G. 1993 Short-term memory: Where do we stand? Memory and Cognition 21, 14–145. Cummins, Fred and Robert F. Port 1998 Rhythmic constraints on stress timing in English. Journal of Phonetics 26, 145–171. Cutler, Anne, Jacques Mehler, Dennis G. Norris and Juan Seguí 1986 The syllable’s differing role in the segmentation of French and English. Journal of Memory and Language 25, 385–400. Cutler, Anne and Takashi Otake 1994 Mora or phoneme? Further evidence for language-specific listening. Journal of Memory and Language 33, 824–844.
118
William J. Barry
Cutler, Anne 1997
The syllable’s role in the segmentation of stress languages. Language and Cognitive Processes 12, 839–845. Cutler, Anne, Lalita Murty and Takashi Otake 2003 Rhythmic similarity effect in non-native listening? Proceedings of the 15th International Congress of Phonetic Sciences, Barcelona, Vol. 1, 29–332. Dauer, Rebecca M. 1983 Stress-timing and syllable-timing reanalyzed. Journal of Phonetics 11, 51–62. 1987 Phonetic and phonological components of language rhythm. Proceedings of the 11th International Congress of Phonetic Sciences, Tallinn (Estonia), Vol. 5, 447–450. Dellwo, Volker and Petra Wagner 2003 Relations between language rhythm and speech rate. Proceedings of the 15th International Congress of Phonetic Sciences, Barcelona, Vol. 1, 471–474. Dellwo, Volker, Ingmar Steiner, Bianca Aschenberner, Jana Dankovičová and Petra Wagner 2004 The BonnTempo-Corpus & BonnTempo-Tools: A database for the study of speech rhythm and rate. Proceedings of the 8th Intenational Congress of Speech and Language Processing. ICSLP, Jeju Island (Korea), 777–780. Engstrand, Olle and Diana Krull 2001 Simplification of phonotactic structures in unscripted Swedish. Journal of the International Phonetic Association 31, 41–50. Farnetani, Edda and Shiro Kori 1990 Rhythmic structure in Italian noun phrases: A study on vowel duration. Phonetica 47, 50–65. Gelder, Beatrice de and Jean Vroomen 1997 Modality effects in immediate recall of verbal and non-verbal information. European Journal of Cognitive Psychology 9(1), 97–110. Grabe, Esther and EeLing Low 2002 Durational variability in speech and the rhythm class hypothesis. In: Carlos Gussenhoven and Natasha Warner (eds.) Papers in Laboratory Phonology VII, 515–546, Berlin, New York: Mouton de Gruyter. Gibbon, Dafydd and Ulrike Gut 2001 Measuring speech rhythm. Proceedings of Eurospeech 2001, Aalborg (Denmark), 91–94.
Rhythm as an L2 problem: How prosodic is it?
119
Gibbon, Dafydd 2003 Computational modelling of rhythm as alternation, iteration and hierarchy. Proceedings of the 15th International Congress of Phonetic Sciences, Barcelona, Vol. 3, 2489–2492. Gut, Ulrike 2003 Prosody in second language speech production: the role of the native language. Zeitschrift für Fremdsprachen Lehren und Lernen 32, 133–152. Hazan, Valerie 2002 L'apprentissage des langues. Proceedings of XXIVemes Journees d'etude de la parole, Nancy, 1–5. Hoequist, Charles J. 1983a Durational correlates of linguistic rhythm categories. Phonetica, 40, 19–31. 1983b Syllable duration in stress-, syllable- and mora-timed languages. Phonetica 40, 203–237. Kallman, Howard J. and Dominic W. Massaro 1983 Backward masking, the suffix effect, and preperceptual storage. Journal of Experimental Psychology: Learning, Memory, and Cognition 9, 312–327. Kochanski, Greg, Esther Grabe, John Coleman and Bert Rosner 2005 Loudness predicts prominence: fundamental frequency lends little. Journal of the Acoustical Society of America 118, 1038– 1054. Krull, Diana and Olle Engstrand 2003 Speech rhythm – intention or consequence? Cross-language observations on the hyper-hypo dimension. PHONUM 9, 133–136. Ladefoged, Peter 1975 A Course in Phonetics. New York: Harcourt Brace Jovanovich. Lloyd James, Arthur 1940 Speech Signals in Telephony. London: Sir I. Pitman & Sons. Markham, Duncan 1997 Phonetic Imitation, Accent, and the Learner. Lund: Lund University Press. Massaro, Dominic W. 1972 Preperceptual images, processing time, and perceptual units in auditory perception. Psychological Review 79,124–145. Mehler, J., J.-Y. Dommergues, U. Frauenfelder and J. Seguí 1981 The syllable’s role in speech segmentation. Journal of Verbal Learning and Verbal Behaviour 20, 298–305.
120
William J. Barry
Otake, Takashi, Giyoo Hatano, Anne Cutler and Jacques Mehler 1993 Mora or syllable? Speech segmentation in Japanese. Journal of Memory and Language 32, 358–378. Palmeri, Thomas J., Stephen D. Goldinger and David B. Pisoni 1993 Episodic encoding of voice attributes and recognition memory for spoken words. Journal of Experimental Psychology: Learning, Memory, and Cognition 19, 309–328. Pike, Kenneth L. 1946 The Intonation of American English. Ann Arbor: University of Michigan Press. Quené, Hugo and Robert F. Port 2005 Effects of timing regularitiy and metrical expectancy on spokenword perception. Phonetica 62, 1–13. Ramus, Franck, Marina Nespor and Jacques Mehler 1999 Correlates of linguistic rhythm in the speech signal. Cognition 73, 265–292. Roach, Peter 1982 On the distinction between “stress-timed” and “syllable-timed” languages. In: David Crystal (ed.), Linguistic Controversies, 73– 79, London: Edward Arnold. Steiner, Ingmar 2005 On the analysis of speech rhythm through acoustic parameters. In: Bernhard Fisseni, Hans-Christian Schmitz, Bernhard Schröder and Petra Wagner (eds.) Sprachtechnologie, mobile Kommunikation und linguistische Ressourcen: Beiträge zur GLDV-Tagung 2005 in Bonn. (Computer Studies in Language and Speech 8). 647–658, Frankfurt/Main: Peter Lang. Stock, Eberhard and Ludmila Veličkova 2002 Sprechrhythmus im Russischen und Deutschen. (Hallesche Schriften zur Sprechwissenschaft und Phonetik 8), Frankfurt/M.: Peter Lang. Veličkova, Ludmila 1990 Untersuchungen zur Theorie und Praxis des Phonetikunterrichts. Habilitationsschrift, Halle. 1993 Die Vermittlung phonologischer Distinktionen mit einem Gestensystem. Deutsch als Fremdsprache 30, 253–258. Wagner, Petra and Volker Dellwo 2004 Introducing YARD (Yet Another Rhythm Determination) and reintroducing isochrony to rhythm research. Proceedings of Speech Prosody, Nara (Japan), 227–230.
Temporal patterns in Norwegian as L2 Wim A. van Dommelen 1.
Introduction
One of the fundamental properties of spoken language is that it, like all physical events, extends over time. Consequently, much of the last decades’ research in phonetics has been devoted to the investigation of the temporal organization of speech. An issue that has evoked much debate and stimulated empirical research is concerned with rhythmical differences between languages. For a discussion of the traditional classification of languages into stress-timed, syllable-timed and mora-timed, see Barry (this volume; cf. also Dauer, 1983). In spite of numerous research efforts devoted to issues of speech timing in general and of language-specific timing in particular, our knowledge and understanding are far from complete. This is true for temporal aspects of language spoken by native speakers (L1 speech) and, even more so, for L2 speech. The approach chosen for the present study is to analyze temporal aspects of Norwegian as a second language spoken by speakers from various language backgrounds in comparison with native Norwegian. The rationale behind this is that we hope to obtain information not only about what deviations from the L1 standard occur but also whether such deviations pattern in language-specific ways. Our study consists of two main parts. Following the description of the collection of subjects, speech material and segmentation in Section 2, the first part (Section 3) deals with the temporal structure of dyads consisting of a vowel followed by a consonant. A special property of the Norwegian phonological system is the quantity system which involves not only the vowels but also the consonants. In a stressed syllable, the vowel can be either short or long while unstressed syllables only can contain short vowels. Consonants in stressed syllables have a complementary distribution of duration, being long after a short vowel (e.g. in matte [¥OCVÖ] ‘mat’) and short following a long vowel (e.g. in mate [¥OCÖV] ‘[to] feed’). The phonological specification of the VC: vs. V:C opposition has been subject of some debate, the question being whether we are dealing with a vowel or a consonant quantity opposition (cf. Kristoffersen, 2000:
122
Wim A. van Dommelen
116–120). From a phonetic viewpoint it seems reasonable to argue that the vowel is the carrier of the quantity opposition. Previous investigations have shown that the ratio V:/V is considerably larger than the C:/C ratio. Fintoft (1961) measured vowel and consonant durations in isolated Norwegian logatomes. He reported a V:/V ratio of approximately 1.9 (varying between 1.7 and 2.1 depending on the nature of the following consonant as a fricative, nasal or liquid). In contrast, the duration ratio of medial long vs. short consonants amounted only to approximately 1.3 (varying between 1.2 and 1.4). Quite similar relations were found by Behne, Moxness and Nyland (1996) through their measurements of the durations of long and short vowels preceding voiced and voiceless plosives in Norwegian sentenceembedded words. From the data presented in their Figure 1, average duration ratios (pooled across voiced and voiceless plosives) of 1.8 and 1.3 can be calculated for V:/V and C:/C, respectively. Also, results on the perception of a long vs. short vowel followed by a voiceless stop by van Dommelen (1999a) suggest that vowel duration is a far more important cue for the perception of vowel quantity than the consonant (cf. also Krull, Traunmüller and van Dommelen, 2003). In our study we thus address the question of how users of Norwegian as a second language realize the VC: and V:C dyads. A point of particular interest will be the kind and amount of variation in the L2 productions. If the variation in vowel and consonant durations is relatively limited, we might be able to detect deviation patterns that are characteristic for L2 user groups from specific language backgrounds. Larger variation, on the other hand, could obscure such possible patterns and render it difficult to draw firm conclusions about typical deviations from Norwegian reference values and differences between the realizations from the L2 speaker groups. The second part of our investigation (Section 4) is concerned with speech rhythm in L2 compared with L1 speech. In recent studies attempts have been made to classify languages according to rhythmical categories using various metrics. To investigate rhythm characteristics of eight languages, Ramus, Nespor and Mehler (1999) calculated the average proportion of vocalic intervals and standard deviation of vocalic and consonantal intervals over sentences. Though their metrics appeared to reflect aspects of rhythmic structure, also considerable overlap was found. Grabe’s Pairwise Variability Index (PVI; see Section 4.1) is a measure of differences in vowel duration between successive syllables and has been used by, e.g., Grabe and Low (2002), Ramus (2002) and Stockmal, Markus and Bond (2005). In order to achieve more reliable results Barry et al. (2003) pro-
Temporal patterns in Norwegian as L2
123
posed to extend existing PVI measures by taking consonant and vowel intervals together. In her 2003 study Gut compared the speech of learners of German with English, Chinese and Romance languages as L1 with the speech of two native speakers of German. For utterances produced by these speakers she used a Rhythm Ratio (RR) to explore the temporal organization of subsequent syllables. Though the Romance language speakers produced syllables that tended to be of more similar duration than those from the German speakers, the difference did not achieve statistical significance. Also the RR values for the English and Chinese subjects did not differ significantly. The present study approaches the issue of language-specific speech rhythm indirectly by comparing the temporal structure of L2 utterances with similar utterances produced by native speakers. More specifically, we will use different measures derived from the sequences of syllables in utterances and use a discriminant analysis to explore whether those measures can be related to the different L2 groups investigated. For the present purposes, the main function of a discriminant analysis is the following. The first step is to define a number of variables (here mean syllable duration, durational differences between consecutive syllables, etc.; a complete description is given in Section 4.1). Secondly, these variables are entered into the analysis together with an a priori classification which in our case represents the six different groups of L2 users. The discriminant analysis then uses the variables to classify the input data into groups, importantly without any prior information about the predefined groups. The output of the analysis tells the user which of the variables entered into the analysis contributed significantly to the statistical grouping. The most interesting question for us is to see to which degree the purely statistical grouping of the data is in congruence with the user-defined classification according to L2 user groups. A reasonably large degree of agreement between the two classifications will indicate that the chosen measures capture relevant aspects of L1influenced speech rhythm.
2.
Subjects, speech material and segmentation
A total of 37 subjects served as speakers in this study, divided into the following seven groups. There were six second language speaker groups with the following L1s (number of speakers in parentheses): Chinese (7), English (4), French (6), German (4), Persian (6) and Russian (4). In an attempt
124
Wim A. van Dommelen
to collect speakers having approximately the same level of proficiency in Norwegian, most of the speakers were recruited from a Norwegian course offered at the Department of Language and Communication Studies (NTNU, Trondheim). Six native speakers of Norwegian served as a control group. The speech material used was chosen from existing recordings made for the Language Encounters project (see Acknowledgement). The recordings have been made in the department’s sound-insulated studio and were subsequently stored with a sampling frequency of 44.1 kHz. The material comprises readings of a short text, 120 different sentences and some spontaneous speech. Since the sentences have been designed to contain all the Norwegian phonemes and relevant VC: and V:C dyads, this part of the material was considered most suited for a systematic investigation and, therefore, a number of sentences has been selected for the present study. For the first part of our investigation, eight different sentences were chosen containing words with the short vowels /a/ and /ø/ (two sentences each) and their long counterparts /a:/ and /ø:/ (two sentences each) all followed by the voiceless plosive /t/. For the second part, ten different sentences were selected containing between 9 and 15 syllables (mean of 12.2 syllables). The total number of utterances investigated was thus 37 (subjects) x 8 (utterances) = 296 for the first part, and 37 x 10 = 370 for the second part. The segmentation of the speech material was done by visual and auditory inspection of the waveform and the spectrogram of the speech signal using Praat (Boersma and Weenink, 2006). Figure 1 shows an example of how vowel and consonant durations were measured for part 1. The test word under scrutiny is møtte ([¥O1VÖ] ‘met’). Determining the starting point of the VC: dyad (the transition from the nasal to the vowel) is a relatively straight-forward task. The end of the dyad was set at the beginning of the schwa, i.e. at the end of the postaspiration. In contrast to these two points in time, defining the exact end of the vowel (= the start of the intervocalic plosive) is not a trivial task. The Norwegian speaker shown in the figure has produced preaspiration, the realization of which can vary but which is here characterized by a short vowel portion with breathy voice quality followed by a voiceless friction phase. Segmentation of the present speech material always followed the convention illustrated here, i.e. defining preaspiration (if any) as the sum of the breathy part and the friction (both of which can be absent).
Temporal patterns in Norwegian as L2
1
2
3
125
4
5000 Freq (hz)
0 0.567
0.827 Time (s)
Figure 1. Waveform (top) and spectrogram (bottom) of the word møtte ([¥O1VÖ]) spoken by a female Norwegian subject. Indicated are (1) vowel, (2) preaspiration (breathy vowel + voiceless friction), (3) occlusion, and (4) postaspiration.
The segmentation for part 2 consisted of dividing the utterances into syllables and determining their durations. Syllabification was guided primarily by the consideration to achieve consistent results across speakers and utterances. In words containing a sequence of a long vowel and a short consonant in a context like V:CV (e.g., fine [¥HKÖP] ‘nice’) the boundary was placed before the consonant (achieving fi-ne), after a short vowel plus long consonant as in minne ([¥OKPÖ] ‘memory’) after the consonant (minn-e). Only when the intervocalic consonant was a voiceless plosive, the boundary was always placed after the consonant (e.g. in mat-et ‘fed’).
3.
Duration patterns of vowels and consonants
This section describes the results from the first part of our investigation, dealing with the temporal structure of VC: and V:C dyads as produced by both L2 and L1 speakers of Norwegian. After the inspection of mean durations in 3.1, Section 3.2 looks into the phenomenon of preaspiration, which was not only produced by the Norwegian natives but also by a group of L2
126
Wim A. van Dommelen
users. In Section 3.3 the problem of variation is dealt with and it is argued that the interpretation of the empirical data to a large degree depends on the perspective chosen for the evaluation. 3.1. Mean vowel and consonant durations Figure 2 depicts mean segment durations for VC: (Figure 2a) and V:C (Figure 2b) dyads as produced by the six groups of L2 users and the Norwegian control group. As the first thing of note for the latter group we observed a relatively long preaspiration (breathy vowel + friction) in both VC: (35 ms) and V:C (29 ms). Traditionally, the occurrence of preaspiration is considered to be restricted to a few dialectal variants, our present speakers (having dialect backgrounds from South-East Norwegian and the Trøndelag region) not belonging to them. Our data therefore suggest that preaspiration occurs more frequently in Norwegian than usually is assumed and they confirm similar findings from previous studies (van Dommelen, 1999b). Further, calculation of duration ratios for V:/V and C:/C (where preaspiration is included in the consonant) achieved values of 2.24 and 1.19, respectively. Fintoft’s (1961) material did not include postvocalic stops (see Section 1) so that a direct comparison with his results is precluded. (In this connection it should be noted that preaspiration only occurs with voiceless stops.) By and large, Fintoft’s mean values of 1.9 for V:/V and 1.3 for C:/C can be said to be not too different. The same conclusion can be drawn concerning Behne, Moxness and Nyland’s (1996) ratios of 1.8 and 1.3 (averaged across voiced and voiceless plosives; see Section 1). In their description no mention is made of the occurrence of preaspiration such that their data are inconclusive in this respect. As to the productions by the L2 speakers, Figure 2 shows that the deviations of the vowel and consonant durations from the L1 reference values are not as large as possibly expected. This is especially true for the V:C dyad. A more systematic pattern was found for the short vowel, which was produced with longer durations than the Norwegian mean value by all L2 speaker groups. To see how the second language users master the V:/V quantity opposition, let us compare their V:/V ratios with the value measured for the reference group, which amounts to 2.24 (i.e., for V excluding preaspiration). While the German, English and Russian speakers had relatively high ratios (1.53, 1.38, and 1.38, respectively), the remaining groups (Chinese, French, and Persian) had less clear durational contrasts (values of
Temporal patterns in Norwegian as L2 (a)
350
C: PA
300
(b)
350
C PA
300 Duration (ms)
Duration (ms)
V 250 200 150 100
127
V:
250 200 150 100 50
50
0
0 Ch
En
Fr
Ge
Pe
Ru
No
Ch
En
Fr
Ge
Pe
Ru
No
Figure 2. Mean segment durations (in ms) in words containing /a(:)/ and /ø(:)/ followed by a voiceless plosive spoken by different speaker groups (Chinese, English, French, German, Persian, Russian, and Norwegian). (a): V, preaspiration (PA) and C:; (b): V:, preaspiration, C.
1.07, 1.08, and 1.04, respectively). It may seem to be somewhat surprising that the Russian speakers pattern with the German and English groups though Russian lacks a vowel quantity/tenseness contrast as present in the L1s of the latter. An explanation could be sought in the Russian word stress system where vowels are lengthened when stressed (cf. Svetozarova 1998). This means that the speakers are at least familiar with conditioned vowel duration. In contrast to the apparently rather regular behaviour of the present Russian subjects, Markus and Bond (1999) report difficulties of Russian talkers to employ duration as a correlate of vowel quantity in Latvian. Similarly, the Russian L2 speakers of Latvian in Bond, Markus and Stockmal (2003) inappropriately produced short vowels with lengthening and failed to reach appropriate durations for long vowels. Prompted by these seemingly diverging results, we inspected our present data more closely. This inspection showed that the behaviour of the Russian speakers as a group is less appropriate after all. One of the subjects produced very long V: durations (mean value of 245 ms) while the other three Russian speakers produced much shorter durations (mean value of 113 ms). Given respective mean durations of 131 ms and 97 ms for the short vowels, the V:/V ratio was remarkably high for the former (1.87) and much lower for the latter (1.16). This result thus demonstrates the issue of variation within a group and so it may be worthwhile to have a closer look at the data and to investigate to what extent variation of individual segment durations can tell us more about L2 performance. Before doing this, however, we will focus on the transition of the vowel into the stop as produced by the English subjects, i.e. preaspiration.
128
Wim A. van Dommelen
3.2. Preaspiration A detail deserving our attention is the fact that the speakers with an English background produced relatively strong preaspiration in both VC: and V:C context (with respective mean durations of 62 ms and 50 ms even longer than the productions of the Norwegian control group; 35 ms and 29 ms, respectively). For some other groups (French, German, Persian, and Russian) short preaspiration portions were measured, but their short durations indicate that we are dealing with presumably physiologically conditioned transitions from the vowel into the stop. For the English speakers, however, the question is how to explain their substantial production of preaspiration. According to impressionistic observation, the proficiency in Norwegian pronunciation of this speaker group was not notably higher than for most of the other L2 groups. It seems, therefore, improbable that the English group had acquired the production of preaspiration through intensive learning and contact with Norwegian. According to occasional inspection of English speech material, preaspiration appears to occur in this language as well. Interestingly, the production of preaspiration seems to have gone unnoticed in the literature. At least, investigation of some textbooks on English phonetics reveals that the feature of preaspiration is not mentioned and, therefore, does not belong to the catalogue of relevant characteristics. For example, in her workbook on the pronunciation of English Kenworthy (2000) deals with aspiration, but not with preaspiration. The same is true for the introduction to phonetic science by Ashby and Maidment (2005). Ladefoged and Maddieson (1996:70–73) discuss the occurrence of the phenomenon preaspiration in well-known examples as Icelandic, Scottish Gaelic and Faroese but are silent on the (possible) production of preaspiration in English. Also the very detailed account of spoken English by Shockey (2003) only includes aspiration. Further, the absence of preaspiration (in contrast to postaspiration) in the textbook on English phonetics for Norwegian students by Davidsen-Nielsen (1996) suggests that this phenomenon does not have a very prominent position in practical phonetics in Norway. Based on the present results it might seem worthwhile for future research to have a closer look at the occurrence of preaspiration in English. It is not impossible that this feature plays a certain role in English speech sound production but until now has escaped our notice.
Temporal patterns in Norwegian as L2
129
3.3. The problem of variation Inspection of mean segment durations can tell us a good deal about how second language users master the durational contrast long/short vowel in V:C vs. VC: dyads. But, as indicated above in Section 1, we will also have to take into account that individual tokens will to a lesser or larger extent vary round the mean. Rather differently distributed duration values can result in the same average. Table 1 gives one of the most usual measures describing dispersion, namely standard deviation (This measure was not included in Figure 2 in order to avoid overloading the picture). It can be seen from the table that the native speakers produced durations with relatively small variations (standard deviation for the vowels on average 15 ms, for the consonants 32 ms). For the group of L2 speakers as a whole higher values were found (pooled across the six groups 36 ms and 46 ms, respectively). Taking averages pooled across all four conditions of long/short vowel and long/short consonant as a measure, the German group was most consistent in their productions (mean standard deviation of 24 ms), while the mean values for the other groups were rather similar to each other (lying between 42 ms for the French and 48 ms for the Chinese subjects). One might wonder whether this rather strong degree of similarity can be interpreted as similar L2 behaviour or whether other perspectives could supply us with additional useful information. Table 1.
Standard deviations (in ms) for vowels and consonants in words containing /a(:)/ and /ø(:)/ followed by a voiceless plosive spoken by different speaker groups. n= number of tokens
Chinese
English French German Persian Russian Norwegian
n
28
16
24
16
24
16
24
V:
48
26
37
20
36
67
20
C
45
62
44
20
45
23
28
V
38
27
25
30
41
38
10
C:
61
55
61
27
51
53
34
Wim A. van Dommelen
(a) consonant duration (ms)
350
Norwegian
V:C
300
(b)
VC:
350
250 200 150 100 50
300 250 200 150 100 50
0 0
50
100
150
200
250
0
300
0
50
100
vowel duration (ms)
150
200
250
300
vowel duration (ms)
(d) 350
V:C VC:
French 300 250 200 150 100 50
350
V:C VC:
German 300 consonant duration (ms)
(c) consonant duration (ms)
V:C VC:
Chinese consonant duratioon (ms)
130
250 200 150 100 50
0
0 0
50
100
150
200
vowel duration (ms)
250
300
0
50
100
150
200
250
300
vowel duration (ms)
Figure 3. Vowel and consonant durations (in ms) in words containing /a(:)/ and /ø(:)/ followed by a voiceless plosive spoken by native speakers of (a) Norwegian (vowel duration does not include preaspiration), (b) Chinese, (c) French, and (d) German. Each data point represents one token.
To answer this question we will demonstrate how one can obtain a more informative impression of variation in production through a graphic representation depicting the durational relationships of vowels and consonants in VC: and V:C. Figure 3 illustrates this for a selection of four speaker groups (native speakers of Norwegian, Chinese, French, and German). As can be seen from Figure 3a, the durations of the segments in the V:C and VC: dyads produced by the Norwegian speakers fall into distinct categories. This is more than possibly could have been expected because the test words did occur in different positions in the utterances (utterance-medial and final) and in the evaluation no attempt has been made to normalize for speech rate. Further, it can be seen easily that the main durational correlate of the VC: - V:C opposition is the vowel. Consonant durations for the two members of the opposition pair overlap to a large degree. In stark contrast with the two distinct categories found for the natives the Chinese speakers’ performance is characterized by almost complete overlap (Figure 3b). There is almost no distinction between the durations of V and V: as well as C: and C. Though the values for the French group (Figure 3c) show less over-
Temporal patterns in Norwegian as L2
131
lap, these speakers didn’t realize clearly distinct categories either. Presumably due to the lack of a vowel quantity opposition in French both V and V: have relatively short durations. At the same time, consonant duration is not being used to distinguish between the two dyads. Finally, the German speakers (Figure 3d) handle the VC: - V:C distinction more like the Norwegian natives. In spite of a certain overlap in vowel durations, a certain tendency of distinguishing two categories can be noticed.
4.
Quantification of rhythm
This section deals with the second part of our study of timing in L2 speech production, namely the question whether speakers from different language backgrounds produce different speech rhythms and whether typical rhythmical properties can be quantified. To that aim, Section 4.1 presents seven measures related to speech rhythm that have been used in a discriminant analysis. Section 4.2 presents the results for a central measure, mean syllable duration. In the last section (4.3) the results of a discriminant analysis are presented showing that in fact aspects of speech rhythm can be captured by some of the measures presented here. 4.1. Definition of measures To compare the temporal structure of the L2 utterances with the L1 reference utterances, seven different types of measures were defined. In all cases calculations were related to each of the seven groups of speakers as a whole. The first measure was syllable duration averaged over all syllables of each utterance, yielding one mean syllable duration for each sentence and each speaker group, i.e. 7 (groups) x 10 (utterances)= 70 mean syllable durations in total (For all measures used in the discriminant analysis the total number of observations is n= 70). Second, the standard deviation for the syllable durations pooled over the speakers of each group was calculated for each of the single utterances’ syllables. The mean standard deviation was then taken as the second measure, thus expressing mean variation of syllable durations across each utterance. Figure 4 may illustrate this for the 10-syllable sentence To barn matet de tamme dyrene (‘Two children fed the tame animals’) produced by the Chinese and the Norwegian speaker group. In this figure, vertical bars indicate ± 1 standard deviation. The
132
Wim A. van Dommelen
mean of the ten standard deviation values represents the second measure as defined above (for Norwegian 27 ms; for Chinese 63 ms). Figure 4 may also serve as an example illustrating the definition of the third and fourth measure. For the Norwegian reference group, mean syllable durations are indicated by closed symbols and ranked in ascending order. Similarly, open symbols depict the durations for the same syllables produced by the group of seven Chinese speakers. Note that the order of the syllables is the same as for the Norwegian natives. Also indicated are regression lines fitted to the two groups of data points. The correlation coefficient for the relation between syllable duration and the rank number of the syllables as defined by the Norwegian reference is the third measure in this study. The higher this correlation coefficient, the better agreement between the overall temporal organization of the syllables and the Norwegian reference. For the Chinese speaker group presented in the figure the value is relatively low: r = 0.541. Further, the slope of the regression line was taken as the fourth measure (here: 18.7). As illustrated in Figure 4, the measures three and four will contain information about the joint duration pattern of the syllables in an utterance. In the example it is obvious that the pattern produced by the Chinese subjects is rather different from the Norwegian reference.
Syllable duration [ms]
600 500 400 300 200 100 e
0 0
re 2
de
et 4
ne
dy
to
6
tamm mat barn 8
10
12
Syllable rank
Figure 4. Mean duration of syllables in a Norwegian utterance ranked according to increasing duration for six native speakers (closed symbols with regression line). Open symbols indicate mean durations for a group of seven Chinese subjects with syllable rank as for the L1 speakers. Vertical bars indicate ± 1 standard deviation.
Temporal patterns in Norwegian as L2
133
As measure number five speech rate was chosen, defined as the number of (actually produced) phonemes per second. This yielded one single value per utterance and speaker group, that is also here resulting in a total of n= 70 values. Subsequently, as the sixth measure for each utterance the standard deviation belonging to the speech rate value was computed. The standard deviation was calculated across the speakers of each group and thus indicates the degree to which mean speech rate varied within a group. Finally, the seventh measure was the normalized Pairwise Variability Index (nPVI) as used by Grabe and Low (2002):
(1) nPVI =
⎡m−1 dk −dk +1 ⎤ 100 × ⎢ ∑ /( m − 1) ⎥ (d +d ) / 2 ⎣ k =1 k k +1 ⎦
In this calculation the difference of the durations (d) of two successive syllables is divided by the mean duration of the two syllables. This is done for all (m-1) successive syllable pairs in an utterance (m= the number of syllables). Finally, by dividing the sum of the (m-1) amounts by (m-1) a mean normalized difference is calculated and expressed as percent. For the convenience of the reader the present measures are repeated below: 1. 2. 3. 4. 5. 6. 7.
mean syllable duration standard deviation for syllable durations correlation coefficient slope of regression line mean speech rate standard deviation for speech rate nPVI
4.2. Results: Mean syllable duration Since the main temporal unit under scrutiny is the syllable, let us first see whether and to what extent the various speaker groups produced different syllable durations. As can be seen from Table 2, mean syllable durations vary substantially. Shortest durations were found for the natives (176 ms), while the subjects with a Chinese L1 produced the longest syllables
134
Wim A. van Dommelen
(286 ms). The other groups have values that are more native-like, in particular the German speakers with a mean of 196 ms. For all speaker groups the standard deviations are quite large, which is due to both inter-speaker variation and the inclusion of all the different types of syllables. Note that the standard deviation described here was computed across all single tokens (e.g., for the Chinese n= 837) and thus differs from the second measure defined above in Section 4.1.) According to a one-way analysis of variance, the overall effect of speaker group on syllable duration is statistically significant (F(6, 4490)= 97.841; p< .0001). In order to obtain information about differences between syllable durations for all possible pairs of language groups, a Games-Howell post-hoc analysis was performed. The result showed that only the difference between the two mean durations for the English group (222 ms) and the Russian group (216 ms) was non-significant. All the remaining differences turned out to be statistically significant at a level of significance p= 0.05. Therefore, it can be concluded that the measure mean syllable duration captured characteristic differences between the speaker groups. Here one might raise the question of how to explain the differences in mean syllable duration. They need not necessarily be due to L1-dependent behaviour but could reflect differences in speech rate correlating with the subjects’ general performance level in Norwegian. A possible approach to investigating this issue could be to collect and analyze speech material from the present speaker groups for their respective L1s. But firstly, due to the considerable research efforts needed, until now we had to refrain from such an enterprise. Secondly, though L2 performance certainly is affected by L1specific factors we can not assume a linear transfer of temporal patterns from L1 to L2. Nevertheless, previous investigations of temporal similarities and dissimilarities between different languages can provide us with a frame of reference. Delattre (1966) compared syllable durations in English, German, French and Spanish. His material consisted of five minutes of spontaneous speech produced by one native speaker of each of these languages. Conditioning factors were syllable weight (stressed/unstressed), place (final/non-final) and type (open/closed). Mean durations of final, stressed closed/open syllables turned out to be longer for English (408 ms/335 ms) than for German (362 ms/298 ms) and French (341 ms/246 ms). For nonfinal syllables rather small differences between English (259 ms/192 ms) and German (246 ms/197 ms) were found (note that in French stressed syllables occur only in final position). Unstressed non-final closed/open syllable durations showed a reversed order for the three languages: French
Temporal patterns in Norwegian as L2
135
(192 ms/137 ms) > German (175 ms/132 ms) > English (155 ms/120 ms). These results indicate that the impact of syllable weight, place and type differ considerably between languages and that it could be worthwhile to look into the more complex matter of speech rhythm rather than average syllable durations. In particular, it should be kept in mind that the values presented in Table 2 represent averages across all three conditions of stress, position and type, which reduces the possibility of comparing results. Roach (1982) measured syllable durations in samples of spontaneous speech produced by one native speaker each of three so-called syllabletimed languages (French, Telugu and Yoruba) and three stress-timed languages (English, Russian and Arabic). He does not present absolute syllable durations but gives their standard deviation as a measure of variability. The hypothesis of more variable durations in stress-timed languages is not born out by the data: rather similar values were found for ‘stress-timed’ English (86 ms) and Russian (77 ms) on the one hand and ‘syllable-timed’ French (75.7 ms) on the other. The data presented in Table 2 are in line with this outcome, the standard deviation for French (101 ms) being comparable to that for English (106 ms) and Russian (107 ms) and even larger than for German (87 ms). Section 4.3 will take up the issue of speech rhythm and investigate whether the measure of syllable duration and the other six ones mentioned above contain sufficient speech rhythm information to classify the utterances according to their membership of the different groups. Table 2. Mean syllable durations and standard deviations in ms for six groups of L2 speakers and a Norwegian control group. Means are across ten utterances and all speakers in the respective speaker groups. Chinese
English
French German
Persian Russian Norwegian
mean
286
222
241
196
258
216
176
sd
113
106
101
87
107
112
86
n
837
489
731
488
732
488
732
4.3. Discriminant analysis In order to investigate whether rhythmical differences between utterances from the different speaker groups can be captured by the seven measures
136
Wim A. van Dommelen
defined above, a discriminant analysis was performed. Before going into the question of the possible contribution of the different measures, let us see how the statistical analysis classified the 70 utterances. The results are presented in Table 3. Here it can be seen that in the majority of cases the L2-produced utterances were correctly classified. The overall correct classification rate amounts to 92.9%. All utterances produced by the Chinese, German, Persian and Russian speakers were classified in accordance with their actual L1 group membership. Of the ten utterances from the English group, one utterance was classified as French and one as German-produced. One utterance from the French subjects was confused with the category English. The classification of two utterances from the Norwegian reference group as German confirms the native-like temporal structure of the speech produced by the Germans (Section 3.3). Table 3. Predicted L1 group membership (percent correct) of ten utterances according to a discriminant analysis using seven measures (see Section 4.1). L1 group
Chinese English French German Persian Russian Norwegian
Chinese
100 0 0 0 0 0 0
Predicted L1 group membership English French German Persian Russian Norwegian 0 0 0 0 0 0 80 10 10 0 0 0 10 90 0 0 0 0 0 0 100 0 0 0 0 0 0 100 0 0 0 0 0 0 100 0 0 0 20 0 0 80
We will now turn to the contribution of the present measures to this classification. The discriminant analysis was performed stepwise, which means that variables are entered one after another as long as they contribute significantly to the model. In turned out that four of the seven measures achieved statistical significance (in order of entrance): • • • •
Measure 1: mean syllable duration Measure 6: standard deviation for speech rate Measure 3: correlation coefficient Measure 5: mean speech rate
Temporal patterns in Norwegian as L2
137
This outcome suggests that three types of temporal information can be distinguished. First, the correlation measure containing information about the overall patterning of syllable durations. Second, the measures 1 and 5 both reflecting speech rate. Finally, measure 6 capturing aspects of variation in speech rate. It seems obvious that the information contained in the measures 1 and 5 contain could overlap to a large degree or even that including one of them could make the other one redundant. In order to get an impression of these two measures’ role the discriminant analysis was run again without measure 1, mean syllable duration. This lowered the classification rate from originally 92.9% to 81.4%. Doing the same thing for measure 5, mean speech rate, resulted in an overall rate of 91.4%. These percentages suggest that the two measures indeed contain redundant information, mean syllable duration having the most predictive power. Though the present analysis has succeeded in classifying the seven different speaker groups according to their respective language backgrounds, the issue of L1-specific speech rhythm is far from solved. Specifically, in interpreting the results one should take into consideration that speech rate and rhythm measures have been shown to co-vary. For example, Dellwo and Wagner (2003) demonstrated that the standard deviation of consonantal intervals as used by Ramus et al. (1999) is heavily speech rate dependent. A similar conclusion was drawn by Barry et al. (2003) among other things as to Grabe and Low’s (2002) PVI measures for vowels and consonants. It is conceivable that the differences in speech rate for the present speaker groups are only partly language-dependent and vary mainly with the speakers’ general skills in Norwegian.
5.
Conclusions
The goal of the present study has been to shed some light on temporal aspects of Norwegian spoken as a second language. In general, it could be shown that speakers from six different native languages at the level of single vowels and consonants as well as syllables produced patterns that differed from the Norwegian reference. In its generality this is, of course, a result that could be expected. Going into more detail, a central question is to what extent the data revealed deviation patterns that could be characteristic for the different speaker groups involved, i.e. depending on their respective native languages. To answer this question, data from measurements on the temporal structure of dyads VC: and V:C were evaluated in different
138
Wim A. van Dommelen
ways. From the average durations for each of the elements under scrutiny (V, V:, C, C:) it was not easy to detect any systematic differences between L2 and L1 productions. More informative was the duration ratio V:/V in this respect. Here, there was a tendency for speakers from languages closer to the target language to have somewhat more native-like values. This tendency was, however, not very clear. One the one hand, the Russian speakers performed similar to the German and the English speakers, which does not seem to be in congruence with the degree of language family membership. On the other hand, the French subjects’ ratios deviated more from the Norwegian reference and were, possibly somewhat surprisingly, similar to those for the Chinese and Persian speakers. The most revealing perspective to evaluate and interpret the present data was to inspect how the durations of the long and short vowels and consonants relate to each other and what the duration patterns for the classes of VC: vs. V:C look like. Most nativelike performance was found for the German speakers, thus confirming the previously observed tendencies for this group. While the data for the French subjects seemed to reflect the lack of vowel quantity in their native language, the Chinese speakers showed considerable scatter and so failed to systematically distinguish between the VC: vs. V:C categories. A fundamental problem in interpreting data like those from the present study is the complexity of the factors that contribute to the measurable output. First of all, there is at present no model to predict what kind of interference phenomena can be expected. From current models, Flege’s (1995) model can be used to make global predictions, but it seems difficult to make predictions about specific deviations. Apart from these L1 influences, which at least in principle could be predicted, there are many further contributing factors at the individual level: duration and intensity of contact with the second language, degree of familiarity with other languages, formal training in L2, education level, family situation as to the use of one, two or even more languages, motivation to learn a new language – just to mention some. All these factors contribute to obscure possible systematic effects to different degrees. The results for the VC: vs. V:C dyads confronted us with the kind of interpretation difficulties as mentioned above. Here, it has become clear that purely phonological reasoning cannot explain the data satisfactorily. The performance of the Russians was more native-like compared to the productions of the Chinese though in both native languages vowel quantity is absent. Further, it is difficult to give more than a rather superficial explanation of the substantial variation in the performance of the Chinese, saying
Temporal patterns in Norwegian as L2
139
that this reflects the pronunciation difficulties they encounter. It is thinkable that the observed variation to a certain extent is caused by uncertainties in grapheme-to-phoneme conversion in reading. All this does not mean, however, that the present results are without practical implications. For example, in the teaching of Norwegian pronunciation to German target groups there will presumably not be much need to focus on issues related to vowel quantity. Consequently, more time would be available to emphasize other aspects. Dealing with French as L1, it seems useful to make speakers aware of the long durations necessary to produce appropriate phonologically long vowels. At the same time, the complementary consonant duration differences in the VC: vs. V:C opposition should be brought to the learners’ attention. Learners with a language background that is more distant, like the present Chinese speakers, can be expected to need and to profit from a very thorough instruction concerning the temporal aspects of Norwegian. An unexpected outcome of the measurements was the presence of preaspiration in Norwegian produced by native speakers of English. This finding demonstrates the potential usefulness of phonetic analyses for pronunciation teaching. Though in many cases the human ear is unsurpassable as an instrument for judging speech productions, some relevant details might escape our attention until revealed by an instrumental analysis. So, instrumental methods may make us more aware of pronunciation phenomena and potentially contribute to improving teaching praxis. In the present case of preaspiration, drawing the attention of the learners of Norwegian to this detail of consonant production might help to make their pronunciation more authentic. Nowadays, with the help of the omnipresent computer and a free-ware program like Praat it does not require much specialist knowledge to integrate sound demonstrations in pronunciation teaching. In this way, learners could acquire a better understanding of all kinds of pronunciation aspects as, for example, vowel reduction, assimilation, intonation or, in a language like Norwegian, a notoriously difficult feature as the realization of tonal accents. As was expected from the outset, investigation of speech rhythm evidenced different temporal patterns for the six speaker groups. It seems reasonable to ascribe the deviations at least partly to the influence of the respective native languages. With rhythm-related measures as input a discriminant analysis classified L2 utterances according to their L1 membership with a relatively high degree of accuracy (92.9% correct). As to the relevance of the present measures of speech rhythm, only four out of the seven measures turned out to contribute significantly. Probably most closely re-
140
Wim A. van Dommelen
lated to speech rhythm, the correlation coefficient measure seems to convey relevant information about the overall patterning of syllable durations. Two further relevant measures (mean syllable duration and mean speech rate expressed in phonemes per second) are both related to speech rate and appear to contain overlapping information. The fourth significant measure involved the variation in speech rate. It thus appears that a large portion of the information about the utterances’ L1 membership originates from the rate of speech deliverance. Since it is conceivable that speech rate does not represent an L1-specific factor, but varies with the level of proficiency in L2 in general, further research on this issue will be needed. At present, it can only be speculated about the reasons why three measures don’t seem to convey rhythm information. Finally, we would like to point out that the present measures were of an exploratory character and some of them were possibly too crude to capture details of speech rhythm. Also, and presumably more importantly, operationalizing speech rhythm as the temporal organization of syllables means a strong reduction which fails to do justice to the complex of interacting factors involved. It is hoped, however, that future efforts studying more aspects of speech rhythm, both in production and perception, eventually will give a better understanding of this phenomenon.
Acknowledgement This research is supported by the Research Council of Norway (NFR) through grant 158458/530 to the project Språkmøter (Language Encounters). The speech material was developed and recorded by Snefrid Holm (Department of Language and Communication Studies, NTNU) as part of her PhD project. I would like to thank Rein Ove Sikveland (Department of Language and Communication Studies, NTNU) for the segmentation of the speech material.
References Ashby, Michael and John Maidment 2005 Introducing Phonetic Science. Cambridge: Cambridge University Press.
Temporal patterns in Norwegian as L2
141
Barry, William J., Bistra Andreeva, Michela Russo, Snezhina Dimitrova and Tanya Kostadinova 2003 Do rhythm measures tell us anything about language type? Proceedings of the 15th International Congress of Phonetic Sciences, Barcelona, 2693–2696. Behne, Dawn, Bente Moxness and Anne Nyland 1996 Acoustic-phonetic evidence of vowel quantity and quality in Norwegian. Fonetik 96, Papers presented at the Swedish Phonetics Conference, Nässlingen, 29–31 May 1996. KTH (Royal institute of Technology), Speech, Music and Hearing. Quarterly Progress and Status Report, TMH-QPSR 2/1996, 13–16. Boersma, Paul and David Weenink 2006 Praat: doing phonetics by computer (Version 4.4.11) [Computer program]. Retrieved February 23, 2006, from http://www.praat.org/. Bond, Dzintra, Dace Markus and Verna Stockmal 2003 Prosodic and rhythmic patterns produced by native and nonnative speakers of a quantity-sensitive language. Proceedings of the 15th International Congress of Phonetic Sciences, Barcelona, 527–530. Dauer, Rebecca M. 1983 Stress-timing and syllable-timing reanalyzed. Journal of Phonetics 11, 51–62. Davidsen-Nielsen, Niels 1996 English Phonetics. Translated and adapted for use in Norway by Barbara Bird and Per Moen. Oslo: Gyldendal Norsk Forlag A/S (Seventh impression). Delattre, Pierre 1966 A comparison of syllable length conditioning among languages. International Review of Applied Linguistics 4, 183–198. Dellwo, Volker and Petra Wagner 2003 Relations between language rhythm and speech rate. Proceedings of the 15th International Congress of Phonetic Sciences, Barcelona, 471–474. van Dommelen, Wim A. 1999a Auditory accounts of temporal factors in the perception of Norwegian disyllables and speech analogs. Journal of Phonetics 27, 107–123. 1999b Preaspiration in intervocalic /k/ vs. /g/ in Norwegian. Proceedings of the 14th International Congress of Phonetic Sciences, San Francisco, 2037–2040.
142
Wim A. van Dommelen
Fintoft, Knut 1961 Flege, James 1995
The duration of some Norwegian speech sounds. Phonetica 7, 19–39.
Second language speech learning: Theory, findings, and problems. In: Winifred Strange (ed.), Speech perception and linguistic experience: Issues in cross-language research, 233–277. Timonium: York Press. Grabe, Esther and Ee Ling Low 2002 Durational variability in speech and the rhythm class hypothesis. In: Carlos Gussenhoven and Natasha Warner (eds.), Laboratory Phonology 7, 515–546. Berlin/New York: Mouton de Gruyter. Gut, Ulrike 2003 Prosody in second language speech production: the role of the native language. Zeitschrift für Fremdsprachen Lehren und Lernen 32, 133–152. Kenworthy, Joanne 2000 The Pronunciation of English: A Workbook. London: Arnold. (Co-published in the USA by Oxford University Press Inc., New York.) Kristoffersen, Gjert 2000 The Phonology of Norwegian. Oxford: Oxford University Press. Krull, Diana, Hartmut Traunmüller and Wim A. van Dommelen 2003 The effect of local speaking rate on perceived quantity: a comparison between three languages. Proceedings of the 15th International Congress of Phonetic Sciences, Barcelona, 1739–1742. Ladefoged, Peter and Ian Maddieson 1996 The Sounds of the World’s Languages. Oxford: Blackwell Publishers Ltd. Markus, Dace and Dzintra Bond 1999 Stress and length in learning Latvian. Proceedings of the 14th International Congress of Phonetic Sciences, San Fransisco, 563–566. Ramus, Franck 2002 Acoustic correlates of linguistic rhythm: Perspectives. Proceedings Speech Prosody 2002, Aix-en-Provence (France), 115–120. Ramus, Franck, Marina Nespor and Jacques Mehler 1999 Correlates of linguistic rhythm in the speech signal. Cognition 73, 265–292.
Temporal patterns in Norwegian as L2 Roach, Peter 1982
143
On the distinction between ‘stress-timed’ and ‘syllable-timed’ languages. In: David Crystal (ed.), Linguistic controversies. Essays in linguistic theory and practice in honour of F.R. Palmer, 72–79. London, Edward Arnold.
Shockey, Linda 2003 Sound Patterns of Spoken English. Malden, USA: Blackwell Publishing Ltd. Stockmal, Verna, Dace Markus and Dzintra Bond 2005 Measures of native and non-native rhythm in a quantity language. Language and Speech 48, 55–63. Svetozarova, Natalia 1998 Intonation in Russian. In: Daniel Hirst and Albert di Cristo (eds.), Intonation Systems. A Survey of Twenty Languages, 261–274. Cambridge: Cambridge University Press.
Learner corpora in second language prosody research and teaching Ulrike Gut 1.
Introduction
This article addresses methodological issues in L2 prosody research and teaching and argues for a corpus-based approach in both areas. Current research methods in L2 prosody have a number of limitations. A survey of all empirical studies on L2 prosody published in the major international journals in second language acquisition (SLA) research in the past 25 years demonstrates that research in L2 prosody tends to be based on a relatively small data base with a limited number of participants. Research on intonation, for example, is carried out with an average number of 22.6 participants (range 2 to 75); research on word stress is based on even fewer participants, on average 7.7, ranging from 4 to 10. The analysis of the productions of only few participants, however, precludes the study of variation between learners, for which representatively sized groups are necessary. Furthermore, empirical research on non-native prosody typically elicits data in a relatively controlled setting and is restricted to one speech style. Most studies base their investigations on the readings of words and sentences. Arguments against experimentally elicited data are brought forth for example by Leather (1999), who argues that some phonological structures may be more susceptible to errors in an experimental setting and suggests that “observations from artificial speech tasks cannot always be extrapolated to natural conditions” (p. 32). Moreover, data analysis in L2 prosody research usually focuses on just one aspect of non-native prosody such as a particular intonational structure. The relationship between different prosodic domains, however, is not investigated. Finally, studies rarely relate their findings to non-linguistic factors assumed to influence the acquisition of prosody in an L2. The only explanatory aspect of language learning under investigation has been the influence of the learners’ native language on their L2 prosody. If factors such as age, motivation and speech style are analysed, only one of them is studied. No longitudinal studies,
146
Ulrike Gut
where speech is collected from the same individuals at multiple intervals over a period of time, have yet been carried out. Recently, it has been suggested that a corpus-linguistic approach should be introduced into research in language acquisition. It is widely argued that a corpus-based methodology can complement the current research methods in second language learning and possibly compensate some of their weaknesses (Biber, Conrad, and Reppen 1994, Botley et al. 1996, Kettemann and Marko 2002, Granger, Hung and Petch-Tyson 2002, Sinclair 2004, Granger 2004). However, so far, corpus linguistics and second language research have mainly co-existed side by side and have not yet joined forces (cf. Hasselgard 1999). Due to the scarcity of learner speech corpora the analysis of learner phonology or prosody have so far been impossible (cf. Nesselhauf 2004). The recently completed LeaP (Learning Prosody) corpus fills this gap in providing a fully annotated speech corpus of learner English and learner German. Apart from serving as a resource for empirical research, language corpora are increasingly used in the classroom and the recognition of their pedagogical value is growing (e.g. Ghadessy, Henry and Roseberry 2001, Kettemann and Marko 2002, Granger, Hung and Petch-Tyson 2002, Sinclair 2004). It has been claimed that the application of corpora in the classroom supports inductive learning processes and the creation of language awareness in language students. By investigating corpora students are stimulated to enquire and speculate about language structures and develop the ability to recognize language patterns. In corpus-based “data-driven learning”, for example, students have the opportunity to work as researchers by developing a research question and analysing it with real-language data. It has been suggested that activities based on a comparison between native and non-native corpus data enable language learners – to focus on negative evidence and typical errors – to train their ability to notice differences between native and non-native language use – to increase their language awareness By observing the errors learners typically and most frequently make, students might find it easier to become aware of the features of their own interlanguage and possibly stimulate a restructuring of their own language use and knowledge (e.g. Granger and Tribble 1998). Due to the scarcity of
Learner corpora in second language prosody research and teaching
147
learner speech corpora the analysis of learner phonology or pronunciation in a classroom setting have so far been impossible (cf. Nesselhauf 2004). The aim of this article is to report on the advantages and new opportunities offered by the corpus-based approach in L2 prosody. Section 2 gives a brief overview of corpus linguistics, the various types of corpora that have been collected and the advantages of a corpus-based approach. In section 3, the learner corpus LeaP is described. It serves as the basis of the analysis of non-native vowel reduction in both L2 English and L2 German (Section 4). Section 5 summarizes the findings of a preliminary study on the application of the LeaP corpus in language teaching. The implications of the results of the analysis for research in L2 prosody and for the teaching of prosody are discussed in section 6.
2.
Corpus linguistics
Corpus linguistics as a method to study the structure and use of language can be traced back to the 18th century (Kennedy 1998: 13). Modern corpora began to be collected in the 1960s. In modern definitions the term corpus is usually used to refer to a substantial collection of language texts or transcriptions of spoken language in electronic form (Biber, Conrad, and Reppen 1996: 4). McEnery and Wilson (2001) list representativeness, sufficient size, a machine-readable form and its function as a standard reference as typical requirements for a corpus. Representativeness refers to the fact that the collection of speech data should be maximally representative for the aspect under investigation, that is, provide researchers with an as accurate as possible picture of the occurrence and variation of the phenomena under investigation. Modern corpora have to be machine-readable so that their purpose, the rapid (semi-)automatic analysis of large amounts of data, can be realized. The computer-based storage form furthermore allows an enrichment of the corpus by annotations. In general, it is assumed that a corpus functions as a standard reference for the language or language variety it represents. 2.1. Types of corpora Several types of corpora can be distinguished: Text corpora consist of collections of written samples of a language variety; speech corpora constitute
148
Ulrike Gut
a collection of spoken samples of a language variety. The latter is often also referred to with the term spoken language corpus. Text corpora are naturally not suited for the description and development of linguistic theory in the area of phonetics and phonology. Corpora can be unannotated or annotated. The term annotation refers to the enhancement of the primary data (audio or video recordings in the case of speech corpora) with various types of linguistic and non-linguistic information. Several types of linguistic annotations are in use and include orthographic transcriptions, phonemic and prosodic transcriptions, part of speech tagging, semantic annotation, anaphoric annotation and lemmatization. For example, the content of a recording may be transcribed orthographically, and an additional phonetic transcription may be carried out. Non-linguistic corpus annotations usually consist of meta-data, i.e. additional information about the corpus or its content. This includes information about the recording (e.g. time and place), about the speakers (e.g. age, sex, native language), about the recording situation (e.g. speech style elicited, instructions) and about the corpus (e.g. who collected it, where, when, with which purpose). A text-to-tone alignment, which links the transcriptions (annotations) with the audio or video recording, provides direct access from each annotated element to the primary data, i.e. the original recordings. By clicking on any annotated element the corresponding part of the recording will be played back by the annotation software. This is especially useful for the analysis of the corpus because items in question can be listened to again or additional phonetic analyses offered by the software such as a spectrographic analysis or pitch tracking can be carried out. In addition, this function enables language teachers and language learners to make use of the corpus in the classroom. In order to create, analyse, query and distribute an annotated speech corpus, an appropriate data format is required. The currently most widely used data format is based on the Extensible Markup Language (XML) technology, which allows an efficient document engineering of speech data by providing tools for the data collection (XML editors), for data analysis (e.g. XSL-T) and for data presentation. Corpora can be further divided into native corpora and learner corpora, the former containing language produced by native speakers, the latter containing language produced by learners of a language. Finally, corpora may contain only one language variety (monolingual corpora) or more than one language (multilingual corpora).
Learner corpora in second language prosody research and teaching
149
2.2. Corpus analysis Two major types of corpus analysis can be distinguished: qualitative and quantitative approaches. In qualitative research, small numbers of phenomena are described in detail and focus lies on the variation of the data. Its main drawback is that the findings cannot be generalized to larger populations with a sufficient degree of certainty. In contrast, a quantitative analysis of a corpus gives a precise account of the frequency and rarity of particular language structures. The specific findings can be tested to discover whether they are statistically significant and can be generalized to a larger population. In early corpus-based studies quantitative analysis was restricted to a simple counting of occurrence of linguistic items. However, the analysis of an annotated corpus allows the computation of various statistical measurements such as correlations between variables, i.e. the analysis of systematic ways in which some linguistic features vary with other linguistic features or how certain non-linguistic features vary with certain linguistic ones, and other multivariate measurements such as factor analyses and cluster analyses. The weakness of quantitative approaches lies in the risk that rare phenomena are not recognized and that fine distinctions are blurred. Corpus-based studies thus benefit most from combining both approaches (McEnery and Wilson 2001: 77). A number of advantages of using speech corpora in research on nonnative speech have been suggested (Biber, Conrad, and Reppen 1994, McEnery and Wilson, 2001, Granger 2002, 2004): – Corpora contain objective language data which reflects authentic natural
language use. A representative corpus of non-native speech constitutes a large empirical database of naturally occurring language structures and patterns of use and thus stands in contrast to the laboratory speech elicited in experimental studies on non-native speech, which has often been criticized as artificial and not generalizable (Leather 1999: 32). Corpora of non-native speech offer empirical investigations of the patterns of actual language use and allow quantitative and qualitative analyses whose results are generalizable to larger populations. – Corpus-based research allows an examination of more varied and larger amounts of data than any other methodology in second language research. This opens up the possibility that in an explorative manner previously unsuspected linguistic phenomena may be discovered and access to previously not accessible structures and patterns of use is
150
Ulrike Gut
provided. In this manner, researchers can for the first time test strongly held convictions and intuitions about frequency and type of learner errors. Granger (2004: 123) suggests that corpus-based research in L2 provides a basis for a new way of thinking which may challenge some of the deeply-rooted ideas about learner language. Similarly, Biber, Conrad, and Reppen (1994) report from morphosyntactic and lexical studies that researchers’ intuitions can prove incorrect when tested against actual frequencies and usage in the corpus. As pointed out variously, corpora constitute the only reliable source of evidence for questions of frequency. – A richly annotated corpus of non-native speech gives access not only to specific learner errors but provides a comprehensive description of all aspects of the learners’ interlanguage, combining information on different linguistic levels and non-linguistic information. This satisfies Leather’s (1999) call for an “ecological” approach to theoretical modelling in second language speech. He argued for paying more attention to experiential and environmental factors of the acquisition process and for research in non-native speech to take on a broader view. A corpus that extends over a wide selection of variables such as speaker learning history, learning situation, age and sex and across a variety of speech styles allows investigations of new issues such as co-occurrence of structures or the co-occurrence of certain linguistic with nonlinguistic features. – Corpora provide information about variation in non-native speech. By dividing the corpus into smaller subcorpora by, for example, grouping learners with the same native language or age at first exposure to the target language, or by comparing certain structures in non-native speech in different speech styles, the extent and type of variation in non-native speech can be analysed. Until recently, corpus-based research in L2 prosody was impossible due to the lack of an appropriate corpus. A small number of learner speech corpora have been set up in the area of speech technology in the past few years, mainly collected to train speech recognition systems which can then be used in man-machine conversations such as telephone booking of train tickets (e.g. the FAE (Foreign Accented English) corpus and the VILTS (Voice Interactive Language Training System) corpus). However, none of these corpora in their present form are immediately reusable for researchers in non-native prosody since they do not contain phonetic or phonological
Learner corpora in second language prosody research and teaching
151
annotations. Recently, a prosodically annotated learner corpus has become available, which will be described in the next section.
3.
The LeaP corpus
The LeaP corpus was collected between May 2001 and July 2003 as part of the LeaP (Learning Prosody in a Foreign Language) project1, which investigated the acquisition of prosody by second language learners of German and of English. The corpus consists of a total of 359 fully annotated recordings adding up to 73.941 words. The total amount of recording time is more than 12 hours. It comprises four different types of speech: – – – –
free speech in an interview situation (length between 10 and 30 minutes) reading passage (length about 2 minutes) retellings of the story (length between 2 and 10 minutes) readings of nonsense word lists (30 to 32 words)
In the LeaP corpus, different learner groups are represented: native speakers of English and of German, serving as controls, especially advanced learners (near-natives), learners before and after a training course in prosody and learners before and after a stay abroad. The English subcorpus contains recordings with 46 non-native and 4 native speakers. The mean age of the non-native speakers is 32.3 years and ranges from 21 to 60. 32 of them are female and 14 are male, and altogether, they have 17 different native languages. The average age at first contact with English is 12.1 years, ranging from one year to 20 years of age. In the German subcorpus, the mean age of the 55 non-native speakers at the time of the recording is 28.9 years and ranges from 18 to 54 years. 35 of them are female and 20 are male. Altogether, they have 24 different native languages. The average age at first contact with German is 16.7 years, ranging from three years to 33 years of age. A large number of additional data was collected for each recording, including data
152
Ulrike Gut
– about the recording (date, place, interviewer and language of the inter-
view) – about the non-native speaker (age, sex, native language/s, second
language/s, age at first contact with target language, type of contact [formal vs. natural], duration and type of stays abroad, duration and type of formal lessons in prosody, prosodic knowledge) – about motivation and attitudes (reasons for acquiring the language, motivation to integrate in the host country, attributed importance to competence in pronunciation compared to other aspects of language, interest, experience and ability in music and in acting) Annotation and text-to-tone alignment of the LeaP corpus was carried out for all reading passages, retellings and two-minute extracts of each interview. The manual annotation comprised six tiers; two further tiers were added automatically: – On the phrase tier, speech and non-speech events were annotated. The
interviewee’s speech is divided into intonational phrases. – On the words tier, words were transcribed orthographically. – On the syllable tier, syllables were transcribed in SAMPA. – On the segments tier, all vocalic and consonantal intervals plus the
intervening pauses were annotated. – On the tones tier, pitch accents and boundary tones were annotated. – On the pitch tier, the initial high pitch, the final low pitch and
intervening pitch peaks and valleys were annotated. – On the POS tier, part-of- speech coding was annotated automatically. – On the lemma tier, lemmata were annotated automatically. For a recording of about one minute length, on average, 1000 events were annotated. Figure 1 illustrates the manually annotated tiers and the annotation process with the waveform (top) and spectrogram (middle) and the six manually annotated tiers (bottom).
Learner corpora in second language prosody research and teaching
153
Figure 1. Manual annotation in the LeaP corpus. From bottom to top the tiers are the phrase tier, the words tier, the syllable tier, the segments tier, the pitch tier and the tones tier.
4.
A corpus-based analysis of vowel reduction
All previous studies investigating vowel reduction by learners of either English or German found that, in non-native speech, vowels are not reduced to an appropriate extent. Often, full vowels instead of reduced vowels are produced in unstressed syllables and the durational difference between full vowels and reduced vowels is not sufficiently large (Wenk 1985, Bond and Fokes 1985, Mairs 1989, Flege and Bohn 1989, Zborowska 2000 for English and Kaltenbacher 1998, Gut 2003 for German). Some experiments involved a comparison of L2 vowel reduction with vowel reduction processes in native speech; some compared learner groups with different native languages. In some approaches learners were presented with reading material of word lists or short phrases. Less frequently, semi-spontaneous speech as in story retellings was elicited. Many aspects of vowel reduction are still unexplored: As yet, there are no longitudinal studies on the acquisition of vowel reduction. Vowel deletion, which is very common in native speech (e.g. Helgason and Kohler
154
Ulrike Gut
1996 for German), has not been studied yet. Although native language influence has been investigated as a possible constraint of non-native vowel reduction, cross-linguistic comparisons of target language have not yet been carried out. Furthermore, no systematic analysis of the co-variance of speech style and vowel reduction has been analysed and the correlation with other prosodic features of non-native speech has not been investigated yet. In order to address these research questions, vowel reduction in the LeaP corpus was analysed quantitatively and qualitatively. For the quantitative analysis, the following measurements were taken: mean length sfv mean length srv
mean length sdv percentage red/del
syllable ratio
mean length of all syllables containing a full vowel mean length of all syllables containing the reduced vowels //, /,/ and /n/ (/n/ in German only) mean length of all syllables with a deleted vowel 100x number of all syllables with either reduced or deleted vowel divided by total number of syllables mean durational ratio of all syllable pairs in which a syllable with a full vowel is followed by a syllable with either a reduced or a deleted vowel
4.1. Results: Vowel reduction in native and non-native speech Vowel reduction in non-native German differs from that in native German in nearly all measured features (see Table 1). The mean length of all types of syllables is longer in non-native German and the syllable ratio, the durational difference between adjacent syllables with a full vowel and those with a reduced or deleted vowel, is lower. Only the percentage of syllables with reduced and deleted vowels is not significantly different between nonnative German and native German. In all variables, the standard deviation is much higher in the speech of the learners of German compared to the native speakers. The native speaker norm was defined as the native speakers’ mean value ±one standard deviation. For the syllable ratio, it lies be-
Learner corpora in second language prosody research and teaching
155
tween 1.58:1 and 1.94:1. Of the recordings with the learners of German 47 or 27.2% fall within this range. The vast majority of recordings outside the native normal range show a durational difference between the two types of syllables that is too small, only in two cases is the durational difference larger than that found in native speech. Table 1. Mean length and standard deviation of syllables with full vowels (sfv), syllables with reduced vowels (srv), syllables with deleted vowels (sdv), the percentage of syllables with reduced and deleted vowel of all syllables and the mean durational ratio of adjacent syllable pairs with the first syllable containing a full and the second a reduced or deleted vowel (syllable ratio) for all syllables in non-native German and native German. (Significant differences are indicated by **=p<0.01, *** = p<0.001)
non-native German native German
mean length sfv
mean length srv
mean length sdv
240.7 (31.9) 202.9 (17.4) ***
179.5 (30.9) 139.7 (12.99) ***
188.7 (49.4) 150 (34) **
percentage red/del syllables 28.66% (7.05) 29.2% (2.5) n.s.
syllable ratio
n
1.49:1 (0.28) 1.76:1 (0.18) ***
50017 3261
The even greater differences between non-native English and native English vowel reduction are illustrated in Table 2. The learners of English produce on average longer syllables of all kinds, fewer syllables with reduced and deleted vowels and a smaller durational difference between neighbouring syllables with a full vowel and a reduced or deleted vowel. In all measured variables the standard deviation is much higher in non-native English than in native English. Of the learners of English 56 or 33.3% fall within the native speaker range. As observed in non-native German, those recordings outside the native normal range do not show enough durational difference between non-reduced and reduced syllables. In both native German and native English, no significant differences in vowel reduction were found across the different speech styles. Neither do non-native speakers of German produce different vowel reduction strategies in the different speech styles. Conversely, non-native speakers of Eng-
156
Ulrike Gut
lish, on average, produce a higher syllable ratio in reading passage style and the story retellings than in free speech. Table 2. Mean length and standard deviation of syllables with full vowels (sfv), syllables with reduced vowels (srv), syllables with deleted vowels (sdv), the percentage of syllables with reduced and deleted vowel of all syllables and the mean durational ratio of adjacent syllable pairs with the first syllable containing a full and the second a reduced or deleted vowel (syllable ratio) for all syllables in non-native English and native English. (Significant differences are indicated by **=p<0.01, ***=p<0.001)
nonnative English native English
mean length sfv 236.1 (44.38)
mean length srv 155.07 (41.02)
mean length sdv 157.07 (63.43)
percentage red/del syllables 24.01% (6.9)
syllable ratio
n
1.98:1 (0.4)
41670
210.75 (19.9) ***
101.875 (13.4) ***
85 (39.04) ***
30.65% (5.74) **
2.45:1 (0.33) **
2492
For both native German and native English, the syllable ratio is significantly correlated with articulation rate measured in mean number of syllables per second (.6 [p<0.05] for German and .89 [p<0.01] for English). No correlation between speech rate and vowel reduction was found for either the non-native German speakers or the non-native English speakers. 4.2. Target language properties in L2 prosody Table 3 compares the mean length and standard deviation of all syllables with full vowels (sfv), all syllables with reduced vowels (srv) and all syllables with deleted vowels (sdv) as well as the percentage of syllables with reduced and deleted vowel of all syllables and the syllable ratio in nonnative German and non-native English. In both non-native German and English, syllables with a full vowel have an average length of between 236ms and 240ms. Syllables with a reduced vowel and syllables with a deleted vowel are on average significantly shorter in non-native English than in non-native German. The percentage of syllables with reduced or deleted vowels is higher in non-native German
Learner corpora in second language prosody research and teaching
157
(28.66%) than in non-native English with 24.01%. There is a significantly larger durational difference between neighbouring syllables with a full vowel and syllables with reduced or deleted vowels in non-native English compared to non-native German. The percentage of syllables with reduced or deleted vowels and the syllable ratio are correlated significantly in nonnative English with .32 (p<0.01) but not in non-native German. This means that the fewer reduced-vowelled syllables are produced in non-native English the smaller is the durational difference between full-vowelled and reduced-vowelled syllables. Table 3. Mean length of syllables with full vowels (sfv), syllables with reduced vowels (srv), syllables with deleted vowels (sdv), the percentage of syllables with reduced and deleted vowel of all syllables and the mean durational ratio of adjacent syllable pairs with the first syllable containing a full and the second a reduced or deleted vowel (syllable ratio) for all syllables in non-native German and non-native English. (Significant differences are indicated by **=p<0.01, ***=p<0.001)
non-native German non-native English
mean length sfv 240.7
mean length srv 179.5
mean length sdv 188.7
percentage red/ del syllables 28.66%
syllable ratio
n
1.49:1
50017
236.1
155.07
157.07
24.01%
1.98:1
41670
n.s.
***
***
***
***
Differences between target languages can also be found in a comparison of four speakers in the LeaP corpus who were recorded as learners of both, German and English. Two of them can be classified as German-dominant since this was the language they learned before English and use more frequently: Speaker AB and speaker BD. Two learners can be called Englishdominant: Speakers AZ and CD. Each speaker shows distinct differences in the fluency in his or her two foreign languages, and this difference lies in the direction suggested by the speaker’s learning history and language use. Speakers AB and BD are more fluent in German, producing a higher articulation rate, mean length of run and fewer filled pauses. Conversely, speakers AZ and CD are more fluent in English with a higher articulation rate and a longer mean length of run in all three speech styles.
158
Ulrike Gut
Table 4 illustrates the syllable ratio and the mean percentage of syllables with reduced or deleted vowels in both the speakers’ non-native German and non-native English speech. Table 4. Mean syllable ratio and mean percentage of syllables with reduced or deleted vowels in the speech of non-native speakers of both German (G) and English (E). (Significant differences are indicated by ***=p<0.001, *=p<0.05) AB
AZ
BD
CD
G E G E G E G E syllable 1.5 :1 1.92 :1 1.3 :1 1.89 :1 1.4 :1 1.99 :1 1.35 :1 2.2 :1 ratio*** percentage 30.8% 22.4% 24.26% 24.8% 34.3% 21.9% 31.65% 24% red/del syllables* n
705
1602
851
1783
1549
2343
1289
1261
All of the speakers, notwithstanding their level of competence and experience with English, be it their L2 or L3, show a higher syllable ratio and a lower percentage of syllables with reduced or deleted vowels in English than in German. 4.3. Acquisition of vowel reduction In a longitudinal study, vowel reduction in the speech of 17 non-native speakers of German and 13 non-native speakers of English was analysed before and after a six-month stay abroad in Germany (n=5) or England (n=5) or before and after a six-month course in German (n=12) and English (n=8) pronunciation and prosody. No significant difference in the syllable ratio between syllables with full-vowelled syllables and syllables with reduced or deleted vowels was found at the two points in time for either learner group (Table 5). An individual analysis of each speaker, however, revealed that three of the non-native speakers of German who had produced a syllable ratio within the native normal range before going abroad or taking a pronunciation course produced a lower syllable ratio, outside the
Learner corpora in second language prosody research and teaching
159
normal native range, six months later. The same was observed for four nonnative speakers of English. Table 5. Mean syllable ratio and mean percentage of syllables with reduced or deleted vowels in the speech of non-native spaekers of German and English before and after a 6-month stay abroad or a pronunciation training course. (Significant differences are indicated by **=p<0.01, *=p<0.05)
non-native German (n=15) non-native English (n=13)
mean syll mean syll mean percentage mean percentage ratio before ratio after before after 1.58:1 1.38:1 25.8% 30.8%**
2.09:1
2.05:1
23.5%
25.9%*
On average, both the German and the English non-native speaker groups succeeded in producing a significantly higher percentage of reduced- or deleted-vowel syllables after the stay abroad or after the pronunciation training course. The German native speakers’ normal range of percentage of syllables with reduced or deleted vowels lies between 26.8% and 31.7%. Of the nine non-native speakers of German who produced an overall percentage of syllables with reduced or deleted vowels below this normal range before going abroad or taking part in a pronunciation course, only two did not succeed in increasing the relative frequency of reducedvowelled syllables to a native-like extent in the retellings. The English native speakers’ normal range of overall percentage of reduced-vowelled syllables lies between 24.91% and 36.39%. Of the nine non-native speakers of English whose percentage was lower than that before the course or the stay abroad, three produced an overall “normal” percentage of syllables with reduced or deleted vowel in the recording afterwards. No difference was found between the group of learners going abroad and the group taking a pronunciation course.
160
Ulrike Gut
4.4. Qualitative analysis of linguistic structures In a qualitative analysis of the vowel reduction patterns of three different learner groups, two particular inflectional morphemes in German were investigated. In German, word-final post-tonic C+<-en> and C+<-e> syllables as for example in treffen (‘to meet’) and diesem (dative form of the demonstrative pronoun ‘this’), where C stands for any consonant, are either produced as C+[n] and C+[m] with the reduced vowel schwa. In connected speech, the vowel may even be deleted (e.g. Helgason and Kohler 1996). The realization of these C+<-en> and C+<-em> were analysed for three native and 16 non-native speakers of German with different language backgrounds: English native speakers (n=5), Italian native speakers (n=6) and Mandarin Chinese native speakers (n=5). Table 6 illustrates the phonetic realisation of the word-final syllables C+<en> and C+<em> by all speakers. The percentages of productions without vowel (deleted), productions with a schwa //, the a-schwa /n/ or a full vowel are given for each group. The German native speakers produce roughly half of the word-final syllables C+<en> and C+<em> without a vowel and half with the reduced vowel []. A-schwa and full vowels never occur in these syllables. The English learners show a different pattern, deleting the majority of these vowels. The Italian learners produce a similar quantity of syllables without vowel and with []. In 9% of these syllables, however, a full vowel is produced, which is significantly different from the German native speakers. The Chinese non-native speakers of German show a clear preference for the [] vowel in these positions, followed by some deleted vowels (17%) and some full vowels. A-schwa occurs in 6% of the cases, which is significantly different from the German native speakers. There are significant differences in the vowel reduction strategies between the different non-native speaker groups. An ANOVA revealed a significant (p<0.05) difference in the percentage of schwas produced in the syllables of the type C+<en> and C+<em> between the three non-native speaker groups. The Chinese produce significantly more schwas in this phonetic environment than the other two speaker groups. Deleted vowels in these syllables are produced significantly more often (p<0.01) by the English non-native speakers of German than the other two speaker groups. Aschwas and full vowels in this environment are produced only by the Italian and the Chinese non-native speakers of German, but not by the native English speakers.
Learner corpora in second language prosody research and teaching
161
Table 6. Mean percentage of production of word-final syllables ending in C+<en> and C+<em> with deleted vowel, //, /n/ or a full vowel by each speaker group in the story retellings. (Significant differences from the native speaker group are indicated by *=p<0.05) deleted n full vowel
German 54% 46% -
English 87% 13% -
Italian 44% 45% 2% 9%*
Chinese 17% 76% 6%* 1%
total
44
66
59
118
Table 7 illustrates the mean duration and the percentage of deletion for all post-tonic syllables of the type C+<en> in the reading passages and retellings of German native speakers and the English, Italian and Chinese nonnative speakers of German. The English native speakers delete more syllables of this type when speaking German than the German native speakers. In those few cases when the vowel is not deleted, however, it is on average significantly longer than that produced by the German native speakers. The Italian and the Chinese non-native speakers of German delete fewer vowels in these syllables than the German native speakers. In addition, the Chinese learner group produces on average significantly longer vowels. An ANOVA carried out for the three learner groups revealed significant differences in vowel duration between them (F(2,255)=7.53, p<0.001). Vowel quality was compared in the female speech of all speaker groups by measuring the mean values of the first two formants F1 and F2 (Table 8). Unfortunately, the number of vowels produced by the two English nonnative speakers of German is very small so that a statistical evaluation is difficult. These two speakers have a higher F1, which reflects a lower tongue position, than the German native speakers. Both the Italian and the Chinese non-native speakers of German also have higher values for F1 and in addition also for F2. This means that the vowel they produce is tenser than the one produced by the German native speakers. An ANOVA carried out for the F1 of the vowels produced by the three English, Italian and Chinese non-native speakers of German revealed no significant group differences.
162
Ulrike Gut
Table 7. Mean duration of all vowels in the post-tonic syllables of the type C+<en> produced by the German native speakers and the three learner groups in the reading passages and retellings. (Significant differences from the native speaker group are indicated by ***=p<0.001, ** = p<0.01, *=p<0.05)
duration percentage deleted
German (n=3) 0.046 76.5%
English (n=5) 0.06* 88.9%
Italian (n=6) 0.054 48.3%
Chinese (n=6) 0.068*** 32.2%
n
98
118
178
236
Table 8. Mean F1 and F2 of all vowels in the post-tonic syllables of the type C+<en> produced by the women among the German native speakers and the three learner groups in the reading passages and retellings. (Significant differences from the native speaker group are indicated by *** = p<0.001, **=p<0.01, *=p<0.05)
F1 F2
German (n=2) 376 1440
English (n=2) 629*** 1427
Italian (n=4) 517.5** 1968***
Chinese (n=4) 521.9** 1600*
n
9
4
40
122
5.
The LeaP corpus in language teaching
The LeaP corpus was used as a tool for inductive learning in a University course entitled “Phonetic properties of non-native speech”, in which 21 students of English at the University of Freiburg in Germany participated. The course lasted for one semester (October 2004 to February 2005) and consisted of 15 classes comprising a mix of lecture, discussion and corpus work. In 13 classes, the students worked with the corpus, using Praat, and solved small tasks such as the measurement of vowel lengths. At the end of the term, the students carried out a group project on an empirical research question of their choice. Research questions included for example “Final devoicing in English by German learners” and “Fluency after a stay abroad”.
Learner corpora in second language prosody research and teaching
163
After the course, the students filled in a questionnaire about the corpus work. In questions 1 and 2 they were asked to rate their preferred teaching method and to estimate where they learned most. Rating options given ranged from 1 (best) to 5 (worst). The students, on average, preferred the discussion (2.2) and lecture (2.2) over corpus work (2.5), reading (2.6) and the presentations by students (3.3). They felt they had learned most in the lecture parts (1.66), followed by their own reading (1.8) and the discussions (2.47). Corpus work (2.66) and the presentations by students (3.25) were rated lowest. In the third question, the students agreed that corpus work was communicative (yes: 75% / no: 25%), interesting (95% / 5%), stimulating (86% / 14%) and varied (62% / 38%). On the whole, they did not judge it to be boring (yes: 11% / no: 89%), too difficult (0% / 100%), too easy (5% / 95%) or discouraging (0% / 100%). Furthermore, 90% agreed that they had learned a lot about foreign accent and that they had become more aware of foreign accents (81%). Only 10%, however, claimed that their own accent had improved through the corpus work, but 72% believed that their language teaching will improve.
6.
Summary and outlook
The aim of this article was to illustrate how a corpus-based analysis of nonnative prosody can complement current research methods and to demonstrate the new opportunities it offers in L2 prosody research and teaching. For this purpose, the LeaP corpus was analysed with respect to vowel reduction in non-native German and non-native English. Comparing nonnative speech with native speech, the results obtained confirm the observations reported in the small-scale studies carried out by Wenk (1985), Bond and Fokes (1985), Flege and Bohn (1989), Kaltenbacher (1998) and Gut (2003). The major difference between non-native speech and native speech lies in the lack of durational difference between syllable pairs in which a syllable with a full vowel precedes a syllable with a reduced or deleted vowel. Overall, syllables of any kind are longer and therefore non-native speech is slower than native speech. In addition, non-native speakers of English do not succeed in a quantitatively sufficient reduction or deletion of vowels. Non-native and native vowel reduction also differ in terms of its correlation with other phonological features. Whereas in native German and native English the extent of vowel reduction correlates with the speaking rate, no such correlation exists in non-native speech.
164
Ulrike Gut
The corpus-based analysis moreover offered the opportunity to carry out a cross-linguistic comparison between speech produced by learners of two different languages, which has not been attempted so far. It was shown that non-native German and non-native English differ significantly in terms of vowel reduction. The durational difference between non-reduced and reduced syllables is greater in non-native English than in non-native German. The percentage of reduced syllables in non-native German, however, is greater than in non-native English. This difference between the two interlanguages can be interpreted as target-language influence, probably based on different syllable structures and morphology in the two languages. This is furthermore corroborated by the analysis of some non-native speakers of both German and English: they show distinctly different vowel reduction processes when speaking German than when speaking English. Moreover, a comparison of vowel reduction in different speech styles was carried out. Significant effects of style were only found for non-native English, where free speech shows less vowel reduction than story retellings and reading passage style. Apart from large-scale quantitative analyses, qualitative studies of speaker subgroups and particular linguistic structures were carried out. Several analyses showed that although there is no difference in the amount of reduced syllables between native and non-native German, the distribution of these syllables differs significantly. Learners do not seem to reduce or delete vowels according to the same phonological rules as native speakers. In particular, certain inflectional morphemes with obligatory vowel reduction in German did not have any or very little vowel reduction in nonnative speech. An acoustic analysis of the quality of the vowel produced furthermore showed significant differences in tongue position and tenseness in non-native vowels. Some differences in the distribution and quality of reduced vowels were found for learners with different languages backgrounds. However, in many areas, the learner subgroups exhibited the same production patterns. Finally, longitudinal data was analysed with the aim of identifying characteristics and factors of the acquisition process. Two learner groups were compared: one participating in a stay abroad programme and the other participating in a pronunciation training course. As concerns learning context, no difference between the two was found. Results showed that both groups improved some aspects of vowel reduction, but the variation among learners was high. Whereas both learner groups improved the number of reduced syllables in the direction of native speaker values, individual learners
Learner corpora in second language prosody research and teaching
165
showed divergent learning paths. None of the learners succeeded in acquiring native-like differences between full-vowelled and reduced-vowelled syllables even after the course or stay abroad. This leads to the tentative proposal that the appropriate amount of vowel reduction is acquired before the appropriate phonetic realisation of it. In summary, the article demonstrated the advantages of corpus-based research in L2 prosody in comparison with current experimental methods. Large-scale quantitative analyses of corpora yield generalizable results concerning the frequency of linguistic structures and their variation among learners, which cannot be derived from experimental studies based on the productions of a few participants. In this study, for example, a total of more than 90.000 syllables was analysed. These large-scale analyses furthermore offer insights into hitherto unresearched areas as for example variation among learners. It was demonstrated that variation is constrained by target language influences and that certain aspects of vowel reduction are acquired after others. However, it was also shown that quantitative analyses must be complemented by qualitative analyses. The observation of the overall frequency of reduced vowels in syllables is of limited value unless it is augmented by an investigation of particular types of linguistic structures such as specific unstressed syllables. A first application of the corpus in a university course showed that corpus work has some potential of raising students’ language awareness. However, despite enjoying the experience, students doubt that it contributes to an improvement of their pronunciation in an L2. Further research is necessary to test the claims made by corpus linguists about the pedagogical value of corpus-based work in the classroom.
Notes 1. funded by the Ministry of Education, Research and Science of North-Rhine Westphalia, Germany
References Biber, David, Susan Conrad, and Randi Reppen 1994 Corpus-based approaches to issues in applied linguistics. Applied Linguistics 15, 169–187.
166
Ulrike Gut
Bond, Z. and Joann Fokes 1985 Non-native patterns of English syllable timing. Journal of Phonetics 13, 407–420. Botley, Simon, Julia Glass, Tony McEnery, and Andrew Wilson (eds.) 1996 Proceedings of Teaching and Language Corpora 1996. Lancaster: UCREL technical papers volume 9. Flege, James and Ocke Schwen Bohn 1989 An instrumental study of vowel reduction and stress placement in Spanish-accented English. Studies in Second Language Acquisition 11, 35–62. Ghadessy, Mohsen, Alex Henry, and Robert Roseberry 2001 Small Corpus Studies and ELT. Amsterdam: John Benjamins. Granger, Sylviane 2002 A bird’s eye view of learner corpus research. In: Sylvaine Granger, Joseph Hung and Stephanie Petch-Tyson (eds.) Computer Learner Corpora, Second Language Acquisition and Foreign Language Teaching, 3–33. Amsterdam: Benjamins. 2004 Computer learner corpus research: current status and future prospects. In: Ulla Connor and Thomas Upton (eds.), Applied Corpus Linguistics. A multidimensional perspective, 123–145. Amsterdam: Rodopi. Granger, Sylvaine and Christopher Tribble, 1998 Learner corpus data in the foreign language classroom: formfocused instruction and data-driven learning. In: Sylviane Granger (ed.), Learner English on Computer. 199–209. London: Longman. Granger, Sylviane, Joseph Hung and Stephanie Petch-Tyson (eds.) 2002 Computer Learner Corpora, Second Language Acquisition and Foreign Language Teaching. Amsterdam: Benjamins. Gut, Ulrike 2003 Non-native speech rhythm in German. Proceedings of the 15th International Congress of Phonetic Sciences, Barcelona, 2437– 2440. Hasselgard, Hilde 1999 Review of S. Granger (ed.), Learner English on Computer. ICAME Journal 23, 148–152. Helgason Pétur and Klaus Kohler 1996 Vowel deletion in the Kiel Corpus of Spontaneous Speech. In: Klaus Kohler, Claudia Rehor and Adrian Simpson (eds.), Sound Patterns in Spontaneous Speech, Arbeitsberichte des Instituts für Phonetik und digitale Sprachverarbeitung 30, Universität Kiel, 115–157.
Learner corpora in second language prosody research and teaching
167
Kaltenbacher, Erika 1998 Zum Sprachrhythmus des Deutschen und seinem Erwerb. In: Heide Wegener (ed.), Eine zweite Sprache lernen, 21–38. Tübingen: Narr. Kennedy, Graeme 1998 An Introduction to Corpus Linguistics. London: Longman. Kettemann, Bernhard and Georg Marko (eds.) 2002 Teaching and Learning by Doing Corpus Analysis. Amsterdam: Rodopi. Leather, James 1999 Second-language speech: an introduction. Language Learning supplement 49, 1–56. Mairs, Jane 1989 Stress assignment in interlanguage phonology: an analysis of the stress system of Spanish speakers learning English. In: Susan Gass and Jacquelin Schachter (eds.), Linguistic Perspectives on Second Language Acquisition, 260–283. Cambridge: Cambridge University Press. McEnery, Tony and Andrew Wilson 2001 Corpus Linguistics. Edinburgh: Edinburgh University Press. 2nd edition. Nesselhauf, Nadja 2004 Learner corpora and their potential for language teaching. In: John Sinclair (ed.), How to Use Corpora in Language Teaching, 125–152. Amsterdam: John Benjamins. Sinclair, John (ed.) 2004 How to Use Corpora in Language Teaching. Amsterdam: John Benjamins. Wenk, Brian 1985 Speech rhythms in second language acquisition. Language and Speech 28, 157–174. Zborowska, Justyna 2000 The acquisition of English speech rhythm by Polish learners. Proceedings of New Sounds 2000, Amsterdam, 368–374.
Part 2. Teaching practice
Teaching prosody in German as foreign language Ulla Hirschfeld and Jürgen Trouvain 1.
Introduction
The “sound of a language”, which is primarily transmitted by prosodic features, does not just convey the content of an utterance, but also other important communicative information. It marks the emotional state of the speaker and it can have the effect of being calming, detesting, encouraging, warm, cold, intimate or strange. Moreover, the individual way of speaking is a key feature of a speaker’s personality, an audible “business card”. From the foreign accent and other features native speakers deduce the educational status, the social affiliation, the degree of intelligence and even certain traits of the individual character (cf. Hirschfeld 1994). For these reasons prosodic features should be taught in the first line when teaching pronunciation. What can be expected from “theory” on prosodic structure and realisation of prosody? What can be transmitted to teachers of German as a foreign language (henceforth DaF for Deutsch als Fremdsprache)? In which way can it be made clear? Theories about what prosody exactly is and how it can be described are not as simple as a teacher may wish, because the rules how to realise prosody are not as clear-cut as the rules how to realise the sounds of a word. Different conditions determine the correctness and the acceptability of the prosody of an utterance such as the communicative situation and the text type. Seen from the teachers’ perspective, researchers have not always a good knowledge of what teachers are interested in. A high quality in teaching of prosody requires a fruitful dialogue between researchers and teachers. The goal of this article is to present how concepts from “theory” can be applied and integrated into foreign language teaching illustrated with examples of German as target language.
172
2.
Ulla Hirschfeld and Jürgen Trouvain
Typology of deviant forms on the prosodic level
Pronunciation and phonetics in German language teaching have been ignored for a long time (and some popular text books that include practical material still do) but a change can be observed over the last 15 years. However, exercises with the focus on prosody are rare apart from those dealing with lexical stress. If pronunciation is part of a language teaching book then often the topic is reduced to segmental phenomena such as the “ich-Laut” x´z, the glottal fricative xÜz or the rounded front vowels xóÁ=v=mÁ=ôz. Seldom considered are other consonants and vowels such as schwa x…z. This is striking because the schwa is by far the most frequent vowel in German (cf. Kohler 1995: 222), and clearly the most frequent vowel in unstressed syllables playing a vital role in the rhythmic alternation of stressedunstressed patterns. Although in Figure 1 the vowels are presented in their underlying form, the very common elisions of schwa in endings like <-en>, <-el>, <-em> in German represent a problem for learners with respect to vowel quality (replaced by a full vowel such as xbz) as well as the rhythmic patterns.
Figure 1. Occurences of vowels in a German speech corpus in percent, sorted by frequency, based on the data by Kohler (1995: 222).
Teaching prosody in German as foreign language
173
In addition, many phenomena responsible for a foreign accent in German are of a supra-segmental or prosodic nature. Take for example a study with learners of German with different L1 who produced the sentence “Es regnet.” (“It is raining.”). When the accentuation fell on the last syllable of the verb (accentuation pattern “es regNET”) more than half of the German listeners (of various age) were unable to recognise the sentence correctly. In the correct pattern “es REGnet” only 5% of the listeners failed to get the message. Most listeners understood for “es regNET” sentences like “Ist sehr nett.” (“(It) is very nice.”) or “Es ist nett?” (“It is nice?”). The fact that declaratives as well as questions were understood shows that the melodic shape of the utterance was not clear and also that the segmental structure had been adapted to the suprasegmental structure. Here, the location of the accentuated vowels plays a decisive role. Too many interruptions in the form of pauses within an utterance led to further difficulties. Suprasegmental deviations in combination with segmental mistakes caused a complete incomprehensibility for many listeners in this study (cf. Hirschfeld 1994: 102 ff.) Although there is a great individual range of errors, some problems can be observed in groups of learners with a heterogeneous language background. The list in Table 1 summarises possible errors attributable to the prosodic levels of prominence (of words and utterances, respectively), pitch contour and phrasing. Table 1. Problems of L2 speech on different prosodic levels. lexical stress phonological phonetic phonetic phonetic phonetic phonetic
1. on the wrong syllable 2. lengthening of short stressed vowels 3. too little contrast when realising stressed vs. unstressed syllables 4. lack of segmental reductions in unstressed syllables 5. melodic deviations in stressed and/or unstressed syllables (cf. melodic contour) 6. over-strong secondary stresses in longer words, esp. compound words
174
Ulla Hirschfeld and Jürgen Trouvain
Table 1. (continued) pitch accents phonological phonological phonetic phonetic phonetic phonetic
1. too many pitch accents 2. accent on the wrong word/s 3. incorrect or too strong secondary pitch accents 4. over-lengthening of pitch accented short vowels 5. too little contrast when realising pitch accented vs. unaccented syllables 6. lack of segmental reductions in unaccented syllables
melodic contour (pitch range, pitch accent realisation, end of utterances) phonetic 1. melodic deviations in stressed and/or unstressed syllables phonological 2. wrong melodic contour at the end of utterances pauses and phrasing structure phonological 1. too many pauses phonological 2. pauses at wrong locations phonetic 3. too long pauses
3.
L1 influences
Some features of German prosody nearly always lead to difficulties for language learners: typical melodic contours (e.g. the accomplishing of the final low at the end of utterances), assignment and production of lexical stress, structuring and realisation of rhythm. Depending on their first language or prior acquired foreign languages, speakers exhibit different problems with their German L2 prosody. The examples in Table 2 illustrate typical deviations for some languages. Additional types of deviation are described in contrastive studies in Hirschfeld et al. (2002 ff.) where a survey of 40 languages is given. Attention must be paid to the fact that for many language learners German is often the second foreign language after English, and sometimes even the third or fourth foreign language. Thus, interferences from prior acquired foreign languages must also be expected.
Teaching prosody in German as foreign language
175
rules of lexical stress length and quality of vowels in stressed syllables phonetic deviations contrast stressed– unstressed insufficient reduction of unstressed syllables over-strong reduction of unstressed syllables accent marking predominantly by intensity pitch contour in utterancefinal low
4.
x x
x x
x x
x x
x x
x
x
x
x
x
x
x
x
x
x x
Spanish
Russian
Korean
French
English
phonological deviations
Arabian
Table 2. Typical deviant prosodic forms for L2 German speakers with various first languages.
x
x
x
x
x
x x
x
Phonetics in textbooks, learning materials and foreign language teaching
In teaching German as a foreign language pronunciation training traditionally plays only a minor role, if there is any a training at all (cf. Hirschfeld 2003: 193 f.). The following sub-sections present an impressionistic analysis of the various factors and processes in current German language teaching practice. 4.1. Teaching and learning materials and methods In teaching and learning materials and methods there is no satisfactory choice with respect to the exercises, the mix of methods, and the number of exercises. There is insufficient additional material for the particular needs of different learner groups which may differ in several respects such as:
176 – – – –
Ulla Hirschfeld and Jürgen Trouvain
the learners’ first language(s) their proficiency level in the foreign language the age of the learner the learning goals
The methods also have to differ in aspects such as – the teaching traditions they are familiar with – the group characteristics (linguistically homogeneous vs. heterogeneous;
group size) – teaching situation (time; location; technical equipment).
4.2. Research on pronunciation training There are still too few published research studies on didactic and methodological issues in pronunciation teaching. This explains why the analysis of research results is still not an integral part of teachers’ education, especially at universities. 4.3. Consideration of research in learning materials The authors of textbooks do not usually take publications in phonetic research into consideration, e.g. studies in contrastive phonetics, investigations of norms and variation in standard German, or studies dealing with functions and effects of phonetic features in intercultural communication. 4.4. Teachers The many peculiarities that define the teaching and learning of pronunciation compared to other skills are not considered in a differentiated way. Instead of dealing with pronunciation problems with individual and/or L1 causes, many teachers give up before they start, with the argument that the possible effect does not justify the effort. There is a widespread opinion that phonetics is a luxury. This attitude leads to the Cinderella role of pronunciation in teaching: Phonetics as the foundation of speaking and hearing in spontaneous conversation as well as of learning alphabet-based writing systems is simply ignored. Even though the situation has improved some-
Teaching prosody in German as foreign language
177
what with regard to speech sounds (and lexical stress) in the last 15 years, the teaching of prosody is still completely unsatisfactory. A second problem is that phonologically relevant prosodic and segmental characteristics and standards of pronunciation are not just transmitted by means of appropriate exercises but also by the teacher’s own speech production. In foreign language training, learners orientate themselves to their teachers. Thus, the teacher’s pronunciation plays an important role because it functions as a model (Dieling and Hirschfeld 2000: 19ff.). Many teachers of German are a bad model because they speak with their non-German accent or with a regional accent. Therefore it is important to make the teaching material function as a model. The teachers in question should be aware and admit to their accented speech. 4.5. Training of teachers In teachers’ training, the mediation of the phonetic/phonological and the pedagogical basics is not taught sufficiently. The consequence is that many teachers feel insecure about how to introduce phonetic forms and how to make learners aware of them, how to correct deviant forms and how to help them to automate correct forms. The teachers must be prepared during all training phases – to determine the use of the didactic methods in class, but also for indi-
–
– – –
vidual teaching, i.e. the development of concepts of exercises for listening and speaking, to recognise prosodic (and segmental) deviant forms, to point out the deviant forms and to correct them, ideally in an emphatic but motivating and effective way, to select and invent appropriate exercises, and to take care of a sufficient level of automation, to mediate rules and knowledge (differentiating quantity and methods for different groups of learners), to accept and understand their role as a model of language and speech (with the consequences for their respective foreign or regional accents).
178
Ulla Hirschfeld and Jürgen Trouvain
These requirements are much more than those we can observe in classrooms nowadays. In the next two sections various types of exercises are presented followed by concrete examples applied in university classes.
5.
Types of exercises
Many teachers of German assume that prosody is acquired by listening and imitation (“parrot method”). Most teaching materials also focus on imitation exercises. However, few learners are able to produce acceptable imitations. Especially teenagers and adult learners exhibit problems that can be attributed to various causes. Therefore, the types of exercises for the development of the perception and production skills must be carefully selected and their timing carefully organised. The first step with listening exercises does not aim at understanding the content but at phonological and phonetic listening. The focus of phonological or phonematic listening is to distinguish and identify elements which differentiate meaning: a. ein FACH – EINfach (English “a compartment”/ “a subject” – “simple”) b. Ja? – Ja! (English “Yes?”/ “Really?”/ –“Yes!”)
Phonological listening is the fundament to the further processing and interpretation of spoken utterances. The next step, the phonetic listening, goes beyond the simple differentation of meaning: the perception of phonetic variants which occur frequently in daily situations is required, e.g. the speech melody in accented syllables or the lengthening of pauses. Here, the common practice of providing audio examples and asking learners to “listen carefully” is not sufficient. Teachers as well as learners need to know where difficulties are likely to occur. This knowledge is only possible when the results of the listening are monitored. There are several ways of doing this ranging from marking syllables and words to transcribing; for quick feedback hand signals can be used. Controllable listening exercises with minimal pairs are also recommended. These are easily prepared, using first and second names or geographical names. In order to discriminate, two or three names can given, e.g.
Teaching prosody in German as foreign language
179
a. Which town is stressed on the last syllable? Luzern - Salzburg, Berlin - Halle - München b. Which name does contain a long vowel in the stressed syllable? Müller - Mühler, Mehler - Meller - Möller For an identification task the teacher gives an example in advance, e.g. a. Which syllable is lexically stressed? The first, the second, the third or the last? Mönchengladbach b. Is the stressed vowel a long vowel or a short vowel? Möhler The learner can even practise this type of listening exercises without a teacher if appropriate software is at hand such as “Phonothek interaktiv” (Hirschfeld and Stock 2000). It is recommended that exercises for listening are linked to those for speaking by also using the listening material for imitating, reading aloud, variation and combination. Furthermore, monologue and dialogue texts, word lists and grammar exercises taken from the text book can be used as a basis for phonetic exercises – they all contain examples which are appropriate for practising. Examples can be used for visual highlighting, for word search, for sorting, listening, humming, articulating and reading aloud. They can be used in different contexts and they can also be accompanied by gestures. Pauses, melodic contours and accent patterns can be marked in texts, either after listening or from memory. Learners can articulate synchronously with the speakers of the audio examples. It is important that different learner strategies are stimulated and that not always the same type of exercises is offered. Exercises should vary and the requirements should continuously increase. Exercises for automation should start with rhythmic-melodic units larger than a word – the practice of single sounds and the articulation of words in isolation should be restricted to the first phase and the correction phase. 5.1. Methodological steps We recommend a methodological procedure that has been validated across a wide range of teaching situations (cf. Hirschfeld 2003: 202):
180
Ulla Hirschfeld and Jürgen Trouvain
1. introduction of the topic, e.g. with a comprehension text 2. listening control, i.e. differentiate (compare) and identify (recognise) prosodic features 3. imitation attempts, individually and in chorus in order to rehearse anonymously 4. correction of deviant forms, to make the learners aware of the critical phonetic features 5. repeated listening control 6. further imitation attempts with feedback 7. automation by repeating, reading, variation of speaking style 5.2. Typology of exercises Providing a good mix of methods includes the provision of different types of exercises. In Dieling and Hirschfeld (2000: 47 ff.) various types of exercises are suggested. The most important ones are: –
– –
–
–
listening exercises – preparatory listening exercises as warm-up exercises: e.g. first names in rhyming, proverbs, texts – identification: e.g. recognising the stressed syllable in a first name (Michael, Michaela, Christian, Christiane) – discrimination: e.g. compare stress position in first names (Peter = Petra, Robert # Roberta) – applied listening: e.g. first names in texts imitation exercises creative production exercises – alter, add, combine linguistic elements – in combination with work on grammar and vocabulary applied production exercises – read aloud, oral presentation – free speech acting in scenes
Teaching prosody in German as foreign language
181
5.3. Central points for exercises in prosody The focus of exercises should differ according to the learners’ first language. Native speakers of tone languages have greater and more complex difficulties than native speakers of Germanic languages (other than German). The following topics are fundamental for the comprehension of German; they should be given a central role among the phonetic exercises: –
–
–
– –
6.
lexical stress – stress assignment (application of stress rules) – vowel length in stressed syllables (long vs. short) – contrast of stressed vs. unstressed syllables rhythm – accentuation at utterance level – alternation of stressed – unstressed syllables in rhythmic groups schwa – realisation of schwa-syllables, especially elision of schwa in word-endings – as an important element of rhythmic structure pauses and phrase structuring typical melodic contours – fall-rise contour in yes/no-questions and contact-eliciting or very friendly utterances – rise-fall contour in terminal declarative utterances – (extreme) final low at the end of utterances
Examples of exercises
In this section we intend to show some examples of exercises that can be applied in almost every pronunciation lesson and which can easily be varied. The two most important features are: 1. Apart from the phonetic topic there is always a content theme such as location names or clothing. The exercises in sections 6.1 and 6.2 can be individually modified: instead of town names one can practise the stress patterns with food terms, hobbies, names of bus stations etc. These content-oriented exercises make them interesting for the learners and pro-
182
Ulla Hirschfeld and Jürgen Trouvain
vide a better memorization of the phonological pattern as well as the vocabulary. 2. Each exercise consists of several steps which elicit and support the activity of the learner. The exercises are not restricted to a few isolated chance collection of words and sentences which have to be heard and repeated. The structure of exercises proposed here makes a high degree of automation possible because further steps can be added continuously, i.e. the same material can be practised with different tasks. The following exercises are taken from Hirschfeld and Reinke (1997) where further practical suggestions are given. 6.1. Lexical stress Step 1: Listen to the town names and assign them to the stress patterns. Berlin, Hannover, Hamburg, Magdeburg, Neuruppin …
1.
zz Step 2: Step 3: Step 4: Step 5:
2. z
z
3.
zzz
4. z
zz
zz
z
Listen again and repeat. Can you find other German towns that fit to these patterns? Draw stress patterns for towns in your own language. Practice exercise: plan a journey in towns with bi-syllabic names and stress on the first (second) syllable.
6.2. Vowel length Step 1: Listen to example words (Mantel, Schal, Hose, Socke), show with your hands whether the stressed vowel is long or short. Step 2: Write down (ten) pieces of clothing below the appropriate heading short or long in a two column-table, depending on the length of the stressed vowel.
Teaching prosody in German as foreign language
short Hemd Rock ...
183
long Hose Schal ...
Step 3: With this table, the learners have to find out by themselves what the spelling rules are that tell you whether it is a long or a short vowel. These spelling-to-sound rules for the vowels should be summarised by the teacher. Step 4: For practice the learners are asked to “pack a suitcase for a short trip, taking only clothes that have a short vowel!”. Alternatively, or as an exercise after the “short-vowel-journey” a “long-voweljourney” can be offered. 6.3. Melody and phrasing Step 1: Have a look at the following lines. The sentences consist of the same words in the same order. 1
PAULA WILL PAUL NICHT
2
PAULA WILL PAUL NICHT
3
PAULA WILL PAUL NICHT
4
PAULA WILL PAUL NICHT
5
PAULA WILL PAUL NICHT
... Step 2: Listen to the examples and add the punctuation signs. It should be done step by step, example by example, week by week. You end up with a list like this one: 1 2 3 4 5
PAULA PAULA PAULA PAULA, PAULA
WILL PAUL WILL, PAUL WILL? PAUL WILL PAUL WILL PAUL,
NICHT. NICHT. NICHT. NICHT? NICHT?
184
Ulla Hirschfeld and Jürgen Trouvain
Step 3: Find further examples and add the following punctuation signs between the words: ? ! , ; : . „ “ Step 4: Compare your results with your partner’s. Step 5: Read the different variants aloud. Your partner should correct your performance. 6.4. Sentence or (pitch) accent Step 1: Listen to the examples and underline the accented words. The list will look like this: 1 2 3 4
PAULA PAULA PAULA PAULA
WILL WILL WILL WILL
PAUL PAUL PAUL PAUL
NICHT. NICHT. NICHT. NICHT.
Step 2: What are the meanings of the variants? Can you imagine a situation where “Paula” is in the focus of the utterance? Step 3: Read the different sentences aloud. This can be done as a partner exercise. 6.5. Schwa Step 1: The letter <e> is the most frequent letter in German. It corresponds sometimes to xÉÁzI=xbz, or is part of x~fzxlfzxáÁz. But in many occasions the <e> has a different pronunciation as in the words hatte, rede, sage, liebe. Can you produce this e-sound in isolation? Step 2: What are the corresponding infinitive and plural forms? What happens to the written <e> in hatten, reden, sagen, lieben? Step 3: Look at the followig word stems: red-, sag-, lieb-. What does this verb sound like in the first person singular compared to the first person plural? xDêÉÁÇ…z vs.=xDêÉÁÇå}z= Step 4: Listen to the following three words: Härte, härter, Hertha. What is the difference? Can you hear the difference between the unstressed [a] and the “vocalized r”?
Teaching prosody in German as foreign language
185
Step 5: What do the comparative forms of the adjective klein sound like? Fill in Eine klein- Schwester, Ein klein- Bruder. Step 6: Mark in the text where an <e> represents a schwa, a deleted schwa or where <e> stands for a “vocalized r”.
7.
Methodological recommendations
The most important features of prosody must be made clear to both learners and teachers in an adequate way. What is the role of the teacher? How are teachers to be convinced that prosody is important? How are teachers to be taught? Here is a summary list of the requirements for foreign-language teachers (cf. Hirschfeld 2003: 213 f.): 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11.
motivate visualise (e.g. body movements) show knowledge of phonetic characteristics make learners aware of deviant forms check performance in listening exercises provide interesting, non-routine and creative fun exercises provide enough exercises for a sufficient level of automation provide exercises: better frequent & short than infrequent & long focus on rhythmic-melodic units (i.e. larger than just one word) integrate exercises into situations and context combine exercises with work on grammar and vocabulary
In our view, teachers must be willing and able to recognise, to explain, and to correct the most serious problems in the area of prosody and pronunciation in general and also to give adequate feedback. They must know rules, characteristics, and the structures of the native language/s of their language learners. They have to apply multiple methods, since the “parrot method” as the most commonly practised method is not sufficient.
8.
Conclusion
This article aimed to describe the state-of-the-art of prosody and pronunciation teaching in DaF. Despite some progress in the last few decades, we can
186
Ulla Hirschfeld and Jürgen Trouvain
still identify enormous deficits. These deficits concern the practice, i.e. the knowledge of teachers and the teaching materials available, as well as the theory, i.e. research in second language acquisition that does not take the practicalities of teaching in consideration. An understanding of both theory and practice is necessary to reduce these deficits. We hope that this article can contribute to this goal by showing the most important problems but also by presenting some practical solutions. There is clearly much to do in order to develop satisfactory methods of teaching prosody. This concerns the diagnosis of phonetic, especially prosodic deviations, the application of exercises, and of course a measure for comparing levels of prosodic mastery. Another unsolved question that requires intensive discussion and research is how to assess various teaching methods. Is there any way to test learners’ progress when teachers integrate prosody into their pronunciation teaching? What is the impact of prosody training? Is there an impact of prosody training in the first place? To answer those questions multiple factors must be taken into account such as the learners’ L1, the group size or the learners’ proficiency level in the foreign language. However, the reality in classrooms does usually not allow the control of all those factors. Nevertheless, experience in teaching practice has clearly shown that a systematic training of prosodic elements raises the degree of intelligibility in the foreign language (Hirschfeld 1994; Missaglia this volume). At the same time the pronunciation of vowels and consonants improves because the learners are now sensitive to features also relevant for segmental structure such as duration and articulatory tension, e.g. for vowel oppositions like xáÁJfI=ìÁJrI=çÁJlz and so forth. Therefore we think that teaching prosody goes hand in hand with teaching the pronunciation of sound segments.
References Dieling, Helga and Hirschfeld, Ursula 2000 Phonetik lehren und lernen. München: Langenscheidt. Hirschfeld, Ursula 1994 Untersuchungen zur phonetischen Verständlichkeit Deutschlernender. (Forum Phoneticum, Bd. 57). Frankfurt/M.: Hector.
Teaching prosody in German as foreign language 2003
187
Phonologie und Phonetik in Deutsch als Fremdsprache. In: Claus Altmayer and Roland Forster (eds.), Deutsch als Fremdsprache: Wissenschaftsanspruch – Teilbereiche – Bezugsdisziplinen, 189233. Frankfurt/M etc.: Peter Lang. Hirschfeld, Ursula, Heinrich P. Kelz and Ursula Müller (eds.) 2002 Phonetik international. Grundwissen von Albanisch bis Zulu. Ein Online-Portal: www.phonetik-international.de. Waldsteinberg: Heidrun Popp Verlag. Hirschfeld, Ursula and Kerstin Reinke 1997 Simsalabim. Übungskurs zur deutschen Phonetik (Video, Kassette, Arbeitsbuch). München: Langenscheidt. Hirschfeld, Ursula and Eberhard Stock (eds.) 2000 Phonothek interaktiv (CD-ROM). München: Langenscheidt. Kohler, Klaus J. 1995 Einführung in die Phonetik des Deutschen. 2nd edition. Berlin: Erich Schmidt Verlag.
Metacompetence-based approach to the teaching of L2 prosody: practical implications Magdalena Wrembel 1.
Introduction
A global perspective in teaching has brought about an increasingly common understanding of pronunciation as being an integral part of oral communication (cf. e.g. Celce-Murcia 1987, Morley 1987). Traditionally, pronunciation instruction has been primarily associated with the accurate production of segments, however, under the influence of discourse-based approaches, suprasegmental features of language have been found to exert the greatest impact on comprehensibility and communication. This has resulted in a significant shift of priorities from the narrow segmental focus to a broader “top-down” perspective highlighting the importance of prosody and contextual meaning (de Bot and Mailfert 1982). Advocates of such an approach consider pronunciation to be a non-segmental, non-discrete and non-autonomous phenomenon, emphasising that only by departing from the traditional understanding of phonology in terms of discrete segments it can be viewed as the phonological aspect of speech in real communication (cf. Pennington 1989). In spite of the widespread consensus about the significance of prosodic features for successful communication, which has resulted in a more balanced treatment of segmental and suprasegmental aspects of pronunciation in some English as a Foreign Language (EFL) course books, prosody still appears to be the ‘problem child’ from the pedagogical perspective and is considered notoriously difficult to teach (cf. e.g. Dalton and Seidlhofer 1994, Celce-Murcia 1987). Prosodic patterns, especially differences in pitch movements, are usually regarded to be more difficult to perceive and produce than the segmentals. According to Roach “the complexity of the total set of sequential and prosodic components of intonation and of paralinguistic features makes it a very difficult thing to teach” (1991: 168). There seems to be an inverse relationship between communicative importance and teachability, as suggested by Dalton and Seidlhofer (1994: 72–73), who point out that individual sound segments are high on the learnability scale
190
Magdalena Wrembel
yet they are relatively less important for communication, whereas the suprasegmentals, or more specifically, the attitudinal function of intonation, are extremely important in discourse, yet they are more difficult to adapt for direct teaching. Consequently, prosody has not been given a prominent place in most EFL teaching materials with an exception of such publications as e.g. Brazil (1994), Bradford (1988), Laroy (1996) or Vaugham-Rees (1994). What makes intonation teaching even more problematic is the fact that some of the most recent trends in pronunciation teaching seem to be influenced by a highly controversial proposal of the Lingua Franca Core – LFC (cf. Jenkins 2000) aimed at the simplification of English phonology and the reduction of the pronunciation teaching load only to those features that seem apparently essential for international intelligibility. For instance, the majority of the suprasegmental features including pitch movements or stress-timing are excluded from Jenkins’ LFC as not crucial and unteachable. This goes counter to a commonly held view that correct intonation, rhythm and accentuation are regarded as indispensable for intelligibility (cf. Brazil 1994, Kenworthy 1990, Morley 1994) and that “all learners should be proficient in making use of pitch movements as important cues for signalling salient words or syllables” (Gimson 2001: 312). The present contribution, therefore, seeks to address the need for adequate methodological guidelines for teaching prosody to foreign language learners. The proposed theoretical model of phonological acquisition is aimed at presenting a rationale for developing phonological metacompetence in L2 learners acquiring foreign language prosody. The need for the construction of such a model has arisen from the recognition that traditional pronunciation teaching techniques are not fully adequate, particularly in the case of L2 prosody. The new rationale rests on the claim that prosody teaching should be directed at consciousness-raising and the analysis of theoretical knowledge rather than practice alone. Teaching about the language is a contentious issue relevant to the perennial debate between second language acquisition researchers on the role played by explicit and implicit knowledge in developing the competence of the second language learner (for a detailed discussion of the role of consciousness in language learning see Wrembel 2006). A prevailing trend nowadays seems to emphasise the natural language learning ability reflected in a naturalistic approach to acquire languages in a purely intuitive manner. The ability to analyse language in a conscious manner is frequently seen as a totally different kind of skill that is usually fostered in a
Metacompetence-based approach to the teaching of L2 prosody
191
traditional, formal classroom setting with varying results. The present paper intends to address this misconception and demonstrate the importance of metalinguistic knowledge and awareness as facilitators of the process of learning of second language prosody. Moreover, it aims to show that the enhancement of phonological metacompetence may be accomplished by means of an array of novel and attractive techniques and classroom practices.
2.
Metacompetence-oriented model of phonological acquisition
The proposed model is an attempt at providing a comprehensive framework of acquisition of L2 phonology, encompassing L2 prosody. To this end, the
Figure 1. Metacompetence-oriented model of phonological acquisition
192
Magdalena Wrembel
model is constructed as consisting of three major component blocks (cf. Fig. 1): a) b) c)
acquisition process (explicated within the framework of Natural Phonology), metacompetence as a facilitating device, conditioning socio- and psycholinguistic factors.
2.1. Process of phonological acquisition Predictions concerning the phonological acquisition of a second language are encompassed within the model of Natural Phonology (cf. e.g. Stampe 1973). According to the natural framework, learning L1 phonology does not require cognitive processing (cf. Donegan 1985), however, in second language acquisition the learner’s starting point is completely different. In the course of SLA the access to universal processes is more difficult as the phonological system of an adult learner is already established, i.e. it is limited to selected processes and underlying representations as well as rules. To gain the access the learner needs to unsuppress, re-order and limit anew some processes in a conscious and controlled manner (DziubalskaKołaczyk 1990). The process of second language phonological acquisition is presented schematically in Fig.1 Part 1 – Process of L2 acquisition, as an adaptation of a model proposed by Dziubalska-Kołaczyk (1990). The first stage of phonological acquisition at the level of perception consists in learning to perceive adequately L2 surface realisations (i.e. L2 outputs), which otherwise are filtered through the grid of one’s L1 and associated with L1specific sound intentions. Formal instruction geared particularly at guided ear training and consciousness raising at the level of contrastive analysis act at this stage as intake facilitators. Explicit theoretical training may enhance the learning process also at the second stage when learners attempt to decode L2-specific sound intentions and form mental representations on the basis of adequately perceived outputs. At the third stage conscious phonological knowledge helps learners to associate inputs with outputs and to work out phonological processes operating in L2. This, in turn, leads to a reactivation of universal processes and to a complete recovery of L2specific universal preferences. The role of metacompetence at the second and third stages is that of acquisition facilitator. Finally, perception feeds
Metacompetence-based approach to the teaching of L2 prosody
193
into production and conscious knowledge of articulation assists a learner’s phonetic performance acting as a monitoring device and offering the possibility of reflective feedback. 2.2. Phonological metacompetence The core construct of the model is phonological metacompetence, which is understood as conscious knowledge of and about the grammar of the language and which may be developed by making the learner metalinguistically aware of L2 phonetics and phonology. The notion of metacompetence alludes to the distinction in cognitive psychology between ‘declarative knowledge’ and ‘procedural knowledge’ that has been recently applied to Second Language Acquisition (SLA). Broadly speaking, declarative linguistic knowledge refers to a speaker’s knowledge of linguistic facts, whereas procedural knowledge refers to know-how in using the language. In the course of skill development declarative knowledge is converted into procedural form, i.e. it gets proceduralised and leads to L2 competence. The present author advocates to interpret phonological metacompetence as a multilevel construct consisting of the three following blocks: (1) metalinguistic consciousness, (2) explicit formal instruction, and (3) first language competence (see Fig. 1 Part 2 - Facilitating device). The first subcomponent, i.e. metalinguistic consciousness, is explicated adopting Schmidt’s (1990) typology and further developed as referring to the following constructs: (a) awareness – perception (i.e. different degrees of conscious noticing), – language awareness raising, (b) intention – controlled/monitored production, – learning strategies based on conscious choice, (c) knowledge – conscious theoretical knowledge of L2 phonetics and phonology. Awareness at the level of perception corresponds to conscious noticing and understanding as a necessary condition for the input to become intake and to be stored in a learner’s temporary memory. Conscious analysis at this level consists in learners’ making a comparison between the observed phonetic input and their own production. The second subcomponent, intention,
194
Magdalena Wrembel
is applicable to consciousness at the level of speech production and it implies controlled/monitored phonetic output. Metalinguistic consciousness at the level of intention involves also a deliberate choice of learning strategies corresponding to the learner’s preferred learning style. The element of knowledge pertains to L2 competence developed through conscious analysis of knowledge of phonetics and phonology acquired as a result of theoretical instruction. The analysis is a precondition for declarative knowledge to be converted into procedural one. The second building block of metacompetence, as proposed by the present author, involves explicit formal instruction. It consists in theoretical training in phonetics and phonology targeted at developing conscious knowledge of the second language phonological system. Apart from providing theoretical foundations, pronunciation instruction should also offer reflective feedback on learners’ pronunciation performance and equip them with appropriate tools and strategies for self-monitoring in order to empower them to continue the learning process outside the classroom. Finally, phonological metacompetence is believed to benefit also from drawing on a learner’s first language competence as a complete detachment from the native tongue is neither psychologically possible nor pedagogically desirable. This expectation corresponds to the notion of ‘psycholinguistic learning strategy’ as proposed by Færch and Kasper (1986) which consists in conscious reliance on a L2 learner’s prior linguistic knowledge (of L1 or any Ln) to form hypotheses about L2, in contrast to a purely inductive strategy that relies solely on the L2 intake. A similar stance was embraced also in the naturalist perspective by Dziubalska-Kołaczyk (2002) who called for raising language awareness through the mediation of the first language. Making learners aware of the ‘competences’ they already possess may thus constitute a methodological remedy targeted at suppressing the L1 interference and reinforcing the L2 acquisition process as such. Adopting Ellis’ (1994) stipulations on the functioning of explicit knowledge, it is postulated in the present model that phonological metacompetence may act in a threefold manner as: 1) Facilitator of intake operating at the level of perception and helping input to become conscious intake. It consists in conscious noticing of specific characteristics of L2 sounds by attracting learners’ attention to those linguistic features they have learnt about through formal explicit instruction.
Metacompetence-based approach to the teaching of L2 prosody
195
2) Acquisition facilitator as metacompetence is predicted to have the potential to facilitate the process of acquisition and form adequate representations by deciphering underlying intentions and preventing the mapping into the L1 system. Finally, this assistance may lead to the reactivation of latent universal processes. 3) Monitoring device exercising control of the output as conscious L2 competence helps to provide reflective feedback on the production. Moreover, metacompetence is a means of empowering L2 learners and enhancing their autonomy by equipping them with necessary tools for self-monitoring and self-correction. 2.3. Influencing factors The third component of the present model (see Fig. 1 Part 3 - Influencing factors) is in line with the sociolinguistic perspective on the nature of language according to which the formal system of language is seen as embedded in its social context. In an attempt to provide a broad and reliable acquisition framework, the model aims at accounting for an array of socioand psycholinguistic factors that condition, to a large extent, phonological acquisition of a second language. It is generally agreed that pronunciation, more than any other aspects of language, is influenced by personal factors and that they are particularly at play with respect to the acquisition of L2 prosody, which constitutes a focal point of personal resistance to learning. The reasons for this are that the rhythm and intonation of our mother tongue are intimately linked with our identity (cf. Laroy 1996) and having to change our normal pitch range or pitch patterns to adapt to foreign language standards appears to jeopardise our language ego and selfconfidence. Therefore, it seems desirable to acknowledge the conditioning potential of these extralinguistic factors and make an attempt to control them in a conscious manner. Factors that were selected and incorporated in the model are generally considered to have the greatest impact on SLA in general and some are more specific for the acquisition of pronunciation. To begin with, cognitive factors encompass language aptitude, intelligence and learning styles and strategies. They remain fairly fixed and are amendable to training only to a limited extent. The varying impact of learner’s intelligence or language aptitude on pronunciation attainment can be nullified or compensated for with high quality language instruction. The only element of the cognitive
196
Magdalena Wrembel
variable that can be modified and enhanced to a considerable extent involves language learning styles and strategies. The present model promotes equipping learners with a broader range of innovative techniques for conscious learning as well as strategies empowering them with self-monitoring abilities and reinforcing various perceptual learning modalities through multi-modal means of presentation and practice. Sociolinguistic factors, on the other hand, include attitude and motivation, which can be further subdivided into integrative and instrumental orientation. It is generally agreed that positive attitudes towards the target language and its language community are conducive towards successful L2 learning. Moreover, concern for good pronunciation motivating L2 learning was found to be one of the most vital predictors of success (cf. Purcell and Suter 1980). The variables described above are susceptible to change as a result of conscious training with the view to enhancing L2 acquisition. Since a learner’s internal motivation can be reinforced by interest invoking instruction, a new tendency, advocated also by the present author, is to enrich pronunciation training by accompanying traditional classroom practices with novel, more engaging teaching techniques incorporating elements of theatre arts, Neuro-Linguistic Programming (NLP) or advanced technologies Computer Assisted Language Learning (CALL). The selection criteria of conditioning factors allow also for the specific character of foreign language pronunciation learning as opposed to the learning of other components of grammar, namely a high level of sensitivity to emotional and psychological factors such as identity, language ego permeability, empathy or self-esteem. It is a common understanding in modern pronunciation pedagogy that affective and psychological factors can foster or inhibit oral mimicry and thus influence pronunciation performance to a considerable extent. Carefully designed pronunciation instruction should thus provide a basis for conscious change in the psychological and affective dimension of learning. Awareness raising in this respect should thus be tailored at creating the most favourable socio- and psychological conditions conducive to the acquisition of the second prosody. The conditioning potential of psychological and affective variables is particularly significant because foreign language pronunciation learning, especially learning L2 prosody may be a stress inducing experience. Stress, in turn, results in muscular tension and stiffened articulators, a learning disadvantage which is largely beyond learners’ control. Moreover, the anxiety level is likely to grow as learners’ effort and diligence does not always
Metacompetence-based approach to the teaching of L2 prosody
197
always lead to immediate improvement and success. Therefore, comprehensive pronunciation training should incorporate confidence building and stress reducing strategies. These strategies consist in conscious efforts towards reducing muscular tension by means of relaxation, breathing exercises and articulatory warm-ups or by adopting drama voice techniques aimed at greater agility and control of articulators as well as confidence building (cf. Wrembel 2001). Furthermore, physiological limitations of the speech apparatus and the motor element inherently involved in pronunciation learning compelled the present author to encompass also oral and auditory capacities of an individual amongst the pertinent factors affecting acquisition. Oral capacities involve the learner’s ability to adapt to different articulatory configurations, whereas auditory capacities concern auditory sensitivity to target language sounds. The aptitude for oral mimicry, that some learners are particularly endowed with, can be reinforced by various imitation exercises such as mouthing, mirroring or modelling adapted from drama voice techniques. Auditory capacities, on the other hand, can be enhanced by consciousness raising at the perceptual level and guided ear-training.
3.
Pedagogical implications; techniques for teaching L2 prosody
The proposed model of the natural approach to the acquisition of second language phonology entails practical recommendations for the teaching of L2 prosody that have been translated into a number of specific classroom practices. The scope of the proposed techniques for the development of phonological metacompetence is multifarious ranging from alternative and innovative methods integrating cognitive, affective and psycho-motor aspects of pronunciation learning to more mainstream activities involving conscious analysis of theoretical linguistic knowledge. The former include general awareness-raising techniques incorporating extra- and para-linguistic elements such as gestures, mimicry or relaxation in order to foster conscious control of articulators and perceptual tuning-in. The latter correspond to more elaborate practices that often rely on advanced technologies providing a new range of feedback and presentation modes. The schematic presentation of the suggested techniques (see Table 1) is based on different degrees of explicitness, on the one hand, and elaboration, on the other.
198
Magdalena Wrembel
Elaboration
Table 1. Metacompetence developing techniques B Articulatory control
D Multimedia learning aids
Articulatory warm-up exercises Drama voice techniques: Articulatory setting exercises: * voice quality * imitation and oral mimicry
Animated views of the articulators Video close-ups of the mouth Computerised displays of speech patterns Spectrograms
A Basic awareness-raising
C Informed teaching techniques
Relaxation, breathing, visualisation Sensitisation: * perceptual tuning-in Awareness raising activities: * discussions * questionnaires * metaphonetic trivia * concern for pronunciation
Theoretical foundations (rules) Contrastive information Pitch-contour notation Guided ear-training - analytic listening Self-monitoring techniques
Explicitness (covert - overt)
3.1. Basic awareness-raising activities The present proposal assumes that the initial stages of conscious teaching/learning of L2 prosody should focus on building awareness and concern for pronunciation and preparing the articulatory and auditory apparatus for the forthcoming practice. This stage (section A) involves the lowest degree of explicitness and elaboration in language consciousness raising yet it constitutes a necessary foundation for the development of phonological metacompetence. As emphasised by Dalton and Seidlhofer (1994), due to the largely subliminal nature of intonation which makes it difficult to describe and teach, sensitising and awareness raising activities are particularly important. One of the major issues at this stage is to develop a concern for intonation in foreign language learning through stimulating discussions on its role in communication. Such discussions, geared at increasing learners’
Metacompetence-based approach to the teaching of L2 prosody
199
motivation and stimulating interest, can be prompted by questionnaires or tape-based tasks (examples of such questionnaires can be found in Hewings 2004, Laroy 1996, Kenworthy 1990). Other awareness raising techniques involve the investigation of the general nature of prosody by attuning learners’ ears to pitch movements, humming the tune instead of using words, recognising moods and acting out tales or using arithmetic to consciously analyse the division of speech into tone units e.g. (2+3) x 5 = 25 vs. 2+ (3 x 5) = 17 (cf. Dalton and Seidlhofer 1994). Awareness of intonation can also be developed by associating pitch movements with other impressions such as a firework rising or falling in the sky, a plane taking off or landing, as well as with various emotions they cause (e.g. anger, happiness). Since intonation is often referred to as ‘vocal gesture’ (cf. Dalton and Seidlhofer 1994: 77) an awareness-raising activity may focus on the significance of gestures in general and then lead to replacing gestures by vocalisations. Other examples of metacompetence enhancing techniques at this stage involve developing physical awareness of suprasegmental features as kinaesthetic involvement seems particularly applicable to the teaching of suprasegmentals. Such applications of ‘whole body’ motion to practice key aspects of stress, rhythm and intonation include e.g. walking or stamping the rhythm, hands raising corresponding to word stress patterns (cf. Miller 2000), tracing intonation contours with arms or acting out pitch movements when learners are assigned particular syllables in an utterance and are responsible for presenting the sentence with their bodies by assuming an appropriate posture corresponding to the pitch level (Acton 1998: 7): a) on toes – the highest pitch level b) standing – slightly raised pitch c) knees bent, hands on knees – starting position, mid-pitch d) squatting – general pitch of unstressed vowels e) kneeling – utterance-final, falling pitch. Moreover, prosodic awareness of L2 learners may be boosted by using various materials including metaphonetic trivia such as advertising leaflets, billboards, SMS and Internet-lore in which some aspects of suprasegmental phonetics come to the fore mainly through puns (cf. Sobkowiak 2003).
200
Magdalena Wrembel
They may constitute good starting points for awareness raising discussions and be particularly stimulating and memorable due to their humorous component. For instance, to illustrate the phonetic importance of an open juncture and semantic consequences of word stress, the learners may be presented with a postcard that reads “Two lips from Amsterdam” as opposed to “Tulips” from Amsterdam (see Sobkowiak 2003: 162). Another step to facilitate an accurate prosodic production in the second language and, in general, to improve voice quality involves conscious relaxation of the muscles of the articulatory apparatus and assuming an appropriate frame of mind, which can be achieved by means of relaxation techniques including breathing exercises (e.g. breathing in, holding the breath and releasing it for the count of three) or visualisation (i.e. guided imagery exercises) (cf. e.g. Celce-Murcia, Brinton and Goodwin 1996, Acton 1997, Laroy 1996). As far as basic awareness raising techniques at the level of perception are concerned, the model allows for the so-called sensitisation, i.e. perceptual tuning into the language. This activity consists in getting learners used to the general auditory impression of the target language, rather than listening for a particular phonetic feature, and approaching the melody of the language in terms of its affective value and aesthetic impact. For instance, when listening to English students are asked to judge whether: – it smells like a meadow / it smells like a town, – it is like the sound of waves breaking on the beach / it is like the sound
of a mountain brook, – it sounds ideal for giving orders / it sounds ideal for courtship (Laroy 1996: 25–26). Developing metacompetence at this stage involves making learners aware of how they perceive the target language by activating all their senses, helping them to overcome their prejudices and thus making their language egos more permeable. 3.2. Articulatory control exercises Section B enumerates metacompetence developing techniques based on articulatory control that involve a higher degree of elaboration, though they are still not fully explicit in providing declarative knowledge of the pho-
Metacompetence-based approach to the teaching of L2 prosody
201
netic system of the target language. Such practices involve voice modulation techniques typically used by drama coaches and articulatory warm-up exercises that aim at a greater articulatory agility and, consequently, a more native-like performance. These techniques of metalinguistic and extralinguistic awareness raising aim at regaining conscious control over the process of articulation through pre-speech physical preparation including, among others, postural alignment, muscular tension release and warming, vocal work-out, massaging face and jaw muscles, lip and tongue activation, warming the voice and releasing resonance as well as pitch, volume and speech rate modulation exercises (cf. Wessels and Lawrence 1995). Conscious employment of theatre arts techniques contributes also to L2 prosody improvement from a psychological perspective by increasing learners’ self-esteem and confidence as well as enabling them to transcendent the normal limits of fluency. A further aspect of phonological metacompetence is related to developing a more authentically native-like ‘voice quality’ or ‘setting’, which can be achieved through a conscious attempt at an adaptation of a long-term articulatory posture specific for a particular target language, i.e. a characte– ristic pitch level, vowel space, tongue position and the degree of muscular activity. The present model advocates specific voice quality setting exercises involving, among others, oral mimicry (e.g. making an English face or finding one’s English voice) and conscious imitations of model intonation patterns. To give some examples of the controlled imitative practice one may enumerate the following: – mouthing – miming a dialogue without words, – mirroring – repeating simultaneously with the speaker and imitating
his/her gestures and facial expressions, – tracing – repeating simultaneously without mirroring the speaker’s ges-
tures, – echoing – repeating slightly after the speaker (cf. Celce-Murcia, Brinton
and Goodwin 1996). Particularly noteworthy are extralinguistic features (i.e. elements of body language) which tend to be incorporated into the imitative practice. The activities described in sections A and B should be viewed as a first stage in the process of L2 prosody learning, i.e. a way to raise general prosodic awareness, to ‘open the ears’ and to establish strategies which can be later consolidated and extended. Some of the techniques proposed above
202
Magdalena Wrembel
fall into the scope of the so-called alternative and innovative methods that co-exist under a general label SALT (i.e. a System of Accelerative Learning Techniques). The major drawbacks of such alternative techniques include apparently limited practical applications, the lack of systematic features and limited empirical validation as pointed out by Pfeiffer (2001). However, their major contribution to the facilitation of L2 prosody learning concerns primarily the affective domain, whose role in the case of pronunciation is of paramount importance. These techniques represent a comprehensive approach integrating cognitive, emotional and physical aspects of pronunciation learning. They are particularly geared at reducing learning inhibitions by creating a positive atmosphere, enhancing learners’ confidence in L2 production and incorporating extra- and para-linguistic elements such as gestures, mimicry and relaxation exercises. In the current model proposed by the author alternative techniques perform an auxiliary function accompanying and enriching the repertoire of mainstream practices rather than replacing it. 3.3. Mainstream techniques for informed pronunciation teaching Section C represents more mainstream pronunciation teaching activities referred to as informed teaching techniques. Contrary to some opinions restricting prosody training to imparting motor and auditory skills, the present approach attaches a paramount importance to the cognitive aspect of phonological acquisition. Metacompetence-oriented theoretical training in the L2 prosody advocates conscious knowledge of phonetics and phonology, therefore, elements of theoretical grounding (e.g. Brazil 1994, Roach 1991, Gimson 2001) are expected to constitute an integral part of the pronunciation training. This recommendation is particularly valid in the context of teacher training, where the trainees are to become potential pronunciation models for their learners. In an effort to overcome interference from the sound system of the target language it is advisable to establish certain basic discriminatory skills enabling learners to distinguish consciously between features of their own language and those of the target. Therefore, it is advocated in the model to allow for contrastive exercises involving the comparison of specific issues in the target and source languages. Conscious training of auditory skills may take various forms ranging from simple discrimination and identification tasks to more elaborate
Metacompetence-based approach to the teaching of L2 prosody
203
guided ear-training. Guided listening may included, for instance, exercises in which listeners must recognise which word is made prominent (highlighted) by choosing a suitable context for what they hear, e.g. OR
A: Let’s go to Paris. A: Have you had a good weekend? B: I’ve been to Paris. (Bradford 1988: 9).
The model endorses also appeals to learners’ different modalities through multisensory means of presentation and practice. The impact of theoretical phonetic training in L2 prosody may be particularly enhanced by means of visual reinforcement including pitch-contour notation. Learners may be encouraged to represent the pitch range by drawing two parallel lines depicting the highest and the lowest limits of the range and drawing lines corresponding to pitch movements within these limits. Moreover, pitch contours may be depicted visually as arrows (cf. e.g. Vaugham-Rees 1994), bending lines (Brazil 1994), dotted lines (Roach 1999) or dots representing syllables (e.g. Gimson 2001). Similar procedures can be used to encourage stress pattern notation including: a) underlining the stressed syllable using a particular colour, b) putting a dot above or under the stressed syllable, c) writing this syllable in a different script, d) representing the stressed/unstressed syllables by different shapes (cf. Laroy 1996: 47). The present metacompetence-oriented framework strives to empower learners by equipping them with self-monitoring and self-correction strategies so that they may be involved consciously in the speech modification process. In practice, it entails helping L2 learners to develop self-rehearsal techniques (e.g. talking to oneself, audio- or videotaping presentations or rehearsing in small groups) as well as providing them with procedures for self-diagnosis and concrete self-study guidelines (e.g. Use strong, vigorous speech! Use controlled speed and pause by phrase groups. Take time to slow your rate of speech and vary tempo. Use clear emphasis. Establish the rhythmic stress-unstress pattern of English including reductions and contractions; link words into phrase groups across word boundaries. Use lively, expressive voice qualities – adapted from Morley 1994: 87). The present author’s recommendations concerning multimodal techniques of L2 prosody training as well as the application of autonomous learning strategies are based on the personal experience as a phonetics teacher
204
Magdalena Wrembel
and mostly positive and enthusiastic feedback received from the students that were taught within the framework of the approach discussed above. 3.4. Elaborate and technologically advanced techniques Finally, section D offers the highest level of elaboration and explicitness as far as phonological metacompetence enhancement techniques are concerned. The majority of techniques suggested therein rely on multimedia learning aids and advanced technologies. As the oral speech mechanism is readily accessible to direct observation, some computer assisted instructional programs or web pages offer animated views of the articulators during speech or vocal folds in motion as an additional visual support for the conscious analysis of the articulatory process. Another option is to video tape learners’ faces during speech production and subsequently examine such close-up frames of articulators in order to analyse the articulatory gesture and the overall articulatory posture, i.e. muscular tension in the supralaryngeal tract or the position of the larynx. Furthermore, more advanced pronunciation teaching courses available on CD-ROMs offer instant audio-visual feedback in the form of computerised displays of speech patterns allowing learners to record their utterances and compare a visual display of their own intonation contours with prerecorded native-speaker models (cf. e.g. CD-ROM Better Accent Tutor, CD-ROM Connected Speech). Identification and discrimination of pitch movements as well as nuclear stress placement can now be further enhanced by multimedia offering visual feedback support, e.g. CD-ROMs for teaching intonation or Internet resources such as e.g. Sound Machines available at John Maidment’s web site offering programs designed to help recognise the nuclear syllable and nuclear tones in presented sentences. The perception and production of foreign sounds may be reinforced by a conscious analysis of the acoustic spectrum displayed in the form of a spectrogram. As advocated by Schwartz (2004), conscious knowledge of acoustic phenomena may represent a useful, albeit fairly new tool in pronunciation pedagogy. Acoustic phenomena are concrete and relatively easy to identify and describe, therefore, they can be of great benefit in the learning process. Such pedagogy-oriented spectrographic analyses may thus serve as an awareness raising tool helping learners to become familiar with such acoustic phenomena as e.g. fundamental frequency (pitch) or amplitude (loudness). An experiment conducted by Schwartz and Glogowska (2004)
Metacompetence-based approach to the teaching of L2 prosody
205
demonstrated that even a brief training session focused on such acoustic phenomena as the range of pitch movement or breathy voice quality resulted in the learners’ perceptible acoustic progress in L2 production. It is undeniable that technology assisted language learning techniques have a special appeal, particularly for young learners and their effectiveness may be influenced, to a large extent, by the fact that they enhance learners’ motivation and interest. They pose, however, a greater challenge to the teacher and require some specific know-how as in the case of acoustic speech analysis. 3.5. Empirical verification of the proposed techniques The major predictions of the proposed model of pronunciation teaching were empirically validated in a study on the role of phonological metacompetence in the acquisition of foreign language phonetics by adult advanced learners of English (Wrembel 2003, 2005). The results indicated that phonological awareness raising and conscious theoretical instruction in English phonetics related significantly to the improvement in the overall L2 pronunciation performance in the experimental group which outperformed the controls that received traditional pronunciation practice and relied solely on procedural knowledge. An issue that merits further investigation, however, is the efficiency of particular innovative techniques for teaching L2 prosody that were presented in this contribution such as e.g. articulatory warm-ups, relaxation and breathing or drama voice procedures. Their effectiveness has been corroborated on the basis of the present author’s informal observations and a longstanding experience as a phonetics teacher as well as self-reported data collected from the students by means of questionnaires. To the best of my knowledge little controlled research has been conducted to validate the efficiency of these novel approaches. The evaluation of specific techniques in terms of how beneficial they are for the learner’s perception and production of L2 prosody seems to be a rather difficult task. Their impact depends to a large extent on how well they correspond to the learners’ learning styles, preferred modalities or even personalities. Therefore, it seems that the best recommendation for the teachers would be to use their classrooms as a testing ground and to try out what seems particularly applicable to specific contexts and learners’ needs.
206
4.
Magdalena Wrembel
Conclusions
The major goal of the present contribution has been to provide insights into new trends in L2 prosody teaching and to illustrate them with a range of innovative techniques, and consequently, to broaden the repertoire of activities used traditionally in the language classroom. It is worth stressing that the proposed list of metacompetence enhancement techniques tailored at the teaching of L2 prosody is by no means exhaustive. The present author aimed at providing a sample of varied activities of a different degree of elaboration and explicitness, thus offering an innovative perspective on the pronunciation pedagogy that may be appealing to foreign language educators, learners and materials designers alike. To sum up, the presented model constitutes a reflection of a consciousness-based approach to the acquisition of foreign language pronunciation and its practical implications are aimed particularly at increasing the effectiveness of teaching L2 prosody. It is advocated that this aim may be achieved by developing learners’ phonological metacompetence, i.e. by means of raising awareness of foreign language intonation and promoting conscious theoretical instruction therein. The major goal is to make prosody an integral part of informed pronunciation teaching by conscious employment of various metacompetence-oriented techniques and activities in order to foster L2 learners’ productive and receptive skills. It is hoped that because of its theoretical foundations (i.e. grounding in a specific linguistic theory) and a broad perspective of second language acquisition, the presented model may be rendered particularly applicable to the teaching and learning of second language prosody.
References Acton, William 1997 Seven Suggestions of Highly Successful Pronunciation Teaching. The Language Teacher Online 21.2 http://langue.hyper.chubu.ac.jp/jalt/pub/tlt/97/feb/seven (date of access: 22 Dec. 2006) 1998 The Syllablettes. Alternatives. Speak Out! 22, 5–10. Better Accent Tutor for English http://www.betteraccent.com/ (date of access: 22 Dec. 2006)
Metacompetence-based approach to the teaching of L2 prosody
207
de Bot, Kees and Mailfert, K. 1982 The teaching of intonation: Fundamental research and classroom applications. TESOL Quarterly 16, 71–77. Bradford, Barbara 1988 Intonation in Context. Cambridge: Cambridge University Press. Brazil, David 1994 Pronunciation for Advanced Learners of English. Cambridge: Cambridge University Press. Celce-Murcia, Marianne 1987 Teaching Pronunciation as Communication. In: Joan Morley (ed.) Current Perspectives on Pronunciation: Practices Anchored in Theory. TESOL, Washington, D.C. 1–12. Celce-Murcia, Marianne, Donna Brinton and Janet Goodwin 1996 Teaching Pronunciation. A Reference for Teachers of English to Speakers of Other Languages. Cambridge: Cambridge University Press. Connected Speech http://www.proteatextware.com.au/cs.htm (date of access: 22 Dec. 2006) Dalton, Christiane and Barbara Seidlhofer 1994 Pronunciation. Oxford: Oxford University Press. Donegan, Patricia 1985 On the Natural Phonology of Vowels. New York: Garland. Dziubalska-Kołaczyk, Katarzyna 1990 A Theory of Second Language Acquisition within the Framework of Natural Phonology. A Polish-English Contrastive Study. PoznaĔ: AMU Press. 2002 Conscious competence of performance as a key to teaching English. In: Ewa Waniek-Klimczak and Patrick Melia (eds.) Accents and Speech in Teaching English Phonetics and Phonology. EFL Perspective. Frankfurt: Peter Lang, 97–106. Ellis, Rod 1994 The Study of Second Language Acquisition. Oxford: Oxford University Press. Faerch, Claus and Gabriele Kasper 1986 Cognitive dimensions of language transfer. In: Eric Kellerman and Michael Scharwood Smith (eds). Crosslinguistic Influence in Second Language Acquisition, 49–65. New York: Pergamon. Gimson, Alfred C. 2001 Gimson’s Pronunciation of English. 6th edition. Revised by A. Cruttenden. London: Edward Arnold.
208
Magdalena Wrembel
Hewings, Martin 2004 Pronunciation Practice Activities. Cambridge: Cambridge University Press. Jenkins, Jennifer 2000 The Phonology of English as an International Language. Oxford: Oxford University Press. Kenworthy, Joanne 1990 Teaching English Pronunciation. London: Longman. Laroy, Clement 1996 Pronunciation. Oxford: Oxford University Press. Maidment, John Sound Machines http://www.eptotd.btinternet.co.uk/vm/soundmachines.htm (date of access: 22 Dec. 2006) Miller, Sue F. 2000 Targeting Pronunciation: the Intonation, Sounds, and Rhythm of American English. Boston, MA: Houghton Mifflin Company. Morley, Joan 1987 Current Perspectives on Pronunciation: Practices Anchored in Theory. Washington, DC: TESOL. 1994 Pronunciation Pedagogy and Theory, New Views, New Directions. Alexandria: TESOL. Pennington, Martha 1989 Teaching pronunciation from the top down. RELC Journal 20, 20–38. Pfeiffer, Waldemar 2001 Nauka jĊzyków obcych. Od praktyki do praktyki. PoznaĔ: WAGROS. Purcell, Edward T. and Richard W. Suter 1980 Predictors of pronunciation accuracy: A reexamination. Language Learning 30, 271–88. Roach, Peter 1991 English Phonetics and Phonology. Cambridge: Cambridge University Press. Schmidt, Richard 1990 The role of consciousness in Second Language Learning. Applied Linguistics 11, 129–158. Schwartz, Geoff 2004 Voice quality in students’ production of the English tense/lax contrast. In: Włodzimierz Sobkowiak and Ewa Waniek-Klimczak (eds.). Dydaktyka Fonetyki JĊzyka Obcego Zeszyt Naukowy Instytutu Neofilologii (3), Wydawnictwo PWSZ w Koninie, 75–79.
Metacompetence-based approach to the teaching of L2 prosody
209
Schwartz, Geoff and Małgorzata Głogowska 2004 Acoustic tools for students’ production of English long (tense) vowels. In: Włodzimierz Sobkowiak and Ewa Waniek-Klimczak (eds.), Dydaktyka Fonetyki JĊzyka Obcego Zeszyt Naukowy Instytutu Neofilologii (3), Wydawnictwo PWSZ w Koninie, 80– 85. Sobkowiak, Włodzimierz 2003 Materiały ulotne jako Ĩródło metakompetencji fonetycznej. (Raising phonetic awareness through trivia). In: Włodzimierz Sobkowiak and Ewa Waniek-Klimczak (eds.). Dydaktyka Fonetyki JĊzyka Obcego. Zeszyty Naukowe PWSZ w Płocku (5), Płock: Wydawnictwo PWSZ, 151–166. Stampe, David 1973 A Dissertation on Natural Phonology. Bloomington: Indiana University Linguistic Club. Vaugham-Rees, Michael 1994 Rhymes and Rhythm. Hong Kong: Macmillan Publishers Ltd. Wessels, Charlyn. and Kate Lawrence 1995 Using Drama Voice Techniques in the Teaching of Pronunciation. In Brown, A. (ed.) Approaches to Pronunciation Teaching, 29–37. Hemel Hempstead: Prentice Hall International. Wrembel, Magdalena 2003 An empirical study on the role of metacompetence in the acquisition of foreign language phonology. Proceedings of the 15th International Congress of Phonetic Sciences Barcelona (Spain), 985–988. 2005 Phonological metacompetence in the acquisition of second language phonetics. Unpublished Ph.D. dissertation, Adam Mickiewicz University, Poznan. 2006 Consciousness in Pronunciation Teaching and Learning. IATEFL PL Newsletter, Post-Conference Edition No 26 , Warszawa
Individual pronunciation coaching and prosody Grit Mehlhorn 1.
Introduction
This paper gives a general overview of the motivation, goals and methods of Individual Pronunciation Coaching (IPC). It will be shown that the following factors influence the learner’s progress: first, the individual diagnosis of the deviations in the target pronunciation; second, an increase of the learner’s consciousness with respect to the foreign pronunciation and the choice of individual learning strategies; and third, the permanent feedback on learning progress. These factors lead to an increased self-reflection on the part of the learner regarding their learning process, language awareness, and they also serve to foster learner autonomy. Special attention is given to the prosodic organization of the foreign language – an aspect of pronunciation which is often neglected in foreign language teaching. The empirical examples reported here are from foreign students at German universities. However, the concept of IPC should work for other target languages as well.
2.
Motivation
The difficulties experienced while learning the pronunciation1 of a foreign language are strongly dependent on the mother tongue of the learners, and on other foreign languages they have already acquired (cf. among others Kaltenbacher 1998; Gut 2003; Hirschfeld, Kelz and Müller 2003). Many of these pronunciation difficulties can be predicted to a certain degree, if one compares the phonetic systems of the mother tongue and the target language. However, this may yield overgeneralisations since learners with a very similar learning background show considerable differences in their individual pronunciation, – as Baran (2002: 315) puts it: ... even within groups where learners are of one age, mother tongue and gender; where individuals receive comparable amount and type of exposure; the same explicit formal training; where students are highly moti-
212
Grit Mehlhorn
vated, and their attitudes are positive; where all learners are taught by the same teacher who uses specific teaching methods and techniques, the pronunciation of individuals still differs sometimes even to a great extent. Foreign language learners of the same mother tongue differ not only with respect to the amount and grade of particular deviations in the pronunciation of the foreign language, but also in terms of: – their ability of segmental and prosodic differentiation, – their articulatory skills, – their cognitive learning styles (e.g. with respect to their preferred per– – – – –
ception mode), learning strategies used, the degree of language awareness, their self-monitoring skills, their motivation and their expectations regarding their pronunciation level.
Pronunciation and prosody practice plays only a marginal role in standard foreign language teaching (Gehrmann 1999). Even if there is pronunciation teaching, only the worst mistakes are corrected. Learners are expected to repeat given forms. The mere imitation, however, does not take into account the cognitive skills of adult learners. As a consequence, even very advanced learners have no clear idea of how and where their pronunciation deviates from the pronunciation of native speakers. They seldom have adequate conscious strategies to improve their pronunciation themselves. As deficient pronunciation influences other language skills such as reading, listening, speaking and writing (de Jong and Kaunzner 2000), the whole acquisition process is slowed down. Eventually, some prosodic deviations can lead to undesirable effects on the native hearer. In the worst case, some foreign accent features are interpreted as bad character traits of the person speaking. Since pronunciation likewise plays only a marginal role in important German language tests like the DSH and the TestDaF2, many students concentrate on the improvement of the skills explicitly required in the tests, but pay only little attention to the improvement of their pronunciation. Therefore, the pronunciation of many language learners shows fossilizations – much more often than their grammar or vocabulary. Indeed, for many learners, it is not enough to be exposed to the foreign language. Without a certain experience in systematic listening to sounds or intonation (Dieling
Individual pronunciation coaching and prosody
213
1989) and without awareness on what to focus on, much of the input is “filtered out”. Thus, the pronunciation of surrounding native speakers can have (only) little influence on what learners produce themselves. Many foreign students consider their foreign accent a barrier preventing them from entering into contact with German students. Native speakers seem to have certain associations or emotional reactions when hearing a given foreign accent (Müller 1994: 182; Stibbard 1996; CunninghamAndersson 1997: 133, 142; Gibbon 1998: 89). These associations are unconsciously related to the personality of the speaker which can even lead to a stigmatization of this person (Grotjahn 1998). Thus, foreign students more often than Germans, are afraid of taking part in seminars, let alone oral presentations or oral exams. Although since the nineties, phonetics is represented to a greater extent in textbooks and audio material for German as a foreign language, in the classroom it is still neglected. This is in part due to the fact that in heterogeneous learner groups with different mother tongues, it is practically impossible to deal with pronunciation difficulties of the individual learner. Hence, lack of time is the main argument against phonetics in the foreign language classroom. Another reason is that teachers’ knowledge in this area is often limited. Therefore, many teachers restrict themselves to the immediate correction of only very striking pronunciation deviations. A conscious discussion of reasons for these deviations rarely takes place. To remedy the aforementioned problems, teachers would have to acquire additional competences and phonetic knowledge. However, since not every teacher can be an expert for every mother tongue of his or her learners, the idea of an individual coaching with special focus on foreign pronunciation took form. This is a new kind of individual coaching which coexists with general language learning coaching (Kleppin 2003), coaching for tandem learners (Brammerts, Calvert, and Kleppin 2005; Schmelter 2004), and the coaching of students at the beginning of their studies (Mehlhorn 2005). Very often, a feeling of success or sense of achievement in the foreign pronunciation is reached only after a longer period of time, after persistent practising. Therefore, the motivation of the learner plays a particularly important role in this field of language learning. This is another reason for individual pronunciation coaching. With the help of the pronunciation diagnosis and individual feedback at different times of the coaching, small, otherwise not realised progress can be shown.
214
3.
Grit Mehlhorn
Goals
Individual learning coaching is based on the concept of learner autonomy3 (Riley 1997, Cotteral and Crabbe 1999, Benson 2001). The learner is seen as an individual who is capable of taking control of his or her own learning. It is an important goal of the individual learning coaching to support the learner’s independence. There are many ways to learn German. The learners taking part in IPC have followed different paths to acquire their present level of competence in German and knowledge about German. Every learner has different difficulties. In IPC, they can learn in which aspects their pronunciation needs correction and how they can improve it. Experience shows that there can be huge differences between pronunciation features which seem problematic to the learner and those which are perceived as deviant by the pronunciation coach. Learners of German as a foreign language, for instance, often mention the uvular [R] and the high front rounded vowels in the first place. Only few learners are aware of deviations in the target intonation or rhythm, however. These deviations should not be neglected, however, since prosody organizes spoken language in patterns fitting the appropriate communication need. “In the speech of advanced learners, departures from what we regard as desirable are said to be more often matters of intonation than matters of how particular sounds are made” (Brazil 1994: 3). Therefore, it is important to inform the learner about possible effects of deviations in the target prosody. Detailed phonetic descriptions are avoided wherever possible, but an essential aim of IPC is to help the learner gain sufficient knowledge of the phonetic system of German to feel at ease when using it. Having understood the system, the learner will be more likely to feel in control of it; and feeling in control will almost certainly reduce the anxiety felt when speaking the foreign language. Supporting the self-confidence of the learner is therefore one of the fundamental goals of the individual coaching. The coach and the learner are concerned with recognising and remedying things that are peculiar to the learner. Confidence building here takes the form of making clear to the learner which phenomena they need to work on and which sounds they can safely consider to be less problematic. The diagnosis, therefore, is aimed at enabling the learner to identify their problems on their own.
Individual pronunciation coaching and prosody
215
It is not necessarily the aim of IPC to reach a native-level pronunciation. Instead, every learner sets up their own, personal standard they want to reach in their pronunciation. The coach can help them break down overly ambitious, far-reaching goals into smaller, attainable ones. Reducing anxiety consists largely of the development of the learner’s power of self-appraisal (Weskamp 2003). Through the permanent concern with their own pronunciation peculiarities, with the feedback given by the coach, with the comparison with a standard (e.g. a native speaker, audio media, language learning software), and with the coach-initiated selfobservation and self-evaluation on the part of the learner, step by step, the self-appraisal of the learner should substitute the feedback of the coach. Eventually, the learner should be able to work on his or her pronunciation without the help of the coach. IPC has a limited time frame. The sooner the learner is able to learn independently, the better. The increase of the learners’ autonomy, self-reflection and their capability of self-appraisal are further aims accompanying the improvement of the learners’ pronunciation skills. At the beginning of the coaching process, the learner is informed about the proceeding and the possibilities of the coaching: – The coach makes a diagnosis of the learner’s individual pronunciation. – She recommends material designed to improve the pronunciation and
shows the learner how to work with it. – The coach supports the learner in splitting up his main goals into small,
realistic sub-goals. – She helps the learner develop appropriate learning strategies and – gives feedback on the learner’s progress.
All these measures are intended to make the learner work more independently. At the same time, it is necessary to make clear that work on pronunciation must necessarily be active. The learner will not improve his or her pronunciation by being told what to do, but by doing it. Obviously, if learners are treated as autonomous persons, their progress depends almost entirely on their own effort. IPC is intended to provide maximum support, but does not spare the learner the effort of learning.
216
4.
Grit Mehlhorn
Methods in IPC
4.1. The first coaching session In the first coaching session, the coach inquires about the so-called learning biography. This means information about – – – – – – – – –
the mother tongue, foreign languages, strategies in pronunciation learning, phonetic knowledge (i.e. knowledge about the phonetic particularities of the mother tongue and the target language), self-assessment of the pronunciation by the learner, already recognized problems, crucial experiences with his or her foreign pronunciation (e.g. perceived misunderstandings), aims, time frame, etc.
This information is necessary to provide the learner with coaching that directly addresses their individual needs. In a second step, the learner reads a short text aloud. This text is directly recorded onto the computer. This recording serves as a diagnosis and starting point for coaching. While the learner reads the text, the coach marks the deviations in her own copy of the text. The results are then discussed with the learner. The coach can encourage the learner to mark the items in the text which the learner should pay special attention to in the next reading task. For this purpose, the coach recommends certain notation symbols (e.g. vertical lines to mark potential breaks, accents on stressed vowels – if the stress was wrong in the first place –, marking of pitch accents, etc.). For learners preferring a visual learning style, it is helpful to be able to actually see their pronunciation problems. The learner then gets an electronic version of the text and can do further work at home. Figure 1 illustrates this marking process with an example of a learner from Mongolia. This student had problems to produce adequate German rise- and fall contours – and, evidently, marked her text accordingly.4
Individual pronunciation coaching and prosody ... Die Jahre ver g i
217
n gen, / und der Herr wurde a l
n H i ter ihm / lag ein Leben voller Ent b
t. e
h rungen.
Figure 1. Individual example of learners’ intonation deviations
When discussing the identified deviations in the recording it is recommended to differentiate between more striking deviations which could lead to misunderstandings or make communication more difficult, and less irritating deviations. For each learner, this individual diagnosis is documented on an evaluation sheet. Table 1 illustrates such an evaluation sheet5 of the prosody for Chinese learners of German. Since the prosody of a given language cannot be evaluated in terms of “right” and “wrong” but rather shows a continuous spectrum from “not understandable” to “native-like”, a seven-point-Likert scale is used to cover gradual pronunciation deviances from 1 (“very deviant”) to 7 (“not deviant”). In order to see progress in the foreign pronunciation, it is helpful to work with the same diagnostic text over a longer period of time. This way, it will be possible to document even small progresses which are important for the learner’s motivation. Table 1. Extract from an evaluation sheet of pronunciation for Chinese learners of German very deviant Å … Æ native-like rhythm 1 2 3 4 5 6 7 a) segmentation (e.g. number of pauses X within phonological phrases) b) reduction of unstressed syllables X c) syllable structure (e.g. change of syllable structure through deletion or insertion X of vowels) intonation 1 2 3 4 5 6 7 d) intonation of the whole utterance X e) intonation on punctuation marks X
218
Grit Mehlhorn
Tabelle 1. continued very deviant Å … Æ native-like accent positions 1 2 3 4 5 6 7 f) position of word stress X g) position of phrase accents X means of accentuation 1 2 3 4 5 6 7 h) duration compared to unstressed sylX lables i) loudness compared to unstressed syllaX bles j) pitch variation compared to unstressed X syllables
As some of the pronunciation problems are due to perception difficulties, one can give perception tasks to find out whether the learner has perception problems or not. In order to test the perception of word stress, one can use tasks where the learner is required to identify the stressed syllable in multisyllabic words (see Figure 2 and the related audio samples 1-10 on the CDROM): In this task you are required to identify the stressed syllable. The first two examples are done for you: example 1: Weih–nachts–mann example 2: In–sek–ten–stich 1. Pho–ne–tik 2. E–bers–wal–de 3. zwei–und–zwan–zig 4. ver–ab–rei–chen 5. Hei–lig–a–bend
6. Süd–a–fri–ka 7. In–fi–ni–tiv 8. Ost–fries–land 9. Groß–bri–tan–nien 10. um–fah–ren
Figure 2. Perception test: identification of the stressed syllable
If the learner has difficulties with perceiving word stress6 it seems appropriate to start with listening exercises (identification and discrimination of stressed and unstressed syllables) before going on to produce the stress patterns. Further tests can involve the identification of sentence accent or different accent types. An advantage of a detailed diagnosis at the beginning of the coaching process is the possibility to compare the initial data
Individual pronunciation coaching and prosody
219
with production and perception data of the learner at a later time. This also helps to make the learning progress comprehensible, and, more importantly, visible for both learner and coach. Now that the learner knows about his individual problems, he decides which particular difficulty he wants to work on. Then, the learner and the coach discuss different approaches and possibilities for pronunciation practice. The coach indicates suitable exercises7 and introduces the learner to different techniques like the use of a speaking dictionary. As not every strategy suits everyone, the adult learner chooses the one he thinks would suit him best. Depending on the learner’s difficulties it can be useful to explain the relation between sounds and letters in German, to show articulation places of certain consonants, to show differences in the rhythms between L1 and L2, or to draw the attention of the learner to particular intonation patterns or stress rules. Often, it is helpful to include information structure rules, i.e., to explain which words are highlighted (focussed) and why other words are deaccented in a given text. The explicit knowledge of phonetic and prosodic rules concerning their own difficulties can help the learners to take control of their own pronunciation. The explanations of the coach should not be limited to linguistic knowledge. She can demonstrate how to use language learning software, how to profit from listening tasks, songs, audio books or a vocabulary trainer, where the learner can find rules and exercises for his individual problems on a CD-ROM, which exercises are appropriate for which difficulty, etc. Together, the learner and the coach discuss which procedure could be helpful for the learner’s working on his pronunciation. However, the decision about the path and direction the learner wants to take is up to the learner, since he is the one who has to put his chosen methods into action. At the end of the session, the learner formulates his goals and defines the necessary steps to reach them, i.e., the exercises he will do until the next session. This is a kind of verbal contract between the learner and his coach and serves as a starting point for the next session. Usually, those sessions take place on a regular basis, normally every three or four weeks. The time frame for the whole coaching is a few months.
220
Grit Mehlhorn
4.2. Subsequent coaching sessions The subsequent coaching sessions can proceed in the following phases: 1. The learner reports on his learning: what he trained and how he trained, which difficulties he encountered, in which areas he noticed a progress, which sub-goals he reached, etc. 2. Taking the learner’s self-evaluation as a starting point, a new diagnosis can be made with the help of a new recording of the learner’s pronunciation. The feedback from the coach with respect to the learner’s performance in certain aspects of pronunciation, and comparisons with former recordings serve as a means for showing to the learner his progress in pronunciation. 3. The consequence of this evaluation can be either a. to maintain the strategies used, if they worked for the learner, and to set a new sub-goal, i.e. to work on the next pronunciation difficulty, or b. to revise the procedure, if it did not suit the learner’s needs. In the latter case, the original goal would be maintained and a new strategy, i.e., other methods and/or exercises, should be tried. 4. During the coaching session, it can be necessary to make the learner aware of pronunciation rules of the target language or to explain and demonstrate new learning strategies, e.g. how to use a pronunciation dictionary, how to concentrate on certain aspects of prosody, etc. 5. A last step consists in the agreement on the next sub-goals and the learning strategies to reach them. Depending on the learner’s personality and their capability of selfreflection, there can be slight deviations from this procedure. The above mentioned phases, however, have proven successful in the practical application of IPC. Moreover, they are able to give the sessions a fixed structure, which is often perceived as helpful (Kleppin 2003, Kleppin and Mehlhorn 2005).
Individual pronunciation coaching and prosody
5.
221
Language awareness
5.1. Foreign language pronunciation and language awareness According to Little (1999), the autonomous learner possesses language awareness. Learners come to the IPC with a certain learning need. After analyzing the learner’s performance in the diagnosis, their “awareness of [his own] learning needs” (Lernbedarfsbewusstheit, Knapp-Potthoff 1997: 13) becomes more acute, since they are now in a position to make a more concrete evaluation of their ability and the pronunciation phenomena they want to improve. The learner needs to develop awareness and monitoring skills that will allow learning opportunities outside the coaching environment (Otlowski 1998). In order to raise language awareness, it is necessary to direct the learner’s attention to form, i.e., to how utterances are realized, e.g. where the differences between the rhythm of L1 and L2 are, how the intonation of a given sentence sounds or where the stress is put (for the concept of focus on form see Long 1991). A first step towards raised language awareness can be taken by concentrating on problematic sounds or intonation patterns. Here, language-related knowledge is helpful, e.g., knowledge about the existence of final devoicing and glottal stops in German, stress rules, the information structure of utterances or the use of certain intonation patterns. A next step of consciousness raising could consist of letting the learner hear his deviations. He listens to his recording and has to concentrate on the marked items. This is done to account for the fact that unless the learner recognizes his deviations, he is hardly able to change his prosody. Once the learner is able to hear his pronunciation difficulties, he can pay attention to the given problems while reading aloud. At the beginning, the learner marks the pronunciation phenomena he wants to focus on in the given text. A further step towards language awareness is reached when the learner succeeds in identifying the pronunciation difficulty under consideration in texts that are unknown to him. The reached receptive sensibility can be seen in noticing individual pronunciation difficulties in the speech of others, often fellow countrymen, and his own speech. In certain situations, the learner should succeed more and more in concentrating on formal aspects in the speech of native speakers, e.g. the speech melody in polite requests or the realization of reduced vowels in unstressed syllables. However, noticing such phenomena and being more aware of them does not mean that
222
Grit Mehlhorn
the learner can produce them automatically in an adequate way. Nevertheless, the noticing of deviations is an important prerequisite for controlling one’s own pronunciation. Through the raised awareness concerning the foreign pronunciation, the learner advances hypotheses concerning the target pronunciation and prosody, e.g. that stressed syllables are longer than unstressed ones. The coach encourages the learner to build hypotheses, on the one hand, but, on the other hand, she tries to restrict possible overgeneralizations made by the learner (e.g. in German there are short stressed vowels as well). In my experience, such hypotheses and other mnemonic devices the learner has created for himself are more helpful for him than abstract phonetic rules. If the learner has developed the ability to focus on his difficulties in a written text and has achieved a correct pronunciation, the next step consists of applying this knowledge to spontaneous speech. One possibility to reach this ultimate goal is the use of different word lists. Figure 3 demonstrates an example of a word list for a Slovak learner of German who consistently placed the word stress on the first syllable. Therefore, during the coaching she made a list with the most problematic words on which she wanted to concentrate in the following weeks. interesSANT
‘interesting’
überSETzen
‘to translate’
AleXANdra
(the name of a fellow student)
die SlowaKEI, aber: der SloWAke, die SloWAkin
‘Slovakia; but: Slovak’
das PERfekt, aber: (Das ist) perFEKT.
‘the perfect tense; but: This is perfect.’
8
Figure 3. Word list (main emphasis: word stress)
Then, the learner discusses with the coach in which context she will try to apply this list, e.g., in a prepared, fairly emotion-free setting such as an oral presentation. If she succeeds in these situations, she will gradually master these words in all kinds of spontaneous speech. It is well-known that it is easier to concentrate on word stress or segmental particularities than on suprasegmental particularities. The question
Individual pronunciation coaching and prosody
223
is then which techniques to use to raise the learner’s awareness of prosody, making use of their individual cognitive learning styles. As many learners are visual perceivers, one can use both visual and auditory means to illustrate certain differences between the learner’s native and the target language, e.g., the different rhythm patterns of syllable-timed and stress-timed languages.9 The picture in Figure 4 was developed by Özen (1986: 13) who treated the differences between Turkish and German. syllabletimed
stresstimed
Figure 4. Rhythm of L1 vs. L2 (duration and pitch height of syllables)
One can use this visual means also for learners with Romance L1s if their rhythm is strikingly deviant from the German one. What this contrast illustrates nicely is the steady rhythm of a syllable-timed language where all syllables have nearly the same length. In a stress-timed language, it is the distance between two accented syllables which has approximately the same length. Therefore, if there are many unaccented syllables between two accented ones, they are produced much quicker, so that reduction takes place. Visual displays of speech are another valuable means for raising language awareness of the various aspects of speech, especially intonation. The coach can extract a problematic sentence from the diagnosis text of the learner and transfer it to a computer program which generates the pitch contour (fundamental frequency) of the sentence. For the following example, the computer program PRAAT was used. Among several other functionalities (which are not crucial for our purposes), PRAAT provides a visual display and a feature that enables the user to overlay a native speaker version of a given sentence on the learner’s version. Figure 5 shows the realization of a sentence spoken by a German native speaker (in the upper half of the figure, cf. audio example 11) and the learner (here: a Russian native speaker, cf. audio example 12). The way the Russian speaker produces this sentence is characterized by a higher number of pitch accents than required, and a wider pitch range on the accented syllables.10 These prosodic features may lead to misjudge-
224
Grit Mehlhorn
ments on the part of the hearer with respect to the speaker’s intention or attitude. Comparing the two realizations of this sentence, one cannot only hear but also see that there are at least two additional (redundant) pitch accents on the indefinite article “einem” (at the beginning of the sentence) and the verb form “fand”. For the learner, this illustration is extremely helpful because she can see the contrast between the two spoken sentences and the concrete deviations directly, as they are shown in a visual model of speech. This is a lot less abstract than simply saying that her pitch range is too wide and that she produces too many pitch accents.
Figure 5. Illustration of deviant intonation of a learner with Russian L1 (bottom) compared to native model (top)
A related problem is seen in Figure 6, where an interrogative sentence spoken by the (speech) model (cf. audio example 13) is not replicated successfully by the Chinese speaker (cf. audio example 14):
Individual pronunciation coaching and prosody
225
Figure 6. Illustration of deviant prosody of a learner with Chinese L1 (bottom) compared to German model (top)
As can be seen in the lower half of the figure, the Chinese student (unintentionally) splits the utterance into two intonation phrases. This is caused by the realization of a strong rise on the unstressed syllable of “Abend” – the word carrying the focus accent in this utterance – and a new “onset” on the next word “wird”, which starts very low. Furthermore, one can see that the words and syllables which should be deaccented are pronounced too long by the learner. Hence, this example shows both intonation and rhythm deviations. An adequate explanation of what the computer display shows should be given by the coach, i.e. information about what it represents and the articulatory correlates of the acoustic signal. If the deviations illustrated here are explained to the learner in a comprehensive way, visible speech can be a valuable learning device for the learner. This only works if the learner is able to “read” the important information from the intonation contour. She
226
Grit Mehlhorn
has to abstract from irrelevant individual details like absolute pitch height or the overall length of the utterance, for example. If the learner succeeds in focussing on form, then noticing can take place. The term noticing is defined as the recognition of specific structures in the target language as a consequence of focussed attention (Schmidt 1995; Eckerth and Riemer 2000). This is a prerequisite for processing these structures and their integration in the learner language. It is assumed that what learners notice in input is what becomes intake for learning (noticing hypothesis, Schmidt 1990). Noticing takes place when learners compare their own performance with the native speaker’s performance and recognize deviations (Schmidt and Frota 1986). In foreign language acquisition, noticing is a necessary step but not a sufficient one since noticing alone does not yet lead to a proper realisation of the target prosody. While noticing refers to surface phenomena, i.e. the learning of individual information as well as its anchoring in short-term memory, understanding refers to deeper levels of abstraction, like the organization and restructuring of information in long-term memory (Eckerth and Riemer 2000: 230). For adult learners, it is important to understand the nature of their deviations. In IPC, the coach tries to maximize the possibilities of noticing via focus on form. This enables a reflection and awareness process which yields deeper processing and hence more profound learning – even outside the coaching context. The following quotations from coaching sessions show that the learners focus on their own pronunciation as well as on the pronunciation of their fellow countrymen and German native speakers: (1) “I tried to speak simultaneously with the speaker on the CD and I noticed the different speed. „An einem Abend im April …“, the speaker makes a pause, while I continue reading … my segmentation of the text was different.” (2) “Yesterday I took the bus. There was a woman who wanted to get off and said (citing) „Darf ich bitte vorbei?“ This was exactly the intonation, this polite intonation! This polite question intonation! In this moment, I heard it!” (3) “In the beginning, I was speaking too high. Now it’s going much better.” (4) “Now, I hear much more, my own errors. And I notice other Italians making the same errors.” (5) I just came back from Russia after a long time away. There I suddenly noticed the newsreaders indeed speaking totally different from the German ones. She speaks somehow “emotionally”, so “excitedly”. Before, I didn’t notice that.”
Individual pronunciation coaching and prosody
227
5.2. Learning awareness Edmondson (1997) and De Florio-Hansen (1997) carve out a sub-concept of language awareness: “Sprachlernbewusstheit” (learning awareness). According to Edmondson’s (1997) definition, language (learning) awareness is the explicit and implicit knowledge of learners concerning their own learning processes, their learning motivation, their personal learning styles, and inventories of learning strategies. It is obtained through experience and reflection (1997: 93). This knowledge also seems to influence the success of foreign language learning. Through the individual coaching and the self-determined work on their own pronunciation the learning awareness of the learner is sharpened. The learner tries several learning strategies in order to reach their goals. A separate checklist (Table 2) gives an overview of possible pronunciation learning strategies. With the help of this checklist, the learners can evaluate which strategies they know and use successfully and which they could try out. The coach can explain and demonstrate new learning strategies.
… I plan a certain pronunciation phenomenon for practising (instead of learning pronunciation in an unfocussed and incidental manner) … I begin with listening tasks before doing pronunciation exercises … I learn transcription symbols of German and write down information about new words’ pronunciation while studying them … I learn transcription symbols of German and write down information about new words’ pronunciation while studying them … I use listening and pronunciation exercises with a key (in order to control my performance)
I will try this
I did not do this
I did this
In order to improve my pronunciation …
This is not relevant for me
Table 2. Extract from a checklist of learning strategies
228
Grit Mehlhorn
… I practise German prosody with longer segments (i.e. compounds, phrases, sentences, texts) … I practise prosody with the help of audio versions of a text … I ask my tandem partner for recording an interesting German text on tape so that I can work with it … I record my own speech and compare my performance with the prosody of the native speaker … I ask native speakers for correction of deviant pronunciation … I pay attention to word stress, sentences accent, and pauses while working with language learning software or while listening to the foreign language … I mark certain phenomena in an exercise text (e.g. pauses, intonation arrows) which I want to focus on when imitating later … I create mnemonic devices to remember the articulation of difficult sounds and problematic intonation patterns … I learn pronunciation with songs and poems … I read texts aloud … I read sentences simultaneously with the native model on the CD …
I will try this
I did not do this
I did this
In order to improve my pronunciation …
This is not relevant for me
Tabelle 2. continued
Individual pronunciation coaching and prosody
229
This is an open list to which the learner should add their individual strategies. It is helpful to fill out the same checklist after a few months and to compare it with the list of original learning strategies of the learner. This way, learner and coach are able to see which new learning strategies the learner found for himself since the first coaching session. Even the insight which learning paths do not fit the learner can help raising his language awareness. Depending on their individual learning styles, learners prefer different strategies. The knowledge of different learning strategies and the reflection on the learning method help the learner to evaluate which procedure is appropriate for them (Rampillon 2000; Tönshoff 2003). Strategies and techniques which the learner used successfully for improving their German pronunciation should facilitate the learning of further languages in many cases. The more strategies the learner tries, the more consciously they decide to adopt a certain procedure.
6.
The role of corrective feedback and self-evaluation
The potential of motivating the learner can be extremely high in individual coaching (Kleppin 2004) if the coach is able to give constructive feedback. Here it is important to regard the learner not as having a deficiency, but to concentrate on their learning progress. Therefore, feedback should promote the learning process but not overtax the learner. Hence, for example, it is not useful to list all segmental deviations in the analysis of a text read by the learner, if it was their goal to concentrate on the intonation in this text. Feedback should be differentiated, to allow the learner to develop sensitivity with respect to his pronunciation problems, but only to an extent that does not have a negative impact on his motivation to improve his pronunciation. At the same time, the learner should feel that increased language awareness is already a learning progress. It would be a missed opportunity of motivation if this was not treated as success itself, i.e. if the learning progress would only be measured by the language production of the learner. Eventually, the learner should be encouraged to “organize” his feedback himself. This can be done by asking acquainted native speakers to pay attention to certain “problematic” sounds and to correct the learner, or through the individual work with audio media and learning software, if
230
Grit Mehlhorn
these offer an informative feedback. Hardison (2004) found significant effects of computer-assisted training in the acquisition of L2 prosody and, more importantly, a generalization as to segmental accuracy and novel sentences (for further examples of computer-assisted prosody programs see Stibbard 1996; Chun 1998). The native speaker pitch contour displays can serve as a salient feedback for the comparison with the learner’s attempts. Students interested in technology may enjoy working with computer software. The auditory and visual feedback should contribute to their learning. The reflection on the learning experience, the learning progress, and the evaluation of successfully used learning strategies is bound to open new perspectives for the organization of further learning (Weskamp 2003). If the learner is able to identify “weak spots” himself and to draw the right consequences from self-observation, comparison with a standard, and selfevaluation, he will no longer depend on the feedback of the coach. The language (learning) awareness he has reached, and the gain in selfconfidence, which enables him to overcome his hesitation when speaking the target language, form the prerequisites for further autonomous learning and pronunciation improvement.
7.
Concluding remarks
IPC is not meant to substitute, but to complement pronunciation training in the classroom and is specifically aimed at overcoming the barrier of acquiring a correct pronunciation. Therefore it is sensible to offer IPC to accompany language courses. At self-learning centres where there is a special need of individual coaching (Langner 2004), the presence of pronunciation coaches is particularly desirable. As a rule, IPC should be voluntary. Possible target groups are foreign students, future language teachers and interpreters as well as learners who want to improve their pronunciation. Experience shows that learners who know their specific pronunciation problems and have worked independently on the improvement of their pronunciation profit much more from autonomous language learning. This article provided a detailed overview of IPC. Since little attention is given to pronunciation in foreign language classrooms and students seldom know how to practise pronunciation outside of class, IPC can play a valuable role in enhancing learner autonomy in the areas of pronunciation that
Individual pronunciation coaching and prosody
231
cause difficulties for foreign learners. One of the key points discussed are possible methods to draw the learners’ attention to the prosodic features of German. Since there are as many different learning styles and preferences as there are learners there is no particular type of training that would suit all students. However, it seems to be beneficial to the individual learner to be acquainted with several learning strategies and to try those strategies in order to find the right method for oneself. It was argued that perception and production tests at several times of the coaching process and constructive feedback by the coach can yield a diagnosis of pronunciation deviations and, at the same time, document the learning progress of the learner. This should raise the language awareness of learners and increase their confidence when speaking the foreign language.
Notes 1. 2.
3.
4.
5.
6. 7. 8.
The term pronunciation used here refers both to the segmental and suprasegmental aspects of learners’ utterances in the target language. DSH is the abbreviation of “Deutsche Sprachprüfung für den Hochschulzugang ausländischer Studienbewerber“; TestDaF means “Test Deutsch als Fremdsprache“. Both tests are designed to test whether the foreign German learners’ language abilities are sufficient for studying at German universities. The autonomous learner himself decides to improve his pronunciation, decides when and where to work with which material and which strategies he wants to use. This is an extract of a text taken from the pronunciation material for German learners “Simsalabim” (Hirschfeld and Reinke 1998). In addition to the printed text, there exists a spoken version of a native speaker on audio cassette so that the learner can do further work on this text at home. The evaluation sheet is an adaptation from Dieling & Hirschfeld (2000: 198) for German as a Foreign Language and was modified for Chinese learners. Apart from prosody, it contains segmental deviances for vowels, consonants, consonant clusters, etc. Dupoux, Peperkamp and Núria (2001) observed in their experiments a tendency for “stress deafness” in native speakers of French. For exercices and pronunciation material for German as a foreign language, see Hirschfeld and Trouvain (this volume). A similar problem arises for learners of English as a foreign language. While the word “perfect” as a noun and an adjective demands the stress on the first syllable, the verb “to perfect” is stressed on the second syllable.
232
Grit Mehlhorn
9.
Although the traditional syllable-timed/stress-timed distinction is not supported by experimental findings, and the fact that the classification is by no means clear-cut (e.g., Bertinetto 1989), there are significant rhythmic differences in languages like German, English and Russian on the one hand (“stress-timed” languages) and languages like French, Turkish or Chinese (“syllable-timed” languages). For the purpose of raising the learners’ awareness for rhythmic differences in the mother tongue and the target language, it seems legitimate to exaggerate those differences. 10. This “typical Russian” intonation seems to be responsible for negative emotions felt by some native speakers of German who intuitively rate this kind of speaking as “exaggerated” or “theatrical”. Müller (1994: 182) describes this as an impression of irritated, unduly emotional language. For the Russian speaker, however, this was merely a normal and, objective (i.e. in no way emotional) information.
References Baran, Małgorzata 2002 The advantage of auditory perceivers and sharpeners in learning foreign language pronunciation. In: Ewa Waniek-Klimczak and Patrick James Melia (eds.), Accents and Speech in Teaching English Phonetics and Phonology. EFL perspective, 315–327. Frankfurt: Lang. Benson, Phil 2001 Teaching and Researching Autonomy in Language Learning. London: Longman. Bertinetto, Pier Marco 1989 Reflections on the dichotomy ‘stress’ vs. ‘syllable-timing’. In: Revue de Phonétique Appliquée 91-93, 99–130. Brammerts, Helmut, Mike Calvert and Karin Kleppin 2005 Ziele und Wege bei der individuellen Lernberatung. In: Helmut Brammerts and Karin Kleppin (eds.), Selbstgesteuertes Sprachenlernen im Tandem. Ein Handbuch, 47–54. Tübingen: Stauffenburg. Brazil, David 1994 Pronunciation for Advanced Learners of English. Cambridge University Press. Chun, Dorothy M. 1998 Signal analysis software for teaching discourse intonation. Language Learning & Technology 2, 61–77.
Individual pronunciation coaching and prosody
233
Cotteral, Sarah and David Crabbe (eds.) 1999 Learner Autonomy in Language Learning: Defining the Field and Effecting Change. Frankfurt: Lang. Cunningham-Andersson, Una 1997 Native speaker reactions to non-native speech. In: Allan James and Jonathan Leather (eds.), Second-Language Speech. Structure and Process. 133–144. Berlin, New York: Mouton de Gruyter. De Florio-Hansen, Inez 1997 ‘Learning Awareness’ als Teil von ‘Language Awareness’. Zur Sprachbewußtheit von Lehramtsstudierenden. Fremdsprachen Lehren und Lernen 26, 144–155. De Jong, John H.A.L. and Ulrike Kaunzner 2000 Acoustic training and development of general language proficiency. In: Ulrike Kaunzner (ed.), Pronunciation and the Adult Learner: Limitations and Possibilities. (Bibliotheca della Scuola Superiore di Lingue Moderne per Interpreti e Traduttori, Forli 27). Bologna: CLUEB. Dieling, Helga 1989 Zu einigen Aspekten des Hörens im Fremdsprachenunterricht. In: Christian Gutowski and Eberhard Stock (eds.), Phonetik des Deutschen: Grundlagen und Anwendungen, 30–43. Halle. Dieling, Helga and Ursula Hirschfeld 2000 Phonetik lehren und lernen. München: Langenscheidt. Dupoux, Emmanuel, Sharon Peperkamp and Núria Sebastián-Gallés 2001 A robust method to study stress ‘deafness’. Journal of the Acoustical Society of America 110, 1606–1618. Eckerth, Johannes and Claudia Riemer 2000 Awareness and Motivation: Noticing als Bindeglied zwischen kognitiven und affektiven Faktoren des Fremdsprachenlernens. In: Claudia Riemer (ed.), Cognitive Aspects of Foreign Language Learning and Teaching, 228–246. Tübingen: Narr. Edmondson, Willis J. 1997 Sprachlernbewußtheit und Motivation beim Fremdsprachenlernen. Fremdsprachen Lehren und Lernen 26, 88–110. Gehrmann, Siegfried 1999 Sprechen als Tätigkeit. Koordinations- und lerntheoretische Grundlagen des zweitsprachlichen Ausspracheerwerbs. Heidelberg: Universitätsverlag. Gibbon, Dafydd 1998 Intonation in German. In: Daniel Hirst and Albert Di Christo (eds.), Intonation Systems. A Survey of Twenty Languages, 78– 95. Cambridge University Press.
234
Grit Mehlhorn
Grotjahn, Rüdiger 1998 Ausspracheunterricht: Ausgewählte Befunde aus der Grundlagenforschung und didaktisch-methodische Implikationen. Zeitschrift für Fremdsprachenforschung 9, 35–83. Gut, Ulrike 2003 Prosody in second language speech production: the role of the native language. Fremdsprachen Lehren und Lernen 32, 133– 151. Hardison, Debra M. 2004 Generalization of computer-assisted prosody training: quantitative and qualitative findings. Language Learning & Technology 8, 34–52. Hirschfeld, Ursula, Heinrich P. Kelz and Ursula Müller (eds.) 2003ff. Phonetik international. Von Afrikaans bis Zulu. Kontrastive Studien für Deutsch als Fremdsprache. Waldsteinberg: Heidrun Popp Verlag. (www.phonetik-international.de) Hirschfeld, Ursula and Kerstin Reinke 1998 Phonetik Simsalabim. Ein Übungskurs für Deutschlernende. Berlin: Langenscheidt. Kaltenbacher, Erika 1998 Zum Sprachrhythmus des Deutschen und seinem Erwerb. In: Wegener, Heide (ed.), Eine zweite Sprache lernen. Empirische Untersuchungen zum Zweitspracherwerb, 21–38. Tübingen: Narr. Kleppin, Karin 2003 Sprachlernberatung: Zur Notwendigkeit eines eigenständigen Ausbildungsmoduls. Zeitschrift für Fremdsprachenforschung 1/2003, 71–85. 2004 ‘Bei dem Lehrer kann man ja nichts lernen!’ Zur Unterstützung von Motivation durch Sprachlernberatung. Zeitschrift für interkulturellen Fremdsprachenunterricht 2/2004. http://zif.spz.tudarmstadt.de/jg-09-2/beitrag/Kleppin2.htm (21.05.2005). Kleppin, Karin and Grit Mehlhorn 2005 Sprachlernberatung. In: Rüdiger Ahrens and Ursula Weier (eds.), Englisch in der Erwachsenenbildung des 21. Jahrhunderts, 71– 90. Heidelberg: Universitätsverlag. Knapp-Potthoff, Annelie 1997 Sprach(lern)bewußtheit im Kontext. Fremdsprachen Lehren und Lernen 26, 9–23.
Individual pronunciation coaching and prosody
235
Langner, Michael 2004 Sprachenlernen – Lernberatung – Neue Medien. Didaktische Verbundkonzeptionen in der Spannung zwischen Autonomie und Sprachunterricht. In: Christina Lang and Gerhard von der Hand (eds.), Sprachenlernen im Verbund, 101–117. Bielefeld: wbv. Little, David 1999 Metalinguistic awareness: The cornerstone of learner autonomy. In: Bettina Mißler and Uwe Multhaup (eds.), The Construction of Knowledge, Learner Autonomy and Related Issues in Foreign Language Learning, 3–12. Tübingen: Stauffenburg. Long, Mike 1991 Focus on form: a design feature in language teaching methodology. In: Kees de Bot, Ralph B. Ginsberg and Claire Kramsch (eds.), Foreign Language Research in Cross-Cultural Perspective, 39–51. Amsterdam: Benjamins. Mehlhorn, Grit 2005 Studienbegleitung für ausländische Studierende an deutschen Hochschulen. Individuelle Lernberatung – ein Leitfaden für die Beratungspraxis. Unter Mitarbeit von Karl-Richard Bausch, Tina Claußen, Beate Helbig-Reuter und Karin Kleppin. München: Iudicium. Müller, Ursula 1994 Phonetische Probleme und ihre Ursachen bei Deutschlernern mit der Muttersprache Russisch. In: Bernd Spillner (ed.), Fachkommunikation. Kongreßbeiträge zur 24. Jahrestagung der Gesellschaft für Angewandte Linguistik GAL e.V. (= Forum Angewandte Linguistik, 27), 177–183. Frankfurt: Lang. Otlowski, Marcus 1998 Pronunciation: What are the expectations? The Internet TESL Journal, Vol. IV, 1/1998, http://iteslj.org/Articles/Otlowski-Pronunciation.html (21.5.2005). Özen, Erhan 1986 Phonetische Probleme türkischsprachiger Deutschlerner. Teil 1: Der andere Rhythmus. Deutsch lernen. Zeitschrift für den Sprachunterricht mit ausländischen Arbeitnehmern. Heft 3, 11– 55. Rampillon, Ute 2000 Aufgabentypologie zum autonomen Lernen Deutsch als Fremdsprache. Ismaning: Hueber.
236
Grit Mehlhorn
Riley, Philip 1997
The guru and the conjurer: aspects of counselling for self-access. In: Phil Benson and Peter Voller (eds.), Autonomy and Independence in Language Learning, 114–131. London, New York: Longman.
Schmelter, Lars 2004 Selbstgesteuertes und potenziell expansives Fremdsprachenlernen im Tandem. (= Gießener Beiträge zur Fremdsprachendidaktik), Tübingen: Narr. Schmidt, Richard W. 1990 The role of consciousness in second language learning. Applied Linguistics 11, 129–158. 1995 Consciousness and foreign language learning: A tutorial on the role of attention and awareness in learning. In: Schmidt, Richard (ed.), Attention and Awareness in Foreign Language Learning, 1–63. Honolulu, Hawai‘i: University of Hawai‘i, Second Language Teaching & Curriculum Center. 2001 Attention. In: Robinson, Peter (ed.), Cognition and Second Language Instruction, 3–32. Cambridge University Press. Schmidt, Richard W. and Sylvia Nagem Frota 1986 Developing basic conversational ability in a second language: a case study of an adult learner of Portuguese. In: Day, Richard R. (ed.), Talking to Learn: Conversation in Second Language Acquisition, 237–336. Rowley, MA: Newbury House. Stibbard, Richard 1996 Teaching English intonation with a visual display of fundamental frequency. The Internet TESL Journal, Vol. II, No. 8, 1996. http://iteslj.org/Articles/Stibbard-Intonation/ (21.5.05). Tönshoff, Wolfgang 2003 Lernerstrategien. In: Karl-Richard Bausch, Herbert Christ and Hans-Jürgen Krumm (eds.), Handbuch Fremdsprachenunterricht, 331–335. Tübingen, Basel: A. Francke. Weskamp, Ralf 2003 Self-assessment/Selbstkontrolle, Selbsteinschätzung und -einstufung. In: Karl-Richard Bausch, Herbert Christ and Hans-Jürgen Krumm (eds.), Handbuch Fremdsprachenunterricht, 382–384. Tübingen, Basel: A. Francke.
Prosodic training for adult Italian learners of German: the Contrastive Prosody Method Federica Missaglia 1.
Introduction
This paper presents the Contrastive Prosody Method (CPM), a prosodycentred pronunciation training method aimed at developing prosodic competence in L2-German of adult Italian learners, including both beginners and advanced learners. Using the familiarity with Italian learners’ pronunciation difficulties as the empirical starting-point, the CPM is aimed at impeding and correcting specific prosodic errors and fossilized features in L2-perception and production, mainly concerning intonation contours and word and sentence stresses. The method is also intended to develop prosodic awareness, a native-like ability to identify and discriminate between different prosodic variants relevant in transmitting, together with the verbal component of speech, the speaker’s communicative intentions and emotions. In the CPM, prosody is given priority with respect to segments not only in content, but also as a means to produce adequate speech acts and to develop communicative competence. The CPM is characterized by both a bilingual and a contrastive approach: the students are never considered simply language learners, i.e. potential L2-speakers, but are always treated as bilingual speakers. Moreover, while training L2-prosody, the learners’ L1 is never excluded: L1-prosody is used as a means to produce and acquire accurate word and sentence stresses, and correct intonation contours in L2. Correct prosodic perception is attained by proper identification of L2-suprasegmentals, achieved by monitoring L1-suprasegmentals. This effect permeates to other levels leading to native-like German prosody and segments. Before illustrating the method in detail (section 4), I will briefly present the theoretical and empirical framework of the CPM (section 2) and I will discuss the different concepts of both the “bilingual approach” (section 3.1) and the “contrastive approach” (section 3.2) in relation to the method. Fi-
238
Federica Missaglia
nally, empirical data obtained by comparing traditional segment-centred with prosody-centred pronuniation training will be investigated in view of L2-prosody and segments (section 5).
2.
Theoretical and empirical framework
The CPM was developed within a multidisciplinary theoretical framework primarily concerned with cognitive and emotional aspects of phonetic and phonological development in first and second language acquisition. Research on cognition and emotion in language acquisition was later implemented with further results in the fields of psychoacoustic and experimental phonetics, theoretical and experimental neurology (Damasio 1994, 1999, 2003; Hüther 2002), neurolinguistics and bilingualism (Paradis 1994, 1997; De Houwer 1995; Fabbro 1996; Mack 2003; Bhatia and Ritchie 2004), cognitive psychology applied to perception and categorization in phonetic and phonological acquisition (Kuhl 1993b 1995; Miller 1994; Kuhl and Meltzoff 1995, 1997), and mental representation of linguistic items (Sendlmeier 1989, 1996). Practical experience in L2-German pronunciation training courses designed for Italian university students was fundamental to the CPM, together with theoretical and experimental research in phonetics and phonology applied to Italian (L1) and German (L2), but also to the specific learner group’s interlanguage (Ioup and Weinberger 1987; Missaglia 1999a). The empirical starting-point of the CPM was the error analysis of linguistic competence tests carried out with advanced Italian learners, i.e. adult learners with high level L2 competence, mostly bilinguals with an intensive German scholastic background (for details see Missaglia 1997). The aim was to collect a corpus of specific interferences, i.e. the fossilized forms of near-native L2-learners, to define their competence level, which was supposed to be the students’ target after advanced L2-courses in an institutional setting (school, university, etc.). However the errors, numerous and distributed on all phonetic levels (i.e. the segmental, intersegmental and suprasegmental level), largely surpassed the interferences which were thought to be sporadic and specific. The bilinguals’ phonological interferences were mostly comparable with those of Italian high school or university students. Data involving cross-sectional research on absolute beginners and the bilingual group showed that both learner groups were not equipped
Prosodic training for adult Italian learners of German
239
to discriminate specific features of German prosody and tended to carry incompatible Italian intonation contours and stress patterns over into German contexts. The phonetic performances of the two groups – beginners vs. very advanced (bilingual) learners – did not differ significantly in qualitative terms; the differences among them did not mainly concern the error types, but rather the extent or degree of error. Most errors and interferences at the segmental level were not primarily related to incorrect pronunciation of single segments, but rather to lacking competence at the suprasegmental level. Incorrect L2-pronunciation by Italian native speakers is mainly attributable to the learners’ distorted perception of L2 sounds and prosody, i.e. a perception filtered by L1, rather than to defective speech or to a deficit in the speakers’ phonatory apparatus. It appears that the so-called foreign accent may not be related to an articulatory deficit, i.e. the learners’ phonetic incapacity to produce L2 sounds, but rather to an incorrect categorial – phonological – interpretation. Trubetzkoy (1939) already pointed to the interaction of perception and production, and recent results in the field of language acquisition corroborate the theory of a perception-production interdependence (Vihman 1993, 1996; Kuhl 1995; Strange 1995a, 1995b). Experiments on the transition from universal perception and production patterns towards languagespecific patterns, i.e. from the perception of phonetic differences by newborn babies towards a categorial perception of L1 sounds after the 6th and 10th month (for vowels and consonants, respectively, cf. Kuhl et al. 1992; De Boysson-Bardies 1993; Werker and Polka 1993; Vihman 1996; Werker 2003) led to the hypothesis that sound categories are mentally represented as phonetic prototypes (for the prototype concept in cognitive psychology see Rosch 1973, 1975; in phonetics Kuhl 1992; Miller 1994), rather than as abstract phonemes or as bundles of distinctive features, as was stated in traditional structuralist linguistics. Kuhl’s Native Language Magnet Theory (Kuhl 1991, 1993; Kuhl and Iverson 1995) holds that phonetic prototypes, i.e. the central and most representative instances of phonological categories, act as perception magnets. They attract the sounds belonging to the same category and hinder native speakers from perceiving acoustic differences between prototypes and phonetically similar sounds. The perception patterns based on phonetic prototypes are language-specific insofar as the perception categories tally with the phonological categories of the language. On the basis of languagespecific perception patterns, analogous production patterns are established, enabling native speakers to produce L1 sounds correctly. L1 phonetic pro-
240
Federica Missaglia
totypes act as phonetic magnets in L2 perception and production, too, thus leading to the so-called foreign accent. In the light of prototype theory it can be assumed that the phonetic and prosodic errors made by Italians when speaking German are generated while learning L2; the fact that they are common to advanced learners and beginners leads to hypothesize that they turn up in the very first contact with L2, then fossilize, thus hindering the production of correct German utterances. The phonetic stumbling block seems to hinder even the first access to L2. Experimental data (Missaglia 1997) further showed that correct pronunciation is largely dependent on the self-control of intonation – also in L1 – and on the correct accentuation of German words and sentences. With minimal effort, both beginners and advanced learners were able to master German pronunciation. Once learners acquired a rudimental prosodic competence, many phonological interferences disappeared, suggesting that accentuation and intonation have a controlling function over syllables and segments. Correct prosodic perception and production have proved to have positive consequences on the segmental level. On the contrary, work with segments alone, such as traditional structural exercises for the language lab, exercises with minimal pairs, substitution exercises, pattern practice and pattern drills have not proved to be productive. The correction of single segments has no lasting effect and it has negative consequences on the intonation contour and the melody of the sentence. Thus in Second Language Acquisition (SLA) correct prosody is to be considered primary with respect to segments also because prosodic deviations have a more negative influence on the communicative effect of speech acts than segmental mistakes. Within this theoretical and empirical framework, we began training adult Italian learners on L2-German prosody with a learner-centred, bilingual approach. 3.1. The bilingual approach The bilingual approach in L2-German pronunciation training was the result of theoretical and practical understanding in the field of research on bilingualism and especially of the discovery that the distinction between bilinguals and L2-learners is relative (for a discussion see Missaglia 1997). The differences between bilinguals and L2-learners mainly concern specific modalities of access to mental representation of L1 and L2 items (Paradis
Prosodic training for adult Italian learners of German
241
2003) rather than degree of competence. The CPM’s bilingual approach is based on the assumption that learners should not be considered simply learners, isolating them from their reality, i.e. that of speaking two languages and of living with two languages. Starting from their first encounter with L2, each learner is not simply to be viewed as a potential L2-speaker, but has to be considered a bilingual individual, i.e. the “locus of the [linguistic] contact” (Weinreich 1953: 1); language learners are bilingual speakers whose linguistic processes and competence can be studied and described within the framework of research on bilingualism. Thus L1 should not be seen as an obstacle to L2-acquisition, but rather as the threshold connecting L1 with L2. Errors belong to each interlanguage (Selinker 1972; Corder 1981) or approximative system (Nemser 1971), i.e. to each stage along the road which leads to bilingualism, as do other speaker-specific characteristics. For this reason in the CPM errors are considered positively, namely as indicators of problematic aspects of the two linguistic systems and also of the characteristics of the learners’ interlanguage. Recognizing the phonetic circumstances both in the L1 and in the learners’ interlanguage has proved to be compulsory in bilingual and contrastive pronunciation training for L2-learners. 3.2. The contrastive approach In the CPM prosody is treated within a contrastive German-Italian framework. With the key word “contrastive” we usually refer to a traditional research area, to contrastive grammar and contrastive analysis and the derived error analysis. Our studies on German and Italian phonetics in contact (and in contrast) are founded on error analysis. But in the CPM the meaning of the keyword “contrastive” is far from that attributed to it by the Contrastive Analysis Hypothesis (Lado 1957). Monolingual and contrastive representations of German and Italian phonetics and phonology show great differences and few similarities concerning both segments and prosody. Most – even recent – contrastive descriptions of German and Italian phonetics and phonology are still dominated by a phonematic perspective. They aim at determining the phonemes’ functional role and, reflecting the methodological implications of traditional Contrastive Analysis, at listing and comparing “common” and “languagespecific” phonemes.
242
Federica Missaglia
The structural differences between the two languages’ phonemic systems and the description of articulatory and auditory characteristics should help learners to avoid mistakes related to production and perception difficulties and should thus simplify L2-acquisition. The real – i.e. phonetic – quality of sounds is not investigated and it is not clear whether and to what extent in native speakers’ articulation different or identical phonetic realization corresponds to so-called “common” phonemes. Practical experience in L2 pronunciation training shows that lists of structural differences and phonetic transcriptions do not automatically lead to native-like German pronunciation, because the phonetic quality of socalled “common” or “very similar” phonemes, i.e. phonemes which are represented by the same IPA-symbol, can be very different. The differences concern mostly vowels, as they are inherently more susceptible to modification or cross-language influence than consonants. The representation of German and Italian vowels with separate vowel charts in which “corresponding” vowels appear at the same height is not appropriate because L2-learners are led to assume that “corresponding” vowels are phonetically identical. The consequence is that they simply carry identical presumed L1 sounds over to L2, both in L2 production and perception. Our empirical results show great phonetic differences in German vowel production and perception by German and Italian native speakers. The data involve auditory experiments determining the relevance of quantitative and qualitative perception differences in the German vowel system for German native speakers and adult Italian learners and also articulatory and acoustic analyses of the variations in German vowel production by German and Italian native speakers (for details see Missaglia 2004). It can be hypothesized that the structural – phonological – and phonetic differences between the German and the Italian vowel inventory (15 vs. 7 vowel phonemes in stressed position; 9 vs. 5 in unstressed position) lead to different language-specific perceptual categories and thus to different auditory strategies by native speakers. Contrary to German, in which vowels in stressed position differ phonetically in concomitant quality and quantity, in the Italian vowel system quantity and quality are not correlated; this may cause difficulties when Italian native speakers must identify and discriminate between most German vowels. The co-existence of four concomitant phonetic features for most phonological oppositions in the German vowel inventory, which enables German native speakers to use different strategies to discriminate between stressed vowels (Sendlmeier 1981),
Prosodic training for adult Italian learners of German
243
can hinder correct vowel production and perception by Italian learners, but it can also be used in L2-pronunciation training. In fact, even if vowel quantity is not relevant at the segmental level in Italian, the experimental results on vowel perception showed that for Italians to discriminate most German vowel pairs, quantity rather than quality was phonologically relevant. This fact can be explained by the specific characteristics of the Italian vowel system and by the Italians’ sensitivity towards differences in vowel duration. In Italian vowel duration has no phonological relevance at the segmental level, i.e. it is not relevant for distinguishing the vowel phonemes (Bertinetto 1981), but it has a positive influence from an interlingual point of view due to its phonological significance at the suprasegmental level. L2 pronunciation can effectively be trained only when L2 sounds are perceived adequately; an adequate mental representation of L2 phonetic items must be established, which can be controlled with a sort of monitor during sound, word and sentence production (see Sendlmeier 1989b, 1994). The great differences between native speakers of German and Italian concern both vowel perception and production. For the qualitative analysis of vowel production by Italian learners in comparison with German native speakers, vowels were acoustically measured by extracting F1 and F2 values. A contrastive formant chart best shows the great differences and little similarities in German vowel production by Italian and German speakers: 2200
F2
1800
1400
I iÜ
iÜ
yÜ I
eÜ
´
E
iU uÜ
Y
Y
eÜ
uÜ
U oÜ
yÜ
500
oÜ
´
E
O
å
O
å
700
a a
aÜ
1000 300
F1
2600
DM
aÜ
IM
900
Figure 1. Contrastive formant chart with mean values of German vowel production by German (DM) and Italian (IM) native speakers (for details see Missaglia and Sendlmeier 1999).
244
Federica Missaglia
The experimental data on vowel production founded on acoustic analyses suggest that the German vowels produced by Italian learners are in some way filtered by the corresponding Italian vowels. Italian learners seem to correlate L2-phonemes lacking in the phoneme system of L1 with Italian vowels which are perceived as more similar. It can be hypothesized that the filtering of L2 centralized lax vowels and their realization as the corresponding tense vowels depends on the phonological system of L1 rather than on the effective phonetic realization of L1. In fact, acoustic experiments with Italian native speakers (Albano Leoni, Cutugno and Savy 1995) showed that in spontaneous speech Italians produce many reduced and centralized vowels, which are perceived phonologically, i.e. as the corresponding decentralized Italian phonemes. Starting from prototype theory the experimental data on vowel production by German and Italian native speakers can be explained as follows: mentally represented language-specific Italian prototypes are activated in both L1 and in L2. L2-learners have interiorized the phonological system and the phonetic prototypes of L1, whereas the interiorization of the correct phonological system and of the phonetic prototypes of L2 is still incomplete. For L2 sound discrimination and production this lack of L2-prototypes is compensated by the unconscious recurrence to L1-specific phonetic prototypes and selective perception patterns. The difficulties in producing correct German vowels by Italian learners depend on L1-specific perception patterns and on interlingual differences in the phonetic and phonological use of the articulation area by languages with different vowel inventories. Italian native speakers are used to wide articulatory and acoustic variations and they tend not to perceive – and consequently, not to produce – slight acoustic differences of tenseness, openness and centralization, because in the Italian vowel system the differences between the seven stressed vowels are great enough to allow greater freedom in vowel articulation. This freedom leads Italian native speakers to pronounce tense vowels more open and centralized than would be necessary for discriminating and identifying them. This is the main reason why it is so difficult for Italian learners to acquire the German vowel system consisting of 15 phonemes, as they are confronted with a language in which small spectral differences are important for vowel discrimination. These experimental data have important implications for L2pronunciation training. It can be hypothesized that, besides segmental prototypes, prosodic prototypes are also established, concerning for example stress pattern, intonation contour and rhythm. These prosodic prototypes
Prosodic training for adult Italian learners of German
245
may influence and distort L2-perception. Speakers of a syllable-timed language may establish prototypes which reflect the phonological characteristics of their mother tongue. When in contact with a stress-timed language these prototypes may cause difficulties in L2 perception, such as in relation to discrimination and identification of stress position, intonation contour, quantity relations, vowel quality and degrees of prominence. Prosodic and rhythmic differences between a stress-timed language such as German (Kohler 1982, 1983, 1991) and a syllable-timed language such as Italian (Bertinetto 1977, 1989a, 1989b) may explain why Italian learners have less production and perception difficulties with German segments than with German prosody, vowel reduction processes and consonant clusters. In German, traditionally a stress-timed language, deaccenting processes are extremely important; they are even more relevant than accenting processes, as at the level of intersegmental co-ordination processes they mostly lead to reductions and assimilations (Kohler 1979, 1990). Unstressed vowels undergo strong reductions tending towards schwa (Meinhold 1962, 1967), and voiced consonants in the syllable coda are devoiced (Auslautverhärtung). For Italian speakers vowel reductions and centralizations in unstressed positions and final devoicing are difficult tasks to accomplish because in Italian, traditionally a syllable-timed language, there is no phonological distinction between stressed and unstressed vowels and consonants in syllable onset and in syllable coda. Thus many segmental mistakes depend on the fact that Italian learners spontaneously carry L1 perception patterns and distribution rules over to L2, emphasizing accurate articulation and elaborate pronunciation of segments over correct prosodic realization. Deviations from the rhythmical structure of the stress-timed German language have a negative influence on communication with German native speakers.
4.
The method
Familiarity with Italian learners’ difficulties in L2-German prosodic perception and production and the knowledge of specific features of German and Italian phonetics and phonology led to the Contrastive Prosody Method (Missaglia 1999b). The CPM is characterized by systematic attention to intonation contours, stress patterns and especially deaccenting processes with the consequent typical reduction phenomena of German. Furthermore,
246
Federica Missaglia
the CPM systematically directs the learners’ attention towards communicatively adequate prosodic realizations of speech acts in L1 and L2, whereas segmental aspects are largely neglected. In the initial phases of the training, learners deal with prosody and phonetics intuitively, and they are taught rules explicitly later on. The training is performed with authentic German texts (dialogues, songs, modern poetry, etc.), which, due to their characteristics, must be read aloud. The exercises rely on two rules: (1) the learners have to imagine real communicative situations and act accordingly; and (2) they have to produce only one strong stress in each sentence deaccenting and reducing all the words without sentence accent. For Italian learners this means a drastically perceived reduction of all secondary stresses towards the primary stress. Italian learners with little knowledge of word and sentence accent rules in German initially do not know which element bears the primary stress. As they cannot rely on their L2-competence, they have to work with the only language they know they can handle without running the risk of making mistakes, i.e. the mother tongue. So they start practicing by treating their Italian as if it were German. By trial and error, learners can experiment with their mother tongue, where they are certain not to make pronunciation mistakes, monitoring it as if it were German. With minimal effort, both beginners and advanced students are able to make themselves masters of German pronunciation. Not caring about the difficulties connected with the production of segments foreign to the mother tongue, learners produce utterances monitoring the generic (prosodic) elements of the communicative situation. They produce Italian sentences with only one strong stress and deaccent all the other words, trying to exaggerate each word accent, whereas the other learners judge whether the Italian sentences produced sound unnatural or spontaneous. At this stage the learners only rely on their L1-competence. Only when they realize which accent cannot be eliminated in Italian – and this is often a surprising discovery – do they realize with their mothertongue sentences which word is endowed with the primary stress. At this point the learner switches to German assuming that the German equivalent of the accented word in the Italian sentence dominates the German sentence, too; in the first phases of the training it is important to present the learners with texts in which the German and the Italian primary stresses coincide. When producing the German sentence the learner exaggerates only the sentence accent by deaccenting all other elements. Thus a perfect German
Prosodic training for adult Italian learners of German
247
speech act is produced effortlessly: by exaggerating the sentence accent, there is little energy for voiced consonants in syllable coda and for long tense vowels in an unstressed position. The reduction and centralization of unstressed vowels, the disappearance of schwa-epenthesis and the final devoicing are automatically accomplished. The aim of the CPM is not to produce sentences, but speech acts, in which prosody is felt as the interface between grammar and the speaker’s affective reality, for example his emotions, fears, anxiety, etc. Realistic, i.e. natural and spontaneous speech acts, have to be produced first with Italian sentences which correspond to the German sentences of the exercise, embedding them into self-constructed virtual situations. Following the premises of a bilingual approach, the didactic units centred on prosody do not exclude the mother tongue, but make regular use of translations. The learners are not given translations in L1, they have to construct them starting from the situation given by the sentence/text. The learners are autonomous, they never repeat model-sentences: correct pronunciation is attained without the teacher’s instructions and rules; the learners never have models to repeat and imitate, but utterances – from their peers – to judge and improve. They act inside a group of learners, who listen to the sentences of the other learners and judge their acceptability in L1 and in L2. The reference point of each trial, phonetic variant and selfcorrection are the learners’ speech acts. The method’s principle is that learning is efficient only when it is self-learning inside a group of peers. Not being dependent on an external model offered by the teacher gives the learners a feeling of security and of success, which leads to a positive change of the learning attitude: the learners quickly realize that they are able to produce German sentences on their own. The learners are autonomous and nobody is excluded from the acquisition process, as everyone – both the speaker and the listeners – must respectively activate their language awareness in order to produce prosodically correct sentences (the former) and control their communicative efficiency (the latter). “Perfection” is reached only when the group’s members judge the sentences as not scholastic but normal-sounding and adequate in reflecting the speaker’s intentions and emotions. When the speech acts in L1 are considered efficient, i.e. natural sounding, the speaker has to produce an equivalent German speech act, then he must quickly switch from Italian to German and back again. By often switching from one language to the other, the learners improve their speed in code-switching. The prosodic correction of the German speech acts is performed by comparing them with the lear-
248
Federica Missaglia
ner’s Italian model; for the prosodic control of Italian and German speech acts the judges are the other learners. The first step in this teamwork is to find an agreement on the norms determining the communicative efficiency of speech acts, both in L2 and L1. The most difficult task is the production of prosodically correct sentences in L1, whereas passing to L2 becomes extremely easy, given that the rules and regularities deduced from Italian are applied with minimal adaptations. Thus learners acquire prosodic competence first in L1, and then profit from the experience in Italian for the acquisition of prosodic competence in German. In fact, what is trained is the learners’ awareness towards prosodic aspects, a sort of prosodic awareness, which has to be reached first in L1, as it can be controlled more easily and without anxiety. In a further step, prosodic awareness will enhance the acquisition of L2 prosodic competence. The positive effect of the training method depends on the fact that the German sentences of the exercises are translated into realistic Italian sentences by the learners themselves. Moreover, in a first step the Italian sentences have to be realized as speech acts which (1) sound natural and (2) present only one strong stress (the sentence accent). Only when the peers judge that these two conditions are met in the Italian sentences, the second step is reached and the original German sentences can be realized. The aim is again to produce speech acts with only one strong stress, which at the same time sound natural to the judging peers. Practical experience with the CPM shows that prosody as a basis for a contrastive approach in pronunciation training is useful, because from the beginning learners effortlessly and unconsciously avoid mistakes which otherwise would hinder them from correctly perceiving, producing and acquiring L2. Furthermore, this has positive consequences on their learning attitude as they soon realize that their sentences “sound German”.
5.
Prosody-centred vs. segment-centred pronunciation training in second language acquisition
To check the validity of the CPM, a systematic control of the method in comparison with traditional segment-centred pronunciation training was introduced on an experimental basis (Missaglia 1999b). An empirical investigation under controlled conditions with a pre- and post-test design was performed by comparing the phonetic performance of an experimental
Prosodic training for adult Italian learners of German
249
group trained with the CPM (PT) with the phonetic performance of a control group (ST) which received a traditional segment-centred training. In order to protect the experimental results from inter-group-effects unrelated to the training methods, all students were selected from the first university year: all subjects were beginners with little experience in German and they were selected on a voluntary basis. The aim of the experiment was twofold: 1) to find out to what extent segmental or suprasegmental competence determines the intuition-based global impression of L2-learners’ comprehensibility by native speakers; and 2) to measure the efficiency of the CPM by comparing the improvement rates after prosody-centred and segment-centred training procedures; both on the basis of subjective auditory judgements and on an objective phonetic-acoustic basis. For the first part of the investigation, i.e. the intuition-based auditory judgements, native speakers – German university teachers – had to judge global pronunciation competence, segmental and suprasegmental competence, i.e. correct segments vs. adequate word and sentence stress and intonation pattern. In a further step, the recordings were phonetically analyzed in detail at the segmental and suprasegmental level (for details see Missaglia 1999b) in order to have an objective basis for evaluating the subjects’ improvement rates in addition to the native speakers’ subjective-auditory judgements. The comparison of global judgements with segmental vs. suprasegmental competence shows that native speakers are more influenced by suprasegmental competence (46.6%), than by segmental competence (22.5%) In 9.9% of cases the two marks were equal, in 20.9% the global impression was the exact mean between the marks for the two competences. 30
30
25
segmental competence suprasegmental competence global judgement
pre-test 25
20
20
15
15
post-test segmental competence suprasegmental competence global judgement
Figure 2. Native speakers’ judgements of global pronunciation competence, segmental competence and suprasegmental competence expressed in Italian marks (min. 18 for sufficiency, max. 30).
250
Federica Missaglia
An individual and a group-specific statistic analysis was performed to measure the mean individual improvement rates and that of the experimental group (PT) in comparison with that of the control group (ST), i.e. to calculate exactly to what extent the two different groups’ improvement rates diverge; statistics (means, standard deviations and t-tests) were calculated with SPSS for Windows 7.5. In the post-test both the PT and the ST obtained higher marks than in the pre-test, which confirms a positive effect of both training procedures. Individual results show an obvious difference between pre- and post-test for all subjects (PT and ST), but group-specific results (PT vs. ST) also show evident between-group differences. There are greater differences between preand post-test for the experimental group (PT) than for the control group (ST) (Figure 3). 30 PRE POST
ST
-
PT
mean
25
20
15
Figure 3. Pre- and post-test performance before (PRE) and after (POST) segmental (ST) and prosodic (PT) training – auditory judgements by native speakers.
Statistic evidence shows that in the pre-test the means of the PT (21.8) and of the ST (20.8) did not significantly differ (p>.05), whereas there was a highly significant difference (p<.01) between the means of the PT (27.3) and the ST (22.8) in the post-test. A t-test for paired samples between preand post-test performances of the two groups shows a highly significant difference (p<.01). The improvement rates varied for the PT between 3 and 8 marks (mean 5.5), and between -2.2 and 6 marks (mean 2.0) for the ST.
Prosodic training for adult Italian learners of German
251
30
25
20 PT
ST
15
Figure 4. Mean improvement rates of the PT and the ST.
As the two groups did not significantly differ in the pre-test (p>.05), whereas both the between-group (PT vs. ST) and the within-group (pre- vs. post-test) differences were highly significant (p<.01), it can be concluded that a training-dependent change has taken place; the different improvement rates can definitely be attributed to a training-effect. Statistical evidence shows that higher improvement rates were achieved by prosodycentred training than by segment-centred training (Figure 5).
Mean
30
25
20 PRE POST 15 ST
PT
Figure 5. PT and ST performance in pre- and post-test – global impression by native speakers.
In relation to global impression by native speakers, the experimental group (PT) improved its pronunciation more notably than the control group (ST). Similar improvement rates could be observed on the basis of the number of segmental mistakes (Figure 6).
252
Federica Missaglia
Segmental mistakes
140 PRE POST
120 100 80 60 40 20 0
ST
PT
Figure 6. PT and ST performance in pre- and post-test – sum of segmental mistakes.
Segmental mistakes
In relation to the mistakes at the segmental level, both the t-test for independent samples of the post-test results between PT and ST means and the t-test for paired samples between pre- and post-test of the PT and the ST showed highly significant differences (p<.01), whereas a non-significant difference (p>.05) between PT and ST segmental mistakes in the pre-test shows that at the beginning of the programmes the two groups did not significantly differ from each other. The comparison between the improvement rates of the prosodicallytrained experimental group (PT) with those of the segmentally-trained control-group (ST) favours the PT. Both the auditory judgments of native speakers and a detailed analysis of segmental mistakes indicate that adult Italian learners of German profit more from a prosody-centred pronunciation-training than from a traditional segment-centred one. Not only is the general performance – the global impression of correctness, comprehensibility and communicative efficiency by native speakers – affected, but also the segmental production. In fact, the data show that attention towards prosodic aspects in the initial phases of L2-acquisition also has positive effects on the segmental level: the mean improvement rates at the segmental level are 78.9% for the PT and 31.9% for the ST (Figure 7). 90 70 50 30 10
PT ST pre post
Figure 7. Sum of segmental mistakes before (PRE) and after (POST) segmental (ST) and prosodic (PT) training.
Prosodic training for adult Italian learners of German
253
The experimental results indicate that L2-learners trained with prosodycentred and segment-centred programmes improve at different rates, both according to global impression by native speakers, which showed to be strongly influenced by suprasegmental competence, and at the segmental level. In both cases statistical evidence favours prosody-centred pronunciation training; statistically significant different improvement rates in the two procedures illustrate the more positive effects of a prosody-centred training on L2-German pronunciation giving evidence of the priority of prosody in L2 pronunciation training and in L2 phonetic acquisition. The positive results concerning both the investigated aspects – L2 segments and prosody – as well as the emotional component involved in the acquisition process evidence the need to invert the traditional priorities in L2 pronunciation training and to give prosody a primary role in second language acquisition.
References Albano Leoni, Federico, Francesco Cutugno and Renata Savy 1995 The vowel system of Italian connected speech. Proceedings of the 13th International Congress of Phonetic Sciences, Stockholm, 396–399. Bathia, Tej K. and William C. Ritchie 2004 The Handbook of Bilingualism. Oxford/Cambridge, MA: Blackwell. Bertinetto, Pier Marco 1977 «Syllabic Blood» ovvero l’italiano come lingua ad isocronismo sillabico. Studi di Grammatica Italiana 6, 69–96. 1981 Strutture prosodiche dell’italiano. Accento, quantità, sillaba, giuntura, fondamenti metrici. Firenze: presso l’Accademia della Crusca. 1989a Reflections on the dichotomy ‘stress’ vs. ‘syllable-timing’. Revue de Phonétique Appliquée 91–93, 99–130. 1989b Syllabic isochronism in Italian and English. Quaderni del Laboratorio di Linguistica. Scuola Normale Superiore di Pisa 3, 9–16. Corder, Pit S. 1981 Error Analysis and Interlanguage. Oxford: Oxford University Press.
254
Federica Missaglia
Damasio, Antonio R. 1994 Descartes’ Error: Emotion, Reason and the Human Brain, New York: G.P. Putnam’s Sons. 1999 The Feeling of what Happens. Body and Emotion in the Making of Consciousness. New York: Harcourt Brace. 2003 Looking for Spinoza. Joy, Sorrow and the Feeling Brain. Orlando: Harcourt. De Boysson-Bardies, Bénédicte 1993 Ontogeny of language-specific syllabic productions. In: Bénédicte De Boysson-Bardies (ed.), Developmental Neurocognition: Speech and Face Processing in the First Year of Life, 353– 363. Dordrecht: Kluwer. De Houwer, Annick 1995 Bilingual language acquisition. In: Paul Fletcher and Brian MacWhinney (eds.), The Handbook of Child Language, 219–250. Oxford/Cambridge, MA: Blackwell. Fabbro, Franco 1996 Il cervello bilingue. Neurolinguistica e poliglossia. Roma: Astrolabio. Hüther, Gerald Bedienungsanleitungen für ein menschliches Gehirn, Göttingen: 20023 Vandenhoeck & Ruprecht [2001]. Ioup, Georgette and Steven H. Weinberger 1987 Interlanguage Phonology: The Acquisition of a Second Language Sound System. Cambridge, MA: Newbury House. Kohler, Klaus J. 1979 Kommunikative Aspekte satzphonetischer Prozesse im Deutschen. In: Heinz Vater (ed..), Phonologische Probleme des Deutschen, 13–39. Tübingen: Narr. 1982 Rhythmus im Deutschen. Arbeitspapiere des Instituts für Phonetik der Univerisität Kiel 19, 89–105. 1983 Stress-timing and speech rate in German: A production model. Arbeitspapiere des Instituts für Phonetik der Univerisität Kiel 20, 5–53. 1990 Segmental reduction in connected speech in German: Phonological facts and phonetic explanations. In: William J. Hardcastle and Alain Marchal (eds.) Speech Production and Speech Modelling, 69–92. Dordrecht: Kluwer. 1991 Isochrony, units of rhythmic organisation and speech rate. Proceedings of the 12th International Congress of Phonetic Sciences, Aix-en Provence (France), 257–261.
Prosodic training for adult Italian learners of German
255
Kuhl, Patricia K. 1991 Human adults and human infants show a ‘perceptual magnet effect’ for the prototypes of speech categories, monkeys do not. Perception and Psychophysics 50, 93–107. 1992 Speech prototypes: Studies on the nature, function, ontogeny and phylogeny of the ‘centers’ of speech categories. In: Yo’ichi Tohkura, Eric Vatikiotis-Bateson and Yoshinori Sagisaka (eds.), Speech Perception, Production and Linguistic Structure, 239– 264. Tokyo/Ohmscha/Amsterdam/Oxford: IOS. 1993a Innate predispositions and the effects of experience in speech perception: The native language magnet theory. In: Bénédicte De Boysson-Bardies (ed.), Developmental Neurocognition: Speech and Face Processing in the First Year of Life, 259–274. Dordrecht: Kluwer. 1993b Early linguistic experience and phonetic perception: Implications for theories of developmental speech perception. Journal of Phonetics 2, 125–139. 1995 Vocal learning in infants: Development of perceptual-motor links for speech. Proceedings of the 13th International Congress of Phonetic Sciences, Stockholm, 146–149. Kuhl, Patricia K. and Andrew N. Meltzoff 1995 Mechanisms of developmental change in speech and language. Proceedings of the 13th International Congress of Phonetic Sciences, Stockholm, 132–139. 1997 Evolution, nativism, and learning in the development of language and speech. In: Alison Gopnik (ed.), The Inheritance and Innateness of Grammars, 7–44. New York: Oxford University Press. Kuhl, Patricia K. and Paul Iverson 1995 Linguistic experience and the ‘Perceptual Magnet Effect’. In: Winifred Strange (ed.), Speech Perception and Linguistic Experience. Theoretical and Methodological Issues in Cross-language Speech Research, 121–154.York: Timonium. Kuhl, Patricia K., Karen A. Williams, Francisco Lacerda, Kenneth N. Stevens and Björn Lindblom 1992 Linguistic experience alters phonetic perception in infants by 6 months of age. Science 255, 606–608. Lado, Robert 1957 Linguistics Across Cultures. Applied Linguistics for Language Teachers. Ann Arbor: University of Michingan Press.
256
Federica Missaglia
Mack, Molly 2003
The phonetic systems of bilinguals. In: Mary T. Banich and Molly Mack (eds.), Mind, Brain and Language. Multidisciplinary Perspectives, 309–349. Mahwah/London: Erlbaum. Meinhold, Gottfried 1962 Die Realisierung der Silben (-…å), (-…ã), (-…ä) in der deutschen hochgelauteten Sprache. Zeitschrift für Phonetik, Sprachwissenschaft und Kommunikationsforschung 15, 1–19. 1967 Geschwächte Lautformen (‘weak forms’) in der deutschen Standardaussprache. Wissenschaftliche Zeitschrift der FriedrichSchiller-Universität Jena 16/5: 609–612. Menn, Lisa and Carol Stoel-Gammon 1995 Phonological development. In: Paul Fletcher and Brian MacWhinney (eds.), The Handbook of Child Language, 335–359. Oxford/Cambridge, Mass.: Blackwell. Miller, Joanne L. 1994 On the internal structure of phonetic categories: A progress report. Cognition 50, 271–285. reprint (1995) in: Jacques Mehler and Susana Franck (eds.), Cognition on Cognition, 333–347. Amsterdam: Elsevier. Missaglia, Federica 1997 Studi sul bilinguismo scolastico italo-tedesco. Brescia: La Scuola. 1999a Phonetische Aspekte des Erwerbs von Deutsch als Fremdsprache durch italienische Muttersprachler. Frankfurt a. M.: Hector. 1999b Contrastive prosody in SLA: An empirical study with Italian learners of German. Proceedings of the 14th International Congress of Phonetic Sciences, San Francisco, 551–554. 2004 Distorted perception and production of L2-German vowels by adult Italian learners. In: Jean Drevillon, Jean Vivier and Agnès Salinas (eds.), Psycholinguistics. A Multidisciplinary Science of 2000: What Implications, What Applications?, 305–314. Paris: Europia. Missaglia, Federica and Walter F. Sendlmeier 1999 Die Realisierung deutscher Vokale durch italienische Muttersprachler – Eine experimentalphonetische Untersuchung. Zeitschrift für Fremdsprachenforschung 10/1, 73–95. Nemser, William 1971 Approximate systems of foreign language learners. International Revue of Applied Linguistics 9/2, 115–123.
Prosodic training for adult Italian learners of German
257
Paradis, Michel 1994 Neurolinguistic aspects of implicit and explicit memory: Implications for bilingualism and SLA. In: Nick Ellis (ed.), Implicit and Explict Learning of Languages, 393–419. London: Academic Press. 1997 The cognitive neuropsychology of bilingualism. In: Annette M.B. De Groot and Judith F. Kroll, Tutorials in Bilingualism. Psycholinguistic Perspectives, 331–354. Mahwah. Erlbaum. 2003 Differential use of cerebral mechanisms in bilinguals. In: Mary T. Banich and Molly Mack (eds.), Mind, Brain, and Language. Multidisciplinary Perspectives, 351–370. Mahwah/London: Erlbaum. Rosch, Eleanor 1973 Natural categories. Cognitive Psychology 4, 328–350. 1975 Cognitive reference points. Cognitive Psychology 7, 532–547. Saffran, Jenny R., Richard N. Aslin and Elissa L. Newport 1996 Statistical learning by 8-month-old infants. Science 274, 1926– 1928. Selinker, Larry 1972 Interlanguage. International Review of Applied Linguistics, 10/3, 209–231. Sendlmeier, Walter F. 1981 Der Einfluß von Qualität und Quantität auf die Perzeption betonter Vokale des Deutschen. Phonetica 38, 291–308. 1989a Perception and mental representation of speech. Linguistics 27, 381–404. 1989b Aufmerksamkeitssteuerung als Methode eines Hörtrainings im Fremdsprachenunterricht. Deutsche Sprache 17, 40–51. 1994 Phonetisch-rezeptive Aspekte des Fremdsprachenerwerbs. Zeitschrift für Fremdsprachenforschung 5, 26–42. 1996 Mentale Repräsentation von Lautsprache. Zeitschrift für Semiotik 18/2–3, 235–249. Strange, Winifred 1995a Cross-language studies of speech perception: A historical review. In: Winifred Strange (ed.), Speech Perception and Linguistic Experience. Issues in Cross-language Research, 3–45. Maryland, York: Timonium. 1995b Phonetics of second-language acquisition: Past, present, future. Proceedings of the 13th International Congress of Phonetic Sciences, Stockholm, 76–83. Trubetzkoy, Nicolaj S. 1939 Grundzüge der Phonologie. Göttingen: Vandenhoeck & Ruprecht [1962].
258
Federica Missaglia
Vihman, Marilyn M. 1993 Variable paths to early word production. Journal of Phonetics 21, 61–82. 1996 Phonological Development. The Origins of Language in the Child, Cambridge, Mass./Oxford: Blackwell. Weinreich, Uriel 1953 Languages in Contact: Findings and Problems. New York: Publications of the Linguistic Circle of New York. Werker, Janet F. 2003 The acquisition of language-specific phonetic categories in infancy, Proceedings of the 15th International Congress of Phonetic Sciences, Barcelona, 21–25. Werker, Janet F. and Linda Polka 1993 The ontogeny and developmental significance of languagespecific phonetic perception. In: Bènédicte De Boysson-Bardies (ed.), Developmental Neurocognition: Speech and Face Processing in the First Year of Life, 275–288. Dordrecht: Kluwer.
Language index
Arabic, 135 Bini, 32 Bulgarian, 107, 110
Icelandic, 128 Indian English, 35 Italian, 33, 35–37, 39f., 58f., 105, 107, 115, 242f., 245
Chinese, 32
Japanese, 7, 27, 32, 62
Danish, 12, 58 Dutch, 46, 55, 110
Korean, 46, 112
English, 7f., 11f., 27, 32, 37, 40, 46, 58f., 62, 64, 67, 83, 103, 105–107, 110, 112, 128, 134f., 155f. Faroese, 128 French, 11, 103, 106f., 115, 131, 134f. German, 7, 27, 32, 37–40, 58f., 64, 67, 83, 106f., 110, 134f., 155f., 160ff., 172, 223, 242, 245 German/Turkish bilinguals, 64 Greek, 37, 39, 60, 105 Hungarian, 39
Norwegian, 7, 121, 124, 126, 130, 132, 135 Polish, 8 Portuguese, 12 Romanian, 39 Russian, 110, 112, 127, 135 Scottish Gaelic, 128 Singapore English, 58, 63 Spanish, 55, 134 Swedish, 7, 32, 58, 106f., 110, 115 Telugu, 135 Turkish, 7, 223 Welsh, 7 Yoruba, 135
Index of L1–L2 combinations
L1 Chinese – L2 German 112, 160ff., 225 L1 Chinese – L2 Norwegian 126, 129f., 132, 134f., 138f. L1 Dutch – L2 English 55 L1 Dutch – L2 Greek 60f., 71 L1 English – L2 French 57 L1 English – L2 German 82, 85, 90, 112, 114, 123, 160ff. L1 English – L2 Italian 59 L1 English – L2 Norwegian 126, 128f., 134f., 138f. L1 French – L2 German 109 L1 French – L2 Norwegian 126, 129f., 134f., 138f. L1 French – L2 Spanish 8 L1 German – L2 English 54, 61, 65, 69, 86, 114
L1 German – L2 Italian 59 L1 German – L2 Norwegian 126, 129f., 134f., 138f. L1 Italian – L2 German 160ff., 238f., 242-244, 252 L1 Japanese – L2 English 62 L1 Korean – L2 German 112 L1 Persian – L2 Norwegian 126, 129, 134f., 138f. L1 Polish – L2 English 8 L1 Russian – L2 German 112, 223f. L1 Russian – L2 Latvian 127 L1 Russian – L2 Norwegian 126, 129, 134f., 138f. L1 Spanish – L2 English 53, 59, 62 L2 English 155f. L2 German 154f., 173, 175
Subject index
accent, 6 dynamic, 7 alignment (see tonal alignment) articulation rate, 9 autosegmental-metrical model, 14, 43f., 56f., 79f. consonant quantity, 121f. in L2, 126f., 129–131 corpus (see language corpora) declination, 40 domain initial strengthening, 30f.
highlighting function, 27f. in L2, 53f., 55, 82–89, 174 model for speech synthesis, 81f. paralinguistic function, 38–41 teaching of, 14 teaching material for, 223ff. intonational phonology and phonetics, 57 intonational phrasing, 29f. in L2, 174 teaching material for, 183 isochrony, 11, 105
effort code, 40 final lengthening, 31 fluency, 9f. focus, 35 frequency code, 39 information structure, 34f. interlinear transcription, 14 intonation contour, 14 intonation language, 32f. intonation, 14, 25, 100 and focus, 35–37 and marking of information structure, 34f. and speech acts, 37f. autosegmental-metrical model of, 14, 43f., 56f., 79f. British School model of, 42f., 79f. foreign-accented, 82–92
L2 consonant quantity, 126f., 129–131 L2 intonation, 53f., 55, 82–89, 174 L2 phonological acquisition process, 192 and extralinguistic factors, 195-197 L2 phrasing, 174 L2 pitch accent, 174 L2 pitch level, 66–70 L2 pitch range, 67f. in L2, 67f. measurement of, 65f. L2 pitch span, 66–70 L2 prosody research methodology, 145 L2 sentence stress, 61–63, 173 L2 speech rhythm, 111–113, 131–137 L2 tonal alignment, 59–61 L2 vowel reduction, 153–162, 247
262
Subject index
language awareness, 8, 193f., 221, 247f. teaching material for, 198–200, 221–226 language corpora, 147–149 as a research method, 149–151 in teaching, 146, 162f. learner autonomy, 214, 247 learning awareness, 227–229 phonological metacompetence, 193 teaching of, 198, 202–204 pitch accent language 7, 32 pitch accent, 27 pitch level, 64f. pitch movement, 26 pitch range, 26, 63f. pitch span, 64f. preaspiration, 128 in L2, 128 rhythm, 11f., 102–108 and language typology, 11f., 104–108 in L2, 111–113, 131–137 in music, 103 measurement of, 12f. teaching of, 13, 109, 113–115
sentence stress, 6, 98f. in L2, 61–63, 174 teaching material for, 184, 246f. speech act, 37f., 246 in L2, 246f. speech rate, 9 measurement of, 9f. teaching of, 10f. variation of, 10 speech rhythm (see rhythm) stress, 6, 27f. deafness, 8 fixed, 6 phonetic realisation, 7 teaching of, 8 ToBI, 44f., 79f. tonal alignment, 58 tone language, 32 vowel quantity, 121f. vowel reduction in L2, 153–162, 247 teaching material for, 182, 184 word stress, 6, 98f. perception of, 218 teaching materials for, 178, 182